The Online Services Development Association (OSDA)
a proposal for a
PUBLIC access RESOURCE DATABASE NETWORK (PRDN)
==============================================
(c) MARCH 1983
WHAT IS OSDA:
This is simply a working group title for a number of sysops
and other individuals who have been active in the development
online services. OSDA is a non profit organisation and has a
good working relationship with a number of helpful
organisations.
INTRODUCTION:
The general concept of a distributed database is not new.
However this proposed application of the concept is
different, innovative and more important, significant for
what it may achieve.
Simply, the proposal is to develop a low cost, Public access
Resource Database Newtwork (PRDN).
Sounds expensive...?
We hasten to add that due to the different and innovative
nature of this proposal, it is not so.
Perhaps a suprise to some, a great deal of the initial and
innovative ground work has been quietly going on for a number
of years.
Much of the credit being due to many dedicated hard working
volunteer individuals and organisations including enlightened
elements in the computer industry.
WHY A PRDN:
There are a great many important and beneficial reasons why
the proposed PRDN should be developed, to briefly touch on a
few:
There is a very real danger that 'Information Poverty' (IP)
will continue to develop along side our apparent ecconomic
ills leading to more immense and difficult to solve
developmental problems in the future.
A developed PRDN would make a very significant contribution
to development in all areas. This being made possible by the
availability of useful information and resources.
There are still tremendous perceptual problems amongst the
population at large about the computer's role in society. Few
fully appreciate the computer as a useful tool.
A suitably presented PRDN could do a great deal to promote a
wider public understanding and acceptance of the benefits of
computers as a useful tool.
The greatest wealth of useful information is held by society
as a whole. A PRDN can tap this COMMON RESOURCE and begin to
make it AVAILABLE and AFFORDABLE to all.
Parallel benefits to the development of a PRDN would be
ecconomic development for makers and providers of requisite
equipment, software, services. The tradititional costs
associated with obtaining Information and Resources could be
reduced dramatically.
By providing for the user to contribute useful information to
the PRDN, purposeful, creative and satisfying employment will
be possible for many. This would also do much to build a
sense of belonging and community into fragmented elements of
society.
WHY NOT A CENTRALISED SYSTEM:
Generally, traditional systems are centralised, which often
means relatively high overheads, high investment and slow
adaption to change, thus they can be obsolete very quickly.
The methods of networking (information distribution) use
currently expensive high speed data links to provide quick
(real time) access to the distant centralised store of
information.
Such systems command relatively high service charges which in
themselves limit wider access.
Information is often kept secret and protected against
open publication so its value stays high to ensure sufficient
return.
This may be considered good business practice, but it can
also slow the wider and more general development of an
efficient society and feed Information Poverty.
To put such systems in perspective, while they remain
relatively expensive to use, they are best suited to
specialist information services for which there is little or
no need by the majority.
DEVELOPING A SUITABLE PRDN:
There is already an embryonic backbone of such a system with
a number of willing system owners and operators.
This backbone is made up of a number of Nodes (computers
with substantial data storage facilities). These already
network with each other to exchange information, electronic
mail (email) and data, mainly via the telephone system on a
need to connect basis (ie: local node information changes or
new email requiring delivery automatically triggers the node
to make appropriate connections on a priority basis).
Collectively these individual computers contain a growing
wealth of useful information.
A simple to use automatic RESOURCE DATABASE system is
required to enable Nodes to provide access to the information
on the collective system.
There are a number of ways of achieving this, what is
required is general agreement on the best method.
RESOURCE DATABASE OUTLINE
One method is as follows:
The Resource Database would provide a simple menu system to
provide subject, title, description and keyword search of the
collective information.
Immediate access would be given to information held on the
local node, while other information would be requested
from the node holding the information. Subject to priority
this would be transfered immediately or later.
Regularly requested information could be held locally to save
on networking overheads. Local Node software automatically
deciding what to hold localy.
All nodes would use the same Resource Database files, with
changes and additions automatically updated by a central
processing node with at least one full backup node.
Individual nodes would automatically forward details of
changes to their content for inclusion.
It would not be neccessary to redistribute the entire
database as Node based software would automatically update
the main database files from small Content Change files.
As a PRDN develops and traffic increases faster data links
may become economic, providing wider real time access.
Some suitable form of reimbursment to cover costs may be
considered neccesary. It is important that whatever method is
introduced, it does not prevent those most in need from using
such a system.
Some Sponsorship Funding may be found by encouraging some
forms of advertising. Within limits the General Public might
be freely permited to Advertise Items for Sale while business
advertising could provide sponsorship. An ADvert INDEX
(Adindex) could enable fast location of items needed
and improved use of resources.
Teachers frequently reinvent the wheel when they create
teaching material their colleagues have done countless times
before. Such resources could be put in an Education Library
with open access and contributions from all.
It may also be appropriate to include the facility for
Authors of material to charge some suitable small amount each
time their information is transfered to an End User. This
could greatly encourage the provision of useful information,
electronic publications and software.
SOLUTIONS TO SOME TECHNICAL PROBLEMS
====================================
VOLUME OF DATA:
With low cost Gigabyte storage for computers just round the
corner we need to plan for a large amount of potential growth.
While a PRDN is embryonic the index files will not be large,
perhaps taking up megabytes. However when considering the
techniques we innovate or adopt, we should keep an eye to
the future growth to avoid later problems.
Longer term, we should perhaps expect a minimum of 1000
gigabytes of storage within a national PRDN (It is difficult
to estimate this).
Allowing for localised duplication of files held on other
Nodes we may expect the main INDEX files to cope with
5,000,000 file titles and descriptions in the not to distant
future.
INDEX DATA STRUCTURE:
We need something fairly simple and compact in use of storage
space and network transfer time.
A Subject number system would save considerable space, three
bytes being required for 16,777,216 subject areas. The local
Node would use a look up table for conversion.
1. SUBJECT (3 byte number with look up table)
2. TITLE (word number token string)
3. KEYWORDS (word number token string with look up table)
4. FILE DESCRIPTION (compex compresed string data)
5. INFORMATION STATUS (one byte with look up table)
6. PRIMARY LOCATION (holding node address 4 byte number)
7. SYSTEM FILE NUMBER (Read/write name 11 bytes)
8. FILE SIZE (2 byte number nearest K)
The above index format uses the least space but may need
modification if data processing time is likely to become a
problem.
COMPRESSED TEXT STRINGS:
The COMPEX protocol permits fast real time compression and
decompression of string text.
This works on the basis that about 75% of text uses only
thirteen or so characters from the alphabet. It uses a nibble
code algorithum to code and decode text accordingly. This
could provide a reduction in size of between 50% and 30%,
subject to the use of characters.
COLLECTIVE SYSTEM FILE NUMBER:
To maintain downwards compatability we will be stuck with the
eight character file name and three character type part used
by many systems.
A suitable number scheme is to use characters "0" through "9"
and "a" through "z" to provide a 36 base number system. This
providing for around
130,000,000,000,000,000
unique file names, sufficient for 26,000,000 file names each
for every member of the worlds population.
COMPLETE SYSTEM ACCOUNTING:
Complete system accounting with the ability to check
costs, level of use for information provided, files, Users
and traffic would be essential.
|