FramerD allows the computer to create, access, and manipulate descriptions and systems of description. Most computer applications work by manipulating descriptions in some systematic way. A system of description is the set of conventions, expectations, and procedures used to manipulate descriptions in a particular application area. For example, in a scheduling application, the system of description might specify:
Descriptive systems can be programmed by people or generated by machines. In either event, when users or programs add new descriptions or extend existing systems, FramerD automatically generates consequences from the additions or extensions.
FramerD was developed to support research in artificial intelligence (AI) involving the construction of artifacts which demonstrate something like human understanding and intelligence. For example, in our current research we use FramerD to encode a text database where relations and meanings are used in retrieval and matching. Each natural language phrase in the original text database is described by a different frame in FramerD; relations between these frames descibe both the structure of sentences (e.g. "Bush" is the subject of "flew") and possible meanings ("flew" might mean "drove the plane" or "rode in the plane"). Taking ideas from past work in artificial intelligence, FramerD is built to describe conceptual objects and their relationships to one another. Unlike this past work, however, FramerD is designed to scale to millions or tens of millions of objects.
FramerD was designed to simplify:
If you need to describe complicated and interconnected structures and want to be able to store and share these structures, it's worthwhile looking at FramerD. In particular, if your work is currently (or constitutionally) in "development mode" and incremental changes to your database are common, FramerD may be what you are looking for.
FramerD is optimized for descriptions consisting primarily of relations to other descriptions. Relations between descriptions can be either structural relations or semantic relations. Structural relations connect elements within a particular context, for example
Semantic relations, on the other hand, connect a description to some "meaning" description elsewhere in the database, for example
Descriptions can also include simple attributes whose values are numbers or strings, but FramerD is optimized for the kinds of complicated relational structures common in artificial intelligence systems. In particular, FramerD has special operations for delayed loading and caching of objects which make it inexpensive to load objects which refer to other objects.
The kernel of FramerD functions is quite small and is easily ported between platforms and architectures. It has been ported to most versions of Unix as well as the major PC operating systems. For instance, under Windows, the command line program for accessing FramerD services takes less than 125KB (120KB of which is a shared library). The entire FramerD script interpreter and libraries are under 500KB, including index and object access, inference and search facilities, and HTML parsing and generation. For comparison, this is substantially smaller than the BASH command shell and its associated libraries on the same platform.
A FramerD database can be maintained on a server, allowing clients to access the data without the overhead of a local copy the entire database. Likewise, particular functions --- like indexing or real-time sensor access --- can be delegated to particular servers with particular resources. A FramerD server can also locally consolidate or mirror other databases in a way which maintains a local cache of references to FramerD objects.
Here are pointers to the sections of the document, which you can also read linearly. Each section describes a different component of FramerD.
FramerD is a system and database for describing complicated structures and systems to the computer. It is based on a set of structures and protocols called DTypes which can represent richly interconnected multi-level structures. The basic structures in DTypes are directly derived from the Lisp language, but analogs to most of them exist (or are easily implemented) in nearly every programming language.
DTypes are dynamically typed: when a program refers to a DType structure, it does not need to know what kind of structure it is referring to. It might be a string, a number, a vector of strings and numbers, or a vector of such vectors. This allows programmers to write procedures which manipulate data generically without regard to its particular representation. Furthermore, because the program checks the type of the data it is operating on, it is less prone to the accidental or intentional inconsistences of improvising or malicous programmers.
DTypes have an external representation which can be easily communicated across networks or saved in files. This external representation is common to all the software and hardware platforms where DTypes are implemented. Memory permitting, a palmtop computer running a version of FramerD can transmit a DTYPE structure to a multi-processor compute server for processing and receive the result of that processing. DTYPEs allow the management of data and computation to be divided as technologically or administratively appropriate.
The external representation of DTypes provides a way for applications to package their own internal data types for transmission or storage. A transmission or storage facility does not need to know the details or purpose of the data's implementation in order to receive, store, or transmit it. This means that a programmer does not need to change their general tools or database format in order to add a new data type to their program. It also means that two clients can share data structures through a storage or communication facility which need not know the data's underlying implementation.
DTypes provide a simple object database which associates numeric object identifers (OIDs) with DType structures. DType structures can contain these object identifiers so that one DType structure can point to the location in the object database where another DType structure is stored. A scheme for assigning object identifiers ensures (as much as possible) that two structures will only use the same object identifier when the are actually referring to the same object.
The chief advantage of using object identifiers to refer to structures is that the structures can be changed without having to change the references. Because programs operate on these references by getting the associated structures, they will always get the "latest version" as changed or updated by other programs. For example, an object identifier might be used to record some changing or accumulating value like the outside temperature or a history of such temperatures. More commonly, an object might contain a description of attributes and relations where different users or processes are modifying and/or updating those attributes or relations. Because users (and other objects) go through the object reference, these changes are propogated to these users and objects.
Object identifers are 64-bit (8 byte) integers, ranging from zero to 18,446,744,073,709,551,616. Needless to say, this allows a large number of potential objects. This large range is divided into sub-ranges called "pools" used to guarantee that no two users or programs use the same object identifer for different purposes. A pool is a range of object ids assigned to a particular user, project, or program. This entity is responsible for managing the identifiers within that pool and no other entity is permitted to modify the mapping of the pool's object ids into structures. However, the entity (user, project, or program) may choose to further subdivide the pool and delegate control over the fragments to yet other users or programs.
A frame is an object identifier whose reference is a set of attributes and relations. These attributes and relations are implemented as sets of slots and values. Each slot is either a symbol (a special kind of string) or another frame; for a given frame, the slot is associated with some number of particular values, each of which is either a DType object or another frame. When the values of a slot point to another frame, we say that the slot encodes a relation between the frames.
For example, a biographical database might include frames describing particular individuals. The attributes of these frames might include literal values like names or dates as well as relations to other individuals or more abstract entities (like positions or organizations). A fragment of such a database might look like this:
where the values associated with the spouse and parent slot describe the corresponding relations. The cryptic string @55/55B1"Martha Washington" refers to an object by both its numeric address and its "human name"; many interfaces to FramerD hide the numeric identifer (the 55/55b1) from the user, using the human name whenever possible.
Operations on frames generally also specify a slot to operate upon. These operations include:
For readers familiar with conventional database systems, a frame is a kind of record whose fields can contain multiple values, including pointers to other frames. Also, because the data language is dynamically typed these values need not have a pre-determined type. For readers familiar with various diaects of Lisp, frames provide (at their basic level) a kind of persistent property list facility where object identifiers, rather than symbols, carry the associated properties.
Unlike record structures in conventional languages or databases, it is easy to add new slots (fields) to a frame. This allows programs and programmers to say something about a particular object without having to say something about all similar objects. Databases can be constructed incrementally and in pieces, making development faster and more flexible. There is a small performance penalty for this flexibility, but we have found it more than worth the payback in development flexibility. FramerD can be considered an "interpreted" database environment, where the structures in the database have the same kind of flexibility as functions and variables in an interpreted environment.
When a slot is itself a frame, the slot may have a complicated behavior. In particular, operations specifying a `slot-frame' may trigger attached procedures when values are retrieved, examined, or modified. These might compute additional values or update other data structures. These attached procedures allow the system to automatically make "inferences" based on the data it is given; for instance, it might combine the fact that MIT is in Cambridge.MA and Cambridge.MA is in the United.States to infer that MIT is in the United.States. It would have to be given rules, in the form of attached procedures, to infer this fact, but once it had received these rules it could make the inference automatically.
An object identifier gives no cues as to the structure associated with it. Nonetheless, sometimes we need to identify objects based on what their structure actually is. FramerD provides indices which associate keys (arbitrary DType structures) to sets of other DType structures. For example, an index might go from words (represented as strings) to the set of descriptions of their possible meanings. Indices can also be used to implement relations which are not (for several possible reasons) implemented by slots and values. For instance, an index in an email system might store a mapping between descriptions of individuals and descriptions of the messages you've received from them. It might be expensive to update those individual's descriptions whenever a new message is received, but indices are designed to support small, incremental changes of this sort. Indices can also be used for more complicated operations, like pattern matching or object retrieval.
Slot indices map slot-value pairs into the objects possessing those slot values.
The management of pools is managed through either access to disk files (local or remote) or communication with remote FramerD servers. Standard operations include allocating a new unique identifer, getting the structure associated with an identifer, or changing the associated structure. When a file or server manages a particular pool, we say that it is the "provider" for the pool. A provider may provide only limited or selective access to a pool; for instance, some pools may only permit access, not allocation or modification, while others may permit allocation or modification only in particular ways or to particular clients.
Pools provided by network servers may actually be copies or combinations of other pools. For instance, the official provider for the MIT version of WordNet may be a server at MIT; however, a site distant from MIT may wish to set up its own server providing a copy of the same database. The mirror site cannot be modified, of course, but it can provide the information it has gotten as a complete or partial snapshot from MIT. (In fact, the server at MIT is really a mirror of an internal site and can't be normally modified!) Similarly, a server may consolidate several pools into a single "point of service". Suppose there were a dozen different FramerD servers at site A which are of interest to a remote site. That site can set up a single server for processing requests which reroutes each request to the appropriate server at site A. Mirroring and consolidation become important as networked databases become more widely used. Sites with particularly popular databases may wish to set up "external mirrors" for the outside world while keeping the real "internal" database reserved for their own work.
Super pools are used to organize the allocation of pools at a particular site. A super pool is a pool of roughly 4 billion object identifiers sharing the same initial 32 bits of identifier. In the 64-bit FramerD object identifier address space, there are also roughly 4 billion super pools. The first such super pool (the "zero" pool) contains objects describing the remaining super pools. New pools can be allocated through the "zero pool server" (currently at MIT) which can be accessed either directly (through DType protocol connections) or through a form on the FramerD web page at "http://framerd.www.media.mit.edu/".
FramerD was first designed to allow diverse users at the MIT Media Lab to share data, provide lab-wide utilities, etc. As a result, access to FramerD was made available from many different platforms. Basic libraries allow FramerD data and services to be directly accessed from C, Lisp, and Java.
FramerD includes FDScript, a simple scripting language based on Scheme. FDScript can be used to write command line scripts (it is smaller than many shells), implement specialized network servers, or provide World Wide Web (CGI) access to FramerD data and services.
Basic FramerD services are built on the DType protocol and libraries for accessing DType services are available in Perl and TCL as well as from the command line (with the dtcall executable) or from the C, Lisp, or Java libraries.
C does not natively provide most the datatypes used in FramerD structures. Pairs, lists, heterogenous vectors, and interned symbols need to be implemented by the C function libraries connecting to FramerD. The libraries for accessing FramerD provide these data structures, access to th DType protocol, object maintenance and caching functions, and the FDScript interpreter. Despite this, it is relatively lightweight, as shown in this table:
|Library||Binary Library Size|
data structures, network access
object, index, and frame access
R4RS evaluator, inference procedures, etc
LISP already provides most of the native DTYPE structures. The FramerD libraries in LISP and Scheme are loaded to provide the same level of access as the C libraries, including network access, local caching, and FDScript execution.
FDScript is a lightweight LISP interpreter which incorporates the FramerD kernel functions. It is appropriate for casual database access, writing command line scripts, and implementing CGI scripts for accessing FramerD databases from the World Wide Web. FDScript uses a uniform parenthesized syntax for expressions and operations. For instance, in the following expression: (index-get "email@example.com" "book") the operation named INDEX-GET is being called with two strings as arguments. The arguments can also be expressions, so that: (fget (index-get "wn15@wn15" "book") 'senses) applies the operation FGET to two arguments: the result of the embedded INDEX-GET and the symbol 'senses. (The quote "'" preserves the following object from interpretation).