The Internal Data Format of RIB
Send questions or comments to rib@nhse.org
October 24, 1997
Introduction
Repository In a Box (RIB) is a toolkit for creating software repositories that can share cataloging information over the Internet. RIB uses the Basic Interoperability Data Model (BIDM) which is an IEEE standard for software cataloging on the Internet. If this data model needs to be modified to meet the needs of a particular software repository then RIB allows customization by using a configuration file. The purpose of this document is to give a description of the BIDM, how a configuration file can be used to customize the BIDM, and finally how RIB uses an HTML binding of the BIDM to store its catalog information. After one has a good understanding of the issues described in this document then he or she will be able to create applications which can read a RIB repository's configuration file and create object HTML files for that repository without having to use the RIB web interface. This is useful for those who have a large collection of existing data that they would like to import directly into a RIB repository.
The BIDM - an object-oriented data model
The BIDM is an object-oriented data model that in many ways borrows from the concepts of object-oriented programming. If you are familiar with object-oriented programming then you have a head start on understanding how the BIDM works. For the purpose of the BIDM, an object is an entity that is described by its attributes and relationships. Examples of objects in the BIDM are Asset and Organization. The attributes of an object are a list of properties that uniquely describe that object. For example, the attributes of an Asset object could be its Name, Abstract, Cost, and Restrictions. The attributes of an Organization object could be its Name, Email, Telephone, and Address. A relationship is used make a connection between two objects. For example, a relationship between an Organization and a Library could be ContactIs. Figure 1 is a graphical representation of how two BIDM objects can be connected by a relationship.
Figure 1: a relationship between two objects
The full version of the BIDM as approved by the IEEE has a very rich vocabulary of object types, relationships, and attributes. A graphical representation of the full BIDM is available in the figures section of the RIB User's Guide.
The RIB configuration file
RIB allows the BIDM to be customized through the use of a configuration file. A default configuration file that uses a subset of the characteristics of the full BIDM is copied into each repository when it is first created by RIB. This configuration file controls what types of objects that repository can contain and specifies the attributes of those objects and the relationships between them.
The format of the configuration file is SGML, which is a format that people are somewhat familiar with if they have ever used HTML. There are three types of tags that are used in the BIDM configuration file: class, attribute, and relationship. The class tag has two fields - name and extends. The name field specifies the name of that class. The extends field specifies the parent class of that class (if it has one). If the class has a parent class then it inherits any attributes and relationships from its parent class. The attribute and relationship tags each have name, req, and mult fields. The name field specifies the name of the attribute or relationship. The presence of the req field specifies that the attribute or relationship is required, and the presence of the mult field specifies that there can be multiple instances of the attribute or relationship. The dtype field specifies the data type of that field. Legal values for the dtype field are string, text, url, and date. The relationship tag has a dest field which is used to specify the target class of the relationship. Anything between the beginning of an attribute or relationship tag and its closing is a description of that attribute or relationship.
For example, a configuration file for the data model depicted in figure 1 above could look like the following:
<class name="RigObject"> <attribute name="Name" req mult dtype="string"> Name or title for the object. </attribute> </class> <class name="Asset" extends="RigObject"> <relationship name="ContactIs" dest="Organization" req mult> An organization that originated or produced this asset. </relationship> <attribute name="Abstract" req dtype="text"> General definition or explanation of the asset. </attribute> <attribute name="Cost" req dtype="string"> Any costs associated with the use or possession of the asset </attribute> <attribute name="Restrictions" dtype="text"> Legal information governing the use of the asset, including possibly copyright, data rights, disclaimers, export restrictions, and licenses. </attribute> </class> <class name="Organization" extends="RigObject"> <attribute name="Email" req dtype="email"> An electronic mail address for the organization. </attribute> <attribute name="Telephone" dtype="string"> A (voice) telephone number for the organization. </attribute> <attribute name="Address" dtype="string"> A (voice) telephone number for the organization. </attribute> </class>
Although the syntax of the configuration file above may look a little overwhelming at first, it is actually quite simple once you get the hang of it and is in fact very close to the default "Simple" configuration file used by RIB for a new repository.
The HTML binding of the BIDM
Since html files are a very convenient format for documents that are transported over the Internet, the RIB uses an HTML binding of the BIDM to store its catalog records in HTML files. These files are called object description files. RIB uses the <META> and <LINK> tags which according HTML specifications must reside between the <HEAD> and </HEAD> tags in an object description file. <META> tags are used for attributes and <LINK> tags are used for relationships. For example, RIB could use the following object description file to describe an Asset named "Bench++" which has a ContactIs relationship to an Organization.
<HTML> <HEAD> <META name="BIDM.Asset.Abstract" content="The suite is based on the Ada PIWG suite, and therefore is designed to test individual language features, with some small applications and "traditional" benchmarks thrown in."> <LINK rel="BIDM.Asset.ContactIs.Organization" href="http://www.nhse.org/rib/repositories/benchweb/objects/Organization/joeorost.html"> <META name="BIDM.Asset.DateOfInformation" content="Wed Jul 9 16:13:08 1997"> <META name="BIDM.Asset.Domain" content="Benchmark and Example Programs!CPU"> <META name="BIDM.Asset.Name" content="Bench++"> <META name="BIDM.Asset.TargetEnvironment" content="Workstation"> </HEAD> <BODY>The name of the Asset stored in this file is Bench++</BODY> </HTML>
<META> tags use the name and content fields to specify each attribute's name and value. <LINK> tags use the rel field to specify the name of a relationship and the href tag to specify the url where an object description file for the target object can be found. Anything between the <BODY> and </BODY> tags is not used by RIB when it parses the file so this area can be left empty. However, when RIB creates an object description file it uses this area for an HTML table which describes that object so that a WWW browser can see information about that object (information between the <HEAD> and </HEAD> tags is not visible to a WWW browser).
Every object that is stored in a RIB repository is represented by an HTML file in that repository's objects directory. For example, if an Asset called "foobar" were created with the RIB web interface then you could look in the objects/Asset directory of that repository (if you have access to the filesystem) and you would see a file named "foobar.html". This file would contain the object description file for that Asset. Similarly, creating an Organization named "goo" would cause in an object description file named "goo.html" to appear in the objects/Organization directory.
If you understand what RIB expects to see in an object description files then you can create your own object description files in the appropriate subdirectory of the objects directory. These files will automatically be detected by RIB so they can be managed by the RIB web interface after their insertion. However, care should be taken to make sure that any files placed in the objects directory are -completely- HTML compliant so that RIB can parse them correctly. Remember that a foreign repository might pick up the HTML files you create, so HTML parse errors could affect more repositories than just your own. Care should also be taken to be sure that any files placed into any of the objects directories obey the rules specified in the configuration file for that type of object (class). For example, the configuration file for a repository specifies which attributes and relationships are required for its objects and which can be multiply defined. Disobeying the rules in the configuration file creates inconsistencies in your object description files, which can cause undesirable side effects.