Development & Implementation Plan
OAIS
Current System
Metadata
Interaction with System
DLSD’s entity relationship data model for the DAMS Metadata Registry combines the findings of the International Federation of Library Associations and Institutions (IFLA) "Functional Requirements for Bibliographic Records" and Carl Lagoze’s research with the Harmony Project on relationships as entities.
The primary element of interest to DLSD in IFLA’s Functional Requirements for Bibliographic Records is the distinction of different states in which an intellectual work can exist. As explained by the IFLA document:
"The entities in the first group (as depicted in the figure below) represent the different aspects of user interests in the products of intellectual or artistic endeavour. The entities defined as work (a distinct intellectual or artistic creation) and expression (the intellectual or artistic realization of a work) reflect intellectual or artistic content. The entities defined as manifestation (the physical embodiment of an expression of a work) and item (a single exemplar of a manifestation), on the other hand, reflect physical form. The relationships depicted in the diagram indicate that a work may be realized through one or more than one expression (hence the double arrow on the line that links work to expression). An expression, on the other hand, is the realization of one and only one work (hence the single arrow on the reverse direction of that line linking expression to work). An expression may be embodied in one or more than one manifestation; likewise a manifestation may embody one or more than one expression. A manifestation, in turn, may be exemplified by one or more than one item; but an item may exemplify one and only one manifestation."
Defining the characteristics or attributes of the different instantiations of an intellection resource allows for the accommodation of the wide variety of intellectual efforts represented in UT Austin’s digital assets.
Further consideration of UT Austin’s particular situation revealed the importance of well-defined and established relationships between digital assets. At this point Carl Lagoze’s publications on relationship entities investigated in the Harmony Project shed light on a path that would accommodate the extensible and organic nature of digital objects.
The relationships that exist between digital objects are complex. Digital information objects exist in many forms. The ability for digital information objects to be easily reproduced makes for a difficult management situation. Keeping track of multiple drafts of a report produced in a word processing software application only begins to convey the complexity of serving digital content to audiences via the World Wide Web in which surrogates are created on the fly from master digital files.
For example, it is common for smaller jpeg images to be created on demand from tiff masters through special programming in a file server system. In this case, the jpeg image may or may not be retained. Recording and identifying this relationship for accruing usage statistics can be accomplished with an adequate data model for a metadata registry. If this particular interest is planned for in the development of the Metadata Registry’s database this relationship can be automatically recorded at the time the JPEG derivative is created and stored in the Metadata Registry database for future reference.
What has been described is the tracking of an event and information about that event to be used later by the metadata system. The relationship of the input and output entities is recorded as well as the actual event that has occurred. This event-aware approach was investigated heavily by the Harmony Project. In the paper: "An Event-Aware Model for Metadata Interoperability," the authors outline a data model for event relationships.
"These concepts are illustrated below. The larger circles represent manifestations of a resource as it moves through a set of event transitions; the events are represented by the squares interspersed between the circles. For example, event E1 may be a creation event that produces resource R1. This resource may then be acted on by a translation event - event E2 - producing resource R2 and so on. The rectangles at the bottom of the figure represent metadata descriptions (instances of particular metadata vocabularies), and the ellipses that enclose part of the resource/event lifecycle represent the snapshot of the lifecycle addressed by that particular metadata description. For example, the larger dark-shaded ellipse represents the snapshot described by desc1, and the smaller light-shaded ellipse the snapshot described by desc2. The smaller circles within each descriptive record are the actual elements, or attributes, of the description. The dotted lines (and the color of each circle) indicate the linkage of the metadata element to an event - as shown the elements in desc1 are actually associated with three different events that are implicit in the snapshot. For example, the attributes (moving from left to right) may describe creator, translator, and publisher, which are actually "agents" of the events. As shown, the three rose colored elements are all associated with a single event E3, implying a relationship between them that can be exploited in mapping between the two descriptive vocabularies that form the basis for the different descriptions."
Currently, DLSD is investigating how the integration of these two metadata models will need to be altered to fit UT Austin’s circumstances. The primary concern is to develop a system with sufficient flexibility to accommodate unexpected metadata concerns that might evolve later.
While discussion continues as the Metadata Registry data model evolves, the integration of FRBR’s intellectual work lifecycle as a resource with the Harmony Project concept of information entities seems to be the most favorable path. Refinements to and expansion of this data model will occur as more practical situations are applied in its context. Until then, this model provides a layering framework to accommodate levels of granularity required for metadata sets of digital assets.
A practical example of how this data model might work follows. The example uses Gone With the Wind to navigate through the intellectual work’s lifecycle in context of the different resource and event entities that produce metadata that needs to be recorded and maintained for the information object. This example spans the physical manifestations of an intellectual work and the more dynamic and complex digital versions.