CTWatch
August 2007
The Coming Revolution in Scholarly Communications & Cyberinfrastructure
Screencast link Compound Information Object Demo Screencast
Herbert Van de Sompel, Los Alamos National Laboratory
Carl Lagoze, Cornell University

5
4. Towards a Solution for Compound Information Objects

The work to date on the formulation of an ORE interoperability layer over the web architecture consists of three components:

  • Using Resource Maps for describing compound objects;
  • Referencing compound object resources; and
  • Discovering Resource Maps.

Similar to other efforts in the scholarly community to develop interoperable, machine-readable representations of research artifacts, this ORE work is inspired by activities in the semantic web community. Two notable influences are Linked Data 11 and named graphs. We note however that the semantic web influence on ORE work does not mandate an implementation built on semantic web technologies such as RDF, RDFS, RDF/XML, and OWL. Rather, OAI-ORE might employ more lightweight technologies that are easier to adopt and that can be used in applications that typically do not require explicit semantic information (e.g., existing popular search engines, blogs). These simpler formats could then be transformed by mechanisms such as GRDDL 14 to extract data for more complex semantic applications.

4.1 Using Resource Maps to Describe Compound Objects

As explained above, a major issue with representing compound objects on the web is the loss of the notion of the logical whole because the web architecture has no standardized facility for representing aggregations of resources. As such, a major focus of OAI-ORE is to add this missing logical boundary information and to do so in a machine-readable manner.

We are examining the use of named graphs 12,13 as a model for expressing this boundary information. A named graph is a set of nodes and arcs. In this context, a node is a URI-identified web resource, and an arc is a directional link between two such resources typed by a URI that indicates a relationship type. The named graph itself is identified by means of a URI. This URI-identification means that the named graph itself is a logical unit and is an addressable resource on the web. As such, named graphs provide a mechanism for describing the basic relationship between a compound object and its components that is missing when a compound object is published to the web. In addition, named graphs provide a means of associating a proxy URI identity (the URI of the named graph) with the compound object. Finally, in addition to expressing this basic boundary-type information, these graphs can express semantically richer information because they may contain arbitrarily typed resources (nodes) and relationships (arcs) that, for example, meet the requirements of a specific application domain.

Figure 5


Figure 5. Publishing a Resource Map to expose logical boundaries in the web graph.

The ORE interoperability layer intends to leverage named graphs by publishing Resource Maps that describe compound objects. A Resource Map is a named graph in which the nodes are resources that correspond with a compound object and its components, as well as resources that are related to these (e.g., the citations of a scholarly paper). A Resource Map must unambiguously distinguish between those “internal” and “external” resources. The arcs of a Resource Map are typed relationships between those resources. We envision a core relationship ontology and the ability to extend this core with discipline-specific ontologies. For practical reasons, a Resource Map is identified by means of a protocol-based URI (e.g., HTTP URI). This makes it possible to obtain machine-readable representations of the Resource Map through content negotiation. As a result, a Resource Map is considered an information resource as per 15.

Published Resource Maps overlay the web graph and effectively become part of (are merged into) it. This is illustrated in Figure 5, where the top pane shows the web graph without the information contained in the Resource Map and the bottom pane illustrates how the boundary of the object and relationships among the components are now visible in the web graph. The URI of the Resource Map (R in Figure 5) provides a web-based handle to the aggregate of multiple resources and their inter-relationships in the Resource Map. This URI can be referenced by standard web applications.

Pages: 1 2 3 4 5 6 7 8

Reference this article
Van de Sompel, H., Lagoze, C. "Interoperability for the Discovery, Use, and Re-Use of Units of Scholarly Communication," CTWatch Quarterly, Volume 3, Number 3, August 2007. http://www.ctwatch.org/quarterly/articles/2007/08/interoperability-for-the-discovery-use-and-re-use-of-units-of-scholarly-communication/

Any opinions expressed on this site belong to their respective authors and are not necessarily shared by the sponsoring institutions or the National Science Foundation (NSF).

Any trademarks or trade names, registered or otherwise, that appear on this site are the property of their respective owners and, unless noted, do not represent endorsement by the editors, publishers, sponsoring institutions, the National Science Foundation, or any other member of the CTWatch team.

No guarantee is granted by CTWatch that information appearing in articles published by the Quarterly or appearing in the Blog is complete or accurate. Information on this site is not intended for commercial purposes.