Connecting RIB to MDS and the Grid

Introduction

Automatic resource allocation. Transparent running of computational jobs. Ubiquitous compute power. Grid-based computing promises all of this and more.

Grid computing is essentially metadata based. That is, large amounts of meta-data must be brought together in order to facilitate the sort of ubiquitous, transparent computing the grid promises. Consider, for instance, the case of a programmer submitting a job to the grid. Suppose the job requires an SGI Origin 2000 with LAPACK installed. The programmer must locate such a machine and marshal whatever other resources are necessary (e.g. machine time, accounts, etc).

The holy grail is the automation of this resource search. Doing so requires storing large amounts of metadata about grid resources. For instance, metadata about each machine connected to the grid needs to be stored. Metadata about software also needs to be stored. And, perhaps more importantly, metadata about software deployments needs to be stored.

Key Players

Several tools have been developed within the research community for dealing with metadata and metadata-based grid computing. These include the Repository in a Box (RIB) toolkit and the Globus toolkit. RIB is a toolkit for building and maintaining metadata repositories. Globus is a toolkit for building compute grids.

Repository in a Box

Repository in a Box (RIB) is a product of the National High Performance Software Exchange (NHSE). RIB provides tools for the creation and maintenance of metadata repositories. It also provides a customizable web-based interface for browsing and searching metadata.

Although RIB is capable of storing metadata conforming to an arbitrary data model, it's primary use is storage and cataloging of software metadata. By default, this metadata conforms to the Basic Interoperability Data Model (BIDM), an IEEE standard.

The NHSE has extended the BIDM to encompass other data concerning software. A key extension is the inclusion of software deployment metadata. This dovetails nicely with one of RIB's features -- the ability to generate matrices displaying related metadata information. In particular, this has been used to generate matrices of software deployment information, with machine names on one axis and software packages on the other. Information concerning the deployment of software on a particular machine is contained within the appropriate matrix cell.

Metacomputing Directory Service

The Metacomputing Directory Service (MDS) is the component of the Globus toolkit concerned with metadata. Metadata concerning all objects important to the grid are stored in this service. MDS is built upon an LDAP directory server, providing an easy to search, hierarchical representation of grid metadata.

In particular, MDS contains (or will contain) information concerning machines, software, and software deployments. Automated tools will search MDS to marshal the resources necessary to compute on the grid. Currently, only machine information is stored within MDS. A suitable representation of software metadata has not yet been decided upon. Software deployment information is also not yet available from MDS.

Project Overview

Given that RIB and MDS both deal with software and deployment metadata, it is natural for the two projects to co-operate in some manner. The goal of the RIB-MDS integration project is to finalize the representation of software and deployment metadata and to provide that metadata to MDS. This integration will be accomplished by the creation of software to extract data from RIB, convert it as necessary, and place it in MDS.

Adoption of a suitable data model for software and deployments will allow the project scope to be extended extensively. For instance, the RIB deployment matrix can be leveraged for MDS use. Additionally, RIB provides a somewhat easier to use data entry facility -- this can be integrated with MDS as well.

Key Technologies

Relevant Documents

Relevant Links