CTWatch
May 2006
Designing and Supporting Science-Driven Infrastructure
Charlie Catlett, Pete Beckman, Dane Skow and Ian Foster, The Computation Institute, University of Chicago and Argonne National Laboratory

4
3.1 Software and Resource Integration and Support

The TeraGrid software environment involves four areas. Grid middleware, including the Globus Toolkit, Condor, and other tools, provides capabilities for harnessing TeraGrid resources as an integrated system. These Grid middleware components are deployed, along with libraries, tools, and virtualization constructs, as the Coordinated TeraGrid Software and Services (CTSS) system, which provides users with a common development and runtime environment across heterogeneous platforms. This common environment lowers barriers users encounter in exploiting the diverse capabilities of the distributed TeraGrid facility to build and run applications. A software deployment validation and verification system, Inca, continuously monitors this complex environment, providing users, administrators, and operators with real-time and historical information about system functionality. In addition, users are provided with login credentials and an allocations infrastructure that allows a single allocation to be used on any TeraGrid system through the Account Management Information Exchange (AMIE)24 account management and distributed accounting system.

These four components must work seamlessly together, combined with related administrative policies and procedures, to deliver TeraGrid services to users. Software integration efforts must ensure that these components can be readily deployed, updated, and managed by resource provider staff, while working with science partners to both harden and enhance the capabilities of the overall system and with the NMI project to implement an independent external test process.

Increasingly, TeraGrid is also providing service-hosting capabilities that allow science communities to leverage the operational infrastructure of this national-scale grid. For example, data and collections-hosting services are provided as part of the TeraGrid resource provider activities at SDSC. Users may request storage space, specifying their desired access protocols, ranging from remote file I/O via a wide area parallel filesystem to GridFTP25 and Storage Resource Broker (SRB).26 Similarly, communities are provided with software areas on all TeraGrid computational resources, thus enabling community-supported software stacks and applications to be deployed TeraGrid-wide.

A general-purpose facility such as TeraGrid must evolve constantly in concert with the changing and growing needs and ideas of its user community. Sometimes the need for a new capability or the improvement of an existing one will be obvious from operational experience or groundswell requests from the user community. In other cases, multiple competing ideas may arise within particular communities/subsets of the facility that must either be replaced by a new common component, or integrated into a coherent system. TeraGrid services are defined as part of the CTSS package, with major releases at roughly six-month intervals used to introduce new capabilities.

The costs of integrating, deploying, and operating these software systems can be significant. The TeraGrid project applies 10 FTEs to the tasks of integrating, verifying, validating, and deploying new capabilities. This staff works with 42 resource integration and support FTEs from the resource provider facilities. The latter staff is responsible for the support and administration of the specific computational, information management, visualization, and data resources operated by resource provider facilities.

Pages: 1 2 3 4 5 6 7

Reference this article
Catlett, C., Beckman, P., Skow, D., Foster, I. "Creating and Operating National-Scale Cyberinfrastructure Services," CTWatch Quarterly, Volume 2, Number 2, May 2006. http://www.ctwatch.org/quarterly/articles/2006/05/creating-and-operating-national-scale-cyberinfrastructure-services/

Any opinions expressed on this site belong to their respective authors and are not necessarily shared by the sponsoring institutions or the National Science Foundation (NSF).

Any trademarks or trade names, registered or otherwise, that appear on this site are the property of their respective owners and, unless noted, do not represent endorsement by the editors, publishers, sponsoring institutions, the National Science Foundation, or any other member of the CTWatch team.

No guarantee is granted by CTWatch that information appearing in articles published by the Quarterly or appearing in the Blog is complete or accurate. Information on this site is not intended for commercial purposes.