The computer science community faces a new stage in the development of interfaces for computation and resource sharing: wide-area, high-performance computing or computational power grids. The burgeoning movement toward computational grids is rapidly transforming the context in which computer science research occurs. While the vision of the advanced research environments based on computational grids exercises a strong attraction for researchers in many disciplines, the increased complexity and distributed nature of computational grids makes it clear that computer scientists must play a leading role in the effort to develop these systems. Because of its long experience in turning innovative computer science research into usable software solutions for the research community, the group of investigators that led this project was well positioned to address the challenges of this national endeavor.
As a foundation for this effort, we built a computational grid for Computer Science research that mirrored, within the University of Tennessee, Knoxville (UTK) campus environment, both the underlying technologies and the types of research collaboration that were structuring the growth of the national technology Grid. We called this infrastructure the Scalable Intracampus Research Grid, or SInRG. The purpose of the SInRG infrastructure was to make it possible to engage in a set of complementary investigations that embraced both the core research needed for basic grid-enabled computing and the interdisciplinary collaborations needed to help drive Computer Science forward in response to the demands of advanced applications from other domains.
SInRG had four properties that we thought were essential to achieving this end:
- It was a genuine grid, realistically mirroring the essential features that make computational grids both promising and problematic. At a minimum, such realism depends on the fact that the computing resources that compose a grid are usually owned and controlled by different parties, geographically distributed, and heterogeneous in their architectures and implementations.
- It was designed for research, i.e. with the kind of flexibility that permitted investigators to rapidly deploy new ideas and prototypes for testing and experimentation.
- It wsa centered on one campus, so as to naturally support and encourage the communication necessary for collaborative research. Any barrier to interpersonal communication presented by its physically distributed nature could be bridged with a local phone call or a fifteen-minute walk.
- It proved to be highly scalable in the number of routine users it could support and the level of resources it could effectively distribute, since we wanted to explore the use of computational grids as integrated parts of a normal research environment.
Figure1: The Scalable Intracampus Research Grid at the University of Tennessee Figure 1 shows the key elements of the plan we had for building and using SInRG for research on gridbased computing and advanced applications: the component architecture for SInRG's nodes, the kind of research activities focused around the nodes, and the character of the middleware that unified the whole.
Component Architecture of SInRG: Grid Service Clusters
The primary building block of the SInRG architecture was the Grid Service Cluster (GSC). A GSC is an ensemble of hardware and software that is constructed and administered by a single workgroup, but is also optimized to make its resources easily available for use by the grid-enabled applications of that group and others. In the former aspect, a GSC is basically a concentration of computational resources in an advanced local area network. It can be as homogenous as required to allow effective resource management, and as customized as required to serve the special needs of the research group that controls it. Several of the GSCs that were constructed were been built with such customizations. In the latter aspect, every GSC has hardware and software elements that open it up to aggregation with other heterogeneous resources for controlled exploitation by grid applications.
Both these facets are reflected in the three basic hardware components that made up the GSCs deployed on SInRG. First, as shown in figure 1, each GSC was built around an advanced data switch (at least 1Gb/s on each link) in order to provide advanced forms of Quality of Service (QoS) within the cluster itself. Second, each GSC contained a large data storage unit attached directly to its switch in order to facilitate (for both local and remote SInRG users) the localization of remote data for efficient processing within the GSC. Third, each GSC contained computational and other special resources that could both utilize and be utilized by SInRG users. While this computing component was typically a commodity cluster, different SInRG research projects required customized GSCs, such as an SMP or a set of Adaptive Computing devices. In all cases, however, both the typical and the specialized resources of a given GSC were, under appropriate conditions, as available across the network to the other users of SInRG as they were to the GSC's owners.
Surrounding the other CS research activities that were carried out on SInRG were a set of interdisciplinary research collaborations, all of which were applications targeted directly for SInRG and other computational grids. Each collaboration involved one or more computer science PIs who were addressing significant computer science research problems. The requirements of these advanced applications drove the research on SInRG, just as similar applications were pushing computer science research within the national grid community. As shown in Figure 1, four of these collaborations -- Medical Imaging, Materials Design, Computational Ecology, and Advanced Machine Design -- managed the resources of their own GSC, each of which was in some way specially adapted to the needs of that group. To ensure that every student and faculty member on campus had the opportunity to take advantage of SInRG, an additional GSC was administered by the University's Office of Information Technology and was available for anyone associated with the University. The importance of the SInRG infrastructure to the success of these applications was shown by the fact that the collaborating researchers from each of these projects committed personnel and other resources to the support of both GSCs and SInRG generally. These commitments created a technical support community for SInRG that extended across departments and disciplines.
Wider Impact of the SInRG Project
The SInRG project had a significant, positive impact on the research community at the campus, a regional, and a national level. At the campus level, the commitment of UTK's Computing and Academic Servicess office to buy its own GSC and support SInRG for the campus science and engineering community reflects a desire to prototype and experiment with the SInRG model for providing centralized campus support for distributed research computing. At a regional level, the respected Joint Institute for Computational Science worked with the SInRG project to provide educational outreach to its partner Historically Black Colleges and Universities on the new forms of gridbased, high-performance computing that the SInRG made available. Finally, the software and technologies developed for SInRG achieved national impact through its affiliation with the Logistical Backbone (LBone) and NSF Middleware Initiative (NMI) projects. Both of these projects provided the SInRG group with a deployment vehicle of unparalleled reach for the software and technology that resulted from their work.