CTWatch
November 2007
Software Enabling Technologies for Petascale Science
Fred Johnson, Acting Director, Computational Science Research & Partnerships (SciDAC) Division
Office of Advanced Scientific Computing Research
DOE Office of Science

3

As described in the Bethel, Johnson et al. article, the Visualization and Analytics Center for Enabling Technologies (VACET) group is developing solutions to this problem that combine “query-driven” strategies, which pre-filter the data to be visualized for relevance and interest, with “context and focus” user interface designs, which enable scientists to control their field of attention while navigating complex data spaces. The success of this approach obviously depends on finding fast and efficient ways to index and search targeted data sets; VACET is collaborating closely with other centers in researching this problem. The work of the Institute for Ultrascale Visualization (Ultraviz Institute), described in Kwan-Liu Ma’s article, also (by necessity) puts the question of interface design at the center of its research agenda, especially for cases requiring the exploration of time-varying multivariate volume data. The Institute’s investigation of “in situ” visualization attempts to address problems at the other end of the visualization pipeline. To overcome the severe problems of data logistics involved in managing the rendering of multi-terabyte data sets in networked environments, in situ visualization performs the necessary calculations while data still resides on the supercomputer that was used to generate it.

Similar problems of petascale data logistics are central to the mission of the three CETs that focus on large scale data management for distributed environments. As the leaders of the Center for Enabling Distributed Petascale Science (CEDPS) make clear in their article (Schopf et al.), such questions of “data placement” are central to the end-to-end effectiveness of SciDAC’s highly distributed collaboration environments. The authors describe their development of a policy-driven data placement service, which builds on their experience working with several leading application communities, including HEP, Fusion Energy, Combustion, and Earth Systems. Complementary efforts on automated scientific workflow, using well known Kepler middleware, are also underway at the Scientific Data Management (SDM) Center. But in order to help investigators manage and analyze the data deluge they confront, the SDM Center research portfolio extends farther down the storage middleware stack. In the Shoshani et al. article, they describe the ensemble of software tools and middleware that they are developing to help scientists to explore their data through automatic feature extraction and highly scalable indexing of massive data sets, and to optimize their use of storage resources through low level parallel I/0 libraries and in situ processing on the storage nodes (“active storage”).

The third CET focused on data management – the Earth System Grid Center for Enabling Technologies (ESG-CET) – revolves, as the name suggests, around a single major application community, viz. the climate research community. This community has been at the forefront of the data grid movement for many years, aggressively developing and deploying data grid technology to show how high impact data sharing can be implemented on a global scale, even while the volume of data continues to escalate. Their discussion (Williams et al.) of the past successes, the current implementation, and the future plans for the Earth System Grid describes a model that several other application communities would do well to emulate as we enter the era of petascale data.

Reflecting on the range and diversity of the work on software cyberinfrastructure presented in this issue of CTWatch Quarterly, it’s hard to avoid the conclusion that the relentless movement toward petascale science, in which the DOE SciDAC program has played such a leading role, has generated a software ecosystem whose continued vitality seems more and more essential to success on the new frontiers of research. But we cannot be complacent. The push beyond petascale is just around the corner and, as before, the effort to scale up even further is certain to bring up uniquely difficult problems that we have not yet anticipated. We must hope, therefore, that the next generation of enabling software technology researchers contains the same kind of energetic, dedicated and creative pioneers that have led the current one.

Pages: 1 2 3

Reference this article
Johnson, F. "Introduction," CTWatch Quarterly, Volume 3, Number 4, November 2007. http://www.ctwatch.org/quarterly/articles/2007/11/introduction/

Any opinions expressed on this site belong to their respective authors and are not necessarily shared by the sponsoring institutions or the National Science Foundation (NSF).

Any trademarks or trade names, registered or otherwise, that appear on this site are the property of their respective owners and, unless noted, do not represent endorsement by the editors, publishers, sponsoring institutions, the National Science Foundation, or any other member of the CTWatch team.

No guarantee is granted by CTWatch that information appearing in articles published by the Quarterly or appearing in the Blog is complete or accurate. Information on this site is not intended for commercial purposes.