CTWatch
March 2008
Urgent Computing: Exploring Supercomputing's New Role
Benjamin S. Kirk, NASA Lyndon B. Johnson Space Center
Grant Palmer, NASA Ames Research Center
Chun Tang, NASA Ames Research Center
William A. Wood, NASA Langley Research Center

3
2. 3 Time Criticality

The intent is that the pre-flight mission timeline occurs uninterrupted while this process is executed on the ground. The nominal damage assessment process is scripted and well-rehearsed as to fit in a nominally 24-hour timeline. This is absolutely essential to mission success. In this way any damage that may require repair is identified and reported to the Mission Management Team by no later than the fifth day of flight. It is at this point during the flight when the schedule for the remainder of the mission is finalized. In particular, if a repair must be executed, it must be identified at this point so that adequate resources (e.g., breathable oxygen, water, spacecraft power) can be allocated. Identifying a problem late in the mission may be useless as there may not be adequate resources available to affect a repair.

3. High-Fidelity CFD Analyses and High-Performance Computing

It is in this compressed timeline that high-fidelity analysis must be performed if it is to be of value to the overall process. Additionally, the data environment is highly dynamic, as new characterizations of a damage site are continuously acquired. The timeline is such that a high-fidelity analysis must have a turnaround time of approximately eight hours or less for it to be useful. This requirement poses a number of challenges.

3.1 Analysis Tools & Supporting Infrastructure

The two primary CFD codes used by the reentry aerothermodynamic community at NASA are the LAURA and DPLR codes from Langley and Ames Research Centers, respectively. Both codes are block-structured, finite volume solvers that model the thermochemical nonequilibrium Navier-Stokes equations. LAURA 5 was originally written in Fortran 77 and was highly optimized for the vector supercomputers of the day. Subsequent modifications to the code have incorporated MPI for distributed-memory parallelism. DPLR 6, written in Fortran 90, is a relatively newer code and was designed from its inception to use MPI on distributed-memory architectures with cache-based commodity processors.

The Columbia supercomputer, installed and maintained by the NASA Advanced Supercomputing Division (NAS), is the primary resource used for these analyses. Columbia is composed of 20 SGI Altix nodes, each of which contains 512 Intel Itanium-2 processors. (Columbia was ranked 20th on the November 2007 Top 500 supercomputer ranking.) Prior to each launch, NAS personnel reserve one node for dedicated mission support and alert the user community that additional resources may be reallocated if necessary. Columbia is augmented with department-level cluster resources to provide redundancy (albeit at reduced capability) in case of emergency.

Institutional policies preclude major modification to either the software environment on the machines or the supporting network infrastructure in a “lockdown” period leading up to launch. This helps assure that resources are available and function as intended when called upon. This restriction prevents overzealous firewall modifications from precluding access to resources, to provide but one example.

Pages: 1 2 3 4 5 6 7

Reference this article
Kirk, B. S., Palmer, G., Tang, C., Wood, W. A. "Urgent Computing in Support of Space Shuttle Orbiter Reentry," CTWatch Quarterly, Volume 4, Number 1, March 2008. http://www.ctwatch.org/quarterly/articles/2008/03/urgent-computing-in-support-of-space-shuttle-orbiter-reentry/

Any opinions expressed on this site belong to their respective authors and are not necessarily shared by the sponsoring institutions or the National Science Foundation (NSF).

Any trademarks or trade names, registered or otherwise, that appear on this site are the property of their respective owners and, unless noted, do not represent endorsement by the editors, publishers, sponsoring institutions, the National Science Foundation, or any other member of the CTWatch team.

No guarantee is granted by CTWatch that information appearing in articles published by the Quarterly or appearing in the Blog is complete or accurate. Information on this site is not intended for commercial purposes.