March 2008
Urgent Computing: Exploring Supercomputing's New Role
Suresh Marru, School of Informatics, Indiana University
Dennis Gannon, School of Informatics, Indiana University
Suman Nadella, Computation Institute, The University of Chicago
Pete Beckman, Mathematics and Computer Science Division, Argonne National Laboratory
Daniel B. Weber, Tinker Air Force Base
Keith A. Brewster, Center for Analysis and Prediction of Storms, University of Oklahoma
Kelvin K. Droegemeier, Center for Analysis and Prediction of Storms, University of Oklahoma


LEAD Scientists conducted on-demand, dynamically adaptive forecasts over regions of expected hazardous weather, as determined by severe weather watches and/or mesoscale discussions among scientists and forecasters at the SPC. The LEAD on-demand forecasts began in the first week of May and continued until June 8, 2007. LEAD scientists Drs. Dan Weber and Keith Brewster, interacted directly with the SPC forecasters and HWT participants to obtain the daily model domain location recommendations and launched the daily forecasts using the LEAD Portal. The 9-hour WRF forecasts consisted of 1000 km x 1000 km regions placed in an area of elevated risk of severe weather occurrence during the 1500-0000 UTC forecast period. The on-demand forecasting process depicted in Figure 5 illustrates the forecasters’ interaction with the weather to create a customized forecast process not possible with the current real-time Numrical Weather Prediction NWP scheme.

Figure 5

Figure 5. HWT 2007 spring experiments

The on-demand forecasts were initialized by using the 15 UTC LEAD ARPS Data Assimilation System ADAS [19] analysis or 3-hour North American Model-NAM forecast initialized at 1200 UTC interpolated to a horizontal grid spacing of 2-km. The ADAS analysis included radar data and other observations to update the 3-hour NAM forecast from the 12 UTC initial time. One advantage to this on-demand forecast system configuration is the potential rapid turnaround for a convective scale forecast using NAM forecasts updated with mid-morning observations. The period selected, from 1500 UTC to 0000 UTC, overlaps with part of the 2007 HWT forecast and verification period for the larger-scale, numerical forecasts using 2 km and 4 km grid spacing.

Each day, one or more six- to nine-hour nested grid forecasts at 2 km grid spacing were launched automatically over regions of expected severe weather, as determined by mesoscale discussions at SPC and/or tornado watches, and one six- to nine-hour nested grid forecast, per day, at 2 km grid spacing was launched manually when and where deemed most appropriate. The production workflows were submitted to the computing resources at the National Center for Supercomputing Applications (NCSA). Because of the load on that machine, including other 2007 HWT computing resource needs, the workflow often waited for several hours in queues, before 80 processors were available to be allocated to the workflow. Moreover, the on-demand forecasts were launched based only on the severity of the weather. If we need a quick turnaround, computing resources have to be pre-reserved and idled, wasting CPU cycles and decreasing the throughput on a busy resource.

In order to tackle this problem, LEAD and SPRUCE researchers collaborated with the University of Chicago/Argonne National Laboratory (UC/ANL) TeraGrid resources to perform real-time, on-demand severe weather modeling. Additionally, the UC/ANL IA64 machine currently supports preemption for urgent jobs with highest priority. As an incentive to use the platform even though jobs may be killed, users are given a 10% discount from the standard CPU service unit billing. Deciding which jobs are preempted is determined by an internal scheduler algorithm that considers several aspects, such as the elapsed time for the existing job, number of nodes, and jobs per user. LEAD was given a limited number of tokens for use throughout the tornado season. The LEAD web portal allows users to configure and run a variety of complex forecast workflows. The user initiates workflows by selecting forecast simulation parameters and a region of the country where severe weather is expected. This selection is done graphically through a “mash-up” of Google maps and the current weather. We deployed SPRUCE directly into the existing LEAD workflow by adding a SPRUCE Web service call and interface to the LEAD portal. Figure 6 shows how LEAD users can simply enter a SPRUCE token at the required urgency level to activate a session and then submit urgent weather simulations.

Figure 6

Figure 6. Launching a urgent computing workflow using SPRUCE token

Pages: 1 2 3 4 5 6 7 8

Reference this article
Marru, S., Gannon, D., Nadella, S., Beckman, P., Weber, D. B., Brewster, K. A., Droegemeier, K. K. "LEAD Cyberinfrastructure to Track Real-Time Storms Using SPRUCE Urgent Computing," CTWatch Quarterly, Volume 4, Number 1, March 2008. http://www.ctwatch.org/quarterly/articles/2008/03/lead-cyberinfrastructure-to-track-real-time-storms-using-spruce-urgent-computing/

Any opinions expressed on this site belong to their respective authors and are not necessarily shared by the sponsoring institutions or the National Science Foundation (NSF).

Any trademarks or trade names, registered or otherwise, that appear on this site are the property of their respective owners and, unless noted, do not represent endorsement by the editors, publishers, sponsoring institutions, the National Science Foundation, or any other member of the CTWatch team.

No guarantee is granted by CTWatch that information appearing in articles published by the Quarterly or appearing in the Blog is complete or accurate. Information on this site is not intended for commercial purposes.