Overview

NetSolve is a client-server system that enables users to solve complex scientific problems remotely. The system allows users to access both hardware and software computational resources distributed across a network. NetSolve searches for computational resources on a network, chooses the best one available, and using retry for fault-tolerance solves a problem, and returns the answers to the user. A load-balancing policy is used by the NetSolve system to ensure good performance by enabling the system to use the computational resources available as efficiently as possible. Our framework is based on the premise that distributed computations involve resources, processes, data, and users, and that secure yet flexible mechanisms for cooperation and communication between these entities is the key to metacomputing infrastructures.

Some goals of the NetSolve project are:
ease-of-use for the user
efficient use of the resources, and
the ability to integrate any arbitrary software component as a resource into the NetSolve system.

Interfaces in Fortran, C, Matlab, and Mathematica have been designed and implemented which enable users to access and use NetSolve more easily. An agent based design has been implemented to ensure efficient use of system resources.

One of the key characteristics of any software system is versatility. In order to ensure the success of NetSolve, the system has been designed to incorporate any piece of software with relative ease. There are no restrictions on the type of software that can be integrated into the system.


Related Materials

Supercomputing 2002 Poster
View [.JPG]
Download [.PDF]

Innovative Computing Laboratory 2002 Report
Color [.PDF] 2.75 MB



 

Related Work

Ninf
NEOS
Globus

 

Applications

Currently, a wide range of applications have integrated and make use of the NetSolve system. Here, we identify a few of these applications and summarize how they have taken advantage of NetSolve.

IPARS (Integrated Parallel Accurate Reservoir Simulators)
IPARS, developed under the directorship of Mary Wheeler at the Center for Subsurface Modeling, at the University of Texas' Institute for Computational and Applied Mathematics, TICAM, is a framework for developing parallel models of subsurface flow and transport through porous media. It currently can simulate single phase (water only), two phase (water and oil) or three phase (water, oil and gas) flow through a multi-block 3D porous medium. IPARS can be applied to model water table decline due to overproduction near urban areas, or enhanced oil and gas recovery in industrial applications.

We have built a NetSolve interface to the IPARS system that allows users to access the full functionality of IPARS. Accessing the system via either the MATLAB, C, Mathematica, or FORTRAN interfaces, automatically executes simulations on a cluster of dual-node workstations that allow for much quicker execution than would be possible on a single local machine. The NetSolve system also does the post-processing of the output to use the third party software, TECPLOT, to render the 3D output images. Among other things, NetSolve provides a gateway to the IPARS system without downloading and installing IPARS code. This means it can even be used on platforms that it has not yet been ported to. We further facilitate this interface by embedding it in html form within a web browser so that with just access to a web browser, one can enter input parameters and submit a request for execution of the IPARS simulator to a NetSolve system. The output images are then brought back and displayed to the web browser. This interaction shows how the NetSolve system can be used to create a robust grid computing environment in which powerful modeling software, like IPARS, becomes both easier to use and to administer.

MCell
MCell is a general Monte Carlo simulator of cellular microphysiology. MCell uses Monte Carlo diffusion and chemical reaction algorithms in 3D to simulate the complex biochemical interactions of molecules inside and outside of living cells. MCell is a collaborative effort between the Terry Sejnowski lab at the Salk Institute, and the Miriam Salpeter lab at Cornell University.

NetSolve is very well suited to MCell's need and this project aims at writing a NetSolve-based framework to support large MCell runs. One of the central piecesof that framework is a scheduler that takes advantage of MCell input datarequirements to minimize turn-around time. This scheduler is part of the larger AppLeS at theUniversity of California, san Diego. The use of NetSolve isolates the scheduler from the resource-management details and allows researchers to focus only on the design of the scheduler.

This collaboration was exhibited at SuperComputing '99 in Portland, Oregon,where on the exhibit floor, members of the NetSolve, MCell, and AppLeS teams, harnessed the resources of over a hundred global area network computers. These computers had a total of more than 150 nodes, and were located on sixcontinents in different countries, states, and organizations. On the floor, a simulation that required 360 independent tasks was run, but this same setup has been used to achieve simulations using thousands of tasks, running oneven more machines. Here is a photograph of the demonstration.

SCIRun

SCIRun is a scientific programming environment that allows the interactive construction, debugging and steering of large-scale scientific computations. SCIRun can be used for interactively:

  • Changing 2D and 3D geometry models (meshes).
  • Controlling and changing numerical simulation methods and parameters.
  • Performing scalar and vector field visualization.

Currently, NetSolve is being integrated into SCIRun as the broker for computational resources. This integration will allow for increased parallelism and performance in the SCIRun paradigm.

CENTS (Collaborative Environment for Nuclear TechnologySoftware):
The goal of this project is to develop a prototype environment for the Collaborative Environment for Nuclear Technology Software (CENTS). CENTS aims to lay the foundation for a Web-based distance computing facility forexecuting nuclear engineering codes. Through its Web-based interfaces, CENTS will allow users to focus on the problem to be solved instead of the specifics of a particular nuclear code. Both parallel (two similar codes operating on the same data) and serial (one code feeding another)integration of nuclear codes will be supported. Via the Web, users will submit input data with computing options for execution, monitor the status of their submissions, retrieve calculational results, and use CENTS tools for viewing and analyzing result data. Stronger-than-password security mechanisms will prevent unauthorized access to computing resources andensure a user's input and output data are not compromised. For computational services, CENTS will employ a collection of heterogeneous computer systems logically clustered and managed for optimal resource utilization. Moreover, CENTS will be constructed in a modular, open-ended style in order to address immediate needs of the Department of Energy andthe nuclear community while accommodating future expansion due to yet unforeseen demands and advances in computer architecture and network capabilities.

The prototype environment was accomplished by implementing the NetSolve problem solving tool and using one of the more popular computer codesdisseminated by RSICC, Monte Carlo Neutral Particle (MCNP). NetSolve was installed on an IBM RISC 6000 computer, a Sun workstation and a DEC Alpha workstation. Respectively, an executable copy of the MCNP computer code wasalso installed on these systems. From the user end, a CGI (Common Gateway Interface) script was written to enable one to access the prototype through a web browser. The only requirement for the user is to supply the inputproblem for the MCNP code. This is accomplished by clicking a link from the RSICC web server. After the user supplies the input, NetSolve sends the problem to one of the workstations in the environment. The problem issolved and the output (4 files) is sent back to the user via the webinterface.

DIPS (Distributed Image Processing Shell):

DIPSis a software tool, developed at the Computer Graphics and Vision unitof the Graz University of Technolgy in Austria, which allows remote computing for image processing. DIPS extends the Image/J Java image processing application to provide remote access to the high-performance ImageVision library by Silicon Graphics.

At its core, DIPS uses NetSolve as its metacomputing resource to provide unprecedented computing power by aggregating distributed resources on the Internet to a single system.

Environment for Ground Cover Classification
The quality of remote sensing images is constantly being increased by theinvention of new sensors and satellites. Available data sources rangefrom SAR-images to Multispectral and even Hyperspectral images with hundreds of sensor channels. The aim of this project is to cluster and classify remotely sensed images from various sensors and combine them to achieve better results in ground cover classification.

As the processed image data is large, a distributed computation environment based on NetSolve and the NetSolve Image-Vision methods is used as the backend. This allows for the handling of such large datasets and complexmethods through a lightweight client.


Integrations

Currently, a wide range of applications have integrated and make use of the NetSolve system. Here, we identify a few of these applications and summarize how they have taken advantage of NetSolve.

NetSolve has integrated numerous systems (either in part or in whole) to help in its functionality. Here we make mention of these systems and summarize what components we have integrated and the reasons for doing so.

Ninf
Ninf and NetSolve are remote computing systems which are oriented to provide numerical computations. These two systems are very similar to each other in their design and motivation. Adapters have been implemented to enable each system to use numerical routines installed on the other.

Legion
Legion has been incorporated in such a way to allow the client-user to program using the NetSolve interface while leveraging the Legion meta-computing resources. The NetSolve client side uses Legion data-flow graphs to keep track of data dependencies. This effort has been extended only to the FORTRAN interfaces and was done by the Legion group, at the University of Virginia.

Globus
"The Globus project is developing the fundamental technology that is needed to build computational grids, execution environments that enable an application to integrate geographically-distributed instruments, displays, and computational and information resources. Such computations may link tens or hundreds of these resources."
NetSolve currently uses Globus' "Heartbeat Monitor" to detect failed servers. In its testing phase is a new NetSolve client which implements a proxy that would allow the client to utilize the Globus grid infrastructure if available. If not, the client resorts to its present behavior.

CONDOR
"Condor is a software system that runs on a cluster of workstations to harness wasted CPU cycles. A Condor pool consists of any number of machines, of possibly different architectures and operating systems, that are connected by a network."
NetSolve currently has the ability to access CONDOR pools as its computational resource. With little effort, the server can be configured to submit the clients request to an existing CONDOR pool, collect the results, and send them to the client.

IBP (Internet Backplane Protocol)
IBP is a storage management system serves up writable storage as a wide-area network resource, allows for the remote direction of storage activities, and decouples the notion of user identification from storage.
Currently being developed are IBP-enabled clients and servers for NetSolve that would allow NetSolve to allocate and schedule storage resources as part of its resource brokering. This will lead to much improved performance and fault-tolerance when resources fail.

NWS (Network Weather Service)
NWS is a system that uses sensor processes on workstations to monitor the cpu and network connection. It constantly collects statistics on these entities and has the ability to incorporate statistical models to run on the collected data to generate a forecast of future behavior.
NetSolve is currently integrating NWS into its agent to help its efforts of determining which computational servers would yield results to the client most efficiently.


 

  Innovative Computing Laboratory
 
Contact NetSolve: netsolve@cs.utk.edu Computer Science Department
  University of Tennessee