Overview

HARNESS (Heterogeneous Adaptable Reconfigurable Networked SystemS) is an experimental metacomputing framework built around the services of a highly customizable and reconfigurable distributed virtual machine (DVM). A DVM is a tightly coupled computation and resource grid that provides a flexible environment to manage and coordinate parallel application execution.

The system is designed to support a wide range of DVM sizes, from users building personal DVMs to enterprise and widely distributed DVMs. Collaboration and resource sharing between different entities are performed by the temporary merging and splitting of different DVMs. Virtual machine (VM) terminology, borrowed from PVM, refers to a system where the computing resources on that system can be viewed as a single, large, distributed-memory computing resource.

HARNESS was designed to support

  • multiple DVMs
  • no single point of failure
  • scalable infrastructure though replicated state
  • flexible/reconfigurable functionality by allowing new components or plug-ins to be added and removed dynamically
  • Legacy message passing code

The architecture is built on kernels, daemons, DVMs, and services provided by standard components. The kernel is implemented as a set of core functions for loading and running components either locally or via remote requests. A HARNESS daemon is composed of a kernel (the HARNESS Core or Hcore) and a minimal set of required components to provide basic services. These services include maintaining state, the ability to communicate between components, and remote invocation of components and new daemons. A HARNESS DVM is composed of a set of co-operating daemons that together present the basic services of communication, process control, resource management, and fault detection. Important goals of HARNESS are that it should be robust and reliable. All information and control within HARNESS are to be built on a Symmetric Peer-to-Peer Distributed Control (SPDC) algorithm; thus, HARNESS will not have a single point of failure, but instead a user configurable level of fault tolerance.

Two plug-ins have been developed to allow standard message passing codes to be used directly, and they support both PVM3 and a subset of the MPI-2 APIs. These plug-ins allow users of existing applications to run on HARNESS without any code modification. The MPI plug-in, known as FT_MPI, provides additional functionality to support fault-tolerant applications.

Using the PVM and MPI plug-ins allows for thousands of existing message passing applications to execute under the HARNESS system. The HARNESS system itself is expected to be used as a natural upgrade path for existing PVM users as well as users wishing to build their own personal computational grids without having to buy into some global grid framework.

The HARNESS system provides a number of benefits over existing VM-based systems, such as reliability, scalability, ease of adding new functionality, and the sharing of resources by merging VMs.


Project Handouts

Jun 30 2022 Admin Login