This issue of CTWatch Quarterly contains four articles that provide an overview of some of the major Grid projects in Europe. All these projects are aimed at developing distributed collaborative research capabilities for the scientists that are built on the deployment of a persistent middleware infrastructure on top of the high bandwidth research networks. The combination of a set of middleware services running on top of high speed networks is called ‘e-Infrastructure’ in Europe and ‘Cyberinfrastructure’ in the USA. In this brief article we shall abstract the key elements of such an e-Infrastructure from these projects and from our experience in our UK e-Science program. We look at the problems of creating and implementing a sustainable, global e-Infrastructure that will enable multidisciplinary and collaborative research across a wide range of disciplines and communities.
The UK e-Science Initiative began in April 2001 and over the last four years, more than £250M has been invested in science applications and middleware development. In addition, the program created a pipeline from the science base to genuine industrial applications of this technology and, most importantly, has enabled the creation of a vibrant, multidisciplinary, e-Science community. This community comes together in its totality at the UK’s annual e-Science All Hands Meeting which is held each September. We now have a community of over 650 who attend and join in to share experience and technologies. These meetings have brought together an exciting mix of scientists, computer scientists, IT professionals, industrial collaborators and, more recently, social scientists and researchers in the arts and humanities. Research scientists from all domains of science and engineering–particle physics, astronomy, chemistry, physics, all flavours of engineering, environmental science, bioinformatics, medical informatics and social science–as well as the arts and humanities are beginning to appreciate the need for e-Science technologies that will allow them to make progress with the next generation of research problems. In most cases, researchers are now finding themselves faced with an increasingly difficult burden of both managing and storing vast amounts of data as well as analyzing, combining and mining the data to extract useful information and knowledge. Often this can involve automation of the task of annotating the data with relevant metadata as well as constructing complex search engines and workflows that capture complex usage patterns of distributed data and compute resources. Most of these problems and the tools and techniques to tackle them are similar across many different types of application. It makes no sense for each community to develop these basic tools in isolation. We need to identify and capture a set of generic middleware services and deploy them on top of the high-bandwidth research networks to constitute a reusable e-Infrastructure. In the UK e-Science Initiative, this task–of identifying and implementing the key features of a national e-Infrastructure–was the remit of the Core Program.
The phrase e-Infrastructure–or Cyberinfrastructure in the US–is used to emphasize that these applications will be facilitated by a set of services that permit easy but controlled access to the traditional infrastructure of science–supercomputers, high performance clusters, networks, databases and experimental facilities. The e-Science challenge is to provide a set of Grid middleware services that are sufficiently robust, powerful and easy to use that application scientists are freed from re-inventing such low-level ‘plumbing’ and can concentrate on their science. A second challenge is to make this combination of middleware and hardware into a truly sustainable e-Infrastructure in much the same way as we take for granted the research networks of today.