CTWatch Quarterly » Specifications for the Next-Generation Computational Biology Infrastructure

Specifications for the Next-Generation Computational Biology Infrastructure

Eric Jakobsson, National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign

Integration of Basic Science with Engineering Design

Biology is different from other basic sciences such as chemistry and physics, in the sense that adaptation for function is an integral part of all biological phenomena. Physical and chemical phenomena have only one type of cause; i.e., the underlying laws of physics. Biological phenomena have two types of cause: 1) the underlying laws of physics, and 2) the imperatives of evolution, which select the actualities of biology out of all the possibilities that one could imagine for how living systems are organized and function. In this sense, biological systems are like engineered systems - purpose contributes along with the laws of physics to define the nature of both biological and human-engineered systems. Elaborate and sophisticated computer aided design (CAD) systems have been developed to guide the creation of human-engineered devices, materials, and systems. The principles of CAD systems (optimization, network analysis, multiscale modeling, etc.) need to be incorporated into our computational approaches to understanding biology. A direct target for such cross-fertilization of biology and engineering is in nanotechnology, where we seek to engineer devices and processes that are on the size scale of biomolecular complexes. The Network for Computational Nanotechnology ² is a notable cyberinfrastructure project in nanotechnology, with a versatile computational environment delivered through its Nanohub web site.³

Integration of algorithmic development with computing architecture

The different types of biological computing have vastly different patterns of computer utilization. Some applications are very CPU-intensive, some require large amounts of memory, some must access enormous data stores, some are much more readily parallelizable than others, and there are highly varied requirements for bandwidth between hard drive, memory, and processor. We need much more extensive mutual tuning of computer architecture to applications software, to be able to do more with existing and projected computational resources. A remarkable instance of such tuning is the molecular simulation code Blue Matter, written specifically to exploit the architecture of the IBM Blue Gene supercomputer. The Blue Matter-Blue Gene combination has done biomolecular dynamics on a hitherto unprecedented scale and is directly enabling fundamentally new discoveries.⁴

There is finally one more critical issue with respect to the development of a suitable cyberinfrastructure for biology. Our society is not training nearly enough prospective workers in the area of computational biology, nor enough quantitative biology researchers in general, to make progress in biological computing commensurate with the increased availability and power of computing resources. We need focused training at both the undergraduate and graduate levels to produce a generation of computational biologists who will be capable of integrating the physics, systems, and informatics approaches to biological computing, and to produce a generation of biologists who will be able to use computational tools in the service of quantitative design perspectives in understanding living systems. The need for such training is well articulated in the National Academy of Sciences report BIO 2010.⁵ The first university to unreservedly embrace the BIO 2010 recommendations by fully integrating computing into all levels of its biology curriculum is the University of California at Merced.⁶

References

¹ Biology Workbench - workbench.sdsc.edu/
² Network for Computational Nanotechnology - www.ncn.purdue.edu/
³ nanoHUB - www.nanohub.org/
⁴see Grossfield, A., Feller, S. E., Pitman, M.C. A role for direct interactions in the modulation of rhodopsin by -3 polyunsaturated lipids. PNAS 103: 4888-4893. 2006.
⁵ www.nap.edu/books/0309085357/html/
⁶ biology.ucmerced.edu/

Pages: 1 2

CTWatch is a collaborative effort				Sponsored By