Cyber Science Infrastructure Initiative for Boosting Japan’s Scientific Research
|
![]() February 2006 |
The Cyber Science Infrastructure (CSI) is a new comprehensive framework in which Japanese universities and research institutions are collaboratively constructing an information technology (IT) based environment to boost scientific research and education activities. Various initiatives are reorganized and included in CSI, such as the national research grid initiative, the university PKI and authentication system initiative, and the academic digital contents projects, as well as the project for a next-generation high-speed network. CSI was launched in late 2004 as a collaborative effort of leading universities, research institutions, and the National Institute of Informatics (NII).
NII is an interuniversity research institution that was established in April 2000 to conduct comprehensive research on informatics; it is the only academic research institution dedicated to informatics and IT. The Institute also been assigned a pivotal role in developing a scientific information and networking infrastructure for Japan. Therefore, NII also has a service operation arm for networking and providing scholarly information. NII puts priority on developing cutting-edge technologies for networks and information services and maintaining an infrastructure that will improve the networking and information environment of Japanese universities and research institutions.
In this article, we describe the current CSI activities being coordinated by NII as a collaborative effort with Japanese universities and research institutions.
For more than 20 years, NII and its predecessor institution, NACSIS, have provided a network and information service infrastructure for Japanese universities and research institutions, with the budgetary support of Japan's Ministry of Education, Culture, Sports, Science and Technology (MEXT). In this undertaking, NII is responsible for planning a better information environment and being a coordinator that can meet the diverse expectations of research communities and higher education institutions.
After the corporatization of Japanese national universities in April 2004, which significantly impacted research and educational communities and brought about the transformation of university administration systems, NII began a new network coordination initiative with leading universities and research communities that were shifting towards IT-based research. NII believes that some sort of core body is necessary to lead an initiative that deals with issues shared by many research and educational institutions. In addition to the network coordination, it considers the following three goals to be urgent for the Japanese research community:
Accomplishing these goals is NII's most important mission, so NII has integrated its activities for developing information infrastructures into the Cyber Science Infrastructure (CSI) initiative, incorporating the prospect of cooperating with researchers outside NII who share these three goals. The executive organization for network planning and coordination was created by NII in early 2005. It identifies CSI as a joint initiative among universities and research institutions aimed at evolving the nation's scientific information infrastructure.
In 2005, the CSI initiative obtained support from MEXT and the Council for Science and Technology Policy of the Japanese government and emphasized CSI's importance as a national initiative by ensuring that NII's budget in the fiscal year 2006 would include funds for CSI. The executive framework for development and dissemination of scholarly information is also part of this initiative, as is the framework for network-related activities, which include the next-generation high-speed optical network, the national research grid initiative, and PKI implementation.
NII started its network operations in 1987 as a major service for universities and research institutions throughout Japan. At first, it employed a X.25 packet switching network for connecting computer centers in universities. In 1992, when an important issue was the improvement of campus information environments, it adopted Internet Protocols for interconnecting campus networks. Since then, this academic Internet backbone, called the Science Information Network (SINET)1, has been expanded in an effort to promote research and education and to disseminate scholarly information for universities and research institutions. As shown in Fig. 1, 44 network nodes are currently deployed at major universities and research institutions. The lines from other institutions are collected at routers on the nodes. In 2005, 722 institutions were using SINET as an indispensable infrastructure for their daily activities. All of the nodes are connected with 1-Gbps or faster optical networks and the backbone speed is 10-Gbps.
SINET operates overseas connections as well. The 10-Gbps line to Abilene in the U.S. is the most heavily used Internet connection to Western countries, and GEANT in Europe is connected through this circuit. SINET also has a 2.4-Gbps line to Los Angeles in the U.S.
In January 2006, SINET launched SINET/Asia operations by deploying 622-Mbps international lines to Singapore and Hong Kong. These lines are used as Internet connections among Asian countries and Australia. The cooperation of the TEIN2 project initiated by the European Union has also meant that SINET will eventually have another path to Europe through this south-bound internetworking.
In 2002, NII launched the Super SINET initiative to provide a cutting-edge network infrastructure for research projects involving large amounts of data and supercomputers. Super SINET provides a 10-Gbps network environment and high-speed routers at selected nodes to meet the enormous bandwidth requirements of high-speed network-dependent research on high-energy physics, nuclear fusion science, space and astronomical science, genome analysis, and nanotechnology research. It is interconnected with SINET, and both networks are being operated in a multi-tier manner. Super SINET is also employed for the grid initiative called NAREGI (National Research Grid Initiative).
Although SINET and Super SINET have served admirably as the Japanese science infrastructure, their traffic volume has been rapidly growing, and their user requirements have become more diversified. A future science infrastructure that supports research on a variety of leading-edge applications will have to provide adequate network services to its users and respond flexibly to changes in their requirements. NII is therefore planning to construct a next-generation network that integrates both SINET and Super SINET.
Figure 2 shows the high-level description of the network architecture for this next-generation SINET. The transport network is a hybrid IP and optical network that accommodates multiple-layer services such as IP, Ethernet, and layer-1 (dedicated line) services. The network resource, i.e., network bandwidth, is flexibly assigned to each layer corresponding to the demand. For example, the bandwidth for a layer-1 connection can be assigned on demand by reducing the bandwidth for layers 2 and 3. Next-generation SDH/SONET technologies such as the General Frame Protocol (GFP), Virtual Concatenation (VCAT), and Link Capacity Adjustment Scheme (LCAS) will be essential to realizing this flexible network architecture. Generalized Multi-Protocol Label Switching (GMPLS) is also a key technology to achieve bandwidth on demand (BoD) services. IP routers provide a converged IP/MPLS platform for layer 2 and 3 services. L2 and L3 Virtual Private Network (VPN) over MPLS, IPv4/IPv6 dual stack, and sophisticated QoS will be common infrastructural functions.
To optimize network resource utilization and to enhance network resilience, the network has to supervise multiple layers and find the optimal resource assignment dynamically. The growth in the traffic of layer 1 services means that we will need a network control platform that can efficiently and dynamically monitor and control multiple layers.
A service control platform that helps to accelerate development of user applications by enabling collaboration between users and the network will be very important in the future infrastructure. The platform will include on-demand bandwidth assignment, advanced network security, and collaboration between middleware/applications and the network. We plan to start deployment of the next-generation SINET in 2007. We believe that this improved network will provide better environments for Japanese universities and research communities.
To facilitate interoperability and availability of the next-generation network and the grid middleware among universities, we also have to ensure a secure and reliable operation environment for universities which have diversified research and education policies. It is for this purpose that we launched the University Public Key Infrastructure (UPKI) initiative in 2005. The initiative will bring together informatics researchers at major universities and research institutions, and it is initially intended to be a means to exchange course credits among universities and to authenticate cooperative research activities. The developers are designing a universal scheme for the campus-wide authentication and authorization system and its applications for the Japanese university environment. The UPKI initiative is expected to be the key to promote sharing of computing facilities and information resources among institutions participating in CSI.
The National Research Grid Initiative (NAREGI)2, is one of the core projects within CSI. It started as the five-year Japanese National Grid Project 2003–2007, with over $100 million dedicated budgetary allocation. It is hosted by NII and led by Ken-ichi Miura. The chief aim of NAREGI is to develop a set of grid middleware to serve as a basis for CSI and other international grid infrastructural efforts. It will also serve as the centerpiece for constructing a nationwide research grid infrastructure. (Recent decisions have been made to extend NAREGI beyond 2007 via consolidation with the recent petascale computing national project.)
To achieve these goals, over 100 professional software developers from computing vendors such as Fujitsu, NEC, and Hitachi are working together in groups called working packages (WP) on a comprehensive middleware stack (Figure 3) under the leadership of NII professors who technically supervise the respective WPs; this is a unique combination of academic and industrial talents not seen in other major international grid projects. NAREGI is also strongly investing in the standardization of grid middleware, especially the OGSA (Open Grid Services Architecture) in GGF (the Global Grid Forum)3, and has actively adopted the standards of other organizations whenever possible. NAREGI will also collaborate with other international grid infrastructure projects, such as the TeraGrid and OSG in the U.S., EGEE and DEISA in Europe, as well as national-level projects in the Asia-Pacific region. Starting in 2005, NAREGI has held several interoperability meetings between TeraGrid and EGEE, and plans to set milestones so that the middleware stacks of the different projects will be interoperable at various levels in order to host virtual organizations on international scales.
Another essential aspect of NAREGI is the inclusion of nanoscience as an "early adopter" application area and a virtual organization (VO). The Institute for Molecular Science (IMS) located in Okazaki, Japan, is serving as the VO hosting institution. The experimental deployment of grid-enabled nanoscience applications over the NAREGI testbed, together with a production environment that will be hosted by the computing centers, will be significant in terms of the scale of the computational requirements, and it will affect academia and industry users who work in nanoscience and nanotechnology.
NII, which hosts the overall project as well as the Center for Grid Middleware Research and Development, and IMS collectively operate the dedicated NAREGI testbed. The testbed uses the Super SINET optical network as the underlying network infrastructure. It currently facilitates nearly 18 Tflops of computing power distributed over nearly 3000 processors (Figure 4) and is in heavy use for middleware development and grid-enabled nanoscience applications. The presence of a dedicated non-commercial testbed has proven absolutely essential in enabling large teams to collaborate toward reaching R&D goals and fostering integration.
The project's initial milestone of internal alpha delivery of middleware was met in the spring of 2005, and the near-production quality and standards-setting/compliant beta version will be ready for the Global Grid Forum 17 in Tokyo in May 2006. Several virtual organizations will also begin operation in 2006, including ones for nanoscience, high-energy physics, and the consortium of major university centers. There will be releases over the next two years to refine and enhance the middleware capabilities so that NAREGI can be used in a 24/7 fashion at major computing centers around the world.
As of early 2006, the project target of achieving a 100 Tflop-scale, CSI production grid seems easily achievable in terms of the computing capacity of the facilities involved in the project. Thus, to support grids that are beyond the petascale, the NAREGI project will be "upscaled" starting April 2006. That is, NAREGI will become part of the new petascale high-performance computing initiative, aiming to construct a computing environment that will be centered around a next-generation Earth Simulator-class machine that will exceed 10 Pflops by 2011, and will be supplemented by a fleet of centers, each of which will host a petascale machine, constituting a nationwide multi-petascale grid. The NAREGI project will thus last until the spring of 2011 or even beyond, with developments to focus on hosting a grid of enormous computing capacity. NAREGI will collaborate with other CSI efforts, such as the national-scale UPKI , so that every researcher and student can have access to the national grid.
NII provides a wide range of academic digital contents services that contribute to the research activities of CSI. In April 2005, NII launched GeNii4 as a unified portal of databases on various academic subjects. GeNii currently offers four services: CiNii, Webcat Plus, KAKEN, and NII-DBR. GeNii provides a meta-search interface that allows simultaneous searches across the four databases as well as individual search interfaces that fully utilize the features of each database. CiNii offers comprehensive information on research articles written in Japanese. NII provides nearly 2.4 million full-text articles from Japanese academic journals and research bulletins published by universities. Webcat Plus provides bibliographic information on books and serials by means of the Union Catalog Database of NACSIS-CAT, which was established by NII and Japanese university libraries in 1985. NII has added many bibliographic records to Webcat Plus, which now includes nearly 12 million records. One of Webcat Plus's features is its "associative search" function. This function makes it possible to conduct intuitive searches in a way that is similar to human thinking processes. The KAKEN database offers information on the research activities supported by Grants-in-Aid for Scientific Research from MEXT, and NII-DBR is a collection of multi-disciplinary databases on Japanese academic societies and researchers.
The university libraries' consortia and NII jointly operate an electronic journal repository, called NII-REO. The repository provides long-term access to electronic journals that are indispensable for academic research. About 60,000 articles from several scholarly publishers are currently stored on the REO server. Furthermore, the large-scale archival digital contents of Springer-Verlag and Oxford University Press journals, which go back to the 19th century, will be added to this collection in 2006. These materials are being utilized by the participating universities. NII and the university libraries are planning to add more titles to this digital archive upon the completion of negotiations with other publishers.
Institutional repositories are digital collections that manage and disseminate scholarly materials created by an institution and its members. They address two strategic issues facing academic institutions: they provide a central component for reforming scholarly communications and they serve as tangible indicators of an institution's quality and activities, thus increasing its visibility, prestige, and public value.
In recent years, more and more academic libraries have started to construct institutional repositories. In 2005, NII started a collaborative project with 19 university libraries aiming at deployment and coordination of institutional repositories in Japan. By using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), NII will harvest metadata from institutional repositories and develop a centralized metadata repository, for which it intends to offer various value-added services to the scholarly community. Other roles that NII will assume in this initiative include conducting activities to aid the formation of an institutional repository community and providing technical support and advocacy services.
The Cyber Science Infrastructure (CSI) is the new initiative for evolving Japan's academic information infrastructure. It was launched in 2005 by NII and is supported by Japanese universities and research institutions. Its executive organization is working to elaborate the comprehensive concept behind CSI; some issues have already been addressed. For example, the detailed specification of the next-generation network will soon be settled on and will reflect the technical advice of many researchers. Procurements will begin in April 2006 and deployment will start in April 2007. Furthermore, the budget for developing the UPKI software has been partially appropriated for the fiscal year 2006. The progress of the CSI initiative will be reported in various research papers as well as on our web site.