ICL Newsletter

News and Announcements

TOP500 – November 2017

Celebrating the 50th list with more than two decades of tracking supercomputing progress, the TOP500 continues to provide a reliable historical record of supercomputers around the world. The list clearly lays out critical HPC metrics across all 500 machines and draws a rich picture of the state of the art in terms of performance, energy consumption, and power efficiency.

In November 2017, the 50th TOP500 list showed China overtaking the United States in the number of ranked systems—with 202 Chinese machines to 143 US machines. With these numbers in play, China now has more machines in the TOP500 than they ever have previously, while the number of US machines drops to a 25-year low. As far as the rankings themselves, China’s Sunway TaihuLight system, installed at the National Supercomputing Center in Wuxi, maintains its number one ranking for the fourth time, with an HPL run of 93.01 petaFLOP/s.

Regarding the TOP500’s longevity as a benchmark, ICL’s Jack Dongarra (co-author of the TOP500 list) had this to say:

I have had the good luck of being in the right place for the creation of several important project in high-performance computing: LINPACK, TOP500, BLAS, MPI, ScaLAPACK, and ATLAS, to name a few. The LINPACK benchmark and the TOP500 list have a life of their own today as a recognized standard for performance measurement.

Below is a table of the top 5 entries for the November 2017 list.

Rank	System	Cores	Rmax (teraFLOP/s)	Rpeak (teraFLOP/s)	Power (kW)
1	Sunway TaihuLight – Sunway MPP, Sunway SW26010 260C 1.45GHz, Sunway , NRCPC National Supercomputing Center in Wuxi China	10,649,600	93,014.6	125,435.9	15,371
2	Tianhe-2 (MilkyWay-2) – TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P , NUDT National Super Computer Center in Guangzhou China	3,120,000	33,862.7	54,902.4	17,808
3	Piz Daint – Cray XC50, Xeon E5-2690v3 12C 2.6GHz, Aries interconnect , NVIDIA Tesla P100 , Cray Inc. Swiss National Supercomputing Centre (CSCS) Switzerland	361,760	19,590.0	25,326.3	2,272
4	Gyoukou – ZettaScaler-2.2 HPC system, Xeon D-1571 16C 1.3GHz, Infiniband EDR, PEZY-SC2 700Mhz , ExaScaler Japan Agency for Marine-Earth Science and Technology Japan	19,860,000	19,135.8	28,192.0	1,350
5	Titan – Cray XK7, Opteron 6274 16C 2.200GHz, Cray Gemini interconnect, NVIDIA K20x , Cray Inc. DOE/SC/Oak Ridge National Laboratory United States	560,640	17,590.0	27,112.5	8,209

Michela Taufer to Join EECS

In 2017, Jack Dongarra created an endowment for the Jack Dongarra Professorship in High-Performance Computing within the Department of Electrical Engineering and Computer Science in UTK’s Tickle College of Engineering. The professorship targets individuals with expertise and experience in high-performance and scientific computing.

Professor Michela Taufer, an Association for Computing Machinery Distinguished Scientist and J.P. Morgan Chase Scholar currently at the University of Delaware, will be the first Jack Dongarra Professor in High-Performance Computing. Prof. Taufer, who is an expert in HPC and director of the University of Delaware’s Global Computing Lab, will join UTK’s Department of Electrical Engineering and Computer Science in June 2018.

For our part, ICL is looking forward to a fruitful collaboration with Michela and her team. Welcome to the fold, Prof. Taufer!

2018 ICL Winter Reception

After a whirlwind autumn and winter, the ICL Winter Reception offered a welcome reprieve where members of ICL could relax, eat, drink, and enjoy the camaraderie. For 2018, we returned to Calhoun’s on the River, where they graciously hosted around 50 ICLers and their invited guests. Above, we have provided some photos of the event for your entertainment. Here’s to another productive year at ICL!

Conference Reports

SC17

The International Conference for High Performance Computing Networking, Storage, and Analysis (SC), established in 1988, is a staple of ICL’s November itinerary. SC17 was held in Denver, Colorado, on November 12–17. As usual, ICL had a significant presence at SC, with faculty, research staff, and students giving talks, presenting papers, and leading “Birds of a Feather” sessions.

Since the University of Tennessee did not have a booth at SC this year, ICL leveraged an online “virtual booth” through which interested parties could keep tabs on ICL-related events, including a list of attendees and detailed schedule of talks.

Click here to view the SC17 photo gallery, and click here to see more details about ICL’s SC17 activities.

ICL Retreat – 2017

The 2017 ICL retreat moved back to the Tremont Lodge & Resort in Townsend, Tennessee, which provided a good platform for two days of talks that covered student projects and summer internships, the lab’s progress in the areas of linear algebra, distributed computing, benchmarking, and performance analysis, along with recaps of administrative procedures.

There was some fun to be had as well, and Miss Lilly’s Cafe opened their doors to host the ICL crew for some delicious southern cooking and beverages of all sorts. Serving as a kickoff to the fall semester, the 2017 retreat offers a platform for ICLers both new and old to get their bearings and hit the ground running for another great year at ICL!

Recent Releases

2018 ICL Annual Report

For seventeen years, ICL has produced an annual report to provide a concise profile of our research, including information about the people and external organizations who make it all happen. Please download a copy and check it out.

Big Data and Extreme-Scale Computing: Pathways to Convergence

At a birds-of-a-feather (BoF) session during SC17, the Big Data and Extreme-scale Computing (BDEC) project issued a report, “Pathways to Convergence: Towards a Shaping Strategy for a Future Software and Data Ecosystem for Scientific Inquiry.” This report covers the five international workshops staged by the BDEC principals over the last four years—workshops designed to explore the highly disruptive effects of the big data revolution on the development of future cyberinfrastructure for science.

These meetings drew participation from all main institutional branches of the HPC community, including academia, government laboratories, and private industry; and they included leaders from major scientific domains and from all different segments of the research community involved in designing, developing, and deploying cyberinfrastructure for science.

Commenting on the work for HPCWire, ICL’s Jack Dongarra—a co-author of the report—said, “Computing is at a profound inflection point, economically and technically. The end of Dennard scaling and its implications for continuing semiconductor-design advances, the shift to mobile and cloud computing, the explosive growth of scientific, business, government, and consumer data and opportunities for data analytics and machine learning, and the continuing need for more powerful computing systems to advance science and engineering are the context for the debate over the future of exascale computing and big data analysis.”

In the report, the BDEC community examines the progress toward or potential for convergence at three different levels.

Examples of successful convergence in large-scale scientific applications: Several different application communities have already shown that HPC and HDA technologies can be combined and integrated in multi-stage application workflows to open new frontiers of research.
Progress toward convergence on large-scale HPC platforms: The potential for a fusion between simulation-centric and data analysis–centric methods has motivated substantial progress toward the more flexible forms of system management that such workflows will require. The report itemizes and analyzes some of the different challenges and opportunities that remain.
Absence of community convergence on a next generation distributed services platform: The erosion of the classic Internet paradigm under the onslaught of big data—and the proliferation of competing approaches to create a new type of distributed services platform (e.g., “edge” or “fog” computing) to replace it—is perhaps the most important challenge that the cyberinfrastructure community will confront in the coming decade. To help frame the discussion moving forward, the BDEC report seeks to articulate the most fundamental requirements—common, open, shareable, interoperable—that might be reasonably agreed to as a starting point for any community codesign effort in this area.

Read HPCWire’s article on the BDEC report or click here to download the report in its entirety.

LAPACK 3.8.0

LAPACK 3.8.0 is now available. The Linear Algebra PACKage (LAPACK) is a widely used library for efficiently solving dense linear algebra problems. ICL has been a major contributor to the development and maintenance of the package since its inception. LAPACK is sequential, relies on the BLAS library, and benefits from the multi-core BLAS library.

LAPACK 3.8.0, released in November 2017, includes level-3 BLAS communication-avoiding, symmetric-indefinite factorizations with Aasen’s triangular tridiagonalization using the two-stage algorithm.

Click here to download the tarball.

MAGMA 2.3.0

MAGMA 2.3.0 is now available. Matrix Algebra on GPU and Multicore Architectures (MAGMA) is a collection of next-generation linear algebra (LA) libraries for heterogeneous architectures. The MAGMA package supports interfaces for current LA packages and standards (e.g., LAPACK and BLAS) to allow computational scientists to easily port any LA-reliant software components to heterogeneous architectures.

MAGMA 2.3 features LAPACK-compliant routines for multi-core CPUs enhanced with NVIDIA GPUs (including the Volta V100). MAGMA now includes more than 400 routines, covering one-sided dense matrix factorizations and solvers, and two-sided factorizations and eigen/singular-value problem solvers, as well as a subset of highly optimized BLAS for GPUs.

Other updates and features include:

Moved MAGMA’s repository to Bitbucket: https://bitbucket.org/icl/magma
Added support for Volta GPUs
Improved performance for batched LU and QR factorizations on small square sizes up to 32
Added test matrix generator to many testers
Launched data analytics tools (MagmaDNN 0.1 Alpha) using MAGMA as the computational back end

Additional MAGMA-sparse features:

Added support for CUDA 9.0
Improved the ParILUT algorithm w.r.t. stability and scalability
Added ParICT, a symmetry-exploiting version of the ParILUT algorithm

Click here to download the tarball.

PAPI 5.6.0

PAPI 5.6.0 is now available. The Performance Application Programming Interface (PAPI) supplies a consistent interface and methodology for collecting performance counter information from various hardware and software components, including most major CPUs, GPUs and accelerators, interconnects, I/O systems, and power interfaces, as well as virtual cloud environments.

PAPI 5.6.0 contains a major cleanup of the source code and the build system to ensure consistent code structure, eliminate errors, and reduce redundancies. A number of validation tests have been added to PAPI to verify the PAPI preset events. Improvements and changes to multiple PAPI components have been made, varying from supporting new events to fixes in the component testing.

For specific and detailed information on changes made in this release, see ChangeLogP560.txt for keywords of interest or go directly to the PAPI git repository.

Click here for more information on downloading and installing PAPI.

ULFM 2.0

ULFM 2.0 is now available. User Level Failure Mitigation (ULFM) is a set of new interfaces for MPI that enables message passing applications to restore MPI functionality affected by process failures. The MPI implementation is spared the expense of internally taking protective and corrective automatic actions against failures. Instead, it can prevent any fault-related deadlock situation by reporting operations whose completions were rendered impossible by failures.

For ULFM 2.0, the team focused on integrating ULFM into the current Open MPI master and improving performance and stability.

The ULFM team is also providing a Docker package for users who might wish to further investigate ULFM’s capabilities before launching a full-scale install of the software.

A full list of ULFM 2.0 features and changes is available here.

Interview

Gerald Ragghianti Then — Gerald Ragghianti

Where are you from, originally?

I have always lived in Tennessee. I grew up near Memphis and then came to east Tennessee for school.

Can you summarize your educational background?

In high school I was a pretty well-rounded student; I was very active in music, science, and math. I finished the equivalent of Calculus III while in high school. I attended the Governor’s School for the Sciences (hosted here at UTK) in 1997, and that experience encouraged me to pursue Physics when I came to UTK as a freshman in 1998. I majored in Physics with minors in Mathematics and trumpet performance. I was offered a graduate fellowship here and completed a Master’s thesis in high-energy physics with an emphasis in computational methods. I worked on the BaBar experience at the Stanford Linear Accelerator Center and the Compact Muon Solenoid (CMS) experiment at the LHC (Large Hadron Collider). During this time I designed, built, and operated a number of Linux compute clusters that were used for data analysis and data simulation for these experiments. My work also involved the application of neural networks into the BaBar experiment’s statistical analysis methods.

Where did you work before joining ICL?

I was very lucky that a computing program was just starting at UT, the Newton HPC Program, just as I was graduating. The university was looking for someone with computational science research and IT experience to run a centralized computing program that would bring HPC capability to all research disciplines on campus. I enjoyed the opportunity to serve a wider research community and went on to lead the technical and program direction of the Newton HPC Program for the next ten years. During that time, the program grew from a small Linux cluster of 32 compute nodes and a handful of users to over 200 compute nodes and over 400 regular users.

How did you first hear about the lab and what made you want to work here?

I quickly learned of ICL upon first coming to UT because of the TOP500 list. At the time, HPC seemed quite mysterious and exclusive, but I knew that the lab had a strong worldwide reputation, and I always had the thought in the back of my mind of one day working here. When I was in grad school, my research lab was considering the use of some of the logistical networking technologies that Terry Moore was involved with, and I have continued to keep in touch with him over the years. It was fortuitous that I happened to bump into him recently just as a high-level restructuring of my job created new “opportunities” for my professional future. It was quite unexpected that ICL would be interested in bringing me on board, and I’m very excited to be able to expand my horizons here.

What is your focus here at ICL? What are you working on?

My first priority is to get all of the ICL research infrastructure (development computer hardware) into a well-defined, managed state that will allow the lab to rely on it for their work. This involves use-case evaluation, taking inventory of the hardware that we have, implementing improvements, and planning for future needs. Eventually, this will take less time, and I’ll be able to expand my view into other areas such as application integration of the lab’s libraries.

What are your interests/hobbies outside of work?

My primary hobby is collecting new hobbies. I love researching new fields, learning new skills, and collecting new tools that make me more independent. Most recently, I have been doing diesel engine repair, Perl programming, landscape design, and IoT development.

Tell us something about yourself that might surprise people.

I am only interested in the newest technology to the extent that it can solve problems without creating new ones. I don’t enjoy tech gadgets for their own sake. I don’t use Facebook. Yes, I am proud of that.

If you weren’t working at ICL, where would you like to be working and why?

I think I generally like working in, or closely with, academic research because it generally involves smarter people and a more “pure” driving force than a profit-driven enterprise. I also appreciate the multicultural environment that comes along with working at a research university.

Recent Papers

Tomov, S., M. Gates, and A. Haidar, Accelerating Linear Algebra with MAGMA , Knoxville, TN, ECP Annual Meeting 2018, Tutorial, February 2018. (35.27 MB)
Haidar, A., A. Abdelfattah, S. Tomov, and J. Dongarra, Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers and Achieve 74 Gflops/Watt on Nvidia V100 , San Jose, CA, GPU Technology Conference (GTC), Poster, March 2018. (2.96 MB)
Anzt, H., M. Kreutzer, E. Ponce, G. D. Peterson, G. Wellein, and J. Dongarra, “Optimization and Performance Evaluation of the IDR Iterative Krylov Solver on GPUs,” The International Journal of High Performance Computing Applications, vol. 32, no. 2, pp. 220â230, March 2018. DOI: 10.1177/1094342016646844 (2.08 MB)
Hoemmen, M., and I. Yamazaki, Production Implementations of Pipelined & Communication-Avoiding Iterative Linear Solvers , Tokyo, Japan, SIAM Conference on Parallel Processing for Scientific Computing, March 2018. (2.34 MB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Tensor Contractions using Optimized Batch GEMM Routines , San Jose, CA, GPU Technology Conference (GTC), Poster, March 2018. (1.64 MB)

Recent Conferences

FEB
5-9

ECP 2nd Annual Meeting Knoxville, Tennessee
Ahmad
Anthony
Aurelien
Azzam
Damien
Earl
Frank
George
Gerald
Hartwig
Heike
Ichitaro
Jack
Jakub
Jamie
John
Mark
Piotr
Stan
Terry
Thomas

Ahmad Abdelfattah, Anthony Danalis, Aurelien Bouteiller, Azzam Haidar, Damien Genet, Earl Carr, Frank Winkler, George Bosilca, Gerald Ragghianti, Hartwig Anzt, Heike Jagode, Ichitaro Yamazaki, Jack Dongarra, Jakub Kurzak, Jamie Finney, John Henry, Mark Gates, Piotr Luszczek, Stanimire Tomov, Terry Moore, Thomas Herault
FEB
26-2

MPI Forum Portland, Oregon
Aurelien

Aurelien Bouteiller
FEB
28-3

Seminar at the University of Uppsala Uppsala, Sweden
George

George Bosilca
MAR
5-10

SIAM Conference on Parallel Processing for Scientific Computing (PP18) Tokyo, Japan
George
Ichitaro
Jack
Piotr

George Bosilca, Ichitaro Yamazaki, Jack Dongarra, Piotr Luszczek
MAR
15-16

TESSE Meeting in Roanoke, VA Roanoke, Alaska
Damien
George
Thomas

Damien Genet, George Bosilca, Thomas Herault
MAR
20-23

Open MPI Developer Meeting Dallas, Texas
Dong
George
Arm

Dong Zhong, George Bosilca, Thananon Patinyasakdikul
MAR
25-28

BDEC Workshop Chicago, Illinois
Jack
Terry
Tracy

Jack Dongarra, Terry Moore, Tracy Rafferty
MAR
26-29

GTC 2018 San Jose, California
Ahmad
Azzam
Ichitaro
Stan

Ahmad Abdelfattah, Azzam Haidar, Ichitaro Yamazaki, Stanimire Tomov

Upcoming Conferences

APR
11

High-Performance Systems and Analytics for Big Data Bloomington, Indiana
Piotr

Piotr Luszczek
APR
15-18

2018 Spring Simulation Multi-Conference Baltimore, Maryland
Piotr

Piotr Luszczek
APR
16-19

JLESC Workshop Barcelona, Spain
Anthony
George
Jack
Thomas

Anthony Danalis, George Bosilca, Jack Dongarra, Thomas Herault
APR
30-1

2018 NSF SI2 PI Meeting Washington, District of Columbia
Azzam
George
Heike
Piotr

Azzam Haidar, George Bosilca, Heike Jagode, Piotr Luszczek
MAY
21-25

IPDPS 2018 Vancouver, Canada
Ichitaro
Jakub

Ichitaro Yamazaki, Jakub Kurzak
MAY
24

PETTT Workshop: Performance Portability Libraries for DoD Applications Washington DC, Maryland
Stan

Stanimire Tomov

Recent Lunch Talks

FEB
2
Yves Robert
ENS-Lyon
Checkpointing Workflows for Fail-Stop Errors PDF
FEB
9
Hartwig Anzt
Karlsruhe Institute of Technology
Ginkgo: Towards an Accelerator-Focused C++ Sparse Linear Algebra Library PDF
FEB
16
Gerald Ragghianti
Local ECP resources, Spack, and Time Machine backups
FEB
23
Sam Crawford
Interdisciplinary Graduate Minor in Computational Science PDF
MAR
2
Anthony Danalis
Low-Level Benchmarking: A Dive down the Rabbit Hole
MAR
9
Thananon Patinyasakdikul
Injection Rate in Multithreaded MPI

Upcoming Lunch Talks

APR
6
Scott Emrich
EECS
High Throughput Genome Analysis using Makeflow and Friends PDF
APR
13
Jiali Li
PCP Component in PAPI PDF
APR
20
Hartwig Anzt
Karlsruhe Institute of Technology
Variable-Size Batched Condition Number Calculation on GPUs PDF
APR
27
Yves Robert
ENS-Lyon
A Performance Model to Execute Workflows on High-Bandwidth Memory Architectures PDF
MAY
4
Ahmad Abdelfattah
MAGMA Update: High Performance and Energy Efficient LU Factorization PDF
MAY
11
Ana Gainaru
Vanderbilt University
Scheduling Solutions for Data-Driven Large-Scale Applications PDF
MAY
18
Pratik Nayak
Karlsruhe Institute of Technology
Using Iterative Methods for Local Solves in Asynchronous Schwarz Methods
MAY
25
Yaohung Tsai
Pseudo-Assembly Programming on NVIDIA GPU PDF

Visitors

Mawussi Zounon from University of Manchester will be visiting from March 17 through March 23. To visit with Jakub and Stan's research group
Srikara Pranesh from University of Manchester will be visiting from March 17 through March 23. To visit Jakub and Stan's research group
Pierre Blanchard from University of Manchester will be visiting from March 17 through March 23. To visit Jakub and Stan's research group

People

Qinglei Cao joined ICL as a graduate research assistant in the May 2017 and is working with the DisCo group.
Jamie Finney joined ICL in November 2017 as a software engineer for the ECP/SLATE project. Welcome aboard, Jamie!
John Henry joined the performance group (PICL) in October 2017 as part of the ECP/Exa-PAPI effort. Welcome, John!
Sam Crawford returned to ICL in July 2017 after a year working as a Technical Editor at Oak Ridge National Laboratory. Welcome back!
Jiali Li joined the performance group (PICL) as a graduate research assistant in June 2017. Welcome to ICL, Jiali!
Gerald Ragghianti joined ICL in June 2017 as a Research Leader. Welcome to ICL, Geri!
Scott Gibson left ICL in June 2017 to take a new position at Oak Ridge National Laboratory. Congratulations, Scott!

Visitors

Mawussi Zounon from University of Manchester will be visiting from March 17 through March 23. To visit with Jakub and Stan's research group
Srikara Pranesh from University of Manchester will be visiting from March 17 through March 23. To visit Jakub and Stan's research group
Pierre Blanchard from University of Manchester will be visiting from March 17 through March 23. To visit Jakub and Stan's research group

Dates to Remember

ICL Retreat

The 2018 ICL retreat has been set for August 20–21. Location to be determined. Mark your calendars!

March 2018

News and Announcements

TOP500 – November 2017

Michela Taufer to Join EECS

2018 ICL Winter Reception

Conference Reports

SC17

ICL Retreat – 2017

Recent Releases

2018 ICL Annual Report

Big Data and Extreme-Scale Computing: Pathways to Convergence

LAPACK 3.8.0

MAGMA 2.3.0

PAPI 5.6.0

ULFM 2.0

Interview

Gerald Ragghianti

Recent Papers

Recent Conferences

Upcoming Conferences

Recent Lunch Talks

Upcoming Lunch Talks

Visitors

People

Visitors

Dates to Remember

ICL Retreat

Archives

PDF Editions