News and Announcements

TOP500 – June 2019

The 53rd TOP500 list was just unveiled at the ISC High Performance 2019 conference in Frankfurt, Germany. The United States has kept the top-two spots with the Department of Energy’s Summit (at Oak Ridge National Laboratory) and Sierra (at Lawrence Livermore National Laboratory).

For the first time, every entrant on the TOP500 now exceeds 1 petaFLOP/s on the HPL benchmark, with #500 coming in at 1.022 petaFLOP/s. Also of note is that Piz Daint has been pushed out of the top 5 by Frontera—a new Dell machine installed at the Texas Advanced Computing Center.

Rank System Cores Rmax (TFLOP/s) Rpeak (TFLOP/s) Power (kW)
1 Summit – IBM Power System AC922, IBM POWER9 22C 3.07GHz, NVIDIA Volta GV100, Dual-rail Mellanox EDR Infiniband, IBM
DOE/SC/Oak Ridge National Laboratory
United States
2,414,592 148,600.0 200,794.9 10,096
2 Sierra – IBM Power System S922LC, IBM POWER9 22C 3.1GHz, NVIDIA Volta GV100, Dual-rail Mellanox EDR Infiniband, IBM / NVIDIA / Mellanox
DOE/NNSA/LLNL
United States
1,572,480 94,640.0 125,712.0 7,438
3 Sunway TaihuLight – Sunway MPP, Sunway SW26010 260C 1.45GHz, Sunway, NRCPC
National Supercomputing Center in Wuxi
China
10,649,600 93,014.6 125,435.9 15,371
4 Tianhe-2A – TH-IVB-FEP Cluster, Intel Xeon E5-2692v2 12C 2.2GHz, TH Express-2, Matrix-2000, NUDT
National Super Computer Center in Guangzhou
China
4,981,760 61,444.5 100,678.7 18,482
5 Frontera – Dell C6420, Xeon Platinum 8280 28C 2.7GHz, Mellanox InfiniBand HDR, Dell EMC
Texas Advanced Computing Center
United States
448,448 23,516.4 38,745.9

HPCG – June 2019

The latest results for the HPC Preconditioned Conjugate Gradient (HPCG) benchmark were just released at ISC High Performance in Frankfurt, Germany. A joint effort between ICL and Sandia National Laboratories, HPCG is designed to measure performance that is representative of modern HPC capability by simulating compute and communication patterns from sparse iterative solvers commonly found in science and engineering applications.

HPCG results are released twice per year alongside the TOP500 rankings to show how real-world applications might fare on a given machine. The full list of HPCG rankings is available here.

Rank Computer HPL (PFLOP/s) TOP500 Rank HPCG (PFLOP/s) %Peak
1 Summit – IBM, POWER9, NVIDIA Volta V100

DOE/SC/ORNL, USA

148.6 1 2.926 1.5%
2 Sierra – IBM, Power9, NVIDIA Tesla V100

DOE/NNSA/LLNL, USA

94.64 2 1.796 1.4%
3 K Computer – Fujitsu, SPARC64

RIKEN/AIST, Japan

10.51 20 0.603 5.3%
4 Trinity – Cray XC40, Intel Xeon E5-2698 v3, Xeon Phi 7250

DOE/NNSA/LANL/SNL, USA

20.159 7 0.546 1.3%
5 AI Bridging Cloud Infrastructure – PRIMERGY CX2570 M4, Xeon Gold 6148 20C 2.4GHz, NVIDIA Tesla V100

AIST, Japan

19.880 8 0.509 1.6%

The coffee machine is dead. Long live the coffee machine!

Our beloved Saeco has finally ground its last bean and caffeinated its last ICLer. However, with some quick and creative thinking by ICL admin and accounting, we have a new and shiny DeLonghi superautomatic, and the coffee must flow.

Please be kind to the new machine. Our last one was quite the trooper and performed beyond expectations.

—Ed.

Conference Reports

MIT GPU Hackathon

Piotr Luszczek made his way to Cambridge, MA on June 3–7, where he served as a hacking mentor for the MIT GPU Hackathon. These Hackathons divide developers up into teams as participants work to port their code to GPUs or further optimize their applications for the latest and greatest in GPU hardware—all with the help of a team mentor.

One feature of being so close to MIT is that the Julia Lab software team joined in with the team that was porting Julia-based code for ocean flow models—which involve multiple layers of water columns and mixing between them—to GPUs.

Another team was calculating plasma containment problems and leveraging the Software for Linear Algebra Targeting Exascale’s (SLATE’s) compatibility interface, which serves as an easy-to-use API for current ScaLAPACK users.

A team from Bosch was looking at energy potential modeled by neural networks and was mixing GPU code with Python.

Piotr’s team had a large code developed by GE for flow optimization around wind turbines. This code used higher-order, Runge-Kutta, and an explicit solver with some parts from PETSc and Hypre (part of xSDK).

Piotr also continued his push to improve application integration with respect to the Extreme-scale Scientific Software Development Kit (xSDK), and he will continue to do so at several of this year’s 11 GPU Hackathons (a record)—the next of which will be held at Princeton University.

The editor would like to thank Piotr Luszczek for his contributions to this article.

BDEC2: Poznań

This slideshow requires JavaScript.

The Big Data and Extreme-Scale Computing2 (BDEC2) workshop series rolls on, and the latest meeting was held on May 14–16 in Poznań, Poland and focused on the need for a new cyberinfrastructure platform to connect the HPC resources required for intense data analysis, deep learning, and neural networks to the vast amount of data generators that are rarely co-located with facilities capable of that scale of computing.

This workshop brings together eminent representatives of the scientific computing community; including members from industry, academia, and government, with expertise in algorithms, computer system architecture, operating systems, workflow middleware, compilers, libraries, languages, and applications; who are endeavoring to map out the ways in which big-data challenges are changing the traditional cyberinfrastructure paradigm.

The ICL team played a prominent role in this meeting, with Terry Moore acting as a key BDEC conductor, David Rogers designing the meeting’s website and logo, and Tracy Rafferty, Joan Snoderly, and Sam Crawford coordinating and providing additional support on site.

For more information on the BDEC effort, please check out the most recent BDEC report, “Pathways to Convergence: Towards a Shaping Strategy for a Future Software and Data Ecosystem for Scientific Inquiry.”

The next BDEC2 meeting is slated for fall 2019 in San Diego, CA.

The editor would like to thank Tracy Rafferty and Terry Moore for their contributions to this article—and for bringing him along.

Recent Releases

MAGMA 2.5.1 Alpha

MAGMA 2.5.1 Alpha is now available. Matrix Algebra on GPU and Multicore Architectures (MAGMA) is a collection of next-generation linear algebra (LA) libraries for heterogeneous architectures. The MAGMA package supports interfaces for current LA packages and standards (e.g., LAPACK and BLAS) to allow computational scientists to easily port any LA-reliant software components to heterogeneous architectures.

Updates and features in MAGMA 2.5.1 Alpha include:

  • Updates and improvements in CMakeLists.txt for improved/friendlier CMake and Spack installations;
  • Fixes related to MAGMA installation on GPUs and CUDA versions that do not support FP16 arithmetic;
  • Added support for Turing GPUs;
  • Removed some C++ features from MAGMA Sparse for friendlier compilation (using nvcc and various CPU compilers).

Click here to download the tarball.

Interview

Hejer Shaiek Then

Hejer Shaiek

Where are you from, originally?
I was born in Tunisia, which is a small country in North Africa. It was originally occupied by Berbers, but it hosted a lot of different civilizations throughout history, like Carthage, the Roman Empire, and the Ottoman Empire, and it was a French colony from 1881 until 1956. It is the country that started the Arab Spring, and it is now considered a democracy, though it still struggles from serious economic difficulties.

But I cannot answer this question without mentioning my French side. I’m not French (yet), but I have been living in France for several years now, and my love for that country and its people is, for me, stronger than any official paper stating my nationality.

Can you summarize your educational background?
So, I started with “Classes Préparatoires aux Grandes Écoles,” which are a special couple of years in the French educational System where you prepare for a contest to be admitted (or not) to engineering schools. During those two years, I learned a lot of math and physics and a bit of computer science, chemistry, and engineering. I then applied to my current school—ENSEEIHT, in Toulouse, France—to major in computer science and applied mathematics, specializing in HPC and Big Data. I am now finishing my third and final year there, and I will earn my Master’s degree in September.

Tell us how you first learned about ICL.
I told one of my favorite professors at ENSEEIHT that I wanted to do a research internship in HPC in the United States, and he suggested ICL.

What made you want to visit ICL?
I would say the subjects of the papers published by ICL. A lot of them, especially those of the Linear Algebra group, are based on things I did see in class. But at the same time, they target new state-of-the-art challenges. So, I thought I had the tools to learn a lot of things, and being surrounded by the people who wrote those papers is my best chance to do so.

What are your research interests?
Honestly, this question is difficult for me because I’m just starting my research journey. I know that my love for linear algebra is what got me here. But now, I would say, as long as I am learning new things, I’m happy.

What are you working on during your visit with ICL?
I am working on the FFT project, the goal of which is having an FFT library that targets current machines.

What are your interests/hobbies outside work?
This answer is always subject to change. Sometimes I like practicing sports or reading, but the most consistent things would be hanging out with friends outside or watching series. Currently, most of my free time is about playing Zelda.

Tell us something about yourself that might surprise people.
I speak 4 languages, and my favorite one is when I can mix them up. I also can be very sociable when I’m comfortable with the language I speak, and that, I believe, may surprise people here.

Recent Papers

  1. Antoniu, G., A. Costan, O. Marcu, M. S. Pérez, N. Stojanovic, R. M. Badia, M. Vázquez, S. Girona, M. Beck, T. Moore, et al., A Collection of White Papers from the BDEC2 Workshop in Poznan, Poland,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-10: University of Tennessee, Knoxville, May 2019.  (5.82 MB)
  2. Ribizel, T., and H. Anzt, Approximate and Exact Selection on GPUs,” 2019 IEEE International Parallel and Distributed Processing Symposium Workshops, Rio de Janeiro, Brazil, IEEE, May 2019. DOI: 10.1109/IPDPSW.2019.00088  (440.71 KB)
  3. Anzt, H., and G. Flegar, Are we Doing the Right Thing? – A Critical Analysis of the Academic HPC Community,” 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brazil, IEEE, May 2019. DOI: 10.1109/IPDPSW.2019.00122  (622.32 KB)
  4. Kaya, O., and Y. Robert, Computing Dense Tensor Decompositions with Optimal Dimension Trees,” Algorithmica, vol. 81, issue 5, pp. 2092–2121, May 2019. DOI: 10.1007/s00453-018-0525-3  (638.4 KB)
  5. Abdelfattah, A., S. Tomov, and J. Dongarra, Fast Batched Matrix Multiplication for Small Sizes using Half Precision Arithmetic on GPUs,” 33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.  (675.5 KB)
  6. Bai, Z., J. Dongarra, D. Lu, and I. Yamazaki, Matrix Powers Kernels for Thick-Restart Lanczos with Explicit External Deflation,” International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.  (480.73 KB)
  7. Anzt, H., T. Ribizel, G. Flegar, E. Chow, and J. Dongarra, ParILUT – A Parallel Threshold ILU for GPUs,” IEEE International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019. DOI: 10.1109/IPDPS.2019.00033  (505.95 KB)
  8. Aupy, G., A. Gainaru, V. Honoré, P. Raghavan, Y. Robert, and H. Sun, Reservation Strategies for Stochastic Jobs,” 33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2019), Rio de Janeiro, Brazil, IEEE Computer Society Press, May 2019.  (808.93 KB)
  9. Danalis, A., H. Jagode, T. Herault, P. Luszczek, and J. Dongarra, Software-Defined Events through PAPI,” 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brazil, IEEE, May 2019. DOI: 10.1109/IPDPSW.2019.00069  (446.41 KB)
  10. Zaitsev, D., S. Tomov, and J. Dongarra, Solving Linear Diophantine Systems on Parallel Architectures,” IEEE Transactions on Parallel and Distributed Systems, vol. 30, issue 5, pp. 1158-1169, May 2019. DOI: http://dx.doi.org/10.1109/TPDS.2018.2873354  (802.97 KB)
  11. Wong, K., S. Tomov, and J. Dongarra, Hands-on Research and Training in High-Performance Data Sciences, Data Analytics, and Machine Learning for Emerging Environments,” ISC High Performance, Frankfurt, Germany, Springer International Publishing, June 2019.  (1016.52 KB)
  12. Danalis, A., H. Jagode, and J. Dongarra, Is your scheduling good? How would you know? , Bordeaux, France, 14th Scheduling for Large Scale Systems Workshop, June 2019.  (2.5 MB)
  13. Kurzak, J., M. Gates, A. Charara, A. YarKhan, and J. Dongarra, Least Squares Solvers for Distributed-Memory Machines with GPU Accelerators,” ACM International Conference on Supercomputing (ICS '19), Phoenix, Arizona, ACM, pp. 117–126, June 2019. DOI: https://dl.acm.org/doi/abs/10.1145/3330345.3330356  (1.63 MB)
  14. Nichols, D., N-S. Tomov, F. Betancourt, S. Tomov, K. Wong, and J. Dongarra, MagmaDNN: Towards High-Performance Data Analytics and Machine Learning for Data-Driven Scientific Computing,” ISC High Performance, Frankfurt, Germany, Springer International Publishing, June 2019. DOI: 10.1007/978-3-030-34356-9_37  (1.37 MB) (8.72 MB)
  15. Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, P. Wu, I. Yamazaki, A. YarKhan, M. Abalenkovs, N. Bagherpour, et al., PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP,” ACM Transactions on Mathematical Software, vol. 45, issue 2, June 2019. DOI: 10.1145/3264491  (7.5 MB)
  16. Canon, L-C., A K W. Chang, Y. Robert, and F. Vivien, Scheduling Independent Stochastic Tasks under Deadline and Budget Constraints,” International Journal of High Performance Computing Applications, vol. 34, issue 2, pp. 246-264, June 2019. DOI: 10.1177/1094342019852135  (427.92 KB)
  17. Kurzak, J., M. Gates, A. Charara, A. YarKhan, and J. Dongarra, SLATE Working Note 12: Implementing Matrix Inversions,” SLATE Working Notes, no. 12, ICL-UT-19-04: Innovative Computing Laboratory, University of Tennessee, June 2019.  (1.95 MB)
  18. Anzt, H., Y. Chen Chen, T. Cojean, J. Dongarra, G. Flegar, P. Nayak, E. S. Quintana-Orti, Y. M. Tsai, and W. Wang, Towards Continuous Benchmarking,” Platform for Advanced Scientific Computing Conference (PASC 2019), Zurich, Switzerland, ACM Press, June 2019. DOI: 10.1145/3324989.3325719  (1.51 MB)

Recent Conferences

  1. MAY
    -
    PETTT Annual PPE Review Vicksburg, Mississippi
    Stanimire Tomov
    Stan
    Stanimire Tomov
  2. MAY
    -
    BDEC2 Poznan, Poland
    Joan Snoderly
    Joan
    Sam Crawford
    Sam
    Terry Moore
    Terry
    Tracy Rafferty
    Tracy
    Joan Snoderly, Sam Crawford, Terry Moore, Tracy Rafferty
  3. MAY
    -
    IPDPS Rio di Janeiro, Brazil
    Anthony Danalis
    Anthony
    Hartwig Anzt
    Hartwig
    Ichitaro Yamazaki
    Ichitaro
    Jack Dongarra
    Jack
    Anthony Danalis, Hartwig Anzt, Ichitaro Yamazaki, Jack Dongarra
  4. MAY
    -
    2019 OLCF User Meeting Oak Ridge, Tennessee
    Stanimire Tomov
    Stan
    Stanimire Tomov
  5. MAY
    -
    MPI Forum Chicago, Illinois
    Aurelien Bouteiller
    Aurelien
    Aurelien Bouteiller
  6. MAY
    -
    DoE/MEXT Chicago, Illinois
    Mark Gates
    Mark
    Mark Gates
  7. JUN
    -
    MIT GPU Hackathon Cambridge, Massachusetts
    Piotr Luszczek
    Piotr
    Piotr Luszczek
  8. JUN
    -
    ISC High Performance 2019 Frankfurt, Germany
    Hartwig Anzt
    Hartwig
    Heike Jagode
    Heike
    Jack Dongarra
    Jack
    Stanimire Tomov
    Stan
    Hartwig Anzt, Heike Jagode, Jack Dongarra, Stanimire Tomov
  9. JUN
    -
    Anthony Danalis
    Anthony
    Thomas Herault
    Thomas
    Anthony Danalis, Thomas Herault
  10. JUN
    -
    ICS'19 Phoenix, Arizona
    Jakub Kurzak
    Jakub
    Jakub Kurzak

Upcoming Conferences

  1. JUL
    -
    Alan Ayala
    Alan
    Alan Ayala
  2. JUL
    -
    Gerald Ragghianti
    Gerald
    Gerald Ragghianti
  3. JUL
    -
    Heike Jagode
    Heike
    Jakub Kurzak
    Jakub
    Heike Jagode, Jakub Kurzak
  4. JUL
    -
    OLCF/ECP OpenMP Hackathon Knoxville, Tennessee
    Ali Charara
    Ali
    Ali Charara
  5. JUL
    -
    2019 Scalable Tools Workshop Tahoe City, California
    Anthony Danalis
    Anthony
    Anthony Danalis
  6. JUL
    -
    Daniel Barry
    Daniel
    Daniel Barry

Recent Lunch Talks

  1. MAY
    3
    Stephen Thomas
    Stephen Thomas
    Global Computing Laboratory
    Analytics4MD: In-Situ Data Analytics for Next Generation Molecular Dynamics (MD) Workflows
  2. MAY
    10
    Qinglei Cao
    Qinglei Cao
    Task Bench: A Parameterized Benchmark for Evaluating Parallel Runtime Performance
  3. MAY
    17
    Ali Charara
    Ali Charara
    SLATE Data Coherence: An Adapted Cache Coherence Model
  4. MAY
    24
    Thomas Herault
    Thomas Herault
    Comparing the Performance of Rigid, Moldable and Grid-Shaped Applications on Failure-Prone HPC Platforms PDF
  5. MAY
    31
    Reazul Hoque
    Reazul Hoque
    Dynamic Task Discovery in a Data Flow Task-Based Runtime PDF
  6. JUN
    6
    Vipin Kumar
    Vipin Kumar
    University of Minnesota
    Physics Guided Machine Learning: A New Paradigm for Modeling Dynamical Systems
  7. JUN
    7
    Yves Robert
    Yves Robert
    ENS-Lyon
    Replication is More Efficient Than You Think

Upcoming Lunch Talks

  1. JUL
    26
    Thananon Patinyasakdikul
    Thananon Patinyasakdikul
    Improving Multithreaded MPI for Current Hardware Architecture