ICL Newsletter

News and Announcements

TOP500 – June 2015

The 45th TOP500 rankings were presented at this year’s ISC High Performance conference in Frankfurt, Germany. For the 5th consecutive time, China’s Tianhe-2 has remained at the top of the ranking with 33.863 petaflop/s on the High Performance LINPACK benchmark. Tianhe-2 has 16,000 nodes, each with two Intel Xeon Ivy Bridge processors and three Xeon Phi coprocessors for a combined total of 3,120,000 computing cores.

As for the rest of the list, the top 5 machines remain unchanged, but there is a new system at No. 7. Shaheen II—a Cray XC40 system installed at King Abdullah University of Science and Technology (KAUST)—achieved 5.536 petaflop/s on the LINPACK benchmark, making it the highest-ranked Middle East system in the 22-year history of the list and the first to reach the Top 10.

More details on the the 45th edition of the TOP500 are available in the official press release.

Rank	Site	System	Rmax (TFlop/s)
1	National Super Computer Center in Guangzhou China	Tianhe-2 (MilkyWay-2) – TH-IVB-FEP Cluster NUDT	33,862.7
2	DOE/SC/Oak Ridge National Laboratory United States	Titan – Cray XK7 Cray Inc.	17,590.0
3	DOE/NNSA/LLNL United States	Sequoia – BlueGene/Q IBM	17,173.2
4	RIKEN Advanced Institute for Computational Science (AICS) Japan	K computer, SPARC64 VIIIfx Fujitsu	10,510.0
5	DOE/SC/Argonne National Laboratory United States	Mira – BlueGene/Q IBM	8,586.6
See the full list at TOP500.org.

In the interview above, Jack Dongarra and Erich Strohmaier discuss the possible reasons for the performance plateau evident in recent TOP500 lists. They also share their expectations, including a leap in performance in the near future, and the arrival of Exascale in 2022-2023.

DARE Funded

ICL’s Data-driven Autotuning for Runtime Execution (DARE) project was recently funded by the NSF. DARE will investigate techniques for empirical autotuning of execution schedules for parallel applications on scalable multicore-plus-accelerator hybrid systems using a software workbench—the DARE framework. The DARE framework will optimize data and work placement, granularity, and scheduling decisions for maximum performance of a given application on a given hardware system. Congratulations to the DARE team!

HPCG June 2015 Results

The June 2015 results for the HPC Preconditioned Conjugate Gradient (HPCG) benchmark were released on July 15th at the ISC-HPC conference. Intended to be a new HPC metric, HPCG is designed to measure performance that is representative of modern HPC capability by simulating patterns commonly found in real science and engineering applications.

To keep pace with the changing hardware and software infrastructures, HPCG results will be used to augment the TOP500 rankings to show how real world applications might fare on a given machine. In the table below, you can see how the HPCG benchmark would have ranked its top 10 machines, and where those machines ranked on the LINPACK-based TOP500 list. The full list of rankings is available here.

Site	Computer	HPL (Pflop/s)	TOP500 Rank	HPCG (Pflop/s)	HPCG Rank	%Peak
NSCC / Guangzhou	Tianhe-2 NUDT, Xeon 12C 2.2GHz + Intel Xeon Phi 57C + Custom	33.863	1	0.5800	1	1.1%
RIKEN Advanced Institute for Computational Science	K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect	10.510	4	0.4608	2	4.1%
DOE/SC/Oak Ridge Nat Lab	Titan – Cray XK7 , Opteron 6274 16C 2.200GHz, Cray Gemini interconnect, NVIDIA K20x	17.590	2	0.3223	3	1.2%
DOE/SC/Argonne National Laboratory	Mira – BlueGene/Q, Power BQC 16C 1.60GHz, Custom	8.587	5	0.1670	4	1.7%
NASA / Mountain View	Pleiades – SGI ICE X, Intel E5-2680, E5-2680V2, E5-2680V3, Infiniband FDR	4.089	11	0.1319	5	2.7%
Swiss National Supercomputing Centre (CSCS)	Piz Daint – Cray XC30, Xeon E5-2670 8C 2.600GHz, Aries interconnect , NVIDIA K20x	6.271	6	0.1246	6	1.6%
KAUST / Jeda	Shaheen II – Cray XC40, Intel Haswell 2.3 GHz 16C, Cray Aries	5.537	7	0.1139	7	1.6%
Texas Advanced Computing Center/Univ. of Texas	Stampede – PowerEdge C8220, Xeon E5-2680 8C 2.700GHz, Infiniband FDR, Intel Xeon Phi SE10P	5.168	8	0.0968	8	1.0%
Leibniz Rechenzentrum	SuperMUC – iDataPlex DX360M4, Xeon E5-2680 8C 2.70GHz, Infiniband FDR	2.897	20	0.0833	9	2.6%
EPSRC/University of Edinburgh	ARCHER – Cray XC30, Intel Xeon E5 v2 12C 2.700GHz, Aries interconnect	1.643	34	0.0808	10	3.2%

Conference Reports

ISC High Performance 2015

This year’s ISC meeting, now known as ISC High Performance (ISC-HPC), was held on July 12-16 in Frankfurt, Germany. ICL was well represented at ISC-HPC with Jack Dongarra, Stan Tomov, Jakub Kurzak, Terry Moore, and Tracy Rafferty all making their way to Frankfurt for the conference.

Jack was in high demand as usual. He started off the week presenting the June 2015 TOP500 awards on Monday, and then gave a talk, “Anatomy of Optimizing an Algorithm for Exascale,” as a distinguished speaker on Wednesday. Jack also presented the HPCG results for June 2015 and served as Chair for the workshop on Big Data and Exascale Computing (BDEC).

Stan and Jakub, along with ICL collaborator Mike Heroux, gave a linear algebra tutorial during Sunday’s tutorial session. Stan stayed busy as well and presented two papers, “Framework for Batched & GPU-resident Factorization Algorithms Applied to Block Householder Transformations” and “On the Design, Development & Analysis of Optimized Matrix-Vector Multiplication Routines for Coprocessors,” and gave a talk on MAGMA MIC at the Intel booth.

As mentioned above, the 1 day workshop on Big Data and Extreme Scale Computing (BDEC)—premised on the need to systematically map out the ways in which the major issues associated with Big Data intersect and interact with plans for achieving Exascale computing—was co-located with ISC-HPC and featured 15 individual talks and 25 participants.

ISC-HPC wasn’t all paper presentations and talks, however, and the ICL team met with representatives from Cavium, who will be giving ICL access to a dual-socket, 96 (48+48) core ARM system. All in all, ISC-HPC—even under a different name—was as productive as ever for ICL.

Recent Releases

MAGMA MIC 1.4.0 Released

MAGMA MIC 1.4.0 is now available. MAGMA (Matrix Algebra on GPU and Multicore Architectures) is a collection of next generation linear algebra (LA) libraries for heterogeneous architectures. The MAGMA package supports interfaces for current LA packages and standards, e.g., LAPACK and BLAS, to allow computational scientists to easily port any LA-reliant software components to heterogeneous architectures. MAGMA allows applications to fully exploit the power of current heterogeneous systems of multi/many-core CPUs and multi-GPUs/co-processors to deliver the fastest possible time to accurate solution within given energy constraints.

MAGMA MIC provides implementations for MAGMA’s one-sided (LU, QR, and Cholesky) and two-sided (Hessenberg, bi- and tridiagonal reductions) dense matrix factorizations, as well as linear and eigenproblem solvers for Intel Xeon Phi Coprocessors. More information on the approach is given in this presentation.

The MAGMA MIC 1.4.0 release adds:

Added port of MAGMA Sparse, including: CG, GMRES, BiCGSTAB (support for both bybrid and native versions); Auxiliary routines; and Preconditioned versions.
Added mixed-precision iterative refinement auxiliary routines and a solver for symmetric and positive definite matrices; {zc|ds}posv_mic.
Improved dsymv and dgemv in expert interface.
Added auxiliary bulge chasing routines used in two-stage eigensolvers.
Accelerated reductions to tridiagonal (dsytrd) and upper Hessenberg form (dgehrd) using the expert dsymv and dgemv, respectively.
Added test drivers and benchmarking routines.

Visit the MAGMA software page to download the tarball.

HPCC 1.5.0b Released

HPCC 1.5.0 beta is now available. The HPCC (HPC Challenge) benchmark suite is designed to establish, through rigorous testing and measurement, the bounds of performance on many real-world applications for computational science at extreme scale. To this end, the benchmark includes a suite of tests for sustained floating point operations, memory bandwidth, rate of random memory updates, interconnect latency, and interconnect bandwidth.

This latest release consists of minor additions and bug fixes:

Added new targets to the main make(1) file.
Fixed bug introduced while updating to MPI STREAM 1.7 with spurious global communicator (reported by NEC).
Added make(1) file for OpenMPI from MacPorts.
Fixed bug introduced while updating to MPI STREAM 1.7 that caused some ranks to use NULL communicator.
Fixed bug introduced while updating to MPI STREAM 1.7 that caused syntax errors.

Visit the HPCC software page to download the tarball.

Interview

Where are you from, originally?

I’m from Hangzhou, a city located in eastern China.

Can you summarize your educational background?

I spent 7 years in Xi’an, an ancient city in China, for undergraduate and graduate studies at Xi’an Jiaotong University. I earned my Bachelor’s degree in Automation Engineering in 2008 and my Master’s degree in System Engineering in 2011. In August 2011, I came to the other side of Earth to pursue my PhD in Computer Science here at UTK.

Tell us how you first learned about ICL.

In the summer of 2010, when I was doing an internship as a software engineer for Alibaba.com, I was interested and impressed by the distributed system design and development work of the largest B2B and C2C website in China. I then made my decision to study abroad in this area of expertise. I talked with professors at my university, and they recommended a book for me to read. That book is “Sourcebook of Parallel Computing” (it’s a Chinese transaction version and it is still on the shelf of my office). From this classic book, I first learned about Jack and his lab – ICL.

What made you want to work for ICL?

Jack is a famous professor, and ICL is a leading lab in parallel and distributed computing. The projects here impressed me and persuaded me to be here.

What are you working on while at ICL?

I’m working with the DisCo group, focusing on developing fault-tolerant features in PaRSEC.

If you weren’t working at ICL, where would you like to be working and why?

I would probably work as a software engineer for Alibaba.com in China, because the headquarters of this company is located in Hangzhou, China—my hometown.

What are your interests/hobbies outside work?

I like swimming, hiking, and playing tennis. I also play computer games; my favorite one is “Hearthstone.”

Tell us something about yourself that might surprise people.

I have a twin brother, but his career path is very different than mine. I chose engineering as my major and he majored in business. He is now an accountant at an IT company in China.

Recent Conferences

JUL
12

ISC High Performance 2015 Frankfurt, Germany
Jack
Jakub
Stan
Terry
Tracy

Jack Dongarra, Jakub Kurzak, Stanimire Tomov, Terry Moore, Tracy Rafferty
AUG
4

OpenShmem workshop Annapolis, Delaware
Aurelien

Aurelien Bouteiller
AUG
23

EuroPar 2015 Vienna, Austria
Hartwig

Hartwig Anzt
AUG
24

HPCC 2015 Newark, New Jersey
Jack
Piotr

Jack Dongarra, Piotr Luszczek
AUG
31

Smoky Mountains Computational Sciences and Engineering Conference (SMC15) Gatlinburg, Tennessee
Stan

Stanimire Tomov

Upcoming Conferences

SEP
2

9th Parallel Tools Workshop Dresden, Germany
Heike

Heike McCraw
SEP
6

PPAM15 Poland, Krakow
Heike

Heike McCraw
SEP
8

IEEE Cluster 2015 Chicago, Illinois
Anthony

Anthony Danalis
SEP
9

ORNL scientific seminar series Oak Ridge, Tennessee
Azzam

Azzam Haidar
SEP
15

2015 IEEE High Performance Extreme Computing Conference (HPEC'15) Waltham, Massachusetts
Piotr

Piotr Luszczek
SEP
21

Euro MPI 2015 Bordeaux, France
George

George Bosilca
SEP
24

Intel Big Data Retreat Hillsboro, Oregon
Piotr
Thomas

Piotr Luszczek, Thomas Herault
SEP
28

Russian Supercomputing Days Moscow, Russia
Jack

Jack Dongarra

Recent Lunch Talks

JUL
1
Ed Valeev
Virginia Tech
Tensor Computation for Chemistry Sparsity and More PDF
JUL
1
Torsten Hoefler
ETH Zürich
Towards Fully Automated Interpretable Performance Models PDF
JUL
17
Sangamesh Ragate
PC Sampling in GPU PDF
JUL
31
Joseph Schuchart
TU Dresden
HPC energy-efficiency research at ZIH, Or: What the HAEC is HDEEM? PDF
AUG
7
Ian Masliah
University of Paris-Sud
Towards C++ and Beyond PDF
AUG
21
Yaohung Tsai
Convolutional Layers in RaPyDLI PDF
AUG
28
Tingxing Dong
Batched Linear Algebra Problems on Hardware Accelerators Based on GPUs PDF

Upcoming Lunch Talks

SEP
4
Mathieu Faverge
Inria
Blocking Strategy Optimizations for Sparse Direct Linear Solver on Heterogeneous Architectures PDF
SEP
11
Asim YarKhan
OpenMP Tasks and PLASMA PDF
SEP
18
Mark Gates
Accelerating Collaborative Filtering Using Concepts from High Performance Computing PDF
SEP
25
Ichitaro Yamazaki
Random Sampling to Update Truncated SVD PDF

Visitors

Valentin Le Fevre from ENS de Lyon will be visiting from May 17 through August 8. Valentin is visiting ICL for several months and will be working with the Distributed Computing group.
Ian Masliah from University of Paris-Sud will be visiting from July 27 through August 17. Ian will be working with the Linear Algebra Group.

People

ICL alumni Mathieu Faverge will be visiting ICL and working with the Linear Algebra Group for a month beginning in August. Welcome back, Mathieu!
Ian Masliah from the University of Paris-Sud will be visiting from July 27 through August 17. Ian will be working with the Linear Algebra Group.
Joe Dorris will be joining ICL this fall semester as an MS student working with the Linear Algebra Group. Welcome aboard, Joe!
Phil Vaccaro will be joining ICL this fall semester as an MS student working with the Performance Analysis Group. Welcome to ICL, Phil!

Visitors

Valentin Le Fevre from ENS de Lyon will be visiting from May 17 through August 8. Valentin is visiting ICL for several months and will be working with the Distributed Computing group.
Ian Masliah from University of Paris-Sud will be visiting from July 27 through August 17. Ian will be working with the Linear Algebra Group.

congratulations

Mr. and Mrs. Gates

On July 25th, ICL’s Mark Gates married Elaine Gates née Davis. Congratulations to the bride and groom!

Dates to Remember

ICL Annual Retreat

The 2015 ICL Annual Retreat, the 16th such retreat in the lab’s history, has been set for August 13th-14th at the Tremont Lodge in Townsend, TN. Dinner on Thursday will be held at Miss Lily’s Cafe. Mark your calendars.

August 2015

News and Announcements

TOP500 – June 2015

DARE Funded

HPCG June 2015 Results

Conference Reports

ISC High Performance 2015

Recent Releases

MAGMA MIC 1.4.0 Released

HPCC 1.5.0b Released

Interview

Chongxiao Cao

Recent Papers

Recent Conferences

Upcoming Conferences

Recent Lunch Talks

Upcoming Lunch Talks

Visitors

People

Visitors

congratulations

Mr. and Mrs. Gates

Dates to Remember

ICL Annual Retreat

Archives

PDF Editions