News and Announcements

SC16

iclatsc16When this year’s International Conference for High Performance Computing, Networking, Storage, and Analysis, SC16, happens November 13–18 in Salt Lake City, Utah, ICL will again be a substantial contributor to the activities, presenting BoFs, papers, tutorials, and a poster.

Although the University of Tennessee will not have a booth this year, ICL has created its own virtual booth where you can keep up with what ICLers will be doing throughout the conference.

Dongarra Becomes a Foreign Member of the Russian Academy of Sciences

Vladimir Fortov, president of the Russian Academy of Sciences (RAS), informed Jack on October 31 that the organization has elected him as a foreign member.

In making this esteemed list of well-known scientists from various countries and disciplines, Dongarra joins a number of other notable prior-elected computer scientists — such as Americans Don Knuth, professor emeritus at Stanford, and Mike Stonebraker, adjunct professor at the Massachusetts Institute of Technology. The RAS roster of foreign members also includes seven Nobel laureates; among those: famous American statesman Henry Kissinger.

“The academy’s prestige stems from its long-standing role as a global network of scientists and scholars from an array of institutes and laboratories dedicated to advancing science for the betterment of humanity,” Dongarra said. “So, being elected to the RAS community is not only an honor but also another effective avenue for sharing what we learn from our experimental computer science work.”

Established in 1724 in Petersburg, Russia, the RAS is that country’s highest scientific society and principal coordinating body for research in natural and social sciences, technology, and production. Membership is by election and is divided into three ranks: academician, corresponding member, or foreign member. The organization has more than 1,500 members, with some 800 corresponding members, 500 academicians, and 200 foreign members.

The RAS trains students and publicizes scientific achievements and knowledge, as well as maintains ties with many international scientific institutions and collaborates with foreign academies. It also directs the research of other scientific institutions and institutions of higher learning.

More on the RAS elections is available via a translated webpage. The complete list of foreign members of the organization is available on Wikipedia.

PULSE and BONSAI Grants Awarded

During the summer, ICL won two grants from the National Science Foundation, under the Software Infrastructure for Sustained Innovation (SI2) program. The awarded projects are called PULSE and BONSAI, and each is funded for approximately $500,000 over three years.

PULSE — the PAPI Unifying Layer for Software-defined Events project — aims to enable cross-layer and integrated modeling and analysis of the entire hardware system by extending PAPI (Performance API) with the capability to expose performance metrics for key software components found in the HPC software stack.

PULSE will enhance the impact of the abstraction and unification layer that PAPI provides to hardware events to also encompass MPI, OpenMP, LAPACK, MAGMA, and task-based runtimes software events.

As for BONSAI — the BEAST OpeN Software Attuning Infrastructure project — it uses the knowledge and software that resulted from ICL’s research to create a modular, easy-to-use software toolkit that will enable domain scientists and library developers to efficiently explore and optimize the execution of a wide variety of computational kernels on current and future hybrid platforms.

Dongarra Shares Insights on the TOP500 and More

Comments from Jack Dongarra are featured in an article published in October 19 issue of Computerworld delving into the history and characteristics of the TOP500 list and related topics.

The article conveys how the speed of supercomputers has progressed over time, industry’s prominent use of simulations and big data, China’s rise as formidable competitor of the US in high-performance computing, and the status of progress in the quest for exascale computing.

Also, back in June when the TOP500 list was published at the International Supercomputer Conference, Jack was interviewed by the Wall Street Journal and the New York Times concerning the latest rankings and China’s prominence in them. See related stories in Tennessee Today and the SC16 website.

Conference Reports

CCDSC 2016

ccdsc-2016

The Workshop on Clusters, Clouds, and Data for Scientific Computing (CCDSC) continued its tradition of addressing numerous themes for developing and using both cluster and computational clouds when it convened on October 3–6, 2016, at La Maison des Contes, 427 Chemin de Chanzé, France.

Held every two years and alternating between the US and France, these by-invitation-only workshops evaluate the state of the art and future trends in cluster computing and the use of computational clouds for scientific computing. The meeting is a continuation of a series of workshops titled “Workshop on Environments and Tools for Parallel Scientific Computing” that began in 1992.

ICL’s Jack Dongarra and former ICLer Bernard Tourancheau (now a professor at the Université Grenoble Alpes) co-chaired, while Yves Robert and Anthony Danalis gave invited talks. Yves’ talk was on “Failure Detection and Propagation in HPC Systems.” And Anthony’s was on “Dataflow programming: Do we need it for exascale?

The meeting hosted more than 50 attendees and featured 50 individual talks.

ASCR

About 150 representatives from the U.S. Department of Energy (DOE) national laboratories and a number of academic institutions met in Rockville, MD, September 27–29 for the Exascale Requirements Review for Advanced Scientific Computing Research (ASCR), one of a series of workshops DOE’s Office of Science conducts to determine the exascale requirements of application scientists and computer scientists.

The workshops address the computation, data analysis, software, workflows, HPC services, and the complete range of computer requirements necessary to support advanced computing research through 2025.

Attending the meeting from ICL were Asim YarKhan and George Bosilca, both of whom participated in breakout sessions focused on system software requirements for production systems and emerging systems. In addition, George took part in the session on system software for early-access machines.

HPEC ’16

ICL’s Stanimire Tomov attended the 20th annual IEE High Performance Extreme Computing (HPEC) Conference, September 13–15 at the Westin Hotel in Waltham, MA.

Two ICL papers were presented at HPEC ’16.

One of the ICL papers, “LU, QR, and Cholesky Factorizations: Programming Model, Performance Analysis and Optimization Techniques for the Intel Knights Landing Xeon Phi,” resulted from work on numerical libraries with Intel® collaborators on Intel’s newest Xeon Phi™ processor, Knights Landing.

The paper was on the “Performance Analysis and Acceleration of Explicit Integration for Large Kinetic Networks using Batched GPU Computations” and featured physics collaborators from Oak Ridge National Laboratory and the University of Tennessee. This publication incorporates some of ICL’s newest batched processing techniques to accelerate applications from core-collapse supernovae to realistic atmospheric simulations.

HPEC is the largest computing conference in New England and addresses the convergence of high-performance embedded computing. Held each year in the Boston area, HPEC brings together academic, industry, and federal DOE/DOD researchers in high-performance computing, computing hardware, software, systems, and applications in which performance matters.

EuroMPI

IMG_5748

The 23rd-annual EuroMPI conference took place on September 25–28 in Edinburgh, Scotland, providing participants the opportunity for networking, discussion, and skills building amidst the theme “Modern Challenges to MPI’s Dominance in HPC.”

ICL’s George Basilca was there. “This looked like a rebirth of the conference, with the number of participants going up for the first time in the last few years,” he says. “The paper acceptance rate also improved; it is now below 45 percent. So EuroMPI is getting back its leadership role in MPI-related developments. And, as a side note, Edinburgh in September was an excellent choice.”

George presented a tutorial titled “Survival in an MPI World,” in which he explained a holistic approach to fault tolerance. His tutorial introduced multiple fault-management techniques while maintaining the focus on what is called User Level Failure Mitigation (ULFM) — a minimal extension of the MPI specification that aims to provide users with the basic building blocks and tools to construct higher-level abstractions and introduce resilience in their applications.

Recent Releases

PAPI 5.5.0

PAPI 5.5.0 has been released!

The Performance API (PAPI) provides simultaneous access to performance counters on CPUs, GPUs, and other components of interest (e.g., network and I/O systems). Provided as a linkable library of shared objects, PAPI can be called directly in a user program, or used transparently through a variety of third-party tools, making it a de facto standard for hardware counter analysis.

The PAPI 5.5.0 release includes a new component that provides read and write access to the information and controls exposed via the Linux powercap interface. The PAPI powercap component supports measuring and capping power usage on recent Intel®architectures.

ICL has added core support for Knights Landing as well as power monitoring via the RAPL and powercap components. Uncore support will be provided later.

Visit the PAPI website for more information and to download the software.

MAGMA 2.1

MAGMA 2.1 has been released!

Matrix Algebra on GPU and Multicore Architectures (MAGMA) is a collection of next-generation linear algebra (LA) libraries for heterogeneous architectures. The MAGMA package supports interfaces for current LA packages and standards (e.g., LAPACK and BLAS) to allow computational scientists to easily port any LA-reliant software components to heterogeneous architectures.

MAGMA enables applications to fully exploit the power of current heterogeneous systems of multi/many-core CPUs and multi-GPUs/coprocessors to deliver the fastest possible time to accurate solution within given energy constraints.

Following are the new features and updates included in MAGMA 2.1:

  • Variable size batched routines (gemm, gemv, syrup, syrk2k)
  • Improved SVD performance for tall (m >> n) or wide (m << n) matrices
  • Preconditioned QMR
  • Expanded Doxygen documentation
  • For MAGMA v1 compatibility, initializes default queue for each GPU on first use, instead of magma_init.
    • Visit the MAGMA website to download the software.

Open MPI 2.0.1

Open MPI 2.0.1 has been released!

The Open MPI project is an open source message passing interface (MPI) implementation that is developed and maintained by a consortium of academic, research, and industry partners.

MPI primarily addresses the message-passing parallel programming model, in which data is moved from the address space of one process to that of another process through cooperative operations on each process.

Open MPI integrates technologies and resources from several other projects (HARNESS/FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI) to build the best MPI library available.

The list of changes to Open MPI implemented in version 2.0.1 is provided on GitHub.

Visit the Open MPI website to download the software.

Interview

Nuria Losada ThenNuria Losada Now

Nuria Losada

Where are you from originally?

I am from Spain. I was born in Vigo, and I moved to A Coruña to study; both are cities in Galicia. It is a green land in the northwest of Spain, with a beautiful coastline full of beaches, islands, and cliffs.

Can you summarize your background?

I earned a college degree in computer engineering in 2013, then a master’s degree in HPC in 2014, both from the University of A Coruña, where I am currently doing my PhD and working as a researcher.

Tell us how you first learned about ICL.

ICL is a world reference in research in high-performance computing and has developed many software packages that are well-known in the scientific community. My research in fault tolerance has led me to use ULFM; however, the first projects I learned about ICL were OpenMPI, PAPI, and the TOP500 list.

What made you want to work for ICL?

I want to improve my knowledge of HPC and fault tolerance, and this research visit brings me the perfect opportunity.

What are you working on at ICL?

I’m working on resilience. We intend to integrate ULFM and the checkpointing tool CPPC (developed by the Computer Architecture Group of A Coruña) to obtain a local recovery fault tolerance solution by using message logging.

What are your interests/hobbies outside work?

I like spending time with my friends, watching movies and TV series, music, traveling, and painting.

Tell us something about yourself that might surprise people.

I grew up fishing with my dad and I ended up being a pretty good fisher.

Recent Papers

  1. Dongarra, J., J. Demmel, J. Langou, and J. Langou, 2016 Dense Linear Algebra Software Packages Survey,” University of Tennessee Computer Science Technical Report, no. UT-EECS-16-744 / LAWN 290: University of Tennessee, September 2016.  (366.43 KB)
  2. Haidar, A., A. Abdelfattah, V. Dobrev, I. Karlin, T. Kolev, S. Tomov, and J. Dongarra, Accelerating Tensor Contractions for High-Order FEM on CPUs, GPUs, and KNLs , Gatlinburg, TN, moky Mountains Computational Sciences and Engineering Conference (SMC16), Poster, September 2016.  (4.29 MB)
  3. Anzt, H., E. Chow, D. Szyld, and J. Dongarra, Domain Overlap for Iterative Sparse Triangular Solves on GPUs,” Software for Exascale Computing - SPPEXA, vol. 113: Springer International Publishing, pp. 527–545, September 2016.
  4. Haidar, A., S. Tomov, K. Arturov, M. Guney, S. Story, and J. Dongarra, LU, QR, and Cholesky Factorizations: Programming Model, Performance Analysis and Optimization Techniques for the Intel Knights Landing Xeon Phi,” IEEE High Performance Extreme Computing Conference (HPEC'16), Waltham, MA, IEEE, September 2016.  (943.23 KB)
  5. Haidar, A., B. Brock, S. Tomov, M. Guidry, J. Jay Billings, D. Shyles, and J. Dongarra, Performance Analysis and Acceleration of Explicit Integration for Large Kinetic Networks using Batched GPU Computations,” 2016 IEEE High Performance Extreme Computing Conference (HPEC ‘16), Waltham, MA, IEEE, September 2016.  (480.29 KB)
  6. Dongarra, J., Sunway TaihuLight Supercomputer Makes Its Appearance,” National Science Review, vol. 3, issue 3, pp. 256-266, September 2016.  (292.11 KB)
  7. Yamazaki, I., S. Nooshabadi, S. Tomov, and J. Dongarra, High Performance Realtime Convex Solver for Embedded Systems,” University of Tennessee Computer Science Technical Report, no. UT-EECS-16-745, October 2016.  (225.43 KB)
  8. Anzt, H., S. Tomov, and J. Dongarra, On the performance and energy efficiency of sparse linear algebra on GPUs,” International Journal of High Performance Computing Applications, October 2016.  (1.19 MB)
  9. Yamazaki, I., S. Tomov, and J. Dongarra, Stability and Performance of Various Singular Value QR Implementations on Multicore CPU with a GPU,” ACM Transactions on Mathematical Software (TOMS), vol. 43, issue 2, October 2016.
  10. Anzt, H., E. Chow, T. Huckle, and J. Dongarra, Batched Generation of Incomplete Sparse Approximate Inverses on GPUs,” Proceedings of the 7th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, pp. 49–56, November 2016.
  11. Bosilca, G., A. Bouteiller, A. Guermouche, T. Herault, Y. Robert, P. Sens, and J. Dongarra, Failure Detection and Propagation in HPC Systems,” Proceedings of the The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'16), Salt Lake City, Utah, IEEE Press, pp. 27:1-27:11, November 2016.
  12. Anzt, H., J. Dongarra, and E. S. Quintana-Orti, Fine-grained Bit-Flip Protection for Relaxation Methods,” Journal of Computational Science, November 2016.  (1.47 MB)
  13. Yamazaki, I., S. Tomov, and J. Dongarra, Non-GPU-resident Dense Symmetric Indefinite Factorization,” Concurrency and Computation: Practice and Experience, November 2016.
  14. Anzt, H., E. Chow, and J. Dongarra, On block-asynchronous execution on GPUs,” LAPACK Working Note, no. 291, November 2016.  (1.05 MB)
  15. Lopez, M. G., V. Larrea, W. Joubert, O. Hernandez, A. Haidar, S. Tomov, and J. Dongarra, Towards Achieving Performance Portability Using Directives for Accelerators,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'16), Third Workshop on Accelerator Programming Using Directives (WACCPD), Salt Lake City, Utah, Innovative Computing Laboratory, University of Tennessee, November 2016.  (567.02 KB)

Recent Conferences

  1. SEP
    Stanimire Tomov
    Stan
    Stanimire Tomov
  2. SEP
    EuroMPI Edinburg, United Kingdom
    George Bosilca
    George
    George Bosilca
  3. OCT
    CCDSC Lyon, France
    Anthony Danalis
    Anthony
    Jack Dongarra
    Jack
    Anthony Danalis, Jack Dongarra

Upcoming Conferences

  1. NOV
    George Bosilca
    George
    George Bosilca
  2. NOV
    -
    SC16 Salt Lake City, UT
    Anthony Danalis
    Anthony
    Aurelien Bouteiller
    Aurelien
    George Bosilca
    George
    Hartwig Anzt
    Hartwig
    Jack Dongarra
    Jack
    Jakub Kurzak
    Jakub
    Piotr Luszczek
    Piotr
    Reazul Hoque
    Reazul
    Terry Moore
    Terry
    Thananon  Patinyasakdikul
    Arm
    Thomas Herault
    Thomas
    Tracy Rafferty
    Tracy
    Yaohung  Tsai
    Mike
    Anthony Danalis, Aurelien Bouteiller, George Bosilca, Hartwig Anzt, Jack Dongarra, Jakub Kurzak, Piotr Luszczek, Reazul Hoque, Terry Moore, Thananon Patinyasakdikul, Thomas Herault, Tracy Rafferty, Yaohung Tsai
  3. NOV
    ECP PI Meeting Lemont, IL
    George Bosilca
    George
    Heike Jagode
    Heike
    Jack Dongarra
    Jack
    George Bosilca, Heike Jagode, Jack Dongarra

Recent Lunch Talks

  1. SEP
    9
    Yves Robert
    Yves Robert from INRIA
    Computing the expected longest path of task graphs in the presence of silent errors PDF
  2. SEP
    16
    Dmitry Lyakh
    Dmitry Lyakh from ORNL
    Dense/sparse numeric tensor algebra: Scalable, hardware-agnostic design for performance portability PDF
  3. SEP
    22
    Phil Vaccaro
    Phil Vaccaro
    PAPI Component: Powercap PDF
  4. SEP
    30
    Azzam Haidar
    Azzam Haidar
    A note on the Power and Performance Analysis of Dense Linear Algebra on Intel Xeon Phi Processors
  5. OCT
    7
    Piotr Luszczek
    Piotr Luszczek
    What Deep Learning?!? PDF
  6. OCT
    14
    Yves Robert
    Yves Robert from INRIA
    Failure Detection and Propagation in HPC systems PDF
  7. OCT
    21
    Harry Hughes
    Harry Hughes
    A Simulation-based System to Optimize Tile Size Parameters in PLASMA PDF
  8. OCT
    28
    Frank Winkler
    Frank Winkler from ORNL
    Performance Analysis at Scale: The Score-P Tools Infrastructure PDF

Upcoming Lunch Talks

  1. NOV
    4
    Reazul Hoque
    Reazul Hoque
    Dynamic Task Discovery in PaRSEC PDF
  2. NOV
    11
    Thananon Patinyasakdikul
    Thananon Patinyasakdikul
    Multithreaded MPI PDF

People

  1. Khairul Kabir
    Khairul Kabir of NVIDIA will be visiting through Nov. 5th working with the Linear Algebra group.

Visitors

  1. Khairul Kabir
    Khairul Kabir from NVIDIA will be visiting from October 24 through November 5. Working on research with the Linear Algebra Group
  2. Camille Coti
    Camille Coti from Universite Paris 13 will be visiting from November 7 through November 8. Camille with be working with the Distributed Computing group.

congratulations

Amina Guermouche

As a member of distributed computing group led by George Bosilca, Amina worked mainly on the Task-based Environment for Scientific Simulation of Extreme Scale (TESSE). She now has a new role as associate professor at Telecom SudParis. Best wishes, Amina!

Sam Crawford

All the very best to Sam, formerly an information specialist at ICL, in his new role as a technical editor and writer at ORNL.

Sofia Tomov

sofiaAt the age of 12, Sofia Tomov, daughter of ICL’s Stanimire Tomov, is already taking on one of the great challenges of modern medicine: the prevention of adverse or life-threatening reactions to medication. To address the problem, Sofia has devised a computer algorithm that would allow doctors to determine if their patients have genetic mutations indicative of high risk for dangerous reaction. Her outstanding work has garnered her a spot as a finalist in the 2016 Discovery Education 3M Young Scientist Challenge, a prestigious science competition for middle school students. Sofia says she hopes that one day her algorithm will help save lives around the world and that she believes its use will become extremely widespread. For more about this amazing, mighty girl, see her Facebook page.

Dates to Remember

ICL Dinner at SC16

Caffé Molise in downtown Salt Lake City will be the locale for the ICL SC16 dinner, which takes place on November 16 at 7:00pm. The restaurant is located at 55 West 100 South.

To confirm your attendance, contact Tracy Rafferty (rafferty@icl.utk.edu).