News and Announcements

Joint Laboratory for Extreme Scale Computing

ICL is now an Associate Partner of the Joint Laboratory for Extreme Scale Computing (JLESC). JLESC, founded in 2009 by the French Institute for Research in Computer Science and Automation (INRIA) and the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign, is an international, virtual organization that aims to enhance the ability of member organizations and investigators to overcome software challenges found in extreme scale, high-performance computers.

JLESC engages computer scientists, engineers, application scientists, and industry leaders to ensure that the research facilitated by the joint laboratory addresses science and engineering’s most critical needs and takes advantage of the continuing evolution of computing technologies. Other partners include Argonne National Laboratory, the Barcelona Supercomputing Center, Jülich Supercomputing Center, and the RIKEN Center for Computational Science.

Conference Reports

8th JLESC Workshop

The 8th JLESC workshop was held on April 17–19 in Barcelona, Spain. This workshop brought together leading researchers in HPC from the JLESC partners—INRIA, the University of Illinois at Urbana-Champaign, Argonne National Laboratory, the Barcelona Supercomputing Center, Jülich Supercomputing Center, RIKEN AICS, and ICL—to explore the most recent and critical issues facing HPC as we enter the extreme-scale era.

As ICL’s first JLESC workshop, we sent a member of each research group to present ICL’s latest work in linear algebra libraries (LA), distributed computing (DisCo), and performance monitoring (PICL).

Representing the LA efforts and ICL as a whole, Jack Dongarra gave a presentation on “Experiments with Energy Savings and Short Precision.” Thomas Herault presented DisCo’s latest work on resilience (SMURFS project) in “Performance of (Un)cooperative Periodic Checkpointing in Shared Environments.” He wasn’t finished there, however, as he also talked about DisCo’s foray into deep learning and machine learning.

Also representing the DisCo effort, George Bosilca presented his thoughts on “Termination detection: Are we done yet?” while also serving on the panel for the discussion of (international) supercomputer users. Finally, Anthony Danalis discussed PICL’s latest efforts with the PAPI project in a presentation called, “PAPI: Measuring outside the Box.”

The 9th JLESC workshop is already set for April of 2019, and ICL/UTK will be hosting this next iteration here in Knoxville, TN.

The editor would like to thank Thomas Herault and Rosa Badia for their contributions to this article.

2018 NSF SI2 PI Workshop

The 2018 National Science Foundation (NSF) Software Infrastructure for Sustained Innovation (SI²) Principal Investigator (PI) workshop was held on April 30 and May 1 at the The Westin in Washington, DC. This year’s workshop, which also happens to be the last SI² workshop hosted by the NSF, was co-located with the 18th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGrid’18).

Around 180 participants showed up for the workshop, where SI² PIs were expected to give a brief talk to introduce their project and present a poster during one of the four poster sessions.

ICL’s Heike Jagode presented the latest work on the SI2-SSE: PAPI Unifying Layer for Software-Defined Events (PULSE) project, which focuses on enabling cross-layer and integrated monitoring of whole application performance by extending PAPI with the capability to expose performance metrics from key software components found in the HPC software stack.

Piotr Luszczek showed off his team’s work on SI2-SSE: An Open Software Infrastructure for Parallel Autotuning of Computational Kernels (BONSAI). BONSAI aims to develop software infrastructure for using scalable, parallel hybrid systems to carry out large, concurrent autotuning sweeps in order to dramatically accelerate the optimization process of computational kernels for GPU accelerators and many-core coprocessors.

George Bosilca was on deck to show the DisCo team’s latest work for the SI2-SSI: Task Based Environment for Scientific Simulation at Extreme Scale (TESSE) project. TESSE uses an application-driven design to create a general-purpose, production-quality software framework that attacks the twin challenges of programmer productivity and portable performance for advanced scientific applications on massively-parallel, hybrid, many-core systems of today and tomorrow.

George also presented ICL’s latest work in the SI2-SSI: Evolve: Enhancing the Open MPI Software for Next Generation Architectures and Applications project. Evolve, a collaborative effort between ICL and the University of Houston, expands the capabilities of Open MPI to support the NSF’s critical software infrastructure missions. Core challenges include: extending the software to scale to 10,000–100,000 processes; ensuring support for accelerators; enabling highly asynchronous execution of communication and I/O operations; and ensuring resilience.

Azzam presented a poster for the SI2-SSE: MAtrix, TEnsor, and Deep-learning Optimized Routines (MATEDOR) project. MATEDOR will perform the research required to define a standard interface for batched operations and to provide a performance-portable software library that demonstrates batching routines for a significant number of kernels. This research is critical, given that the performance opportunities inherent in solving many small batched matrices often yield more than a 10× speedup over the current classical approaches.

Vince Weaver—now a professor at the University of Maine—presented work on the SI2-SSI: Collaborative Proposal: Performance Application Programming Interface for Extreme-Scale Environments (PAPI-EX) project. PAPI’s new powercap component enables power measurement without root privilege as well as power control. This new component exposes running average power limit (RAPL) functionality to enable users to read and write power parameters via PAPI_read() and PAPI_write().

The editor would like to thank Heike Jagode and Piotr Luszczek for their contributions to this article.

BDEC: Next Generation

The latest Big Data and Extreme Scale Computing (BDEC) workshop, dubbed “BDEC: Next Generation” or “BDEC: NG,” was held in Chicago, IL on March 26–28. BDEC: NG marks the 10th entry in a series premised on the idea that we must begin to systematically map out and account for the ways in which the major issues associated with big data intersect with, impinge upon, and potentially change the national (and international) plans that are now being laid out for achieving exascale computing.

Where the original BDEC workshops focused on high-performance computing (HPC) and big data effects on science (and computational science in particular), BDEC: NG expands the scope of the traditional role of big data within scientific computing—from the usual simulation and modeling to data analytics used to search the noise for phenomena that might require additional study and to verify existing models and simulations.

A reorientation of the BDEC effort—in the form of BDEC: NG—is essential for planning a new road map for the changing landscape of big data and exascale computing. Previous BDEC efforts cultivated and solidified an international cooperative, and BDEC: NG will leverage these relationships.

The editor would like to thank Terry Moore for his contribution to this article.

Interview

Tony Castaldo Then

Tony Castaldo

Where are you from, originally?

From a military Air Force background. I was born in Washington, DC, and we were immediately transferred to Hawaii; leaving Hawaii when I was in the first grade. My father was stationed for five and six year tours, and after Hawaii we were stationed in San Antonio, then Illinois, and then my father retired, and my family moved back to San Antonio. I got a job and remained in Illinois to finish the last two years of high school.

Can you summarize your educational background?

Mostly unintentional. In high school my interest in mathematics led me to programming. I joined the military and taught myself more programming; after that I used the GI Bill to finish an associates degree in computer programming. I programmed boot-level assembly for embedded devices and OS and assembly level code for about twenty-five years. After the dot crash (I had cashed out more than a year before, out of analytic alarm) contracting gigs dried up, and I decided to return to college for a few years to let the market recover. It was great fun and turned into ten years. I finished a Bachelor’s and Master’s in Mathematics, then a Master’s and PhD in CS, and then became a full-time research assistant professor (with Clint Whaley, my PhD advisor).

Where did you work before joining ICL?

Many places; most places for a year or two at a time, as an on-and-off contractor. I have worked full-time at a large bank (processing credit card authorizations), a communications device manufacturer (boot-level device code, communications, and compression algorithms), and a large hospital (IBM mainframe systems programmer). My immediately prior job was with Minimal Metrics, a contract software company owned by Phil Mucci.

How did you first hear about the lab, and what made you want to work here?

In CS graduate school my advisor was Clint Whaley; working on the ATLAS linear algebra package. Clint was contacted by Philip Mucci, and I ended up on a contract with Phil to adapt ATLAS to install on a new kind of hardware setup. That was successful, and I continued to contract with Phil, and some years later I agreed to join a venture with him to develop a commercial project based on Perfminer. We failed to raise venture capital before we ran out of money. We both began looking for other work. Both Clint and Phil had told me about ICL, and Phil noticed the job opening and sent it to me. I am a fan of lower-level software tools and statistical analysis and also a fan of the academic environment and open-source code in general. I worked longer with Clint Whaley than any other person in my life, and I enjoyed it immensely. ICL seemed like a good fit.

What is your focus here at ICL? What are you working on?

Exascale PAPI, supporting a DoD Project. At the moment, my introductory project is automatic discovery (by benchmarks and statistical methods) of PAPI events that track common hardware events (e.g., which events count mispredicted branches). I expect to move on to more project-oriented tasks in the near future.

What are your interests/hobbies outside of work?

I am currently teaching myself to write fantasy fiction, and I’ve completed two books (unpublished). I read extensively, primarily non-fiction that leans scientific. Last year I read Piketty’s Capital and Seyfried’s Cancer as a Metabolic Disease (a new approach to the biology and treatment of cancer). I read science of any form extensively.

Tell us something about yourself that might surprise people.

I don’t believe in the Big Bang, Inflation, String Theory, Dark Matter, or Dark Energy. I suspect they are all the naming of unresolved problems in physics to pretend they are solved, except for a few pesky details nobody can figure out, like theoretical particles nobody can find. Of course I am on the outside looking in, and I’d be happy to see proof I am wrong!

If you weren’t working at ICL, where would you like to be working and why?

I suppose I should say if I were not working at ICL, I would like to be working at ICL!

Pretty much any applied scientific research interests me, if it has real world impact. From the subatomic to (statistically justified versions of) economics, biology, sociology, psychology, and alternative energy. So, wherever cool work can be done. I don’t really have specific career goals or ambition to achieve anything specific. When looking for work, I am more inclined to look for interesting problems or ideas, where I believe I can contribute to the effort.

Recent Papers

  1. Marques, O., J. Demmel, and P. B. Vasconcelos, Bidiagonal SVD Computation via an Associated Tridiagonal Eigenproblem,” LAPACK Working Note, no. LAWN 295, ICL-UT-18-02: University of Tennessee, April 2018.  (1.53 MB)
  2. Haidar, A., H. Jagode, P. Vaccaro, A. YarKhan, S. Tomov, and J. Dongarra, Investigating Power Capping toward Energy-Efficient Scientific Applications,” Concurrency Computation: Practice and Experience, vol. 2018, issue e4485, pp. 1-14, April 2018. DOI: 10.1002/cpe.4485  (1.2 MB)
  3. Haidar, A., S. Tomov, A. Abdelfattah, I. Yamazaki, and J. Dongarra, MAtrix, TEnsor, and Deep-learning Optimized Routines (MATEDOR) , Washington, DC, NSF PI Meeting, Poster, April 2018. DOI: 10.6084/m9.figshare.6174143.v3  (2.4 MB)
  4. Danalis, A., H. Jagode, and J. Dongarra, PAPI: Counting outside the Box , Barcelona, Spain, 8th JLESC Meeting, April 2018.
  5. Kurzak, J., M. Gates, A. YarKhan, I. Yamazaki, P. Wu, P. Luszczek, J. Finney, and J. Dongarra, Parallel BLAS Performance Report,” SLATE Working Notes, no. 05, ICL-UT-18-01: University of Tennessee, April 2018.  (4.39 MB)
  6. Haidar, A., A. Abdelfattah, M. Zounon, S. Tomov, and J. Dongarra, A Guide for Achieving High Performance with Very Small Matrices on GPUs: A Case Study of Batched LU and Cholesky Factorizations,” IEEE Transactions on Parallel and Distributed Systems, vol. 29, issue 5, pp. 973–984, May 2018. DOI: 10.1109/TPDS.2017.2783929  (832.92 KB)
  7. Dong, T., A. Haidar, S. Tomov, and J. Dongarra, Accelerating the SVD Bi-Diagonalization of a Batch of Small Matrices using GPUs,” Journal of Computational Science, vol. 26, pp. 237–245, May 2018. DOI: 10.1016/j.jocs.2018.01.007  (2.18 MB)
  8. Gates, M., S. Tomov, and J. Dongarra, Accelerating the SVD Two Stage Bidiagonal Reduction and Divide and Conquer Using GPUs,” Parallel Computing, vol. 74, pp. 3–18, May 2018. DOI: 10.1016/j.parco.2017.10.004  (1.34 MB)
  9. Yamazaki, I., A. Abdelfattah, A. Ida, S. Ohshima, S. Tomov, R. Yokota, and J. Dongarra, Analyzing Performance of BiCGStab with Hierarchical Matrix on GPU Clusters,” IEEE International Parallel and Distributed Processing Symposium (IPDPS), Vancouver, BC, Canada, IEEE, May 2018.  (1.37 MB)
  10. Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Batched One-Sided Factorizations of Tiny Matrices Using GPUs: Challenges and Countermeasures,” Journal of Computational Science, vol. 26, pp. 226–236, May 2018. DOI: 10.1016/j.jocs.2018.01.005  (3.73 MB)
  11. Caniou, Y., E. Caron, A K W. Chang, and Y. Robert, Budget-Aware Scheduling Algorithms for Scientific Workflows with Stochastic Task Weights on Heterogeneous IaaS Cloud Platforms,” 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Vancouver, BC, Canada, IEEE, May 2018. DOI: 10.1109/IPDPSW.2018.00014  (1.31 MB)
  12. Bouteiller, A., G. Bosilca, T. Herault, and J. Dongarra, Data Movement Interfaces to Support Dataflow Runtimes,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-03: University of Tennessee, May 2018.  (210.94 KB)
  13. Jagode, H., A. Danalis, R. Hoque, M. Faverge, and J. Dongarra, Evaluation of Dataflow Programming Models for Electronic Structure Theory,” Concurrency and Computation: Practice and Experience: Special Issue on Parallel and Distributed Algorithms, vol. 2018, issue e4490, pp. 1–20, May 2018. DOI: 10.1002/cpe.4490  (1.69 MB)
  14. Herault, T., Y. Robert, A. Bouteiller, D. Arnold, K. Ferreira, G. Bosilca, and J. Dongarra, Optimal Cooperative Checkpointing for Shared High-Performance Computing Platforms,” 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Best Paper Award, Vancouver, BC, Canada, IEEE, May 2018. DOI: 10.1109/IPDPSW.2018.00127  (899.3 KB)

Recent Conferences

  1. APR
    Piotr Luszczek
    Piotr
    Piotr Luszczek
  2. APR
    -
    Piotr Luszczek
    Piotr
    Piotr Luszczek
  3. APR
    -
    JLESC Workshop Barcelona, Spain
    Anthony Danalis
    Anthony
    George Bosilca
    George
    Jack Dongarra
    Jack
    Thomas Herault
    Thomas
    Anthony Danalis, George Bosilca, Jack Dongarra, Thomas Herault
  4. APR
    -
    2018 NSF SI2 PI Meeting Washington, District of Columbia
    Azzam Haidar
    Azzam
    George Bosilca
    George
    Heike Jagode
    Heike
    Piotr Luszczek
    Piotr
    Azzam Haidar, George Bosilca, Heike Jagode, Piotr Luszczek
  5. MAY
    -
    IPDPS 2018 Vancouver, Canada
    Ichitaro Yamazaki
    Ichitaro
    Jakub Kurzak
    Jakub
    Ichitaro Yamazaki, Jakub Kurzak
  6. MAY
    Stanimire Tomov
    Stan
    Stanimire Tomov

Upcoming Conferences

  1. JUN
    -
    GPU Hackathon Boulder, Colorado
    Piotr Luszczek
    Piotr
    Piotr Luszczek
  2. JUN
    -
    Heike Jagode
    Heike
    Heike Jagode
  3. JUN
    -
    SC Conference Meeting Dallas, Texas
    Gerald Ragghianti
    Gerald
    Jack Dongarra
    Jack
    Gerald Ragghianti, Jack Dongarra
  4. JUN
    -
    HPDC Tempe, Arizona
    Xi Luo
    Xi
    Xi Luo
  5. JUN
    -
    Thomas Herault
    Thomas
    Thomas Herault
  6. JUN
    -
    ISC High Performance 2018 Frankfurt, Germany
    Piotr Luszczek
    Piotr
    Piotr Luszczek
  7. JUN
    -
    PMAA18 and CLay-GPU Workshop Zurich, Switzerland
    George Bosilca
    George
    George Bosilca

Recent Lunch Talks

  1. APR
    6
    Scott Emrich
    Scott Emrich
    EECS
    High Throughput Genome Analysis using Makeflow and Friends PDF
  2. APR
    13
    Jiali Li
    Jiali Li
    PCP Component in PAPI PDF
  3. APR
    20
    Hartwig Anzt
    Hartwig Anzt
    Karlsruhe Institute of Technology
    Variable-Size Batched Condition Number Calculation on GPUs PDF
  4. APR
    27
    Yves Robert
    Yves Robert
    ENS-Lyon
    A Performance Model to Execute Workflows on High-Bandwidth Memory Architectures PDF
  5. MAY
    4
    Ahmad Abdelfattah
    Ahmad Abdelfattah
    MAGMA Update: High Performance and Energy Efficient LU Factorization PDF
  6. MAY
    11
    Ana Gainaru
    Ana Gainaru
    Vanderbilt University
    Scheduling Solutions for Data-Driven Large-Scale Applications PDF
  7. MAY
    18
    Pratik Nayak
    Pratik Nayak
    Karlsruhe Institute of Technology
    Using Iterative Methods for Local Solves in Asynchronous Schwarz Methods
  8. MAY
    25
    Yaohung Tsai
    Yaohung Tsai
    Pseudo-Assembly Programming on NVIDIA GPU PDF

Upcoming Lunch Talks

  1. JUN
    1
    Anne Benoit
    Anne Benoit
    Georgia Tech
    Combining Checkpointing and Replication for Reliable Execution of Linear Workflows
  2. JUN
    8
    Lou Gross
    Lou Gross
    NIMBioS
    A Rational Basis for Hope: Human Behavior Modeling and Climate Change

People

  1. Pratik Nayak
    Pratik Nayak is a PhD student from Karlsruhe Institute of Technology studying under ICL collaborator Hartwig Anzt. During Pratik's visit, he will be working with Ichitaro Yamazaki on asynchronous Schwarz methods.
  2. Tony Castaldo
    Tony Castaldo joined the Performance ICL (PICL) team last month as a Research Scientist. Welcome to ICL, Tony!

Dates to Remember

ICL Retreat

The 2018 ICL retreat has been set for August 20–21. Location to be determined. Mark your calendars!