News and Announcements

Winter Reception 2017

In celebration of another outstanding year, the ICL Winter Reception took place at Calhoun’s on the River, once again bringing together a sizable number of ICLers and their guests for an enjoyable evening. Here’s to further success in 2017!

ECP First Annual Meeting — Photo Gallery

The US Department of Energy’s Exascale Computing Project (ECP) has published a brief article and image gallery to convey the essence of its first annual meeting January 29–February 3.

ICL representatives from the seven ECP projects in which we are involved were among the 450 attendees, and are featured in some of the pictures.

See the content here.

Conference Reports

Open MPI Developers Meeting

On January 24–26, ICL’s George Bosilca attended the Open MPI Developers Meeting in San Jose, CA, in which Open MPI’s core developers addressed various subjects ranging from technical topics to software release.

“One of the most interesting outcomes was that we decided to switch the software releases to a time-based approach—similar to the Linux kernel—in which we will issue a stable release every four to six months, with a two-month stabilization period,” George said.

In addition to discussing the current status of different software release versions, the attendees selected the release managers for version 3.0, pending confirmations by their respective institutions; and they conducted a number of technical discussions concerning the future of the code base, vendor participation, user feedback, and other major roadmaps for the next six months, the period preceding the next meeting.

PPoPP 2017

ICL’s Hartwig Anzt and Ahmad Ahmad were speakers at workshops held in conjunction with the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP) 2017, which took place February 4–8 in Austin, TX.

Hartwig gave a talk on batched Gauss-Jordan elimination for block-Jackobi preconditioning during the 8th International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM). Also during the symposium, Hartwig attended a tutorial on heterogeneous computing.

Ahmad was a speaker at the 10th workshop about general purpose processing using GPUs (GPGPU-10), where he presented a technical paper on performing the one-sided factorization using only GPUs. In contrast with the MAGMA hybrid CPU+GPU routines, this is a new category of algorithms in which the CPU is not involved in performing any computation.

Ahmad demonstrated the ability of the new algorithms to be within 90 percent of the performance of the hybrid techniques while using fewer resources and consuming less power.

SRA International Winter Intensive Training Programs

ICL’s Tracy Rafferty and Cindy Knisley were in Atlanta February 16–17 for the Society of Research Administrators International 2017 Winter Intensive Training Programs, where they were able to sharpen their knowledge and skills in the development and business sides of proposal work, respectively.

The conference, which drew approximately fifty attendees and ran concurrently over two days, allowed participants to focus on a specific area of interest that would promote their professional growth.

Tracy attended the Grant Writing and Proposal Development session. Her goal was to learn more about how to cultivate successful proposals, including positioning projects for funding, planning submissions, and satisfying reviewers.

The workshop leaders were Tolise Dailey, training development specialist in the Office of Contracts and Grants at the University of Colorado, and Johnny Toma, deputy director of finance and administration at Georgetown University Medical Center.

“The two presenters were very good and accomplished in their work,” Tracy said. “The session included actual examples of what works and what doesn’t, and plenty of Q&A. As it turns out, there were three attendees from the UT Knoxville Faculty Development team, as well as two from UT Chattanooga.”

Cindy went to the Departmental Administration workshop, which was aimed at departmental administrators of research-sponsored awards. Both pre- and post-award components were discussed, beginning with the basic elements of proposal development and continuing through managing the day-to-day activities of sponsored awards.

Emphasis during this workshop was on minimizing audit risks, budgets, budget justifications relative to allowable direct costs, direct costs vs indirect costs, travel on sponsored awards, sponsored projects vs gifts, grants vs cooperative agreement vs contract, and methods of determining if a cost is appropriate to charge to a sponsored award.

The workshop also incorporated how to communicate with principal investigators concerning proposal development and daily management needs.

NSF 2017 SI2 PI Meeting

About 150 principal investigators who are currently funded by the National Science Foundation’s Software Infrastructure for Sustained Innovation (SI2) program met in Arlington, VA, February 21–22, for a mix of presentations and breakout sessions focused on fostering collaborations and discussion about topics of interest to the attendees. Jakub Kurzak and Anthony Danalis represented ICL.

Jakub presented on the BONSAI project during the poster session. BONSAI, which stands for BEAST on OpeN Software Attuning Infrastructure, will develop a software infrastructure for using parallel hybrid systems at any scale to execute large, concurrent autotuning sweeps that will make the optimization of computational kernels for GPU accelerators and manycore coprocessors faster.

Workshop on Batched, Reproducible, and Reduced-precision BLAS

dsc01833

ICL was integrally involved in the second Workshop on Batched, Reproducible, and Reduced-precision Basic Linear Algebra Software Library (BLAS) in Atlanta, February 24–25. Participants sought to investigate the possibility of extending the currently accepted standards to provide greater parallelism for small-size operations, reproducibility, and reduced-precision support.

Presentations and discussions centered around the separate issues related to defining a standard interface for the batched BLAS, reproducible BLAS, and reduced-precision BLAS.

Hardware and software vendors and developers reported on their needs relative to numerical linear algebra software for current and future systems. The authors of the batched, reproducible, and reduced-precision BLAS presented the current proposal and various implements (reference and more-specific ones). The participants discussed various aspects of the plans.

Jack Dongarra delivered the welcome and introduced the participants, whose affiliations were ICL/UT Knoxville; the University of Manchester; the University of California, Berkeley; Sandia National Laboratories; Umea University; Aachen University; King Abdullah University of Science & Technology; Texas A&M; ARM; Numerical Algorithms Group (NAG); MathWorks; Cray; and Adobe.

ICLers presented on the following topics:

  • Autotuning—Jakub Kurzak and Piotr Luszczek
  • A Proposed Modification to the Batch BLAS Interface—Ahmad Ahmad
  • MAGMA Batched Computations: Approaches and Applications—Stan Tomov
  • High Performance Design of Batched Tensor Computations: Performance Analysis, Modeling, Tuning, and Optimization—Azzam Haidar
  • Half-precision Benchmarks—Piotr

The workshop was sponsored in part by Intel, ICL, and Georgia Tech.

SIAM Conference

BLAS-workshop-group

ICL gave numerous talks at the SIAM Conference on Computational Science and Engineering, February 27–March 3, in Atlanta. SIAM is the Society for Industrial and Applied Mathematics.

ICL Talks at SIAM 2017

  • Accelerating the Singular Value Decomposition with a Hybrid Two-Stage Algorithm—Mark Gates
  • Batch Linear Algebra for GPU-Accelerated High Performance Computing Environments—Ahmad Ahmad
  • Dense LA at Exascale—Jakub Kurzak
  • Implementation Techniques for Point Set Registration on Multicore and Hetergoneneous Hardware—Piotr Luszczek
  • Multilevel Parallelism for SU/PG Discretization Within HPCMP Create™-AV COFFE for Compressible Rans Equations on Fully Tetrahedral Meshes—Stephen Wood
  • Preconditioning on Parallel and Hybrid Architectures—Hartwig Anzt
  • Programming Model, Performance Analysis and Optimization Techniques for the Intel Knights Landing Xeon Phi—Azzam Haidar
  • Randomized Algorithm for Computing or Updating SVD—Ichitaro Yamazaki
  • Tensor Contractions with MAGMA Batched—Stan Tomov
  • The Design and Implementation of a Dense Linear Algebra Library for Extreme Parallel Computers—Jack Dongarra
  • Toward Distributed Eigenvalue and Singular Value Solver Using Data Flow System—Azzam Haidar

SIAM exists to ensure the strongest interactions between mathematics and other scientific and technological communities through membership activities, publication of journals and books, and conferences.

The SIAM Conference was sponsored by the SIAM Activity Group on Computational Science and Engineering.

ICLer Gives Dissertation Talk at Sandia

Stephen Wood, an ICL postdoctoral research associate, gave a talk based on his dissertation titled “Lattice Boltzmann Methods for Wind Energy Analysis” to seventeen members of the Sandia National Laboratories Wind Energy Technologies (WET) department, led by Dave Minster.

The talk covered steps Stephen followed to build, verify, and validate a lattice Boltzmann implementation for wind turbine analysis. Emphasis was placed on the assessment of turbulence modeling options and the analysis of simulation results.

Stephen also met with Srini Arunajatesan, leader of the High Fidelity Modeling group, and Paul Crosier, leader of the software development efforts for the Multiscale Computational Materials Methods group.

David Maniacci, the rotor blade and wind plant aerodynamics lead within WET, invited Stephen to give his talk at Sandia following discussions regarding the details of wind turbine simulation at the American Institute of Aeronautics and Astronautics (AIAA) SciTech Forum in January.

“The opportunities for collaboration through, and surrounding, the [US Department of Energy’s] Exascale Computing Project were recognized by all,” Stephen said. “We will continue our discussions and begin planning collaborations.”

Interview

Matthew Bachstein Then

Matthew Bachstein

Where are you from originally?

I mostly grew up in Lebanon, TN, though I moved around a lot when I was younger.

Can you summarize your educational background?

I have my BS in mathematical sciences from Clemson, and my MS in mathematics from UT Knoxville, both focused in applied mathematics.

What are your research interests?

Currently, optimization of GPU kernels.

What made you want to work for ICL?

I realized that I really liked the implementation and optimization aspects of numerical algorithms, which led to the switch from mathematics to computer science.

What are you working on while at ICL?

I am currently working on the BONSAI project with Piotr and Jakub.

If you weren’t working at ICL, where would you like to be working and why?

Not sure.

What are your interests/hobbies outside work?

Video games, of course. But I also enjoy fly-fishing and small-scale woodworking.

Tell us something unique about yourself that might surprise some people.

I very much enjoy cooking.

Recent Papers

  1. Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Orti, Batched Gauss-Jordan Elimination for Block-Jacobi Preconditioner Generation on GPUs,” Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, New York, NY, USA, ACM, pp. 1–10, February 2017. DOI: 10.1145/3026937.3026940  (552.62 KB)
  2. Haidar, A., A. Abdelfattah, S. Tomov, and J. Dongarra, High-performance Cholesky Factorization for GPU-only Execution,” Proceedings of the General Purpose GPUs (GPGPU-10), Austin, TX, ACM, February 2017. DOI: 10.1145/3038228.3038237  (872.18 KB)
  3. Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, et al., Accelerating Tensor Contractions in High-Order FEM with MAGMA Batched , Atlanta, GA, SIAM Conference on Computer Science and Engineering (SIAM CSE17), Presentation, March 2017.  (9.29 MB)
  4. Baboulin, M., J. Dongarra, A. Remy, S. Tomov, and I. Yamazaki, Solving Dense Symmetric Indefinite Systems using GPUs,” Concurrency and Computation: Practice and Experience, vol. 29, issue 9, March 2017. DOI: 10.1002/cpe.4055  (1.94 MB)

Recent Conferences

  1. FEB
    Hartwig Anzt
    Hartwig
    Hartwig Anzt
  2. FEB
    -
    GPGPU-10 Workshop Austin, Texas
    Ahmad Abdelfattah Ahmad
    Ahmad
    Ahmad Abdelfattah Ahmad
  3. FEB
    -
    SIAM CSE + Batched BLAS Atlanta, Georgia
    Ahmad Abdelfattah Ahmad
    Ahmad
    Anthony Danalis
    Anthony
    Aurelien Bouteiller
    Aurelien
    Azzam Haidar
    Azzam
    Dongarra
    Dongarra
    George Bosilca
    George
    Hartwig Anzt
    Hartwig
    Ichitaro Yamazaki
    Ichitaro
    Jack Dongarra
    Jack
    Jakub Kurzak
    Jakub
    Mark Gates
    Mark
    Piotr Luszczek
    Piotr
    Stanimire Tomov
    Stan
    Stephen Wood
    Stephen
    Thomas Herault
    Thomas
    Yaohung  Tsai
    Mike
    Yves Robert
    Yves
    Ahmad Abdelfattah Ahmad, Anthony Danalis, Aurelien Bouteiller, Azzam Haidar, Dongarra, George Bosilca, Hartwig Anzt, Ichitaro Yamazaki, Jack Dongarra, Jakub Kurzak, Mark Gates, Piotr Luszczek, Stanimire Tomov, Stephen Wood, Thomas Herault, Yaohung Tsai, Yves Robert
  4. FEB
    -
    Damien Genet
    Damien
    George Bosilca
    George
    Damien Genet, George Bosilca
  5. FEB
    -
    Damien Genet
    Damien
    George Bosilca
    George
    Damien Genet, George Bosilca
  6. FEB
    -
    MPI Forum Portlan, Oregon
    Aurelien Bouteiller
    Aurelien
    Aurelien Bouteiller
  7. MAR
    BDEC Wuxi, China
    Jack Dongarra
    Jack
    Piotr Luszczek
    Piotr
    Terry Moore
    Terry
    Tracy Rafferty
    Tracy
    Jack Dongarra, Piotr Luszczek, Terry Moore, Tracy Rafferty
  8. MAR
    -
    TESSE Workgroup Meeting Roanoke, Virginia
    Damien Genet
    Damien
    George Bosilca
    George
    Jeffrey Steill
    Jeffrey
    Thomas Herault
    Thomas
    Damien Genet, George Bosilca, Jeffrey Steill, Thomas Herault

Upcoming Conferences

  1. APR
    -
    Future Online Analysis Platform Arlington, Virginia
    Piotr Luszczek
    Piotr
    Terry Moore
    Terry
    Piotr Luszczek, Terry Moore

Recent Lunch Talks

  1. FEB
    3
    George Ostrouchov
    George Ostrouchov
    ORNL
    Bringing Modern Statistical Science to HPC PDF
  2. FEB
    10
    Scott Klasky
    Scott Klasky
    ORNL
    Creating ADIOS-2 for scientific exascale data PDF
  3. FEB
    17
    Arvind Ramanathan
    Arvind Ramanathan
    ORNL
    Computer to Cure Cancer: Building Exascale Deep Learning Tools for Effective Cancer Surveillance PDF
  4. MAR
    10
    Nuria Losada
    Nuria Losada
    Stop&Restart vs Resilient MPI applications PDF
  5. MAR
    17
    Stephen Wood
    Stephen Wood
    Multilevel Parallelism for CFD: Preconditioning via Parallel ILU Factorization PDF
  6. MAR
    31
    Piotr Luszczek
    Piotr Luszczek
    Implementation Techniques for Point Set Registration on Multicore and Heterogeneous Hardware PDF

Upcoming Lunch Talks

  1. APR
    7
    Benjamin Hernandez
    Benjamin Hernandez
    ORNL
    GPU Refinement Learning for Crowd Simulations PDF
  2. APR
    21
    Sangamesh Ragate
    Sangamesh Ragate
    Sequence Labeling Using RNN PDF
  3. APR
    28
    Azzam Haidar
    Azzam Haidar
    Toward Distributed Eigenvalue and Singular Value Solver Using Data Flow System PDF

Visitors

  1. Sticks Mabakane
    Sticks Mabakane from Council for Scientific and Industrial Research (South Africa) will be visiting from February 25 through March 12. Sticks is working with the PAPI group on performance visualizations.

Visitors

  1. Sticks Mabakane
    Sticks Mabakane from Council for Scientific and Industrial Research (South Africa) will be visiting from February 25 through March 12. Sticks is working with the PAPI group on performance visualizations.

congratulations

New Arrival

Rowan-McMillan-Aultman

ICL’s Terry Moore is now a granddad! Rowan McMillan Aultman was 21.5 inches long and weighed 8.9 pounds when he was born on February 20 in Tallahassee, FL. Mother and baby are doing fine. Congratulations, Terry!

13-year-old Daughter of ICLer to Attend UT

sofia You may recall having read previously in this newsletter about Sophia Tomov, the daughter of ICL’s Stanimire Tomov, and her winning a spot as a finalist in the 2016 Discovery Education 3M Young Scientist Challenge, a prestigious science competition for middle school students.

Now a new development to report: Sophia, who was accepted into UT as a visiting high school student during the fall of 2016 at the age of 12, will take a computer science course in coding this fall at UT.

Sophia, who hopes to pursue a career in sustainable energy, is the first Tennessee recipient of the Caroline D. Bradley Scholarship, a scholarship for gifted high school students. She is currently a home-schooled student in the eighth grade.

Read more about Sophia’s story in UT’s Daily Beacon publication.

ICL Student Worked on Innovative TV Switcher

Before coming to UT and ICL to pursue higher studies, graduate research assistant Hanumantharayappa had the opportunity to contribute his talents to the team developing a technology product that is receiving high-profile attention.

The product, called Caavo, is a TV switcher, or hub, with eight HDMI ports that unifies streaming television boxes and services such as cable, satellite, apps, and subscriptions.

You can read more about Caavo in the magazine Fast Company and on the website TechCrunch.