News and Announcements
Winter Reception 2017
In celebration of another outstanding year, the ICL Winter Reception took place at Calhoun’s on the River, once again bringing together a sizable number of ICLers and their guests for an enjoyable evening. Here’s to further success in 2017!
ECP First Annual Meeting — Photo Gallery
The US Department of Energy’s Exascale Computing Project (ECP) has published a brief article and image gallery to convey the essence of its first annual meeting January 29–February 3.
ICL representatives from the seven ECP projects in which we are involved were among the 450 attendees, and are featured in some of the pictures.
See the content here.
Conference Reports
Open MPI Developers Meeting
On January 24–26, ICL’s George Bosilca attended the Open MPI Developers Meeting in San Jose, CA, in which Open MPI’s core developers addressed various subjects ranging from technical topics to software release.
“One of the most interesting outcomes was that we decided to switch the software releases to a time-based approach—similar to the Linux kernel—in which we will issue a stable release every four to six months, with a two-month stabilization period,” George said.
In addition to discussing the current status of different software release versions, the attendees selected the release managers for version 3.0, pending confirmations by their respective institutions; and they conducted a number of technical discussions concerning the future of the code base, vendor participation, user feedback, and other major roadmaps for the next six months, the period preceding the next meeting.
PPoPP 2017
ICL’s Hartwig Anzt and Ahmad Ahmad were speakers at workshops held in conjunction with the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP) 2017, which took place February 4–8 in Austin, TX.
Hartwig gave a talk on batched Gauss-Jordan elimination for block-Jackobi preconditioning during the 8th International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM). Also during the symposium, Hartwig attended a tutorial on heterogeneous computing.
Ahmad was a speaker at the 10th workshop about general purpose processing using GPUs (GPGPU-10), where he presented a technical paper on performing the one-sided factorization using only GPUs. In contrast with the MAGMA hybrid CPU+GPU routines, this is a new category of algorithms in which the CPU is not involved in performing any computation.
Ahmad demonstrated the ability of the new algorithms to be within 90 percent of the performance of the hybrid techniques while using fewer resources and consuming less power.
SRA International Winter Intensive Training Programs
ICL’s Tracy Rafferty and Cindy Knisley were in Atlanta February 16–17 for the Society of Research Administrators International 2017 Winter Intensive Training Programs, where they were able to sharpen their knowledge and skills in the development and business sides of proposal work, respectively.
The conference, which drew approximately fifty attendees and ran concurrently over two days, allowed participants to focus on a specific area of interest that would promote their professional growth.
Tracy attended the Grant Writing and Proposal Development session. Her goal was to learn more about how to cultivate successful proposals, including positioning projects for funding, planning submissions, and satisfying reviewers.
The workshop leaders were Tolise Dailey, training development specialist in the Office of Contracts and Grants at the University of Colorado, and Johnny Toma, deputy director of finance and administration at Georgetown University Medical Center.
“The two presenters were very good and accomplished in their work,” Tracy said. “The session included actual examples of what works and what doesn’t, and plenty of Q&A. As it turns out, there were three attendees from the UT Knoxville Faculty Development team, as well as two from UT Chattanooga.”
Cindy went to the Departmental Administration workshop, which was aimed at departmental administrators of research-sponsored awards. Both pre- and post-award components were discussed, beginning with the basic elements of proposal development and continuing through managing the day-to-day activities of sponsored awards.
Emphasis during this workshop was on minimizing audit risks, budgets, budget justifications relative to allowable direct costs, direct costs vs indirect costs, travel on sponsored awards, sponsored projects vs gifts, grants vs cooperative agreement vs contract, and methods of determining if a cost is appropriate to charge to a sponsored award.
The workshop also incorporated how to communicate with principal investigators concerning proposal development and daily management needs.
NSF 2017 SI2 PI Meeting
About 150 principal investigators who are currently funded by the National Science Foundation’s Software Infrastructure for Sustained Innovation (SI2) program met in Arlington, VA, February 21–22, for a mix of presentations and breakout sessions focused on fostering collaborations and discussion about topics of interest to the attendees. Jakub Kurzak and Anthony Danalis represented ICL.
Jakub presented on the BONSAI project during the poster session. BONSAI, which stands for BEAST on OpeN Software Attuning Infrastructure, will develop a software infrastructure for using parallel hybrid systems at any scale to execute large, concurrent autotuning sweeps that will make the optimization of computational kernels for GPU accelerators and manycore coprocessors faster.
Workshop on Batched, Reproducible, and Reduced-precision BLAS

ICL was integrally involved in the second Workshop on Batched, Reproducible, and Reduced-precision Basic Linear Algebra Software Library (BLAS) in Atlanta, February 24–25. Participants sought to investigate the possibility of extending the currently accepted standards to provide greater parallelism for small-size operations, reproducibility, and reduced-precision support.
Presentations and discussions centered around the separate issues related to defining a standard interface for the batched BLAS, reproducible BLAS, and reduced-precision BLAS.
Hardware and software vendors and developers reported on their needs relative to numerical linear algebra software for current and future systems. The authors of the batched, reproducible, and reduced-precision BLAS presented the current proposal and various implements (reference and more-specific ones). The participants discussed various aspects of the plans.
Jack Dongarra delivered the welcome and introduced the participants, whose affiliations were ICL/UT Knoxville; the University of Manchester; the University of California, Berkeley; Sandia National Laboratories; Umea University; Aachen University; King Abdullah University of Science & Technology; Texas A&M; ARM; Numerical Algorithms Group (NAG); MathWorks; Cray; and Adobe.
ICLers presented on the following topics:
- Autotuning—Jakub Kurzak and Piotr Luszczek
- A Proposed Modification to the Batch BLAS Interface—Ahmad Ahmad
- MAGMA Batched Computations: Approaches and Applications—Stan Tomov
- High Performance Design of Batched Tensor Computations: Performance Analysis, Modeling, Tuning, and Optimization—Azzam Haidar
- Half-precision Benchmarks—Piotr
The workshop was sponsored in part by Intel, ICL, and Georgia Tech.
SIAM Conference

ICL gave numerous talks at the SIAM Conference on Computational Science and Engineering, February 27–March 3, in Atlanta. SIAM is the Society for Industrial and Applied Mathematics.
ICL Talks at SIAM 2017
- Accelerating the Singular Value Decomposition with a Hybrid Two-Stage Algorithm—Mark Gates
- Batch Linear Algebra for GPU-Accelerated High Performance Computing Environments—Ahmad Ahmad
- Dense LA at Exascale—Jakub Kurzak
- Implementation Techniques for Point Set Registration on Multicore and Hetergoneneous Hardware—Piotr Luszczek
- Multilevel Parallelism for SU/PG Discretization Within HPCMP Create™-AV COFFE for Compressible Rans Equations on Fully Tetrahedral Meshes—Stephen Wood
- Preconditioning on Parallel and Hybrid Architectures—Hartwig Anzt
- Programming Model, Performance Analysis and Optimization Techniques for the Intel Knights Landing Xeon Phi—Azzam Haidar
- Randomized Algorithm for Computing or Updating SVD—Ichitaro Yamazaki
- Tensor Contractions with MAGMA Batched—Stan Tomov
- The Design and Implementation of a Dense Linear Algebra Library for Extreme Parallel Computers—Jack Dongarra
- Toward Distributed Eigenvalue and Singular Value Solver Using Data Flow System—Azzam Haidar
SIAM exists to ensure the strongest interactions between mathematics and other scientific and technological communities through membership activities, publication of journals and books, and conferences.
The SIAM Conference was sponsored by the SIAM Activity Group on Computational Science and Engineering.
ICLer Gives Dissertation Talk at Sandia
Stephen Wood, an ICL postdoctoral research associate, gave a talk based on his dissertation titled “Lattice Boltzmann Methods for Wind Energy Analysis” to seventeen members of the Sandia National Laboratories Wind Energy Technologies (WET) department, led by Dave Minster.
The talk covered steps Stephen followed to build, verify, and validate a lattice Boltzmann implementation for wind turbine analysis. Emphasis was placed on the assessment of turbulence modeling options and the analysis of simulation results.
Stephen also met with Srini Arunajatesan, leader of the High Fidelity Modeling group, and Paul Crosier, leader of the software development efforts for the Multiscale Computational Materials Methods group.
David Maniacci, the rotor blade and wind plant aerodynamics lead within WET, invited Stephen to give his talk at Sandia following discussions regarding the details of wind turbine simulation at the American Institute of Aeronautics and Astronautics (AIAA) SciTech Forum in January.
“The opportunities for collaboration through, and surrounding, the [US Department of Energy’s] Exascale Computing Project were recognized by all,” Stephen said. “We will continue our discussions and begin planning collaborations.”
Interview

Matthew Bachstein
Where are you from originally?
I mostly grew up in Lebanon, TN, though I moved around a lot when I was younger.
Can you summarize your educational background?
I have my BS in mathematical sciences from Clemson, and my MS in mathematics from UT Knoxville, both focused in applied mathematics.
What are your research interests?
Currently, optimization of GPU kernels.
What made you want to work for ICL?
I realized that I really liked the implementation and optimization aspects of numerical algorithms, which led to the switch from mathematics to computer science.
What are you working on while at ICL?
I am currently working on the BONSAI project with Piotr and Jakub.
If you weren’t working at ICL, where would you like to be working and why?
Not sure.
What are your interests/hobbies outside work?
Video games, of course. But I also enjoy fly-fishing and small-scale woodworking.
Tell us something unique about yourself that might surprise some people.
I very much enjoy cooking.
































You may recall having read previously in this newsletter about Sophia Tomov, the daughter of ICL’s Stanimire Tomov, and her winning a spot as a finalist in the 2016 Discovery Education 3M Young Scientist Challenge, a prestigious science competition for middle school students.