News and Announcements

NVIDIA recognizes MAGMA at GTC 2012

The CUDA Centers of Excellence (CCOEs) include some of the world’s top universities engaged in cutting-edge work with CUDA and GPU computing.  To highlight and reward this excellent research, each of the world’s 18 CCOEs was asked to submit an abstract describing their top achievement in GPU computing over the past year and a half.  A panel of experts, led by NVIDIA Chief Scientist Bill Dally, selected four CCOEs to present their achievements at a special event during GTC 2012.

UTK was selected as one of these four CCOE finalists, and Stan Tomov presented the team’s work on MAGMA: A Breakthrough in Solvers for Eigenvalue Problems.  Each of the four finalists received an HP ProLiant SL250 Gen8 system configured with dual NVIDIA Tesla K10 GPU accelerators. For more information see the NVIDIA press release.

ACM SIGHPC

 

Jack “Top500” Dongarra explains the benefits of joining ACM’s Special Interest Group on High Performance Computing (SIGHPC). Watch, learn, and accept no substitutions.

Conference Reports

26th IEEE International Parallel & Distributed Processing Symposium

  • IMG_2165
  • IMG_2106
  • IMG_1878
  • IMG_1652
  • IMG_1485
  • IMG_1361
  • IMG_1346
  • IMG_1341
  • IMG_1336
  • IMG_1324
  • IMG_1323
  • IMG_1321
  • IMG_1320
  • IMG_1319
  • IMG_1318
  • IMG_1312
  • IMG_1311
  • IMG_2165

ICL’s Teng Ma and Anthony Danalis attended this year’s IEEE International Parallel & Distributed Processing Symposium (IPDPS) in Shanghai, China.  Teng and Anthony were in good company as several ICL collaborators and alum were also in attendance, including Julien Langou, Hatem Ltaief, Marc Baboulin, Haihang You, Laura Grigori, and Hartwig Anzt—just to name a few.

Teng accepted a Best Paper Award on behalf of himself and the co-authors (George, Aurelien, and Jack) for HierKNEM: An Adaptive Framework for Kernel-Assisted and Topology-Aware Collective Communications on Many-core Clusters.  He also gave a talk on HierKNEM for everyone attending the Best Paper Session.

Anthony gave a talk on Efficient Quality Threshold Clustering for Parallel Architectures during the GPU Acceleration Session.  In addition to giving a talk, Anthony put his photography skills to work and provided us with some really good shots of the symposium and of Shanghai.

ATIP – A*CRC Workshop

Jack recently flew to Singapore to attend the ATIP – A*CRC Workshop on Accelerator Technologies in High Performance Computing.  The conference centered around Asia’s rise in HPC as well as the use of hybrid CPU-GPGPU systems.  To put that in perspective, as of the November 2011 Top500 rankings, four out of the top five most powerful supercomputers in the world were built in Japan or China, and three of these five systems utilize GPU accelerators.

Jack gave a talk entitled Algorithmic and Software Challenges when Moving Towards Exascale where he discussed the importance of software re-design in the pursuit of future extreme scale and hybrid HPC systems.  In total, the workshop included two days of presentations followed by 1.5 days of site visits to relevant laboratories and research institutes in Singapore.

Recent Releases

MAGMA 1.2 Released

MAGMA 1.2 is now available for download.  This release provides the following new functionalities:

  • Two-stage reduction to tridiagonal form;
  • Updated eigensolvers for standard and generalized eigenproblems for symmetric/Hermitian matrices;
  • Tracing functions;
  • Improved error checking, interfaces, and documentation;
  • Extended LAPACK testing.

Visit the MAGMA website to download the tarball.

ScaLAPACK 2.0.2 Released

ScaLAPACK 2.0.2 is now available for download.  New features include:

  • Miscellaneous minor bug fixes;
  • CMAKE shared library support;
  • Bug fixes to the PxHSEQR Nonsymmetric Eigenvalue Problem routines;
  • New xLAMOV routines replace calls to xLACPY to avoid array argument overlap;
  • Modifications to BLACS to better support MPICH-based MPI libraries;
  • New PMPIM2 and PMPCOL routines to avoid code duplication in MRRR routines.

Visit the ScaLAPACK website to download the tarball.

clMAGMA 0.2 Released

clMAGMA 0.2 is now available for download.  This release provides the following new functionalities:

  • Solvers for general and symmetric dense matrices;
  • Least squares solver;
  • Multiple precision support for the LU factorization;
  • Support for s/d/c/z precision arithmetic for all routines.

Visit the MAGMA website to download the tarball.

Interview

Julien Herrmann Then

Julien Herrmann

Where are you from, originally?

I come from Saint-Etienne, 30 miles southwest of Lyon, France. I spent my childhood there, in the “dark city.” This nickname comes from the many coal mines in the region and the fact that Saint-Etienne was a big industrial center during the last century. I don’t have memories of this industrial period because, a couple of decades ago, the city started to change. Saint-Etienne is now focusing on sustainable development and design, and more museums, technology parks, and cultural events are emerging. Although it still has a bad reputation (especially in Lyon), the “dark city” is turning into the “green city.”

Can you summarize your educational background?

I studied in Lyon when I finished high school. I did my “classe préparatoire” in the Lycée Les Lazaristes, with a major in Mathematics and Physics. Then, I entered ENS Lyon and earned my Bachelor’s and Master’s degrees in Theoretical Computer Science.

Tell us how you first learned about ICL. What made you want to visit ICL?

Last year I did an internship at the University of Paris-Sud under the direction of Marc Baboulin. That’s how I learned about Jack Dongarra and ICL. I also worked on Butterfly Transformations and had contact with Stan Tomov. ICL’s software is world-renowned, and with my reading of ICL papers, I was already impressed with the quality of ICL’s researchers. So, when the opportunity came to visit ICL and work with ICL’s people, I knew that it would be a unique chance for me to extend my knowledge in the high-performance computing area.

What are your research interests? What are you working on during your visit with ICL?

I am interested in numerical algorithms and algorithmic in general. This year, I worked on a new kind of linear solver using QR and LU routines, under the direction of Yves Robert. The idea was to design some algorithms faster than a QR algorithm and with a good stability. I implemented several of them on DAGUE. It was really interesting to work in cooperation with the DAGUE team, and I am really thankful to Mathieu Faverge, Thomas Herault, and George Bosilca for their help.

What are your interests/hobbies outside of work?

I like to Rock’n Roll, although I have not done this regularly for a long time. During my stay in Knoxville, I discovered swing dancing, which is very similar to Rock’n Roll. In Lyon I’m also playing Ultimate Frisbee. This sport is not really famous in France, but it is taught in Middle Schools.

Tell us something about yourself that might surprise people.

That’s a tough question. People may be surprised when I say that I did what we call in France “rock sauté” (which is the equivalent of acrobatic Rock’n Roll for kids) for 6 years when I was a child. Check it out on YouTube to see what it looks like. My partner and I went to the semi-final of the National Junior Championship, and we also did some acrobatic Rock’n Roll.

Recent Papers

  1. Haidar, A., H. Ltaeif, P. Luszczek, and J. Dongarra, A Comprehensive Study of Task Coalescing for Selecting Parallelism Granularity in a Two-Stage Bidiagonal Reduction,” IPDPS 2012, Shanghai, China, May 2012.  (480.43 KB)
  2. Baboulin, M., D. Becker, and J. Dongarra, A Parallel Tiled Solver for Symmetric Indefinite Systems On Multicore Architectures,” IPDPS 2012, Shanghai, China, May 2012.  (544.09 KB)
  3. Bland, W., Enabling Application Resilience With and Without the MPI Standard,” 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Ottawa, Canada, May 2012.  (262.93 KB)
  4. Dongarra, J., M. Faverge, T. Herault, J. Langou, and Y. Robert, Hierarchical QR Factorization Algorithms for Multi-Core Cluster Systems,” IPDPS 2012, the 26th IEEE International Parallel and Distributed Processing Symposium, Shanghai, China, IEEE Computer Society Press, May 2012.  (405.71 KB)
  5. Ma, T., G. Bosilca, A. Bouteiller, and J. Dongarra, HierKNEM: An Adaptive Framework for Kernel-Assisted and Topology-Aware Collective Communications on Many-core Clusters,” IPDPS 2012 (Best Paper), Shanghai, China, May 2012.  (165.9 KB)
  6. Dongarra, J., and A. J. van der Steen, High Performance Computing Systems: Status and Outlook,” Acta Numerica, vol. 21, Cambridge, UK, Cambridge University Press, pp. 379-474, May 2012.  (1.48 MB)
  7. Tomov, S., J. Dongarra, A. Haidar, I. Yamazaki, T. Dong, T. Schulthess, and R. Solcà, MAGMA: A Breakthrough in Solvers for Eigenvalue Problems , San Jose, CA, GPU Technology Conference (GTC12), Presentation, May 2012.  (9.23 MB)
  8. Baboulin, M., S. Donfack, J. Dongarra, L. Grigori, A. Remi, and S. Tomov, A Class of Communication-Avoiding Algorithms for Solving General Dense Linear Systems on CPU/GPU Parallel Machines,” Proc. of the International Conference on Computational Science (ICCS), vol. 9, pp. 17-26, June 2012.
  9. Song, F., and J. Dongarra, A Scalable Framework for Heterogeneous GPU-Based Clusters,” The 24th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2012), Pittsburgh, PA, USA, ACM, June 2012.  (3.39 MB)
  10. Anzt, H., S. Tomov, M. Gates, J. Dongarra, and V. Heuveline, Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems,” ICCS 2012, Omaha, NE, June 2012.  (608.95 KB)
  11. Song, F., S. Tomov, and J. Dongarra, Enabling and Scaling Matrix Computations on Heterogeneous Multi-Core and Multi-GPU Systems,” 26th ACM International Conference on Supercomputing (ICS 2012), San Servolo Island, Venice, Italy, ACM, June 2012.  (5.88 MB)
  12. Du, P., P. Luszczek, and J. Dongarra, High Performance Dense Linear System Solver with Resilience to Multiple Soft Errors,” ICCS 2012, Omaha, NE, June 2012.  (1.27 MB)
  13. Yamazaki, I., S. Tomov, and J. Dongarra, One-Sided Dense Matrix Factorizations on a Multicore with Multiple GPU Accelerators,” The International Conference on Computational Science (ICCS), June 2012.
  14. Bosilca, G., A. Bouteiller, E. Brunet, F. Cappello, J. Dongarra, A. Guermouche, T. Herault, Y. Robert, F. Vivien, and D. Zaidouni, Unified Model for Assessing Checkpointing Protocols at Extreme-Scale,” University of Tennessee Computer Science Technical Report (also LAWN 269), no. UT-CS-12-697, June 2012.  (2.76 MB)

Recent Lunch Talks

  1. MAY
    4
    Rick Weber
    Rick Weber
    EECS
    Approaching 5 Tflops in a $6000 box of commodity parts: xgemm on Southern Islands PDF
  2. MAY
    11
    Vince Weaver
    Vince Weaver
    Measuring Power and Energy with PAPI PDF
  3. MAY
    18
    Lou Gross
    Lou Gross
    NIMBioS
    Computational Irreducibility and Prediction in Ecology PDF
  4. MAY
    25
    Omar Zenati
    Omar Zenati
    LU Decompositions over DAGuE PDF
  5. JUN
    1
    Kiran Kasichayanula
    Kiran Kasichayanula
    Power Aware Computing on GPUs PDF
  6. JUN
    8
    James Plank
    James Plank
    EECS
    Smoke, Mirrors and Erasure Codes PDF
  7. JUN
    15
    Mark Gates
    Mark Gates
    Multi-GPU Hessenberg reduction in MAGMA PDF
  8. JUN
    22
    Branislav Jansik
    Branislav Jansik
    Aarhus University
    DEC code implementation and parallelization PDF
  9. JUN
    29
    Shirley Moore
    Shirley Moore
    Shirley Moore Photo Gallery URL

Upcoming Lunch Talks

  1. JUL
    6
    Mathieu Faverge
    Mathieu Faverge
    Algorithms mixing LU and QR kernels for multi-core cluster systems PDF
  2. JUL
    13
    Aurelien Bouteiller
    Aurelien Bouteiller
    Fault Tolerance Methods: Challenges and Opportunities PDF
  3. JUL
    20
    Erlin Yao
    Erlin Yao
    A Case Study of Designing Efficient Algorithm-based Fault Tolerant Application for Exascale Parallelism PDF
  4. JUL
    27
    Anthony Danalis
    Anthony Danalis
    From Serial Loops to Parallel Execution on Distributed Systems PDF

Visitors

  1. Vincent Cohen-Addad
    Vincent Cohen-Addad from ENS Lyon will be visiting from June 15 through August 17. Vincent will be working with the Distributed Computing group.
  2. Branislav Jansik
    Branislav Jansik from Department of Chemistry, Aarhus University will be visiting on Friday, June 22. Visiting ORNL to prepare a Gordon Bell app; interested in using PAPI.
  3. Ahmad Ahmad
    Ahmad Ahmad from King Abdullah University of Science and Technology (KAUST) will be visiting from June 29 through August 29. Ahmad will be working as an intern student.

People

  1. George Bosilca
    George Bosilca will be on sabbatical in France for the next few months, but hopes to return to ICL in the near future.
  2. Kiran Kumar Kasichayanula
    Kiran Kumar Kasichayanula, an EECS/ICL graduate student, recently earned his Master's from UTK, and is now working at MulticoreWare, Inc.  Congratulations, Kiran!
  3. Mitch Horton
    Mitch Horton recently finished up his Post Doc work at ICL and is headed to Georgia Tech.  Good luck, Mitch!

Visitors

  1. Vincent Cohen-Addad
    Vincent Cohen-Addad from ENS Lyon will be visiting from June 15 through August 17. Vincent will be working with the Distributed Computing group.
  2. Branislav Jansik
    Branislav Jansik from Department of Chemistry, Aarhus University will be visiting on Friday, June 22. Visiting ORNL to prepare a Gordon Bell app; interested in using PAPI.
  3. Ahmad Ahmad
    Ahmad Ahmad from King Abdullah University of Science and Technology (KAUST) will be visiting from June 29 through August 29. Ahmad will be working as an intern student.

Dates to Remember

ICL Annual Retreat

This year’s ICL Retreat has been set for August 16th & 17th.  Be sure to mark it on your calendars!