News and Announcements

UT’s Kraken Decommissioned

Kraken-NICSThe National Institute for Computational Sciences (NICS) decommissioned the Kraken Supercomputer on April 30, 2014. The Cray XT5 machine was the first academic supercomputer to break the 1 petaflop/s barrier and served the XSEDE and TeraGrid communities, in some form or another, since 2008. Kraken ran 43% of XSEDE/TeraGrid’s computing allocations during this 6 year time span, with a 96% up-time.

Users who previously utilized Kraken resources will still be able to access other machines through the XSEDE network. More details about Kraken, including a FAQ about the decommissioning and a history by-the-numbers, can be found on NICS’s website.

Staff Lot 9 Closing for Repaving

S9_aerial

Beginning Monday, May 12th, staff Lot 9 will close for refurbishment. In Phase 1, the lot will be leveled out, repaved, and re-striped, and new lighting and trees will be placed. Also during this time, the southbound lane of Phillip Fulmer Way from Middle Drive to Peyton Manning Pass will be closed as workers build a new retaining wall and sidewalk. Pedestrians will need to walk on the stadium side of the street. Margaret’s Alley, the upper lot closer to Claxton, will remain open during Phase 1.

Phase 2 starts in mid-July, and includes the repaving and re-striping of the Margaret’s Alley section (upper lot) of the Staff 9 lot.

Throughout the construction and closure, Staff 9 permit holders will be able to park in the G10 garage on Phillip Fulmer Way next to Thompson-Boling Arena. All work will be completed by mid-August. For more information on campus road closures and construction projects, visit the Cone Zone.

NICS HPC Seminar Series

The National Institute for Computational Sciences presents the Seminar Series on High Performance Computing, every Tuesday and Thursday from 2:10pm to 3:10pm in the NICS conference room in Claxton 351. This is a joint effort between different leadership organizations (NICS, JICS, OLCF, ICL) to increase HPC awareness within the academic community.

Different topics will be introduced starting with the most basic and building up to more advanced topics in HPC. No registration is required for the seminar.

Calendar of topics to be covered in May:

Date Title
1 Art and Science of using Python in HPC
6 From equation to code (Part 1)
8 From equation to code (Part 2)

Recent Releases

PaRSEC / DPLASMA 1.2.1 Released

A new version of the PaRSEC runtime, together with the DPLASMA dense linear algebra library, is now available. This release brings many changes to the runtime, as well as to the DPLASMA layer.

At the runtime level, many existing features have been stabilized and optimized:

  • Only support for C is required from MPI.
  • Revert to an older LEX syntax (not (?i:xxx)).
  • Don’t recursively call the MPI communication engine progress function.
  • Protect the runtime headers from inclusion in C++ files.
  • Fix a memory leak allowing the remote data to be released once used.
  • Integrate the new profiling system and the python interface (added Panda support).
  • Change several default parameters for the DPLASMA tests.
  • Add Fortran support for the PaRSEC and the profiling API and add tests to validate it.
  • Backport all the profiling features from the devel branch (panda support, simpler management, better integration with python, support for HDF5, etc.).
  • Change the colorscheme of the .dot generator.
  • Correctly compute the identifier of the tasks (ticket #33).
  • Allocate the list items from the corresponding list using the requested gap.
  • Add support for older lex (without support for ?i).
  • Add support for 128-bit atomics and resolve the lingering ABA issue. When 128-bit atomics are not available, fall back to an atomic lock implementation of the most basic data structures.
  • Required Cython 0.19.1 (at least).
  • Completely disconnect the ordering of parameters and locals in the JDF.
  • Many other minor bug fixes, code readability improvements, and corrected typos.

In addition to these capabilities, the current release brings a more efficient runtime able to deliver more performance to the DPLASMA library.

  • Add the doxygen documentation generation.
  • Improved ztrmm with all the matrix reads unified.
  • Support for potri functions (trtri+lauum) and corresponding testings.
  • Fix bug in symmetric/hermitian norms when A.mt < P.

The DPLASMA library itself is the leading implementation of a dense linear algebra package for distributed heterogeneous systems. DPLASMA’s feature set includes:

  • Linear systems of equations (Cholesky, LU [inc. pivoting, PP], LDL [prototype]).
  • Least squares (QR & LQ).
  • Symmetric eigenvalue problem (Reduction to Band [prototype]).
  • Level 3 Tile BLAS (GEMM, TRSM, TRMM, HEMM/SYMM, HERK/SYRK, HER2K/SYR2k.
  • Covers double real, double complex, single real, and single complex (D, Z, S, C) precisions.
  • Provides ScaLAPACK-compatible interface for matrices in F77 column-major layout
  • Supports: Linux, Windows, Mac OS X, UN*X (depends on MPI, hwloc)

Visit the PaRSEC software page to download the tarball.

MAGMA 1.5.0 Beta 1 Released

MAGMA 1.5.0 beta 1 is now available. This release provides performance improvements for SVD and eigenvector routines. More information is given in the MAGMA: a New Generation of Linear Algebra Libraries for GPU and Multicore Architectures presentation as well as the MAGMA Quick Reference Guide. The MAGMA 1.5.0 release adds the following new functionalities:

  • SVD using Divide and Conquer (gesdd);
  • Nonsymmetric eigenvector computation is multi-threaded (trevc3_mt).

Parameters (trans, uplo, etc.) now use symbolic constants (MagmaNoTrans, MagmaLower, etc.) instead of characters (‘N’, ‘L’, etc.). Converters are provided to translate these to LAPACK, CUBLAS, and CBLAS constants.

Visit the MAGMA software page to download the tarball.

Interview

Felix Wolf Then

Felix Wolf

Where are you from, originally?

I was born in Munich, Germany, but spent most of my childhood and youth in Jülich, a small town in the Rhineland area—not too far away from the Belgian and Dutch border. Some colleagues and alumni from ICL have already been there—also because it is home of the Jülich Supercomputing Centre. The center, which is a part of Forschungszentrum Jülich, a major national lab, is located in a forest about a 30 minutes’ walk away from the street where I grew up. Further landmarks of the area include two strip mines, a sugar factory, plenty of farmland, mostly for sugar beets, and some historical fortifications. In addition, they recently constructed an experimental solar tower on the edge of town, which you can hardly look at during operation because its glare is so bright.

Can you summarize your educational background?

The high school I went to was located in an unusual place—within the walls of a centuries-old renaissance fortress in downtown Jülich, called the “citadel.” Every time I went to school, I had to cross a moat and pass through a tunnel like a knight, only I didn’t ride on horseback. The tunnel was dark because it was designed with a kink in it to prevent cannon fire from reaching the interior of the fortress.

Since I liked math but also had an affinity with language, I decided after graduation to study computer science. It was not actually that I felt a strong attraction to computer hardware or the desire to spend much of my time sitting in front of it. For my minor subject I chose physics, which enabled me later to pretend that I understood what all the simulation codes were about. I received both my master’s and my Ph.D. degree from RWTH Aachen University. My enthusiasm for supercomputing was sparked during my master’s thesis, as part of which I developed a performance-tool prototype that served both as inspiration and foundation for most of my research work to date.

How did you get introduced to ICL?

This was funny. I actually never applied directly for a position at ICL. A CV sent to another university in the U.S. that didn’t have a position for me was forwarded without my knowledge to Jack, who invited me to Knoxville for an interview. His email came as quite a surprise. The interview took the whole day, during which I had very stimulating face-to-face discussions with a number of future colleagues. After the technical part of the interview was over, Tracy Rafferty gave me a very nice tour of the campus. In the evening, Kenny Roche and Noel Black took me out to a restaurant and a few bars, including a martini jazz bar near Western Plaza and the legendary Cup A Joe. The interview was finished the following day in the early hours of the morning, after which I knew that working at ICL would be a rewarding experience and also a lot of fun.

What did you work on during your time at ICL?

Although I was technically a member of the PAPI team, Shirley Moore gave me the freedom to pursue my own R&D agenda, for which I am still very grateful. Essentially, I enjoyed the luxury of occupying a “bring-your-own-project” position. During my time at ICL, I therefore continued to work on KOJAK, a performance-analysis tool that helped HPC application developers to spot inefficiencies in their code. KOJAK had emerged from my Ph.D. thesis in collaboration with Bernd Mohr from the Jülich Supercomputing Centre. While I was in Knoxville, Bernd and I managed to move the project from a prototype to a release version—in addition to the many new features we added as a result of our research. That it was turned into a transatlantic partnership also made it easier to promote it in the U.S. Three ICL students, Fengguang Song, Nikhil Bhatia, and Farzona Pulatova, joined our efforts and helped us a great deal—especially with the interactive exploration of performance data. One year after my departure from Knoxville, Farzona visited my group in Germany as a summer intern.

What are some of your favorite memories from your time at ICL?

Because there are so many fond memories of my time at ICL, this is a tough question. One thing that made a vivid impression on me was the good sense of humor—sometimes even dry humor—I found among many people at ICL, and which I enjoyed so much. A magnificent scene that I still see in front of me was a lunch talk that Tracy Rafferty was supposed to announce but whose title she couldn’t remember. Instead, she stood in front of the audience with a straight face and said: “It’s about something scalable.”

Also, whenever I got stuck in a project, I went to Paul Peltz and Joe Thomas in the ICL help office, where we would crack some jokes, most of which I probably shouldn’t tell here. After that, I would return to my office in a more relaxed mood and untie the knot I was struggling with. In this sense, ICL help’s mission went far beyond purely technical questions and extended well into moral support.

Tell us where you are and what you’re doing now.

Since 2009, I have been head of the Laboratory for Parallel Programming at the German Research School for Simulation Sciences. The school is a joint venture of Forschungszentrum Jülich and RWTH Aachen University, committed to research and education in the applications and methods of HPC-based computer simulation. The lab is located on the university campus in Aachen, the city where I now live with my family. In addition, I am a professor at RWTH Aachen University, which gives me the opportunity to train students and, in particular, supervise thesis projects.

The lab’s mission is to develop methods, tools, and infrastructure to exploit massive parallelism on modern computer architectures. One of our key projects is the performance-analysis tool Scalasca, the successor to KOJAK, which we are developing jointly with the Jülich Supercomputing Centre. In Aachen, our focus is on automatic performance modeling with applications in requirements engineering for the co-design of exascale hardware and software. A major but relatively new topic that is separate from Scalasca is parallelism discovery, which covers techniques to guide the parallelization of sequential programs. Although it usually does not involve as many processors as our other projects, the number of applications that can benefit is quite large.

In what ways did working at ICL prepare you for what you do now, if at all?

Again, there are numerous things that I learned from working at ICL. What I would like to emphasize here is how much the success of a scientist depends on a powerful support infrastructure like the one provided at ICL, including but not limited to an effective business office, capable technical support, experienced communication specialists, and talented graphic designers. Another thing I learned to appreciate is a culture that values production software. As long as you are only creating prototypes to write articles in academic journals, only a few people, if any, will be able to build directly on your results, often reducing the practical impact of your work. Furthermore, you will hardly get any actual user feedback. Although not always flattering, such feedback usually contains very valuable guidance on where to invest your efforts.

Tell us something about yourself that might surprise some people.

As an eighteen-year-old, I had an uncommon hobby. I imitated voices—one voice to be precise. It was the voice of a very popular literary critic, who presented his book reviews on TV in an incredibly entertaining manner and whom I greatly admired. What started off as a party joke became quite serious when my classmates talked me into giving the farewell address at graduation, mimicking his voice and characteristic mannerisms. The subject I chose for my speech was a discussion of how literature was taught at our school. I then experienced intense stage fright until I tried the following advice I had found in a rhetoric manual. Before the speech, drink one glass of champagne—but not two.

Recent Papers

  1. Anzt, H., S. Tomov, and J. Dongarra, Implementing a Sparse Matrix Vector Product for the SELL-C/SELL-C-σ formats on NVIDIA GPUs,” University of Tennessee Computer Science Technical Report, no. UT-EECS-14-727: University of Tennessee, April 2014.  (578.11 KB)
  2. Haidar, A., R. Solcà, M. Gates, S. Tomov, T. C. Schulthess, and J. Dongarra, A Novel Hybrid CPU-GPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks,” International Journal of High Performance Computing Applications, vol. 28, issue 2, pp. 196-209, May 2014. DOI: 10.1177/1094342013502097  (1.74 MB)
  3. Dong, T., V. Dobrev, T. Kolev, R. Rieben, S. Tomov, and J. Dongarra, A Step towards Energy Efficient Computing: Redesigning A Hydrodynamic Application on CPU-GPU,” IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (1.01 MB)
  4. Dongarra, J., M. Faverge, H. Ltaeif, and P. Luszczek, Achieving numerical accuracy and high performance using recursive tile LU factorization with partial pivoting,” Concurrency and Computation: Practice and Experience, vol. 26, issue 7, pp. 1408-1431, May 2014. DOI: 10.1002/cpe.3110  (1.96 MB)
  5. Bosilca, G., A. Bouteiller, T. Herault, Y. Robert, and J. Dongarra, Assessing the Impact of ABFT and Checkpoint Composite Strategies,” 16th Workshop on Advances in Parallel and Distributed Computational Models, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (1.02 MB)
  6. Cao, C., J. Dongarra, P. Du, M. Gates, P. Luszczek, and S. Tomov, clMAGMA: High Performance Dense Linear Algebra with OpenCL ,” International Workshop on OpenCL, Bristol University, England, May 2014.  (460.91 KB)
  7. Yamazaki, I., J. Kurzak, P. Luszczek, and J. Dongarra, Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime,” Workshop on Large-Scale Parallel Processing, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (398.16 KB)
  8. Faverge, M., J. Herrmann, J. Langou, B. Lowery, Y. Robert, and J. Dongarra, Designing LU-QR Hybrid Solvers for Performance and Stability,” IPDPS 2014, Phoenix, AZ, IEEE, May 2014. DOI: 10.1109/IPDPS.2014.108  (4.2 MB)
  9. Donfack, S., S. Tomov, and J. Dongarra, Dynamically balanced synchronization-avoiding LU factorization with multicore and GPUs,” Fourth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2014, May 2014.  (490.08 KB)
  10. Benoit, A., Y. Robert, and S. K. Raina, Efficient checkpoint/verification patterns for silent error detection,” Innovative Computing Laboratory Technical Report, no. ICL-UT-14-03: University of Tennessee, May 2014.  (397.75 KB)
  11. Lukarski, D., H. Anzt, S. Tomov, and J. Dongarra, Hybrid Multi-Elimination ILU Preconditioners on GPUs,” International Heterogeneity in Computing Workshop (HCW), IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (1.67 MB)
  12. Yamazaki, I., H. Anzt, S. Tomov, M. Hoemmen, and J. Dongarra, Improving the performance of CA-GMRES on multicores with multiple GPUs,” IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (333.82 KB)
  13. Haidar, A., P. Luszczek, and J. Dongarra, New Algorithm for Computing Eigenvectors of the Symmetric Eigenvalue Problem,” Workshop on Parallel and Distributed Scientific and Engineering Computing, IPDPS 2014 (Best Paper), Phoenix, AZ, IEEE, May 2014. DOI: 10.1109/IPDPSW.2014.130  (2.33 MB)
  14. Tomov, S., P. Luszczek, I. Yamazaki, J. Dongarra, H. Anzt, and W. Sawyer, Optimizing Krylov Subspace Solvers on Graphics Processing Units,” Fourth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (536.32 KB)
  15. Lacoste, X., M. Faverge, P. Ramet, S. Thibault, and G. Bosilca, Taking Advantage of Hybrid Systems for Sparse Direct Solvers via Task-Based Runtimes,” 23rd International Heterogeneity in Computing Workshop, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (807.33 KB)
  16. Haidar, A., C. Cao, J. Dongarra, P. Luszczek, and S. Tomov, Unified Development for Mixed Multi-GPU and Multi-Coprocessor Environments using a Lightweight Runtime Environment,” IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (1.51 MB)

Recent Lunch Talks

  1. APR
    4
    Jakub Kurzak
    Jakub Kurzak
    Some Techniques for Optimizing CUDA More PDF
  2. APR
    10
    Dorian Arnold
    Dorian Arnold
    UNM
    A Simulation-based Framework for Evaluating Resilience Strategies at Scale
  3. APR
    25
    George Bosilca
    George Bosilca
    Toward composite fault management strategies: a quantitative evaluation PDF
  4. MAY
    2
    Ichitaro Yamazaki
    Ichitaro Yamazaki
    Performance of s-step GMRES to avoid communication on/between GPUs PDF
  5. MAY
    9
    Hartwig Anzt
    Hartwig Anzt
    Hybrid Multi-Elimination ILU Preconditioners on GPUs PDF
  6. MAY
    14
    Thomas Herault
    Thomas Herault
    DPLASMA/PaRSEC
  7. MAY
    16
    Azzam Haidar
    Azzam Haidar
    MAGMA: LU Factorization for Small Matrices PDF
  8. MAY
    30
    Grigori Fursin
    Grigori Fursin
    INRIA
    Collective Mind: community-driven systematization and automation of program optimization PDF

Upcoming Lunch Talks

  1. JUN
    6
    Kris Garrett
    Kris Garrett
    ORNL
    A Nonlinear QR Algorithm for Banded Nonlinear Eigenvalue Problems PDF
  2. JUN
    13
    Tingxing Dong
    Tingxing Dong
    A Step towards Energy Efficient Computing: Redesigning A Hydrodynamic Application on CPU-GPU PDF
  3. JUN
    20
    Yves Robert
    Yves Robert
    Algorithms for coping with silent errors PDF
  4. JUN
    27
    Ryan Glasby
    Ryan Glasby
    JICS
    Comparison of SU/PG and DG Finite-Element Techniques for the Compressible Navier-Stokes Equations on Anisotropic Unstructured Meshes PDF

Visitors

  1. Damien Genet
    Damien Genet from INRIA Bordeaux will be visiting from May 19 through June 6.
  2. Marc Baboulin
    Marc Baboulin from Inria will be visiting from May 27 through May 31.
  3. Grigori Fursin
    Grigori Fursin from Inria will be visiting from May 27 through May 31.

Visitors

  1. Damien Genet
    Damien Genet from INRIA Bordeaux will be visiting from May 19 through June 6.
  2. Marc Baboulin
    Marc Baboulin from Inria will be visiting from May 27 through May 31.
  3. Grigori Fursin
    Grigori Fursin from Inria will be visiting from May 27 through May 31.

congratulations

Amy Josephine Rogers

Amy Josephine Rogers, born to David and Wendy Rogers on May 2nd at 8:10AM, weighs in at 7lbs, 2oz and 19.5″ long. Everything went smoothly and mom and baby are doing just fine.

Big sister Eva is pictured holding Amy Jo. Congratulations to the Rogers family!

amy_jo_rogers

Dates to Remember

2014 ICL Retreat

Mark your calendars for August 14th – 15th for the 2014 ICL Retreat!