|
2022/01/14 |
Mark Gates |
ICL |
Parallel divide & conquer eigenvector computation in SLATE |
mark-gates-parallel-divid-2022-01-14.pdf |
|
2022/01/07 |
Qinglei Cao |
ICL |
Dense, Mixed-Precision and Tile Low-Rank GEMM and Cholesky on Fugaku Using PaRSEC |
qinglei-cao-dense-mixed-pr-2022-01-07.pdf |
|
2021/12/10 |
Daniel Mishler |
|
HHL: a Quantum Algorithm for Exponential Speedup on Systems of Linear Equations |
daniel-mishler-2021-12-10.pdf |
|
2021/12/03 |
Anthony Danalis |
ICL |
SDE library internals a.k.a watching the making of sausage |
|
|
2021/11/12 |
Mohsen Mahmoudi-Aznaveh |
Texas A&M |
Paru: Parallel Unsymmetric Multifrontal sparse LU factorization |
mohsen-mahmoudi-aznaveh-paru-parallel-2021-11-12.pdf |
|
2021/11/05 |
Grzegorz Kwasniewski |
ETH Zurich |
From graph pebbling to I/O optimal and high-performance code |
grzegorz-kwasniewski-from-graph-pebb-2021-11-05.pdf |
|
2021/10/29 |
Rabab Al-omairy |
ICL |
Communication Avoiding LU with Tournament Pivoting in SLATE |
|
|
2021/10/22 |
Tony Castaldo |
ICL |
Parallel Spectrum Slicing for Selected Eigenpairs with Tall-Skinny QR Orthogonalization of Eigenvectors |
|
|
2021/10/15 |
Hartwig Anzt |
ICL |
Batched Iterative Solvers for Sparse Linear Problems |
|
|
2021/10/08 |
Aurelien Bouteiller |
ICL |
|
|
|
2021/10/01 |
Daniel Bielich |
ICL |
Delayed Classical Gram-Schmidt with Reorthogonalization (DCGS2), in the context of QR and Arnoldi |
|
|
2021/09/24 |
Ed Valeev |
Virginia Tech |
Tensor-Centric View of Electrons in Molecules and Materials |
|
|
2021/09/17 |
Anouar Benali |
Argonne National Laboratory |
Towards predictive simulations of molecules and solids using quantum Monte Carlo methods |
|
|
2021/09/10 |
Stephen Herbein |
Lawrence Livermore National Laboratory |
HPC + Cloud Convergence at LLNL |
stephen-herbein-hpc-cloud-con-2021-09-10.pdf |
|
2021/09/03 |
Tu Mai Anh Do |
USC Information Sciences Institute |
Assessing Resource Provisioning and Allocation of Ensembles of In Situ Workflows |
tu-mai-anh-do-assessing-resou-2021-09-03.pdf |
|
2021/08/27 |
Scott Klasky |
Oak Ridge National Laboratory |
Data Reduction via the MultiGrid Adaptive Reduction of Data (MGARD) |
scott-klasky-data-reduction-2021-08-27.pdf |
|
2021/08/20 |
Thomas Herault |
ICL |
Templated Task Graph a new task programing interface in C++ |
|
|
2021/08/06 |
Michele Benzi |
Scuola Normale Superiore in Pisa |
Walk-based measures of centrality, communicability, and robustness in networks |
|
|
2021/07/30 |
Ahmad Abdelfattah |
ICL |
The underlying complexity of optimizing batch routines |
|
|
2021/07/23 |
Jamie Coble |
UTK's Department of Nuclear Engineering |
Data-Driven Decision Making for Improved Economics of Nuclear Power |
|
|
2021/07/16 |
Kate Keahey |
Argonne National Laboratory |
Chamelon: An Innovation Platform for Repeatable Computer Science Research |
|
|
2021/07/09 |
Piotr Luszczek |
ICL |
Towards linear algebra in the standard C++ library |
|
|
2021/07/02 |
Stanimire Tomov |
ICL |
MAGMA: Evolution and Revolution |
|
|
2021/06/25 |
Rabab Al-Omairy |
ICL |
|
|
|
2021/06/18 |
George Bosilca |
ICL |
50 Shades of Cholesky |
Bosilca-50-shades-of-Cholesky.pdf |
|
2021/06/11 |
Hartwig Anzt |
KIT |
Porting Ginkgo to Intel GPUs using DPC++ |
|
|
2021/06/04 |
Jack Dongarra |
ICL |
|
|
|
2021/05/28 |
Fan Zhang |
UTK's Nuclear Engineering Department |
Enhancing the Cybersecurity of Nuclear Facilities through Data Aggregation |
|
|
2021/05/21 |
Dr. Katie Cahill |
Howard H. Baker Jr. Center for Public Policy |
Returning with Care: COVID-19 and Campus Reopening |
|
|
2021/05/14 |
Pedro Valero Lara |
Oak Ridge National Laboratory |
Tasking for Linear Algebra Kernels |
|
|
2021/05/07 |
Wissam Sid Lakhdar |
ICL |
Multitask and Transfer Learning for Autotuning Exascale Applications |
|
|
2021/04/30 |
Natalie Beams |
ICL |
ROCm Road: Milestones, Mishaps, and Mysteries from a Year of Preparing libCEED for an AMD GPU Future |
|
|
2021/04/23 |
Joseph Schuchart |
ICL |
Callback-Based Completion Notification Using MPI Continuations |
|
|
2021/04/09 |
Yu Pei |
ICL |
|
|
|
2021/03/26 |
Dong Zhong |
ICL |
Toward Reliable and Efficient Message Passing Software for HPC Systems: Fault Tolerance and Vector Extension |
|
|
2021/03/19 |
Cade Brown |
ICL |
The Wild World of Derivatives, Gamma Squeezes, and (of course) GameStop |
|
|
2021/03/12 |
Sajal Dash |
Oak Ridge National Laboratory |
Scaling Out a Combinatorial Algorithm for Discovering Carcinogenic Gene Combinations to Thousands of GPUs |
|
|
2021/03/05 |
Azzam Haidar |
NVIDIA |
Tensor Core Accelerated Iterative Refinement Solvers and Their Impact on Scientific Computing |
Haidar-Tensor-Core-Accelerated-Iterative-Refinement-Solvers-and-Its-Impact-on-Scientific-Computing-03-05-2021.pdf |
|
2021/02/26 |
Terry Moore |
ICL |
Some Reflections About the Life of ICL in the 21st Century |
Moore-Some-Reflections-on-the-Life-of-ICL-in-the-21st-Century-02-26-2021.pdf |
|
2021/02/19 |
Sebastien Cayrols |
ICL |
Design and Optimization of MPI_Alltoall for Mixed-Precision Algorithms |
|
|
2021/02/12 |
Qinglei Cao |
ICL |
Accelerating Geostatistical Modeling and Prediction With Mixed-Precision Computations: A High-Productivity Approach with PaRSEC |
|
|
2021/02/05 |
Daniel Barry |
ICL |
The Linear Algebra of Native Hardware Event Identification |
Barry-The-Linear-Algebra-of-Native-Hardware-Event-Identification-02-05-2021.pdf |
|
2021/01/29 |
Alan Ayala |
ICL |
Tuning FFTs for Exascale |
|
|
2021/01/22 |
Seetharami Seelam |
IBM |
Future of HPC on Cloud: A Researcher Perspective |
|
|
2021/01/15 |
Yves Robert |
ENS-Lyon |
Resilient Scheduling of Moldable Jobs on Failure-Prone Platforms |
|
|
2021/01/08 |
Maksim Melnichenko |
ICL |
Randomized Algorithms for the Low Rank Matrix Approximation |
|
|
2020/12/18 |
Ian Lumsden |
Global Computing Laboratory |
|
|
|
2020/12/11 |
Nigel Tan |
Global Computing Laboratory |
Optimizing Vector Particle-In-Cell (VPIC) for Memory Constrained Systems Using Half-Precision |
Nigel-Tan-Optimizing-Vector-Particle-in-Cell-for-Memory-Constrained-Systems-Using-Half-Precision-12-11-2020.pdf |
|
2020/12/04 |
Anthony Danalis |
ICL |
|
|
|
2020/11/13 |
Asim YarKhan |
ICL |
Profiling and Performance Improvements in SLATE: Experience with the Cholesky Factorization |
|
|
2020/11/06 |
Heike Jagode |
ICL |
Dataflow Execution for Computational Chemistry Methods |
|
|
2020/10/30 |
Piotr Luszczek |
ICL |
Scalable Data Generation for Evaluating Mixed-Precision Solvers |
|
|
2020/10/23 |
Sunita Chandrasekaran |
University of Delaware |
Developing Software for Today's and Tomorrow's platform - Fun or a Nightmare! |
|
|
2020/10/16 |
Florent Lopez |
ICL |
Running SLATE using the PaRSEC Runtime System |
Lopez-Running-SLATE-using-the-PaRSEC-Runtime-System-10-16-2020.pdf |
|
2020/10/09 |
Thomas Herault |
ICL |
|
|
|
2020/10/02 |
Micah Beck |
EECS |
One Device To Rule Them All |
|
|
2020/09/25 |
Anthony Skjellum |
UT Chattanooga SimCenter |
|
|
|
2020/09/11 |
Richard Barrett |
Sandia National Laboratories |
Chapel Performance for Some Graph Analytics |
Barrett-Chapel-Performance-for-Some-Graph-Analytics-09-11-2020.pdf |
|
2020/09/04 |
Sudip Seal |
Oak Ridge National Laboratory |
Revisiting Parallel Algorithms for Block Tridiagonal Systems |
Seal-Revisiting-Parallel-Algorithms-for-Block-Tridiagonal-Systems-09-04-2020.pdf |
|
2020/08/28 |
Clark Hathaway and Sebastian Mobo |
Global Computing Laboratory |
A Framework for Linking Urban Traffic and Vehicle Emissions in Smart Cities |
Hathaway-Mobo-A-Framework-for-Linking-Urban-Traffic-and-Vehicle-Emissions-in-Smart-Cities-08-28-2020.pdf |
|
2020/08/21 |
Phil Mucci |
Minimal Metrics |
Automating Workflow Support with EPMT |
Mucci-Automating-Workflow-Support-with-EPMT-08-21-2020.pdf |
|
2020/08/07 |
Chad Steed |
Oak Ridge National Laboratory |
Interactive Visual Analysis of the Top500 List |
Steed-Interactive-Visual-Analysis-of-the-TOP500-List-08-07-2020.pdf |
|
2020/07/31 |
Dalal Sukkari |
ICL |
Leveraging Task-Based Polar Decomposition Using SLATE on Massively Parallel Systems with Hardware Accelerators |
|
|
2020/07/24 |
Richard Archibald |
Oak Ridge National Laboratory |
Inverse Modeling for Experimental Science and Data Analytics |
Archibald-Inverse-Modeling-for-Experimental-Science-and-Data-Analytic-07-24-2020.pdf |
|
2020/07/17 |
Joan Snoderly |
ICL |
Getting Started with Concur: UT's Online Self-Service Travel Booking Tool |
Snoderly-CONCUR-07-17-2020.pdf |
|
2020/07/10 |
Marcus Ritter and Felix Wolf |
Technical University of Darmstadt |
Learning Cost-Effective Sampling Strategies for Empirical Performance Modeling |
Ritter_Wolf-Learning-Cost-Effective-Sampling-Strategies-for-Empirical-Performance-Modeling-07-10-2020.pdf |
|
2020/06/26 |
Tony Castaldo |
ICL |
Controlling Power on NVIDIA and AMD GPUs |
|
|
2020/06/19 |
Michael Wyatt |
Global Computing Laboratory |
AI4IO: A Suite of AI-Based Tools for IO-Aware HPC Resource Management |
|
|
2020/06/12 |
Hartwig Anzt |
ICL |
Porting Linear Algebra Libraries to the AMD Ecosystem |
Anzt-Porting-Linear-Algebra-Libraries-to-the-AMD-Ecosystem-06-12-2020.pdf |
|
2020/06/05 |
Frank Winkler |
ICL |
PIKA: Center-Wide and Job-Aware Cluster Monitoring |
Winkler-PIKA-Job-Monitoring-06-05-2020.pdf |
|
2020/05/29 |
Azzam Haidar |
NVIDIA |
How CUDA Math Libraries Can Help You Unleash the Power of the New NVIDIA A100 GPU |
|
|
2020/05/22 |
Aurelien Bouteiller |
ICL |
Here's What's Fresh with MPI-4 |
Bouteiller-Whats-Fresh-with-MPI-4-05-22-2020.pdf |
|
2020/05/15 |
Jiali Li |
ICL |
Optimizing all-to-all Operation with Awareness of Network Topology |
Li-Optimizing-all-to-all-Operation-with-Awareness-of-Network-Topology-05-15-2020.pdf |
|
2020/05/08 |
Paula Olaya |
Global Computing Laboratory |
Building Containerized Environments for Reproducibility and Traceability of Scientific Workflows |
Olaya-Building-Containerized-Environments-for-Reproducibility-and-Traceability-of-Scientific-Workflows-05-08-2020.pdf |
|
2020/05/01 |
Stanimire Tomov |
ICL |
HeFFTe: FFT-ECP API and High-Performance Library Prototype for Multidimensional FFTs on Large-Scale Heterogeneous Systems |
Tomov-FFT-ECP-05-01-2020.pdf |
|
2020/04/24 |
Dong Zhong |
ICL |
Using Arm Scalable Vector Extension to Optimize Open MPI |
Zhong-Using-ARM-Scalable-Vector-Extension-to-Optimize-Open-MPI-04-24-2020.pdf |
|
2020/04/17 |
Mark Gates |
ICL |
Test Matrix Generation |
Gates-Test-Matrix-Generation-04-17-2020.pdf |
|
2020/04/03 |
Yu Pei |
ICL |
Communication Avoiding 2D Stencil Implementations over PaRSEC Task-Based Runtime |
Pei-CA-Stencil-PaRSEC-04-03-2020.pdf |
|
2020/03/13 |
Ahmad Abdelfattah |
ICL |
Recent Developments for Mixed Precision Solvers in MAGMA |
|
|
2020/03/06 |
Chris Gropp |
Global Computing Laboratory |
Understanding Machine Learning Systems using Adversarial Approaches |
|
|
2020/02/28 |
Mohammed Al Farhan |
ICL |
Optimizing the Cholesky factorization in SLATE |
AlFarhan-SLATE-02-28-2020.pdf |
|
2020/02/21 |
Daniel Schultz |
ICL |
The Effects of MPI Oversubscription & Multi-Process Service on Distributed-memory FFT Libraries |
Schultz-The-Effects-of-MPI-Oversubscription-&-Multi-Process-Service-on-Distributed-memory-FFT-Libraries-02-21-2020.pdf |
|
2020/01/31 |
Neil Lindquist |
ICL |
Improving the Performance of GMRES with Mixed Precision |
Lindquist-Improving-the-Performance-of-GMRES-with-Mixed-Precision-01-31-2020.pdf |
|
2020/01/29 |
Joan Snoderly |
ICL |
ICL Performance Evaluation Process |
Snoderly-ICL-Performance-Evaluation-Process-01-29-2020.pdf |
|
2020/01/24 |
Jeff Larkin |
NVIDIA |
To OpenACC 3.0 and Beyond! |
Larkin-Open-ACC3-01-24-2020.pdf |
|
2020/01/17 |
George Bosilca |
ICL |
Towards Portable Online Prediction of Network Utilization using MPI-level Monitoring |
Bosilca-Towards-Portable-Online-Prediction-of-Network-Utilization-using-MPI-level-Monitoring-01-17-2020.pdf |
|
2020/01/10 |
Hidehiko Hasegawa |
University of Tsukuba |
Predicting the Convergence of BiCG Method from Grayscale Matrix Images |
Hasegawa-Predicting-the-Convergence-of-BiCG-Method-from-Grayscale-Matrix-Images-01-10-2020.pdf |
|
2019/12/13 |
Piotr Luszczek |
ICL |
Building Community through xSDK Software Policies |
Luszczek-xSDK-Community-Policies-12-13-2019.pdf |
|
2019/12/06 |
Damien Genet |
ICL |
PAPI Updates on Power9 |
Genet-PAPI-Updates-on-Power9-12-06-2019.pdf |
|
2019/11/15 |
Anthony Danalis |
ICL |
Questions about hardware events? CAT has the answers. |
|
|
2019/11/08 |
Sticks Mabakane |
ICL |
Effective Callgraph Visualisations for Optimisation of Parallel-Programs |
Mabakane-Effective-Callgraph-Visualisations-for-Optimisation-of-Parallel-Programs_a-Design-Study.pdf |
|
2019/11/01 |
David Eberius |
ICL |
A Flexible MPI Benchmark For Fast Assessment of Multithreaded Communication Performance |
Eberius-Multirate-A-Flexible-MPI-Benchmark-For-Fast-Assessment-of-Multithreaded-Communication-Performance-11-01-2019.pdf |
|
2019/10/25 |
Yaohung Tsai |
ICL |
Autotuning in Deep Learning Kernels |
Tsai-Autotunning-Deep-Learning-10-25-2019.pdf |
|
2019/10/18 |
Alan Ayala |
ICL |
heFFTe: Highly Efficient FFT for Exascale |
Ayala-heFFTe-10-18-2019.pdf |
|
2019/10/11 |
Axel Huebl |
Lawrence Berkeley National Laboratory |
Scalable, Performance-Portable Particle-in-Cell Simulations and PByte-Scale Data-Challenges |
|
|
2019/10/04 |
Yves Robert |
ENS-Lyon |
Scheduling Independent Stochastic Tasks on Heterogeneous Cloud Platforms |
Robert-Scheduling-Independent-Stochastic-Tasks-on-Heterogeneous-Cloud-Platforms-10-04-2019.pdf |
|
2019/09/27 |
Srinivas Aluru |
Georgia Tech |
Parallel Machine Learning Approaches for Reverse Engineering Genome-Scale Networks |
Aluru-Parallel-Machine-Learning-Approaches-for-Reverse-Engineering-Genome-Scale-Networks-09-27-2019.pdf |
|
2019/09/20 |
Oscar Hernandez |
ORNL |
Filling in the Gaps between Applications and the OpenMP Specification for Exascale |
|
|
2019/09/13 |
Nuria Losada |
ICL |
Asynchronous Receiver-Driven Replay for Local Rollback of MPI Applications |
Losada-Asynchronous-Receiver-Driven-Replay-for-Local-Rollback-of-MPI-Applications-09-13-2019.pdf |
|
2019/09/06 |
Asim YarKhan |
ICL |
Linear Systems Solvers for Distributed-Memory Machines with GPU Accelerators |
YarKhan-SLATE-Linear-Systems-09-06-2019.pdf |