2019/03/29 |
Qinglei Cao |
ICL |
Understanding the Performance of PaRSEC in the "Task Bench" Benchmark |
|
2019/03/26 |
Torsten Hoefler |
ETH Zürich |
High-Performance Communication in Machine Learning |
Hoefler-High-Performance-Communication-in-Machine-Learning-03-26-2019.pdf |
2019/03/22 |
Jamie Finney |
ICL |
DevOps: Tools and Practices |
Finney-DevOps-Tools-and-Practices-03-22-2019.pdf |
2019/03/15 |
Danny Rorabaugh |
Global Computing Laboratory |
A Workflow for Soil Moisture Analytics |
Rorabaugh-A-Workflow-for-Soil-Moisture-Analytics-03-15-2019.pdf |
2019/03/08 |
Stephen Herbein |
Lawrence Livermore National Laboratory |
Flux: Using Next-Generation Resource Management and Scheduling Infrastructure for Exascale Workflows |
Herbein-Flux-Using-Next-Generation-Resource-Management-and-Scheduling-Infrastructure-for-Exascale-Workflows-03-08-2019.pdf |
2019/02/22 |
Stanimire Tomov |
ICL |
A Novel Approach to Using Fused Batched BLAS to Accelerate High-Order Discretizations |
Tomov-A-Novel-Approach-to-Using-Batched-BLAS-to-Accelerate-High-Order-Discretizations-CEED-02-22-2019.pdf |
2019/02/15 |
Yves Robert |
ENS-Lyon |
Scheduling Independent Stochastic Tasks under Deadline and Budget Constraints |
Robert-Scheduling-Independent-Stochastic-Tasks-02-15-2019.pdf |
2019/02/08 |
Aurelien Bouteiller |
ICL |
Extensions of ULFM for Easier Application Design |
Bouteiller-Extensions-of-ULFM-for-Easier-Application-Design-02-08-2019.pdf |
2019/02/01 |
Yu Pei |
ICL |
Preliminary Results: Integrating Future as Async Structure into PaRSEC |
Pei-Async-PaRSEC-02-01-2019.pdf |
2019/01/25 |
Dong Zhong |
ICL |
Runtime Level Failure Detection and Propagation in HPC Systems |
Zhong-Resilient-PRRTE-01-25-2019.pdf |
2019/01/11 |
Joseph Schuchart |
High Performance Computing Center, Stuttgart (HLRS) |
MPI-3 RMA in Practice: Interesting Issues and where to Find Them |
|
2019/01/04 |
Jack Dongarra |
ICL |
ICL in the New Year |
|
2018/12/14 |
Heike Jagode |
ICL |
Software-Defined Events through PAPI |
2018/Jagode-SDEs-through-PAPI-12-14-2018.pdf |
2018/12/07 |
Jakub Kurzak |
ICL |
2018 SLATE Recap |
Kurzak-2018-SLATE-Recap-12-07-2018.pdf |
2018/11/30 |
John Levesque |
Cray |
Can We get to an EXAFLOP in Sustained Performance and still be Performance Portable? |
|
2018/11/09 |
Damien Genet |
ICL |
Tensor Contraction on Distributed Hybrid Architectures using a Task-Based Runtime System |
Genet-TESSE-11-09-2018.pdf |
2018/11/02 |
Hartwig Anzt |
Karlsruhe Institute of Technology |
Convolutional Neural Networks for the Efficient Preconditioner Generation |
Anzt-Machine_Learning-11-2-2018.pdf |
2018/10/26 |
Michael Wyatt |
Global Computing Laboratory |
PRIONN: Predicting Runtime and IO using Neural Networks |
|
2018/10/19 |
Travis Johnston |
ORNL |
167-PetaFLOPs Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation |
|
2018/10/12 |
Pierre Blanchard |
University of Manchester |
Optimizing the Polar Decomposition for Modern Computer Architectures |
Blanchard-Optimizing-Polar-Decomposition-10-12-2018.pdf |
2018/10/05 |
Yuechao Lu |
Osaka University |
Randomized SVD and its Application |
|
2018/09/28 |
Mark Gates |
ICL |
The Singular Value Decomposition: Anatomy of Optimizing an Algorithm for Extreme Scale |
Gates-Singular-Value-Decomposition-09-28-2018.pdf |
2018/09/21 |
Jakub Sistek |
University of Manchester |
PLASMA INTERTWinING |
Sistek-PLASMA-INTERTWinING-09-21-2018.pdf |
2018/09/14 |
Xi Luo |
ICL |
ADAPT: An Event-Based Adaptive Collective Communication Framework |
Xi-Luo-ADAPT-09-14-2018.pdf |
2018/09/07 |
Terry Moore |
ICL |
Big Data and Extreme-scale Computing: Past, Present, and Future |
Moore-BDEC2-09-07-2018.pdf |
2018/08/31 |
Arm Patinyasakdikul |
ICL |
One-Sided MPI Implementation |
Arm-One-Sided-MPI-Implementation-08-31-2018.pdf |
2018/08/24 |
Tracy Rafferty |
ICL |
ICL Meeting Space |
Rafferty-Reserving-Space-8-24-2018.pdf |
2018/06/08 |
Lou Gross |
NIMBioS |
A Rational Basis for Hope: Human Behavior Modeling and Climate Change |
|
2018/06/01 |
Anne Benoit |
Georgia Tech |
Combining Checkpointing and Replication for Reliable Execution of Linear Workflows |
|
2018/05/25 |
Yaohung Tsai |
ICL |
Pseudo-Assembly Programming on NVIDIA GPU |
Tsai-Pseudo-Assembly-Programming-on-NVIDIA-GPU-05-25-2018.pdf |
2018/05/18 |
Pratik Nayak |
Karlsruhe Institute of Technology |
Using Iterative Methods for Local Solves in Asynchronous Schwarz Methods |
|
2018/05/11 |
Ana Gainaru |
Vanderbilt University |
Scheduling Solutions for Data-Driven Large-Scale Applications |
Gainaru-Scheduling-Solutions-05-11-2018.pdf |
2018/05/04 |
Ahmad Abdelfattah |
ICL |
MAGMA Update: High Performance and Energy Efficient LU Factorization |
Ahmad-MAGMA-05-04-2018.pdf |
2018/04/27 |
Yves Robert |
ENS-Lyon |
A Performance Model to Execute Workflows on High-Bandwidth Memory Architectures |
Robert-A-Performance-Model-to-Execute-Workflows-on-High-Bandwidth-Memory-Architectures-04-27-2018.pdf |
2018/04/20 |
Hartwig Anzt |
Karlsruhe Institute of Technology |
Variable-Size Batched Condition Number Calculation on GPUs |
Anzt-Variable-Size-04-20-2018.pdf |
2018/04/13 |
Jiali Li |
ICL |
PCP Component in PAPI |
Li-PCP-PAPI-04-13-2018.pdf |
2018/04/06 |
Scott Emrich |
EECS |
High Throughput Genome Analysis using Makeflow and Friends |
Emrich-Integrated-genome-analysis-using-Makeflow-friends-04-06-2018.pdf |
2018/03/09 |
Thananon Patinyasakdikul |
ICL |
Injection Rate in Multithreaded MPI |
|
2018/03/02 |
Anthony Danalis |
ICL |
Low-Level Benchmarking: A Dive down the Rabbit Hole |
|
2018/02/23 |
Sam Crawford |
ICL |
Interdisciplinary Graduate Minor in Computational Science |
Crawford-IGMCS-02-23-2018.pdf |
2018/02/16 |
Gerald Ragghianti |
ICL |
Local ECP resources, Spack, and Time Machine backups |
|
2018/02/09 |
Hartwig Anzt |
Karlsruhe Institute of Technology |
Ginkgo: Towards an Accelerator-Focused C++ Sparse Linear Algebra Library |
Anzt-Ginkgo-On-the-way-towards-an-accelerator-focused-C-plus-plus-sparse-linear-algebra-library-02-09-2018.pdf |
2018/02/02 |
Yves Robert |
ENS-Lyon |
Checkpointing Workflows for Fail-Stop Errors |
Robert-Checkpointing-workflows-for-fail-stop-errors-02-02-2018.pdf |
2018/01/26 |
David Eberius |
ICL |
Performance Counters in Open MPI Using MPI_T |
Eberius-Performance-Counters-in-Open-MPI-Using-MPI_T-01-26-2018.pdf |
2018/01/19 |
Keita Teranishi |
Sandia National Laboratories |
Portability and Scalability of Sparse Tensor Decompositions on CPU/MIC/GPU Architectures |
Teranishi-Portability-and-Scalability-of-Sparse-Tensor-Decompositions-on-CPU_MIC_GPU-Architectures-01-19-2018.pdf |
2018/01/11 |
Edgar Gabriel |
University of Houston |
File I/O in High Performance Computing and Big Data Environments |
|
2018/01/05 |
Azzam Haidar |
ICL |
Investigating Half Precision Arithmetic to Accelerate Dense Linear System Solvers |
|
2017/12/15 |
Reazul Hoque |
ICL |
Dynamic Task Discovery in PaRSEC: A Data-Flow Task-Based Runtime |
Hoque-Dynamic-Task-Discovery-in-PaRSECA-Data-Flow-Task-Based-Runtime-12-15-2017.pdf |
2017/12/08 |
Yves Robert |
ENS-Lyon |
Budget-Aware Scheduling Algorithms for Scientific Workflows with Stochastic Task Weights on IaaS Cloud Platforms |
Robert-Budget-Aware-Scheduling-Algorithms-for-Scientific-Workflows-with-Stochastic-Task-Weights-on-IaaS-Cloud-Platforms-12-08-2017.pdf |
2017/12/01 |
Matthew Bachstein |
ICL |
BONSAI: Benchtesting OpeN Software Autotuning Infrastructure |
Bachstein-BONSAI-12-01-2017.pdf |
2017/11/08 |
Leighanne Sisk and Teresa Finchum |
ICL |
Traveling and You: A Primer on Getting To and Fro |
|
2017/11/03 |
Yves Robert |
ENS-Lyon |
A Few Recent Results and Open Problems for Resilience at Scale |
|
2017/10/27 |
Hanumantharayappa |
ICL |
PAPI Counter Inspection Tool (CIT) |
|
2017/10/20 |
Piotr Luszczek |
ICL |
MAGMA at OLCF Hackathons |
Luszczek-MAGMA-OLCF-Hackathon-10-20-2017.pdf |
2017/10/06 |
Thananon Patinyasakdikul |
ICL |
1337 c0d3 |
Arm-1337c0d3-10-06-2017.pdf |
2017/09/29 |
Joseph Schuchart |
High Performance Computing Center Stuttgart |
Global Task Data Dependencies in PGAS Applications |
Schuchart-Global-Task-Data-Dependencies-in-PGAS-Applications-09-29-2017.pdf |
2017/09/22 |
Charles Phillips |
EECS |
Advances in Graph-Theoretical Algorithms, Machine Learning and Big Data Applications |
Phillips-Advances-in-Graph-Theoretical-Algorithms-09-22-2017.pdf |
2017/09/15 |
Dmitry Zaitsev |
|
Clans of Linear Systems |
Zaitsev-Clans-of-Linear-Systems-09-15-2017.pdf |
2017/09/08 |
Jeff Larkin |
NVIDIA |
NVIDIA Updates: V100 + CUDA 9 |
Larkin-NVIDIA-Update-V100-CUDA-9-2017-09-01.pdf |
2017/09/01 |
Hartwig Anzt |
Karlsruhe Institute of Technology |
Adaptive Precision in Block-Jacobi Preconditioning for Iterative Sparse Linear System Solvers |
Anzt-Adaptive-Precision-in-Block-Jacobi-Preconditioning-for-Iterative-Sparse-Linear-System-Solvers-2017-09-01.pdf |
2017/08/25 |
Terry Moore |
ICL |
ASG and TSG: Strengths, Weaknesses, Opportunities, and Threats |
|
2017/07/28 |
Sushil Prasad |
National Science Foundation |
Parallel Processing over Spatial-Temporal Datasets from Geo, Bio, Climate, and Social Science Communities: A Research Roadmap |
Prasad-BigData-Visionary-07-28-2017.pdf |
2017/06/30 |
Saeid Nooshabadi |
Michigan Technological University |
Algorithm and Architecture for Density Estimation for Data Intensive Application in High Dimension |
Nooshabadi-Estimation-06-40-2017.pdf |
2017/05/26 |
Markus Eisenbach |
ORNL |
A Linear Scaling Multiple Scattering Code for First Principles Calculations of the Ground State and Statistical Physics of Materials |
markus-eisenbach-slides-05-26-2017.pdf |
2017/05/19 |
Chris Davis and Sophie Voisin |
ORNL |
Combining NVIDIA Docker and Databases to Enhance Agile Development and Optimize Resource Allocation |
Chris_Davis-Sophie_Voisin-slides-05-19-2017.pdf |
2017/05/05 |
Lynne Parker |
EECS |
Some Lessons Learned from My 2-Year Stint at NSF: The Abbreviated Version |
Parker-NSF-Lessons-Learned-May-2017.pdf |
2017/04/28 |
Azzam Haidar |
|
Toward Distributed Eigenvalue and Singular Value Solver Using Data Flow System |
azzam-haidar-slides-04-28-2017.pdf |
2017/04/21 |
Sangamesh Ragate |
|
Sequence Labeling Using RNN |
sangamesh-ragate-slides-04-21-2017.pdf |
2017/04/07 |
Benjamin Hernandez |
ORNL |
GPU Refinement Learning for Crowd Simulations |
benjamin-hernandez-slides-04-07-2017.pdf |
2017/03/31 |
Piotr Luszczek |
|
Implementation Techniques for Point Set Registration on Multicore and Heterogeneous Hardware |
piotr-luszczek-slides-03-31-2017.pdf |
2017/03/17 |
Stephen Wood |
|
Multilevel Parallelism for CFD: Preconditioning via Parallel ILU Factorization |
stephen-wood-slides-03-17-2017.pdf |
2017/03/10 |
Nuria Losada |
|
Stop&Restart vs Resilient MPI applications |
nuria-losada-slides-03-10-2017.pdf |
2017/02/17 |
Arvind Ramanathan |
ORNL |
Computer to Cure Cancer: Building Exascale Deep Learning Tools for Effective Cancer Surveillance |
arvind-ramanathan-slides-02-17-2017.pdf |
2017/02/10 |
Scott Klasky |
ORNL |
Creating ADIOS-2 for scientific exascale data |
scott-klasky-slides-02-10-2017.pdf |
2017/02/03 |
George Ostrouchov |
ORNL |
Bringing Modern Statistical Science to HPC |
george-ostrouchov-slides-02-03-2017.pdf |
2017/01/27 |
Ryan Glasby |
JICS |
Results from HPCMP CREATE™-AV Kestrel Component COFFE for 3-D Aircraft Configurations with Higher-Order Meshes Generated by Pointwise, Inc. |
ryan-glasby-slides-01-27-2017.pdf |
2017/01/20 |
Yves Robert |
|
Bidiagonalization with Parallel Tiled Algorithms |
yves-robert-slides-01-20-2017.pdf |
2017/01/13 |
David Eberius |
|
Profiling PaRSEC and Using Software Events in PAPI |
david-eberius-slides-01-13-2017.pdf |
2017/01/06 |
Kwai Wong |
JICS |
An Interoperable Workflow Platform for Multidisciplinary Simulations—openDIEL |
kwai-wong-slides-01-06-2017.pdf |
2016/12/16 |
Chongxiao Cao |
|
|
|
2016/12/09 |
Wei Wu |
|
Topology-aware Collective of CUDA-aware Open MPI |
|
2016/12/02 |
Stephen Richmond |
|
UCX as Communication Backend for PaRSEC |
|
2016/11/11 |
Thananon Patinyasakdikul |
ICL |
Multithreaded MPI |
arm-slides-11-11-2016.pdf |
2016/11/04 |
Reazul Hoque |
|
Dynamic Task Discovery in PaRSEC |
reazul-hoque-slides-11-4-2016.pdf |
2016/10/28 |
Frank Winkler |
ORNL |
Performance Analysis at Scale: The Score-P Tools Infrastructure |
frank-winkler-slides-10-28-2016.pdf |
2016/10/21 |
Harry Hughes |
ICL |
A Simulation-based System to Optimize Tile Size Parameters in PLASMA |
harry-hughes-slides-10-21-2016.pdf |
2016/10/14 |
Yves Robert |
INRIA |
Failure Detection and Propagation in HPC systems |
yves-robert-slides-10-14-2016.pdf |
2016/10/07 |
Piotr Luszczek |
|
What Deep Learning?!? |
piotr-luszczek-slides-10-07-2016.pdf |
2016/09/30 |
Azzam Haidar |
|
A note on the Power and Performance Analysis of Dense Linear Algebra on Intel Xeon Phi Processors |
|
2016/09/22 |
Phil Vaccaro |
|
PAPI Component: Powercap |
philip-vaccaro-slides-09-22-2016.pdf |
2016/09/16 |
Dmitry Lyakh |
ORNL |
Dense/sparse numeric tensor algebra: Scalable, hardware-agnostic design for performance portability |
dmitry-lyakh-slides-09-16-2016.pdf |
2016/09/09 |
Yves Robert |
INRIA |
Computing the expected longest path of task graphs in the presence of silent errors |
yves-robert-slides-09-09-2016.pdf |
2016/08/05 |
Joe Dorris |
ICL |
Patent Data Visualization and Processing |
|
2016/07/22 |
Myungho Lee |
Soongsil University |
Memory-Efficient Parallelization of 3D Lattice Boltzmann Flow Solver on a GPU |
|
2016/07/01 |
Emmanuel Jeannot |
INRIA |
Topology-Aware Data Management |
Jeannot-Topology-Aware-Data-Management-07-01-16.pdf |
2016/06/17 |
Julien Langou |
University of Colorado |
A Makespan Lower Bound for the Scheduling of the Tiled Cholesky Factorization based on ALAP scheduling |
Langou-A-Makespan-Lower-Bound-for-the-Scheduling-of-the-Tiled-Cholesky-Factorization-based-on-ALAP-scheduling-06-17-16.pdf |
2016/06/10 |
Emmanuel Agullo |
INRIA |
Overview of Task-based Sparse and Data-sparse Solvers on Top of Runtime Systems |
|
2016/05/27 |
Azzam Haidar |
ICL |
Heterogeneous Computation: The Current Challenge |
|
2016/05/20 |
Iain Duff |
the Numerical Analysis Group at the Scientific Computing Department of the Science and Technology Facilities Council (UK) |
Scalability of Sparse Direct Codes |
Duff-Scalability-of-Sparse-Direct-Codes-05-20-16.pdf |
2016/05/13 |
George Bosilca |
ICL |
PaRSEC - Yet another runtime? |
Bosilca-PaRSEC-Yet-Another-Runtime-05-13-16.pdf |