|
2014/04/25 |
George Bosilca |
ICL |
Toward composite fault management strategies: a quantitative evaluation |
Bosilca-Assessing-the-Impact-of-ABFT-Checkpoint-Composite-Strategies-2014-04-25.pdf |
|
2014/04/10 |
Dorian Arnold |
UNM |
A Simulation-based Framework for Evaluating Resilience Strategies at Scale |
|
|
2014/04/04 |
Jakub Kurzak |
ICL |
Some Techniques for Optimizing CUDA More |
Kurzak-Some-Techniques-for-Optimizing-CUDA-More-2014-04-04.pdf |
|
2014/03/28 |
Hartwig Anzt |
ICL |
Optimizing Krylov Subspace Solvers on Graphics Processing Units |
Anzt-Optimizing-Krylov-Subspace-Solvers-on-GPUs-2014-03-28.pdf |
|
2014/03/21 |
Mark Gates |
ICL |
Accelerating eigenvector computation |
Gates-Accelerating-Computation-of-Eigenvectors-2014-03-21.pdf |
|
2014/03/13 |
Atsushi Hori |
RIKEN |
A New Process/Thread Model for Many-core Era |
|
|
2014/03/07 |
Mathieu Faverge |
INRIA |
Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes |
Faverge-Sparse-Linear-Algebra-over-DAG-Runtimes-2014-03-07.pdf |
|
2014/02/28 |
Samuel Thibault |
INRIA |
StarPU: Task Graphs from Heterogeneous Platforms to Clusters Thereof |
Thibault-StarPU-Task-Graphs-from-Heterogeneous-Platforms-to-Clusters-Thereof-2014-02-28.pdf |
|
2014/02/21 |
Simplice Donfack |
ICL |
Improving multicore capabilities in hybrid CPUs/GPUs applications (Case of MAGMA) |
Donfack-Improving-Multicore-Capabilities-in-Hybrid-CPUs-GPUs-Applications-2014-02-21.pdf |
|
2014/02/14 |
Yves Robert |
ICL |
Scheduling Data Sensor Retrieval for Boolean Tree Query Processing |
|
|
2014/02/07 |
Aurelien Bouteiller |
ICL |
Fault Tolerant MPI |
Bouteiller-Fault-Tolerant-MPI-2014-02-07.pdf |
|
2014/01/31 |
Thomas Herault |
ICL |
Assessing the Impact of ABFT and Checkpointing Composite Strategies |
Herault-ABFT-Periodic-Checkpointing-2014-01-31.pdf |
|
2014/01/24 |
Blake Haugen |
ICL |
Latent Semantic Analysis |
Haugen-Latent-Semantic-Analysis-2014-01-24.pdf |
|
2014/01/17 |
Kirk Cameron |
Virginia Tech |
Power and Energy Whack-a-mole in HPC |
|
|
2014/01/10 |
Julien Herrmann |
|
Designing LU-QR hybrid solvers for performance and stability |
Herrmann-LUQR-2014-01-10.pdf |
|
2013/12/20 |
Matt Johnson |
Garmin |
An Overview of Garmin Flight Control Systems |
|
|
2013/12/13 |
Gabriel Marin |
ICL |
Performance Diagnosis Using MIAMI: A Few Motivating Case Studies |
Marin-Performance-Diagnosis-Using-MIAMI-2013-12-13.pdf |
|
2013/12/06 |
Anthony Danalis |
ICL |
Does the cache matter? When? Why? and How Much? |
|
|
2013/11/15 |
Jakub Kurzak |
ICL |
BEAST: A Transparent Autotuner for Accelerators |
Kurzak-BEAST-2013-11-15.pdf |
|
2013/11/08 |
Jean-Charles Vasnier |
ORNL |
One OpenCL To Rule Them All |
Vasnier-One-OpenCL-To-Rule-Them-All-2013-11-08.pdf |
|
2013/11/01 |
Azzam Haidar |
ICL |
Unified Development for Mixed Multi-GPU and Multi-Coprocessor Environments using a Lightweight Runtime Environment |
Haidar-Unified-Development-for-Mixed-Multi-GPU-and-Multi-Coprocessor-Environments-2013-11-01.pdf |
|
2013/10/25 |
Mark Dean |
EECS |
Technology Trends and Innovation Opportunities |
Dean-Beyond-von-Neumann-2013-10-25.pdf |
|
2013/10/18 |
Rainer Keller |
University of Stuttgart |
Tools for High-Performance Computing |
Keller-Tools-for-High-Performance-Computing-2013-10-18.pdf |
|
2013/10/11 |
Alisa Meador |
UTK Center for International Education |
International SOS |
|
|
2013/10/04 |
Aurelien Bouteiller |
ICL |
Multi-criteria Checkpointing Strategies: Response-time vs Resource Utilization |
Bouteiller-Multi-criteria-Checkpointing-Strategies-2013-10-04.pdf |
|
2013/09/27 |
Dan Terpstra |
ICL |
Small Scale Water Treatment in Developing Countries |
Terpstra-Living-Waters-for-the-World-2013-09-28.pdf |
|
2013/09/20 |
Piotr Luszczek |
ICL |
Energy and Power Consumption Trends |
Luszczek-Energy-Power-Trends-HPC.pdf |
|
2013/09/13 |
Jakub Kurzak |
ICL |
Parallel Ultra Light Systolic Array Runtime |
Kurzak-PULSAR-2013-09-13.pdf |
|
2013/09/06 |
Yves Robert |
ICL |
On the Combination of Silent Error Detection and Checkpointing |
Robert-Silent-Error-Detection-and-Checkpointing-2013-09-06.pdf |
|
2013/08/30 |
Jeff Larkin |
NVIDIA |
OpenACC 2.0 Highlights |
Larkin-OpenACC-Introduction-2013-08-30.pdf |
|
2013/08/23 |
Michela Taufer |
University of Delaware |
On the effectiveness of application-aware self-management for scientific discovery in volunteer computing systems |
Taufer-On-the-effectiveness-of-application-aware-volunteer-computing-2013-08-23.pdf |
|
2013/07/19 |
Anthony Danalis |
ICL |
Creating a new operation with DPLASMA: a step by step guide |
Danalis-Creating-a-new-operation-with-DPLASMA-2013-07-19.pdf |
|
2013/06/28 |
Haihang You |
NICS |
Optimizing utilization across XSEDE
resources |
|
|
2013/06/21 |
Hartwig Anzt |
ICL |
Energy Efficiency on Emerging Hardware |
Anzt-Energy-Efficiency-on-Emerging-Hardware-2013-06-21.pdf |
|
2013/06/14 |
Dong Li |
ORNL |
Toward Reliable and Power Efficient Exascale Systemsf |
Li-Toward-Reliable-and-Power-Efficient-Exascale-Systems-2013-06-14.pdf |
|
2013/06/07 |
Vincent C. Betro |
NICS |
Performance of the fusion code GYRO on four generations of Cray Computers |
Betro-Performance-of-the-fusion-code-GYRO-2013-06-07.pdf |
|
2013/05/31 |
Volodymyr Turchenko |
ICL |
Batch Pattern Parallelization Scheme of NNs on Many-core Architectures |
Turchenko-Batch-Pattern-Parallelization-Scheme-of-Neural-Networks-2013-05-31.pdf |
|
2013/05/24 |
Piotr Luszczek |
ICL |
Competitive Proposal Writing |
Luszczek-Competitive-Proposal-Writing-2013-05-24.pdf |
|
2013/05/17 |
Julien Herrmann |
ENS |
Tree traversals with task-memory affinities on hybrid platforms |
Herrmann-Tree-traversals-with-task-memory-affinities-2013-05-17.pdf |
|
2013/05/10 |
Yves Robert |
ICL |
Energy-efficient scheduling |
Robert-Energy-efficient-scheduling-2013-05-10.pdf |
|
2013/05/03 |
Aurelien Bouteiller |
ICL |
Making DPLASMA with PaRSEC, the Cookbook |
Bouteiller-Parsec-2013-05-03.pdf |
|
2013/04/26 |
Bhanu Rekepalli |
NICS |
Solving Life Sciences Data Deluge Problems using JICS Resources |
|
|
2013/04/19 |
Blake Haugen |
ICL |
Simulation Using Dynamic Schedulers |
Haugen-Simulation-Using-Dynamic-Schedulers-13-04-19.pdf |
|
2013/04/12 |
Erlin Yao |
ICL |
Algorithm-Based Fault Tolerance (ABFT) for Dense Linear Algebra |
|
|
2013/04/05 |
Yves Robert |
ICL |
Revisiting the double checkpointing algorithm (a.k.a. the buddy algorithm) |
Robert-Buddy-Algorithm-2013-04-05.pdf |
|
2013/03/28 |
Julien Langou |
University of Colorado at Denver |
Greedy Trees for MPI Reductions |
Langou-Greedy-Trees-for-MPI-Reductions-2013-03-28.pdf |
|
2013/03/22 |
Patrick Worley |
ORNL |
Capturing Computer Performance Variability in Production Jobs |
Worley-PerfVar-UTK_2013-03-22.pdf |
|
2013/03/15 |
Simplice Donfack |
ICL |
Dynamically balanced synchronization-avoiding LU factorization with multicore and GPUs |
Simplice-Dynamically-balanced-synchronization-avoiding-LU-2013-03-15.pdf |
|
2013/03/08 |
Ichitaro Yamazaki |
ICL |
Low-rank approximation on a GPU and its integration into an HSS solver |
Yamazaki-Low-rank-approximation-on-a-GPU-2013-03-08.pdf |
|
2013/03/01 |
Tracy Rafferty |
ICL |
Effort Certifications |
Rafferty-NSF-Changes-2013-01-04.pdf |
|
2013/02/22 |
Amy Szczepanski |
EECS |
The Art of Making Mistakes on Purpose |
Szczepanski-The-Art-of-Making-Mistakes-on-Purpose-2013-02-22.pdf |
|
2013/02/15 |
Mark Gates |
ICL |
QR with column pivoting in PLASMA |
Gates-QR-with-column-pivoting-in-PLASMA-2013-02-15.pdf |
|
2013/02/08 |
James Plank |
EECS |
Two Talks at Usenix FAST Next Week |
Plank-FAST-2013-02-08.pdf |
|
2013/02/01 |
Khairul Kabir |
ICL |
MAGMA MIC - Linear Algebra Library for Intel Xeon Phi Coprocessors |
|
|
2013/01/18 |
Christal Yost |
NICS |
Science Communication |
Yost-Science-Communication-2013-01-18.pdf |
|
2013/01/11 |
Thomas Herault |
ICL |
Integration of ABFT ScaLAPACK and SciDB using the Checkpoint on Failure Protocol |
Herault-Integration-of-ABFT-ScaLAPACK-and-SciDB -2013-01-11.pdf |
|
2013/01/04 |
Tracy Rafferty |
ICL |
NSF Changes |
Rafferty-NSF-Changes-2013-01-04.pdf |
|
2012/12/14 |
Peng Du, Matt Johnson, Teng Ma, Asim YarKhan |
ICL |
ICL Graduates |
|
|
2012/12/07 |
Ichitaro Yamazaki |
ICL |
Using sparse and dense direct solvers to solve coupled linear systems |
|
|
2012/11/30 |
Azzam Haidar |
ICL |
MAGMA: toward fast Eigensolver |
Haidar-MAGMA-toward-fast-Eigensolver-2012-11-30.pdf |
|
2012/11/09 |
George Bosilca |
ICL |
Communication patterns and their integration at different levels of the software stack |
Bosilca-Taking-advantage-of-communication-patterns-2012-11-09.pdf |
|
2012/11/02 |
Mitch Horton |
GATech |
Domain Science at Scale on Keeneland |
Horton-BEAST-BEAGLE-Phylogenetics-Software-2012-11-02.pdf |
|
2012/10/26 |
Gabriel Marin |
ORNL |
How fast should my application run? |
Marin-How-Fast-Should-My-Application-Run-2012-10-26.pdf |
|
2012/10/19 |
Nicholas Nagle |
Department of Geography |
Machine learning perspectives for Sample Surveys and Small Area Estimation |
Nagle-Machine-learning-perspectives-for-survey-weighting-2012-10-19.pdf |
|
2012/10/12 |
Yves Robert |
ICL |
Impact of fault prediction on checkpointing strategies |
Robert-On-the-impact-of-fault-prediction-on-checkpointing-2012-10-12.pdf |
|
2012/10/05 |
Piotr Luszczek |
ICL |
Anatomy of a Globally Recursive Embedded LINPACK Benchmark |
Luszczek-Anatomy-of-a-Globally-Recursive-Embedded-LINPACK-benchmark-2012-10-05.pdf |
|
2012/09/28 |
Sticks Mabakane |
University of Cape Town |
Designing a better visualization system for supercomputing users |
Mabakane-Designing-a-better-visualization-system-for-supercomputing-users-2012-09-26.pdf |
|
2012/09/21 |
Volodymyr Turchenko |
the Research Institute of Intelligent Computer Systems, Ternopil National Economic University, Ukraine |
Parallelization of Neural Networks Training |
Turchenko-Parallelization-of-Neural-Networks-2010-09-21.pdf |
|
2012/09/14 |
Jakub Kurzak |
ICL |
Virtual Systolic Array for QR Decomposition |
Kurzak-Virtual-Systolic-Array-for-QR-Decomposition-2012-09-14.pdf |
|
2012/09/07 |
Tingxing "Tim" Dong |
ICL |
Accelerating the BLAST code with hybrid MPI + OpenMP + CUDA programming |
Dong-Accelerating-the-BLAST-code-2012-09-07.pdf |
|
2012/08/31 |
Yutaka Ishikawa and Mitsuhisa Sato |
Japan |
Updates on XcalableMP PGAS Language |
Sato-Updates-on-XcalableMP-PGAS-Language-2012-08-29.pdf |
|
2012/08/24 |
Azzam Haidar |
ICL |
Eigenproblem Algorithms |
Haidar-Eigenproblem-algorithms-2012-08-24.pdf |
|
2012/08/10 |
Ichitaro Yamazaki |
ICL |
Pivoting strategies for solving symmetric indefinite linear systems |
Yamazaki-Pivoting-strategies-for-a-symmetric-indefinite-matrix-2012-08-10.pdf |
|
2012/08/03 |
Stanimire Tomov |
ICL |
clMAGMA: Heterogeneous High-Performance Linear Algebra with OpenCL |
Tomov-clMAGMA-2012-08-03.pdf |
|
2012/07/27 |
Anthony Danalis |
ICL |
From Serial Loops to Parallel Execution on Distributed Systems |
Danalis-From-Serial-Loops-to-Parallel-Execution-on-Distributed-Systems-2012-07-27.pdf |
|
2012/07/20 |
Erlin Yao |
ICL |
A Case Study of Designing Efficient Algorithm-based Fault Tolerant Application for Exascale Parallelism |
Yao-Algorithm-based-Fault-Tolerant-Application-for-Exascale-Parallelism-2012-07-20.pdf |
|
2012/07/13 |
Aurelien Bouteiller |
ICL |
Fault Tolerance Methods: Challenges and Opportunities |
Bouteiller-Fault-Tolerance-Techniques-for-MPI-Programs-2012-07-13.pdf |
|
2012/07/06 |
Mathieu Faverge |
ICL |
Algorithms mixing LU and QR kernels for multi-core cluster systems |
Faverge-Mixing-LU-and-QR-on-multi-core-cluster-systems-2012-07-06.pdf |
|
2012/06/29 |
Shirley Moore |
ICL |
Shirley Moore Photo Gallery |
http://icl.cs.utk.edu/newsletter/2012/06/shirley-photo-gallery/ |
|
2012/06/22 |
Branislav Jansik |
Aarhus University |
DEC code implementation and parallelization |
Janisk-DEC-code-implementation-and-Parallelization-2012-06-22.pdf |
|
2012/06/15 |
Mark Gates |
ICL |
Multi-GPU Hessenberg reduction in MAGMA |
Gates-Multi-GPU-Hessenberg-reduction-in-MAGMA-2012-06-15.pdf |
|
2012/06/08 |
James Plank |
EECS |
Smoke, Mirrors and Erasure Codes |
Plank-Smoke-Mirrors-and-Erasure-Codes-2012-06-08.pdf |
|
2012/06/01 |
Kiran Kasichayanula |
ICL |
Power Aware Computing on GPUs |
Kasichayanula-Power-Aware-Computing-on-GPUs-2012-06-01.pdf |
|
2012/05/25 |
Omar Zenati |
ICL |
LU Decompositions over DAGuE |
Zenati-LU-Decompositions-over-DAGuE-2012-05-25.pdf |
|
2012/05/18 |
Lou Gross |
NIMBioS |
Computational Irreducibility and Prediction in Ecology |
Gross-Computational-Irreducibility-and-Prediction-in-Ecology-2012-05-18.pdf |
|
2012/05/11 |
Vince Weaver |
ICL |
Measuring Power and Energy with PAPI |
Weaver-Measuring-Energy-and-Power-with-PAPI-2012-05-11.pdf |
|
2012/05/04 |
Rick Weber |
EECS |
Approaching 5 Tflops in a $6000 box of commodity parts: xgemm on Southern Islands |
Weber-ICL-xgemm-2012-05-04.pdf |
|
2012/04/27 |
Yves Robert |
ENS Lyon et IUF / ICL |
A unified model and an assessment of checkpointing protocols at extreme scale |
Robert-Unified-Model-for-Assessing-Checkpointing-Protocols-at-Extreme-Scale-2012-04-27.pdf |
|
2012/04/20 |
Jakub Kurzak |
ICL |
Preliminary Results of Autotuning GEMM Kernels for the NVIDIA Kepler GPU |
Kurzak-Autotuning-GEMM-Kernels-for-the-NVIDIA-Kepler-GPU-2012-04-20.pdf |
|
2012/04/13 |
Heike Jagode |
ICL |
Performance Counter Monitoring for the Blue Gene/Q Architecture |
Jagode-Performance-Counter-Monitoring-for-the-Blue-GeneQ-Architecture-2012-04-13.pdf |
|
2012/03/30 |
Piotr Luszczek |
ICL |
Energize and power-up your solvers |
Luszczek-Energize-and-Power-up-Your-Solvers-2012-03-30.pdf |
|
2012/03/23 |
Dorian Arnold |
University of New Mexico |
Arnold Fault Tolerance at Extreme Scales |
Arnold-Fault-Tolerance-at-Extreme-Scales-2012-03-23.pdf |
|
2012/03/21 |
Guillaume Mercier |
INRIA Bordeaux Sud-Ouest - Institut Polytechnique de Bordeaux |
Bind, Reorder, Improve |
Mercier-Bind-Reorder-Improve-2012-03-21.pdf |
|
2012/03/16 |
Hidehiko Hasegawa |
University of Tsukuba |
High Performance Iterative Methods using Accurate Computations |
Hasegawa-High-Performance-Iterative-Methods-using-Accurate-Computations-2012-03-16.pdf |
|
2012/03/09 |
Wesley Bland |
ICL |
A Proposal for User-Level Failure Mitigation |
Bland-Process-Failure-Mitigation-in-the-MPI-3.0-Standard-2012-03-09.pdf |
|
2012/03/02 |
Shirley Moore |
ICL |
Forging Partnerships between Computer and Computational Science: The Road Ahead |
Moore-Forging-Partnerships-2012-03-02.pdf |
|
2012/02/24 |
Yulu Jia |
ICL |
Multi-GPU Implementation of LU Factorization |
Jia-Multi-GPU-Implementation-of-LU-Factorization-2012-02-24.pdf |
|
2012/02/17 |
Dulceneia Becker |
ICL |
Random Butterfly Transformation for Symmetric Systems |
Becker-Random-Butterfly-Transformations-for-Symmetric-Systems-2012-02-17.pdf |
|
2012/02/10 |
Laura Grigori |
INRIA |
Recent advances on communication avoiding dense factorizations |
Grigori-Recent-results-in-avoiding-communication-for-dense-factorizations-2012-02-10.pdf |
|
2012/02/03 |
Daichi Mukunoki |
University of Tsukuba |
Quadruple Precision BLAS Subroutines on GPUs |
Mukunoki-Quadruple-Precision-BLAS-Subroutines-on-GPUs-2012-02-03.pdf |