Publications

2011

2012

Du, P., R. Weber, P. Luszczek, S. Tomov, G. D. Peterson, and J. Dongarra, “From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming,” Parallel Computing, vol. 38, no. 8, pp. 391-407, August 2012.

(1.64 MB)

Anzt, H., P. Luszczek, J. Dongarra, and V. Heuveline, “GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement,” EuroPar 2012 (also LAWN 260), Rhodes Island, Greece, August 2012.

(662.98 KB)

Dongarra, J., M. Gates, Y. Jia, K. Kabir, P. Luszczek, and S. Tomov, MAGMA MIC: Linear Algebra Library for Intel Xeon Phi Coprocessors , Salt Lake City, UT, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC12), November 2012.

(6.4 MB)

Weaver, V. M., M. Johnson, K. Kasichayanula, J. Ralph, P. Luszczek, D. Terpstra, and S. Moore, “Measuring Energy and Power with PAPI,” International Workshop on Power-Aware Systems and Architectures, Pittsburgh, PA, September 2012. DOI: 10.1109/ICPPW.2012.39

(146.79 KB)

Kasichayanula, K., D. Terpstra, P. Luszczek, S. Tomov, S. Moore, and G. D. Peterson, “Power Aware Computing on GPUs,” SAAHPC '12 (Best Paper Award), Argonne, IL, July 2012.

(658.06 KB)

Kurzak, J., P. Luszczek, M. Faverge, and J. Dongarra, “Programming the LU Factorization for a Multicore System with Accelerators,” Proceedings of VECPAR’12, Kobe, Japan, April 2012.

(414.33 KB)

2013

Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Luszczek, and J. Dongarra, “Dense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG Approach,” Scalable Computing and Communications: Theory and Practice: John Wiley & Sons, pp. 699-735, March 2013.

(1.01 MB)

Aupy, G., M. Faverge, Y. Robert, J. Kurzak, P. Luszczek, and J. Dongarra, “Implementing a systolic algorithm for QR factorization on multicore clusters with PaRSEC,” Lawn 277, no. UT-CS-13-709, May 2013.

(298.63 KB)

Haidar, A., P. Luszczek, J. Kurzak, and J. Dongarra, “An Improved Parallel Singular Value Algorithm and Its Implementation for Multicore Hardware,” University of Tennessee Computer Science Technical Report (also LAWN 283), no. ut-eecs-13-720: University of Tennessee, October 2013.

(1.23 MB)

Kurzak, J., P. Luszczek, and J. Dongarra, “LU Factorization with Partial Pivoting for a Multicore System with Accelerators,” IEEE Transactions on Parallel and Distributed Computing, vol. 24, issue 8, pp. 1613-1621, August 2013. DOI: http://doi.ieeecomputersociety.org/10.1109/TPDS.2012.242

(1.08 MB)

Kurzak, J., P. Luszczek, A. YarKhan, M. Faverge, J. Langou, H. Bouwmeester, and J. Dongarra, “Multithreading in the PLASMA Library,” Multi and Many-Core Processing: Architecture, Programming, Algorithms, & Applications: Taylor & Francis, 00 2013.

(536.28 KB)

Jia, Y., P. Luszczek, and J. Dongarra, “Transient Error Resilient Hessenberg Reduction on GPU-based Hybrid Architectures,” UT-CS-13-712: University of Tennessee Computer Science Technical Report, June 2013.

(206.42 KB)

2014

Dongarra, J., M. Faverge, H. Ltaeif, and P. Luszczek, “Achieving numerical accuracy and high performance using recursive tile LU factorization with partial pivoting,” Concurrency and Computation: Practice and Experience, vol. 26, issue 7, pp. 1408-1431, May 2014. DOI: 10.1002/cpe.3110

(1.96 MB)

Main menu

Pages