piotr luszczek :: publications

2016

Task based Cholesky decomposition on Xeon Phi architectures using OpenMP Joseph Dorris, Asim YarKhan, Jakub Kurzak, Piotr Luszczek and Jack Dongarra, International Journal of Computational Science and Engineering Vol. 17, No. 3, DOI 10.1504/IJCSE.2018.095851 [paper]

2015

Performance of Random Sampling for Computing Low-rank Approximations of a Dense Matrix on GPUs, Th&eaccute;o Mary, Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Jack Dongarra, Proceedings of SC15, Austin, TX, USA, November 15-20, 2015.
[paper, slides]

2014

Unified Development for Mixed Multi-GPU and Multi-Coprocessor Environments using a Lightweight Runtime Environment, Azzam Haidar, Chongxiao Cao, Asim YarKhan, Piotr Luszczek, Stanimire Tomov, Khairul Kabir, Jack Dongarra, Proceedings of 28th IEEE International Parallel & Distributed Processing Symposium May 19-23, 2014 Arizona Grand Resort PHOENIX (Arizona) USA
[PDF]

2013

An Improved Parallel Singular Value Algorithm and Its Implementation for Multicore Hardware, Azzam Haidar, Piotr Luszczek, Jakub Kurzak, Proceedings of SC13, November 17-21, 2013, Denver, CO, USA; 10.1145/2503210.2503292 University of Tennessee Technical Report
[PDF]
BlackjackBench: Portable Hardware Characterization with Automated Results' Analysis, Anthony Danalis, Piotr Luszczek, Gabriel Marin, Jeffrey S. Vetter, Jack Dongarra, The Computer Journal, first published online June 28, 2013. DOI: 10.1093/comjnl/bxt057
[PDF]

2012

Energy Footprint of Advanced Dense Numerical Linear Algebra using Tile Algorithms on Multicore Architecture, Jack Dongarra, Hatem Ltaief, Piotr Luszczek, and Vince M. Weaver, submitted to The 2nd International Conference on Cloud and Green Computing(CGC 2012) November 1-3, 2012, Xiangtan, Hunan, China.
[PDF]
BlackjackBench: portable hardware characterization Danalis, Anthony and Luszczek, Piotr and Marin, Gabriel and Vetter, Jeffrey S. and Dongarra, Jack, SIGMETRICS Perform. Eval. Rev. 40(2), pp 74-79, 2012. ISSN 0163-5999. DOI: 10.1145/2381056.2381074
[PDF]
Anatomy of a Globally Recursive Embedded LINPACK Benchmark Piotr Luszczek and Jack Dongarra, accepted in 2012 IEEE High Performance Extreme Computing Conference (HPEC 2012), Westin Hotel, Waltham, Massachusetts, September 10-12, 2012. IEEE Catalog Number: CFP12HPE-CDR, ISBN: 978-1-4673-1574-6.
[PDF | slides]
From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming Peng Du, Rick Weber, Piotr Luszczek, Stanimire Tomov, Gregory Peterson, Jack Dongarra Parallel Computing, Vol. 38 No. 8, pages 391-407, August 2012. 10.1016/j.parco.2011.10.002, ISSN: 0167-8191
[PDF | publisher page ]
Programming the LU Factorization for a Multicore System with Accelerators Jakub Kurzak, Piotr Luszczek, Mathieu Faverage, and Jack Dongarra, In Proceedings of VECPAR 2012 Kobe, Japan, July 17-20, 2012. Springer LNCS ????
[PDF]
Recent Advances in Dense Matrix Computations for Two-Sided Reduction Algorithms, Hatem Ltaief, Azzam Haidar, Piotr Luszczek, and Jack Dongarra, Presentation at 7th International Workshop on Parallel Matrix Algorithms and Applications (PMAA 2012), Birkbeck University of London, UK, 28-30 June 2012.
[PDF]
High Performance Dense Linear System Solver with Resilience to Multiple Soft Errors, Peng Du, Piotr Luszczek, and Jack Dongarra, International Conference on Computational Science, ICCS 2012, Omaha NE.
[PDF]

2011

Exploiting Fine-Grain Parallelism in Recursive LU Factorization Jack Dongarra, Mathieu Faverge, Hatem Ltaief, and Piotr Luszczek In Proceedings of ParCo 2011 30 August - 2 September 2011, Ghent, Belgium
[extended abstract PDF]
Profiling High Performance Dense Linear Algebra Algorithms on Multicore Architectures for Power and Energy Efficiency Hatem Ltaief, Piotr Luszczek, Jack Dongarra In Proceedings International Conference on Energy-Aware High Performance Computing, September 07-09, 2011, Hamburg, Germany (University of Tennessee Technical Report ut-cs-11-674, LAWN 251)
[PDF]
High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures Hatem Ltaief, Piotr Luszczek, and Jack Dongarra UTK CS Technical Report ut-cs-11-673. Submitted to ACM TOMS.
[PDF]
Two-Stage Tridiagonal Reduction for Dense Symmetric Matrices using Tile Algorithms on Multicore Architectures Piotr Luszczek, Hatem Ltaief, and Jack Dongarra Proceedings of IPDPS 2011 Anchorage, Alaska, May 16-20 2011.
[PDF]

2010

Mixed-Tool Performance Analysis on Hybrid Multicore Architectures Peng Du, Piotr Luszczek, Stanimire Tomov, Jack Dongarra ICPPW '10: Proceedings of the 2010 39th International Conference on Parallel Processing Workshops Publisher: IEEE Computer Society, San Diego, California, September 13-16, 2010
[PDF]
From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming Peng Du, Rick Weber, Piotr Luszczek, Stanimire Tomov, Gregory Peterson, Jack Dongarra Technical Report UT-CS-10-656, University of Tennessee, Computer Science Department, LAPACK Working Note 228
[PDF]
Reducing the time to tune parallel dense linear algebra routines with partial execution and performance modelling Jack Dongarra and Piotr Luszczek, Proceedings of SC10, New Orleans, Louisianna, USA, November 13-19, 2010; Also: Technical Report UT-CS-10-661, University of Tennessee, Computer Science Department
[PDF | PDF poster]
Analysis of Various Scalar, Vector, and Parallel Implementations of RandomAccess Piotr Luszczek and Jack Dongarra, Innovative Computing Laboratory (ICL) Technical Report, ICL-UT-10-03, June, 2010.
[PDF]
Distributed-Memory Task Execution and Dependence Tracking within DAGuE and the DPLASMA Project George Bosilca, Aurelien Bouteiller, Anthony Danalis, Mathieu Faverge, Azzam Haidar, Thomas Herault, Jakub Kurzak, Julien Langou, Pierre Lemarinier, Hatem Ltaief, Piotr Luszczek, Asim YarKhan, Jack Dongarra, ICL Technical Report, ICL-UT-10-02, April 2010.
[PDF]

2009

Improving Performance of Sparse Numerical Linear Algebra Computations: Algorithmic optimization techniques for sparse direct and sparse iterative numerical solvers of large linear equations: Piotr Luszczek; LAP Lambert Academic Publishing, ISBN-10: 3838334698, ISBN-13: 978-3838334691, (January 11, 2010); [available at Amazon]

Parallel Programming in MATLAB: Piotr Luszczek; The International Journal of High Performance Computing Applications, Volume 23 Issue 3, pages 277-283, July 2009; [PDF]

2007

High Performance Development for High End Computing with Python Language Wrapper (PLW): Piotr Luszczek and Jack Dongarra; The International Journal of High Performance Computing Applications, Volume 21, No. 2, Summer 2007
The Impact of Multicore on Math Software: Alfredo Buttari, Jack Dongarra, Jakub Kurzak, Julien Langou, Piotr Luszczek, Stan Tomov
Proceedings of PARA'06, Umeå, Sweden, June 18-21, 2006.
Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64bit Accuracy: Alfredo Buttari, Jack Dongarra, Jakub Kurzak, Piotr Luszczek, Stan Tomov
Submitted to ACM TOMS, 2007.

2006

Design and Implementation of the HPCC Benchmark Suite: Piotr Luszczek, Jack Dongarra, and Jeremy Kepner; CT Watch Quarterly, 2(4A), November 2006; [BibTeX]
Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy: Julie Langou, Julien Langou, Piotr Luszczek, Jakub Kurzak, Alfredo Buttari and Jack Dongarra; UTK CS Tech Report, CS-06-574, LAPACK Working Note #175, April 2006.; Accepted to SC06.; [PDF]
High Performance Development for High End Computing with Python Language Wrapper (PLW): Piotr Luszczek and Jack Dongarra; Accepted to IJHPCA.; [PDF]
Self adapting numerical software (SANS) effort: Jack Dongarra, George Bosilca, Zizhong Chen, Victor Eijkhout, Graham E. Fagg, Erika Fuentes, Julien Langou, Piotr Luszczek, Jelena Pjesivac-Grbovic, Keith Seymour, Haihang You, Sathish S. Vadhiyar; IBM Journal of Research and Development 50, pp. 223-238, March, 2006.; [PDF | BibTeX ]
Overview of the HPCC Challenge Benchmark Suite: Piotr Luszczek and Jack Dongarra; SPEC Benchmark Workshop 2006, January 23, 2006; Thompson Conference Center, University of Texas, Austin; [HTML]

2005

Introduction to the HPC Challenge Benchmark Suite: Piotr Luszczek, Jack J. Dongarra, David Koester, Rolf Rabenseifner, Bob Lucas, Jeremy Kepner, John McCalpin, David Bailey, and Daisuke Takahashi; [PDF]
HPC Challenge v1.x Benchmark Suite: David Koester and Piotr Luszczek; Technical Program Tutorial, Noveber 13, 2006; Washington State Convention and Trade Center, Seattle, WA; [HTML]
Introduction to the HPC Challenge Benchmark Suite: Jack Dongarra, Piotr Luszczek; ICL Technical Report, ICL-UT-05-01; CS Dept. Tech Report UT-CS-05-544, 2005; [PDF]

2004

LAPACK for Clusters Project: An Example of Self Adapting Numerical Software: Zizhong Chen, Jack Dongarra, Piotr Luszczek, and Kenneth Roche; Hawaii International Conference on System Sciences HICSS-37, Hilton Waikoloa Village, Big Island, Hawaii, January 5-8, 2004.; [ PDF ]
Cray X1 Evaluation Status Report: Agarwal, P., Alexander, R., Apra, E., Balay, S., Bland, A., Colgan, J., D'Azevedo, E., Dongarra, J., Dunigan, T., Fahey, M., Fahey, R., Geist, A., Gordon, M., Harrison, R., Kaushik, D., Krishnakumar, M., Luszczek, P., Mezzacapa, T., Nichols, J., Nieplocha, J., Oliker, L., Packwood, T., Pindzola, M., Schulthess, T., Vetter, J., White, J., Windus, T., Worley, P., Zacharia, T.; Oak Ridge National Laboratory Report, ORNL/TM-2004/13, January 2004.; [PDF]
Design of Interactive Environment for Numerically Intensive Parallel Linear Algebra Calculations: Piotr Luszczek and Jack Dongarra; Proceedings of the 4th International Conference on Computational Science, Krakow, Poland, June 6-9, 2004; Lecture Notes in Computer Science 3039, Springer-Verlag Berlin-Heidelberg, 2004, pp. 270-277, ISBN 3-540-22129-8; [PDF | BibTeX]

2003

Self-Adapting Software for Numerical Linear Algebra and LAPACK for Clusters: Zizhong Chen, Jack Dongarra, Piotr Luszczek, and Kenneth Roche; Parallel Computing Journal, November-December 2003, ISSN 0167-8191; LAPACK Working Note: 160; University of Tennessee CS Technical Report Number: ut-cs-03-499; [ PDF | ScienceDirect | BibTeX ]
The LINPACK Benchmark: Past, Present, and Future: Jack J. Dongarra and Piotr Luszczek and Antoine Petitet; Concurrency and Computation: Practice and Experience 15, pp. 1-18, 2003; [ PDF | BibTeX ]
A Runtime Support for Large-Scale Irregular Computing on Clusters and Grids: Brezany, P., Bubak, M., Luszczek, P., Malawski, M., Zajac, K.; Annual Review of Scalable Computing, in: Kwong, Y. C. (E ds.), Series on Scalable Computing, vol. 5, Singapore University Press and World Scientific, 2003, pp. 30-64
Self Adaptive Software for Numerical Linear Algebra Library Routines on Clusters: Zizhong Chen, Jack Dongarra, Piotr Luszczek, and Kenneth Roche; presented at Workshop on Parallel Linear Algebra, WoPLA'03
A Framework for Check-Pointed Fault-Tolerant Out-of-Core Linear Algebra: Ed D'Azevedo and Piotr Luszczek; presented at SIAM Conference on Computational Science and Engineering (CSE03) February 10-13, 2003 Hyatt Regency Islandia Hotel and Marina, San Diego, CA; [ PDF ]

Performance Improvements of Common Sparse Numerical Linear Algebra Computations

Piotr Luszczek

Ph.D. Thesis, May 2003, University of Tennessee Knoxville

[ HTML ]

2002-1999

Marian Bubak, Dawid Kurzyniec, Piotr Luszczek, Vaidy S. Sunderam, "Creating Java to Native Code Interfaces with Janet", Scientific Programming 9(1): 39-50 (2001)
H.-Y. Lin and P. Luszczek, "Tuning LINPACK N*N for PA-RISC Platforms", Presented at High Performance Computing on Hewlett-Packard Systems Conference, Bremen, Germany, October 7-9, 2001. (PDF, gzipped PostScript, BibTeX, more)
M. Bubak, D. Kurzyniec, and P. Luszczek, "Convenient use of legacy software in Java with Janet package", Future Generation Computer Systems 17(8), pp. 987-997, June 2001. (PDF, gzipped PostScript, BibTeX)
Abstract
As Java becomes an appropriate environment for high performance computing, the interest arises in combining it with existing code written in other languages. Portable Java interfaces to native code may be developed using the Java Native Interface (JNI). However, as a low-level API it is rather inconvenient to be used directly, thus the higher level tools and techniques are desired. We present Janet -- a highly expressive Java language extension and preprocessing tool that enables convenient integration of native code with Java programs.
J. Dongarra, V. Eijkhout, and P. Luszczek, "Recursive approach in sparse matrix LU factorization", Scientific Programming 9(1), pp. 51-60, 2001. (Also: In Proceedings of the 1st SGI Users Conference, pp. 409-418, Cracow, Poland, October 2000; ACC Cyfronet UMM.) (PDF, gzipped PostScript, BibTeX)
Abstract
This paper describes a recursive method for the LU factorization of sparse matrices. The recursive formulation of common linear algebra codes has been proven very successful in dense matrix computations. An extension of the recursive technique for sparse matrices is presented. Performance results given here show that the recursive approach may perform comparable to leading software packages for sparse matrix factorization in terms of execution time, memory usage, and error estimates of the solution.
M. Bubak, P. Luszczek, "Towards Portable Runtime Support for Irregular and Out-of-Core Computations", in: Dongarra, J., Luque, E., Margalef, T., (Eds.), "Recent Advances in Parallel Virtual Machine and Message Passing Interface" Proceedings of 6th European PVM/MPI Users' Group Meeting, Barcelona, Spain, September 1999, Lecture Notes in Computer Science 1697, Springer-Verlag Berlin-Heidelberg, 1999, pp. 59-66. (gzipped PostScript, BibTeX)
Abstract
In this paper, we present the lip - a run time system which enables easy and portable parallelization of irregular and out-of-core computations. Functions for handling irregular data were developed using the same concept as in the CHAOS library. Out-of-core parallelization is based on the idea of in-core section, and functions for out-of-core data are implemented with capabilities provided by MPI-IO. The new library may be used in C, Fortran and Java programs. Results of performance tests for a generic irregular out-of-core program on HP S2000 are presented and possible further extensions are discussed.