piotr luszczek :: publications
2016
-
Task based Cholesky decomposition on Xeon Phi architectures using OpenMP
Joseph Dorris, Asim YarKhan, Jakub Kurzak, Piotr Luszczek and Jack Dongarra,
International Journal of Computational Science and Engineering Vol. 17, No. 3,
DOI 10.1504/IJCSE.2018.095851
[paper]
2015
-
Performance of Random Sampling for Computing Low-rank Approximations of a Dense Matrix on GPUs,
Th&eaccute;o Mary,
Ichitaro Yamazaki,
Jakub Kurzak,
Piotr Luszczek,
Stanimire Tomov,
Jack Dongarra,
Proceedings of SC15, Austin, TX, USA, November 15-20, 2015.
[paper, slides]
2014
-
Unified Development for Mixed Multi-GPU and Multi-Coprocessor Environments using a Lightweight Runtime Environment,
Azzam Haidar, Chongxiao Cao, Asim YarKhan, Piotr Luszczek, Stanimire Tomov, Khairul Kabir, Jack Dongarra,
Proceedings of 28th IEEE International Parallel & Distributed Processing Symposium May 19-23, 2014 Arizona Grand Resort PHOENIX (Arizona) USA
[PDF]
2013
-
An Improved Parallel Singular Value Algorithm and Its Implementation for Multicore Hardware,
Azzam Haidar, Piotr Luszczek, Jakub Kurzak,
Proceedings of SC13, November 17-21, 2013, Denver, CO, USA; 10.1145/2503210.2503292
University of Tennessee Technical Report
[PDF]
-
BlackjackBench: Portable Hardware Characterization with Automated Results' Analysis,
Anthony Danalis, Piotr Luszczek, Gabriel Marin, Jeffrey S. Vetter, Jack Dongarra, The Computer Journal, first published online
June 28, 2013. DOI: 10.1093/comjnl/bxt057
[PDF]
2012
-
Energy Footprint of Advanced Dense Numerical Linear Algebra using Tile Algorithms on Multicore
Architecture,
Jack Dongarra, Hatem Ltaief, Piotr Luszczek, and Vince M. Weaver,
submitted to The 2nd International Conference on Cloud and
Green Computing(CGC 2012) November 1-3, 2012, Xiangtan, Hunan, China.
[PDF]
-
BlackjackBench: portable hardware characterization
Danalis, Anthony and Luszczek, Piotr and Marin, Gabriel and Vetter, Jeffrey S. and Dongarra, Jack,
SIGMETRICS Perform. Eval. Rev. 40(2), pp 74-79, 2012. ISSN 0163-5999. DOI: 10.1145/2381056.2381074
[PDF]
-
Anatomy of a Globally Recursive Embedded LINPACK Benchmark
Piotr Luszczek and Jack Dongarra, accepted in 2012 IEEE
High Performance Extreme Computing Conference (HPEC 2012), Westin Hotel, Waltham, Massachusetts, September 10-12, 2012.
IEEE Catalog Number: CFP12HPE-CDR, ISBN: 978-1-4673-1574-6.
[PDF | slides]
-
From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming
Peng Du, Rick Weber, Piotr Luszczek, Stanimire Tomov, Gregory Peterson, Jack Dongarra
Parallel Computing, Vol. 38 No. 8, pages 391-407, August 2012.
10.1016/j.parco.2011.10.002, ISSN: 0167-8191
[PDF |
publisher page ]
-
Programming the LU Factorization for a Multicore System with Accelerators
Jakub Kurzak, Piotr Luszczek, Mathieu Faverage, and Jack Dongarra, In Proceedings of
VECPAR 2012
Kobe, Japan, July 17-20, 2012. Springer LNCS ????
[PDF]
-
Recent Advances in Dense Matrix Computations for Two-Sided Reduction Algorithms,
Hatem Ltaief, Azzam Haidar, Piotr Luszczek, and Jack Dongarra,
Presentation at 7th International Workshop on Parallel Matrix Algorithms and Applications (PMAA 2012),
Birkbeck University of London, UK, 28-30 June 2012.
[PDF]
-
High Performance Dense Linear System Solver with Resilience to Multiple Soft Errors,
Peng Du, Piotr Luszczek, and Jack Dongarra,
International Conference on Computational Science, ICCS 2012, Omaha NE.
[PDF]
2011
-
Exploiting Fine-Grain Parallelism in Recursive LU Factorization
Jack Dongarra, Mathieu Faverge, Hatem Ltaief, and Piotr Luszczek
In Proceedings of ParCo 2011 30 August - 2 September 2011, Ghent, Belgium
[extended abstract PDF]
-
Profiling High Performance Dense Linear Algebra Algorithms on Multicore Architectures for Power and Energy Efficiency
Hatem Ltaief, Piotr Luszczek, Jack Dongarra
In Proceedings International Conference on Energy-Aware High Performance Computing, September 07-09, 2011,
Hamburg, Germany (University of Tennessee Technical Report
ut-cs-11-674,
LAWN 251)
[PDF]
-
High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures
Hatem Ltaief, Piotr Luszczek, and Jack Dongarra
UTK CS Technical Report ut-cs-11-673. Submitted to ACM TOMS.
[PDF]
-
Two-Stage Tridiagonal Reduction for Dense Symmetric Matrices using Tile Algorithms on Multicore Architectures
Piotr Luszczek, Hatem Ltaief, and Jack Dongarra
Proceedings of IPDPS 2011 Anchorage, Alaska, May 16-20 2011.
[PDF]
2010
-
Mixed-Tool Performance Analysis on Hybrid Multicore Architectures
Peng Du, Piotr Luszczek, Stanimire Tomov, Jack Dongarra
ICPPW '10: Proceedings of the 2010 39th International Conference on Parallel Processing Workshops
Publisher: IEEE Computer Society,
San Diego, California,
September 13-16, 2010
[PDF]
-
From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming
Peng Du, Rick Weber, Piotr Luszczek, Stanimire Tomov, Gregory Peterson, Jack Dongarra
Technical Report UT-CS-10-656, University of Tennessee, Computer Science Department, LAPACK Working Note 228
[PDF]
-
Reducing the time to tune parallel dense linear algebra routines with partial execution and performance modelling
Jack Dongarra and Piotr Luszczek, Proceedings of SC10, New Orleans, Louisianna, USA, November 13-19, 2010; Also: Technical Report UT-CS-10-661, University of Tennessee, Computer Science Department
[PDF | PDF poster]
-
Analysis of Various Scalar, Vector, and Parallel Implementations of RandomAccess
Piotr Luszczek and Jack Dongarra, Innovative Computing Laboratory (ICL) Technical Report, ICL-UT-10-03, June, 2010.
[PDF]
-
Distributed-Memory Task Execution and Dependence Tracking within DAGuE and the DPLASMA Project
George Bosilca, Aurelien Bouteiller, Anthony Danalis, Mathieu Faverge, Azzam Haidar, Thomas Herault, Jakub Kurzak,
Julien Langou, Pierre Lemarinier, Hatem Ltaief, Piotr Luszczek, Asim YarKhan, Jack Dongarra,
ICL Technical Report, ICL-UT-10-02, April 2010.
[PDF]
2009
- Improving Performance of Sparse Numerical Linear Algebra Computations: Algorithmic optimization techniques for sparse direct
and sparse iterative numerical solvers of large linear equations
-
Piotr Luszczek
-
LAP Lambert Academic Publishing,
ISBN-10: 3838334698,
ISBN-13: 978-3838334691,
(January 11, 2010)
-
[available at Amazon]
- Parallel Programming in MATLAB
-
Piotr Luszczek
-
The International Journal of High Performance Computing Applications,
Volume 23 Issue 3, pages 277-283, July 2009
-
[PDF]
2007
- High Performance Development for High End Computing with Python
Language Wrapper (PLW)
-
Piotr Luszczek and Jack Dongarra
-
The International Journal of High Performance Computing Applications,
Volume 21, No. 2, Summer 2007
- The Impact of Multicore on Math Software
-
Alfredo Buttari, Jack Dongarra, Jakub Kurzak, Julien Langou,
Piotr Luszczek, Stan Tomov
Proceedings of PARA'06, Umeå, Sweden, June 18-21, 2006.
- Using Mixed Precision for Sparse Matrix Computations to Enhance
the Performance while Achieving 64bit Accuracy
-
Alfredo Buttari, Jack Dongarra, Jakub Kurzak, Piotr Luszczek, Stan Tomov
Submitted to ACM TOMS, 2007.
2006
- Design and Implementation of the HPCC Benchmark Suite
-
Piotr Luszczek, Jack Dongarra, and Jeremy Kepner
- CT Watch Quarterly, 2(4A), November 2006
-
[BibTeX]
- Exploiting the Performance of 32 bit Floating Point Arithmetic
in Obtaining 64 bit Accuracy
- Julie Langou, Julien Langou, Piotr Luszczek, Jakub Kurzak, Alfredo
Buttari and Jack Dongarra
- UTK CS Tech Report, CS-06-574, LAPACK Working Note #175, April 2006.
-
Accepted to SC06.
-
[PDF]
- High Performance Development for High End Computing with Python
Language Wrapper (PLW)
-
Piotr Luszczek and Jack Dongarra
-
Accepted to IJHPCA.
-
[PDF]
- Self adapting numerical software (SANS) effort
-
Jack Dongarra, George Bosilca, Zizhong Chen, Victor Eijkhout, Graham E. Fagg,
Erika Fuentes, Julien Langou, Piotr Luszczek, Jelena Pjesivac-Grbovic, Keith Seymour, Haihang You, Sathish S. Vadhiyar
- IBM Journal of Research and Development 50, pp. 223-238, March, 2006.
- [PDF
| BibTeX ]
- Overview of the HPCC Challenge Benchmark Suite
- Piotr Luszczek and Jack Dongarra
- SPEC Benchmark Workshop 2006, January 23, 2006
- Thompson Conference Center, University of Texas, Austin
- [HTML]
2005
- Introduction to the HPC Challenge Benchmark Suite
- Piotr Luszczek, Jack J. Dongarra, David Koester, Rolf Rabenseifner,
Bob Lucas, Jeremy Kepner, John McCalpin, David Bailey,
and Daisuke Takahashi
-
[PDF]
- HPC Challenge v1.x Benchmark Suite
- David Koester and Piotr Luszczek
- Technical Program Tutorial, Noveber 13, 2006
- Washington State Convention and Trade Center, Seattle, WA
- [HTML]
- Introduction to the HPC Challenge Benchmark Suite
-
Jack Dongarra, Piotr Luszczek
-
ICL Technical Report, ICL-UT-05-01
-
CS Dept. Tech Report UT-CS-05-544, 2005
-
[PDF]
2004
- LAPACK for Clusters Project: An Example of Self Adapting Numerical
Software
- Zizhong Chen, Jack Dongarra, Piotr Luszczek, and Kenneth
Roche
- Hawaii International Conference on System Sciences
HICSS-37,
Hilton Waikoloa Village, Big Island, Hawaii, January 5-8, 2004.
- [ PDF ]
- Cray X1 Evaluation Status Report
- Agarwal, P., Alexander, R., Apra, E., Balay, S., Bland, A.,
Colgan, J., D'Azevedo, E., Dongarra, J., Dunigan, T., Fahey, M., Fahey, R.,
Geist, A., Gordon, M., Harrison, R., Kaushik, D., Krishnakumar, M.,
Luszczek, P., Mezzacapa, T., Nichols, J., Nieplocha, J., Oliker, L.,
Packwood, T., Pindzola, M., Schulthess, T., Vetter, J., White, J.,
Windus, T., Worley, P., Zacharia, T.
- Oak Ridge National Laboratory Report, ORNL/TM-2004/13, January 2004.
-
[PDF]
- Design of Interactive Environment for Numerically Intensive
Parallel Linear Algebra Calculations
- Piotr Luszczek and Jack Dongarra
- Proceedings of the 4th International Conference on Computational Science,
Krakow, Poland, June 6-9, 2004
- Lecture Notes in Computer Science 3039, Springer-Verlag
Berlin-Heidelberg, 2004, pp. 270-277, ISBN 3-540-22129-8
- [PDF | BibTeX]
2003
- Self-Adapting Software for Numerical Linear Algebra and LAPACK for
Clusters
- Zizhong Chen, Jack Dongarra, Piotr Luszczek, and Kenneth Roche
- Parallel Computing Journal, November-December 2003, ISSN 0167-8191
- LAPACK Working Note: 160
- University of Tennessee CS Technical Report Number: ut-cs-03-499
- [ PDF | ScienceDirect
| BibTeX ]
- The LINPACK Benchmark: Past, Present, and Future
- Jack J. Dongarra and Piotr Luszczek and Antoine Petitet
- Concurrency and Computation: Practice and Experience 15, pp. 1-18, 2003
- [ PDF | BibTeX ]
- A Runtime Support for Large-Scale Irregular Computing on Clusters
and Grids
- Brezany, P., Bubak, M., Luszczek, P., Malawski, M., Zajac, K.
- Annual Review of Scalable Computing, in: Kwong, Y. C. (E ds.), Series on
Scalable Computing, vol. 5, Singapore University Press and World Scientific,
2003, pp. 30-64
- Self Adaptive Software for Numerical Linear Algebra Library
Routines on Clusters
- Zizhong Chen, Jack Dongarra, Piotr Luszczek, and Kenneth Roche
- presented at Workshop on Parallel Linear Algebra,
WoPLA'03
- A Framework for Check-Pointed Fault-Tolerant Out-of-Core Linear
Algebra
- Ed D'Azevedo and Piotr Luszczek
- presented at SIAM Conference on Computational Science and Engineering
(CSE03)
February 10-13, 2003
Hyatt Regency Islandia Hotel and Marina, San Diego, CA
- [ PDF ]
Performance Improvements of Common Sparse Numerical Linear Algebra
Computations
Piotr Luszczek
Ph.D. Thesis, May 2003, University of Tennessee Knoxville
[ HTML ]
2002-1999
- Marian Bubak, Dawid Kurzyniec, Piotr Luszczek, Vaidy S. Sunderam,
"Creating Java to Native Code Interfaces with Janet",
Scientific Programming 9(1): 39-50 (2001)
- H.-Y. Lin and P. Luszczek, "Tuning LINPACK N*N for PA-RISC Platforms",
Presented at High Performance Computing on Hewlett-Packard Systems
Conference, Bremen, Germany, October 7-9, 2001.
(PDF, gzipped
PostScript,
BibTeX, more)
- M. Bubak, D. Kurzyniec, and P. Luszczek, "Convenient use of legacy software
in Java with Janet package", Future Generation Computer Systems 17(8),
pp. 987-997, June 2001.
(PDF,
gzipped PostScript,
BibTeX)
Abstract
As Java becomes an appropriate environment for high performance computing, the
interest arises in combining it with existing code written in other
languages. Portable Java interfaces to native code may be developed using the
Java Native Interface (JNI). However, as a low-level API it is rather
inconvenient to be used directly, thus the higher level tools and techniques
are desired. We present Janet -- a highly expressive Java language extension
and preprocessing tool that enables convenient integration of native code with
Java programs.
- J. Dongarra, V. Eijkhout, and P. Luszczek, "Recursive approach in sparse
matrix LU factorization", Scientific Programming 9(1), pp. 51-60, 2001. (Also:
In Proceedings of the 1st SGI Users Conference, pp. 409-418, Cracow, Poland,
October 2000; ACC Cyfronet UMM.)
(PDF,
gzipped PostScript,
BibTeX)
Abstract
This paper describes a recursive method for the LU factorization of sparse
matrices. The recursive formulation of common linear algebra codes has been
proven very successful in dense matrix computations. An extension of the
recursive technique for sparse matrices is presented. Performance results
given here
show that the recursive approach may perform comparable to leading software
packages for sparse matrix factorization in terms of execution time, memory
usage, and error estimates of the solution.
- M. Bubak, P. Luszczek, "Towards Portable Runtime Support for Irregular and
Out-of-Core Computations", in: Dongarra, J., Luque, E., Margalef, T., (Eds.),
"Recent Advances in Parallel Virtual Machine and Message Passing Interface"
Proceedings of 6th European PVM/MPI Users' Group Meeting, Barcelona, Spain,
September 1999, Lecture Notes in Computer Science 1697, Springer-Verlag
Berlin-Heidelberg, 1999, pp. 59-66.
(gzipped PostScript,
BibTeX)
Abstract
In this paper, we present the lip
- a run time system which
enables easy and portable parallelization of irregular and out-of-core
computations. Functions for handling irregular data were developed using
the same concept as in the CHAOS library. Out-of-core parallelization is based
on the idea of in-core section, and functions for out-of-core data are
implemented with capabilities provided by MPI-IO. The new library may be used
in C, Fortran and Java programs. Results of performance tests for a generic
irregular out-of-core program on HP S2000 are presented and possible further
extensions are discussed.