Publications

Show only items where

Author

Type

Term

Year

Keyword

Export 1274 results:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Bassi, A., M. Beck, G. Fagg, T. Moore, J. Plank, M. Swany, and R. Wolski, “The Internet BackPlane Protocol: A Study in Resource Sharing,” Proceedings of the second IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID 2002), Berlin, Germany, October 2002.

Bassi, A., and X. Li, “Internet Backplane Protocol - Test Language v. 1.0,” University of Tennessee Computer Science Technical Report, no. UT-CS-01-464, January 2001.

(22.43 KB)

Bassi, A., M. Beck, J. Plank, and R. Wolski, “Internet Backplane Protocol: API 1.0,” University of Tennessee Computer Science Technical Report, no. UT-CS-01-464, January 2001.

(55.33 KB)

Bartlett, R., xSDK4ECP: Extreme-scale Scientific Software Development Kit for ECP (Poster) , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.

(1.54 MB)

Barry, D., A. Danalis, and H. Jagode, “Effortless Monitoring of Arithmetic Intensity with PAPI’s Counter Analysis Toolkit,” Tools for High Performance Computing 2018/2019: Springer, pp. 195–218, 2021.

Barry, D., H. Jagode, A. Danalis, and J. Dongarra, “Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements,” 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), St. Petersburg, Florida, IEEE, August 2023.

(1.81 MB)

Barry, D., H. Jagode, A. Danalis, and J. Dongarra, Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements , St. Petersburg, FL, 28th HIPS Workshop, May 2023.

(3.99 MB)

Barry, D., A. Danalis, and H. Jagode, “Effortless Monitoring of Arithmetic Intensity with PAPI's Counter Analysis Toolkit,” 13th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Springer International Publishing, September 2020.

(738.47 KB)

Barbut, Q., A. Benoit, T. Herault, Y. Robert, and F. Vivien, “When to checkpoint at the end of a fixed-length reservation?,” Fault Tolerance for HPC at eXtreme Scales (FTXS) Workshop, Denver, United States, August 2023.

Ballard, G., D. Becker, J. Demmel, J. Dongarra, A. Druinsky, I. Peled, O. Schwartz, S. Toledo, and I. Yamazaki, “Communication-Avoiding Symmetric-Indefinite Factorization,” SIAM Journal on Matrix Analysis and Application, vol. 35, issue 4, pp. 1364-1406, July 2014.

(593.18 KB)

Balaprakash, P., J. Dongarra, T. Gamblin, M. Hall, J. Hollingsworth, B. Norris, and R. Vuduc, “Autotuning in High-Performance Computing Applications,” Proceedings of the IEEE, vol. 106, issue 11, pp. 2068–2083, November 2018.

(2.5 MB)

Bak, S., C. Bertoni, S. Boehm, R. Budiardja, B. M. Chapman, J. Doerfert, M. Eisenbach, H. Finkel, O. Hernandez, J. Huber, et al., “OpenMP application experiences: Porting to accelerated nodes,” Parallel Computing, vol. 109, March 2022.

Bak, S., O. Hernandez, M. Gates, P. Luszczek, and V. Sarkar, “Task-graph scheduling extensions for efficient synchronization and communication,” Proceedings of the ACM International Conference on Supercomputing, pp. 88–101, 2021.

Bailey, D., J. Chame, C. Chen, J. Dongarra, M. Hall, J. K. Hollingsworth, P. D. Hovland, S. Moore, K. Seymour, J. Shin, et al., “PERI Auto-tuning,” Proc. SciDAC 2008, vol. 125, Seatlle, Washington, Journal of Physics, January 2008.

(873.75 KB)

Bai, Z., J. Dongarra, D. Lu, and I. Yamazaki, “Matrix Powers Kernels for Thick-Restart Lanczos with Explicit External Deflation,” International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.

(480.73 KB)

Bai, Z., J. Demmel, J. Dongarra, J. Langou, and J. Wang, “LAPACK,” Handbook of Linear Algebra, Second, Boca Raton, FL, CRC Press, 2013.

(223.21 KB)

Badia, R. M., M. Beck, F. Bodin, T. Boku, F. Cappello, A. Choudhary, C. Costa, E. Deelman, N. Ferrier, K. Fujisawa, et al., “A Collection of Presentations from the BDEC2 Workshop in Kobe, Japan,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-09: University of Tennessee, Knoxville, February 2019.

(58.85 MB)

Baboulin, M., D. Becker, and J. Dongarra, “A Parallel Tiled Solver for Symmetric Indefinite Systems On Multicore Architectures,” IPDPS 2012, Shanghai, China, May 2012.

(544.09 KB)

Baboulin, M., J. Dongarra, J. Herrmann, and S. Tomov, “Accelerating Linear System Solutions Using Randomization Techniques,” INRIA RR-7616 / LAWN #246 (presented at International AMMCS’11), Waterloo, Ontario, Canada, July 2011.

(358.79 KB)

Baboulin, M., J. Dongarra, A. Remy, S. Tomov, and I. Yamazaki, “Solving Dense Symmetric Indefinite Systems using GPUs,” Concurrency and Computation: Practice and Experience, vol. 29, issue 9, March 2017.

(1.94 MB)

Baboulin, M., A. Buttari, J. Dongarra, J. Kurzak, J. Langou, J. Langou, P. Luszczek, and S. Tomov, “Accelerating Scientific Computations with Mixed Precision Algorithms,” Computer Physics Communications, vol. 180, issue 12, pp. 2526-2533, December 2009.

(402.69 KB)

Baboulin, M., and S. Gratton, “Using dual techniques to derive componentwise and mixed condition numbers for a linear functional of a linear least squares solution,” University of Tennessee Computer Science Technical Report, UT-CS-08-622 (also LAPACK Working Note 207), January 2008.

(159.65 KB)

Baboulin, M., J. Demmel, J. Dongarra, S. Tomov, and V. Volkov, Enhancing the Performance of Dense Linear Algebra Solvers on GPUs (in the MAGMA Project) , Austin, TX, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC08), November 2008.

(5.28 MB)

Baboulin, M., J. Dongarra, A. Remy, S. Tomov, and I. Yamazaki, “Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures,” Lecture Notes in Computer Science, vol. 9573: Springer International Publishing, pp. 86-95, September 2015, 2016.

(327.14 KB)

Baboulin, M., J. Dongarra, and S. Tomov, “Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures,” University of Tennessee Computer Science Technical Report, UT-CS-08-615 (also LAPACK Working Note 200), January 2008.

(289.93 KB)

Baboulin, M., S. Donfack, J. Dongarra, L. Grigori, A. Remi, and S. Tomov, “A Class of Communication-Avoiding Algorithms for Solving General Dense Linear Systems on CPU/GPU Parallel Machines,” Proc. of the International Conference on Computational Science (ICCS), vol. 9, pp. 17-26, June 2012.

Baboulin, M., J. Dongarra, S. Gratton, and J. Langou, “Computing the Conditioning of the Components of a Linear Least Squares Solution,” University of Tennessee Computer Science Technical Report, no. UT-CS-07-604, (also LAPACK Working Note 193), January 2007.

(374.97 KB)

Baboulin, M., J. Dongarra, S. Gratton, and J. Langou, “Computing the Conditioning of the Components of a Linear Least-squares Solution,” Numerical Linear Algebra with Applications, vol. 16, no. 7, pp. 517-533, 00 2009.

(374.97 KB)

Baboulin, M., D. Becker, G. Bosilca, A. Danalis, and J. Dongarra, “An Efficient Distributed Randomized Algorithm for Solving Large Dense Symmetric Indefinite Linear Systems,” Parallel Computing, vol. 40, issue 7, pp. 213-223, July 2014.

(1.42 MB)

Baboulin, M., J. Dongarra, and R. Lacroix, “Computing Least Squares Condition Numbers on Hybrid Multicore/GPU Systems,” International Interdisciplinary Conference on Applied Mathematics, Modeling and Computational Science (AMMCS), Waterloo, Ontario, CA, August 2014.

(130.18 KB)

Baboulin, M., S. Tomov, and J. Dongarra, “Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures,” PARA 2008, 9th International Workshop on State-of-the-Art in Scientific and Parallel Computing, Trondheim Norway, May 2008.

Baboulin, M., J. Dongarra, S. Gratton, and J. Langou, “Computing the Conditioning of the Components of a Linear Least Squares Solution,” VECPAR '08, High Performance Computing for Computational Science, Toulouse, France, January 2008.

(374.97 KB)

Baboulin, M., D. Becker, and J. Dongarra, “A parallel tiled solver for dense symmetric indefinite systems on multicore architectures,” University of Tennessee Computer Science Technical Report, no. ICL-UT-11-07, October 2011.

(544.2 KB)

Baboulin, M., J. Dongarra, J. Herrmann, and S. Tomov, “Accelerating Linear System Solutions Using Randomization Techniques,” ACM Transactions on Mathematical Software (also LAWN 246), vol. 39, issue 2, February 2013.

(358.79 KB)

Baboulin, M., D. Becker, G. Bosilca, A. Danalis, and J. Dongarra, “An efficient distributed randomized solver with application to large dense linear systems,” ICL Technical Report, no. ICL-UT-12-02, July 2012.

(626.26 KB)

Baboulin, M., V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, and S. Tomov, Towards a High-Performance Tensor Algebra Package for Accelerators , Gatlinburg, TN, moky Mountains Computational Sciences and Engineering Conference (SMC15), September 2015.

(1.76 MB)

Ayala, A., S. Tomov, M. Stoyanov, A. Haidar, and J. Dongarra, “Accelerating Multi - Process Communication for Parallel 3-D FFT,” 2021 Workshop on Exascale MPI (ExaMPI), St. Louis, MO, USA, IEEE, December 2021.

Ayala, A., S. Tomov, P. Luszczek, S. Cayrols, G. Ragghianti, and J. Dongarra, “Interim Report on Benchmarking FFT Libraries on High Performance Systems,” Innovative Computing Laboratory Technical Report, no. ICL-UT-21-03: University of Tennessee, July 2021.

(2.68 MB)

Ayala, A., S. Tomov, M. Stoyanov, A. Haidar, and J. Dongarra, “Performance Analysis of Parallel FFT on Large Multi-GPU Systems,” 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Lyon, France, IEEE, August 2022.

Ayala, A., S. Tomov, A. Haidar, and J. Dongarra, heFFTe: Highly Efficient FFT for Exascale (Poster) : NVIDIA GPU Technology Conference (GTC2020), October 2020.

(866.88 KB)

Ayala, A., S. Tomov, A. Haidar, and J. Dongarra, “heFFTe: Highly Efficient FFT for Exascale,” International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, June 2020.

(2.62 MB)

Ayala, A., S. Tomov, M. Stoyanov, and J. Dongarra, “Scalability Issues in FFT Computation,” International Conference on Parallel Computing Technologies: Springer, pp. 279–287, 2021.

Ayala, A., S. Tomov, A. Haidar, and J. Dongarra, heFFTe: Highly Efficient FFT for Exascale (Poster) , Seattle, WA, SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP20), February 2020.

(1.54 MB)

Ayala, A., S. Tomov, X. Luo, H. Shaiek, A. Haidar, G. Bosilca, and J. Dongarra, “Impacts of Multi-GPU MPI Collective Communications on Large FFT Computation,” Workshop on Exascale MPI (ExaMPI) at SC19, Denver, CO, November 2019.

(1.6 MB)

Ayala, A., S. Tomov, J. Dongarra, and A. Haidar, heFFTe: Highly Efficient FFT for Exascale (Poster) , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.

(6.2 MB)

Ayala, A., S. Tomov, A. Haidar, M.. Stoyanov, S. Cayrols, J. Li, G. Bosilca, and J. Dongarra, Accelerating FFT towards Exascale Computing : NVIDIA GPU Technology Conference (GTC2021), 2021.

(27.23 MB)

Ayala, A., S. Tomov, P. Luszczek, S. Cayrols, G. Ragghianti, and J. Dongarra, “FFT Benchmark Performance Experiments on Systems Targeting Exascale,” ICL Technical Report, no. ICL-UT-22-02, March 2022.

(5.87 MB)

Ayala, A., S. Tomov, P. Luszczek, S. Cayrols, G. Ragghianti, and J. Dongarra, “Analysis of the Communication and Computation Cost of FFT Libraries towards Exascale,” ICL Technical Report, no. ICL-UT-22-07: Innovative Computing Laboratory, July 2022.

(5.91 MB)

Aupy, G., Y. Robert, and F. Vivien, “Assuming failure independence: are we right to be wrong?,” The 3rd International Workshop on Fault Tolerant Systems (FTS), Honolulu, Hawaii, IEEE, September 2017.

(597.11 KB)

Aupy, G., A. Benoit, L. Pottier, P. Raghavan, Y. Robert, and M. Shantharam, “Co-Scheduling Algorithms for Cache-Partitioned Systems,” 19th Workshop on Advances in Parallel and Distributed Computational Models, Orlando, FL, IEEE Computer Society Press, May 2017.

(584.76 KB)

Main menu

Publications

Pages