Publications

Export 345 results:
Filters: First Letter Of Last Name is S  [Clear All Filters]
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
N
Seymour, K., A. YarKhan, S. Agrawal, and J. Dongarra, NetSolve: Grid Enabling Scientific Computing Environments,” Grid Computing and New Frontiers of High Performance Processing, no. 14: Elsevier, 00 2005.  (425 KB)
Agrawal, S., J. Dongarra, K. Seymour, and S. Vadhiyar, NetSolve: Past, Present, and Future - A Look at a Grid Enabled Server,” Making the Global Infrastructure a Reality: Wiley Publishing, 00 2003.  (158.19 KB)
Hoefler, T., J. M. Squyres, G. Fagg, G. Bosilca, W. Rehm, and A. Lumsdaine, A New Approach to MPI Collective Communication Implementations,” Distributed and Parallel Systems: Springer US, pp. 45-54, 2007.  (140.2 KB)
Berman, F., H. Casanova, A. Chien, K. Cooper, H. Dail, A. Dasgupta, W. Deng, J. Dongarra, L. Johnsson, K. Kennedy, et al., New Grid Scheduling and Rescheduling Methods in the GrADS Project,” International Journal of Parallel Programming, vol. 33, no. 2: Springer, pp. 209-229, June 2005.  (306.41 KB)
Berman, F., H. Casanova, A. Chien, K. Cooper, H. Dail, A. Dasgupta, W. Deng, J. Dongarra, L. Johnsson, K. Kennedy, et al., New Grid Scheduling and Rescheduling Methods in the GrADS Project,” International Journal of Parallel Programming, vol. 33, no. 2: Springer, pp. 209-229, June 2005.  (306.41 KB)
Li, Y., J. Dongarra, and S. Tomov, A Note on Auto-tuning GEMM for GPUs,” 9th International Conference on Computational Science (ICCS 2009), no. 5544-5545, Baton Rouge, LA, pp. 884-892, May 2009.  (236.02 KB)
Li, Y., J. Dongarra, and S. Tomov, A Note on Auto-tuning GEMM for GPUs,” 9th International Conference on Computational Science (ICCS 2009), no. 5544-5545, Baton Rouge, LA, pp. 884-892, May 2009.  (236.02 KB)
Solcà, R., A. Haidar, S. Tomov, J. Dongarra, and T. C. Schulthess, A Novel Hybrid CPU-GPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks,” Supercomputing '12 (poster), Salt Lake City, Utah, November 2012.
Solcà, R., A. Haidar, S. Tomov, J. Dongarra, and T. C. Schulthess, A Novel Hybrid CPU-GPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks,” Supercomputing '12 (poster), Salt Lake City, Utah, November 2012.
Haidar, A., R. Solcà, M. Gates, S. Tomov, T. C. Schulthess, and J. Dongarra, A Novel Hybrid CPU-GPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks,” International Journal of High Performance Computing Applications, vol. 28, issue 2, pp. 196-209, May 2014.  (1.74 MB)
Haidar, A., R. Solcà, M. Gates, S. Tomov, T. C. Schulthess, and J. Dongarra, A Novel Hybrid CPU-GPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks,” International Journal of High Performance Computing Applications, vol. 28, issue 2, pp. 196-209, May 2014.  (1.74 MB)
Dongarra, J., I. Duff, D. Sorensen, and H. van der Vorst, Numerical Linear Algebra for High-Performance Computers,” Software, Environments and Tools: SIAM, 1998.
O
Bak, S., C. Bertoni, S. Boehm, R. Budiardja, B. M. Chapman, J. Doerfert, M. Eisenbach, H. Finkel, O. Hernandez, J. Huber, et al., OpenMP application experiences: Porting to accelerated nodes,” Parallel Computing, vol. 109, March 2022.
Bak, S., C. Bertoni, S. Boehm, R. Budiardja, B. M. Chapman, J. Doerfert, M. Eisenbach, H. Finkel, O. Hernandez, J. Huber, et al., OpenMP application experiences: Porting to accelerated nodes,” Parallel Computing, vol. 109, March 2022.
Benoit, A., A. Cavelan, Y. Robert, and H. Sun, Optimal Resilience Patterns to Cope with Fail-stop and Silent Errors,” 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Chicago, IL, IEEE, May 2016.  (603.58 KB)
Hiroyasu, T., M. Miki, J. Sawada, and J. Dongarra, Optimization of Injection Schedule of Diesel Engine Using GridRPC,” Information Processing Society of Japan Symposium Series, vol. 2003, no. 14, pp. 189-197, January 2003.  (520.96 KB)
Hiroyasu, T., M. Miki, H. Shimosaka, and J. Dongarra, Optimization Problem Solving System using Grid RPC,” 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, Tokyo, Japan, March 2003.  (71.6 KB)
Shimosaka, H., T. Hiroyasu, M. Miki, and J. Dongarra, Optimization Problem Solving System Using GridRPC,” IEEE Transactions on Parallel and Distributed Systems (submitted), January 2005.  (740.57 KB)
Hiroyasu, T., M. Miki, H. Shimosaka, Y. Tanimura, and J. Dongarra, Optimization System Using Grid RPC,” Meeting of the Japan Society of Mechanical Engineers, Kyoto University, Kyoto, Japan, October 2002.
Tomov, S., P. Luszczek, I. Yamazaki, J. Dongarra, H. Anzt, and W. Sawyer, Optimizing Krylov Subspace Solvers on Graphics Processing Units,” Fourth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (536.32 KB)
Seymour, K., H. Nakada, S. Matsuoka, J. Dongarra, C. Lee, and H. Casanova, Overview of GridRPC: A Remote Procedure Call API for Grid Computing,” Proceedings of the Third International Workshop on Grid Computing, pp. 274-278, January 2002.  (221.82 KB)
Steen, A J.. van der, and J. Dongarra, Overview of High Performance Computers,” Handbook of Massive Data Sets: Kluwer Academic Publishers, pp. 791-852, January 2001.  (442.71 KB)
P
Hoemmen, M., D. Hollman, C. Trott, D. Sunderland, N. Liber, L-T. Lo, D. Lebrun-Grandie, G. Lopez, P. Caday, S. Knepper, et al., P1673R3: A Free Function Linear algebra Interface Based on the BLAS,” ISO JTC1 SC22 WG22, no. P1673R3: ISO, April 2021.  (858.89 KB)
London, K., S. Moore, P. Mucci, K. Seymour, and R. Luczak, The PAPI Cross-Platform Interface to Hardware Performance Counters,” Department of Defense Users' Group Conference Proceedings, Biloxi, Mississippi, June 2001.  (328.56 KB)
Sid-Lakhdar, W. M., S. Cayrols, D. Bielich, A. Abdelfattah, P. Luszczek, M. Gates, S. Tomov, H. Johansen, D. Williams-Young, T. A. Davis, et al., PAQR: Pivoting Avoiding QR factorization,” ICL Technical Report, no. ICL-UT-22-06, June 2022.  (364.85 KB)
Sid-Lakhdar, W., S. Cayrols, D. Bielich, A. Abdelfattah, P. Luszczek, M. Gates, S. Tomov, H. Johansen, D. Williams-Young, T. Davis, et al., PAQR: Pivoting Avoiding QR factorization,” 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS), St. Petersburg, FL, USA, IEEE, 2023.
Malony, A. D., S. Biersdorff, S. Shende, H. Jagode, S. Tomov, G. Juckeland, R. Dietrich, D. Poole, and C. Lamb, Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs,” International Conference on Parallel Processing (ICPP'11), Taipei, Taiwan, ACM, September 2011.  (1.41 MB)
Giraud, L., J. Langou, and G.. Sylvand, On the Parallel Solution of Large Industrial Wave Propagation Problems,” Journal of Computational Acoustics (to appear), January 2005.  (1.08 MB)
Youseff, L., K. Seymour, H. You, D. Zagorodnov, J. Dongarra, and R. Wolski, Paravirtualization Effect on Single- and Multi-threaded Memory-Intensive Linear Algebra Software,” Cluster Computing Journal: Special Issue on High Performance Distributed Computing, vol. 12, no. 2: Springer Netherlands, pp. 101-122, 00 2009.  (451.07 KB)
Haidar, A., B. Brock, S. Tomov, M. Guidry, J. Jay Billings, D. Shyles, and J. Dongarra, Performance Analysis and Acceleration of Explicit Integration for Large Kinetic Networks using Batched GPU Computations,” 2016 IEEE High Performance Extreme Computing Conference (HPEC ‘16), Waltham, MA, IEEE, September 2016.  (480.29 KB)
Parker, S., J. Mellor-Crummey, D. H. Ahn, H. Jagode, H. Brunst, S. Shende, A. D. Malony, D. DelSignore, R. Tschuter, R. Castain, et al., Performance Analysis and Debugging Tools at Scale,” Exascale Scientific Applications: Scalability and Performance Portability: Chapman & Hall / CRC Press, pp. 17-50, November 2017.
Worley, P. H., J. Candy, L. Carrington, K. Huck, T. Kaiser, K. Mahinthakumar, A. D. Malony, S. Moore, D. Reed, P. C. Roth, et al., Performance Analysis of GYRO: A Tool Evaluation,” In Proceedings of the 2005 SciDAC Conference, San Francisco, CA, June 2005.  (172.07 KB)
Worley, P. H., J. Candy, L. Carrington, K. Huck, T. Kaiser, K. Mahinthakumar, A. D. Malony, S. Moore, D. Reed, P. C. Roth, et al., Performance Analysis of GYRO: A Tool Evaluation,” In Proceedings of the 2005 SciDAC Conference, San Francisco, CA, June 2005.  (172.07 KB)
Worley, P. H., J. Candy, L. Carrington, K. Huck, T. Kaiser, K. Mahinthakumar, A. D. Malony, S. Moore, D. Reed, P. C. Roth, et al., Performance Analysis of GYRO: A Tool Evaluation,” In Proceedings of the 2005 SciDAC Conference, San Francisco, CA, June 2005.  (172.07 KB)
Worley, P. H., J. Candy, L. Carrington, K. Huck, T. Kaiser, K. Mahinthakumar, A. D. Malony, S. Moore, D. Reed, P. C. Roth, et al., Performance Analysis of GYRO: A Tool Evaluation,” In Proceedings of the 2005 SciDAC Conference, San Francisco, CA, June 2005.  (172.07 KB)
Ayala, A., S. Tomov, M. Stoyanov, A. Haidar, and J. Dongarra, Performance Analysis of Parallel FFT on Large Multi-GPU Systems,” 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Lyon, France, IEEE, August 2022.
Hernandez, O., F. Song, B. Chapman, J. Dongarra, B. Mohr, S. Moore, and F. Wolf, Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications,” Second International Workshop on OpenMP, Reims, France, January 2006.  (350.9 KB)
Hernandez, O., F. Song, B. Chapman, J. Dongarra, B. Mohr, S. Moore, and F. Wolf, Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications,” Lecture Notes in Computer Science, OpenMP Shared Memory Parallel Programming, vol. 4315: Springer Berlin / Heidelberg, 00 2008.  (350.9 KB)
Dongarra, J., A. D. Malony, S. Moore, P. Mucci, and S. Shende, Performance Instrumentation and Measurement for Terascale Systems,” ICCS 2003 Terascale Workshop, Melbourne, Australia, Springer, Berlin, Heidelberg, June 2003.  (5.36 MB)
Bosilca, G., A. Bouteiller, T. Herault, P. Lemariner, N. Ohm Saengpatsa, S. Tomov, and J. Dongarra, Performance Portability of a GPU Enabled Factorization with the DAGuE Framework,” IEEE Cluster: workshop on Parallel Programming on Accelerator Clusters (PPAC), June 2011.  (290.98 KB)
Shende, S., A. D. Malony, A. Morris, and F. Wolf, Performance Profiling Overhead Compensation for MPI Programs,” In Proc. of the 12th European Parallel Virtual Machine and Message Passing Interface Conference: Springer LNCS, September 2005.  (220.26 KB)
Gates, M., A. Charara, A. YarKhan, D. Sukkari, M. Al Farhan, and J. Dongarra, Performance Tuning SLATE,” SLATE Working Notes, no. 14, ICL-UT-20-01: Innovative Computing Laboratory, University of Tennessee, January 2020.  (1.29 MB)
Bailey, D., J. Chame, C. Chen, J. Dongarra, M. Hall, J. K. Hollingsworth, P. D. Hovland, S. Moore, K. Seymour, J. Shin, et al., PERI Auto-tuning,” Proc. SciDAC 2008, vol. 125, Seatlle, Washington, Journal of Physics, January 2008.  (873.75 KB)
Bailey, D., J. Chame, C. Chen, J. Dongarra, M. Hall, J. K. Hollingsworth, P. D. Hovland, S. Moore, K. Seymour, J. Shin, et al., PERI Auto-tuning,” Proc. SciDAC 2008, vol. 125, Seatlle, Washington, Journal of Physics, January 2008.  (873.75 KB)
Abalenkovs, M., N. Bagherpour, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Relton, J. Sistek, D. Stevens, et al., PLASMA 17 Performance Report,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-11: University of Tennessee, June 2017.  (7.57 MB)
Abalenkovs, M., N. Bagherpour, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Relton, J. Sistek, D. Stevens, et al., PLASMA 17 Performance Report,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-11: University of Tennessee, June 2017.  (7.57 MB)
Abalenkovs, M., N. Bagherpour, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Relton, J. Sistek, D. Stevens, et al., PLASMA 17.1 Functionality Report,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-10: University of Tennessee, June 2017.  (1.8 MB)
Abalenkovs, M., N. Bagherpour, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Relton, J. Sistek, D. Stevens, et al., PLASMA 17.1 Functionality Report,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-10: University of Tennessee, June 2017.  (1.8 MB)
Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, P. Wu, I. Yamazaki, A. YarKhan, M. Abalenkovs, N. Bagherpour, et al., PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP,” ACM Transactions on Mathematical Software, vol. 45, issue 2, June 2019.  (7.5 MB)
Castain, R., J. Hursey, A. Bouteiller, and D. Solt, PMIx: Process Management for Exascale Environments,” Parallel Computing, vol. 79, pp. 9–29, January 2018.

Pages