Publications

Export 1276 results:
Filters: 10.1016 is j.parco.2021.102871  [Clear All Filters]
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
Y
YarKhan, A., A. Haidar, C. Cao, P. Luszczek, S. Tomov, and J. Dongarra, Cholesky Across Accelerators,” 17th IEEE International Conference on High Performance Computing and Communications (HPCC 2015), Elizabeth, NJ, IEEE, August 2015.
YarKhan, A., K. Seymour, K. Sagi, Z. Shi, and J. Dongarra, Recent Developments in GridSolve,” International Journal of High Performance Computing Applications (Special Issue: Scheduling for Large-Scale Heterogeneous Platforms), vol. 20, no. 1: Sage Science Press, 00 2006.  (496.69 KB)
YarKhan, A., J. Kurzak, and J. Dongarra, QUARK Users' Guide: QUeueing And Runtime for Kernels,” University of Tennessee Innovative Computing Laboratory Technical Report, no. ICL-UT-11-02, 00 2011.  (247.12 KB)
YarKhan, A., J. Kurzak, P. Luszczek, and J. Dongarra, Porting the PLASMA Numerical Library to the OpenMP Standard,” International Journal of Parallel Programming, June 2016.  (1.66 MB)
YarKhan, A., and J. Dongarra, Biological Sequence Alignment on the Computational Grid Using the GrADS Framework,” Future Generation Computing Systems, vol. 21, no. 6: Elsevier, pp. 980-986, June 2005.  (147.29 KB)
YarKhan, A., and J. Dongarra, Experiments with Scheduling Using Simulated Annealing in a Grid Environment,” Grid Computing - GRID 2002, Third International Workshop, vol. 2536, Baltimore, MD, Springer, pp. 232-242, November 2002.  (66.91 KB)
YarKhan, A., J. Kurzak, A. Abdelfattah, and J. Dongarra, An Empirical View of SLATE Algorithms on Scalable Hybrid System,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-08: University of Tennessee, Knoxville, September 2019.  (441.16 KB)
YarKhan, A., M. Al Farhan, D. Sukkari, M. Gates, and J. Dongarra, SLATE Performance Report: Updates to Cholesky and LU Factorizations,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-14: University of Tennessee, October 2020.  (1.64 MB)
YarKhan, A., Dynamic Task Execution on Shared and Distributed Memory Architectures , 2012.  (3.29 MB)
Yi, Q., K. Kennedy, H. You, K. Seymour, and J. Dongarra, Automatic Blocking of QR and LU Factorizations for Locality,” 2nd ACM SIGPLAN Workshop on Memory System Performance (MSP 2004), Washington, DC, ACM, June 2004.  (212.77 KB)
You, H., Q. Liu, Z. Li, and S. Moore, The Design of an Auto-tuning I/O Framework on Cray XT5 System,” Cray Users Group Conference (CUG'11) (Best Paper Finalist), Fairbanks, Alaska, May 2011.  (459.57 KB)
You, H., K. Seymour, and J. Dongarra, An Effective Empirical Search Method for Automatic Software Tuning,” ICL Technical Report, no. ICL-UT-05-02, January 2005.  (74.66 KB)
You, H., K. Seymour, J. Dongarra, and S. Moore, Empirical Tuning of a Multiresolution Analysis Kernel using a Specialized Code Generator,” ICL Technical Report, no. ICL-UT-07-02, January 2007.  (123.34 KB)
You, H., K. Seymour, J. Dongarra, and S. Moore, Automated Empirical Tuning of a Multiresolution Analysis Kernel,” ICL Technical Report, no. ICL-UT-07-01, pp. 10, January 2007.  (120.7 KB)
You, H., B. Rekapalli, Q. Liu, and S. Moore, Autotuned Parallel I/O for Highly Scalable Biosequence Analysis,” TeraGrid'11, Salt Lake City, Utah, July 2011.  (275.34 KB)
Youseff, L., K. Seymour, H. You, J. Dongarra, and R. Wolski, The Impact of Paravirtualized Memory Hierarchy on Linear Algebra Computational Kernels and Software,” ACM/IEEE International Symposium on High Performance Distributed Computing, Boston, MA., June 2008.  (403.89 KB)
Youseff, L., K. Seymour, H. You, D. Zagorodnov, J. Dongarra, and R. Wolski, Paravirtualization Effect on Single- and Multi-threaded Memory-Intensive Linear Algebra Software,” Cluster Computing Journal: Special Issue on High Performance Distributed Computing, vol. 12, no. 2: Springer Netherlands, pp. 101-122, 00 2009.  (451.07 KB)
Z
Zaitsev, D., and P. Luszczek, Docker Container based PaaS Cloud Computing Comprehensive Benchmarks using LAPACK,” Computer Modeling and Intelligent Systems CMIS-2020, Zaporizhzhoa, March 2020.  (451.33 KB)
Zaitsev, D., S. Tomov, and J. Dongarra, Solving Linear Diophantine Systems on Parallel Architectures,” IEEE Transactions on Parallel and Distributed Systems, vol. 30, issue 5, pp. 1158-1169, May 2019.  (802.97 KB)
Zhao, Y., L. Wan, W. Wu, G. Bosilca, R. Vuduc, J. Ye, W. Tang, and Z. Xu, Efficient Communications in Training Large Scale Neural Networks,” ACM MultiMedia Workshop 2017, Mountain View, CA, ACM, October 2017.  (1.41 MB)
Zhong, D., Q. Cao, G. Bosilca, and J. Dongarra, Using Advanced Vector Extensions AVX-512 for MPI Reduction,” EuroMPI/USA '20: 27th European MPI Users' Group Meeting, Austin, TX, September 2020.  (634.45 KB)
Zhong, D., Q. Cao, G. Bosilca, and J. Dongarra, Using long vector extensions for MPI reductions,” Parallel Computing, vol. 109, pp. 102871, March 2022.
Zhong, D., A. Bouteiller, X. Luo, and G. Bosilca, Runtime Level Failure Detection and Propagation in HPC Systems,” European MPI Users' Group Meeting (EuroMPI '19), Zürich, Switzerland, ACM, September 2019.  (1.11 MB)
Zhong, D., P. Shamis, Q. Cao, G. Bosilca, and J. Dongarra, Using Arm Scalable Vector Extension to Optimize Open MPI,” 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID 2020), Melbourne, Australia, IEEE/ACM, May 2020.  (359.95 KB)
Zhong, D., G. Bosilca, Q. Cao, and J. Dongarra, Using Advanced Vector Extensions AVX-512 for MPI Reduction (Poster) , Austin, TX, EuroMPI/USA '20: 27th European MPI Users' Group Meeting, September 2020.  (708.68 KB)
Zunger, A., A. Franceschetti, G. Bester, W. B. Jones, K. Kim, P. A. Graf, L-W. Wang, A. Canning, O. Marques, C. Voemel, et al., Predicting the electronic properties of 3D, million-atom semiconductor nanostructure architectures,” J. Phys.: Conf. Ser. 46, vol. :101088/1742-6596/46/1/040, pp. 292-298, January 2006.  (644.1 KB)

Pages