Publications
Performance, Design, and Autotuning of Batched GEMM for GPUs,”
High Performance Computing: 31st International Conference, ISC High Performance 2016, Frankfurt, Germany, June 19-23, 2016, Proceedings, no. 9697: Springer International Publishing, pp. 21–38, 2016.
DOI: 10.1007/978-3-319-41321-1_2 (1.98 MB)
“Characterization of Power Usage and Performance in Data-Intensive Applications using MapReduce over MPI,”
2019 International Conference on Parallel Computing (ParCo2019), Prague, Czech Republic, September 2019.
“Argobots: A Lightweight Low-Level Threading and Tasking Framework,”
IEEE Transactions on Parallel and Distributed Systems, October 2017.
DOI: 10.1109/TPDS.2017.2766062
“