%0 Generic %D 2020 %T Asynchronous SGD for DNN Training on Shared-Memory Parallel Architectures %A Florent Lopez %A Edmond Chow %A Stanimire Tomov %A Jack Dongarra %K Asynchronous iterative methods %K Deep learning %K gpu %K multicore CPU %K Stochastic Gradient Descent %X We present a parallel asynchronous Stochastic Gradient Descent algorithm for shared memory architectures. Different from previous asynchronous algorithms, we consider the case where the gradient updates are not particularly sparse. In the context of the MagmaDNN framework, we compare the parallel efficiency of the asynchronous implementation with that of the traditional synchronous implementation. Tests are performed for training deep neural networks on multicore CPUs and GPU devices. %B Innovative Computing Laboratory Technical Report %I University of Tennessee, Knoxville %8 2020-03 %G eng