https://www.conftool.net/parco2011/inde ... racts=show
Improving Performance of Triangular Matrix-Vector BLAS Routines on GPUs
Przemyslaw Stpiczynski, Marek Karwacki
Will the improvements documented here be incorporated in to a new release of MAGMA soon? I have been in touch with the authors and they say that they have submitted the code to you for inclusion.