Bug: getrf_batched kernel produces NaNs on singular square inputs of size <=32

pearu · Post by **pearu** » Tue Oct 29, 2019 5:23 am

Hi,

I was not able to create a bug report in magma mercurial issues, so I'll report it here.

The subject of this message summarizes the issue, here's a reproducer based on pytorch:

Code: Select all

>>> import torch
>>> m, n = 3, 3
>>> torch.ones(1, m, n, device='cuda').lu()
(tensor([[[1., 1., 1.],
         [1., 0., 0.],
         [1., nan, nan]]], device='cuda:0'), tensor([[1, 2, 3]], device='cuda:0', dtype=torch.int32))

Notice the nan entries appear only when m == n and m <= 32, for other cases, the getrf_batched works correctly.

The source of this issue is likely in the kernel functions implemented in magmablas/zgetrf_batched_smallsq_shfl.cu and ./magmablas/zgetrf_batched_smallsq_noshfl.cu .

Best regards,
Pearu

mgates3 · Post by **mgates3** » Tue Oct 29, 2019 8:11 pm

You need to have a Bitbucket account to post bug reports. I posted this there for tracking:
https://bitbucket.org/icl/magma/issues/ ... es-nans-on

-mark

MAGMA Forum

Bug: getrf_batched kernel produces NaNs on singular square inputs of size <=32

Bug: getrf_batched kernel produces NaNs on singular square inputs of size <=32

Re: Bug: getrf_batched kernel produces NaNs on singular square inputs of size <=32