`linalg::Svd()` outputs NaN when the input tensor contains only zeros #505

IvanaGyro · 2024-11-09T18:22:44Z

linalg::Svd() outputs NaN when the input tensor contains only zeros. This issue only happens on GPU and doesn't happen when the data type is float. This bug is the culprit of the broken Svd.gpu_U1_zeros_test test case.

Below is the minimal test case to reproduce the bug.

#include <vector>

#include "cuda_runtime_api.h"
#include "gtest/gtest.h"

#include "Generator.hpp"
#include "linalg.hpp"
#include "Tensor.hpp"
#include "Type.hpp"
#include "Device.hpp"

  TEST(Svd, GpuZeros) {
    cudaSetDevice(Device.cuda);
    Tensor all_zeros = zeros(64, Type.Double, Device.cuda);
    all_zeros.reshape_(16, 4);
    std::vector<Tensor> svd_output = linalg::Svd(all_zeros, /* is_UvT */ true);
    cudaDeviceSynchronize();
    ASSERT_EQ(svd_output.size(), 3);
    Tensor& singular_values = svd_output[0];
    ASSERT_EQ(singular_values.shape().size(), 1);
    ASSERT_EQ(singular_values.shape()[0], 4);
    EXPECT_EQ(singular_values.at({0}), 0);
    EXPECT_EQ(singular_values.at({1}), 0);
    EXPECT_EQ(singular_values.at({2}), 0);
    EXPECT_EQ(singular_values.at({3}), 0);
  }

current output:

: Failure
Expected equality of these values:
  singular_values.at({0})
    Which is: < -nan > dtype: [Double (Float64)]
  0

: Failure
Expected equality of these values:
  singular_values.at({1})
    Which is: < -nan > dtype: [Double (Float64)]
  0

: Failure
Expected equality of these values:
  singular_values.at({2})
    Which is: < -nan > dtype: [Double (Float64)]
  0

: Failure
Expected equality of these values:
  singular_values.at({3})
    Which is: < -nan > dtype: [Double (Float64)]
  0

The text was updated successfully, but these errors were encountered:

IvanaGyro · 2024-11-14T15:04:34Z

Only when the size of singular values are 4, 23, 28, 33, 48, 51, 53, and 60, the output singular values contain NaN. Below is the code for finding out these numbers and the output of the code. I am not sure why these numbers cause the problem. However, I confirm that NaN is output by cusolverDnXgesvdp(), which is a cuSOLVER function.

#include <iostream>
#include <utility>
#include <vector>

#include "cuda_runtime_api.h"
#include "gtest/gtest.h"

#include "backend/Storage.hpp"
#include "Generator.hpp"
#include "linalg.hpp"
#include "Tensor.hpp"
#include "Type.hpp"
#include "Device.hpp"

TEST(Svd, GetZerosWhenInputtingZeros) {
  cudaSetDevice(Device.cuda);
  auto any_non_zero = [](const Tensor& tensor) -> bool {
    Storage storage = tensor.storage();
    size_t size = storage.size();
    double* ptr = reinterpret_cast<double*>(storage.data());
    for (int i = 0; i < size; ++i) {
      if (*ptr++ != 0) {
        return true;
      }
    }
    return false;
  };
  std::vector<std::pair<size_t, size_t>> shapes_get_nan;
  for (size_t m = 1; m < 61; ++m) {
    for (size_t n = 1; n < 61; ++n) {
      Tensor all_zeros = zeros(m * n, Type.Double, Device.cuda);
      all_zeros.reshape_(m, n);
      std::vector<Tensor> svd_output = linalg::Svd(all_zeros, /* is_UvT */ true);
      Tensor& singular_values = svd_output[0];
      if (any_non_zero(singular_values)) {
        shapes_get_nan.emplace_back(m, n);
      }
    }
  }
  std::cout << "{";
  for (const std::pair<size_t, size_t>& shape : shapes_get_nan) {
    std::cout << std::dec << "{" << shape.first << ", " << shape.second << "}, ";
  }
  std::cout << "}" << std::endl;
}

output:

{{4, 4}, {4, 5}, {4, 6}, {4, 7}, {4, 8}, {4, 9}, {4, 10}, {4, 11}, {4, 12}, {4, 13}, {4, 14}, {4, 15}, {4, 16}, {4, 17}, {4, 18}, {4, 19}, {4, 20}, 
{4, 21}, {4, 22}, {4, 23}, {4, 24}, {4, 25}, {4, 26}, {4, 27}, {4, 28}, {4, 29}, {4, 30}, {4, 31}, {4, 32}, {4, 33}, {4, 34}, {4, 35}, {4, 36}, 
{4, 37}, {4, 38}, {4, 39}, {4, 40}, {4, 41}, {4, 42}, {4, 43}, {4, 44}, {4, 45}, {4, 46}, {4, 47}, {4, 48}, {4, 49}, {4, 50}, {4, 51}, {4, 52}, 
{4, 53}, {4, 54}, {4, 55}, {4, 56}, {4, 57}, {4, 58}, {4, 59}, {4, 60}, {5, 4}, {6, 4}, {7, 4}, {8, 4}, {9, 4}, {10, 4}, {11, 4}, {12, 4}, {13, 4}, 
{14, 4}, {15, 4}, {16, 4}, {17, 4}, {18, 4}, {19, 4}, {20, 4}, {21, 4}, {22, 4}, {23, 4}, {23, 23}, {23, 24}, {23, 25}, {23, 26}, {23, 27},
{23, 28}, {23, 29}, {23, 30}, {23, 31}, {23, 32}, {23, 33}, {23, 34}, {23, 35}, {23, 36}, {23, 37}, {23, 38}, {23, 39}, {23, 40}, {23, 41}, 
{23, 42}, {23, 43}, {23, 44}, {23, 45}, {23, 46}, {23, 47}, {23, 48}, {23, 49}, {23, 50}, {23, 51}, {23, 52}, {23, 53}, {23, 54}, {23, 55}, 
{23, 56}, {23, 57}, {23, 58}, {23, 59}, {23, 60}, {24, 4}, {24, 23}, {25, 4}, {25, 23}, {26, 4}, {26, 23}, {27, 4}, {27, 23}, {28, 4}, {28, 23}, 
{28, 28}, {28, 29}, {28, 30}, {28, 31}, {28, 32}, {28, 33}, {28, 34}, {28, 35}, {28, 36}, {28, 37}, {28, 38}, {28, 39}, {28, 40}, {28, 41}, 
{28, 42}, {28, 43}, {28, 44}, {28, 45}, {28, 46}, {28, 47}, {28, 48}, {28, 49}, {28, 50}, {28, 51}, {28, 52}, {28, 53}, {28, 54}, {28, 55}, 
{28, 56}, {28, 57}, {28, 58}, {28, 59}, {28, 60}, {29, 4}, {29, 23}, {29, 28}, {30, 4}, {30, 23}, {30, 28}, {31, 4}, {31, 23}, {31, 28}, 
{32, 4}, {32, 23}, {32, 28}, {33, 4}, {33, 23}, {33, 28}, {33, 33}, {33, 34}, {33, 35}, {33, 36}, {33, 37}, {33, 38}, {33, 39}, {33, 40}, 
{33, 41}, {33, 42}, {33, 43}, {33, 44}, {33, 45}, {33, 46}, {33, 47}, {33, 48}, {33, 49}, {33, 50}, {33, 51}, {33, 52}, {33, 53}, {33, 54}, 
{33, 55}, {33, 56}, {33, 57}, {33, 58}, {33, 59}, {33, 60}, {34, 4}, {34, 23}, {34, 28}, {34, 33}, {35, 4}, {35, 23}, {35, 28}, {35, 33}, 
{36, 4}, {36, 23}, {36, 28}, {36, 33}, {37, 4}, {37, 23}, {37, 28}, {37, 33}, {38, 4}, {38, 23}, {38, 28}, {38, 33}, {39, 4}, {39, 23}, {39, 28},
 {39, 33}, {40, 4}, {40, 23}, {40, 28}, {40, 33}, {41, 4}, {41, 23}, {41, 28}, {41, 33}, {42, 4}, {42, 23}, {42, 28}, {42, 33}, {43, 4}, {43, 23}, 
{43, 28}, {43, 33}, {44, 4}, {44, 23}, {44, 28}, {44, 33}, {45, 4}, {45, 23}, {45, 28}, {45, 33}, {46, 4}, {46, 23}, {46, 28}, {46, 33}, {47, 4}, 
{47, 23}, {47, 28}, {47, 33}, {48, 4}, {48, 23}, {48, 28}, {48, 33}, {48, 48}, {48, 49}, {48, 50}, {48, 51}, {48, 52}, {48, 53}, {48, 54}, 
{48, 55}, {48, 56}, {48, 57}, {48, 58}, {48, 59}, {48, 60}, {49, 4}, {49, 23}, {49, 28}, {49, 33}, {49, 48}, {50, 4}, {50, 23}, {50, 28}, 
{50, 33}, {50, 48}, {51, 4}, {51, 23}, {51, 28}, {51, 33}, {51, 48}, {51, 51}, {51, 52}, {51, 53}, {51, 54}, {51, 55}, {51, 56}, {51, 57}, 
{51, 58}, {51, 59}, {51, 60}, {52, 4}, {52, 23}, {52, 28}, {52, 33}, {52, 48}, {52, 51}, {53, 4}, {53, 23}, {53, 28}, {53, 33}, {53, 48}, 
{53, 51}, {53, 53}, {53, 54}, {53, 55}, {53, 56}, {53, 57}, {53, 58}, {53, 59}, {53, 60}, {54, 4}, {54, 23}, {54, 28}, {54, 33}, {54, 48}, 
{54, 51}, {54, 53}, {55, 4}, {55, 23}, {55, 28}, {55, 33}, {55, 48}, {55, 51}, {55, 53}, {56, 4}, {56, 23}, {56, 28}, {56, 33}, {56, 48}, 
{56, 51}, {56, 53}, {57, 4}, {57, 23}, {57, 28}, {57, 33}, {57, 48}, {57, 51}, {57, 53}, {58, 4}, {58, 23}, {58, 28}, {58, 33}, {58, 48}, 
{58, 51}, {58, 53}, {58, 58}, {58, 59}, {58, 60}, {59, 4}, {59, 23}, {59, 28}, {59, 33}, {59, 48}, {59, 51}, {59, 53}, {59, 58}, {60, 4}, 
{60, 23}, {60, 28}, {60, 33}, {60, 48}, {60, 51}, {60, 53}, {60, 58}, {60, 60}, }

IvanaGyro · 2024-11-14T15:24:03Z

Should we force the SVD function to output all zeros when the input values are all zeros?

ianmccul · 2024-11-14T15:31:58Z

I have posted so many bug reports for cusolver SVD ... it is very disappointing that 6 years later there are still lots of bugs in it. Their test infrastructure must be pathetic.

It is actually useful in some situations if the SVD of a matrix of zeros gives random unitaries for the U and V matrices. The reason is that if you need to select from a vector that has a zero singular value, then you should get a random choice, rather than (1,0,0,0,..), which is a rather biased selection! But probably not something that you want to incorporate into the SVD unconditionally ...

IvanaGyro added the bug Something isn't working label Nov 10, 2024

IvanaGyro mentioned this issue Nov 12, 2024

Skip test cases testing unimplemented functions #503

Merged

IvanaGyro self-assigned this Nov 14, 2024

IvanaGyro added the question Further information is requested label Nov 14, 2024

IvanaGyro mentioned this issue Dec 27, 2024

Fix gpu unit test. #545

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`linalg::Svd()` outputs NaN when the input tensor contains only zeros #505

`linalg::Svd()` outputs NaN when the input tensor contains only zeros #505

IvanaGyro commented Nov 9, 2024 •

edited

Loading

IvanaGyro commented Nov 14, 2024 •

edited

Loading

IvanaGyro commented Nov 14, 2024

ianmccul commented Nov 14, 2024

linalg::Svd() outputs NaN when the input tensor contains only zeros #505

linalg::Svd() outputs NaN when the input tensor contains only zeros #505

Comments

IvanaGyro commented Nov 9, 2024 • edited Loading

IvanaGyro commented Nov 14, 2024 • edited Loading

IvanaGyro commented Nov 14, 2024

ianmccul commented Nov 14, 2024

`linalg::Svd()` outputs NaN when the input tensor contains only zeros #505

`linalg::Svd()` outputs NaN when the input tensor contains only zeros #505

IvanaGyro commented Nov 9, 2024 •

edited

Loading

IvanaGyro commented Nov 14, 2024 •

edited

Loading