Tensor layout should use strides #518

ianmccul · 2024-11-24T10:27:08Z

I don't think I'll get it done right now, so putting this here to track progress.

Currently, the shape of a tensor is controlled by _shape, a vector of dimensions of each rank, and _mapper, which I assume is a permutation of indices. There is also _invmapper, which I assume is the inverse permutation of _mapper.

This design has several disadvantages, including

Getting the memory location of some element (i,j,k,...) of a tensor is quite difficult
It doesn't support strided storage. If you have a tensor T(i,j,k), and you want to slice it to fix one of the dimensions, say T(i,5,k) as a rank-2 tensor, it cannot be done with this design (unless the 2nd index is the leading dimension of the storage, and even then the start of the tensor is no longer at zero).
Currently, although permute() is a fast operation, a lot of operations that you might do after that require that the storage is contiguous, so end up copying and rearranging memory anyway, even in cases where in principle it doesn't need to (such as transposing a matrix and then calling a BLAS or LAPACK function). In principle this could be avoided in the current design, in some cases, but the calculation of whether the layout is compatible with BLAS is too difficult.

The representation of the layout should change to store the dimension and stride of each rank. Something like (exposition only, an actual implementation wouldn't use two separate vectors):

vector<int> dim;
vector<int> stride;
size_type base;

The location of some element (i,j,k,...) is then Tensor->Mem + base + (stride[0]*i + stride[1]*j + stride[2]*k + ...)

Permuting indices is still fast: just apply the permutation to the dim and stride arrays. Slicing a tensor is also a simple operation. For example if you want a rank 2 tensor that represents T(i,5,j) then the layout changes to simply remove that dimension from the layout, and update the base index to base += 5 * stride[1]. Similarly, projecting onto some range along some dimension is also fairly simple (like Python Array[:, 1:3].

Determining whether the layout is compatible with a BLAS or LAPACK call is fairly easy: the rank must be 2, and one of the strides must be 1. If stride[0] == 1 then it is a column-major matrix (blas trans parameter is N), and if stride[1] == 1 then it is row-major (blas trans parameter is T). The leading dimension is the largest stride.

The text was updated successfully, but these errors were encountered:

ianmccul · 2024-11-24T11:30:32Z

the calculation of whether the layout is compatible with BLAS is too difficult.

Actually this cannot be true: in the current design the only possible _mapper values for a rank 2 tensor are {0,1} and {1,0}, which map onto BLAS normal and transpose, respectively (or the other way around, for column major storage). Why isn't this done in the existing BLAS/LAPACK wrappers?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensor layout should use strides #518

Tensor layout should use strides #518

ianmccul commented Nov 24, 2024

ianmccul commented Nov 24, 2024

Tensor layout should use strides #518

Tensor layout should use strides #518

Comments

ianmccul commented Nov 24, 2024

ianmccul commented Nov 24, 2024