You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I don't think I'll get it done right now, so putting this here to track progress.
Currently, the shape of a tensor is controlled by _shape, a vector of dimensions of each rank, and _mapper, which I assume is a permutation of indices. There is also _invmapper, which I assume is the inverse permutation of _mapper.
This design has several disadvantages, including
Getting the memory location of some element (i,j,k,...) of a tensor is quite difficult
It doesn't support strided storage. If you have a tensor T(i,j,k), and you want to slice it to fix one of the dimensions, say T(i,5,k) as a rank-2 tensor, it cannot be done with this design (unless the 2nd index is the leading dimension of the storage, and even then the start of the tensor is no longer at zero).
Currently, although permute() is a fast operation, a lot of operations that you might do after that require that the storage is contiguous, so end up copying and rearranging memory anyway, even in cases where in principle it doesn't need to (such as transposing a matrix and then calling a BLAS or LAPACK function). In principle this could be avoided in the current design, in some cases, but the calculation of whether the layout is compatible with BLAS is too difficult.
The representation of the layout should change to store the dimension and stride of each rank. Something like (exposition only, an actual implementation wouldn't use two separate vectors):
The location of some element (i,j,k,...) is then Tensor->Mem + base + (stride[0]*i + stride[1]*j + stride[2]*k + ...)
Permuting indices is still fast: just apply the permutation to the dim and stride arrays. Slicing a tensor is also a simple operation. For example if you want a rank 2 tensor that represents T(i,5,j) then the layout changes to simply remove that dimension from the layout, and update the base index to base += 5 * stride[1]. Similarly, projecting onto some range along some dimension is also fairly simple (like Python Array[:, 1:3].
Determining whether the layout is compatible with a BLAS or LAPACK call is fairly easy: the rank must be 2, and one of the strides must be 1. If stride[0] == 1 then it is a column-major matrix (blas trans parameter is N), and if stride[1] == 1 then it is row-major (blas trans parameter is T). The leading dimension is the largest stride.
The text was updated successfully, but these errors were encountered:
the calculation of whether the layout is compatible with BLAS is too difficult.
Actually this cannot be true: in the current design the only possible _mapper values for a rank 2 tensor are {0,1} and {1,0}, which map onto BLAS normal and transpose, respectively (or the other way around, for column major storage). Why isn't this done in the existing BLAS/LAPACK wrappers?
I don't think I'll get it done right now, so putting this here to track progress.
Currently, the shape of a tensor is controlled by
_shape
, a vector of dimensions of each rank, and_mapper
, which I assume is a permutation of indices. There is also_invmapper
, which I assume is the inverse permutation of_mapper
.This design has several disadvantages, including
permute()
is a fast operation, a lot of operations that you might do after that require that the storage is contiguous, so end up copying and rearranging memory anyway, even in cases where in principle it doesn't need to (such as transposing a matrix and then calling a BLAS or LAPACK function). In principle this could be avoided in the current design, in some cases, but the calculation of whether the layout is compatible with BLAS is too difficult.The representation of the layout should change to store the dimension and stride of each rank. Something like (exposition only, an actual implementation wouldn't use two separate
vector
s):The location of some element (i,j,k,...) is then
Tensor->Mem + base + (stride[0]*i + stride[1]*j + stride[2]*k + ...)
Permuting indices is still fast: just apply the permutation to the
dim
andstride
arrays. Slicing a tensor is also a simple operation. For example if you want a rank 2 tensor that represents T(i,5,j) then the layout changes to simply remove that dimension from the layout, and update thebase
index tobase += 5 * stride[1]
. Similarly, projecting onto some range along some dimension is also fairly simple (like PythonArray[:, 1:3]
.Determining whether the layout is compatible with a BLAS or LAPACK call is fairly easy: the rank must be 2, and one of the strides must be 1. If
stride[0] == 1
then it is a column-major matrix (blas trans parameter isN
), and ifstride[1] == 1
then it is row-major (blas trans parameter isT
). The leading dimension is the largest stride.The text was updated successfully, but these errors were encountered: