Time dimension in audio datasets after applying ToFrame transform #174
-
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
Hey, import tonic
from torch.utils.data import DataLoader
transform = tonic.transforms.ToFrame(sensor_size=tonic.datasets.SHD.sensor_size, time_window=1000)
trainset = tonic.datasets.SHD(save_to='./data', train=True, transform=transform)
raster, targets = trainset[0]
print(raster.shape)
Your code also works, you just have to adjust the time_window parameter. PS: if you post code instead of images, then I can easily copy it and re-run your examples, which makes it easier for me. |
Beta Was this translation helpful? Give feedback.
-
Personally I used this paper as a reference. Check out Table 1 in the supplementary material: https://www.biorxiv.org/content/10.1101/2021.03.22.436372v2 |
Beta Was this translation helpful? Give feedback.
-
Hello, the dataset of ' tonic.datasets.NTIDIGITS' has not been found in the current version, is it been deleted? |
Beta Was this translation helpful? Give feedback.
Hey,
The problem is a different one: you first apply a downsample transform with default arguments, which will divide all timestamps by 1000, turning them from microseconds to miliseconds. Then, you apply the ToFrame transform with a time window of 1000. But most recordings are under 1s long! That's why you essentially binned all events of a recording into 1 frame. I recommend to always look at the raw data before transforming to frames. This short piece of code works: