Skip to content

Time dimension in audio datasets after applying ToFrame transform #174

Answered by biphasic
msbouanane asked this question in Q&A
Discussion options

You must be logged in to vote

Hey,
The problem is a different one: you first apply a downsample transform with default arguments, which will divide all timestamps by 1000, turning them from microseconds to miliseconds. Then, you apply the ToFrame transform with a time window of 1000. But most recordings are under 1s long! That's why you essentially binned all events of a recording into 1 frame. I recommend to always look at the raw data before transforming to frames. This short piece of code works:

import tonic
from torch.utils.data import DataLoader

transform = tonic.transforms.ToFrame(sensor_size=tonic.datasets.SHD.sensor_size, time_window=1000)

trainset = tonic.datasets.SHD(save_to='./data', train=True, transform=t…

Replies: 3 comments 2 replies

Comment options

You must be logged in to vote
1 reply
@msbouanane
Comment options

Answer selected by msbouanane
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@biphasic
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants