You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all congratulations for this amazing work. I want to ask how you handle the creation of dataloaders for labeled and unlabaled data. To the best of my understanding from reading the dataloaders in each iteration you forward pass the same amount of labaled and unlabaled data. Actually in each epoch you pass the whole labeled data and random sample equal amount of unlabaled data.
I have read a couple of approaches for this. The first is to define an epoch as the passing of all unlabeled data from the network, but with this the labeled data will be passed from the network multiple times in an epoch. The second approach, as you have done, is to use a sampler to sample at each epoch equal amount of unlabeled data to match the size of the labeled data. Which of these 2 techniques would force the model to perform better? Generally , I'm a bit confused on how to construct the dataloaders of labeled and unlabeled data in a semi supervised setting. Any hints will be appreciated!
Thanks in advance.
The text was updated successfully, but these errors were encountered:
Hello,
First of all congratulations for this amazing work. I want to ask how you handle the creation of dataloaders for labeled and unlabaled data. To the best of my understanding from reading the dataloaders in each iteration you forward pass the same amount of labaled and unlabaled data. Actually in each epoch you pass the whole labeled data and random sample equal amount of unlabaled data.
I have read a couple of approaches for this. The first is to define an epoch as the passing of all unlabeled data from the network, but with this the labeled data will be passed from the network multiple times in an epoch. The second approach, as you have done, is to use a sampler to sample at each epoch equal amount of unlabeled data to match the size of the labeled data. Which of these 2 techniques would force the model to perform better? Generally , I'm a bit confused on how to construct the dataloaders of labeled and unlabeled data in a semi supervised setting. Any hints will be appreciated!
Thanks in advance.
The text was updated successfully, but these errors were encountered: