-
I was trying to train a DeiT model from scratch using timm, and I'm of the impression that this is not possible right away when we want to use a teacher model. Is that right? Also, in general, if not all the models supported by timm are trainable from scratch, is there a way to find out which one of them are? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
@sinahmr they are all trainable from scratch with timm scripts, but not necessarily as per specific algorithms in the paper. So the distilled deit, no, the code isn't there though it can be hacked on fairly easily (that's what the official impl is essentially, it's part timm training code and part their own). The non-distilled models, and deit-3 should be reproducible though. beit, convnext-v2 (fcmae part), mae, dino, etc same thing... the models will train from scratch with standard xent or bce but I have not included reproductions of their specific unsupervised/semi-supervised learning algorithms. |
Beta Was this translation helpful? Give feedback.
@sinahmr they are all trainable from scratch with timm scripts, but not necessarily as per specific algorithms in the paper.
So the distilled deit, no, the code isn't there though it can be hacked on fairly easily (that's what the official impl is essentially, it's part timm training code and part their own). The non-distilled models, and deit-3 should be reproducible though.
beit, convnext-v2 (fcmae part), mae, dino, etc same thing... the models will train from scratch with standard xent or bce but I have not included reproductions of their specific unsupervised/semi-supervised learning algorithms.