Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is the pretrained mae encoder weights available ? #94

Open
CheungZeeCn opened this issue Aug 1, 2023 · 2 comments
Open

Is the pretrained mae encoder weights available ? #94

CheungZeeCn opened this issue Aug 1, 2023 · 2 comments

Comments

@CheungZeeCn
Copy link

in config:

"mae_checkpoint": "mae_models/mae_pretrain_vit_large_full.pth"

in udop_dual:

self.vision_encoder = mae_model(config.mae_version, config.mae_checkpoint, config.image_size, config.vocab_size, 
config.max_2d_position_embeddings)

But I found no pretiraned weights for mae encoder. Is the pretrained mae encoder weights available now?

Thank you!

@zinengtang
Copy link
Collaborator

The MAE checkpoint is together with the transformer weights included in the checkpoint.
if you want the original MAE weights you can download it from the original MAE codebase.

@znb899
Copy link

znb899 commented Aug 31, 2023

In the transformer weights, for mae, there are only weights for patch_embed and special_vis_token (and pos_embed), but not the blocks. And in the forward method, you indeed only use patch_embed to encode the patches.

Do you not use the full mae like in udop_dual? This simple projection carries all the information for font, line spacing, color etc etc?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants