-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack error #177
Comments
Hi @Charlesliu77, could you point to which line of the code this issue happens at? |
llava_arch.py: line 378 |
Hi, can you try replacing this line with
|
Thanks a lot, it works, but i have another question about the model verison, what's the difference between Nvila and Nvila-lite? |
NVILA-Lite is designed is to optimize the efficiency over NVILA while maintaining a competitive performance. The main differences between NVILA-Lite and NVILA include that NVILA-Lite uses 3x3 downsample instead of 2x2 in the mm projector, and NVILA-Lite uses dynamic res instead of dynamic s2. We will update more details about NVILA-Lite in our next version of the preprint. Stay tuned! |
the image size of inputs are different, i got the error below when using the dynamic_s2 preprocess method:
RuntimeError: stack expects each tensor to be equal size, but got [2560, 3584] at entry 0 and [3072, 3584] at entry 1.
The text was updated successfully, but these errors were encountered: