You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
some other questions about wkv6_cuda: i observed nan errors and change all exp calls to __expf(-__expf(**)), and the model seemed to perform well (but i honestly dont know why i changed to this, perhaps intuition?).
I'm currently using rwkv with a full-length pixel sequence, and found that T_MAX must greaterequal than total pixel numbers in an image, smaller will cause cuda error: illegal memory access. but larger T_MAX leads to immediate oom, even if the model itself is really small(<3m params)... (my card: rtx 40901)
Any advices to address these? appreciate for ur help.
i notice that there are two for loops in https://github.com/OpenGVLab/Vision-RWKV/blob/master/classification/mmcls_custom/models/backbones/cuda_v6/wkv6_cuda.cu line 23-57. what is the purpose of the first loop? i compared with RWKV-LM's cuda files but find no ideas.
The text was updated successfully, but these errors were encountered: