Full attention option
Added an option to the linformer to compare it with full attention. Watch out, this takes O(n^2) time and space complexity now, where n is the sequence length
Added an option to the linformer to compare it with full attention. Watch out, this takes O(n^2) time and space complexity now, where n is the sequence length