huggingface / transformers Public

Notifications You must be signed in to change notification settings
Fork 27.6k
Star 138k

Code
Issues 990
Pull requests 536
Actions
Projects 1
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: huggingface/transformers

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

990 Open 15,483 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

use_liger_kernel requires much more GPU memory during evaluation than training bug

#35689 opened Jan 14, 2025 by Smu-Tan

2 of 4 tasks

Some weights of the model checkpoint at /models/DeepSeek-V3_bf16 were not used when initializing DeepseekV3ForCausalLM bug

#35688 opened Jan 14, 2025 by Godlovecui

4 tasks

past_key_values cat out of model generate, output appear disorder bug Generation

#35684 opened Jan 14, 2025 by lzlwakeup

2 of 4 tasks

Support LLMs With No Image Placeholder Embedding in LLava-based Models Feature request

Request for a new feature

Multimodal VLM

#35683 opened Jan 14, 2025 by alex-jw-brooks

FA2 support for Aria Flash Attention Multimodal Vision

#35670 opened Jan 13, 2025 by molbap

Improve Guidance for Using DDP in examples/pytorch Feature request

Request for a new feature

#35667 opened Jan 13, 2025 by caojiaolong

RLE of SAM can't handle masks with no change bug

#35664 opened Jan 13, 2025 by MSt-10

2 of 4 tasks

About GA loss in the latest transformers version bug

#35663 opened Jan 13, 2025 by hiyouga

4 tasks

AttributeError: 'MERTConfig' object has no attribute 'conv_pos_batch_norm' bug

#35656 opened Jan 13, 2025 by JacopoMadaluni

2 of 4 tasks

PR #35438 introduced a new bug bug

#35649 opened Jan 13, 2025 by techkang

2 of 4 tasks

Unnecessary KV Cache Updates During Training Mode bug

#35648 opened Jan 13, 2025 by Hannibal046

2 of 4 tasks

Will Qwen2VL support sequence classification head in the future? Feature request

Request for a new feature

#35645 opened Jan 13, 2025 by cv-nlp

tokenizer.decode() and tokenizer.convert_ids_to_tokens() return different results bug

#35641 opened Jan 12, 2025 by thangld201

4 tasks

Expected tensors and new_tensors to have the same type but found <class ‘tuple’> and <class ‘torch.Tensor’> bug

#35640 opened Jan 12, 2025 by Bruce-Azar-Wayne

4 tasks

Breaking change in v4.48.0 and Python 3.9 bug

#35639 opened Jan 12, 2025 by davidmezzetti

4 tasks

FSDP OOM error

#35636 opened Jan 12, 2025 by blurmemo

set_initialized_submodules too slow when loading big model like DeepSeekV3 bug

#35635 opened Jan 12, 2025 by hongchuan666

4 tasks

ValueError: MllamaForConditionalGeneration does not support Flash Attention 2.0 yet bug

#35634 opened Jan 12, 2025 by yxchng

4 tasks

Trying To Convert Paligemma model in npz to hf model format

#35632 opened Jan 12, 2025 by Shaka42

static cache with mixtral will cause CUDA error: device-side assert triggered bug

#35626 opened Jan 11, 2025 by zyxiyy

1 of 4 tasks

Segmentation fault: address not mapped to object at address 0x100000007 bug

#35624 opened Jan 11, 2025 by mrinaldi97

4 tasks done

Unsupported: hasattr SkipFunctionVariable when i compile the mixtral model with muti-gpus bug

#35623 opened Jan 11, 2025 by zyxiyy

4 tasks

running utills.fx.symbolic_trace on gp2 raised an error: torch.fx.proxy.TraceError: Proxy object cannot be iterated, which does not occur in the previous version bug

#35622 opened Jan 11, 2025 by minkiml

4 tasks

The argument "dim" is gone from LlamaRotaryEmbedding initializer. Intentional? bug

#35621 opened Jan 11, 2025 by jeffhataws

4 tasks

from_pretrained fails to save weights.py and layers.py into cache, therefore fails to find them in cache bug

#35619 opened Jan 11, 2025 by openyk

4 tasks

Previous 1 2 3 4 5 … 39 40 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly