-
Notifications
You must be signed in to change notification settings - Fork 10.2k
Pull requests: ggerganov/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
ggml-cuda : add TQ2_0 kernels, for ternary inference on GPU
enhancement
New feature or request
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
performance
Speed related topics
python
python script changes
Review Complexity : High
Generally require indepth knowledge of LLMs or GPUs
testing
Everything test related
#11183
opened Jan 10, 2025 by
compilade
Loading…
gguf-py: Fixed local detection of gguf pacakge
python
python script changes
#11180
opened Jan 10, 2025 by
VJHack
Loading…
convert : sort print supported models [no ci]
python
python script changes
#11179
opened Jan 10, 2025 by
danbev
Loading…
SYCL: Add gated linear attention kernel
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#11175
opened Jan 10, 2025 by
qnixsynapse
Loading…
lora : update API names
breaking change
Changes that break ABIs, APIs, file formats, or other forms of backwards compatibility.
examples
server
#11167
opened Jan 9, 2025 by
ggerganov
Loading…
vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#11166
opened Jan 9, 2025 by
jeffbolznv
Loading…
FR: server: Pre-fill textarea and auto-generate based on query parameters
examples
server
#11150
opened Jan 9, 2025 by
tim-janik
Loading…
vulkan: optimize coopmat2 q2_k dequant function
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#11130
opened Jan 7, 2025 by
jeffbolznv
Loading…
llama-bench : add test measuring token generation rate at given prompt length
examples
#11126
opened Jan 7, 2025 by
fairydreaming
Loading…
llama : functions -> methods
android
Issues specific to Android
breaking change
Changes that break ABIs, APIs, file formats, or other forms of backwards compatibility.
devops
improvements to build systems and github actions
examples
python
python script changes
server
testing
Everything test related
#11110
opened Jan 6, 2025 by
ggerganov
Loading…
2 tasks done
feat(ci): add visionOS build workflow
devops
improvements to build systems and github actions
ggml
changes relating to the ggml tensor library for machine learning
#11103
opened Jan 6, 2025 by
ggerganov
Loading…
vulkan: scale caching for k quants + misc fixes
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#11081
opened Jan 5, 2025 by
netrunnereve
Loading…
Remove obsolete HIP workaround
build
Compilation issues
devops
improvements to build systems and github actions
ggml
changes relating to the ggml tensor library for machine learning
nix
Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment
Nvidia GPU
Issues specific to Nvidia GPUs
#11080
opened Jan 5, 2025 by
sARY77
Loading…
server : POC OAI-compat TTS using OuteTTS
examples
server
#11070
opened Jan 3, 2025 by
ngxson
Loading…
feat(ci): add visionOS build workflow
devops
improvements to build systems and github actions
#11065
opened Jan 3, 2025 by
sinkingsugar
Loading…
llama : remove notion of CLS token
python
python script changes
#11064
opened Jan 3, 2025 by
ggerganov
Loading…
android : Apply chat template
android
Issues specific to Android
examples
#11059
opened Jan 3, 2025 by
Dhruvanand24
Loading…
CUDA Graph Compute Function Refactor (precursor for performance improvements)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#11042
opened Jan 2, 2025 by
aendk
Loading…
Add VisionOS compatibility by adding missing type definitions
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#11019
opened Dec 30, 2024 by
sinkingsugar
Loading…
Vulkan: Destroy Vulkan instance on exit
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#10989
opened Dec 26, 2024 by
0cc4m
Loading…
Removed unnecessary iteration of batch n_tokens on sequence embedding…
examples
#10972
opened Dec 25, 2024 by
Emreerdog
Loading…
Previous Next
ProTip!
Adding no:label will show everything without a label.