-
Notifications
You must be signed in to change notification settings - Fork 15.4k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
opencl: use larger workgroup size for get_rows
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
metal: handle command buffer failures gracefully in synchronize
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#20306
opened Mar 9, 2026 by
JulianPscheid
Loading…
docs: update CPU backend ops to mark POOL_1D as supported
documentation
Improvements or additions to documentation
#20304
opened Mar 9, 2026 by
a3894281
Loading…
vulkan: partial revert #20084
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#20298
opened Mar 9, 2026 by
jeffbolznv
Loading…
vulkan: fix OOB check in flash_attn_mask_opt
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#20296
opened Mar 9, 2026 by
jeffbolznv
Loading…
ci: disable coopmat on ubuntu-24-cmake-vulkan job
devops
improvements to build systems and github actions
#20294
opened Mar 9, 2026 by
0cc4m
Loading…
[SYCL] fix op ROPE, add ROPE_BACK
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#20293
opened Mar 9, 2026 by
arthw
Loading…
Gracefully handle undetected tool parser, print error message.
#20286
opened Mar 9, 2026 by
pwilkin
Loading…
Support refusal content for Responses API
examples
server
#20285
opened Mar 9, 2026 by
pwilkin
Loading…
[SYCL] fix for failed UT case: ACC, L2_NORM, UPSCALE, GEGLU
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#20283
opened Mar 9, 2026 by
arthw
Loading…
ggml-cuda: gdn use shared mem for HIP
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
model: add sarvam_moe architecture support
model
Model specific
python
python script changes
#20275
opened Mar 9, 2026 by
sumitchatterjee13
Loading…
CANN: handle in-place ROPE on non-contiguous f32 tensors
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#20274
opened Mar 9, 2026 by
noemotiovon
Loading…
Read the persisted llama_kv_cell_ext for n_pos_per_embd > 1 on state_read for all sequence ids
#20273
opened Mar 9, 2026 by
sprayandwipe
Loading…
server: support chunked transfer encoding
examples
server
#20269
opened Mar 9, 2026 by
crmky
Loading…
WIP/POC: NVFP4 with CUDA SM120
documentation
Improvements or additions to documentation
examples
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
testing
Everything test related
#20247
opened Mar 8, 2026 by
michaelw9999
•
Draft
metal : add Metal backend for GGML_OP_GATED_DELTA_NET
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#20244
opened Mar 8, 2026 by
arkavo-com
Loading…
2 of 3 tasks
gguf-py: validate metadata values against declared types
python
python script changes
#20242
opened Mar 8, 2026 by
eyupcanakman
Loading…
webui : add option to copy assistant response without thinking content
examples
server
#20238
opened Mar 8, 2026 by
rankaiyx
Loading…
Create build-apk.yml
android
Issues specific to Android
examples
#20231
opened Mar 8, 2026 by
subhasishlak123
Loading…
ggml-webgpu: Add supports for Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
GGML_OP_REPEAT
documentation
#20230
opened Mar 8, 2026 by
yomaytk
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-02-09.