llama-cpp-mtp-hip

Arch Linux PKGBUILD for llama.cpp with MTP speculative decoding, using the HIP/ROCm backend for AMD GPUs. This package was created to support the new generation of MTP-capable models as introduced in llama.cpp PR #22673.

What is MTP?

Multi-Token Prediction (MTP) lets a draft model guess several tokens at once and the base model verify them in parallel. On the right hardware it speeds up inference significantly. Upstream support was added in PR #22673. This package builds the mtp-clean branch by am17an, which provides the complete MTP speculative decoding pipeline for ROCm.

Requirements

Arch Linux (or a derivative)
ROCm packages: hip-runtime-amd, hipblas, rocblas
An AMD GPU with ROCm support

Build

makepkg -si

License

The PKGBUILD and supporting files in this repository are GPLv3. The upstream llama.cpp source it builds is MIT — see the llama.cpp repository for details.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
LICENSE		LICENSE
PKGBUILD		PKGBUILD
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llama-cpp-mtp-hip

What is MTP?

Requirements

Build

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

llama-cpp-mtp-hip

What is MTP?

Requirements

Build

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages