Skip to content

Add KV cache to the EAGLE-3 draft head#20152

Draft
digantdesai wants to merge 1 commit into
gh/digantdesai/56/headfrom
gh/digantdesai/57/head
Draft

Add KV cache to the EAGLE-3 draft head#20152
digantdesai wants to merge 1 commit into
gh/digantdesai/56/headfrom
gh/digantdesai/57/head

Conversation

@digantdesai

Copy link
Copy Markdown
Contributor

Adds a flat KV cache and an explicit-mask attention path (forward_cached) to the
draft head so a proposal step reuses cached keys/values instead of recomputing
the prefix's projections and MLP. Attention still scores against the full
max_seq_len buffer under a static causal mask, matching the target's cache and
keeping the path export-friendly. The stateless is_causal forward is unchanged.

forward_cached requires batch size 1 and contiguous-from-0 writes (overwrites
for speculative rollback are allowed, gapped seeds are not); the invariant is
enforced by an eager-only validator that is skipped under export. Tests cover
cached prefill plus single-step decode against the stateless recompute, rollback
reseeding, and rejection of gapped/offset seeds and B>1.

Authored with assistance from Claude Code.

[ghstack-poisoned]
@pytorch-bot

pytorch-bot Bot commented Jun 9, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20152

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 3 Pending

As of commit d782162 with merge base dc55469 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

return torch.cat((-x2, x1), dim=-1)


class Eagle3KVCache(nn.Module):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What distinguishes this from our standard KVCache? Can we just import the standard one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants