[Perf] Split feed repost branch by entity type#794
Merged
raymondjacobson merged 1 commit intomainfrom May 8, 2026
Merged
Conversation
The feed query LEFT JOINed both tracks and playlists onto every repost row to filter out reposts pointing at deleted/unlisted/ private entities. Postgres satisfied the playlist side by hashing *every* public playlist (~94k rows) on every call, regardless of how few playlist-type reposts the followee set contained. Splitting the branch by repost_type lets each side use a per-row INNER JOIN against the entity (tracks_pkey or playlists_pkey), removing the upfront 94k-row hash entirely. Cold-cache and tail latency benefit even when warm timings look similar. Adds a regression test exercising both repost branches plus the owned-track and owned-playlist branches; no prior coverage existed.
Contributor
|
love it |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Rewrite the feed query so the repost branch handles track-type and playlist-type reposts separately. Each side INNER-joins against just the entity type it needs (
tracks_pkeyorplaylists_pkey), eliminating the upfront 94k-row hash that the planner was building over every public playlist on every call.Why
The old query did:
To satisfy the playlist LEFT JOIN, Postgres scanned the
idx_playlist_statuspartial index for every public playlist (94,306 rows on prod, 4.3 MB hash, 2,420 buffers) on every feed call — even though the average user's repost set contains only a handful of playlist-type reposts.Per pg_stat_statements this query had two variants with mean exec times of 860ms and 4,478ms; in Axiom,
/v1/users/:userId/feedshows p50 4.4s / p95 13s — the worst signed-in endpoint by total time.Impact
EXPLAIN ANALYZE on prod read replica, user 20 (1,752 follows):
Warm-cache timings are similar in my tests (PG plan caching obscures the win at the read-replica). The savings show up in cold cache and at the tail — production tail latency should drop noticeably with the upfront hash gone.
Risk
'track'-typed entities, Branch 1b returns whateverrepost_typewas ('playlist'or'album') — same as before. The outerGROUP BY (entity_type, entity_id)andmax(created_at)semantics are unchanged.TestUsersFeedcovers both repost branches and the owned-track/owned-playlist branches.Test plan
go test -count=1 ./api/...(full suite, all green)TestUsersFeedexercises track-repost, playlist-repost, owned-track, and owned-playlist branches plus the no-followees empty case/v1/users/Wem1e/feed?limit=20(Phuture, 1752 follows): 500-750ms warm🤖 Generated with Claude Code