Skip to content

Add support for Equality Deletes on DeleteFileIndex#3285

Open
rambleraptor wants to merge 3 commits intoapache:mainfrom
rambleraptor:equality-delete-index
Open

Add support for Equality Deletes on DeleteFileIndex#3285
rambleraptor wants to merge 3 commits intoapache:mainfrom
rambleraptor:equality-delete-index

Conversation

@rambleraptor
Copy link
Copy Markdown
Contributor

Part of #3270

Rationale for this change

This adds support for getting equality deletes in the DeleteFileIndex.

I'm very purposefully ignoring them in _read_all_delete_files because they will crash.

Are these changes tested?

I made some equality deletes by-hand and had PyIceberg read them to see the indexes. Worked as expected. If you know a way to create equality deletes, I can test those as well.

Are there any user-facing changes?

  • Adds support for equality deletes in DeleteFileIndex

@ndrluis
Copy link
Copy Markdown
Collaborator

ndrluis commented Apr 26, 2026

@rambleraptor I think we should add a regression test for schema evolution here. This pruning path assumes the current table type for an equality field is the same type that was used when the data file and equality delete were written, which is not always true after a legal promotion like int -> long. In that case, historical manifests still contain 4-byte int bounds, and decoding them with the current LongType can fail in DeleteFileIndex.for_data_file(...).

For reference, Iceberg Java had to address the same schema-evolution issue in apache/iceberg#15268, where the fix was to avoid assuming the current schema is always the right one for equality-delete field resolution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants