Skip to content

gh-138907: Support RFC 9309 in robotparser#138908

Merged
serhiy-storchaka merged 9 commits intopython:mainfrom
serhiy-storchaka:robotparser-rfc9309
May 4, 2026
Merged

gh-138907: Support RFC 9309 in robotparser#138908
serhiy-storchaka merged 9 commits intopython:mainfrom
serhiy-storchaka:robotparser-rfc9309

Conversation

@serhiy-storchaka
Copy link
Copy Markdown
Member

@serhiy-storchaka serhiy-storchaka commented Sep 15, 2025

  • empty lines are always ignored instead of separating groups
  • the "user-agent" line after a rule starts a new group
  • groups matching the same user agent are now merged
  • the rule with the longest match wins instead of the first matching rule
  • in case of equal matches, the “Allow” rule wins over “Disallow”
  • special characters “$” and “*” are now supported in rules
  • prefer full match for user agent

Comment thread Lib/urllib/robotparser.py Outdated
@serhiy-storchaka serhiy-storchaka marked this pull request as ready for review April 25, 2026 11:31
@serhiy-storchaka serhiy-storchaka added needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes labels Apr 25, 2026
@serhiy-storchaka serhiy-storchaka enabled auto-merge (squash) May 4, 2026 17:37
@read-the-docs-community
Copy link
Copy Markdown

@serhiy-storchaka serhiy-storchaka merged commit bc285e5 into python:main May 4, 2026
52 checks passed
@miss-islington-app
Copy link
Copy Markdown

Thanks @serhiy-storchaka for the PR 🌮🎉.. I'm working now to backport this PR to: 3.13, 3.14.
🐍🍒⛏🤖

@miss-islington-app
Copy link
Copy Markdown

Sorry, @serhiy-storchaka, I could not cleanly backport this to 3.13 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker bc285e583286c739e553e49c19fd946cb63432c7 3.13

@bedevere-app
Copy link
Copy Markdown

bedevere-app Bot commented May 4, 2026

GH-149374 is a backport of this pull request to the 3.14 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.14 bugs and security fixes label May 4, 2026
@bedevere-app
Copy link
Copy Markdown

bedevere-app Bot commented May 4, 2026

GH-149376 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.13 bugs and security fixes label May 4, 2026
serhiy-storchaka added a commit that referenced this pull request May 4, 2026
)

* empty lines are always ignored instead of separating groups
* the "user-agent" line after a rule starts a new group
* groups matching the same user agent are now merged
* the rule with the longest match wins instead of the first matching rule
* in case of equal matches, the “Allow” rule wins over “Disallow”
* special characters “$” and “*” are now supported in rules
* prefer full match for user agent

(cherry picked from commit bc285e5)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit that referenced this pull request May 4, 2026
)

* empty lines are always ignored instead of separating groups
* the "user-agent" line after a rule starts a new group
* groups matching the same user agent are now merged
* the rule with the longest match wins instead of the first matching rule
* in case of equal matches, the “Allow” rule wins over “Disallow”
* special characters “$” and “*” are now supported in rules
* prefer full match for user agent

(cherry picked from commit bc285e5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants