chore: add internal markdown link check#21831
chore: add internal markdown link check#21831Geethapranay1 wants to merge 3 commits intoapache:mainfrom
Conversation
|
@comphead PTAL |
|
|
||
| Run the internal markdown link check locally: | ||
|
|
||
| ```shell |
There was a problem hiding this comment.
for local we need to document mapfile is needed
There was a problem hiding this comment.
and also mapfile is not available on macOS
comphead
left a comment
There was a problem hiding this comment.
Thanks @Geethapranay1 for the PR
I tried to check it locally but with macOS I'm missing some bash commands. Would you help to attach to the PR how bad link would look like.
I'm assuming it should be very clear to the user what link needs to be fixed
|
Thanks @comphead for the detailed review and for testing this on macOS. I will:
|
|
@comphead PTAL, i have changed according to the review |
|
|
||
| ```text | ||
| [docs/source/user-guide/cli/overview.md]: | ||
| [ERROR] file:///.../docs/source/user-guide/cli/missing-page.md | Cannot find file: File not found. Check if file exists and path is correct |
There was a problem hiding this comment.
oh, lychee doesn't refer to a specific line in the md file?
comphead
left a comment
There was a problem hiding this comment.
Thanks @Geethapranay1 this pr makes a lot of sense
Which issue does this PR close?
Rationale for this change
datafusion did not have a CI check for broken links in markdown content, docs workflows build and deploy docs, and dev checks formatting and spelling, but none of them validate link targets.
This pr adds a dedicated link check for internal markdown links so broken references fail early in PRs.
I kept the scope internal-only to avoid flaky CI failures from external websites and rate limits.
Rust doc comments remain covered by the existing rustdoc CI job.
What changes are included in this PR?
dev.yml.LYCHEE_VERSIONpin intool_versions.sh.markdown_link_check.shto run lychee on the selected markdown paths.lychee.tomlwith internal-link policy and exclusions..asf.yaml.roadmap.md49.0.0.mdoverview.mddataframe.mdformat_options.mdAre these changes tested?
Yes,
python3 ci/scripts/check_asf_yaml_status_checks.pypassed.bash -n ci/scripts/markdown_link_check.shpassed.bash ci/scripts/markdown_link_check.shpassed with 0 errors.cargo fmt --all --checkpassed.OK: All 5 required_status_checks match existing GitHub Actions jobs.
🔍 12824 Total (in 0s) ✅ 490 OK 🚫 0 Errors 👻 12334 Excluded
Are there any user-facing changes?
No,
There is one contributor-facing CI change: PRs now fail when internal markdown links break in the checked markdown files.