fix: preserve leading empty lines in code cells (if any) during formatting#953
fix: preserve leading empty lines in code cells (if any) during formatting#953mcanouil wants to merge 2 commits intoquarto-dev:mainfrom
Conversation
Before this change, formatting an R code cell with Air stripped all leading empty lines between option directives and code. Python cells were unaffected because Python's virtual documents include inject lines (# type: ignore, # flake8: noqa) that pushed user code off position 0, inadvertently buffering it from the formatter. The fix is formatter-agnostic: leading empty lines are stripped from the virtual document before it is passed to any language formatter, and the line offset is adjusted so formatter edits land in the correct position. A normaliseEdit collapses two or more leading empty lines to exactly one; this fires as a Quarto-level operation even when no language formatter is active, and also for cells that consist only of option directives followed by blank lines.
|
Do I understand it correctly that the code you added here undoes some of the work that Air does? |
|
Yes😅 (but before it does it, here it does not even know there were empty lines at the top of the code cells) Ruff/Black would have done the same thing if the extension was not injected a comment at the top of the virtual doc.🤷♂️ |
|
To clarify, this PR creates normalisation of the leading empty lines regardless of languages and formatters, this makes things consistent. |
|
@DavisVaughan does this solution seem reasonable to you? |
| const blockRange = new Range( | ||
| new Position(block.range.start.line, block.range.start.character), | ||
| new Position(block.range.end.line, block.range.end.character) | ||
| ); |
There was a problem hiding this comment.
Is this different from block.range, which already exists?
| if (normalizeEdit && !blockRange.contains(normalizeEdit.range)) { | ||
| return undefined; | ||
| } |
There was a problem hiding this comment.
I don't see how this could happen, it looks like the way you defined normalizeEdit would guarantee that it lies within the range of the block
There was a problem hiding this comment.
(If you remove it here, also remove it again below)
(If you don't remove it, maybe ensure we have a test that shows why it is required)
| // Include normalizeEdit in the guard so it is validated along with formatter | ||
| // edits — all edits must be in range or none are applied. |
There was a problem hiding this comment.
again, it seems like you built the normalize edit from the original range itself, so this feels circuitous to me, but maybe im missing something.
| if (codeLines.every(l => l.trim() === "")) { | ||
| return undefined; | ||
| if (normalizeEdit && !blockRange.contains(normalizeEdit.range)) { | ||
| return undefined; |
There was a problem hiding this comment.
Non blocking note: In terms of code style / ease of maintenance, it might be nice if we could structure this function to not have early exists with hard to remember corner cases like the normalizeEdit behavior.
So, like, in this case what you might consider doing is assigning edits = [] / undefined here when there are no codeLines instead of letting executeFormatDocumentProvider() run, but then let everything else fall through the "happy path" as much as possible with an empty edits array.
In my head, the benefit of this is that the only time normalizeEdits would come into play is at the very end, with
if (normalizeEdit) {
adjustedEdits.push(normalizeEdit);
}
and otherwise you don't have to think about it, which would be quite nice
| test("single leading empty line in R cell is preserved after formatting", async function () { | ||
| const formattedResult = await testFormatter( | ||
| "format-r-leading-empty-lines.qmd", | ||
| [8, 0], | ||
| rAssignmentFormatter, | ||
| "r" | ||
| ); | ||
|
|
||
| assert.ok( | ||
| formattedResult.includes("#| label: one-empty-line"), | ||
| "Option directive should be preserved" | ||
| ); | ||
| assert.ok( | ||
| formattedResult.includes("x <- 1"), | ||
| "Code should be reformatted" | ||
| ); | ||
| assert.ok( | ||
| /one-empty-line\n\n/.test(formattedResult), | ||
| "Single leading empty line should be preserved" | ||
| ); | ||
| assert.ok( | ||
| !/one-empty-line\n\n\n/.test(formattedResult), | ||
| "No extra empty line should be introduced" | ||
| ); | ||
| }); |
There was a problem hiding this comment.
Another non blocking note:
I think these kinds of tests are much easier to skim if they are structured as snapshot tests, kind of like https://github.com/quarto-dev/quarto/blob/5f420f8788de39ea4b37f33dd1cba0a3268db5a8/apps/vscode/src/test/vdoc.test.ts
The idea being that for each test you have some on-disk input.qmd (here, your format-r-leading-empty-lines.qmd) and some known on-disk output.qmd. You:
- Format
content(input.qmd)->formatted - Compare
content(output.qmd) === formatted
That way its fairly easy to just look at output.qmd and ensure it has all the expected properties we care about
Each test could be structured as a folder, like
r-multiple-empty-lines-collapsed-to-one/
input.qmd
output.qmd
|
@DavisVaughan thanks for the review. I'll get back on this PR probably this weekend or next week. |
formatBlocknow strips leading empty lines from the virtual document and adjusts the line offset accordingly, so formatter edits land at the correct position.normalizeEditcollapses two or more leading empty lines to exactly one; this fires even when no language formatter is active.Tested interactively on a VSIX installed build with (my formatters: Ruff and Air):
Which lead to:
Fixes #950