Skip to content

fix: preserve leading empty lines in code cells (if any) during formatting#953

Open
mcanouil wants to merge 2 commits intoquarto-dev:mainfrom
mcanouil:fix/preserve-leading-empty-lines-formatting
Open

fix: preserve leading empty lines in code cells (if any) during formatting#953
mcanouil wants to merge 2 commits intoquarto-dev:mainfrom
mcanouil:fix/preserve-leading-empty-lines-formatting

Conversation

@mcanouil
Copy link
Copy Markdown
Contributor

@mcanouil mcanouil commented Apr 22, 2026

  • Formatting an R code cell with Air stripped all leading empty lines between option directives and code. Python cells were unaffected because their virtual documents include inject lines that push user code off position 0.
  • formatBlock now strips leading empty lines from the virtual document and adjusts the line offset accordingly, so formatter edits land at the correct position.
  • A normalizeEdit collapses two or more leading empty lines to exactly one; this fires even when no language formatter is active.
  • New tests cover: single empty line preserved, multiple collapsed to one, no empty lines unaffected, a hostile formatter proving empty lines are hidden, and normalisation without a formatter — for both R and Python.

Tested interactively on a VSIX installed build with (my formatters: Ruff and Air):

---
title: Leading Empty Lines
format: html
---

```{r}
#| label: one-empty-line

x<-1
```

```{r}
#| label: two-empty-lines


x<-2
```

```{r}
#| label: no-empty-lines
x<-3
```

```{python}
#| label: no-empty-line
x=1;y=2
```

```{python}
#| label: one-empty-line

x=1;y=2
```

```{python}
#| label: two-empty-lines


x=3;y=4
```

Which lead to:

---
title: Leading Empty Lines
format: html
---

```{r}
#| label: one-empty-line

x <- 1
```

```{r}
#| label: two-empty-lines

x <- 2
```

```{r}
#| label: no-empty-lines
x <- 3
```

```{python}
#| label: no-empty-line
x = 1
y = 2
```

```{python}
#| label: one-empty-line

x = 1
y = 2
```

```{python}
#| label: two-empty-lines

x = 3
y = 4
```

Fixes #950

Before this change, formatting an R code cell with Air stripped all
leading empty lines between option directives and code. Python cells
were unaffected because Python's virtual documents include inject
lines (# type: ignore, # flake8: noqa) that pushed user code off
position 0, inadvertently buffering it from the formatter.

The fix is formatter-agnostic: leading empty lines are stripped from
the virtual document before it is passed to any language formatter,
and the line offset is adjusted so formatter edits land in the correct
position. A normaliseEdit collapses two or more leading empty lines to
exactly one; this fires as a Quarto-level operation even when no
language formatter is active, and also for cells that consist only of
option directives followed by blank lines.
@mcanouil mcanouil changed the title fix: preserve leading empty lines in code cells during formatting fix: preserve leading empty lines in code cells (if any) during formatting Apr 22, 2026
@cscheid
Copy link
Copy Markdown
Member

cscheid commented Apr 22, 2026

Do I understand it correctly that the code you added here undoes some of the work that Air does?

@mcanouil
Copy link
Copy Markdown
Contributor Author

mcanouil commented Apr 22, 2026

Yes😅 (but before it does it, here it does not even know there were empty lines at the top of the code cells)

Ruff/Black would have done the same thing if the extension was not injected a comment at the top of the virtual doc.🤷‍♂️

@mcanouil
Copy link
Copy Markdown
Contributor Author

To clarify, this PR creates normalisation of the leading empty lines regardless of languages and formatters, this makes things consistent.

@vezwork vezwork requested a review from DavisVaughan April 24, 2026 15:31
@vezwork
Copy link
Copy Markdown
Member

vezwork commented Apr 24, 2026

@DavisVaughan does this solution seem reasonable to you?

Copy link
Copy Markdown
Member

@vezwork vezwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally and it works how I would expect. The code and tests look good upon quick review.

Comment on lines +330 to +333
const blockRange = new Range(
new Position(block.range.start.line, block.range.start.character),
new Position(block.range.end.line, block.range.end.character)
);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this different from block.range, which already exists?

Comment on lines +354 to +356
if (normalizeEdit && !blockRange.contains(normalizeEdit.range)) {
return undefined;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this could happen, it looks like the way you defined normalizeEdit would guarantee that it lies within the range of the block

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(If you remove it here, also remove it again below)

(If you don't remove it, maybe ensure we have a test that shows why it is required)

Comment on lines +391 to +392
// Include normalizeEdit in the guard so it is validated along with formatter
// edits — all edits must be in range or none are applied.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, it seems like you built the normalize edit from the original range itself, so this feels circuitous to me, but maybe im missing something.

Comment on lines 353 to +355
if (codeLines.every(l => l.trim() === "")) {
return undefined;
if (normalizeEdit && !blockRange.contains(normalizeEdit.range)) {
return undefined;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non blocking note: In terms of code style / ease of maintenance, it might be nice if we could structure this function to not have early exists with hard to remember corner cases like the normalizeEdit behavior.

So, like, in this case what you might consider doing is assigning edits = [] / undefined here when there are no codeLines instead of letting executeFormatDocumentProvider() run, but then let everything else fall through the "happy path" as much as possible with an empty edits array.

In my head, the benefit of this is that the only time normalizeEdits would come into play is at the very end, with

  if (normalizeEdit) {
    adjustedEdits.push(normalizeEdit);
  }

and otherwise you don't have to think about it, which would be quite nice

Comment on lines +528 to +552
test("single leading empty line in R cell is preserved after formatting", async function () {
const formattedResult = await testFormatter(
"format-r-leading-empty-lines.qmd",
[8, 0],
rAssignmentFormatter,
"r"
);

assert.ok(
formattedResult.includes("#| label: one-empty-line"),
"Option directive should be preserved"
);
assert.ok(
formattedResult.includes("x <- 1"),
"Code should be reformatted"
);
assert.ok(
/one-empty-line\n\n/.test(formattedResult),
"Single leading empty line should be preserved"
);
assert.ok(
!/one-empty-line\n\n\n/.test(formattedResult),
"No extra empty line should be introduced"
);
});
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another non blocking note:

I think these kinds of tests are much easier to skim if they are structured as snapshot tests, kind of like https://github.com/quarto-dev/quarto/blob/5f420f8788de39ea4b37f33dd1cba0a3268db5a8/apps/vscode/src/test/vdoc.test.ts

The idea being that for each test you have some on-disk input.qmd (here, your format-r-leading-empty-lines.qmd) and some known on-disk output.qmd. You:

  • Format content(input.qmd) -> formatted
  • Compare content(output.qmd) === formatted

That way its fairly easy to just look at output.qmd and ensure it has all the expected properties we care about

Each test could be structured as a folder, like

r-multiple-empty-lines-collapsed-to-one/
  input.qmd
  output.qmd

@mcanouil
Copy link
Copy Markdown
Contributor Author

@DavisVaughan thanks for the review. I'll get back on this PR probably this weekend or next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add empty line after quarto comments

4 participants