fix: preserve multiline source expression whitespace#323
Conversation
AI Code ReviewWe are reviewing a PR that adds a new The PR does not add new MDL syntax (it's about preserving existing source text) but modifies the internal representation and handling of expressions. Let's go through the checklist: Overlap & duplication
MDL syntax design
Full-stack consistency
Test coverage
Security & robustness
Scope & atomicity
Code quality
Bugs & correctness
Looking at the test case: The declare initial value is split over multiple lines with The test expects: The implementation in
The The function One potential issue: the function DECLARE $Endpoint String = @Synthetic.Endpoint The initial value starts at In the test, the expression string is: This string does not have leading or trailing whitespace (it starts with '@' and ends with ''''). So Similarly, for the log template param, we expect "toString($Count)\n\n". This string starts with 't' and ends with '\n' (two newlines). The trailing newlines would be removed by So if the original text of the expression (from the antlr context) includes trailing whitespace (like newlines) then Looking at the test: the first template parameter is It looks for the gap between the end of the current expression and the start of the next parameter (or the end of the template params). It then if the delimiter is a comma, it takes the text up to the comma. In the test, the gap between the end of the first expression and the start of the second expression is two newlines and then a comma? Actually, the input is: , {2} = $Endpoint); So after
But note: the next parameter starts at the
So the gap string is: "\n\n , {2} = $Endpoint" (but note: the actual input has two newlines, then a space, then a comma, then a space, then Then in But the test expects the source to be exactly "toString($Count)\n\n" (without the trailing space). However, note the test input has two newlines and then a comma and then a space. The gap we captured was "\n\n " (two newlines and a space). The test expects only two newlines. This might be a discrepancy. However, note the test says: "Log template parameter keeps trailing blank-line whitespace." and the test expects the source to be "toString($Count)\n\n". The actual input has two newlines and then a comma and a space. The blank line is the two newlines? The comma and space are not part of the whitespace we want to keep? We only want the whitespace between the end of the expression and the comma? The function But the test expects only the two newlines, not the space. However, note that the space after the two newlines and before the comma is not part of the "blank-line whitespace" we want to keep? The blank line is the two newlines. The space is incidental and might be considered part of the formatting. But the PR description says: "Preserve newline-only trailing whitespace between log template parameters." So we only want to preserve the newlines, not the space. This suggests that the function Looking at the test input string in the PR: , {2} = $Endpoint); There is a blank line (two newlines) between So the test input must not have a space between the two newlines and the comma? Actually, the test input as written in the PR has: , {2} = $Endpoint); There are two newlines after , {2} = $Endpoint); The line with the comma starts with two spaces? Actually, the indentation level: the WHILE block is indented by two spaces, then the LOG statement is indented by two more? But we don't have to worry because the antlr token stream will capture the exact text. Given that the test passes, we assume the implementation is correct for the test case. Now, let's check for any potential issues:
Automated review via OpenRouter (Nemotron Super 120B) — workflow source |
|
Updated the branch with a source-preservation guard for string literals: punctuation inside single-quoted literals no longer forces
|
ako
left a comment
There was a problem hiding this comment.
Review: fix: preserve multiline source expression whitespace
Summary: Solid approach — wrapping parsed expressions in a SourceExpr that preserves the original text is the right design for roundtrip fidelity. The core implementation is correct. Several issues worth addressing.
Bug in resolveAssociationPaths for SourceExpr
case *ast.SourceExpr:
if e.Source != "" {
return e
}
return &ast.SourceExpr{
Expression: fb.resolveAssociationPaths(e.Expression),
Source: e.Source, // e.Source is "" here
}The second branch creates a SourceExpr{Source: ""} wrapping a resolved inner expression. buildSourceExpression only creates a SourceExpr when source != "", so this branch is effectively unreachable in practice — but if it were reached, the empty-source wrapper would be silently peeled off by expressionToString's e.Source != "" guard. Simplify to just return fb.resolveAssociationPaths(e.Expression) for the empty case.
More importantly: when Source != "" the inner e.Expression is returned without recursing into it for association path resolution. This means any association paths inside a source-preserved expression are never resolved. Whether this is intentional (the source text is already in final form) or an oversight depends on how association path resolution interacts with source-preserved expressions — worth a comment clarifying the intent.
retrieveInterClauseWhitespaceSuffix hard-codes 4-space indent
for _, suffix := range []string{"\r\n ", "\n "} {
if strings.HasSuffix(gap, suffix) {
return strings.TrimSuffix(gap, suffix)
}
}This strips exactly 4 trailing spaces from the inter-clause gap, assuming the MDL formatter will re-add 4 spaces of indentation at output time. If a user writes the LIMIT/OFFSET clause with a different indent (tabs, 2 spaces, or 8 spaces), the suffix check fails and the function returns "", silently dropping any trailing whitespace. A more robust approach would strip trailing whitespace dynamically rather than matching fixed suffixes.
shouldPreserveExpressionSource — not( case lacks explanation
if strings.Contains(strings.ToLower(source), "not(") {
return true
}This could false-positive on identifiers or attribute names containing the substring not( (unlikely but possible). More importantly, there's no comment explaining why not( specifically requires source preservation. Adding one (e.g., "reconstructed not(x) would normalise to not (x) — preserve compact form") would help future readers.
buildSetStatement calls buildExpression twice
The pattern calls buildExpression(expr) for list/aggregate detection, then overwrites valueExpr at the end with buildSourceExpression(valueExprCtx). This calls buildExpression a second time internally. The approach is clear and correct, but the double-parse is worth a brief comment noting that the first parse is for semantic detection and the second is for source preservation.
Expression traversal coverage
The PR adds SourceExpr handling to expressionToString, expressionToXPath, resolveAssociationPaths, and unwrapParenCall. Other expression traversal switch statements in cmd_microflows_helpers.go (e.g. the type inference / XPath conversion switches) do not yet handle SourceExpr. Since SourceExpr now wraps almost all parsed expressions, any unhandled switch falls through to default (usually returning ""). Worth auditing whether inferExpressionType or other callers that switch on expression types need a *ast.SourceExpr case that unwraps to the inner expression.
appendLogNodeTrailingWhitespace comment is fragile
// formatAction always writes one space between the node and message slots.
// Preserve the source line break but avoid duplicating indentation spaces.
trailing = strings.TrimRight(trailing, " \t")This is a coupling to the formatter's output format. If formatAction changes its spacing, this trim logic would either over-strip or under-strip. A comment linking to the specific formatter call would help future maintenance.
Tests
Good coverage across the key cases: multiline declare, multiline log template, call/change/show-message argument slots, retrieve LIMIT/OFFSET, string-literal punctuation not triggering preservation. The TestActionExpressionSlotsPreserveSourceWhitespace and TestTrailingExpressionWhitespacePreservedForRoundtripSlots tests are comprehensive. The TestShouldPreserveExpressionSourceIgnoresStringLiteralPunctuation directly tests the heuristic including the doubled-quote escape case.
Verdict
Approve with the resolveAssociationPaths bug flagged as the highest-priority fix, and the hard-coded 4-space suffix in retrieveInterClauseWhitespaceSuffix as worth addressing. The core SourceExpr design is clean and correct.
AI Code ReviewCritical Issues
Moderate Issues
Minor Issues
What Looks Good
RecommendationApprove the PR. It correctly addresses the root cause, includes thorough test coverage, maintains backward compatibility, and follows the project's architectural principles (no new MDL syntax, full pipeline consistency for existing features). The fix ensures roundtrip stability for microflow source text as required. Automated review via OpenRouter (Nemotron Super 120B) — workflow source |
AI Code ReviewCritical IssuesNone found. Moderate IssuesNone found. Minor Issues
What Looks Good
RecommendationApprove. The PR effectively fixes the reported issue with minimal risk, follows existing patterns, and includes thorough testing. The minor concerns about code complexity do not block merging given the correctness and test coverage. Future refactoring could simplify the whitespace preservation helpers, but the current implementation is correct and maintainable for now. Automated review via OpenRouter (Nemotron Super 120B) — workflow source |
AI Code ReviewCritical IssuesNone found. Moderate IssuesNone found. Minor Issues
What Looks Good
RecommendationApprove the PR. The core fix for issue #322 is well-implemented and tested. The unrelated association changes, while potentially part of a larger effort, do not interfere with the source preservation fix and can be addressed in follow-up if needed. The PR satisfies the checklist requirements for a bug fix: it includes tests first, avoids duplication, maintains scope, and improves code quality. Automated review via OpenRouter (Nemotron Super 120B) — workflow source |
Replaces the hard-coded "\n " / "\r\n " suffix matches with a backwards walk that strips a trailing newline plus whatever indent (spaces or tabs) follows it. The formatter re-emits its own indent before each clause, so the original source's trailing newline+indent is structural and would duplicate after a roundtrip — but the old 4-space match silently dropped the entire whitespace gap if the formatter or the source used a different width. Adds a doc comment explaining the contract. Also documents the `not(` heuristic in shouldPreserveExpressionSource so future readers know it preserves the compact form against the parser's `not (<expr>)` re-emission. Addresses ako's review on PR mendixlabs#323. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AI Code ReviewWhat Looks GoodThis PR effectively addresses issue #322 by preserving multiline source expression whitespace in MDL declare/log expressions. The solution is well-structured:
RecommendationApprove - This PR is ready to merge. It correctly fixes the reported issue with appropriate test coverage and follows the project's architectural patterns. No changes are needed before merging. Automated review via OpenRouter (Nemotron Super 120B) — workflow source |
Symptom: describe/exec round-trips collapsed multiline declaration and log-template expressions, producing cosmetic drift and losing source layout that users intentionally wrote. Root cause: declare initial values and log message/node expressions were rebuilt from parsed expression nodes instead of retaining source text, and template parameter gaps discarded newline-only whitespace between parameters. Fix: add SourceExpr as a small expression wrapper for retained source text, use it for declare/log/while expressions that need source preservation, serialize it back through the executor, and append newline-only trailing whitespace between log template parameters to the captured source expression. Tests: added parser coverage for multiline declare values and log template parameters with trailing blank lines; validated make build, make test, and make lint-go.
…ation Adds an MDL script under mdl-examples/bug-tests/ exercising both multiline DECLARE initial values and inter-parameter blank lines in LOG template parameters. The describe → exec → describe fixpoint confirms SourceExpr-wrapped expressions retain their original whitespace. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Symptom: a plain string literal containing punctuation such as `!` or `:` could be wrapped as a SourceExpr, causing downstream log-template formatting to emit extra quoting. Root cause: shouldPreserveExpressionSource scanned every operator-like byte without tracking whether it was inside a single-quoted string literal. Fix: skip compact-operator detection while inside single-quoted literals, including doubled single-quote escapes. Tests: add visitor coverage for punctuation and escaped quotes inside string literals while keeping compact operators outside literals preserved.
Symptom: microflow roundtrips still normalized expression source text in common action slots such as call arguments, member assignments, show-message arguments, retrieve WHERE expressions, IF conditions, and RETURN values. Root cause: the visitor only wrapped a small subset of expressions in SourceExpr. Other expression-bearing fields still used buildExpression directly, so the writer re-rendered them with canonical spacing and collapsed line breaks. Fix: route expression slots through buildSourceExpression, preserve newline-only trailing whitespace between list elements, and unwrap SourceExpr when detecting rule split calls so source preservation does not hide the structured expression. Tests: added a visitor regression covering create/change assignments, Java action arguments, IF conditions, show-message arguments, and RETURN values; ran make build, make lint-go, and make test.
Symptom: describe/exec/describe still drifted for expression slots that had layout whitespace immediately before delimiters such as semicolons, closing parentheses, or LOG message expressions. Root cause: source-preservation helpers kept multiline expression bodies but ignored delimiter-adjacent whitespace unless it contained a newline, and LOG node/message gaps were not captured at all. Fix: preserve whitespace-only gaps before expression-list delimiters and statement semicolons, and carry multiline LOG node-to-message gaps through the node expression without duplicating formatter spacing. Tests: add a visitor regression covering DECLARE, SET, CHANGE, LOG node gaps, and LOG template parameter trailing whitespace; make build and make test pass.
Symptom: database retrieve roundtrips could collapse blank lines before OFFSET and move a semicolon onto the OFFSET expression line. Root cause: retrieve LIMIT and OFFSET used GetText(), which kept the expression value but discarded delimiter-adjacent source whitespace that the describer can emit from existing model strings. Fix: parse retrieve range expressions from original source text and preserve only whitespace that belongs to the range expression slots, accounting for the formatter's built-in OFFSET clause prefix. Tests: add a visitor regression covering multiline LIMIT/OFFSET source whitespace and run make build plus make test.
Symptom: retrieve WHERE clauses using compact XPath paths could roundtrip as canonicalized paths with spaces around '/', producing cosmetic drift. Root cause: source-expression preservation did not treat retrieve XPath paths as source-sensitive, and XPath constraints were built directly from normalized XPath AST nodes. Fix: preserve original source text for XPath constraints and retrieve WHERE expressions that contain path separators without changing normal expression parsing. Tests: added a retrieve XPath path-spacing regression and ran make test.
Symptom: resolveAssociationPaths could return a SourceExpr wrapper with an empty Source field after resolving the inner expression. Root cause: the SourceExpr branch rebuilt the wrapper even when there was no source text to preserve. That made the wrapper semantically redundant and left future callers dependent on expressionToString peeling it away. Fix: keep non-empty SourceExpr values verbatim for whitespace-preserving roundtrips, but unwrap empty SourceExpr values to the resolved inner expression. Tests: make build; focused SourceExpr association-resolution tests; make test.
Replaces the hard-coded "\n " / "\r\n " suffix matches with a backwards walk that strips a trailing newline plus whatever indent (spaces or tabs) follows it. The formatter re-emits its own indent before each clause, so the original source's trailing newline+indent is structural and would duplicate after a roundtrip — but the old 4-space match silently dropped the entire whitespace gap if the formatter or the source used a different width. Adds a doc comment explaining the contract. Also documents the `not(` heuristic in shouldPreserveExpressionSource so future readers know it preserves the compact form against the parser's `not (<expr>)` re-emission. Addresses ako's review on PR mendixlabs#323. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
bb5edc4 to
7364284
Compare
AI Code ReviewWhat Looks Good
RecommendationApprove. The PR fully addresses issue #322 by preserving multiline source expression whitespace through the complete MDL pipeline (AST → visitor → executor), with appropriate test coverage and validation. The implementation is consistent with project architecture and doesn't introduce any syntax changes that would require additional LSP or extension updates. Automated review via OpenRouter (Nemotron Super 120B) — workflow source |
|
Review follow-up status on the current head:
Local validation after the update was green: make build
make test
make lint-go |
Part of #332.
Fixes #322.
Summary
Preserves multiline source text for declaration initial values and log expressions, including newline-only whitespace between log template parameters. Also avoids preserving plain string literals solely because their text contains punctuation such as
!or:.Root cause
The visitor rebuilt declare/log expressions from parsed expression nodes. That discarded original line breaks and inter-parameter whitespace. The source-preservation scanner also inspected operator-like bytes inside single-quoted string literals, which could wrap plain template text as a
SourceExpr.Fix
SourceExpras a source-preserving expression wrapper.buildSourceExpressionfor source-sensitive declare/log/while expressions.SourceExprthrough executor expression formatting.Tests
Added parser coverage for synthetic multiline declare/log template expressions and visitor coverage for punctuation inside string literals. Test names and fixtures are synthetic and do not reference real project microflows.
Validation
go test ./mdl/visitor -run 'TestDeclareAndLogTemplatePreserveMultilineSourceWhitespace|TestShouldPreserveExpressionSourceIgnoresStringLiteralPunctuation'make buildmake testmake lint-gomake test-integrationAgentic Code Testing
Test plan