v0.6.57: mothership reliability, ashby refactor, tables row count, copilot id fix, bun upgrade by TheodoreSpeaks · Pull Request #4293 · simstudioai/sim

TheodoreSpeaks · 2026-04-24T22:57:22Z

fix(mothership): Use heartbeat mechanism for chat locks (fix(mothership): Use heartbeat mechanism for chat locks #4286)
improvement(mothership): stream retry state machine, progressive re-rendering (improvement(mothership): stream retry state machine, progressive re-rendering #4287)
fix(copilot): replace crypto.randomUUID() with generateId() per project rule (improvement(copilot): replace crypto.randomUUID() with generateId() per project rule #4268)
refactor(ashby): align tools, block, and triggers with Ashby API (refactor(ashby): align tools, block, and triggers with Ashby API #4288)
improvement(tables): race-free row-count trigger + scoped tx timeouts (improvement(tables): race-free row-count trigger + scoped tx timeouts #4289)
improvement(mothership): treat error as terminal event (improvement(mothership): treat error as terminal event in traces #4290)
chore(bun): bump bun to 1.3.13 (chore(bun): bump bun to 1.3.13 #4291)
fix(table-block): resolve canonical tableId in filter/sort builders (fix(table-block): resolve canonical tableId in filter/sort builders #4294)
improvement(mothership): do not silently re-route missing stream id
(improvement(mothership): do not silently re-route missing stream id #4295)

…endering (#4287) * improvement(mothership): stream rety state machine, progressive re-rendering * address comments

…ct rule (#4268) AGENTS.md / CLAUDE.md forbid crypto.randomUUID() (non-secure contexts throw TypeError in browsers). Four copilot server-side files still violated this rule, left over after PR #3397 polyfilled the client. Routes through request lifecycle, OAuth draft insertion, persisted message normalization, and table-row generation now use generateId from @sim/utils/id, which is a drop-in UUID v4 producer that falls back to crypto.getRandomValues when randomUUID is unavailable. Refs #3393.

* refactor(ashby): align tools, block, and triggers with Ashby API Audit-driven refactor to destructure rich fields per Ashby's API docs, centralize output shapes via shared mappers in tools/ashby/utils.ts, and align webhook provider handler with trigger IDs via a shared action map. Removes stale block outputs left over from prior flat response shapes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ashby): remove stale noteId output and reject ping events - Remove stale `noteId` output descriptor from block (create_note now returns `id` at the top level via the shared note mapper). - Explicitly reject `ping` events in the webhook matchEvent before falling back to the generic triggerId check, so webhook records missing triggerId cannot execute workflows on Ashby ping probes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ashby): trim optional ID params in create/update tools Optional ID params in create_application, change_application_stage, and update_candidate were passed through to the request body without .trim(), unlike their required ID siblings. Normalize to prevent copy-paste whitespace errors. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ashby): add subblock migration for removed filterCandidateId Add SUBBLOCK_ID_MIGRATIONS entry so deployed workflows that previously used the `filterCandidateId` subBlock on `list_applications` don't break after the field was removed (Ashby's application.list doesn't filter by candidateId). Also regenerate docs to sync noteId removal. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ashby): final API alignment from parallel validation - create_candidate: email is optional per Ashby docs (only name is required); tool, types, and block all made non-required. - list_applications: guard NaN when createdAfter can't be parsed so we don't send a bad value to Ashby's API. - webhook provider: replace createHmacVerifier with explicit fail-closed verifyAuth that 401s when secretToken, signature header, or signature match is missing (was previously fail-open on missing secret). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ashby): preserve input.data path in webhook formatInput Restore the explicit `data` key alongside the spread so deployed workflows that reference `input.data.application.*`, `input.data.candidate.*`, etc. keep working. The spread alone dropped those paths. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(ashby): drop legacy input.data key from webhook formatInput Keep formatInput aligned with the advertised trigger outputs schema (flat top-level entities) and drop the legacy input.data.* compat path. Every field declared in each trigger's outputs is now populated 1:1 by the data spread plus the explicit action key — no undeclared keys. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ashby): trim remaining ID body params for parity Add .trim() on sourceId (create_candidate), jobId (list_applications), applicationId and interviewStageId (list_interviews) to match the trim-on-IDs pattern used across the rest of the Ashby tools and guard against copy-paste whitespace. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * update docs --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…#4289) * improvement(tables): race-free row-count trigger + scoped tx timeouts * fix(tables): close upsert race + serialize replaceTableRows Two concurrency bugs flagged by review: 1. `upsertRow` insert path: removing FOR UPDATE broke serialization between the initial existing-row SELECT and the INSERT. Two concurrent upserts on the same conflict target could both find no match, then both insert, producing a duplicate that bypasses the app-level unique check. Fixed by re-checking for the matching row *after* acquiring the per-table advisory lock; if a racing tx already inserted, switch to UPDATE. 2. `replaceTableRows`: under READ COMMITTED each tx's DELETE uses its own MVCC snapshot, so two concurrent replaces could each DELETE only the rows visible at their start, then both INSERT, leaving the table with the union of both row sets. Fixed by acquiring the per-table advisory lock at the start of the tx to serialize replaces against each other and against auto-position inserts. Also updated a stale docstring on `upsertRow` that still referenced the removed FOR UPDATE. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(tables): serialize explicit-position inserts with advisory lock The `(table_id, position)` index is non-unique. Concurrent explicit- position inserts at the same slot can both observe the slot empty, both skip the shift, then each INSERT — producing duplicate `(table_id, position)` rows. Take the per-table advisory lock in the explicit-position branches of `insertRow` and `batchInsertRows` (the auto-position branches already do this). Updated the test that asserted the explicit path skipped the lock, and added coverage for `batchInsertRows` with explicit positions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(tables): dedupe upsert UPDATE path + extract nextAutoPosition Two pure cleanups on the round-1 changes: 1. Extract `nextAutoPosition(trx, tableId)` — the `SELECT coalesce(max( position), -1) + 1` pattern was repeated in three call sites (`insertRow` auto branch, `batchInsertRows` auto branch, `upsertRow` insert branch). One helper, same behavior. 2. Consolidate `upsertRow` update path. The initial-SELECT match and the post-lock re-select match previously had two literal duplicates of the same UPDATE + return block. Resolve `matchedRowId` first, then run one UPDATE branch. Lock is still only acquired when we don't find a match on the first pass. No behavior change. 98/98 table unit tests unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

vercel · 2026-04-24T22:57:27Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
docs	Skipped		Apr 24, 2026 11:37pm

cursor · 2026-04-24T22:57:30Z

PR Summary

Medium Risk
Medium risk because it changes streaming/SSE behavior and core table row mutation/query logic (locking, timeouts, counts), which could affect performance or correctness under load despite added tests.

Overview
Mothership/Copilot streaming reliability: the chat stream resume endpoint now emits SSE comments (: accepted + periodic keepalive) and tracks last-write time; Go-side stream requests stop retrying non-idempotent failures, treat error events as terminal for telemetry, and ensure spans close correctly on fetch failures. Chat abort polling now heartbeats Redis chat-stream locks via a new extendLock helper (short TTL + periodic renewals) so dead pods don’t strand chats; related tests were added.

Tables performance & correctness: row listing APIs add includeTotal to optionally skip COUNT(*) (returning totalCount: null), with client/query hooks updated to pass/handle it and avoid counting in grid views. Table mutations now set per-transaction Postgres timeouts via SET LOCAL, rely on the row-count trigger instead of app-side counting/FOR UPDATE, and use per-table advisory locks to serialize position-sensitive writes (insert/batch insert/replace/upsert) plus targeted position recompaction after bulk deletes; unique-constraint checks and multi-row updates were optimized.

Ashby alignment & UI: Ashby docs/blocks/triggers were updated to reflect richer tool outputs, new/changed inputs (e.g., optional candidate email, job posting expand flags, includeArchived/syncToken for tags), and removal of the candidateId filter from list applications; the chat UI uses useProgressiveList to progressively render/stage messages and avoid expensive rescans for the preceding user query.

^{Reviewed by Cursor Bugbot for commit 60b80ec. Bugbot is set up for automated code reviews on this repo. Configure here.}

greptile-apps · 2026-04-24T23:04:33Z

Greptile Summary

This PR bundles five distinct reliability improvements: heartbeat-based Redis chat-stream locks (TTL reduced from 2 h → 60 s, renewed every 20 s), a state-machine retry loop with progressive re-rendering for mothership streams, a race-free Postgres trigger + advisory lock for table row counts, optional COUNT(*) skipping for the grid UI (includeTotal: false), and the crypto.randomUUID() → generateId() copilot fix. The Ashby refactor aligns tool schemas with the upstream API.

The new increment_user_table_row_count trigger's atomic conditional UPDATE ... WHERE row_count < max_rows correctly replaces the old TOCTOU-prone SELECT → check → UPDATE pattern. Application-level SELECT FOR UPDATE is cleanly removed, and the one-shot reconciliation migration handles historical drift.

Confidence Score: 5/5

PR is safe to merge — all findings are P2 style/documentation suggestions with no blocking correctness issues.

The trigger change is logically correct (atomic conditional UPDATE eliminates TOCTOU), the advisory lock correctly serializes position writes, the heartbeat mechanism has adequate safety margin (3 beats before lock expiry), and the cursor-advancement fix in the stream route is a genuine correctness improvement. The totalCount → null type change is handled correctly at all call sites. No P0/P1 issues found.

No files require special attention. The migration (0198_tables_race_free_trigger.sql) and service.ts are the highest-impact files and both look correct.

Important Files Changed

Filename	Overview
apps/sim/lib/copilot/request/session/abort.ts	Heartbeat added to startAbortPoller: chat stream lock TTL reduced from 2h to 60s, renewed every 20s via extendLock Lua CAS. Clean separation of heartbeat from abort-poll logic.
apps/sim/lib/core/config/redis.ts	New extendLock using Lua CAS (get-and-compare before expire) to safely refresh TTL only if caller still owns the lock. No-op path for Redis-less deployments.
apps/sim/lib/table/service.ts	Application-level FOR UPDATE + COUNT removed; capacity now enforced atomically by the new trigger. Advisory lock serializes position-aware inserts. Scoped SET LOCAL timeouts added to all mutating transactions.
packages/db/migrations/0198_tables_race_free_trigger.sql	Replaces TOCTOU-prone two-step trigger with atomic conditional UPDATE; includes one-shot reconciliation to correct any historical row_count drift.
apps/sim/app/api/copilot/chat/stream/route.ts	Cursor/state now advanced only after successful enqueue (preventing position drift on controller close). SSE keepalive comments injected every 15s of idle. enqueueComment helper correctly updates lastWriteTime.
apps/sim/hooks/use-progressive-list.ts	Refactored to two separate effects (key sync + RAF chain). renderState computed inline during render for flash-free key transitions. Logic is sound but notably more complex than the prior version.
apps/sim/app/workspace/[workspaceId]/home/components/mothership-chat/mothership-chat.tsx	Progressive rendering integrated; precedingUserContent now O(n) via memoized index map instead of O(n²) reverse scan per message. Index arithmetic (stagedOffset + localIndex) for precedingUserContentByIndex lookup is correct.
apps/sim/hooks/queries/tables.ts	totalCount type changed to number
apps/sim/app/api/table/[tableId]/rows/route.ts	includeTotal query param added; COUNT(*) skipped when false. Row-count trigger error message ("row limit") still matched by existing catch clauses.
apps/sim/app/api/v1/tables/[tableId]/rows/route.ts	Same includeTotal treatment as internal route; when includeTotal=true, rowsPromise and countQuery run via Promise.all (parallel). When false, sequential — correct since only rows needed.

Sequence Diagram

sequenceDiagram
    participant C as Client
    participant SR as Stream Route (resume)
    participant AP as AbortPoller
    participant Redis
    participant DB as Postgres

    C->>SR: POST /copilot/chat/stream (resume)
    SR->>SR: enqueueComment("accepted")
    SR->>Redis: acquireLock(chatStreamKey, streamId, TTL=60s)
    SR->>AP: startAbortPoller(streamId, {chatId})
    
    loop Every 250ms poll
        AP->>DB: Check abort marker
        alt Every 20s heartbeat
            AP->>Redis: extendLock(chatStreamKey, streamId, 60s) [Lua CAS]
            Redis-->>AP: 1 (owned) or 0 (lost)
        end
    end

    loop Every 250ms poll (if no terminal event)
        SR->>DB: readEvents(streamId, cursor)
        DB-->>SR: events[]
        SR->>C: SSE envelopes (cursor updated after successful enqueue)
        alt No write for 15s
            SR->>C: SSE comment("keepalive")
        end
    end

    SR->>Redis: releaseLock(chatStreamKey, streamId)

Comments Outside Diff (1)

apps/sim/lib/copilot/request/session/abort.ts, line 306-340 (link)

Heartbeat blocked while abort-poll is in flight

The heartbeat block runs only after pollingStreams.delete(streamId) in the finally block, meaning it executes in the same tick that clears the guard. However, if another interval fires while the IIFE is awaiting the abort check (pollingStreams.has(streamId) is true), that interval exits immediately — skipping the heartbeat for that tick. Since CHAT_STREAM_LOCK_HEARTBEAT_INTERVAL_MS (20 s) is checked against lastHeartbeatAt, a single blocked tick doesn't cause lock expiry given the 60 s TTL, but a prolonged slow abort-check could delay a heartbeat beyond the 20 s window. This is unlikely in practice given the abort check is a lightweight Redis GET, but worth noting in a comment alongside CHAT_STREAM_LOCK_HEARTBEAT_INTERVAL_MS.

_{Reviews (1): Last reviewed commit: "chore(bun): bump bun to 1.3.13 (#4291)" | Re-trigger Greptile}

greptile-apps · 2026-04-24T23:04:37Z

+ *
+ * `lock_timeout` is the critical one: without it, a waiter inherits the full
+ * `statement_timeout` clock, so one stuck writer can drain the pool.
+ *
+ * Safe under pgBouncer transaction pooling — `SET LOCAL` is transaction-scoped
+ * and cleared at COMMIT/ROLLBACK before the session returns to the pool.
+ */
+async function setTableTxTimeouts(
+  trx: DbTransaction,
+  opts?: { statementMs?: number; lockMs?: number; idleMs?: number }
+) {
+  const s = opts?.statementMs ?? 10_000


sql.raw() with string interpolation for SET LOCAL

sql.raw() with template-string number interpolation works here because s, l, and i are always application-computed numbers, but it bypasses Drizzle's parameterization entirely and could silently accept non-numeric values if the call sites ever evolve. PostgreSQL's SET LOCAL doesn't support $1 placeholders, so raw SQL is necessary — a brief inline comment explaining that would help future readers avoid the "why isn't this parameterized?" question.

greptile-apps · 2026-04-24T23:04:41Z


-  const completedKeysRef = useRef(new Set<string>())
-  const prevKeyRef = useRef(key)
-  const stagingCountRef = useRef(initialBatch)
-  const [count, setCount] = useState(() => {
-    if (items.length <= initialBatch) return items.length
-    return initialBatch
-  })
+  const renderState =
+    state.key === key && (state.count > 0 || items.length === 0 || state.caughtUp)
+      ? state
+      : createInitialState(key, items.length, initialBatch)

  useEffect(() => {
-    if (completedKeysRef.current.has(key)) {
-      setCount(items.length)
-      return
-    }
+    setState((prev) => {


Inline createInitialState during render on key mismatch

When state.key !== key, renderState is computed inline (not from React state), so staged uses the correct initial slice on the very first render after a key change. This avoids a flash of stale content. However, the inline object is recreated on every render while the two clocks are mismatched (before the useEffect fires). This is intentional and cheap, but it means staged is a new array reference on every render during that window — components that memo-compare staged will see unnecessary re-renders. A small comment explaining the intentional dual-clock pattern would help future readers.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

…4294) The visual filter/sort builders read the selected tableId from subBlock id 'tableId', but the Table block stores it under 'tableSelector' (basic) or 'manualTableId' (advanced) via canonicalParamId. The lookup always returned null, so useTable was disabled and the column picker always showed "no options available". Adds useCanonicalSubBlockValue that resolves by canonicalParamId through the canonical index, mirroring the pattern used by dropdown dependsOn.

…4295)

TheodoreSpeaks and others added 6 commits April 24, 2026 14:36

fix(mothership): Use heartbeat mechanism for chat locks (#4286)

04f1d01

improvement(mothership): stream retry state machine, progressive re-r…

5604477

…endering (#4287) * improvement(mothership): stream rety state machine, progressive re-rendering * address comments

improvement(mothership): treat error as terminal event (#4290)

af4be77

chore(bun): bump bun to 1.3.13 (#4291)

5fba724

waleedlatif1 changed the title ~~v0.6.57: mothership reliability, ashby refactor, tables row count, copilot id fix~~ v0.6.57: mothership reliability, ashby refactor, tables row count, copilot id fix, bun upgrade Apr 24, 2026

vercel Bot deployed to Preview April 24, 2026 23:02 View deployment

greptile-apps Bot reviewed Apr 24, 2026

View reviewed changes

vercel Bot temporarily deployed to Preview April 24, 2026 23:24 Inactive

improvement(mothership): do not silently re-route missing stream id (#…

f16d17b

…4295)

vercel Bot temporarily deployed to Preview April 24, 2026 23:37 Inactive

TheodoreSpeaks merged commit 595c4c3 into main Apr 24, 2026
30 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.6.57: mothership reliability, ashby refactor, tables row count, copilot id fix, bun upgrade#4293

v0.6.57: mothership reliability, ashby refactor, tables row count, copilot id fix, bun upgrade#4293
TheodoreSpeaks merged 9 commits intomainfrom
staging

TheodoreSpeaks commented Apr 24, 2026 •

edited by icecrasher321

Loading

Uh oh!

vercel Bot commented Apr 24, 2026 •

edited

Loading

Uh oh!

cursor Bot commented Apr 24, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Apr 24, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot Apr 24, 2026

Uh oh!

greptile-apps Bot Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

TheodoreSpeaks commented Apr 24, 2026 • edited by icecrasher321 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vercel Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

greptile-apps Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

TheodoreSpeaks commented Apr 24, 2026 •

edited by icecrasher321

Loading

vercel Bot commented Apr 24, 2026 •

edited

Loading

cursor Bot commented Apr 24, 2026 •

edited

Loading

greptile-apps Bot commented Apr 24, 2026 •

edited

Loading