Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions apps/vscode-e2e/.env.local.sample
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
OPENROUTER_API_KEY=sk-or-v1-...
DEEPSEEK_API_KEY=sk-...
55 changes: 52 additions & 3 deletions apps/vscode-e2e/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,10 +82,42 @@ Record mode uses **record-on-miss**: if an existing fixture already matches a re

If the LLM calls a tool first (e.g. `read_file`) and then calls `attempt_completion` after seeing the result, you need two fixtures:

- **Turn 1**: match on the task prompt → respond with the tool call
- **Turn 2**: match on a stable part of the tool _result_ → respond with `attempt_completion`
- **Turn 1**: match on the task prompt (with `sequenceIndex: 0` so it fires only once) → respond with the tool call, giving the tool call a unique `id`
- **Turn 2**: match on `toolCallId` → respond with `attempt_completion`

Using `toolCallId` (the `id` of the tool call emitted in turn 1) is the recommended approach for turn-2 matching. It is:

- **Precise**: fires only when that exact tool call's result is in the conversation
- **Cross-test safe**: each test's tool call ids are unique, so accumulated match counts from previous tests can't interfere
- **Stateless**: no `sequenceIndex` needed on turn-2 fixtures — if the task makes extra API calls they'll keep getting the same `attempt_completion`

Example:

```json
{
"fixtures": [
{
"match": {
"userMessage": "my-e2e-tag:my-test",
"sequenceIndex": 0
},
"response": {
"toolCalls": [{ "name": "read_file", "arguments": "{\"path\":\"marker.txt\"}", "id": "call_my_read" }]
}
},
{
"match": { "toolCallId": "call_my_read" },
"response": {
"toolCalls": [
{ "name": "attempt_completion", "arguments": "{\"result\":\"MY_MARKER\"}", "id": "call_my_done" }
]
}
}
]
}
```

The tool result is provided by the extension (not the mock), so its content is deterministic if test files have stable names. Use a stable substring from the tool result as the turn-2 match string.
The `model` field can be added to either match when a test targets a specific model.

## 404 errors in logs are expected

Expand Down Expand Up @@ -118,6 +150,23 @@ ZAI_API_KEY=<key> TEST_FILE=zai.test pnpm --filter @roo-code/vscode-e2e test:ci
```

When adding a new test to this suite, add a matching fixture to the `installZAiFetchInterceptor` call in `suiteSetup`. Use a short unique prefix (e.g. `"zai-glm-e2e-mytest:"`) that won't appear in `<environment_details>`.

### DeepSeek V4 (`suite/providers/deepseek-v4.test.ts`)

DeepSeek exposes `deepSeekBaseUrl`, so the suite redirects the OpenAI-compatible DeepSeek client through aimock with `deepSeekBaseUrl: ${AIMOCK_URL}/v1`. The test still installs a lightweight fetch capture for request-shape assertions, but responses should come from aimock fixtures or aimock record mode.

Record DeepSeek fixtures with the targeted file filter so aimock proxies OpenAI-compatible traffic to `https://api.deepseek.com`:

```sh
DEEPSEEK_API_KEY=<key> TEST_FILE=deepseek-v4.test pnpm --filter @roo-code/vscode-e2e test:record
```

After converting the generated `openai-*.json` files into stable named fixtures, verify in mock mode:

```sh
USE_MOCK=true TEST_FILE=deepseek-v4.test pnpm --filter @roo-code/vscode-e2e test:run
```

## Tests that use a non-default provider

If your test calls `api.setConfiguration({ apiProvider: "anthropic", ... })`, point aimock at the
Expand Down
128 changes: 128 additions & 0 deletions apps/vscode-e2e/fixtures/deepseek-v4.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
{
"fixtures": [
{
"match": {
"model": "deepseek-v4-flash",
"userMessage": "deepseek-v4-e2e:deepseek-v4-flash:reasoning-on",
"sequenceIndex": 0
},
"response": {
"toolCalls": [
{
"name": "read_file",
"arguments": "{\"path\":\"deepseek-v4-e2e-deepseek-v4-flash-reasoning-on.txt\"}",
"id": "call_dsv4_flash_on_read"
}
]
}
},
{
"match": {
"model": "deepseek-v4-flash",
"toolCallId": "call_dsv4_flash_on_read"
},
"response": {
"toolCalls": [
{
"name": "attempt_completion",
"arguments": "{\"result\":\"DEEPSEEK_V4_MARKER_deepseek_v4_flash_reasoning_on\"}",
"id": "call_dsv4_flash_on_done"
}
]
}
},
{
"match": {
"model": "deepseek-v4-flash",
"userMessage": "deepseek-v4-e2e:deepseek-v4-flash:reasoning-off",
"sequenceIndex": 0
},
"response": {
"toolCalls": [
{
"name": "read_file",
"arguments": "{\"path\":\"deepseek-v4-e2e-deepseek-v4-flash-reasoning-off.txt\"}",
"id": "call_dsv4_flash_off_read"
}
]
}
},
{
"match": {
"model": "deepseek-v4-flash",
"toolCallId": "call_dsv4_flash_off_read"
},
"response": {
"toolCalls": [
{
"name": "attempt_completion",
"arguments": "{\"result\":\"DEEPSEEK_V4_MARKER_deepseek_v4_flash_reasoning_off\"}",
"id": "call_dsv4_flash_off_done"
}
]
}
},
{
"match": {
"model": "deepseek-v4-pro",
"userMessage": "deepseek-v4-e2e:deepseek-v4-pro:reasoning-on",
"sequenceIndex": 0
},
"response": {
"toolCalls": [
{
"name": "read_file",
"arguments": "{\"path\":\"deepseek-v4-e2e-deepseek-v4-pro-reasoning-on.txt\"}",
"id": "call_dsv4_pro_on_read"
}
]
}
},
{
"match": {
"model": "deepseek-v4-pro",
"toolCallId": "call_dsv4_pro_on_read"
},
"response": {
"toolCalls": [
{
"name": "attempt_completion",
"arguments": "{\"result\":\"DEEPSEEK_V4_MARKER_deepseek_v4_pro_reasoning_on\"}",
"id": "call_dsv4_pro_on_done"
}
]
}
},
{
"match": {
"model": "deepseek-v4-pro",
"userMessage": "deepseek-v4-e2e:deepseek-v4-pro:reasoning-off",
"sequenceIndex": 0
},
"response": {
"toolCalls": [
{
"name": "read_file",
"arguments": "{\"path\":\"deepseek-v4-e2e-deepseek-v4-pro-reasoning-off.txt\"}",
"id": "call_dsv4_pro_off_read"
}
]
}
},
{
"match": {
"model": "deepseek-v4-pro",
"toolCallId": "call_dsv4_pro_off_read"
},
"response": {
"toolCalls": [
{
"name": "attempt_completion",
"arguments": "{\"result\":\"DEEPSEEK_V4_MARKER_deepseek_v4_pro_reasoning_off\"}",
"id": "call_dsv4_pro_off_done"
}
]
}
}
]
}
1 change: 1 addition & 0 deletions apps/vscode-e2e/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
"lint": "eslint src --ext=ts --max-warnings=0",
"check-types": "tsc -p tsconfig.esm.json --noEmit",
"format": "prettier --write src",
"test:deepseek-v4": "TEST_FILE=deepseek-v4.test pnpm test:ci",
"test:ci": "pnpm -w bundle && pnpm --filter @roo-code/vscode-webview build && pnpm test:run",
"test:ci:mock": "pnpm -w bundle && pnpm --filter @roo-code/vscode-webview build && USE_MOCK=true pnpm test:run",
"test:record": "AIMOCK_RECORD=true pnpm test:ci",
Expand Down
13 changes: 9 additions & 4 deletions apps/vscode-e2e/src/runTest.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,15 @@ import { LLMock } from "@copilotkit/aimock"

async function main() {
const isRecord = process.env.AIMOCK_RECORD === "true"
const testGrep = process.argv.find((arg, i) => process.argv[i - 1] === "--grep") || process.env.TEST_GREP
const testFile = process.argv.find((arg, i) => process.argv[i - 1] === "--file") || process.env.TEST_FILE
const isDeepSeekTest = testFile?.includes("deepseek-v4") === true

if (isRecord && !process.env.OPENROUTER_API_KEY) {
if (isRecord && isDeepSeekTest && !process.env.DEEPSEEK_API_KEY) {
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds the DeepSeek-specific record-mode key check, but the useMock gate below still ignores DEEPSEEK_API_KEY. Running DEEPSEEK_API_KEY=... TEST_FILE=deepseek-v4.test pnpm --filter @roo-code/vscode-e2e test:ci still starts aimock and injects AIMOCK_URL, so the new suite replays fixtures instead of exercising the live provider this PR is meant to validate. The DeepSeek key needs to count as a real-provider credential for DeepSeek-targeted runs.

throw new Error("AIMOCK_RECORD=true requires DEEPSEEK_API_KEY to record DeepSeek fixtures")
}

if (isRecord && !isDeepSeekTest && !process.env.OPENROUTER_API_KEY) {
throw new Error("AIMOCK_RECORD=true requires OPENROUTER_API_KEY to record fixtures")
}

Expand Down Expand Up @@ -43,7 +50,7 @@ async function main() {
// Use /api (not /api/v1) — aimock appends the request path (/v1/chat/completions)
// so including /v1 here would produce a doubled /v1/v1 upstream URL.
providers: {
openai: "https://openrouter.ai/api",
openai: isDeepSeekTest ? "https://api.deepseek.com" : "https://openrouter.ai/api",
// aimock forwards the x-api-key header from the Anthropic SDK to the real API.
anthropic: "https://api.anthropic.com",
},
Expand Down Expand Up @@ -84,8 +91,6 @@ async function main() {
// - npm run test:e2e -- --grep "write-to-file"
// - TEST_GREP="apply-diff" npm run test:e2e
// - TEST_FILE="task.test.js" npm run test:e2e
const testGrep = process.argv.find((arg, i) => process.argv[i - 1] === "--grep") || process.env.TEST_GREP
const testFile = process.argv.find((arg, i) => process.argv[i - 1] === "--file") || process.env.TEST_FILE

// Pass test filters and mock URL as environment variables to the test runner
const extensionTestsEnv = {
Expand Down
6 changes: 5 additions & 1 deletion apps/vscode-e2e/src/suite/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,11 @@ export async function run() {
apiProvider: "openrouter" as const,
// In record mode, forward the real key so aimock can proxy it to OpenRouter.
// In replay mode, "mock-key" is sufficient — aimock never contacts the real API.
openRouterApiKey: aimockUrl && !isRecord ? "mock-key" : process.env.OPENROUTER_API_KEY!,
openRouterApiKey: aimockUrl
? isRecord
? (process.env.OPENROUTER_API_KEY ?? "mock-key")
: "mock-key"
: process.env.OPENROUTER_API_KEY!,
openRouterModelId: "openai/gpt-4.1",
...(aimockUrl && { openRouterBaseUrl: `${aimockUrl}/v1` }),
})
Expand Down
Loading
Loading