Skip to content

LCORE-1853: Add relevance cutoff score to inline BYOK RAG#1720

Open
syedriko wants to merge 1 commit into
lightspeed-core:mainfrom
syedriko:syedriko-lcore-1853-2
Open

LCORE-1853: Add relevance cutoff score to inline BYOK RAG#1720
syedriko wants to merge 1 commit into
lightspeed-core:mainfrom
syedriko:syedriko-lcore-1853-2

Conversation

@syedriko
Copy link
Copy Markdown
Contributor

@syedriko syedriko commented May 11, 2026

Description

In the Lightspeed Stack configuration, byok_rag section, added a "relevance_cutoff_score" configuraiton variable, a positive float.
When a RAG chunk is retrieved from a BYOK database and its relevance score is less than the cutoff, the chunk is dropped from further consideration. Specifically, this cutoff filtering happens before score multipliers are applied.
Only added handling of relevance_cutoff_score to the inline BYOK RAG for now as adding it to the tool BYOK RAG is calling for changes in Llama Stack.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: Cursor
  • Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Added per-knowledge-source relevance_cutoff_score for BYOK Inline RAG to filter results by minimum similarity (default 0.3); Tool RAG/file_search unaffected.
  • Documentation

    • BYOK guide updated with usage, tuning guidance, Inline vs Tool RAG notes, and a reminder to refresh sources and monitor performance.
  • API

    • OpenAPI schema exposes relevance_cutoff_score with default and validation.
  • Tests

    • Unit tests expanded to cover configuration, validation, propagation, and filtering behavior.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 11, 2026

Walkthrough

This PR adds per-knowledge-source relevance cutoff score configuration for BYOK RAG, enabling Inline RAG queries to filter vector search results by minimum similarity threshold. The constant, config model, vector search integration, API schema, and comprehensive tests are introduced.

Changes

BYOK RAG Relevance Cutoff Configuration

Layer / File(s) Summary
Default Constant
src/constants.py
DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE constant set to 0.3 defines the module-level default minimum similarity threshold.
Configuration Model
src/models/config.py
ByokRag model adds relevance_cutoff_score: float field with positive value constraint (gt=0) and default from the constant.
Vector Search Integration
src/utils/vector_search.py
New _relevance_cutoff_for_vector_store() helper reads per-store cutoff from configuration; _query_store_for_byok_rag() signature updated to accept score_threshold and pass it to vector_io.query; _fetch_byok_rag() call site computes per-store threshold and supplies it.
API Schema
docs/openapi.json
relevance_cutoff_score property added with type: number, exclusiveMinimum: 0.0, and default: 0.3.
User Guide
docs/byok_guide.md
Extensive documentation of relevance_cutoff_score, its mapping to vector store score_threshold, explicit note that it applies only to Inline RAG (not Tool RAG), expanded YAML configuration example, and updated Inline vs Tool RAG comparison table.
Unit Tests
tests/unit/models/config/test_byok_rag.py, tests/unit/models/config/test_dump_configuration.py, tests/unit/utils/test_vector_search.py
Tests verify default value assignment, non-default value handling, validation failure for zero values, configuration serialization, and vector query parameter passing for the relevance cutoff. New tests confirm configured cutoff is forwarded as score_threshold and that backend filtering excludes sub-threshold hits.

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and specifically summarizes the main change: adding a relevance cutoff score feature to inline BYOK RAG.
Docstring Coverage ✅ Passed Docstring coverage is 90.48% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/openapi.json`:
- Around line 11809-11815: The OpenAPI schema file docs/openapi.json is out of
sync and should be regenerated rather than edited; run the provided generator
script (uv run scripts/generate_openapi_schema.py docs/openapi.json) to recreate
the schema, then verify the relevance_cutoff_score field in the regenerated JSON
has the correct default value that matches
DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE from src/constants.py and commit the
regenerated file.

In `@src/utils/vector_search.py`:
- Around line 29-34: Update the docstring for _relevance_cutoff_for_vector_store
to follow Google Python conventions: add a brief description line, then a
Parameters section documenting vector_store_id (str) and what it represents, and
a Returns section describing the returned float (either the matched
brag.relevance_cutoff_score or
constants.DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE). If the function can raise
any exceptions, add a Raises section; otherwise omit Raises. Ensure the
docstring text references the function name and types only (no code) and matches
the existing one-line summary.

In `@tests/unit/models/config/test_dump_configuration.py`:
- Line 11: Replace the top-level import "import constants" with explicit
from-imports for the names actually used in this test (e.g., "from constants
import FOO, BAR" — substitute the real constant names referenced in this file)
and update usages that refer to constants.<NAME> (including the reference around
line 996) to use the direct names (FOO, BAR, etc.) so the import style matches
the existing "from X import Y" pattern used elsewhere in the file.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f0c9945a-ead4-4245-9d4b-99ea45582ed7

📥 Commits

Reviewing files that changed from the base of the PR and between 70cb0db and 6af93fb.

📒 Files selected for processing (8)
  • docs/byok_guide.md
  • docs/openapi.json
  • src/constants.py
  • src/models/config.py
  • src/utils/vector_search.py
  • tests/unit/models/config/test_byok_rag.py
  • tests/unit/models/config/test_dump_configuration.py
  • tests/unit/utils/test_vector_search.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: black
  • GitHub Check: unit_tests (3.13)
  • GitHub Check: Pylinter
  • GitHub Check: radon
  • GitHub Check: build-pr
  • GitHub Check: E2E: server mode / ci / group 2
  • GitHub Check: E2E: server mode / ci / group 1
  • GitHub Check: E2E: library mode / ci / group 3
  • GitHub Check: E2E: library mode / ci / group 2
  • GitHub Check: E2E: library mode / ci / group 1
  • GitHub Check: E2E: server mode / ci / group 3
  • GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (4)
src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

  • src/models/config.py
  • src/constants.py
  • src/utils/vector_search.py
src/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Pydantic models must use @model_validator and @field_validator for validation and complete type annotations for all attributes, avoiding Any type

Files:

  • src/models/config.py
src/constants.py

📄 CodeRabbit inference engine (AGENTS.md)

Use constants.py for shared constants with descriptive comments and type hints using Final[type]

Files:

  • src/constants.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

  • tests/unit/models/config/test_dump_configuration.py
  • tests/unit/utils/test_vector_search.py
  • tests/unit/models/config/test_byok_rag.py
🧠 Learnings (2)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

  • src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.

Applied to files:

  • src/models/config.py
🪛 GitHub Actions: OpenAPI (Spectral) / 0_spectral.txt
docs/openapi.json

[error] 1-1: OpenAPI schema out of date. diff detected changes between docs/openapi.json and /tmp/openapi-generated.json. Regenerate with: 'uv run scripts/generate_openapi_schema.py docs/openapi.json'.

🪛 GitHub Actions: OpenAPI (Spectral) / spectral
docs/openapi.json

[error] 1-1: OpenAPI schema is out of date. 'diff -u docs/openapi.json /tmp/openapi-generated.json' failed; regenerate with: 'uv run scripts/generate_openapi_schema.py docs/openapi.json'.

🪛 markdownlint-cli2 (0.22.1)
docs/byok_guide.md

[warning] 89-89: Blank line inside blockquote

(MD028, no-blanks-blockquote)

🔇 Additional comments (15)
src/constants.py (1)

196-198: LGTM! Well-defined constant for BYOK RAG relevance filtering.

The constant follows the established pattern with proper type hints and descriptive comment. The default value of 0.3 is reasonable for similarity-based filtering.

src/models/config.py (1)

1636-1643: LGTM! Proper Pydantic field definition for relevance cutoff.

The field is correctly configured with:

  • Appropriate type annotation and default value from constants
  • gt=0 validation ensuring positive cutoff values
  • Clear description specifying "raw similarity score" (before multipliers)

The validation constraint requiring values > 0 prevents disabling the cutoff entirely, which aligns with the PR requirement for a "positive float."

tests/unit/models/config/test_dump_configuration.py (1)

996-996: LGTM! Test correctly validates default value serialization.

The test properly verifies that the new relevance_cutoff_score field is included in the dumped configuration with the correct default value from constants.

tests/unit/models/config/test_byok_rag.py (3)

7-7: LGTM!

The import of DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE is appropriate for testing the default value behavior.


39-39: LGTM!

The test assertions properly validate both the default and custom relevance_cutoff_score values. The choice of 0.72 as a test value provides good coverage.

Also applies to: 59-59, 68-68


208-220: LGTM!

The validation test correctly verifies the gt=0 constraint on relevance_cutoff_score. Testing the boundary case with 0.0 provides good coverage.

src/utils/vector_search.py (2)

186-213: LGTM!

The function correctly accepts and forwards the score_threshold parameter to client.vector_io.query. The docstring is properly updated to document the new parameter.


418-427: LGTM!

The implementation correctly retrieves the per-vector-store relevance_cutoff_score and passes it to the query function. Each store in the parallel queries will use its own configured cutoff threshold.

docs/byok_guide.md (4)

82-82: LGTM!

The documentation clearly explains the relevance_cutoff_score feature, including its purpose, default value, and how it's applied during Inline RAG retrieval.


297-298: LGTM!

The YAML example clearly demonstrates how to configure relevance_cutoff_score, including helpful comments about the default value and scope.


304-306: LGTM!

The clarification about score space and per-store tuning is valuable guidance for users configuring this feature.


333-336: LGTM!

The comparison table clearly shows that relevance_cutoff_score applies only to Inline RAG (BYOK), providing a helpful reference for users choosing between modes.

tests/unit/utils/test_vector_search.py (3)

430-432: LGTM!

The mock configurations are consistently updated to include relevance_cutoff_score, using the default constant value appropriately.

Also applies to: 472-474, 516-518, 522-524, 752-754


500-504: LGTM!

The assertion correctly verifies that score_threshold is passed to vector_io.query along with the other expected parameters.


561-599: LGTM!

The new test explicitly verifies that a custom relevance_cutoff_score configuration is correctly passed to vector_io.query as score_threshold. This provides comprehensive coverage for the feature.

Comment thread src/utils/vector_search.py
Comment thread tests/unit/models/config/test_dump_configuration.py Outdated
@syedriko syedriko force-pushed the syedriko-lcore-1853-2 branch 3 times, most recently from 7187d0b to 8e966b0 Compare May 11, 2026 21:20
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/byok_guide.md`:
- Around line 84-94: Remove the empty line between the two adjacent blockquote
note blocks so they are back-to-back (i.e., merge the `> [!NOTE]` block about
OKP/BYOK `score_multiplier` and chunk constants with the following `> [!NOTE]`
block about `relevance_cutoff_score`) to satisfy MD028; locate the two `>
[!NOTE]` paragraphs in the byok guide and delete the blank line separating them
so markdownlint no longer reports the blockquote-blank rule.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: dc0b1607-f348-4610-b49a-bace5ce7b17c

📥 Commits

Reviewing files that changed from the base of the PR and between 6af93fb and 8e966b0.

📒 Files selected for processing (8)
  • docs/byok_guide.md
  • docs/openapi.json
  • src/constants.py
  • src/models/config.py
  • src/utils/vector_search.py
  • tests/unit/models/config/test_byok_rag.py
  • tests/unit/models/config/test_dump_configuration.py
  • tests/unit/utils/test_vector_search.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: spectral
  • GitHub Check: Pylinter
  • GitHub Check: build-pr
  • GitHub Check: E2E: library mode / ci / group 1
  • GitHub Check: E2E: server mode / ci / group 1
  • GitHub Check: E2E: server mode / ci / group 2
  • GitHub Check: E2E: server mode / ci / group 3
  • GitHub Check: E2E: library mode / ci / group 2
  • GitHub Check: E2E: library mode / ci / group 3
  • GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (4)
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

  • tests/unit/models/config/test_dump_configuration.py
  • tests/unit/models/config/test_byok_rag.py
  • tests/unit/utils/test_vector_search.py
src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

  • src/constants.py
  • src/models/config.py
  • src/utils/vector_search.py
src/constants.py

📄 CodeRabbit inference engine (AGENTS.md)

Use constants.py for shared constants with descriptive comments and type hints using Final[type]

Files:

  • src/constants.py
src/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Pydantic models must use @model_validator and @field_validator for validation and complete type annotations for all attributes, avoiding Any type

Files:

  • src/models/config.py
🧠 Learnings (2)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

  • src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.

Applied to files:

  • src/models/config.py
🪛 markdownlint-cli2 (0.22.1)
docs/byok_guide.md

[warning] 89-89: Blank line inside blockquote

(MD028, no-blanks-blockquote)

🔇 Additional comments (4)
src/constants.py (1)

196-198: Good addition of shared BYOK relevance cutoff constant.
Typed with Final[float], clearly named, and placed in the correct constants section.

tests/unit/models/config/test_dump_configuration.py (2)

11-11: Import update looks correct and consistent with this test module.


1028-1028: Expected dump payload correctly includes relevance_cutoff_score.
This strengthens coverage for the new BYOK RAG config field serialization path.

docs/openapi.json (1)

11810-11815: Default value is correct.

The default value of 0.3 matches the DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE constant from src/constants.py. The schema definition with exclusiveMinimum: 0.0 correctly enforces positive values and the description accurately reflects the filtering behavior.

Comment thread docs/byok_guide.md
@syedriko syedriko force-pushed the syedriko-lcore-1853-2 branch from 8e966b0 to fee7578 Compare May 11, 2026 23:52
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
docs/byok_guide.md (1)

89-89: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove the blank blockquote line to satisfy markdownlint MD028.

The empty blockquote line between the two adjacent > [!NOTE] blocks still triggers the MD028 (no-blanks-blockquote) rule violation that was previously flagged.

📝 Proposed fix
 > context, set the `BYOK_RAG_MAX_CHUNKS` and `OKP_RAG_MAX_CHUNKS` constants in `src/constants.py`
 > (defaults: 10 and 5 respectively). For Tool RAG, use `TOOL_RAG_MAX_CHUNKS` (default: 10).
->
 > [!NOTE]
 > `relevance_cutoff_score` applies to Inline RAG only. When the model uses Tool RAG (`file_search`),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/byok_guide.md` at line 89, Remove the empty blockquote line between the
two adjacent blockquote markers so MD028 is satisfied: find the consecutive
lines starting with "> [!NOTE]" and delete the blank line (the standalone
">"/empty blockquote) separating them so the two note blocks are directly
adjacent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/openapi.json`:
- Around line 11809-11815: Regenerate the out-of-sync OpenAPI schema file
docs/openapi.json by running the project script
scripts/generate_openapi_schema.py (via the project's runner e.g., `uv run`)
instead of editing the JSON manually; after regeneration verify the
relevance_cutoff_score default in docs/openapi.json matches the
DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE constant defined in src/constants.py,
then commit the regenerated docs/openapi.json.

---

Duplicate comments:
In `@docs/byok_guide.md`:
- Line 89: Remove the empty blockquote line between the two adjacent blockquote
markers so MD028 is satisfied: find the consecutive lines starting with ">
[!NOTE]" and delete the blank line (the standalone ">"/empty blockquote)
separating them so the two note blocks are directly adjacent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d4d39bf3-18f3-4c52-9c75-cdb4a6f534b2

📥 Commits

Reviewing files that changed from the base of the PR and between 8e966b0 and fee7578.

📒 Files selected for processing (8)
  • docs/byok_guide.md
  • docs/openapi.json
  • src/constants.py
  • src/models/config.py
  • src/utils/vector_search.py
  • tests/unit/models/config/test_byok_rag.py
  • tests/unit/models/config/test_dump_configuration.py
  • tests/unit/utils/test_vector_search.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: build-pr
  • GitHub Check: integration_tests (3.12)
  • GitHub Check: Pylinter
  • GitHub Check: unit_tests (3.12)
  • GitHub Check: E2E: server mode / ci / group 1
  • GitHub Check: E2E: server mode / ci / group 3
  • GitHub Check: E2E: library mode / ci / group 3
  • GitHub Check: E2E: library mode / ci / group 1
  • GitHub Check: E2E: library mode / ci / group 2
  • GitHub Check: E2E: server mode / ci / group 2
  • GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (4)
src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

  • src/constants.py
  • src/models/config.py
  • src/utils/vector_search.py
src/constants.py

📄 CodeRabbit inference engine (AGENTS.md)

Use constants.py for shared constants with descriptive comments and type hints using Final[type]

Files:

  • src/constants.py
src/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Pydantic models must use @model_validator and @field_validator for validation and complete type annotations for all attributes, avoiding Any type

Files:

  • src/models/config.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

  • tests/unit/models/config/test_byok_rag.py
  • tests/unit/models/config/test_dump_configuration.py
  • tests/unit/utils/test_vector_search.py
🧠 Learnings (2)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

  • src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.

Applied to files:

  • src/models/config.py
🪛 GitHub Actions: OpenAPI (Spectral) / 0_spectral.txt
docs/openapi.json

[error] 1-1: docs/openapi.json is out of date. Step checks openapi schema via uv run python scripts/generate_openapi_schema.py /tmp/openapi-generated.json and diff -u docs/openapi.json /tmp/openapi-generated.json; diff failed. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json.

🪛 GitHub Actions: OpenAPI (Spectral) / spectral
docs/openapi.json

[error] 1-1: docs/openapi.json is out of date. Diff detected between docs/openapi.json and generated /tmp/openapi-generated.json. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json

🔇 Additional comments (11)
src/constants.py (1)

196-197: LGTM!

src/models/config.py (1)

1721-1727: LGTM!

tests/unit/models/config/test_dump_configuration.py (1)

11-11: LGTM!

Also applies to: 1028-1028

tests/unit/models/config/test_byok_rag.py (1)

7-7: LGTM!

Also applies to: 39-39, 59-59, 68-68, 208-220

src/utils/vector_search.py (2)

29-44: LGTM!


196-223: LGTM!

Also applies to: 434-434

docs/byok_guide.md (1)

82-82: LGTM!

Also applies to: 297-298, 303-306, 333-336, 586-586

tests/unit/utils/test_vector_search.py (4)

3-5: LGTM!

Also applies to: 26-26, 31-54


460-462: LGTM!

Also applies to: 502-504, 530-535, 546-548, 552-554, 882-884


591-629: LGTM!


631-674: LGTM!

Also applies to: 676-729

Comment thread docs/openapi.json
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant