LCORE-1853: Add relevance cutoff score to inline BYOK RAG by syedriko · Pull Request #1720 · lightspeed-core/lightspeed-stack

syedriko · 2026-05-11T19:12:24Z

Description

In the Lightspeed Stack configuration, byok_rag section, added a "relevance_cutoff_score" configuraiton variable, a positive float.
When a RAG chunk is retrieved from a BYOK database and its relevance score is less than the cutoff, the chunk is dropped from further consideration. Specifically, this cutoff filtering happens before score multipliers are applied.
Only added handling of relevance_cutoff_score to the inline BYOK RAG for now as adding it to the tool BYOK RAG is calling for changes in Llama Stack.

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: Cursor
Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

Related Issue #
Closes # https://redhat.atlassian.net/browse/LCORE-1853

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

New Features
- Added per-knowledge-source relevance_cutoff_score for BYOK Inline RAG to filter results by minimum similarity (default 0.3); Tool RAG/file_search unaffected.
Documentation
- BYOK guide updated with usage, tuning guidance, Inline vs Tool RAG notes, and a reminder to refresh sources and monitor performance.
API
- OpenAPI schema exposes relevance_cutoff_score with default and validation.
Tests
- Unit tests expanded to cover configuration, validation, propagation, and filtering behavior.

coderabbitai · 2026-05-11T19:13:14Z

Walkthrough

This PR adds per-knowledge-source relevance cutoff score configuration for BYOK RAG, enabling Inline RAG queries to filter vector search results by minimum similarity threshold. The constant, config model, vector search integration, API schema, and comprehensive tests are introduced.

Changes

BYOK RAG Relevance Cutoff Configuration

Layer / File(s)	Summary
Default Constant `src/constants.py`	`DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE` constant set to `0.3` defines the module-level default minimum similarity threshold.
Configuration Model `src/models/config.py`	`ByokRag` model adds `relevance_cutoff_score: float` field with positive value constraint (`gt=0`) and default from the constant.
Vector Search Integration `src/utils/vector_search.py`	New `_relevance_cutoff_for_vector_store()` helper reads per-store cutoff from configuration; `_query_store_for_byok_rag()` signature updated to accept `score_threshold` and pass it to `vector_io.query`; `_fetch_byok_rag()` call site computes per-store threshold and supplies it.
API Schema `docs/openapi.json`	`relevance_cutoff_score` property added with `type: number`, `exclusiveMinimum: 0.0`, and `default: 0.3`.
User Guide `docs/byok_guide.md`	Extensive documentation of `relevance_cutoff_score`, its mapping to vector store `score_threshold`, explicit note that it applies only to Inline RAG (not Tool RAG), expanded YAML configuration example, and updated Inline vs Tool RAG comparison table.
Unit Tests `tests/unit/models/config/test_byok_rag.py`, `tests/unit/models/config/test_dump_configuration.py`, `tests/unit/utils/test_vector_search.py`	Tests verify default value assignment, non-default value handling, validation failure for zero values, configuration serialization, and vector query parameter passing for the relevance cutoff. New tests confirm configured cutoff is forwarded as `score_threshold` and that backend filtering excludes sub-threshold hits.

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title clearly and specifically summarizes the main change: adding a relevance cutoff score feature to inline BYOK RAG.
Docstring Coverage	✅ Passed	Docstring coverage is 90.48% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/openapi.json`:
- Around line 11809-11815: The OpenAPI schema file docs/openapi.json is out of
sync and should be regenerated rather than edited; run the provided generator
script (uv run scripts/generate_openapi_schema.py docs/openapi.json) to recreate
the schema, then verify the relevance_cutoff_score field in the regenerated JSON
has the correct default value that matches
DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE from src/constants.py and commit the
regenerated file.

In `@src/utils/vector_search.py`:
- Around line 29-34: Update the docstring for _relevance_cutoff_for_vector_store
to follow Google Python conventions: add a brief description line, then a
Parameters section documenting vector_store_id (str) and what it represents, and
a Returns section describing the returned float (either the matched
brag.relevance_cutoff_score or
constants.DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE). If the function can raise
any exceptions, add a Raises section; otherwise omit Raises. Ensure the
docstring text references the function name and types only (no code) and matches
the existing one-line summary.

In `@tests/unit/models/config/test_dump_configuration.py`:
- Line 11: Replace the top-level import "import constants" with explicit
from-imports for the names actually used in this test (e.g., "from constants
import FOO, BAR" — substitute the real constant names referenced in this file)
and update usages that refer to constants.<NAME> (including the reference around
line 996) to use the direct names (FOO, BAR, etc.) so the import style matches
the existing "from X import Y" pattern used elsewhere in the file.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f0c9945a-ead4-4245-9d4b-99ea45582ed7

📥 Commits

Reviewing files that changed from the base of the PR and between 70cb0db and 6af93fb.

📒 Files selected for processing (8)

docs/byok_guide.md
docs/openapi.json
src/constants.py
src/models/config.py
src/utils/vector_search.py
tests/unit/models/config/test_byok_rag.py
tests/unit/models/config/test_dump_configuration.py
tests/unit/utils/test_vector_search.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)

GitHub Check: black
GitHub Check: unit_tests (3.13)
GitHub Check: Pylinter
GitHub Check: radon
GitHub Check: build-pr
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E Tests for Lightspeed Evaluation job

🧰 Additional context used

📓 Path-based instructions (4)

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

src/models/config.py
src/constants.py
src/utils/vector_search.py

src/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Pydantic models must use @model_validator and @field_validator for validation and complete type annotations for all attributes, avoiding Any type

Files:

src/models/config.py

src/constants.py

📄 CodeRabbit inference engine (AGENTS.md)

Use constants.py for shared constants with descriptive comments and type hints using Final[type]

Files:

src/constants.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

tests/unit/models/config/test_dump_configuration.py
tests/unit/utils/test_vector_search.py
tests/unit/models/config/test_byok_rag.py

🧠 Learnings (2)

📚 Learning: 2026-01-12T10:58:40.230Z

Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

src/models/config.py

📚 Learning: 2026-02-25T07:46:33.545Z

Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.

Applied to files:

src/models/config.py

🪛 GitHub Actions: OpenAPI (Spectral) / 0_spectral.txt

docs/openapi.json

[error] 1-1: OpenAPI schema out of date. diff detected changes between docs/openapi.json and /tmp/openapi-generated.json. Regenerate with: 'uv run scripts/generate_openapi_schema.py docs/openapi.json'.

🪛 GitHub Actions: OpenAPI (Spectral) / spectral

docs/openapi.json

[error] 1-1: OpenAPI schema is out of date. 'diff -u docs/openapi.json /tmp/openapi-generated.json' failed; regenerate with: 'uv run scripts/generate_openapi_schema.py docs/openapi.json'.

🪛 markdownlint-cli2 (0.22.1)

docs/byok_guide.md

[warning] 89-89: Blank line inside blockquote

(MD028, no-blanks-blockquote)

🔇 Additional comments (15)

src/constants.py (1)

196-198: LGTM! Well-defined constant for BYOK RAG relevance filtering.

The constant follows the established pattern with proper type hints and descriptive comment. The default value of 0.3 is reasonable for similarity-based filtering.

src/models/config.py (1)

1636-1643: LGTM! Proper Pydantic field definition for relevance cutoff.

The field is correctly configured with:

Appropriate type annotation and default value from constants

gt=0 validation ensuring positive cutoff values

Clear description specifying "raw similarity score" (before multipliers)

The validation constraint requiring values > 0 prevents disabling the cutoff entirely, which aligns with the PR requirement for a "positive float."

tests/unit/models/config/test_dump_configuration.py (1)

996-996: LGTM! Test correctly validates default value serialization.

The test properly verifies that the new relevance_cutoff_score field is included in the dumped configuration with the correct default value from constants.

tests/unit/models/config/test_byok_rag.py (3)

7-7: LGTM!

The import of DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE is appropriate for testing the default value behavior.

39-39: LGTM!

The test assertions properly validate both the default and custom relevance_cutoff_score values. The choice of 0.72 as a test value provides good coverage.

Also applies to: 59-59, 68-68

208-220: LGTM!

The validation test correctly verifies the gt=0 constraint on relevance_cutoff_score. Testing the boundary case with 0.0 provides good coverage.

src/utils/vector_search.py (2)

186-213: LGTM!

The function correctly accepts and forwards the score_threshold parameter to client.vector_io.query. The docstring is properly updated to document the new parameter.

418-427: LGTM!

The implementation correctly retrieves the per-vector-store relevance_cutoff_score and passes it to the query function. Each store in the parallel queries will use its own configured cutoff threshold.

docs/byok_guide.md (4)

82-82: LGTM!

The documentation clearly explains the relevance_cutoff_score feature, including its purpose, default value, and how it's applied during Inline RAG retrieval.

297-298: LGTM!

The YAML example clearly demonstrates how to configure relevance_cutoff_score, including helpful comments about the default value and scope.

304-306: LGTM!

The clarification about score space and per-store tuning is valuable guidance for users configuring this feature.

333-336: LGTM!

The comparison table clearly shows that relevance_cutoff_score applies only to Inline RAG (BYOK), providing a helpful reference for users choosing between modes.

tests/unit/utils/test_vector_search.py (3)

430-432: LGTM!

The mock configurations are consistently updated to include relevance_cutoff_score, using the default constant value appropriately.

Also applies to: 472-474, 516-518, 522-524, 752-754

500-504: LGTM!

The assertion correctly verifies that score_threshold is passed to vector_io.query along with the other expected parameters.

561-599: LGTM!

The new test explicitly verifies that a custom relevance_cutoff_score configuration is correctly passed to vector_io.query as score_threshold. This provides comprehensive coverage for the feature.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/byok_guide.md`:
- Around line 84-94: Remove the empty line between the two adjacent blockquote
note blocks so they are back-to-back (i.e., merge the `> [!NOTE]` block about
OKP/BYOK `score_multiplier` and chunk constants with the following `> [!NOTE]`
block about `relevance_cutoff_score`) to satisfy MD028; locate the two `>
[!NOTE]` paragraphs in the byok guide and delete the blank line separating them
so markdownlint no longer reports the blockquote-blank rule.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: dc0b1607-f348-4610-b49a-bace5ce7b17c

📥 Commits

Reviewing files that changed from the base of the PR and between 6af93fb and 8e966b0.

📒 Files selected for processing (8)

docs/byok_guide.md
docs/openapi.json
src/constants.py
src/models/config.py
src/utils/vector_search.py
tests/unit/models/config/test_byok_rag.py
tests/unit/models/config/test_dump_configuration.py
tests/unit/utils/test_vector_search.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: spectral
GitHub Check: Pylinter
GitHub Check: build-pr
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E Tests for Lightspeed Evaluation job

🧰 Additional context used

📓 Path-based instructions (4)

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

tests/unit/models/config/test_dump_configuration.py
tests/unit/models/config/test_byok_rag.py
tests/unit/utils/test_vector_search.py

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

src/constants.py
src/models/config.py
src/utils/vector_search.py

src/constants.py

📄 CodeRabbit inference engine (AGENTS.md)

Use constants.py for shared constants with descriptive comments and type hints using Final[type]

Files:

src/constants.py

src/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Pydantic models must use @model_validator and @field_validator for validation and complete type annotations for all attributes, avoiding Any type

Files:

src/models/config.py

🧠 Learnings (2)

📚 Learning: 2026-01-12T10:58:40.230Z

Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

src/models/config.py

📚 Learning: 2026-02-25T07:46:33.545Z

Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.

Applied to files:

src/models/config.py

🪛 markdownlint-cli2 (0.22.1)

docs/byok_guide.md

[warning] 89-89: Blank line inside blockquote

(MD028, no-blanks-blockquote)

🔇 Additional comments (4)

src/constants.py (1)

196-198: Good addition of shared BYOK relevance cutoff constant.
Typed with Final[float], clearly named, and placed in the correct constants section.

tests/unit/models/config/test_dump_configuration.py (2)

11-11: Import update looks correct and consistent with this test module.

1028-1028: Expected dump payload correctly includes relevance_cutoff_score.
This strengthens coverage for the new BYOK RAG config field serialization path.

docs/openapi.json (1)

11810-11815: Default value is correct.

The default value of 0.3 matches the DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE constant from src/constants.py. The schema definition with exclusiveMinimum: 0.0 correctly enforces positive values and the description accurately reflects the filtering behavior.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

docs/byok_guide.md (1)
89-89: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove the blank blockquote line to satisfy markdownlint MD028.

The empty blockquote line between the two adjacent > [!NOTE] blocks still triggers the MD028 (no-blanks-blockquote) rule violation that was previously flagged.
📝 Proposed fix
 > context, set the `BYOK_RAG_MAX_CHUNKS` and `OKP_RAG_MAX_CHUNKS` constants in `src/constants.py`
 > (defaults: 10 and 5 respectively). For Tool RAG, use `TOOL_RAG_MAX_CHUNKS` (default: 10).
->
 > [!NOTE]
 > `relevance_cutoff_score` applies to Inline RAG only. When the model uses Tool RAG (`file_search`),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/byok_guide.md` at line 89, Remove the empty blockquote line between the
two adjacent blockquote markers so MD028 is satisfied: find the consecutive
lines starting with "> [!NOTE]" and delete the blank line (the standalone
">"/empty blockquote) separating them so the two note blocks are directly
adjacent.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/openapi.json`:
- Around line 11809-11815: Regenerate the out-of-sync OpenAPI schema file
docs/openapi.json by running the project script
scripts/generate_openapi_schema.py (via the project's runner e.g., `uv run`)
instead of editing the JSON manually; after regeneration verify the
relevance_cutoff_score default in docs/openapi.json matches the
DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE constant defined in src/constants.py,
then commit the regenerated docs/openapi.json.

---

Duplicate comments:
In `@docs/byok_guide.md`:
- Line 89: Remove the empty blockquote line between the two adjacent blockquote
markers so MD028 is satisfied: find the consecutive lines starting with ">
[!NOTE]" and delete the blank line (the standalone ">"/empty blockquote)
separating them so the two note blocks are directly adjacent.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d4d39bf3-18f3-4c52-9c75-cdb4a6f534b2

📥 Commits

Reviewing files that changed from the base of the PR and between 8e966b0 and fee7578.

📒 Files selected for processing (8)

docs/byok_guide.md
docs/openapi.json
src/constants.py
src/models/config.py
src/utils/vector_search.py
tests/unit/models/config/test_byok_rag.py
tests/unit/models/config/test_dump_configuration.py
tests/unit/utils/test_vector_search.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)

GitHub Check: build-pr
GitHub Check: integration_tests (3.12)
GitHub Check: Pylinter
GitHub Check: unit_tests (3.12)
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E Tests for Lightspeed Evaluation job

🧰 Additional context used

📓 Path-based instructions (4)

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

src/constants.py
src/models/config.py
src/utils/vector_search.py

src/constants.py

📄 CodeRabbit inference engine (AGENTS.md)

Use constants.py for shared constants with descriptive comments and type hints using Final[type]

Files:

src/constants.py

src/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Pydantic models must use @model_validator and @field_validator for validation and complete type annotations for all attributes, avoiding Any type

Files:

src/models/config.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

tests/unit/models/config/test_byok_rag.py
tests/unit/models/config/test_dump_configuration.py
tests/unit/utils/test_vector_search.py

🧠 Learnings (2)

📚 Learning: 2026-01-12T10:58:40.230Z

Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

src/models/config.py

📚 Learning: 2026-02-25T07:46:33.545Z

Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.

Applied to files:

src/models/config.py

🪛 GitHub Actions: OpenAPI (Spectral) / 0_spectral.txt

docs/openapi.json

[error] 1-1: docs/openapi.json is out of date. Step checks openapi schema via uv run python scripts/generate_openapi_schema.py /tmp/openapi-generated.json and diff -u docs/openapi.json /tmp/openapi-generated.json; diff failed. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json.

🪛 GitHub Actions: OpenAPI (Spectral) / spectral

docs/openapi.json

[error] 1-1: docs/openapi.json is out of date. Diff detected between docs/openapi.json and generated /tmp/openapi-generated.json. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json

🔇 Additional comments (11)

src/constants.py (1)

196-197: LGTM!

src/models/config.py (1)

1721-1727: LGTM!

tests/unit/models/config/test_dump_configuration.py (1)

11-11: LGTM!

Also applies to: 1028-1028

tests/unit/models/config/test_byok_rag.py (1)

7-7: LGTM!

Also applies to: 39-39, 59-59, 68-68, 208-220

src/utils/vector_search.py (2)

29-44: LGTM!

196-223: LGTM!

Also applies to: 434-434

docs/byok_guide.md (1)

82-82: LGTM!

Also applies to: 297-298, 303-306, 333-336, 586-586

tests/unit/utils/test_vector_search.py (4)

3-5: LGTM!

Also applies to: 26-26, 31-54

460-462: LGTM!

Also applies to: 502-504, 530-535, 546-548, 552-554, 882-884

591-629: LGTM!

631-674: LGTM!

Also applies to: 676-729

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

Comment thread src/utils/vector_search.py

Comment thread tests/unit/models/config/test_dump_configuration.py Outdated

syedriko force-pushed the syedriko-lcore-1853-2 branch 3 times, most recently from 7187d0b to 8e966b0 Compare May 11, 2026 21:20

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

Comment thread docs/byok_guide.md

LCORE-1853: Add relevance cutoff score to inline BYOK RAG

fee7578

syedriko force-pushed the syedriko-lcore-1853-2 branch from 8e966b0 to fee7578 Compare May 11, 2026 23:52

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

Comment thread docs/openapi.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LCORE-1853: Add relevance cutoff score to inline BYOK RAG#1720

LCORE-1853: Add relevance cutoff score to inline BYOK RAG#1720
syedriko wants to merge 1 commit into
lightspeed-core:mainfrom
syedriko:syedriko-lcore-1853-2

syedriko commented May 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 11, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

syedriko commented May 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

syedriko commented May 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 11, 2026 •

edited

Loading