LCORE-1853: Add relevance cutoff score to inline BYOK RAG#1720
Conversation
WalkthroughThis PR adds per-knowledge-source relevance cutoff score configuration for BYOK RAG, enabling Inline RAG queries to filter vector search results by minimum similarity threshold. The constant, config model, vector search integration, API schema, and comprehensive tests are introduced. ChangesBYOK RAG Relevance Cutoff Configuration
🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/openapi.json`:
- Around line 11809-11815: The OpenAPI schema file docs/openapi.json is out of
sync and should be regenerated rather than edited; run the provided generator
script (uv run scripts/generate_openapi_schema.py docs/openapi.json) to recreate
the schema, then verify the relevance_cutoff_score field in the regenerated JSON
has the correct default value that matches
DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE from src/constants.py and commit the
regenerated file.
In `@src/utils/vector_search.py`:
- Around line 29-34: Update the docstring for _relevance_cutoff_for_vector_store
to follow Google Python conventions: add a brief description line, then a
Parameters section documenting vector_store_id (str) and what it represents, and
a Returns section describing the returned float (either the matched
brag.relevance_cutoff_score or
constants.DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE). If the function can raise
any exceptions, add a Raises section; otherwise omit Raises. Ensure the
docstring text references the function name and types only (no code) and matches
the existing one-line summary.
In `@tests/unit/models/config/test_dump_configuration.py`:
- Line 11: Replace the top-level import "import constants" with explicit
from-imports for the names actually used in this test (e.g., "from constants
import FOO, BAR" — substitute the real constant names referenced in this file)
and update usages that refer to constants.<NAME> (including the reference around
line 996) to use the direct names (FOO, BAR, etc.) so the import style matches
the existing "from X import Y" pattern used elsewhere in the file.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f0c9945a-ead4-4245-9d4b-99ea45582ed7
📒 Files selected for processing (8)
docs/byok_guide.mddocs/openapi.jsonsrc/constants.pysrc/models/config.pysrc/utils/vector_search.pytests/unit/models/config/test_byok_rag.pytests/unit/models/config/test_dump_configuration.pytests/unit/utils/test_vector_search.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: black
- GitHub Check: unit_tests (3.13)
- GitHub Check: Pylinter
- GitHub Check: radon
- GitHub Check: build-pr
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (4)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/models/config.pysrc/constants.pysrc/utils/vector_search.py
src/models/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Pydantic models must use
@model_validatorand@field_validatorfor validation and complete type annotations for all attributes, avoidingAnytype
Files:
src/models/config.py
src/constants.py
📄 CodeRabbit inference engine (AGENTS.md)
Use
constants.pyfor shared constants with descriptive comments and type hints usingFinal[type]
Files:
src/constants.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/models/config/test_dump_configuration.pytests/unit/utils/test_vector_search.pytests/unit/models/config/test_byok_rag.py
🧠 Learnings (2)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.
Applied to files:
src/models/config.py
🪛 GitHub Actions: OpenAPI (Spectral) / 0_spectral.txt
docs/openapi.json
[error] 1-1: OpenAPI schema out of date. diff detected changes between docs/openapi.json and /tmp/openapi-generated.json. Regenerate with: 'uv run scripts/generate_openapi_schema.py docs/openapi.json'.
🪛 GitHub Actions: OpenAPI (Spectral) / spectral
docs/openapi.json
[error] 1-1: OpenAPI schema is out of date. 'diff -u docs/openapi.json /tmp/openapi-generated.json' failed; regenerate with: 'uv run scripts/generate_openapi_schema.py docs/openapi.json'.
🪛 markdownlint-cli2 (0.22.1)
docs/byok_guide.md
[warning] 89-89: Blank line inside blockquote
(MD028, no-blanks-blockquote)
🔇 Additional comments (15)
src/constants.py (1)
196-198: LGTM! Well-defined constant for BYOK RAG relevance filtering.The constant follows the established pattern with proper type hints and descriptive comment. The default value of 0.3 is reasonable for similarity-based filtering.
src/models/config.py (1)
1636-1643: LGTM! Proper Pydantic field definition for relevance cutoff.The field is correctly configured with:
- Appropriate type annotation and default value from constants
gt=0validation ensuring positive cutoff values- Clear description specifying "raw similarity score" (before multipliers)
The validation constraint requiring values > 0 prevents disabling the cutoff entirely, which aligns with the PR requirement for a "positive float."
tests/unit/models/config/test_dump_configuration.py (1)
996-996: LGTM! Test correctly validates default value serialization.The test properly verifies that the new
relevance_cutoff_scorefield is included in the dumped configuration with the correct default value from constants.tests/unit/models/config/test_byok_rag.py (3)
7-7: LGTM!The import of
DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCOREis appropriate for testing the default value behavior.
39-39: LGTM!The test assertions properly validate both the default and custom
relevance_cutoff_scorevalues. The choice of 0.72 as a test value provides good coverage.Also applies to: 59-59, 68-68
208-220: LGTM!The validation test correctly verifies the
gt=0constraint onrelevance_cutoff_score. Testing the boundary case with0.0provides good coverage.src/utils/vector_search.py (2)
186-213: LGTM!The function correctly accepts and forwards the
score_thresholdparameter toclient.vector_io.query. The docstring is properly updated to document the new parameter.
418-427: LGTM!The implementation correctly retrieves the per-vector-store
relevance_cutoff_scoreand passes it to the query function. Each store in the parallel queries will use its own configured cutoff threshold.docs/byok_guide.md (4)
82-82: LGTM!The documentation clearly explains the
relevance_cutoff_scorefeature, including its purpose, default value, and how it's applied during Inline RAG retrieval.
297-298: LGTM!The YAML example clearly demonstrates how to configure
relevance_cutoff_score, including helpful comments about the default value and scope.
304-306: LGTM!The clarification about score space and per-store tuning is valuable guidance for users configuring this feature.
333-336: LGTM!The comparison table clearly shows that
relevance_cutoff_scoreapplies only to Inline RAG (BYOK), providing a helpful reference for users choosing between modes.tests/unit/utils/test_vector_search.py (3)
430-432: LGTM!The mock configurations are consistently updated to include
relevance_cutoff_score, using the default constant value appropriately.Also applies to: 472-474, 516-518, 522-524, 752-754
500-504: LGTM!The assertion correctly verifies that
score_thresholdis passed tovector_io.queryalong with the other expected parameters.
561-599: LGTM!The new test explicitly verifies that a custom
relevance_cutoff_scoreconfiguration is correctly passed tovector_io.queryasscore_threshold. This provides comprehensive coverage for the feature.
7187d0b to
8e966b0
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/byok_guide.md`:
- Around line 84-94: Remove the empty line between the two adjacent blockquote
note blocks so they are back-to-back (i.e., merge the `> [!NOTE]` block about
OKP/BYOK `score_multiplier` and chunk constants with the following `> [!NOTE]`
block about `relevance_cutoff_score`) to satisfy MD028; locate the two `>
[!NOTE]` paragraphs in the byok guide and delete the blank line separating them
so markdownlint no longer reports the blockquote-blank rule.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: dc0b1607-f348-4610-b49a-bace5ce7b17c
📒 Files selected for processing (8)
docs/byok_guide.mddocs/openapi.jsonsrc/constants.pysrc/models/config.pysrc/utils/vector_search.pytests/unit/models/config/test_byok_rag.pytests/unit/models/config/test_dump_configuration.pytests/unit/utils/test_vector_search.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: spectral
- GitHub Check: Pylinter
- GitHub Check: build-pr
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (4)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/models/config/test_dump_configuration.pytests/unit/models/config/test_byok_rag.pytests/unit/utils/test_vector_search.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/constants.pysrc/models/config.pysrc/utils/vector_search.py
src/constants.py
📄 CodeRabbit inference engine (AGENTS.md)
Use
constants.pyfor shared constants with descriptive comments and type hints usingFinal[type]
Files:
src/constants.py
src/models/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Pydantic models must use
@model_validatorand@field_validatorfor validation and complete type annotations for all attributes, avoidingAnytype
Files:
src/models/config.py
🧠 Learnings (2)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.
Applied to files:
src/models/config.py
🪛 markdownlint-cli2 (0.22.1)
docs/byok_guide.md
[warning] 89-89: Blank line inside blockquote
(MD028, no-blanks-blockquote)
🔇 Additional comments (4)
src/constants.py (1)
196-198: Good addition of shared BYOK relevance cutoff constant.
Typed withFinal[float], clearly named, and placed in the correct constants section.tests/unit/models/config/test_dump_configuration.py (2)
11-11: Import update looks correct and consistent with this test module.
1028-1028: Expected dump payload correctly includesrelevance_cutoff_score.
This strengthens coverage for the new BYOK RAG config field serialization path.docs/openapi.json (1)
11810-11815: Default value is correct.The default value of
0.3matches theDEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCOREconstant fromsrc/constants.py. The schema definition withexclusiveMinimum: 0.0correctly enforces positive values and the description accurately reflects the filtering behavior.
8e966b0 to
fee7578
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
docs/byok_guide.md (1)
89-89:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winRemove the blank blockquote line to satisfy markdownlint MD028.
The empty blockquote line between the two adjacent
> [!NOTE]blocks still triggers theMD028 (no-blanks-blockquote)rule violation that was previously flagged.📝 Proposed fix
> context, set the `BYOK_RAG_MAX_CHUNKS` and `OKP_RAG_MAX_CHUNKS` constants in `src/constants.py` > (defaults: 10 and 5 respectively). For Tool RAG, use `TOOL_RAG_MAX_CHUNKS` (default: 10). -> > [!NOTE] > `relevance_cutoff_score` applies to Inline RAG only. When the model uses Tool RAG (`file_search`),🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/byok_guide.md` at line 89, Remove the empty blockquote line between the two adjacent blockquote markers so MD028 is satisfied: find the consecutive lines starting with "> [!NOTE]" and delete the blank line (the standalone ">"/empty blockquote) separating them so the two note blocks are directly adjacent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/openapi.json`:
- Around line 11809-11815: Regenerate the out-of-sync OpenAPI schema file
docs/openapi.json by running the project script
scripts/generate_openapi_schema.py (via the project's runner e.g., `uv run`)
instead of editing the JSON manually; after regeneration verify the
relevance_cutoff_score default in docs/openapi.json matches the
DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE constant defined in src/constants.py,
then commit the regenerated docs/openapi.json.
---
Duplicate comments:
In `@docs/byok_guide.md`:
- Line 89: Remove the empty blockquote line between the two adjacent blockquote
markers so MD028 is satisfied: find the consecutive lines starting with ">
[!NOTE]" and delete the blank line (the standalone ">"/empty blockquote)
separating them so the two note blocks are directly adjacent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: d4d39bf3-18f3-4c52-9c75-cdb4a6f534b2
📒 Files selected for processing (8)
docs/byok_guide.mddocs/openapi.jsonsrc/constants.pysrc/models/config.pysrc/utils/vector_search.pytests/unit/models/config/test_byok_rag.pytests/unit/models/config/test_dump_configuration.pytests/unit/utils/test_vector_search.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: build-pr
- GitHub Check: integration_tests (3.12)
- GitHub Check: Pylinter
- GitHub Check: unit_tests (3.12)
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (4)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/constants.pysrc/models/config.pysrc/utils/vector_search.py
src/constants.py
📄 CodeRabbit inference engine (AGENTS.md)
Use
constants.pyfor shared constants with descriptive comments and type hints usingFinal[type]
Files:
src/constants.py
src/models/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Pydantic models must use
@model_validatorand@field_validatorfor validation and complete type annotations for all attributes, avoidingAnytype
Files:
src/models/config.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/models/config/test_byok_rag.pytests/unit/models/config/test_dump_configuration.pytests/unit/utils/test_vector_search.py
🧠 Learnings (2)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.
Applied to files:
src/models/config.py
🪛 GitHub Actions: OpenAPI (Spectral) / 0_spectral.txt
docs/openapi.json
[error] 1-1: docs/openapi.json is out of date. Step checks openapi schema via uv run python scripts/generate_openapi_schema.py /tmp/openapi-generated.json and diff -u docs/openapi.json /tmp/openapi-generated.json; diff failed. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json.
🪛 GitHub Actions: OpenAPI (Spectral) / spectral
docs/openapi.json
[error] 1-1: docs/openapi.json is out of date. Diff detected between docs/openapi.json and generated /tmp/openapi-generated.json. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json
🔇 Additional comments (11)
src/constants.py (1)
196-197: LGTM!src/models/config.py (1)
1721-1727: LGTM!tests/unit/models/config/test_dump_configuration.py (1)
11-11: LGTM!Also applies to: 1028-1028
tests/unit/models/config/test_byok_rag.py (1)
7-7: LGTM!Also applies to: 39-39, 59-59, 68-68, 208-220
src/utils/vector_search.py (2)
29-44: LGTM!
196-223: LGTM!Also applies to: 434-434
docs/byok_guide.md (1)
82-82: LGTM!Also applies to: 297-298, 303-306, 333-336, 586-586
tests/unit/utils/test_vector_search.py (4)
3-5: LGTM!Also applies to: 26-26, 31-54
460-462: LGTM!Also applies to: 502-504, 530-535, 546-548, 552-554, 882-884
591-629: LGTM!
631-674: LGTM!Also applies to: 676-729
Description
In the Lightspeed Stack configuration, byok_rag section, added a "relevance_cutoff_score" configuraiton variable, a positive float.
When a RAG chunk is retrieved from a BYOK database and its relevance score is less than the cutoff, the chunk is dropped from further consideration. Specifically, this cutoff filtering happens before score multipliers are applied.
Only added handling of relevance_cutoff_score to the inline BYOK RAG for now as adding it to the tool BYOK RAG is calling for changes in Llama Stack.
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
New Features
Documentation
API
Tests