Conversation
- Disable `reportIncompatibleVariableOverride` globally so the Pydantic v2 pattern (`Paper.item_type: Literal["paper"]` narrowing `KnowledgeItem.item_type: str`) does not trigger spurious type errors. - Make `quantmind/config/registry._discover_flows_in_path` resilient to `OSError` from `Path.rglob` so the local verify loop is reliable on macOS (AppTranslocation tmpdirs can crash mid-scan). Code is transitional and gets deleted in PR5.
Replaces the flat KnowledgeItem hierarchy with BaseKnowledge + three sibling shapes: - FlattenKnowledge: atomic cards (News, Earnings, PaperKnowledgeCard) - TreeKnowledge: hierarchical artifacts (Paper) - GraphKnowledge: placeholder; subclassing blocked until shape finalises BaseKnowledge gains typed provenance (SourceRef / ExtractionRef instead of bare strings), an auto UUID id, schema_version, created_at, and an embedding_text() contract that subclasses MUST override so the future store layer knows what to embed. Citations grow optional tree_id / node_id anchors for tree-rooted citations. Paper is now a TreeKnowledge (sectioned paper); the previous flat Paper becomes PaperKnowledgeCard (the distilled summary card pointing at a paper via paper_id). News and Earnings remain flat and reparent to FlattenKnowledge unchanged in domain payload. Tests cover the new shapes end-to-end (45 assertions across base, tree, graph, paper, news, earnings); the wider verify loop stays green at 259 tests, 68.4% coverage. Storage layer (KnowledgeStore Protocol + SQLite + sqlite-vec backend) is specified but lands separately so this PR remains schema-only.
…dtrip tests Polishes PR3 ahead of merge with four small additions surfaced in review: - `BaseKnowledge.is_extracted()` / `freshness(now=None)` / `with_tags(*tags)` shared helpers (every shape benefits): provenance check, staleness measurement, and frozen-friendly tag append. `with_tags` is idempotent on duplicates so callers do not have to dedup themselves. - `Citation.tree_id` / `node_id` are now exercised by a JSON round-trip test that proves UUID anchors survive serialisation. - `Factor` and `Thesis` ship as stubs (FlattenKnowledge subclasses) so ``from quantmind.knowledge import Factor, Thesis`` works today; the full payloads land with their respective flows. - New ``test_roundtrip.py`` exercises ``model_dump_json`` → ``model_validate_json`` on every concrete subclass, including ``Paper.nodes: dict[UUID, TreeNode]`` whose JSON-stringified keys must rehydrate back to ``UUID`` keys for the SDK ``output_type=`` contract. `FlattenKnowledge` itself stays empty intentionally: its subclasses share no payload fields, so any "common method" would be hollow. Cross-shape helpers belong on `BaseKnowledge` instead. `typing_extensions` (already pulled in transitively by Pydantic) is used for ``Self`` so ``with_tags`` returns the correct subclass type on Python 3.10.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Lands
quantmind/knowledge/as a data standard with three shapes plus thequantmind/configs/skeleton. The knowledge schema is the contract every downstream module (flows, utils, retrieval, futureKnowledgeStore) builds on; this PR ships schema only — the storage layer is specified but lands separately.quantmind/knowledge/— three-shape data standardBaseKnowledge— root for every shape. Carriesid(auto UUID),item_type,schema_version,as_of(mandatory),created_at,source: SourceRef(typed provenance — no bare strings),extraction: ExtractionRef | None,confidence,citations,tags,disclaimers, plus anembedding_text()contract that subclasses MUST override so the future store knows what to embed.FlattenKnowledge— atomic-card shape. Subclasses:News,Earnings,PaperKnowledgeCard.TreeKnowledge— hierarchical-artifact shape. Holdsroot_node_idplus a flatnodes: dict[UUID, TreeNode]map. Helpers:root(),children_of(),walk_dfs(),find_path(). Defaultembedding_text()delegates to root. Subclass:Paper(whole paper as section tree).GraphKnowledge— placeholder. The class exists for type-hintingBaseKnowledge | FlattenKnowledge | TreeKnowledge | GraphKnowledge, but__init_subclass__raisesNotImplementedErroruntil the shape is finalised in a later PR.Citationgains optionaltree_id/node_idanchors for tree-rooted citations. The full design rationale (PageIndex-style navigation, embedding as pre-filter not replacement, future SQLite +sqlite-vecbackend, Python-interface contract forKnowledgeStore) is captured in the local design spec.quantmind/configs/Unchanged from the original PR3:
BaseFlowCfg+BaseInputplus per-flow<Name>FlowCfgand<Name>Inputdiscriminated unions for paper / news / earnings.Other changes
openai-agents>=0.14added as a hard dep soBaseFlowCfg.model_settings: ModelSettings | Noneis honoured.import-lintercontracts:knowledgeis a leaf;configsmay only depend onknowledge.quantmind/config/registry._discover_flows_in_pathmade resilient toOSErrorfromPath.rglob(transitional code; deleted in PR5).basedpyright'sreportIncompatibleVariableOverridedisabled so the Pydantic v2 idiom of narrowing astrfield toLiteral[...]in subclasses (used throughout the discriminator pattern) does not produce spurious errors.Old
quantmind/models/{content,paper,analysis}.pyandquantmind/config/*stay in place; transitionalparsers/,sources/,flow/,llm/still depend on them and they get deleted in PR4-PR5.Part of #71.
Test plan
bash scripts/verify.shpasses locally (259 tests, coverage 68.43% ≥ 60%)python -c "from quantmind.knowledge import Paper, PaperKnowledgeCard, News, Earnings, FlattenKnowledge, TreeKnowledge, GraphKnowledge; print('OK')"printsOK🤖 Generated with Claude Code
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com