Skip to content

Consolidate DB access behind DatabaseInterface and add orchestrator#61

Open
thedeepaksengar wants to merge 3 commits into
King-s-Knowledge-Graph-Lab:CodeSplitfrom
thedeepaksengar:CodeSplit
Open

Consolidate DB access behind DatabaseInterface and add orchestrator#61
thedeepaksengar wants to merge 3 commits into
King-s-Knowledge-Graph-Lab:CodeSplitfrom
thedeepaksengar:CodeSplit

Conversation

@thedeepaksengar
Copy link
Copy Markdown

@thedeepaksengar thedeepaksengar commented Apr 19, 2026

Consolidated leaked DB queries. There were 33 raw pymongo calls scattered
across 6 files (functions.py, custom_decorators.py, info.py, queue_manager.py,
ProVe_main_service.py, background_processing.py, ProVe_heuristic_service.py).
All of them now go through methods on MongoDBHandler. Added 16 new methods
(get_latest_status_by_qid, get_html_by_task_id, upsert_summary_by_id,
log_usage, etc). Also folded StatsDBHandler and TMPStatsDBHandler subclasses
back into MongoDBHandler since they were just the same class pointing at
different DB names — the new handler lazily opens the usage DBs on demand.

Added the DB orchestrator / interface layer. New prove_shared/database/
subpackage with:

  • interface.py — DatabaseInterface that MongoDBHandler (and the future
    PostgreSQLHandler) implement
  • postgres.py — stub for the Postgres implementation, every method raises
    NotImplementedError with notes on what the SQL equivalent should look like
  • orchestrator.py — DatabaseOrchestrator + get_database() factory, reads
    the active backend from config.yaml

All 6 callers now do db = get_database() instead of MongoDBHandler().
Switching DBs is config-only:

database:
  primary: mongo         # or postgres
  fallback: none         # or mongo / postgres
  mode: single           # or dual-write (during migration)
  auto_fallback_on_read: false

Review-feedback changes on top of the initial PR.

  • Moved mongo_handler.py into database/ alongside postgres.py for symmetry. Updated every import site; nothing references the old path.
  • Restored full Google-style docstrings (Args/Returns/Raises) on every method, new and existing — so Sphinx/pdoc autodoc picks them up.
  • Renamed the module-level mongo_handler variable to database_handler everywhere (it may now hold a PostgreSQLHandler or an orchestrator, so the old name was misleading). Same for local mongo_status /mongo_statuses → status_doc / status_docs.

Added unit tests. 116 tests, all passing:

  • test_mongo.py — Mongo-specific unit tests for all 16 new methods (mock pymongo, verify call translation) plus the refactored requestItemProcessing helper.
  • test_database_contract.py — backend-agnostic contract tests via a pytest fixture. Currently parametrised over Mongo only; when Postgres lands we add its fixture and the whole suite auto-runs against both.
  • test_orchestrator.py — single / dual-write / read-fallback routing, log-and-continue semantics, claim-op primary-only, factory caching.
  • test_postgres.py — stub interface compliance + every stub method raises NotImplementedError with a message pointing at itself.

Bugs fixed along the way.

  • Race condition in get_summary — the insert-or-update branch now uses upsert=True atomically.
  • Race condition in retry_processing — read-modify-write of retry_count replaced with $inc.
  • process_top_viewed_items / process_pagepile_list were passing 'top_viewed' / 'pagepile_weekly_update' as the queue arg instead of request_type
  • custom_decorators.py was opening a fresh Mongo connection on every HTTP request.
  • Latent type-hint bug found by the new tests: pymongo.collection is a module, not a class (the class is pymongo.collection.Collection). Fixed throughout mongo.py.

Copy link
Copy Markdown
Collaborator

@NathanGavenski NathanGavenski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @thedeepaksengar great PR, let me know when you go through the comments so we can have a small meeting

Comment thread prove-shared/src/prove_shared/database/mongo.py
Comment thread prove-shared/src/prove_shared/mongo_handler.py
Comment thread prove-api/custom_decorators.py
Comment thread prove-api/functions.py Outdated
Comment thread prove-api/functions.py Outdated
Comment thread prove-api/functions.py
Comment thread prove-processing/background_processing.py Outdated
Comment thread prove-processing/background_processing.py
Comment thread prove-processing/ProVe_heuristic_service.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants