Skip to content

chore: sync upstream plastic-labs/honcho main#2

Closed
offendingcommit wants to merge 26 commits intomainfrom
upstream-sync
Closed

chore: sync upstream plastic-labs/honcho main#2
offendingcommit wants to merge 26 commits intomainfrom
upstream-sync

Conversation

@offendingcommit
Copy link
Copy Markdown
Owner

Summary

  • Syncs upstream plastic-labs/honcho main into this fork's main
  • Brings in the new src/llm/ backend package (replaces src/utils/clients.py), new honcho-cli, docs updates, and all other upstream work

Notes

  • Expect merge conflicts against local fork-only commits (CF Gateway auth header, Gemini thought_signature fix, LM Studio/Prometheus/Traefik stack, dreamer specialist overrides) — written against the old src/utils/clients.py and will need to be re-ported onto the new src/llm/ abstraction.

Test plan

  • Resolve conflicts, port fork-only features to src/llm/ backends
  • uv run pytest tests/
  • uv run ruff check src/ and uv run basedpyright
  • Local docker compose smoke test with LM Studio + CF gateway paths

LubaKaper and others added 26 commits April 6, 2026 17:06
…#495)

* chore: add .worktrees/ to .gitignore

* feat(examples): add Zo Computer memory skill integration

* feat(examples): add Zo Computer memory skill integration

* fix(examples): address CodeRabbit review on Zo skill integration

  - Fix version inconsistency: SKILL.md matches pyproject.toml (>=2.1.0)
  - Move client.py into tools/ package and use relative imports
  - Add assistant_id parameter to save_memory() for consistency with get_context()
  - Use UUID-based IDs in tests to prevent state leakage between runs
  - Add pytest.mark.skipif guard on integration tests (requires HONCHO_API_KEY)
  - Fix import ordering, move pytest to module level, sort __all__ alphabetically
  - Fix markdown blank lines around fenced code blocks (MD031)
  - Add rate limit delay fixture to avoid hitting Honcho free tier limits

* fix(examples): validate HONCHO_API_KEY early in client initialization

* docs(examples): note cross-peer memory behavior in shared workspaces

* docs(examples): fix save_memory and query_memory signatures in README

* docs(examples): fix markdown linting issues in README

* docs(examples): add assistant_id parameter to save_memory example in
  SKILL.md

---------

Co-authored-by: Luba Kaper <lubakaper@lubas-air.mynetworksettings.com>
…fig guide (plastic-labs#510)

* fix: Inconsistencies in Docs, health endpoint, troubleshooting guide

* fix: (docs) maintain consistency on postgres db name

* chore: (docs) update v2 contributing docs with updates db paths

* docs: overhaul self-hosting docs for provider-agnostic setup

- .env.template: lead with provider options (custom, vllm, google,
  anthropic, openai, groq) instead of baking in vendor-specific keys.
  All provider/model settings commented out so server fails fast until
  configured. Separate endpoint config from per-feature provider+model
  from tuning knobs.
- docker-compose.yml.example: fix healthcheck -d honcho -> -d postgres
  to match POSTGRES_DB=postgres.
- config.toml.example: reorder and document LLM key section with
  OpenRouter and vLLM examples.
- self-hosting.mdx: replace multi-vendor key table with provider options
  table. Add examples for OpenRouter, vLLM/Ollama, and direct vendor
  keys. Remove duplicated key lists from Docker/manual setup sections.
- configuration.mdx: replace scattered provider docs with provider types
  table. Fix Docker Compose snippet to match actual compose file. Note
  code defaults as fallback, not recommended path.
- troubleshooting.mdx: add alternative provider issues section (custom
  provider config, model name format, Docker localhost, structured
  output failures).

* docs: add Docker build troubleshooting for permission errors

- Document BuildKit requirement (RUN --mount syntax)
- AppArmor/SELinux blocking Docker builds on Linux
- Volume mount UID mismatch between host and container app user
- Note in self-hosting docs that Docker path builds from source

* docs: reframe self-hosting as contributor/dev path, point to cloud service

* Revert "docs: reframe self-hosting as contributor/dev path, point to cloud service"

This reverts commit 3e766eb.

* docs: add production compose, model guidance, thinking budget docs

- Add docker-compose.prod.yml for VM/server deployment: no source
  mounts, restart policies, 127.0.0.1-bound ports, cache enabled
- Add model tier guidance and community quick-start link to self-hosting
- Document THINKING_BUDGET_TOKENS gotcha for non-Anthropic providers
- Add reverse proxy examples (Caddy + nginx) to production section
- Add backup/restore commands to production considerations

* docs: simplify self-hosting to single provider, restructure config guide

Self-hosting page now defaults to one OpenAI-compatible endpoint
with one model for all features. Moved model tiers, alternative
providers, and per-feature tuning into the configuration guide.
Eliminated duplicate config priority sections, dev/prod split,
and redundant TOML examples.

* docs: merge compose files, restore provider/model to feature sections in .env.template

Single docker-compose.yml.example with dev sections commented out.
Moved PROVIDER and MODEL back alongside each feature in .env.template
so settings stay colocated with their module. Updated self-hosting
docs to reference single compose file.

* fix: broken anchor links, redundant migration step, minor inconsistencies

Fix 4 broken internal links (#llm-provider-setup, #llm-api-keys,
#which-api-keys-do-i-need, #alternative-providers) to point to
correct headings. Remove redundant Docker migration step (entrypoint
already runs alembic). Fix cache URL missing ?suppress=true in
reference config. Fix uv install command to use official method.

* docs: env template ready to use, simplify self-hosting flow

.env.template now has provider/model lines uncommented with
placeholder values — user just sets endpoint, key, and model name.
Thinking budgets default to 0 for non-Anthropic providers.

Self-hosting page: removed 30-line env var wall, LLM setup now
points to the template. Merged duplicate verify sections.
Removed api_key from SDK examples (auth off by default).

* docs: reorder next steps, configuration guide first

* fix: default embedding provider to openrouter for single-endpoint setup

Without this, embeddings default to openai which requires a separate
LLM_OPENAI_API_KEY. Setting to openrouter routes embeddings through
the same OpenAI-compatible endpoint as everything else.

* fix: review issues — hermes page, thinking budget, production wording

Hermes integration page: replaced inline Docker/manual setup with
link to self-hosting guide, added elkimek community link. Removed
old env var names (OPENAI_API_KEY without LLM_ prefix).

Troubleshooting: removed "or 1" from thinking budget guidance.
Self-hosting: softened "production-ready" to "production-oriented"
since auth is disabled by default.

* docs: model examples in template, expanded LLM setup, better verify flow

.env.template: added "e.g. google/gemini-2.5-flash" hints next to
model placeholders so users know the expected format.

Self-hosting: expanded LLM Setup to show the 3 things users need to
set (endpoint, key, model name) with find-replace tip. Added build
time note, deriver log check, and real smoke test (create workspace)
to verify section. Health check now notes it doesn't verify DB/LLM.

* fix: smoke test uses v3 API path, not v1

* docs: clarify deriver metrics port vs Prometheus host port

* fix: remove deprecated memoryMode from hermes config example

* docs: update hermes page to match current memory provider config

Updated config to match hermes-agent docs: removed apiKey (not needed
for self-hosted), added hermes memory setup CLI command, added config
fields table (recallMode, writeFrequency, sessionStrategy, etc.).

Better verification tests: store-and-recall across sessions, direct
tool calling test. Links to upstream hermes docs for full field list.

* fix: invalid THINKING_BUDGET_TOKENS=0 and missing docker/ in image

Comment out THINKING_BUDGET_TOKENS=0 in .env.template — deriver,
summary, and dream validators require gt=0. Dialectic levels also
commented out since non-thinking models don't need the override.

Add COPY for docker/ directory in Dockerfile so entrypoint.sh is
available when docker-compose.yml.example references it.

* chore: Additional troubleshooting step

---------

Co-authored-by: Vineeth Voruganti <13438633+VVoruganti@users.noreply.github.com>
* fix: further remove extraneous transactions

* fix: (search) use 2 phase function to reduce un-needed transaction

* fix: refactor agent search to perform external operations before making a transaction

* fix: reduce scope of queue manager transaction

* fix: (bench) add concurrency to test bench

* fix: address review findings for search dedup, webhook idempotency, and bench throttling

* Fix Leakage in non-session-scoped chat call (plastic-labs#526)

* fix: (search) reduce scope for peer based searches

* fix: tests

* fix: (test) address coderabbit comment

* fix: drop db param from deliver_webhook

---------

Co-authored-by: Rajat Ahuja <rahuja445@gmail.com>
* chore: (docs) Update changelogs and version numbers

* chore: remove extraneous dep on mintlify
* Simplify Paperclip integration instructions

Clarified instructions for local Honcho setup and removed unnecessary details.


* Update docs.json

* Update links in Paperclip integration guide

* Revise memory initialization instructions in Paperclip guide

Updated instructions for initializing memory and removed optional checks section.
…ic-labs#530)

The HEALTHCHECK directive probes an HTTP endpoint that only the API
serves. The deriver service reuses this image but is a background queue
worker with no HTTP server — the probe can never succeed, so Docker
permanently marks the deriver container as unhealthy.

Remove the HEALTHCHECK from the shared image. Service-level health
checks belong in each service's own configuration (e.g. Kubernetes
readiness/liveness probes on the API Deployment only).

Closes plastic-labs#521

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…ation (plastic-labs#459)

* fix: Add JSON repair for truncated LLM responses across all providers and Gemini thinking budget support

LengthFinishReasonError from OpenAI-compatible providers (custom, openai, groq) was crashing the deriver
with 14k+ occurrences in production. The vLLM path already had repair logic but it was gated on
provider=="vllm", unreachable when routing through litellm as a custom provider.

- Extract shared _repair_response_model_json() helper for all providers
- Catch LengthFinishReasonError in OpenAI/custom parse() path and repair truncated JSON
- Add repair fallback to Anthropic and Gemini response_model paths
- Add repair fallback to Groq response_model path
- Pass thinking_budget_tokens to Gemini 2.5 models via thinking_config
- Add 14 tests covering repair paths for all providers and Gemini thinking budget

Fixes HONCHO-YC

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: live llm integration tests

* feat: Consistent Model Config Protocol

* fix: migrate the remaining app callers off the legacy llm_settings path

* fix: Docs and regression tests

* fix: refactor llm runtime path to model-config-only API

* fix: refactor config to nested model-config source of truth

* fix: refactor llm streaming and tool dispatch through backends

* fix: cut over llm config to nested model_config only

* fix: collapse vllm and custom into openai_compatible transport

* feat: refactor llm config to explicit transports and bare model ids

* feat: (embed) Add configurability for embedding model

* fix: tests for embedding provider

* fix: Address Review Comments

* fix: (llm) remove Groq backend and per-vendor base URLs

* chore: move llm tests

* fix: (llm) address review findings — config regressions, backend bugs, dead code

* fix: address backend end silly errors

* chore: (docs) update configuration and self-hosting guides

* chore: fix tests

* fix: address code rabbit comments

* fix: add validation to the dream settings

* fix: further address code rabbit comments

* fix: Address Code Rabbit Comments

* fix: Another round of code rabbit

* fix: Address Code Rabbit Nits

* fix: tests

* refactor: rename thinking validator to reflect transport scope

_validate_anthropic_thinking_minimum only enforces the >=1024 rule for
Anthropic and no-ops for other transports, so the name was misleading
now that it's shared across ConfiguredModelSettings, FallbackModelSettings,
and ModelConfig. Renamed to _validate_thinking_constraints with a docstring
clarifying per-transport behavior. No logic change.

* fix(config): drop transport-specific thinking params when env override changes transport

_fill_defaults_for_nested_field previously preserved the default MODEL_CONFIG's
thinking_budget_tokens/thinking_effort across a transport override. This leaked
Gemini-family defaults (e.g. thinking_budget_tokens=1024) into OpenAI-transport
overrides, and the OpenAI backend then correctly rejected the unsupported param
at call time (OpenAI uses reasoning.effort, not a token budget).

The helper now strips thinking_budget_tokens and thinking_effort from the
default dict when the env override supplies a transport different from the
default's. Explicit thinking params in the override are preserved.

* fix(config): apply thinking-param strip to dialectic level merge too

DialecticSettings._merge_level_defaults does its own inline MODEL_CONFIG
merge (parallel to _fill_defaults_for_nested_field), so the previous fix
missed dialectic-level overrides. E.g. flipping
DIALECTIC_LEVELS__minimal__MODEL_CONFIG__TRANSPORT from gemini (default)
to openai still leaked the default thinking_budget_tokens=0 into the
openai config, which the OpenAI backend then rejected at call time.

The level-merge path now applies the same 'strip transport-specific
thinking params when transport changes' rule as the generic helper.
Added a regression test exercising the merge validator directly.

* refactor(llm): wire ModelConfig knobs through, prune clients.py migration leftovers

Three connected fixes to finish carving the LLM stack out of src/utils/clients.py
and into src/llm/:

1. Propagate ModelConfig tuning knobs into backend calls.
   honcho_llm_call_inner built extra_params from only {json_mode, verbosity},
   silently dropping top_p, top_k, frequency_penalty, presence_penalty, seed,
   and operator-supplied provider_params from any ModelConfig. Thread the
   selected config through ProviderSelection and merge
   build_config_extra_params(selected_config) into extra_params; per-call
   kwargs still win over provider_params defaults. Makes
   _build_config_extra_params public as build_config_extra_params so
   clients.py and request_builder.py share one translation. Adds
   TestModelConfigExtraParamsPropagation covering OpenAI/Anthropic knob
   propagation, provider_params passthrough, and per-call override
   precedence.

2. Drop dead extract_openai_* duplicates in clients.py.
   extract_openai_reasoning_content, extract_openai_reasoning_details, and
   extract_openai_cache_tokens had no callers outside their own definitions
   — the live implementations live in src/llm/backends/openai.py. -103
   lines from clients.py.

3. Unify on ModelTransport, delete SupportedProviders.
   The "google" vs "gemini" split forced a _provider_for_model_config
   translation shim in two places. Replace all SupportedProviders usages
   with ModelTransport, rename CLIENTS["google"] → CLIENTS["gemini"],
   update provider branches + LLMError labels + reasoning-trace entries
   accordingly. Trace JSONL now writes "provider": "gemini" instead of
   "google" — consistent with the broader env-var rename cutover.

Also tidies up pre-existing basedpyright findings in tests/llm/test_model_config.py
(pydantic before-validator dict inputs + descriptor-proxy call).

ruff: clean. basedpyright: 0 errors, 0 warnings. Tests: 153/153 pass across
tests/utils/test_clients.py, tests/utils/test_length_finish_reason.py,
tests/llm/, tests/dialectic/, tests/deriver/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(llm): finish the src/utils/clients.py → src/llm/ migration

honcho_llm_call_inner now delegates to request_builder.execute_completion
and execute_stream instead of re-implementing backend call scaffolding
inline. The new _effective_config_for_call helper carries per-call kwargs
(temperature, stop_seqs, thinking_budget_tokens, reasoning_effort) onto
the selected ModelConfig — or synthesizes a minimal config for the
test-only callers that pass provider+model directly. max_output_tokens
is zeroed on the effective config to preserve the current
"per-call max_tokens wins" semantic; honoring ModelConfig.max_output_tokens
is a separable correctness concern.

Side effect of routing through the new path: ConfiguredModelSettings'
thinking_budget_tokens validator now fires on synthesized configs.
test_anthropic_thinking_budget was asserting that a sub-1024 budget
propagated to Anthropic — bumped to 1024 to match what Anthropic actually
accepts.

Unified client construction. Promoted the cached client factories in
src/llm/__init__.py (get_anthropic_client, get_openai_client,
get_gemini_client, get_{anthropic,openai,gemini}_override_client) to
public API and added them to __all__. Promoted
credentials._default_transport_api_key → default_transport_api_key.
Deleted the duplicate _build_client and _default_credentials_for_provider
from clients.py; _client_for_model_config now falls through to the
public factories. CLIENTS dict and _get_backend_for_provider stay as the
mockable seam for the ~50 patch.dict(CLIENTS, {...}) test call sites.

Wired operator-configurable Gemini cached-content reuse end-to-end.
PromptCachePolicy moved from src/llm/caching.py into src/config.py so
ModelConfig can reference it as a field without a circular import;
caching.py re-exports the name for existing imports. Added
cache_policy: PromptCachePolicy | None on ConfiguredModelSettings,
FallbackModelSettings, ResolvedFallbackConfig, and ModelConfig.
resolve_model_config, _resolve_fallback_config, and
_select_model_config_for_attempt copy the field through.
honcho_llm_call_inner passes effective_config.cache_policy into
execute_completion / execute_stream, so operators opt in via
e.g. DERIVER_MODEL_CONFIG__CACHE_POLICY__MODE=gemini_cached_content
and the selection actually fires instead of sitting on a dead path.

New regression test test_cache_policy_reaches_gemini_backend asserts the
PromptCachePolicy object reaches the Gemini backend's extra_params.

ruff + basedpyright: clean. Tests: 154/154 pass across
tests/utils/test_clients.py, tests/utils/test_length_finish_reason.py,
tests/llm/, tests/dialectic/, tests/deriver/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(llm): move all LLM orchestration into src/llm/ and delete clients.py

The 1624-line src/utils/clients.py has been carved up into focused modules
under src/llm/ and deleted. There is now one golden path for LLM
orchestration and no dual entrypoint.

New module layout:

  src/llm/
    __init__.py       thin stable re-export surface
    api.py            public honcho_llm_call with retry + fallback + tool
                      loop delegation
    executor.py       honcho_llm_call_inner (single-call executor); bridges
                      to request_builder.execute_completion / execute_stream
    tool_loop.py      execute_tool_loop + stream_final_response, plus
                      assistant-tool-message and tool-result formatting
    runtime.py        AttemptPlan dataclass (replaces the loose
                      ProviderSelection NamedTuple), effective_config_for_call,
                      plan_attempt, per-retry temperature bump, attempt
                      ContextVar
    registry.py       single owner of CLIENTS dict + cached default and
                      override SDK-client factories + backend/history-adapter
                      selection + high-level get_backend(config)
    conversation.py   count_message_tokens, tool-aware message grouping,
                      truncate_messages_to_fit
    types.py          HonchoLLMCallResponse, HonchoLLMCallStreamChunk,
                      StreamingResponseWithMetadata, IterationData,
                      IterationCallback, ReasoningEffortType, VerbosityType,
                      ProviderClient
    request_builder.py low-level request assembly (ModelConfig → backend
                      complete/stream); no longer owns credential resolution
    credentials.py    default_transport_api_key, resolve_credentials
    caching.py        gemini_cache_store; re-exports PromptCachePolicy
                      from src.config
    backend.py        Protocol + normalized result types
    history_adapters.py provider-specific assistant/tool message shapes
    structured_output.py
    backends/         AnthropicBackend, OpenAIBackend, GeminiBackend

handle_streaming_response had no production callers; it is deleted. The
three tests that used it now drive honcho_llm_call_inner(stream=True,
client_override=...) directly, which exercises the same code path the
public API uses.

Dead credential passthrough removed. The ProviderBackend Protocol and
all three concrete backends no longer accept api_key / api_base — those
are baked into the underlying SDK client at registry construction time
and were being del'd everywhere they appeared. request_builder also
stops resolving and forwarding them.

Client construction is unified. The cached default-client factories
(get_anthropic_client, get_openai_client, get_gemini_client) and override
factories (get_*_override_client) are promoted to public API; the
module-level CLIENTS dict populates from them and remains the
patch.dict(CLIENTS, {...}) mocking seam tests rely on. Old duplicate
helpers (_build_client, _default_credentials_for_provider) are gone.
default_transport_api_key is promoted to public.

Application imports now come from src.llm (dreamer, dialectic, deriver,
summarizer, telemetry-adjacent tests). No code imports from
src.utils.clients anywhere in the repo.

ruff: clean. basedpyright: 0 errors, 0 warnings. Tests: 1013/1013 pass
across the entire non-infra test suite (excluding tests/unified,
tests/bench, tests/live_llm, tests/alembic).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(llm): sanitize tool schemas for Gemini's function_declarations validator

Gemini's native-transport function-declarations validator accepts a narrow
subset of JSON-Schema / OpenAPI: type, format, description, nullable, enum,
properties, required, items, minItems, maxItems, minimum, maximum, title.
Anything else — additionalProperties, allOf, if/then/else, $ref, anyOf,
oneOf, $defs, patternProperties — triggers an INVALID_ARGUMENT 400 at call
time.

Our agent tool schemas in src/utils/agent_tools.py use several of those
(additionalProperties: false, allOf + if/then conditionals) because they
were authored for OpenAI strict-mode + Anthropic, which need the richer
vocabulary. GeminiBackend._convert_tools was passing them straight through.

Add _sanitize_schema(): walks the parameters tree and drops unsupported
keywords while preserving semantics for the keywords that hold user data
(properties maps field-name → sub-schema; required / enum are lists of
literals; items is a single sub-schema). Other backends are untouched and
continue to receive the full strict schemas.

Regression tests:
- test_gemini_sanitize_schema_strips_unsupported_keywords: confirms
  additionalProperties, allOf + if/then, and $defs are stripped at nested
  levels while legitimate fields survive.
- test_gemini_convert_tools_sanitizes_parameters_schema: end-to-end
  _convert_tools output has no forbidden keys.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: fix tool calling syntax for gemini

* refactor(llm): normalize defaults, widen OpenAI reasoning-model routing

* chore: fix test

* fix(llm): address post-migration review feedback

* fix(llm): gemini robustness + dreamer specialist ergonomics

* chore: addres review comments

* chore: (docs) unrelease changelog addition

* chore: (docs) merge commit changes

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Erosika <eri@plasticlabs.ai>
* feat: adding honcho-cli package

* feat: adding more support for command-level flags, also including workarounds for getting raw SDK info

* feat: adding peer config

* feat: adding setup commands

* chore: setting up package dependencies for cli

* feat: promote init/doctor to top-level + polish wizard

* feat: make init --yes fall back to existing config

* chore: updating documentation

* chore: updating tagline

* feat: structurally updating recomended settings for CLI

* fix: style

* fix: removing redundant describe method

* fix: delete key generation commands and fixing session ID

* fix: removing defaults and changing config write path.

* chore: pagnating conclusions

* chore: require workspace

* fix: polish command surfaces — scoping, validation, perf, consistency

* chore: removing session message

* fix: CLI output shape, destructive-confirm previews, skip needless round-trips

* chore: CLI polish — peer inspect config, drop dead helper, doc/help consistency

* chore: update readme

* chore: updating tests

* chore: doc updates

* fix: config command

* chore: unused code

* fix: doctor command

* fix: removing quiet tag and fixing session key ordering

* fix: config commands and session id command

* fix: removing message_count

* fix: branding circular dependency

* fix: refactor lazy imports to use common.py correctly.

* fix: removing all lazy imports

* chore: cr fixes

* fix: config, env, flag setup

* chore: updating skill

* feat: adding workspace, session, and message create

* fix: init now supports local honcho

* chore: cr

* feat(cli): CLI surface polish — reasoning flag, peer-scoped messages, help sync

Add --reasoning/-r to peer chat (minimal..max), -p peer filter to
message list with newest-first ordering, and a curated welcome panel
with getting-started/memory/commands sections.

Sync the welcome panel and group help strings with the actual
registered commands — drop phantom 'session clone', add the 4 missing
peer commands and 7 missing session commands, fix conclusion/message/
workspace group docstrings that claimed commands that don't exist.

* feat(cli): themed, unified help system with pattern/example

Replace the hand-rolled welcome with a layered system:

- Theme typer.rich_utils (dim borders, brand color) so every --help
  inherits the voice.
- HonchoTyperGroup subclass renders a curated 3-panel welcome
  (getting started / memory / commands) with recipes Typer can't
  auto-generate.
- Unify the front door: bare 'honcho', 'honcho --help', and
  'honcho help' all render the same welcome via one code path;
  sub-groups and leaf commands still get Typer's themed renderer.
- Replace Click's 'Usage: …' line with pattern/example rows at every
  sub-group and leaf command, so the help voice stays consistent from
  top to leaves.

* refactor(cli): address review — typed exceptions, chmod 600, tighter redaction, class-based help, tests

- Replace module-level monkey-patch of TyperGroup/TyperCommand.get_usage
  with HonchoTyperGroup applied via cls= on every sub-Typer. Lives in
  a new _help.py module to avoid circular imports. No longer leaks
  behavior changes into other Typer users in the same process.
- _test_connection dispatches on the SDK's typed exceptions
  (AuthenticationError, ConnectionError, TimeoutError, APIError)
  instead of substring-matching error messages.
- Config.save() now chmods ~/.honcho/config.json to 0o600 after write
  so the plaintext API key isn't world-readable on multi-user hosts.
- Tighten api_key redaction to '***<last4>' (was 'header...last4'),
  matching setup._redact for consistency. Short keys fully masked.
- Add test_validation.py covering safe IDs, unsafe chars, path
  traversal, and empty input. Update test_config.py redaction cases
  and add 0o600 permission assertion. Fix stale patch paths in
  test_commands.py that pointed at honcho_cli.main instead of the
  command modules where get_client is actually imported.

* feat(cli): add options panel to welcome menu

Append a fourth panel listing the global flags (-w/-p/-s, --json,
--version, --help) with their env-var counterparts. Discoverable
from bare 'honcho' without needing to hunt for --help.

* chore(cli): drop --version from welcome options panel

* feat(cli): add pixel-honcho icon to banner

Prepend a 13-char ASCII rendering of honcho-pixel.svg to the HONCHO
wordmark. Uses Unicode half-blocks to pack 12 pixel rows into 6 text
rows, faithfully preserving the SVG outline (two eye dots, mouth slit,
tapering foot). Appears in bare 'honcho', 'honcho --help', 'honcho
--version', and 'honcho init'.

* fix: polish Honcho CLI wolcome panel and error messages

* fix: honcho workspace inspect speed

* chore: minor fix to session pagination

* fix: removing NDJSON output

* chore: consolidating honcho CLI's dula argv grammar onto Pattern A (command-first)

* chore: clean up imports

* fix: four `-s` consistency fixes applied

* chore: minor changes to memory rows

* fix: changing package name to honcho-cli

* fix: removing pixel face

---------

Co-authored-by: Erosika <eri@plasticlabs.ai>
…lastic-labs#575)

The MCP Worker hardcoded https://api.honcho.dev for every request, forcing
anyone running a self-hosted Honcho instance to patch the source before
deploying their own Worker alongside it.

Route the baseUrl through the Worker env so operators can set
HONCHO_API_URL (via .dev.vars for local development or wrangler secret for
deployed Workers) and point the Worker at their instance. The variable is
intentionally not exposed as a request header: that would let public
clients steer traffic to internal URLs, which is a latency and security
regression.

When HONCHO_API_URL is unset, the Worker falls back to
https://api.honcho.dev, so existing deployments are unaffected.

Closes plastic-labs#508
…patible providers (plastic-labs#586)

* fix: wrap single embed() input in array for OpenAI-compatible provider compatibility

* Fix input format in embedding test assertion
* fix: catch InternalServerError from turbopuffer

* fix: remove unused VectorUpsertResult

* fix: downgrade vector store sync errors to warnings

* fix: remove upsert_with_retry

* fix: (vector) add silent path and explicit path for vector db server errors

---------

Co-authored-by: Vineeth Voruganti <13438633+VVoruganti@users.noreply.github.com>
* docs: adding cli doc

* docs: adding generated script and content and github workflow

* chore: removing workflow

* fix: (docs) re-format and add details to cli-reference docs

---------

Co-authored-by: Vineeth Voruganti <13438633+VVoruganti@users.noreply.github.com>
* fix: moving cli skills to root

* chore: updating cli readme

* chore: updating language

* chore: updating docs
…ig (plastic-labs#587)

* Update deriver.py

* Simplify model configuration in deriver.py

Removed stop_sequences from model configuration.
* docs: adding opencode

* docs: align opencode guide with latest plugin changes

* chore: updating language

---------

Co-authored-by: adavyas <adavyasharma@gmail.com>
…stic-labs#581)

The Surprisal module passes `{"level": levels}` directly to
`get_all_documents()`, but `apply_filter()` expects operator syntax:
`{"level": {"in": levels}}`.

Without the `in` operator, the filter is silently ignored, causing
`_fetch_level_observations()` to return 0 results. This makes the
entire Surprisal phase of the Dream cycle a no-op.

Fixes plastic-labs#559
* docs: adding opencode

* docs: align opencode guide with latest plugin changes

* chore: updating language

* docs: remove interview command from opencode guide

---------

Co-authored-by: ajspig <dragon@monstercode.com>
Resolve conflicts between fork-only commits (CF Gateway auth, Gemini
thought_signature fix, LM Studio/Prometheus/Traefik stack, dreamer
specialist overrides) and upstream's new src/llm/ transport-based
abstraction that replaces src/utils/clients.py.

Port decisions:
- Dropped fork's cf / custom / vllm / groq providers — superseded by the
  new ModelConfig base_url/api_key override mechanism.
- Kept OPENAI_BASE_URL and CF_GATEWAY_AUTH_TOKEN on LLMSettings and wired
  them into src/llm/registry (default + override OpenAI clients) and
  src/embedding_client so CF AI Gateway routing survives the refactor.
- Ported thought_signature extraction into OpenAIBackend and replay into
  OpenAIHistoryAdapter so Gemini thinking models via the CF OpenAI-compat
  route can do multi-turn tool loops without 400ing.
- Dropped fork's DEDUCTION_PROVIDER / INDUCTION_PROVIDER and matching
  THINKING_BUDGET_TOKENS fields — upstream's per-specialist
  DEDUCTION_MODEL_CONFIG / INDUCTION_MODEL_CONFIG (full
  ConfiguredModelSettings) is a strict superset.
- Kept fork's traefik+prometheus+grafana docker-compose stack; kept
  upstream's broader docker/ COPY in the Dockerfile.
basedpyright with reportMissingTypeArgument rejected the bare `dict`
types in the mock fake_post used by the SDK message-batching test,
failing Static Analysis on PR #3. Add `dict[str, Any]` annotations and
an explicit return type so CI stays green.
basedpyright's default exit code is non-zero whenever any diagnostics
are reported, so the 8 warnings introduced by the fork-only commits
were failing the Static Analysis job on PR #3 even though there were
no errors.

- src/deriver/queue_manager.py: drop `item.created_at is not None`
  guards. created_at is `Mapped[datetime.datetime]` (non-nullable), so
  the checks were always True and basedpyright flagged them as
  reportUnnecessaryComparison.
- tests/sdk/test_session.py: factor out the shared mock-response body
  into a single helper and give the per-branch closures distinct names.
  This clears reportRedeclaration on `calls` / `fake_post` and lets the
  `# pyright: ignore` comments target the actual warning
  (reportPrivateUsage on `_http` / `_async_http_client`) instead of the
  irrelevant reportAttributeAccessIssue that was flagged as an
  unnecessary ignore.
@offendingcommit
Copy link
Copy Markdown
Owner Author

Closing in favor of #4, which captures everything this PR had plus the most recent 21 upstream commits AND adds the deployment-critical CF Gateway header injection adjacent to upstream's new src/llm/ architecture (without that, every CF-gateway-bound call would silently fail auth).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.