Skip to content

feat(policy): spike policy envelope and narrowness checks #1840

@zredlined

Description

@zredlined

Problem Statement

Enterprise deployments need a way to define an auditable maximum OpenShell policy/profile and prove that local users, agents, reviewers, provider attachments, and policy updates cannot exceed it.

OpenShell should also explore whether policy proposals can be checked for narrowness. A proposal may be inside the maximum envelope but still too broad for automatic approval, such as granting ** path access when the observed denial was one exact request.

This issue is a spike. It should produce convincing fixtures and a feasibility report, not a production admission gate.

Research Hypothesis

The research question is whether policy negotiation can be reduced to satisfiability checks over OpenShell policy semantics.

For maximum-envelope enforcement, model each relevant capability as an abstract action x:

x = {
  binary,
  host,
  port,
  protocol,
  method,
  path,
  query,
  graphql_operation_type,
  graphql_fields,
  mcp_server,
  mcp_tool,
  mcp_resource
}

Then ask the OpenShell prover / Z3 model for a counterexample:

exists x:
  candidate_allows(x)
  AND NOT maximum_allows(x)

If this is unsat, the candidate stays within the modeled maximum. If this is sat, the solver should return a concrete capability that exceeds the maximum, such as:

x = {
  binary: "/usr/bin/gh",
  host: "api.github.com",
  port: 443,
  protocol: "rest",
  method: "POST",
  path: "/repos/NVIDIA/OpenShell/admin"
}

For narrowness, test whether the same style applies against a request-derived witness:

exists x:
  candidate_delta_allows(x)
  AND NOT witness_allows(x)

Also test whether some narrowness rules are better expressed as structural budget constraints:

breadth(candidate_delta) <= configured_budget

The spike should determine whether narrowness works best as SMT containment, structural budget checks, broadness scoring, or a hybrid.

Proposed Spike

Build fixture-driven tests that load:

  • maximum_policy: the security-approved OpenShell policy/profile envelope.
  • current_policy: the runtime's current minimal/effective policy.
  • proposal: a merge operation or proposed rule.
  • observed_denial: optional witness data for narrowness checks.
  • narrowness_budget: optional limits for automatic approval.
  • expected_max_result: within_max, exceeds_max, or unsupported.
  • expected_narrowness_result: within_budget, over_budget, or unsupported.

The spike should prefer concrete, runnable fixtures over product design. A successful spike can still be small if it demonstrates the model catches real broadening and gives useful counterexamples.

Initial Fixtures

Maximum-envelope fixtures should include:

Case Maximum Proposal Expected
Exact REST path GET /repos/NVIDIA/OpenShell/issues/* GET /repos/NVIDIA/OpenShell/issues/123 within max
Broader REST path GET /repos/NVIDIA/OpenShell/issues/* GET /repos/NVIDIA/** exceeds max
Method escalation GET /repos/** POST /repos/** exceeds max
Query broadening GET /search/issues?org=NVIDIA GET /search/issues exceeds max
Deny precedence max full with deny POST /admin/** POST /admin/settings exceeds max
Host wildcard broadening api.github.com:443 *.github.com:443 exceeds max
Binary broadening /usr/bin/gh /usr/bin/* exceeds max
CIDR broadening 10.0.5.0/24 10.0.0.0/8 exceeds max
GraphQL mutation query only mutation createIssue exceeds max
MCP placeholder GitHub MCP get_* only create_pull_request exceeds max or unsupported until #1531

Start narrowness with a small budget model:

narrowness_budget:
  max_endpoints_per_update: 1
  max_binaries_per_update: 1
  allowed_method_breadth: exact_observed
  allowed_host_breadth: exact_observed
  allowed_path_breadth: exact_or_one_segment_wildcard
  forbid_recursive_path_wildcard: true
  forbid_binary_glob: true

Narrowness fixtures should include:

Observed denial Proposal Expected
gh GET api.github.com /repos/NVIDIA/OpenShell/issues/123 exact same host/method/path/binary within budget
same /repos/NVIDIA/OpenShell/issues/* within budget if one-segment wildcard is enabled
same /repos/NVIDIA/OpenShell/** over budget
same ** over budget
same GET denial add POST over budget
same host *.github.com over budget
same binary /usr/bin/* over budget
one denied endpoint add GitHub and Slack in one proposal over budget

Expected Output

  • A small fixture suite.
  • A short feasibility note for maximum-envelope checking.
  • A short feasibility note for narrowness budgets.
  • Example counterexample/redraft guidance for humans and agents, such as:
exceeds maximum: /usr/bin/gh can POST /repos/NVIDIA/OpenShell/admin via api.github.com:443
within maximum, but over narrowness budget: path /repos/NVIDIA/** is broader than the observed denial; try /repos/NVIDIA/OpenShell/issues/*

Code Areas To Inspect

Area Files
Runtime policy semantics crates/openshell-sandbox/data/sandbox-policy.rego, crates/openshell-sandbox/src/opa.rs, crates/openshell-sandbox/src/l7/*
Policy schema and merge crates/openshell-policy/src/lib.rs, crates/openshell-policy/src/merge.rs
Prover crates/openshell-prover/src/policy.rs, crates/openshell-prover/src/model.rs, crates/openshell-prover/src/queries.rs
Gateway proposal flow crates/openshell-server/src/grpc/policy.rs

Non-Goals

  • Do not implement the production admission gate.
  • Do not design the full runtime-profile product.
  • Do not create a multi-issue roadmap before the spike results.
  • Do not require MCP support beyond placeholder fixtures until feat(policy): add MCP L7 inspection and enforcement #1531 defines the policy surface.

Labels

area:policy, area:gateway, area:sandbox, topic:l7, topic:security, topic:testing, spike, state:in-progress

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

Status
In progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions