Skip to content

Kosli attest#9

Open
gsavage wants to merge 7 commits into
mainfrom
kosli-attest
Open

Kosli attest#9
gsavage wants to merge 7 commits into
mainfrom
kosli-attest

Conversation

@gsavage
Copy link
Copy Markdown
Contributor

@gsavage gsavage commented May 12, 2026

No description provided.

gsavage and others added 7 commits May 11, 2026 14:59
Adds a follow-up job to the apply workflow that writes a small JSON
record (the triggering SHA plus a hardcoded drift flag) for use by a
later drift-detection step. The file is always published as a GitHub
artifact, and is also uploaded to s3://<bucket>/<repo>/drift.plan.json
when the new optional s3_bucket input is supplied — leaving it unset
skips both the AWS credential exchange and the S3 upload.

The work lives in apply.yml rather than base.yml so that the plan
workflow, which also consumes base.yml, does not produce the file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces a reusable Detect Drift workflow that consumers can schedule
(via cron, workflow_dispatch, etc.) to compare deployed infrastructure
against the last applied state. It reads the drift baseline written by
the apply workflow from S3, runs a plan against the recorded SHA via
base.yml, and — if the plan contains changes — overwrites drift.plan.json
in S3 with the same SHA and an ISO 8601 timestamp in the drift field.
A third-party monitor watching that object then sees drift != false and
fires an alert.

To support this, base.yml gains two backward-compatible additions:
  * a `ref` input, threaded into the first actions/checkout so the
    drift workflow can plan against the historical SHA rather than the
    triggering ref;
  * a `has_changes` workflow output, derived from grepping the existing
    plan text for "No changes.", so the caller can decide whether to
    flag drift.

The workflow fails loudly when no baseline is present in S3, on the
assumption that a missing baseline reflects a real configuration
problem (apply.yml has never run, or the object was deleted) that
should surface rather than be silently skipped. A top-level concurrency
group keyed on repository + environment prevents overlapping scheduled
runs from racing on the same JSON object.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
apply.yml, plan.yml, and detect-drift.yml previously pinned base.yml to
the literal @main ref. That made end-to-end testing of cross-file
changes painful: a branch that modified both, say, detect-drift.yml and
base.yml could not be exercised from a consumer repo without either
merging to main or temporarily rewriting the @main pin to a feature
branch (and remembering to revert it).

Switching the three callers to the same-repo relative form
"./.github/workflows/base.yml" makes them follow the calling reusable
workflow's own ref. A consumer that pins
"kosli-dev/tf/.github/workflows/detect-drift.yml@<ref>" now transitively
pulls base.yml at the same <ref>, so the one entry-point pin in the
consumer is the only ref knob in the whole chain.

Also folds in the related permissions bump on detect-drift.yml's plan
job (contents: read -> contents: write) so the job can grant base.yml
the contents: write it currently requests — a temporary workaround to
keep the wider-permissions test path working while we evaluate
tightening base.yml itself.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Callers can now record their Terraform plans, applies, state files, and
drift baselines as Kosli evidence by supplying a trail template via the
new kosli_template_file input (plus kosli_flow/host/org/cli_version and
the kosli_api_token secret). When the template input is empty the new
steps are all skipped, so existing callers that do not yet integrate
with Kosli (e.g. terraform-monitoring-and-alerting) remain unaffected.

base.yml installs the Kosli CLI, ensures the flow exists, begins a trail
named after the head commit SHA, and attests the human-readable plan
(/tmp/<env>.plan.txt). When tf_apply is true it also attests the apply
log. apply.yml's reset-drift-detection job additionally downloads the
state file from S3 and attests both the state file and drift.plan.json
as Kosli artifacts on the same trail. Attestation names are generic
(terraform-plan, terraform-apply, terraform-state, drift-plan) because
each caller is expected to run one flow per environment via a matrix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The flow name is always terraform-<environment>-plan when invoked from
plan.yml and terraform-<environment>-apply when invoked from apply.yml,
so the kosli_flow input added a parameter the caller would otherwise
have to repeat. Drop the input and compute the value at job level using
inputs.tf_apply to choose the suffix; reset-drift-detection always uses
the apply suffix since it only runs in the apply workflow.

While here, lift KOSLI_ORG / KOSLI_HOST / KOSLI_API_TOKEN / KOSLI_FLOW /
KOSLI_TRAIL to job-level env so each Kosli step reduces to a single
command line — the CLI picks the values up from the environment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
base.yml has no business knowing whether it is invoked for a plan or an
apply, and the inline ternary that derived the suffix from tf_apply was
unpleasant to read. Restore kosli_flow as a base.yml input, and compute
it in plan.yml (terraform-<env>-plan) and apply.yml (terraform-<env>-
apply) where the workflow identity is already known. The same
terraform-<env>-apply value is already hardcoded in apply.yml's
reset-drift-detection job.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three classes of pre-existing actionlint warnings, all now resolved:

* tf_apply was declared as a string with values "true" / "false", so
  actionlint flagged every assignment as a bool-into-string mismatch.
  Switch tf_apply to a boolean input and update the if-conditions in
  base.yml from `inputs.tf_apply == 'true'` to `inputs.tf_apply`.
* Quote $GITHUB_PATH and $GITHUB_STEP_SUMMARY in the shell snippets
  (SC2086) and group consecutive redirects into single { ... } >> file
  blocks (SC2129) in base.yml's Plan summary / Apply summary steps and
  detect-drift.yml's Read baseline step.

`actionlint .github/workflows/*.yml` now exits clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant