Skip to content

feat: telemetry#39

Open
pratikbin wants to merge 11 commits intomainfrom
feat/posthog-telemetry
Open

feat: telemetry#39
pratikbin wants to merge 11 commits intomainfrom
feat/posthog-telemetry

Conversation

@pratikbin
Copy link
Copy Markdown
Contributor

No description provided.

pratikbin added 11 commits May 1, 2026 23:49
Phase 1 of PostHog telemetry plan. No-op when PostHogAPIKey
ldflag is empty (default). Adds:

- internal/telemetry/{disabled,distinct_id,properties,redact,errors,client}.go
- internal/telemetry/os_release_{unix,windows,other}.go (build-tagged)
- internal/config/identity.go (Identity store, separate from OAuthSession)
- deps: posthog-go@v1.12.4, machineid@v1.0.1
Phase 2 of PostHog telemetry plan. Adds:

- main.go: telemetry.Init before app.Run, finalizeTelemetry after.
- cmd/root/root.go Before: stash telemetry_start + telemetry_arg_first.
- cmd/root/telemetry.go: wrapActions (recurses Subcommands), buildInvokedProps,
  resolveProjectID, coarseCommandFromArgs, FinalizeTelemetry, helpEmittedThisProcess.

Action wrapper emits command_invoked unless c.Bool("help") (Phase 3).
Finalizer emits command_completed/command_failed with duration_ms,
error_category, api_status_code. Shutdown bounded to 500ms.
Phase 3 of PostHog telemetry plan. urfave/cli's --help and --version
flags short-circuit before subcommand dispatch, so override
HelpPrinter and VersionPrinter to emit command_invoked.

helpEmittedThisProcess atomic prevents double-emit when subcommand
--help reaches both override and Action wrapper. currentAppPtr
exposes App to HelpPrinter (which receives no *cli.Context).
No Flush calls — natural batch interval + 500ms Shutdown handles flush.
Phase 4 of PostHog telemetry plan.

login: After credential save, GetUser() → SaveIdentity (preserving
AliasedForUserID when same user) → RebindIdentity → auth_event success.
GetUser failure is silent — login still succeeds. Pre-save errors emit
auth_event success:false (no user_id since rebind not yet run).

logout: DeleteToken + DeleteOAuthSession + DeleteIdentity, then emit
auth_event action=logout. distinctID still bound in-process so attribution
is correct.

refresh: cmd/root Before hook emits auth_event action=refresh on
RefreshTokens success/fail and SaveOAuthSession fail.
Phase 5 of PostHog telemetry plan. Deferred single emit at the end
of the upgrade Action covers both success and failure paths
(from_version, to_version, success, failure_reason). No-op short
circuits (already up-to-date, version-ahead) do not emit.

ask/vms ssh untouched — natural posthog-go 5s batch interval flushes
command_invoked while parent CLI waits on the child subprocess.
No Flush API; relies on main.go finalizer's 500ms Shutdown.
Phase 6 of PostHog telemetry plan. Release + nightly workflows
inject PostHogAPIKey from secrets.POSTHOG_API_KEY and PostHogHost
literal at build time. CI release artifacts therefore emit
telemetry without any per-user env var.

README adds a Telemetry section disclosing data collected and the
CREATEOS_DO_NOT_TRACK=1 opt-out variable.

NOTE: GitHub Actions repo secret POSTHOG_API_KEY must be added by
a repo admin BEFORE the next release tag is pushed.
Addresses code-review findings:

1. redact.go (HIGH): `createos login -t <token>` leaked the raw
   token because LocalFlagNames returned the alias "t" and
   RedactFlagValue's substring match did not catch it.
   FlagsFromContext now resolves the cli.Flag via Lineage and
   redacts when ANY of its Names() (canonical or alias) matches
   the denylist.

2. login.go (MEDIUM): If GetUser or SaveIdentity failed,
   RebindIdentity could load a stale .identity from a previous
   user and mis-attribute login telemetry. Now we only rebind
   when SaveIdentity succeeds for the current account; otherwise
   delete any stale identity file and skip rebind.

3. errors.go (LOW): "non-interactive mode: use --token flag to
   sign in" was bucketed as auth because the auth needles ran
   first. Re-ordered: user_input substrings now check first.

4. README.md (LOW): Telemetry disclosure no longer claims data is
   anonymous unconditionally; clarifies that events are anonymous
   pre-login and tied to account_id (and project_id when
   applicable) post-login.
End-to-end test against PostHog (project 392811) revealed two issues:

1. urfave/cli/v2 v2.27.7's c.Command.FullName() returns only the leaf
   name ("list"), not the full path ("projects list") despite plan §Phase 0
   §C suggesting otherwise. Replaced with commandPath() that walks
   c.Lineage() and joins all parent command names, skipping the synthesized
   root command whose Name == App.Name.

2. 500ms Shutdown timeout silently dropped events on cold-start because the
   first TCP/TLS handshake to us.i.posthog.com exceeds 500ms. Plan §Phase 3
   step 4 already anticipated this — bumped to 3 seconds. CLI exit gains up
   to ~3s in the worst case (cold-start + slow network); typical case is
   sub-200ms once the connection is warm.

Verified via PostHog HogQL query: command_invoked + command_completed
both arrive with command="projects list" within ~5s of CLI exit.
Login now passes the API User struct's Email, DisplayName, Username, and
CreatedAt as PostHog Person-level properties via Identify. Mutable fields
(email, name, username) use $set; signup_date uses $set_once so it cannot
be overwritten by subsequent logins.

Person properties go ONLY to PostHog's person record via Identify — they
are NEVER attached to Capture event payloads.

Implementation:
- internal/telemetry/client.go: new SetPersonProperties(props) method;
  RebindIdentity merges them into the Identify event.
- cmd/auth/login.go: bindIdentityAndCapture builds props from User and
  calls SetPersonProperties before RebindIdentity.

Verified end-to-end against PostHog: person record now shows email +
signup_date alongside the existing user_id binding. Events still carry
no email — only user_id.
Add a Telemetry section to CLAUDE.md so future Claude sessions adding new
commands or integrations don't miss PostHog event wiring or accidentally
break the redaction / identity / Person-property invariants.

Covers:
- Most new commands need ZERO telemetry code (Action wrapper handles it).
- When to add a domain Capture call (lifecycle events only).
- Hard rules: no posthog-go imports outside internal/telemetry, no Flush
  method, no PII in event payloads, no per-loop Capture calls.
- How to add a new sensitive flag (denyKeywords vs canonical name).
- How project_id auto-attaches and where to update resolveProjectID.
- How to smoke-test against a staging PostHog key.f
@pratikbin pratikbin requested a review from BhautikChudasama May 3, 2026 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant