Skip to content

MAINT BREAK: Move include_baseline from Scenario constructor to initi…#1700

Open
adrian-gavrila wants to merge 3 commits intomicrosoft:mainfrom
adrian-gavrila:adrian-gavrila/include-baseline-on-initialize-async
Open

MAINT BREAK: Move include_baseline from Scenario constructor to initi…#1700
adrian-gavrila wants to merge 3 commits intomicrosoft:mainfrom
adrian-gavrila:adrian-gavrila/include-baseline-on-initialize-async

Conversation

@adrian-gavrila
Copy link
Copy Markdown
Contributor

@adrian-gavrila adrian-gavrila commented May 8, 2026

Description

Drops include_default_baseline from Scenario.__init__ and from every subclass constructor. Adds include_baseline: bool | None = None as a keyword-only parameter on Scenario.initialize_async, under
@apply_defaults.

This treats baseline like every other common runtime parameter (objective_target, scenario_strategies, dataset_config, max_concurrency, max_retries, memory_labels), so it can be overridden per
run from a config file, the CLI, or the frontend without rebuilding the scenario object.

Class-level controls

Two ClassVar[bool] flags on the Scenario base, mirroring the existing TARGET_REQUIREMENTS pattern:

class Scenario:
    SUPPORTS_DEFAULT_BASELINE: ClassVar[bool] = True   # capability: can it have one?
    DEFAULT_INCLUDE_BASELINE: ClassVar[bool] = True    # default when caller does not specify
  • AdversarialBenchmark and Psychosocial set SUPPORTS_DEFAULT_BASELINE = False. Explicit include_baseline=True on these raises ValueError.
  • Jailbreak sets DEFAULT_INCLUDE_BASELINE = False. It supports a baseline but does not include one by default (templates already dominate the run). Callers can opt in per run.

Resolution checks capability first, then default. A forbidden scenario can never silently inherit a True default.

Migration

# Before
scenario = RedTeamAgent(include_baseline=False)
await scenario.initialize_async(objective_target=target, ...)

# After
scenario = RedTeamAgent()
await scenario.initialize_async(objective_target=target, include_baseline=False, ...)

Testing

  • pytest tests/unit/scenario: 555 passed.
  • pytest tests/unit: 7460 passed.
  • ruff check, ruff format, pre-commit (incl. ty type check): clean.
  • Notebook docs regenerated end-to-end against a live target.
  • New tests: initialize_async(include_baseline=True) on AdversarialBenchmark raises ValueError; include_baseline=False on the same is accepted.

@adrian-gavrila adrian-gavrila force-pushed the adrian-gavrila/include-baseline-on-initialize-async branch from 0d921bb to 206f1b4 Compare May 8, 2026 21:18
…alize_async

Treats include_baseline like every other common runtime parameter on
initialize_async. Subclasses control behavior via two ClassVar flags:
SUPPORTS_DEFAULT_BASELINE (capability) and DEFAULT_INCLUDE_BASELINE
(default when caller doesn't specify).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@adrian-gavrila adrian-gavrila force-pushed the adrian-gavrila/include-baseline-on-initialize-async branch from 206f1b4 to a21dbbb Compare May 8, 2026 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant