Skip to content

FEAT: Adding Scenario run to the REST API#1696

Open
rlundeen2 wants to merge 12 commits intomicrosoft:mainfrom
rlundeen2:users/rludneen/scenario-rest
Open

FEAT: Adding Scenario run to the REST API#1696
rlundeen2 wants to merge 12 commits intomicrosoft:mainfrom
rlundeen2:users/rludneen/scenario-rest

Conversation

@rlundeen2
Copy link
Copy Markdown
Contributor

Adds backend endpoints for running, monitoring, and cancelling scenario executions via the REST UI.

  • New ScenarioRunService with in-memory run tracking, background asyncio execution, and cancellation support (max 3
    concurrent runs)
  • New routes: POST /runs, GET /runs, GET /runs/{id}, DELETE /runs/{id}, GET /runs/{id}/results
  • Comprehensive unit tests for service and route layers

Comment thread pyrit/backend/models/scenarios.py Outdated
Comment thread pyrit/backend/models/scenarios.py Outdated
Comment thread pyrit/backend/models/scenarios.py Outdated
Comment thread pyrit/backend/models/scenarios.py Outdated
Comment thread pyrit/backend/models/scenarios.py Outdated
Comment thread pyrit/backend/routes/scenarios.py Outdated
class ScenarioRunStatus(str, Enum):
"""Status of a scenario run."""

PENDING = "pending"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: what does pending mean ? like scheduled ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and do we need to have another status? We have on in PyRIT core, right?

Comment thread pyrit/backend/models/scenarios.py
Comment thread pyrit/backend/services/scenario_run_service.py
@@ -64,6 +64,7 @@ def __init__(
objective_scorer_identifier: "ComponentIdentifier",
scenario_run_state: ScenarioRunState = "CREATED",
labels: Optional[dict[str, str]] = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want an error message property here too ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% and great catch; it is even in my notes. But it has some bits I'd like to tackle as a follow up

Comment thread pyrit/backend/services/scenario_run_service.py
Comment thread pyrit/backend/services/scenario_run_service.py Outdated
Comment thread pyrit/backend/models/attacks.py Outdated
Comment thread pyrit/backend/services/scenario_service.py Outdated
active.task = task

response = self._build_response(scenario_result_id=scenario_result_id)
assert response is not None # guaranteed: we just inserted into DB via initialize_async
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have it then? For typing reasons? Asserts don't run in production code.

attack_result_id: str = Field(..., description="Database-assigned unique ID for this AttackResult")
conversation_id: str = Field(..., description="Primary conversation of this attack result")
attack_type: str = Field(..., description="Attack class name (e.g., 'CrescendoAttack', 'ManualAttack')")
attack_type: str = Field("", description="Attack class name (e.g., 'CrescendoAttack', 'ManualAttack')")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change?



class ScenarioListResponse(BaseModel):
class ListRegisteredScenarioResponse(BaseModel):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ScenarioS plural?

Raises:
ValueError: If the run is not in a completed state.
"""
memory = CentralMemory.get_memory_instance()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens in 4 methods as the first command. Perhaps we should set it on the constructor as it's not going to change?

_service_instance: ScenarioRunService | None = None


def get_scenario_run_service() -> ScenarioRunService:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm somewhat confused by this. It's quite different from how we do this in other services and I don't know why.

Is it because of max concurrent runs?

objective_target_identifier: Mapped[dict[str, str]] = mapped_column(JSON, nullable=False)
objective_scorer_identifier: Mapped[Optional[dict[str, str]]] = mapped_column(JSON, nullable=True)
scenario_run_state: Mapped[Literal["CREATED", "IN_PROGRESS", "COMPLETED", "FAILED"]] = mapped_column(
scenario_run_state: Mapped[Literal["CREATED", "IN_PROGRESS", "COMPLETED", "FAILED", "CANCELLED"]] = mapped_column(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this Literal["CREATED", "IN_PROGRESS", "COMPLETED", "FAILED", "CANCELLED"] become a type so that we can use it in multiple places?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually it did. I see it in models/scenario_result.py. But not used here

objective_scorer_identifier: "ComponentIdentifier",
scenario_run_state: ScenarioRunState = "CREATED",
labels: Optional[dict[str, str]] = None,
created_at: Optional[datetime] = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the guidelines say we don't do Optional anymore but rather datetime | None following whatever the latest PEP is.

Comment on lines +102 to 103
self.created_at = created_at if created_at is not None else datetime.now(timezone.utc)
self.completion_time = completion_time if completion_time is not None else datetime.now(timezone.utc)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice if it matched. created_at and completed_at

Comment thread .pyrit_conf_example
# ----------------------------
# Maximum number of scenario runs that can execute concurrently in the backend.
# Applies only to the pyrit_backend server.
max_concurrent_scenario_runs: 3
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I struggle with this a bit. For example, if we use the same adversarial model it's bound to become a bottleneck when there are too many runs that need it. But what if it's just prompt sending attack which doesn't need it at all (maybe for scorers depending on configuration)?

On the other hand, I could run different scenarios against different endpoints in parallel. Let's say it's 1000 TAP attacks against endpoint A in scenario A, and same for B and C. Then I might be worried about running more than one TAP attack with the same endpoint at a time, but there's no reason to hold off on B and C. I suspect we don't control that kind of parallel execution, right?

None, description="Preview of the last message (truncated to ~100 chars)"
)
score_value: str | None = Field(None, description="Score value from the objective scorer")
executed_turns: int = Field(0, ge=0, description="Number of turns executed")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: num_executed_turns

class RunScenarioRequest(BaseModel):
"""Request body for starting a scenario run."""

scenario_name: str = Field(..., description="Registry key of the scenario to run")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
scenario_name: str = Field(..., description="Registry key of the scenario to run")
scenario_name: str = Field(..., description="Name of registered scenario from the ScenarioRegistry")

matching the target_name description below which makes more sense to me

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or if its not necessarily the name, rename to scenario_registry_key


class ScenarioListResponse(BaseModel):
class ListRegisteredScenarioResponse(BaseModel):
"""Response for listing scenarios."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
"""Response for listing scenarios."""
"""Response for listing registered scenarios."""

except KeyError as e:
raise ValueError(str(e)) from None

# Validate and run initializers
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: make lines 215-226 an initializer validation function, 229-241 a resolve target function, and 253-266 a resolve strategy function

identifier_filters (Optional[Sequence[IdentifierFilter]], optional):
A sequence of IdentifierFilter objects that allows filtering by identifier JSON properties.
Defaults to None.
limit (int | None): Maximum number of results to return. Defaults to None (no limit).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: limit is not super intuitive to me, i'd prefer something like max_results or result_limit

Copy link
Copy Markdown
Contributor

@hannahwestra25 hannahwestra25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nits but looks good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants