Skip to content
21 changes: 21 additions & 0 deletions doc/code/targets/0_prompt_targets.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,27 @@ target = MyHTTPTarget(custom_configuration=config, ...)

The full implementation lives in [`pyrit/prompt_target/common/target_capabilities.py`](https://github.com/microsoft/PyRIT/blob/main/pyrit/prompt_target/common/target_capabilities.py) and [`pyrit/prompt_target/common/target_configuration.py`](https://github.com/microsoft/PyRIT/blob/main/pyrit/prompt_target/common/target_configuration.py). For runnable examples — inspecting capabilities on a real target, comparing known model profiles, and `ADAPT` vs `RAISE` in action — see [Target Capabilities](./6_1_target_capabilities.ipynb).

### Querying live target capabilities

Declared capabilities describe what a target *should* support. For deployments where actual behavior is uncertain — custom OpenAI-compatible endpoints, gateways that strip features, models whose support drifts — you can probe what the target *actually* accepts at runtime:

```python
from pyrit.prompt_target import (
query_target_capabilities_async,
verify_target_async,
verify_target_modalities_async,
)

# Probe a single dimension:
verified_caps = await query_target_capabilities_async(target=target)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more intuitive IMO if it was target.get_capabilities() or (even better) target.capabilities (and similarly target.input_modalities / target.output_modalities) since these are static after instantiation (right?).

Having to import a function makes it a bit more obscure.

verified_modalities = await verify_target_modalities_async(target=target)

# Or do both at once and get a populated TargetCapabilities back:
verified = await verify_target_async(target=target)
```

Each probe sends a minimal request (bounded by `per_probe_timeout_s`, default 30s, with one retry on transient errors) and only marks a capability or modality as supported if the call returns cleanly. "Supported" here means *the request was accepted* — a target that silently ignores a system prompt or `response_format` directive is still reported as supporting it, so validate response content out of band when the distinction matters. These functions are not safe to call concurrently with other operations on the same target instance: they temporarily mutate `target._configuration` and write probe rows to memory (rows are tagged with `prompt_metadata["capability_probe"] == "1"` for filtering). See [Target Capabilities](./6_1_target_capabilities.ipynb) for runnable examples.

## Multi-Modal Targets

Like most of PyRIT, targets can be multi-modal.
Expand Down
203 changes: 194 additions & 9 deletions doc/code/targets/6_1_target_capabilities.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -53,13 +53,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"No new upgrade operations detected.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"No new upgrade operations detected.\n",
"supports_multi_turn: True\n",
"supports_editable_history: True\n",
"supports_system_prompt: True\n",
Expand Down Expand Up @@ -462,8 +456,199 @@
"try:\n",
" no_editable_history.ensure_can_handle(capability=CapabilityName.EDITABLE_HISTORY)\n",
"except ValueError as exc:\n",
" print(exc)\n",
"# ---"
" print(exc)"
]
},
{
"cell_type": "markdown",
"id": "19",
"metadata": {},
"source": [
"## 7. Querying live target capabilities\n",
"\n",
"Declared capabilities describe what a target *should* support. For deployments where the actual\n",
"behavior is uncertain — custom OpenAI-compatible endpoints, gateways that strip features, models\n",
"whose support drifts over time — you can probe what the target *actually* accepts at runtime with\n",
"`query_target_capabilities_async`, `verify_target_modalities_async`, or the convenience wrapper\n",
"`verify_target_async` that runs both and returns a populated `TargetCapabilities`.\n",
"\n",
"`query_target_capabilities_async` walks each capability that has a registered probe (currently\n",
"`SYSTEM_PROMPT`, `MULTI_MESSAGE_PIECES`, `MULTI_TURN`, `JSON_OUTPUT`, `JSON_SCHEMA`), sends a\n",
"minimal request, and includes the capability in the returned set only if the call succeeds.\n",
"During probing the target's configuration is temporarily replaced with a permissive one so\n",
"`ensure_can_handle` does not short-circuit a probe for a capability the target declares as\n",
"unsupported. The original configuration is restored before the function returns.\n",
"\n",
"`verify_target_modalities_async` does the same for input modality combinations declared in\n",
"`capabilities.input_modalities`, sending a small payload built from optional `test_assets`.\n",
"\n",
"Each probe call is bounded by `per_probe_timeout_s` (default 30s) and is retried once on\n",
"transient errors before being declared failed. \"Supported\" here means *the request was\n",
"accepted* — a target that silently ignores a system prompt or `response_format` directive will\n",
"still be reported as supporting that capability.\n",
"\n",
"These functions are **not safe to call concurrently** with other operations on the same target\n",
"instance: they temporarily mutate `target._configuration` and write probe rows to\n",
"`target._memory`. Probe-written memory rows are tagged with\n",
"`prompt_metadata[\"capability_probe\"] == \"1\"` so consumers can filter them.\n",
"\n",
"Typical usage against a real endpoint:\n",
"\n",
"```python\n",
"from pyrit.prompt_target import verify_target_async\n",
"\n",
"verified = await verify_target_async(target=target)\n",
"print(verified)\n",
"```\n",
"\n",
"Below we mock the target's underlying transport (`_send_prompt_to_target_async`) so the notebook\n",
"stays self-contained — the result shape is the same as a live run. We mock the protected method\n",
"rather than `send_prompt_async` so the probe still exercises the real validation and memory\n",
"pipeline."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "20",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"verified capabilities:\n",
" - supports_editable_history\n",
" - supports_json_output\n",
" - supports_json_schema\n",
" - supports_multi_message_pieces\n",
" - supports_multi_turn\n",
" - supports_system_prompt\n"
]
}
],
"source": [
"from unittest.mock import AsyncMock\n",
"\n",
"from pyrit.models import MessagePiece\n",
"from pyrit.prompt_target import (\n",
" query_target_capabilities_async,\n",
" verify_target_async,\n",
")\n",
"\n",
"\n",
"def _ok_response():\n",
" return [\n",
" Message(\n",
" [\n",
" MessagePiece(\n",
" role=\"assistant\",\n",
" original_value=\"ok\",\n",
" original_value_data_type=\"text\",\n",
" conversation_id=\"probe\",\n",
" response_error=\"none\",\n",
" )\n",
" ]\n",
" )\n",
" ]\n",
"\n",
"\n",
"probe_target = OpenAIChatTarget(model_name=\"gpt-4o\", endpoint=\"https://example.invalid/\", api_key=\"sk-not-a-real-key\")\n",
"probe_target._send_prompt_to_target_async = AsyncMock(return_value=_ok_response()) # type: ignore[method-assign]\n",
"\n",
"verified = await query_target_capabilities_async(target=probe_target, per_probe_timeout_s=5.0) # type: ignore\n",
"print(\"verified capabilities:\")\n",
"for capability in sorted(verified, key=lambda c: c.value):\n",
" print(f\" - {capability.value}\")"
]
},
{
"cell_type": "markdown",
"id": "21",
"metadata": {},
"source": [
"To narrow the probe to specific capabilities (faster, fewer calls), pass `capabilities=`:\n",
"\n",
"```python\n",
"from pyrit.prompt_target.common.target_capabilities import CapabilityName\n",
"\n",
"verified = await query_target_capabilities_async(\n",
" target=target,\n",
" capabilities=[CapabilityName.JSON_SCHEMA, CapabilityName.SYSTEM_PROMPT],\n",
")\n",
"```\n",
"\n",
"`verify_target_async` is the most common entry point: it runs both the capability and modality\n",
"probes and assembles a `TargetCapabilities` you can drop straight into a `TargetConfiguration`,\n",
"so the rest of PyRIT (attacks, scorers, the normalization pipeline) operates on capabilities\n",
"that have been observed to work end-to-end."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "22",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"verify_target_async result:\n",
" supports_multi_turn: True\n",
" supports_system_prompt: True\n",
" supports_multi_message_pieces: True\n",
" supports_json_output: True\n",
" supports_json_schema: True\n",
" input_modalities: [['text']]\n"
]
}
],
"source": [
"probe_target._send_prompt_to_target_async = AsyncMock(return_value=_ok_response()) # type: ignore[method-assign]\n",
"\n",
"verified_caps = await verify_target_async(target=probe_target, per_probe_timeout_s=5.0) # type: ignore\n",
"print(\"verify_target_async result:\")\n",
"print(f\" supports_multi_turn: {verified_caps.supports_multi_turn}\")\n",
"print(f\" supports_system_prompt: {verified_caps.supports_system_prompt}\")\n",
"print(f\" supports_multi_message_pieces: {verified_caps.supports_multi_message_pieces}\")\n",
"print(f\" supports_json_output: {verified_caps.supports_json_output}\")\n",
"print(f\" supports_json_schema: {verified_caps.supports_json_schema}\")\n",
"print(f\" input_modalities: {sorted(sorted(m) for m in verified_caps.input_modalities)}\")"
]
},
{
"cell_type": "markdown",
"id": "23",
"metadata": {},
"source": [
"### Discovering undeclared modalities\n",
"\n",
"By default `verify_target_async` only probes modality combinations the target already\n",
"**declares** in `capabilities.input_modalities`. For an OpenAI-compatible endpoint that\n",
"claims text-only but might actually accept images, pass `test_modalities=` (and matching\n",
"`test_assets=`) explicitly to probe combinations beyond the declared baseline:\n",
"\n",
"```python\n",
"verified = await verify_target_async(\n",
" target=target,\n",
" test_modalities={frozenset({\"text\"}), frozenset({\"text\", \"image_path\"})},\n",
" test_assets={\"image_path\": \"/path/to/test_image.png\"},\n",
")\n",
"```\n",
"\n",
"Similarly, when narrowing the probe set with `capabilities=`, capabilities NOT in the\n",
"narrowed set are copied from the target's declared values rather than being reset to\n",
"`False` — narrowing controls *what is re-verified*, not what the returned dataclass\n",
"reports. This makes incremental probing safe:\n",
"\n",
"```python\n",
"# Re-verify only JSON support; other declared flags pass through unchanged.\n",
"verified = await verify_target_async(\n",
" target=target,\n",
" capabilities={CapabilityName.JSON_OUTPUT, CapabilityName.JSON_SCHEMA},\n",
")\n",
"```"
]
}
],
Expand Down
136 changes: 135 additions & 1 deletion doc/code/targets/6_1_target_capabilities.py
Original file line number Diff line number Diff line change
Expand Up @@ -245,4 +245,138 @@
no_editable_history.ensure_can_handle(capability=CapabilityName.EDITABLE_HISTORY)
except ValueError as exc:
print(exc)
# ---

# %% [markdown]
# ## 7. Querying live target capabilities
#
# Declared capabilities describe what a target *should* support. For deployments where the actual
# behavior is uncertain — custom OpenAI-compatible endpoints, gateways that strip features, models
# whose support drifts over time — you can probe what the target *actually* accepts at runtime with
# `query_target_capabilities_async`, `verify_target_modalities_async`, or the convenience wrapper
# `verify_target_async` that runs both and returns a populated `TargetCapabilities`.
#
# `query_target_capabilities_async` walks each capability that has a registered probe (currently
# `SYSTEM_PROMPT`, `MULTI_MESSAGE_PIECES`, `MULTI_TURN`, `JSON_OUTPUT`, `JSON_SCHEMA`), sends a
# minimal request, and includes the capability in the returned set only if the call succeeds.
# During probing the target's configuration is temporarily replaced with a permissive one so
# `ensure_can_handle` does not short-circuit a probe for a capability the target declares as
# unsupported. The original configuration is restored before the function returns.
#
# `verify_target_modalities_async` does the same for input modality combinations declared in
# `capabilities.input_modalities`, sending a small payload built from optional `test_assets`.
#
# Each probe call is bounded by `per_probe_timeout_s` (default 30s) and is retried once on
# transient errors before being declared failed. "Supported" here means *the request was
# accepted* — a target that silently ignores a system prompt or `response_format` directive will
# still be reported as supporting that capability.
#
# These functions are **not safe to call concurrently** with other operations on the same target
# instance: they temporarily mutate `target._configuration` and write probe rows to
# `target._memory`. Probe-written memory rows are tagged with
# `prompt_metadata["capability_probe"] == "1"` so consumers can filter them.
#
# Typical usage against a real endpoint:
#
# ```python
# from pyrit.prompt_target import verify_target_async
#
# verified = await verify_target_async(target=target)
# print(verified)
# ```
#
# Below we mock the target's underlying transport (`_send_prompt_to_target_async`) so the notebook
# stays self-contained — the result shape is the same as a live run. We mock the protected method
# rather than `send_prompt_async` so the probe still exercises the real validation and memory
# pipeline.

# %%
from unittest.mock import AsyncMock

from pyrit.models import MessagePiece
from pyrit.prompt_target import (
query_target_capabilities_async,
verify_target_async,
)


def _ok_response():
return [
Message(
[
MessagePiece(
role="assistant",
original_value="ok",
original_value_data_type="text",
conversation_id="probe",
response_error="none",
)
]
)
]


probe_target = OpenAIChatTarget(model_name="gpt-4o", endpoint="https://example.invalid/", api_key="sk-not-a-real-key")
probe_target._send_prompt_to_target_async = AsyncMock(return_value=_ok_response()) # type: ignore[method-assign]

verified = await query_target_capabilities_async(target=probe_target, per_probe_timeout_s=5.0) # type: ignore
print("verified capabilities:")
for capability in sorted(verified, key=lambda c: c.value):
print(f" - {capability.value}")

# %% [markdown]
# To narrow the probe to specific capabilities (faster, fewer calls), pass `capabilities=`:
#
# ```python
# from pyrit.prompt_target.common.target_capabilities import CapabilityName
#
# verified = await query_target_capabilities_async(
# target=target,
# capabilities=[CapabilityName.JSON_SCHEMA, CapabilityName.SYSTEM_PROMPT],
# )
# ```
#
# `verify_target_async` is the most common entry point: it runs both the capability and modality
# probes and assembles a `TargetCapabilities` you can drop straight into a `TargetConfiguration`,
# so the rest of PyRIT (attacks, scorers, the normalization pipeline) operates on capabilities
# that have been observed to work end-to-end.

# %%
probe_target._send_prompt_to_target_async = AsyncMock(return_value=_ok_response()) # type: ignore[method-assign]

verified_caps = await verify_target_async(target=probe_target, per_probe_timeout_s=5.0) # type: ignore
print("verify_target_async result:")
print(f" supports_multi_turn: {verified_caps.supports_multi_turn}")
print(f" supports_system_prompt: {verified_caps.supports_system_prompt}")
print(f" supports_multi_message_pieces: {verified_caps.supports_multi_message_pieces}")
print(f" supports_json_output: {verified_caps.supports_json_output}")
print(f" supports_json_schema: {verified_caps.supports_json_schema}")
print(f" input_modalities: {sorted(sorted(m) for m in verified_caps.input_modalities)}")

# %% [markdown]
# ### Discovering undeclared modalities
#
# By default `verify_target_async` only probes modality combinations the target already
# **declares** in `capabilities.input_modalities`. For an OpenAI-compatible endpoint that
# claims text-only but might actually accept images, pass `test_modalities=` (and matching
# `test_assets=`) explicitly to probe combinations beyond the declared baseline:
#
# ```python
# verified = await verify_target_async(
# target=target,
# test_modalities={frozenset({"text"}), frozenset({"text", "image_path"})},
# test_assets={"image_path": "/path/to/test_image.png"},
# )
# ```
#
# Similarly, when narrowing the probe set with `capabilities=`, capabilities NOT in the
# narrowed set are copied from the target's declared values rather than being reset to
# `False` — narrowing controls *what is re-verified*, not what the returned dataclass
# reports. This makes incremental probing safe:
#
# ```python
# # Re-verify only JSON support; other declared flags pass through unchanged.
# verified = await verify_target_async(
# target=target,
# capabilities={CapabilityName.JSON_OUTPUT, CapabilityName.JSON_SCHEMA},
# )
# ```
Loading
Loading