Tester YAML Reference
Schema version: 0.2 (SUPPORTED_TESTER_SCHEMA_VERSION in tester_config.py).
Unsupported schema_version values raise ConfigError.
Document structure
schema_version: "0.2"
run: { }
benchmark: { }
harness: { }
verification: { } # optionalUnknown keys at any level raise ConfigError.
run
| Field | Type | Required | Description |
|---|---|---|---|
id | string | yes | Run identifier stored in each JSONL record |
output_dir | string | yes | Output directory; relative paths resolve from tester YAML location |
benchmark
| Field | Type | Required | Description |
|---|---|---|---|
manifest | string | yes | Path to benchmark manifest YAML |
tasks | string | yes | Path to JSONL task rows |
harness
| Field | Type | Required | Description |
|---|---|---|---|
type | string | yes | codex, claude_code, or command |
env | string[] | no | Environment variable names to forward (no = values) |
config | object | no | Harness-specific; default {} |
harness.config for command
| Field | Type | Default | Description |
|---|---|---|---|
command | string or string[] | required | Shell command run in container |
artifact_path | string | none | Workspace-relative file for candidate extraction |
task_file | string | task.json | Agent task JSON path |
timeout_seconds | number | none | Harness timeout |
allowed_domains | string[] | [] | Extra egress domains |
harness.config for codex
| Field | Type | Default | Description |
|---|---|---|---|
model | string | required | Codex model name |
version | string | latest | Codex CLI version for overlay build |
task_file | string | task.json | Agent task JSON path |
timeout_seconds | number | 900 | Harness timeout |
allowed_domains | string[] | [] | Added to api.openai.com |
harness.config for claude_code
Same fields as Codex with these defaults:
| Field | Default |
|---|---|
model | sonnet |
version | latest |
task_file | task.json |
timeout_seconds | 900 |
Default egress includes api.anthropic.com. Requires ANTHROPIC_API_KEY in harness.env.
verification
| Field | Type | Default | Description |
|---|---|---|---|
disallow_dangerous_commands | boolean | true | Block benchmark needed_commands unless false |
deny_commands | string[] | [] | Explicit deny list (known commands only: chroot) |
Omitted verification section uses defaults (fail-closed for dangerous commands).
Path resolution
_config_path() resolves relative paths against the tester YAML parent directory. Absolute paths are used as-is.
Errors
All validation errors raise securebench.errors.ConfigError (subclass of ValueError).
Common messages:
Unsupported tester schema_version '…'harness.type must be one of: claude_code, codex, commandharness.env[N] must be a non-empty environment variable name without '='
Examples
See Tester Configuration and benchmarks/*/tester-*.yaml in the SecureBench repo.
Last updated on