Skip to Content
ReferenceTester YAML

Tester YAML Reference

Schema version: 0.2 (SUPPORTED_TESTER_SCHEMA_VERSION in tester_config.py).

Unsupported schema_version values raise ConfigError.

Document structure

schema_version: "0.2" run: { } benchmark: { } harness: { } verification: { } # optional

Unknown keys at any level raise ConfigError.

run

FieldTypeRequiredDescription
idstringyesRun identifier stored in each JSONL record
output_dirstringyesOutput directory; relative paths resolve from tester YAML location

benchmark

FieldTypeRequiredDescription
manifeststringyesPath to benchmark manifest YAML
tasksstringyesPath to JSONL task rows

harness

FieldTypeRequiredDescription
typestringyescodex, claude_code, or command
envstring[]noEnvironment variable names to forward (no = values)
configobjectnoHarness-specific; default {}

harness.config for command

FieldTypeDefaultDescription
commandstring or string[]requiredShell command run in container
artifact_pathstringnoneWorkspace-relative file for candidate extraction
task_filestringtask.jsonAgent task JSON path
timeout_secondsnumbernoneHarness timeout
allowed_domainsstring[][]Extra egress domains

harness.config for codex

FieldTypeDefaultDescription
modelstringrequiredCodex model name
versionstringlatestCodex CLI version for overlay build
task_filestringtask.jsonAgent task JSON path
timeout_secondsnumber900Harness timeout
allowed_domainsstring[][]Added to api.openai.com

harness.config for claude_code

Same fields as Codex with these defaults:

FieldDefault
modelsonnet
versionlatest
task_filetask.json
timeout_seconds900

Default egress includes api.anthropic.com. Requires ANTHROPIC_API_KEY in harness.env.

verification

FieldTypeDefaultDescription
disallow_dangerous_commandsbooleantrueBlock benchmark needed_commands unless false
deny_commandsstring[][]Explicit deny list (known commands only: chroot)

Omitted verification section uses defaults (fail-closed for dangerous commands).

Path resolution

_config_path() resolves relative paths against the tester YAML parent directory. Absolute paths are used as-is.

Errors

All validation errors raise securebench.errors.ConfigError (subclass of ValueError).

Common messages:

  • Unsupported tester schema_version '…'
  • harness.type must be one of: claude_code, codex, command
  • harness.env[N] must be a non-empty environment variable name without '='

Examples

See Tester Configuration and benchmarks/*/tester-*.yaml in the SecureBench repo.

Last updated on