Agent testing harness

ModelMirror

An offline human prompt-pack debugger for agent workflows. Inspect exactly what your runtime would send, emulate the model's next move, and record deterministic PASS/FAIL artifacts before the LLM call happens.

What it does

ModelMirror lets a human step into the model's seat before an agent runtime calls an LLM. It loads a prompt bundle, validates the structure, shows messages, tools, retrieved context, and constraints, then captures whether the expected next move is a response, tool call, clarification, or refusal.

Prompt-pack inspection

See system, developer, user, retrieval, tools, runtime constraints, and raw payload fields without sending them to a model provider.

Human emulation loop

Record what a reasonable model should do next: respond, call a tool, clarify, or refuse, with structured artifacts.

Local by design

No telemetry, no network runtime calls, no tool execution. The base debugger stays deterministic and non-executing.

Run it locally

Clone the repo, install editable, validate a bundle, then open the debugger UI.

git clone https://github.com/Roop-World/ModelMirror.git cd ModelMirror python3 -m pip install -e . python3 -m model_mirror run --bundle templates/packed_prompt.example.json --out outputs --operator demo_operator
Open repository