Standards become real when you can run them.
Without a working reference implementation, “a standard” often means:
- the contracts are theoretically correct,
- but nobody can show you a full end-to-end run,
- and implementers end up guessing the intended semantics.
JARVIS exists to prevent that.
What JARVIS is
JARVIS is the first-party open-source reference stack that implements ARP Standard v1 end-to-end.
It is designed to be:
- opinionated enough to be runnable and debuggable,
- swappable enough that you can replace pieces without rewrites,
- and spec-aligned so implementers can treat it as a baseline.
What JARVIS is for
1) Proving the Standard in reality
JARVIS demonstrates:
- bounded candidate selection,
- enforceable constraint envelopes,
- policy checkpoints,
- durable run artifacts.
2) Giving teams a “golden path” onramp
Most teams don’t want to build the platform first.
JARVIS is the “run it now” path:
- start the stack,
- run a workflow,
- inspect artifacts,
- iterate.
3) Providing a baseline for swapping components
If you want to implement your own services:
- start from a known-good baseline,
- swap one component at a time,
- validate your changes against stable contracts.
What’s inside, conceptually
JARVIS includes conformant implementations for the Standard’s core roles, plus practical durability primitives so you can inspect real runs.
It ships:
- bounded selection,
- constraint enforcement,
- durable event and artifact storage,
- a runnable local profile for quick iteration.
Exact packaging and versions are tracked in the quickstart docs and release repo.
How teams use JARVIS
Learn
Run it locally and build intuition for how bounded orchestration and durable artifacts work.
Extend
Add your own capabilities and watch them appear in candidate menus and run timelines.
Replace
Swap your own implementation of one component while leaving everything else in place.
This is how ARP stays framework-agnostic: your internals can differ, the outer artifacts remain stable.
Quickstart
If you want the fastest “see it work” loop:
- Quickstart: /quickstart
- JARVIS overview: /jarvis
The key thing to look at first
When you run your first workflow, don’t focus on the model output.
Focus on the artifacts:
- the bounded candidate menu that constrained decisions,
- the constraint envelope that limited blast radius,
- the policy checkpoints that gated side effects,
- the durable events that let you replay “what happened and why.”
Simplified example:
{
"candidate_set_id": "cs_01J...",
"subtask": "Initiate refund if eligible",
"top_k": 3,
"candidates": [
{ "node_type_id": "billing.initiate_refund", "score": 0.9 },
{ "node_type_id": "billing.create_case_for_agent", "score": 0.72 },
{ "node_type_id": "support.escalate_to_human", "score": 0.61 }
]
}
RunStarted run_01J...
CandidateSetCreated cs_01J... subtask=t3 top_k=3
PolicyCheckpoint pre_invoke allow billing.initiate_refund
NodeRunCompleted nr_01J... billing.initiate_refund status=ok
RunCompleted run_01J...
That’s the difference between a demo and a system.
Next in the series: ARP in 10 minutes: one run, five artifacts