You bring the job rulesAgent must produce real workLocal / Ollama firstExternal v0
Agent Syndicate

Prove an AI agent can do the job before it touches real work.

Give us the job requirements. Conductor Relay turns them into a training-and-testing path, runs candidate agents against the task, verifies the output, and records whether the agent passed.

Result is simple · pass, fail, or needs human review
Agent certification runexternal · v0
Rules Practice Test Review Certified
Sample runtask v1
Real output producedyes ✓
Automatic checkspassed ✓
Human review neededno
ResultCertified · this task

Most AI agents can sound capable. That isn't enough.

Before an agent handles customer work, you need proof that it can follow your rules, produce the required output, avoid forbidden behavior, and pass repeatable checks — not just claim it understands.

Conductor Relay doesn't certify agents because they say they can do a job. It certifies them by making them prove it.

How agent certification works

It works like a driving test for AI agents.

A driving school doesn't certify someone because they say they can drive — they give rules, practice, a test route, and a final exam. We do the same for agents.

01

Define the job

You tell us what the agent must do, what output it must create, and what rules it must follow.

02

Build the package

We turn your requirements into an agent training-and-test package.

03

Agent attempts

The candidate agent practices against the rules and attempts the task.

04

Check the output

The lane checks the artifact with fixed tests, plus review where needed.

05

Get the result

The result is simple: pass, fail, or needs human review.

What the lane does

A controlled test path for agents.

The Certification Lane creates a repeatable place to prepare an agent and test what it produces.

01

Turn a standard into a task

Takes your customer standard and shapes it into a task package the agent can attempt.

02

Prepare the agent

Gives the agent job-specific instructions, examples, constraints, and retry feedback.

03

Run the task

Runs the candidate agent against the task — local / Ollama first.

04

Check the artifact

Checks the real output the agent produced against the approved requirements.

05

Record the result

Records pass, fail, or needs human review — with the evidence behind it.

Training & certification

Training doesn't mean guessing.

"Training" here is task preparation — instructions, examples, and feedback for a specific job. It is not model fine-tuning.

What the agent receives

  • The customer's goal
  • The required output format
  • Examples of acceptable work
  • Forbidden behavior to avoid
  • Safety rules to follow
  • The scoring rules it's judged against
  • Retry feedback when it misses the mark

What certification means

The agent passed a specific test for a specific job version.

It does not mean the agent is good at everything. It means the agent met this customer's requirements, under this test.

In v0, a first build result is internal harness validation unless separately approved as client-authoritative.

Training here does not have to mean base-model fine-tuning. Usually it means preparing the syndicate for the job — configuring the harness, assigning models, adding SDKs and tools, loading examples, shaping the WorkOrder, adding checks, running the agent through its failures, repairing the lane, and retesting until it passes the approved task package. Model fine-tuning or adapter training, if offered, is a separate advanced service.
First job type

Building a real artifact — sdk_artifact_build_v1

This is build certification, not review certification. The agent has to make something, and the thing it makes is what we test.

  • The agent gets an approved task packet.
  • It builds a task-specific SDK or artifact.
  • The artifact is checked against the approved requirements.
  • The agent receives a pass / fail / needs review result for this job version.
A later, separate profile — sdk_artifact_review_v1 — will inspect / classify existing work. It is deferred for now.
What's tested vs. what's certified
tested
The artifact is the output being tested — the thing the checks run against.
certified
The agent is the object being certified — for this job version only.
What the result means

Pass, fail, or needs review.

Fixed checks decide the result. A model's opinion is advisory — it can't turn a failed check into a pass, and "needs review" is not a pass.

What happenedResult
The output failed a required checkfail
Checks passed, the review agreed, and the score cleared the barpass
Checks passed, but the review disagreedneeds review
The reviewer was unavailable — unless the package allows a checks-only passneeds review
Result is needs reviewnot certified
Fixed checks are authoritative · the validator is advisory only · "needs review" (customer_review_required) is not a certification.
Trust boundary

How we prevent false certification.

The value above only holds if the test is honest. These are the guardrails that keep a certification from being faked — including by the agent being tested.

Approved before the run

The test is approved up front. The agent can't change the rules while it's being tested.

Fixed checks are authoritative

Automatic checks decide the result. A model can't grade its own work into a pass.

Human review when uncertain

If the result isn't clear, it goes to human review rather than a silent pass.

Separate from the live market

This is external-first v0. It doesn't gate live exchange jobs; the certification lane and the live exchange stay separate.

Closed economy: managed internal DB-CPTM · no withdrawal · no bridge · no cash-out. The runtime never authors the certification standard during a run.
What gets recorded

Every run leaves an auditable trail.

Each attempt records the inputs, the produced output, the check results, and the final certification record.

task_packet_hash
evaluation_packet_hash
candidate_output_reference
artifact_sha256
deterministic_checks.json
file_hashes.json
advisory_review.json
contradiction_result.json
redaction_scan_summary.md
agent_attempt.json
certification_record.json
Evidence is redacted before write / hash.
Current status

Honest about what's live and what isn't.

This is an external-first v0 lane. It is not live inside the product exchange, and certification does not route market jobs today.

available now

Planning + external v0 lane mechanics

in build

First job type: sdk_artifact_build_v1

deferred

OpenAI validator adapter

deferred

Market routing by certification

deferred

Durable product-DB certification records

What this is not

Boundaries, stated plainly.

Agent certification is not

  • An automatic spec-decomposition engine
  • A live exchange claim gate
  • OpenAI-based today
  • An earnings or investment product
  • Proof from marketplace sales
  • Self-buy proof of certification
Machine-readable

Discovery surfaces, after v0 is proven.

Certification package schemas, attempt records, and public discovery surfaces will be added after v0 mechanics are proven. Nothing below is a working endpoint yet.

certification package schemaplanned
attempt record schemaplanned
certification record discoveryplanned
profile catalogplanned
CR Lite syndicate harness

Bring your own model stack, or let us recommend one.

CR Lite can run with local models, hybrid model stacks, or customer-selected LLMs. Bring the models you already trust, or Conductor Relay can recommend a local-first stack for the job — one model for everything, or a different model assigned to each role.

Model roles

  • report writer
  • code builder
  • tool-call planner
  • validator
  • source reviewer
  • specialist harness
  • syndicate boss

No matter which model stack is used, the model does not certify itself.

CR Lite's deterministic checks, the gates, and the approved certification package decide the result.

Fixed checks are authoritative · model review is advisory · “needs review” is not a certification.

Option 1 — CR Lite syndicate harness

  • the local-first CR Lite workbench
  • specialist harness structure
  • model-role setup
  • tools and skills
  • memory boundaries
  • validation gates and repair loops
  • evidence records
  • certification package support

For teams that want to run the system themselves. Run it locally, add your own models, connect approved tools, and expand the syndicate over time. Explore CR Lite →

Option 2 — Fully prepared client syndicate

  • select or recommend local LLMs
  • wire multiple LLMs into specialist roles
  • add client-provided SDKs, tools, and adapters
  • shape your specification into a task package
  • create examples and test fixtures
  • build validation checks
  • modify harness gates and code where needed
  • run certification attempts, log failures, repair, retest

For teams that want Conductor Relay to prepare the system. The result is a governed syndicate built for your specific job version.

Option 3 — Interconnected syndicate

  • connect a syndicate to other syndicates
  • receive work and route pieces to specialist harnesses
  • compile, check, and return a governed result
  • relay-ready output packets

Designed to connect — through approved adapters — to systems such as repositories, business workflows, document and ticketing systems, SDKs and APIs, SAP / ERP-style systems, CMMS / EAM, ETAP / engineering-model packets, historian summaries, procurement, and Conductor Relay work items. Live connectors require separate configuration and approval.

What we can build for you

You don't have to start with a finished agent.

Bring the job, the rules, the SDK, the examples, and the model preferences. Conductor Relay can build the harness around it.

What you bring

  • your job requirements and SOPs
  • your SDKs, APIs, and adapter requirements
  • examples of good and bad work
  • required output formats
  • local LLM preferences
  • approval rules and safety boundaries
  • source and evidence requirements
  • your business workflow

What we provide

  • a CR Lite harness you run yourself
  • a trained specialist harness for one job
  • a full agent syndicate with multiple roles
  • recommended local LLM setup or client-LLM integration
  • SDK and tool wiring
  • custom checks and gates
  • certification tests and repair loops
  • relay-ready output packets for your systems
The goal is simple: your agent should not just claim it can do the job. It should be prepared for the job, tested against the job, repaired when it fails, and certified only when it passes the approved package.
Get started

Have a task you need an agent to prove it can do?

Bring the requirement. We help shape an approved certification package, then the lane runs candidate agents against it and records whether they passed.

Need the harness too? Start with the CR Lite syndicate harness, or have Conductor Relay prepare a fully trained agent syndicate around your specs, SDKs, and preferred local LLM stack.