Skip to main content
AccenzioAccenzio

Multi-agent AI systems. Shipped to production.

I architect and ship custom agentic AI products for operations-heavy teams. Multi-agent orchestration. Eval-set rigor. Live in weeks, not quarters.

0+
Years engineering
0+
Agents in production
0%
Drawings extraction
agent-graph · prod
● healthy
INPUTSMULTI-AGENT GRAPHOUTPUTS◄ TEAM A · WORKERS ►◄ TEAM B · QUALITY ►◄ TEAM C · TOOLS ►
Documents
Streams
Datasets
Artifact
Update
Alert
ExtractA
ReasonA
SynthesizeA
ValidateB
ScoreB
ReviewB
RetrieveC
SearchC
Coordinator
Orchestrator
p50 1.2serror 0.04%247 runs/daylast 24h

Built on agentic primitives

ClaudeVisionMulti-agentTool-useMCPRAGSchemasEval-sets

Capabilities

What I actually do.And how I do it.

Five things I'm hired for. Each one shipped to production at least once — most of them to the same customer.

Coordinator · workers · reviewer

Multi-agent orchestration

Coordinator, reviewer, and chat-as-orchestrator patterns built for the job — not pre-baked LangChain templates. Tool-use APIs, structured outputs, and stateful sub-agents that actually compose.

  • Custom orchestration code, no template scaffolding
  • Coordinator → workers → reviewer quality gate
  • Chat-as-orchestrator for steerable pipelines
  • Stateful sub-agents with shared memory
5+
agents in active production
agent-graph · prod
● live
Coord
Worker
Worker
Tools
Review

Process

How I build production AI systems

From the first scoping call to a system your team can run without me. Same shape every engagement.

Phase 01 · 15–30 min call

Diagnose

Map the highest-pain workflow on your team. Identify time cost, error rate, and AI feasibility. You walk away with a written audit and a reference architecture — whether or not we work together.

Deliverables
  • Workflow audit document
  • Reference architecture sketch
  • Eval criteria + success metrics
  • Fixed-fee proposal (if there's a fit)
workflow-audit · v1
● live
Findings
Workflow inventory12 mapped
Time cost / wk~17 hrs
Error rate8%
AI feasibilityHigh
Reference architectureDrafted
audit complete · proposal ready

Every system ships with eval sets, monitoring, and a runbook. No black boxes.

The actual stack — production code, not no-code glue

Anthropic Claude
Model + tool-use APIs
Multi-agent orchestration
Coordinator · workers · reviewer
Tool-use & function calling
Stateful sub-agents
RAG + retrieval
Memory, search, citations
Eval-set rigor
Regression baselines from day one
Production observability
Tracing · alerting · runbooks

Clients

Companies I'm building with.

EPCM Holdings
EPCM HoldingsIndustrial engineering · 90-person firm
NodAxis
NodAxisAI infrastructure · behind-the-meter power

Selective engagements. Each build phase-gated and shipped to production.

Featured Work

Tender Intelligence.An RFQ pipeline, in production.

A multi-agent AI product I built for an industrial engineering firm. Live in beta. The case study for everything else on this page.

Industrial engineering · 90-person firm

From a 3-day RFQ cycle to a priced, scheduled, client-ready proposal in under 4 hours.

Drop a tender package — PDFs, Word docs, engineering drawings — into the system. Five specialised Claude agents extract the BOQ, suggest pricing against past wins, build the man-hour schedule, run a top-down sanity check, and produce a Word-ready proposal. Every number is editable, every artifact is auditable, and the system learns from each won project.

Anthropic Claude (tool_use)Next.js 15FastAPIPostgresClerkSentryVercelRailway
See if I can build something like this for you
Tender Intelligence project dashboard — ATI Dust Collector RFQ with 36 line items, confidence breakdown, and coordinator-flagged issues
Architecture · Six specialised agents, one pipeline
Extractor

Reads tender PDFs, Word, and engineering drawings. Pulls structured BOQ items and specs.

Pricing

Suggests pricing per line item from historical wins. Benchmarks against the firm’s actuals.

Schedule

Builds editable man-hour and plant-hour schedules. Reconciles back to the BOQ.

Top-down estimator

Independent sanity-check on the bottom-up build. Flags variance against past projects.

Coordinator + Reviewer

Cross-checks every artifact. Reviewer is the draft→ready quality gate.

Chat-as-orchestrator

Estimator talks to the system; chat invokes regenerate tools. No deep-menu hunting.

−94%
Response time
3 days → <4 hrs
−93%
Eng. hours per RFQ
6 hrs → 0.4 hrs
+55%
Win rate uplift
22% → 34%
70%
Drawings extraction
up from 6% baseline

Tender Intelligence is built by Accenzio for an industrial engineering firm. Metrics measured against the firm's pre-system baselines on real RFQ packages. Confidence figure refers to BOQ extraction with engineering drawings.

Engagements

Three ways to work together.Fixed scope. Fixed price. Code you own.

Milestone-gated payments and a written scope you can read in fifteen minutes. Pick the engagement that fits — stop at any phase.

Door-opener

Discovery & Audit

A senior-engineer-grade read on whether AI is the right tool — before you commit a build budget.

1 week
  • Diagnosis call + workflow walkthrough (Loom)
  • Written audit: time cost, error rate, AI feasibility
  • Reference architecture (agent graph + stack)
  • Eval criteria + success metrics defined
  • Fixed-fee proposal for the build, if there's a fit

Best for: Teams who want a senior-engineer-grade read on AI feasibility — and a credible plan to build it right.

Book a Discovery
Most engagements

Custom AI Product Build

A multi-agent system, custom code, shipped to production.

6–12 weeks
  • Multi-agent architecture, custom code
  • Anthropic Claude tool-use, structured outputs
  • Eval sets + regression baselines from day one
  • Production deploy, monitored end-to-end
  • Sentry · structlog · runbook · full handoff docs
  • Phase-gated billing — exit ramp after Spec
Proof ·Latest build: Tender Intelligence — multi-agent RFQ proposal automation for an industrial engineering firm, in production.

Best for: Teams with one painful, high-value workflow that needs a real product — not a Zapier patchwork.

Start with a Discovery
Post-launch

AI Ops Partner

Once the build is live: the system stays sharp, monitored, and improving.

ongoing
  • Eval-set regression checks on every release
  • Sentry on-call, weekly ops brief
  • Model + tool updates as the field moves
  • Continuous improvement against measured baselines
  • Standard tier absorbs new automation work

Best for: Teams whose system is in production and needs a senior engineer keeping it tuned.

Book a call

Typical path

DiscoveryBuildoptional AI Ops Partner· stop at any phase

Talk through what fits
Pieter Le Roux
Pieter Le Roux
Founder · AI Systems Architect

About

You'll work with me. Not a project manager.

I'm Pieter — an AI systems architect with twenty years of engineering across the stack. Full-stack development, cloud infrastructure, security engineering, and the last several years deep on agentic AI in production. Accenzio is the studio I run for selective custom builds. Based in Austin. Available worldwide.

I build two things well: multi-agent systems that actually hold up in production and the eval-set discipline that proves they do. Everything ships with logs, monitoring, runbooks, and a handoff that means your team can run the system without me.

20+
Years engineering across the stack
100+
Agents shipped to production
Solo
Selective. Direct. Fixed scope.
Book a 15-min diagnosis call

FAQ

Common questions.Honest answers.

The things buyers always ask before booking a call. Answered upfront.

Tailor-made productized services. I ship production-grade code using Claude, agents, and the right tools for the job. I don't lean on no-code or low-code platforms unless they genuinely fit the problem, which is rare. Every engagement starts with a comprehensive diagnostic of your current workflow and the bottlenecks inside it. What I propose is specific to what I find. Nothing cookie-cutter, nothing templated.

Templates and abstraction frameworks make demos easy and production hard. The moment you need a coordinator agent that reconciles three sub-agent outputs, or a reviewer that gates draft to ready, or chat-as-orchestrator where the user steers the pipeline, you fight the framework instead of the problem. I use the Anthropic SDK directly because it's the same surface area Anthropic optimises for, and it doesn't lock you into anyone else's roadmap.

Because there's a chasm between a single-shot prompt and a system that runs unattended in production for months. ChatGPT and Claude Code are excellent assistants for one-off work. They are not eval-tested, monitored, version-controlled, regression-gated systems. Real production AI is multiple agents reconciling each other's outputs, structured schemas you can trust, fallback paths, observability, and a runbook your ops team can use at 2am. That's what I build, and most teams that try to do it themselves spend six months reinventing the same plumbing.

I don't guarantee absence of bugs. Anyone who does is lying. I guarantee detection. Every agent has an eval set with seed cases that lock the baseline before code ships. Every release runs the eval set, and regressions block merge. Automated alerts fire the moment something errors. Structured logs with correlation IDs let you trace any single run end-to-end. When something does go wrong, you find out in minutes, not when a customer calls.

A Discovery and Audit takes one to two weeks. A full custom AI product build typically runs six to twelve weeks, depending on integration surface and how messy the source data is. For the right scope, a multi-agent system can be in production beta inside seven weeks. Numbers shift if your stack is non-standard or if access takes a while. I tell you that on the diagnosis call.

Yes. That's the point of how I build. Every system ships with a runbook for step-by-step troubleshooting, structured logs, eval sets your team can re-run, and full handoff docs. The architecture is intentionally readable, so any competent senior engineer can take it over. The optional AI Ops Partner retainer covers the parts most teams don't want to own (model and tool updates, monthly eval regression, on-call), but it's opt-in, not lock-in.

Any industry. I have experience across many domains, from ops and sales to compliance and document-heavy workflows. The multi-agent patterns I use (extract, enrich, reconcile, review, ship) are portable. The architecture works anywhere an expert reads long documents to produce structured output, or anywhere a repetitive workflow is currently being done by hand. The agent prompts and eval sets get built for your specific domain after the diagnostic.

Yes. You own the codebase, the prompts, the eval sets, and the runbooks. They land in your GitHub org or wherever you keep code. Co-credited builder + customer IP arrangements are possible if you'd rather brand the system as yours and credit Accenzio as the engineering partner.

A real conversation, not a sales pitch. In fifteen minutes we walk through your most painful workflow, what it costs you in time and errors, and where the bottlenecks live. I tell you what I would deliver, how I would approach it, and a rough sense of timeline and shape. If there is a fit, I follow up with a written audit and a proposal. If there isn't, you still walk away with a clearer view of the problem. No deck, no recording, no obligation.

Got a different question? Bring it to the diagnosis call.

Book a 15-min call

Got a workflow that needs a real product behind it?

Free 15-minute diagnosis. No slide deck. We'll find the highest-ROI workflow on your team's plate and tell you, honestly, what it would take to ship the multi-agent system that solves it.

No sales call
Written audit if we don't fit
Reply within 24h