Multi-agent AI systems. Shipped to production.

I architect and ship custom agentic AI products for operations-heavy teams. Multi-agent orchestration. Eval-set rigor. Live in weeks, not quarters.

Book free 15-min diagnosis See how it works

Years engineering

Agents in production

Drawings extraction

agent-graph · prod

● healthy

p50 1.2serror 0.04%247 runs/daylast 24h

Built on agentic primitives

ClaudeVisionMulti-agentTool-useMCPRAGSchemasEval-sets

Capabilities

What I actually do.And how I do it.

Five things I'm hired for. Each one shipped to production at least once — most of them to the same customer.

Coordinator · workers · reviewer

Multi-agent orchestration

Coordinator, reviewer, and chat-as-orchestrator patterns built for the job — not pre-baked LangChain templates. Tool-use APIs, structured outputs, and stateful sub-agents that actually compose.

Custom orchestration code, no template scaffolding
Coordinator → workers → reviewer quality gate
Chat-as-orchestrator for steerable pipelines
Stateful sub-agents with shared memory

agents in active production

agent-graph · prod

● live

Process

How I build production AI systems

From the first scoping call to a system your team can run without me. Same shape every engagement.

Phase 01 · 15–30 min call

Diagnose

Map the highest-pain workflow on your team. Identify time cost, error rate, and AI feasibility. You walk away with a written audit and a reference architecture — whether or not we work together.

Deliverables

Workflow audit document
Reference architecture sketch
Eval criteria + success metrics
Fixed-fee proposal (if there's a fit)

workflow-audit · v1

● live

Findings

Workflow inventory12 mapped

Time cost / wk~17 hrs

Error rate8%

AI feasibilityHigh

Reference architectureDrafted

audit complete · proposal ready

Every system ships with eval sets, monitoring, and a runbook. No black boxes.

The actual stack — production code, not no-code glue

Anthropic Claude

Model + tool-use APIs

Multi-agent orchestration

Coordinator · workers · reviewer

Tool-use & function calling

Stateful sub-agents

RAG + retrieval

Memory, search, citations

Eval-set rigor

Regression baselines from day one

Production observability

Tracing · alerting · runbooks

Clients

Companies I'm building with.

EPCM HoldingsIndustrial engineering · 90-person firm

NodAxisAI infrastructure · behind-the-meter power

Selective engagements. Each build phase-gated and shipped to production.

Featured Work

Tender Intelligence.An RFQ pipeline, in production.

A multi-agent AI product I built for an industrial engineering firm. Live in beta. The case study for everything else on this page.

Industrial engineering · 90-person firm

From a 3-day RFQ cycle to a priced, scheduled, client-ready proposal in under 4 hours.

Drop a tender package — PDFs, Word docs, engineering drawings — into the system. Five specialised Claude agents extract the BOQ, suggest pricing against past wins, build the man-hour schedule, run a top-down sanity check, and produce a Word-ready proposal. Every number is editable, every artifact is auditable, and the system learns from each won project.

Anthropic Claude (tool_use)Next.js 15FastAPIPostgresClerkSentryVercelRailway

See if I can build something like this for you

Tender Intelligence project dashboard — ATI Dust Collector RFQ with 36 line items, confidence breakdown, and coordinator-flagged issues

Architecture · Six specialised agents, one pipeline

Extractor

Reads tender PDFs, Word, and engineering drawings. Pulls structured BOQ items and specs.

Pricing

Suggests pricing per line item from historical wins. Benchmarks against the firm’s actuals.

Schedule

Builds editable man-hour and plant-hour schedules. Reconciles back to the BOQ.

Top-down estimator

Independent sanity-check on the bottom-up build. Flags variance against past projects.

Coordinator + Reviewer

Cross-checks every artifact. Reviewer is the draft→ready quality gate.

Chat-as-orchestrator

Estimator talks to the system; chat invokes regenerate tools. No deep-menu hunting.

−94%

Response time

3 days → <4 hrs

−93%

Eng. hours per RFQ

6 hrs → 0.4 hrs

+55%

Win rate uplift

22% → 34%

70%

Drawings extraction

up from 6% baseline

Tender Intelligence is built by Accenzio for an industrial engineering firm. Metrics measured against the firm's pre-system baselines on real RFQ packages. Confidence figure refers to BOQ extraction with engineering drawings.

Engagements

Three ways to work together.Fixed scope. Fixed price. Code you own.

Milestone-gated payments and a written scope you can read in fifteen minutes. Pick the engagement that fits — stop at any phase.

Door-opener

Discovery & Audit

A senior-engineer-grade read on whether AI is the right tool — before you commit a build budget.

1 week

Diagnosis call + workflow walkthrough (Loom)
Written audit: time cost, error rate, AI feasibility
Reference architecture (agent graph + stack)
Eval criteria + success metrics defined
Fixed-fee proposal for the build, if there's a fit

Best for: Teams who want a senior-engineer-grade read on AI feasibility — and a credible plan to build it right.

Book a Discovery →

Most engagements

Custom AI Product Build

A multi-agent system, custom code, shipped to production.

6–12 weeks

Multi-agent architecture, custom code
Anthropic Claude tool-use, structured outputs
Eval sets + regression baselines from day one
Production deploy, monitored end-to-end
Sentry · structlog · runbook · full handoff docs
Phase-gated billing — exit ramp after Spec

Proof ·Latest build: Tender Intelligence — multi-agent RFQ proposal automation for an industrial engineering firm, in production.

Best for: Teams with one painful, high-value workflow that needs a real product — not a Zapier patchwork.

Start with a Discovery →

Post-launch

AI Ops Partner

Once the build is live: the system stays sharp, monitored, and improving.

ongoing

Eval-set regression checks on every release
Sentry on-call, weekly ops brief
Model + tool updates as the field moves
Continuous improvement against measured baselines
Standard tier absorbs new automation work

Best for: Teams whose system is in production and needs a senior engineer keeping it tuned.

Book a call →

Typical path

Discovery→Build→optional AI Ops Partner· stop at any phase

Talk through what fits

Pieter Le Roux

Founder · AI Systems Architect

About

You'll work with me. Not a project manager.

I'm Pieter — an AI systems architect with twenty years of engineering across the stack. Full-stack development, cloud infrastructure, security engineering, and the last several years deep on agentic AI in production. Accenzio is the studio I run for selective custom builds. Based in Austin. Available worldwide.

I build two things well: multi-agent systems that actually hold up in production and the eval-set discipline that proves they do. Everything ships with logs, monitoring, runbooks, and a handoff that means your team can run the system without me.

20+

Years engineering across the stack

100+

Agents shipped to production

Solo

Selective. Direct. Fixed scope.

Book a 15-min diagnosis call

FAQ

Common questions.Honest answers.

The things buyers always ask before booking a call. Answered upfront.

Tailor-made productized services. I ship production-grade code using Claude, agents, and the right tools for the job. I don't lean on no-code or low-code platforms unless they genuinely fit the problem, which is rare. Every engagement starts with a comprehensive diagnostic of your current workflow and the bottlenecks inside it. What I propose is specific to what I find. Nothing cookie-cutter, nothing templated.

Templates and abstraction frameworks make demos easy and production hard. The moment you need a coordinator agent that reconciles three sub-agent outputs, or a reviewer that gates draft to ready, or chat-as-orchestrator where the user steers the pipeline, you fight the framework instead of the problem. I use the Anthropic SDK directly because it's the same surface area Anthropic optimises for, and it doesn't lock you into anyone else's roadmap.

Because there's a chasm between a single-shot prompt and a system that runs unattended in production for months. ChatGPT and Claude Code are excellent assistants for one-off work. They are not eval-tested, monitored, version-controlled, regression-gated systems. Real production AI is multiple agents reconciling each other's outputs, structured schemas you can trust, fallback paths, observability, and a runbook your ops team can use at 2am. That's what I build, and most teams that try to do it themselves spend six months reinventing the same plumbing.

I don't guarantee absence of bugs. Anyone who does is lying. I guarantee detection. Every agent has an eval set with seed cases that lock the baseline before code ships. Every release runs the eval set, and regressions block merge. Automated alerts fire the moment something errors. Structured logs with correlation IDs let you trace any single run end-to-end. When something does go wrong, you find out in minutes, not when a customer calls.

A Discovery and Audit takes one to two weeks. A full custom AI product build typically runs six to twelve weeks, depending on integration surface and how messy the source data is. For the right scope, a multi-agent system can be in production beta inside seven weeks. Numbers shift if your stack is non-standard or if access takes a while. I tell you that on the diagnosis call.

Yes. That's the point of how I build. Every system ships with a runbook for step-by-step troubleshooting, structured logs, eval sets your team can re-run, and full handoff docs. The architecture is intentionally readable, so any competent senior engineer can take it over. The optional AI Ops Partner retainer covers the parts most teams don't want to own (model and tool updates, monthly eval regression, on-call), but it's opt-in, not lock-in.

Any industry. I have experience across many domains, from ops and sales to compliance and document-heavy workflows. The multi-agent patterns I use (extract, enrich, reconcile, review, ship) are portable. The architecture works anywhere an expert reads long documents to produce structured output, or anywhere a repetitive workflow is currently being done by hand. The agent prompts and eval sets get built for your specific domain after the diagnostic.

Yes. You own the codebase, the prompts, the eval sets, and the runbooks. They land in your GitHub org or wherever you keep code. Co-credited builder + customer IP arrangements are possible if you'd rather brand the system as yours and credit Accenzio as the engineering partner.

A real conversation, not a sales pitch. In fifteen minutes we walk through your most painful workflow, what it costs you in time and errors, and where the bottlenecks live. I tell you what I would deliver, how I would approach it, and a rough sense of timeline and shape. If there is a fit, I follow up with a written audit and a proposal. If there isn't, you still walk away with a clearer view of the problem. No deck, no recording, no obligation.

Got a different question? Bring it to the diagnosis call.

Book a 15-min call

Got a workflow that needs a real product behind it?

Free 15-minute diagnosis. No slide deck. We'll find the highest-ROI workflow on your team's plate and tell you, honestly, what it would take to ship the multi-agent system that solves it.

Book a free 15-min call Email accenzio@gmail.com

No sales call

Written audit if we don't fit

Reply within 24h