I design AI systems for environments where wrong answers have consequences.
Decision-critical workflows. Regulated constraints. Production AI with human oversight designed in, not patched on.

// Design Philosophy
How I Design AI Systems
> Deterministic logic before probabilistic models
Rules handle what can be encoded. LLMs handle what cannot. The tradeoff: slower iteration, higher reliability.
> Human review as a system component, not a fallback
Workflows are designed for oversight from day one. Review output becomes structured data that improves retrieval and future drafts.
> Evidence-first outputs when stakes are high
The system produces recommendations plus evidence. Humans make the final call. Less liability, fewer silent failures.
> Auditability and traceability over “clever prompts”
Every output must be reproducible and debuggable. If you can’t trace it, you can’t ship it.
> Fail closed, not confidently wrong
“No match found” is a feature. Guardrails beat improvisation when the cost of being wrong is real.
// Flagship Case Study
Public-Sector Decision Support
Some implementation details are intentionally generalized due to NDA and public-sector confidentiality. What’s shown here is the architecture, guardrails, and evaluation approach, without customer data or proprietary workflows.
Evidence-Grounded Report Drafting + Human Review
A production workflow that drafts decision-support reports under strict constraints: evidence-first retrieval, deterministic rules, and structured human review. Designed to be auditable, explainable, and safe under regulatory scrutiny.
// Leadership
How I Led and Shipped
> Ownership snapshot
Led a team of 4 engineers delivering GenAI features across core workflows in a startup environment. Owned roadmap and technical direction, ran weekly release reviews with Quality Management and Operations, and set guardrails for what the system can and cannot do in production.
Worked directly with C-level leadership to prioritize AI work, communicate constraints, and align stakeholders on risk, rollout, and reliability requirements.
// Tech Stack
Production Stack
> What I used to run these systems in production
// Supporting Systems
Subsystems (Case Studies)
Document Understanding Pipeline
Extracts structured fields from messy PDFs using validation layers and “fail closed” behavior.
Semantic Matching System
Matches user profiles to program catalogs using retrieval, constraints, and controlled reasoning.
Agentic Onboarding Flow
Stateful intake with routing logic, stopping conditions, and explicit uncertainty handling.
// Restraint
Where I Did Not Use AI
Knowing what not to automate is a design decision. In regulated environments, “don’t guess” beats “sounds smart.”
Final decisions
The system drafts recommendations and evidence, but never makes the final call. Humans retain authority and context.
Policy interpretation
When guidelines changed, I rejected “LLM decides what the rule means.” We updated explicit rules manually to keep outputs defensible.
Failure mode: plausible but wrong details
Early versions produced plausible text when retrieval returned no matches. Fixed with explicit “no match found” logic and strict constraints: only reference retrieved evidence.
Failure mode: review fatigue
“Approve or reject” reports trained reviewers to skim. Redesigned to draft sections that require active editing, which improved attention and error detection.
// Side Projects
Explorations & Systems
POCO Steering OS
A Product Intelligence OS that turns chaotic feature requests into traceable, capacity-aware decisions. Deterministic prioritization + AI-assisted capacity planning + explicit tradeoffs for teams drowning in scope creep.
// Background
Context & Environment
Applied AI Engineer and Technical Product Owner with experience designing and shipping GenAI systems in production. Worked at Taleroo (Germany). Details are generalized where needed due to NDA.
I build systems where correctness, traceability, and operational adoption matter as much as model quality. That usually means collaborating closely with operations and compliance-minded stakeholders, not just engineers.
My work sits at the intersection of LLM capabilities and deterministic logic: knowing when to use each, and how to make them behave predictably under real constraints.