Independent AI review
A structured read of the existing code, prompts, evals, infra and architecture decisions. You get a written assessment: what is sound, what is risky, what you can ship from here, and where the repair has to start.
Chapter one of six — a customer-first page.
Stand-ups every morning. Sprint reviews every two weeks. A board full of tickets that move from left to right on schedule. And yet the thing that’s supposed to ship — the AI feature the business is waiting on — keeps slipping. The ceremonies are full. The software is empty.
Chapter two
A notebook convinced the room. A demo went well in a Teams call. Somewhere between that moment and the first real user, the answer got slower, stranger, more expensive — and nobody can point to the file where the regression lives. The gap between “it works on my laptop” and “it works for ten thousand users” is where most AI initiatives quietly die.
Chapter three
Burn-down charts descend politely. Satisfaction scores float near the top. Requirements coverage is a comforting 92%. Meanwhile, the actual users are waiting for something that hasn’t moved in three months, and the steering committee is about to ask a very uncomfortable question at the next review. Metrics are measuring the ritual, not the outcome.
Chapter four
A big-brand consultancy came in, ran a slick kickoff, left behind a seventy-slide deck and a pilot that wowed the execs. Then the junior consultants rotated off, the champion changed role, and the code became the team’s problem to carry. The polish didn’t survive contact with the backlog. Now the internal engineers are fixing things they didn’t design and can’t quite explain.
Chapter five
The staffing firm sent four people in week one, two more in week three, all billable by the hour, all waiting for someone to tell them what the right answer looks like. The hours tick. The tickets close. The architecture hasn’t been decided yet. An AI system built without a few hard, opinionated calls at the start is very expensive scaffolding around a very expensive hole.
Chapter six
They’ve read the papers. They’ve prototyped on weekends. They could probably get this over the line — if they were allowed to stop doing four other things at once. What they don’t need is another motivational kickoff. What they need is a senior engineer alongside them who has shipped this kind of system before, who can take the hard calls off the lead’s desk, and who leaves before the dependency becomes a habit.
What it actually looks like to work with me.
A structured read of the existing code, prompts, evals, infra and architecture decisions. You get a written assessment: what is sound, what is risky, what you can ship from here, and where the repair has to start.
The notebook or demo is real, but the path to production isn’t. I work inside your codebase and take the system from “it worked once” to something your team can operate, monitor and change without me.
I sit in your repo, your stand-ups and your reviews, as a hands-on peer to your tech lead. The goal is always to leave with the team stronger than when I arrived, and with an exit date on the calendar.
Not “what is a transformer.” A working session for senior devs: prompt architecture, evaluation, cost, failure modes, and how to integrate LLMs into a .NET or polyglot codebase without losing your testing discipline.
A focused session on a single choice you’re about to make: vendor, architecture, hire, go/no-go. I’ve been on the other side of this call many times. Sometimes the most useful thing is someone with nothing to sell you that day.
Evidence, not endorsements.
I won’t quote customers I can’t clear. Here are things you can actually read, use or run yourself:
Good. I’m not trying to replace a delivery partner. An independent review is often most useful precisely when there already is a vendor — because I’m the person in the room with no follow-on sale to protect.
Then don’t hire me for training. Hire me as a senior engineer for a fixed window, with a defined exit. If you come away more capable as a team, that’s a side-effect, not an invoice line.
Yes. A lot of my recent work is exactly that: LLM-backed features in C#/F# codebases, where the team doesn’t want to rewrite half of production in Python to get there.
Because every engagement has a named exit, a written handover, and a team that’s measurably more capable at the end than the start. If I can’t describe that up-front for your situation, I’ll say so and decline.
That’s fine. A 90-minute second-opinion call costs you an afternoon and often saves a quarter. Start there.
About the person doing the work.
Independent engineer. Two decades of production software, the last several years focused on the space where LLM-based systems meet real codebases, real teams and real constraints. Co-founder of Cumin & Potato GmbH and the PXL Clock product. Maintainer of open-source libraries in the .NET and F# community. German-based, working in English and German, remote-first, on-site where it helps.
Describe your situation in a paragraph. I’ll reply within two working days with one of three things: a yes, a no, or a better person for your problem.