Emmanuel Genard

Scientific Management For AI Workers

We need Scientific Management (Taylorism) for AI coding agents. Specify-then-verify isn’t enough. Agents still decide how to implement the specification, which produces inconsistent results even with the best models. LLM-generated code that demands constant rework is a waste, even when the LLM does the rework. Scientific management can get high-quality output the first time.

The core idea: managers study the work, break jobs into small tasks, identify the best method for each task, then train and supervise workers to follow that method. Terrible for humans. Perfect for AI. And we already do most of it.

We have frameworks that encode best practices. We already break features into small tasks. And in the specific context of programming language, application framework, domain, codebase, business stage, and business resources there is going to be one best way to do something.

We encode best practices in frameworks. We break features into small tasks. And within a specific programming language, application framework, domain, codebase, and business stage, there is one best way to do something.

Take adding a feature: there’s a specific way to add a DB column, update a service, add a controller, update the UI, and decide what and how to test. Each of those steps breaks into smaller steps, each with a defined method.

I’m still experimenting with how to apply this broadly, but I’ve already experienced that rigorously defined, small-enough tasks produce consistent, reliable output from cheaper open-source models. Over time, the best way to enforce best practices will be to embed them in the codebase itself. LLMs are amazing pattern-matching machines. Give them the best patterns you can.

When I gave an agent a specification and a verification method, I thought I’d covered enough. I hadn’t. The agent falls back on heuristics from its training data, and those are almost never the best approach for my specific context. Scientific management closes that gap: rigorous task definitions yield more reliable output, and once each task is rigorously defined, I can find the cheapest way to get it done.

Published: 2026-04-14

Last Edited: 2026-04-14