Reading list — AI-engineer talk

Reference material

The sources specifically referenced or used in the talk.

The Goal: A Process of Ongoing Improvement

Eliyahu M. Goldratt & Jeff Cox · 1984

The foundational text of the Theory of Constraints. Written as a novel to illustrate the concepts narratively. In the grand tradition of business novels, it is not a good novel, but it illustrates the point.
Time Warp: The Gap Between Developers' Ideal vs Actual Workweeks in an AI-Driven Era

Sukrit Kumar, Drishti Goel, Thomas Zimmermann, Brian Houck, B. Ashok, Chetan Bansal (Microsoft Research) · 2025

The peer-reviewed anchor for the talk's 'coding is only ~11% of the workweek' claim. ICSE-SEIP 2025 Distinguished Paper; survey of 484 Microsoft developers. Actual workweek shares: Communication & Meetings ≈12%, Coding ≈11%, Debugging ≈9%, Architecting & Design ≈6%, Code Review ≈5%. Developers' *ideal* workweek would push coding to ≈20% — even at the ideal, coding is a minority of the week. This is the empirical floor under the talk's Amdahl-style arithmetic about how much faster the system can possibly get.
AI and Engineering Velocity: A Longitudinal Analysis

DX (Noda & Houck) · 2026

The source of the talk's headline enterprise number: across roughly four hundred organisations tracked for sixteen months, AI tool usage rose about sixty-five percent while median PR throughput moved only about ten — a real but modest 5–15% uplift, not the 10× that weekend experience suggests. The cleanest published instance of the pattern the whole talk is about: the adoption line climbs, the cost line climbs, the throughput line barely does.
The One Number You Need to Increase ROI Per Engineer

DX (getdx.com)

The basis for the budget arithmetic on Slides 14 and 20. DX publishes a calibration linking movement in its Developer Experience Index (DXI) to engineering time — about thirteen minutes per developer, per week, per DXI point, which annualises to roughly ten hours per engineer, per DXI point, per year. The talk uses that ten-hour conversion directly: three DXI points across six hundred engineers ≈ 18,000 hours/year ≈ $1.5M at a blended rate. Vendor-published (DX sells the engineering-intelligence platform SEEK runs), so it is presented as a vendor figure rather than an independent result — but it is the actual source of the talk's dollar claims.

State of AI-assisted Software Development 2025

DORA (Google Cloud) · 2025

AI doesn't fix a team; it amplifies what's already there. DORA's 2025 reading is a partial reversal of 2024 — a positive relationship between AI adoption and delivery throughput, but still a negative one with delivery stability. Teams with loosely coupled architectures and fast feedback loops see the gains; tightly coupled, slow-process teams do not. Large, multi-year survey instrument; correlational not causal.
The Phoenix Project: A Novel about IT, DevOps, and Helping Your Business Win

Gene Kim, Kevin Behr & George Spafford · 2013

The widely-read application of Goldratt's Theory of Constraints to software delivery.
The Principles of Product Development Flow

Donald G. Reinertsen · 2009

A rigorous treatment of product-development flow grounded in queueing theory and economics — the same 'manage the bottleneck, manage the queue' logic as Theory of Constraints, with the maths attached. The ideas the talk leans on: invisible, unmanaged queues are the root cause of poor product-development performance; large batch sizes increase cycle time and delay feedback; the *boundary object* a team aligns around (a specification, a design doc) governs the speed of cross-functional alignment. Practitioner book built on established theory; well-respected in Lean product development.
Accelerate: The Science of Lean Software and DevOps

Nicole Forsgren, Jez Humble & Gene Kim · 2018

The empirical backbone underneath every DORA report since. Defines the four delivery metrics — deployment frequency, lead time for changes, change-failure rate, and mean time to recovery — that the talk treats as the real measure of system throughput, as distinct from coding-segment activity. Repeatedly finds that architectural and process decoupling (teams that can deploy without cross-team approval) is among the strongest predictors of delivery performance, which is exactly the mechanism behind DORA 2025's 'AI amplifies what's already there' finding.
The SPACE of Developer Productivity: There's more to it than you think

Nicole Forsgren, Margaret-Anne Storey, Chandra Maddila, Thomas Zimmermann, Brian Houck, Jenna Butler · 2021

Peer-reviewed; the direct response to 'measure developers by lines of code or commits.' Argues productivity is multidimensional across five axes — Satisfaction & well-being, Performance, Activity, Communication & collaboration, Efficiency & flow — and that activity metrics in isolation mislead. The talk uses SPACE to defend the move from adoption metrics (an activity proxy) to a developer-experience composite (DXI), which is the find-the-constraint mechanism the SEEK case study is built on.
DevEx: What Actually Drives Productivity

Abi Noda, Margaret-Anne Storey, Nicole Forsgren & Michaela Greiler · 2023

Peer-reviewed sequel to SPACE from the same author lineage. Distils developer experience into three drivers: feedback loops, cognitive load, and flow state. The framework the talk uses to explain *why* the SEEK interventions worked where they worked — CI/CD speed, migrations, and documentation moved feedback loops and cognitive load locally — and why the cross-team coordination and decision-making drivers were stubborn until the spec-driven-development play. The academic framing behind the DXI calibration the talk costs out.
Seeing Like a State: How Certain Schemes to Improve the Human Condition Have Failed

James C. Scott · 1998

Only referenced very vagually and in passing, but it has important ideas for thinking about organisational change. Scott's central thesis — that large administrative systems can only represent complex social reality through a 'heroic and greatly schematised process of abstraction and simplification' — is the philosophical warrant for why top-down productivity metrics mislead. Any metric that flattens a software organisation's actual working patterns (trust, tacit knowledge, informal co-ordination) into a single legibility schema repeats Scott's 'high-modernist' error. The talk's practical response — ask 20 people, run with it, measure the results, reorient — is a practitioner's version of Scott's 'metis': cheap local knowledge that works in an irreducibly illegible environment.
The AI Inventory Trap: Why Faster Upstream Makes You Slower End-to-End

Eric Bowman · 2026

The closest external article to the talk's exact spine. Bowman argues that AI accelerates coding but increases the rate of work creation faster than the rate of work completion — so when downstream stations are near capacity, work-in-progress grows, lead time rises even as cycle time falls, and the constraint shifts downstream. Includes the Amdahl-flavoured arithmetic — a 2× speedup of one-third of the cycle yields less than 17% end-to-end improvement — that the talk also relies on. Practitioner essay, no controlled study, but a clean independent statement of the framing.

Reference material

Further reading