Three Things AI Hasn't Replaced — A Cross-Domain Field Guide | Outline

Outline only. v0.2 working outline — restructured from "flagship comprehensive merger of A + B" to "cross-domain field guide." Now meaningfully different from B: B is poker-anchored with adjacent-domain coda; C is cross-domain with poker as one case study among several. v1 prose drafted only on flagship-outlet ask. Editorial questions in yellow callouts.

Audience: AI researchers, decision-science readers, executives in fields where AI is being marketed as expert replacement (medicine, law, finance, education, gameplay), policy makers thinking about AI's impact on expert workforces, sophisticated cultural readers.
Outlets: Atlantic feature / HBR feature / New Yorker feature / IEEE Spectrum feature / MIT Tech Review feature.
Length: ~6,000–7,500 words target.
Tone: Cross-domain, evidence-grounded, sharp on structural argument, accessible across expert audiences. Less poker-jargon than B; more medicine/law/finance vocabulary.
Sub-series role: C is the cross-domain field guide. Companions: A (poker-community pedagogy) and B (poker-anchored cultural piece). C uses poker examples but is not poker-anchored.

Working title and alternates

⭐ "Three Things AI Hasn't Replaced — A Cross-Domain Field Guide" — preserves the "Three Things" spine; signals multi-domain scope
"What AI Doesn't Replace: A Field Guide for Expert Decision-Makers" — cleaner, drops the numbered list framing
"The Three Limits of AI Across Expert Decision-Making" — academic register
"Where AI Stops: A Field Guide for Doctors, Lawyers, Traders, and Coaches" — explicit audience listing
"When the Algorithm Meets the Expert: Three Gaps That Haven't Closed" — narrative, op-ed-leaning

One-line goal

Three structural gaps (autonomy, outperformance, pedagogy) that appear in every expert decision domain where AI is being marketed as replacement. Poker, medicine, law, finance, education each get treated as parallel case studies — not as adjacent-domain footnotes. The cross-domain pattern is the real argument: this isn't a poker-specific problem.

Story arc

The cultural narrative — "AI is replacing the radiologist, the lawyer, the trader, the coach, the editor" — is being sold across many expert domains simultaneously. Each domain has its own marketed AI products, its own boom industry, its own headline experiments. When examined carefully, the narrative fails on the same three structural fronts everywhere: AI methodology depends on humans (autonomy), AI hasn't been demonstrated to outperform top experts at scale (outperformance), and AI doesn't transmit the tacit knowledge experts apprentice into (pedagogy). The article walks through the pattern across domains; the cross-domain consistency is the argument.

Section-by-section beats

§1 · ~500 words

Open: the cultural narrative is the same in every domain

The radiologist's AI replacement, the contract-review AI, the algorithmic trader, the AI poker coach, the AI editor. Each one has its own marketed product, its own boom industry, its own breathless headline experiment. They all share the same cultural arc: "the algorithm is catching up to the expert; the expert's job is at risk." Position this article as a cross-domain field guide: examined carefully, the narrative fails on the same three structural fronts wherever it's tested.

§2 · ~1,200 words

The autonomy gap — AI methodology is human-in-the-loop in every expert domain

The structural property: AI in expert decision domains doesn't run autonomously. Humans build it, audit it, override it, curate its training data, define its edge cases, decide when to deploy it and when to suspend it. Examples per domain:

Poker — tree designers (Tombos21's "heart of the problem" quote), reward functions, anti-cheat review queues (DLv4 + analyst loops)
Medicine — radiology-AI training-data curation, clinician override workflows, edge-case review by senior radiologists, FDA approval gating
Law — contract-review AI's attorney-review queues, confidentiality / privilege flagging, judgment about ambiguous clauses still escalates to humans
Finance — algorithmic-trading systems' human risk officers, regime-shift kill switches, regulatory approval and audit
Education — adaptive-learning tutoring AI's curriculum designers, content moderation, teacher escalation queues

The cross-domain takeaway: even the most automated-looking AI runs on humans. The question for any AI claim in any domain is where the humans are, not whether. The marketing erases the humans; the operating reality includes them.

§3 · ~1,600 words

The outperformance gap — even where AI matches, it's narrow-setting overfitting, not generalization

Two layers — both important; the second is what the marketing erases.

Layer 1 — the empirical record across domains. No expert decision domain has a clean public demonstration of AI outperforming the best experts at the variety of work the actual job entails:

Poker — Cepheus, Libratus, DeepStack, Pluribus, with caveats their own authors disclose
Go — AlphaGo, AlphaGo Zero, MuZero (each in narrow ruleset; no generalized board-game master AI)
Medicine — radiology AI matches but doesn't surpass top radiologists at the scope of cases the job actually presents (Stanford / Google Health / DeepMind studies all show match-or-near-match, not surpass)
Law — contract-review AI matches junior associates on narrow tasks; doesn't approach partner-level judgment on novel situations
Finance — algorithmic trading succeeds in specific market regimes; fails under regime shifts; the best systematic shops still hire human researchers and risk officers
Editing / writing — LLM AI matches grammar-and-style review but doesn't approach editor-level judgment on what's worth saying

Layer 2 — the overfitting layer. Even where AI matches top experts on a specific benchmark, the matching is typically achieved by throwing massive data and compute at one narrow setting — one game, one image type, one document genre, one market regime, one curriculum — and overfitting the model to that setting. Change the setting and performance degrades. The pattern holds across game-playing AI: AlphaStar plays one version of StarCraft against a specific opponent distribution; OpenAI Five plays one specific Dota 2 hero matchup; Suphx plays Mahjong at a specific ruleset; Pluribus plays 6-max NLHE at exactly the configuration its training expected. The pattern holds across professional-AI: radiology AI trained on one population's images degrades on another's; legal AI trained on US case law fails in other jurisdictions.

The claim "AI matches the best" is in practice "AI matches the best in this one narrow setting we trained for." When the setting shifts — when the rules change, the population changes, the distribution shifts, when true reasoning and adaptation is required — current AI overfits. The technology hasn't demonstrated the leap to generalized expert reasoning that the marketing implies. The two layers compound: rare rigorous demonstrations across domains, and even those are narrow-setting matching, not the generalized expert-equivalent reasoning the headlines suggest.

§4 · ~1,400 words

The pedagogy gap — AI doesn't transmit the tacit knowledge experts apprentice into

The third structural property: in every expert domain, the formalism (the textbook, the chart, the model's output) is the floor. The expert's role is the application of the formalism plus everything the formalism leaves out. AI implements the formalism. It doesn't apprentice you into the judgment.

Domain-by-domain:

Poker coaching — solver charts give you the equilibrium frequencies. Coaches teach the deviation, the read, the tilt management, the study workflow. The chart is the floor; the coach raises the ceiling. (Cross-link to A.)
Medical training — textbooks and AI-assisted decision support give you guidelines. Residency, mentorship, and clinical reasoning under uncertainty are what produce the senior physician. AI can summarize a chart; it can't conduct rounds.
Legal apprenticeship — AI can produce a clean memo from a database. Junior associates learn what cases to bring up in front of a partner, what questions a judge will ask, what risk a client is willing to absorb. Tacit, transmitted in chambers.
Trading and risk — quants ship models. Senior traders teach what model failure looks like, how to size during regime shifts, what stories the market is telling. AI runs the model; the trader runs the meta-model.
Teaching — adaptive AI delivers content. Master teachers diagnose why a student isn't learning, sequence the next lesson, hold the room together. Apprenticed in classrooms, not in textbooks.

The cross-domain takeaway: in every expert domain, the formalism is necessary but insufficient. The expert's role is the structured application of the formalism plus the irreducible tacit knowledge of how to apply it. AI implements the first; humans transmit the second.

§5 · ~700 words

Synthesis — three gaps in every expert decision domain

Stack the three gaps. Each is independent. Each appears in every expert domain examined carefully. Each is an open structural problem, not a "we just need more compute" problem.

Building the AI (autonomy) — the methodology depends on humans across all domains
Outperforming the AI (outperformance) — narrow-setting matching, not generalized surpassing
Teaching past the AI (pedagogy) — formalism is the floor, tacit transmission is the ceiling

The right diagnostic for any AI claim in any expert domain: "Where is the human in the loop? Where has the AI's outperformance been measured against the best of you, across the variety of work the actual job entails? Where does the formalism the AI implements stop reaching the practice?" When all three answers are clean, the AI claim survives. When any one is muddy, the claim doesn't.

§6 · ~800 words

Implications for expert decision-makers

What experts in domains being told an AI is replacing them should make of this. The three gaps as a diagnostic to apply to any specific AI claim in their field. The next decade isn't AI versus experts; it's experts who learn to work with AI versus experts who don't. The amplification, not the replacement, is the operating model that survives the three gaps.

Specific implications by domain:

Coaches teaching with AI tools they understand the limits of
Radiologists working alongside AI for triage; final read still human
Lawyers using AI for discovery, judgment for strategy
Traders running AI models with human risk oversight
Teachers using adaptive AI for delivery, themselves for diagnosis and motivation

The amplification model is what scales. The replacement model fails the three gaps test in every domain.

§7 · ~400 words

Close

The cultural narrative will keep selling the same story across new domains as AI products launch. The diagnostic stays the same. Three gaps. Three independent open problems. Three reasons the expert's role has been amplified, not replaced. The marketing keeps trying to write that out of the picture; the operating reality keeps putting it back in.

Length budget

§	Beat	Words
§1	Open (cross-domain narrative)	500
§2	Autonomy gap (cross-domain)	1,200
§3	Outperformance gap (with overfitting layer, cross-domain)	1,600
§4	Pedagogy gap (cross-domain)	1,400
§5	Synthesis	700
§6	Implications for expert decision-makers	800
§7	Close	400
Total		~6,600

Within the 6,000–7,500 target. Tight enough for HBR feature length; expand §3 or §4 to 7,500 for Atlantic / New Yorker if needed.

How C differs from A and B

Article	Anchor	Audience	Domain coverage
A	Coach pedagogy in poker	Coaches running schools, serious students	Poker only
B	Three gaps in poker (with adjacent-domain coda)	Tech-media + cultural readers	Poker primary; medicine/law/finance brief mention at end
C (this piece)	Three gaps across expert decision-making	AI researchers, decision-science readers, executives + policy	Poker as one case study; medicine, law, finance, education each get full treatment

Cross-references

A — coach pedagogy in poker. C cites for the poker-pedagogy example in §4.
B — poker-anchored cultural piece. C and B share the three-gap framework but apply it differently (B = poker depth; C = cross-domain breadth).
Manifesto (A1) — field-level argument behind the poker examples in §3.

Open Editor's Qs

⚑ Q1 — Title Pick from the candidates above. My recommendation: ⭐ "Three Things AI Hasn't Replaced — A Cross-Domain Field Guide" (preserves the "Three Things" spine, signals multi-domain). Alternates: "What AI Doesn't Replace: A Field Guide for Expert Decision-Makers" (cleaner) or "When the Algorithm Meets the Expert" (op-ed).

⚑ Q2 — Domain coverage depth Five domains (poker, medicine, law, finance, education) each get 1–2 paragraphs per gap-section. Is that the right balance? Alternative: drop to three domains (poker + 2) for tighter argument; or expand to six domains (add policy / military / engineering). My recommendation: five as planned — gives the cross-domain argument enough breadth without diluting depth.

⚑ Q3 — Domain expertise We have public-knowledge-level depth in poker and AI generally; we don't have deep expertise in medicine, law, finance, or education. The piece relies on cited examples (specific products, specific named studies, specific failure modes). Each cited example needs to be a real, sourced reference. My recommendation: research each domain to get one specific named AI product + one specific cited study + one specific failure mode + one specific expert workflow. ~half-day per domain.

⚑ Q4 — Length 6,000 / 7,000 / 7,500 words? My recommendation: 6,500 — tight enough for HBR feature, can scale up to 7,500 for Atlantic / New Yorker if needed.

⚑ Q5 — Production timing Write A first (poker community), B second (Atlantic/HBR poker), C only on flagship-outlet ask? Or write C as a parallel ambitious piece and pitch it directly to Atlantic / HBR / New Yorker? My recommendation: still write C only on flagship-outlet ask — the cross-domain version requires real research per domain (Q3), and we don't ship without an outlet on the line.

⚑ Q6 — Co-byline Solo (Thanh) or co-byline with a domain expert? E.g., Thanh + Annie Duke / Maria Konnikova for the decision-science angle; or Thanh + a senior medical-AI researcher; or Thanh + a senior legal-tech researcher. Co-byline lifts the cross-domain authority significantly.