Outline only. This is the v0.1 working outline for Article B in the Coach + AI Critique sub-series. v1 prose will be drafted later, replacing the section-beat summaries below with full prose. Editorial questions are visible in yellow callouts.
Audience
Decision-science and AI-curious cultural readers; sophisticated coaches and players who want the broader frame; tech-media editors evaluating how to think about AI claims in adjacent decision domains.
Outlets
The Atlantic / HBR / IEEE Spectrum / Wired / New Yorker (op-ed-leaning).
Length
~5,000 words target.
Tone
Op-ed-leaning, cultural, evidence-grounded. Less "how to study," more "what does this technology actually do, and what doesn't it."
Sub-series role
B is the cultural / decision-science version. Companions: A (practical poker-community) and C (flagship comprehensive).

Working title and alternates

Original "AI Still Needs You. AI Still Hasn't Beaten You." was the seed; B expands the scope to include the pedagogy gap, making it a fuller decision-science piece.

One-line goal

A cultural-frame argument for the irreducibility of human judgment in modern gameplay AI. Three structural gaps the marketing keeps trying to write out: autonomy (AI methodology depends on humans), outperformance (AI hasn't beaten top humans in any published controlled setting), and pedagogy (GTO study alone doesn't make a player). The article uses poker as the case study but reaches for adjacent decision domains where the same pattern holds.

Story arc

The cultural assumption — across poker, finance, medicine, law, anywhere AI is being marketed as a replacement for expert judgment — is that the algorithms are catching up to humans. In poker, where we have the cleanest measurement environment for this kind of claim, the assumption fails on three fronts. The article walks through the three gaps with poker-specific evidence, then generalizes the lesson for other decision domains.

Section-by-section beats

§1 · ~400 words

Open: the cultural narrative we keep being sold

Not a poker-specific open. A cultural one: the AI that's going to replace your coach, your doctor, your lawyer, your editor. Poker is the cleanest case study because measurement is unambiguous — you can play hands and count the result. So poker is the right place to look for whether the marketing claim survives contact with the data.

§2 · ~1,000 words

The autonomy gap: AI methodology is human-in-the-loop

Walk through where humans are required at every step:

The cultural takeaway: even the most automated-looking AI runs on humans. The question for any AI claim in any domain is where the humans are, not whether.

§3 · ~1,400 words

The outperformance gap: even where AI matches, it's overfitting, not generalization

Two layers to this gap.

Layer 1 — the empirical record. No public controlled setting has shown the best gameplay AI outperforming the best human players and coaches at scale.

Layer 2 — the overfitting layer. Even where AI matches top humans on a specific benchmark, the matching is typically achieved by throwing massive data and compute at one narrow setting — one game, one ruleset, one stack depth, one opponent pool — and overfitting the model to that setting. Change the setting and performance degrades. This pattern holds across game-playing AI more broadly: AlphaStar plays one version of StarCraft against a specific opponent distribution; OpenAI Five plays one specific Dota 2 hero matchup; Suphx plays Mahjong at a specific ruleset; Pluribus plays 6-max NLHE at exactly the configuration its training expected. There is no generalized AI poker player — one that outperforms humans across the variety of poker the actual game presents (different formats, stack depths, opponent populations, exotic variants, ruleset shifts).

The claim "AI matches the best" is in practice "AI matches the best in this one narrow setting we trained for." When the setting shifts — when the rules change, when true reasoning and adaptation is required, when the opponent pool isn't in the training distribution — current AI overfits. The technology hasn't demonstrated the leap to general game-playing reasoning that the marketing implies.

The cultural takeaway: the headline "AI beats humans" claim — in poker as in elsewhere — usually rests on one paper, one experiment, one narrow setting, and a set of caveats the marketing erases. When you pull the experiment apart, it's almost never what the headlines said. And even where the matching claim survives, what's been demonstrated is narrow-setting overfitting, not generalized reasoning. The two layers compound.

§4 · ~1,200 words

The pedagogy gap: GTO study alone doesn't make a player

Compressed version of Article A's §2 + §3. Eight foundation skills the chart teaches; eight ceiling skills the chart doesn't reach (live reads, opponent-specific deviations, ICM-heavy spots, tilt management, table selection, study-plan personalization, format-cross-pollination, multi-way / exotic formats).

The cultural takeaway: in any expert domain, the formalism (the textbook, the chart, the algorithm's output) is the floor. The expert's role is the application of the formalism plus everything the formalism leaves out.

§5 · ~700 words

Synthesis: three irreducible roles of human judgment

Stack the three gaps:

Each gap is independent. Each is an open problem, not a "we just need more compute" problem. Each generalizes beyond poker to any decision domain where AI is being marketed as a replacement for expert judgment.

The right question for any AI claim isn't "can the AI do this?" It's "on which gap is this claim quietly conceding?"

§6 · ~700 words

What this means for AI in adjacent decision domains

Generalize. In medicine, law, finance, education — the same three gaps appear. The cultural narrative ("the algorithm will replace the expert") fails on the same three fronts. The article points at examples (legal-doc AI that depends on attorney review queues; medical-imaging AI whose performance ceiling matches but doesn't beat top radiologists; financial-trading AI whose strongest deployments are exploitative rather than equilibrium-seeking).

This is where the piece earns the cultural / decision-science register. The poker case study isn't just about poker; it's a clean diagnostic instrument for a much broader class of AI claims.

§7 · ~500 words

Close: what coaches (and other experts) should make of this

Practical for the audience that reads Atlantic / HBR. If you're an expert in a domain being told an AI is replacing you, the three gaps are the diagnostic. Where is the human in the loop? Where has the AI's outperformance been measured against the best of you? Where does the formalism the AI implements stop reaching real practice? When all three answers are clean, the AI claim survives. When any one is muddy, the claim doesn't.

The next decade isn't AI versus experts. It's experts who learn to teach with AI versus experts who don't.

Length budget

§BeatWords
§1Open400
§2Autonomy gap1,000
§3Outperformance gap1,200
§4Pedagogy gap1,200
§5Synthesis700
§6Adjacent domains700
§7Close500
Total~5,700

Slightly above the 5,000-word target. Trim §6 if needed for outlet length cap.

Cross-references in the sub-series

Open Editor's Qs

⚑ Q1 — Adjacent-domain claims in §6 How aggressive on the medicine / law / finance comparisons? My recommendation: 1 specific example per domain, sourced. Don't generalize beyond what the data supports.
⚑ Q2 — Length 5,000 vs 6,000 words? §6 is the swing variable. My recommendation: 5,000 with §6 tighter; expand to 6,000 if a flagship outlet wants the longer cultural feature.
⚑ Q3 — Title Pick from the candidates above. My recommendation: ⭐ "What the Algorithms Can't Do (And Why You Still Need Coaches and Human Experts)".
⚑ Q4 — Outlet pitch order HBR for the business-decision frame, Atlantic for the cultural frame, Wired backup for the AI-critique frame? Or different order?
⚑ Q5 — Coach byline / co-byline Should B carry a named-coach co-byline (Petrangelo, Brad Wilson, Annie Duke / Maria Konnikova for the decision-science angle)? Optional but lifts the piece for tier-1 outlets.