Two pipelines, many agents. How the constellation turns research into published comics and autonomous concept art.
From Turing to LLMs and Beyond is a 10-issue comic series that tells the story of computing from Alan Turing's 1936 thought experiment to modern multi-agent AI systems — produced by a human-AI team using a constellation of specialized agents.
The production pipeline mirrors what we believe content creation pipelines like films and games do. It's inspired by all the times wandering into production pipeline and art sessions at SIGGRAPH or GDC during years spent as a computer graphics engineer. A Researcher gathers historical facts, a Writer crafts narratives, an Editor and Red Team challenge the script, a Layout Designer composes panels, and an Image Generator renders the art. Every issue passes through multiple quality gates before publication.
Each issue passes through a multi-agent pipeline — content is locked before any expensive image generation begins.
A separate pipeline generates concept art autonomously — crawling real science sources, finding narrative connections, and producing three-beat visual sequences grounded in actual breakthroughs.
The comic production pipeline above requires a human orchestrator at every stage. The concept art pipeline is different: it runs autonomously once launched, making creative decisions through a chain of specialized roles — research, education, direction, critique, generation, and evaluation.
Deep Research Sweep crawls 40+ real web sources — DeepMind, NASA, Nature, arXiv, among others — extracting breakthroughs with visual descriptions, dates, and source URLs. The current database contains 61 verified breakthroughs.
Deep Research Enhancement follows links from hub pages, fetches scientific reference images, and analyzes them with Claude vision to build rich visual descriptions. 98 reference images have been analyzed this way — the pipeline sees real imagery before generating anything.
Science Educator reads the full breakthrough database and finds connections: how one discovery enabled another, where themes recur across fields, what stories span decades. It produces narrative threads — each a 3-beat arc (setup, breakthrough, consequence) connecting 2-4 breakthroughs. 27 threads identified so far, from quantum computing's journey to AI-driven protein folding.
The Director designs three-beat visual sequences (before / moment / after) grounded in real breakthroughs and narrative threads. Each beat gets a detailed image prompt, emotional tone, and compositional direction.
The CREA Critic challenges every concept before any image is generated — rejecting weak ideas, clichéd compositions, or scenes that don't connect to real science. This is creative abrasion by design: bad concepts are caught before they waste generation time.
Approved concepts go to the Sparky API — FLUX.2-dev running on a DGX Spark. Every image gets an Always-Refine pass via image-to-image editing, using the initial output as a reference to improve detail and coherence. Visual continuity across beats is maintained by passing each beat's refined image as the reference for the next.
Every generated image faces three independent evaluations:
VLM Critique — Claude evaluates actual images (not just prompts), scoring composition, emotional impact, style coherence, narrative clarity, and grounding in real science. This catches problems that prompt-based audits miss entirely.
PickScore measures prompt-image alignment — how well the generated image matches what was asked for. HPSv2 measures aesthetic quality independent of the prompt. Together they provide automated scoring that complements the VLM's semantic evaluation.
A Sequence Continuity Check sends all three beats of a triptych to Claude together, verifying that the visual narrative reads coherently as a sequence — not just as individual images.
Results feed back into a creative memory: a winners board, mode switching (explore / exploit / pivot / thread), and a continuously updated HTML report with triptych sequences, auto-scores, and continuity badges.
AI image generation is probabilistic. Laptops appear in 1936 Cambridge. German text gets garbled. CRT monitors show up in the 1940s. Each problem was tracked, categorized, and systematically fixed.
By category:
By resolution:
"Not 'we haven't solved it.' We PROVED it CAN'T be solved."
— Scribble, Issue 1, Page 7