Deterministic Scaffolding for VLM Image Generation
Frontier models like Gemini 3.0 Pro and GPT-5 still cannot natively handle complex spatial tasks such as numbering a 50-step spiral game board (source: samcollins.blog). The Underdrawing Method uses d

The Pitch
Frontier models like Gemini 3.0 Pro and GPT-5 still cannot natively handle complex spatial tasks such as numbering a 50-step spiral game board (source: samcollins.blog). The Underdrawing Method uses deterministic SVG or Python scripts to create a structural scaffold before any pixels are generated. By separating logic from aesthetics, developers can force 100% accuracy in text and numbering that native one-shot prompting still fails to deliver in May 2026.
Under the Hood
Gemini 3.0 Pro and ChatGPT Images 2 consistently fail to correctly number 50 consecutive items in a spiral natively (source: samcollins.blog). Asking GPT-5 to number a spiral is currently the quickest way to turn a logic problem into a surrealist painting. This method solves the hallucination by requiring a two-phase workflow: Layer 1 is a deterministic SVG or Python-based outline, and Layer 2 uses generative Image-to-Image models to apply textures (source: Sam Collins blog).
Research from WACV 2026 suggests that current AI editors only fulfill about 33% of precise editing requests correctly. This confirms a persistent gap in the 2026 stack that necessitates external geometric constraints (source: WACV 2026 Paper #2231-2241). The Hacker News community views this as a sophisticated evolution of early Stable Diffusion img2img workflows, now adapted for VLM reasoning (source: HN comment by vunderba).
Current limitations and unknowns:
- High technical friction requiring knowledge of SVG, Python, or Mermaid.
- Potential "Prompt Neglect" where models ignore descriptive style adjectives (source: HN).
- Increased agentic latency due to the multi-step code-and-vision execution.
- No public library yet exists to automate Layer 1 for non-engineers.
- Performance deltas between Claude 4.5 Opus and Gemini 3.0 Pro are currently undocumented.
Marcus's Take
This is the only viable way to ship production assets involving data visualization or precise spatial layouts in May 2026. If your product relies on GPT-5's intuition to place 50 numbers correctly, you are shipping broken features. It is a cumbersome workflow that increases latency and friction, but until vision models can actually count, you must use it for any project where accuracy is non-negotiable.
Ship clean code,
Marcus.

Marcus Webb - Senior Backend Analyst at UsedBy.ai
Related Articles

SQLite 3.53.1: Technical Reliability vs. Compliance Governance
SQLite is the industry’s default embedded database, now officially designated as a Recommended Storage Format (RSF) by the U.S. Library of Congress (Source: loc.gov RFS 2026). It remains the most depl

The Conduit Problem: Generative AI and the Hollowing of Technical Expertise
The primary metric for developer productivity in mid-2026 has shifted from logic density to artifact volume, fueled by LLM-driven "elongation" of workplace outputs. This phenomenon, labeled AI Product

Valve Releases CAD Files for Steam Controller 2026 and Magnetic Puck
Valve has published the full engineering specifications and CAD files for the 2026 Steam Controller shell and its magnetic charging "Puck" on GitLab. (GitLab) This release, licensed under CC BY-NC-SA
Stay Ahead of AI Adoption Trends
Get our latest reports and insights delivered to your inbox. No spam, just data.