noticiasPublicado 2026-07-02

Cold Open — AIEWF draws the software factory blueprint

En la AI Engineer World's Fair, una frase resonó en todos los escenarios: 'fábrica de software'. Los ingenieros de despliegue de Cursor, el framework 'eve' de Vercel y los sitios autoensamblables de Adobe apuntan en la misma dirección. Además: Simon Willison construye un agente de código en una tarde, y 95 comentaristas de HN debaten cómo mantener a Fable bajo control.

▶ video

— vistas

Thursday, July 2, 2026. We scanned 582 fresh items off the wire today; three made the cut — and one of them is the name of a conference that is, right now, rewriting the vocabulary of software delivery.

Listen to today's episode on the Cold Open podcast page.

The lead · AIEWF draws the software factory blueprint

AI Engineer World's Fair — software factory concept

AI Engineer World's Fair is running in San Francisco this week, and if you read every Latent Space dispatch from the floor, one phrase lands in almost every talk: software factory. Not as a metaphor. As a deployment model.

Cursor is now staffing Forward Deployed Engineers — humans who go inside enterprise clients and stand up agent-based software delivery systems, essentially turning software teams into production pipelines that run with and through AI. Pauline Brunet, who leads that team, described the model plainly: her engineers show organizations how to implement agents, how to structure sandboxes, and how to make the output of agents reliable enough to ship from. "Software factories" is how she described what they are building.

Alongside that, Vercel's Andrew Qu introduced eve, Vercel's internal agent framework, and walked through the design decisions that led there. The three that caught the room: skills (portable instruction sets the agent loads on demand), sandboxes (isolated execution environments so agents can try things safely), and agent-readable websites (structured pages the agent can consume without scraping). The third point is quiet but significant — it suggests the web is starting to bifurcate into human-readable and agent-readable layers.

Adobe showed a prototype of "agentic sites" that generate each page around a specific visitor's intent rather than serving a static template. Each visitor effectively gets a site assembled for them in the moment.

Geoffrey Litt's keynote offered the sharpest counterpoint of the day: "understand to participate." As agents write increasingly large and complex changes, Litt argued, builders risk accumulating cognitive debt — a growing gap between what they believe the code does and what it actually does. The prescription is not to slow down but to demand that agents explain what they built, not just show the output.

Why it matters

The convergence is the story. In a single conference day, the leading deployment tool (Cursor), the leading web deployment platform (Vercel), and Adobe all described agents not as an add-on but as the production mechanism. The builders in that room are the ones writing the job description for what a software engineer does next year.

Litt's caveat earns a dedicated bullet because it is easy to miss when the demos are moving fast: the cognitive-debt risk is real, and the teams that ignore it will eventually find themselves shipping code they cannot debug, own, or explain. The antidote — forcing comprehension, not just delegation — is a design choice, not a feature that ships automatically.

The fine print

Most of what was shown at AIEWF is either early-stage (eve is an internal framework, not a public product yet), demo-quality (Adobe's agentic sites are an experiment), or a services pitch (Cursor's FDE model is paid consulting). The direction is credible; the timelines are not. Treat these as the 2026 roadmap, not the 2026 release notes.

Sources: latent.space/p/cursor-forward-deployed-engineers · latent.space/p/vercel-agents-new-software · latent.space/p/the-website-of-the-future · simonwillison.net

02 · Simon Willison ships llm-coding-agent 0.1a0

Simon Willison's llm-coding-agent built with Claude Fable 5

Simon Willison published llm-coding-agent 0.1a0 today — a working coding agent built on his LLM library using Claude Fable 5 as the backend. He bootstrapped it in a single session, starting from his python-lib-template-repository and two prompts: the first to write a spec, the second to implement it.

The more interesting note is buried in the project intro: his LLM library has "evolved into more of an agent framework" over the past year. The coding agent is not a new product — it is a proof of what the library became when he was not looking.

Why it matters. This is the current builder reality in one afternoon: take a library you already maintain, hand the scaffolding work to Fable 5, and ship a working agent before the day is done. The practical signal is the time-to-agent, not the agent itself — it is now short enough that it can be incidental to another project.

Source: simonwillison.net/2026/Jul/2/llm-coding-agent

03 · The short leash AI coding method

Short leash AI coding method — keeping Fable under control

On Hacker News today: "The short leash AI coding method for beating Fable" (blog.okturtles.org), 83 points and 95 comments. The premise is that when the best coding model available is also the most autonomous, the risk is not that it underperforms — it is that it goes too far in the wrong direction. The "short leash" method is a structured approach to keeping AI coding sessions tight: frequent checkpoints, explicit scope boundaries, and deliberate human review at each step.

Why it matters. The timing is almost too on-the-nose. AIEWF is debating software factories while Fable 5 ships and 95 engineers in the HN comments debate how to not let it rewrite everything it touches. Both conversations are real, and neither cancels the other.

Source: blog.okturtles.org/2026/07/short-leash-ai-method · HN discussion

Also on the radar

Video for any LLM — claude-real-video (HN: 97 pts) — a tool that lets any LLM watch a video by extracting frames in real time and feeding them as context. The approach bypasses the video-natively-capable model requirement.
Voice AI — Hugging Face and Cerebras brought Gemma 4 to real-time voice AI, pairing DeepMind's open multimodal model with Cerebras inference hardware for sub-100ms latency voice responses.
Infrastructure — NVIDIA is opening its AI compute infrastructure to capital partners, explicitly framing the play as powering "AI factories that generate tokens at scale."
Autoresearch — Introspection's Roland Gavrilescu presented self-improving agent loops at AIEWF — agents that write their own "recipes" and iterate on them, with humans staying in the loop at the judgment layer.

Trends in dev tools

What moved today in the tools engineers actually ship with.

Enterprise AI deployment is becoming a staffing model. Cursor's FDE team does not sell software — they go on-site and build agent-based delivery pipelines for large organizations. The pitch is that implementation expertise is the bottleneck, not the tool. Sources: latent.space/p/cursor-forward-deployed-engineers
DSPy is entering the prompt-optimization conversation for real. Simon Willison used DSPy this morning — prompted by an AIEWF keynote — to evaluate and improve the SQL system prompt in his Datasette Agent. The pattern: fire off the DSPy optimization task in Claude Code for web while the keynote is still running. Source: simonwillison.net/2026/Jul/2/dspy-datasette-agent-prompts
Vercel's agent framework names the primitives. "Skills, sandboxes, and agent-readable websites" is a clean vocabulary that other frameworks will probably adopt or argue with. Watch for these terms to show up in competitor announcements. Source: latent.space/p/vercel-agents-new-software
Evaluating agents is becoming a discipline. A new arXiv paper introduces PACE — a lightweight proxy for agentic capability evaluation designed to replace expensive, infrastructure-heavy benchmarks like SWE-Bench. If it holds up under scrutiny, it could become the fast-feedback loop that agentic development is currently missing.

Popular skills

The agent-skills wave — portable instruction folders loaded on demand — keeps picking up vocabulary and research this week.

SkillCoach formalizes the evaluation gap. A new paper (arxiv.org/abs/2607.01874) proposes self-evolving rubrics for evaluating how well an agent actually uses skills in practice — not just whether the skill file exists, but whether the agent applies it correctly in real tasks. This is the measurement layer the skills ecosystem has been missing.
Vercel names skills as the composability unit. In the eve framework, skills are the canonical way to package reusable agent behavior. This is the same pattern that started in Claude Code and is now appearing in a major web deployment platform's agent architecture.
Skill engineering as a design discipline. Paul Bakaus' conversation at AIEWF (latent.space/p/skill-engineering-design) makes the case that skills require the same design rigor as UI components — they are not just prompt files, they are interfaces that agents and humans both depend on.

AI fun fact

Marvin Minsky's 1986 book The Society of Mind described intelligence as emerging from the interaction of thousands of small, specialized agents — each one too simple to be intelligent alone, but collectively capable of any mental task. His concept of "K-lines" — chains of knowledge that different mental agents activate together — prefigures today's agent-skills pattern almost exactly: a portable unit of expertise that gets loaded and composed on demand. Nearly forty years later, the builders at AIEWF are essentially implementing Minsky's architecture in production code.

Source: Minsky, Marvin. The Society of Mind. Simon & Schuster, 1986. (simonwillison.net)

Sources

Cursor Forward Deployed Engineers — latent.space
Vercel's eve agent framework — latent.space
Adobe agentic sites — latent.space
Geoffrey Litt / "understand to participate" — simonwillison.net
Simon Willison / llm-coding-agent — simonwillison.net
DSPy / Datasette Agent prompts — simonwillison.net
Short leash AI coding — blog.okturtles.org · HN
claude-real-video — github.com · HN
Gemma 4 voice AI — huggingface.co
NVIDIA AI infrastructure — blogs.nvidia.com
PACE benchmark paper — arxiv.org
SkillCoach paper — arxiv.org
Skill engineering design — latent.space
Minsky, The Society of Mind (1986)