Overview
Today felt like a tour of where AI work is heading next: agents moving into the tools teams already live in, models learning to click around the real world, and companies wrestling with the boring-but-decisive bits like process, security, and capex. There was also a fresh reminder that access to frontier models is becoming both a product feature and a geopolitical headache.
The big picture
The centre of gravity is drifting from “chat with a model” to “assign work to an agent” and “let it operate inside your systems”. That raises the stakes on integration, permissions, and cost, while the competitive lines harden around compute, custom silicon, and who can keep their models from being copied. In the background, organisational habits still matter, sometimes more than the tech.
Notion turns agents into teammates, not tabs
Notion’s new External Agents idea is simple: the work already happens in boards and docs, so the agents should sit there too. You can @-mention Claude or Cursor like colleagues, assign tasks from a shared board, and keep runs visible to the whole team.
The interesting part is governance. Notion is framing this as controlled access with custom context and permissions, which is exactly what teams will demand once agents start touching real code and real data.
Cursor’s Notion integration pushes code changes straight to PR
Cursor followed up with the practical angle: mention @Cursor in Notion, give it a spec, and it can investigate, plan, build, test, then open a PR the team can review. It’s “agent as delivery pipeline”, anchored to the workflow people already trust: code review.
It also hints at where Cursor is heading with its SDK, turning its runtime into something other products can embed without rebuilding the whole agent stack from scratch.
Gemini 3.5 Flash gets “computer use”, and it looks like proper office automation
Google DeepMind is showing Gemini 3.5 Flash reasoning and acting across browser, mobile, and desktop environments. The demo is the familiar pattern now: step-by-step planning, clicks and inputs, then a clean summary at the end.
This stuff matters less for flashy demos and more for the dull tasks teams keep meaning to automate, like filing tickets and chasing forms. The question is whether reliability is now good enough to trust it without supervision.
NVIDIA’s pitch: agents that can watch the footage for you
Metropolis Blueprint VSS 3 is NVIDIA aiming at a neglected corner of “agent work”: video libraries and live streams. The idea is that you can ask for what you need in plain language and get search, summaries, alerts, and reports back, even across large archives.
What stands out is the product packaging: open-source repo, Docker and Helm support, and a menu of new “agent skills”. It’s a reminder that for enterprise uptake, deployment details often beat model trivia.
Anthropic accuses Alibaba of mass fraud to extract Claude capabilities
Polymarket surfaced a sharp allegation from Anthropic: nearly 25,000 fraudulent accounts used to generate tens of millions of Claude exchanges, apparently aimed at pulling out software engineering and agent-style reasoning behaviour. It’s a classic distillation fear, but at a scale that makes it hard to wave away.
If this becomes the new normal, expect tougher identity checks, stricter region controls, and more tension between “make it easy to adopt” and “stop people siphoning the model”.
Fable 5 might be edging back into wider access
Two separate threads converged: hints in Claude Code update strings about included weekly allotments and no separate purchase, plus reports that Fable 5 has reappeared in Amazon Bedrock. Taken together, it reads like Anthropic testing a more stable packaging for its high-end model.
Given how much developer sentiment swings on model availability, “it’s back” can be as important as any benchmark chart.
GPT-5.5 Instant gets a tune-up, and the replies are telling
OpenAI shipped another incremental update to GPT-5.5 Instant, pitching better intent understanding, improved constraint handling, and stronger shopping and local recommendations. It’s the sort of release that sounds small until you remember how many people use the “Instant” tier every day.
The louder story, though, is the comments: plenty of users still arguing about personality, preference, and what they feel they lost from earlier model behaviour.
“No approvals, only vetos”: Vercel’s take on shipping speed
Malte Ubl dropped a sharp critique of slow approval culture, describing it as a status game that turns into a doom loop. His fix at Vercel is blunt: ship by default unless someone speaks up with a veto.
It’s a useful counterpoint to all the agent tooling news. Even with smarter systems, teams still get stuck if decision-making is built around waiting.
Hyperscaler free cash flow panic, explained as capex reality
Chamath Palihapitiya pushed back on the “FCF is collapsing” narrative around hyperscalers, arguing the current numbers reflect an investment cycle where capex is swallowing operating cash flow. The core claim is that this looks ugly short-term, but can build a moat if the spend turns into defensible infrastructure.
It’s a tidy lens for the AI buildout: the real debate is not whether the bill is huge, it’s whether it buys lasting advantage.
MCP gets heat for heavy session overhead
dax took aim at MCP’s assumption that one process equals one session, arguing the spec forces you to spawn new servers per active session and creates pointless overhead. The critique is familiar in standards work: lock in too early and you fossilise the wrong architecture.
If stateless changes land soon, this becomes a footnote. If not, developers will keep routing around it with simpler CLI approaches.
Episode #442: 25 June 2026






















