Overview
Today’s threads centre on AI crossing the gap from clever demos to hands-on work. Google’s Gemini 3 Deep Think is pitching hard at labs and workshops, while OpenAI’s Codex-Spark shows what real-time coding looks like when latency almost disappears. Around that, devs debate control and rate limits, founders push agents-as-staff, creators test new video models, and we still make time for a pre-dawn ISS launch and a liquid robot straight from the future.
The big picture
DeepMind upgrades Gemini 3 Deep Think for real science and engineering
DeepMind spotlights Duke’s Wang Lab using Deep Think to tune semiconductor material recipes in under two hours, alongside new highs on reasoning benchmarks such as 84.6% on ARC-AGI-2 and top scores across maths and science threads. Access lands in the Gemini app for Google AI Ultra subscribers and via Vertex AI Early Access.
Google AI shows Deep Think moving from theory to build-tested utility
Google AI shares case studies and a video of Anupam Pathak using Deep Think to draft Python for CAD workflows and speed up turbine blade iteration. Benchmarks include a 3455 Elo in competitive programming, hinting at code competency that starts to rival seasoned developers.
Gemini app rolls out Deep Think to subscribers - with mixed reception
Google’s Gemini account frames Deep Think as a tool for researchers and engineers, including a sketch-to-STL pipeline that turns a hand-drawn laptop stand into a printable model. The catch noted in replies: a $250 per month paywall, tight daily query caps, and messy chat organisation, which dampen accessibility for many.
OpenAI introduces GPT-5.3-Codex-Spark for real-time coding
OpenAI announces a text-only, ultra-fast coding model streaming at over 1,000 tokens per second on Cerebras hardware, with a 128k context window and reduced latency. It is live as a research preview for ChatGPT Pro users in the Codex app and VS Code extension, and it posts strong SWE-Bench Pro numbers.
How developers get it: where to try Codex-Spark today
The developer account clarifies distribution across the Codex CLI and IDE extension, pitching it as the first in a family of low-latency assistants. Replies are excited about speed, but the #Keep4o chorus returns, arguing that pushing pro tooling should not come at the expense of broader access and stability many users relied on.
Inside the demo: “is this sped up?”
An engineer’s post shows Codex-Spark generating and running a Rust Snake game almost instantly. The disbelief is telling, and so is the note that this pace depends on heavy-duty infrastructure that not everyone will see right away.
Keep control: Antigravity’s “Implementation Plan” before code
Google’s Antigravity IDE nudges teams to request an editable plan first, review architecture in markdown, then approve execution. The replies ask for deeper reasoning and complain about rate limits and bans for unsupported API use - a reminder that guardrails matter, but so does developer trust.
Dario Amodei on the short “centaur” window
Anthropic’s CEO argues the era of human-AI pair programming could be brief before full automation takes the lead. He points to chess as precedent and suggests coding could be entirely machine-written within a year, with sharp labour shocks alongside big productivity gains.
Agents as employees: YC cheers Tensol AI’s OpenClaw stack
Tensol pitches autonomous agents that handle repetitive workflows across support and engineering with tool access and business context. The promise is fast bug triage and customer outreach in minutes, all within a secure setup that aims to lower integration friction.
Seedance 2.0 steals the creative spotlight
Min Choi curates ten wild clips from ByteDance’s new video model, from ad-grade spots to anime and surreal scenes. Motion and lip sync look stable, generation often lands in minutes, and the internet takes notice despite the familiar complaint about janky text in-frame.
Nine cents to remake an F1 money shot
A post shows an AI recreation of the priciest scene from the F1 film for pocket change, placing more pressure on VFX budgets. Whether you liked the film or not, the cost curve is bending fast toward cheap iteration and shot design at scale.
Why short-form videos all sound the same
A thread breaks down the hook-research-insight formula behind viral clips and argues it can sap expert credibility. The broader question: do we want engagement spikes, or do we want trust that lasts more than a news cycle?
NASA lines up Crew-12 to the ISS
Liftoff is set no earlier than 13 February at 05:15 EST, with coverage from 03:15. The crew will spend eight months running hundreds of experiments, flying a reused Falcon 9 and Dragon under the Commercial Crew Program.
A robot made of liquid
Pascal Bornet shares soft robotics research where ultrasound steers a droplet that can split, merge, squeeze through gaps, then heal. Early days, yes, but the potential spans medicine and exploration, with fresh safety and control questions once AI enters the loop.
CodeWiki turns repos into interactive guides
Google’s CodeWiki converts a GitHub repo into a navigable wiki with diagrams and a code-aware chatbot. It reduces onboarding pain for large projects and, thanks to tight GitHub integration, could tidy up open-source collaboration.
Why it matters
AI is getting hands-on with the physical world. Deep Think’s mix of sketch-to-STL and lab-grade optimisation brings models into workshops and wet labs, not just whiteboards. If those benchmark gains translate to stable tooling, we will see shorter design cycles in materials, hardware, and robotics.
Software pace is breaking old habits. Codex-Spark’s speed makes conversational coding feel like live editing. That demands stronger process gates, which tools like Antigravity’s planning step try to provide. Teams that keep human review up front will waste fewer cycles on brittle code.
Workflows are unbundling into agents and approvals. Tensol’s pitch and Amodei’s warning sit on the same curve: automate the boring parts today, then the complex bits tomorrow. Expect junior roles to compress, senior roles to tilt toward supervision and systems thinking, and policy debates to heat up.
Media economics are shifting. Seedance 2.0 and nine-cent VFX rewrites make high-end visuals accessible to small teams, which is thrilling for creators and unsettling for big-budget pipelines. The trust debate around short-form formulas will only grow as synthetic content floods feeds.
Beyond AI, spaceflight and soft robotics keep the horizon wide. Crew-12 is a reminder that reusable launch is now routine, feeding research that benefits Earth. Liquid robotics hints at machines that move where rigid ones cannot. Together, they mark a year defined by faster loops between idea, simulation, and real-world results.







