Episode #336: 11 March 2026

Daily Vibe Casting

0:00

-20:34

Episode #336: 11 March 2026

Multimodal AI arrives, coding agents face reality, and bots threaten social platforms

Daily Vibe Casting

Mar 11, 2026

Overview

Today had two clear threads: multimodal AI moving from demos to developer tools, and the messy human bits that sit around it, from quality in AI-written code to bot-filled social platforms. There was also a dose of old-fashioned hardware spectacle, with SpaceX teasing Flight 12 testing and defence tech hype colliding with reality.

The big picture

AI is getting easier to build with and harder to trust at the edges. On the builder side, embeddings and coding agents keep compressing the work needed to ship useful systems. On the social side, the same acceleration risks flooding platforms with synthetic behaviour, while founders and product teams are reminded that growth still comes down to people choosing to stay.

Google’s Gemini Embedding 2 pushes multimodal search into the mainstream

Google is putting a proper stake in the ground with Gemini Embedding 2: text, images, video, audio, and PDFs in the same vector space. The interesting part is not the buzz, it’s the practical implication for teams doing retrieval, content understanding, and cross-media search without stitching together separate models.

Preview access via the Gemini API and Vertex AI also hints at how quickly this is meant to land in real products, not just research blogs.

Google for Developers@googledevs

Announcing Gemini Embedding 2 ✨ the first fully multimodal embedding model built on the Gemini architecture. Now available in preview via the Gemini API and Vertex AI. The new model provides semantic understanding across 100+ languages — and support for modalities across text,

4:45 PM · Mar 10, 2026 · 232K Views

82 Replies · 256 Reposts · 1.84K Likes

Claude Code vs Codex, a grounded comparison from someone who actually uses them

@Hesamation’s write-up is the sort of agent comparison people want: not vibes, but what happens when you hand these tools a real pipeline and judge the output. The headline is that Claude Code seems to hold up better on longer, messier tasks, while Codex still earns points for clean, configurable engineering.

The takeaway is refreshingly human: tooling choices are about fit with your workflow, not a single scoreboard.

ℏεsam@Hesamation

https://t.co/rD8OXHanSq

5:16 PM · Mar 10, 2026 · 340K Views

26 Replies · 110 Reposts · 1.09K Likes

Benchmark culture continues, with GPT-5.4 getting the “special” label

LisanBench is making the rounds as another attempt to measure planning and vocabulary under constraints, and the claim here is that GPT-5.4 explores a wider space of possibilities than Opus 4.6 or Gemini 3.1 Pro. Whether you buy the framing or not, people are hungry for tests that feel less gameable than the usual leaderboard loop.

It’s also a reminder that “reasoning” debates are still being fought with charts, videos, and a fair bit of interpretation.

Lisan al Gaib@scaling01

GPT-5.4 is special LisanBench: GPT-5.4 vs Opus 4.6 vs Gemini 3.1 Pro

Lisan al Gaib @scaling01

I have something really sweet cooking for LisanBench :)

6:05 PM · Mar 10, 2026 · 253K Views

29 Replies · 35 Reposts · 555 Likes

“Fighting slop” is becoming a serious software practice

@swyx surfaced a small set of rules from OpenCode that land because they are boring in the right way: don’t ship features just because you can, leave the code better, fix process over adding more. That is the sort of discipline LLM-assisted teams need if they do not want to wake up buried in brittle glue code.

It’s a useful counterweight to the current rush to automate everything that can be automated.

swyx@swyx

OpenCode’s rules for fighting slop: - don’t ship features just because you can - leave the code better than you found it - fixing features & process > new features so well reasoned

dax @thdxr

sent this to the team today everything great comes from being able to delay gratification for as long as possible and it feels like we're collectively losing our ability to do that

3:46 PM · Mar 10, 2026 · 90.7K Views

34 Replies · 55 Reposts · 909 Likes

Hermes Agent vs OpenClaw, the open-source agent debate rolls on

@gregisenberg’s question captures the mood: are we seeing genuine progress in open agents, or just a new name and a new demo loop? Hermes Agent is being framed as more mature, with memory and skill-building, plus safety choices like isolation, but the scepticism is healthy given how quickly “autonomy” gets oversold.

For most people watching, the real test will be boring reliability, not a flashy terminal video.

GREG ISENBERG@gregisenberg

Is Hermes Agent the new OpenClaw?

Nous Research @NousResearch

Meet Hermes Agent, the open source agent that grows with you. Hermes Agent remembers what it learns and gets more capable over time, with a multi-level memory system and persistent dedicated machine access.

7:48 PM · Mar 10, 2026 · 206K Views

118 Replies · 50 Reposts · 863 Likes

Meta and the fear of bot-filled platforms gets louder

@birdabo’s post is a joke with teeth: the idea of millions of autonomous agents buying, selling, posting, and scamming across Facebook, Instagram, and WhatsApp hits a nerve because it feels plausible. Even without any new acquisition, people already experience the bot problem as a daily annoyance.

If platforms cannot separate human intent from automated behaviour, trust becomes the scarce resource, and it is hard to win back once lost.

sui ☄️@birdabo

meta just acquired the tech to flood their platforms with bots. millions of autonomous AI agents across facebook, instagram, whatsapp - buying, selling, posting, and scamming your grandparents. i feel bad for the boomers on facebook.

Polymarket @Polymarket

BREAKING: META acquires Moltbook, a social network built for AI agents.

3:09 PM · Mar 10, 2026 · 349K Views

92 Replies · 1.69K Reposts · 15.7K Likes

Naval’s take, AI drains moats built on scarcity

@naval summed up a growing founder anxiety in a single sentence: AI is going to drain a lot of moats. The implied follow-up is where people are landing now, that defensibility moves towards judgment, relationships, distribution, and accountability, the things you cannot copy-paste from a model output.

It is a clean framing for why “we have secret sauce” sounds weaker every month.

Naval@naval

AI is going to drain a lot of moats.

12:01 PM · Mar 11, 2026 · 1.28M Views

1.24K Replies · 1.31K Reposts · 13.2K Likes

Churn as the harshest kind of feedback

Paul Graham’s reminder is blunt and useful: slow growth is bad, but slow growth because people try your product and then leave is worse. Churn is not a marketing problem, it’s a product problem, and it means users made it past the threshold and still decided it was not worth it.

This is the sort of metric founders should stare at before polishing their next launch post.

Paul Graham@paulg

Churn is the worst reason to have slow growth. Churn means you're not just unknown, or that there's a big threshold to sign up. It means people are actually trying the product and deciding they don't like it.

10:09 PM · Mar 10, 2026 · 173K Views

158 Replies · 119 Reposts · 2.93K Likes

Starship Flight 12 teasing, the theatre of iteration

NASASpaceflight captured SpaceX doing what SpaceX does: load propellant, kick on the deluge, and let everyone argue whether it was a static fire attempt or a spin prime. Even when nothing lights, the process is public enough that the community treats it as an event.

It also underlines how much the Starship programme now runs on rapid ground testing as a form of progress in itself.

NSF - NASASpaceflight.com@NASASpaceflight

Starship Flight 12: SpaceX just loaded and acted like they were going to do a Static Fire test, with the Deluge activating, etc. Heck, could have been a spin prime, hard to see with this deluge. Either way, that was fun! youtube.com/watch?v=8vRyvn…

4:39 PM · Mar 10, 2026 · 163K Views

25 Replies · 228 Reposts · 2.5K Likes

The “miniature fighter jet” clip, defence tech hype meets scrutiny

@CollinRugg’s viral clip about Mach Industries’ Viper shows how quickly language outruns reality in defence tech. People hear “fighter jet” and imagine a certain class of aircraft, while the details point to something closer to a VTOL cruise-missile-style system.

The comments split neatly between excitement about lower-cost capability and suspicion that this is branding doing too much work.