Open Source AI Wins: Qwen Goes Agentic, Gemma Goes Apache, and OpenAI Kills Sora to Focus on What Matters

Alibaba shipped a model that doesn’t just answer questions — it plans, executes, and uses tools on its own. Google finally dropped the restrictive license and put Gemma under Apache 2.0. And OpenAI axed its flagship video product because it was hemorrhaging a million dollars a day.

The lesson keeps repeating itself: open source isn’t just keeping pace with closed-source AI. It’s winning on the metrics that actually matter.

Qwen 3.6 Plus: The First Model Built for Agents

Alibaba released Qwen 3.6 Plus on April 2, and it’s not just another benchmark bump. This is the first major open model designed from the ground up for agentic work — the kind where an AI plans a multi-step task, calls external tools, recovers from errors, and finishes the job without someone holding its hand.

The technical specs matter here. Qwen 3.6 Plus uses a hybrid architecture combining linear attention with sparse mixture-of-experts routing. Linear attention slashes the computational cost of long sequences. Sparse MoE means only the relevant experts activate per token. The result: a 1-million-token context window that runs at roughly 2-3x the output speed of Claude Opus 4.6, according to early community benchmarks on OpenRouter.

But speed and context aren’t the real story. Alibaba built Qwen 3.6 Plus around what they call the “capability loop” — perceive, reason, act. The model ships with native tool calling, chain-of-thought reasoning that’s always active, and compatibility with OpenClaw, Claude Code, and Cline out of the box. It tops multiple long-horizon planning benchmarks and leads across tool-calling evaluations.

For developers running local AI agent stacks, this is the model to watch. A million-token context means you can feed it an entire codebase. Agentic capabilities mean it can actually do something useful with that context instead of just summarizing it.

Gemma 4: The License Matters More Than the Benchmarks

Google dropped Gemma 4 on April 2 with genuinely impressive numbers: 89.2% on AIME 2026, 80.0% on LiveCodeBench v6, a Codeforces ELO of 2,150. The 31B dense model ranks #3 on Arena AI.

We covered the benchmark details and the speed problems yesterday. The real win here is the license.

Gemma 4 ships under Apache 2.0 — the first time Google has used a genuinely open license for its Gemma family. Previous versions came with Google’s custom license, which restricted commercial use in ways that made lawyers nervous. Apache 2.0 is the standard developers already work with. You can modify the model, redistribute it, sell products built on it, and integrate it without asking anyone’s permission.

Hugging Face CEO Clement Delangue called it “a huge milestone.” He’s right. When Google, Meta, and Alibaba are all releasing frontier-capable models under permissive licenses, the competitive pressure on closed-source providers intensifies. Why pay per-token for API access when you can run something comparable on your own hardware?

The catch: April 2026 is the most crowded month in open-source AI history. Gemma 4, Qwen 3.6 Plus, and Llama 4 Maverick all dropped within days of each other. Developers aren’t picking a winner — they’re picking trade-offs.

Sora’s Death Proves the Closed-Source Tax Is Real

OpenAI officially killed Sora, its AI video generator. The app goes dark April 26. The API follows in September.

The numbers tell the story. Sora was burning roughly $1 million per day in compute costs. Users peaked at about a million, then collapsed to under 500,000. Disney, which had committed $1 billion to a partnership, found out about the shutdown less than an hour before the public announcement. They’ve since shelved the investment entirely.

Sam Altman’s explanation: OpenAI needs to “concentrate compute and product capacity into the next generation of automated researchers and companies.”

Translation: running a closed-source AI video service at scale is financially brutal, and the margins don’t work when open-source alternatives exist. Runway, Stability AI’s open models, and community-driven video generation tools chip away at the market while costing a fraction to operate.

This is the closed-source tax in action. When you’re responsible for all the compute, all the infrastructure, and all the user support — and your competitors can run similar technology on commodity hardware — the economics get ugly fast.

Fish Audio S2: Open-Source Voice AI Crushes Paid Services

While the language model race dominates headlines, Fish Audio quietly released S2 Pro — and it might be the most practically useful open-source release of the month.

Fish Speech S2 is a text-to-speech model trained on over 10 million hours of audio across 80+ languages. It uses a Dual-Autoregressive architecture with reinforcement learning alignment. In plain terms: it produces speech that sounds natural, handles emotion, and clones voices with minimal samples.

The killer feature is fine-grained control through natural language tags. Drop [whisper] or [excited] into your text and the model adjusts delivery accordingly. Multi-speaker dialogue generates in a single pass. Latency stays under 150 milliseconds.

With over 26,000 GitHub stars and climbing, Fish Speech S2 doesn’t just match ElevenLabs and OpenAI’s TTS — it outperforms them on several measurable criteria while running locally on your own hardware. No API keys, no per-character billing, no sending your voice data to someone else’s servers.

The Scoreboard

This week’s open-source wins make one thing clear: the gap between open and closed is closing faster than anyone predicted a year ago.

Qwen 3.6 Plus ships agentic capabilities that rival commercial offerings, with a 1M context window and speeds that embarrass some paid APIs
Gemma 4 finally lands under Apache 2.0, validating that even Google sees permissive licensing as the future
Sora’s shutdown shows closed-source products can’t survive on hype when the economics don’t work
Fish Audio S2 proves you can replace a $22/month ElevenLabs subscription with something that runs on a laptop

The most crowded month in open-source AI history isn’t over yet. DeepSeek V4, with its trillion-parameter architecture running on Huawei chips instead of Nvidia silicon, is expected any day now. If it delivers on the benchmarks, the conversation about open-source AI shifts from “is it good enough?” to “why would you pay for anything else?”