devtake.dev
Topic

AI models

The model layer moves weekly. We follow capability jumps (SWE-bench, CursorBench, long-context), the regressions the marketing decks don’t mention, and the widening gap between what labs claim and what independent testers measure. We also cover the open-weights side closely — when a 35B MoE on a laptop out-draws a frontier API, that’s the kind of story you won’t read on a lab blog.

8 articles in this topic

Cloudflare Unweight tensor compression announcement social graphic
Open Source·

Cloudflare open-sourced a lossless LLM compressor that shaves 22% off model weights

Unweight is Cloudflare Research's new BF16 weight compressor. 22% smaller bundles, 13% smaller inference footprint, 30-40% throughput overhead, BSD license.

Anthropic's Claude Design announcement illustration, a quill on a cactus-green background
AI·

Anthropic shipped Claude Design. Figma stock dropped 7% the same day.

Anthropic launched Claude Design on April 17, a prompt-to-prototype tool that exports to Canva, not Figma. Figma's stock closed down 7% on the same day.

Screenshot of the updated OpenAI Codex Mac app with background computer-use panel
AI·

OpenAI's Codex now drives your Mac, not just your code

OpenAI shipped a Codex update that can pilot desktop apps with a cursor, generate images in-line, and run parallel agents. It's the opening move in a real Claude Code fight.

Header card from Simon Willison's 'Qwen3.6 beats Opus' post comparing pelican SVGs
AI·

Qwen 3.6-35B-A3B: the open MoE beating Opus 4.7 on Simon Willison's laptop

Alibaba's Qwen 3.6-35B-A3B is a 35B-param mixture-of-experts with only 3B active. Apache 2.0, runs on consumer GPUs, and it's already winning real tasks.

Claude Opus 4.7 launch artwork from the Anthropic news post
AI·

Claude Opus 4.7 is here, and the long-context benchmarks got worse

Anthropic's Opus 4.7 is state-of-the-art on SWE-bench and CursorBench, but independent tests show regressions on long-context retrieval and thematic reasoning.

Google Gemini app running on a Mac desktop showing the mini chat interface
AI·

Google Gemini finally has a Mac app, and it's gunning for ChatGPT's desktop lead

Google shipped a native Swift Gemini app for macOS with screen sharing, voice, and Deep Research. Here's what it does, what it doesn't, and how it stacks up.

Abstract visualization of cybersecurity and AI defense systems
AI·

OpenAI launches GPT-5.4-Cyber for defensive security, opens access to thousands

OpenAI's new cybersecurity-tuned model can reverse-engineer binaries and analyze malware. It's restricted to verified defenders through the Trusted Access program.

Claude wordmark on Anthropic's introducing-Routines announcement
AI·

Claude Code Routines: what they actually do, and when to use them over GitHub Actions

Anthropic just shipped Routines: Claude Code sessions as cron jobs, webhooks, and GitHub-event reactors. Here's what they replace, what they don't, and one rule to follow.