#ai-models

Days after opening Fable 5 to the public, a US government order forced Anthropic to pull it

A Commerce Department export directive forced Anthropic to disable Fable 5 and Mythos 5 for all users, days after opening Fable 5 to the public.

AI·4 days ago

Claude Fable 5 is Anthropic's first public Mythos-class model. It tops SWE-Bench Pro at 80.3%.

Claude Fable 5 hits 80.3% on SWE-Bench Pro and ships on Bedrock and Copilot at $10/$50 per million tokens, free on paid plans only through June 22.

The South Facade of the White House in Washington, with the fountain and South Lawn in the foreground.

Policy·5 days ago

Sriram Krishnan is leaving the White House AI job to build an outside policy institution

Sriram Krishnan, the a16z partner who co-wrote the AI Action Plan, leaves his White House senior AI advisor role at the end of June 2026. Here's what changes.

The White House in Washington, D.C., where the executive order was signed

Policy·last week

Trump dropped the mandatory AI model review after Silicon Valley pushed back

Trump's June 2 AI executive order asks for a voluntary 30-day model review, down from a mandatory 90-day one. Here's what got cut and who pushed.

The Stanford Law School building on Stanford University's campus

AI·last week

Stanford tested AI against law professors. The pros picked the AI 75% of the time.

A blinded Stanford Law study had 16 professors grade AI tutoring answers against their own. Here's what the 75% win rate actually measures, and what it doesn't.

AI·2 weeks ago

Claude Opus 4.8 flags the bugs it writes four times more often than Opus 4.7

Anthropic's Opus 4.8 posts 69.2% on SWE-Bench Pro, lets code flaws slip 4x less often, and ships parallel subagents in Claude Code. Here's what matters.

AI·3 weeks ago

DeepSeek locked in the 75% V4-Pro cut. The API now undercuts every Western frontier model.

On May 23 DeepSeek told customers the V4-Pro discount becomes its standard price after May 31. Output drops from $3.48 to $0.87 per million tokens.

Diagram of an artificial neural network with input, hidden, and output layers

AI·3 weeks ago

Andrej Karpathy joined Anthropic. The OpenAI founding member's job: use Claude to train Claude.

Karpathy started this week at Anthropic on Nick Joseph's pre-training team. His mandate is using Claude to accelerate Claude's own training.

Cactus Compute YouTube thumbnail showing the team behind Needle

AI·last month

Cactus Compute distilled Gemini into a 26M tool-calling model. The trick: no feed-forward layers.

Needle is a 26M-parameter function caller distilled from Gemini 3.1 Flash-Lite. The Simple Attention Network drops MLPs and runs at 6,000 tok/s prefill on edge silicon.

Illustration accompanying ChinaTalk's investigation into grey-market Claude API proxy networks

AI·last month

Chinese proxy networks sell Claude API access at 90% off. They harvest every prompt that passes through.

A ChinaTalk investigation reveals how 'transfer stations' resell Anthropic API access using stolen credentials, model substitution, and prompt harvesting.

The DELEGATE-52 project repository on GitHub, showing Microsoft's benchmark for testing LLM document editing fidelity

AI·last month

Microsoft tested 19 LLMs as document editors. Even the best ones corrupted 25% of the content.

The DELEGATE-52 benchmark tests AI editing across 52 professional domains. Frontier models corrupt a quarter of document content over long workflows.

A mathematics lecture hall with equations on blackboards

AI·last month

Timothy Gowers gave GPT 5.5 an open math problem. It returned a novel proof in 17 minutes.

The 1998 Fields Medal winner reports GPT 5.5 Pro produced a novel proof for an unsolved math problem in 17 minutes, and says the era of owning theorems is ending.

Cartoon Claude Code terminal flexing two muscular arms against a terracotta background

AI·last month

Anthropic doubled Claude Code's limits by renting 220,000 GPUs from xAI

Anthropic doubled Claude Code's 5-hour limits, killed peak-hours throttling, and raised Opus API tiers. The capacity comes from xAI's Colossus 1, via a SpaceX deal.

A smartphone screen showing the Snapchat app interface

AI·last month

Perplexity's $400M Snapchat search deal is dead. Snap pulled it from guidance.

Snap revealed in its Q1 2026 earnings that its November $400M deal to put Perplexity inside Snapchat 'amicably ended' before any broader rollout shipped.

Anthropic CEO Dario Amodei photographed at Bloomberg House during the World Economic Forum.

AI·last month

Anthropic is fielding offers at a $900B valuation. The round closes in two weeks and tops OpenAI.

Preemptive bids put Anthropic at $850B-$900B with a $50B raise. Run rate hit $30B in March, up from $9B at year-end 2025.

Microsoft and OpenAI logos paired on a navy gradient backdrop.

AI·2 months ago

Microsoft and OpenAI just rewrote their deal. Exclusivity is dead, and so is the AGI clause.

Microsoft loses exclusive rights to OpenAI's models. The revenue share now caps at 2030 and stops depending on AGI. Here's what actually changed and who it benefits.

Arcee AI Trinity branding from the Trinity-Large-Thinking blog post.

Open Source·2 months ago

Arcee's Trinity-Large-Thinking is a 399B open MoE that costs 96% less than Opus

Arcee released Trinity-Large-Thinking on April 1: a 399B-param sparse MoE with 13B active, Apache 2.0 weights, $0.88 per million output tokens, and PinchBench just behind Opus 4.6.

AI·2 months ago

OpenAI just retired SWE-bench Verified. The headline coding benchmark of 2025 is officially saturated.

OpenAI says SWE-bench Verified is saturated and contaminated, and 60% of remaining problems are unsolvable. Here's what comes next, and why every coding leaderboard is suspect.