devtake.dev — #ai-models

devtake.dev — #ai-modelsArticles tagged ai-models on devtake.dev.https://devtake.dev/en-usDays after opening Fable 5 to the public, a US government order forced Anthropic to pull ithttps://devtake.dev/article/anthropic-fable-mythos-government-suspension/https://devtake.dev/article/anthropic-fable-mythos-government-suspension/A Commerce Department export directive forced Anthropic to disable Fable 5 and Mythos 5 for all users, days after opening Fable 5 to the public.Sat, 13 Jun 2026 14:30:00 GMTaiai-modelsanthropicclaudeclaude-mythosai-securitypolicydieter-morelliClaude Fable 5 is Anthropic's first public Mythos-class model. It tops SWE-Bench Pro at 80.3%.https://devtake.dev/article/claude-fable-5-launch/https://devtake.dev/article/claude-fable-5-launch/Claude Fable 5 hits 80.3% on SWE-Bench Pro and ships on Bedrock and Copilot at $10/$50 per million tokens, free on paid plans only through June 22.Tue, 09 Jun 2026 18:55:00 GMTaiai-modelsanthropicclaudeclaude-mythosbenchmarksllmagentic-codingdieter-morelliSriram Krishnan is leaving the White House AI job to build an outside policy institutionhttps://devtake.dev/article/sriram-krishnan-leaves-white-house-ai/https://devtake.dev/article/sriram-krishnan-leaves-white-house-ai/Sriram Krishnan, the a16z partner who co-wrote the AI Action Plan, leaves his White House senior AI advisor role at the end of June 2026. Here's what changes.Mon, 08 Jun 2026 08:45:00 GMTpolicypolicyai-policyregulationnational-securityai-modelsclara-wexlerTrump dropped the mandatory AI model review after Silicon Valley pushed backhttps://devtake.dev/article/trump-narrower-ai-executive-order/https://devtake.dev/article/trump-narrower-ai-executive-order/Trump's June 2 AI executive order asks for a voluntary 30-day model review, down from a mandatory 90-day one. Here's what got cut and who pushed.Wed, 03 Jun 2026 13:00:00 GMTpolicypolicyregulationai-securitynational-securityai-modelsclara-wexlerStanford tested AI against law professors. The pros picked the AI 75% of the time.https://devtake.dev/article/stanford-ai-beats-law-professors/https://devtake.dev/article/stanford-ai-beats-law-professors/A blinded Stanford Law study had 16 professors grade AI tutoring answers against their own. Here's what the 75% win rate actually measures, and what it doesn't.Wed, 03 Jun 2026 11:15:00 GMTaiaillmbenchmarkslegal-aiai-modelsgeminiragai-evaldieter-morelliClaude Opus 4.8 flags the bugs it writes four times more often than Opus 4.7https://devtake.dev/article/claude-opus-4-8-launch/https://devtake.dev/article/claude-opus-4-8-launch/Anthropic's Opus 4.8 posts 69.2% on SWE-Bench Pro, lets code flaws slip 4x less often, and ships parallel subagents in Claude Code. Here's what matters.Fri, 29 May 2026 07:20:00 GMTaiai-modelsanthropicclaudellmbenchmarksagentic-codingclaude-codeopus-4-7dieter-morelliDeepSeek locked in the 75% V4-Pro cut. The API now undercuts every Western frontier model.https://devtake.dev/article/deepseek-v4-pro-price-cut-permanent/https://devtake.dev/article/deepseek-v4-pro-price-cut-permanent/On May 23 DeepSeek told customers the V4-Pro discount becomes its standard price after May 31. Output drops from $3.48 to $0.87 per million tokens.Sun, 24 May 2026 10:30:00 GMTaideepseekai-modelsllmanthropicopenaigeminiai-chipschinadieter-morelliAndrej Karpathy joined Anthropic. The OpenAI founding member's job: use Claude to train Claude.https://devtake.dev/article/karpathy-joins-anthropic-pretraining/https://devtake.dev/article/karpathy-joins-anthropic-pretraining/Karpathy started this week at Anthropic on Nick Joseph's pre-training team. His mandate is using Claude to accelerate Claude's own training.Thu, 21 May 2026 12:00:00 GMTaianthropicopenaiclaudeandrej-karpathyai-modelsllmpre-trainingai-talentdieter-morelliCactus Compute distilled Gemini into a 26M tool-calling model. The trick: no feed-forward layers.https://devtake.dev/article/needle-cactus-compute-tool-calling/https://devtake.dev/article/needle-cactus-compute-tool-calling/Needle is a 26M-parameter function caller distilled from Gemini 3.1 Flash-Lite. The Simple Attention Network drops MLPs and runs at 6,000 tok/s prefill on edge silicon.Wed, 13 May 2026 10:00:00 GMTaiai-modelsgeminiopen-weightsfunction-callingtool-callingedge-aion-device-aismall-modelsdieter-morelliChinese proxy networks sell Claude API access at 90% off. They harvest every prompt that passes through.https://devtake.dev/article/chinese-grey-market-claude-api-stolen-credentials/https://devtake.dev/article/chinese-grey-market-claude-api-stolen-credentials/A ChinaTalk investigation reveals how 'transfer stations' resell Anthropic API access using stolen credentials, model substitution, and prompt harvesting.Sun, 10 May 2026 09:30:00 GMTaianthropicclaudeai-securitycredential-theftchinasupply-chainai-modelsdieter-morelliMicrosoft tested 19 LLMs as document editors. Even the best ones corrupted 25% of the content.https://devtake.dev/article/llms-corrupt-documents-delegation-errors/https://devtake.dev/article/llms-corrupt-documents-delegation-errors/The DELEGATE-52 benchmark tests AI editing across 52 professional domains. Frontier models corrupt a quarter of document content over long workflows.Sun, 10 May 2026 09:00:00 GMTaillmai-modelsbenchmarksmicrosoftdelegationvibe-codingdieter-morelliTimothy Gowers gave GPT 5.5 an open math problem. It returned a novel proof in 17 minutes.https://devtake.dev/article/fields-medal-gowers-gpt-open-problems/https://devtake.dev/article/fields-medal-gowers-gpt-open-problems/The 1998 Fields Medal winner reports GPT 5.5 Pro produced a novel proof for an unsolved math problem in 17 minutes, and says the era of owning theorems is ending.Sat, 09 May 2026 07:30:00 GMTaiopenaillmai-modelsbenchmarksdieter-morelliAnthropic doubled Claude Code's limits by renting 220,000 GPUs from xAIhttps://devtake.dev/article/claude-code-rate-limits-doubled-may-2026/https://devtake.dev/article/claude-code-rate-limits-doubled-may-2026/Anthropic doubled Claude Code's 5-hour limits, killed peak-hours throttling, and raised Opus API tiers. The capacity comes from xAI's Colossus 1, via a SpaceX deal.Thu, 07 May 2026 12:30:00 GMTaianthropicclaude-codeclaude-opusspacexxaiai-infrastructureai-modelscolossusdieter-morelliPerplexity's $400M Snapchat search deal is dead. Snap pulled it from guidance.https://devtake.dev/article/snap-perplexity-400m-deal-ended/https://devtake.dev/article/snap-perplexity-400m-deal-ended/Snap revealed in its Q1 2026 earnings that its November $400M deal to put Perplexity inside Snapchat 'amicably ended' before any broader rollout shipped.Thu, 07 May 2026 11:00:00 GMTaiperplexitysnapsnapchatai-searchai-modelsevan-spiegeldieter-morelliAnthropic is fielding offers at a $900B valuation. The round closes in two weeks and tops OpenAI.https://devtake.dev/article/anthropic-900b-valuation-50b-round/https://devtake.dev/article/anthropic-900b-valuation-50b-round/Preemptive bids put Anthropic at $850B-$900B with a $50B raise. Run rate hit $30B in March, up from $9B at year-end 2025.Sat, 02 May 2026 09:45:00 GMTaianthropicclaudefundingvaluationopenaiai-modelsfinancedieter-morelliMicrosoft and OpenAI just rewrote their deal. Exclusivity is dead, and so is the AGI clause.https://devtake.dev/article/microsoft-openai-deal-revenue-share-end/https://devtake.dev/article/microsoft-openai-deal-revenue-share-end/Microsoft loses exclusive rights to OpenAI's models. The revenue share now caps at 2030 and stops depending on AGI. Here's what actually changed and who it benefits.Mon, 27 Apr 2026 19:00:00 GMTaiopenaimicrosoftai-modelsazurellmai-infrastructureanthropicgpt-5-5dieter-morelliArcee's Trinity-Large-Thinking is a 399B open MoE that costs 96% less than Opushttps://devtake.dev/article/arcee-trinity-large-thinking-reasoning/https://devtake.dev/article/arcee-trinity-large-thinking-reasoning/Arcee released Trinity-Large-Thinking on April 1: a 399B-param sparse MoE with 13B active, Apache 2.0 weights, $0.88 per million output tokens, and PinchBench just behind Opus 4.6.Mon, 27 Apr 2026 13:00:00 GMTopen-sourcearceetrinityllmai-modelsopen-weightsmoereasoningapache-2-0soren-vanekOpenAI just retired SWE-bench Verified. The headline coding benchmark of 2025 is officially saturated.https://devtake.dev/article/openai-retires-swe-bench-verified/https://devtake.dev/article/openai-retires-swe-bench-verified/OpenAI says SWE-bench Verified is saturated and contaminated, and 60% of remaining problems are unsolvable. Here's what comes next, and why every coding leaderboard is suspect.Mon, 27 Apr 2026 10:00:00 GMTaiopenaiswe-benchbenchmarksai-modelsllmai-codingevaluationsclaude-opusdieter-morelliOpenAI's Privacy Filter is a 1.5B PII redactor that ships under Apache 2.0. Here's what it actually does.https://devtake.dev/article/openai-privacy-filter/https://devtake.dev/article/openai-privacy-filter/OpenAI released Privacy Filter on April 22 as an open-weight on-device model for masking eight types of PII. F1 of 96%. Runs in a browser. Here's the catch.Sun, 26 Apr 2026 13:00:00 GMTaiopenaiprivacypiiopen-weightsai-modelsllmhugging-facedata-privacydieter-morelliDeepSeek V4 lands: 1.6T-param open MoE, 1M-token context, and SWE-bench within 0.2 of Opus 4.6https://devtake.dev/article/deepseek-v4-release/https://devtake.dev/article/deepseek-v4-release/DeepSeek shipped V4-Pro and V4-Flash under MIT on April 24. V4-Pro hits 80.6% on SWE-bench Verified. V4-Flash is $0.14 in / $0.28 out.Fri, 24 Apr 2026 21:30:00 GMTaideepseekdeepseek-v4llmai-modelsopen-weightsmoebenchmarksopen-sourcedieter-morelliGoogle is putting up to $40B into Anthropic. That's five days after Amazon's $5B.https://devtake.dev/article/google-anthropic-40b-investment/https://devtake.dev/article/google-anthropic-40b-investment/Google committed $10B upfront and up to $40B total at a $350B valuation, plus five gigawatts of Google Cloud capacity. It's Anthropic's second nine-figure deal in a week.Fri, 24 Apr 2026 20:30:00 GMTaianthropicgooglealphabetclaudeai-infrastructureai-modelsfundingtpudieter-morelliAnthropic admits three Claude Code bugs quietly tanked quality for six weekshttps://devtake.dev/article/anthropic-claude-code-quality-postmortem/https://devtake.dev/article/anthropic-claude-code-quality-postmortem/Anthropic's April 23 postmortem names three bugs that degraded Claude Code between March 4 and April 20. Usage limits are being reset for every subscriber.Fri, 24 Apr 2026 11:30:00 GMTaiclaude-codeanthropicclaudeopus-4-7ai-agentsai-modelspostmortemsonnet-4-6dieter-morelliOpenAI shipped GPT-5.5 seven weeks after 5.4. API tokens now cost twice as much.https://devtake.dev/article/openai-gpt-5-5-launch/https://devtake.dev/article/openai-gpt-5-5-launch/OpenAI released GPT-5.5 (codename Spud) on April 23. The API runs at $5/$30 per million tokens, double GPT-5.4, with Pro at $30/$180.Thu, 23 Apr 2026 18:30:00 GMTaiopenaigpt-5-5chatgptcodexai-modelsapi-pricingllmagentic-aidieter-morelliClaude Opus 4.7 is here, and the long-context benchmarks got worsehttps://devtake.dev/article/anthropic-claude-opus-4-7-launch/https://devtake.dev/article/anthropic-claude-opus-4-7-launch/Anthropic's Opus 4.7 is state-of-the-art on SWE-bench and CursorBench, but independent tests show regressions on long-context retrieval and thematic reasoning.Fri, 17 Apr 2026 09:30:00 GMTaiclaudeanthropicopus-4-7llmbenchmarksmythosai-modelsdieter-morelli