<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>devtake.dev — #ai-models</title><description>Articles tagged ai-models on devtake.dev.</description><link>https://devtake.dev/</link><language>en-us</language><item><title>Days after opening Fable 5 to the public, a US government order forced Anthropic to pull it</title><link>https://devtake.dev/article/anthropic-fable-mythos-government-suspension/</link><guid isPermaLink="true">https://devtake.dev/article/anthropic-fable-mythos-government-suspension/</guid><description>A Commerce Department export directive forced Anthropic to disable Fable 5 and Mythos 5 for all users, days after opening Fable 5 to the public.</description><pubDate>Sat, 13 Jun 2026 14:30:00 GMT</pubDate><category>ai</category><category>ai-models</category><category>anthropic</category><category>claude</category><category>claude-mythos</category><category>ai-security</category><category>policy</category><author>dieter-morelli</author></item><item><title>Claude Fable 5 is Anthropic&apos;s first public Mythos-class model. It tops SWE-Bench Pro at 80.3%.</title><link>https://devtake.dev/article/claude-fable-5-launch/</link><guid isPermaLink="true">https://devtake.dev/article/claude-fable-5-launch/</guid><description>Claude Fable 5 hits 80.3% on SWE-Bench Pro and ships on Bedrock and Copilot at $10/$50 per million tokens, free on paid plans only through June 22.</description><pubDate>Tue, 09 Jun 2026 18:55:00 GMT</pubDate><category>ai</category><category>ai-models</category><category>anthropic</category><category>claude</category><category>claude-mythos</category><category>benchmarks</category><category>llm</category><category>agentic-coding</category><author>dieter-morelli</author></item><item><title>Sriram Krishnan is leaving the White House AI job to build an outside policy institution</title><link>https://devtake.dev/article/sriram-krishnan-leaves-white-house-ai/</link><guid isPermaLink="true">https://devtake.dev/article/sriram-krishnan-leaves-white-house-ai/</guid><description>Sriram Krishnan, the a16z partner who co-wrote the AI Action Plan, leaves his White House senior AI advisor role at the end of June 2026. Here&apos;s what changes.</description><pubDate>Mon, 08 Jun 2026 08:45:00 GMT</pubDate><category>policy</category><category>policy</category><category>ai-policy</category><category>regulation</category><category>national-security</category><category>ai-models</category><author>clara-wexler</author></item><item><title>Trump dropped the mandatory AI model review after Silicon Valley pushed back</title><link>https://devtake.dev/article/trump-narrower-ai-executive-order/</link><guid isPermaLink="true">https://devtake.dev/article/trump-narrower-ai-executive-order/</guid><description>Trump&apos;s June 2 AI executive order asks for a voluntary 30-day model review, down from a mandatory 90-day one. Here&apos;s what got cut and who pushed.</description><pubDate>Wed, 03 Jun 2026 13:00:00 GMT</pubDate><category>policy</category><category>policy</category><category>regulation</category><category>ai-security</category><category>national-security</category><category>ai-models</category><author>clara-wexler</author></item><item><title>Stanford tested AI against law professors. The pros picked the AI 75% of the time.</title><link>https://devtake.dev/article/stanford-ai-beats-law-professors/</link><guid isPermaLink="true">https://devtake.dev/article/stanford-ai-beats-law-professors/</guid><description>A blinded Stanford Law study had 16 professors grade AI tutoring answers against their own. Here&apos;s what the 75% win rate actually measures, and what it doesn&apos;t.</description><pubDate>Wed, 03 Jun 2026 11:15:00 GMT</pubDate><category>ai</category><category>ai</category><category>llm</category><category>benchmarks</category><category>legal-ai</category><category>ai-models</category><category>gemini</category><category>rag</category><category>ai-eval</category><author>dieter-morelli</author></item><item><title>Claude Opus 4.8 flags the bugs it writes four times more often than Opus 4.7</title><link>https://devtake.dev/article/claude-opus-4-8-launch/</link><guid isPermaLink="true">https://devtake.dev/article/claude-opus-4-8-launch/</guid><description>Anthropic&apos;s Opus 4.8 posts 69.2% on SWE-Bench Pro, lets code flaws slip 4x less often, and ships parallel subagents in Claude Code. Here&apos;s what matters.</description><pubDate>Fri, 29 May 2026 07:20:00 GMT</pubDate><category>ai</category><category>ai-models</category><category>anthropic</category><category>claude</category><category>llm</category><category>benchmarks</category><category>agentic-coding</category><category>claude-code</category><category>opus-4-7</category><author>dieter-morelli</author></item><item><title>DeepSeek locked in the 75% V4-Pro cut. The API now undercuts every Western frontier model.</title><link>https://devtake.dev/article/deepseek-v4-pro-price-cut-permanent/</link><guid isPermaLink="true">https://devtake.dev/article/deepseek-v4-pro-price-cut-permanent/</guid><description>On May 23 DeepSeek told customers the V4-Pro discount becomes its standard price after May 31. Output drops from $3.48 to $0.87 per million tokens.</description><pubDate>Sun, 24 May 2026 10:30:00 GMT</pubDate><category>ai</category><category>deepseek</category><category>ai-models</category><category>llm</category><category>anthropic</category><category>openai</category><category>gemini</category><category>ai-chips</category><category>china</category><author>dieter-morelli</author></item><item><title>Andrej Karpathy joined Anthropic. The OpenAI founding member&apos;s job: use Claude to train Claude.</title><link>https://devtake.dev/article/karpathy-joins-anthropic-pretraining/</link><guid isPermaLink="true">https://devtake.dev/article/karpathy-joins-anthropic-pretraining/</guid><description>Karpathy started this week at Anthropic on Nick Joseph&apos;s pre-training team. His mandate is using Claude to accelerate Claude&apos;s own training.</description><pubDate>Thu, 21 May 2026 12:00:00 GMT</pubDate><category>ai</category><category>anthropic</category><category>openai</category><category>claude</category><category>andrej-karpathy</category><category>ai-models</category><category>llm</category><category>pre-training</category><category>ai-talent</category><author>dieter-morelli</author></item><item><title>Cactus Compute distilled Gemini into a 26M tool-calling model. The trick: no feed-forward layers.</title><link>https://devtake.dev/article/needle-cactus-compute-tool-calling/</link><guid isPermaLink="true">https://devtake.dev/article/needle-cactus-compute-tool-calling/</guid><description>Needle is a 26M-parameter function caller distilled from Gemini 3.1 Flash-Lite. The Simple Attention Network drops MLPs and runs at 6,000 tok/s prefill on edge silicon.</description><pubDate>Wed, 13 May 2026 10:00:00 GMT</pubDate><category>ai</category><category>ai-models</category><category>gemini</category><category>open-weights</category><category>function-calling</category><category>tool-calling</category><category>edge-ai</category><category>on-device-ai</category><category>small-models</category><author>dieter-morelli</author></item><item><title>Chinese proxy networks sell Claude API access at 90% off. They harvest every prompt that passes through.</title><link>https://devtake.dev/article/chinese-grey-market-claude-api-stolen-credentials/</link><guid isPermaLink="true">https://devtake.dev/article/chinese-grey-market-claude-api-stolen-credentials/</guid><description>A ChinaTalk investigation reveals how &apos;transfer stations&apos; resell Anthropic API access using stolen credentials, model substitution, and prompt harvesting.</description><pubDate>Sun, 10 May 2026 09:30:00 GMT</pubDate><category>ai</category><category>anthropic</category><category>claude</category><category>ai-security</category><category>credential-theft</category><category>china</category><category>supply-chain</category><category>ai-models</category><author>dieter-morelli</author></item><item><title>Microsoft tested 19 LLMs as document editors. Even the best ones corrupted 25% of the content.</title><link>https://devtake.dev/article/llms-corrupt-documents-delegation-errors/</link><guid isPermaLink="true">https://devtake.dev/article/llms-corrupt-documents-delegation-errors/</guid><description>The DELEGATE-52 benchmark tests AI editing across 52 professional domains. Frontier models corrupt a quarter of document content over long workflows.</description><pubDate>Sun, 10 May 2026 09:00:00 GMT</pubDate><category>ai</category><category>llm</category><category>ai-models</category><category>benchmarks</category><category>microsoft</category><category>delegation</category><category>vibe-coding</category><author>dieter-morelli</author></item><item><title>Timothy Gowers gave GPT 5.5 an open math problem. It returned a novel proof in 17 minutes.</title><link>https://devtake.dev/article/fields-medal-gowers-gpt-open-problems/</link><guid isPermaLink="true">https://devtake.dev/article/fields-medal-gowers-gpt-open-problems/</guid><description>The 1998 Fields Medal winner reports GPT 5.5 Pro produced a novel proof for an unsolved math problem in 17 minutes, and says the era of owning theorems is ending.</description><pubDate>Sat, 09 May 2026 07:30:00 GMT</pubDate><category>ai</category><category>openai</category><category>llm</category><category>ai-models</category><category>benchmarks</category><author>dieter-morelli</author></item><item><title>Anthropic doubled Claude Code&apos;s limits by renting 220,000 GPUs from xAI</title><link>https://devtake.dev/article/claude-code-rate-limits-doubled-may-2026/</link><guid isPermaLink="true">https://devtake.dev/article/claude-code-rate-limits-doubled-may-2026/</guid><description>Anthropic doubled Claude Code&apos;s 5-hour limits, killed peak-hours throttling, and raised Opus API tiers. The capacity comes from xAI&apos;s Colossus 1, via a SpaceX deal.</description><pubDate>Thu, 07 May 2026 12:30:00 GMT</pubDate><category>ai</category><category>anthropic</category><category>claude-code</category><category>claude-opus</category><category>spacex</category><category>xai</category><category>ai-infrastructure</category><category>ai-models</category><category>colossus</category><author>dieter-morelli</author></item><item><title>Perplexity&apos;s $400M Snapchat search deal is dead. Snap pulled it from guidance.</title><link>https://devtake.dev/article/snap-perplexity-400m-deal-ended/</link><guid isPermaLink="true">https://devtake.dev/article/snap-perplexity-400m-deal-ended/</guid><description>Snap revealed in its Q1 2026 earnings that its November $400M deal to put Perplexity inside Snapchat &apos;amicably ended&apos; before any broader rollout shipped.</description><pubDate>Thu, 07 May 2026 11:00:00 GMT</pubDate><category>ai</category><category>perplexity</category><category>snap</category><category>snapchat</category><category>ai-search</category><category>ai-models</category><category>evan-spiegel</category><author>dieter-morelli</author></item><item><title>Anthropic is fielding offers at a $900B valuation. The round closes in two weeks and tops OpenAI.</title><link>https://devtake.dev/article/anthropic-900b-valuation-50b-round/</link><guid isPermaLink="true">https://devtake.dev/article/anthropic-900b-valuation-50b-round/</guid><description>Preemptive bids put Anthropic at $850B-$900B with a $50B raise. Run rate hit $30B in March, up from $9B at year-end 2025.</description><pubDate>Sat, 02 May 2026 09:45:00 GMT</pubDate><category>ai</category><category>anthropic</category><category>claude</category><category>funding</category><category>valuation</category><category>openai</category><category>ai-models</category><category>finance</category><author>dieter-morelli</author></item><item><title>Microsoft and OpenAI just rewrote their deal. Exclusivity is dead, and so is the AGI clause.</title><link>https://devtake.dev/article/microsoft-openai-deal-revenue-share-end/</link><guid isPermaLink="true">https://devtake.dev/article/microsoft-openai-deal-revenue-share-end/</guid><description>Microsoft loses exclusive rights to OpenAI&apos;s models. The revenue share now caps at 2030 and stops depending on AGI. Here&apos;s what actually changed and who it benefits.</description><pubDate>Mon, 27 Apr 2026 19:00:00 GMT</pubDate><category>ai</category><category>openai</category><category>microsoft</category><category>ai-models</category><category>azure</category><category>llm</category><category>ai-infrastructure</category><category>anthropic</category><category>gpt-5-5</category><author>dieter-morelli</author></item><item><title>Arcee&apos;s Trinity-Large-Thinking is a 399B open MoE that costs 96% less than Opus</title><link>https://devtake.dev/article/arcee-trinity-large-thinking-reasoning/</link><guid isPermaLink="true">https://devtake.dev/article/arcee-trinity-large-thinking-reasoning/</guid><description>Arcee released Trinity-Large-Thinking on April 1: a 399B-param sparse MoE with 13B active, Apache 2.0 weights, $0.88 per million output tokens, and PinchBench just behind Opus 4.6.</description><pubDate>Mon, 27 Apr 2026 13:00:00 GMT</pubDate><category>open-source</category><category>arcee</category><category>trinity</category><category>llm</category><category>ai-models</category><category>open-weights</category><category>moe</category><category>reasoning</category><category>apache-2-0</category><author>soren-vanek</author></item><item><title>OpenAI just retired SWE-bench Verified. The headline coding benchmark of 2025 is officially saturated.</title><link>https://devtake.dev/article/openai-retires-swe-bench-verified/</link><guid isPermaLink="true">https://devtake.dev/article/openai-retires-swe-bench-verified/</guid><description>OpenAI says SWE-bench Verified is saturated and contaminated, and 60% of remaining problems are unsolvable. Here&apos;s what comes next, and why every coding leaderboard is suspect.</description><pubDate>Mon, 27 Apr 2026 10:00:00 GMT</pubDate><category>ai</category><category>openai</category><category>swe-bench</category><category>benchmarks</category><category>ai-models</category><category>llm</category><category>ai-coding</category><category>evaluations</category><category>claude-opus</category><author>dieter-morelli</author></item><item><title>OpenAI&apos;s Privacy Filter is a 1.5B PII redactor that ships under Apache 2.0. Here&apos;s what it actually does.</title><link>https://devtake.dev/article/openai-privacy-filter/</link><guid isPermaLink="true">https://devtake.dev/article/openai-privacy-filter/</guid><description>OpenAI released Privacy Filter on April 22 as an open-weight on-device model for masking eight types of PII. F1 of 96%. Runs in a browser. Here&apos;s the catch.</description><pubDate>Sun, 26 Apr 2026 13:00:00 GMT</pubDate><category>ai</category><category>openai</category><category>privacy</category><category>pii</category><category>open-weights</category><category>ai-models</category><category>llm</category><category>hugging-face</category><category>data-privacy</category><author>dieter-morelli</author></item><item><title>DeepSeek V4 lands: 1.6T-param open MoE, 1M-token context, and SWE-bench within 0.2 of Opus 4.6</title><link>https://devtake.dev/article/deepseek-v4-release/</link><guid isPermaLink="true">https://devtake.dev/article/deepseek-v4-release/</guid><description>DeepSeek shipped V4-Pro and V4-Flash under MIT on April 24. V4-Pro hits 80.6% on SWE-bench Verified. V4-Flash is $0.14 in / $0.28 out.</description><pubDate>Fri, 24 Apr 2026 21:30:00 GMT</pubDate><category>ai</category><category>deepseek</category><category>deepseek-v4</category><category>llm</category><category>ai-models</category><category>open-weights</category><category>moe</category><category>benchmarks</category><category>open-source</category><author>dieter-morelli</author></item><item><title>Google is putting up to $40B into Anthropic. That&apos;s five days after Amazon&apos;s $5B.</title><link>https://devtake.dev/article/google-anthropic-40b-investment/</link><guid isPermaLink="true">https://devtake.dev/article/google-anthropic-40b-investment/</guid><description>Google committed $10B upfront and up to $40B total at a $350B valuation, plus five gigawatts of Google Cloud capacity. It&apos;s Anthropic&apos;s second nine-figure deal in a week.</description><pubDate>Fri, 24 Apr 2026 20:30:00 GMT</pubDate><category>ai</category><category>anthropic</category><category>google</category><category>alphabet</category><category>claude</category><category>ai-infrastructure</category><category>ai-models</category><category>funding</category><category>tpu</category><author>dieter-morelli</author></item><item><title>Anthropic admits three Claude Code bugs quietly tanked quality for six weeks</title><link>https://devtake.dev/article/anthropic-claude-code-quality-postmortem/</link><guid isPermaLink="true">https://devtake.dev/article/anthropic-claude-code-quality-postmortem/</guid><description>Anthropic&apos;s April 23 postmortem names three bugs that degraded Claude Code between March 4 and April 20. Usage limits are being reset for every subscriber.</description><pubDate>Fri, 24 Apr 2026 11:30:00 GMT</pubDate><category>ai</category><category>claude-code</category><category>anthropic</category><category>claude</category><category>opus-4-7</category><category>ai-agents</category><category>ai-models</category><category>postmortem</category><category>sonnet-4-6</category><author>dieter-morelli</author></item><item><title>OpenAI shipped GPT-5.5 seven weeks after 5.4. API tokens now cost twice as much.</title><link>https://devtake.dev/article/openai-gpt-5-5-launch/</link><guid isPermaLink="true">https://devtake.dev/article/openai-gpt-5-5-launch/</guid><description>OpenAI released GPT-5.5 (codename Spud) on April 23. The API runs at $5/$30 per million tokens, double GPT-5.4, with Pro at $30/$180.</description><pubDate>Thu, 23 Apr 2026 18:30:00 GMT</pubDate><category>ai</category><category>openai</category><category>gpt-5-5</category><category>chatgpt</category><category>codex</category><category>ai-models</category><category>api-pricing</category><category>llm</category><category>agentic-ai</category><author>dieter-morelli</author></item><item><title>Claude Opus 4.7 is here, and the long-context benchmarks got worse</title><link>https://devtake.dev/article/anthropic-claude-opus-4-7-launch/</link><guid isPermaLink="true">https://devtake.dev/article/anthropic-claude-opus-4-7-launch/</guid><description>Anthropic&apos;s Opus 4.7 is state-of-the-art on SWE-bench and CursorBench, but independent tests show regressions on long-context retrieval and thematic reasoning.</description><pubDate>Fri, 17 Apr 2026 09:30:00 GMT</pubDate><category>ai</category><category>claude</category><category>anthropic</category><category>opus-4-7</category><category>llm</category><category>benchmarks</category><category>mythos</category><category>ai-models</category><author>dieter-morelli</author></item></channel></rss>