<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>devtake.dev — #open-weights</title><description>Articles tagged open-weights on devtake.dev.</description><link>https://devtake.dev/</link><language>en-us</language><item><title>Running a coding agent fully on Apple Silicon, no cloud, is now an off-the-shelf stack</title><link>https://devtake.dev/article/local-coding-agents-mac/</link><guid isPermaLink="true">https://devtake.dev/article/local-coding-agents-mac/</guid><description>A popular Hacker News how-to walked through a fully local coding agent on Apple Silicon. Here&apos;s the realistic 2026 stack: runner, model, and harness.</description><pubDate>Sat, 13 Jun 2026 12:30:00 GMT</pubDate><category>ai</category><category>ai</category><category>llm</category><category>local-inference</category><category>ai-agents</category><category>agentic-coding</category><category>open-weights</category><category>mac</category><category>moe</category><author>dieter-morelli</author></item><item><title>Cactus Compute distilled Gemini into a 26M tool-calling model. The trick: no feed-forward layers.</title><link>https://devtake.dev/article/needle-cactus-compute-tool-calling/</link><guid isPermaLink="true">https://devtake.dev/article/needle-cactus-compute-tool-calling/</guid><description>Needle is a 26M-parameter function caller distilled from Gemini 3.1 Flash-Lite. The Simple Attention Network drops MLPs and runs at 6,000 tok/s prefill on edge silicon.</description><pubDate>Wed, 13 May 2026 10:00:00 GMT</pubDate><category>ai</category><category>ai-models</category><category>gemini</category><category>open-weights</category><category>function-calling</category><category>tool-calling</category><category>edge-ai</category><category>on-device-ai</category><category>small-models</category><author>dieter-morelli</author></item><item><title>Arcee&apos;s Trinity-Large-Thinking is a 399B open MoE that costs 96% less than Opus</title><link>https://devtake.dev/article/arcee-trinity-large-thinking-reasoning/</link><guid isPermaLink="true">https://devtake.dev/article/arcee-trinity-large-thinking-reasoning/</guid><description>Arcee released Trinity-Large-Thinking on April 1: a 399B-param sparse MoE with 13B active, Apache 2.0 weights, $0.88 per million output tokens, and PinchBench just behind Opus 4.6.</description><pubDate>Mon, 27 Apr 2026 13:00:00 GMT</pubDate><category>open-source</category><category>arcee</category><category>trinity</category><category>llm</category><category>ai-models</category><category>open-weights</category><category>moe</category><category>reasoning</category><category>apache-2-0</category><author>soren-vanek</author></item><item><title>OpenAI&apos;s Privacy Filter is a 1.5B PII redactor that ships under Apache 2.0. Here&apos;s what it actually does.</title><link>https://devtake.dev/article/openai-privacy-filter/</link><guid isPermaLink="true">https://devtake.dev/article/openai-privacy-filter/</guid><description>OpenAI released Privacy Filter on April 22 as an open-weight on-device model for masking eight types of PII. F1 of 96%. Runs in a browser. Here&apos;s the catch.</description><pubDate>Sun, 26 Apr 2026 13:00:00 GMT</pubDate><category>ai</category><category>openai</category><category>privacy</category><category>pii</category><category>open-weights</category><category>ai-models</category><category>llm</category><category>hugging-face</category><category>data-privacy</category><author>dieter-morelli</author></item><item><title>DeepSeek V4 lands: 1.6T-param open MoE, 1M-token context, and SWE-bench within 0.2 of Opus 4.6</title><link>https://devtake.dev/article/deepseek-v4-release/</link><guid isPermaLink="true">https://devtake.dev/article/deepseek-v4-release/</guid><description>DeepSeek shipped V4-Pro and V4-Flash under MIT on April 24. V4-Pro hits 80.6% on SWE-bench Verified. V4-Flash is $0.14 in / $0.28 out.</description><pubDate>Fri, 24 Apr 2026 21:30:00 GMT</pubDate><category>ai</category><category>deepseek</category><category>deepseek-v4</category><category>llm</category><category>ai-models</category><category>open-weights</category><category>moe</category><category>benchmarks</category><category>open-source</category><author>dieter-morelli</author></item><item><title>Qwen 3.6-35B-A3B: the open MoE beating Opus 4.7 on Simon Willison&apos;s laptop</title><link>https://devtake.dev/article/qwen-3-6-35b-a3b-beats-opus-on-laptop/</link><guid isPermaLink="true">https://devtake.dev/article/qwen-3-6-35b-a3b-beats-opus-on-laptop/</guid><description>Alibaba&apos;s Qwen 3.6-35B-A3B is a 35B-param mixture-of-experts with only 3B active. Apache 2.0, runs on consumer GPUs, and it&apos;s already winning real tasks.</description><pubDate>Fri, 17 Apr 2026 10:00:00 GMT</pubDate><category>ai</category><category>qwen</category><category>alibaba</category><category>open-source</category><category>moe</category><category>llm</category><category>local-inference</category><category>open-weights</category><author>dieter-morelli</author></item></channel></rss>