devtake.dev — #local-inference

devtake.dev — #local-inferenceArticles tagged local-inference on devtake.dev.https://devtake.dev/en-usRunning a coding agent fully on Apple Silicon, no cloud, is now an off-the-shelf stackhttps://devtake.dev/article/local-coding-agents-mac/https://devtake.dev/article/local-coding-agents-mac/A popular Hacker News how-to walked through a fully local coding agent on Apple Silicon. Here's the realistic 2026 stack: runner, model, and harness.Sat, 13 Jun 2026 12:30:00 GMTaiaillmlocal-inferenceai-agentsagentic-codingopen-weightsmacmoedieter-morelliA crafted Ollama model file leaks the whole server's memory. 300,000 instances are exposed.https://devtake.dev/article/ollama-bleeding-llama-cve-2026-7482/https://devtake.dev/article/ollama-bleeding-llama-cve-2026-7482/Cyera disclosed CVE-2026-7482 on May 1, a CVSS 9.1 unauthenticated heap read in Ollama. Three API calls dump prompts, env vars, and API keys from any open instance.Mon, 11 May 2026 10:00:00 GMTsecuritysecurityollamallmcve-2026-7482local-inferencememorycyeraai-securityluca-reinhardtAMD's 'Gorgon Halo' refresh leaks with 192GB memory. Strix Halo tops out at 128GB.https://devtake.dev/article/amd-gorgon-halo-ryzen-ai-max-495-leak/https://devtake.dev/article/amd-gorgon-halo-ryzen-ai-max-495-leak/A leaked Geekbench listing puts AMD's Ryzen AI Max+ 495 on a 192GB platform with a Radeon 8065S iGPU. The Strix Halo chip it replaces capped at 128GB.Mon, 04 May 2026 09:30:00 GMThardwareamdryzen-ai-maxgorgon-halostrix-halolocal-inferencelaptoplpcamm2hardwarehiro-tanakaApple killed the $599 Mac mini. The cheapest one is now $799 with 512GB.https://devtake.dev/article/apple-mac-mini-base-discontinued-799/https://devtake.dev/article/apple-mac-mini-base-discontinued-799/Apple quietly pulled the 256GB Mac mini from its store on May 1. Tim Cook had warned the day before that demand was outpacing supply for months.Sat, 02 May 2026 09:15:00 GMTappleapplemac-minim4pricinghardwaremaclocal-inferencenaomi-parkQwen 3.6-35B-A3B: the open MoE beating Opus 4.7 on Simon Willison's laptophttps://devtake.dev/article/qwen-3-6-35b-a3b-beats-opus-on-laptop/https://devtake.dev/article/qwen-3-6-35b-a3b-beats-opus-on-laptop/Alibaba's Qwen 3.6-35B-A3B is a 35B-param mixture-of-experts with only 3B active. Apache 2.0, runs on consumer GPUs, and it's already winning real tasks.Fri, 17 Apr 2026 10:00:00 GMTaiqwenalibabaopen-sourcemoellmlocal-inferenceopen-weightsdieter-morelli