Ollama 0.19 Preview: MLX-Powered Apple Silicon Acceleration Specifically Optimized for AI Coding Agents and OpenClaw

Ollama announced v0.19 preview on March 31, 2026 — a fundamental architecture shift from llama.cpp to Apple MLX framework on Apple Silicon, specifically optimized for AI coding agents and personal assistant workloads.

Key Performance Gains: Using Qwen3.5-35B-A3B with NVFP4 quantization on M5 chips:

1851 tokens/s prefill speed (massive improvement over previous Q4_K_M implementation)
134 tokens/s decode speed
Leverages M5/M5 Pro/M5 Max GPU Neural Accelerators for both time-to-first-token and generation speed

Agent-Specific Optimizations: The release explicitly targets agentic coding use cases with three cache improvements:

Cross-conversation cache reuse: Less memory utilization and more cache hits when branching with shared system prompts — critical for tool-calling agents like Claude Code that send repeated system prompts
Intelligent checkpoints: Cache snapshots at strategic prompt locations, reducing prompt processing and enabling faster responses during multi-turn agent interactions
Smarter eviction: Shared prefixes survive longer even when older branches are dropped

NVFP4 Production Parity: Ollama now uses NVIDIA NVFP4 format for quantization, matching what inference providers use in production. This means local development with Ollama produces the same quality results as cloud deployments — critical for agent developers testing locally before deploying.

Explicit Agent Integrations: The blog post specifically highlights integration with:

OpenClaw personal assistants
Claude Code coding agent
OpenCode
Codex

With command-line launchers like 'ollama launch claude --model qwen3.5:35b-a3b-coding-nvfp4' and 'ollama launch openclaw --model qwen3.5:35b-a3b-coding-nvfp4'.

Requires Mac with 32GB+ unified memory. The release is #2 on Hacker News with 187 points and 77 comments within 4 hours.

Ollama 0.19 Preview: MLX-Powered Apple Silicon Acceleration Specifically Optimized for AI Coding Agents and OpenClaw

Sources

Share this article

🧠 Stay Updated on AI Agents

Deploy Your AI Agent Today

More from AI Infrastructure