WWDC 2026: Apple Finally Shipped the AI It Promised — A Developer's Breakdown

I watched the WWDC 2026 keynote (June 8) the same way I watched Microsoft Build last week — with a developer's notebook open and one question on repeat: what can I actually build with this? Two years after Apple Intelligence was announced and then famously under-delivered, this was the year Apple stopped apologizing. The rebuilt Siri is real, every OS jumped to version 27, and — buried under the consumer headlines — Apple quietly shipped the most developer-friendly local-AI stack of any platform vendor.

It was also the end of an era: this was Tim Cook's farewell keynote. John Ternus takes over as CEO on September 1.

There's a lot here, so let me sort it the way I sorted Build: the headline, the consumer stuff worth knowing, and then the part that actually got me excited as a developer.

The headline: Siri AI, rebuilt with Google

The new assistant is branded Siri AI, and Apple was unusually candid about how it was built: the underlying Apple Foundation Models were "custom-built in collaboration with Google and its Gemini models," then adapted for Apple hardware and Apple's Private Cloud Compute.

What it does now — the things Siri was supposed to do in 2024:

Holds a real conversation. Multi-turn, with follow-ups, and a new standalone Siri app that keeps conversation history. It's on iPad and Mac too.
Sees your screen. Ask about whatever is in front of you.
Knows your personal context. It can dig through your own messages, mail, and photos to answer questions — the long-delayed "personal context" feature, finally shipping.

The privacy framing was classic Apple — Craig Federighi: "data is only used to execute your request." The interesting strategic read: Apple didn't train a frontier model alone. They partnered for the model tech and kept ownership of the privacy and serving layer (Private Cloud Compute). It's the Google-pays-to-be-Safari's-default deal, running in reverse.

Jargon check — Private Cloud Compute (PCC): Apple's own AI datacenter built on Apple silicon servers, architected so that even Apple can't read your requests — stateless processing, no prompt storage, independently auditable. Think "on-device privacy, cloud-sized model." It matters later in this post, because Apple is now giving developers access to it for free.

The consumer round-up (fast)

Everything is version 27 — iOS, iPadOS, macOS, watchOS, tvOS, visionOS. No more mismatched numbers.
iOS 27 runs on iPhone 11 and later — Apple claims the widest iOS rollout ever. Notable because AI features usually shrink device support.
Performance as a feature: photos load up to 70% faster, AirDrop transfers up to 80% faster, smarter CPU scheduling.
Liquid Glass got dialed back. Last year's divisive transparent design now has a toned-down default and an opacity slider. The backlash worked.
The Passwords app became an agent. This one stopped me: Apple Intelligence plus Safari will "agentically take action on your behalf" — going to each website with a weak or leaked password and changing it for you. That's a mainstream OS shipping a browser-driving agent as a checkbox feature.
Photos gets Reframe (AI perspective correction), Extend (outpainting to change aspect ratio), and a much better Cleanup.
Image Playground was rebuilt on Private Cloud Compute and can finally do photorealistic output — with generated images excluded from training data.
Grab bag: system-wide AI dictation, full-screen widgets, separate volume for alarms vs. alerts, much stronger parental controls (mandatory child accounts under 13, "Ask to Browse"), menopause tracking in Health.

Fine. Now the good part.

The developer story: one protocol to rule every model

The single biggest idea Apple shipped this year is a Swift protocol. I'm not joking.

The Foundation Models framework — Apple's Swift API for calling LLMs from your app — now sits behind a LanguageModel protocol. Any model that conforms can back a session:

SystemLanguageModel — Apple's small on-device model (now rebuilt, with vision)
PrivateCloudComputeLanguageModel — Apple's bigger server model: 32K context, adjustable reasoning levels, no API key, no account setup
CoreAILanguageModel and MLXLanguageModel — open-sourced by Apple, for running local open-weight models on your Mac's GPU and Neural Engine
Official Swift packages from Anthropic and Google — drop AnthropicLanguageModel() in via Swift Package Manager and Claude backs your session; nothing else in your code changes

// The whole pitch in four lines:
let session = LanguageModelSession(model: SystemLanguageModel())   // free, on-device
// let session = LanguageModelSession(model: PrivateCloudComputeLanguageModel()) // free-tier cloud
// let session = LanguageModelSession(model: AnthropicLanguageModel())           // Claude
// let session = LanguageModelSession(model: MLXLanguageModel(/* local Qwen */)) // your GPU

let response = try await session.respond {
    "What animal is this?"
    Attachment(uiImage)   // image input is new this year
}

If you've used the OpenAI-compatible API convention in the open-source world, this is the same idea — write once, swap the backend — except typed, native, and OS-level. After spending last month wiring Claude Code to Ollama by overriding base URLs and env vars, seeing this as a first-class protocol felt almost unfair.

Other Foundation Models upgrades worth your time:

Image input on the on-device model (vision), any size or aspect ratio — larger images just cost more tokens.
Built-in system tools the model can call: an OCRTool, a BarcodeReaderTool, and a Spotlight search tool — effectively local RAG over the user's own data, no vector DB required.
Dynamic Profiles — a declarative, SwiftUI-flavored API for agent "modes": run the cheap on-device model with one toolset in analysis mode, then switch the same session to Private Cloud Compute with deep reasoning for brainstorming. Multi-agent workflows as a switch statement.
A new Evaluations framework for testing AI features systematically — Apple's phrase: "beyond what unit tests alone can catch." The industry-wide evals problem now has a first-party Apple answer.
An fm CLI and a Python SDK on macOS 27 — free local LLM calls from shell scripts. Sleeper hit of the conference for me.
The framework itself is going open source later this summer (the framework, mind you — not the model weights).

Core AI: bring-your-own-model becomes an OS feature

Alongside Foundation Models (Apple's models) there's a brand-new framework called Core AI for running your models:

A memory-safe Swift API to load, specialize, and run models entirely on-device — Apple's pitch is literally "zero server dependencies and zero token costs."
Ahead-of-time compilation so big models load fast, with fine-grained control over inference memory, zero-copy data paths, and stateful execution.
Python tools to convert PyTorch models to Apple silicon.
Architecture "optimized for the unified memory and Neural Engine of Apple silicon" — and Apple says Core AI is what powers the new Siri itself.

Jargon check — Core ML vs. Core AI: Core ML (2017) was built for the classifier era — small task-specific models. Core AI is built for the LLM era — big generative models that need streaming, KV-cache-style state, and careful memory management. Core ML isn't dead, but Core AI is clearly the new front door for bringing your own models.

I have a lot more to say about this one — it directly collides with my local-AI setup — but I'm giving it its own post.

Private Cloud Compute is free for small developers

This is the announcement I'd put on a poster: if you're in the App Store Small Business Program and your app has fewer than 2 million first-time downloads, you can call the next-generation Apple Foundation Model on Private Cloud Compute at no cloud API cost. No keys, no accounts, prompts not stored, and users with iCloud+ get higher limits.

Compare that with what an indie app pays today for GPT/Claude/Gemini API calls at even modest scale. Apple just made "ship an AI feature without a cloud bill" the default for small developers — and in exchange, of course, those developers build on Apple's stack. It's aggressive, and it will work.

Xcode 27: Claude Code moves in

Xcode 27 ships native agentic coding with the agents you already know — Claude Code, Gemini, and OpenAI Codex — selectable per task, with support for MCP (Model Context Protocol) and the Agent Client Protocol. Your existing MCP servers work inside Xcode.

Two design choices stood out:

A dual-engine split. Real-time Swift autocomplete runs on a local Neural Engine model — your source never leaves the Mac — while the heavyweight agents handle planning, refactors, and reviews in the cloud.
Agents can validate their own work. They can write and run tests, experiment in Playgrounds, check visual output via previews, and drive the simulator through the new Device Hub. Anyone who's watched an agent claim "done!" without running anything knows why this matters.

Plus the housekeeping: Xcode is 30% smaller, Apple-silicon-only now, settings sync via iCloud, and Xcode Cloud builds up to 2× faster.

My takeaway

Build 2026 was Microsoft declaring the agent-first era from the cloud down. WWDC 2026 was Apple answering from the device up: a free on-device model with vision, a free cloud model for small apps, an open protocol so any model — Claude, Gemini, or a quantized Qwen on your own GPU — slots into the same Swift API, and an IDE where the agents check their own work.

The thing I keep chewing on: Apple is the only platform vendor whose AI strategy gets cheaper for developers as it gets better. Microsoft's MAI models are metered APIs. Apple's answer is "the model is in the OS, and the cloud tier is free if you're small."

All of this is developer-beta-only right now, shipping publicly with the OS 27 releases this fall — which gives the rest of us a summer to play. My own list for the coming months: the fm CLI, the MLXLanguageModel, and Claude Code inside Xcode 27. I'll write up each as I get to them.

Tim Cook closed his last keynote on stage with the hardware-and-silicon story he spent fifteen years building. Fittingly, it's the silicon — unified memory, the Neural Engine, PCC servers — that makes the whole AI story credible. Over to you, John Ternus.

Next up: what Core AI, MLX, and the new local-model stack mean for my own local AI setup — that post is coming tomorrow.

It was also the end of an era: this was Tim Cook's farewell keynote. John Ternus takes over as CEO on September 1.

There's a lot here, so let me sort it the way I sorted Build: the headline, the consumer stuff worth knowing, and then the part that actually got me excited as a developer.

The headline: Siri AI, rebuilt with Google

What it does now — the things Siri was supposed to do in 2024:

Holds a real conversation. Multi-turn, with follow-ups, and a new standalone Siri app that keeps conversation history. It's on iPad and Mac too.
Sees your screen. Ask about whatever is in front of you.
Knows your personal context. It can dig through your own messages, mail, and photos to answer questions — the long-delayed "personal context" feature, finally shipping.

Jargon check — Private Cloud Compute (PCC): Apple's own AI datacenter built on Apple silicon servers, architected so that even Apple can't read your requests — stateless processing, no prompt storage, independently auditable. Think "on-device privacy, cloud-sized model." It matters later in this post, because Apple is now giving developers access to it for free.

The consumer round-up (fast)

Everything is version 27 — iOS, iPadOS, macOS, watchOS, tvOS, visionOS. No more mismatched numbers.
iOS 27 runs on iPhone 11 and later — Apple claims the widest iOS rollout ever. Notable because AI features usually shrink device support.
Performance as a feature: photos load up to 70% faster, AirDrop transfers up to 80% faster, smarter CPU scheduling.
Liquid Glass got dialed back. Last year's divisive transparent design now has a toned-down default and an opacity slider. The backlash worked.
The Passwords app became an agent. This one stopped me: Apple Intelligence plus Safari will "agentically take action on your behalf" — going to each website with a weak or leaked password and changing it for you. That's a mainstream OS shipping a browser-driving agent as a checkbox feature.
Photos gets Reframe (AI perspective correction), Extend (outpainting to change aspect ratio), and a much better Cleanup.
Image Playground was rebuilt on Private Cloud Compute and can finally do photorealistic output — with generated images excluded from training data.
Grab bag: system-wide AI dictation, full-screen widgets, separate volume for alarms vs. alerts, much stronger parental controls (mandatory child accounts under 13, "Ask to Browse"), menopause tracking in Health.

Fine. Now the good part.

The developer story: one protocol to rule every model

The single biggest idea Apple shipped this year is a Swift protocol. I'm not joking.

The Foundation Models framework — Apple's Swift API for calling LLMs from your app — now sits behind a LanguageModel protocol. Any model that conforms can back a session:

SystemLanguageModel — Apple's small on-device model (now rebuilt, with vision)
PrivateCloudComputeLanguageModel — Apple's bigger server model: 32K context, adjustable reasoning levels, no API key, no account setup
CoreAILanguageModel and MLXLanguageModel — open-sourced by Apple, for running local open-weight models on your Mac's GPU and Neural Engine
Official Swift packages from Anthropic and Google — drop AnthropicLanguageModel() in via Swift Package Manager and Claude backs your session; nothing else in your code changes

// The whole pitch in four lines:
let session = LanguageModelSession(model: SystemLanguageModel())   // free, on-device
// let session = LanguageModelSession(model: PrivateCloudComputeLanguageModel()) // free-tier cloud
// let session = LanguageModelSession(model: AnthropicLanguageModel())           // Claude
// let session = LanguageModelSession(model: MLXLanguageModel(/* local Qwen */)) // your GPU

let response = try await session.respond {
    "What animal is this?"
    Attachment(uiImage)   // image input is new this year
}

Other Foundation Models upgrades worth your time:

Image input on the on-device model (vision), any size or aspect ratio — larger images just cost more tokens.
Built-in system tools the model can call: an OCRTool, a BarcodeReaderTool, and a Spotlight search tool — effectively local RAG over the user's own data, no vector DB required.
Dynamic Profiles — a declarative, SwiftUI-flavored API for agent "modes": run the cheap on-device model with one toolset in analysis mode, then switch the same session to Private Cloud Compute with deep reasoning for brainstorming. Multi-agent workflows as a switch statement.
A new Evaluations framework for testing AI features systematically — Apple's phrase: "beyond what unit tests alone can catch." The industry-wide evals problem now has a first-party Apple answer.
An fm CLI and a Python SDK on macOS 27 — free local LLM calls from shell scripts. Sleeper hit of the conference for me.
The framework itself is going open source later this summer (the framework, mind you — not the model weights).

Core AI: bring-your-own-model becomes an OS feature

Alongside Foundation Models (Apple's models) there's a brand-new framework called Core AI for running your models:

A memory-safe Swift API to load, specialize, and run models entirely on-device — Apple's pitch is literally "zero server dependencies and zero token costs."
Ahead-of-time compilation so big models load fast, with fine-grained control over inference memory, zero-copy data paths, and stateful execution.
Python tools to convert PyTorch models to Apple silicon.
Architecture "optimized for the unified memory and Neural Engine of Apple silicon" — and Apple says Core AI is what powers the new Siri itself.

Jargon check — Core ML vs. Core AI: Core ML (2017) was built for the classifier era — small task-specific models. Core AI is built for the LLM era — big generative models that need streaming, KV-cache-style state, and careful memory management. Core ML isn't dead, but Core AI is clearly the new front door for bringing your own models.

I have a lot more to say about this one — it directly collides with my local-AI setup — but I'm giving it its own post.

Private Cloud Compute is free for small developers

Xcode 27: Claude Code moves in

Two design choices stood out:

A dual-engine split. Real-time Swift autocomplete runs on a local Neural Engine model — your source never leaves the Mac — while the heavyweight agents handle planning, refactors, and reviews in the cloud.
Agents can validate their own work. They can write and run tests, experiment in Playgrounds, check visual output via previews, and drive the simulator through the new Device Hub. Anyone who's watched an agent claim "done!" without running anything knows why this matters.

Plus the housekeeping: Xcode is 30% smaller, Apple-silicon-only now, settings sync via iCloud, and Xcode Cloud builds up to 2× faster.

My takeaway

Next up: what Core AI, MLX, and the new local-model stack mean for my own local AI setup — that post is coming tomorrow.

WWDC 2026: Apple Finally Shipped the AI It Promised — A Developer's Breakdown

The headline: Siri AI, rebuilt with Google

The consumer round-up (fast)

The developer story: one protocol to rule every model

Core AI: bring-your-own-model becomes an OS feature

Private Cloud Compute is free for small developers

Xcode 27: Claude Code moves in

My takeaway

STAY UPDATED

WWDC 2026: Apple Finally Shipped the AI It Promised — A Developer's Breakdown

The headline: Siri AI, rebuilt with Google

The consumer round-up (fast)

The developer story: one protocol to rule every model

Core AI: bring-your-own-model becomes an OS feature

Private Cloud Compute is free for small developers

Xcode 27: Claude Code moves in

My takeaway

STAY UPDATED