Designing Context On-Device: What Foundation Models Mean for AI-Native Apps
For the past few months, I’ve been diving deep into MCP (Model Context Protocol) — shaping inputs, modularising prompts, and designing smarter ways to talk to models like GPT-4.
I even wrote a post about it: Designing Context.
It was a personal reckoning: prompting alone won’t scale. You need structure. You need relevance. You need context.
And then WWDC25 dropped.
Apple unveiled Foundation Models, and suddenly, the conversation shifted.
What happens when the same context design philosophy I’ve been building on — the same modular, structured input mindset — moves on-device?
This post is my first attempt to answer that.
From Cloud Context to Local Context
If you’ve been using OpenAI’s models, you know the deal: you send input to the cloud, shape the response, and (if you're like me) obsess over how to structure it so the model gets it right.
This has been the backbone of how AteIQ works — structuring meal logs and user data into clean, modular context blocks, then feeding that to GPT for insight.
But Apple’s Foundation Models reframe the equation entirely.
Now you can run models locally, on-device, with blazing speed and total user privacy.
And that’s more than a technical shift.
It’s a contextual shift.
What Apple’s Foundation Models Actually Are
Let’s recap what Apple introduced:
- A set of pre-trained large models for language, vision, and multimodal input
- Available through Swift APIs (
FoundationModel
,GenerateTextRequest
, etc.) - Fine-tuned for privacy, efficiency, and on-device performance
- Integrated with tools like Core ML, App Intents, and SwiftData
In plain English?
You can now run a mini GPT-style experience without the cloud.
And that unlocks a ton of new use cases — especially when paired with smart context design.
Designing Context for Foundation Models
Even though Apple doesn’t say “MCP,” the underlying ideas are strikingly familiar:
Foundation Models still rely on good context to be useful.
It just takes a different form.
You’re not sending a prompt string — you’re constructing typed inputs, often tied to the current user session, device state, or app domain.
Think:
- A journaling app that references a user’s recent moods
- A coaching app that adapts its tone based on time of day
- A food app (like AteIQ 👀) that responds differently to a midnight snack than a post-workout meal
All of these require context-aware logic.
That logic doesn’t go away with local models.
If anything, it becomes more powerful — because it’s private, fast, and real-time.
What I’ve Learned So Far
Here’s what’s clicking for me as I explore Foundation Models with a context-first mindset:
1. The Model Doesn’t Know Anything — Until You Tell It
Local models don’t have memory, history, or a backend. That means you define what they know, every time.
That’s not a limitation. That’s a superpower.
If you treat each inference call like a fresh teammate dropping into the room, your job is to say:
“Here’s what’s happening right now. Here’s what matters. Now help me with this.”
2. Context Is Now Local by Default
In the OpenAI world, context often lives in a backend.
With Foundation Models, context lives on-device — in memory, user defaults, local files, even sensors.
It’s like designing with a new palette:
- What can I infer from recent user actions?
- What can I pull from app state?
- What can I leave out that the model can infer from the environment?
Context design becomes not just a data structure… but an architectural choice.
3. You Can Design for Speed and Structure
Because inference is fast (sub-second in many cases), you can iterate much more freely.
That means experimenting with:
- Context modules (e.g.,
userProfile
,timeOfDay
,recentGoal
) - Reusable templates for different app flows
- Feedback loops that reshape context based on results
It's like building a local AI teammate that doesn’t talk to the cloud but still knows the whole room.
Bridging the Two Worlds
So where does this leave me?
I'm still using OpenAI.
But I’m starting to explore what happens when some of AteIQ’s features — like snack suggestions, nudges, or affirmations — shift from the cloud to the device.
That’s the future I see:
- Use Foundation Models for fast, contextual nudges, tone adjustments, and real-time inference
- Use cloud models (like GPT-4) for deeper reasoning, long-form analysis, and personalized logic
Think of it like rendering:
- Local = fast, ambient, real-time
- Cloud = powerful, precise, asynchronous
And the glue between them?
Still the same: well-designed context.
What I’m Exploring Next
I’m still early in this journey — but here’s what’s on my whiteboard:
- 🧠 Can I use Foundation Models to give users instant summaries of their day?
- 🥗 Can context-aware prompts adjust tone in AteIQ based on how someone’s week is going?
- ✍️ Can I explore on-device journaling, where reflections and coaching stay 100% private?
- 🔁 Can I blend both models — cloud + local — through a shared context schema?
These aren’t just experiments. They’re product opportunities.
Because when context is modular, and models are available everywhere, you unlock a new kind of design freedom.
Why This Matters
Foundation Models aren’t just a new API.
They represent a shift in how we think about AI-native UX.
Instead of shipping prompts, you’re shipping reasoning.
Instead of storing user data in the cloud, you’re processing it locally.
Instead of hacking together workarounds, you’re designing for intelligence from the start.
MCP gave me the lens.
Foundation Models gave me a new canvas.
Together, they push me toward one core belief:
Context is the real product layer of AI.
The better we get at designing it, the more human our apps will feel.
This post explored how Foundation Models open up a new frontier in context design — moving from cloud-based reasoning to real-time, private, on-device intelligence. I’ll be sharing more — from prototypes to production features inside AteIQ.
If you’re experimenting with this too, I’d genuinely love to connect or you can drop me a line.
And if you’re just getting started, I hope this blog becomes a place you can revisit and grow alongside.
Until next time — structure that context.