
The Million-Dollar Question: Can Your iPhone Do Math?
You'd think a flagship smartphone, especially one marketed for its powerful AI capabilities, could handle basic arithmetic. But what if your shiny new iPhone 16 Pro Max, a device costing well over a grand, started spitting out numerical garbage when running local LLMs?
That's exactly the baffling scenario I encountered. While my iPhone 15 Pro and MacBook Pro handled the same MLX LLM code perfectly, the iPhone 16 Pro Max consistently produced tensor outputs off by an order of magnitude. My strong suspicion? A potential hardware defect within its Neural Engine or another critical ML-related system. And yes, it was a massive pain to debug, but at least you get a blog post out of it!
The Project That Sparked the Problem
This whole saga began with what was supposed to be a simple, unwinding side project. After long hours on my custom chatbot project, Schmidt, I decided to build a straightforward expense tracking app. The goal was simple yet ambitious:
- Automatically log expenses upon payment.
- Update an Apple Watch complication with monthly budget percentage.
- Categorize purchases using an LLM for later analysis.
I envisioned using MiniMax M2.1 for the categorization, integrating it with Apple's new developer APIs for on-device LLM capabilities. The core idea came from missing a robust feature from my previous banking app, so I thought, "Why not build it better?"
Apple Intelligence: The Promise vs. The Reality
With the recent explosion in LLM innovation, Apple's push into on-device AI for developers seemed like a perfect fit. Integrating with their APIs was straightforward on paper: check for feature availability, then query the model. I quickly got the app registering purchases and moved on to the classification feature.
My first test: a purchase at "Kasai Kitchin." The result? "Unknown."
Logs revealed the issue: model support wasn't downloading. This wasn't entirely surprising; Apple's services can sometimes be finicky. After a frustrating four-hour wait and confirming many others faced similar download woes, I eventually coaxed the feature into what I thought was working order.
When "Working" Meant "Broken"
Once the model finally seemed to be running on my iPhone 16 Pro Max, I tried the classification again. Still "unknown." What was going on?
I decided to simplify the task and just ask the LLM to add two numbers, expecting a straightforward "3" for "1+2". Instead, I got "30". For "100+200", I got "3000". It was consistently off by an order of magnitude when running a simple math operation.
This wasn't a model error; it was a fundamental miscalculation at the hardware level. Digging deeper into the tensor outputs, it became terrifyingly clear: the raw numerical values themselves were incorrect, consistently wrong by a factor of ten or one hundred. It was like the phone was multiplying the results by 10 for every operation.
The Diagnosis: A $1000 Hardware Headache
To isolate the problem, I ran the exact same code on my iPhone 15 Pro and my MacBook Pro. Both devices executed the MLX LLM tasks perfectly, returning correct numerical outputs and accurate classifications. This stark contrast led to an inescapable conclusion: the fault lay squarely with my iPhone 16 Pro Max.
The consistent numerical errors in the tensor outputs strongly suggest a hardware anomaly within the iPhone 16 Pro Max's Neural Engine or another core system essential for machine learning operations. It's a shocking discovery for a brand-new, top-tier device touted for its AI prowess.
Lessons Learned (and Frustration Acknowledged)
Debugging this issue was a nightmare, sifting through low-level tensor data to find a numerical discrepancy that shouldn't exist. It highlights the often-unseen complexities (and potential pitfalls) of on-device machine learning and the critical importance of reliable underlying hardware.
So, the next time your thousand-dollar iPhone struggles with what seems like a simple task, remember: sometimes, even the most advanced tech can have a fundamental flaw lurking beneath the surface.