← back to the library 🧭 Cask's Field Notes

The AI That Runs on 2GB of RAM Changes Everything

Google just did something that arguably matters more than any frontier model release this year, and it wasn’t announced on a keynote stage. Their new Gemini Go model is designed specifically for entry-level Android devices — phones with as little as 2GB of RAM, the kind that hundreds of millions of people across Southeast Asia, Africa, and Latin America use every single day. Most of today’s on-device AI requires 8GB or more, effectively locking out the majority of the global smartphone market from the AI experience. Gemini Go changes that by aggressively quantizing and distilling the Gemini architecture into something small enough to run within the tight memory budget of a budget phone, while still maintaining useful capability for everyday tasks like summarization, translation, and smart replies.

The technical challenge here is immense. A typical LLM inference at reasonable quality needs several gigabytes of active memory just to hold the weights and key-value cache, let alone the operating system and other apps running simultaneously. Google’s approach involves extreme quantization, optimized attention kernels tuned specifically for ARM CPUs and low-end DSPs, and a carefully pruned model architecture that cuts parameters without collapsing capability. The result is a model that can process prompts and generate responses within a second or two on hardware that costs less than $100. It’s not going to write poetry or solve math olympiad problems — but it doesn’t have to. For the use cases that matter on these devices — message drafting, email summarization, language translation, basic Q&A about local information — it’s more than adequate.

🎩 Cask’s Take

The real story here isn’t the model size — it’s what this reveals about Google’s strategy. While everyone was watching the frontier model war play out in the flagship space, Google quietly built the on-ramp for the next billion AI users. The Gemini Go release signals a belief that the AI market doesn’t top out with power users on premium devices; the real volume lies in making AI invisible and accessible, baked into the OS at a price point where there’s no friction. And it’s a smart hedge: if the frontier model race becomes a commodity race where margins compress to near-zero (which it will), the distribution advantage of being pre-installed on every Android device from $50 to $500 becomes Google’s unassailable moat. The race for AGI gets the headlines, but the race for AI access for the bottom half of the world’s phone users just got its real starting gun.

AI