Cambridge, MA — September 25, 2025 — Liquid today announced a breakthrough in AI model training and customization that enables 350M–2.6B parameter foundation models (“Nanos”) to deliver GPT-4o-class performance on specialized agentic tasks—while running on phones, laptops, and embedded devices. In internal and partner evaluations, Liquid Nanos perform competitively with models up to hundreds of times larger. The result: planet-scale AI agents with cloud-free economics.
The cost and energy demands of serving large frontier models from data centers have been a major barrier to broad deployment. Liquid Foundation Models (LFMs), the efficient, light-weight, and multimodal large language models designed by Liquid, change that. Using advanced training schemes and a specialized form of fine-tuning, LFMs, even at an extremely small scale – Nanos, approach the reliability of frontier-level on the building blocks of agentic AI: precise data extraction, structured output generation, multilingual translation (10+ languages), retrieval-augmented generation (RAG), mathematical reasoning, and tool calling.
“Liquid Nanos provide the task-specific performance of large frontier models at zero marginal inference cost. They are the ideal solution to generative AI applications with strict KPIs on response latency, inference cost, and privacy requirements,” said Mathias Lechner, Liquid AI CTO. “Our enterprise customers have successfully deployed Liquid Nanos in scenarios ranging from high-throughput cloud instances at massive scale to running fully local on low-power embedded devices.”
Liquid has launched six task-specific Nanos in the initial release, with plans to expand to many more:
- LFM2-350M-ENJP-MT: A 350M parameter model for bidirectional English/Japanese translation. It surpasses the quality of generalist open-source models more than 10x its size and delivers translation quality competitive with GPT-4o, a model estimated to be more than 500x its size.
- LFM2-350M-Extract: A 350M parameter multilingual data extraction model designed for structured data extraction from unstructured input sources, such as extracting information from an invoice email into JSON. It outperforms Gemma 3 4B at this task, a model more than 11x its size.
- LFM2-1.2B-Extract: A 1.2B parameter multilingual data extraction model. The model can output complex objects in different languages on a level higher than Gemma 3 27B, a model 22.5x its size, and delivers results competitive with GPT-4o, estimated to be 160x larger. The model provides a significant boost in validity, accuracy, and faithfulness of the extracted data in structured format from a given document.
- LFM2-350M-Math: A 350M parameter reasoning model capable of solving mathematical problems.
- LFM2-1.2B-RAG: A 1.2B parameter model designed for answering questions based on long contexts for retrieval-augmented generation (RAG) use cases.
- LFM2-1.2B-Tool: A 1.2B parameter model designed for function-calling use cases, e.g., for agentic workflows.
At the core, Liquid uses a combination of proprietary software for automated evaluations, knowledge distillation, reinforcement learning, and model merging to iteratively improve the performance of a model.
“Nanos flip the deployment model,” said Ramin Hasani, Liquid AI CEO. “Instead of shipping every token to a data center, we ship intelligence to the device. That unlocks speed, privacy, resilience, and a cost profile that finally scales to everyone.”
Nanos can lead to a step change in the economic and environmental impact of AI systems. In our society, people won’t be using one general assistant. Instead, they will rely on multiple small, task-specific agents that live across their devices (phone, laptop, car, watch, home hub), apps (mail, calendar, documents, browser, shopping, travel, finance), and the services they interact with (each bank/telco/retailer/healthcare provider often runs per-user agents for personalization, support, risk, and compliance). Add ephemeral micro-agents spun up during workflows (RAG fetchers, extractors, translators, tool-callers) and background security/automation agents, and you quickly reach ~100 agents per person.
For a population of 10M, that means ~1B agents across consumer, enterprise, and public-sector workloads. To serve 10 million people with 1 billion AI agents hosted in the cloud, the estimated annual cost would be at least ~$43.5B, with ~16 TWh of energy consumption. That’s about the same annual energy used by 1.45 million U.S. homes. With Liquid’s on-device Nanos, delivering the same capability costs up to 50x less per year with a 100x lower energy footprint (assuming 99% of inference runs locally on the users’ devices).
“I find it very impressive that Liquid's novel pre-training and post-training technique enables their fast and small LLMs to perform on par with frontier models such as GPT-4o, which is orders of magnitude larger, on specialized tasks," said Mikhail Parakhin, Shopify CTO. "Liquid is simultaneously raising the bar for both performance and speed in foundation models, pushing beyond the state of the art. That is why we are excited to utilize their models across Shopify's platforms and services.”
Liquid’s task-specific Nanos are designed for the reality of most AI deployments. Their advantages include:
- Purpose-built efficiency: Most AI use cases are task-specific, making smaller, fine-tuned models a natural fit.
- Composable systems: Multiple Nanos can be combined to cover more complex use cases, while still being more efficient than a single 100B+ parameter, general-purpose model.
- On-device deployment: Running models locally removes cloud dependencies, reducing costs and latency while keeping data private and secure.
“Deloitte is excited about the opportunity to collaborate with Liquid AI and their new Nanos model, which has the potential to drive performance comparable to larger models at a lower cost,” said Ranjit Bawa, Chief Strategy and Technology Officer, Deloitte.
Beyond on-device inference, Nanos can also scale across GPUs (on-prem or in the cloud) to power big-data workflows. Their compact size delivers exceptional GPU throughput, enabling large-scale document ingestion and structured data extraction, ETL-style pipelines, or log parsing. For industries such as finance and e-commerce, this provides both the assurance of internal execution for sensitive workloads and the flexibility to leverage GPUs when throughput and scale are paramount. Liquid’s new model development pipeline can also scale Nanos systematically to 1000× their size, aligning with new hardware architectures to unlock vastly more powerful systems with attractive, predictable economics.
“Liquid’s Nanos represents a powerful inflection point for AI PCs, delivering frontier-level performance in a compact, energy-efficient form. At AMD, we share this focus on performance-per-watt leadership and see on-device intelligence as key to scaling AI broadly and sustainably,” said Mark Papermaster, Chief Technology Officer and Executive Vice President at AMD.
Liquid Nanos are now available on the Liquid Edge AI platform LEAP for download and integration on iOS, Android mobile phones, and laptops, and can be tested on Apollo. The models are also available on Hugging Face. Developers can use these models directly out of the box, broadly accessible under an open license for academics, developers, and small businesses. Liquid is already working with multiple Fortune 500 enterprises to deploy customized versions of task-specific Nanos in industries such as consumer electronics, automotive, e-commerce, and finance.
About Liquid AI
Liquid AI is a foundation model company spun out of MIT, focused on building highly efficient AI systems designed for real-world environments. Liquid’s proprietary Liquid Foundation Models (LFMs) and its LEAP development platform enable seamless deployment of advanced AI directly on devices where latency, privacy, and performance matter most. Learn more at liquid.ai.
Caroline Hoogland