The challenge
A global automaker wanted to bring real-time voice and vision AI to vehicles—but off-the-shelf models were too slow for mid-tier CPUs. Despite months of effort with llama.cpp, slow inference speeds and hardware limitations blocked deployment.
Key Obstacles:
- Performance bottlenecks: Small VLMs ran too slowly on existing hardware
- Integration hurdles: Unacceptable time-to-first-token (TTFT) for in-car UX
- Resource constraints: Couldn’t support efficient AI inference without costly upgrades
OUR SOLUTION
Liquid AI delivered a hardware-optimized VLM that ran 10x faster on the automaker’s existing CPUs. Using our Edge SDK, we reduced model size by 50% without sacrificing accuracy—and deployed a production-ready solution in just one week.
THE RESULTS
The automaker achieved real-time AI interactions directly in vehicles—no hardware upgrades needed.
- 10x faster time-to-first-token
- 50% smaller model size (no performance loss)
- Deployment slashed from months to 1 week
- Enabled real-time voice/vision AI on existing hardware