We’re excited to release LFM2-VL-3B, the newest and most capable addition to our family of vision LFMs (450M and 1.6B). Built on the LFM2-2.6B backbone, this 3B parameter model targets applications that require more accuracy while maintaining the speed advantage of the LFM2 architecture. It is available today on LEAP and Hugging Face.

Flexible Architecture

LFM2-VL-3B follows the recipe adopted for our previous VLMs. It builds on our most powerful dense model, LFM2-2.6B, and integrates a SigLIP2 400M NaFlex encoder. This enables image processing at native resolutions with variable aspect ratios. Its flexible architecture allows developers to balance performance and speed by adjusting the number of vision tokens per image. This offers finer control for deployment, especially in edge environments.

You can find more information about the architecture in our LFM2-VL blog post.

Broader Capabilities

LFM2-VL-3B delivers competitive results across open-source evaluations, achieving an impressive 51.8% on MM-IFEval and 71.4% on RealWorldQA. The model shows strong performance in single- and multi-image comprehension and English OCR, with low hallucination rates on the POPE benchmark.

It maintains comparable language-only knowledge benchmark scores to its backbone, LFM2-2.6B, with 30% on GPQA and 63% on MMLU. In addition, we have significantly expanded multilingual capabilities, extending visual understanding beyond English to include Japanese, French, Spanish, German, Italian, Portuguese, Arabic, Chinese, and Korean.

Model
Average
MMStar
MMMU (val)
MathVista
BLINK
InfoVQA (val)
MMBench (dev en)
OCRBench
POPE
RealWorldQA
MME
MM-IFEval
SEEDBench
InternVL3_5-2B
66.63
57.67
51.78
61.60
50.97
69.29
78.18
834.00
87.17
60.78
2,128.83
47.31
75.41
Qwen2.5-VL-3B
66.61
56.13
51.67
62.50
48.97
76.12
80.41
824.00
86.17
65.23
2,163.29
38.62
73.88
InternVL3-2B
66.46
61.10
48.70
57.60
53.10
66.10
81.10
831.00
90.10
65.10
2,186.40
38.49
74.95
SmolVLM2-2.2B
54.85
46.00
41.60
51.50
42.30
37.75
69.24
725.00
85.10
57.50
1792.50
19.42
71.30
LFM2-VL-3B
67.31
57.73
45.33
62.20
51.03
67.37
79.81
822.00
89.01
71.37
2,050.90
51.83
76.55
We calculated the scores for all models using VLMEvalKit. We couldn’t include Qwen3-VL-2B in this table, as its release occurred the day before.

Open and Available

LFM2-VL-3B is now available on Hugging Face under our LFM Open License, and through our LEAP platform, making cutting-edge efficient AI accessible to developers and researchers worldwide. 

The LFM2 series continues to push the boundaries of efficient AI. We're proving that with the right architecture and approach, smaller models can deliver enterprise-grade performance without the computational overhead. In the future, we will continue to scale our foundation models to bring this level of efficiency to more devices and unlock new applications.

Ready to experience AI?

Power your business, workflows, and engineers with Liquid AI.

Manage your preferences

We use cookies to enhance your browsing experience and analyze our traffic. By clicking "Accept All", you consent to our use of cookies.

Learn more
  • Essential cookies required