Baidu Unveils Compact ERNIE-4.5-VL-28B-A3B-Thinkin...

Baidu’s latest open‑source contribution, ERNIE‑4.5‑VL‑28B‑A3B‑Thinking, represents a significant step forward in compact multimodal reasoning. The model is specifically tuned for understanding complex inputs such as documents, charts, and videos—a domain traditionally dominated by much larger, resource‑intensive models. By leveraging a 3‑billion‑parameter architecture, Baidu demonstrates that high‑performance multimodal inference can be achieved without the enormous compute footprints of its predecessors.

What sets ERNIE‑4.5‑VL‑28B‑A3B‑Thinking apart is its “active parameter” design. Instead of keeping the entire 28‑billion‑parameter backbone in memory, the model activates only a subset of parameters during inference, dramatically reducing memory usage and inference latency. This technique allows the model to be deployed in production scenarios—such as real‑time document classification or interactive video summarization—where computational resources are at a premium. Additionally, the open‑source release includes comprehensive training scripts and evaluation datasets, enabling the research community to fine‑tune the model for niche applications.

Baidu’s strategy reflects a broader trend in AI toward “model distillation” and parameter efficiency. By providing the ERNIE‑4.5 family with a compact yet powerful multimodal tool, the company is fostering innovation across industries that rely on document analytics, data visualization, and multimedia content. Developers can now experiment with state‑of‑the‑art reasoning capabilities in a cost‑effective manner, accelerating the adoption of multimodal AI in sectors ranging from finance to education.

Key takeaway: Baidu’s ERNIE‑4.5‑VL‑28B‑A3B‑Thinking shows that compact, open‑source multimodal models can match large‑model performance while enabling practical production deployment.

Baidu Unveils Compact ERNIE-4.5-VL-28B-A3B-Thinking Model

Related Articles

PyGWalker Dashboard Tutorial: Build Interactive Analytics

Kosmos: AI Scientist Automates Data-Driven Discovery

Vector, Graph, Log Memory for LLM Agents: Comparison

Suno AI: Revolutionizing Music Creation with Artificial Intelligence