Baidu ERNIE multimodal AI beats GPT and Gemini in benchmarks

Baidu’s latest ERNIE model, a super-efficient multimodal AI, is beating GPT and Gemini on key benchmarks and targets enterprise data often ignored by text-focused models. For many businesses, valuable insights are locked in engineering schematics, factory-floor video feeds, medical scans, and logistics dashboards. Baidu’s new model, ERNIE-4.5-VL-28B-A3B-Thinking, is designed to fill this gap. What’s interesting…

Read More

Flawed AI benchmarks put enterprise budgets at risk

A new academic review suggests AI benchmarks are flawed, potentially leading an enterprise to make high-stakes decisions on “misleading” data. Enterprise leaders are committing budgets of eight or nine figures to generative AI programmes. These procurement and development decisions often rely on public leaderboards and benchmarks to compare model capabilities. A large-scale study, ‘Measuring what…

Read More

Samsung benchmarks real productivity of enterprise AI models

Samsung is overcoming limitations of existing benchmarks to better assess the real-world productivity of AI models in enterprise settings. The new system, developed by Samsung Research and named TRUEBench, aims to address the growing disparity between theoretical AI performance and its actual utility in the workplace. As businesses worldwide accelerate their adoption of large language…

Read More