Crowdsourced AI benchmarks have serious flaws, some experts say

AI labs are increasingly relying on crowdsourced benchmarking platforms such as Chatbot Arena to probe the…

AI benchmarking platform Chatbot Arena forms a new company

Chatbot Arena, the crowdsourced benchmarking project major AI labs rely on to test and market their…

Debates over AI benchmarking have reached Pokémon

Not even Pokémon is safe from AI benchmarking controversy. Last week, a post on X went…

People are benchmarking AI by having it make balls bounce in rotating shapes

The list of informal, weird AI benchmarks keeps growing. Over the past few days, some in…

Even some of the best AI can’t beat this new benchmark

The nonprofit Center for AI Safety (CAIS) and Scale AI, a company that provides a number…

AI benchmarking organization criticized for waiting to disclose funding from OpenAI

An organization developing math benchmarks for AI didn’t disclose that it had received funding from OpenAI…

Will Smith eating spaghetti and other weird AI benchmarks that took off in 2024

When a company releases a new AI video generator, it’s not long before someone uses it…

For the Analyst: Peer Benchmarking Methods to Improve Earnings Forecasts

Finding suitable peers for financial analysis is a vexing task that requires careful consideration of firms’…