Run Benchmark
Math Word Problems
This benchmark is in Tier 1 (Screening).
Select Model
Benchmark Info
A benchmark to evaluate a model's ability to read math word problems and extract the relevant numbers to compute the correct answer. Approximately one third of questions contain distractor/unused information.