Run Benchmark

Math Word Problems

Select Model
Select the model to run this benchmark against.
Benchmark execution is allowed only from direct local/private network IPs.
Cancel
Benchmark Info

A benchmark to evaluate a model's ability to read math word problems and extract the relevant numbers to compute the correct answer. Approximately one third of questions contain distractor/unused information.