Math Word Problems
A benchmark to evaluate a model's ability to read math word problems and extract the relevant numbers to compute the correct answer. Approximately one third of questions contain distractor/unused information.
Questions
40
Total Runs
18
Best Score
100/100
Leaderboard
| Rank | Model | Score | Run Date | Actions |
|---|---|---|---|---|
|
Gemma 2 9B (LMStudio)
5800 MB |
100
|
2026-02-28 15:01:01 | View Details | |
| 2 |
Gemma 3 12B (LMStudio)
8100 MB |
100
|
2026-02-28 15:53:28 | View Details |
| 3 |
Llama 3 8B (LMStudio)
4900 MB |
100
|
2026-02-28 16:46:27 | View Details |
| 4 |
Ministral 8B (LMStudio)
4900 MB |
100
|
2026-02-28 17:23:11 | View Details |
| 5 |
OLMo 3 7B (LMStudio)
4300 MB |
100
|
2026-02-28 17:37:15 | View Details |
| 6 |
Qwen3 VL 8B (LMStudio)
5000 MB |
100
|
2026-02-28 18:41:02 | View Details |
| 7 | GPT-5 mini |
97
|
2026-02-26 01:18:54 | View Details |
| 8 | GPT-5 nano |
95
|
2026-02-26 00:51:00 | View Details |
| 9 |
Qwen3 1.7B (LMStudio)
1100 MB |
92
|
2026-02-28 18:05:22 | View Details |
| 10 |
Qwen3 4B (LMStudio)
2800 MB |
92
|
2026-02-28 18:13:31 | View Details |
| 11 |
Granite 3.2 8B (LMStudio)
4900 MB |
90
|
2026-02-28 16:18:48 | View Details |
| 12 |
Gemma 2 2B (LMStudio)
1500 MB |
75
|
2026-02-28 03:22:58 | View Details |
| 13 |
Llama 3.2 1B (LMStudio)
1300 MB |
75
|
2026-02-28 17:13:05 | View Details |
| 14 |
Llama 3.1 8B (LMStudio)
4900 MB |
70
|
2026-02-28 17:01:52 | View Details |
| 15 |
Llama 2 7B (LMStudio)
4900 MB |
67
|
2026-02-28 03:05:27 | View Details |
| 16 |
Gemma 2B (LMStudio)
1500 MB |
67
|
2026-02-28 15:35:02 | View Details |
| 17 |
SmolLM2 1.7B (LMStudio)
1100 MB |
62
|
2026-02-28 19:19:22 | View Details |
| 18 |
Phi-3.5 Mini (LMStudio)
2500 MB |
20
|
2026-02-28 17:49:34 | View Details |
Questions
Question
There are 7 rows of stickers with 8 stickers in each row. How many stickers are there in total?
Question payload
{
"question_text": "There are 7 rows of stickers with 8 stickers in each row. How many stickers are there in total?",
"answer_type": "numeric",
"correct_answer": 56.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"multiplication"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
There are 7 rows of cookies with 12 cookies in each row. How many cookies are there in total?
Question payload
{
"question_text": "There are 7 rows of cookies with 12 cookies in each row. How many cookies are there in total?",
"answer_type": "numeric",
"correct_answer": 84.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"multiplication"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
Emma has 13 cards and David has 12 cards. How many cards do they have together?
Question payload
{
"question_text": "Emma has 13 cards and David has 12 cards. How many cards do they have together?",
"answer_type": "numeric",
"correct_answer": 25.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"addition"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
George had 41 dollars. After spending 23 dollars, how much money does George have left?
Question payload
{
"question_text": "George had 41 dollars. After spending 23 dollars, how much money does George have left?",
"answer_type": "numeric",
"correct_answer": 18.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A car travels at 55 km/h. How far does it travel in 8 hours?
Question payload
{
"question_text": "A car travels at 55 km/h. How far does it travel in 8 hours?",
"answer_type": "numeric",
"correct_answer": 440.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"multiplication"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A car travels at 60 km/h. How far does it travel in 3 hours?
Question payload
{
"question_text": "A car travels at 60 km/h. How far does it travel in 3 hours?",
"answer_type": "numeric",
"correct_answer": 180.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"multiplication"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
Carlos had 51 dollars. After spending 50 dollars, how much money does Carlos have left?
Question payload
{
"question_text": "Carlos had 51 dollars. After spending 50 dollars, how much money does Carlos have left?",
"answer_type": "numeric",
"correct_answer": 1.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
Carlos baked 21 cookies and wants to give an equal number to each of 7 friends. How many cookies does each friend get?
Question payload
{
"question_text": "Carlos baked 21 cookies and wants to give an equal number to each of 7 friends. How many cookies does each friend get?",
"answer_type": "numeric",
"correct_answer": 3.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"division"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
Fatima baked 72 cookies and wants to give an equal number to each of 8 friends. How many cookies does each friend get?
Question payload
{
"question_text": "Fatima baked 72 cookies and wants to give an equal number to each of 8 friends. How many cookies does each friend get?",
"answer_type": "numeric",
"correct_answer": 9.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"division"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A store had 61 apples. They sold 14 apples. How many apples are left?
Question payload
{
"question_text": "A store had 61 apples. They sold 14 apples. How many apples are left?",
"answer_type": "numeric",
"correct_answer": 47.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
Hannah had 44 dollars. After spending 21 dollars, how much money does Hannah have left?
Question payload
{
"question_text": "Hannah had 44 dollars. After spending 21 dollars, how much money does Hannah have left?",
"answer_type": "numeric",
"correct_answer": 23.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
There are 4 rows of apples with 7 apples in each row. How many apples are there in total?
Question payload
{
"question_text": "There are 4 rows of apples with 7 apples in each row. How many apples are there in total?",
"answer_type": "numeric",
"correct_answer": 28.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"multiplication"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
Carlos had 79 dollars. After spending 59 dollars, how much money does Carlos have left?
Question payload
{
"question_text": "Carlos had 79 dollars. After spending 59 dollars, how much money does Carlos have left?",
"answer_type": "numeric",
"correct_answer": 20.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A store has 28 apples, 28 oranges, and 21 bananas. A customer buys 20 oranges. How many oranges does the store have left?
Question payload
{
"question_text": "A store has 28 apples, 28 oranges, and 21 bananas. A customer buys 20 oranges. How many oranges does the store have left?",
"answer_type": "numeric",
"correct_answer": 8.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction",
"distractor"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A store has 27 apples, 44 oranges, and 27 bananas. A customer buys 5 oranges. How many oranges does the store have left?
Question payload
{
"question_text": "A store has 27 apples, 44 oranges, and 27 bananas. A customer buys 5 oranges. How many oranges does the store have left?",
"answer_type": "numeric",
"correct_answer": 39.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction",
"distractor"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A store had 74 books. They sold 32 books. How many books are left?
Question payload
{
"question_text": "A store had 74 books. They sold 32 books. How many books are left?",
"answer_type": "numeric",
"correct_answer": 42.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A store has 14 apples, 24 oranges, and 30 bananas. A customer buys 16 oranges. How many oranges does the store have left?
Question payload
{
"question_text": "A store has 14 apples, 24 oranges, and 30 bananas. A customer buys 16 oranges. How many oranges does the store have left?",
"answer_type": "numeric",
"correct_answer": 8.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction",
"distractor"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A train travels 40 km in the first hour, 85 km in the second hour, and 59 km in the third hour. How far does it travel in the first two hours?
Question payload
{
"question_text": "A train travels 40 km in the first hour, 85 km in the second hour, and 59 km in the third hour. How far does it travel in the first two hours?",
"answer_type": "numeric",
"correct_answer": 125.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"addition",
"distractor"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A train travels 63 km in the first hour, 62 km in the second hour, and 110 km in the third hour. How far does it travel in the first two hours?
Question payload
{
"question_text": "A train travels 63 km in the first hour, 62 km in the second hour, and 110 km in the third hour. How far does it travel in the first two hours?",
"answer_type": "numeric",
"correct_answer": 125.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"addition",
"distractor"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
There are 5 rows of stickers with 3 stickers in each row. How many stickers are there in total?
Question payload
{
"question_text": "There are 5 rows of stickers with 3 stickers in each row. How many stickers are there in total?",
"answer_type": "numeric",
"correct_answer": 15.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"multiplication"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
Bob has 22 marbles and George has 22 marbles. How many marbles do they have together?
Question payload
{
"question_text": "Bob has 22 marbles and George has 22 marbles. How many marbles do they have together?",
"answer_type": "numeric",
"correct_answer": 44.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"addition"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A store has 15 apples, 22 oranges, and 27 bananas. A customer buys 4 oranges. How many oranges does the store have left?
Question payload
{
"question_text": "A store has 15 apples, 22 oranges, and 27 bananas. A customer buys 4 oranges. How many oranges does the store have left?",
"answer_type": "numeric",
"correct_answer": 18.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction",
"distractor"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A car travels at 43 km/h. How far does it travel in 2 hours?
Question payload
{
"question_text": "A car travels at 43 km/h. How far does it travel in 2 hours?",
"answer_type": "numeric",
"correct_answer": 86.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"multiplication"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
Julia had 40 dollars. After spending 19 dollars, how much money does Julia have left?
Question payload
{
"question_text": "Julia had 40 dollars. After spending 19 dollars, how much money does Julia have left?",
"answer_type": "numeric",
"correct_answer": 21.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
There are 10 rows of cookies with 2 cookies in each row. How many cookies are there in total?
Question payload
{
"question_text": "There are 10 rows of cookies with 2 cookies in each row. How many cookies are there in total?",
"answer_type": "numeric",
"correct_answer": 20.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"multiplication"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
Bob baked 24 cookies and wants to give an equal number to each of 2 friends. How many cookies does each friend get?
Question payload
{
"question_text": "Bob baked 24 cookies and wants to give an equal number to each of 2 friends. How many cookies does each friend get?",
"answer_type": "numeric",
"correct_answer": 12.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"division"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
Bob had 80 dollars. After spending 58 dollars, how much money does Bob have left?
Question payload
{
"question_text": "Bob had 80 dollars. After spending 58 dollars, how much money does Bob have left?",
"answer_type": "numeric",
"correct_answer": 22.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A train travels 81 km in the first hour, 90 km in the second hour, and 72 km in the third hour. How far does it travel in the first two hours?
Question payload
{
"question_text": "A train travels 81 km in the first hour, 90 km in the second hour, and 72 km in the third hour. How far does it travel in the first two hours?",
"answer_type": "numeric",
"correct_answer": 171.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"addition",
"distractor"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A train travels 101 km in the first hour, 68 km in the second hour, and 42 km in the third hour. How far does it travel in the first two hours?
Question payload
{
"question_text": "A train travels 101 km in the first hour, 68 km in the second hour, and 42 km in the third hour. How far does it travel in the first two hours?",
"answer_type": "numeric",
"correct_answer": 169.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"addition",
"distractor"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A store had 77 cards. They sold 44 cards. How many cards are left?
Question payload
{
"question_text": "A store had 77 cards. They sold 44 cards. How many cards are left?",
"answer_type": "numeric",
"correct_answer": 33.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
George has 9 marbles, Alice has 14 marbles, and Julia has 28 marbles. How many marbles do Alice and Julia have together?
Question payload
{
"question_text": "George has 9 marbles, Alice has 14 marbles, and Julia has 28 marbles. How many marbles do Alice and Julia have together?",
"answer_type": "numeric",
"correct_answer": 42.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"addition",
"distractor"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
Julia had 44 dollars. After spending 13 dollars, how much money does Julia have left?
Question payload
{
"question_text": "Julia had 44 dollars. After spending 13 dollars, how much money does Julia have left?",
"answer_type": "numeric",
"correct_answer": 31.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A store had 20 cookies. They sold 7 cookies. How many cookies are left?
Question payload
{
"question_text": "A store had 20 cookies. They sold 7 cookies. How many cookies are left?",
"answer_type": "numeric",
"correct_answer": 13.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A store has 44 apples, 35 oranges, and 10 bananas. A customer buys 28 oranges. How many oranges does the store have left?
Question payload
{
"question_text": "A store has 44 apples, 35 oranges, and 10 bananas. A customer buys 28 oranges. How many oranges does the store have left?",
"answer_type": "numeric",
"correct_answer": 7.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction",
"distractor"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
There are 6 rows of books with 12 books in each row. How many books are there in total?
Question payload
{
"question_text": "There are 6 rows of books with 12 books in each row. How many books are there in total?",
"answer_type": "numeric",
"correct_answer": 72.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"multiplication"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
Hannah had 30 dollars. After spending 21 dollars, how much money does Hannah have left?
Question payload
{
"question_text": "Hannah had 30 dollars. After spending 21 dollars, how much money does Hannah have left?",
"answer_type": "numeric",
"correct_answer": 9.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
There are 5 rows of apples with 4 apples in each row. How many apples are there in total?
Question payload
{
"question_text": "There are 5 rows of apples with 4 apples in each row. How many apples are there in total?",
"answer_type": "numeric",
"correct_answer": 20.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"multiplication"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
A car travels at 79 km/h. How far does it travel in 8 hours?
Question payload
{
"question_text": "A car travels at 79 km/h. How far does it travel in 8 hours?",
"answer_type": "numeric",
"correct_answer": 632.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"multiplication"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
Fatima had 65 dollars. After spending 19 dollars, how much money does Fatima have left?
Question payload
{
"question_text": "Fatima had 65 dollars. After spending 19 dollars, how much money does Fatima have left?",
"answer_type": "numeric",
"correct_answer": 46.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"subtraction"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}
Question
David has 18 apples and Ivan has 7 apples. How many apples do they have together?
Question payload
{
"question_text": "David has 18 apples and Ivan has 7 apples. How many apples do they have together?",
"answer_type": "numeric",
"correct_answer": 25.0,
"category": "word_problems",
"difficulty": "medium",
"tags": [
"word_problem",
"addition"
],
"evaluation_criteria": {
"exact_match": true,
"case_sensitive": false,
"contains": false,
"required_fields": [],
"tolerance": 0.001,
"alternatives": []
}
}