Run Benchmark
Validate Translation (voras)
This benchmark is in Tier 3 (Advanced).
Select Model
Benchmark Info
A regression benchmark for the voras agent's validate_all_translations_for_word() function. Tests whether the LLM correctly identifies semantically incorrect or non-lemma translations across multiple target languages.