Run Benchmark
Validate Definition (lokys)
This benchmark is in Tier 3 (Advanced).
Select Model
Benchmark Info
A regression benchmark for the lokys agent's validate_definition() function. Tests whether the LLM correctly identifies well-formed vs. problematic word definitions (e.g. circular definitions, translations used as definitions).