Validate Definition (lokys)
A regression benchmark for the lokys agent's validate_definition() function. Tests whether the LLM correctly identifies well-formed vs. problematic word definitions (e.g. circular definitions, translations used as definitions).
Questions
0
Leaderboard Entries
0
Best Score
-
Leaderboard
No runs yet for this benchmark.
Run it now!
Questions
No questions generated for this benchmark yet.