Run Benchmark

Validate Definition (lokys)

This benchmark is in Tier 3 (Advanced).

Select Model
Models that cannot run this benchmark tier are shown as disabled based on capability level.
Benchmark execution is allowed only from direct local/private network IPs.
Cancel
Benchmark Info

A regression benchmark for the lokys agent's validate_definition() function. Tests whether the LLM correctly identifies well-formed vs. problematic word definitions (e.g. circular definitions, translations used as definitions).