Run Benchmark
Syllogism Validity
This benchmark is in Tier 2 (Core).
Select Model
Benchmark Info
A benchmark to evaluate whether a model can determine if short categorical syllogisms are logically valid.
Syllogism Validity
This benchmark is in Tier 2 (Core).
A benchmark to evaluate whether a model can determine if short categorical syllogisms are logically valid.