Skip to main content
Open BetaWe’re learning fast - your sessions and feedback directly shape AI CogniFit.
Season Zero Active

Human AI Deception
Benchmark Arena

A living benchmark measuring how well humans detect AI errors, hallucinations, and reasoning flaws. Contribute your judgment skills and see how you compare.

Human Performance

Track how well humans detect AI mistakes across diverse scenarios

Error Patterns

Identify common blindspots where humans struggle to catch AI failures

Skill Growth

Watch your judgment improve over time with structured practice

The Human AI Deception Benchmark measures human ability to detect AI failures across five skill pillars:

AI Literacy

Understanding how AI systems work and why they fail

Logic & Reasoning

Spotting flawed arguments and reasoning chains

Risk & Safety

Assessing when AI outputs are safe to use

Authenticity Detection

Distinguishing human vs AI-generated content

Calibration

Knowing when you're confident vs. uncertain in your judgments

We also measure response time and confidence calibration to understand how people make judgment calls under pressure.

Ready to Contribute?

Your participation helps build the most comprehensive benchmark of human AI judgment. Start your journey today.

PrivacyEthicsStatusOpen Beta Terms
Share feedback