Human AI Deception
Benchmark Arena
A living benchmark measuring how well humans detect AI errors, hallucinations, and reasoning flaws. Contribute your judgment skills and see how you compare.
Human Performance
Track how well humans detect AI mistakes across diverse scenarios
Error Patterns
Identify common blindspots where humans struggle to catch AI failures
Skill Growth
Watch your judgment improve over time with structured practice
The Human AI Deception Benchmark measures human ability to detect AI failures across five skill pillars:
AI Literacy
Understanding how AI systems work and why they fail
Logic & Reasoning
Spotting flawed arguments and reasoning chains
Risk & Safety
Assessing when AI outputs are safe to use
Authenticity Detection
Distinguishing human vs AI-generated content
Calibration
Knowing when you're confident vs. uncertain in your judgments
We also measure response time and confidence calibration to understand how people make judgment calls under pressure.
Ready to Contribute?
Your participation helps build the most comprehensive benchmark of human AI judgment. Start your journey today.