Research Standards
Evidence Levels & Quality Standards
How we grade the confidence level of our AI evaluation research and recommendations
How to Use Evidence Levels Safely
- ✓Match evidence level to decision risk: Level A for strategic commitments, B for pilots, C for exploration
- ✓Always check publication date: AI evidence older than 12 months may be outdated
- ✓Consider your context: Validate findings match your industry, scale, and technical maturity
What NOT to do:
- • Never use Level C evidence alone for major investments
- • Don't ignore contradictory evidence from different sources
- • Avoid generalizing from single-vendor case studies
Quick Reference for Executives
- Level A: Gold standard evidence - safe to base strategic decisions on
- Level B: Industry-validated - appropriate for pilot programs and controlled rollouts
- Level C: Emerging insights - useful for exploration but requires validation
How We Assign Evidence Levels
Our evidence grading follows established scientific standards adapted for AI evaluation contexts. Each claim, metric, and recommendation receives a grade based on:
- Study design quality - RCTs and systematic reviews earn higher grades
- Sample size and diversity - Larger, more representative samples increase confidence
- Reproducibility - Findings replicated across contexts receive higher ratings
- Quantitative rigor - Statistical significance and effect sizes matter
- Recency - More recent evidence (especially for AI) carries more weight
Evidence Level Definitions
Evidence A
Strong Evidence
Important Limits & Potential Misuse
- • Evidence levels indicate confidence in research quality, not guaranteed outcomes
- • Context matters: Strong evidence in one setting may not transfer to yours
- • AI capabilities evolve rapidly; evidence older than 12 months may be outdated
- • Never use single studies to justify major decisions; seek converging evidence
- • Consider your organization's unique constraints and capabilities
Evidence B
Moderate Evidence
Important Limits & Potential Misuse
- • Evidence levels indicate confidence in research quality, not guaranteed outcomes
- • Context matters: Strong evidence in one setting may not transfer to yours
- • AI capabilities evolve rapidly; evidence older than 12 months may be outdated
- • Never use single studies to justify major decisions; seek converging evidence
- • Consider your organization's unique constraints and capabilities
Evidence C
Limited Evidence
Important Limits & Potential Misuse
- • Evidence levels indicate confidence in research quality, not guaranteed outcomes
- • Context matters: Strong evidence in one setting may not transfer to yours
- • AI capabilities evolve rapidly; evidence older than 12 months may be outdated
- • Never use single studies to justify major decisions; seek converging evidence
- • Consider your organization's unique constraints and capabilities