Pre-Retro: Data to Gather
Don't run this retro without data. Opinions without numbers become arguments without resolution.
Before the meeting, collect:
| Metric | Source | Target | |--------|--------|--------| | Tasks completed with AI | Sprint board | All AI-tagged items | | Estimated vs. actual time | Ticket history | Δ calculation | | Review cycles per task | PR/doc comments | Compare AI vs. manual | | TLX scores (if collected) | Survey/Slack | Team average |
The 5 Retro Questions
Use these verbatim. They're designed to surface friction that teams avoid discussing.
1. "Which AI-assisted task took longer than expected?"
What you're looking for: Tasks where generation was fast but review/revision was slow.
Follow-up: "How much of the total time was fixing AI output versus your own work?"
Action if common: Add review time estimates to AI task planning.
2. "Where did AI output require the most editing?"
What you're looking for: Patterns in AI failure modes (formatting, accuracy, tone, completeness).
Follow-up: "Could a better prompt have prevented this, or is this task unsuited for AI?"
Action if common: Build prompt templates for high-edit tasks or reclassify them as manual.
3. "Did anyone feel more mentally drained using AI than expected?"
What you're looking for: Hidden cognitive load—checking output, maintaining context, managing prompts.
Follow-up: "What specifically caused the fatigue?" (Switching contexts? Verifying accuracy? Prompt iteration?)
Action if common: Implement micro-TLX checks after AI tasks.
4. "Which task should we stop using AI for?"
What you're looking for: Honest admission that some AI applications aren't working.
Follow-up: "What made you realize it? Time, quality, or frustration?"
Action if raised: Remove from AI workflow immediately. Test again in 2 sprints with better prompts.
5. "What's one AI use case we should expand?"
What you're looking for: Genuine wins that can scale.
Follow-up: "What made it work well? Can we template that approach?"
Action if raised: Document the pattern. Add to team playbook.
Tracker Fields to Copy
Add these to your sprint board or project tracker:
AI_ASSISTED: [Yes/No]
ESTIMATED_TIME_MIN: [number]
ACTUAL_TIME_MIN: [number]
REVIEW_CYCLES: [number]
DELTA_PERCENT: [calculated: (estimated - actual) / estimated * 100]
TLX_SCORE: [0-100, optional]
AI_NOTES: [free text - what worked/didn't]
Sample Ticket Entry
Task: Draft Q4 planning doc
AI_ASSISTED: Yes
ESTIMATED_TIME_MIN: 30
ACTUAL_TIME_MIN: 55
REVIEW_CYCLES: 3
DELTA_PERCENT: -83% (took 83% longer than expected)
TLX_SCORE: 72
AI_NOTES: AI draft missed key context from Q3.
Had to rewrite intro and add 4 missing sections.
Prompt: "Draft Q4 planning doc based on Q3 template"
Retro Template (Copy to Miro/FigJam)
┌─────────────────────────────────────────────────────────┐
│ AI SPRINT RETRO - [Sprint Name] │
│ Date: ___________ Facilitator: ___________ │
├─────────────────────────────────────────────────────────┤
│ │
│ 📊 THIS SPRINT'S DATA │
│ ┌────────────────────┬────────────────────┐ │
│ │ Tasks with AI │ _____ │ │
│ │ Avg Δ │ _____% │ │
│ │ Avg TLX │ _____ │ │
│ │ Time saved (claimed)│ _____ hrs │ │
│ │ Time saved (actual) │ _____ hrs │ │
│ └────────────────────┴────────────────────┘ │
│ │
│ ✅ KEEP DOING (AI wins) │
│ ┌─────────────────────────────────────────┐ │
│ │ │ │
│ │ │ │
│ └─────────────────────────────────────────┘ │
│ │
│ 🛑 STOP DOING (AI friction) │
│ ┌─────────────────────────────────────────┐ │
│ │ │ │
│ │ │ │
│ └─────────────────────────────────────────┘ │
│ │
│ 🧪 TRY NEXT SPRINT │
│ ┌─────────────────────────────────────────┐ │
│ │ Task: _______________ Owner: __________ │ │
│ │ Hypothesis: __________________________ │ │
│ │ Success metric: ______________________ │ │
│ └─────────────────────────────────────────┘ │
│ │
│ ⚠️ ACTION ITEMS │
│ □ _______________ Owner: _____ Due: _____ │
│ □ _______________ Owner: _____ Due: _____ │
│ □ _______________ Owner: _____ Due: _____ │
│ │
└─────────────────────────────────────────────────────────┘
Warning Signs to Watch
- ✓
- Δ trending up over 3+ sprints — Team is getting less calibrated, not more
- TLX above 65 for multiple team members — Burnout risk, even if velocity looks good
- Review cycles increasing — AI is generating more work for reviewers
- "We don't have time to track" — Measurement debt is accumulating
- Same person always editing AI output — Hidden bottleneck forming
Apply Now
Practice With Your Team
Team Resources
- Delta Logging in Sprints — Full guide to Δ tracking in agile workflows
- Micro-TLX Fatigue Check — The 2-slider protocol for workload monitoring
- When AI Slows You Down — Diagnostic framework for negative ROI tasks
- Fair Trial Protocol — How to run valid AI vs. manual comparisons
“Questions and metrics aligned with the AI CogniFit Methodology. Tracker fields compatible with Jira, Linear, Asana, and Notion.
”