AI prompts for unmoderated usability testing: task writing, results analysis, and variant comparison

These prompts cover the stages of unmoderated usability testing where AI saves the most time: writing clear test tasks, analyzing quantitative results, and comparing design variants.

Write unmoderated usability test tasks from product goals

I am setting up an unmoderated usability test for [product description]. The test will run on [Maze / UserTesting / Lyssna / other tool] with [N] participants.

Product goals being tested:
[list 3-5 product goals, e.g., "Users can find and apply a discount code during checkout"]

For each product goal, write a task scenario that:
1. Describes a realistic situation in 1-2 sentences (no interface terminology)
2. Gives enough context for the participant to understand what they need to do WITHOUT a facilitator
3. Has an unambiguous success state that the testing tool can detect (e.g., "reaches the confirmation screen")
4. Can be completed in 2-4 minutes by a user who has never seen the product

Also provide:
- A post-task question for each task (beyond the SEQ — something specific to the goal being tested)
- A list of potential misunderstandings in the task wording and how to prevent them
- An estimated total study duration (sum of all tasks + questionnaires)

Keep the total study under 15 minutes to minimize dropout.

Analyze unmoderated usability test results

Here are the results from an unmoderated usability test with [N] participants on [product description].

Task results:
[paste task-level data: completion rate, median time-on-task, SEQ scores, common failure points]

Open-ended responses to "What was the most confusing part of this experience?":
[paste all responses]

Post-study SUS scores:
[paste scores or average]

Analyze the data and produce:

1. TASK PERFORMANCE SUMMARY: Table with each task, completion rate, median time, SEQ average, and a traffic-light rating (green = meets benchmark, yellow = borderline, red = below benchmark). Use these benchmarks: completion > 78% = green, 60-78% = yellow, < 60% = red.

2. PROBLEM AREAS: For each task with a yellow or red rating, describe the likely problem based on the available data (completion rate, time, SEQ mismatch patterns, open-ended responses).

3. OPEN-ENDED THEMES: Code all open-ended responses into 3-5 themes with frequency counts and representative quotes.

4. SUS INTERPRETATION: Convert the SUS score to a letter grade (A-F) and percentile rank. Compare to industry benchmarks if available.

5. PRIORITIZED RECOMMENDATIONS: Top 3-5 actions the team should take, ranked by impact. For each, state: what to change, which task it affects, and what metric should improve.

Compare two design variants from A/B usability test data

I ran an unmoderated usability test comparing two design variants:
- Variant A: [brief description]
- Variant B: [brief description]

Each variant was tested with [N] participants on the same [M] tasks.

Results:
Variant A: [paste completion rates, time-on-task, SEQ scores per task]
Variant B: [paste completion rates, time-on-task, SEQ scores per task]

Compare the variants:

1. TASK-BY-TASK COMPARISON: Table showing both variants' metrics side by side for each task. Flag statistically significant differences (use a threshold of p < 0.05 if raw data is available, or note where the gap exceeds 15 percentage points as practically significant).

2. OVERALL WINNER: Which variant performs better overall? Is the advantage consistent across all tasks or driven by one task?

3. TRADE-OFFS: Are there tasks where the losing variant actually performs better? If so, what element of that variant might be worth preserving?

4. RECOMMENDATION: Proceed with Variant [X], with the following modifications based on where Variant [Y] performed better: [specific changes]. If the data is inconclusive, recommend what additional testing would resolve the question.