How to design a UX survey: a practical guide with AI prompts

What is a UX survey?

A survey is a structured research method that collects self-reported data from a large number of respondents through a standardized set of questions — closed-ended (multiple choice, Likert scales, rankings) and open-ended (free text). Surveys are among the most accessible UX research methods because they scale easily, cost relatively little per response, and can reach users who would never be available for an interview or lab session. Their primary value lies in quantifying attitudes, preferences, satisfaction levels, and behavioral self-reports across a broad population, producing data that supports statistical analysis, segmentation, and trend tracking over time.

What question does it answer?

How satisfied are users with the product overall and with specific features or workflows?
Which features do users consider most important, and which are rarely used or poorly understood?
What are the most common pain points, and how frequently do users encounter them?
How do attitudes, satisfaction, or feature usage differ across user segments (role, tenure, device, geography)?
What are users’ self-reported reasons for churning, downgrading, or not adopting a feature?
How has user satisfaction or perception changed since the last measurement or release?

When to use

When the team needs to quantify user attitudes or satisfaction across a large population — surveys collect data from hundreds or thousands of users at a fraction of the cost of interviews.
When prior qualitative research (interviews, usability tests) has identified themes and the team needs to know how widespread those themes are across the user base.
When the product team wants to prioritize a feature backlog based on what users say matters most to them, using data from a representative sample rather than the opinions of a few vocal users.
When measuring the impact of a change over time — running the same survey before and after a redesign produces comparable data points that track progress.
When segmenting users by demographics, behavior, or attitudes to identify distinct groups with different needs.
When collecting feedback at a specific moment in the user journey — post-onboarding, post-purchase, post-support-interaction — to capture experience while memory is fresh.

Not the right method when the team needs to understand why users behave a certain way in depth — surveys capture what people say, not what they do, and self-reported data is subject to recall bias, social desirability bias, and satisficing. For understanding root causes, pair surveys with qualitative methods like interviews or contextual inquiry. Surveys are also a poor choice when the team does not yet know what questions to ask — if you are exploring a new problem space, start with interviews to learn the vocabulary and concerns of users, then build a survey to quantify those findings. Finally, surveys fail when the target population is too small (under 50-100 respondents) for meaningful statistical analysis.

What you get (deliverables)

Quantitative dataset with response distributions for every closed-ended question — ready for statistical analysis, cross-tabulation, and visualization.
Satisfaction scores calculated from standardized instruments (NPS, CSAT, SUS, UMUX-Lite, CES) that can be benchmarked against industry averages or tracked over time.
Segmented analysis breaking down responses by user type, demographics, usage frequency, or any other variable collected in the survey.
Coded open-ended responses grouped into themes, with frequency counts showing how many respondents mentioned each theme.
Priority ranking of features, pain points, or improvement areas based on user-reported importance and satisfaction gaps.
Summary report with key findings, segment differences, trend comparisons (if repeating), and actionable recommendations for the product team.

Participants and duration

Respondents: A minimum of 100 completed responses for basic statistical reliability. For segment comparisons, aim for at least 30 responses per segment. Response rates for in-product surveys range from 10-30%; email surveys typically see 5-15%.
Survey length: 5-10 minutes (15-25 questions). Completion rates drop sharply after 10 minutes.
Design time: 3-7 days for drafting, internal review, pilot testing with 5-10 people, and revisions.
Field time: 5-14 days for data collection.
Analysis time: 2-5 days for cleaning, statistical analysis, open-ended coding, visualization, and report writing.
Total timeline: 2-4 weeks from research question definition to final report.

How to design and run a UX survey (step-by-step)

1. Define the research questions and how the data will be used

Write down the specific decisions this survey will inform before writing a single question. “We want feedback” is not a research question. “Which of our three onboarding variants produces higher self-reported confidence?” is. For each research question, note what decision it supports and what you will do if the answer goes one way versus the other. If a question does not connect to a decision, cut it.

2. Choose question types and write the questions

Build the questionnaire around four question types: closed-ended single-select (for demographics and categorical data), Likert scales (for attitudes and satisfaction), multiple-select (for behaviors and feature usage), and open-ended (sparingly, for context and explanation). Follow five principles from survey design research: neutrality (no leading wording), specificity (precise time frames and behaviors instead of vague language), singularity (one concept per question — never double-barreled), answerability (every respondent can honestly answer, with “N/A” options where needed), and clarity (6th-grade reading level, no jargon). Where standardized instruments exist (SUS, NPS, CSAT, CES, SEQ), use them instead of inventing your own.

3. Design the survey flow and logic

Group related questions together to reduce context-switching. Place the most important questions early. Use branching logic to skip irrelevant questions. Randomize answer option order for multiple-choice questions to counteract primacy bias, but do not randomize scales. Add a progress indicator. Keep the total under 25 questions and under 10 minutes.

4. Pilot test

Run a pilot with 5-10 people who are similar to your target audience. Ask them to complete the survey while thinking aloud (cognitive interview technique): “Was anything confusing? How did you arrive at your answer? What does this question mean to you?” Also pilot on mobile. Fix every issue the pilot surfaces: confusing wording, missing response options, broken logic paths, questions that take too long.

5. Distribute the survey

Choose the distribution channel based on who you need to reach. In-product intercepts produce the highest relevance because respondents are in context. Email invitations reach users who are not currently active. Panel providers reach non-users or prospective users. Include a clear introduction: who is running the survey, why, how long it takes, and how responses will be used.

6. Monitor collection and close the survey

Track response count and completion rate daily. If the completion rate is below 50%, investigate. Do not analyze partial data or draw conclusions before the target sample is reached. Close the survey at the planned deadline or when the target sample is met.

7. Clean the data

Remove responses from straightliners (same answer for every Likert question), speeders (completed in less than one-third of the median time), and respondents who failed attention-check questions. For open-ended responses, remove spam, gibberish, and off-topic answers.

8. Analyze and visualize

For closed-ended questions, calculate response distributions, means, medians, and standard deviations. Cross-tabulate by segments and run statistical tests (chi-square for categorical comparisons, t-test or ANOVA for scale comparisons). For open-ended questions, code responses into themes — either manually or with AI coding tools — and count frequencies.

Structure the report around the research questions. For each research question, present the data (what), explain why it matters (so what), and recommend an action (now what). Highlight segment differences. Include a limitations section. Close with a prioritized list of recommended actions.

How AI changes this method

AI compatibility: partial — AI dramatically accelerates open-ended response analysis and can assist with question design review, but cannot replace human judgment in defining research questions, choosing what to measure, or interpreting results within business context.

What AI can do

Code open-ended responses at scale: AI tools can read thousands of free-text responses, generate a codebook of themes, assign each response to themes, and produce frequency counts — work that manually takes days can be done in hours.
Check question design for bias: An LLM can review draft questions and flag leading language, double-barreled constructions, loaded assumptions, and jargon.
Generate draft questions: Given research questions and context, an LLM can produce a first draft of survey questions in the correct format, which a researcher then refines.
Summarize and cluster open-ended feedback: Beyond coding, AI can generate narrative summaries of what respondents said, grouping responses by sentiment, topic, or suggested action.
Translate surveys: LLMs produce high-quality first-draft translations for multi-language surveys, which a native-speaking researcher then reviews.
Spot anomalies in response data: AI can flag straightliners, speeders, and contradictory response patterns during data cleaning.

What requires a human researcher

Defining what to measure and why: The decision about which constructs to include and what business decisions the data will inform requires understanding the product strategy and stakeholder needs.
Judging question validity: An LLM can flag surface-level issues, but only a researcher can assess whether a question actually measures the intended construct.
Interpreting results in context: A 65% satisfaction score means different things depending on the industry, the competitive landscape, and the company’s history.
Navigating ethics and privacy: Deciding what is appropriate to ask, ensuring informed consent, and complying with regulations require human judgment.

AI-enhanced workflow

Before AI, open-ended survey analysis was the biggest time sink. A researcher would export hundreds or thousands of text responses, read each one, develop a codebook, tag every response, count frequencies, and write a summary. For a survey with 500 open-ended responses, this alone could take 2-3 full days.

With AI coding tools integrated into the workflow, the researcher uploads the raw responses, the tool generates an initial codebook and assigns codes, and the researcher reviews and adjusts. What took days now takes hours, and the researcher spends that recovered time on interpretation and recommendation rather than tagging.

The second major gain is in question design review. Instead of waiting for a pilot to reveal that a question is double-barreled or leading, a researcher can paste the draft into an LLM and get feedback in seconds. This does not replace the pilot — real-user reactions are still essential — but it catches the most obvious issues before the pilot begins.

Tools

Survey platforms: Qualtrics, SurveyMonkey, Typeform, Google Forms, Survicate, Alchemer, Lensym.

In-product survey tools: Hotjar Surveys, Pendo, Userpilot, Sprig (Userleap).

AI open-ended analysis: BTInsights, Fathom, Qualz.ai, ChatGPT / Claude.

Data analysis: Excel / Google Sheets, R, Python (pandas, scipy), SPSS, JASP.

Distribution and panels: Prolific, UserTesting, Respondent.io.

Works well with

In-depth Interview (Di): Surveys quantify how widespread a pattern is; interviews explain why it exists. Run interviews first to discover themes, then survey to measure their prevalence.
Usability Testing Moderated (Ut): Surveys measure satisfaction and self-reported experience; usability testing reveals what actually happens when users interact with the product.
A/B Testing (Ab): A survey can identify which features users say they want improved; A/B testing then validates whether a specific change moves behavior.
NPS / CSAT / SUS (Np): Standardized satisfaction instruments are often embedded within surveys. The survey provides context that makes the score interpretable.
Persona Building (Ps): Survey data on goals, frustrations, behaviors, and demographics feeds directly into persona creation, grounding personas in quantitative evidence.

Example from practice

A B2B SaaS company offering project management software noticed that trial-to-paid conversion had dropped from 12% to 8% over two quarters. The product team suspected that the onboarding flow was the problem, but they had no data on what specifically frustrated trial users. Running interviews with 50+ churned trial users was impractical given the timeline, so the team designed a survey targeting users who signed up for a trial but did not convert.

The survey — 18 questions, estimated 7 minutes — included Likert scales measuring satisfaction with specific onboarding steps, a multiple-choice question about the primary reason for not converting, and two open-ended questions. The team distributed it via email to 2,400 former trial users and received 312 completed responses (13% response rate).

Analysis revealed three findings the team had not expected. First, 41% of respondents selected “I couldn’t figure out how to set up my first project” as their primary reason for not converting — a step the team considered straightforward. Second, the Likert data showed that satisfaction with the “invite teammates” step was significantly lower than all other steps (mean 2.8 vs. 4.1 on a 5-point scale). Third, open-ended responses repeatedly mentioned that the product felt “empty” during the solo trial because the collaboration features only work with a team. The product team redesigned onboarding with a guided project setup wizard, a “try with sample data” mode, and a simplified team invitation flow. The next quarter’s trial-to-paid conversion recovered to 11.5%.

Beginner mistakes

Writing questions before defining research questions

The most common mistake is jumping into question writing without articulating what decisions the data will inform. Before writing a single question, write down the research questions and the decisions they support. If a question does not map to a decision, do not include it.

Using leading or double-barreled questions

“How satisfied are you with our excellent new feature?” is leading. “How satisfied are you with the speed and reliability?” is double-barreled. Both are common because they are invisible to the writer. The fix is a pilot test with cognitive interviews.

Making the survey too long

Every additional question costs completions. After 10 minutes, drop-off accelerates. A 10-question survey with a 70% completion rate produces more useful data than a 40-question survey with a 20% rate.

Relying too heavily on open-ended questions

Open-ended questions are skipped frequently, produce thin responses on mobile, and take significantly longer to analyze. If your survey has more than 2-3, consider whether interviews might be more appropriate.

Skipping the pilot test

A five-person pilot catches confusing wording, missing response options, broken logic, and mobile display issues before they corrupt real data. Skipping it to save a day often costs the entire dataset.

AI prompts for this method

4 ready-to-use AI prompts with placeholders — copy-paste and fill in with your context. See all prompts for UX surveys →.