Skip to content

How to conduct in-depth interviews: a practical guide with AI prompts

In-depth interviews are one-on-one semi-structured conversations between a researcher and a participant. The goal is to understand motivations, behaviors, pain points, and decision-making processes that surveys and analytics cannot reveal. The researcher follows a discussion guide but adapts in real time, probing deeper where the conversation is most productive.

This method works best when you need to understand the “why” behind user behavior — at the beginning of a project, after quantitative data has raised questions, or when exploring sensitive topics that people would not discuss in a group.

What you learn

  • Why users behave a certain way and what drives their decisions
  • How users experience a problem today, including workarounds they have developed
  • What unmet needs exist that the current product does not address
  • Where the gap lies between what users report and what they actually do

What you get

  • Interview transcripts (audio, video, and text)
  • Affinity diagrams with clustered themes from across interviews
  • Key insight documents: patterns, contradictions, and unexpected findings with supporting quotes
  • Persona inputs, journey map raw material, or jobs-to-be-done frameworks
  • Research-backed recommendations for product or design decisions

When to use (and when not to)

Use in-depth interviews when:

  • You are at the beginning of a project and need to understand the problem space before designing solutions
  • Quantitative data (analytics, surveys) has shown a pattern but has not explained why it exists
  • The topic is sensitive or complex — something users would not discuss openly in a group
  • You are building or validating personas, journey maps, or JTBD frameworks
  • Existing assumptions about user needs have not been tested with real users

Skip in-depth interviews when you need statistically representative data, when you already understand the “why” and need to measure “how many,” or when time permits only a quick validation (consider concept testing or a survey instead).

Participants and timing

Participants: 5–8 users per segment. Saturation typically occurs at 5–6 interviews for a homogeneous group — after that point, new themes become rare. For two distinct segments, plan for 10–16 participants total.

Session length: 45–60 minutes per interview. Shorter sessions stay on the surface; longer sessions cause fatigue and lower the quality of responses.

Total timeline: 1–3 weeks, including:

  • Recruitment: 3–5 days
  • Interviews: 3–5 days (no more than 2–3 sessions per day to avoid researcher fatigue)
  • Analysis and synthesis: 2–4 days

How to conduct in-depth interviews

1. Define the research objective

Write one to three specific questions this study must answer. Avoid vague goals like “understand users better.” Instead: “Why do small business owners abandon the expense reporting flow at the receipt upload step?“

2. Write screening criteria

Define who qualifies as a participant. Prioritize behavioral criteria (has used the product in the last 30 days, has completed at least 3 transactions) over demographics alone. Exclude people who work in your industry — they respond differently than actual users.

3. Recruit participants

Sources: existing customer database, user research panels (User Interviews, Respondent, Ethnio), social media, or site intercepts. Offer incentives appropriate to your audience — $50–150 for consumer studies, $150–300+ for B2B specialists. Over-recruit by 20–30% to account for no-shows.

4. Write the discussion guide

Structure the guide into four sections:

Warm-up (5 minutes). Build rapport by asking about the participant’s role, daily routine, or relationship to the topic. No product-specific questions yet.

Context setting (5–10 minutes). Understand the broader situation. “Walk me through a typical day when you need to [task related to your research question].”

Core exploration (25–35 minutes). The main research questions. Use open-ended questions grounded in past behavior, not hypothetical scenarios. Instead of “Would you use feature X?” ask “Tell me about the last time you tried to solve [problem]. What happened?”

For each core question, prepare follow-up probes: “Can you tell me more about that?”, “What happened next?”, “Why was that important to you?”

Wrap-up (5 minutes). Close with “Is there anything I should have asked but didn’t?” This question often produces the most candid responses.

5. Pilot the guide

Run one or two pilot interviews with colleagues or friendly users. Check whether the flow feels natural, whether questions are clear, and whether the timing is realistic. Revise before the first real session.

6. Conduct the interviews

  • Start recording after receiving verbal consent.
  • Follow the guide but stay flexible — if a participant raises something unexpected and relevant, explore it.
  • Aim for the participant to speak 70–80% of the time.
  • Take brief notes on key moments, body language, and emotional reactions.
  • Use neutral probes rather than expressing agreement or disagreement: “That’s interesting, tell me more.”

7. Debrief after each session

Within 15 minutes of each interview, write down: the top three things that surprised you, any new questions that emerged, and emerging patterns you notice across sessions.

8. Analyze and synthesize

  • Transcribe interviews using Otter.ai, Dovetail, or a similar tool.
  • Code transcripts by marking recurring themes, contradictions, and emotional moments.
  • Group codes into themes and themes into insights using an affinity diagram.
  • Write insights as “observation + implication” statements: “Users store receipts in email rather than in the app because they do not trust the app to retain data long-term. This means trust-building features should take priority over convenience features.”
  • Select 5–8 representative quotes to support each insight.

9. Share findings

Lead your report with insights and recommendations, not with a description of your process. Stakeholders want to know what to do next, not how many interviews you ran.

How AI changes this method

AI compatibility: partial — AI can handle preparation (discussion guide drafting, screening questionnaire creation) and post-interview analysis (transcription, thematic coding, insight extraction) but cannot replace the live conversation between researcher and participant. The value of in-depth interviews lies in real-time rapport, follow-up probes, and reading non-verbal cues — all of which require a human researcher in the session.

What AI can do

  • Transcription and speaker separation: Tools like Otter.ai and Speak AI convert recorded interviews to text in minutes, replacing hours of manual transcription.
  • Discussion guide drafting: Given a research objective and target audience, an LLM can generate a structured guide with open-ended questions, probes, and timing suggestions. The researcher then edits based on domain knowledge.
  • Screening questionnaire generation: An LLM can draft behavioral screening criteria and disqualifying questions based on study goals, saving 30-60 minutes of manual work.
  • Thematic coding across transcripts: After interviews are transcribed, AI can identify recurring themes, tag quotes by topic, and flag contradictions between participants.
  • Insight statement drafting: Given coded data, an LLM can draft observation-implication statements that the researcher then validates against the raw data.
  • Report and presentation drafting: AI can structure findings into stakeholder-ready formats, including executive summaries and prioritized recommendations.

What requires a human researcher

  • Conducting the interview itself: Real-time rapport, empathy, silence management, and reading body language cannot be automated. The quality of the data depends on the interviewer’s ability to follow unexpected threads and ask the right follow-up at the right moment.
  • Interpreting context and nuance: AI can identify that five participants mentioned “trust” but cannot determine whether they mean trust in the product, trust in the company, or trust in their own ability.
  • Ethical judgment during sessions: Deciding when to stop probing a sensitive topic, recognizing participant discomfort, and adjusting the approach in real time require human judgment.
  • Final insight validation: AI-generated themes and insights are hypotheses. The researcher must verify them against the raw data and their own session notes before presenting them as findings.

AI-enhanced workflow

The most significant change AI brings to in-depth interviews is compressing the analysis phase. Before AI tools, a researcher who conducted 8 interviews would spend 2-3 days on transcription alone, followed by another 2-3 days on manual coding and synthesis. With AI transcription and assisted coding, that same analysis can happen in 1-2 days — freeing time for more interviews or deeper interpretation.

The preparation phase also benefits: instead of writing a discussion guide from scratch (typically 2-4 hours), a researcher can prompt an LLM to generate a draft in 10 minutes, then spend an hour refining it.

Where AI does not change the workflow at all is the interview itself. There is no AI shortcut for sitting across from a person and asking “Tell me more about that.” The 45-60 minutes of live conversation remains entirely human, and this is where the method’s value is generated.

Tools

  • Recording and transcription: Zoom (built-in recording), Otter.ai (real-time transcription), Rev (human-reviewed transcription)
  • Recruitment: User Interviews, Respondent, Ethnio, Calendly (scheduling)
  • Analysis and synthesis: Dovetail (coding, tagging, insight repository), Miro (affinity diagrams), EnjoyHQ, Notably
  • Note-taking: Notion, Google Docs (timestamped notes during sessions)
  • AI-assisted analysis: Speak AI (transcription with automated theme detection), Marvin (research repository with AI search)

Common mistakes

Asking leading or hypothetical questions

“Don’t you think the dashboard is confusing?” leads the participant to agree. “Would you use feature X?” invites speculation rather than reality. Ground every question in past behavior: “Tell me about the last time you tried to find [information] in the product. What happened?”

Talking more than listening

New interviewers often fill silences, share their own experiences, or rephrase what the participant said. The result is that you hear your own assumptions reflected back. Practice the 5-second rule: after the participant finishes speaking, count to five before saying anything. Often they will continue with the most important part of their answer.

Skipping the pilot

Running the discussion guide for the first time with a real participant wastes one of your limited sessions. Questions that looked clear on paper may confuse users or take twice the expected time. Always run at least one pilot — with a colleague, a friendly user, or someone adjacent to the target audience.

Interviewing without synthesizing along the way

Some teams run 15–20 interviews before analyzing any of them. By interview 12, themes are repeating but the researcher has no structured notes to prove saturation. Debrief after every session. Start coding after interview 3. You will recognize saturation when new interviews confirm existing themes rather than introducing new ones.

Treating quotes as insights

A quote is evidence, not an insight. “I hate the search” is a data point. The insight is: “Users rely on search as a primary navigation path, but the current search does not support the terms they use internally, which causes repeated zero-result queries.” Write every insight as an observation paired with its implication.

Works well with

  • Surveys: Run interviews first to identify themes, then use a survey to measure how common each theme is across the user base.
  • Journey mapping: Interview data provides the raw material for journey maps — touchpoints, emotions, and friction points taken directly from user stories.
  • Persona building: Interviews reveal behavioral patterns, goals, and frustrations that form the foundation of evidence-based personas.
  • Concept testing: After interviews surface unmet needs, test early concepts with the same or similar users to validate direction.
  • Diary studies: A diary study captures behavior over time; follow-up interviews explain the reasoning behind what the diary entries recorded.

Example

A B2B SaaS company saw that 40% of trial users never completed onboarding. Analytics showed they dropped off at the “connect your data source” step, and the product team assumed the integration was technically difficult.

Seven in-depth interviews with users who had abandoned onboarding told a different story: users understood how to connect the data source, but they were afraid of exposing internal company data to a tool they had not yet evaluated. The trust barrier came before the technical one.

The team added a sandbox mode with sample data so that users could explore the product before connecting anything real. Trial-to-paid conversion increased by 28% over two quarters.