How to run cohort analysis: track retention and measure product improvement
What is cohort analysis?
Cohort analysis is a quantitative research method that groups users by a shared characteristic — most commonly the date they signed up — and tracks how each group’s behavior changes over time. Instead of looking at aggregate retention that blends new users with veterans, cohort analysis separates users into distinct groups and measures each group’s engagement independently. This reveals whether the product is getting better at retaining new users, whether a specific change improved early retention, or whether certain acquisition channels bring users who stay longer.
What question does it answer?
- Is the product getting better at retaining new users over time?
- How does the retention curve shape differ between cohorts?
- Which activation events predict long-term retention?
- Does a specific product change show up as improvement in the cohorts that experienced it?
- Which acquisition channels produce users with the highest long-term retention?
- At what point in the lifecycle does the biggest engagement drop happen?
When to use
- When the team needs to measure retention honestly — aggregate DAU/MAU can grow while retention declines if acquisition compensates for churn.
- When evaluating a product change’s impact on retention by comparing cohorts before and after.
- When diagnosing where in the lifecycle engagement breaks down.
- When comparing quality of different user segments or acquisition channels.
- When building a business case for product investment by quantifying retention improvement.
- When setting retention targets using recent cohort curves as baselines.
Not the right method when the product has very few users (aim for 50-100+ per cohort). Does not explain why users churn — pair with qualitative methods. Requires consistent tracking over time.
What you get (deliverables)
- Cohort retention table (rows: cohorts, columns: time periods, cells: retention %).
- Retention curves for visual comparison across cohorts.
- Day-1, day-7, day-30 benchmarks compared to industry and historical data.
- Behavioral cohort comparison (e.g., completed onboarding vs. skipped).
- Acquisition channel retention comparison.
- Improvement trend showing how retention metrics change across cohorts.
Participants and duration
- Participants: No recruited participants — uses live product data. At least 50-100 per cohort; 500+ for segment analysis.
- Data window: 3-6 months minimum for meaningful cross-cohort patterns.
- Setup time: 1-3 days with existing tracking; 1-2 weeks for new instrumentation.
- Analysis time: 1-3 days for focused review.
How to run cohort analysis (step-by-step)
1. Define the cohort type and grouping period
Acquisition cohorts group by signup date (most common). Behavioral cohorts group by a specific action. Choose weekly or monthly grouping based on user volume and product rhythm.
2. Define the retention event
What counts as “retained”? Any session (broad) or a core action like completing a task (narrow and more meaningful). Document the definition precisely — changing it later invalidates comparisons.
3. Set up the cohort report
Configure in your analytics tool with cohort dimension, retention event, and time intervals (Day 0, 1, 3, 7, 14, 30, 60, 90). Run for the past 3-6 months.
4. Read the retention table
Three angles: across rows (how each cohort decays), down columns (is the same time point improving for newer cohorts?), and anomalies (cohorts with unusual retention correlating with events).
5. Compare behavioral cohorts
Split by key activation events. Users who complete onboarding vs. those who skip — if the retention gap is large, that event is a strong retention predictor.
6. Compare acquisition channels
Retention by traffic source shows which channels bring durable users, even if some channels cost more per acquisition.
7. Measure product change impact
Identify the first cohort to experience a change and compare its curve to the previous cohort.
8. Report and set targets
Present retention curves, behavioral comparisons, and a clear assessment. Set targets for upcoming cohorts.
How AI changes this method
AI compatibility: full — AI automates cohort segmentation, retention curve generation, anomaly detection, and churn prediction.
What AI can do
- Auto-detect retention inflection points and identify which behavioral events correlate with long-term retention.
- Predict churn risk per user using ML models trained on cohort data.
- Generate natural-language retention summaries from cohort tables.
- Alert on anomalies when a new cohort’s curve deviates from historical norms.
What requires a human researcher
- Defining what “retained” means — a product strategy decision.
- Interpreting why retention changed — requires context beyond the data.
- Setting realistic targets based on industry benchmarks and business strategy.
- Deciding where to invest when multiple factors correlate with retention.
AI-enhanced workflow
Before AI, cohort analysis was a monthly exercise by a data analyst. With AI-enhanced analytics, retention monitoring is continuous — the tool generates tables for each new cohort, compares early data against historical patterns, and alerts the team if a cohort tracks below expectations.
Tools
Analytics with cohorts: Amplitude, Mixpanel, GA4, PostHog, Heap, CleverTap.
SaaS retention: Userpilot, Appcues, ChartMogul.
Data warehousing: BigQuery, Snowflake, Redshift.
AI/predictive: Amplitude Predict, Mixpanel Spark, ChatGPT / Claude.
Visualization: Looker Studio, Tableau, Metabase.
Works well with
- Analytics / Clickstream (An): Analytics provides the event data cohort analysis is built on.
- Funnel Analysis (Fa): Funnels measure conversion through a flow; cohorts measure whether engagement persists over time.
- Survey (Sv): Cohort analysis shows when users churn; a survey asks why.
- In-depth Interview (Di): Interviews with users from a low-retention cohort reveal the cause.
- NPS / CSAT / SUS (Np): Tracking satisfaction by cohort connects subjective experience to behavioral retention.
Example from practice
A project management SaaS had 25,000 MAU, but cohort analysis revealed day-30 retention was declining: 34% (January), 28% (March), 22% (May). Growing acquisition masked the decline. The steepest drop occurred between day 1 and day 3, with 55-60% never returning after their first session.
Behavioral comparison showed users who created a project in their first session retained at 48% on day 30, vs. 11% for those who did not. The team redesigned onboarding to guide first-session project creation. The June cohort showed 31% day-30 retention, reversing the trend, and the day-1-to-3 drop narrowed from 55% to 40%.
Beginner mistakes
Looking at aggregate retention instead of cohorts
Aggregate 30-day retention blends dozens of cohorts. Growing acquisition can mask declining retention. Always look per cohort.
Using too small a cohort size
A 20-person cohort produces wildly swinging percentages. Use 50+ for trends, 200+ for segment comparisons, or switch to monthly grouping.
Changing the retention event definition
If “retained” changed meaning between periods, the data is not comparable. Keep the definition consistent.
Confusing correlation with causation
Users who complete onboarding have higher retention — but motivated users do both. Test causation with A/B experiments.
Ignoring incomplete cohorts
The most recent cohort’s day-30 data is only available 30 days after signup. Missing data looks like a retention drop. Always mark incomplete data points.