SQL漏斗分析核心是按顺序、时间窗口内统计各环节去重用户数并计算转化率;需定义步骤事件、用户标识、时间范围,用JOIN或条件聚合识别行为序列,注意去重、时序校验和会话切分。

SQL 实现漏斗分析的核心是:按用户行为路径分步统计各环节的去重用户数,并计算相邻步骤的转化率。关键在于用 JOIN 或 窗口函数 + 条件聚合 对同一用户在多个事件中的行为顺序进行识别和归并。
明确漏斗步骤与时间约束
漏斗不是简单统计每个事件的数量,而是要求用户按顺序、在合理时间窗口内完成一系列动作(如:浏览商品 → 加入购物车 → 提交订单 → 支付成功)。需提前定义:
- 每一步对应的事件名或页面路径(如
event_name IN ('view_product', 'add_to_cart', 'submit_order', 'pay_success')) - 用户标识字段(通常是
user_id,注意区分匿名 ID 和登录 ID) - 时间范围(如最近 7 天)和单次会话最大跨度(如 24 小时内完成才算有效路径)
用自连接匹配用户行为序列
对同一用户的多步行为,可通过多次 LEFT JOIN 实现逐级筛选。以四步漏斗为例:
WITH step1 AS (
SELECT DISTINCT user_id, event_time AS t1
FROM events
WHERE event_name = 'view_product'
AND event_time >= NOW() - INTERVAL '7 days'
),
step2 AS (
SELECT DISTINCT user_id, event_time AS t2
FROM events
WHERE event_name = 'add_to_cart'
AND event_time >= NOW() - INTERVAL '7 days'
),
step3 AS (
SELECT DISTINCT user_id, event_time AS t3
FROM events
WHERE event_name = 'submit_order'
AND event_time >= NOW() - INTERVAL '7 days'
),
step4 AS (
SELECT DISTINCT user_id, event_time AS t4
FROM events
WHERE event_name = 'pay_success'
AND event_time >= NOW() - INTERVAL '7 days'
)
SELECT
COUNT(DISTINCT s1.user_id) AS step1_cnt,
COUNT(DISTINCT s2.user_id) AS step2_cnt,
COUNT(DISTINCT s3.user_id) AS step3_cnt,
COUNT(DISTINCT s4.user_id) AS step4_cnt,
ROUND(COUNT(DISTINCT s2.user_id) * 100.0 / NULLIF(COUNT(DISTINCT s1.user_id), 0), 2) AS rate_1to2,
ROUND(COUNT(DISTINCT s3.user_id) * 100.0 / NULLIF(COUNT(DISTINCT s2.user_id), 0), 2) AS rate_2to3,
ROUND(COUNT(DISTINCT s4.user_id) * 100.0 / NULLIF(COUNT(DISTINCT s3.user_id), 0), 2) AS rate_3to4
FROM step1 s1
LEFT JOIN step2 s2 ON s1.user_id = s2.user_id AND s2.t2 > s1.t1 AND s2.t2 <= s1.t1 + INTERVAL '24 hours'
LEFT JOIN step3 s3 ON s2.user_id = s3.user_id AND s3.t3 > s2.t2 AND s3.t3 <= s2.t2 + INTERVAL '24 hours'
LEFT JOIN step4 s4 ON s3.user_id = s4.user_id AND s4.t4 > s3.t3 AND s4.t4 <= s3.t3 + INTERVAL '24 hours';
注意:JOIN 条件中加入时间先后和窗口限制,避免倒序或跨天误算。
本书是全面讲述PHP与MySQL的经典之作,书中不但全面介绍了两种技术的核心特性,还讲解了如何高效地结合这两种技术构建健壮的数据驱动的应用程序。本书涵盖了两种技术新版本中出现的最新特性,书中大量实际的示例和深入的分析均来自于作者在这方面多年的专业经验,可用于解决开发者在实际中所面临的各种挑战。
用条件聚合 + 窗口函数简化写法(推荐)
更简洁高效的方式是先标记每个用户是否完成各步骤,再聚合。适合步骤较多或需灵活调整场景:
WITH user_steps AS (
SELECT
user_id,
MAX(CASE WHEN event_name = 'view_product' THEN 1 ELSE 0 END) AS has_view,
MAX(CASE WHEN event_name = 'add_to_cart' THEN 1 ELSE 0 END) AS has_cart,
MAX(CASE WHEN event_name = 'submit_order' THEN 1 ELSE 0 END) AS has_order,
MAX(CASE WHEN event_name = 'pay_success' THEN 1 ELSE 0 END) AS has_pay,
-- 可选:记录最早发生时间用于排序验证
MIN(CASE WHEN event_name = 'view_product' THEN event_time END) AS t_view,
MIN(CASE WHEN event_name = 'add_to_cart' THEN event_time END) AS t_cart,
MIN(CASE WHEN event_name = 'submit_order' THEN event_time END) AS t_order,
MIN(CASE WHEN event_name = 'pay_success' THEN event_time END) AS t_pay
FROM events
WHERE event_name IN ('view_product','add_to_cart','submit_order','pay_success')
AND event_time >= NOW() - INTERVAL '7 days'
GROUP BY user_id
HAVING
MIN(CASE WHEN event_name = 'view_product' THEN event_time END) IS NOT NULL
),
valid_paths AS (
SELECT *
FROM user_steps
WHERE
(t_cart IS NULL OR t_cart > t_view) AND
(t_order IS NULL OR t_order > t_cart) AND
(t_pay IS NULL OR t_pay > t_order)
)
SELECT
COUNT(*) AS step1_cnt,
SUM(has_cart) AS step2_cnt,
SUM(has_order) AS step3_cnt,
SUM(has_pay) AS step4_cnt,
ROUND(SUM(has_cart) * 100.0 / NULLIF(COUNT(*), 0), 2) AS rate_1to2,
ROUND(SUM(has_order) * 100.0 / NULLIF(SUM(has_cart), 0), 2) AS rate_2to3,
ROUND(SUM(has_pay) * 100.0 / NULLIF(SUM(has_order), 0), 2) AS rate_3to4
FROM valid_paths;
这种方式逻辑清晰、易于扩展步骤,也方便后续加维度(如按渠道、设备分组)。
注意事项与常见陷阱
漏斗分析容易出错的地方:
- 未去重用户:直接 COUNT(*) 会把一个用户多次行为重复计算,必须用 COUNT(DISTINCT user_id)
- 忽略时间顺序:用户先支付再下单显然不合理,JOIN 或 HAVING 中必须校验时间先后
- 会话归属混乱:没做 session 切分时,用户隔天行为可能被误连;建议先按设备+时间切分会话再跑漏斗
- 数据延迟与空值:支付事件可能比下单晚几秒甚至几分钟入库,WHERE 条件的时间范围要留余量









