按分组取最新一条记录 - ClickHouse & MySQL
时间:2025-8-26 22:25 作者:wanzi 分类: 数据库
按分组取最新一条记录 - ClickHouse & MySQL
ClickHouse
1. 新版本(ClickHouse 24.0+ 支持 QUALIFY)
SELECT *
FROM ids_data
WHERE event_type = 'alert' AND level IN ('middle','high')
QUALIFY row_number() OVER (PARTITION BY threat_behavior_category ORDER BY timestamp DESC) = 1
ORDER BY timestamp DESC
LIMIT 15 OFFSET 0;
特点:
- 语法直观,类似 MySQL 8.0 的窗口函数
- 逻辑:先过滤 → 分组排序 → 取每组最新一条 → 分页
- 推荐在 ClickHouse 24.0+ 使用
适用场景:
- 日志、告警、行为数据等,需要每个分类最新一条数据
- 版本 >= 24.0
2. 老版本(不支持 QUALIFY)
SELECT
threat_behavior_category,
argMax(id, timestamp) AS id,
argMax(threat_behavior_tactic, timestamp) AS threat_behavior_tactic,
argMax(level, timestamp) AS level,
argMax(task_id, timestamp) AS task_id,
argMax(user_id, timestamp) AS user_id,
argMax(src_ip, timestamp) AS src_ip,
argMax(dest_ip, timestamp) AS dest_ip,
argMax(dest_addr, timestamp) AS dest_addr,
argMax(dest_port, timestamp) AS dest_port,
argMax(threat_behavior, timestamp) AS threat_behavior,
argMax(keyword, timestamp) AS keyword,
argMax(is_normal, timestamp) AS is_normal,
max(timestamp) AS timestamp
FROM (
SELECT *
FROM ids_data
WHERE event_type = 'alert' AND level IN ('middle','high')
) AS t
GROUP BY threat_behavior_category
ORDER BY timestamp DESC
LIMIT 15 OFFSET 0;
特点:
- 用
argMax(col, timestamp)
取每组最新一条 - 所有版本都支持,兼容性好
- SQL 比 QUALIFY 写法长一些
适用场景:
- ClickHouse 旧版本(< 24.0)
- 需要兼容性好的方案
3. 如果不需要分组,只取最近 N 条
SELECT *
FROM ids_data
WHERE event_type = 'alert' AND level IN ('middle','high')
ORDER BY timestamp DESC
LIMIT 15 OFFSET 0;
适用场景:
- 实时告警列表、最新日志流
- 只关心最新 N 条,不分组
MySQL
1. MySQL 8.0+
SELECT *
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY threat_behavior_category ORDER BY timestamp DESC) AS rn
FROM ids_data
WHERE event_type = 'alert' AND level IN ('middle','high')
) t
WHERE t.rn = 1
ORDER BY timestamp DESC
LIMIT 15 OFFSET 0;
特点:
- 语法与 ClickHouse QUALIFY 类似
- 窗口函数 MySQL 8.0+ 支持
2. 老版本 MySQL(无窗口函数)
SELECT t.*
FROM ids_data t
JOIN (
SELECT threat_behavior_category, MAX(timestamp) AS max_ts
FROM ids_data
WHERE event_type = 'alert' AND level IN ('middle','high')
GROUP BY threat_behavior_category
) m
ON t.threat_behavior_category = m.threat_behavior_category AND t.timestamp = m.max_ts
ORDER BY t.timestamp DESC
LIMIT 15 OFFSET 0;
特点:
- 先分组取最大时间,再 JOIN 回原表
- MySQL 5.7 或更低版本使用