最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Keep track of first row in ClickHouse - Stack Overflow

programmeradmin3浏览0评论

I have a table with users actions:

user_id action timestamp
1 PAGEVIEW 00:01
1 PAGEVIEW 00:02
1 PAY 00:05

I have a table with users actions:

user_id action timestamp
1 PAGEVIEW 00:01
1 PAGEVIEW 00:02
1 PAY 00:05

How do I create a materialized view to have timestamps of first user's appearance (when first action was made)?

In this case, the result should be 1 (user_id), 00:01 (first action)

It is like ReplacingMergeTree backwards, I need to keep track of the first, not latest row, in my ordering.

Share Improve this question edited Mar 31 at 6:19 TechGeii asked Mar 31 at 6:14 TechGeiiTechGeii 4195 silver badges14 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 1

You may use a version column with calculated expression which value is higher on a lower timestamp argument.

CREATE TABLE tab
(
    user_id     UInt32
  , action      LowCardinality(String)
  , timestamp   DateTime
) Engine = MergeTree()
ORDER BY user_id;

CREATE TABLE tab_rmt
(
    user_id     UInt32
  , action      LowCardinality(String)
  , timestamp   DateTime
  , __ver       UInt32
) Engine = ReplacingMergeTree(__ver)
ORDER BY user_id;

CREATE MATERIALIZED VIEW tab_mv TO tab_rmt AS
SELECT *, 4294967295 - toUnixTimestamp(timestamp) AS __ver
FROM tab;

INSERT INTO tab
SELECT *
FROM VALUES 
(
    'user_id UInt32
  , action  LowCardinality(String)
  , timestamp   DateTime'
, (1, 'PAGEVIEW', toDateTime('2025-01-01 00:01:00'))
, (1, 'PAGEVIEW', toDateTime('2025-01-01 00:02:00'))
, (1, 'PAY',      toDateTime('2025-01-01 00:03:00'))
);

SELECT * FROM tab_rmt FINAL FORMAT Vertical;

The result is:

Row 1:
──────
user_id:   1
action:    PAGEVIEW
timestamp: 2025-01-01 00:01:00
__ver:     2559277635 -- 2.56 billion

fiddle

A limit query might be what you want here:

SELECT *
FROM actions
WHERE user_id = 1
ORDER BY timestamp
LIMIT 1;

If instead you want a query for all users, then use ROW_NUMBER:

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY timestamp) rn
    FROM actions
)

SELECT user_id, action, timestamp
FROM cte
WHERE rn = 1;
发布评论

评论列表(0)

  1. 暂无评论