I have a table with users actions:
user_id | action | timestamp |
---|---|---|
1 | PAGEVIEW | 00:01 |
1 | PAGEVIEW | 00:02 |
1 | PAY | 00:05 |
I have a table with users actions:
user_id | action | timestamp |
---|---|---|
1 | PAGEVIEW | 00:01 |
1 | PAGEVIEW | 00:02 |
1 | PAY | 00:05 |
How do I create a materialized view to have timestamps of first user's appearance (when first action was made)?
In this case, the result should be 1 (user_id), 00:01 (first action)
It is like ReplacingMergeTree
backwards, I need to keep track of the first, not latest row, in my ordering.
2 Answers
Reset to default 1You may use a version column with calculated expression which value is higher on a lower timestamp
argument.
CREATE TABLE tab
(
user_id UInt32
, action LowCardinality(String)
, timestamp DateTime
) Engine = MergeTree()
ORDER BY user_id;
CREATE TABLE tab_rmt
(
user_id UInt32
, action LowCardinality(String)
, timestamp DateTime
, __ver UInt32
) Engine = ReplacingMergeTree(__ver)
ORDER BY user_id;
CREATE MATERIALIZED VIEW tab_mv TO tab_rmt AS
SELECT *, 4294967295 - toUnixTimestamp(timestamp) AS __ver
FROM tab;
INSERT INTO tab
SELECT *
FROM VALUES
(
'user_id UInt32
, action LowCardinality(String)
, timestamp DateTime'
, (1, 'PAGEVIEW', toDateTime('2025-01-01 00:01:00'))
, (1, 'PAGEVIEW', toDateTime('2025-01-01 00:02:00'))
, (1, 'PAY', toDateTime('2025-01-01 00:03:00'))
);
SELECT * FROM tab_rmt FINAL FORMAT Vertical;
The result is:
Row 1:
──────
user_id: 1
action: PAGEVIEW
timestamp: 2025-01-01 00:01:00
__ver: 2559277635 -- 2.56 billion
fiddle
A limit query might be what you want here:
SELECT *
FROM actions
WHERE user_id = 1
ORDER BY timestamp
LIMIT 1;
If instead you want a query for all users, then use ROW_NUMBER
:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY timestamp) rn
FROM actions
)
SELECT user_id, action, timestamp
FROM cte
WHERE rn = 1;