I have a BIG NUMBER WITH TIMETREND chart in Superset which has Time grain = Quarter, create_date on temporal axes, and a metric COUNT_DISTINCT(id). The dataset is used by many other charts. I have to change the metric to do rolling count distinct to quarter instead of count distinct per quarter.
Lets say this is the source data:
create_date | quarter | id |
---|---|---|
2024-08-11 | 2024Q3 | 3 |
2024-09-15 | 2024Q3 | 1 |
2024-09-24 | 2024Q3 | 2 |
2024-10-10 | 2024Q4 | 3 |
2024-11-01 | 2024Q4 | 4 |
2024-12-11 | 2024Q4 | 3 |
2025-01-18 | 2025Q1 | 5 |
2025-01-20 | 2025Q1 | 6 |
2025-01-20 | 2025Q1 | 7 |
I have a BIG NUMBER WITH TIMETREND chart in Superset which has Time grain = Quarter, create_date on temporal axes, and a metric COUNT_DISTINCT(id). The dataset is used by many other charts. I have to change the metric to do rolling count distinct to quarter instead of count distinct per quarter.
Lets say this is the source data:
create_date | quarter | id |
---|---|---|
2024-08-11 | 2024Q3 | 3 |
2024-09-15 | 2024Q3 | 1 |
2024-09-24 | 2024Q3 | 2 |
2024-10-10 | 2024Q4 | 3 |
2024-11-01 | 2024Q4 | 4 |
2024-12-11 | 2024Q4 | 3 |
2025-01-18 | 2025Q1 | 5 |
2025-01-20 | 2025Q1 | 6 |
2025-01-20 | 2025Q1 | 7 |
Now I have this in superset:
quarter | COUNT_DISTINCT(id) |
---|---|
2024Q3 | 3 |
2024Q4 | 2 |
2025Q1 | 3 |
And I want this to be shown:
quarter | rolling COUNT_DISTINCT(id) |
---|---|
2024Q3 | 3 |
2024Q4 | 4 |
2025Q1 | 7 |
How can I achieve this in superset?
Thanx in advance!
Share Improve this question edited Jan 20 at 19:47 user1664961 asked Jan 20 at 19:40 user1664961user1664961 359 bronze badges1 Answer
Reset to default 0To the best of my knowledge having worked with Superset daily for the past year, none of the current features will allow this as it requires a window function which cannot be used in the visuals.
Nothing is stopping you from creating a virtual dataset in SQLLab and creating a column with a window function there.
Note: I think there is a minor error in your question as 3+2 is 5 not 4. So the result should be 3,5,8 instead of 3,4,7
WITH grouped_ids AS
(SELECT COUNT(DISTINCT id) as ids, quarter FROM timeseries GROUP BY quarter)
SELECT quarter,
sum(t.ids) OVER w AS total,
count(*) OVER w AS count
FROM grouped_ids t
WINDOW w AS (
ORDER BY quarter
)
ORDER BY quarter
Here is a db fiddle with the implementation of the above code in PostGreSQL based off of this answer.