Snowflake Time-Series Weighted Values

I have data from a PLC coming in ONLY ON CHANGE into table format (TIMESTAMP, TAG, VALUE). I have a visualisation tool (seeq) that queries this base table in snowflake and shows the data on a time series chart. If a user selects a long time-range, then this data will need to be aggregated (max 2000 points per time series plot). I want this aggregation (average) to be weighted for how long a tag has been on that value before its change. For example if I have a tag = 'cheese' and t=0 -> t=5 has a value of 100, then t=6 -> t=100 has a value of 500. If the user in seeq selects this tag, on a long period window (i.e. spans from t=0 to t=100000), the data registered from this tag has to be aggregated to (5100+95500)/100 for t=50 (mid point) on the plot. How to generate a query for this in snowflake using this base table format of TIMESTAMP, TAG, VALUE.

Tried doing a cross join to a time dimension table created from a tag dimension table then using a lead function to get every single second output from the raw data spread across every time and then weight it accordingly. It was not very performant in terms of speed.

Share Improve this question asked Mar 17 at 22:15 Austin 1

something like NTILE docs.snowflake/en/sql-reference/functions/ntile might be a way to chunk the data, and then do some average/weighted operation. – Simeon Pilgrim Commented Mar 17 at 22:15

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

so I am not really sure what you are trying todo, but an explosive way of doing something like what you describe can be done like so:

with d0 as (
    select * from values
     ('cheese', 0, 5, 100),
     ('cheese', 6, 100, 500)
     t(tag, _s, _e, val)
 ), d1 as (
     select 
        tag,
        value::number rn,
        val,
        ntile(10) over (partition by tag order by rn) as tile,
     from d0, 
        table(flatten(array_generate_range(_s, _e+1)))
)
select
    tag,
    tile,
    avg(rn) as mid,
    avg(val) as val
from d1
group by 1,2
order by 1,2;

which gives:

TAG	TILE	MID	VAL
cheese	1	5.000000	281.818182
cheese	2	15.500000	500.000000
cheese	3	25.500000	500.000000
cheese	4	35.500000	500.000000
cheese	5	45.500000	500.000000
cheese	6	55.500000	500.000000
cheese	7	65.500000	500.000000
cheese	8	75.500000	500.000000
cheese	9	85.500000	500.000000
cheese	10	95.500000	500.000000

those rows do not really need expanding, and the interpolation can be driven against a d0 like table, if that is how your data is sourced..

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

Snowflake Time-Series Weighted Values - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)