I am facing issue while filtering data with array
I have columns userid,event_name,attributes,ti
Attributes column have value like this
{"bool_sample":true,"array_int":[10,20,25,38],"array_string":["hello","world","i am fine"]}
My query
SELECT * FROM "events-data" CROSS JOIN UNNEST(CAST(json_extract(attributes, '$["array_int"]') AS array<int>)) AS t(val)
WHERE val > 9;
This query filtering data but it giving me multiple row for same record, for above attributes column record it giving me 4 rows
userid,event_name,attributes,ti,val
test-userid,test-event,test-attributes,10
test-userid,test-event,test-attributes,20
test-userid,test-event,test-attributes,25
test-userid,test-event,test-attributes,38
I do not need multiple rows
I am facing issue while filtering data with array
I have columns userid,event_name,attributes,ti
Attributes column have value like this
{"bool_sample":true,"array_int":[10,20,25,38],"array_string":["hello","world","i am fine"]}
My query
SELECT * FROM "events-data" CROSS JOIN UNNEST(CAST(json_extract(attributes, '$["array_int"]') AS array<int>)) AS t(val)
WHERE val > 9;
This query filtering data but it giving me multiple row for same record, for above attributes column record it giving me 4 rows
userid,event_name,attributes,ti,val
test-userid,test-event,test-attributes,10
test-userid,test-event,test-attributes,20
test-userid,test-event,test-attributes,25
test-userid,test-event,test-attributes,38
I do not need multiple rows
Share Improve this question edited Jan 19 at 19:45 Guru Stron 144k11 gold badges168 silver badges209 bronze badges asked Jan 18 at 14:51 Nishant DixitNishant Dixit 5,5225 gold badges21 silver badges31 bronze badges 1- "I do not need multiple rows" - and what do you need? What is the desired output here? – Guru Stron Commented Jan 19 at 19:30
2 Answers
Reset to default 0No need CROSS JOIN with UNNEST extracted data. Check exists
for.
Try
SELECT * FROM "events-data"
WHERE exists (
select val
from UNNEST(CAST(json_extract_path(attributes, '$.["array_int"]') AS int[])) AS t(val)
where val>9)
;
That is what UNNEST
does - it expands array in multiple rows (one row per array element, see the docs).
If the goal is to fetch rows with array_int
containing values more than 9 then you can use JSON functions, for example json_exists
:
-- sample data
WITH dataset(userid,event_name,attributes) as (
values ('test-userid', 'test-event', '{"bool_sample":true,"array_int":[10,20,25,38],"array_string":["hello","world","i am fine"]}')
)
-- query
select *
from dataset
where json_exists(attributes, 'lax $.array_int[*]?(@ > 9)');
Output:
userid | event_name | attributes |
---|---|---|
test-userid | test-event | {"bool_sample":true,"array_int":[10,20,25,38],"array_string":["hello","world","i am fine"]} |
If you need to get the array itself with filtered data then you can use the casting+filtering:
-- query
select userid, event_name, vals
from (
select *, filter(cast(json_extract(attributes, '$.array_int') as array(int)), v -> v > 9) vals
from dataset)
where cardinality(vals) > 0;
Output:
userid | event_name | vals |
---|---|---|
test-userid | test-event | {10,20,25,38} |