sql - Group IDs from one table, then count groups having different values in another table

I have the following Postgres SQL tables:
_db<>fiddle

create table t1(id,sval)as values
 (88,      'X129de')
,(4,       'YHK33e')
,(44,      '1K4892')
,(53,      'YHK33e')
,(42,      'YHK33e')
,(2,       '1K4892')
,(1,       '1K4892')
,(99,      'X129de');

create table t2(id,isin)as values
 (88,      'XXXXXX')
,(4,       'UUUUUU')
,(44,      'IIIIII')
,(53,      'UUUUUU')
,(42,      'IIIIII')
,(2,       'UUUUUU')
,(1,       'UUUUUU')
,(99,      'XXXXXX');

I am trying to put together a query for the first table to gather all the ids with the same sval. In the above example there would be three groups

[4, 53, 42]
[44, 2, 1]
[88, 99]

From these ids I want to query the second table, to see if within each of these groups, any of their isin values are different. For for the example of

[4, 53, 42] - YES since 42 has a different isin than 4 and 53.
[88, 99] - NO because their isin values are the same.
[44, 2, 1] YES - because 44 has a different isin value to 1 and 2.

I am not sure how to anise the last part, but I just want to know for each group if there is at least one different isin value, and how many groups have different values.

I have the following Postgres SQL tables:
_db<>fiddle

create table t1(id,sval)as values
 (88,      'X129de')
,(4,       'YHK33e')
,(44,      '1K4892')
,(53,      'YHK33e')
,(42,      'YHK33e')
,(2,       '1K4892')
,(1,       '1K4892')
,(99,      'X129de');

create table t2(id,isin)as values
 (88,      'XXXXXX')
,(4,       'UUUUUU')
,(44,      'IIIIII')
,(53,      'UUUUUU')
,(42,      'IIIIII')
,(2,       'UUUUUU')
,(1,       'UUUUUU')
,(99,      'XXXXXX');

I am trying to put together a query for the first table to gather all the ids with the same sval. In the above example there would be three groups

[4, 53, 42]
[44, 2, 1]
[88, 99]

From these ids I want to query the second table, to see if within each of these groups, any of their isin values are different. For for the example of

[4, 53, 42] - YES since 42 has a different isin than 4 and 53.
[88, 99] - NO because their isin values are the same.
[44, 2, 1] YES - because 44 has a different isin value to 1 and 2.

I am not sure how to anise the last part, but I just want to know for each group if there is at least one different isin value, and how many groups have different values.

Share Improve this question edited Feb 16 at 13:41 Zegarek 26.3k5 gold badges24 silver badges30 bronze badges asked Feb 14 at 22:12 kahhengchong 275 bronze badges

3 In what format do you need output to be? – Milos Stojanovic Commented Feb 14 at 22:16
Hi @MilosStojanovic sorry for the late response. There isn't a particular format, I am trying to just figure out how many of the groups have at least one different ISIN. And if it is easy to show, what the different ISINs are – kahhengchong Commented Feb 17 at 8:46

Add a comment |

5 Answers 5

Sorted by: Reset to default 1

You can join the two tables on their ids to get a mapping between svals and isins. Now, for each isin you can use a window function to find the highest and lowest svals. If they are the same, all the svals for that isin are the same. If not, it means the same isin has at least one different sval. From there, it's just a matter of using case expression to put it all together:

select t1.id, 
       CASE MAX(isin) OVER (PARTITION BY sval)
            WHEN MIN(isin) OVER (PARTITION BY sval) THEN 'NO' 
            ELSE 'YES' 
       END
from t1
join t2 on t1.id = t2.id

DBFiddle demo

Join using^[1] id, group by sval, count distinct isin values for each group: _{demo at db<>fiddle}

select array_agg(id)
     , count(distinct isin)>1
from t1 join t2 using(id)
group by sval;

id_group	has_many_isins
{1,2,44}	TRUE
{99,88}	false
{42,53,4}	TRUE

With more context:

select sval
     , array_agg(id)            as id_group
     , count(distinct isin)>1   as has_many_isins
     , count(distinct isin)     as "#distinct_isins"
     , array_agg(distinct isin) as distinct_isin_vals
from t1 join t2 using(id)
group by sval;

sval	id_group	has_many_isins	#distinct_isins	distinct_isin_vals
1K4892	{1,2,44}	TRUE	2	{IIIIII,UUUUUU}
X129de	{99,88}	false	1	{XXXXXX}
YHK33e	{42,53,4}	TRUE	2	{IIIIII,UUUUUU}

To just list those groups that have multiple different ISINs in the other table, move that to HAVING:

select array_agg(id) as id_group
from t1 join t2 using(id)
group by sval
having count(distinct isin)>1;

I think OP is wanting the sval group and whether that group contains multiple isin values. The recognition by Mureinik is good, but I think fails in not showing the originating group. The following a variant of that recognition to identify the appropriate group. ( see demo here)

with grps(sval, idarray) as    -- convert individual rows to group and id array
     ( select sval, array_agg(id order by id)
         from tbl1
        group by sval
     ) -- select * from grps;
select distinct on (sval) 
       grps.* 
     , case when hm  then 'YES'  else 'NO' 
       end "Multiple ISIN Values" 
  from grps 
 cross join lateral           
            (select not (max(tbl2.isin) =  min(tbl2.isin)) hm
               from tbl2
              where tbl2.id = any (grps.idarray) 
            ) jl
  order by sval;

The grps CTE builds an array of id for each sval group. It then joins the groups with the min+max isin for the each. Finally, it reduces the multiple group resuts to by just returning a single row for each group (see distinct on)

Step by step: first find the groups of ids with equal sval in the first table as arrays in cte1, then count the distinct isin values per group in the second table using a scalar subquery in cte2 and finally pick only these groups (as arrays) with count > 1.

with cte1 as 
(
 select array_agg(id) grp from table_1 group by sval
),
cte2 as 
(
 select grp, 
    (select count(distinct(isin)) from table_2 where id = any(grp)) cnt
 from cte1
)
select grp from cte2 where cnt > 1;

DB-Fiddle demo

Assuming you just want a grouped list of id values, and the number of distinct isin values, you can just join them together and use normal grouping functions.

select
  t1.sval, 
  count(*) as total_ids,
  count(distinct isin) as different_isin
from t1
join t2 on t1.id = t2.id
group by
  t1.sval
having min(t2.isin) <> max(t2.isin);

If a particular id might not have a match at all then you can use a left-join, and check the count from t2.

select
  t1.sval, 
  count(*) as total_ids_in_t1,
  count(t2.id) as total_ids_in_t2,
  count(distinct isin) as different_isin
from t1
left join t2 on t1.id = t2.id
group by
  t1.sval
having min(t2.isin) <> max(t2.isin) or count(t2.id) = 0;

db<>fiddle

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

sql - Group IDs from one table, then count groups having different values in another table - Stack Overflow

5 Answers 5

与本文相关的文章

评论列表(0)