最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

sql - Finding parent records that has missing child records from a collection of children - Stack Overflow

programmeradmin1浏览0评论

Suppose I have a parent table (t_parent) like this:

ID NAME MAX_CHILDREN_KEY
1 Parent 1 3
2 Parent 2 4
3 Parent 3 1
4 Parent 4 7
5 Parent 5 7
6 Parent 6 2
7 Parent 7 2

Suppose I have a parent table (t_parent) like this:

ID NAME MAX_CHILDREN_KEY
1 Parent 1 3
2 Parent 2 4
3 Parent 3 1
4 Parent 4 7
5 Parent 5 7
6 Parent 6 2
7 Parent 7 2

And a children table (t_children) like this:

ID PARENT_ID SORT_KEY NAME
1 1 1 Parent 1 Child 1
2 1 2 Parent 1 Child 2
3 1 3 Parent 1 Child 3
4 2 1 Parent 2 Child 1
5 2 4 Parent 2 Child 4
6 3 1 Parent 3 Child 1
7 4 1 Parent 4 Child 1
8 5 7 Parent 5 Child 7
9 6 2 Parent 6 Child 2
10 6 2 Parent 6 Child 2 (duplicate)
11 7 3 Parent 7 Child 3
12 7 4 Parent 7 Child 4

How can I query all the parent records with missing children records matching the t_parent.max_children_key and t_children.sort_key? Base from this two dataset, the query should return:

  • Parent 2 - Because Child 2 and Child 3 are missing.
  • Parent 4 - Because Child 2 to Child 7 are missing.
  • Parent 5 - Because Child 1 to Child 6 are missing.
  • Parent 6 - Because despite having 2 children, Child 1 is missing.
  • Parent 7 - Because despite having 2 children, Child 1 and Child 2 are missing.

Right now, I'm matching the count of the children using having count(1) but it fails to include Parent 6 and Parent 7.

Share Improve this question asked Mar 7 at 7:34 GideonGideon 1,6542 gold badges27 silver badges64 bronze badges
Add a comment  | 

3 Answers 3

Reset to default 0
SELECT *
FROM t_parent tp
WHERE EXISTS (
    SELECT 1
    FROM t_children tc
    WHERE tc.PARENT_ID = tp.ID
    HAVING 
        COUNT(DISTINCT tc.SORT_KEY) <> tp.MAX_CHILDREN_KEY
        OR MAX(tc.SORT_KEY) <> tp.MAX_CHILDREN_KEY
);

This query compares MAX_CHILDREN_KEY with the maximum value of SORT_KEY and the number of its unique values ​​for each PARENT_ID.

db<>fiddle

you can try the following, you can set limits according to your needs:

WITH NumberSeries AS (
    SELECT 1 AS num
    UNION ALL
    SELECT num + 1 
    FROM NumberSeries
    WHERE num < 10  -- Set a reasonable upper limit 
),
ExpectedChildren AS (
    SELECT p.ID AS PARENT_ID, ns.num AS EXPECTED_SORT_KEY
    FROM t_parent p
    CROSS JOIN NumberSeries ns
    WHERE ns.num <= p.MAX_CHILDREN_KEY
),
MissingChildren AS (
    SELECT ec.PARENT_ID, ec.EXPECTED_SORT_KEY
    FROM ExpectedChildren ec
    LEFT JOIN t_children c
        ON ec.PARENT_ID = c.PARENT_ID AND ec.EXPECTED_SORT_KEY = c.SORT_KEY
    WHERE c.ID IS NULL
)
SELECT DISTINCT p.ID, p.NAME
FROM MissingChildren mc
JOIN t_parent p ON mc.PARENT_ID = p.ID
ORDER BY p.ID
OPTION (MAXRECURSION 100);

Detailed form:

WITH RECURSIVE
cte (ID, NAME, CUR_CHILDREN_KEY, MAX_CHILDREN_KEY) AS (
  SELECT ID, NAME, 1, MAX_CHILDREN_KEY
  FROM t_parent
  UNION ALL
  SELECT ID, NAME, 1 + CUR_CHILDREN_KEY, MAX_CHILDREN_KEY
  FROM cte 
  WHERE CUR_CHILDREN_KEY < MAX_CHILDREN_KEY
)
SELECT t1.ID, t1.NAME, t1.CUR_CHILDREN_KEY LOST_CHILDREN_KEY
FROM cte t1
LEFT JOIN t_children t2 
  ON (t1.ID, t1.CUR_CHILDREN_KEY) = (t2.PARENT_ID, t2.SORT_KEY)
WHERE t2.PARENT_ID IS NULL
ORDER BY ID, LOST_CHILDREN_KEY
ID NAME LOST_CHILDREN_KEY
2 Parent 2 2
2 Parent 2 3
4 Parent 4 2
4 Parent 4 3
4 Parent 4 4
4 Parent 4 5
4 Parent 4 6
4 Parent 4 7
5 Parent 5 1
5 Parent 5 2
5 Parent 5 3
5 Parent 5 4
5 Parent 5 5
5 Parent 5 6
6 Parent 6 1
7 Parent 7 1
7 Parent 7 2

Compact form:

WITH RECURSIVE
cte (ID, NAME, CUR_CHILDREN_KEY, MAX_CHILDREN_KEY) AS (
  SELECT ID, NAME, 1, MAX_CHILDREN_KEY
  FROM t_parent
  UNION ALL
  SELECT ID, NAME, 1 + CUR_CHILDREN_KEY, MAX_CHILDREN_KEY
  FROM cte 
  WHERE CUR_CHILDREN_KEY < MAX_CHILDREN_KEY
)
SELECT t1.ID, t1.NAME, 
       COUNT(*) LOST_CHILDREN_AMOUNT,
       GROUP_CONCAT(t1.CUR_CHILDREN_KEY ORDER BY t1.CUR_CHILDREN_KEY) LOST_CHILDREN_KEYS
FROM cte t1
LEFT JOIN t_children t2 
  ON (t1.ID, t1.CUR_CHILDREN_KEY) = (t2.PARENT_ID, t2.SORT_KEY)
WHERE t2.PARENT_ID IS NULL
GROUP BY t1.ID, t1.NAME
ORDER BY ID
ID NAME LOST_CHILDREN_AMOUNT LOST_CHILDREN_KEYS
2 Parent 2 2 2,3
4 Parent 4 6 2,3,4,5,6,7
5 Parent 5 6 1,2,3,4,5,6
6 Parent 6 1 1
7 Parent 7 2 1,2

fiddle

PS. Thanks to id'7238 for source sample data.

PPS. The query does not check for child rows whose SORT_KEY value is greater than MAX_CHILDREN_KEY value for its parent (or less than 1).

发布评论

评论列表(0)

  1. 暂无评论