最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

sql - In a select with a group by, get the last value of a column not in the group by - Stack Overflow

programmeradmin3浏览0评论

Given a table of data like this:

a b c d e
1 test 9 h 2024-10-22 08:00:00.000
1 test 9 l 2024-10-23 08:00:00.000
1 test 9 q 2024-10-22 08:00:00.000

Given a table of data like this:

a b c d e
1 test 9 h 2024-10-22 08:00:00.000
1 test 9 l 2024-10-23 08:00:00.000
1 test 9 q 2024-10-22 08:00:00.000

I want to group my data by columns a,b,c and show the value form column d the has the newest date in column e.

So I would expect to get one row of data back like this:

a b c d
1 test 9 l

I would have liked something as simple as a "last()" like below but as far as I can find there isn't anything so simple?

SELECT 
    a, b, c,
    last(d)
FROM
    dbo.items 
GROUP BY 
    a, b, c

The only example I can find remotely close to what I want is a LAST_VALUE OVER PARTITION it doesn't work in a group by

LAST_VALUE(d) OVER (PARTITION BY d ORDER BY e) AS d

And I know similar things are possible to access the stuff not in a group by, like if b want in the group by I would still be able to STRING_AGG all the values like so

STRING_AGG(b, ',') AS b 

and get "test,test,test" as the value

Share Improve this question edited Nov 18, 2024 at 19:08 Dale K 27.6k15 gold badges58 silver badges83 bronze badges asked Nov 18, 2024 at 13:55 AureliusAurelius 1321 gold badge1 silver badge15 bronze badges 1
  • Please tag your RDBMS, see here why that's important: meta.stackoverflow/questions/388759/… – Bart McEndree Commented Nov 18, 2024 at 14:15
Add a comment  | 

2 Answers 2

Reset to default 3

Using Row_Number might work if you are using SQL Server

SELECT a, b, c, d FROM
(SELECT *,
      ROW_NUMBER() OVER (PARTITION BY a,b,c ORDER BY e desc) as rn
FROM dbo.items 
)t
WHERE rn=1

fiddle

a b c d
1 test 9 l

There are some hacky and some more standard solutions.

Standard is to get the last value in a subquery and to aggregate it later, something like:

select a, STRING_AGG(b, ',') WITHIN GROUP(ORDER BY a,c) as b, c, max(lastD) as d
from (
    select a, b,c,d, e, last_value(d) over(partition by a,b,c order by e desc) as lastD
    from (
        VALUES  (1, N'test', 9, N'h', N'2024-10-22 08:00:00.000')
        ,   (1, N'test', 9, N'l', N'2024-10-23 08:00:00.000')
        ,   (1, N'test', 9, N'q', N'2024-10-22 08:00:00.000')
    ) t (a,b,c,d,e)
    ) x
GROUP BY a,b,c

The hacky way is something i'd call reconstruction, which entails combining your aggregation value and then deconstructing it back after retrieving the maximum value, something like:

SELECT  a, STRING_AGG(b, ',') WITHIN GROUP(ORDER BY a,c) AS b, c
,   STUFF(MAX(CONCAT(CONVERT(VARCHAR(30), cast(e AS datetime), 121), d)), 1,23, '') AS d
FROM
(
    VALUES  (1, N'test', 9, N'h', N'2024-10-22 08:00:00.000')
    ,   (1, N'test', 9, N'l', N'2024-10-23 08:00:00.000')
    ,   (1, N'test', 9, N'q', N'2024-10-22 08:00:00.000')
    ) t (a,b,c,d,e)
GROUP BY a,b,c

Here, by combining a varchar representation of the date and your d value, one gets a natural ascending value by the date, so one can use MAX. After getting the highest value, one can use STUFF function to remove the date part and get the d value.

This has some caveats especially if you concat non-string columns. Also, it's not possible to use tie-breakers if you have multiple of the same date. The upside is that it avoids the extra window aggregation step.

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论