DB Fiddle
CREATE TABLE vouchers (
id SERIAL PRIMARY KEY,
collected_date DATE,
collected_volume INT);
INSERT INTO vouchers(collected_date, collected_volume)VALUES
('2024-02-15', 1),
('2024-03-09', 900),
('2024-04-20', 300),
('2024-04-20', 800),
('2024-05-24', 400),
('2025-01-17', 200),
('2025-02-15', 800),
('2025-02-15', 150);
Expected Result
collected_date | collected_volume |
---|---|
2024-02-15 | 1 |
2025-02-15 | -1 |
2024-03-09 | 900 |
2025-03-09 | -900 |
2024-04-20 | 1100 |
2025-04-20 | -1100 |
2024-05-24 | 400 |
2025-05-24 | -400 |
2025-01-17 | 200 |
2026-01-17 | -200 |
2025-02-15 | 950 |
2026-02-15 | -950 |
DB Fiddle
CREATE TABLE vouchers (
id SERIAL PRIMARY KEY,
collected_date DATE,
collected_volume INT);
INSERT INTO vouchers(collected_date, collected_volume)VALUES
('2024-02-15', 1),
('2024-03-09', 900),
('2024-04-20', 300),
('2024-04-20', 800),
('2024-05-24', 400),
('2025-01-17', 200),
('2025-02-15', 800),
('2025-02-15', 150);
Expected Result
collected_date | collected_volume |
---|---|
2024-02-15 | 1 |
2025-02-15 | -1 |
2024-03-09 | 900 |
2025-03-09 | -900 |
2024-04-20 | 1100 |
2025-04-20 | -1100 |
2024-05-24 | 400 |
2025-05-24 | -400 |
2025-01-17 | 200 |
2026-01-17 | -200 |
2025-02-15 | 950 |
2026-02-15 | -950 |
To get from the raw data to the expected results these steps are required:
- Calculate an
expire_date
which is always 12 months after thecollected_date
- Put the
expire_date
in the row below thecollected_date
. Not in a separate column! - Put total
collected_volume
percollected_date
as a negative number to the row with theexpire_date
.
If generated row with expire_date
and negative collected_volume
coincides with one of the original, base collected_date
rows, they should stay separate and not be summed up: ('2024-02-15',1)
case generates ('2025-02-15',-1)
which stays separate from ('2025-02-15',950)
.
So far I have been able to develop this query:
select collected_date as collected_date
, (collected_date + interval '12 months')::date as expire_date
, sum(collected_volume) as collected_volume
from vouchers
group by 1,2
order by 1,2;
However, I have no clue what I need to change to display the expire_date
below each collected_date
and not in a separate column.
Do you have any idea how to solve this?
Share Improve this question edited Apr 1 at 8:04 Zegarek 27.2k5 gold badges24 silver badges30 bronze badges asked Mar 31 at 13:59 MichiMichi 5,5658 gold badges57 silver badges119 bronze badges 4 |3 Answers
Reset to default 3You can use a lateral join against a values
constructor. Then group by those values.
This is better in that it only requires reading the table once.
select
t.collected_date,
sum(t.collected_volume) as collected_volume
from vouchers v
cross join lateral (values
(collected_date, collected_volume),
(collected_date + interval '12 months', -collected_volume)
) t(collected_date, collected_volume)
group by
v.collected_date,
t.collected_date
order by
v.collected_date,
t.collected_date;
Or you can split it after aggregating, using a derived table, which will be a bit more efficient.
select
t.collected_date,
t.collected_volume
from (
select
v.collected_date,
sum(v.collected_volume) as collected_volume
from vouchers v
group by
v.collected_date
) v
cross join lateral (values
(v.collected_date, v.collected_volume),
(v.collected_date + interval '12 months', -v.collected_volume)
) t(collected_date, collected_volume)
order by
v.collected_date,
t.collected_date;
db<>fiddle
Another option, giving a different result (from an accounting perspective) would only group by the values
date.
select
t.collected_date,
sum(t.collected_volume) as collected_volume
from vouchers v
cross join lateral (values
(collected_date, collected_volume),
(collected_date + interval '12 months', -collected_volume)
) t(collected_date, collected_volume)
group by
t.collected_date
order by
t.collected_date;
- Aggregate in a CTE first
- Copy each resulting row, plus one with the expiration date and negated sum.
- Differentiate between the original and the copy by adding a boolean and keeping an unchanged date, to sort by them later.
It doesn't matter much whether you do the copying by taking a union
of the CTE with itself:
with cte(cd,cv)as(
select collected_date, sum(collected_volume)
from vouchers
group by collected_date
),cte2 as(
select*,(cd,false) as orderby
from cte
union all
select (cd+'12mon'::interval)::date
, -cv
, (cd,true)
from cte)
select cd as collected_date
, cv as collected_volume
from cte2
order by orderby;
Or by joining each row to a set of two did:
with cte(cd,cv)as(
select collected_date, sum(collected_volume)
from vouchers
group by collected_date)
select collected_date, collected_volume
from cte cross join lateral
(values((cd+'12months'::interval)::date, -cv, true)
,(cd, cv, false) )as v(collected_date,collected_volume,is_added)
order by cd, is_added;
The important part is aggregating first, and holding on to values that dictate your desired order. In a test on 700k rows, these take about 300ms
:
demo at db<>fiddle
variant | exec time |
---|---|
agg_first_union | 293.493 ms |
agg_first_join | 314.945 ms |
@Tim's agg_first_union this unfortunately doesn't sort things right or add expiration dates yet |
333.805 ms |
@Charlieface's join_vals | 2310.740 ms |
If you just want to view this output, then define the aggregation in a CTE and use a union query:
WITH cte AS (
SELECT collected_date, SUM(collected_volume) AS collected_volume
FROM vouchers
GROUP BY collected_date
)
SELECT collected_date, collected_volume FROM cte
UNION ALL
SELECT collected_date, -1.0*collected_volume FROM cte
ORDER BY collected_date, collected_volume DESC;
INSERT
or as aSELECT
? – Charlieface Commented Mar 31 at 14:052026-02-15
example: both the positive and negative values are on the same date. 2) In Step 2 you specified to Put theexpire_date
in the row below but your expiration date2025-03-09
for the('2024-03-09', '900')
record is placed above it, and so are all others. 3) If an expiration date for one row happens to be on another's collection date, do you sum them up? Say you have a('2023-03-09',2)
- do you subtract that2
from thecollected_volume
on2024-03-09
instead of generating a separate "expiration" row? – Zegarek Commented Mar 31 at 17:45