Given the following example data:
ACCOUNTID | ENTRYDATE | TRANSACTION | TYPE | AMOUNT |
---|---|---|---|---|
18 | 2025-07-02 08:00 | PAYMENT | CASH | -2359.53 |
18 | 2024-19-12 08:04 | SALE | NUTS | 133 |
18 | 2024-19-12 08:04 | SALE | NUTS | 75 |
18 | 2024-19-12 07:36 | SALE | BOLTS | 1085.71 |
18 | 2024-25-06 07:40 | SALE | BOLTS | 1065.82 |
18 | 2024-14-01 15:30 | PAYMENT | CASH | -1065.82 |
18 | 2023-26-12 13:07 | SALE | BOLTS | 1065.82 |
18 | 2023-10-08 12:36 | PAYMENT | CASH | -1017.91 |
18 | 2023-27-06 10:53 | SALE | BOLTS | 1017.91 |
18 | 2023-07-02 10:44 | PAYMENT | CASH | -1017.91 |
18 | 2022-28-12 09:54 | SALE | BOLTS | 1017.91 |
18 | 2022-04-08 00:00 | PAYMENT | CASH | -941.07 |
18 | 2022-23-06 06:45 | SALE | BOLTS | 941.07 |
Given the following example data:
ACCOUNTID | ENTRYDATE | TRANSACTION | TYPE | AMOUNT |
---|---|---|---|---|
18 | 2025-07-02 08:00 | PAYMENT | CASH | -2359.53 |
18 | 2024-19-12 08:04 | SALE | NUTS | 133 |
18 | 2024-19-12 08:04 | SALE | NUTS | 75 |
18 | 2024-19-12 07:36 | SALE | BOLTS | 1085.71 |
18 | 2024-25-06 07:40 | SALE | BOLTS | 1065.82 |
18 | 2024-14-01 15:30 | PAYMENT | CASH | -1065.82 |
18 | 2023-26-12 13:07 | SALE | BOLTS | 1065.82 |
18 | 2023-10-08 12:36 | PAYMENT | CASH | -1017.91 |
18 | 2023-27-06 10:53 | SALE | BOLTS | 1017.91 |
18 | 2023-07-02 10:44 | PAYMENT | CASH | -1017.91 |
18 | 2022-28-12 09:54 | SALE | BOLTS | 1017.91 |
18 | 2022-04-08 00:00 | PAYMENT | CASH | -941.07 |
18 | 2022-23-06 06:45 | SALE | BOLTS | 941.07 |
I need to add 2 columns which calculate the how much of each transaction should be applied to each category (BOLTs or NUTS). The SALE ROWs are easy enough. It is the payments that are giving me trouble. The results should look something like this:
ACCOUNTID | ENTRYDATE | TRANSACTION | TYPE | AMOUNT | NUT_AMT | BOLT_AMT |
---|---|---|---|---|---|---|
18 | 2025-07-02 08:00 | PAYMENT | CASH | -2359.53 | -208 | -2151.53 |
18 | 2024-19-12 08:04 | SALE | NUTS | 133 | 133 | 0 |
18 | 2024-19-12 08:04 | SALE | NUTS | 75 | 75 | 0 |
18 | 2024-19-12 07:36 | SALE | BOLTS | 1085.71 | 0 | 1085.71 |
18 | 2024-25-06 07:40 | SALE | BOLTS | 1065.82 | 0 | 1065.82 |
18 | 2024-14-01 15:30 | PAYMENT | CASH | -1065.82 | 0 | -1065.82 |
18 | 2023-26-12 13:07 | SALE | BOLTS | 1065.82 | 0 | 1065.82 |
18 | 2023-10-08 12:36 | PAYMENT | CASH | -1017.91 | 0 | -1017.91 |
18 | 2023-27-06 10:53 | SALE | BOLTS | 1017.91 | 0 | 1017.91 |
18 | 2023-07-02 10:44 | PAYMENT | CASH | -1017.91 | 0 | -1017.91 |
18 | 2022-28-12 09:54 | SALE | BOLTS | 1017.91 | 0 | 1017.91 |
18 | 2022-04-08 00:00 | PAYMENT | CASH | -941.07 | 0 | -941.07 |
18 | 2022-23-06 06:45 | SALE | BOLTS | 941.07 | 0 | -941.07 |
I have written a couple window functions which essentially perform a cumulative sum over each of the two new columns. But it is the payments I can't figure out. I would greatly appreciate some help.
Thanks in advance.
Share edited Feb 10 at 22:19 Dale K 27.4k15 gold badges58 silver badges83 bronze badges asked Feb 10 at 21:53 user944796user944796 113 bronze badges 4- How do you know which payment belongs to which sales? If we simply order the rows by date/time, we get a payment of -941.07 first, then a sale of 941.07, then a sale of 1017.91, then two payments of -1017.91, then a sale of 1065.82, ... This seems quite mixed up. Have you made an error with the date/times or is it just that I don't see the rule we need to apply here? – Thorsten Kettner Commented Feb 10 at 22:10
- Be great to see the SQL you have which isn't quite working. – Dale K Commented Feb 10 at 22:20
- have a look here: dbfiddle.uk/I39fePrf it uses the assumption that the "closest following sale" will be the bolt/nut category - and according to that assumption (and your example data) that all cash payments are "bolt". You need to define the correct rule/s to apply - and if you cannot do that then you will always get unexpected (and quite likely) incorrect answers. – Paul Maxwell Commented Feb 11 at 1:17
- Thanks. I tried to give a simplified example, which might have not simplified as much. Imagine a company that sells bolts. You sell bolt, Customers pay for bolts. Then one day you start selling nuts. Customers continue ordering bolts. But then they start ordering Nuts as well. But they keep paying in the same manner. Before you knew that the payments were to be credited to Bolts. But now, you need to know what part of the larger payments go toward bolts, and what part goes to nuts. Based on what the balances owed for both Bolts and Nuts at the time of the payment. – user944796 Commented Feb 11 at 2:06
2 Answers
Reset to default 0Let's say that any payment should be distributed between all sales that occurred since the last time the balance passed 0;
and implement it methodically:
- use a windowed
sum
to follow the due balance - determine the tipping points (when balance went down to 0 and started growing again)
- group together all sales and payments between two tipping points
- distribute the payment proportionally to the sales over that debt cycle
Note that:
- /!\ This will not work in case of prepayments /!\
due to 2., because a payment is only associated to past sales.
However it might be possible to modify the query to work on prepayments, by artificially shifting payments: ordering bycase when type in ('CASH') then entrydate + interval '1' month else entrydate end
- a partial payment will be dispatched between future sales too
due to 4., because the distributing group runs until the settling payment.
Thus a "1. buy 15 of bolts - 2. pay 10 - 3. buy 5 of nuts - 4. pay the remaining 10"
will distribute both payment 7.5 for A / 2.5 for B,
instead of the (commercially correct) "2. partially pays for 1. (bolts), 4. the remains (5 of bolts, 5 of nuts)
(see it running on your data in a fiddle)
with
-- DUE Total:
-- follow each account's balance, where sales increase debt and payments decrease it
-- (hopefully making it return to 0).
duet as -- DUE Total
(
select
row_number() over (partition by accountid order by entrydate) id, -- Within accountId
t.*,
sum(amount) over (partition by accountid order by entrydate) due
from transac t
),
-- Flag each transaction which starts a debt
-- (due amount > 0 whereas last transaction had a due amount <= 0)
debtstart as
(
select
d.*,
-- This could have been computed directly into duet (with a row between … 1 precedenting),
-- but let's give it its own table for clarity.
case when lag(due) over (partition by accountid order by entrydate) > 0 then 0 else 1 end ds
from duet d
),
-- Now for each transaction, get the last id when the debt balance was negative.
-- This will slice our history, grouping together sales until a payment settles them all
-- (by setting the balance to 0 again, starting a new cycle).
indebtsince as
(
select
ds.*,
max(case when ds = 1 then id else 0 end) over (partition by accountid order by entrydate) indebtentryid
from debtstart ds
),
-- Here we have two options, depending on the count of products we sell:
-- - either we hardcode the products, by adding 1 column per product on our transactions table
-- (simpler to start with, because we keep 1 row per transaction; but each entry to the catalog will require to duplicate our code for a new column)
-- - or we work product by product
-- (adaptative, the code will run for any number of references; but requires a new, by-product, table)
-- We take this last path to be future-proof.
-- Let's decline debt by product type, summing it over each debt cycle.
byproduct as
(
select accountid, indebtentryid, type, sum(amount) amount
from indebtsince
where amount > 0
group by accountid, indebtentryid, type
),
byproductproportion as
(
select
p.*,
amount / sum(amount) over (partition by accountid, indebtentryid) proportion
from byproduct p
),
-- Apply the proportions of each debt cycle to all payments of that cycle.
bypayment as
(
select pay.id, pp.type, pay.amount * pp.proportion amount
from indebtsince pay join byproductproportion pp
on pp.accountid = pay.accountid
and pp.indebtentryid = pay.indebtentryid
where pay.type in ('CASH')
-- And aggregate the sales too, to benefit from pivot (next table) calculus.
union all
select id, type, amount
from duet
where type not in ('CASH')
),
-- Pivot time! Only now do we specialize our columns, to get one line per payment (instead of one per product per payment).
pivot as
(
select
sum(case when type = 'NUTS' then amount end) nut_amt,
sum(case when type = 'BOLTS' then amount end) bolt_amt,
id
from bypayment
group by id
)
-- Grand total
select d.*, p.nut_amt, p.bolt_amt
from duet d join pivot p on p.id = d.id
order by entrydate desc;
You can use the MODEL clause:
select accountid, entrydate, transaction, type, amount, nut_amt, bolt_amt
from (
select d.*,
row_number() over(partition by accountid order by entrydate) as rn
from data d
)
model
partition by ( accountid )
dimension by ( rn )
measures( amount as amount, 0 as nut_amt, 0 as bolt_amt, 0 as tot_nut, 0 as tot_bolt,
transaction as transaction, type as type, entrydate as entrydate )
rules(
tot_nut[any] = case type[cv()] when 'NUTS' then nvl(tot_nut[cv()-1],0) + amount[cv()]
when 'CASH' then 0 else nvl(tot_nut[cv()-1],0) end,
tot_bolt[any] = case type[cv()] when 'BOLTS' then nvl(tot_bolt[cv()-1],0) + amount[cv()]
when 'CASH' then 0 else nvl(tot_bolt[cv()-1],0) end,
nut_amt[any] = case type[cv()] when 'NUTS' then amount[cv()]
when 'CASH' then -tot_nut[cv()-1] else 0 end,
bolt_amt[any] = case type[cv()] when 'BOLTS' then amount[cv()]
when 'CASH' then -tot_bolt[cv()-1] else 0 end
)
order by entrydate desc
;