I have a table with logs of going inside and outside the building. The table looks like that:
user_id | datetime | direction |
---|---|---|
1 | 17/2/2025, 18:25:02.000 | in |
1 | 17/2/2025, 20:09:10.000 | out |
2 | 17/2/2025, 09:55:57.000 | in |
2 | 17/2/2025, 20:48:37.000 | out |
I have a table with logs of going inside and outside the building. The table looks like that:
user_id | datetime | direction |
---|---|---|
1 | 17/2/2025, 18:25:02.000 | in |
1 | 17/2/2025, 20:09:10.000 | out |
2 | 17/2/2025, 09:55:57.000 | in |
2 | 17/2/2025, 20:48:37.000 | out |
In fact I have hundreds of people every day and long period, not only one day. I need to count maximum number of people inside the building within each hour. I think it could be easier if I could take first entrance and last leaving within each day, but it will be incorrect since person may go outside and inside of building several times within a day and I only need to count hours, where person is inside.
The result table should look like this:
hour | count |
---|---|
17/2/2025, 09:00 | 1 |
17/2/2025, 10:00 | 1 |
17/2/2025, 11:00 | 1 |
17/2/2025, 12:00 | 1 |
17/2/2025, 13:00 | 1 |
17/2/2025, 14:00 | 1 |
17/2/2025, 15:00 | 1 |
17/2/2025, 16:00 | 1 |
17/2/2025, 17:00 | 1 |
17/2/2025, 18:00 | 2 |
17/2/2025, 19:00 | 2 |
17/2/2025, 20:00 | 2 |
is there a way to calculate this in PostgreSQL?
Share Improve this question edited yesterday richyen 10k4 gold badges17 silver badges34 bronze badges asked yesterday Daniel GDaniel G 838 bronze badges 2- You might use cumulative sum on a case of direction to calculate the delta. Here is a similar question but for SQL-Server stackoverflow/questions/25857117/… – Bart McEndree Commented yesterday
- If you need a calendar tables then consider this: stackoverflow/questions/5821915/… – Bart McEndree Commented yesterday
2 Answers
Reset to default 3You can use generate_series, a range type and the overlap operator in a window function:
WITH input AS (
SELECT *
, tsrange(lag(datetime) OVER inout, datetime, '[)') AS timeframe
FROM mytable
WINDOW inout as (PARTITION BY user_id ORDER BY datetime)
)
SELECT ts,
count(user_id)
FROM generate_series('2025-02-17 08:00'::timestamp, '2025-02-18 14:00'::timestamp, interval '1 hour') g(ts)
LEFT JOIN input ON timeframe && tsrange(ts, ts + interval '1 hour')
AND direction = 'out' -- must have checked out
GROUP BY 1
ORDER BY 1;
See also https://dbfiddle.uk/HQG5DQOM
Three steps:
- Get all hours since the first hour in the table to the last one. This is done with
generate_series
in PostgreSQL. - Outer join the ins and outs and get a rolling sum of people in.
- Aggregate to get the maximum persons per hour.
The query:
with
hours (hour) as
(
select *
from generate_series( (select date_trunc('hour', min(datetime)) from mytable),
(select date_trunc('hour', max(datetime)) from mytable),
interval '1 hour') gen
),
cumulated as
(
select
h.hour,
sum(case t.direction when 'in' then 1 when 'out' then -1 end) over (order by h.hour, t.datetime) as users
from hours h
left join mytable t on t.datetime >= h.hour
and t.datetime < h.hour + interval '1 hour'
)
select hour, max(users)
from cumulated
group by hour
order by hour;
Demo: https://dbfiddle.uk/UCsCd8DU