I'm trying to sum up data of selected date, month of selected date and previous month of selected date but don't know how to do. Below is my sample data and my expected Output:
Sample data:
import pandas as pd
import numpy as np
df = pd.read_excel('(1).xlsx', sheet_name='Data')
df
COA Code USDConversion Amount Base Date 2
0 19010000000 26924582.44 2024-10-01
1 19010000000 38835600.44 2024-10-02
2 19010000000 46794586.57 2024-10-03
3 19010000000 57117346.49 2024-10-06
4 19010000000 69256132.98 2024-10-07
... ... ... ...
65 58000000000 38082130.88 2024-11-12
66 58000000000 38140016.13 2024-11-13
67 58000000000 38160089.27 2024-11-14
68 58000000000 38233974.54 2024-11-17
69 58000000000 38323598.99 2024-11-18
So if I select date of November (for example 2024-11-18
, I want to group by selected date, month of selected date and previous month of selected date.
Output:
COA Code 2024-11-18 October November
0 19010000000 42625047.24 1354513618.61 584813860.97
1 58000000000 38323598.99 820927014.08 456265522.64
I'm trying to sum up data of selected date, month of selected date and previous month of selected date but don't know how to do. Below is my sample data and my expected Output:
Sample data:
import pandas as pd
import numpy as np
df = pd.read_excel('https://github/hoatranobita/hoatranobita/raw/refs/heads/main/Check%20data%20(1).xlsx', sheet_name='Data')
df
COA Code USDConversion Amount Base Date 2
0 19010000000 26924582.44 2024-10-01
1 19010000000 38835600.44 2024-10-02
2 19010000000 46794586.57 2024-10-03
3 19010000000 57117346.49 2024-10-06
4 19010000000 69256132.98 2024-10-07
... ... ... ...
65 58000000000 38082130.88 2024-11-12
66 58000000000 38140016.13 2024-11-13
67 58000000000 38160089.27 2024-11-14
68 58000000000 38233974.54 2024-11-17
69 58000000000 38323598.99 2024-11-18
So if I select date of November (for example 2024-11-18
, I want to group by selected date, month of selected date and previous month of selected date.
Output:
COA Code 2024-11-18 October November
0 19010000000 42625047.24 1354513618.61 584813860.97
1 58000000000 38323598.99 820927014.08 456265522.64
Share
Improve this question
edited Nov 19, 2024 at 9:38
hoa tran
asked Nov 19, 2024 at 4:13
hoa tranhoa tran
1,7411 gold badge6 silver badges19 bronze badges
1
- So, you want to provide a date and get the value and that of the month and previous month? Can you have several COA codes per date? – mozway Commented Nov 19, 2024 at 5:35
1 Answer
Reset to default 1The exact generalization of your question is not fully clear, but assuming that you want to group by COA Code, you could ensure everything is a datetime/periods, then select the appropriate rows with boolean indexing and between
, finally perform a groupby.sum
of those rows and concat
to the original date rows.
Here as a function for clarity:
def get_previous(df, date, date_col='Base Date 2'):
# ensure working with datetime/period objects
date = pd.Timestamp(date)
period = date.to_period('M')
dt = pd.to_datetime(df[date_col])
p = dt.dt.to_period('M')
# select rows to keep
m = p.between(period-1, period, inclusive='both')
# produce rows with original date
# aggregate previous and current month
# combine and rename the columns
return pd.concat([
df[df[date_col].eq(date)]
.set_index('COA Code')['USDConversion Amount']
.rename(date.strftime('%Y-%m-%d')),
df[m].groupby(['COA Code', p])['USDConversion Amount']
.sum().unstack(date_col)
.rename(columns=lambda x: x.strftime('%B'))
], axis=1).reset_index()
out = get_previous(df, '2024-11-18')
Output:
COA Code 2024-11-18 October November
0 58000000000 38323598.99 8.209270e+08 4.562655e+08
NB. you can replace groupby.sum
+unstack
by pivot_table
(df[m].assign(col=p).pivot_table(index='COA Code', columns='col', values='USDConversion Amount', aggfunc='sum')
).