I'm trying to prepare data for visualization with seaborn. So I need to get number of different type of sessions for a multiple line chart.
With
session_cnt = df.groupby(df['EVENT_DATETIME'].dt.date, df['CUSTOMER_ID']).agg(session_count=('SESSION_ID', 'nunique'), app_session_cnt=('APP_SESSION_ID', 'nunique')).reset_index()
I got
TypeError: unhashable type 'series'
So I did that:
session_cnt = df.groupby(df['EVENT_DATETIME'].dt.date, df['CUSTOMER_ID'].values).agg(session_count=('SESSION_ID', 'nunique'), app_session_cnt=('APP_SESSION_ID', 'nunique')).reset_index()
But got
TypeError: unhashable type 'numpy.ndarray'
I'd like to understand which column to check when using groupby and getting TypeError, because now I'm only guessing. Maybe I need to read a good article on that error.
I'm trying to prepare data for visualization with seaborn. So I need to get number of different type of sessions for a multiple line chart.
With
session_cnt = df.groupby(df['EVENT_DATETIME'].dt.date, df['CUSTOMER_ID']).agg(session_count=('SESSION_ID', 'nunique'), app_session_cnt=('APP_SESSION_ID', 'nunique')).reset_index()
I got
TypeError: unhashable type 'series'
So I did that:
session_cnt = df.groupby(df['EVENT_DATETIME'].dt.date, df['CUSTOMER_ID'].values).agg(session_count=('SESSION_ID', 'nunique'), app_session_cnt=('APP_SESSION_ID', 'nunique')).reset_index()
But got
TypeError: unhashable type 'numpy.ndarray'
I'd like to understand which column to check when using groupby and getting TypeError, because now I'm only guessing. Maybe I need to read a good article on that error.
Share Improve this question edited Jan 18 at 17:43 march_1 asked Jan 18 at 12:53 march_1march_1 134 bronze badges 3 |1 Answer
Reset to default 0Groupby takes a "mapping, function, label, pd.Grouper or list of such" :
session_cnt = (df.groupby([df['EVENT_DATETIME'].dt.date, df['CUSTOMER_ID']])
.agg(session_count=('SESSION_ID', 'nunique'),
app_session_cnt=('APP_SESSION_ID', 'nunique'))
.reset_index()
)
hashing
means something is being used as a dict key. A string or number works. A list, array, or Series don't. – hpaulj Commented Jan 18 at 17:33.values
after df['CUSTOMER_ID']. How to get which column is list/array or series? I checked dtypes, thought that meant something (as I noticed that APP_SESSION_ID wasobject
but SESSION_ID wasint64
). I tried making bothobject
but thehashing
problem still exists. – march_1 Commented Jan 18 at 18:00df[name]
is always a Series, a dataframe column.values
makes an array from that. The trick here to understand what thegroupby
by
argument should be. – hpaulj Commented Jan 18 at 19:26