最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

TypeError: unhashable type 'series''numpy.ndarray' - Stack Overflow

programmeradmin2浏览0评论

I'm trying to prepare data for visualization with seaborn. So I need to get number of different type of sessions for a multiple line chart.

With
session_cnt = df.groupby(df['EVENT_DATETIME'].dt.date, df['CUSTOMER_ID']).agg(session_count=('SESSION_ID', 'nunique'), app_session_cnt=('APP_SESSION_ID', 'nunique')).reset_index()

I got

TypeError: unhashable type 'series'

So I did that:

session_cnt = df.groupby(df['EVENT_DATETIME'].dt.date, df['CUSTOMER_ID'].values).agg(session_count=('SESSION_ID', 'nunique'), app_session_cnt=('APP_SESSION_ID', 'nunique')).reset_index()

But got

TypeError: unhashable type 'numpy.ndarray'

I'd like to understand which column to check when using groupby and getting TypeError, because now I'm only guessing. Maybe I need to read a good article on that error.

I'm trying to prepare data for visualization with seaborn. So I need to get number of different type of sessions for a multiple line chart.

With
session_cnt = df.groupby(df['EVENT_DATETIME'].dt.date, df['CUSTOMER_ID']).agg(session_count=('SESSION_ID', 'nunique'), app_session_cnt=('APP_SESSION_ID', 'nunique')).reset_index()

I got

TypeError: unhashable type 'series'

So I did that:

session_cnt = df.groupby(df['EVENT_DATETIME'].dt.date, df['CUSTOMER_ID'].values).agg(session_count=('SESSION_ID', 'nunique'), app_session_cnt=('APP_SESSION_ID', 'nunique')).reset_index()

But got

TypeError: unhashable type 'numpy.ndarray'

I'd like to understand which column to check when using groupby and getting TypeError, because now I'm only guessing. Maybe I need to read a good article on that error.

Share Improve this question edited Jan 18 at 17:43 march_1 asked Jan 18 at 12:53 march_1march_1 134 bronze badges 3
  • I don't see a difference between the 2 tries. Show the full error message. hashing means something is being used as a dict key. A string or number works. A list, array, or Series don't. – hpaulj Commented Jan 18 at 17:33
  • @hpaulj Oh, sorry, my bad. I've corrected the 1st try, it was without .values after df['CUSTOMER_ID']. How to get which column is list/array or series? I checked dtypes, thought that meant something (as I noticed that APP_SESSION_ID was object but SESSION_ID was int64). I tried making both object but the hashing problem still exists. – march_1 Commented Jan 18 at 18:00
  • A df[name] is always a Series, a dataframe column. values makes an array from that. The trick here to understand what the groupby by argument should be. – hpaulj Commented Jan 18 at 19:26
Add a comment  | 

1 Answer 1

Reset to default 0

Groupby takes a "mapping, function, label, pd.Grouper or list of such" :

session_cnt = (df.groupby([df['EVENT_DATETIME'].dt.date, df['CUSTOMER_ID']]) 
                 .agg(session_count=('SESSION_ID', 'nunique'), 
                      app_session_cnt=('APP_SESSION_ID', 'nunique'))
                 .reset_index()
)
发布评论

评论列表(0)

  1. 暂无评论