最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - pandas how to use columns value as pointer to another multindex to look up values - Stack Overflow

programmeradmin2浏览0评论

Say I have a dataset with multiindex i1 i2 i3 and columns A B1 .. B_n where A stores the i3 level index of a value I want to look up from another observation. I want to use this value to look up the B2 value. There can be pointers to a value not in the df or multiple pointers to the same value. However I am not getting what I want and suspect merging might not be the best way of doing this.

For example row (1,10,102) should have looked up i3=101's B2_x value here:

            A  B1  B2_x  c1  B2_y  b2_wanted
i1 i2 i3                                     
1  10 101  103   2     1   1   2.0        2.0
      102  101   2     2   2   NaN        1.0
      103  100   1     2   2   1.0        NaN
   11 100  102   1     2   2   NaN        NaN
      101  103   1     2   2   NaN        NaN
   12 101  103   2     1   2   1.0        1.0
      103  101   1     1   2   1.0        1.0
2  10 100  103   1     2   1   NaN        NaN
      101  103   1     1   2   1.0        NaN
      102  101   1     1   1   NaN        1.0

This is the code for my attempt

# generate test df
import pandas as pd
import numpy as np
np.random.seed(2)
range1 = range(1, 4) 
range2 = range(10, 13) 
range3 = range(100, 104)  
multi_index = pd.MultiIndex.from_product([range1, range2, range3], names=['i1', 'i2', 'i3'])

df = pd.DataFrame(index=multi_index, data={'A': np.random.randint(100, 104, size=len(multi_index)),
                                           'B1': np.random.randint(1, 3, size=len(multi_index)),
                                           'B2': np.random.randint(1, 3, size=len(multi_index)),
                                           'c1': np.random.randint(1, 3, size=len(multi_index))})

df = df[df.index.get_level_values("i3")!=df["A"]].head(10).copy()

# attempt to solve starts here

vars_to_lookup=df[["A","B2"]].copy().reset_index().set_index(["i1","i2","A"])
vars_to_lookup.index.names=["i1","i2","i3"]
vars_to_lookup.drop("i3",axis=1,inplace=True)

temp_df=pd.merge(df,vars_to_lookup,left_index=True,right_index=True,how="left")
temp_df["b2_wanted"]= pd.Series([2,1,np.nan,np.nan,np.nan,1,1,np.nan,np.nan,1 ],index=df.index)

Say I have a dataset with multiindex i1 i2 i3 and columns A B1 .. B_n where A stores the i3 level index of a value I want to look up from another observation. I want to use this value to look up the B2 value. There can be pointers to a value not in the df or multiple pointers to the same value. However I am not getting what I want and suspect merging might not be the best way of doing this.

For example row (1,10,102) should have looked up i3=101's B2_x value here:

            A  B1  B2_x  c1  B2_y  b2_wanted
i1 i2 i3                                     
1  10 101  103   2     1   1   2.0        2.0
      102  101   2     2   2   NaN        1.0
      103  100   1     2   2   1.0        NaN
   11 100  102   1     2   2   NaN        NaN
      101  103   1     2   2   NaN        NaN
   12 101  103   2     1   2   1.0        1.0
      103  101   1     1   2   1.0        1.0
2  10 100  103   1     2   1   NaN        NaN
      101  103   1     1   2   1.0        NaN
      102  101   1     1   1   NaN        1.0

This is the code for my attempt

# generate test df
import pandas as pd
import numpy as np
np.random.seed(2)
range1 = range(1, 4) 
range2 = range(10, 13) 
range3 = range(100, 104)  
multi_index = pd.MultiIndex.from_product([range1, range2, range3], names=['i1', 'i2', 'i3'])

df = pd.DataFrame(index=multi_index, data={'A': np.random.randint(100, 104, size=len(multi_index)),
                                           'B1': np.random.randint(1, 3, size=len(multi_index)),
                                           'B2': np.random.randint(1, 3, size=len(multi_index)),
                                           'c1': np.random.randint(1, 3, size=len(multi_index))})

df = df[df.index.get_level_values("i3")!=df["A"]].head(10).copy()

# attempt to solve starts here

vars_to_lookup=df[["A","B2"]].copy().reset_index().set_index(["i1","i2","A"])
vars_to_lookup.index.names=["i1","i2","i3"]
vars_to_lookup.drop("i3",axis=1,inplace=True)

temp_df=pd.merge(df,vars_to_lookup,left_index=True,right_index=True,how="left")
temp_df["b2_wanted"]= pd.Series([2,1,np.nan,np.nan,np.nan,1,1,np.nan,np.nan,1 ],index=df.index)
Share Improve this question asked Mar 17 at 3:27 mother brainmother brain 255 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

Here's one approach:

df['B2_wanted'] = (
    df.reindex(
        df.droplevel('i3').set_index('A', append=True).index
    )['B2'].to_numpy()
)

Output:

             A  B1  B2  c1  B2_wanted
i1 i2 i3                             
1  10 101  103   2   1   1        2.0
      102  101   2   2   2        1.0
      103  100   1   2   2        NaN
   11 100  102   1   2   2        NaN
      101  103   1   2   2        NaN
   12 101  103   2   1   2        1.0
      103  101   1   1   2        1.0
2  10 100  103   1   2   1        NaN
      101  103   1   1   2        NaN
      102  101   1   1   1        1.0

Explanation

  • Use df.droplevel + df.set_index to exchange level 'i3' for column 'A' and select df.index.
  • Pass the new index to df.reindex and assign 'B2' values via Series.to_numpy.
发布评论

评论列表(0)

  1. 暂无评论