python - How to specify column name with the suffix based on another column value

#Column X contains the suffix of one of V* columns. Need to put the value from V(X) in column Y.

import pandas as pd
import numpy as np

# sample dataframes
df = pd.DataFrame({
    'EMPLID': [12, 13, 14, 15, 16, 17, 18],
    'V1': [2,3,4,50,6,7,8],
    'V2': [3,3,3,3,3,3,3],
    'V3': [7,15,8,9,10,11,12],
    'X': [2,3,1,3,3,1,2]
})

# Expected output:

   EMPLID  V1  V2  V3  X  Y  
0      12   2   3   7  2  3
1      13   3   3  15  3  15
2      14   4   3   8  1  4
3      15  50   3   9  3  9
4      16   6   3  10  3  10
5      17   7   3  11  1  7
6      18   8   3  12  2  3

Example code I've tried (all got syntax error): df['Y'] = df[f"V + df['X']"]

Any suggestion is appreciated. Thank you.

#Column X contains the suffix of one of V* columns. Need to put the value from V(X) in column Y.

import pandas as pd
import numpy as np

# sample dataframes
df = pd.DataFrame({
    'EMPLID': [12, 13, 14, 15, 16, 17, 18],
    'V1': [2,3,4,50,6,7,8],
    'V2': [3,3,3,3,3,3,3],
    'V3': [7,15,8,9,10,11,12],
    'X': [2,3,1,3,3,1,2]
})

# Expected output:

   EMPLID  V1  V2  V3  X  Y  
0      12   2   3   7  2  3
1      13   3   3  15  3  15
2      14   4   3   8  1  4
3      15  50   3   9  3  9
4      16   6   3  10  3  10
5      17   7   3  11  1  7
6      18   8   3  12  2  3

Example code I've tried (all got syntax error): df['Y'] = df[f"V + df['X']"]

Any suggestion is appreciated. Thank you.

Share Improve this question asked Nov 20, 2024 at 16:57 bione_g dgrace 711 silver badge6 bronze badges

Add a comment |

3 Answers 3

Sorted by: Reset to default 2

The canonical way would be to use indexing lookup, however since the column names and X values are a bit different, you first need to convert to string and prepend V:

idx, cols = pd.factorize('V'+df['X'].astype(str))

df['Y'] = df.reindex(cols, axis=1).to_numpy()[np.arange(len(df)), idx]

Output:

   EMPLID  V1  V2  V3  X   Y
0      12   2   3   7  2   3
1      13   3   3  15  3  15
2      14   4   3   8  1   4
3      15  50   3   9  3   9
4      16   6   3  10  3  10
5      17   7   3  11  1   7
6      18   8   3  12  2   3

For the sake of simplicity / clarity, I would use the apply method. It allows you to perform row-wise operations.

So combined with a lamdba function you can do it in one line :

df['Y'] = df.apply(lambda row: row[f"V{row['X']}"], axis=1)

Note the axis=1 is crucial to operate on each, row. By default it is on each column.

import pandas as pd
import numpy as np
df = pd.DataFrame({
    'EMPLID': [12, 13, 14, 15, 16, 17, 18],
    'V1': [2, 3, 4, 50, 6, 7, 8],
    'V2': [3, 3, 3, 3, 3, 3, 3],
    'V3': [7, 15, 8, 9, 10, 11, 12],
    'X': [2, 3, 1, 3, 3, 1, 2]
})
df['Y'] = np.where(df['X'] == 3, df['V3'], df['V1'])
print(df)

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

python - How to specify column name with the suffix based on another column value - Stack Overflow

3 Answers 3

与本文相关的文章

评论列表(0)