I have pandas table where I want to create new column and fill data based on another columns values. I also want to know, if new columns value is updated So I have dictionary like this:
update_values = {"Groub_A": {"aadff2": "Mark", "aasd12": "Otto", "asdd2": "Jhon"},"Groub_B": {"aadfaa": "Josh", "aa1113": "Math", "967323sd": "Marek"}}
And I want to my table look like this:
Column_1 | Column_new_2 | Column_new_3
aadff2 | Mark | Groub_A
aadff2 | Mark | Groub_A
aasd12 | Otto | Groub_A
asdd2 | Jhon | Groub_A
967323sd | Marek | Groub_B
967323sd | Marek | Groub_B
aa1113 | Math | Groub_B
So far I have just copied Column_1 and use df.replace("Column_new_2":update_values["Groub_A"])
and same thing with groub_B, but then don't know how to make Column_new_3?
There must be a easy solution, but I just can't figure it out.
I have pandas table where I want to create new column and fill data based on another columns values. I also want to know, if new columns value is updated So I have dictionary like this:
update_values = {"Groub_A": {"aadff2": "Mark", "aasd12": "Otto", "asdd2": "Jhon"},"Groub_B": {"aadfaa": "Josh", "aa1113": "Math", "967323sd": "Marek"}}
And I want to my table look like this:
Column_1 | Column_new_2 | Column_new_3
aadff2 | Mark | Groub_A
aadff2 | Mark | Groub_A
aasd12 | Otto | Groub_A
asdd2 | Jhon | Groub_A
967323sd | Marek | Groub_B
967323sd | Marek | Groub_B
aa1113 | Math | Groub_B
So far I have just copied Column_1 and use df.replace("Column_new_2":update_values["Groub_A"])
and same thing with groub_B, but then don't know how to make Column_new_3?
There must be a easy solution, but I just can't figure it out.
3 Answers
Reset to default 2Consider a nested list/dict comprehension to build a list of dictionaries to pass into pandas.DataFrame.from_records
. Then, left merge
against current data frame for New columns.
new_data = [
{"Column_1": k, "Column_new_2": v, "Column_new_3": gk}
for gk, gv in update_values.items()
for k, v in gv.items()
]
current_df = current_df.merge(
pd.DataFrame.from_records(new_data), on = "Column_1", how = "left"
)
import pandas as pd
# Original DataFrame
df = pd.DataFrame({
'Column_1': ['aadff2', 'aadff2', 'aasd12', 'asdd2', '967323sd', '967323sd', 'aa1113']
})
# Dictionary with values to update
update_values = {"Groub_A": {"aadff2": "Mark", "aasd12": "Otto", "asdd2": "Jhon"},
"Groub_B": {"aadfaa": "Josh", "aa1113": "Math", "967323sd": "Marek"}}
# Create an empty list to store new rows
new_rows = []
# Iterate over the dictionary and create a new DataFrame
for group, values in update_values.items():
for key, value in values.items():
new_rows.append([key, value, group])
# Create a DataFrame from the new rows
df_update = pd.DataFrame(new_rows, columns=['Column_1', 'Column_new_2', 'Column_new_3'])
# Merge the original DataFrame with the new DataFrame
df = df.merge(df_update, on='Column_1', how='left')
print(df)
Another possible solution, which uses map
to create the two new columns:
df.assign(
**dict(zip(
['Column_2', 'Column_3'],
zip(*df['Column_1'].map(
lambda x: [update_values["Groub_A"].get(x), 'Group_A']
if x in update_values["Groub_A"]
else [update_values["Groub_B"].get(x), 'Group_B']
))
))
)
Output:
Column_1 Column_2 Column_3
0 aadff2 Mark Group_A
1 aadff2 Mark Group_A
2 aasd12 Otto Group_A
3 asdd2 Jhon Group_A
4 967323sd Marek Group_B
5 967323sd Marek Group_B
6 aa1113 Math Group_B