Not really well versed with python/sql joins in general, but it looks like this inner join is producing 2 identical columns of my_id, mod_id
which have identical values?
df = (
lhs_df[LHS_COLS]
.merge(
rhs_df, left_on="my_id", right_on="mod_id", suffixes=("_lhs", "_rhs"), how="inner"
)
.merge(final_df[COLS], on="my_id", how="left")
.rename(
columns={
"my_id": "id_fixed",
"mod_id": "id_mod",
}
)
)
I don't understand what the point of this is instead of dropping one of the columns and have downstream consumers that used id_mod
use id_fixed
instead. Am I misunderstanding how the inner join here works?