Excel file #1:
Date Data1 Data2
1-1-2025 x1 x2
1-2-2025 x3 x4
Excel file #2:
Date Data1 Data2
1-3-2025 y1 y2
1-4-2025 y3 y4
I want to merge them into a single file:
Date Data1 Data2
1-1-2025 x1 x2
1-2-2025 x3 x4
1-3-2025 y1 y2
1-4-2025 y3 y4
My code:
for file_path in file_paths:
# Get the date from the file name
file_name = file_path
file_name = os.path.splitext(file_name)[0]
file_name = file_name[-8:]
file_name = file_name.replace('_','/')
# Read the Excel file
df = pd.read_excel(file_path, sheet_name='Parte', usecols='O:AN', skiprows=10, nrows=110, header=[0])
data_frames.append(df)
combined_df = pd.concat(data_frames, ignore_index=True)
combined_df.to_excel('rute1', index=False)
The result is not correct:
Date Data1 Data2
1-1-2025 x1 x2
1-2-2025 x3 x4
1-3-2025 y1 y2
1-4-2025 y3 y4
Excel file #1:
Date Data1 Data2
1-1-2025 x1 x2
1-2-2025 x3 x4
Excel file #2:
Date Data1 Data2
1-3-2025 y1 y2
1-4-2025 y3 y4
I want to merge them into a single file:
Date Data1 Data2
1-1-2025 x1 x2
1-2-2025 x3 x4
1-3-2025 y1 y2
1-4-2025 y3 y4
My code:
for file_path in file_paths:
# Get the date from the file name
file_name = file_path
file_name = os.path.splitext(file_name)[0]
file_name = file_name[-8:]
file_name = file_name.replace('_','/')
# Read the Excel file
df = pd.read_excel(file_path, sheet_name='Parte', usecols='O:AN', skiprows=10, nrows=110, header=[0])
data_frames.append(df)
combined_df = pd.concat(data_frames, ignore_index=True)
combined_df.to_excel('rute1', index=False)
The result is not correct:
Date Data1 Data2
1-1-2025 x1 x2
1-2-2025 x3 x4
1-3-2025 y1 y2
1-4-2025 y3 y4
Share
Improve this question
edited Feb 15 at 3:58
user4157124
3,00214 gold badges31 silver badges46 bronze badges
asked Jan 28 at 20:45
RodRod
1
1
|
1 Answer
Reset to default 0It looks like the Date
column isn't recognized as a datetime object. Convert the Date
column to a datetime format. After combining the data, sort the rows by Date
:
df['Date'] = pd.to_datetime(df['Date'])
combined_df = pd.concat(data_frames, ignore_index=True)
combined_df = combined_df.sort_values(by='Date')
pd.DataFrame
simply by selecting it in your preferred order, e.g.df: pd.DataFrame = df[["columns", "in-the", "order", "required"]]
– Felipe Whitaker Commented Jan 28 at 20:48