最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - how to read a file and reformat some columns into new rows - Stack Overflow

programmeradmin2浏览0评论

I have data * in the following format in a text file.

x1 y1 z1 x2 y2 z2 x3 y3 z3 data1x data1y data1z data2x data2y data2z data3x data3y data3z

I'm trying to reformat the file such the output looks like this:

x1 y1 z1 data1x data1y data1z 
x2 y2 z2 data2x data2y data2z
x3 y3 z3 data3x data3y data3z

I'm sure that there's a smart way to do the same using for example pandas. I tried with pivot and pivot_table, but the output wasn't quite right. Could you please help me?

  • It's node coordinates and the corresponding vector field values (all floats), and in reality each row consists of 10 nodes, and there are thousands of rows in the file

This

all_data = []
with open(old_file, "r") as file1:
    skip_first_row = file1.readline()

    for line in file1:
        line_list = line.split()

        all_data.append(line_list[0:3]+line_list[30:33])
        all_data.append(line_list[3:6]+line_list[33:36])
        all_data.append(line_list[6:9]+line_list[36:39])


with open(new_file, "w") as file2:
    write_headline = file2.write("%s\n" % headline)
    for data in all_data:
        file2.write(' '.join(data) + '\n')

does the job, but I'm sure there a smarter way!

I have data * in the following format in a text file.

x1 y1 z1 x2 y2 z2 x3 y3 z3 data1x data1y data1z data2x data2y data2z data3x data3y data3z

I'm trying to reformat the file such the output looks like this:

x1 y1 z1 data1x data1y data1z 
x2 y2 z2 data2x data2y data2z
x3 y3 z3 data3x data3y data3z

I'm sure that there's a smart way to do the same using for example pandas. I tried with pivot and pivot_table, but the output wasn't quite right. Could you please help me?

  • It's node coordinates and the corresponding vector field values (all floats), and in reality each row consists of 10 nodes, and there are thousands of rows in the file

This

all_data = []
with open(old_file, "r") as file1:
    skip_first_row = file1.readline()

    for line in file1:
        line_list = line.split()

        all_data.append(line_list[0:3]+line_list[30:33])
        all_data.append(line_list[3:6]+line_list[33:36])
        all_data.append(line_list[6:9]+line_list[36:39])


with open(new_file, "w") as file2:
    write_headline = file2.write("%s\n" % headline)
    for data in all_data:
        file2.write(' '.join(data) + '\n')

does the job, but I'm sure there a smarter way!

Share Improve this question edited Mar 16 at 19:50 globglogabgalab 5883 silver badges19 bronze badges asked Mar 14 at 16:56 user29975383user29975383 111 silver badge1 bronze badge
Add a comment  | 

2 Answers 2

Reset to default 1

Another possible solution, based on pandas and numpy:

txt = "x1 y1 z1 x2 y2 z2 x3 y3 z3 data1x data1y data1z data2x data2y data2z data3x data3y data3z"

pd.DataFrame(np.hstack(
    pd.read_csv(StringIO(txt), sep=r'\s+', header=None).values
    .reshape(-1, 3, 3))) # replace StringIO(txt) with 'your_file.csv'

It first reads the space-separated data from the file into a pandas datadrame using pd.read_csv with StringIO(txt) as a placeholder for the file input. It then extracts the underlying numpy array using .values, reshapes it into a 3D array of shape (-1, 3, 3). Finally, np.hstack is used to horizontally concatenate these sub-arrays.

Output:

    0   1   2       3       4       5
0  x1  y1  z1  data1x  data1y  data1z
1  x2  y2  z2  data2x  data2y  data2z
2  x3  y3  z3  data3x  data3y  data3z

You can used batched to split each lines into batches of three, and then use zip to transpose them:

from itertools import batched
all_data = []
with open(old_file, "r") as inputfile:
    skip_first_row = inputfile.readline()

    for line in inputfile:
        line_list = line.split()
        all_data += zip(*batched(line_list, 3))
发布评论

评论列表(0)

  1. 暂无评论