最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

zip - How to read first row of a zipped csv in python - Stack Overflow

programmeradmin4浏览0评论

I currently have a zip file, that holds an underlying csv. I would like to read the file row by row without extracting the entire CSV file from the zip.

The underlying csv is simply too big to extract so I need a work around

I currently have a zip file, that holds an underlying csv. I would like to read the file row by row without extracting the entire CSV file from the zip.

The underlying csv is simply too big to extract so I need a work around

Share Improve this question asked Feb 7 at 15:05 polliewpolliew 1 New contributor polliew is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct. 1
  • 1 If you are running under Windows, zip files are treated as folders, so you only have to supply the right path to your python program and it should be able to read it. – quamrana Commented Feb 7 at 15:08
Add a comment  | 

1 Answer 1

Reset to default 2

You can stream read the zip archive and get the contents of the first row via:

import zipfile
with zipfile.ZipFile("final_analysis_data.zip") as z: # 100m compressed
    with z.open("final_analysis_data.csv") as f:      # 650m uncompressed
        first_row = next(f).decode()
        input("check memory useage now, press enter to continue")
print(first_row)

The input() statement will just pause and allow you to verify that you are not reading the entire archive into memory. With a 100m archive of a 650m csv in this example the python process uses 6m of ram.

Note:

If you feel that this resolves your issue, you might consider closing it as duplicate of:

Read a large zipped text file line by line in python

rather than accepting an answer.

发布评论

评论列表(0)

  1. 暂无评论