最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - Load a csv with consecutive trailing separators as separate null columns in a pandas dataframe - Stack Overflow

programmeradmin1浏览0评论

I have a csv file of the following format: 12;0;5/15/2008;1:01:09;1;0;0;None;97;39;0.279;;;0;;;;;0;0;0;;;;;;;;;;;;;;;;

Then I read the file into a pandas dataframe:

df = pd.read_csv(config.get("OS", "output_file"), header=None, delimiter=config.get("OS", "delimiter"), decimal=".")
df=df.fillna(value='')
print(df)

and in the output I only have 13 columns ending with the last non-null column.

How can I preserve the number of columns in the dataframe and have the trailing columns in the dataframe set to null?

I have a csv file of the following format: 12;0;5/15/2008;1:01:09;1;0;0;None;97;39;0.279;;;0;;;;;0;0;0;;;;;;;;;;;;;;;;

Then I read the file into a pandas dataframe:

df = pd.read_csv(config.get("OS", "output_file"), header=None, delimiter=config.get("OS", "delimiter"), decimal=".")
df=df.fillna(value='')
print(df)

and in the output I only have 13 columns ending with the last non-null column.

How can I preserve the number of columns in the dataframe and have the trailing columns in the dataframe set to null?

Share Improve this question asked Mar 14 at 1:21 ShmygShmyg 132 bronze badges 3
  • Welcome to SO, what datatype do you want those columns be, they're defaulting to float46 (for me at least) with 'NaN' as the default ( i'm also seeing 36 columns !, when you say you only have 13 columns do you mean the rest are missing or you only want 13 !, is this on windows? what does config.get("OS", delimiter") resolve to ?. – ticktalk Commented Mar 14 at 1:23
  • Delimiter is set to semicolon. I have only 13 or so. After the last 0, there is nothing. I ran this on both Windows and Linux. I haven't thought about the data type; at the moment, my concern is getting the columns back. – Shmyg Commented Mar 14 at 1:30
  • 1 show the output of the operations (copy/paste - no screenshots) tks – ticktalk Commented Mar 14 at 6:13
Add a comment  | 

1 Answer 1

Reset to default 0

Here's what i get (linux mint 20.x , python 3.12.2, pandas 2.2.2)

cat shmyg.py
import pandas as pd
import sys

if len(sys.argv) < 3:
    print(f'usage: {sys.argv[0]} separator filename')
    sys.exit(1)

df = pd.read_csv(sys.argv[2], sep=sys.argv[1], header=None)
df.info()
print(df)

####

cat shmyg.txt 
12;0;5/15/2008;1:01:09;1;0;0;None;97;39;0.279;;;0;;;;;0;0;0;;;;;;;;;;;;;;;;

####
python shmyg.py ';' shmyg.txt 
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1 entries, 0 to 0
Data columns (total 37 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   0       1 non-null      int64  
 1   1       1 non-null      int64  
 2   2       1 non-null      object 
 3   3       1 non-null      object 
 4   4       1 non-null      int64  
 5   5       1 non-null      int64  
 6   6       1 non-null      int64  
 7   7       0 non-null      float64
 8   8       1 non-null      int64  
 9   9       1 non-null      int64  
 10  10      1 non-null      float64
 11  11      0 non-null      float64
 12  12      0 non-null      float64
 13  13      1 non-null      int64  
 14  14      0 non-null      float64
 15  15      0 non-null      float64
 16  16      0 non-null      float64
 17  17      0 non-null      float64
 18  18      1 non-null      int64  
 19  19      1 non-null      int64  
 20  20      1 non-null      int64  
 21  21      0 non-null      float64
 22  22      0 non-null      float64
 23  23      0 non-null      float64
 24  24      0 non-null      float64
 25  25      0 non-null      float64
 26  26      0 non-null      float64
 27  27      0 non-null      float64
 28  28      0 non-null      float64
 29  29      0 non-null      float64
 30  30      0 non-null      float64
 31  31      0 non-null      float64
 32  32      0 non-null      float64
 33  33      0 non-null      float64
 34  34      0 non-null      float64
 35  35      0 non-null      float64
 36  36      0 non-null      float64
dtypes: float64(24), int64(11), object(2)
memory usage: 428.0+ bytes
   0   1          2        3   4   5   6   7   8   9      10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36
0  12   0  5/15/2008  1:01:09   1   0   0 NaN  97  39  0.279 NaN NaN   0 NaN NaN NaN NaN   0   0   0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论