Python的大熊猫read_table空格分隔符
问题描述:
我有这个样本的txt文件,看起来像:Python的大熊猫read_table空格分隔符
ACW00011604 17.1167 -61.7833 10.1 ST JOHNS COOLIDGE FLD
ACW00011647 17.1333 -61.7833 19.2 ST JOHNS
E000041196 25.3330 55.5170 34.0 SHARJAH INTER. AIRP
等等
我需要的是这个文件被加载,如:
X X.1 X.3 X.4 X.5
ACW00011604 17.1167 -61.7833 10.1 ST JOHNS COOLIDGE FLD
ACW00011647 17.1333 -61.7833 19.2 ST JOHNS
我试过进口大熊猫,因为pd
ds=pd.read_table("st.txt", delim_whitespace=True, header=None)
但它的工作原理是:
X X.1 X.3 X.4 X.5 X.6 X.7 X.8
ACW00011604 17.1167 -61.7833 10.1 ST JOHNS COOLIDGE FLD
ACW00011647 17.1333 -61.7833 19.2 ST JOHNS
E000041196 25.3330 55.5170 34.0 SHARJAH INTER. AIRP
我该如何处理?
答
使用read_fwf
读取固定宽度的文件格式,并通过PARAMS header=None
和您所需的列名:
In [18]:
import io
import pandas as pd
t="""ACW00011604 17.1167 -61.7833 10.1 ST JOHNS COOLIDGE FLD
ACW00011647 17.1333 -61.7833 19.2 ST JOHNS
E000041196 25.3330 55.5170 34.0 SHARJAH INTER. AIRP"""
df = pd.read_fwf(io.StringIO(t), header=None, names=['X','X.1','X.3','X.4', 'X.5'])
df
Out[18]:
X X.1 X.3 X.4 X.5
0 ACW00011604 17.1167 -61.7833 10.1 ST JOHNS COOLIDGE FLD
1 ACW00011647 17.1333 -61.7833 19.2 ST JOHNS
2 E000041196 25.3330 55.5170 34.0 SHARJAH INTER. AIRP
所以你的情况下面应该工作:
ds=pd.read_fwf("st.txt", header=None, names=['X','X.1','X.3','X.4', 'X.5'])