python熊猫特殊字符作为分隔符
问题描述:
我有一个特殊字符[˛]作为分隔符的文本文件。我复制粘贴此特殊字符作为分隔符在我read_csv命令,我收到以下错误:python熊猫特殊字符作为分隔符
ParserWarning: Falling back to the 'python' engine because the
separator encoded in utf-8 is > 1 char long, and the 'c' engine does
not support such separators; you can avoid this warning by specifying
engine='python'.
"""Entry point for launching an IPython kernel.
不知道如何在阅读文本文件中使用特殊字符?
答
你只能得到警告和解决方案,删除它很容易 - 添加engine='python'
。
Under the hood pandas uses a fast and efficient parser implemented in
C
as well as a python implementation which is currently more feature-complete. Where possible pandas uses the C parser (specified asengine='c'
), but may fall back to python if C-unsupported options are specified. Currently, C-unsupported options include:
- 月比单字符以外(例如正则表达式分隔符)
- skipfooter
- 九月=无与delim_whitespace =假
Specifying any of the above options will produce a ParserWarning unless the python engine is selected explicitly using
engine='python'
.
import pandas as pd
from pandas.compat import StringIO
temp=u"""a˛b˛c
1˛3˛5
7˛8˛1
"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), sep="˛", engine='python')
print (df)
a b c
0 1 3 5
1 7 8 1