如何读取返回的值(从前一个函数)到熊猫,python?
问题描述:
在以下程序中如何读取返回的值(从前一个函数)到熊猫,python?
我想访问/管道下游函数中的一个函数中的数据。
使用Python代码类似如下:
def main():
data1, data2, data3 = read_file()
do_calc(data1, data2, data3)
def read_file():
data1 = ""
data2 = ""
data3 = ""
file1 = open('file1.txt', 'r+').read()
for line in file1
do something....
data1 += calculated_values
file2 = open('file2.txt', 'r+').read()
for line in file1
do something...
data2 += calculated_values
file1 = open('file1.txt', 'r+').read()
for line in file1
do something...
data3 += calculated_values
return data1, data2, data3
def do_calc(data1, data2, data3):
d1_frame = pd.read_table(data1, sep='\t')
d2_frame = pd.read_table(data2, sep='\t')
d3_frame = pd.read_table(data3, sep='\t')
all_data = [d1_frame, d2_frame, d3_frame]
main()
有什么问题给定的代码?看起来熊猫无法正确读取输入文件,但将数据1,2和3的值打印到屏幕上。
read_hdf似乎读取的文件,但不正确。有没有办法将函数直接返回的数据读入熊猫(无需写入/读入文件)。
错误消息:
Traceback (most recent call last):
File "calc.py", line 757, in <module>
main()
File "calc.py", line 137, in main
merge_tables(pop1_freq_table, pop2_freq_table, f1_freq_table)
File "calc.py", line 373, in merge_tables
df1 = pd.read_table(pop1_freq_table, sep='\t')
File "/home/everestial007/.local/lib/python3.5/site-packages/pandas/io/parsers.py", line 645, in parser_f
return _read(filepath_or_buffer, kwds)
File "/home/everestial007/.local/lib/python3.5/site-packages/pandas/io/parsers.py", line 388, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/home/everestial007/.local/lib/python3.5/site-packages/pandas/io/parsers.py", line 729, in __init__
self._make_engine(self.engine)
File "/home/everestial007/.local/lib/python3.5/site-packages/pandas/io/parsers.py", line 922, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/everestial007/.local/lib/python3.5/site-packages/pandas/io/parsers.py", line 1389, in __init__
self._reader = _parser.TextReader(src, **kwds)
File "pandas/parser.pyx", line 373, in pandas.parser.TextReader.__cinit__ (pandas/parser.c:4019)
File "pandas/parser.pyx", line 665, in pandas.parser.TextReader._setup_parser_source (pandas/parser.c:7967)
FileNotFoundError: File b'0.667,0.333\n2\t15800126\tT\tT,A\t0.667,0.333\n2\t15800193\tC\tC,T\t0.667,0.333\n2\t15800244\tT\tT,C\......
我将不胜感激任何解释。
答
pd.read_table(data1, sep='\t')
认为data1
作为文件路径,因为它没有read
方法。您可以在堆栈跟踪中看到它尝试使用csv文件内容的名称打开文件。
Parameters -------- filepath_or_buffer : str, pathlib.Path, py._path.local.LocalPath or any object with a read() method (such as a file handle or StringIO)
你应该把它转换为io.StringIO
对象,以便它可以读取
的QuickFix:
pd.read_table(io.StringIO(data1), sep='\t')
而是创建一个副本
从read_table
帮助
的数据。最好的解决将是直接创建io.StringIO
缓冲区:
def read_file():
data1 = io.StringIO()
file1 = open('file1.txt', 'r+').read()
for line in file1
do something....
data1.write(calculated_values)
# in the end
data1.seek(0) # reset to start of "file"
(1)我会建议你使用'pd.read_csv(文件1,月= “\ t” 的)',而不是读表。 (2)如果每个文件都包含1列,为什么不直接读入熊猫并进行计算?这会让你的生活比编写一个单独的函数来读取更容易。 – Rohit