使用Python将JSON数据转换为CSV
我试图从NHL.com提取统计表并将它们转换为csv,以便以后在Excel中使用。我能够拉桌子,但有问题将其转换为CSV。我发现很多关于将json转换为csv的问题,但没有一个解决方案适用于我。一些解决方案利用了熊猫,由于某种原因,它不断给我一个回溯错误。这是直到转换为csv之前的代码。使用Python将JSON数据转换为CSV
import requests
import lxml.html
from pprint import pprint
from sys import exit
import json
import csv
import datetime
import dateutil.relativedelta
now = datetime.datetime.now()
one_month_ago = now + dateutil.relativedelta.relativedelta(months=-15)
today_date = now.strftime('%Y-%m-%d')
one_month_ago_date = one_month_ago.strftime('%Y-%m-%d')
url = 'http://www.nhl.com/stats/rest/individual/skaters/basic/game/skatersummary?cayenneExp=gameDate%3E=%22'+one_month_ago_date+'T04:00:00.000Z%22%20and%20gameDate%3C=%22'+today_date+'T03:59:59.999Z%22%20and%20gameLocationCode=%22H%22%20and%20gameTypeId=%222%22&factCayenneExp=shots%3E=1&sort=[{%22property%22:%22points%22,%22direction%22:%22DESC%22},{%22property%22:%22goals%22,%22direction%22:%22DESC%22},{%22property%22:%22assists%22,%22direction%22:%22DESC%22}]'
resp = requests.get(url).text
resp = json.loads(resp)
任何帮助将不胜感激!
编辑: 我试过的一些csv转换方法包括来自How can I convert JSON to CSV?的最高评分答案。 我在这里粘贴和格式化问题,所以我只是提供了链接。
这是我尝试使用熊猫时的输出。
Traceback (most recent call last):
File "NHL Data Scrape.py", line 1, in <module>
from pandas.io.json import json_normalize
File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\pandas\__init__.py", line 13, in <module>
__import__(dependency)
File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\__init__.py", line 142, in <module>
from . import add_newdocs
File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\add_newdocs.py", line 13, in <module>
from numpy.lib import add_newdoc
File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\lib\__init__.py", line 8, in <module>
from .type_check import *
File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\lib\type_check.py", line 11, in <module>
import numpy.core.numeric as _nx
File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\core\__init__.py", line 35, in <module>
from . import _internal # for freeze programs
File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\core\_internal.py", line 18, in <module>
from .numerictypes import object_
File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\core\numerictypes.py", line 962, in <module>
_register_types()
File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\core\numerictypes.py", line 958, in _register_types
numbers.Integral.register(integer)
AttributeError: module 'numbers' has no attribute 'Integral'
------------------
(program exited with code: 1)
Press any key to continue . . .
您可以使用json_normalize()
从pandas.io.json
,如:
In []:
from pandas.io.json import json_normalize
...
resp = requests.get(url).json()
json_normalize(resp, 'data')
Out[]:
assists faceoffWinPctg gameWinningGoals gamesPlayed goals otGoals ...
0 31 0.0967 2 41 20 1 ...
1 27 0.0000 3 38 22 0 ...
2 35 0.5249 4 41 14 2 ...
3 34 0.4866 3 41 14 1 ...
...
这种格式正是我想要的样子!唯一的问题是,当我尝试运行它时,我得到一个很长的回溯错误,首先引用“从pandas.io.json导入json_normalize”,然后在pandas文件夹中给出_init_.py的文件位置,然后对各种numpy模块执行相同的操作以及。我已经安装了它们并位于Python36文件夹中,所以我不确定它为什么会这样做。我正在努力解决这个问题。 –
尽管谢谢你的回应!如果我可以在将来让大熊猫工作,我会欺骗它,因为它似乎是超级有用和简单的。 –
看起来你有一个糟糕的安装numpy。我会尝试重新安装它。 – AChampion
你可以使用Python的内置csv.DictWriter
resp = requests.get(url).json() # get response data in json
# resp['data'] is a list of dicts which contains players info.
# resp['data'][0].keys() is a dictionary keys. We'll use it for csv header.
with open('nhl_players.csv', 'w') as f:
w = csv.DictWriter(f, resp['data'][0].keys())
w.writeheader()
w.writerows(resp['data'])
下面是输出CSV文件https://www.dropbox.com/s/1mmprenx0eniflg/nhl_players.csv?dl=0
希望这帮助。
非常感谢!这工作完美! –
你有什么尝试转换为CSV?您还可以提供JSON的小样本和您的预期结果(CSV),这将使答案更容易! –
你使用的是什么版本的熊猫?当你简单地运行'import pandas'时会发生什么? – MattR
我使用熊猫0.20.3,我通过点安装。我只是重新安装了Python和所有的软件包,看看是否会有所帮助,而且没有任何改变。另外,当我简单地使用“导入熊猫”时,我得到相同的错误 –