使用Python将JSON数据转换为CSV

问题描述:

我试图从NHL.com提取统计表并将它们转换为csv,以便以后在Excel中使用。我能够拉桌子,但有问题将其转换为CSV。我发现很多关于将json转换为csv的问题,但没有一个解决方案适用于我。一些解决方案利用了熊猫,由于某种原因,它不断给我一个回溯错误。这是直到转换为csv之前的代码。使用Python将JSON数据转换为CSV

import requests 
import lxml.html 
from pprint import pprint 
from sys import exit 
import json 
import csv 
import datetime 
import dateutil.relativedelta 


now = datetime.datetime.now() 
one_month_ago = now + dateutil.relativedelta.relativedelta(months=-15) 

today_date = now.strftime('%Y-%m-%d') 
one_month_ago_date = one_month_ago.strftime('%Y-%m-%d') 

url = 'http://www.nhl.com/stats/rest/individual/skaters/basic/game/skatersummary?cayenneExp=gameDate%3E=%22'+one_month_ago_date+'T04:00:00.000Z%22%20and%20gameDate%3C=%22'+today_date+'T03:59:59.999Z%22%20and%20gameLocationCode=%22H%22%20and%20gameTypeId=%222%22&factCayenneExp=shots%3E=1&sort=[{%22property%22:%22points%22,%22direction%22:%22DESC%22},{%22property%22:%22goals%22,%22direction%22:%22DESC%22},{%22property%22:%22assists%22,%22direction%22:%22DESC%22}]' 
resp = requests.get(url).text 
resp = json.loads(resp) 

任何帮助将不胜感激!

编辑: 我试过的一些csv转换方法包括来自How can I convert JSON to CSV?的最高评分答案。 我在这里粘贴和格式化问题,所以我只是提供了链接。

这是我尝试使用熊猫时的输出。

Traceback (most recent call last): 
File "NHL Data Scrape.py", line 1, in <module> 
from pandas.io.json import json_normalize 

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site- 
packages\pandas\__init__.py", line 13, in <module> 
__import__(dependency) 

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site- 
packages\numpy\__init__.py", line 142, in <module> 
from . import add_newdocs 

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site- 
packages\numpy\add_newdocs.py", line 13, in <module> 
from numpy.lib import add_newdoc 

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site- 
packages\numpy\lib\__init__.py", line 8, in <module> 
from .type_check import * 

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site- 
packages\numpy\lib\type_check.py", line 11, in <module> 
import numpy.core.numeric as _nx 

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site- 
packages\numpy\core\__init__.py", line 35, in <module> 
from . import _internal # for freeze programs 

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site- 
packages\numpy\core\_internal.py", line 18, in <module> 
from .numerictypes import object_ 

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site- 
packages\numpy\core\numerictypes.py", line 962, in <module> 
_register_types() 

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site- 
packages\numpy\core\numerictypes.py", line 958, in _register_types 
    numbers.Integral.register(integer) 

AttributeError: module 'numbers' has no attribute 'Integral' 


------------------ 
(program exited with code: 1) 

Press any key to continue . . . 
+1

你有什么尝试转换为CSV?您还可以提供JSON的小样本和您的预期结果(CSV),这将使答案更容易! –

+0

你使用的是什么版本的熊猫?当你简单地运行'import pandas'时会发生什么? – MattR

+0

我使用熊猫0.20.3,我通过点安装。我只是重新安装了Python和所有的软件包,看看是否会有所帮助,而且没有任何改变。另外,当我简单地使用“导入熊猫”时,我得到相同的错误 –

您可以使用json_normalize()pandas.io.json,如:

In []: 
from pandas.io.json import json_normalize 

... 
resp = requests.get(url).json() 
json_normalize(resp, 'data') 

Out[]: 
    assists faceoffWinPctg gameWinningGoals gamesPlayed goals otGoals ... 
0   31   0.0967     2   41  20  1 ... 
1   27   0.0000     3   38  22  0 ... 
2   35   0.5249     4   41  14  2 ... 
3   34   0.4866     3   41  14  1 ... 
... 
+0

这种格式正是我想要的样子!唯一的问题是,当我尝试运行它时,我得到一个很长的回溯错误,首先引用“从pandas.io.json导入json_normalize”,然后在pandas文件夹中给出_init_.py的文件位置,然后对各种numpy模块执行相同的操作以及。我已经安装了它们并位于Python36文件夹中,所以我不确定它为什么会这样做。我正在努力解决这个问题。 –

+0

尽管谢谢你的回应!如果我可以在将来让大熊猫工作,我会欺骗它,因为它似乎是超级有用和简单的。 –

+0

看起来你有一个糟糕的安装numpy。我会尝试重新安装它。 – AChampion

你可以使用Python的内置csv.DictWriter

resp = requests.get(url).json() # get response data in json 

# resp['data'] is a list of dicts which contains players info. 
# resp['data'][0].keys() is a dictionary keys. We'll use it for csv header. 
with open('nhl_players.csv', 'w') as f: 
    w = csv.DictWriter(f, resp['data'][0].keys()) 
    w.writeheader() 
    w.writerows(resp['data']) 

下面是输出CSV文件https://www.dropbox.com/s/1mmprenx0eniflg/nhl_players.csv?dl=0

希望这帮助。

+0

非常感谢!这工作完美! –