将JSON导入熊猫
问题描述:
我不得不关注来自API(例如my_json)的JSON。实体的数组存储在一个关键称为实体:将JSON导入熊猫
{
"action" : "get",
"application" : "4d97323f-ac0f-11e6-b1d4-0eec2415f3df",
"params" : {
"limit" : [ "2" ]
},
"path" : "/businesses",
"entities" : [
{
"uuid" : "508d56f1-636b-11e7-9928-122e0737977d",
"type" : "business",
"size" : 730 },
{
"uuid" : "2f3bd4dc-636b-11e7-b937-0ad881f403bf",
"type" : "business",
"size" : 730
} ],
"timestamp" : 1499469891059,
"duration" : 244,
"count" : 2
}
我试图将其加载到数据帧如下:
import pandas as pd
pd.read_json(my_json['entities'], orient='split')
我收到以下错误:
ValueError: Invalid file path or buffer object type: <type 'list'>
我试过记录方向,但仍然无法正常工作。
答
你使用的方式my_json['entities']
使它看起来像是一个Python dict
。
根据pandas
documentation,read_json
接受“有效的JSON字符串或文件样”。难道可以将dict
转换成JSON strinrg有以下几点:
import json
json_str = json.dumps(my_json["entities"])
为你描述它不适合的格式战略orient="split"
下的关键"entities"
数据。它看起来像您将需要使用orient="list"
:
import pandas as pd
my_json = """{
"entities": [
{
"type": "business",
"uuid": "199bca3e-baf6-11e6-861b-0ad881f403bf",
"size": 918
},
{
"type": "business",
"uuid": "054a7650-b36a-11e6-a734-122e0737977d",
"size": 984
}
]
}"""
print pd.read_json(my_json, orient='list')
产生:
entity
0 {u'type': u'business', u'uuid': u'199bca3e-baf...
1 {u'type': u'business', u'uuid': u'054a7650-b36...
或
import pandas as pd
my_json = """[
{
"type": "business",
"uuid": "199bca3e-baf6-11e6-861b-0ad881f403bf",
"size": 918
},
{
"type": "business",
"uuid": "054a7650-b36a-11e6-a734-122e0737977d",
"size": 984
}
]"""
print pd.read_json(my_json, orient='list')
产生:
size type uuid
0 918 business 199bca3e-baf6-11e6-861b-0ad881f403bf
1 984 business 054a7650-b36a-11e6-a734-122e0737977d
答
danielcorin我指出了正确的方向。我结束了必须做的:
pd.read_json(json.dumps(b_j['entities']) , orient='list')
read_json方法需要一个字符串,所以我转储实体集合,并使用它。
答
如果my_json
是一本字典,因为我怀疑,那么你可以跳过pd.read_json
,只是做
pd.DataFrame(my_json['entities'])
size type uuid
0 730 business 508d56f1-636b-11e7-9928-122e0737977d
1 730 business 2f3bd4dc-636b-11e7-b937-0ad881f403bf
能否请你加'my_json'的内容,你的问题? – Infinity