比较Python中的词典列表
我已经阅读过各种问题,但没有发现与这种情况完全匹配的内容,我无法理解它。比较Python中的词典列表
我想比较2个字典的列表。我不想检查单个键值对,我想检查整个字典与其他字典,但其中一个字典中的一个字典中有一个额外的项目'ID',而另一个列表不是这样的我不需要比较这一点。
STATUS_CODE和desc不是唯一
只是说明可能改变,但就我而言,整个事情已经再变。
的样本数据:
data_db = [
{ "id": 1, "status_code": 2, "desc": "Description sample1" },
{ "id": 2, "status_code": 4, "desc": "Description sample2" },
{ "id": 3, "status_code": 5, "desc": "Description sample3" },
{ "id": 4, "status_code": 5, "desc": "Description sample4" }
]
data_api = [
{ "status_code": 1, "desc": "Description sample5" },
{ "status_code": 4, "desc": "Description sample6" },
{ "status_code": 5, "desc": "Description sample3" }
]
预期输出:(!因为它是对我足够的混乱)
missing_from_db = [
{ "status_code": 1, "desc": "Description sample4" },
{ "status_code": 4, "desc": "Description sample6" } # because in data_db it desc is different
]
missing_from_api = [1,2,4] # This can just be the ids from data_db
我希望这是有道理的。
代码明智我还没有拿出任何远程关闭或有用的东西。我有最近的想法是重新格式化data_db到这:
data_db = [
{
"id": 1,
"data": { "status_code": 2, "desc": "Description sample1" }
},
{
"id": 2,
"data": { "status_code": 4, "desc": "Description sample2" }
},
{
"id": 3,
"data": { "status_code": 5, "desc": "Description sample3" }
},
{
"id": 4,
"data": { "status_code": 5, "desc": "Description sample4" }
}
]
谢谢!
重新格式化您的data_db
应该工作:
data_db = [
{
"id": 1,
"data": { "status_code": 2, "desc": "Description sample1" }
},
{
"id": 2,
"data": { "status_code": 4, "desc": "Description sample2" }
},
{
"id": 3,
"data": { "status_code": 5, "desc": "Description sample3" }
},
{
"id": 4,
"data": { "status_code": 5, "desc": "Description sample4" }
}
]
data_api = [
{ "status_code": 1, "desc": "Description sample5" },
{ "status_code": 4, "desc": "Description sample6" },
{ "status_code": 5, "desc": "Description sample3" }
]
# checking the dicts in data_api against the 'data' sub-dicts in data_db
missing_from_db = [d for d in data_api if d not in [x['data'] for x in data_db]]
# using similar comprehension to extract the 'id' vals of the 'data' in data_db which aren't in data_api
missing_from_api = [d['id'] for d in data_db if d['data'] not in data_api]
结果:
print missing_from_db
[{'status_code': 1, 'desc': 'Description sample5'},
{'status_code': 4, 'desc': 'Description sample6'}]
print missing_from_api
[1, 2, 4]
谢谢。 Spot on,short and sweet/pythonic! – s27840
干杯。还有一种方法可以用你的原始数据结构来完成它,但是你需要额外的函数或者lambda函数,而且我也无法将头部缠绕在其中。 –
这不是一个很好的解决方案,它依赖于你有特殊的结构,但它的工作原理:
data_db = [
{ "id": 1, "status_code": 2, "desc": "Description sample1" },
{ "id": 2, "status_code": 4, "desc": "Description sample2" },
{ "id": 3, "status_code": 5, "desc": "Description sample3" },
{ "id": 4, "status_code": 5, "desc": "Description sample4" }
]
data_api = [
{ "status_code": 1, "desc": "Description sample5" },
{ "status_code": 4, "desc": "Description sample6" },
{ "status_code": 5, "desc": "Description sample3" }
]
lst = []
for dct in data_api:
for dct2 in data_db:
if all(dct[key] == dct2[key] for key in dct):
break
else:
lst.append(dct)
lst2 = []
for dct2 in data_db:
for dct in data_api:
if all(dct[key] == dct2[key] for key in dct):
break
else:
lst2.append(dct2["id"])
print(lst)
print(lst2)
这会有帮助吗
def find_missing(data1,data2):
missig_from_data = list()
for i in range(0,len(data2)):
status = False
dec = False
for j in range(0,len(data1)):
if data2[i]['status_code'] == data1[j]['status_code']:
status = True
if data2[i]['desc'] == data1[j]['desc']:
dec = True
if (status == False and dec==False) or (status == True and dec==False) or (status == False and dec==True):
missig_from_data.append(data2[i])
return missig_from_data
data_db = [
{ "id": 1, "status_code": 2, "desc": "Description sample1" },
{ "id": 2, "status_code": 4, "desc": "Description sample2" },
{ "id": 3, "status_code": 5, "desc": "Description sample3" },
{ "id": 4, "status_code": 5, "desc": "Description sample4" }
]
data_api = [
{ "status_code": 1, "desc": "Description sample5" },
{ "status_code": 4, "desc": "Description sample6" },
{ "status_code": 5, "desc": "Description sample3" }
]
missig_from_data_db = find_missing(data_db,data_api)
missing_from_api = find_missing(data_api,data_db)
missing_from_api_1 = list()
for i in range(0,len(missing_from_api)): missing_from_api_1.append(missing_from_api[i]['id'])
print missig_from_data_db
print missing_from_api_1
输出:
[{'status_code': 1, 'desc': 'Description sample5'}, {'status_code': 4, 'desc': 'Description sample6'}]
[1, 2, 4]
围棋与重新格式化'data_db'。 –
写出这个问题实际上帮助我解决了很多问题,并且坚持了这个想法。未来,当遇到困难/困惑时,请尝试一下! – s27840
同样的事情发生在我身上多次:) –