比较Python中的词典列表

问题描述:

我已经阅读过各种问题,但没有发现与这种情况完全匹配的内容,我无法理解它。比较Python中的词典列表

我想比较2个字典的列表。我不想检查单个键值对,我想检查整个字典与其他字典,但其中一个字典中的一个字典中有一个额外的项目'ID',而另一个列表不是这样的我不需要比较这一点。

STATUS_CODE和desc不是唯一

只是说明可能改变,但就我而言,整个事情已经再变。

的样本数据:

data_db = [ 
    { "id": 1, "status_code": 2, "desc": "Description sample1" }, 
    { "id": 2, "status_code": 4, "desc": "Description sample2" }, 
    { "id": 3, "status_code": 5, "desc": "Description sample3" }, 
    { "id": 4, "status_code": 5, "desc": "Description sample4" } 
] 

data_api = [ 
    { "status_code": 1, "desc": "Description sample5" }, 
    { "status_code": 4, "desc": "Description sample6" }, 
    { "status_code": 5, "desc": "Description sample3" } 
] 

预期输出:(!因为它是对我足够的混乱)

missing_from_db = [ 
    { "status_code": 1, "desc": "Description sample4" }, 
    { "status_code": 4, "desc": "Description sample6" } # because in data_db it desc is different 
] 

missing_from_api = [1,2,4] # This can just be the ids from data_db 

我希望这是有道理的。

代码明智我还没有拿出任何远程关闭或有用的东西。我有最近的想法是重新格式化data_db到这:

data_db = [ 
    { 
     "id": 1, 
     "data": { "status_code": 2, "desc": "Description sample1" } 
    }, 
    { 
     "id": 2, 
     "data": { "status_code": 4, "desc": "Description sample2" } 
    }, 
    { 
     "id": 3, 
     "data": { "status_code": 5, "desc": "Description sample3" } 
    }, 
    { 
     "id": 4, 
     "data": { "status_code": 5, "desc": "Description sample4" } 
    } 
] 

谢谢!

+0

围棋与重新格式化'data_db'。 –

+0

写出这个问题实际上帮助我解决了很多问题,并且坚持了这个想法。未来,当遇到困难/困惑时,请尝试一下! – s27840

+0

同样的事情发生在我身上多次:) –

重新格式化您的data_db应该工作:

data_db = [ 
    { 
     "id": 1, 
     "data": { "status_code": 2, "desc": "Description sample1" } 
    }, 
    { 
     "id": 2, 
     "data": { "status_code": 4, "desc": "Description sample2" } 
    }, 
    { 
     "id": 3, 
     "data": { "status_code": 5, "desc": "Description sample3" } 
    }, 
    { 
     "id": 4, 
     "data": { "status_code": 5, "desc": "Description sample4" } 
    } 
] 

data_api = [ 
    { "status_code": 1, "desc": "Description sample5" }, 
    { "status_code": 4, "desc": "Description sample6" }, 
    { "status_code": 5, "desc": "Description sample3" } 
] 

# checking the dicts in data_api against the 'data' sub-dicts in data_db 
missing_from_db = [d for d in data_api if d not in [x['data'] for x in data_db]] 

# using similar comprehension to extract the 'id' vals of the 'data' in data_db which aren't in data_api 
missing_from_api = [d['id'] for d in data_db if d['data'] not in data_api] 

结果:

print missing_from_db 

[{'status_code': 1, 'desc': 'Description sample5'}, 
{'status_code': 4, 'desc': 'Description sample6'}] 

print missing_from_api 

[1, 2, 4] 
+1

谢谢。 Spot on,short and sweet/pythonic! – s27840

+0

干杯。还有一种方法可以用你的原始数据结构来完成它,但是你需要额外的函数或者lambda函数,而且我也无法将头部缠绕在其中。 –

这不是一个很好的解决方案,它依赖于你有特殊的结构,但它的工作原理:

data_db = [ 
    { "id": 1, "status_code": 2, "desc": "Description sample1" }, 
    { "id": 2, "status_code": 4, "desc": "Description sample2" }, 
    { "id": 3, "status_code": 5, "desc": "Description sample3" }, 
    { "id": 4, "status_code": 5, "desc": "Description sample4" } 
] 

data_api = [ 
    { "status_code": 1, "desc": "Description sample5" }, 
    { "status_code": 4, "desc": "Description sample6" }, 
    { "status_code": 5, "desc": "Description sample3" } 
] 

lst = [] 
for dct in data_api: 
    for dct2 in data_db: 
     if all(dct[key] == dct2[key] for key in dct): 
      break 
    else: 
     lst.append(dct) 

lst2 = [] 
for dct2 in data_db: 
    for dct in data_api: 
     if all(dct[key] == dct2[key] for key in dct): 
      break 
    else: 
     lst2.append(dct2["id"]) 

print(lst) 
print(lst2) 

这会有帮助吗

def find_missing(data1,data2): 
    missig_from_data = list() 
    for i in range(0,len(data2)): 
     status = False 
     dec = False 
     for j in range(0,len(data1)): 
      if data2[i]['status_code'] == data1[j]['status_code']: 
       status = True 
       if data2[i]['desc'] == data1[j]['desc']: 
        dec = True 
     if (status == False and dec==False) or (status == True and dec==False) or (status == False and dec==True): 
      missig_from_data.append(data2[i]) 

    return missig_from_data 

data_db = [ 
    { "id": 1, "status_code": 2, "desc": "Description sample1" }, 
    { "id": 2, "status_code": 4, "desc": "Description sample2" }, 
    { "id": 3, "status_code": 5, "desc": "Description sample3" }, 
    { "id": 4, "status_code": 5, "desc": "Description sample4" } 
] 

data_api = [ 
    { "status_code": 1, "desc": "Description sample5" }, 
    { "status_code": 4, "desc": "Description sample6" }, 
    { "status_code": 5, "desc": "Description sample3" } 
] 

missig_from_data_db = find_missing(data_db,data_api) 
missing_from_api = find_missing(data_api,data_db) 
missing_from_api_1 = list() 

for i in range(0,len(missing_from_api)): missing_from_api_1.append(missing_from_api[i]['id']) 

print missig_from_data_db 
print missing_from_api_1 

输出:

[{'status_code': 1, 'desc': 'Description sample5'}, {'status_code': 4, 'desc': 'Description sample6'}] 
[1, 2, 4]