如何用Python中的字典搜索嵌套列表?
在Python 3.6中,我有一个像下面这样的列表,并且无法弄清楚如何正确搜索这些值。所以,如果我给了下面的搜索字符串,我需要搜索标题和标签的值以及哪个匹配最多的值,我会返回id,如果有相同数量的许多不同图像(id)的比赛,那么标题首先按字母顺序排列的人将被退回。另外,它应该不是区分大小写的。所以在代码中,我有搜索作为我的术语来搜索,它应该返回第一个id值,而是返回不同的值。如何用Python中的字典搜索嵌套列表?
image_info = [
{
"id" : "34694102243_3370955cf9_z",
"title" : "Eastern",
"flickr_user" : "Sean Davis",
"tags" : ["Los Angeles", "California", "building"]
},
{
"id" : "37198655640_b64940bd52_z",
"title" : "Spreetunnel",
"flickr_user" : "Jens-Olaf Walter",
"tags" : ["Berlin", "Germany", "tunnel", "ceiling"]
},
{
"id" : "34944112220_de5c2684e7_z",
"title" : "View from our rental",
"flickr_user" : "Doug Finney",
"tags" : ["Mexico", "ocean", "beach", "palm"]
},
{
"id" : "36140096743_df8ef41874_z",
"title" : "Someday",
"flickr_user" : "Thomas Hawk",
"tags" : ["Los Angeles", "Hollywood", "California", "Volkswagen", "Beatle", "car"]
}
]
my_counter = 0
search = "CAT IN BUILding"
search = search.lower().split()
matches = {}
for image in image_info:
for word in search:
word = word.lower()
if word in image["title"].lower().split(" "):
my_counter += 1
print(my_counter)
if word in image["tags"]:
my_counter +=1
print(my_counter)
if my_counter > 0:
matches[image["id"]] = my_counter
my_counter = 0
这是一种代码的变体,我试图在搜索前预先对数据进行索引。这是一个非常基本的实现如何CloudSearch或ElasticSearch会索引和搜索
import itertools
from collections import Counter
image_info = [
{
"id" : "34694102243_3370955cf9_z",
"title" : "Eastern",
"flickr_user" : "Sean Davis",
"tags" : ["Los Angeles", "California", "building"]
},
{
"id" : "37198655640_b64940bd52_z",
"title" : "Spreetunnel",
"flickr_user" : "Jens-Olaf Walter",
"tags" : ["Berlin", "Germany", "tunnel", "ceiling"]
},
{
"id" : "34944112220_de5c2684e7_z",
"title" : "View from our rental",
"flickr_user" : "Doug Finney",
"tags" : ["Mexico", "ocean", "beach", "palm"]
},
{
"id" : "36140096743_df8ef41874_z",
"title" : "Someday",
"flickr_user" : "Thomas Hawk",
"tags" : ["Los Angeles", "Hollywood", "California", "Volkswagen", "Beatle", "car"]
}
]
my_counter = 0
search = "CAT IN BUILding california"
search = set(search.lower().split())
matches = {}
index = {}
# Building a rudimentary search index
for info in image_info:
bag = info["title"].lower().split(" ")
tags = [t.lower().split(" ") for t in info["tags"]] # we want to be able to hit "los angeles" as will as "los" and "angeles"
tags = list(itertools.chain.from_iterable(tags))
for k in (bag + tags):
if k in index:
index[k].append(info["id"])
else:
index[k] = [info["id"]]
#print(index)
hits = []
for s in search:
if s in index:
hits += index[s]
print(Counter(hits).most_common(1)[0][0])
您正在创建词典匹配新条目[图片[ “ID”] = my_counter。 如果您想在该字典中只保留1个条目,并且您希望image_id和count。我修改了你的字典和条件。希望能帮助到你。
my_counter = 0
search_term = "CAT IN BUILding"
search = search_term.lower().split()
matches = {}
matches[search_term] = {}
for image in image_info:
for word in search:
word = word.lower()
if word in image["title"].lower().split(" "):
my_counter += 1
print(my_counter)
if word in image["tags"]:
my_counter +=1
print(my_counter)
if my_counter > 0:
if not matches[search_term].values() or my_counter > matches[search_term].values()[0]:
matches[search_term][image["id"]] = my_counter
my_counter = 0
我试着运行你修改过的代码,现在得到错误:TypeError:' dict_values的对象不支持索引 – Gray
Python 3.4在执行dict.values()时返回dict_values()而不是列表。只需将list()放在匹配[search_term] .values()周围。它应该像列表一样(匹配[search_term] .values())[0] –
也可以使用小写列表标记,如上面的一个用户突出显示的那样。 –
什么,当你说“返回”你的意思是?你没有返回任何东西?你的预期产出是什么,它与你拥有的产品有什么不同?你能更明确吗? –
我运行了你的代码,它给了我匹配词典中的第一个ID。但是,标签存在一个错误。您将搜索字符串中的单词缩写为小写,而不是标记中的单词,但标记包含一些大写的单词。例如,你将无法匹配洛杉矶。 – bouma
@ juanpa.arrivillaga因此,我使用搜索项“CAT IN BUILTING”来搜索列表/字典中的标题和标记的值,并且我希望函数返回找到的匹配项。因此,对于“CAT IN BUILTING”,它应该返回1,并在34694102243_3370955cf9_z找到匹配的ID。如果搜索词是“在墨西哥海滩建造”,那么它应该返回34944112220_de5c2684e7_z,因为它在标签中有2个匹配项。 – Gray