如何将字典作为值插入Python中使用循环的字典
我目前面临一个问题,使我的CVS数据字典。如何将字典作为值插入Python中使用循环的字典
我有3列,我想在文件中使用:
userID, placeID, rating
U1000, 12222, 3
U1000, 13333, 2
U1001, 13333, 4
我想作的结果是这样的:
{'U1000': {'12222': 3, '13333': 2},
'U1001': {'13333': 4}}
也就是说, 我想使我的数据结构看起来像:
sample = {}
sample["U1000"] = {}
sample["U1001"] = {}
sample["U1000"]["12222"] = 3
sample["U1000"]["13333"] = 2
sample["U1001"]["13333"] = 4
但我有很多数据是亲cessed。 我想获得与循环的结果,但我已经尝试过了2小时,失败..
---以下代码可以迷惑你---
我的结果看现在这个样子:
{'U1000': ['12222', 3],
'U1001': ['13333', 4]}
- 该字典的值是一个列表,而一本字典
- 用户“U1000”出现多次,但在我孤单的结果只有一次
我想我的代码有很多错误..如果你不介意的话,请看看:
reader = np.array(pd.read_csv("rating_final.csv"))
included_cols = [0, 1, 2]
sample= {}
target=[]
target1 =[]
for row in reader:
content = list(row[i] for i in included_cols)
target.append(content[0])
target1.append(content[1:3])
sample = dict(zip(target, target1))
我怎么能提高代码? 我已经看过通过计算器,但由于个人缺乏能力, 任何人都可以请帮助我呢?
非常感谢!
这应该做你想要什么:
import collections
reader = ...
sample = collections.defaultdict(dict)
for user_id, place_id, rating in reader:
rating = int(rating)
sample[user_id][place_id] = rating
print(sample)
# -> {'U1000': {'12222': 3, '1333': 2}, 'U1001': {'13333': 4}}
defaultdict
是一个方便的工具,只要您试图访问一个关键,是不是在字典中提供的默认值。如果你(因为你要sample['non-existent-user-id]
失败,KeyError
例如)不喜欢它,使用:
reader = ...
sample = {}
for user_id, place_id, rating in reader:
rating = int(rating)
if user_id not in sample:
sample[user_id] = {}
sample[user_id][place_id] = rating
感谢您的澄清,这真的有帮助! –
例子中的预期输出是不可能的,因为{'1333': 2}
不会与一个键关联。你可以得到{'U1000': {'12222': 3, '1333': 2}, 'U1001': {'13333': 4}}
虽然与dict
的dict
一个S:
sample = {}
for row in reader:
userID, placeID, rating = row[:3]
sample.setdefault(userID, {})[placeID] = rating # Possibly int(rating)?
或者,使用collections.defaultdict(dict)
以避免涉及setdefault
(或其他方法需要一个try
/except KeyError
或if userID in sample:
在交换牺牲setdefault
的原子为不产生空dict
小号不必要地):
import collections
sample = collections.defaultdict(dict)
for row in reader:
userID, placeID, rating = row[:3]
sample[userID][placeID] = rating
# Optional conversion back to plain dict
sample = dict(sample)
转换回普通dict
确保将来升ookups不会自动生动化按键,正常情况下会提升KeyError
,如果您print
那么它看起来像正常的dict
。
如果included_cols
是很重要的(因为名字或列索引可能会发生变化),则可以使用operator.itemgetter
加快和简化一次提取所有所需的列:
from collections import defaultdict
from operator import itemgetter
included_cols = (0, 1, 2)
# If columns in data were actually:
# rating, foo, bar, userID, placeID
# we'd do this instead, itemgetter will handle all the rest:
# included_cols = (3, 4, 0)
get_cols = itemgetter(*included_cols) # Create function to get needed indices at once
sample = defaultdict(dict)
# map(get_cols, ...) efficiently converts each row to a tuple of just
# the three desired values as it goes, which also lets us unpack directly
# in the for loop, simplifying code even more by naming all variables directly
for userID, placeID, rating in map(get_cols, reader):
sample[userID][placeID] = rating # Possibly int(rating)?
感谢您的回答,这真的有帮助! –
这似乎是你想要的字典作为_values_ ,而不是_keys_。也许正确的标题匹配? – ShadowRanger
谢谢你的提醒。已更正标题以及内容! –
另外,你的例子有'{'U1000':{'12222':3},{'1333':2},'U1001':{'13333':4}}',但是这是'U1000'和' U1001',但没有与{{1333':2}'相关联的键(或无值)。你可以有'{'U1000':{'12222':3,'1333':2},'U1001':{'13333':4}}'或'{'U1000':[{'12222': 3},{'1333':2}],'U1001':[{'13333':4}]}',但不是你提供的。 – ShadowRanger