任何人都可以解释这个列表的理解?
问题描述:
def unpack_dict(matrix, map_index_to_word):
table = sorted(map_index_to_word, key=map_index_to_word.get)
data = matrix.data
indices = matrix.indices
indptr = matrix.indptr
num_doc = matrix.shape[0]
return [{k:v for k,v in zip([table[word_id] for word_id in
indices[indptr[i]:indptr[i+1]] ],
data[indptr[i]:indptr[i+1]].tolist())} \
for i in range(num_doc) ]
wiki['tf_idf'] = unpack_dict(tf_idf, map_index_to_word)
map_index_to_word是单词的词典:指数几千字。 tf_idf是TFIDF稀疏矢量 数据帧维基显示在屏幕截图这里
答
[{k: v for k, v in zip([table[word_id] for word_id in indices[indptr[i]:indptr[i + 1]]],data[indptr[i]:indptr[i + 1]].tolist())} for i in range(num_doc)]
是一样的:
final_list = []
for i in range(num_doc):
new_list = []
for word_id in indices[indptr[i]:indptr[i + 1]]:
new_list.append(table[word_id])
new_dict = {}
for k, v in zip(new_list, data[indptr[i]:indptr[i + 1]].tolist()):
new_dict[k] = v
final_list.append(new_dict)
答
这?
[{k:v for k,v in zip([table[word_id] for word_id in
indices[indptr[i]:indptr[i+1]] ],
data[indptr[i]:indptr[i+1]].tolist())} \
for i in range(num_doc) ]
外的理解是
[... for i in range(num_doc) ]
只是一个简单的循环num_doc
倍。
里面是一个词典理解。
{k:v for k,v in zip()}
的zip
需要从k
键:
[table[word_id] for word_id in indices[indptr[i]:indptr[i+1]] ]
和v
值从:
data[indptr[i]:indptr[i+1]].tolist()
所以i
,外变量创建切片范围,indptr[i]:indptr[i+1]
。
所以这是一个词典列表。字典键值为table[word_id]
,其中word_id
位于indices
的范围内,其值为data
的对应范围。