使用字符串以任意顺序匹配数组元素
问题描述:
我对python很陌生,试图找到tweet是否有任何查找元素。使用字符串以任意顺序匹配数组元素
例如,如果我能找到这个单词猫,它应该匹配猫,也可以任意顺序匹配可爱的小猫。但从我了解我无法找到解决方案。任何指导表示赞赏。
import re
lookup_table = ['cats', 'cute kittens', 'dog litter park']
tweets = ['that is a cute cat',
'kittens are cute',
'that is a cute kitten',
'that is a dog litter park',
'no wonder that dog park is bad']
for tweet in tweets:
lookup_found = None
print re.findall(r"(?=(" + '|'.join(lookup_table) + r"))", tweet.lower())
输出
['cat']
[]
[]
['dog litter park']
[]
预期输出:
that is a cute cat > cats
kittens are cute > cute kittens
this is a cute kitten > cute kittens
that is a dog litter park > dog litter park
no wonder that dog park is bad > dog litter park
答
对于查找的话这是只有一个字的文字,你可以使用
for word in tweet
而对于像查找单词“可爱的小猫”,你在哪里等任何订单。只需将它分开并在推文字符串中查找即可。
这是我试过的,它效率不高,但它的工作。尝试运行它。
lookup_table = ['cat', 'cute kitten', 'dog litter park']
tweets = ['that is a cute cat',
'kittens are cute',
'that is a cute kitten',
'that is a dog litter park',
'no wonder that dog park is bad']
for word in lookup_table:
for tweet in tweets:
if " " in word:
temp = word.split(sep=" ")
else:
temp = [word]
for x in temp:
if x in tweet:
print(tweet)
break
答
这是我该怎么做。我认为lookup_table不必太严格,我们可以避免复数;
import re
lookup_table = ['cat', 'cute kitten', 'dog litter park']
tweets = ['that is a cute cat',
'kittens are cute',
'that is a cute kitten',
'that is a dog litter park',
'no wonder that dog park is bad']
for data in lookup_table:
words=data.split(" ")
for word in words:
result=re.findall(r'[\w\s]*' + word + '[\w\s]*',','.join(tweets))
if len(result)>0:
print(result)
答
问题1:
单/复数: 只是为了让事情滚动我会用活用,Python包摆脱单一&复数,例如...
问题2:
分裂和加入: 我写了一个小脚本来演示率你如何使用它,没有稳健测试,但应该让你移动
import inflect
p = inflect.engine()
lookup_table = ['cats', 'cute kittens', 'dog litter park']
tweets = ['that is a cute cat',
'kittens are cute',
'that is a cute kitten',
'that is a dog litter park',
'no wonder that dog park is bad']
for tweet in tweets:
matched = []
for lt in lookup_table:
match_result = [lt for mt in lt.split() for word in tweet.split() if p.compare(word, mt)]
if any(match_result):
matched.append(" ".join(match_result))
print tweet, '>>' , matched
?? ??使用单数形式。 –
你也应该告诉我们你实际需要的输出。 –
@KarolyHorvath我不确定你的意思是 – user6083088