提取和使用python

问题描述：

我有保存在一个文本文件格式提取和使用python

VP VB go 
NP PRP$ your NN left

这两条线解析。我想访问该文本文件，然后打印这个下面的结果在一个新的文本文件

NP NN left

帮助我如何使用这个蟒蛇。

感谢您的任何帮助提前

我们在什么基础上选择从文本文件中选择'NP NN left'？没有这样的解释，'print（'NP NN left'）'是一个有效的解决方案。 – unutbu 2013-03-16 22:19:11

@unutbu我想打印所有那些在行的开头有NP的模式和NN在同一行的一个词之前的模式。 – Mcolorz 2013-03-16 22:26:20

答

编辑：这是更好吗？

f=open("myfile") 
#read all lines of the file and remove newline characters 
a=[i.strip() for i in f.readlines()] 
f.close() 

for i in a: 
    i=i.split() 
    n=-1 
    try: 
    n=i.index("NN") 
    except: 
    pass 
    if n!=-1 and n!=len(i)-1 and i[0]=="NP": 
    print i[0], i[n], i[n+1]

该文件有1000行这些类型，并且NP和NN之间的单词数量不固定，因此使用数组通过给出索引并指定第一行中的内容不可能是这样 – Mcolorz 2013-03-16 22:56:50

答

如果我正确地解释你，你想要的

NP NN word

所有情况下在这种情况下，你可以使用正则表达式表达式查找NP，NN，以及随后的一句话：

import re 
f = open('file.txt') 
regex = r'^(NP).*?(NN) (\w+).*?$' 
for line in f: 
    try: ' '.join(re.search(regex, line).groups()) 
    except AttributeError: pass

提取和使用python

相关推荐