如何遍历树的所有节点？

问题描述：

我想简化我的分析树的节点，即给出一个节点，我摆脱了第一个连字符以及连字符之后的任何内容。例如，如果一个节点是NP-TMP-FG，我想使它成为NP，如果它是SBAR-SBJ，我想使它成为SBAR等等。这是我有一个解析树的例子如何遍历树的所有节点？

((S (S-TPC-2 (NP-SBJ (NP (DT The) (NN asbestos) (NN fiber)) (, ,) 
(NP (NN crocidolite)) (, ,)) (VP (VBZ is) (ADJP-PRD (RB unusually) (JJ resilient)) 
(SBAR-TMP (IN once) (S (NP-SBJ (PRP it)) (VP (VBZ enters) (NP (DT the) (NNS lungs))))) 
(, ,) (PP (IN with)(S-NOM (NP-SBJ (NP (RB even) (JJ brief) (NNS exposures)) (PP (TO to) 
(NP (PRP it)))) (VP (VBG causing) (NP (NP (NNS symptoms)) (SBAR (WHNP-1 (WDT that)) 
(S (NP-SBJ (-NONE- *T*-1)) (VP (VBP show) (PRT (RP up)) (ADVP-TMP (NP (NNS decades)) 
(JJ later))))))))))) (, ,) (NP-SBJ (NNS researchers)) (VP (VBD said)(SBAR (-NONE- 0) 
(S (-NONE- *T*-2)))) (. .)))

这是我的代码，但它不起作用。

import re 
import nltk 
from nltk.tree import * 
tree = Tree.fromstring(line) // Each parse tree is stored in one single line 
for subtree in tree.subtrees(): 
    re.sub('-.*', '', subtree.label()) 
print tree

编辑：

我想这个问题是subtree.label（）显示的节点，但它不能被改变，因为它是一个函数。打印subtree.label的输出（）是：

S 
S-TPC-2 
NP-SBJ 
NP 
DT 
NN 
,

等等...

答

我想出了这一点：

for subtree in tree.subtrees(): 
    s = subtree.label() 
    subtree.set_label(re.sub('-.*', "", s))

答

你可以做这样的事情：

for subtree in tree.subtrees(): 
    first = subtree.label().split('-')[0] 
    subtree.set_label(first)

如何遍历树的所有节点？

相关推荐