elementtree:获取XML文档中特定标记的内容

问题描述:

我想提取XML文件中特定标记的内容。elementtree:获取XML文档中特定标记的内容

示例XML:

<facts> 
     <fact> 
      <name>crash</name> 
      <full_name>Crash</full_name> 
      <variables> 
       <variable> 
        <name>id</name> 
        <proper_name>Crash Instance</proper_name> 
        <type>INT</type> 
        <interpretation>key</interpretation> 
       </variable> 
       <variable> 
        <name>accident_key</name> 
        <proper_name>Case Identifier</proper_name> 
        <interpretation>string</interpretation> 
        <type>CHAR(9)</type> 
       </variable> 
       <variable> 
        <name>accident_year</name> 
        <proper_name>Crash Year</proper_name> 
        <interpretation>dim</interpretation> 
        <type>INT</type> 
       </variable> 
      </variables> 
     </fact> 
    <fact> 
     <name>vehicle</name> 
     <full_name>Vehicle</full_name> 
     <variables> 
      <variable> 
       <name>id</name> 
       <proper_name>Vehicle Instance</proper_name> 
       <type>INT</type> 
      </variable> 
      <variable> 
       <name>crash_id</name> 
        <proper_name>Crash Instance</proper_name> 
       <type>INT</type> 
      </variable> 
     </variables> 
    </fact> 
</facts> 

我想拉所有的从节点标签的内容,但只有在崩溃的事实。

这是我的代码到目前为止。

def header(filename, fact):  
    lst = [] 
    tree = ET.parse(filename) #read in the XML 
    for fact in tree.iter(tag = 'fact'): 
     factname = fact.find('name').text 
     if factname == fact: #choose the fact to pull from 
      for var in fact.iter(tag = 'variable'): 
       name = var.find('name').text 
       lst.append(name) 
    return lst #return a list of all the <name> tags from the Crash fact 

newlst = header('schema.xml','crash') 

我的输出newlst应该是Crash事实中所有标记的列表。但它一直空着。

奇怪的是,它返回正确的输出,如果我硬编码的一切(和删除功能):

lst = [] 
tree = ET.parse('schema.xml') 
for fact in tree.iter(tag = 'fact'): 
    factname = fact.find('name').text 
    if factname == 'crash': 
     for var in fact.iter(tag = 'variable'): 
      name = var.find('name').text 
      lst.append(name) 
print(lst) 


Output: ['id', 
'accident_key', 
'accident_year'] 

在功能,您使用的变量fact既作为参数,并作为第一for循环的变量。试试这个版本:

def header(filename, target_factname):  
    lst = [] 
    tree = ET.parse(filename) #read in the XML 
    for fact in tree.iter(tag = 'fact'): 
     factname = fact.find('name').text 
     if factname == target_factname: #choose the fact to pull from 
      for var in fact.iter(tag = 'variable'): 
       name = var.find('name').text 
       lst.append(name) 
    return lst #return a list of all the <name> tags from the Crash fact 
+0

我知道我在犯一个愚蠢的错误......谢谢! – ale19