如何使用Perl和XPath从此XML文件中提取所需的节点?

问题描述:

执行XPath表达式以从XML DB文件中提取与死亡率相关的所有年份和值元素后,我想从节点列表中获取每个节点并查找年节点,打印该节点,找到值节点并打印所有分开。问题是输出不显示任何东西。如何使用Perl和XPath从此XML文件中提取所需的节点?

XML内容是这样的:

<dataset type="country" name="Afghanistan" total="222"> 
... 
     <data> 
      <country id="AFG">Afghanistan</country> 
      <indicator id="SP.DYN.CDRT.IN">Death rate, crude (per 1,000 people)</indicator> 
      <year>2006</year> 
      <value>20.3410000</value> 
      </data> 
      <data> 
      <country id="AFG">Afghanistan</country> 
      <indicator id="SP.DYN.CDRT.IN">Death rate, crude (per 1,000 people)</indicator> 
      <year>2007</year> 
      <value>19.9480000</value> 
      </data> 
      <data> 
      <country id="AFG">Afghanistan</country> 
      <indicator id="SP.DYN.CDRT.IN">Death rate, crude (per 1,000 people)</indicator> 
      <year>2008</year> 
      <value>19.5720000</value> 
      </data> 
      <data> 
      <country id="AFG">Afghanistan</country> 
      <indicator id="IC.EXP.DOCS">Documents to export (number)</indicator> 
      <year>2005</year> 
      <value>7.0000000</value> 
      </data> 
      <data> 
      <country id="AFG">Afghanistan</country> 
      <indicator id="IC.EXP.DOCS">Documents to export (number)</indicator> 
      <year>2006</year> 
      <value>12.0000000</value> 
      </data> 
      <data> 
      <country id="AFG">Afghanistan</country> 
      <indicator id="IC.EXP.DOCS">Documents to export (number)</indicator> 
      <year>2007</year> 
      <value>12.0000000</value> 
      </data> 
... 
</dataset> 

Perl代码看起来是这样的:

#Use XML Xlib parser to find elements related to death rate 

my $parser = XML::LibXML->new(); 
my $tree = $parser->parse_file($XML_DB); 
my $root = XML::LibXML::XPathContext->new($tree->documentElement()); 
#print $nodeSet->to_literal(); 

foreach my $node ($root->findnodes("/*/data/indicator[\@id = 'SP.DYN.CDRT.IN']/following-sibling::*")) { 
    #print $node->textContent() . "\n"; 
    #print $node->nodeName . "\n"; 
    print $node->find("year") . "\n"; 
} 
exit; 

表达yearfind("year")像你想象的那样,因为你的复杂选择不做工作不会在data节点结束。使用Xacobeo来调试XPath表达式。这工作:

foreach my $node ($root->findnodes(q{/*/data/indicator[@id = 'SP.DYN.CDRT.IN']/following-sibling::*})) { 
    say $_->toString for $node->childNodes; 
} 

输出:

2006 
20.3410000 
2007 
19.9480000 
2008 
19.5720000 
+0

感谢您的帮助很大! – user338516 2010-05-19 13:48:25

+0

daxim,你有没有使用Xacobeo的示例代码? – user338516 2010-05-19 15:57:53

+1

WTF,Xacobeo是一个GUI应用程序 - 只需安装并运行它。另外,您应该__接受answer__,请参阅http://*.com/faq#When%20you%20have%20decided%20which%20answer%20is%20the%20most%20helpful%20to%20you – daxim 2010-05-19 16:09:09