在Groovy中无需使用SAX解析器实现而高效地转换XML

问题描述：

XML转换可以使用XmlSlurper或XmlParser完成。但我正在寻找其他解决方案。因为我可能有大小超过1 GB的XML文件，SAX解析器可能无法处理它。在Groovy中无需使用SAX解析器实现而高效地转换XML

INPUT:(Before transformation) 
<response version-api="2.0"> 
<value> 
<ErrorCodes>1, 2, 3, 4</ErrorCodes> 
</value> 
</response> 

OUTPUT:(After Transformation) 
<response version-api='2.0'> 
<value> 
<ErrorCode>1</ErrorCode> 
<ErrorCode>2</ErrorCode> 
<ErrorCode>3</ErrorCode> 
<ErrorCode>4</ErrorCode> 
</value> 
</response>

从经验来说，SAX能够正确处理大型有效载荷，因此您可以正确使用它。 – Aelexe

答

这里是简单的脚本，其中转换提到的问题。

请查收评论在线：

import groovy.xml.XmlUtil 
def xml = '''<response version-api="2.0"> 
    <value> 
     <ErrorCodes>1, 2, 3, 4</ErrorCodes> 
    </value> 
</response>''' 
def newXml = new XmlSlurper().parseText(xml) 
//Get the current Error codes into a list 
def codes = newXml.value.ErrorCodes.toString().split(',')*.trim() 
//remove the existing ErrorCodes node 
newXml.value.ErrorCodes.replaceNode {} 
//Create the transformed xml by adding the list of ErrorCodes 
newXml.value.appendNode { 
    codes.each { 
     ErrorCodes(it) 
    } 
} 
println XmlUtil.serialize(newXml)

您可以从groovy web console

UPDATE尝试脚本：

我只是在这个问题固定错字。

用户不想使用XmlSlurper？稍后实现。

另一种方式可以使用stylesheet进行转换。

也许你可以尝试使用不同的方式花费多少时间。

发现几个环节：

This question与java标签要求。
This one使用xslt。

答

Groovy API在封面下使用懒惰评估，但对于1GB或更大的XML文件，您应该考虑StAX。它不像SAX那样是回调驱动的，它是一个使用迭代器的流API，它为您提供了更多的编写代码的灵活性。

再次看看您的示例，您还可以使用groovy StreamingMarkupBuilder或MarkupBuilder类真正受益。前者对于像这样的大文件应该更好。它们非常易于使用，并且将成为您将转换逻辑与StAX混合使用的绝佳方式。

在Groovy中无需使用SAX解析器实现而高效地转换XML

相关推荐