使用SAX解析器分割XML
问题描述:
我有以下xml文件。使用SAX解析器分割XML
<Engineers>
<Engineer>
<Name>JOHN</Name>
<Position>STL</Position>
<Team>SS</Team>
</Engineer>
<Engineer>
<Name>UDAY</Name>
<Position>TL</Position>
<Team>SG</Team>
</Engineer>
<Engineer>
<Name>INDRA</Name>
<Position>Director</Position>
<Team>PP</Team>
</Engineer>
</Engineers>
当Xpath作为工程师/工程师给出时,我需要将此xml分成更小的xml字符串。
较小的XML字符串如下
<Engineers>
<Engineer>
<Name>INDRA</Name>
<Position>Director</Position>
<Team>PP</Team>
</Engineer>
</Engineers>
<Engineers>
<Engineer>
<Name>JOHN</Name>
<Position>STL</Position>
<Team>SS</Team>
</Engineer>
</Engineers>
我已经实现了以下迄今为止使用SAX,我们可以得到里面的XML,但并不像我want.How可以继续我的元素?
public class ReadSAX
{
public static void main(String[] args)
{
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
public void startElement(String uri, String localName,
String qName, Attributes attributes)
throws SAXException {
System.out.println("Start Element :" + qName);
public void endElement(String uri, String localName,
String qName)
throws SAXException {
System.out.println("End Element :" + qName);
}
public void characters(char ch[], int start, int length)
throws SAXException {
System.out.println(new String(ch, start, length));
}
};
File file = new File("c:\\file.xml");
InputStream inputStream= new FileInputStream(file);
Reader reader = new InputStreamReader(inputStream,"UTF-8");
InputSource is = new InputSource(reader);
is.setEncoding("UTF-8");
saxParser.parse(is, handler);
} catch (Exception e) {
e.printStackTrace();
}
}
}
答
为什么要使用这种低级编码方法?
在XSLT 2.0,它只是
<xsl:template match="/">
<xsl:for-each select="Engineers/Engineer">
<xsl:result-document select="{position()}.xml">
<Engineers>
<xsl:copy-of select="."/>
</Engineers>
</xsl:result-document>
</xsl:for-each>
</xsl:template>
,如果这需要太多的内存,让流XSLT 3.0处理器,这将解决这个问题。
答
我认为你需要做的就是用VTD-XML的剪切和粘贴能力......本文的Java API XML处理有权性能分析一下,会告诉你更多关于VTD-XML的..
http://sdiwc.us/digitlib/journal_paper.php?paper=00000582.pdf
import com.ximpleware.*;
import java.io.*;
public class splitXML {
public static void main(String[] args) throws VTDException, IOException {
VTDGen vg = new VTDGen();
if (!vg.parseFile("d:\\xml\\input.xml", false)){
System.out.println("error");
return;
}
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
ap.selectXPath("/engineers/engineer");
int i=0,n=0;
FileOutputStream fos =null;
byte[] stag="<engineers>".getBytes();
byte[] etag="</engineers>".getBytes();
while((i=ap.evalXPath())!=-1){
fos.write(stag);
fos = new FileOutputStream("d:\\xml\\output"+(++n)+".xml");
long l = vn.getElementFragment();
fos.write(vn.getXML().getBytes(), (int)l, (int)(l>>32));
fos.write(etag);
fos.close();
}
}
}
我会添加一段代码超级这样做是VTD-XML的...代码10线超高效/简单... –