使用SAX解析器分割XML

问题描述:

我有以下xml文件。使用SAX解析器分割XML

<Engineers> 
    <Engineer> 
     <Name>JOHN</Name> 
     <Position>STL</Position> 
     <Team>SS</Team> 
    </Engineer> 
    <Engineer> 
     <Name>UDAY</Name> 
     <Position>TL</Position> 
     <Team>SG</Team> 
    </Engineer> 
    <Engineer> 
     <Name>INDRA</Name> 
     <Position>Director</Position> 
     <Team>PP</Team> 
    </Engineer> 
</Engineers> 

当Xpath作为工程师/工程师给出时,我需要将此xml分成更小的xml字符串。

较小的XML字符串如下

<Engineers> 
    <Engineer> 
     <Name>INDRA</Name> 
     <Position>Director</Position> 
     <Team>PP</Team> 
    </Engineer> 
</Engineers> 

<Engineers> 
    <Engineer> 
     <Name>JOHN</Name> 
     <Position>STL</Position> 
     <Team>SS</Team> 
    </Engineer> 
</Engineers> 

我已经实现了以下迄今为止使用SAX,我们可以得到里面的XML,但并不像我want.How可以继续我的元素?

public class ReadSAX 
{ 
    public static void main(String[] args) 
    { 
     try { 

       SAXParserFactory factory = SAXParserFactory.newInstance(); 
       SAXParser saxParser = factory.newSAXParser(); 

       DefaultHandler handler = new DefaultHandler() { 

    public void startElement(String uri, String localName, 
        String qName, Attributes attributes) 
        throws SAXException { 

        System.out.println("Start Element :" + qName); 



       public void endElement(String uri, String localName, 
         String qName) 
         throws SAXException { 

         System.out.println("End Element :" + qName); 

       } 

       public void characters(char ch[], int start, int length) 
        throws SAXException { 

        System.out.println(new String(ch, start, length)); 


       } 

       }; 

       File file = new File("c:\\file.xml"); 
       InputStream inputStream= new FileInputStream(file); 
       Reader reader = new InputStreamReader(inputStream,"UTF-8"); 

       InputSource is = new InputSource(reader); 
       is.setEncoding("UTF-8"); 

       saxParser.parse(is, handler); 


      } catch (Exception e) { 
       e.printStackTrace(); 
      } 

    } 
} 
+0

我会添加一段代码超级这样做是VTD-XML的...代码10线超高效/简单... –

为什么要使用这种低级编码方法?

在XSLT 2.0,它只是

<xsl:template match="/"> 
    <xsl:for-each select="Engineers/Engineer"> 
    <xsl:result-document select="{position()}.xml"> 
     <Engineers> 
     <xsl:copy-of select="."/> 
     </Engineers> 
    </xsl:result-document> 
    </xsl:for-each> 
</xsl:template> 

,如果这需要太多的内存,让流XSLT 3.0处理器,这将解决这个问题。

+0

@Micheal凯喜sir..how我可以使用这与一个Java类来获得结果? – Hussey123

+0

你有没有试过谷歌“从Java运行XSLT转换”?但不是谷歌,如果你正在从Java进行XML编程,你应该使用Elliotte Rusty Harold的书:http://www.cafeconleche.org/books/xmljava/ –

+0

@Micheal Kay非常感谢您的先生。很多。 – Hussey123

我认为你需要做的就是用VTD-XML的剪切和粘贴能力......本文的Java API XML处理有权性能分析一下,会告诉你更多关于VTD-XML的..

http://sdiwc.us/digitlib/journal_paper.php?paper=00000582.pdf

import com.ximpleware.*; 
import java.io.*; 
public class splitXML { 
    public static void main(String[] args) throws VTDException, IOException { 
     VTDGen vg = new VTDGen(); 
     if (!vg.parseFile("d:\\xml\\input.xml", false)){ 
      System.out.println("error"); 
      return; 
     } 
     VTDNav vn = vg.getNav(); 
     AutoPilot ap = new AutoPilot(vn); 
     ap.selectXPath("/engineers/engineer"); 
     int i=0,n=0; 
     FileOutputStream fos =null; 
     byte[] stag="<engineers>".getBytes(); 
     byte[] etag="</engineers>".getBytes(); 
     while((i=ap.evalXPath())!=-1){ 
      fos.write(stag); 
      fos = new FileOutputStream("d:\\xml\\output"+(++n)+".xml"); 
      long l = vn.getElementFragment(); 
      fos.write(vn.getXML().getBytes(), (int)l, (int)(l>>32)); 
      fos.write(etag); 
      fos.close(); 
     } 
    } 
}