训练分类模型Opennlp

问题描述：

我试图用下面的代码来训练一个模型，但我一直收到DocumentCategorizerME.train()方法的错误，它告诉我将factory更改为doccatfactory。为什么？训练分类模型Opennlp

public void trainModel() 
{ 
    DoccatModel model = null; 
    InputStream dataIn = null; 

    try 
    { 
     InputStreamFactory factory = getInputStreamFactory(new File("D:/training.txt")); 
     ObjectStream<String> lineStream = new PlainTextByLineStream(factory, Charset.defaultCharset()); 
     ObjectStream<DocumentSample> sampleStream = new DocumentSampleStream(lineStream); 
     TrainingParameters params = new TrainingParameters(); 
     params.put(TrainingParameters.ITERATIONS_PARAM, "100"); 
     params.put(TrainingParameters.CUTOFF_PARAM, "0"); 

     model = DocumentCategorizerME.train("en", sampleStream, params, factory); 

    } 



} 

public static InputStreamFactory getInputStreamFactory(final File file) throws IOException{ 
    return new InputStreamFactory() { 

     @Override 
     public InputStream createInputStream() throws IOException { 
      return new FileInputStream(file); 
     } 
    }; 
}

答

当您使用DocumentCategorizerME.train（...）方法

，你需要在DoccatFactory不是InputStreamFactory通过。尝试：

model = DocumentCategorizerME.train("en", sampleStream, params, new DoccatFactory());

希望它有帮助。

训练分类模型Opennlp

相关推荐