训练分类模型Opennlp
问题描述:
我试图用下面的代码来训练一个模型,但我一直收到DocumentCategorizerME.train()
方法的错误,它告诉我将factory
更改为doccatfactory
。为什么?训练分类模型Opennlp
public void trainModel()
{
DoccatModel model = null;
InputStream dataIn = null;
try
{
InputStreamFactory factory = getInputStreamFactory(new File("D:/training.txt"));
ObjectStream<String> lineStream = new PlainTextByLineStream(factory, Charset.defaultCharset());
ObjectStream<DocumentSample> sampleStream = new DocumentSampleStream(lineStream);
TrainingParameters params = new TrainingParameters();
params.put(TrainingParameters.ITERATIONS_PARAM, "100");
params.put(TrainingParameters.CUTOFF_PARAM, "0");
model = DocumentCategorizerME.train("en", sampleStream, params, factory);
}
}
public static InputStreamFactory getInputStreamFactory(final File file) throws IOException{
return new InputStreamFactory() {
@Override
public InputStream createInputStream() throws IOException {
return new FileInputStream(file);
}
};
}
答
当您使用DocumentCategorizerME.train(...)方法
,你需要在DoccatFactory不是InputStreamFactory通过。尝试:
model = DocumentCategorizerME.train("en", sampleStream, params, new DoccatFactory());
希望它有帮助。