在eclipse中运行索引器作业时出现错误“Missing elastic.cluster and elastic.host ....”

问题描述:

我在eclipse中配置了apache nutch 1.13和solr 5.5.0和hbase 0.90.6。现在,我可以从注入器运行这些作业,但在运行索引作业时会抛出错误“Missing elastic.cluster and elastic.host ....”。我在nutch-site.xml文件中的plugin.includes下设置了indexer-solr。但仍然得到这些错误。有人可以帮我解释为什么会发生这种情况吗?在eclipse中运行索引器作业时出现错误“Missing elastic.cluster and elastic.host ....”

问题出在nutch-site.xml。如果你看到有两个nutch-site.xml;一个在conf文件夹下,另一个在src/test文件夹中。我们通常在conf文件夹下配置nutch-site.xml文件,但是当我们在eclipse中导入它时,它会将该文件视为src/test文件夹下的文件。所以解决这个错误的方法是在src/test文件夹下配置你的设置。一般该文件包含非常基本的配置,你需要使用以下线

<property> 
    <name>plugin.includes</name> 
    <value>protocol-http|urlfilter-regex|parse-(html|tika)|index-(basic|anchor)|indexer-solr|scoring-opic|urlnormalizer-(pass|regex|basic)</value> 
    <description>Regular expression naming plugin directory names to 
    include. Any plugin not matching this expression is excluded. 
    In any case you need at least include the nutch-extensionpoints plugin. By 
    default Nutch includes crawling just HTML and plain text via HTTP, 
    and basic indexing and search plugins. In order to use HTTPS please enable 
    protocol-httpclient, but be aware of possible intermittent problems with the 
    underlying commons-httpclient library. Set parsefilter-naivebayes for classification based focused crawler. 
    </description> 
</property> 

所以更换

<property> 
    <name>plugin.includes</name> 
    <value>.*</value> 
    <description>Enable all plugins during unit testing.</description> 
</property> 

,如果你想使用的Solr然后使用索引,Solr的,则弹性索引弹性等。

希望这有助于他人。