AWS EMR步骤失败，因为它创建的作业失败

问题描述：

我试图使用Amazon EMR分析Wikipedia article view dataset。该数据集包含三个月期间（2011年1月1日至2011年3月31日）的页面查看统计数据。我试图找到那个时代观点最多的文章。这里是我使用的代码：AWS EMR步骤失败，因为它创建的作业失败

public class mostViews { 

public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { 

    private final static IntWritable views = new IntWritable(1); 
    private Text article = new Text(); 

    public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { 

     String line = value.toString(); 

     String[] words = line.split(" "); 
     article.set(words[1]); 
     views.set(Integer.parseInt(words[2])); 
     output.collect(article, views); 
    } 
} 

public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { 

    public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { 

     int sum = 0; 

     while (values.hasNext()) 
     { 
      sum += values.next().get(); 
     } 
     output.collect(key, new IntWritable(sum)); 
    } 
} 

public static void main(String[] args) throws Exception { 
    JobConf conf = new JobConf(mostViews.class); 
    conf.setJobName("wordcount"); 

    conf.setOutputKeyClass(Text.class); 
    conf.setOutputValueClass(IntWritable.class); 

    conf.setMapperClass(Map.class); 
    conf.setCombinerClass(Reduce.class); 
    conf.setReducerClass(Reduce.class); 

    conf.setInputFormat(TextInputFormat.class); 
    conf.setOutputFormat(TextOutputFormat.class); 

    FileInputFormat.setInputPaths(conf, new Path(args[0])); 
    FileOutputFormat.setOutputPath(conf, new Path(args[1])); 

    JobClient.runJob(conf); 
} 
}

代码本身的工作，但是当我创建一个集群并添加自定义罐子，它有时会失败，但它的工作原理其他时间。使用整个数据集作为输入会导致失败，但使用一个月（例如1月）完成。使用整个数据集运行后，我看着“控制器”的日志文件，发现这一点，我认为这是相关的：

2015-03-10T11:50:12.437Z INFO Synchronously wait child process to complete :  hadoop jar /mnt/var/lib/hadoop/steps/s-22ZUAWNM... 
2015-03-10T12:05:10.505Z INFO Process still running 
2015-03-10T12:20:12.573Z INFO Process still running 
2015-03-10T12:35:14.642Z INFO Process still running 
2015-03-10T12:50:16.711Z INFO Process still running 
2015-03-10T13:05:18.779Z INFO Process still running 
2015-03-10T13:20:20.848Z INFO Process still running 
2015-03-10T13:35:22.916Z INFO Process still running 
2015-03-10T13:50:24.986Z INFO Process still running 
2015-03-10T14:05:27.056Z INFO Process still running 
2015-03-10T14:20:29.126Z INFO Process still running 
2015-03-10T14:35:31.196Z INFO Process still running 
2015-03-10T14:50:33.266Z INFO Process still running 
2015-03-10T15:05:35.337Z INFO Process still running 
2015-03-10T15:11:37.366Z INFO waitProcessCompletion ended with exit code 1 :  hadoop jar /mnt/var/lib/hadoop/steps/s-22ZUAWNM... 
2015-03-10T15:11:40.064Z INFO Step created jobs: job_1425988140328_0001 
2015-03-10T15:11:50.072Z WARN Step failed as jobs it created failed.  Ids:job_1425988140328_0001

谁能告诉我发生了什么事情错了，我能做些什么来解决这个问题？事实上它可以运行一个月，但不会持续两三个月，这让我认为数据集可能太大，但我不确定。我对这整个Hadoop/EMR还是一个新东西，所以如果有任何我遗漏的信息只是让我知道。任何帮助或建议将不胜感激。

提前致谢！

您是否找到解决方案？ – 2015-09-08 03:51:47

不完全是，我只是缩小了数据集的大小，然后它似乎工作。尽管如此，我仍然不知道为什么会发生这种情况。 – spoon 2015-09-08 07:52:32

答

这些错误通常在空间不足时发生，无论是在HDFS（EMR节点的硬盘）还是在内存上。 “到/ mnt在/ var/lib中/的Hadoop /步/ S-22ZUAWNM ......”

其次我想尝试：

首先，我将试图读取该消息指引你到日志启动创建更大的EMR（具有更多磁盘和RAM或更多核心实例的EC2实例）。

AWS EMR步骤失败，因为它创建的作业失败

相关推荐