Hadoop分布式集群搭建(HA)(爬坑成功!)

Hadoop分布式集群搭建(HA)

  • 1.准备三台虚拟机(我用的是vbox安装的centos7)

  • 2.安装jdk并配置环境变量

  • 3.使用xshell+xftp上传hadoop-2.7.3文件到Linux系统下,拷贝三份到虚拟机上,或配置免密登录发送文件过去,注意虚拟机之间最好配好彼此免密通信
    注意注意注意!!!!!!!!!非常关键,如果有秘钥之前生成过 一定要删除一定要删除

  • 4.我的三台ip为:

  • Hadoop分布式集群搭建(HA)(爬坑成功!)
    如果后面有node01!!!!!!一定删除

  • 首先修改每台主机名称:

  • vi /etc/hosts

  • 我的namenode为n1 元数据节点

  • 我的secondarynamenode为n2 从元数据节点

  • 其余两台为datanode

  • cd /usr/soft/hadoop-2.7.3/etc/hadoop 进入hadoop核心文件目录

  • 修改四个核心文件

  • 1.core-site.xml 核心配置文件 hadoop启动的核心

  • 这里需要注意的是修改两个点hdfs的http以及tmp的目录路径

  • Hadoop分布式集群搭建(HA)(爬坑成功!)

  • 2.hdfs-site.xml

  • dfs.replication 如果搭建伪分布式,下面的值要设为1

  • dfs.permission是在做单元测试之前的设置,目的是为了权限

  • dfs.namenode.secondary.http-address 配置从元数据节点http端口号

  • 注意这里一定要配置name和data的路径
    *Hadoop分布式集群搭建(HA)(爬坑成功!)
    这里别忘了 50090的secondary
    Hadoop分布式集群搭建(HA)(爬坑成功!)

  • 3.yarn-site.xml

  • Hadoop分布式集群搭建(HA)(爬坑成功!)

  • Yarn的目的是为了更好地管理资源,用于资源调度

  • 4.mapred-site.xml

  • Hadoop分布式集群搭建(HA)(爬坑成功!)

  • 设置映射化简模型框架为yarn

  • 使用yarn管理mapreduce计算框架

  • 5.配置hadoop.env.sh文件

  • Hadoop分布式集群搭建(HA)(爬坑成功!)

  • 配置当前jdk的目录

  • jdk是hadoop启动的核心(毕竟hadoop底层是java写的,正因如此其可移植性很强)

  • 测试小例子:准备文件

  • Hadoop分布式集群搭建(HA)(爬坑成功!)
    内容如下:

Dear friends,

On behalf of the Students’ Union, I am writing to extend my warm welcome to all the international students that are coming to our university. I firmly believe that your entering our university will bring us new ideas from different cultures and promote communication.

Also, we have some suggestions before you come to our university. As you all know, we have a great much free time in college and the teachers don’t oversee us strictly. Therefore, it is recommended that you can arrange your time properly and avoid being indulged in computer games, etc. Moreover, in order to make yourselves quickly adjusted to our campus life, it would be of great help if you can learn Chinese as well as you can. Last but not the least, it is necessary to take part in some activities so that you can improve your communication abilities and expand horizon.

We are looking forward to seeing you soon and wish everything goes well.

Yours sincerely

Li Ming

  • 将文件上传到HDFS文件系统
  • 命令:
  • Hadoop分布式集群搭建(HA)(爬坑成功!)
  • 拖到root目录下,记得需要放在数据节点下,因为元数据节点只用于存放信息,这里我放在node03节点下
  • 1.在hdfs下新建文件夹
  • 命令:hadoop fs -mkdir -p /input
  • 命令:hadoop fs -ls -R / 递归查看所有目录
  • Hadoop分布式集群搭建(HA)(爬坑成功!)
  • 2.将linux本地文件拷入文件系统下
  • 命令:hadoop fs -put /root/didid.txt /input
  • cd /usr/soft/hadoop-2.7.3/share/hadoop/mapreduce/
  • 我们使用的包是封装好的mapreduce包(测试用例)
  • 测试用例就是之前写过的单词统计
    • 该实例就相当于java的helloworld
  • 这里使用敲入如下命令,可以自动生成(tab键)
    • hadoop jar hadoop-mapreduce-examples-2.7.3.jar wordcount /input/didid.txt /usr/soft/out

    • 结果存放在 /usr/soft/out/目录 ,out目录是我自己手动建的无所谓,仅用于存放结果

    • Hadoop分布式集群搭建(HA)(爬坑成功!)

    • 输入命令:hadoop fs -ls /output

    • 发现已经生成了part-r-00000文件

  • 输入命令:hadoop fs -text /usr/soft/out/part-r-00000
    • Hadoop分布式集群搭建(HA)(爬坑成功!)
      Hadoop分布式集群搭建(HA)(爬坑成功!)
      看到结果就说明集群已经嗨起来了
      Hadoop分布式集群搭建(HA)(爬坑成功!)
      最后进入50070 可以看到 Live Nodes 已经变成了1 因为数据节点只搭了一个!

输入jps查看节点信息如下:
Hadoop分布式集群搭建(HA)(爬坑成功!)
Hadoop分布式集群搭建(HA)(爬坑成功!)
Hadoop分布式集群搭建(HA)(爬坑成功!)
Hadoop分布式集群搭建(HA)(爬坑成功!)