sqoop的导入导出
一.SQOOP的导入
首先在MySQL中使用下面的命令创建测试用的数据库和表,同时为表插入数据
create database testdb;
use testdb;
create table user(
id int not null auto_increment,
account varchar(10) default null,
password varchar(10) default null,
primary key(id)
);
插入数据
insert into user (account,password) values ("aaa","123");
insert into user values ("aaa","123");
update user set password="213" where id=2;
insert into user values (null,"ccc","321");
insert into user values (null,"ddd","456");
insert into user values (null,"eee","546");
insert into user(account,password) values("fff","654");
insert into user(account,password) values("ggg","789");
insert into user(account,password) values("hhh","879");
insert into user(account,password) values("iii","987");
1)导入 MySQL -> HDFS
(1) 全表导入
bin/sqoop import \
--connect jdbc:mysql://linux1:3306/testdb \
--username root \
--password 123456 \
--table user \
--target-dir /input/sqoop/import1 \
--num-mappers 1 \
--fields-terminated-by ","
bin/sqoop import \
--connect jdbc:mysql://linux1:3306/testdb \
--username root \
-P \ //不使用明文指定数据库连接密码
--table user \
--target-dir /input/sqoop/import2 \ //指定数据导入到HDFS上的目录
--delete-target-dir \ //如果目标目录已存在,则先删除
--num-mappers 1 \ //指定使用导入数据的map个数
--fields-terminated-by "," //目标文件的分隔符
bin/sqoop import \
--connect jdbc:mysql://linux1:3306/testdb \
--username root \
-P \
--table user \
--target-dir /input/sqoop/import2 \
--delete-target-dir \
--num-mappers 1 \
--fields-terminated-by ","
(2) query,where子句必须有$CONDITONS(固定写法) 查询导入方式
bin/sqoop import \
--connect jdbc:mysql://linux1:3306/testdb \
--username root \
-P \
--query 'select id,account from user where password>600 and $CONDITIONS' \
--target-dir /input/sqoop/import3 \
-m 1 \
--delete-target-dir \
--fields-terminated-by "\t"
(3)增量导入
bin/sqoop import \
--connect jdbc:mysql://linux1:3306/testdb \
--username root \
--password 123456 \
--table user \
--target-dir /input/sqoop/append \
-m 1 \
--fields-terminated-by " " \
--check-column id \
--incremental append \
--last-value 2
2)MySQL -> Hive
bin/sqoop import \
--connect jdbc:mysql://linux1:3306/testdb \
--username root \
--password 123456 \
--table user \
-m 1 \
--delete-target-dir \
--fields-terminated-by "\t" \
--hive-import \
--hive-overwrite \
--hive-database db01 \
--hive-table user
3).从HDFS、Hive表导出到Mysql
由于hive表的数据库和表其实就是HDFS上的目录和文件,所以从Hive表导出数据就是从HDFS对应的目录导出数据
先在mysql中创建表接收导出数据
create table hive2mysql(
deptno int,
deptname varchar(20),
address varchar(20)
);
bin/sqoop export \
--connect jdbc:mysql://linux1:3306/testdb \
--username root \
--password 123456 \
--table hive2mysql \
--num-mappers 1 \
--export-dir /user/hive/warehouse/db01.db/dept \
--fields-terminated-by "\t"
4)可以将Sqoop的命令选项写在文件,通过【--options-file】指定文件,进行运行程序。
vim sqoop_script
export
--connect
jdbc:mysql://linux1:3306/testdb
--username
root
--password
123456
--table
Hivemysql1
--num-mappers
1
--export-dir
/user/hive/warehouse/db01.db/dept
--fields-terminated-by
"\t"
$ bin/sqoop --options-file ~/sqoop_script