Hive(六)案例sql语句练习
掌握了以下sql语句,以后的项目执行sql语句,对于你来说完全们难度。
本案例是##################雇员表和部门表##########################
相当于是对sql语句的练习,难度不小,希望小伙伴们做好心理准备
准备数据:
需要的数据被我上传在博客上,大家可以去下载
地址是: https://download.****.net/my
将下载好的数据上传到linux上/opt/datas
当然也可以自己创建文件
比如我的文件:
首先创建数据库
CREATE DATABASE IF NOT EXISTS db_0816; |
db_0816是数据库名
使用数据库
use db_0816; |
创建雇员表
CREATE TABLE db_0816.emp( empno INT, ename STRING, job STRING, mgr INT, hiredate STRING, sal DOUBLE, comn DOUBLE, deptno INT )ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY "\n" |
CREATE TABLE db_0816.dept( deptno INT, dname STRING, loc STRING )ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY "\n" |
导入数据
LOAD DATA LOAD INPATH "/opt/data/emp.txt" OVERWRITE INTO TABLE db_0816.emp |
LOAD DATA LOAD INPATH "/opt/data/dept.txt" OVERWRITE INTO TABLE db_0816.dept |
验证
select count(1) from db_0816.emp |
select count(1) from db_0816.dept |
数据库和数据表都整好了,下面来实现需求
#查询部门编号是30的员工并且薪资大于500的
select empno,ename,sal,deptno from emp where deptno=30 and sal>500; |
#查询薪资最高的3条记录
select empno,ename,sal,deptno from emp order by sal desc limit 3; |
#查询员工表中有哪些部门
select distinct deptno from emp; |
#查询员工号大于7800的员工编号对应的部门名称
select e.empno,e.ename,e.deptno,d.dname from ( select empno,ename,sal,deptno from emp where empno>7800 )e JOIN dept d on e.deptno=d.deptno; |
#查新薪资在800到1500之间的员工姓名,工号,职位,薪水
select ename,empno,job,sal from emp where sal>800 and sal<1500; #或者 where sal between 800 and 1500 |
#查询奖金不为空的且薪资不低于500的
select ename,empno,job,sal,comn from emp where comn is not null and sal>=500 |
#每个部门的平均工资
select deptno,round(avg(sal),2) as sal_avg from emp group by deptno |
|
#每个部门中每个岗位的最高薪资
select deptno,job,max(sal) as max_sal from emp group by deptno,job; |
#平均工资大于2000的部门名称和平均工资
select d.dname,e.avg_sal from( select deptno,round(avg(sal),2) as avg_sal from emp group by deptno having round(avg(sal),2)>2000 )e join dept d on d.deptno=e.deptno; |
#等值连接(inner)
select * from dept d join emp e on e.deptno=e.deptno |
#左连接
select d.dname,e.empno,ename from dept d left outer join emp e on e.deptno = e.deptno |
#右连接
select d.dname,e.empno,ename from dept d right outer join emp e on e.deptno = e.deptno |
#全连接
select d.dname,e.empno,ename from dept d full outer join emp e on e.deptno = e.deptno |
统计出每个部门工资最高的前3个人的信息
--如果有3个部门,那么有9天
select empno,ename,sal,hiredate,deptno, ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY sal DESC) as rnk from db_0822.emp |
前三
select t.empno,t.ename,t.sal,t.deptno,t.rnk from( select empno,ename,sal,hiredate,deptno, ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY sal DESC) as rnk from db_0625.emp )t WHERE t.rnk<=3 |
出现相同工资的,相同的值,正常来说,排序应该一样
--比如:如果有2个第一名,那么第二名就没有了。
select t.empno,t.ename,t.sal,t.deptno,t.rnk from( select empno,ename,sal,hiredate,deptno, RANK() OVER (PARTITION BY deptno ORDER BY sal DESC) as rnk from db_0822.emp )t WHERE t.rnk<=3 |
小案例到这就结束了,希望以上的sql语句对大家有所帮助!