使用SQL统计出每个用户的累积访问次数

我们有如下的用户访问数据

userId

visitDate

visitCount

u01

2017/1/21

5

u02

2017/1/23

6

u03

2017/1/22

8

u04

2017/1/20

3

u01

2017/1/23

6

u01

2017/2/21

8

U02

2017/1/23

6

U01

2017/2/22

4

要求使用SQL统计出每个用户的累积访问次数,如下表所示:

用户id

月份

小计

累积

u01

2017-01

11

11

u01

2017-02

12

23

u02

2017-01

12

12

u03

2017-01

8

8

u04

2017-01

3

3

数据集

u01     2017/1/21       5

u02     2017/1/23       6

u03     2017/1/22       8

u04     2017/1/20       3

u01     2017/1/23       6

u01     2017/2/21       8

u02     2017/1/23       6

u01     2017/2/22       4

1)创建表

create table action

(userId string,

visitDate string,

visitCount int)

row format delimited fields terminated by "\t";

2)加载数据

hive (task)> load data local inpath '/opt/temp/action.txt' overwrite into table action;

执行sql

with t1 as(select userId,date_format(regexp_replace(visitDate,'/','-'),'yyyy-MM') as visitDate ,visitCount from action),
t2 as(select userId,visitDate,sum(visitCount) as visitCount_sum from t1 group by userId,visitDate),
t3 as(select userId,visitDate,visitCount_sum,sum(visitCount_sum) over(partition by userId order by visitDate asc rows between UNBOUNDED PRECEDING and current row) as heji  from t2)
select * from t3 order by userId;

使用SQL统计出每个用户的累积访问次数