Hive SQL数据分析实战(二)

有以下几张数据表,请写出Hive SQL语句,实现以下需求。

注:分区字段为dt,代表日期。
在这里插入图片描述

在这里插入图片描述

在这里插入图片描述

在这里插入图片描述

1、某年度对用户满意度进行调研分析,找出目标人群。

参考实现:找出2019年购买商品后又退款的用户

select a.user_name
from
    (select distinct user_name
    from user_trade
    where year(dt)=2019) a
join
    (select distinct user_name
    from user_refund
    where year(dt)=2019) b
on a.user_name=b.user_name;

2、用户忠诚度类项目的调研分析,找出目标人群。

参考实现:选出在2017年和2018年都购买商品的用户

select a.user_name
from
    (select distinct user_name
    from user_trade
    where year(dt)=2017) a
join
    (select distinct user_name
    from user_trade
    where year(dt)=2018) b
on a.user_name=b.user_name;

3、高忠诚度用户的匹配分析,用以生成心路历程类报表推送给用户,找出目标人群。

参考实现:选出在2017年、2018年和2019年都购买商品的用户

select a.user_name
from
    (select distinct user_name
    from trade_2017) a
join
    (select distinct user_name
    from trade_2018) b on a.user_name=b.user_name
join
    (select distinct user_name
    from trade_2019) c on b.user_name=c.user_name;

4、无退款服务类用户的定位分析,用以发送服务判断类用户调研。

参考实现:在2019年购买商品,但是没有退款的用户。

select a.user_name
from
    (select distinct user_name
    from user_trade
    where year(dt)=2019) a
left join
    (select distinct user_name
    from user_refund
    where year(dt)=2019) b
on a.user_name=b.user_name
where b.user_name is null;

5、对客户的学历进行调研分析,观察其分布情况。

参考实现:在2019年购买商品的用户的学历分布

select b.education,
		count(distinct a.user_name)
from
	 (select distinct user_name
    from user_trade
    where year(dt)=2019) a
left join
    (select distinct user_name,
    		extra2['education'] as education
    from user_info
    where year(dt)=2019) b
on a.user_name=b.user_name
group by b.education;

6、老客户召回计划,匹配到目标人群。

参考实现:在2017和2018年都购买,但是没有在2019年购买的用户

select a.user_name
from
    (select distinct user_name
    from trade_2017) a
join
    (select distinct user_name
    from trade_2018) b on a.user_name=b.user_name
left join
    (select distinct user_name
    from trade_2019) c on b.user_name=c.user_name
where c.user_name is null;

7、对近几年的交易进行分析,评估平台的价值。

参考实现:2017-2019年有交易的所有用户数

select count(distinct a.user_name),
		count(a.user_name)
from	
	(select user_name
	from trade_2017
	union all
	select user_name
	from trade_2018
	union all
	select user_name
	from trade_2019) a;

8、对某年度的客户交易价值进行分析。

参考实现:2019年每个用户的支付和退款金额汇总

select coalesce(a.user_name,b.user_name),
		if(a.pay_amount if null, 0, a.pay_amount),
		if(b.refund_amount if null, 0, b.refund_amount)
from 
	(select user_name,
			sum(pay_amount) as pay_amount
	from user_trade
	where year(dt)=2019
	group by user_name) a
full join
	(select user_name,
			sum(refund_amount) as refund_amount
	from user_refund
	where year(dt)=2019
	group by user_name) b
on a.user_name=b.user_name

9、对沉默用户的年龄段进行分析,用以部署活动来刺激其消费。

参考实现:首次激活时间在2017年,但是一直没有支付的用户的年龄段分布。

select a.age_type,
		count(a.user_name)
from
    (select	user_name,
            case when age<20 then '20岁以下'
                when age>=20 and age<30 then '20-30岁'
                when age>=30 and age<40 then '30-40岁'
                else '40岁以上' end as age_type
    from user_info
    where year(firstactivetime)=2017) a
left join
	(select distinct user_name
	from user_trade
	where dt>'0') b
where b.user_name is null
group by a.age_type;

10、对活跃用户的激活时间段进行分析,用以部署活动来刺激其消费。

参考实现:2018、2019年都有交易的用户,其激活时间段分布。

select hour(firstactivetime),
		count(a.user_name)
from
	(select user_name
	from trade_2018
	unoin
	select user_name
	from trade_2019)a
left join
	user_info b
on a.user_name=b.user_name
group by hour(firstactivetime);

11、在2019年购买后又退款的用户性别分布

select c.sex, 
		count(distinct c.user_name) as user_num
from
	(select distinct user_name from user_trade where year(dt)=2019) a
join 
	(select distinct user_name from user_refund where year(dt)=2019) b
on a.user_name=b.user_name
join
	(select user_name, sex from user_info ) c
on a.user_name=c.user_name
group by c.sex;

12、在2018年购买,但是没在2019年购买的用户城市分布。

select c.city, 
		count(distinct c.user_name) as user_num
from
	(select distinct user_name from trade_2018 ) a
left join 
	(select distinct user_name from trade_2019) b
on a.user_name=b.user_name
join
	(select user_name, city from user_info ) c
on a.user_name=c.user_name
where b.user_name is null
group by c.city;

13、2017-2019年,有交易但是没退款的用户的手机品牌分布。

select c.phonebrand , 
		count(distinct c.user_name) as user_num
from
	(select distinct user_name from user_trade where year(dt) in (2017, 2018, 2019)) a
left join 
	(select distinct user_name from user_refund where year(dt) in (2017, 2018, 2019)) b
on a.user_name=b.user_name
join
	(select user_name, extra2["phonebrand"] as phonebrand from user_info ) c
on a.user_name=c.user_name
where b.user_name is null
group by c.phonebrand;

版权声明:本文为weixin_42384784原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。