hive系统函数collect_list和collect_set的应用

collect_list:收集数据,返回是一个集合,集合中元素不去重
collect_set:收集数据,返回是一个集合,集合中元素去重
结合concat_ws函数返回行转列的结果

案例1 collect_list:

spark-sql> with test1 as (
         > select '张三丰' as user_name,'向往的生活' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张三丰' as user_name,'极限挑战' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张三丰' as user_name,'今夜百乐门' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张三丰' as user_name,'向往的生活' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张三丰' as user_name,'极限挑战' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张无忌' as user_name,'王牌对王牌' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张无忌' as user_name,'觉醒年代' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张无忌' as user_name,'欢乐喜剧人' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张无忌' as user_name,'向往的生活' as video_name,'2021-07-28' as dateline)
         > 
         > select 
         >   user_name,
         >   dateline,
         >   concat_ws(',',collect_list(video_name)) as video_list
         > from test1
         > group by user_name,dateline;
张无忌  		2021-07-28      王牌对王牌,向往的生活,觉醒年代,欢乐喜剧人               
张三丰		2021-07-28		今夜百乐门,极限挑战,向往的生活,极限挑战,向往的生活
Time taken: 17.38 seconds, Fetched 2 row(s)

案例2 collect_set:

spark-sql> with test1 as (
         > select '张三丰' as user_name,'向往的生活' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张三丰' as user_name,'极限挑战' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张三丰' as user_name,'今夜百乐门' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张三丰' as user_name,'向往的生活' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张三丰' as user_name,'极限挑战' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张无忌' as user_name,'王牌对王牌' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张无忌' as user_name,'觉醒年代' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张无忌' as user_name,'欢乐喜剧人' as video_name,'2021-07-28' as dateline
         > union all 
         > select '张无忌' as user_name,'向往的生活' as video_name,'2021-07-28' as dateline)
         > 
         > select 
         >   user_name,
         >   dateline,
         >   concat_ws(',',collect_set(video_name)) as video_list
         > from test1
         > group by user_name,dateline;
张三丰		2021-07-28		今夜百乐门,极限挑战,向往的生活
张无忌		2021-07-28		欢乐喜剧人,向往的生活,王牌对王牌,觉醒年代

版权声明:本文为weixin_40267121原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。