lateral view 的使用
针对一个实际问题对 lateral view 的使用原理进行拆分
问题:购买每个商品品类的用户数
- 查询表格中的数据:
select * from user_goods_category limit 5;
- 处理的数据表格如下:
| user_name | category_detail |
|---|---|
| Abby | clothes,food,electronics |
| Ailsa | book,clothes,food |
| Albert | clothes,electronics,computer |
| Alexander | clothes,book,computer,electronics |
| Alice | clothes,electronics,shoes |
从表中可以看出,category_detail 列中包含多个项目,这种表格并不符合关系型数据库的第一范式(每个属性都不可再分)。
1、首先将 category_detail 拆分开
explode(a) - separates the elements of array a into multiple rows,
or the elements of a map into multiple rows and columns
将数组 a 的元素分隔为多个行,或将映射的元素分隔为多个行和列
select explode(split(category_detail,","))
from user_goods_category
limit 16;
结果如下:
| col |
|---|
| clothes |
| food |
| electronics |
| book |
| clothes |
| food |
| clothes |
| electronics |
| computer |
| clothes |
| book |
| computer |
| electronics |
| clothes |
| electronics |
| shoes |
explode 说明
explode()拆分字段后,只含有拆分的字段,不含原表中的字段explode()的处理对象是array和map对象,所以先用split()将字符串对象 分隔为arrayexplode(array)使得结果中将array列表里的每个元素生成一行;explode(map)使得结果中将map里的每一对元素作为一行,key为一列,value为一列。
2、将拆分后的字段与原表联系起来
select user_name,b.category
from user_goods_category
lateral view explode(split(category_detail,",")) b as category
limit 16;
结果如下:
| user_name | b.category |
|---|---|
| Abby | clothes |
| Abby | food |
| Abby | electronics |
| Ailsa | book |
| Ailsa | clothes |
| Ailsa | food |
| Albert | clothes |
| Albert | electronics |
| Albert | computer |
| Alexander | clothes |
| Alexander | book |
| Alexander | computer |
| Alexander | electronics |
| Alice | clothes |
| Alice | electronics |
| Alice | shoes |
lateral view 说明
lateral view为侧视图,意义是为了配合explode来使用,把某一行数据拆分成多行数据。不加lateral view的explode只能提取单个字段拆分,加上lateral view就可以将拆分的单个字段数据与原始表数据关联上。- 在使用
lateral view的时候需要指定视图别名和生成的新列别名,此时原表不必指定别名。 lateral view explode(split(category_detail,",")) b as category中的b为视图名,categary为分割后的列的别名
3、查询购买每个商品品类的用户数
基于 lateral view 的拆分和连接结果,可以很轻松的将购买每个商品品类的用户数查询出来:
select c.category,
count(distinct c.user_name) as user_num
from
(select a.user_name,
b.category
from user_goods_category a -- 原表可以不设置别名
lateral view explode(split(category_detail,",")) b as category
limit 16) as c
group by c.category;
查询结果:
| c.category | user_num |
|---|---|
| book | 2 |
| clothes | 5 |
| computer | 2 |
| electronics | 4 |
| food | 2 |
| shoes | 1 |
lateral view explode 说明
lateral view explode 相当于一个拆分字段的虚表,然后与原表进行关联。
版权声明:本文为gxw11111111原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。