记两则需求SQL关联注意事项
需求一: 两个表分条件选择join
两个表字段如下
表ci:
| c_id | level 1 | level2 | level3 | level4 | 其他字段 |
|---|---|---|---|---|---|
表qq:
| level_rank | level_id | 其他字段 |
|---|---|---|
两个表的关联条件是: qq.level_rank = 1 时, 用qq.level_id =ci. level1 ;qq.level_rank = 2 时, 用qq.level_id =ci. level2; qq.level_rank = 3 时, 用qq.level_id =ci. level3; qq.level_rank = 4 时, 用qq.level_id =ci. level4;
1.在 mysql gp gp impala 数据库中 可以如下形式join, 在on 后面写case 条件.
select *
from
(select 111 as level_1 ,111 as level_2 ,111 as level_3,111 as level_4
union all select 122 as level_1 ,122 as level_2 ,122 as level_3,122 as level_4
union all select 333 as level_1 ,331 as level_2 ,333 as level_3,333 as level_4
union all select 111 as level_1 ,111 as level_2 ,111 as level_3,444 as level_4) ci
left join
(select 1 as level_rank,111 as level_id
union all select 2 as level_rank ,222 as level_id
union all select 3 as level_rank ,333 as level_id
union all select 4 as level_rank ,44 as level_id) qq
on case qq.level_rank when 1 then qq.level_id = ci.level_1
when 2 then qq.level_id = ci.level_2 when 3 then qq.level_id = ci.level_3 when 4 then qq.level_id = ci.level_4 end
- 但上述语句在hive中并不适用.
需求二: 两个表关联条件中有null的情况
当有字段为null时, 是关联不到任何数据的,比如:
select * from
(select 111 as level_1 ,111 as level_2 ,111 as level_3,111 as level_4
union all select 122 as level_1 ,122 as level_2 ,122 as level_3,122 as level_4
union all select 333 as level_1 ,331 as level_2 ,333 as level_3,333 as level_4
union all select 111 as level_1 ,111 as level_2 ,111 as level_3,444 as level_4) ci
join
(select 1 as level_rank,111 as level_id
union all select 2 as level_rank ,222 as level_id
union all select 3 as level_rank ,333 as level_id
union all select 4 as level_rank ,44 as level_id) qq
on null = null
结果图片如下图所示: :
多条件join , 其中有条件是null = null 时 , 也不会有结果数据
select * from
(select 111 as level_1 ,111 as level_2 ,111 as level_3,111 as level_4
union all select 122 as level_1 ,122 as level_2 ,122 as level_3,122 as level_4
union all select 333 as level_1 ,331 as level_2 ,333 as level_3,333 as level_4
union all select 111 as level_1 ,111 as level_2 ,111 as level_3,444 as level_4) ci
join
(select 1 as level_rank,111 as level_id
union all select 2 as level_rank ,222 as level_id
union all select 3 as level_rank ,333 as level_id
union all select 4 as level_rank ,44 as level_id) qq
on 1=1 and null = null
结果情况如下图所示:
基于上述情况,避免join情况中出现null的情况, 可以 用函数代替, 在hive中,可用 nvl() 函数.
对比下面两个写法及结果:
select * from
(select 111 as level_1 ,111 as level_2 ,111 as level_3,111 as level_4
union all select 122 as level_1 ,122 as level_2 ,122 as level_3,122 as level_4
union all select 333 as level_1 ,331 as level_2 ,null as level_3,null as level_4
union all select 111 as level_1 ,111 as level_2 ,111 as level_3,null as level_4) ci
join
(select 111 as level_1 ,111 as level_2 ,111 as level_3,111 as level_4
union all select 122 as level_1 ,122 as level_2 ,122 as level_3,122 as level_4
union all select 333 as level_1 ,331 as level_2 ,null as level_3,null as level_4
union all select 111 as level_1 ,111 as level_2 ,111 as level_3,null as level_4) qq
on qq.level_1 = ci.level_1 and qq.level_2 = ci.level_2 and nvl(qq.level_3,-1) = nvl(ci.level_3,-1) and nvl(qq.level_4,-1) = nvl(ci.level_4,-1)

添加了nvl条件过滤的数据条数是4
select * from
(select 111 as level_1 ,111 as level_2 ,111 as level_3,111 as level_4
union all select 122 as level_1 ,122 as level_2 ,122 as level_3,122 as level_4
union all select 333 as level_1 ,331 as level_2 ,null as level_3,null as level_4
union all select 111 as level_1 ,111 as level_2 ,111 as level_3,null as level_4) ci
join
(select 111 as level_1 ,111 as level_2 ,111 as level_3,111 as level_4
union all select 122 as level_1 ,122 as level_2 ,122 as level_3,122 as level_4
union all select 333 as level_1 ,331 as level_2 ,null as level_3,null as level_4
union all select 111 as level_1 ,111 as level_2 ,111 as level_3,null as level_4) qq
on qq.level_1 = ci.level_1 and qq.level_2 = ci.level_2 and qq.level_3 = ci.level_3 and qq.level_4 = ci.level_4

没有添加nvl函数过滤的 只有两条数据 .
版权声明:本文为weixin_42861725原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。