Hive之——怎样写exist/in以及not exists/not in子句

Hive 不支持 where 子句中的子查询, SQL 常用的 exist in 子句需要改写。这一改写相对简单。考虑以下 SQL 查询语句:

SELECT a.key, a.value
FROM a
WHERE a.key in
(SELECT b.key
FROM B);

可以改写为

SELECT a.key, a.value
FROM a LEFT OUTER JOIN b ON (a.key = b.key)
WHERE b.key <> NULL;

一个更高效的实现是利用 left semi join 改写为:

SELECT a.key, a.val
FROM a LEFT SEMI JOIN b on (a.key = b.key);

left semi join 是 0.5.0 以上版本的特性。hive 的 left semi join 讲解https://blog.csdn.net/happyrocking/article/details/79885071

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------

not exists 例子


select a, b
  from table1 t1
 where not exists (select 1
          from table2 t2
         where t1.a = t2.a
           and t1.b = t2.b)

可以改为



select t1.a, t2.b
  from table1 t1
  left join table2 t2
    on (t1.a = t2.a and t1.b = t2.b)
 where t2.a is null

 


版权声明:本文为zhangge360原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。