运维开发网

Hive Explode / Lateral查看多个数组

运维开发网 https://www.qedev.com 2020-05-18 14:56 出处:网络
我有一个具有以下架构的配置单元表: COOKIE | PRODUCT_ID | CAT_ID | QTY 1234123 [1,2,3] [r,t,null] [2,1,null] 我如何规范化数组,以便得到以下结果 COOKIE | PRODUCT_ID | CAT_ID | QTY 1234123 [1] [r] [2
我有一个具有以下架构的配置单元表:

COOKIE  | PRODUCT_ID | CAT_ID |    QTY    
1234123   [1,2,3]    [r,t,null]  [2,1,null]

我如何规范化数组,以便得到以下结果

COOKIE  | PRODUCT_ID | CAT_ID |    QTY

1234123   [1]          [r]         [2]

1234123   [2]          [t]         [1] 

1234123   [3]          null       null

我尝试过以下方法:

select concat_ws('|',visid_high,visid_low) as cookie
,pid
,catid 
,qty
from table
lateral view explode(productid) ptable as pid
lateral view explode(catalogId) ptable2 as catid 
lateral view explode(qty) ptable3 as qty

然而结果是笛卡尔积.

您可以使用Brickhouse( http://github.com/klout/brickhouse)中的numeric_range和array_index UDF来解决此问题.有一篇内容丰富的博客文章详细描述了 http://brickhouseconfessions.wordpress.com/2013/03/07/exploding-multiple-arrays-at-the-same-time-with-numeric_range/

使用这些UDF,查询将是这样的

select cookie,
   array_index( product_id_arr, n ) as product_id,
   array_index( catalog_id_arr, n ) as catalog_id,
   array_index( qty_id_arr, n ) as qty
from table
lateral view numeric_range( size( product_id_arr )) n1 as n;
0

精彩评论

暂无评论...
验证码 换一张
取 消