Hive version: 2.3.4
WITH behavior AS (
SELECT 'a' AS uid, '20230211' AS dt, 1 AS score
UNION ALL
SELECT 'a' AS uid, '20230212' AS dt, 2 AS score
UNION ALL
SELECT 'a' AS uid, '20230213' AS dt, 3 AS score
UNION ALL
SELECT 'a' AS uid, '20230214' AS dt, 4 AS score
UNION ALL
SELECT 'a' AS uid, '20230215' AS dt, 5 AS score
UNION ALL
SELECT 'a' AS uid, '20230216' AS dt, 6 AS score
)
SELECT
uid, dt, score,
COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS s1,
COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN 3 PRECEDING AND 1 PRECEDING) AS s2,
COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN 1 FOLLOWING AND 2 FOLLOWING) AS s3,
COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) AS s4
FROM behavior;
The output of the above code is as follows:
But my personal understanding is as follows:
Let's take the line dt = '20230214' and ORDER BY specifies that the value comparison field is score, that is, the current value is 4:
UNBOUNDED PRECEDING AND 1 PRECEDING
indicates the range
(-∞, 3]
. The result of s1 should be
[1, 2, 3]
. But the result is
[1,2]
is not expected
3 PRECEDING AND 1 PRECEDING
indicates the range is
[1, 3]
, s2 should be
[1, 2, 3]
,
1 FOLLOWING AND 2 FOLLOWING
indicates the range
UNBOUNDED PRECEDING AND 1 PRECEDING
0, s3 should be
UNBOUNDED PRECEDING AND 1 PRECEDING
1,
UNBOUNDED PRECEDING AND 1 PRECEDING
3 indicates the range
UNBOUNDED PRECEDING AND 1 PRECEDING
4, s4 should be
UNBOUNDED PRECEDING AND 1 PRECEDING
1, But the result is
UNBOUNDED PRECEDING AND 1 PRECEDING
2 does not meet expectations
Could you please explain the misunderstanding?
0 Answer
No answer yet
这家伙很懒,什么都没留下...