-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix panic when hashing empty FixedSizeList Array #13533
Fix panic when hashing empty FixedSizeList Array #13533
Conversation
Previously it would panic due to division by zero.
3ea432f
to
28e3b99
Compare
datafusion/common/src/hash_utils.rs
Outdated
let offset_size = if array.len() == 0 { | ||
0 | ||
} else { | ||
value_len as usize / array.len() | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be equiv to array.value_length()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think value_length is different
Like a FixedSlizeLIst(3)
would have elements that are each 3 bytes, but the array can have any number of elements (rows)
So in that case value_length()
is 3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this example, value_length() is 3
array.len() is something, say N
so i though that value_len is 3 * N
ie. i kind of suspect this division is calculating what is know from the array type (the 3 in the example)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 https://docs.rs/arrow/latest/arrow/array/struct.FixedSizeBinaryArray.html#method.value_length
I don't understand what this offset_size is really used for to be honest -- it doesn't make a lot of sense below either
It seems the intention is more like offset_size = array.value_length()
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the intention is more like
offset_size = array.value_length()
🤔
that's my take too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- thank you @findepi
datafusion/common/src/hash_utils.rs
Outdated
let offset_size = if array.len() == 0 { | ||
0 | ||
} else { | ||
value_len as usize / array.len() | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think value_length is different
Like a FixedSlizeLIst(3)
would have elements that are each 3 bytes, but the array can have any number of elements (rows)
So in that case value_length()
is 3
69136a2
to
4b46b8a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @findepi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously it would panic due to division by zero.