-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Fix CAGRA filter #489
base: branch-24.12
Are you sure you want to change the base?
Conversation
Can you add a test that would prevent regression? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, @enp1s0!
I'm a little bit confused with the description. Do I understand it right that this PR contains two fixes: (1) make the bitonic sort array always a power-of-two, (2) move filtered elements to the end of the topk buffer?
The big chunk of the PR addresses (2), but that should be irrelevant for #472, because in that bug no elements are filtered out.
Therefore, I think, it would be really beneficial to construct a reproducer for #472 as a test case in this PR and make sure it's fixed with the introduced change.
Also, (1) did you have a chance to check if this affects the QPS? (2) do we need a similar fix for multi-cta and multi-kernel versions of CAGRA? |
@achirkin, thank you for your comment, and I'm sorry for the bad PR description. I updated it.
No, this PR changes the filtering process so that the bitonic sort is not used to move the invalid elements to the end of the buffer. In the current search implementation, the bitonic sort is used to move the invalid elements as: cuvs/cpp/src/neighbors/detail/cagra/search_single_cta_kernel-inl.cuh Lines 758 to 763 in 5062594
cuvs/cpp/src/neighbors/detail/cagra/search_single_cta_kernel-inl.cuh Lines 644 to 649 in 5062594
The problem is that the (max) array length (= Although, as you mentioned, making the bitonic sort array always a power-of-two is an alternative way to fix this issue, I didn't do it because 1) the array elements except the filtered-out nodes are already sorted, and 2) more registers are required that will not be used but required to make the bitonic sort array power-of-two. Also, this bug is the cause of a problem in the CAGRA filtering unit test: cuvs/cpp/test/neighbors/ann_cagra.cuh Line 762 in 5062594
When itop_k is not specified, the default value, 64, is used. The graph degree is also 64. Therefore, MAX_ITOPK (64) + MAX_CANDIDATES (64) equals 128, and the bitonic sort works correctly in this case. However, if itopk size is set to another value, the bitonic sort does not work.
Yes, so I reenabled the test in this PR by changing the following lines to set the itopk size correctly. cuvs/cpp/test/neighbors/ann_cagra.cuh Lines 762 to 765 in 5062594
I measured the performance of no filtering out search (the same situation as #472 )
No.
|
Ref : #472
The cause of the bug
The bitonic sort was used on an array that was not a power of 2 long. In the current search implementation, the bitonic sort is used to move the invalid elements to the end of the buffer as:
cuvs/cpp/src/neighbors/detail/cagra/search_single_cta_kernel-inl.cuh
Lines 758 to 763 in 5062594
cuvs/cpp/src/neighbors/detail/cagra/search_single_cta_kernel-inl.cuh
Lines 644 to 649 in 5062594
The problem is that the (max) array length (=
MAX_ITOPK + MAX_CANDIDATES
) is not always the power of two.These bitonic sorts are called even if no elements are filtered out unless
cuvs::neighbors::filtering::none_sample_filter
is specified as the filter, so #472 occurs.Fix
This PR changes the filtering process so that the bitonic sort is not used to move the invalid elements to the end of the buffer.