Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Fix CAGRA filter #489

Open
wants to merge 11 commits into
base: branch-24.12
Choose a base branch
from

Conversation

enp1s0
Copy link
Member

@enp1s0 enp1s0 commented Nov 23, 2024

Ref : #472

The cause of the bug

The bitonic sort was used on an array that was not a power of 2 long. In the current search implementation, the bitonic sort is used to move the invalid elements to the end of the buffer as:

topk_by_bitonic_sort_1st<MAX_ITOPK + MAX_CANDIDATES>(
result_distances_buffer,
result_indices_buffer,
internal_topk + search_width * graph_degree,
top_k,
false);

topk_by_bitonic_sort_1st<MAX_ITOPK + MAX_CANDIDATES>(
result_distances_buffer,
result_indices_buffer,
internal_topk + search_width * graph_degree,
internal_topk,
false);

The problem is that the (max) array length (=MAX_ITOPK + MAX_CANDIDATES) is not always the power of two.
These bitonic sorts are called even if no elements are filtered out unless cuvs::neighbors::filtering::none_sample_filter is specified as the filter, so #472 occurs.

Fix

This PR changes the filtering process so that the bitonic sort is not used to move the invalid elements to the end of the buffer.

@enp1s0 enp1s0 requested a review from a team as a code owner November 23, 2024 15:58
@enp1s0 enp1s0 self-assigned this Nov 23, 2024
@github-actions github-actions bot added the cpp label Nov 23, 2024
@enp1s0 enp1s0 added bug Something isn't working non-breaking Introduces a non-breaking change labels Nov 23, 2024
@enp1s0 enp1s0 changed the title Fix CAGRA filter [BUG] Fix CAGRA filter Nov 23, 2024
@lowener
Copy link
Contributor

lowener commented Nov 24, 2024

Can you add a test that would prevent regression?

Copy link
Contributor

@achirkin achirkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, @enp1s0!
I'm a little bit confused with the description. Do I understand it right that this PR contains two fixes: (1) make the bitonic sort array always a power-of-two, (2) move filtered elements to the end of the topk buffer?
The big chunk of the PR addresses (2), but that should be irrelevant for #472, because in that bug no elements are filtered out.
Therefore, I think, it would be really beneficial to construct a reproducer for #472 as a test case in this PR and make sure it's fixed with the introduced change.

@achirkin
Copy link
Contributor

achirkin commented Nov 26, 2024

Also, (1) did you have a chance to check if this affects the QPS? (2) do we need a similar fix for multi-cta and multi-kernel versions of CAGRA?

@enp1s0
Copy link
Member Author

enp1s0 commented Nov 26, 2024

@achirkin, thank you for your comment, and I'm sorry for the bad PR description. I updated it.

Do I understand it right that this PR contains two fixes: (1) make the bitonic sort array always a power-of-two, (2) move filtered elements to the end of the topk buffer?

No, this PR changes the filtering process so that the bitonic sort is not used to move the invalid elements to the end of the buffer. In the current search implementation, the bitonic sort is used to move the invalid elements as:

topk_by_bitonic_sort_1st<MAX_ITOPK + MAX_CANDIDATES>(
result_distances_buffer,
result_indices_buffer,
internal_topk + search_width * graph_degree,
top_k,
false);

topk_by_bitonic_sort_1st<MAX_ITOPK + MAX_CANDIDATES>(
result_distances_buffer,
result_indices_buffer,
internal_topk + search_width * graph_degree,
internal_topk,
false);

The problem is that the (max) array length (=MAX_ITOPK + MAX_CANDIDATES) is not always the power of two.
The second bitonic sort is called even if no elements are filtered out unless cuvs::neighbors::filtering::none_sample_filter is specified as the filter, so #472 occurs.

Although, as you mentioned, making the bitonic sort array always a power-of-two is an alternative way to fix this issue, I didn't do it because 1) the array elements except the filtered-out nodes are already sorted, and 2) more registers are required that will not be used but required to make the bitonic sort array power-of-two.

Also, this bug is the cause of a problem in the CAGRA filtering unit test:

// TODO: setting search_params.itopk_size here breaks the filter tests, but is required for

When itop_k is not specified, the default value, 64, is used. The graph degree is also 64. Therefore, MAX_ITOPK (64) + MAX_CANDIDATES (64) equals 128, and the bitonic sort works correctly in this case. However, if itopk size is set to another value, the bitonic sort does not work.

I think, it would be really beneficial to construct a reproducer for #472 as a test case in this PR and make sure it's fixed with the introduced change.

Yes, so I reenabled the test in this PR by changing the following lines to set the itopk size correctly.

// TODO: setting search_params.itopk_size here breaks the filter tests, but is required for
// k>1024 skip these tests until fixed
if (ps.k >= 1024) { GTEST_SKIP(); }
// search_params.itopk_size = ps.itopk_size;

did you have a chance to check if this affects the QPS?

I measured the performance of no filtering out search (the same situation as #472 )

filtering-bug

do we need a similar fix for multi-cta and multi-kernel versions of CAGRA?

No.

  • In the case of multi-CTA, the bitonic sort for the power-of-2 array is used to move the invalid elements, so there is no need to change. (We use a bitonic sort here because the array size is relatively small (32+graph_degree), which would not increase the register usage pressure.)
  • In the case of multi-kernel, the _find_topk routine is called, and this bug is not related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cpp non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants