Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDLP optimizations #94

Open
4 tasks
szarnyasg opened this issue Jun 20, 2020 · 3 comments
Open
4 tasks

CDLP optimizations #94

szarnyasg opened this issue Jun 20, 2020 · 3 comments
Labels
optimization Optimization of existing algorithms

Comments

@szarnyasg
Copy link
Contributor

szarnyasg commented Jun 20, 2020

While doing 51fd40f, I observed a few opportunities for optimizing the CDLP algorithm:

  • Document that the matrix changes through the execution (but its pattern stays the same)
  • Use a more efficient approach to sanitize (GrB_transpose/GrB_assign)
  • Use a more efficient approach for computing l = diag^{-1} (L), e.g. GrB_extractTuples

Additionally:

  • Add non-deterministic mode. In practice, this works just as well as the deterministic variant (which selects the "min mode" from the labels) but could faster. Update: It might even work better, see Non-deterministic CDLP algorithm #101.
@szarnyasg szarnyasg self-assigned this Jun 20, 2020
@szarnyasg szarnyasg added the optimization Optimization of existing algorithms label Jun 21, 2020
@szarnyasg
Copy link
Contributor Author

szarnyasg commented Jun 29, 2020

I was thinking about we could parallelize the "min mode" selection in the CDLP algorithm. A possible approach would be:

  1. Run a row-wise sort on the matrix value. (This is probably not possible when only using the GrB API, we need extract, msort and build.)

  2. Convert the matrix to <leftModeVal, leftModeCount, midModeVal, MidModeCount, rightModeVal, rightModeCount> sextuples using an apply operator f(x) = <x, 1, NA, NA, NA, NA>

  3. Run a row-wise reduce which merges these:

    • x = <a, a_cnt, b, b_cnt1, c, c_cnt1>, y = <b, b_cnt2, c, c_cnt2, d, d_cnt> --> <a, a_cnt, [select b/c with the largest count from b_cnt1+b_cnt2 and c_cnt1+c_cnt2], d, d_cnt>
    • x = <a, a_cnt, b, b_cnt, c, c_cnt1>, y = <c, c_cnt2, d, d_cnt, e, e_cnt> --> <a, a_cnt, [select b/c/d with the largest count, b_cnt, c_cnt1+c_cnt2, and d_cnt], d, d_cnt
    • x = <a, a_cnt, b, b_cnt, c, c_cnt>, y = <d, d_cnt, e, e_cnt, f, f_cnt> --> <a, a_cnt, [select b/c/d/e with the largest count], f, f_cnt>
  4. The minimum mode value can be picked from <a, a_cnt, b, b_cnt, c, c_cnt>, based on the largest count and smallest value.

Alternatively, we could parallelize the current RLE loop. See e.g. https://erkaman.github.io/posts/cuda_rle.html.

@szarnyasg szarnyasg removed their assignment Mar 31, 2023
@aaronzo
Copy link

aaronzo commented Jun 7, 2024

Hi Gábor :) @szarnyasg
I appreciate it's been a while since this thread - I am looking to implement your algorithm above. I think the rowwise-reduction calculation of the most common element is slightly flawed - in the sextuple the right-most element seems never to become not "NA".

I think it's possible with 4 elems in a tuple - what do you think about the following?

def acc(a, b):
    a_left_value, a_left_count, a_right_value, a_right_count = a
    b_left_value, b_left_count, b_right_value, b_right_count = b
    
    right_value, right_count = b_right_value, b_right_count
    if a_right_value == b_left_count:
        left_value, left_count = a_right_value, a_right_count + b_left_count
    
        if left_value == b_right_value:
            right_value, right_count = left_value, left_count
        
        if a_left_count > left_count:
            left_value, left_count = a_left_value, a_left_count 
    
    else:
        if a_left_count > b_left_count:
            left_value, left_count = a_left_value, a_left_count
        else:
            left_value, left_count = b_left_value, b_left_count
    
    return left_value, left_count, right_value, right_count

@szarnyasg
Copy link
Contributor Author

Hi @aaronzo, unfortunately, I have stopped contributing to LAGraph due to my other obligations. Your algorithm looks good! I hope it will work both in terms of correctness / performance. Cheers, Gabor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization Optimization of existing algorithms
Projects
None yet
Development

No branches or pull requests

2 participants