Speed up dot product of symmetric matrices #24

phui · 2019-12-11T22:23:56Z

I am no expert in matrix computation.

For a symmetric matrix output, skipping computation of entries in the low-triangular area will speed up the computation by roughly a factor of two. This happens a lot when you are trying to find the best matches within the same set, aka self dot-product.

For example, at line 100 of https://github.com/ing-bank/sparse_dot_topn/blob/master/sparse_dot_topn/sparse_dot_topn_source.cpp, can we add a condition to check if head < i, with a flag check on an additional argument symmetric?

The text was updated successfully, but these errors were encountered:

sarimak · 2020-01-10T12:51:26Z

While we can safely skip the diagonal for the self dot-product (which contains ones due to the L2-normalization of the rows and the absolute similarity to self is not a real neighbour), reusing values from upper-diagonal matrix for calculating the lower-diagonal matrix IMO is not that easy because of top-N pruning of the neighbors.

When calculating the similarity for (x, y) we can linearly scan the already calculated neighbors if they contain the similarity for (y, x) when x < y but even if not, we cannot assume it is zero or below-treshold and we need to calculate the similarity for (x, y) again. The reason is that maybe the similarity did not fit into top-N for row y but still may fit into top-N for row x.

And the likelihood that we will have to calculate the similarity anyway is quite high. So in most cases we would just make things worse by scanning the symmetric row's neighbours.

sarimak · 2020-01-30T08:38:35Z

Update: We can do it differently - iterate over all rows in the outermost loop, iterate over portion of a row in the inner loop (and the slice length is shorter by one every row = upper triangular matrix) and we can store each calculated similarity to both symmetric positions (x, y) and (y, x). This way we don't have to deal with top N trimming of the neighbors and similarities.

But this makes things more complicated when using multiple threads: While we could evenly divide the work among N threads by splitting the outermost loop (each thread processes its portion of rows), due to shortening of the "columns" in the inner loop the work would not be distributed evenly among the threads.

What's worse: While originally we had embarassingly parallel algorithm with no data races among threads (all threads read the same matrix but each thread writes to own portion of the resulting neighbors and similarities), writing the symmetric similarity now breaks this feature and the threads now need to lock the resulting neighbors/similarities to maintain thread safety. The additional complexity may still pay off but maybe not that much.

phui · 2020-01-30T14:27:00Z

How about adding a set empty_entries that saves what pairs of (x, y) are below threshold, and always map (y, x) to (x, y) for y > x when performing saves and checks for symmetric output? This way you don't need to sync for empty_entries, since a not-up-to-date empty_entries would just add a bit computation, but doesn't impact correctness of the output.

Update:

Hmm, I realize this is a bad idea for sparse computation...

Maybe you can iterate (x, y) diagonally like

While iterating, save results to both (x, y) and (y, x).

trichie mentioned this issue Jun 24, 2022

Idea: Potential speed-up of awesome_cosine_similarity when comparing certain lists to themselves #65

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up dot product of symmetric matrices #24

Speed up dot product of symmetric matrices #24

phui commented Dec 11, 2019 •

edited

Loading

sarimak commented Jan 10, 2020 •

edited

Loading

sarimak commented Jan 30, 2020 •

edited

Loading

phui commented Jan 30, 2020 •

edited

Loading

Speed up dot product of symmetric matrices #24

Speed up dot product of symmetric matrices #24

Comments

phui commented Dec 11, 2019 • edited Loading

sarimak commented Jan 10, 2020 • edited Loading

sarimak commented Jan 30, 2020 • edited Loading

phui commented Jan 30, 2020 • edited Loading

phui commented Dec 11, 2019 •

edited

Loading

sarimak commented Jan 10, 2020 •

edited

Loading

sarimak commented Jan 30, 2020 •

edited

Loading

phui commented Jan 30, 2020 •

edited

Loading