-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Results of simhash.find_all() #44
Comments
They should, but the current implementation in I seem to recall the use of |
Oddly, this line seems to indicate that the input is a |
It also seems odd to me that the input hashes are passed as a non-const reference, especially since the code doesn't seem to modify that argument directly, but rather only copies from it. |
Any news on that? I came across the same behavior - two identical hashes are not seen as duplicates by find_all |
It seems to me that the easiest fix is to change it from |
Any update on this issue? Would be great if it's fixed. |
@dlecocq I changed simhash.cpp
Getting error:
can you show me how you made it? |
I have the following code where 'a' and 'b' have the same value. However, they are not considered as simhash pairs by the simhash.find_all() method.
a = 8550830854347186281
b = 8550830854347186281
print ("Inputs differ in "+str(simhash.num_differing_bits(a, b))+" bits.")
all_simhash_pairs = simhash.find_all([a,b], 2, 1)
print ("Simhash Pair counts = "+str(len(all_simhash_pairs)))
Shouldn't they be included in the results? Or, maybe I didn't understand simhashing properly.
The text was updated successfully, but these errors were encountered: