You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On HDD, sequential access is relatively fast, while random access is terribly slow. That's why duplicut, written back in 2014 has been optimized thinking of it.
It made at that time no sense to have multiple threads reading concurrently a massive wordlist's content, so sequential access with a single thread was more performant when all lines could fit in hashmap at once.
Now we entered the SSD era, concurency could leverage great performance, as random access is way faster.
My idea was to continue reading the input (or previously-written part of output when we're low on RAM) sequentially (tricky to do otherwise when the input is lines of varying lengths), but buffer it rather than process it against the hash table line by line. Once the buffer fills up, process it with multiple threads (mark for removal entries that are seen in the global hash table). Then repeat for the next buffer's worth of input. I guess a reasonable buffer size can be a few MB (maybe similar to L3 cache size). A complication is dealing with duplicates within a buffer - perhaps that needs to be taken care of separately, maybe using a separate smaller hash table, and maybe sequentially.
Random reading of input is also possible, perhaps skipping until the start of a new line and somehow processing the partial lines on block boundaries separately.
I suggested OpenMP because that's what we already use in JtR, and because it's easy to use in this way. Since you already use explicit pthreads, you probably shouldn't mix different threading technologies. You can implement the above with either technology.
HDD vs SSD
On HDD, sequential access is relatively fast, while random access is terribly slow. That's why
duplicut
, written back in 2014 has been optimized thinking of it.It made at that time no sense to have multiple threads reading concurrently a massive wordlist's content, so sequential access with a single thread was more performant when all lines could fit in hashmap at once.
Now we entered the SSD era, concurency could leverage great performance, as random access is way faster.
@solardiz suggested OpenMP, which would probably increase perf a lot.
TODO
@solardiz i'd love your suggestions & opinion about duplicut & ways to optimize 😄
The text was updated successfully, but these errors were encountered: