Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix race condition with destroying tls range. (#398)
Running the thread creation loop from hell (you know the one), a double-free occurs (a double explicit free, not a collection free). This triggers the assert at https://github.com/snazzy-d/sdc/blob/d693c6d140be865e304c289502bdb8b0c0c3e0b5/sdlib/d/gc/tcache.d#L716 The stack trace I had was: ```gdb (gdb) bt #0 0x00005555555c38d4 in __sd_assert_fail () #1 0x00005555555edbfa in d.gc.tcache.ThreadCache.getPageDescriptor(ref d.gc.tcache.ThreadCache, void*) () #2 0x00005555555d92e1 in d.gc.tcache.ThreadCache.free(ref d.gc.tcache.ThreadCache, void*) () #3 0x00005555555f1a7f in __sd_gc_pre_suspend_hook () #4 0x00005555555eb335 in d.gc.signal.suspendThreadImpl(scope d.gc.tstate.ThreadState*) () #5 0x00005555555f12f2 in __sd_gc_push_registers () #6 0x00005555555eb4fd in d.gc.signal.suspendThreadDelayed(scope d.gc.tstate.ThreadState*) () #7 0x00005555555ef7f2 in d.gc.tstate.ThreadState.exitBusyStateSlow(ref d.gc.tstate.ThreadState, ulong) () #8 0x00005555555ee017 in d.gc.tstate.ThreadState.exitBusyState(ref d.gc.tstate.ThreadState) () #9 0x00005555555d989b in d.gc.tcache.ThreadCache.flushCache(ref d.gc.tcache.ThreadCache) () #10 0x00005555555ed9bf in d.gc.tcache.ThreadCache.destroyThread(ref d.gc.tcache.ThreadCache) () #11 0x00005555555eec51 in _D1d2gc6thread13destroyThreadFMZv () #12 0x00005555555c46d9 in d.rt.trampoline.runThread(scope d.rt.trampoline.ThreadRunner*) () #13 0x00007ffff7dfeac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442 #14 0x00007ffff7e90850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 ``` What happens is inside `destroyThread`, the `tlsSegments` array is freed. However, it is not set to `null`. Then it is interrupted by a signal telling it to stop for a scan. At this point, it is in a busy state flushing cache (but I don't think it really matters too much, as long as the signal is received after this free). Once it exits the busy state, it runs the suspend hook, we find out it is a druntime thread (this is with the integration from druntime), and the hook tries to free the same pointer. However, that pointer points to an unallocated block, and we have an assert. Entering the busy state early, and setting the pointer to null, means we can avoid the race condition. Note that I can't add a test directly for this, this *requires* druntime to return true for the `thread_preSuspend` hook.
- Loading branch information