Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix SIGCONT handling on threads blocked in syscalls
If a process receives a SIGSTOP, we emulate the group-stop by: * Leaving the thread which happened to receive the SIGSTOP signal ptrace-stopped * Refusing to schedule any other thread until the group-stop is over The whole group-stop is therefore emulated by rr and not actually enforced by the kernel. When a SIGCONT is received, we need to end the group-stop. However, we can't actually _know_ that a ptrace-stopped thread received a signal until we try and resume it. To work around this, we check /proc/tid/status's `SigPnd` and `ShdPnd` fields in the scheduler to detect when a thread that's in a group-stop has a pending SIGCONT, and so needs to be PTRACE_CONT'd so we can actually `wait` and receive that SIGCONT. A problem however arises in the following case: * A process has at least two threads, * One thread "A" receives a SIGSTOP, * And the other thread "B" is in a blocking system call, * And then a process-directed SIGCONT is sent to the process, * And the scheduler checks if "B" is runnable before checking if "A" is runnable. In this case, the issue is that the process-directed SIGCONT will set the bit in `ShdPnd` for _both_ threads. So `t->is_signal_pending(SIGCONT)` will be true for both thread A and B. The scheduler then tries to PTRACE_CONT thread B, but it's not actually in a ptrace-stop, so it all goes pear shaped (actually you get an assertion failure in `t->resume_execution()`). The fix is not to perform this `SigPnd`/`ShdPnd` checking at all for threads that are not actually in a ptrace-stop. They don't need this kind of special handling, because they're actually not ptrace-stopped; when we go to `try_wait` on them later on, we'll notice that they received a signal, and the handling in `RecordTask::signal_delivered` will actually run `emulate_SIGCONT` then.
- Loading branch information