Fix SIGCONT handling on threads blocked in syscalls #3874
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
If a process receives a SIGSTOP, we emulate the group-stop by:
The whole group-stop is therefore emulated by rr and not actually enforced by the kernel.
When a SIGCONT is received, we need to end the group-stop. However, we can't actually know that a ptrace-stopped thread received a signal until we try and resume it. To work around this, we check /proc/tid/status's
SigPnd
andShdPnd
fields in the scheduler to detect when a thread that's in a group-stop has a pending SIGCONT, and so needs to be PTRACE_CONT'd so we can actuallywait
and receive that SIGCONT.A problem however arises in the following case:
In this case, the issue is that the process-directed SIGCONT will set the bit in
ShdPnd
for both threads. Sot->is_signal_pending(SIGCONT)
will be true for both thread A and B. The scheduler then tries to PTRACE_CONT thread B, but it's not actually in a ptrace-stop, so it all goes pear shaped (actually you get an assertion failure int->resume_execution()
).The fix is not to perform this
SigPnd
/ShdPnd
checking at all for threads that are not actually in a ptrace-stop. They don't need this kind of special handling, because they're actually not ptrace-stopped; when we go totry_wait
on them later on, we'll notice that they received a signal, and the handling inRecordTask::signal_delivered
will actualy runemulate_SIGCONT
then.Fixes #3871