Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recv wait can deadlock on an application thread #807

Open
HexKitchen opened this issue Jul 12, 2024 · 0 comments
Open

recv wait can deadlock on an application thread #807

HexKitchen opened this issue Jul 12, 2024 · 0 comments

Comments

@HexKitchen
Copy link
Contributor

Summary

There is a race condition that can cause recv().wait() on an application thread to deadlock.

Repro steps

For a fairly minimal repro case, see the attached files, which are adapted from the "Blocking recv" example at https://frida.re/docs/messages/ . Running it should cause a deadlock after roughly 1 second.

Attachment: test1.zip

I have also seen the same deadlock occur when running the testcase /GumJS/Script/recv_wait_should_not_leak#V8.

See also the pull request I am creating to fix this race condition, which includes its own specialized testcase.

If you have any difficulty reproducing it, I'll be glad to assist further.

Some diagnosis

The implementation of wait() starts in message-dispatcher.js:

  this.wait = function wait() {
    while (!completed)                    // <=== [LINE 1]
      engine._waitForEvent();
  };

If at [LINE 1] the requested message has already been received, wait() will return without needing to call down into engine._waitForEvent. Everything is fine in this circumstance.

If the requested message has not yet been received, execution continues in gumjs_wait_for_event (code shown for QJS):

...
  g_mutex_lock (&self->event_mutex);     // <=== [LINE 2]

  start_count = self->event_count;       // <=== [LINE 3]
  while (self->event_count == start_count && self->event_source_available)
  {
    ...
      g_cond_wait (&self->event_cond, &self->event_mutex);
    ...
  }

At [LINE 2] the code obtains event_mutex, thereby locking in the value of core->event_count. Then it waits on event_cond. Subsequently, when an event posts, it will wake this thread. Everything is fine in this circumstance as well.

The trouble happens if the event posts after the check on [LINE 1] but before the mutex is obtained on [LINE 2]. In that case, the start_count observed on [LINE 3] winds up being the value of event_count after the event posts. Furthermore, g_cond_wait will hang, potentially forever, unless by luck some additional event (that should not have a bearing on this flow) happens to post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant