From: Laszlo Ersek <lersek@redhat.com>
To: Max Reitz <mreitz@redhat.com>, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [PATCH] coroutine-sigaltstack: Keep SIGUSR2 handler up
Date: Fri, 22 Jan 2021 19:29:14 +0100 [thread overview]
Message-ID: <1121a803-98e7-6d41-119c-3d82717976ec@redhat.com> (raw)
In-Reply-To: <eef8237e-293a-b6e6-20be-fa004509fa05@redhat.com>
On 01/22/21 19:05, Max Reitz wrote:
> On 22.01.21 18:09, Laszlo Ersek wrote:
>> On 01/22/21 11:20, Max Reitz wrote:
>>> Modifying signal handlers is a process-global operation. When two
>>> threads run coroutine-sigaltstack's qemu_coroutine_new() concurrently,
>>> they may interfere with each other: One of them may revert the SIGUSR2
>>> handler back to the default between the other thread setting up
>>> coroutine_trampoline() as the handler and raising SIGUSR2. That SIGUSR2
>>> will then lead to the process exiting.
>>>
>>> Outside of coroutine-sigaltstack, qemu does not use SIGUSR2. We can
>>> thus keep the signal handler installed all the time.
>>> CoroutineThreadState.tr_handler tells coroutine_trampoline() whether its
>>> stack is set up so a new coroutine is to be launched (i.e., it should
>>> invoke sigsetjmp()), or not (i.e., the signal came from an external
>>> source and we should just perform the default action, which is to exit
>>> the process).
>>>
>>> Note that in user-mode emulation, the guest can register signal handlers
>>> for any signal but SIGSEGV and SIGBUS, so if it registers a SIGUSR2
>>> handler, sigaltstack coroutines will break from then on. However, we do
>>> not use coroutines for user-mode emulation, so that is fine.
>>>
>>> Suggested-by: Laszlo Ersek <lersek@redhat.com>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>> util/coroutine-sigaltstack.c | 56 +++++++++++++++++++-----------------
>>> 1 file changed, 29 insertions(+), 27 deletions(-)
>>>
>>> diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
>>> index aade82afb8..2d32afc322 100644
>>> --- a/util/coroutine-sigaltstack.c
>>> +++ b/util/coroutine-sigaltstack.c
>>> @@ -59,6 +59,8 @@ typedef struct {
>>> static pthread_key_t thread_state_key;
>>> +static void coroutine_trampoline(int signal);
>>> +
>>> static CoroutineThreadState *coroutine_get_thread_state(void)
>>> {
>>> CoroutineThreadState *s = pthread_getspecific(thread_state_key);
>>> @@ -80,6 +82,7 @@ static void qemu_coroutine_thread_cleanup(void
>>> *opaque)
>>> static void __attribute__((constructor)) coroutine_init(void)
>>> {
>>> + struct sigaction sa;
>>> int ret;
>>> ret = pthread_key_create(&thread_state_key,
>>> qemu_coroutine_thread_cleanup);
>>> @@ -87,6 +90,20 @@ static void __attribute__((constructor))
>>> coroutine_init(void)
>>> fprintf(stderr, "unable to create leader key: %s\n",
>>> strerror(errno));
>>> abort();
>>> }
>>> +
>>> + /*
>>> + * Establish the SIGUSR2 signal handler. This is a process-wide
>>> + * operation, and so will apply to all threads from here on.
>>> + */
>>> + sa = (struct sigaction) {
>>> + .sa_handler = coroutine_trampoline,
>>> + .sa_flags = SA_ONSTACK,
>>> + };
>>> +
>>> + if (sigaction(SIGUSR2, &sa, NULL) != 0) {
>>> + perror("Unable to install SIGUSR2 handler");
>>> + abort();
>>> + }
>>> }
>>> /* "boot" function
>>> @@ -121,7 +138,17 @@ static void coroutine_trampoline(int signal)
>>> /* Get the thread specific information */
>>> coTS = coroutine_get_thread_state();
>>> self = coTS->tr_handler;
>>> +
>>> + if (!self) {
>>> + /*
>>> + * This SIGUSR2 came from an external source, not from
>>> + * qemu_coroutine_new(), so perform the default action.
>>> + */
>>> + exit(0);
>>> + }
>>> +
>>> coTS->tr_called = 1;
>>> + coTS->tr_handler = NULL;
>>> co = &self->base;
>>> /*
>>
>> (8) There's a further complication here, assuming we really want to
>> recognize the case when the handler is executing unexpectedly:
>>
>> - pthread_getspecific() is not necessarily async-signal-safe, according
>> to POSIX, so calling coroutine_get_thread_state() in the "unexpected"
>> case (e.g. in response to an asynchronously generated SIGUSR2) is
>> problematic in its own right,
>
> That’s a shame.
>
>> - if the SIGUSR2 is delivered to a thread that has never called
>> coroutine_get_thread_state() before, then we'll reach g_malloc0() inside
>> coroutine_get_thread_state(), in signal handler context, which is very
>> bad.
>
> Could be solved with a coroutine_try_get_thread_state() that will never
> malloc, but return NULL then.
>
>> You'd have to block SIGUSR2 for the entire process (all threads) at all
>> times, and only temporarily unblock it for a particular coroutine
>> thread, with the sigsuspend(). The above check would suffice, that way.
>
> Yes, that’s what I was originally afraid of. I feel like that may be
> the complexity drop that pushes this change too far out of my comfort
> zone. (And as evidenced by your review, it already was pretty much
> outside as it was.)
>
>> Such blocking is possible by calling pthread_sigmask() from the main
>> thread, before any other thread is created (the signal mask is inherited
>> across pthread_create()). I guess it could be done in coroutine_init()
>> too.
>>
>> And *then* the pthread_sigmask() calls should indeed be removed from
>> qemu_coroutine_new().
>
> OTOH, that does sound rather simple...
>
>> (Apologies if my feedback is difficult to understand, it's my fault. I
>> could propose a patch, if (and only if) you want that.)
>
> I can’t say I wouldn’t be happy with a patch for this code that doesn’t
> bear my S-o-b. ;)
>
> I feel conflicted. I can send a v2 that addresses this (probably
> consisting of multiple patches then, e.g. I’d split the SIGUSR2 blocking
> off the main patch), but to me, this bug is really more of a nuisance
> that just blocks me from sending a pull request for my block branch...
> So I’d rather not drag it out forever. OTOH, sending a quick and bad
> fix just because I can’t wait is just bad.
>
> I suppose I’ll have to decide over the weekend. Though if you’re
> itching to write a patch yourself, I’d definitely be grateful.
OK, I'll try my hand at it; I hope I won't be eating my words.
Laszlo
next prev parent reply other threads:[~2021-01-22 18:30 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-22 10:20 [PATCH] coroutine-sigaltstack: Keep SIGUSR2 handler up Max Reitz
2021-01-22 14:55 ` Eric Blake
2021-01-22 16:38 ` Laszlo Ersek
2021-01-22 17:58 ` Max Reitz
2021-01-22 18:19 ` Eric Blake
2021-01-22 18:28 ` Laszlo Ersek
2021-01-22 18:27 ` Laszlo Ersek
2021-01-22 19:02 ` Laszlo Ersek
2021-01-22 17:09 ` Laszlo Ersek
2021-01-22 18:05 ` Max Reitz
2021-01-22 18:29 ` Laszlo Ersek [this message]
2021-01-22 21:26 ` Laszlo Ersek
2021-01-23 0:41 ` Laszlo Ersek
2021-01-25 10:57 ` Max Reitz
2021-01-25 21:16 ` Laszlo Ersek
2021-01-23 22:13 ` Paolo Bonzini
2021-01-25 21:13 ` Laszlo Ersek
2021-01-25 22:08 ` Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1121a803-98e7-6d41-119c-3d82717976ec@redhat.com \
--to=lersek@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).