From: Laszlo Ersek <lersek@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
Max Reitz <mreitz@redhat.com>,
qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [PATCH] coroutine-sigaltstack: Keep SIGUSR2 handler up
Date: Mon, 25 Jan 2021 22:13:05 +0100 [thread overview]
Message-ID: <88cba59c-f644-7474-03b7-90208697bcb6@redhat.com> (raw)
In-Reply-To: <063dd2a6-a3ae-0d7c-3e29-96a08a6d9a3d@redhat.com>
On 01/23/21 23:13, Paolo Bonzini wrote:
> On 22/01/21 22:26, Laszlo Ersek wrote:
>> That seems bogus, per POSIX, given that all signals except SIGUSR2 are
>> included in the mask passed to sigsuspend().
>
> What happens if you get a SIGSTOP at exactly the wrong time? (Yeah I
> know how incredibly unlikely that would be).
Nothing; it is impossible to ignore or catch SIGSTOP:
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html
And therefore it is also impossible to block it (but an attempt to block
it will not cause issues); see e.g.
https://pubs.opengroup.org/onlinepubs/9699919799/functions/sigsuspend.html
("It is not possible to block signals that cannot be ignored. This is
enforced by the system without causing an error to be indicated.")
So the process will simply stop executing.
The more interesting question is SIGCONT. SIGCONT will always behave
like it should:
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html
The interesting part is that it's possible to install a handler for
SIGCONT, which will act *in addition* to the normal SIGCONT effect.
Such a handler is actually useful for (e.g.) restarting timers that may
have been thrown off while the process was not scheduled for a while
(due to SIGSTOP).
If there is no handler installed for SIGCONT, nothing particular happens
when it is delivered (the process doesn't notice).
QEMU does seem to install a SIGCONT handler, in "chardev/char-stdio.c".
But that's not a problem (at least with the patch I had attached)
because the mask that sigsuspend() puts in place, temporarily, for
unblocking SIGUSR2, also blocks SIGCONT (together with everything else).
If a STOP/CONT occurs exactly then, then SIGCONT will resume the
process, but the handler for SIGCONT will run precisely *after*
sigsuspend() -- including the SIGUSR2 handling -- returns, and SIGCONT
becomes unblocked by virtue of sigsuspend() restoring the original mask.
(Side comment: I'm not convinced the existent SIGCONT handling in
"chardev/char-stdio.c" is safe, regardless of the topic at hand --
SIGCONT does not seem to be masked for all threads except one, and so
any thread could execute term_stdio_handler(). While the only called
function in that is tcsetattr(), which is async signal safe, I don't
know whether accessing "stdio_echo_state" and "stdio_allow_signal" is
also safe. It may be.)
BTW, the signal(7) Linux manual documents spurious wakeups for
sigtimedwait(2) and sigwaitinfo(2), indeed in connection with SIGCONT.
However, sigsuspend() is not listed in that section (I had checked):
https://man7.org/linux/man-pages/man7/signal.7.html
Interruption of system calls and library functions by stop signals
On Linux, even in the absence of signal handlers, certain
blocking interfaces can fail with the error EINTR after the
process is stopped by one of the stop signals and then resumed
via SIGCONT. This behavior is not sanctioned by POSIX.1, and
doesn't occur on other systems.
The Linux interfaces that display this behavior are:
* "Input" socket interfaces, when a timeout (SO_RCVTIMEO) has
been set on the socket using setsockopt(2): accept(2), recv(2),
recvfrom(2), recvmmsg(2) (also with a non-NULL timeout
argument), and recvmsg(2).
* "Output" socket interfaces, when a timeout (SO_RCVTIMEO) has
been set on the socket using setsockopt(2): connect(2),
send(2), sendto(2), and sendmsg(2), if a send timeout
(SO_SNDTIMEO) has been set.
* epoll_wait(2), epoll_pwait(2).
* semop(2), semtimedop(2).
* sigtimedwait(2), sigwaitinfo(2).
* Linux 3.7 and earlier: read(2) from an inotify(7) file
descriptor
* Linux 2.6.21 and earlier: futex(2) FUTEX_WAIT,
sem_timedwait(3), sem_wait(3).
* Linux 2.6.8 and earlier: msgrcv(2), msgsnd(2).
* Linux 2.4 and earlier: nanosleep(2).
If other host OSes that QEMU supports have similar non-conformance
notes, and sigsuspend() is affected on them, then removing the loop
around sigsuspend() is unsafe.
I only know where to check POSIX and the Linux manuals.
> BTW if we are in a mood for cleanup, there's no reason to use
> pthread_key_t instead of __thread + qemu_thread_atexit_add (adding a
> Notifier to struct CoroutineThreadState). That would fix the issue with
> async-signal safety of pthread_getspecific.
>
> (It makes sense for the function not to be async-signal safe since it
> can in principle allocate memory for the data. In practice it's most
> likely okay if the function has been called before on this thread).
This is a good idea, but with the patch I had attached, it's not urgent
-- pthread_getspecific is only called from a safe context. Per POSIX,
there is a problem only when an unsafe function is executed while the
signal interrupts an unsafe function:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03
In the presence of signals, all functions defined by this volume of
POSIX.1-2017 shall behave as defined when called from or interrupted
by a signal-catching function, with the exception that when a signal
interrupts an unsafe function or equivalent (such as the processing
equivalent to exit() performed after a return from the initial call
to main()) and the signal-catching function calls an unsafe
function, the behavior is undefined.
And with the proposed patch, the only function that can be interrupted
by SIGUSR2 is sigsuspend(), and sigsuspend() is listed in the table of
safe functions.
Thanks
Laszlo
next prev parent reply other threads:[~2021-01-25 21:15 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-22 10:20 [PATCH] coroutine-sigaltstack: Keep SIGUSR2 handler up Max Reitz
2021-01-22 14:55 ` Eric Blake
2021-01-22 16:38 ` Laszlo Ersek
2021-01-22 17:58 ` Max Reitz
2021-01-22 18:19 ` Eric Blake
2021-01-22 18:28 ` Laszlo Ersek
2021-01-22 18:27 ` Laszlo Ersek
2021-01-22 19:02 ` Laszlo Ersek
2021-01-22 17:09 ` Laszlo Ersek
2021-01-22 18:05 ` Max Reitz
2021-01-22 18:29 ` Laszlo Ersek
2021-01-22 21:26 ` Laszlo Ersek
2021-01-23 0:41 ` Laszlo Ersek
2021-01-25 10:57 ` Max Reitz
2021-01-25 21:16 ` Laszlo Ersek
2021-01-23 22:13 ` Paolo Bonzini
2021-01-25 21:13 ` Laszlo Ersek [this message]
2021-01-25 22:08 ` Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=88cba59c-f644-7474-03b7-90208697bcb6@redhat.com \
--to=lersek@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).