qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Richard Henderson <richard.henderson@linaro.org>
To: qemu-devel@nongnu.org
Cc: Peter Maydell <peter.maydell@linaro.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	laurent@vivier.eu, david@gibson.dropbear.id.au
Subject: Re: [PATCH] user-exec: Do not filter the signal on si_code
Date: Mon, 30 Sep 2019 14:01:04 -0700	[thread overview]
Message-ID: <ec1ace6c-49db-e769-e43e-6b0e059d6705@linaro.org> (raw)
In-Reply-To: <20190930192931.20509-1-richard.henderson@linaro.org>

On 9/30/19 12:29 PM, Richard Henderson wrote:
> This is a workaround for a ppc64le host kernel bug.
> 
> For the test case linux-test, we have an instruction trace
> 
> IN: sig_alarm
> ...
> 
> IN:
> 0x400080ed28:  380000ac  li       r0, 0xac
> 0x400080ed2c:  44000002  sc
> 
> IN: __libc_nanosleep
> 0x1003bb4c:  7c0802a6  mflr     r0
> 0x1003bb50:  f8010010  std      r0, 0x10(r1)
> 
> Our signal return trampoline has, rightly, changed the guest
> stack page read-only.  Which, rightly, faults on the store of
> a return address into a stack frame.
> 
> Checking the host /proc/pid/maps, we see the expected state:
> 
> 4000800000-4000810000 r--p 00000000 00:00 0
> 
> However, the host kernel has supplied si_code == SEGV_MAPERR,
> which is obviously incorrect.
> 
> By dropping this check, we may have an extra walk of the page
> tables, but this should be inexpensive.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> 
> FWIW, filed as
> 
>   https://bugzilla.redhat.com/show_bug.cgi?id=1757189
> 
> out of habit and then
> 
>   https://bugs.centos.org/view.php?id=16499
> 
> when I remembered that the system is running Centos not RHEL.
> 
> ---
>  accel/tcg/user-exec.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
> index 71c4bf6477..31ef091a70 100644
> --- a/accel/tcg/user-exec.c
> +++ b/accel/tcg/user-exec.c
> @@ -143,9 +143,12 @@ static inline int handle_cpu_signal(uintptr_t pc, siginfo_t *info,
>       * for some other kind of fault that should really be passed to the
>       * guest, we'd end up in an infinite loop of retrying the faulting
>       * access.
> +     *
> +     * XXX: At least one host kernel, ppc64le w/Centos 7 4.14.0-115.6.1,
> +     * incorrectly reports SEGV_MAPERR for a STDX write to a read-only page.
> +     * Therefore, do not test info->si_code.
>       */
> -    if (is_write && info->si_signo == SIGSEGV && info->si_code == SEGV_ACCERR &&
> -        h2g_valid(address)) {
> +    if (is_write && info->si_signo == SIGSEGV && h2g_valid(address)) {

Ho hum.  This change is in conflict with Peter's long comment; I should have
read the context more thoroughly.  There is an even longer comment with the
patch description: 9c4bbee9e3b83544257e82566342c29e15a88637

The SEGV_ACCERR check here is to prevent a loop by which page_unprotect races
with itself and, from Peter's analysis,

>      * ...but when B gets the mmap lock it finds that the page is already
>        PAGE_WRITE, and so it exits page_unprotect() via the "not due to
>        protected translation" code path, and wrongly delivers the signal
>        to the guest rather than just retrying the access

This bug was fixed in the referenced patch.  But then continues:

>     Since this would cause an infinite loop if we ever called
>     page_unprotect() for some other kind of fault than "write failed due
>     to bad access permissions", tighten the condition in
>     handle_cpu_signal() to check the signal number and si_code, and add a
>     comment so that if somebody does ever find themselves debugging an
>     infinite loop of faults they have some clue about why.
>     
>     (The trick for identifying the correct setting for
>     current_tb_invalidated for thread B (needed to handle the precise-SMC
>     case) is due to Richard Henderson.  Paolo Bonzini suggested just
>     relying on si_code rather than trying anything more complicated.)

It is disappointing about the kernel bug.  But since this affects Centos 7,
which is what *all* of the gcc compile farm ppc64 machines use, I think we need
to work around it somehow.

Should we simply add SEGV_MAPERR to the set of allowed si_code, to directly
work around the bug?  If we got that code from a kernel without the bug, then
page_find should fail to find an entry, and we should then indicate that the
signal should be passed to the guest.

Thoughts?


r~


  parent reply	other threads:[~2019-09-30 21:03 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-30 19:29 [PATCH] user-exec: Do not filter the signal on si_code Richard Henderson
2019-09-30 19:40 ` no-reply
2019-09-30 21:01 ` Richard Henderson [this message]
2019-10-01 10:34   ` Peter Maydell
2019-10-01 11:19     ` Laurent Vivier
2019-10-01 11:46       ` Peter Maydell
2019-10-01 13:15         ` Laurent Vivier
2019-10-01 14:58           ` Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec1ace6c-49db-e769-e43e-6b0e059d6705@linaro.org \
    --to=richard.henderson@linaro.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=laurent@vivier.eu \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).