From: "Richard W.M. Jones" <rjones@redhat.com>
To: Richard Henderson <richard.henderson@linaro.org>
Cc: peter.maydell@linaro.org, berrange@redhat.com, qemu-devel@nongnu.org
Subject: Re: [PATCH] tcg/arm: Reduce vector alignment requirement for NEON
Date: Mon, 13 Sep 2021 12:07:27 +0100 [thread overview]
Message-ID: <20210913110727.GF26415@redhat.com> (raw)
In-Reply-To: <20210912174925.200132-1-richard.henderson@linaro.org>
On Sun, Sep 12, 2021 at 10:49:25AM -0700, Richard Henderson wrote:
> With arm32, the ABI gives us 8-byte alignment for the stack.
> While it's possible to realign the stack to provide 16-byte alignment,
> it's far easier to simply not encode 16-byte alignment in the
> VLD1 and VST1 instructions that we emit.
>
> Remove the assertion in temp_allocate_frame, limit natural alignment
> to the provided stack alignment, and add a comment.
>
> Reported-by: Richard W.M. Jones <rjones@redhat.com>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>
> I haven't seen the assertion with the various arm kernels that I happen
> to have laying about. I have not taken the time to build the combo
> from the bug report:
>
> [ 0.000000] Linux version 5.14.0-60.fc36.armv7hl (mockbuild@buildvm-a32-12.iad2.fedoraproject.org) (gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1), GNU ld version 2.37-9.fc36) #1 SMP Mon Aug 30 14:08:34 UTC 2021
>
> I thought about parameterizing this patch further, but I can't think of
> another ISA that would be affected. (i686 clumsily changed its abi 20
> years ago to avoid faulting on vector spills; other isas so far have
> allowed vectors to be unaligned.)
Is it possible this change could have caused a more serious
regression? Now when I try to boot the Fedora kernel using TCG on
armv7hl I can't even get to the point where it detects virtio-scsi
devices.
Full log is here (go down to the bottom and work backwards):
https://kojipkgs.fedoraproject.org//work/tasks/7337/75597337/build.log
This might have been caused by a coincidental change to the kernel.
The test environment I have makes it extremely difficult to test this
change in isolation.
However I do know that the same error does _not_ occur on x86-64
guest/host with this patch applied.
Rich.
>
> r~
> ---
> tcg/tcg.c | 8 +++++++-
> tcg/arm/tcg-target.c.inc | 13 +++++++++----
> 2 files changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index 4142d42d77..ca5bcc4635 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -3060,7 +3060,13 @@ static void temp_allocate_frame(TCGContext *s, TCGTemp *ts)
> g_assert_not_reached();
> }
>
> - assert(align <= TCG_TARGET_STACK_ALIGN);
> + /*
> + * Assume the stack is sufficiently aligned.
> + * This affects e.g. ARM NEON, where we have 8 byte stack alignment
> + * and do not require 16 byte vector alignment. This seems slightly
> + * easier than fully parameterizing the above switch statement.
> + */
> + align = MIN(TCG_TARGET_STACK_ALIGN, align);
> off = ROUND_UP(s->current_frame_offset, align);
>
> /* If we've exhausted the stack frame, restart with a smaller TB. */
> diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
> index e5b4f86841..8515717435 100644
> --- a/tcg/arm/tcg-target.c.inc
> +++ b/tcg/arm/tcg-target.c.inc
> @@ -2477,8 +2477,13 @@ static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg arg,
> tcg_out_vldst(s, INSN_VLD1 | 0x7d0, arg, arg1, arg2);
> return;
> case TCG_TYPE_V128:
> - /* regs 2; size 8; align 16 */
> - tcg_out_vldst(s, INSN_VLD1 | 0xae0, arg, arg1, arg2);
> + /*
> + * We have only 8-byte alignment for the stack per the ABI.
> + * Rather than dynamically re-align the stack, it's easier
> + * to simply not request alignment beyond that. So:
> + * regs 2; size 8; align 8
> + */
> + tcg_out_vldst(s, INSN_VLD1 | 0xad0, arg, arg1, arg2);
> return;
> default:
> g_assert_not_reached();
> @@ -2497,8 +2502,8 @@ static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg,
> tcg_out_vldst(s, INSN_VST1 | 0x7d0, arg, arg1, arg2);
> return;
> case TCG_TYPE_V128:
> - /* regs 2; size 8; align 16 */
> - tcg_out_vldst(s, INSN_VST1 | 0xae0, arg, arg1, arg2);
> + /* See tcg_out_ld re alignment: regs 2; size 8; align 8 */
> + tcg_out_vldst(s, INSN_VST1 | 0xad0, arg, arg1, arg2);
> return;
> default:
> g_assert_not_reached();
> --
> 2.25.1
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW
next prev parent reply other threads:[~2021-09-13 11:09 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-12 17:49 [PATCH] tcg/arm: Reduce vector alignment requirement for NEON Richard Henderson
2021-09-13 8:07 ` Richard W.M. Jones
2021-09-13 11:07 ` Richard W.M. Jones [this message]
2021-09-13 16:19 ` Richard Henderson
2021-09-13 16:28 ` Richard W.M. Jones
2021-09-13 17:01 ` Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210913110727.GF26415@redhat.com \
--to=rjones@redhat.com \
--cc=berrange@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.