From: David Hildenbrand <david@redhat.com>
To: Richard Henderson <richard.henderson@linaro.org>, qemu-devel@nongnu.org
Cc: peter.maydell@linaro.org
Subject: Re: [Qemu-devel] [PULL 16/16] tcg/i386: Use MOVDQA for TCG_TYPE_V128 load/store
Date: Tue, 28 May 2019 20:46:30 +0200 [thread overview]
Message-ID: <09cb162c-594c-4863-2dfd-ae30b218bc9c@redhat.com> (raw)
In-Reply-To: <7d00bc45-fbc5-9096-af44-253779063319@redhat.com>
On 28.05.19 20:33, David Hildenbrand wrote:
> On 28.05.19 19:28, David Hildenbrand wrote:
>> On 23.05.19 00:28, Richard Henderson wrote:
>>> This instruction raises #GP, aka SIGSEGV, if the effective address
>>> is not aligned to 16-bytes.
>>>
>>> We have assertions in tcg-op-gvec.c that the offset from ENV is
>>> aligned, for vector types <= V128. But the offset itself does not
>>> validate that the final pointer is aligned -- one must also remember
>>> to use the QEMU_ALIGNED() attribute on the vector member within ENV.
>>>
>>> PowerPC Altivec has vector load/store instructions that silently
>>> discard the low 4 bits of the address, making alignment mistakes
>>> difficult to discover. Aid that by making the most popular host
>>> visibly signal the error.
>>>
>>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>>> ---
>>> tcg/i386/tcg-target.inc.c | 24 ++++++++++++++++++++++--
>>> 1 file changed, 22 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c
>>> index 6ec5e60448..c0443da4af 100644
>>> --- a/tcg/i386/tcg-target.inc.c
>>> +++ b/tcg/i386/tcg-target.inc.c
>>> @@ -1082,14 +1082,24 @@ static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret,
>>> }
>>> /* FALLTHRU */
>>> case TCG_TYPE_V64:
>>> + /* There is no instruction that can validate 8-byte alignment. */
>>> tcg_debug_assert(ret >= 16);
>>> tcg_out_vex_modrm_offset(s, OPC_MOVQ_VqWq, ret, 0, arg1, arg2);
>>> break;
>>> case TCG_TYPE_V128:
>>> + /*
>>> + * The gvec infrastructure is asserts that v128 vector loads
>>> + * and stores use a 16-byte aligned offset. Validate that the
>>> + * final pointer is aligned by using an insn that will SIGSEGV.
>>> + */
>>> tcg_debug_assert(ret >= 16);
>>> - tcg_out_vex_modrm_offset(s, OPC_MOVDQU_VxWx, ret, 0, arg1, arg2);
>>> + tcg_out_vex_modrm_offset(s, OPC_MOVDQA_VxWx, ret, 0, arg1, arg2);
>>> break;
>>> case TCG_TYPE_V256:
>>> + /*
>>> + * The gvec infrastructure only requires 16-byte alignment,
>>> + * so here we must use an unaligned load.
>>> + */
>>> tcg_debug_assert(ret >= 16);
>>> tcg_out_vex_modrm_offset(s, OPC_MOVDQU_VxWx | P_VEXL,
>>> ret, 0, arg1, arg2);
>>> @@ -1117,14 +1127,24 @@ static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg,
>>> }
>>> /* FALLTHRU */
>>> case TCG_TYPE_V64:
>>> + /* There is no instruction that can validate 8-byte alignment. */
>>> tcg_debug_assert(arg >= 16);
>>> tcg_out_vex_modrm_offset(s, OPC_MOVQ_WqVq, arg, 0, arg1, arg2);
>>> break;
>>> case TCG_TYPE_V128:
>>> + /*
>>> + * The gvec infrastructure is asserts that v128 vector loads
>>> + * and stores use a 16-byte aligned offset. Validate that the
>>> + * final pointer is aligned by using an insn that will SIGSEGV.
>>> + */
>>> tcg_debug_assert(arg >= 16);
>>> - tcg_out_vex_modrm_offset(s, OPC_MOVDQU_WxVx, arg, 0, arg1, arg2);
>>> + tcg_out_vex_modrm_offset(s, OPC_MOVDQA_WxVx, arg, 0, arg1, arg2);
>>> break;
>>> case TCG_TYPE_V256:
>>> + /*
>>> + * The gvec infrastructure only requires 16-byte alignment,
>>> + * so here we must use an unaligned store.
>>> + */
>>> tcg_debug_assert(arg >= 16);
>>> tcg_out_vex_modrm_offset(s, OPC_MOVDQU_WxVx | P_VEXL,
>>> arg, 0, arg1, arg2);
>>>
>>
>> This is the problematic patch. Haven't looked into the details yet, so I
>> can't tell what's wrong. Maybe really an alignemnt issue?
>>
>
> Okay, looks like "vregs" in "struct CPUS390XState" is always aligned to
> 8, but not to 16 bytes.
>
> And that in return is the case, because "CPUS390XState env" is not
> aligned to 16 bytes in "struct S390CPU"
>
>
> !!!!!!!! CPU: 0x55a5e3046ef0
> !!!!!!!! ENV: 0x55a5e304f1a8
> !!!!!!!! VREGS: 0x55a5e304f228
> !!!!!!!! CPU: 0x55a5e3070bb0
> !!!!!!!! ENV: 0x55a5e3078e68
> !!!!!!!! VREGS: 0x55a5e3078ee8
> !!!!!!!! CPU: 0x55a5e3098310
> !!!!!!!! ENV: 0x55a5e30a05c8
> !!!!!!!! VREGS: 0x55a5e30a0648
> !!!!!!!! CPU: 0x55a5e30c0730
> !!!!!!!! ENV: 0x55a5e30c89e8
> !!!!!!!! VREGS: 0x55a5e30c8a68
> !!!!!!!! CPU: 0x55a5e30e7c90
> !!!!!!!! ENV: 0x55a5e30eff48
> !!!!!!!! VREGS: 0x55a5e30effc8
> !!!!!!!! CPU: 0x55a5e310eea0
> !!!!!!!! ENV: 0x55a5e3117158
> !!!!!!!! VREGS: 0x55a5e31171d8
> !!!!!!!! CPU: 0x55a5e31361e0
> !!!!!!!! ENV: 0x55a5e313e498
> !!!!!!!! VREGS: 0x55a5e313e518
> !!!!!!!! CPU: 0x55a5e315d520
> !!!!!!!! ENV: 0x55a5e31657d8
> !!!!!!!! VREGS: 0x55a5e3165858
>
> vregs is defined as:
>
> CPU_DoubleU vregs[32][2];
>
> We either have to switch to a type that has a natural alignment of 16
> bytes, or enforce alignment of "CPUS390XState env" to 16 bytes.
>
> What do you suggest?
FWIW, this seems to be the easiest way:
diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
index f0d9a6a36d..d363ae0fb3 100644
--- a/target/s390x/cpu.h
+++ b/target/s390x/cpu.h
@@ -66,7 +66,7 @@ struct CPUS390XState {
* The floating point registers are part of the vector registers.
* vregs[0][0] -> vregs[15][0] are 16 floating point registers
*/
- CPU_DoubleU vregs[32][2]; /* vector registers */
+ CPU_DoubleU vregs[32][2] QEMU_ALIGNED(16); /* vector registers */
uint32_t aregs[16]; /* access registers */
uint8_t riccb[64]; /* runtime instrumentation control */
uint64_t gscb[4]; /* guarded storage control */
Makes it work for me again.
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2019-05-28 18:47 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-22 22:28 [Qemu-devel] [PULL 00/16] tcg queued patches Richard Henderson
2019-05-22 22:28 ` [Qemu-devel] [PULL 01/16] tcg/i386: Fix dupi/dupm for avx1 and 32-bit hosts Richard Henderson
2019-05-22 22:28 ` [Qemu-devel] [PULL 02/16] tcg: Fix missing checks and clears in tcg_gen_gvec_dup_mem Richard Henderson
2019-05-22 22:28 ` [Qemu-devel] [PULL 03/16] tcg: Add support for vector bitwise select Richard Henderson
2019-05-22 22:28 ` [Qemu-devel] [PULL 04/16] tcg: Add support for vector compare select Richard Henderson
2019-05-22 22:28 ` [Qemu-devel] [PULL 05/16] tcg: Introduce do_op3_nofail for vector expansion Richard Henderson
2019-05-22 22:28 ` [Qemu-devel] [PULL 06/16] tcg: Expand vector minmax using cmp+cmpsel Richard Henderson
2019-05-22 22:28 ` [Qemu-devel] [PULL 07/16] tcg: Add TCG_OPF_NOT_PRESENT if TCG_TARGET_HAS_foo is negative Richard Henderson
2019-05-22 22:28 ` [Qemu-devel] [PULL 08/16] tcg/i386: Support vector comparison select value Richard Henderson
2019-05-30 11:26 ` Peter Maydell
2019-05-30 12:50 ` Richard Henderson
2019-05-30 14:54 ` Aleksandar Markovic
2019-05-30 17:45 ` Richard Henderson
2019-05-30 23:18 ` Aleksandar Markovic
2019-05-22 22:28 ` [Qemu-devel] [PULL 09/16] tcg/i386: Remove expansion for missing minmax Richard Henderson
2019-05-22 22:28 ` [Qemu-devel] [PULL 10/16] tcg/i386: Use umin/umax in expanding unsigned compare Richard Henderson
2019-05-22 22:28 ` [Qemu-devel] [PULL 11/16] tcg/aarch64: Support vector bitwise select value Richard Henderson
2019-05-22 22:28 ` [Qemu-devel] [PULL 12/16] tcg/aarch64: Split up is_fimm Richard Henderson
2019-05-22 22:28 ` [Qemu-devel] [PULL 13/16] tcg/aarch64: Use MVNI in tcg_out_dupi_vec Richard Henderson
2019-05-22 22:28 ` [Qemu-devel] [PULL 14/16] tcg/aarch64: Build vector immediates with two insns Richard Henderson
2019-05-22 22:28 ` [Qemu-devel] [PULL 15/16] tcg/aarch64: Allow immediates for vector ORR and BIC Richard Henderson
2019-05-22 22:28 ` [Qemu-devel] [PULL 16/16] tcg/i386: Use MOVDQA for TCG_TYPE_V128 load/store Richard Henderson
2019-05-28 17:28 ` David Hildenbrand
2019-05-28 18:33 ` David Hildenbrand
2019-05-28 18:46 ` David Hildenbrand [this message]
2019-05-28 21:34 ` Richard Henderson
2019-05-23 8:17 ` [Qemu-devel] [PULL 00/16] tcg queued patches Aleksandar Markovic
2019-05-23 12:42 ` Richard Henderson
2019-05-24 10:43 ` Peter Maydell
2019-05-28 16:58 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=09cb162c-594c-4863-2dfd-ae30b218bc9c@redhat.com \
--to=david@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).