* [PATCH 0/1] target/i386: Raise #GP on unaligned m128 accesses when required. @ 2022-08-29 14:23 Ricky Zhou 2022-08-29 14:23 ` [PATCH 1/1] " Ricky Zhou 0 siblings, 1 reply; 5+ messages in thread From: Ricky Zhou @ 2022-08-29 14:23 UTC (permalink / raw) To: qemu-devel; +Cc: pbonzini, richard.henderson, eduardo, Ricky Zhou This is a change to raise #GP on unaligned m128 loads/stores when required by the spec. Some notes on this change: 1. I considered making use of the existing support for enforcing memory alignment (setting MO_ALIGN_16 in the load/store's MemOp), but rejected this approach. There are at least two scenarios where we might want to do alignment checks in x86: a. Loads/stores when the AC flag is enabled (which should raise #AC on misalignment) b. SSE/AVX instructions which require memory alignment (which raise #GP on misalignment) The MemOp alignment checking mechanism can only handle one of these scenarios, since they require different exceptions to be raised. I think it make more sense to use the existing memory alignment support for implementing (a), since helper_unaligned_{ld,st} is already triggers SIGBUS in qemu-user. This is why I ended up implementing (b) with a helper. 2. It is often the case that legacy SSE instructions require 16 byte alignment of 128-bit memory operands, but AVX versions of the instructions do not (e.g. movsldup requires alignment and vmovsldup does not). From what I can tell, QEMU currently doesn't appear to report AVX support in cpuid, but it still seems to emulate some of these instructions if you tell it to execute them. This change attempts to distinguish between legacy SSE instructions and AVX instructions by by conditioning on !(s->prefix & PREFIX_VEX). Not sure this is very future-proof though - for example, it may need to be updated if support for EVEX prefixes is added. LMK if there's a nicer way to do this. 3. I tested this by running a Linux VM in qemu-system-x86_64 and verifying that movaps on an misaligned address triggers a segfault. Ricky Zhou (1): target/i386: Raise #GP on unaligned m128 accesses when required. target/i386/helper.h | 1 + target/i386/tcg/mem_helper.c | 8 ++++++++ target/i386/tcg/translate.c | 38 +++++++++++++++++++++++++++++++++--- 3 files changed, 44 insertions(+), 3 deletions(-) -- 2.37.2 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/1] target/i386: Raise #GP on unaligned m128 accesses when required. 2022-08-29 14:23 [PATCH 0/1] target/i386: Raise #GP on unaligned m128 accesses when required Ricky Zhou @ 2022-08-29 14:23 ` Ricky Zhou 2022-08-29 16:45 ` Richard Henderson 0 siblings, 1 reply; 5+ messages in thread From: Ricky Zhou @ 2022-08-29 14:23 UTC (permalink / raw) To: qemu-devel; +Cc: pbonzini, richard.henderson, eduardo, Ricky Zhou Many instructions which load/store 128-bit values are supposed to raise #GP when the memory operand isn't 16-byte aligned. This includes: - Instructions explicitly requiring memory alignment (Exceptions Type 1 in the "AVX and SSE Instruction Exception Specification" section of the SDM) - Legacy SSE instructions that load/store 128-bit values (Exceptions Types 2 and 4). This change adds a raise_gp_if_unaligned helper which raises #GP if an address is not properly aligned. This helper is called before 128-bit loads/stores where appropriate. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/217 Signed-off-by: Ricky Zhou <ricky@rzhou.org> --- target/i386/helper.h | 1 + target/i386/tcg/mem_helper.c | 8 ++++++++ target/i386/tcg/translate.c | 38 +++++++++++++++++++++++++++++++++--- 3 files changed, 44 insertions(+), 3 deletions(-) diff --git a/target/i386/helper.h b/target/i386/helper.h index ac3b4d1ee3..17d78f2b0d 100644 --- a/target/i386/helper.h +++ b/target/i386/helper.h @@ -213,6 +213,7 @@ DEF_HELPER_1(update_mxcsr, void, env) DEF_HELPER_1(enter_mmx, void, env) DEF_HELPER_1(emms, void, env) DEF_HELPER_3(movq, void, env, ptr, ptr) +DEF_HELPER_3(raise_gp_if_unaligned, void, env, tl, tl) #define SHIFT 0 #include "ops_sse_header.h" diff --git a/target/i386/tcg/mem_helper.c b/target/i386/tcg/mem_helper.c index e3cdafd2d4..79259abef3 100644 --- a/target/i386/tcg/mem_helper.c +++ b/target/i386/tcg/mem_helper.c @@ -181,3 +181,11 @@ void helper_boundl(CPUX86State *env, target_ulong a0, int v) raise_exception_ra(env, EXCP05_BOUND, GETPC()); } } + +void helper_raise_gp_if_unaligned(CPUX86State *env, target_ulong addr, + target_ulong align_mask) +{ + if (unlikely((addr & align_mask) != 0)) { + raise_exception_ra(env, EXCP0D_GPF, GETPC()); + } +} diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index b7972f0ff5..de13f483b6 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3054,7 +3054,7 @@ static const struct SSEOpHelper_epp sse_op_table6[256] = { [0x25] = SSE41_OP(pmovsxdq), [0x28] = SSE41_OP(pmuldq), [0x29] = SSE41_OP(pcmpeqq), - [0x2a] = SSE41_SPECIAL, /* movntqda */ + [0x2a] = SSE41_SPECIAL, /* movntdqa */ [0x2b] = SSE41_OP(packusdw), [0x30] = SSE41_OP(pmovzxbw), [0x31] = SSE41_OP(pmovzxbd), @@ -3194,10 +3194,11 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, break; case 0x1e7: /* movntdq */ case 0x02b: /* movntps */ - case 0x12b: /* movntps */ + case 0x12b: /* movntpd */ if (mod == 3) goto illegal_op; gen_lea_modrm(env, s, modrm); + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, tcg_const_tl(0xf)); gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); break; case 0x3f0: /* lddqu */ @@ -3273,6 +3274,11 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, case 0x26f: /* movdqu xmm, ea */ if (mod != 3) { gen_lea_modrm(env, s, modrm); + /* movaps, movapd, movdqa */ + if (b == 0x028 || b == 0x128 || b == 0x16f) { + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); + } gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); } else { rm = (modrm & 7) | REX_B(s); @@ -3331,6 +3337,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, case 0x212: /* movsldup */ if (mod != 3) { gen_lea_modrm(env, s, modrm); + if (!(s->prefix & PREFIX_VEX)) { + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); + } gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); } else { rm = (modrm & 7) | REX_B(s); @@ -3373,6 +3383,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, case 0x216: /* movshdup */ if (mod != 3) { gen_lea_modrm(env, s, modrm); + if (!(s->prefix & PREFIX_VEX)) { + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); + } gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); } else { rm = (modrm & 7) | REX_B(s); @@ -3465,6 +3479,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, case 0x27f: /* movdqu ea, xmm */ if (mod != 3) { gen_lea_modrm(env, s, modrm); + if (b == 0x029 || b == 0x129 || b == 0x17f) { + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); + } gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); } else { rm = (modrm & 7) | REX_B(s); @@ -3806,10 +3824,16 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, tcg_gen_st16_tl(s->tmp0, cpu_env, op2_offset + offsetof(ZMMReg, ZMM_W(0))); break; - case 0x2a: /* movntqda */ + case 0x2a: /* movntdqa */ + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); gen_ldo_env_A0(s, op1_offset); return; default: + if (!(s->prefix & PREFIX_VEX)) { + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); + } gen_ldo_env_A0(s, op2_offset); } } @@ -4351,6 +4375,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, } else { op2_offset = offsetof(CPUX86State,xmm_t0); gen_lea_modrm(env, s, modrm); + if (!(s->prefix & PREFIX_VEX)) { + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); + } gen_ldo_env_A0(s, op2_offset); } } else { @@ -4469,6 +4497,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, break; default: /* 128 bit access */ + if (!(s->prefix & PREFIX_VEX)) { + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); + } gen_ldo_env_A0(s, op2_offset); break; } -- 2.37.2 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] target/i386: Raise #GP on unaligned m128 accesses when required. 2022-08-29 14:23 ` [PATCH 1/1] " Ricky Zhou @ 2022-08-29 16:45 ` Richard Henderson 2022-08-29 20:46 ` Ricky Zhou 0 siblings, 1 reply; 5+ messages in thread From: Richard Henderson @ 2022-08-29 16:45 UTC (permalink / raw) To: Ricky Zhou, qemu-devel; +Cc: pbonzini, eduardo On 8/29/22 07:23, Ricky Zhou wrote: > Many instructions which load/store 128-bit values are supposed to > raise #GP when the memory operand isn't 16-byte aligned. This includes: > - Instructions explicitly requiring memory alignment (Exceptions Type 1 > in the "AVX and SSE Instruction Exception Specification" section of > the SDM) > - Legacy SSE instructions that load/store 128-bit values (Exceptions > Types 2 and 4). > > This change adds a raise_gp_if_unaligned helper which raises #GP if an > address is not properly aligned. This helper is called before 128-bit > loads/stores where appropriate. > > Resolves:https://gitlab.com/qemu-project/qemu/-/issues/217 > Signed-off-by: Ricky Zhou<ricky@rzhou.org> > --- > target/i386/helper.h | 1 + > target/i386/tcg/mem_helper.c | 8 ++++++++ > target/i386/tcg/translate.c | 38 +++++++++++++++++++++++++++++++++--- > 3 files changed, 44 insertions(+), 3 deletions(-) This trap should be raised via the memory operation: - static inline void gen_ldo_env_A0(DisasContext *s, int offset) + static inline void gen_ldo_env_A0(DisasContext *s, int offset, bool aligned) { int mem_index = s->mem_index; - tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, mem_index, MO_LEUQ); + tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, mem_index, + MO_LEUQ | (aligned ? MO_ALIGN_16 : 0)); tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(0))); tcg_gen_addi_tl(s->tmp0, s->A0, 8); tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(1))); } Only the first of the two loads/stores must be aligned, as the other is known to be +8. You then must fill in the x86_tcg_ops.do_unaligned_access hook to raise #GP. r~ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] target/i386: Raise #GP on unaligned m128 accesses when required. 2022-08-29 16:45 ` Richard Henderson @ 2022-08-29 20:46 ` Ricky Zhou 2022-08-29 22:54 ` Richard Henderson 0 siblings, 1 reply; 5+ messages in thread From: Ricky Zhou @ 2022-08-29 20:46 UTC (permalink / raw) To: Richard Henderson; +Cc: qemu-devel, pbonzini, eduardo On Mon, Aug 29, 2022 at 9:45 AM Richard Henderson <richard.henderson@linaro.org> wrote: > > On 8/29/22 07:23, Ricky Zhou wrote: > This trap should be raised via the memory operation: > ... > Only the first of the two loads/stores must be aligned, as the other is known to be +8. > You then must fill in the x86_tcg_ops.do_unaligned_access hook to raise #GP. Thanks for taking a look at this - did you see the bit in the cover letter where I discuss doing this via alignment requirements on the memory operation? My logic was that the memop alignment checks seem to be more oriented towards triggering #AC exceptions (even though this is not currently implemented), since qemu-user's unaligned access handlers (helper_unaligned_{ld,st}) already trigger SIGBUS as opposed to SIGSEGV. I was concerned that implementing this via MO_ALIGN_16 would get in the way of a hypothetical future implementation of the AC flag, since do_unaligned_access would need to raise #AC instead of #GP for that. One slightly more involved way to use alignment on the MemOp could be to arrange to pass the problematic MemOp to do_unaligned_access and helper_unaligned_{ld,st}. Then we could allow CPUs to handle misalignment of different MemOps differently (e.g. raise #GP/SIGSEGV for certain ops and #AC/SIGBUS for others). For this change to x86, we could maybe get away with making MO_ALIGN_16 and above trigger #GP/SIGSEGV and everything else trigger #AC/SIGBUS. If that's a little hacky, we could instead add some dedicated bits to MemOp that distinguish different types of unaligned accesses. What do you think? Happy to implement whichever approach is preferred! Thanks, Ricky ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] target/i386: Raise #GP on unaligned m128 accesses when required. 2022-08-29 20:46 ` Ricky Zhou @ 2022-08-29 22:54 ` Richard Henderson 0 siblings, 0 replies; 5+ messages in thread From: Richard Henderson @ 2022-08-29 22:54 UTC (permalink / raw) To: Ricky Zhou; +Cc: qemu-devel, pbonzini, eduardo On 8/29/22 13:46, Ricky Zhou wrote: > Thanks for taking a look at this - did you see the bit in the cover > letter where I discuss doing this via alignment requirements on the > memory operation? My logic was that the memop alignment checks seem to > be more oriented towards triggering #AC exceptions (even though this is > not currently implemented), I missed that in the cover. However... implementing #AC is pretty hypothetical. It's not something that I've ever seen used, and not something that anyone has asked for. > One slightly more involved way to use alignment on the MemOp could be to > arrange to pass the problematic MemOp to do_unaligned_access and > helper_unaligned_{ld,st}. Then we could allow CPUs to handle > misalignment of different MemOps differently (e.g. raise #GP/SIGSEGV for > certain ops and #AC/SIGBUS for others). For this change to x86, we could > maybe get away with making MO_ALIGN_16 and above trigger #GP/SIGSEGV and > everything else trigger #AC/SIGBUS. If that's a little hacky, we could > instead add some dedicated bits to MemOp that distinguish different > types of unaligned accesses. There's another related problem that actually has gotten a bug report in the past: when the form of the address should raise #SS instead of #GP in system mode. My initial thought was to record information about "the" memory access in the per-insn unwind info, until I realized that there are insns with multiple memory operations requiring different treatment. E.g. "push (%rax)", where the read might raise #GP and the write might raise #SS. So I think we'd need to encode #GP vs #SS into the mmu_idx used (e.g. in the lsb). However, I don't think there are any similar situations of multiple memory types affecting SSE, so #AC vs #GP could in fact be encoded into the per-insn unwind info. As for SIGBUS vs SIGSEGV for SSE and user-only, you only need implement the x86_cpu_ops.record_sigbus hook. C.f. the s390x version which raises PGM_SPECIFICATION -> SIGILL for unaligned atomic operations. r~ ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-08-29 22:55 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-08-29 14:23 [PATCH 0/1] target/i386: Raise #GP on unaligned m128 accesses when required Ricky Zhou 2022-08-29 14:23 ` [PATCH 1/1] " Ricky Zhou 2022-08-29 16:45 ` Richard Henderson 2022-08-29 20:46 ` Ricky Zhou 2022-08-29 22:54 ` Richard Henderson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).