Re: [PULL 47/53] target/i386: reimplement 0x0f 0x28-0x2f, add AVX

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Peter Maydell <peter.maydell@linaro.org>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org, Richard Henderson <richard.henderson@linaro.org>
Subject: Re: [PULL 47/53] target/i386: reimplement 0x0f 0x28-0x2f, add AVX
Date: Tue, 9 May 2023 11:52:55 +0100	[thread overview]
Message-ID: <CAFEAcA_oh9F7KegOPMaRGtkR_JBGDoXokLk4BsovXb56aPP=RQ@mail.gmail.com> (raw)
In-Reply-To: <20221018133042.856368-48-pbonzini@redhat.com>

On Tue, 18 Oct 2022 at 15:32, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> Here the code is a bit uglier due to the truncation and extension
> of registers to and from 32-bit.  There is also a mistake in the
> manual with respect to the size of the memory operand of CVTPS2PI
> and CVTTPS2PI, reported by Ricky Zhou.
>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---

Hi; in bug https://gitlab.com/qemu-project/qemu/-/issues/1637
it's been reported that there's an error in the handling
of the UCOMISS instruction that was added in this commit:

>  static void decode_sse_unary(DisasContext *s, CPUX86State *env, X86OpEntry *entry, uint8_t *b)
>  {
>      if (!(s->prefix & (PREFIX_REPZ | PREFIX_REPNZ))) {
> @@ -746,6 +793,15 @@ static const X86OpEntry opcodes_0F[256] = {
>      [0x76] = X86_OP_ENTRY3(PCMPEQD,    V,x, H,x, W,x,  vex4 mmx avx2_256 p_00_66),
>      [0x77] = X86_OP_GROUP0(0F77),
>
> +    [0x28] = X86_OP_ENTRY3(MOVDQ,      V,x,  None,None, W,x, vex1 p_00_66), /* MOVAPS */
> +    [0x29] = X86_OP_ENTRY3(MOVDQ,      W,x,  None,None, V,x, vex1 p_00_66), /* MOVAPS */
> +    [0x2A] = X86_OP_GROUP0(0F2A),
> +    [0x2B] = X86_OP_GROUP0(0F2B),
> +    [0x2C] = X86_OP_GROUP0(0F2C),
> +    [0x2D] = X86_OP_GROUP0(0F2D),
> +    [0x2E] = X86_OP_ENTRY3(VUCOMI,     None,None, V,x, W,x,  vex4 p_00_66),
> +    [0x2F] = X86_OP_ENTRY3(VCOMI,      None,None, V,x, W,x,  vex4 p_00_66),

I've figured out what's going wrong, but the tables here are
a bit opaque for my level of understanding of the instruction
set. The problem seems to be that we assume that UCOMISS takes
two 64 bit arguments. However, the documentation says that it
operates on the low 32-bits of a register operand and on a
32-bit memory location. (UCOMISD operates on 64-bit values.)
This only causes a problem when the guest code tries to do
a UCOMISS on a 32-bit memory operand that happens to be at
the very end of a page where there is nothing mapped after it.
It looks like QEMU always tries to load a full 128 bits from
memory, and so it causes a segfault that shouldn't happen:
you can see that we do two qemu_ld_i64 from rcx + 0xff4 and
rcx + 0xffc, when we should only be loading a single 32 bit value.

0x400000116f:  0f 2e 81 f4 0f 00 00     ucomiss  0xff4(%rcx), %xmm0

OP:
 ld_i32 loc9,env,$0xfffffffffffffff8
 brcond_i32 loc9,$0x0,lt,$L0

 ---- 000000400000116f 0000000000000000
 add_i64 loc2,rcx,$0xff4
 qemu_ld_i64 loc4,loc2,leq,0
 st_i64 loc4,env,$0xb50
 add_i64 loc3,loc2,$0x8
 qemu_ld_i64 loc4,loc3,leq,0
 st_i64 loc4,env,$0xb58
 add_i64 loc13,env,$0xb50
 add_i64 loc15,env,$0x350
 call ucomiss,$0x0,$0,env,loc15,loc13

(this is from the test case in the bug report)

Presumably the table entry should be marked up somehow to
indicate that the operand size is only 32 bits/64 bits,
but I'm not sure how that should be done. Any suggestions?

thanks
-- PMM

next prev parent reply	other threads:[~2023-05-09 10:53 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-18 13:29 [PULL 00/53] target/i386, scsi, build patches for 2022-10-18 Paolo Bonzini
2022-10-18 13:29 ` [PULL 01/53] virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon disk hotplug events Paolo Bonzini
2022-10-18 13:29 ` [PULL 02/53] configure: don't enable firmware for targets that are not built Paolo Bonzini
2022-10-18 13:29 ` [PULL 03/53] scsi: Use device_cold_reset() and bus_cold_reset() Paolo Bonzini
2022-10-18 13:29 ` [PULL 04/53] hw/scsi/vmw_pvscsi.c: Use device_cold_reset() to reset SCSI devices Paolo Bonzini
2022-10-18 13:29 ` [PULL 05/53] hyperv: fix SynIC SINT assertion failure on guest reset Paolo Bonzini
2022-10-18 13:29 ` [PULL 06/53] configure: Avoid using strings binary Paolo Bonzini
2022-10-18 13:29 ` [PULL 07/53] target/i386: Use device_cold_reset() to reset the APIC Paolo Bonzini
2022-10-18 13:29 ` [PULL 08/53] target/i386: Save and restore pc_save before tcg_remove_ops_after Paolo Bonzini
2022-10-18 13:29 ` [PULL 09/53] target/i386: Use MMUAccessType across excp_helper.c Paolo Bonzini
2022-10-18 13:29 ` [PULL 10/53] target/i386: Direct call get_hphys from mmu_translate Paolo Bonzini
2022-10-18 13:30 ` [PULL 11/53] target/i386: Introduce structures for mmu_translate Paolo Bonzini
2022-10-18 13:30 ` [PULL 12/53] target/i386: Reorg GET_HPHYS Paolo Bonzini
2022-10-18 13:30 ` [PULL 13/53] target/i386: Add MMU_PHYS_IDX and MMU_NESTED_IDX Paolo Bonzini
2022-10-18 13:30 ` [PULL 14/53] target/i386: Use MMU_NESTED_IDX for vmload/vmsave Paolo Bonzini
2022-10-18 13:30 ` [PULL 15/53] target/i386: Combine 5 sets of variables in mmu_translate Paolo Bonzini
2022-10-18 13:30 ` [PULL 16/53] target/i386: Use atomic operations for pte updates Paolo Bonzini
2022-10-18 13:30 ` [PULL 17/53] target/i386: Use probe_access_full for final stage2 translation Paolo Bonzini
2022-10-18 13:30 ` [PULL 18/53] target/i386: Define XMMReg and access macros, align ZMM registers Paolo Bonzini
2022-10-18 13:30 ` [PULL 19/53] target/i386: make ldo/sto operations consistent with ldq Paolo Bonzini
2022-10-18 13:30 ` [PULL 20/53] target/i386: make rex_w available even in 32-bit mode Paolo Bonzini
2022-10-18 13:30 ` [PULL 21/53] target/i386: add core of new i386 decoder Paolo Bonzini
2022-10-18 13:30 ` [PULL 22/53] target/i386: add ALU load/writeback core Paolo Bonzini
2022-10-18 13:30 ` [PULL 23/53] target/i386: add CPUID[EAX=7,ECX=0].ECX to DisasContext Paolo Bonzini
2022-10-18 13:30 ` [PULL 24/53] target/i386: add CPUID feature checks to new decoder Paolo Bonzini
2022-10-18 13:30 ` [PULL 25/53] target/i386: add AVX_EN hflag Paolo Bonzini
2022-10-18 13:30 ` [PULL 26/53] target/i386: validate VEX prefixes via the instructions' exception classes Paolo Bonzini
2022-10-18 13:30 ` [PULL 27/53] target/i386: validate SSE prefixes directly in the decoding table Paolo Bonzini
2022-10-18 13:30 ` [PULL 28/53] target/i386: move scalar 0F 38 and 0F 3A instruction to new decoder Paolo Bonzini
2022-10-18 13:30 ` [PULL 29/53] target/i386: Prepare ops_sse_header.h for 256 bit AVX Paolo Bonzini
2022-10-18 13:30 ` [PULL 30/53] target/i386: extend helpers to support VEX.V 3- and 4- operand encodings Paolo Bonzini
2022-10-18 13:30 ` [PULL 31/53] target/i386: support operand merging in binary scalar helpers Paolo Bonzini
2022-10-18 13:30 ` [PULL 32/53] target/i386: provide 3-operand versions of unary " Paolo Bonzini
2022-10-18 13:30 ` [PULL 33/53] target/i386: implement additional AVX comparison operators Paolo Bonzini
2022-10-18 13:30 ` [PULL 34/53] target/i386: Introduce 256-bit vector helpers Paolo Bonzini
2022-10-18 13:30 ` [PULL 35/53] target/i386: reimplement 0x0f 0x60-0x6f, add AVX Paolo Bonzini
2022-10-18 13:30 ` [PULL 36/53] target/i386: reimplement 0x0f 0xd8-0xdf, 0xe8-0xef, 0xf8-0xff, " Paolo Bonzini
2022-10-18 13:30 ` [PULL 37/53] target/i386: reimplement 0x0f 0x50-0x5f, " Paolo Bonzini
2022-10-18 13:30 ` [PULL 38/53] target/i386: reimplement 0x0f 0x78-0x7f, " Paolo Bonzini
2022-10-18 13:30 ` [PULL 39/53] target/i386: reimplement 0x0f 0x70-0x77, " Paolo Bonzini
2022-10-18 13:30 ` [PULL 40/53] target/i386: reimplement 0x0f 0xd0-0xd7, 0xe0-0xe7, 0xf0-0xf7, " Paolo Bonzini
2022-10-18 13:30 ` [PULL 41/53] target/i386: clarify (un)signedness of immediates from 0F3Ah opcodes Paolo Bonzini
2022-10-18 13:30 ` [PULL 42/53] target/i386: reimplement 0x0f 0x3a, add AVX Paolo Bonzini
2022-10-18 13:30 ` [PULL 43/53] target/i386: Use tcg gvec ops for pmovmskb Paolo Bonzini
2022-10-18 13:30 ` [PULL 44/53] target/i386: reimplement 0x0f 0x38, add AVX Paolo Bonzini
2022-10-18 13:30 ` [PULL 45/53] target/i386: reimplement 0x0f 0xc2, 0xc4-0xc6, " Paolo Bonzini
2022-10-18 13:30 ` [PULL 46/53] target/i386: reimplement 0x0f 0x10-0x17, " Paolo Bonzini
2022-10-18 13:30 ` [PULL 47/53] target/i386: reimplement 0x0f 0x28-0x2f, " Paolo Bonzini
2023-05-09 10:52   ` Peter Maydell [this message]
2022-10-18 13:30 ` [PULL 48/53] target/i386: implement XSAVE and XRSTOR of AVX registers Paolo Bonzini
2022-10-18 13:30 ` [PULL 49/53] target/i386: implement VLDMXCSR/VSTMXCSR Paolo Bonzini
2022-10-18 13:30 ` [PULL 50/53] target/i386: Enable AVX cpuid bits when using TCG Paolo Bonzini
2022-10-18 13:30 ` [PULL 51/53] tests/tcg: extend SSE tests to AVX Paolo Bonzini
2022-10-18 13:30 ` [PULL 52/53] target/i386: move 3DNow to the new decoder Paolo Bonzini
2022-10-18 13:30 ` [PULL 53/53] target/i386: remove old SSE decoder Paolo Bonzini
2022-10-18 20:01 ` [PULL 00/53] target/i386, scsi, build patches for 2022-10-18 Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFEAcA_oh9F7KegOPMaRGtkR_JBGDoXokLk4BsovXb56aPP=RQ@mail.gmail.com' \
    --to=peter.maydell@linaro.org \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).