qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0
@ 2025-12-10 13:16 Paolo Bonzini
  2025-12-10 13:16 ` [PATCH 01/18] target/i386/tcg: fix check for invalid VSIB instruction Paolo Bonzini
                   ` (17 more replies)
  0 siblings, 18 replies; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

This notably includes the last patches from the original set that implemented
the new decoder (cleaning up a bit the x87 decoder), more removal of temporaries,
and more size reduction for CC computation helpers.  On top of that there are a
few simplifications, fies and optimizations.

The diffstat is large but most of it is moving code around.

Paolo

Paolo Bonzini (18):
  target/i386/tcg: fix check for invalid VSIB instruction
  target/i386/tcg: ignore V3 in 32-bit mode
  target/i386/tcg: update cc_op after PUSHF
  target/i386/tcg: mark more instructions that are invalid in 64-bit mode
  target/i386/tcg: do not compute all flags for SAHF
  target/i386/tcg: remove do_decode_0F
  target/i386/tcg: move and expand misplaced comment
  target/i386/tcg: simplify effective address calculation
  target/i386/tcg: unnest switch statements in disas_insn_x87
  target/i386/tcg: move fcom/fcomp differentiation to gen_helper_fp_arith_ST0_FT0
  target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for fcom STn and fcomp STn
  target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for undocumented fcom/fcomp variants
  target/i386/tcg: unify more pop/no-pop x87 instructions
  target/i386/tcg: kill tmp1_i64
  target/i386/tcg: kill tmp2_i32
  target/i386/tcg: commonize code to compute SF/ZF/PF
  target/i386/tcg: add a CCOp for SBB x,x
  target/i386/tcg: move fetch code out of translate.c

 target/i386/cpu.h                        |  17 +-
 target/i386/tcg/decode-new.h             |   3 +
 target/i386/tcg/cc_helper_template.h.inc | 112 +--
 target/i386/cpu-dump.c                   |   2 +
 target/i386/tcg/cc_helper.c              | 280 +++++---
 target/i386/tcg/translate.c              | 824 ++++++++---------------
 target/i386/tcg/decode-new.c.inc         | 328 ++++++++-
 target/i386/tcg/emit.c.inc               | 109 ++-
 8 files changed, 845 insertions(+), 830 deletions(-)

-- 
2.52.0



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 01/18] target/i386/tcg: fix check for invalid VSIB instruction
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 15:47   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 02/18] target/i386/tcg: ignore V3 in 32-bit mode Paolo Bonzini
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable

VSIB instructions (VEX class 12) must not have an address prefix.
Checking s->aflag == MO_16 is not enough because in 64-bit mode
the address prefix changes aflag to MO_32.  Add a specific check
bit instead.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/decode-new.h     |  3 +++
 target/i386/tcg/decode-new.c.inc | 27 +++++++++++++--------------
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index 7f23d373ea7..38882b5c6ab 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -181,6 +181,9 @@ typedef enum X86InsnCheck {
     /* Vendor-specific checks for Intel/AMD differences */
     X86_CHECK_i64_amd = 2048,
     X86_CHECK_o64_intel = 4096,
+
+    /* No 0x67 prefix allowed */
+    X86_CHECK_no_adr = 8192,
 } X86InsnCheck;
 
 typedef enum X86InsnSpecial {
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 0f8c5d16938..0b85b0f6513 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -623,10 +623,10 @@ static const X86OpEntry opcodes_0F38_00toEF[240] = {
     [0x46] = X86_OP_ENTRY3(VPSRAV,      V,x,  H,x,       W,x,  vex6 chk(W0) cpuid(AVX2) p_66),
     [0x47] = X86_OP_ENTRY3(VPSLLV,      V,x,  H,x,       W,x,  vex6 cpuid(AVX2) p_66),
 
-    [0x90] = X86_OP_ENTRY3(VPGATHERD, V,x,  H,x,  M,d,  vex12 cpuid(AVX2) p_66), /* vpgatherdd/q */
-    [0x91] = X86_OP_ENTRY3(VPGATHERQ, V,x,  H,x,  M,q,  vex12 cpuid(AVX2) p_66), /* vpgatherqd/q */
-    [0x92] = X86_OP_ENTRY3(VPGATHERD, V,x,  H,x,  M,d,  vex12 cpuid(AVX2) p_66), /* vgatherdps/d */
-    [0x93] = X86_OP_ENTRY3(VPGATHERQ, V,x,  H,x,  M,q,  vex12 cpuid(AVX2) p_66), /* vgatherqps/d */
+    [0x90] = X86_OP_ENTRY3(VPGATHERD, V,x,  H,x,  M,d,  vex12 chk(no_adr) cpuid(AVX2) p_66), /* vpgatherdd/q */
+    [0x91] = X86_OP_ENTRY3(VPGATHERQ, V,x,  H,x,  M,q,  vex12 chk(no_adr) cpuid(AVX2) p_66), /* vpgatherqd/q */
+    [0x92] = X86_OP_ENTRY3(VPGATHERD, V,x,  H,x,  M,d,  vex12 chk(no_adr) cpuid(AVX2) p_66), /* vgatherdps/d */
+    [0x93] = X86_OP_ENTRY3(VPGATHERQ, V,x,  H,x,  M,q,  vex12 chk(no_adr) cpuid(AVX2) p_66), /* vgatherqps/d */
 
     /* Should be exception type 2 but they do not have legacy SSE equivalents? */
     [0x96] = X86_OP_ENTRY3(VFMADDSUB132Px, V,x,  H,x, W,x,  vex6 cpuid(FMA) p_66),
@@ -2435,8 +2435,8 @@ static bool validate_vex(DisasContext *s, X86DecodedInsn *decode)
         break;
     case 12:
         /* Must have a VSIB byte and no address prefix.  */
-        assert(s->has_modrm);
-        if ((s->modrm & 7) != 4 || s->aflag == MO_16) {
+        assert(s->has_modrm && (decode->e.check & X86_CHECK_no_adr));
+        if ((s->modrm & 7) != 4) {
             goto illegal;
         }
 
@@ -2740,15 +2740,14 @@ static void disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
         }
-        if (decode.e.check & X86_CHECK_prot_or_vm86) {
-            if (!PE(s)) {
-                goto illegal_op;
-            }
+        if ((decode.e.check & X86_CHECK_prot_or_vm86) && !PE(s)) {
+            goto illegal_op;
         }
-        if (decode.e.check & X86_CHECK_no_vm86) {
-            if (VM86(s)) {
-                goto illegal_op;
-            }
+        if ((decode.e.check & X86_CHECK_no_vm86) && VM86(s)) {
+            goto illegal_op;
+        }
+        if ((decode.e.check & X86_CHECK_no_adr) && (s->prefix & PREFIX_ADR)) {
+            goto illegal_op;
         }
     }
 
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 02/18] target/i386/tcg: ignore V3 in 32-bit mode
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
  2025-12-10 13:16 ` [PATCH 01/18] target/i386/tcg: fix check for invalid VSIB instruction Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 15:52   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 03/18] target/i386/tcg: update cc_op after PUSHF Paolo Bonzini
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable

From the manual: "In 64-bit mode all 4 bits may be used. [...]
In 32-bit and 16-bit modes bit 6 must be 1 (if bit 6 is not 1, the
2-byte VEX version will generate LDS instruction and the 3-byte VEX
version will ignore this bit)."

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/decode-new.c.inc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 0b85b0f6513..c9b4d5ffa32 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -2665,7 +2665,7 @@ static void disas_insn(DisasContext *s, CPUState *cpu)
                     goto unknown_op;
                 }
             }
-            s->vex_v = (~vex3 >> 3) & 0xf;
+            s->vex_v = (~vex3 >> 3) & (CODE64(s) ? 15 : 7);
             s->vex_l = (vex3 >> 2) & 1;
             s->prefix |= pp_prefix[vex3 & 3] | PREFIX_VEX;
         }
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 03/18] target/i386/tcg: update cc_op after PUSHF
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
  2025-12-10 13:16 ` [PATCH 01/18] target/i386/tcg: fix check for invalid VSIB instruction Paolo Bonzini
  2025-12-10 13:16 ` [PATCH 02/18] target/i386/tcg: ignore V3 in 32-bit mode Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 15:55   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 04/18] target/i386/tcg: mark more instructions that are invalid in 64-bit mode Paolo Bonzini
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

PUSHF needs to compute the full eflags, set the cc_op to
CC_OP_EFLAGS.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/emit.c.inc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 1a7fab9333a..22e53f5b000 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -3250,6 +3250,8 @@ static void gen_PUSHF(DisasContext *s, X86DecodedInsn *decode)
     gen_update_cc_op(s);
     gen_helper_read_eflags(s->T0, tcg_env);
     gen_push_v(s, s->T0);
+    decode->cc_src = s->T0;
+    decode->cc_op = CC_OP_EFLAGS;
 }
 
 static MemOp gen_shift_count(DisasContext *s, X86DecodedInsn *decode,
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 04/18] target/i386/tcg: mark more instructions that are invalid in 64-bit mode
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
                   ` (2 preceding siblings ...)
  2025-12-10 13:16 ` [PATCH 03/18] target/i386/tcg: update cc_op after PUSHF Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 15:59   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 05/18] target/i386/tcg: do not compute all flags for SAHF Paolo Bonzini
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/decode-new.c.inc | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index c9b4d5ffa32..213dbb9637c 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1698,9 +1698,9 @@ static const X86OpEntry opcodes_root[256] = {
     [0xD1] = X86_OP_GROUP1(group2, E,v),
     [0xD2] = X86_OP_GROUP2(group2, E,b, 1,b), /* CL */
     [0xD3] = X86_OP_GROUP2(group2, E,v, 1,b), /* CL */
-    [0xD4] = X86_OP_ENTRY2(AAM, 0,w, I,b),
-    [0xD5] = X86_OP_ENTRY2(AAD, 0,w, I,b),
-    [0xD6] = X86_OP_ENTRYw(SALC, 0,b),
+    [0xD4] = X86_OP_ENTRY2(AAM, 0,w, I,b, chk(i64)),
+    [0xD5] = X86_OP_ENTRY2(AAD, 0,w, I,b, chk(i64)),
+    [0xD6] = X86_OP_ENTRYw(SALC, 0,b, chk(i64)),
     [0xD7] = X86_OP_ENTRY1(XLAT, 0,b, zextT0), /* AL read/written */
 
     [0xE0] = X86_OP_ENTRYr(LOOPNE, J,b), /* implicit: CX with aflag size */
@@ -1834,7 +1834,7 @@ static const X86OpEntry opcodes_root[256] = {
     [0xCB] = X86_OP_ENTRY0(RETF),
     [0xCC] = X86_OP_ENTRY0(INT3),
     [0xCD] = X86_OP_ENTRYr(INT, I,b,  chk(vm86_iopl)),
-    [0xCE] = X86_OP_ENTRY0(INTO),
+    [0xCE] = X86_OP_ENTRY0(INTO, chk(i64)),
     [0xCF] = X86_OP_ENTRY0(IRET,      chk(vm86_iopl) svm(IRET)),
 
     /*
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 05/18] target/i386/tcg: do not compute all flags for SAHF
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
                   ` (3 preceding siblings ...)
  2025-12-10 13:16 ` [PATCH 04/18] target/i386/tcg: mark more instructions that are invalid in 64-bit mode Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 16:03   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 06/18] target/i386/tcg: remove do_decode_0F Paolo Bonzini
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

Only OF is needed, the others are overwritten.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/emit.c.inc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 22e53f5b000..131aefce53c 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -3778,7 +3778,7 @@ static void gen_SAHF(DisasContext *s, X86DecodedInsn *decode)
         return gen_illegal_opcode(s);
     }
     tcg_gen_shri_tl(s->T0, cpu_regs[R_EAX], 8);
-    gen_compute_eflags(s);
+    gen_neg_setcc(s, JCC_O << 1, cpu_cc_src);
     tcg_gen_andi_tl(cpu_cc_src, cpu_cc_src, CC_O);
     tcg_gen_andi_tl(s->T0, s->T0, CC_S | CC_Z | CC_A | CC_P | CC_C);
     tcg_gen_or_tl(cpu_cc_src, cpu_cc_src, s->T0);
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 06/18] target/i386/tcg: remove do_decode_0F
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
                   ` (4 preceding siblings ...)
  2025-12-10 13:16 ` [PATCH 05/18] target/i386/tcg: do not compute all flags for SAHF Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 16:03   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 07/18] target/i386/tcg: move and expand misplaced comment Paolo Bonzini
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

It is not needed anymore since all prefixes are handled by the
new decoder.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/decode-new.c.inc | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 213dbb9637c..ea8e26f7f98 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1430,15 +1430,10 @@ static const X86OpEntry opcodes_0F[256] = {
     [0xff] = X86_OP_ENTRYr(UD,     nop,v),                        /* UD0 */
 };
 
-static void do_decode_0F(DisasContext *s, CPUX86State *env, X86OpEntry *entry, uint8_t *b)
-{
-    *entry = opcodes_0F[*b];
-}
-
 static void decode_0F(DisasContext *s, CPUX86State *env, X86OpEntry *entry, uint8_t *b)
 {
     *b = x86_ldub_code(env, s);
-    do_decode_0F(s, env, entry, b);
+    *entry = opcodes_0F[*b];
 }
 
 static void decode_63(DisasContext *s, CPUX86State *env, X86OpEntry *entry, uint8_t *b)
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 07/18] target/i386/tcg: move and expand misplaced comment
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
                   ` (5 preceding siblings ...)
  2025-12-10 13:16 ` [PATCH 06/18] target/i386/tcg: remove do_decode_0F Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 16:04   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 08/18] target/i386/tcg: simplify effective address calculation Paolo Bonzini
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/decode-new.c.inc | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index ea8e26f7f98..9d17bae7e75 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1878,16 +1878,11 @@ static const X86OpEntry opcodes_root[256] = {
 #undef vex12
 #undef vex13
 
-/*
- * Decode the fixed part of the opcode and place the last
- * in b.
- */
 static void decode_root(DisasContext *s, CPUX86State *env, X86OpEntry *entry, uint8_t *b)
 {
     *entry = opcodes_root[*b];
 }
 
-
 static int decode_modrm(DisasContext *s, CPUX86State *env,
                         X86DecodedInsn *decode, X86DecodedOp *op)
 {
@@ -2222,6 +2217,10 @@ static bool decode_insn(DisasContext *s, CPUX86State *env, X86DecodeFunc decode_
 {
     X86OpEntry *e = &decode->e;
 
+    /*
+     * Each step decodes part of the opcode and place the last not-fully-decoded
+     * byte in decode->b.  If the modrm byte is read, it is placed in s->modrm.
+     */
     decode_func(s, env, e, &decode->b);
     while (e->is_decode) {
         e->is_decode = false;
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 08/18] target/i386/tcg: simplify effective address calculation
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
                   ` (6 preceding siblings ...)
  2025-12-10 13:16 ` [PATCH 07/18] target/i386/tcg: move and expand misplaced comment Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 16:15   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 09/18] target/i386/tcg: unnest switch statements in disas_insn_x87 Paolo Bonzini
                   ` (9 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

Split gen_lea_v_seg_dest into three simple phases (extend from
16 bits, add, final extend), with optimization for known-zero bases
to avoid back-to-back extensions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/translate.c | 64 ++++++++++++-------------------------
 1 file changed, 20 insertions(+), 44 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 0cb87d02012..2ab3c2ac663 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -627,54 +627,30 @@ static TCGv eip_cur_tl(DisasContext *s)
 static void gen_lea_v_seg_dest(DisasContext *s, MemOp aflag, TCGv dest, TCGv a0,
                                int def_seg, int ovr_seg)
 {
-    switch (aflag) {
-#ifdef TARGET_X86_64
-    case MO_64:
-        if (ovr_seg < 0) {
-            tcg_gen_mov_tl(dest, a0);
-            return;
+    int easize;
+    bool has_base;
+
+    if (ovr_seg < 0) {
+        ovr_seg = def_seg;
+    }
+
+    has_base = ovr_seg >= 0 && (ADDSEG(s) || ovr_seg >= R_FS);
+    easize = CODE64(s) ? MO_64 : MO_32;
+
+    if (has_base) {
+        if (aflag < easize) {
+            /* Truncate before summing base.  */
+            tcg_gen_ext_tl(dest, a0, aflag);
+            a0 = dest;
         }
-        break;
-#endif
-    case MO_32:
-        /* 32 bit address */
-        if (ovr_seg < 0 && ADDSEG(s)) {
-            ovr_seg = def_seg;
-        }
-        if (ovr_seg < 0) {
-            tcg_gen_ext32u_tl(dest, a0);
-            return;
-        }
-        break;
-    case MO_16:
-        /* 16 bit address */
-        tcg_gen_ext16u_tl(dest, a0);
+        tcg_gen_add_tl(dest, a0, cpu_seg_base[ovr_seg]);
         a0 = dest;
-        if (ovr_seg < 0) {
-            if (ADDSEG(s)) {
-                ovr_seg = def_seg;
-            } else {
-                return;
-            }
-        }
-        break;
-    default:
-        g_assert_not_reached();
+    } else {
+        /* Possibly one extension, but that's it.  */
+        easize = aflag;
     }
 
-    if (ovr_seg >= 0) {
-        TCGv seg = cpu_seg_base[ovr_seg];
-
-        if (aflag == MO_64) {
-            tcg_gen_add_tl(dest, a0, seg);
-        } else if (CODE64(s)) {
-            tcg_gen_ext32u_tl(dest, a0);
-            tcg_gen_add_tl(dest, dest, seg);
-        } else {
-            tcg_gen_add_tl(dest, a0, seg);
-            tcg_gen_ext32u_tl(dest, dest);
-        }
-    }
+    tcg_gen_ext_tl(dest, a0, easize);
 }
 
 static void gen_lea_v_seg(DisasContext *s, TCGv a0,
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 09/18] target/i386/tcg: unnest switch statements in disas_insn_x87
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
                   ` (7 preceding siblings ...)
  2025-12-10 13:16 ` [PATCH 08/18] target/i386/tcg: simplify effective address calculation Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 16:20   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 10/18] target/i386/tcg: move fcom/fcomp differentiation to gen_helper_fp_arith_ST0_FT0 Paolo Bonzini
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/translate.c | 290 +++++++++++++++++-------------------
 1 file changed, 134 insertions(+), 156 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 2ab3c2ac663..c755329b3d9 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2457,36 +2457,32 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
 
         switch (op) {
         case 0x00 ... 0x07: /* fxxxs */
-        case 0x10 ... 0x17: /* fixxxl */
-        case 0x20 ... 0x27: /* fxxxl */
-        case 0x30 ... 0x37: /* fixxx */
-            {
-                int op1;
-                op1 = op & 7;
+            tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+                                s->mem_index, MO_LEUL);
+            gen_helper_flds_FT0(tcg_env, s->tmp2_i32);
+            goto fp_arith_ST0_FT0;
 
-                switch (op >> 4) {
-                case 0:
-                    tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
-                                        s->mem_index, MO_LEUL);
-                    gen_helper_flds_FT0(tcg_env, s->tmp2_i32);
-                    break;
-                case 1:
-                    tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
-                                        s->mem_index, MO_LEUL);
-                    gen_helper_fildl_FT0(tcg_env, s->tmp2_i32);
-                    break;
-                case 2:
-                    tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0,
-                                        s->mem_index, MO_LEUQ);
-                    gen_helper_fldl_FT0(tcg_env, s->tmp1_i64);
-                    break;
-                case 3:
-                default:
-                    tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
-                                        s->mem_index, MO_LESW);
-                    gen_helper_fildl_FT0(tcg_env, s->tmp2_i32);
-                    break;
-                }
+        case 0x10 ... 0x17: /* fixxxl */
+            tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+                                s->mem_index, MO_LEUL);
+            gen_helper_fildl_FT0(tcg_env, s->tmp2_i32);
+            goto fp_arith_ST0_FT0;
+
+        case 0x20 ... 0x27: /* fxxxl */
+            tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0,
+                                s->mem_index, MO_LEUQ);
+            gen_helper_fldl_FT0(tcg_env, s->tmp1_i64);
+            goto fp_arith_ST0_FT0;
+
+        case 0x30 ... 0x37: /* fixxx */
+            tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+                                s->mem_index, MO_LESW);
+            gen_helper_fildl_FT0(tcg_env, s->tmp2_i32);
+            goto fp_arith_ST0_FT0;
+
+fp_arith_ST0_FT0:
+            {
+                int op1 = op & 7;
 
                 gen_helper_fp_arith_ST0_FT0(op1);
                 if (op1 == 3) {
@@ -2495,88 +2491,78 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
                 }
             }
             break;
+
         case 0x08: /* flds */
-        case 0x0a: /* fsts */
-        case 0x0b: /* fstps */
-        case 0x18 ... 0x1b: /* fildl, fisttpl, fistl, fistpl */
-        case 0x28 ... 0x2b: /* fldl, fisttpll, fstl, fstpl */
-        case 0x38 ... 0x3b: /* filds, fisttps, fists, fistps */
-            switch (op & 7) {
-            case 0:
-                switch (op >> 4) {
-                case 0:
-                    tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
-                                        s->mem_index, MO_LEUL);
-                    gen_helper_flds_ST0(tcg_env, s->tmp2_i32);
-                    break;
-                case 1:
-                    tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
-                                        s->mem_index, MO_LEUL);
-                    gen_helper_fildl_ST0(tcg_env, s->tmp2_i32);
-                    break;
-                case 2:
-                    tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0,
-                                        s->mem_index, MO_LEUQ);
-                    gen_helper_fldl_ST0(tcg_env, s->tmp1_i64);
-                    break;
-                case 3:
-                default:
-                    tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
-                                        s->mem_index, MO_LESW);
-                    gen_helper_fildl_ST0(tcg_env, s->tmp2_i32);
-                    break;
-                }
-                break;
-            case 1:
-                /* XXX: the corresponding CPUID bit must be tested ! */
-                switch (op >> 4) {
-                case 1:
-                    gen_helper_fisttl_ST0(s->tmp2_i32, tcg_env);
-                    tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
-                                        s->mem_index, MO_LEUL);
-                    break;
-                case 2:
-                    gen_helper_fisttll_ST0(s->tmp1_i64, tcg_env);
-                    tcg_gen_qemu_st_i64(s->tmp1_i64, s->A0,
-                                        s->mem_index, MO_LEUQ);
-                    break;
-                case 3:
-                default:
-                    gen_helper_fistt_ST0(s->tmp2_i32, tcg_env);
-                    tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
-                                        s->mem_index, MO_LEUW);
-                    break;
-                }
+            tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+                                s->mem_index, MO_LEUL);
+            gen_helper_flds_ST0(tcg_env, s->tmp2_i32);
+            break;
+        case 0x18: /* fildl */
+            tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+                                s->mem_index, MO_LEUL);
+            gen_helper_fildl_ST0(tcg_env, s->tmp2_i32);
+            break;
+        case 0x28: /* fldl */
+            tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0,
+                                s->mem_index, MO_LEUQ);
+            gen_helper_fldl_ST0(tcg_env, s->tmp1_i64);
+            break;
+        case 0x38: /* filds */
+            tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+                                s->mem_index, MO_LESW);
+            gen_helper_fildl_ST0(tcg_env, s->tmp2_i32);
+            break;
+
+        case 0x19: /* fisttpl */
+            gen_helper_fisttl_ST0(s->tmp2_i32, tcg_env);
+            tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
+                                s->mem_index, MO_LEUL);
+            gen_helper_fpop(tcg_env);
+            break;
+        case 0x29: /* fisttpll */
+            gen_helper_fisttll_ST0(s->tmp1_i64, tcg_env);
+            tcg_gen_qemu_st_i64(s->tmp1_i64, s->A0,
+                                s->mem_index, MO_LEUQ);
+            gen_helper_fpop(tcg_env);
+            break;
+        case 0x39: /* fisttps */
+            gen_helper_fistt_ST0(s->tmp2_i32, tcg_env);
+            tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
+                                s->mem_index, MO_LEUW);
+            gen_helper_fpop(tcg_env);
+            break;
+
+        case 0x0a: case 0x0b: /* fsts, fstps */
+            gen_helper_fsts_ST0(s->tmp2_i32, tcg_env);
+            tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
+                                s->mem_index, MO_LEUL);
+            if ((op & 7) == 3) {
+                gen_helper_fpop(tcg_env);
+            }
+            break;
+        case 0x1a: case 0x1b: /* fistl, fistpl */
+            gen_helper_fistl_ST0(s->tmp2_i32, tcg_env);
+            tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
+                                s->mem_index, MO_LEUL);
+            if ((op & 7) == 3) {
+                gen_helper_fpop(tcg_env);
+            }
+            break;
+        case 0x2a: case 0x2b: /* fstl, fstpl */
+            gen_helper_fstl_ST0(s->tmp1_i64, tcg_env);
+            tcg_gen_qemu_st_i64(s->tmp1_i64, s->A0,
+                                s->mem_index, MO_LEUQ);
+            if ((op & 7) == 3) {
+                gen_helper_fpop(tcg_env);
+            }
+            break;
+
+        case 0x3a: case 0x3b: /* fists, fistps */
+            gen_helper_fist_ST0(s->tmp2_i32, tcg_env);
+            tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
+                                s->mem_index, MO_LEUW);
+            if ((op & 7) == 3) {
                 gen_helper_fpop(tcg_env);
-                break;
-            default:
-                switch (op >> 4) {
-                case 0:
-                    gen_helper_fsts_ST0(s->tmp2_i32, tcg_env);
-                    tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
-                                        s->mem_index, MO_LEUL);
-                    break;
-                case 1:
-                    gen_helper_fistl_ST0(s->tmp2_i32, tcg_env);
-                    tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
-                                        s->mem_index, MO_LEUL);
-                    break;
-                case 2:
-                    gen_helper_fstl_ST0(s->tmp1_i64, tcg_env);
-                    tcg_gen_qemu_st_i64(s->tmp1_i64, s->A0,
-                                        s->mem_index, MO_LEUQ);
-                    break;
-                case 3:
-                default:
-                    gen_helper_fist_ST0(s->tmp2_i32, tcg_env);
-                    tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
-                                        s->mem_index, MO_LEUW);
-                    break;
-                }
-                if ((op & 7) == 3) {
-                    gen_helper_fpop(tcg_env);
-                }
-                break;
             }
             break;
         case 0x0c: /* fldenv mem */
@@ -2707,39 +2693,37 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             }
             break;
         case 0x0d: /* grp d9/5 */
-            {
-                switch (rm) {
-                case 0:
-                    gen_helper_fpush(tcg_env);
-                    gen_helper_fld1_ST0(tcg_env);
-                    break;
-                case 1:
-                    gen_helper_fpush(tcg_env);
-                    gen_helper_fldl2t_ST0(tcg_env);
-                    break;
-                case 2:
-                    gen_helper_fpush(tcg_env);
-                    gen_helper_fldl2e_ST0(tcg_env);
-                    break;
-                case 3:
-                    gen_helper_fpush(tcg_env);
-                    gen_helper_fldpi_ST0(tcg_env);
-                    break;
-                case 4:
-                    gen_helper_fpush(tcg_env);
-                    gen_helper_fldlg2_ST0(tcg_env);
-                    break;
-                case 5:
-                    gen_helper_fpush(tcg_env);
-                    gen_helper_fldln2_ST0(tcg_env);
-                    break;
-                case 6:
-                    gen_helper_fpush(tcg_env);
-                    gen_helper_fldz_ST0(tcg_env);
-                    break;
-                default:
-                    goto illegal_op;
-                }
+            switch (rm) {
+            case 0:
+                gen_helper_fpush(tcg_env);
+                gen_helper_fld1_ST0(tcg_env);
+                break;
+            case 1:
+                gen_helper_fpush(tcg_env);
+                gen_helper_fldl2t_ST0(tcg_env);
+                break;
+            case 2:
+                gen_helper_fpush(tcg_env);
+                gen_helper_fldl2e_ST0(tcg_env);
+                break;
+            case 3:
+                gen_helper_fpush(tcg_env);
+                gen_helper_fldpi_ST0(tcg_env);
+                break;
+            case 4:
+                gen_helper_fpush(tcg_env);
+                gen_helper_fldlg2_ST0(tcg_env);
+                break;
+            case 5:
+                gen_helper_fpush(tcg_env);
+                gen_helper_fldln2_ST0(tcg_env);
+                break;
+            case 6:
+                gen_helper_fpush(tcg_env);
+                gen_helper_fldz_ST0(tcg_env);
+                break;
+            default:
+                goto illegal_op;
             }
             break;
         case 0x0e: /* grp d9/6 */
@@ -2801,22 +2785,16 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             }
             break;
         case 0x00: case 0x01: case 0x04 ... 0x07: /* fxxx st, sti */
+            gen_helper_fmov_FT0_STN(tcg_env,
+                                    tcg_constant_i32(opreg));
+            gen_helper_fp_arith_ST0_FT0(op & 7);
+            break;
+
         case 0x20: case 0x21: case 0x24 ... 0x27: /* fxxx sti, st */
         case 0x30: case 0x31: case 0x34 ... 0x37: /* fxxxp sti, st */
-            {
-                int op1;
-
-                op1 = op & 7;
-                if (op >= 0x20) {
-                    gen_helper_fp_arith_STN_ST0(op1, opreg);
-                    if (op >= 0x30) {
-                        gen_helper_fpop(tcg_env);
-                    }
-                } else {
-                    gen_helper_fmov_FT0_STN(tcg_env,
-                                            tcg_constant_i32(opreg));
-                    gen_helper_fp_arith_ST0_FT0(op1);
-                }
+            gen_helper_fp_arith_STN_ST0(op & 7, opreg);
+            if (op >= 0x30) {
+                gen_helper_fpop(tcg_env);
             }
             break;
         case 0x02: /* fcom */
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 10/18] target/i386/tcg: move fcom/fcomp differentiation to gen_helper_fp_arith_ST0_FT0
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
                   ` (8 preceding siblings ...)
  2025-12-10 13:16 ` [PATCH 09/18] target/i386/tcg: unnest switch statements in disas_insn_x87 Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 16:21   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 11/18] target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for fcom STn and fcomp STn Paolo Bonzini
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

There is only one call site for gen_helper_fp_arith_ST0_FT0(), therefore
there is no need to check the op1 == 3 in the caller.  Once this is done,
eliminate the goto to that call site.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/translate.c | 23 ++++++++---------------
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index c755329b3d9..3c55b62bdec 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1485,6 +1485,7 @@ static void gen_helper_fp_arith_ST0_FT0(int op)
         break;
     case 3:
         gen_helper_fcom_ST0_FT0(tcg_env);
+        gen_helper_fpop(tcg_env);
         break;
     case 4:
         gen_helper_fsub_ST0_FT0(tcg_env);
@@ -2460,36 +2461,28 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
                                 s->mem_index, MO_LEUL);
             gen_helper_flds_FT0(tcg_env, s->tmp2_i32);
-            goto fp_arith_ST0_FT0;
+            gen_helper_fp_arith_ST0_FT0(op & 7);
+            break;
 
         case 0x10 ... 0x17: /* fixxxl */
             tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
                                 s->mem_index, MO_LEUL);
             gen_helper_fildl_FT0(tcg_env, s->tmp2_i32);
-            goto fp_arith_ST0_FT0;
+            gen_helper_fp_arith_ST0_FT0(op & 7);
+            break;
 
         case 0x20 ... 0x27: /* fxxxl */
             tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0,
                                 s->mem_index, MO_LEUQ);
             gen_helper_fldl_FT0(tcg_env, s->tmp1_i64);
-            goto fp_arith_ST0_FT0;
+            gen_helper_fp_arith_ST0_FT0(op & 7);
+            break;
 
         case 0x30 ... 0x37: /* fixxx */
             tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
                                 s->mem_index, MO_LESW);
             gen_helper_fildl_FT0(tcg_env, s->tmp2_i32);
-            goto fp_arith_ST0_FT0;
-
-fp_arith_ST0_FT0:
-            {
-                int op1 = op & 7;
-
-                gen_helper_fp_arith_ST0_FT0(op1);
-                if (op1 == 3) {
-                    /* fcomp needs pop */
-                    gen_helper_fpop(tcg_env);
-                }
-            }
+            gen_helper_fp_arith_ST0_FT0(op & 7);
             break;
 
         case 0x08: /* flds */
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 11/18] target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for fcom STn and fcomp STn
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
                   ` (9 preceding siblings ...)
  2025-12-10 13:16 ` [PATCH 10/18] target/i386/tcg: move fcom/fcomp differentiation to gen_helper_fp_arith_ST0_FT0 Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 16:24   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 12/18] target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for undocumented fcom/fcomp variants Paolo Bonzini
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

Treat specially the undocumented ops, instead of treating specially the
two d8/0 opcodes that have undocumented variants: just call
gen_helper_fp_arith_ST0_FT0 for all opcodes in the d8/0 encoding.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/translate.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 3c55b62bdec..8f50071a4f4 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2777,7 +2777,7 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
                 break;
             }
             break;
-        case 0x00: case 0x01: case 0x04 ... 0x07: /* fxxx st, sti */
+        case 0x00 ... 0x07: /* fxxx st, sti */
             gen_helper_fmov_FT0_STN(tcg_env,
                                     tcg_constant_i32(opreg));
             gen_helper_fp_arith_ST0_FT0(op & 7);
@@ -2790,12 +2790,10 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
                 gen_helper_fpop(tcg_env);
             }
             break;
-        case 0x02: /* fcom */
         case 0x22: /* fcom2, undocumented op */
             gen_helper_fmov_FT0_STN(tcg_env, tcg_constant_i32(opreg));
             gen_helper_fcom_ST0_FT0(tcg_env);
             break;
-        case 0x03: /* fcomp */
         case 0x23: /* fcomp3, undocumented op */
         case 0x32: /* fcomp5, undocumented op */
             gen_helper_fmov_FT0_STN(tcg_env, tcg_constant_i32(opreg));
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 12/18] target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for undocumented fcom/fcomp variants
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
                   ` (10 preceding siblings ...)
  2025-12-10 13:16 ` [PATCH 11/18] target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for fcom STn and fcomp STn Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 16:26   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 13/18] target/i386/tcg: unify more pop/no-pop x87 instructions Paolo Bonzini
                   ` (5 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

For 0x32 hack the op to be fcomp; for the others there isn't even anything special
to do.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/translate.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 8f50071a4f4..f47bb5de8b3 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2777,7 +2777,12 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
                 break;
             }
             break;
+        case 0x32: /* fcomp5, undocumented op */
+            /* map to fcomp; op & 7 == 2 would not pop  */
+            op = 0x03;
+            /* fallthrough */
         case 0x00 ... 0x07: /* fxxx st, sti */
+        case 0x22 ... 0x23: /* fcom2 and fcomp3, undocumented ops */
             gen_helper_fmov_FT0_STN(tcg_env,
                                     tcg_constant_i32(opreg));
             gen_helper_fp_arith_ST0_FT0(op & 7);
@@ -2790,16 +2795,6 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
                 gen_helper_fpop(tcg_env);
             }
             break;
-        case 0x22: /* fcom2, undocumented op */
-            gen_helper_fmov_FT0_STN(tcg_env, tcg_constant_i32(opreg));
-            gen_helper_fcom_ST0_FT0(tcg_env);
-            break;
-        case 0x23: /* fcomp3, undocumented op */
-        case 0x32: /* fcomp5, undocumented op */
-            gen_helper_fmov_FT0_STN(tcg_env, tcg_constant_i32(opreg));
-            gen_helper_fcom_ST0_FT0(tcg_env);
-            gen_helper_fpop(tcg_env);
-            break;
         case 0x15: /* da/5 */
             switch (rm) {
             case 1: /* fucompp */
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 13/18] target/i386/tcg: unify more pop/no-pop x87 instructions
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
                   ` (11 preceding siblings ...)
  2025-12-10 13:16 ` [PATCH 12/18] target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for undocumented fcom/fcomp variants Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-10 13:16 ` [PATCH 14/18] target/i386/tcg: kill tmp1_i64 Paolo Bonzini
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/translate.c | 49 ++++++++++++++-----------------------
 1 file changed, 18 insertions(+), 31 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index f47bb5de8b3..8cd70456a51 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2828,44 +2828,55 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             }
             break;
         case 0x1d: /* fucomi */
+        case 0x3d: /* fucomip */
             if (!(s->cpuid_features & CPUID_CMOV)) {
                 goto illegal_op;
             }
             gen_update_cc_op(s);
             gen_helper_fmov_FT0_STN(tcg_env, tcg_constant_i32(opreg));
             gen_helper_fucomi_ST0_FT0(tcg_env);
+            if (op >= 0x30) {
+                gen_helper_fpop(tcg_env);
+            }
             assume_cc_op(s, CC_OP_EFLAGS);
             break;
         case 0x1e: /* fcomi */
+        case 0x3e: /* fcomip */
             if (!(s->cpuid_features & CPUID_CMOV)) {
                 goto illegal_op;
             }
             gen_update_cc_op(s);
             gen_helper_fmov_FT0_STN(tcg_env, tcg_constant_i32(opreg));
             gen_helper_fcomi_ST0_FT0(tcg_env);
+            if (op >= 0x30) {
+                gen_helper_fpop(tcg_env);
+            }
             assume_cc_op(s, CC_OP_EFLAGS);
             break;
         case 0x28: /* ffree sti */
+        case 0x38: /* ffreep sti, undocumented op */
             gen_helper_ffree_STN(tcg_env, tcg_constant_i32(opreg));
+            if (op >= 0x30) {
+                gen_helper_fpop(tcg_env);
+            }
             break;
         case 0x2a: /* fst sti */
-            gen_helper_fmov_STN_ST0(tcg_env, tcg_constant_i32(opreg));
-            break;
         case 0x2b: /* fstp sti */
         case 0x0b: /* fstp1 sti, undocumented op */
         case 0x3a: /* fstp8 sti, undocumented op */
         case 0x3b: /* fstp9 sti, undocumented op */
             gen_helper_fmov_STN_ST0(tcg_env, tcg_constant_i32(opreg));
-            gen_helper_fpop(tcg_env);
+            if (op != 0x2a) {
+                gen_helper_fpop(tcg_env);
+            }
             break;
         case 0x2c: /* fucom st(i) */
-            gen_helper_fmov_FT0_STN(tcg_env, tcg_constant_i32(opreg));
-            gen_helper_fucom_ST0_FT0(tcg_env);
-            break;
         case 0x2d: /* fucomp st(i) */
             gen_helper_fmov_FT0_STN(tcg_env, tcg_constant_i32(opreg));
             gen_helper_fucom_ST0_FT0(tcg_env);
-            gen_helper_fpop(tcg_env);
+            if (op == 0x2d) {
+                gen_helper_fpop(tcg_env);
+            }
             break;
         case 0x33: /* de/3 */
             switch (rm) {
@@ -2879,10 +2890,6 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
                 goto illegal_op;
             }
             break;
-        case 0x38: /* ffreep sti, undocumented op */
-            gen_helper_ffree_STN(tcg_env, tcg_constant_i32(opreg));
-            gen_helper_fpop(tcg_env);
-            break;
         case 0x3c: /* df/4 */
             switch (rm) {
             case 0:
@@ -2894,26 +2901,6 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
                 goto illegal_op;
             }
             break;
-        case 0x3d: /* fucomip */
-            if (!(s->cpuid_features & CPUID_CMOV)) {
-                goto illegal_op;
-            }
-            gen_update_cc_op(s);
-            gen_helper_fmov_FT0_STN(tcg_env, tcg_constant_i32(opreg));
-            gen_helper_fucomi_ST0_FT0(tcg_env);
-            gen_helper_fpop(tcg_env);
-            assume_cc_op(s, CC_OP_EFLAGS);
-            break;
-        case 0x3e: /* fcomip */
-            if (!(s->cpuid_features & CPUID_CMOV)) {
-                goto illegal_op;
-            }
-            gen_update_cc_op(s);
-            gen_helper_fmov_FT0_STN(tcg_env, tcg_constant_i32(opreg));
-            gen_helper_fcomi_ST0_FT0(tcg_env);
-            gen_helper_fpop(tcg_env);
-            assume_cc_op(s, CC_OP_EFLAGS);
-            break;
         case 0x10 ... 0x13: /* fcmovxx */
         case 0x18 ... 0x1b:
             {
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 14/18] target/i386/tcg: kill tmp1_i64
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
                   ` (12 preceding siblings ...)
  2025-12-10 13:16 ` [PATCH 13/18] target/i386/tcg: unify more pop/no-pop x87 instructions Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 16:28   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 15/18] target/i386/tcg: kill tmp2_i32 Paolo Bonzini
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/translate.c | 66 ++++++++++++++++++++--------------
 target/i386/tcg/emit.c.inc  | 72 ++++++++++++++++++++++---------------
 2 files changed, 84 insertions(+), 54 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 8cd70456a51..108276f4008 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -136,7 +136,6 @@ typedef struct DisasContext {
 
     /* TCG local register indexes (only used inside old micro ops) */
     TCGv_i32 tmp2_i32;
-    TCGv_i64 tmp1_i64;
 
     sigjmp_buf jmpbuf;
     TCGOp *prev_insn_start;
@@ -2365,14 +2364,18 @@ static void gen_jmp_rel_csize(DisasContext *s, int diff, int tb_num)
 
 static inline void gen_ldq_env_A0(DisasContext *s, int offset)
 {
-    tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, s->mem_index, MO_LEUQ);
-    tcg_gen_st_i64(s->tmp1_i64, tcg_env, offset);
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_qemu_ld_i64(t, s->A0, s->mem_index, MO_LEUQ);
+    tcg_gen_st_i64(t, tcg_env, offset);
 }
 
 static inline void gen_stq_env_A0(DisasContext *s, int offset)
 {
-    tcg_gen_ld_i64(s->tmp1_i64, tcg_env, offset);
-    tcg_gen_qemu_st_i64(s->tmp1_i64, s->A0, s->mem_index, MO_LEUQ);
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_ld_i64(t, tcg_env, offset);
+    tcg_gen_qemu_st_i64(t, s->A0, s->mem_index, MO_LEUQ);
 }
 
 static inline void gen_ldo_env_A0(DisasContext *s, int offset, bool align)
@@ -2452,6 +2455,7 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
         TCGv ea = gen_lea_modrm_1(s, decode->mem, false);
         TCGv last_addr = tcg_temp_new();
         bool update_fdp = true;
+        TCGv_i64 t64;
 
         tcg_gen_mov_tl(last_addr, ea);
         gen_lea_v_seg(s, ea, decode->mem.def_seg, s->override);
@@ -2472,9 +2476,10 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             break;
 
         case 0x20 ... 0x27: /* fxxxl */
-            tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0,
+            t64 = tcg_temp_new_i64();
+            tcg_gen_qemu_ld_i64(t64, s->A0,
                                 s->mem_index, MO_LEUQ);
-            gen_helper_fldl_FT0(tcg_env, s->tmp1_i64);
+            gen_helper_fldl_FT0(tcg_env, t64);
             gen_helper_fp_arith_ST0_FT0(op & 7);
             break;
 
@@ -2496,9 +2501,10 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             gen_helper_fildl_ST0(tcg_env, s->tmp2_i32);
             break;
         case 0x28: /* fldl */
-            tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0,
+            t64 = tcg_temp_new_i64();
+            tcg_gen_qemu_ld_i64(t64, s->A0,
                                 s->mem_index, MO_LEUQ);
-            gen_helper_fldl_ST0(tcg_env, s->tmp1_i64);
+            gen_helper_fldl_ST0(tcg_env, t64);
             break;
         case 0x38: /* filds */
             tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
@@ -2513,8 +2519,9 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             gen_helper_fpop(tcg_env);
             break;
         case 0x29: /* fisttpll */
-            gen_helper_fisttll_ST0(s->tmp1_i64, tcg_env);
-            tcg_gen_qemu_st_i64(s->tmp1_i64, s->A0,
+            t64 = tcg_temp_new_i64();
+            gen_helper_fisttll_ST0(t64, tcg_env);
+            tcg_gen_qemu_st_i64(t64, s->A0,
                                 s->mem_index, MO_LEUQ);
             gen_helper_fpop(tcg_env);
             break;
@@ -2542,8 +2549,9 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             }
             break;
         case 0x2a: case 0x2b: /* fstl, fstpl */
-            gen_helper_fstl_ST0(s->tmp1_i64, tcg_env);
-            tcg_gen_qemu_st_i64(s->tmp1_i64, s->A0,
+            t64 = tcg_temp_new_i64();
+            gen_helper_fstl_ST0(t64, tcg_env);
+            tcg_gen_qemu_st_i64(t64, s->A0,
                                 s->mem_index, MO_LEUQ);
             if ((op & 7) == 3) {
                 gen_helper_fpop(tcg_env);
@@ -2611,13 +2619,15 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             gen_helper_fpop(tcg_env);
             break;
         case 0x3d: /* fildll */
-            tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0,
+            t64 = tcg_temp_new_i64();
+            tcg_gen_qemu_ld_i64(t64, s->A0,
                                 s->mem_index, MO_LEUQ);
-            gen_helper_fildll_ST0(tcg_env, s->tmp1_i64);
+            gen_helper_fildll_ST0(tcg_env, t64);
             break;
         case 0x3f: /* fistpll */
-            gen_helper_fistll_ST0(s->tmp1_i64, tcg_env);
-            tcg_gen_qemu_st_i64(s->tmp1_i64, s->A0,
+            t64 = tcg_temp_new_i64();
+            gen_helper_fistll_ST0(t64, tcg_env);
+            tcg_gen_qemu_st_i64(t64, s->A0,
                                 s->mem_index, MO_LEUQ);
             gen_helper_fpop(tcg_env);
             break;
@@ -2951,6 +2961,7 @@ static void gen_multi0F(DisasContext *s, X86DecodedInsn *decode)
     int modrm = s->modrm;
     MemOp ot;
     int reg, rm, mod, op;
+    TCGv_i64 t64;
 
     /* now check op code */
     switch (b) {
@@ -3142,9 +3153,10 @@ static void gen_multi0F(DisasContext *s, X86DecodedInsn *decode)
                 || (s->prefix & (PREFIX_DATA | PREFIX_REPZ | PREFIX_REPNZ))) {
                 goto illegal_op;
             }
+            t64 = tcg_temp_new_i64();
             tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[R_ECX]);
-            gen_helper_xgetbv(s->tmp1_i64, tcg_env, s->tmp2_i32);
-            tcg_gen_extr_i64_tl(cpu_regs[R_EAX], cpu_regs[R_EDX], s->tmp1_i64);
+            gen_helper_xgetbv(t64, tcg_env, s->tmp2_i32);
+            tcg_gen_extr_i64_tl(cpu_regs[R_EAX], cpu_regs[R_EDX], t64);
             break;
 
         case 0xd1: /* xsetbv */
@@ -3156,10 +3168,11 @@ static void gen_multi0F(DisasContext *s, X86DecodedInsn *decode)
             if (!check_cpl0(s)) {
                 break;
             }
-            tcg_gen_concat_tl_i64(s->tmp1_i64, cpu_regs[R_EAX],
+            t64 = tcg_temp_new_i64();
+            tcg_gen_concat_tl_i64(t64, cpu_regs[R_EAX],
                                   cpu_regs[R_EDX]);
             tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[R_ECX]);
-            gen_helper_xsetbv(tcg_env, s->tmp2_i32, s->tmp1_i64);
+            gen_helper_xsetbv(tcg_env, s->tmp2_i32, t64);
             /* End TB because translation flags may change.  */
             s->base.is_jmp = DISAS_EOB_NEXT;
             break;
@@ -3319,18 +3332,20 @@ static void gen_multi0F(DisasContext *s, X86DecodedInsn *decode)
             if (s->prefix & (PREFIX_DATA | PREFIX_REPZ | PREFIX_REPNZ)) {
                 goto illegal_op;
             }
+            t64 = tcg_temp_new_i64();
             tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[R_ECX]);
-            gen_helper_rdpkru(s->tmp1_i64, tcg_env, s->tmp2_i32);
-            tcg_gen_extr_i64_tl(cpu_regs[R_EAX], cpu_regs[R_EDX], s->tmp1_i64);
+            gen_helper_rdpkru(t64, tcg_env, s->tmp2_i32);
+            tcg_gen_extr_i64_tl(cpu_regs[R_EAX], cpu_regs[R_EDX], t64);
             break;
         case 0xef: /* wrpkru */
             if (s->prefix & (PREFIX_DATA | PREFIX_REPZ | PREFIX_REPNZ)) {
                 goto illegal_op;
             }
-            tcg_gen_concat_tl_i64(s->tmp1_i64, cpu_regs[R_EAX],
+            t64 = tcg_temp_new_i64();
+            tcg_gen_concat_tl_i64(t64, cpu_regs[R_EAX],
                                   cpu_regs[R_EDX]);
             tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[R_ECX]);
-            gen_helper_wrpkru(tcg_env, s->tmp2_i32, s->tmp1_i64);
+            gen_helper_wrpkru(tcg_env, s->tmp2_i32, t64);
             break;
 
         CASE_MODRM_OP(6): /* lmsw */
@@ -3722,7 +3737,6 @@ static void i386_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cpu)
     dc->T1 = tcg_temp_new();
     dc->A0 = tcg_temp_new();
 
-    dc->tmp1_i64 = tcg_temp_new_i64();
     dc->tmp2_i32 = tcg_temp_new_i32();
     dc->cc_srcT = tcg_temp_new();
 }
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 131aefce53c..8dac4d09da1 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -521,10 +521,12 @@ static void gen_3dnow(DisasContext *s, X86DecodedInsn *decode)
 
     gen_helper_enter_mmx(tcg_env);
     if (fn == FN_3DNOW_MOVE) {
-       tcg_gen_ld_i64(s->tmp1_i64, tcg_env, decode->op[1].offset);
-       tcg_gen_st_i64(s->tmp1_i64, tcg_env, decode->op[0].offset);
+        TCGv_i64 t = tcg_temp_new_i64();
+
+        tcg_gen_ld_i64(t, tcg_env, decode->op[1].offset);
+        tcg_gen_st_i64(t, tcg_env, decode->op[0].offset);
     } else {
-       fn(tcg_env, OP_PTR0, OP_PTR1);
+        fn(tcg_env, OP_PTR0, OP_PTR1);
     }
 }
 
@@ -2596,10 +2598,11 @@ static void gen_MOVQ(DisasContext *s, X86DecodedInsn *decode)
 {
     int vec_len = vector_len(s, decode);
     int lo_ofs = vector_elem_offset(&decode->op[0], MO_64, 0);
+    TCGv_i64 t = tcg_temp_new_i64();
 
-    tcg_gen_ld_i64(s->tmp1_i64, tcg_env, decode->op[2].offset);
+    tcg_gen_ld_i64(t, tcg_env, decode->op[2].offset);
     if (decode->op[0].has_ea) {
-        tcg_gen_qemu_st_i64(s->tmp1_i64, s->A0, s->mem_index, MO_LEUQ);
+        tcg_gen_qemu_st_i64(t, s->A0, s->mem_index, MO_LEUQ);
     } else {
         /*
          * tcg_gen_gvec_dup_i64(MO_64, op0.offset, 8, vec_len, s->tmp1_64) would
@@ -2610,7 +2613,7 @@ static void gen_MOVQ(DisasContext *s, X86DecodedInsn *decode)
          * it disqualifies using oprsz < maxsz to emulate VEX128.
          */
         tcg_gen_gvec_dup_imm(MO_64, decode->op[0].offset, vec_len, vec_len, 0);
-        tcg_gen_st_i64(s->tmp1_i64, tcg_env, lo_ofs);
+        tcg_gen_st_i64(t, tcg_env, lo_ofs);
     }
 }
 
@@ -4505,10 +4508,12 @@ static void gen_VMASKMOVPS_st(DisasContext *s, X86DecodedInsn *decode)
 
 static void gen_VMOVHPx_ld(DisasContext *s, X86DecodedInsn *decode)
 {
+    TCGv_i64 t = tcg_temp_new_i64();
+
     gen_ldq_env_A0(s, decode->op[0].offset + offsetof(XMMReg, XMM_Q(1)));
     if (decode->op[0].offset != decode->op[1].offset) {
-        tcg_gen_ld_i64(s->tmp1_i64, tcg_env, decode->op[1].offset + offsetof(XMMReg, XMM_Q(0)));
-        tcg_gen_st_i64(s->tmp1_i64, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(0)));
+        tcg_gen_ld_i64(t, tcg_env, decode->op[1].offset + offsetof(XMMReg, XMM_Q(0)));
+        tcg_gen_st_i64(t, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(0)));
     }
 }
 
@@ -4519,33 +4524,39 @@ static void gen_VMOVHPx_st(DisasContext *s, X86DecodedInsn *decode)
 
 static void gen_VMOVHPx(DisasContext *s, X86DecodedInsn *decode)
 {
+    TCGv_i64 t = tcg_temp_new_i64();
+
     if (decode->op[0].offset != decode->op[2].offset) {
-        tcg_gen_ld_i64(s->tmp1_i64, tcg_env, decode->op[2].offset + offsetof(XMMReg, XMM_Q(1)));
-        tcg_gen_st_i64(s->tmp1_i64, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(1)));
+        tcg_gen_ld_i64(t, tcg_env, decode->op[2].offset + offsetof(XMMReg, XMM_Q(1)));
+        tcg_gen_st_i64(t, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(1)));
     }
     if (decode->op[0].offset != decode->op[1].offset) {
-        tcg_gen_ld_i64(s->tmp1_i64, tcg_env, decode->op[1].offset + offsetof(XMMReg, XMM_Q(0)));
-        tcg_gen_st_i64(s->tmp1_i64, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(0)));
+        tcg_gen_ld_i64(t, tcg_env, decode->op[1].offset + offsetof(XMMReg, XMM_Q(0)));
+        tcg_gen_st_i64(t, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(0)));
     }
 }
 
 static void gen_VMOVHLPS(DisasContext *s, X86DecodedInsn *decode)
 {
-    tcg_gen_ld_i64(s->tmp1_i64, tcg_env, decode->op[2].offset + offsetof(XMMReg, XMM_Q(1)));
-    tcg_gen_st_i64(s->tmp1_i64, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(0)));
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_ld_i64(t, tcg_env, decode->op[2].offset + offsetof(XMMReg, XMM_Q(1)));
+    tcg_gen_st_i64(t, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(0)));
     if (decode->op[0].offset != decode->op[1].offset) {
-        tcg_gen_ld_i64(s->tmp1_i64, tcg_env, decode->op[1].offset + offsetof(XMMReg, XMM_Q(1)));
-        tcg_gen_st_i64(s->tmp1_i64, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(1)));
+        tcg_gen_ld_i64(t, tcg_env, decode->op[1].offset + offsetof(XMMReg, XMM_Q(1)));
+        tcg_gen_st_i64(t, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(1)));
     }
 }
 
 static void gen_VMOVLHPS(DisasContext *s, X86DecodedInsn *decode)
 {
-    tcg_gen_ld_i64(s->tmp1_i64, tcg_env, decode->op[2].offset);
-    tcg_gen_st_i64(s->tmp1_i64, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(1)));
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_ld_i64(t, tcg_env, decode->op[2].offset);
+    tcg_gen_st_i64(t, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(1)));
     if (decode->op[0].offset != decode->op[1].offset) {
-        tcg_gen_ld_i64(s->tmp1_i64, tcg_env, decode->op[1].offset + offsetof(XMMReg, XMM_Q(0)));
-        tcg_gen_st_i64(s->tmp1_i64, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(0)));
+        tcg_gen_ld_i64(t, tcg_env, decode->op[1].offset + offsetof(XMMReg, XMM_Q(0)));
+        tcg_gen_st_i64(t, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(0)));
     }
 }
 
@@ -4557,34 +4568,39 @@ static void gen_VMOVLHPS(DisasContext *s, X86DecodedInsn *decode)
 static void gen_VMOVLPx(DisasContext *s, X86DecodedInsn *decode)
 {
     int vec_len = vector_len(s, decode);
+    TCGv_i64 t = tcg_temp_new_i64();
 
-    tcg_gen_ld_i64(s->tmp1_i64, tcg_env, decode->op[2].offset + offsetof(XMMReg, XMM_Q(0)));
+    tcg_gen_ld_i64(t, tcg_env, decode->op[2].offset + offsetof(XMMReg, XMM_Q(0)));
     tcg_gen_gvec_mov(MO_64, decode->op[0].offset, decode->op[1].offset, vec_len, vec_len);
-    tcg_gen_st_i64(s->tmp1_i64, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(0)));
+    tcg_gen_st_i64(t, tcg_env, decode->op[0].offset + offsetof(XMMReg, XMM_Q(0)));
 }
 
 static void gen_VMOVLPx_ld(DisasContext *s, X86DecodedInsn *decode)
 {
     int vec_len = vector_len(s, decode);
+    TCGv_i64 t = tcg_temp_new_i64();
 
-    tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, s->mem_index, MO_LEUQ);
+    tcg_gen_qemu_ld_i64(t, s->A0, s->mem_index, MO_LEUQ);
     tcg_gen_gvec_mov(MO_64, decode->op[0].offset, decode->op[1].offset, vec_len, vec_len);
-    tcg_gen_st_i64(s->tmp1_i64, OP_PTR0, offsetof(ZMMReg, ZMM_Q(0)));
+    tcg_gen_st_i64(t, OP_PTR0, offsetof(ZMMReg, ZMM_Q(0)));
 }
 
 static void gen_VMOVLPx_st(DisasContext *s, X86DecodedInsn *decode)
 {
-    tcg_gen_ld_i64(s->tmp1_i64, OP_PTR2, offsetof(ZMMReg, ZMM_Q(0)));
-    tcg_gen_qemu_st_i64(s->tmp1_i64, s->A0, s->mem_index, MO_LEUQ);
+    TCGv_i64 t = tcg_temp_new_i64();
+
+    tcg_gen_ld_i64(t, OP_PTR2, offsetof(ZMMReg, ZMM_Q(0)));
+    tcg_gen_qemu_st_i64(t, s->A0, s->mem_index, MO_LEUQ);
 }
 
 static void gen_VMOVSD_ld(DisasContext *s, X86DecodedInsn *decode)
 {
     TCGv_i64 zero = tcg_constant_i64(0);
+    TCGv_i64 t = tcg_temp_new_i64();
 
-    tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, s->mem_index, MO_LEUQ);
+    tcg_gen_qemu_ld_i64(t, s->A0, s->mem_index, MO_LEUQ);
     tcg_gen_st_i64(zero, OP_PTR0, offsetof(ZMMReg, ZMM_Q(1)));
-    tcg_gen_st_i64(s->tmp1_i64, OP_PTR0, offsetof(ZMMReg, ZMM_Q(0)));
+    tcg_gen_st_i64(t, OP_PTR0, offsetof(ZMMReg, ZMM_Q(0)));
 }
 
 static void gen_VMOVSS(DisasContext *s, X86DecodedInsn *decode)
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 15/18] target/i386/tcg: kill tmp2_i32
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
                   ` (13 preceding siblings ...)
  2025-12-10 13:16 ` [PATCH 14/18] target/i386/tcg: kill tmp1_i64 Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 16:29   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 16/18] target/i386/tcg: commonize code to compute SF/ZF/PF Paolo Bonzini
                   ` (2 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/translate.c | 121 +++++++++++++++++++++---------------
 1 file changed, 71 insertions(+), 50 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 108276f4008..e91715af817 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -134,9 +134,6 @@ typedef struct DisasContext {
     TCGv T0;
     TCGv T1;
 
-    /* TCG local register indexes (only used inside old micro ops) */
-    TCGv_i32 tmp2_i32;
-
     sigjmp_buf jmpbuf;
     TCGOp *prev_insn_start;
     TCGOp *prev_insn_end;
@@ -2455,6 +2452,7 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
         TCGv ea = gen_lea_modrm_1(s, decode->mem, false);
         TCGv last_addr = tcg_temp_new();
         bool update_fdp = true;
+        TCGv_i32 t32;
         TCGv_i64 t64;
 
         tcg_gen_mov_tl(last_addr, ea);
@@ -2462,16 +2460,18 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
 
         switch (op) {
         case 0x00 ... 0x07: /* fxxxs */
-            tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+            t32 = tcg_temp_new_i32();
+            tcg_gen_qemu_ld_i32(t32, s->A0,
                                 s->mem_index, MO_LEUL);
-            gen_helper_flds_FT0(tcg_env, s->tmp2_i32);
+            gen_helper_flds_FT0(tcg_env, t32);
             gen_helper_fp_arith_ST0_FT0(op & 7);
             break;
 
         case 0x10 ... 0x17: /* fixxxl */
-            tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+            t32 = tcg_temp_new_i32();
+            tcg_gen_qemu_ld_i32(t32, s->A0,
                                 s->mem_index, MO_LEUL);
-            gen_helper_fildl_FT0(tcg_env, s->tmp2_i32);
+            gen_helper_fildl_FT0(tcg_env, t32);
             gen_helper_fp_arith_ST0_FT0(op & 7);
             break;
 
@@ -2484,21 +2484,24 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             break;
 
         case 0x30 ... 0x37: /* fixxx */
-            tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+            t32 = tcg_temp_new_i32();
+            tcg_gen_qemu_ld_i32(t32, s->A0,
                                 s->mem_index, MO_LESW);
-            gen_helper_fildl_FT0(tcg_env, s->tmp2_i32);
+            gen_helper_fildl_FT0(tcg_env, t32);
             gen_helper_fp_arith_ST0_FT0(op & 7);
             break;
 
         case 0x08: /* flds */
-            tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+            t32 = tcg_temp_new_i32();
+            tcg_gen_qemu_ld_i32(t32, s->A0,
                                 s->mem_index, MO_LEUL);
-            gen_helper_flds_ST0(tcg_env, s->tmp2_i32);
+            gen_helper_flds_ST0(tcg_env, t32);
             break;
         case 0x18: /* fildl */
-            tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+            t32 = tcg_temp_new_i32();
+            tcg_gen_qemu_ld_i32(t32, s->A0,
                                 s->mem_index, MO_LEUL);
-            gen_helper_fildl_ST0(tcg_env, s->tmp2_i32);
+            gen_helper_fildl_ST0(tcg_env, t32);
             break;
         case 0x28: /* fldl */
             t64 = tcg_temp_new_i64();
@@ -2507,14 +2510,16 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             gen_helper_fldl_ST0(tcg_env, t64);
             break;
         case 0x38: /* filds */
-            tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+            t32 = tcg_temp_new_i32();
+            tcg_gen_qemu_ld_i32(t32, s->A0,
                                 s->mem_index, MO_LESW);
-            gen_helper_fildl_ST0(tcg_env, s->tmp2_i32);
+            gen_helper_fildl_ST0(tcg_env, t32);
             break;
 
         case 0x19: /* fisttpl */
-            gen_helper_fisttl_ST0(s->tmp2_i32, tcg_env);
-            tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
+            t32 = tcg_temp_new_i32();
+            gen_helper_fisttl_ST0(t32, tcg_env);
+            tcg_gen_qemu_st_i32(t32, s->A0,
                                 s->mem_index, MO_LEUL);
             gen_helper_fpop(tcg_env);
             break;
@@ -2526,23 +2531,26 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             gen_helper_fpop(tcg_env);
             break;
         case 0x39: /* fisttps */
-            gen_helper_fistt_ST0(s->tmp2_i32, tcg_env);
-            tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
+            t32 = tcg_temp_new_i32();
+            gen_helper_fistt_ST0(t32, tcg_env);
+            tcg_gen_qemu_st_i32(t32, s->A0,
                                 s->mem_index, MO_LEUW);
             gen_helper_fpop(tcg_env);
             break;
 
         case 0x0a: case 0x0b: /* fsts, fstps */
-            gen_helper_fsts_ST0(s->tmp2_i32, tcg_env);
-            tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
+            t32 = tcg_temp_new_i32();
+            gen_helper_fsts_ST0(t32, tcg_env);
+            tcg_gen_qemu_st_i32(t32, s->A0,
                                 s->mem_index, MO_LEUL);
             if ((op & 7) == 3) {
                 gen_helper_fpop(tcg_env);
             }
             break;
         case 0x1a: case 0x1b: /* fistl, fistpl */
-            gen_helper_fistl_ST0(s->tmp2_i32, tcg_env);
-            tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
+            t32 = tcg_temp_new_i32();
+            gen_helper_fistl_ST0(t32, tcg_env);
+            tcg_gen_qemu_st_i32(t32, s->A0,
                                 s->mem_index, MO_LEUL);
             if ((op & 7) == 3) {
                 gen_helper_fpop(tcg_env);
@@ -2559,8 +2567,9 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             break;
 
         case 0x3a: case 0x3b: /* fists, fistps */
-            gen_helper_fist_ST0(s->tmp2_i32, tcg_env);
-            tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
+            t32 = tcg_temp_new_i32();
+            gen_helper_fist_ST0(t32, tcg_env);
+            tcg_gen_qemu_st_i32(t32, s->A0,
                                 s->mem_index, MO_LEUW);
             if ((op & 7) == 3) {
                 gen_helper_fpop(tcg_env);
@@ -2572,9 +2581,10 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             update_fip = update_fdp = false;
             break;
         case 0x0d: /* fldcw mem */
-            tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+            t32 = tcg_temp_new_i32();
+            tcg_gen_qemu_ld_i32(t32, s->A0,
                                 s->mem_index, MO_LEUW);
-            gen_helper_fldcw(tcg_env, s->tmp2_i32);
+            gen_helper_fldcw(tcg_env, t32);
             update_fip = update_fdp = false;
             break;
         case 0x0e: /* fnstenv mem */
@@ -2583,8 +2593,9 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             update_fip = update_fdp = false;
             break;
         case 0x0f: /* fnstcw mem */
-            gen_helper_fnstcw(s->tmp2_i32, tcg_env);
-            tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
+            t32 = tcg_temp_new_i32();
+            gen_helper_fnstcw(t32, tcg_env);
+            tcg_gen_qemu_st_i32(t32, s->A0,
                                 s->mem_index, MO_LEUW);
             update_fip = update_fdp = false;
             break;
@@ -2606,8 +2617,9 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
             update_fip = update_fdp = false;
             break;
         case 0x2f: /* fnstsw mem */
-            gen_helper_fnstsw(s->tmp2_i32, tcg_env);
-            tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0,
+            t32 = tcg_temp_new_i32();
+            gen_helper_fnstsw(t32, tcg_env);
+            tcg_gen_qemu_st_i32(t32, s->A0,
                                 s->mem_index, MO_LEUW);
             update_fip = update_fdp = false;
             break;
@@ -2638,10 +2650,11 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
         if (update_fdp) {
             int last_seg = s->override >= 0 ? s->override : decode->mem.def_seg;
 
-            tcg_gen_ld_i32(s->tmp2_i32, tcg_env,
+            t32 = tcg_temp_new_i32();
+            tcg_gen_ld_i32(t32, tcg_env,
                            offsetof(CPUX86State,
                                     segs[last_seg].selector));
-            tcg_gen_st16_i32(s->tmp2_i32, tcg_env,
+            tcg_gen_st16_i32(t32, tcg_env,
                              offsetof(CPUX86State, fpds));
             tcg_gen_st_tl(last_addr, tcg_env,
                           offsetof(CPUX86State, fpdp));
@@ -2903,8 +2916,9 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
         case 0x3c: /* df/4 */
             switch (rm) {
             case 0:
-                gen_helper_fnstsw(s->tmp2_i32, tcg_env);
-                tcg_gen_extu_i32_tl(s->T0, s->tmp2_i32);
+                TCGv_i32 t32 = tcg_temp_new_i32();
+                gen_helper_fnstsw(t32, tcg_env);
+                tcg_gen_extu_i32_tl(s->T0, t32);
                 gen_op_mov_reg_v(s, MO_16, R_EAX, s->T0);
                 break;
             default:
@@ -2940,9 +2954,10 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
     }
 
     if (update_fip) {
-        tcg_gen_ld_i32(s->tmp2_i32, tcg_env,
+        TCGv_i32 t32 = tcg_temp_new_i32();
+        tcg_gen_ld_i32(t32, tcg_env,
                        offsetof(CPUX86State, segs[R_CS].selector));
-        tcg_gen_st16_i32(s->tmp2_i32, tcg_env,
+        tcg_gen_st16_i32(t32, tcg_env,
                          offsetof(CPUX86State, fpcs));
         tcg_gen_st_tl(eip_cur_tl(s),
                       tcg_env, offsetof(CPUX86State, fpip));
@@ -2961,6 +2976,7 @@ static void gen_multi0F(DisasContext *s, X86DecodedInsn *decode)
     int modrm = s->modrm;
     MemOp ot;
     int reg, rm, mod, op;
+    TCGv_i32 t32;
     TCGv_i64 t64;
 
     /* now check op code */
@@ -3027,10 +3043,11 @@ static void gen_multi0F(DisasContext *s, X86DecodedInsn *decode)
             if (!PE(s) || VM86(s))
                 goto illegal_op;
             if (check_cpl0(s)) {
+                t32 = tcg_temp_new_i32();
                 gen_svm_check_intercept(s, SVM_EXIT_LDTR_WRITE);
                 gen_ld_modrm(s, decode, MO_16);
-                tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
-                gen_helper_lldt(tcg_env, s->tmp2_i32);
+                tcg_gen_trunc_tl_i32(t32, s->T0);
+                gen_helper_lldt(tcg_env, t32);
             }
             break;
         case 1: /* str */
@@ -3049,10 +3066,11 @@ static void gen_multi0F(DisasContext *s, X86DecodedInsn *decode)
             if (!PE(s) || VM86(s))
                 goto illegal_op;
             if (check_cpl0(s)) {
+                t32 = tcg_temp_new_i32();
                 gen_svm_check_intercept(s, SVM_EXIT_TR_WRITE);
                 gen_ld_modrm(s, decode, MO_16);
-                tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
-                gen_helper_ltr(tcg_env, s->tmp2_i32);
+                tcg_gen_trunc_tl_i32(t32, s->T0);
+                gen_helper_ltr(tcg_env, t32);
             }
             break;
         case 4: /* verr */
@@ -3153,9 +3171,10 @@ static void gen_multi0F(DisasContext *s, X86DecodedInsn *decode)
                 || (s->prefix & (PREFIX_DATA | PREFIX_REPZ | PREFIX_REPNZ))) {
                 goto illegal_op;
             }
+            t32 = tcg_temp_new_i32();
             t64 = tcg_temp_new_i64();
-            tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[R_ECX]);
-            gen_helper_xgetbv(t64, tcg_env, s->tmp2_i32);
+            tcg_gen_trunc_tl_i32(t32, cpu_regs[R_ECX]);
+            gen_helper_xgetbv(t64, tcg_env, t32);
             tcg_gen_extr_i64_tl(cpu_regs[R_EAX], cpu_regs[R_EDX], t64);
             break;
 
@@ -3168,11 +3187,12 @@ static void gen_multi0F(DisasContext *s, X86DecodedInsn *decode)
             if (!check_cpl0(s)) {
                 break;
             }
+            t32 = tcg_temp_new_i32();
             t64 = tcg_temp_new_i64();
             tcg_gen_concat_tl_i64(t64, cpu_regs[R_EAX],
                                   cpu_regs[R_EDX]);
-            tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[R_ECX]);
-            gen_helper_xsetbv(tcg_env, s->tmp2_i32, t64);
+            tcg_gen_trunc_tl_i32(t32, cpu_regs[R_ECX]);
+            gen_helper_xsetbv(tcg_env, t32, t64);
             /* End TB because translation flags may change.  */
             s->base.is_jmp = DISAS_EOB_NEXT;
             break;
@@ -3332,20 +3352,22 @@ static void gen_multi0F(DisasContext *s, X86DecodedInsn *decode)
             if (s->prefix & (PREFIX_DATA | PREFIX_REPZ | PREFIX_REPNZ)) {
                 goto illegal_op;
             }
+            t32 = tcg_temp_new_i32();
             t64 = tcg_temp_new_i64();
-            tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[R_ECX]);
-            gen_helper_rdpkru(t64, tcg_env, s->tmp2_i32);
+            tcg_gen_trunc_tl_i32(t32, cpu_regs[R_ECX]);
+            gen_helper_rdpkru(t64, tcg_env, t32);
             tcg_gen_extr_i64_tl(cpu_regs[R_EAX], cpu_regs[R_EDX], t64);
             break;
         case 0xef: /* wrpkru */
             if (s->prefix & (PREFIX_DATA | PREFIX_REPZ | PREFIX_REPNZ)) {
                 goto illegal_op;
             }
+            t32 = tcg_temp_new_i32();
             t64 = tcg_temp_new_i64();
             tcg_gen_concat_tl_i64(t64, cpu_regs[R_EAX],
                                   cpu_regs[R_EDX]);
-            tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[R_ECX]);
-            gen_helper_wrpkru(tcg_env, s->tmp2_i32, t64);
+            tcg_gen_trunc_tl_i32(t32, cpu_regs[R_ECX]);
+            gen_helper_wrpkru(tcg_env, t32, t64);
             break;
 
         CASE_MODRM_OP(6): /* lmsw */
@@ -3737,7 +3759,6 @@ static void i386_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cpu)
     dc->T1 = tcg_temp_new();
     dc->A0 = tcg_temp_new();
 
-    dc->tmp2_i32 = tcg_temp_new_i32();
     dc->cc_srcT = tcg_temp_new();
 }
 
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 16/18] target/i386/tcg: commonize code to compute SF/ZF/PF
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
                   ` (14 preceding siblings ...)
  2025-12-10 13:16 ` [PATCH 15/18] target/i386/tcg: kill tmp2_i32 Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 18:46   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 17/18] target/i386/tcg: add a CCOp for SBB x,x Paolo Bonzini
  2025-12-10 13:16 ` [PATCH 18/18] target/i386/tcg: move fetch code out of translate.c Paolo Bonzini
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

PF/ZF/SF are computed the same way for almost all CC_OP values (depending
only on the operand size in the case of ZF and SF).  The only exception is
PF for CC_OP_BLSI* and CC_OP_BMILG*; but AMD documents that PF should
be computed normally (rather than being undefined) so that is a kind of
bug fix.

Put the common code at the end of helper_cc_compute_all, shaving
another kB from its text.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.h                        |   4 +-
 target/i386/tcg/cc_helper_template.h.inc | 112 +++------
 target/i386/tcg/cc_helper.c              | 274 +++++++++++++++--------
 3 files changed, 209 insertions(+), 181 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index cee1f692a1c..ecca38ed0b5 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1495,12 +1495,12 @@ typedef enum {
     CC_OP_SARL,
     CC_OP_SARQ,
 
-    CC_OP_BMILGB, /* Z,S via CC_DST, C = SRC==0; O=0; P,A undefined */
+    CC_OP_BMILGB, /* P,Z,S via CC_DST, C = SRC==0; A=O=0 */
     CC_OP_BMILGW,
     CC_OP_BMILGL,
     CC_OP_BMILGQ,
 
-    CC_OP_BLSIB, /* Z,S via CC_DST, C = SRC!=0; O=0; P,A undefined */
+    CC_OP_BLSIB, /* P,Z,S via CC_DST, C = SRC!=0; A=O=0 */
     CC_OP_BLSIW,
     CC_OP_BLSIL,
     CC_OP_BLSIQ,
diff --git a/target/i386/tcg/cc_helper_template.h.inc b/target/i386/tcg/cc_helper_template.h.inc
index d8fd976ca15..af58c2409f7 100644
--- a/target/i386/tcg/cc_helper_template.h.inc
+++ b/target/i386/tcg/cc_helper_template.h.inc
@@ -1,5 +1,5 @@
 /*
- *  x86 condition code helpers
+ *  x86 condition code helpers for AF/CF/OF
  *
  *  Copyright (c) 2008 Fabrice Bellard
  *
@@ -44,14 +44,9 @@
 
 /* dynamic flags computation */
 
-static uint32_t glue(compute_all_cout, SUFFIX)(DATA_TYPE dst, DATA_TYPE carries)
+static uint32_t glue(compute_aco_cout, SUFFIX)(DATA_TYPE carries)
 {
-    uint32_t af_cf, pf, zf, sf, of;
-
-    /* PF, ZF, SF computed from result.  */
-    pf = compute_pf(dst);
-    zf = (dst == 0) * CC_Z;
-    sf = lshift(dst, 8 - DATA_BITS) & CC_S;
+    uint32_t af_cf, of;
 
     /*
      * AF, CF, OF computed from carry out vector.  To compute AF and CF, rotate it
@@ -62,14 +57,14 @@ static uint32_t glue(compute_all_cout, SUFFIX)(DATA_TYPE dst, DATA_TYPE carries)
      */
     af_cf = ((carries << 1) | (carries >> (DATA_BITS - 1))) & (CC_A | CC_C);
     of = (lshift(carries, 12 - DATA_BITS) + CC_O / 2) & CC_O;
-    return pf + zf + sf + af_cf + of;
+    return af_cf + of;
 }
 
-static uint32_t glue(compute_all_add, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
+static uint32_t glue(compute_aco_add, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
 {
     DATA_TYPE src2 = dst - src1;
     DATA_TYPE carries = ADD_COUT_VEC(src1, src2, dst);
-    return glue(compute_all_cout, SUFFIX)(dst, carries);
+    return glue(compute_aco_cout, SUFFIX)(carries);
 }
 
 static int glue(compute_c_add, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
@@ -77,12 +72,12 @@ static int glue(compute_c_add, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
     return dst < src1;
 }
 
-static uint32_t glue(compute_all_adc, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1,
+static uint32_t glue(compute_aco_adc, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1,
                                          DATA_TYPE src3)
 {
     DATA_TYPE src2 = dst - src1 - src3;
     DATA_TYPE carries = ADD_COUT_VEC(src1, src2, dst);
-    return glue(compute_all_cout, SUFFIX)(dst, carries);
+    return glue(compute_aco_cout, SUFFIX)(carries);
 }
 
 static int glue(compute_c_adc, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1,
@@ -97,11 +92,11 @@ static int glue(compute_c_adc, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1,
 #endif
 }
 
-static uint32_t glue(compute_all_sub, SUFFIX)(DATA_TYPE dst, DATA_TYPE src2)
+static uint32_t glue(compute_aco_sub, SUFFIX)(DATA_TYPE dst, DATA_TYPE src2)
 {
     DATA_TYPE src1 = dst + src2;
     DATA_TYPE carries = SUB_COUT_VEC(src1, src2, dst);
-    return glue(compute_all_cout, SUFFIX)(dst, carries);
+    return glue(compute_aco_cout, SUFFIX)(carries);
 }
 
 static int glue(compute_c_sub, SUFFIX)(DATA_TYPE dst, DATA_TYPE src2)
@@ -111,12 +106,12 @@ static int glue(compute_c_sub, SUFFIX)(DATA_TYPE dst, DATA_TYPE src2)
     return src1 < src2;
 }
 
-static uint32_t glue(compute_all_sbb, SUFFIX)(DATA_TYPE dst, DATA_TYPE src2,
+static uint32_t glue(compute_aco_sbb, SUFFIX)(DATA_TYPE dst, DATA_TYPE src2,
                                          DATA_TYPE src3)
 {
     DATA_TYPE src1 = dst + src2 + src3;
     DATA_TYPE carries = SUB_COUT_VEC(src1, src2, dst);
-    return glue(compute_all_cout, SUFFIX)(dst, carries);
+    return glue(compute_aco_cout, SUFFIX)(carries);
 }
 
 static int glue(compute_c_sbb, SUFFIX)(DATA_TYPE dst, DATA_TYPE src2,
@@ -134,57 +129,35 @@ static int glue(compute_c_sbb, SUFFIX)(DATA_TYPE dst, DATA_TYPE src2,
 #endif
 }
 
-static uint32_t glue(compute_all_logic, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
+static uint32_t glue(compute_aco_inc, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
 {
-    uint32_t cf, pf, af, zf, sf, of;
-
-    cf = 0;
-    pf = compute_pf(dst);
-    af = 0;
-    zf = (dst == 0) * CC_Z;
-    sf = lshift(dst, 8 - DATA_BITS) & CC_S;
-    of = 0;
-    return cf + pf + af + zf + sf + of;
-}
-
-static uint32_t glue(compute_all_inc, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
-{
-    uint32_t cf, pf, af, zf, sf, of;
+    uint32_t cf, af, of;
 
     cf = src1;
-    pf = compute_pf(dst);
     af = (dst ^ (dst - 1)) & CC_A; /* bits 0..3 are all clear */
-    zf = (dst == 0) * CC_Z;
-    sf = lshift(dst, 8 - DATA_BITS) & CC_S;
     of = (dst == SIGN_MASK) * CC_O;
-    return cf + pf + af + zf + sf + of;
+    return cf + af + of;
 }
 
-static uint32_t glue(compute_all_dec, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
+static uint32_t glue(compute_aco_dec, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
 {
-    uint32_t cf, pf, af, zf, sf, of;
+    uint32_t cf, af, of;
 
     cf = src1;
-    pf = compute_pf(dst);
     af = (dst ^ (dst + 1)) & CC_A; /* bits 0..3 are all set */
-    zf = (dst == 0) * CC_Z;
-    sf = lshift(dst, 8 - DATA_BITS) & CC_S;
     of = (dst == SIGN_MASK - 1) * CC_O;
-    return cf + pf + af + zf + sf + of;
+    return cf + af + of;
 }
 
-static uint32_t glue(compute_all_shl, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
+static uint32_t glue(compute_aco_shl, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
 {
-    uint32_t cf, pf, af, zf, sf, of;
+    uint32_t cf, af, of;
 
     cf = (src1 >> (DATA_BITS - 1)) & CC_C;
-    pf = compute_pf(dst);
     af = 0; /* undefined */
-    zf = (dst == 0) * CC_Z;
-    sf = lshift(dst, 8 - DATA_BITS) & CC_S;
     /* of is defined iff shift count == 1 */
     of = lshift(src1 ^ dst, 12 - DATA_BITS) & CC_O;
-    return cf + pf + af + zf + sf + of;
+    return cf + af + of;
 }
 
 static int glue(compute_c_shl, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
@@ -192,47 +165,25 @@ static int glue(compute_c_shl, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
     return (src1 >> (DATA_BITS - 1)) & CC_C;
 }
 
-static uint32_t glue(compute_all_sar, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
+static uint32_t glue(compute_aco_sar, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
 {
-    uint32_t cf, pf, af, zf, sf, of;
+    uint32_t cf, af, of;
 
     cf = src1 & 1;
-    pf = compute_pf(dst);
     af = 0; /* undefined */
-    zf = (dst == 0) * CC_Z;
-    sf = lshift(dst, 8 - DATA_BITS) & CC_S;
     /* of is defined iff shift count == 1 */
     of = lshift(src1 ^ dst, 12 - DATA_BITS) & CC_O;
-    return cf + pf + af + zf + sf + of;
+    return cf + af + of;
 }
 
-/* NOTE: we compute the flags like the P4. On olders CPUs, only OF and
-   CF are modified and it is slower to do that.  Note as well that we
-   don't truncate SRC1 for computing carry to DATA_TYPE.  */
-static uint32_t glue(compute_all_mul, SUFFIX)(DATA_TYPE dst, target_long src1)
+static uint32_t glue(compute_aco_bmilg, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
 {
-    uint32_t cf, pf, af, zf, sf, of;
-
-    cf = (src1 != 0);
-    pf = compute_pf(dst);
-    af = 0; /* undefined */
-    zf = (dst == 0) * CC_Z;
-    sf = lshift(dst, 8 - DATA_BITS) & CC_S;
-    of = cf * CC_O;
-    return cf + pf + af + zf + sf + of;
-}
-
-static uint32_t glue(compute_all_bmilg, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
-{
-    uint32_t cf, pf, af, zf, sf, of;
+    uint32_t cf, af, of;
 
     cf = (src1 == 0);
-    pf = 0; /* undefined */
     af = 0; /* undefined */
-    zf = (dst == 0) * CC_Z;
-    sf = lshift(dst, 8 - DATA_BITS) & CC_S;
     of = 0;
-    return cf + pf + af + zf + sf + of;
+    return cf + af + of;
 }
 
 static int glue(compute_c_bmilg, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
@@ -240,17 +191,14 @@ static int glue(compute_c_bmilg, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
     return src1 == 0;
 }
 
-static int glue(compute_all_blsi, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
+static int glue(compute_aco_blsi, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
 {
-    uint32_t cf, pf, af, zf, sf, of;
+    uint32_t cf, af, of;
 
     cf = (src1 != 0);
-    pf = 0; /* undefined */
     af = 0; /* undefined */
-    zf = (dst == 0) * CC_Z;
-    sf = lshift(dst, 8 - DATA_BITS) & CC_S;
     of = 0;
-    return cf + pf + af + zf + sf + of;
+    return cf + af + of;
 }
 
 static int glue(compute_c_blsi, SUFFIX)(DATA_TYPE dst, DATA_TYPE src1)
diff --git a/target/i386/tcg/cc_helper.c b/target/i386/tcg/cc_helper.c
index f1940b40927..2c4170b5b77 100644
--- a/target/i386/tcg/cc_helper.c
+++ b/target/i386/tcg/cc_helper.c
@@ -73,9 +73,25 @@ target_ulong helper_cc_compute_nz(target_ulong dst, target_ulong src1,
     }
 }
 
+/* NOTE: we compute the flags like the P4. On olders CPUs, only OF and
+   CF are modified and it is slower to do that.  Note as well that we
+   don't truncate SRC1 for computing carry to DATA_TYPE.  */
+static inline uint32_t compute_aco_mul(target_long src1)
+{
+    uint32_t cf, af, of;
+
+    cf = (src1 != 0);
+    af = 0; /* undefined */
+    of = cf * CC_O;
+    return cf + af + of;
+}
+
 target_ulong helper_cc_compute_all(target_ulong dst, target_ulong src1,
                                    target_ulong src2, int op)
 {
+    uint32_t flags = 0;
+    int shift = 0;
+
     switch (op) {
     default: /* should never happen */
         return 0;
@@ -85,90 +101,6 @@ target_ulong helper_cc_compute_all(target_ulong dst, target_ulong src1,
     case CC_OP_POPCNT:
         return dst ? 0 : CC_Z;
 
-    case CC_OP_MULB:
-        return compute_all_mulb(dst, src1);
-    case CC_OP_MULW:
-        return compute_all_mulw(dst, src1);
-    case CC_OP_MULL:
-        return compute_all_mull(dst, src1);
-
-    case CC_OP_ADDB:
-        return compute_all_addb(dst, src1);
-    case CC_OP_ADDW:
-        return compute_all_addw(dst, src1);
-    case CC_OP_ADDL:
-        return compute_all_addl(dst, src1);
-
-    case CC_OP_ADCB:
-        return compute_all_adcb(dst, src1, src2);
-    case CC_OP_ADCW:
-        return compute_all_adcw(dst, src1, src2);
-    case CC_OP_ADCL:
-        return compute_all_adcl(dst, src1, src2);
-
-    case CC_OP_SUBB:
-        return compute_all_subb(dst, src1);
-    case CC_OP_SUBW:
-        return compute_all_subw(dst, src1);
-    case CC_OP_SUBL:
-        return compute_all_subl(dst, src1);
-
-    case CC_OP_SBBB:
-        return compute_all_sbbb(dst, src1, src2);
-    case CC_OP_SBBW:
-        return compute_all_sbbw(dst, src1, src2);
-    case CC_OP_SBBL:
-        return compute_all_sbbl(dst, src1, src2);
-
-    case CC_OP_LOGICB:
-        return compute_all_logicb(dst, src1);
-    case CC_OP_LOGICW:
-        return compute_all_logicw(dst, src1);
-    case CC_OP_LOGICL:
-        return compute_all_logicl(dst, src1);
-
-    case CC_OP_INCB:
-        return compute_all_incb(dst, src1);
-    case CC_OP_INCW:
-        return compute_all_incw(dst, src1);
-    case CC_OP_INCL:
-        return compute_all_incl(dst, src1);
-
-    case CC_OP_DECB:
-        return compute_all_decb(dst, src1);
-    case CC_OP_DECW:
-        return compute_all_decw(dst, src1);
-    case CC_OP_DECL:
-        return compute_all_decl(dst, src1);
-
-    case CC_OP_SHLB:
-        return compute_all_shlb(dst, src1);
-    case CC_OP_SHLW:
-        return compute_all_shlw(dst, src1);
-    case CC_OP_SHLL:
-        return compute_all_shll(dst, src1);
-
-    case CC_OP_SARB:
-        return compute_all_sarb(dst, src1);
-    case CC_OP_SARW:
-        return compute_all_sarw(dst, src1);
-    case CC_OP_SARL:
-        return compute_all_sarl(dst, src1);
-
-    case CC_OP_BMILGB:
-        return compute_all_bmilgb(dst, src1);
-    case CC_OP_BMILGW:
-        return compute_all_bmilgw(dst, src1);
-    case CC_OP_BMILGL:
-        return compute_all_bmilgl(dst, src1);
-
-    case CC_OP_BLSIB:
-        return compute_all_blsib(dst, src1);
-    case CC_OP_BLSIW:
-        return compute_all_blsiw(dst, src1);
-    case CC_OP_BLSIL:
-        return compute_all_blsil(dst, src1);
-
     case CC_OP_ADCX:
         return compute_all_adcx(dst, src1, src2);
     case CC_OP_ADOX:
@@ -176,33 +108,181 @@ target_ulong helper_cc_compute_all(target_ulong dst, target_ulong src1,
     case CC_OP_ADCOX:
         return compute_all_adcox(dst, src1, src2);
 
+    case CC_OP_MULB:
+        flags = compute_aco_mul(src1);
+        goto psz_b;
+    case CC_OP_MULW:
+        flags = compute_aco_mul(src1);
+        goto psz_w;
+    case CC_OP_MULL:
+        flags = compute_aco_mul(src1);
+        goto psz_l;
+
+    case CC_OP_ADDB:
+        flags = compute_aco_addb(dst, src1);
+        goto psz_b;
+    case CC_OP_ADDW:
+        flags = compute_aco_addw(dst, src1);
+        goto psz_w;
+    case CC_OP_ADDL:
+        flags = compute_aco_addl(dst, src1);
+        goto psz_l;
+
+    case CC_OP_ADCB:
+        flags = compute_aco_adcb(dst, src1, src2);
+        goto psz_b;
+    case CC_OP_ADCW:
+        flags = compute_aco_adcw(dst, src1, src2);
+        goto psz_w;
+    case CC_OP_ADCL:
+        flags = compute_aco_adcl(dst, src1, src2);
+        goto psz_l;
+
+    case CC_OP_SUBB:
+        flags = compute_aco_subb(dst, src1);
+        goto psz_b;
+    case CC_OP_SUBW:
+        flags = compute_aco_subw(dst, src1);
+        goto psz_w;
+    case CC_OP_SUBL:
+        flags = compute_aco_subl(dst, src1);
+        goto psz_l;
+
+    case CC_OP_SBBB:
+        flags = compute_aco_sbbb(dst, src1, src2);
+        goto psz_b;
+    case CC_OP_SBBW:
+        flags = compute_aco_sbbw(dst, src1, src2);
+        goto psz_w;
+    case CC_OP_SBBL:
+        flags = compute_aco_sbbl(dst, src1, src2);
+        goto psz_l;
+
+    case CC_OP_LOGICB:
+        flags = 0;
+        goto psz_b;
+    case CC_OP_LOGICW:
+        flags = 0;
+        goto psz_w;
+    case CC_OP_LOGICL:
+        flags = 0;
+        goto psz_l;
+
+    case CC_OP_INCB:
+        flags = compute_aco_incb(dst, src1);
+        goto psz_b;
+    case CC_OP_INCW:
+        flags = compute_aco_incw(dst, src1);
+        goto psz_w;
+    case CC_OP_INCL:
+        flags = compute_aco_incl(dst, src1);
+        goto psz_l;
+
+    case CC_OP_DECB:
+        flags = compute_aco_decb(dst, src1);
+        goto psz_b;
+    case CC_OP_DECW:
+        flags = compute_aco_decw(dst, src1);
+        goto psz_w;
+    case CC_OP_DECL:
+        flags = compute_aco_decl(dst, src1);
+        goto psz_l;
+
+    case CC_OP_SHLB:
+        flags = compute_aco_shlb(dst, src1);
+        goto psz_b;
+    case CC_OP_SHLW:
+        flags = compute_aco_shlw(dst, src1);
+        goto psz_w;
+    case CC_OP_SHLL:
+        flags = compute_aco_shll(dst, src1);
+        goto psz_l;
+
+    case CC_OP_SARB:
+        flags = compute_aco_sarb(dst, src1);
+        goto psz_b;
+    case CC_OP_SARW:
+        flags = compute_aco_sarw(dst, src1);
+        goto psz_w;
+    case CC_OP_SARL:
+        flags = compute_aco_sarl(dst, src1);
+        goto psz_l;
+
+    case CC_OP_BMILGB:
+        flags = compute_aco_bmilgb(dst, src1);
+        goto psz_b;
+    case CC_OP_BMILGW:
+        flags = compute_aco_bmilgw(dst, src1);
+        goto psz_w;
+    case CC_OP_BMILGL:
+        flags = compute_aco_bmilgl(dst, src1);
+        goto psz_l;
+
+    case CC_OP_BLSIB:
+        flags = compute_aco_blsib(dst, src1);
+        goto psz_b;
+    case CC_OP_BLSIW:
+        flags = compute_aco_blsiw(dst, src1);
+        goto psz_w;
+    case CC_OP_BLSIL:
+        flags = compute_aco_blsil(dst, src1);
+        goto psz_l;
+
 #ifdef TARGET_X86_64
     case CC_OP_MULQ:
-        return compute_all_mulq(dst, src1);
+        flags = compute_aco_mul(src1);
+        goto psz_q;
     case CC_OP_ADDQ:
-        return compute_all_addq(dst, src1);
+        flags = compute_aco_addq(dst, src1);
+        goto psz_q;
     case CC_OP_ADCQ:
-        return compute_all_adcq(dst, src1, src2);
+        flags = compute_aco_adcq(dst, src1, src2);
+        goto psz_q;
     case CC_OP_SUBQ:
-        return compute_all_subq(dst, src1);
+        flags = compute_aco_subq(dst, src1);
+        goto psz_q;
     case CC_OP_SBBQ:
-        return compute_all_sbbq(dst, src1, src2);
-    case CC_OP_LOGICQ:
-        return compute_all_logicq(dst, src1);
+        flags = compute_aco_sbbq(dst, src1, src2);
+        goto psz_q;
     case CC_OP_INCQ:
-        return compute_all_incq(dst, src1);
+        flags = compute_aco_incq(dst, src1);
+        goto psz_q;
     case CC_OP_DECQ:
-        return compute_all_decq(dst, src1);
+        flags = compute_aco_decq(dst, src1);
+        goto psz_q;
+    case CC_OP_LOGICQ:
+        flags = 0;
+        goto psz_q;
     case CC_OP_SHLQ:
-        return compute_all_shlq(dst, src1);
+        flags = compute_aco_shlq(dst, src1);
+        goto psz_q;
     case CC_OP_SARQ:
-        return compute_all_sarq(dst, src1);
+        flags = compute_aco_sarq(dst, src1);
+        goto psz_q;
     case CC_OP_BMILGQ:
-        return compute_all_bmilgq(dst, src1);
+        flags = compute_aco_bmilgq(dst, src1);
+        goto psz_q;
     case CC_OP_BLSIQ:
-        return compute_all_blsiq(dst, src1);
+        flags = compute_aco_blsiq(dst, src1);
+        goto psz_q;
 #endif
     }
+
+psz_b:
+    shift += 8;
+psz_w:
+    shift += 16;
+psz_l:
+#ifdef TARGET_X86_64
+    shift += 32;
+psz_q:
+#endif
+
+    flags += compute_pf(dst);
+    dst <<= shift;
+    flags += dst == 0 ? CC_Z : 0;
+    flags += (target_long)dst < 0 ? CC_S : 0;
+    return flags;
 }
 
 uint32_t cpu_cc_compute_all(CPUX86State *env)
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 17/18] target/i386/tcg: add a CCOp for SBB x,x
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
                   ` (15 preceding siblings ...)
  2025-12-10 13:16 ` [PATCH 16/18] target/i386/tcg: commonize code to compute SF/ZF/PF Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 19:11   ` Richard Henderson
  2025-12-10 13:16 ` [PATCH 18/18] target/i386/tcg: move fetch code out of translate.c Paolo Bonzini
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

This is more efficient both when generating code and when testing
flags.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.h           | 13 ++++++++++++-
 target/i386/cpu-dump.c      |  2 ++
 target/i386/tcg/cc_helper.c |  6 ++++++
 target/i386/tcg/translate.c | 13 +++++++++++++
 target/i386/tcg/emit.c.inc  | 33 ++++++---------------------------
 5 files changed, 39 insertions(+), 28 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index ecca38ed0b5..314e773a5d4 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1515,7 +1515,18 @@ typedef enum {
     CC_OP_POPCNTL__,
     CC_OP_POPCNTQ__,
     CC_OP_POPCNT = sizeof(target_ulong) == 8 ? CC_OP_POPCNTQ__ : CC_OP_POPCNTL__,
-#define CC_OP_LAST_BWLQ CC_OP_POPCNTQ__
+
+    /*
+     * Note that only CC_OP_SBB_SELF (i.e. the one with MO_TL size)
+     * is used or implemented, because the translation produces a
+     * sign-extended CC_DST.
+     */
+    CC_OP_SBB_SELFB__, /* S/Z/C/A via CC_DST, O clear, P set.  */
+    CC_OP_SBB_SELFW__,
+    CC_OP_SBB_SELFL__,
+    CC_OP_SBB_SELFQ__,
+    CC_OP_SBB_SELF = sizeof(target_ulong) == 8 ? CC_OP_SBB_SELFQ__ : CC_OP_SBB_SELFL__,
+#define CC_OP_LAST_BWLQ CC_OP_SBB_SELFQ__
 
     CC_OP_DYNAMIC, /* must use dynamic code to get cc_op */
 } CCOp;
diff --git a/target/i386/cpu-dump.c b/target/i386/cpu-dump.c
index 67bf31e0caa..20a3002f013 100644
--- a/target/i386/cpu-dump.c
+++ b/target/i386/cpu-dump.c
@@ -91,6 +91,8 @@ static const char * const cc_op_str[] = {
     [CC_OP_BMILGQ] = "BMILGQ",
 
     [CC_OP_POPCNT] = "POPCNT",
+
+    [CC_OP_SBB_SELF] = "SBBx,x",
 };
 
 static void
diff --git a/target/i386/tcg/cc_helper.c b/target/i386/tcg/cc_helper.c
index 2c4170b5b77..91e492196af 100644
--- a/target/i386/tcg/cc_helper.c
+++ b/target/i386/tcg/cc_helper.c
@@ -100,6 +100,9 @@ target_ulong helper_cc_compute_all(target_ulong dst, target_ulong src1,
         return src1;
     case CC_OP_POPCNT:
         return dst ? 0 : CC_Z;
+    case CC_OP_SBB_SELF:
+	/* dst is either all zeros (--Z-P-) or all ones (-S-APC) */
+        return (dst & (CC_Z|CC_A|CC_C|CC_S)) ^ (CC_P | CC_Z);
 
     case CC_OP_ADCX:
         return compute_all_adcx(dst, src1, src2);
@@ -326,6 +329,9 @@ target_ulong helper_cc_compute_c(target_ulong dst, target_ulong src1,
     case CC_OP_MULQ:
         return src1 != 0;
 
+    case CC_OP_SBB_SELF:
+        return dst & 1;
+
     case CC_OP_ADCX:
     case CC_OP_ADCOX:
         return dst;
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index e91715af817..17ad4ccacaf 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -304,6 +304,7 @@ static const uint8_t cc_op_live_[] = {
     [CC_OP_ADOX] = USES_CC_SRC | USES_CC_SRC2,
     [CC_OP_ADCOX] = USES_CC_DST | USES_CC_SRC | USES_CC_SRC2,
     [CC_OP_POPCNT] = USES_CC_DST,
+    [CC_OP_SBB_SELF] = USES_CC_DST,
 };
 
 static uint8_t cc_op_live(CCOp op)
@@ -938,6 +939,9 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, TCGv reg)
         size = cc_op_size(s->cc_op);
         return gen_prepare_val_nz(cpu_cc_src, size, false);
 
+    case CC_OP_SBB_SELF:
+        return (CCPrepare) { .cond = TCG_COND_NE, .reg = cpu_cc_dst };
+
     case CC_OP_ADCX:
     case CC_OP_ADCOX:
         return (CCPrepare) { .cond = TCG_COND_NE, .reg = cpu_cc_dst,
@@ -999,6 +1003,7 @@ static CCPrepare gen_prepare_eflags_o(DisasContext *s, TCGv reg)
     case CC_OP_ADCOX:
         return (CCPrepare) { .cond = TCG_COND_NE, .reg = cpu_cc_src2,
                              .no_setcond = true };
+    case CC_OP_SBB_SELF:
     case CC_OP_LOGICB ... CC_OP_LOGICQ:
     case CC_OP_POPCNT:
         return (CCPrepare) { .cond = TCG_COND_NEVER };
@@ -1078,6 +1083,14 @@ static CCPrepare gen_prepare_cc(DisasContext *s, int b, TCGv reg)
         }
         break;
 
+    case CC_OP_SBB_SELF:
+        /* checking for nonzero is usually the most efficient */
+        if (jcc_op == JCC_L || jcc_op == JCC_B || jcc_op == JCC_S) {
+            jcc_op = JCC_Z;
+            inv = !inv;
+        }
+        goto slow_jcc;
+
     case CC_OP_LOGICB ... CC_OP_LOGICQ:
         /* Mostly used for test+jump */
         size = s->cc_op - CC_OP_LOGICB;
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 8dac4d09da1..0fde3d669d9 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -3876,37 +3876,16 @@ static void gen_SBB(DisasContext *s, X86DecodedInsn *decode)
         return;
     }
 
-    c_in = tcg_temp_new();
-    gen_compute_eflags_c(s, c_in);
-
-    /*
-     * Here the change is as follows:
-     * CC_SBB: src1 = T0, src2 = T0, src3 = c_in
-     * CC_SUB: src1 = 0, src2 = c_in (no src3)
-     *
-     * The difference also does not matter:
-     * - AF is bit 4 of dst^src1^src2, but bit 4 of src1^src2 is zero in both cases
-     *   therefore AF comes straight from dst (in fact it is c_in)
-     * - for OF, src1 and src2 have the same sign in both cases, meaning there
-     *   can be no overflow
-     */
+    /* SBB x,x has its own CCOp so that's even easier.  */
     if (decode->e.op2 != X86_TYPE_I && !decode->op[0].has_ea && decode->op[0].n == decode->op[2].n) {
-        if (s->cc_op == CC_OP_DYNAMIC) {
-            tcg_gen_neg_tl(s->T0, c_in);
-        } else {
-            /*
-             * Do not negate c_in because it will often be dead and only the
-             * instruction generated by negsetcond will survive.
-             */
-            gen_neg_setcc(s, JCC_B << 1, s->T0);
-        }
-        tcg_gen_movi_tl(s->cc_srcT, 0);
-        decode->cc_src = c_in;
-        decode->cc_dst = s->T0;
-        decode->cc_op = CC_OP_SUBB + ot;
+        gen_neg_setcc(s, JCC_B << 1, s->T0);
+        prepare_update1_cc(decode, s, CC_OP_SBB_SELF);
         return;
     }
 
+    c_in = tcg_temp_new();
+    gen_compute_eflags_c(s, c_in);
+
     if (s->prefix & PREFIX_LOCK) {
         tcg_gen_add_tl(s->T0, s->T1, c_in);
         tcg_gen_neg_tl(s->T0, s->T0);
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 18/18] target/i386/tcg: move fetch code out of translate.c
  2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
                   ` (16 preceding siblings ...)
  2025-12-10 13:16 ` [PATCH 17/18] target/i386/tcg: add a CCOp for SBB x,x Paolo Bonzini
@ 2025-12-10 13:16 ` Paolo Bonzini
  2025-12-11 19:29   ` Richard Henderson
  17 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-10 13:16 UTC (permalink / raw)
  To: qemu-devel

Let translate.c only concern itself with TCG code generation.  Move everything
that uses CPUX86State*, as well as gen_lea_modrm_0 now that it is only used
to fill decode->mem, to decode-new.c.inc.

While at it also rename gen_lea_modrm_0.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/translate.c      | 271 ------------------------------
 target/i386/tcg/decode-new.c.inc | 277 ++++++++++++++++++++++++++++++-
 2 files changed, 274 insertions(+), 274 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 17ad4ccacaf..a905efdfbbd 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1644,182 +1644,6 @@ static TCGv gen_shiftd_rm_T1(DisasContext *s, MemOp ot,
     return cc_src;
 }
 
-#define X86_MAX_INSN_LENGTH 15
-
-static uint64_t advance_pc(CPUX86State *env, DisasContext *s, int num_bytes)
-{
-    uint64_t pc = s->pc;
-
-    /* This is a subsequent insn that crosses a page boundary.  */
-    if (s->base.num_insns > 1 &&
-        !translator_is_same_page(&s->base, s->pc + num_bytes - 1)) {
-        siglongjmp(s->jmpbuf, 2);
-    }
-
-    s->pc += num_bytes;
-    if (unlikely(cur_insn_len(s) > X86_MAX_INSN_LENGTH)) {
-        /* If the instruction's 16th byte is on a different page than the 1st, a
-         * page fault on the second page wins over the general protection fault
-         * caused by the instruction being too long.
-         * This can happen even if the operand is only one byte long!
-         */
-        if (((s->pc - 1) ^ (pc - 1)) & TARGET_PAGE_MASK) {
-            (void)translator_ldub(env, &s->base,
-                                  (s->pc - 1) & TARGET_PAGE_MASK);
-        }
-        siglongjmp(s->jmpbuf, 1);
-    }
-
-    return pc;
-}
-
-static inline uint8_t x86_ldub_code(CPUX86State *env, DisasContext *s)
-{
-    return translator_ldub(env, &s->base, advance_pc(env, s, 1));
-}
-
-static inline uint16_t x86_lduw_code(CPUX86State *env, DisasContext *s)
-{
-    return translator_lduw(env, &s->base, advance_pc(env, s, 2));
-}
-
-static inline uint32_t x86_ldl_code(CPUX86State *env, DisasContext *s)
-{
-    return translator_ldl(env, &s->base, advance_pc(env, s, 4));
-}
-
-#ifdef TARGET_X86_64
-static inline uint64_t x86_ldq_code(CPUX86State *env, DisasContext *s)
-{
-    return translator_ldq(env, &s->base, advance_pc(env, s, 8));
-}
-#endif
-
-/* Decompose an address.  */
-
-static AddressParts gen_lea_modrm_0(CPUX86State *env, DisasContext *s,
-                                    int modrm, bool is_vsib)
-{
-    int def_seg, base, index, scale, mod, rm;
-    target_long disp;
-    bool havesib;
-
-    def_seg = R_DS;
-    index = -1;
-    scale = 0;
-    disp = 0;
-
-    mod = (modrm >> 6) & 3;
-    rm = modrm & 7;
-    base = rm | REX_B(s);
-
-    if (mod == 3) {
-        /* Normally filtered out earlier, but including this path
-           simplifies multi-byte nop, as well as bndcl, bndcu, bndcn.  */
-        goto done;
-    }
-
-    switch (s->aflag) {
-    case MO_64:
-    case MO_32:
-        havesib = 0;
-        if (rm == 4) {
-            int code = x86_ldub_code(env, s);
-            scale = (code >> 6) & 3;
-            index = ((code >> 3) & 7) | REX_X(s);
-            if (index == 4 && !is_vsib) {
-                index = -1;  /* no index */
-            }
-            base = (code & 7) | REX_B(s);
-            havesib = 1;
-        }
-
-        switch (mod) {
-        case 0:
-            if ((base & 7) == 5) {
-                base = -1;
-                disp = (int32_t)x86_ldl_code(env, s);
-                if (CODE64(s) && !havesib) {
-                    base = -2;
-                    disp += s->pc + s->rip_offset;
-                }
-            }
-            break;
-        case 1:
-            disp = (int8_t)x86_ldub_code(env, s);
-            break;
-        default:
-        case 2:
-            disp = (int32_t)x86_ldl_code(env, s);
-            break;
-        }
-
-        /* For correct popl handling with esp.  */
-        if (base == R_ESP && s->popl_esp_hack) {
-            disp += s->popl_esp_hack;
-        }
-        if (base == R_EBP || base == R_ESP) {
-            def_seg = R_SS;
-        }
-        break;
-
-    case MO_16:
-        if (mod == 0) {
-            if (rm == 6) {
-                base = -1;
-                disp = x86_lduw_code(env, s);
-                break;
-            }
-        } else if (mod == 1) {
-            disp = (int8_t)x86_ldub_code(env, s);
-        } else {
-            disp = (int16_t)x86_lduw_code(env, s);
-        }
-
-        switch (rm) {
-        case 0:
-            base = R_EBX;
-            index = R_ESI;
-            break;
-        case 1:
-            base = R_EBX;
-            index = R_EDI;
-            break;
-        case 2:
-            base = R_EBP;
-            index = R_ESI;
-            def_seg = R_SS;
-            break;
-        case 3:
-            base = R_EBP;
-            index = R_EDI;
-            def_seg = R_SS;
-            break;
-        case 4:
-            base = R_ESI;
-            break;
-        case 5:
-            base = R_EDI;
-            break;
-        case 6:
-            base = R_EBP;
-            def_seg = R_SS;
-            break;
-        default:
-        case 7:
-            base = R_EBX;
-            break;
-        }
-        break;
-
-    default:
-        g_assert_not_reached();
-    }
-
- done:
-    return (AddressParts){ def_seg, base, index, scale, disp };
-}
-
 /* Compute the address, with a minimum number of TCG ops.  */
 static TCGv gen_lea_modrm_1(DisasContext *s, AddressParts a, bool is_vsib)
 {
@@ -1904,79 +1728,6 @@ static void gen_st_modrm(DisasContext *s, X86DecodedInsn *decode, MemOp ot)
     }
 }
 
-static target_ulong insn_get_addr(CPUX86State *env, DisasContext *s, MemOp ot)
-{
-    target_ulong ret;
-
-    switch (ot) {
-    case MO_8:
-        ret = x86_ldub_code(env, s);
-        break;
-    case MO_16:
-        ret = x86_lduw_code(env, s);
-        break;
-    case MO_32:
-        ret = x86_ldl_code(env, s);
-        break;
-#ifdef TARGET_X86_64
-    case MO_64:
-        ret = x86_ldq_code(env, s);
-        break;
-#endif
-    default:
-        g_assert_not_reached();
-    }
-    return ret;
-}
-
-static inline uint32_t insn_get(CPUX86State *env, DisasContext *s, MemOp ot)
-{
-    uint32_t ret;
-
-    switch (ot) {
-    case MO_8:
-        ret = x86_ldub_code(env, s);
-        break;
-    case MO_16:
-        ret = x86_lduw_code(env, s);
-        break;
-    case MO_32:
-#ifdef TARGET_X86_64
-    case MO_64:
-#endif
-        ret = x86_ldl_code(env, s);
-        break;
-    default:
-        g_assert_not_reached();
-    }
-    return ret;
-}
-
-static target_long insn_get_signed(CPUX86State *env, DisasContext *s, MemOp ot)
-{
-    target_long ret;
-
-    switch (ot) {
-    case MO_8:
-        ret = (int8_t) x86_ldub_code(env, s);
-        break;
-    case MO_16:
-        ret = (int16_t) x86_lduw_code(env, s);
-        break;
-    case MO_32:
-        ret = (int32_t) x86_ldl_code(env, s);
-        break;
-#ifdef TARGET_X86_64
-    case MO_64:
-        ret = x86_ldq_code(env, s);
-        break;
-#endif
-    default:
-        g_assert_not_reached();
-    }
-    return ret;
-}
-
 static void gen_conditional_jump_labels(DisasContext *s, target_long diff,
                                         TCGLabel *not_taken, TCGLabel *taken)
 {
@@ -2221,28 +1972,6 @@ static void gen_leave(DisasContext *s)
     gen_op_mov_reg_v(s, a_ot, R_ESP, s->T1);
 }
 
-/* Similarly, except that the assumption here is that we don't decode
-   the instruction at all -- either a missing opcode, an unimplemented
-   feature, or just a bogus instruction stream.  */
-static void gen_unknown_opcode(CPUX86State *env, DisasContext *s)
-{
-    gen_illegal_opcode(s);
-
-    if (qemu_loglevel_mask(LOG_UNIMP)) {
-        FILE *logfile = qemu_log_trylock();
-        if (logfile) {
-            target_ulong pc = s->base.pc_next, end = s->pc;
-
-            fprintf(logfile, "ILLOPC: " TARGET_FMT_lx ":", pc);
-            for (; pc < end; ++pc) {
-                fprintf(logfile, " %02x", translator_ldub(env, &s->base, pc));
-            }
-            fprintf(logfile, "\n");
-            qemu_log_unlock(logfile);
-        }
-    }
-}
-
 /* an interrupt is different from an exception because of the
    privilege checks */
 static void gen_interrupt(DisasContext *s, uint8_t intno)
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 9d17bae7e75..b4aa300ab47 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -279,6 +279,130 @@
 
 #define UNKNOWN_OPCODE ((X86OpEntry) {})
 
+#define X86_MAX_INSN_LENGTH 15
+
+static uint64_t advance_pc(CPUX86State *env, DisasContext *s, int num_bytes)
+{
+    uint64_t pc = s->pc;
+
+    /* This is a subsequent insn that crosses a page boundary.  */
+    if (s->base.num_insns > 1 &&
+        !translator_is_same_page(&s->base, s->pc + num_bytes - 1)) {
+        siglongjmp(s->jmpbuf, 2);
+    }
+
+    s->pc += num_bytes;
+    if (unlikely(cur_insn_len(s) > X86_MAX_INSN_LENGTH)) {
+        /* If the instruction's 16th byte is on a different page than the 1st, a
+         * page fault on the second page wins over the general protection fault
+         * caused by the instruction being too long.
+         * This can happen even if the operand is only one byte long!
+         */
+        if (((s->pc - 1) ^ (pc - 1)) & TARGET_PAGE_MASK) {
+            (void)translator_ldub(env, &s->base,
+                                  (s->pc - 1) & TARGET_PAGE_MASK);
+        }
+        siglongjmp(s->jmpbuf, 1);
+    }
+
+    return pc;
+}
+
+static inline uint8_t x86_ldub_code(CPUX86State *env, DisasContext *s)
+{
+    return translator_ldub(env, &s->base, advance_pc(env, s, 1));
+}
+
+static inline uint16_t x86_lduw_code(CPUX86State *env, DisasContext *s)
+{
+    return translator_lduw(env, &s->base, advance_pc(env, s, 2));
+}
+
+static inline uint32_t x86_ldl_code(CPUX86State *env, DisasContext *s)
+{
+    return translator_ldl(env, &s->base, advance_pc(env, s, 4));
+}
+
+#ifdef TARGET_X86_64
+static inline uint64_t x86_ldq_code(CPUX86State *env, DisasContext *s)
+{
+    return translator_ldq(env, &s->base, advance_pc(env, s, 8));
+}
+#endif
+
+static target_ulong insn_get_addr(CPUX86State *env, DisasContext *s, MemOp ot)
+{
+    target_ulong ret;
+
+    switch (ot) {
+    case MO_8:
+        ret = x86_ldub_code(env, s);
+        break;
+    case MO_16:
+        ret = x86_lduw_code(env, s);
+        break;
+    case MO_32:
+        ret = x86_ldl_code(env, s);
+        break;
+#ifdef TARGET_X86_64
+    case MO_64:
+        ret = x86_ldq_code(env, s);
+        break;
+#endif
+    default:
+        g_assert_not_reached();
+    }
+    return ret;
+}
+
+static inline uint32_t insn_get(CPUX86State *env, DisasContext *s, MemOp ot)
+{
+    uint32_t ret;
+
+    switch (ot) {
+    case MO_8:
+        ret = x86_ldub_code(env, s);
+        break;
+    case MO_16:
+        ret = x86_lduw_code(env, s);
+        break;
+    case MO_32:
+#ifdef TARGET_X86_64
+    case MO_64:
+#endif
+        ret = x86_ldl_code(env, s);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+    return ret;
+}
+
+static target_long insn_get_signed(CPUX86State *env, DisasContext *s, MemOp ot)
+{
+    target_long ret;
+
+    switch (ot) {
+    case MO_8:
+        ret = (int8_t) x86_ldub_code(env, s);
+        break;
+    case MO_16:
+        ret = (int16_t) x86_lduw_code(env, s);
+        break;
+    case MO_32:
+        ret = (int32_t) x86_ldl_code(env, s);
+        break;
+#ifdef TARGET_X86_64
+    case MO_64:
+        ret = x86_ldq_code(env, s);
+        break;
+#endif
+    default:
+        g_assert_not_reached();
+    }
+    return ret;
+}
+
 static uint8_t get_modrm(DisasContext *s, CPUX86State *env)
 {
     if (!s->has_modrm) {
@@ -1883,6 +2007,130 @@ static void decode_root(DisasContext *s, CPUX86State *env, X86OpEntry *entry, ui
     *entry = opcodes_root[*b];
 }
 
+/* Decompose an address.  */
+static AddressParts decode_modrm_address(CPUX86State *env, DisasContext *s,
+                                         int modrm, bool is_vsib)
+{
+    int def_seg, base, index, scale, mod, rm;
+    target_long disp;
+    bool havesib;
+
+    def_seg = R_DS;
+    index = -1;
+    scale = 0;
+    disp = 0;
+
+    mod = (modrm >> 6) & 3;
+    rm = modrm & 7;
+    base = rm | REX_B(s);
+
+    if (mod == 3) {
+        /* Normally filtered out earlier, but including this path
+           simplifies multi-byte nop, as well as bndcl, bndcu, bndcn.  */
+        goto done;
+    }
+
+    switch (s->aflag) {
+    case MO_64:
+    case MO_32:
+        havesib = 0;
+        if (rm == 4) {
+            int code = x86_ldub_code(env, s);
+            scale = (code >> 6) & 3;
+            index = ((code >> 3) & 7) | REX_X(s);
+            if (index == 4 && !is_vsib) {
+                index = -1;  /* no index */
+            }
+            base = (code & 7) | REX_B(s);
+            havesib = 1;
+        }
+
+        switch (mod) {
+        case 0:
+            if ((base & 7) == 5) {
+                base = -1;
+                disp = (int32_t)x86_ldl_code(env, s);
+                if (CODE64(s) && !havesib) {
+                    base = -2;
+                    disp += s->pc + s->rip_offset;
+                }
+            }
+            break;
+        case 1:
+            disp = (int8_t)x86_ldub_code(env, s);
+            break;
+        default:
+        case 2:
+            disp = (int32_t)x86_ldl_code(env, s);
+            break;
+        }
+
+        /* For correct popl handling with esp.  */
+        if (base == R_ESP && s->popl_esp_hack) {
+            disp += s->popl_esp_hack;
+        }
+        if (base == R_EBP || base == R_ESP) {
+            def_seg = R_SS;
+        }
+        break;
+
+    case MO_16:
+        if (mod == 0) {
+            if (rm == 6) {
+                base = -1;
+                disp = x86_lduw_code(env, s);
+                break;
+            }
+        } else if (mod == 1) {
+            disp = (int8_t)x86_ldub_code(env, s);
+        } else {
+            disp = (int16_t)x86_lduw_code(env, s);
+        }
+
+        switch (rm) {
+        case 0:
+            base = R_EBX;
+            index = R_ESI;
+            break;
+        case 1:
+            base = R_EBX;
+            index = R_EDI;
+            break;
+        case 2:
+            base = R_EBP;
+            index = R_ESI;
+            def_seg = R_SS;
+            break;
+        case 3:
+            base = R_EBP;
+            index = R_EDI;
+            def_seg = R_SS;
+            break;
+        case 4:
+            base = R_ESI;
+            break;
+        case 5:
+            base = R_EDI;
+            break;
+        case 6:
+            base = R_EBP;
+            def_seg = R_SS;
+            break;
+        default:
+        case 7:
+            base = R_EBX;
+            break;
+        }
+        break;
+
+    default:
+        g_assert_not_reached();
+    }
+
+ done:
+    return (AddressParts){ def_seg, base, index, scale, disp };
+}
+
 static int decode_modrm(DisasContext *s, CPUX86State *env,
                         X86DecodedInsn *decode, X86DecodedOp *op)
 {
@@ -1895,8 +2143,8 @@ static int decode_modrm(DisasContext *s, CPUX86State *env,
     } else {
         op->has_ea = true;
         op->n = -1;
-        decode->mem = gen_lea_modrm_0(env, s, modrm,
-                                      decode->e.vex_class == 12);
+        decode->mem = decode_modrm_address(env, s, get_modrm(s, env),
+                                           decode->e.vex_class == 12);
     }
     return modrm;
 }
@@ -2516,6 +2764,23 @@ illegal:
     return false;
 }
 
+static void dump_unknown_opcode(CPUX86State *env, DisasContext *s)
+{
+    if (qemu_loglevel_mask(LOG_UNIMP)) {
+        FILE *logfile = qemu_log_trylock();
+        if (logfile) {
+            target_ulong pc = s->base.pc_next, end = s->pc;
+
+            fprintf(logfile, "ILLOPC: " TARGET_FMT_lx ":", pc);
+            for (; pc < end; ++pc) {
+                fprintf(logfile, " %02x", translator_ldub(env, &s->base, pc));
+            }
+            fprintf(logfile, "\n");
+            qemu_log_unlock(logfile);
+        }
+    }
+}
+
 /*
  * Convert one instruction. s->base.is_jmp is set if the translation must
  * be stopped.
@@ -2902,5 +3167,11 @@ static void disas_insn(DisasContext *s, CPUState *cpu)
     gen_illegal_opcode(s);
     return;
  unknown_op:
-    gen_unknown_opcode(env, s);
+    /*
+     * Similarly, except that the assumption here is that we don't decode
+     * the instruction at all -- either a missing opcode, an unimplemented
+     * feature, or just a bogus instruction stream.
+     */
+    gen_illegal_opcode(s);
+    dump_unknown_opcode(env, s);
 }
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/18] target/i386/tcg: fix check for invalid VSIB instruction
  2025-12-10 13:16 ` [PATCH 01/18] target/i386/tcg: fix check for invalid VSIB instruction Paolo Bonzini
@ 2025-12-11 15:47   ` Richard Henderson
  2025-12-11 20:28     ` Paolo Bonzini
  0 siblings, 1 reply; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 15:47 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: qemu-stable

On 12/10/25 07:16, Paolo Bonzini wrote:
> VSIB instructions (VEX class 12) must not have an address prefix.
> Checking s->aflag == MO_16 is not enough because in 64-bit mode
> the address prefix changes aflag to MO_32.  Add a specific check
> bit instead.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   target/i386/tcg/decode-new.h     |  3 +++
>   target/i386/tcg/decode-new.c.inc | 27 +++++++++++++--------------
>   2 files changed, 16 insertions(+), 14 deletions(-)

Where do you see this?  I think this is wrong.

In particular,

Table 2-27. Type 12 Class Exception Conditions
- If address size attribute is 16 bit.

and

2.3.12 Vector SIB (VSIB) Memory Addressing
In 16-bit protected mode, VSIB memory addressing is permitted if address size attribute is 
overridden to 32 bits.

Therefore, in 16-bit mode, one *must* use the address prefix.



r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/18] target/i386/tcg: ignore V3 in 32-bit mode
  2025-12-10 13:16 ` [PATCH 02/18] target/i386/tcg: ignore V3 in 32-bit mode Paolo Bonzini
@ 2025-12-11 15:52   ` Richard Henderson
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 15:52 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: qemu-stable

On 12/10/25 07:16, Paolo Bonzini wrote:
>  From the manual: "In 64-bit mode all 4 bits may be used. [...]
> In 32-bit and 16-bit modes bit 6 must be 1 (if bit 6 is not 1, the
> 2-byte VEX version will generate LDS instruction and the 3-byte VEX
> version will ignore this bit)."
> 
> Cc:qemu-stable@nongnu.org
> Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
> ---
>   target/i386/tcg/decode-new.c.inc | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 03/18] target/i386/tcg: update cc_op after PUSHF
  2025-12-10 13:16 ` [PATCH 03/18] target/i386/tcg: update cc_op after PUSHF Paolo Bonzini
@ 2025-12-11 15:55   ` Richard Henderson
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 15:55 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

On 12/10/25 07:16, Paolo Bonzini wrote:
> PUSHF needs to compute the full eflags, set the cc_op to
> CC_OP_EFLAGS.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   target/i386/tcg/emit.c.inc | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
> index 1a7fab9333a..22e53f5b000 100644
> --- a/target/i386/tcg/emit.c.inc
> +++ b/target/i386/tcg/emit.c.inc
> @@ -3250,6 +3250,8 @@ static void gen_PUSHF(DisasContext *s, X86DecodedInsn *decode)
>       gen_update_cc_op(s);
>       gen_helper_read_eflags(s->T0, tcg_env);
>       gen_push_v(s, s->T0);
> +    decode->cc_src = s->T0;
> +    decode->cc_op = CC_OP_EFLAGS;
>   }
>   
>   static MemOp gen_shift_count(DisasContext *s, X86DecodedInsn *decode,

Ah, as an optimization to not duplicate computation of these flags, not a bug fix.  You 
might expand the commit message by a few words there.  Anyway,

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 04/18] target/i386/tcg: mark more instructions that are invalid in 64-bit mode
  2025-12-10 13:16 ` [PATCH 04/18] target/i386/tcg: mark more instructions that are invalid in 64-bit mode Paolo Bonzini
@ 2025-12-11 15:59   ` Richard Henderson
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 15:59 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

On 12/10/25 07:16, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   target/i386/tcg/decode-new.c.inc | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~

> 
> diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
> index c9b4d5ffa32..213dbb9637c 100644
> --- a/target/i386/tcg/decode-new.c.inc
> +++ b/target/i386/tcg/decode-new.c.inc
> @@ -1698,9 +1698,9 @@ static const X86OpEntry opcodes_root[256] = {
>       [0xD1] = X86_OP_GROUP1(group2, E,v),
>       [0xD2] = X86_OP_GROUP2(group2, E,b, 1,b), /* CL */
>       [0xD3] = X86_OP_GROUP2(group2, E,v, 1,b), /* CL */
> -    [0xD4] = X86_OP_ENTRY2(AAM, 0,w, I,b),
> -    [0xD5] = X86_OP_ENTRY2(AAD, 0,w, I,b),
> -    [0xD6] = X86_OP_ENTRYw(SALC, 0,b),
> +    [0xD4] = X86_OP_ENTRY2(AAM, 0,w, I,b, chk(i64)),
> +    [0xD5] = X86_OP_ENTRY2(AAD, 0,w, I,b, chk(i64)),
> +    [0xD6] = X86_OP_ENTRYw(SALC, 0,b, chk(i64)),
>       [0xD7] = X86_OP_ENTRY1(XLAT, 0,b, zextT0), /* AL read/written */
>   
>       [0xE0] = X86_OP_ENTRYr(LOOPNE, J,b), /* implicit: CX with aflag size */
> @@ -1834,7 +1834,7 @@ static const X86OpEntry opcodes_root[256] = {
>       [0xCB] = X86_OP_ENTRY0(RETF),
>       [0xCC] = X86_OP_ENTRY0(INT3),
>       [0xCD] = X86_OP_ENTRYr(INT, I,b,  chk(vm86_iopl)),
> -    [0xCE] = X86_OP_ENTRY0(INTO),
> +    [0xCE] = X86_OP_ENTRY0(INTO, chk(i64)),
>       [0xCF] = X86_OP_ENTRY0(IRET,      chk(vm86_iopl) svm(IRET)),
>   
>       /*



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 05/18] target/i386/tcg: do not compute all flags for SAHF
  2025-12-10 13:16 ` [PATCH 05/18] target/i386/tcg: do not compute all flags for SAHF Paolo Bonzini
@ 2025-12-11 16:03   ` Richard Henderson
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 16:03 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

On 12/10/25 07:16, Paolo Bonzini wrote:
> Only OF is needed, the others are overwritten.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   target/i386/tcg/emit.c.inc | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
> index 22e53f5b000..131aefce53c 100644
> --- a/target/i386/tcg/emit.c.inc
> +++ b/target/i386/tcg/emit.c.inc
> @@ -3778,7 +3778,7 @@ static void gen_SAHF(DisasContext *s, X86DecodedInsn *decode)
>           return gen_illegal_opcode(s);
>       }
>       tcg_gen_shri_tl(s->T0, cpu_regs[R_EAX], 8);
> -    gen_compute_eflags(s);
> +    gen_neg_setcc(s, JCC_O << 1, cpu_cc_src);
>       tcg_gen_andi_tl(cpu_cc_src, cpu_cc_src, CC_O);
>       tcg_gen_andi_tl(s->T0, s->T0, CC_S | CC_Z | CC_A | CC_P | CC_C);
>       tcg_gen_or_tl(cpu_cc_src, cpu_cc_src, s->T0);

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 06/18] target/i386/tcg: remove do_decode_0F
  2025-12-10 13:16 ` [PATCH 06/18] target/i386/tcg: remove do_decode_0F Paolo Bonzini
@ 2025-12-11 16:03   ` Richard Henderson
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 16:03 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

On 12/10/25 07:16, Paolo Bonzini wrote:
> It is not needed anymore since all prefixes are handled by the
> new decoder.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   target/i386/tcg/decode-new.c.inc | 7 +------
>   1 file changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
> index 213dbb9637c..ea8e26f7f98 100644
> --- a/target/i386/tcg/decode-new.c.inc
> +++ b/target/i386/tcg/decode-new.c.inc
> @@ -1430,15 +1430,10 @@ static const X86OpEntry opcodes_0F[256] = {
>       [0xff] = X86_OP_ENTRYr(UD,     nop,v),                        /* UD0 */
>   };
>   
> -static void do_decode_0F(DisasContext *s, CPUX86State *env, X86OpEntry *entry, uint8_t *b)
> -{
> -    *entry = opcodes_0F[*b];
> -}
> -
>   static void decode_0F(DisasContext *s, CPUX86State *env, X86OpEntry *entry, uint8_t *b)
>   {
>       *b = x86_ldub_code(env, s);
> -    do_decode_0F(s, env, entry, b);
> +    *entry = opcodes_0F[*b];
>   }
>   
>   static void decode_63(DisasContext *s, CPUX86State *env, X86OpEntry *entry, uint8_t *b)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/18] target/i386/tcg: move and expand misplaced comment
  2025-12-10 13:16 ` [PATCH 07/18] target/i386/tcg: move and expand misplaced comment Paolo Bonzini
@ 2025-12-11 16:04   ` Richard Henderson
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 16:04 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

On 12/10/25 07:16, Paolo Bonzini wrote:
> @@ -2222,6 +2217,10 @@ static bool decode_insn(DisasContext *s, CPUX86State *env, X86DecodeFunc decode_
>   {
>       X86OpEntry *e = &decode->e;
>   
> +    /*
> +     * Each step decodes part of the opcode and place the last not-fully-decoded

places

> +     * byte in decode->b.  If the modrm byte is read, it is placed in s->modrm.
> +     */
>       decode_func(s, env, e, &decode->b);
>       while (e->is_decode) {
>           e->is_decode = false;

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 08/18] target/i386/tcg: simplify effective address calculation
  2025-12-10 13:16 ` [PATCH 08/18] target/i386/tcg: simplify effective address calculation Paolo Bonzini
@ 2025-12-11 16:15   ` Richard Henderson
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 16:15 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

On 12/10/25 07:16, Paolo Bonzini wrote:
> Split gen_lea_v_seg_dest into three simple phases (extend from
> 16 bits, add, final extend), with optimization for known-zero bases
> to avoid back-to-back extensions.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   target/i386/tcg/translate.c | 64 ++++++++++++-------------------------
>   1 file changed, 20 insertions(+), 44 deletions(-)
> 
> diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
> index 0cb87d02012..2ab3c2ac663 100644
> --- a/target/i386/tcg/translate.c
> +++ b/target/i386/tcg/translate.c
> @@ -627,54 +627,30 @@ static TCGv eip_cur_tl(DisasContext *s)
>   static void gen_lea_v_seg_dest(DisasContext *s, MemOp aflag, TCGv dest, TCGv a0,
>                                  int def_seg, int ovr_seg)
>   {
> -    switch (aflag) {
> -#ifdef TARGET_X86_64
> -    case MO_64:
> -        if (ovr_seg < 0) {
> -            tcg_gen_mov_tl(dest, a0);
> -            return;
> +    int easize;
> +    bool has_base;
> +
> +    if (ovr_seg < 0) {
> +        ovr_seg = def_seg;
> +    }
> +
> +    has_base = ovr_seg >= 0 && (ADDSEG(s) || ovr_seg >= R_FS);

I guess def_seg is -1 for LEA, so ovr_seg can still be -1.
I wonder if it would be clearer to avoid this duplication of segment earlier in decode?

Anyway, for here, maybe clearer as

     has_base = ovr_seg >= R_FS || (ovr_seg >= 0 && ADDSEG(s));

even though the end result is the same.

Nice cleanup.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 09/18] target/i386/tcg: unnest switch statements in disas_insn_x87
  2025-12-10 13:16 ` [PATCH 09/18] target/i386/tcg: unnest switch statements in disas_insn_x87 Paolo Bonzini
@ 2025-12-11 16:20   ` Richard Henderson
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 16:20 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

On 12/10/25 07:16, Paolo Bonzini wrote:
> @@ -2801,22 +2785,16 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
>               }
>               break;
>           case 0x00: case 0x01: case 0x04 ... 0x07: /* fxxx st, sti */
> +            gen_helper_fmov_FT0_STN(tcg_env,
> +                                    tcg_constant_i32(opreg));
> +            gen_helper_fp_arith_ST0_FT0(op & 7);
> +            break;
> +
>           case 0x20: case 0x21: case 0x24 ... 0x27: /* fxxx sti, st */
>           case 0x30: case 0x31: case 0x34 ... 0x37: /* fxxxp sti, st */
> -            {
> -                int op1;
> -
> -                op1 = op & 7;
> -                if (op >= 0x20) {
> -                    gen_helper_fp_arith_STN_ST0(op1, opreg);
> -                    if (op >= 0x30) {
> -                        gen_helper_fpop(tcg_env);
> -                    }
> -                } else {
> -                    gen_helper_fmov_FT0_STN(tcg_env,
> -                                            tcg_constant_i32(opreg));
> -                    gen_helper_fp_arith_ST0_FT0(op1);
> -                }
> +            gen_helper_fp_arith_STN_ST0(op & 7, opreg);
> +            if (op >= 0x30) {
> +                gen_helper_fpop(tcg_env);
>               }
>               break;

Leaving the op >= 30 check here?
I'd have expected

case 0x20: case 0x21: case 0x24 ... 0x27: /* fxxx sti, st */
     gen_helper_fp_arith_STN_ST0(op & 7, opreg);
     break;
case 0x30: case 0x31: case 0x34 ... 0x37: /* fxxxp sti, st */
     gen_helper_fp_arith_STN_ST0(op & 7, opreg);
     gen_helper_fpop(tcg_env);
     break;

Anyway,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/18] target/i386/tcg: move fcom/fcomp differentiation to gen_helper_fp_arith_ST0_FT0
  2025-12-10 13:16 ` [PATCH 10/18] target/i386/tcg: move fcom/fcomp differentiation to gen_helper_fp_arith_ST0_FT0 Paolo Bonzini
@ 2025-12-11 16:21   ` Richard Henderson
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 16:21 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

On 12/10/25 07:16, Paolo Bonzini wrote:
> There is only one call site for gen_helper_fp_arith_ST0_FT0(), therefore
> there is no need to check the op1 == 3 in the caller.  Once this is done,
> eliminate the goto to that call site.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   target/i386/tcg/translate.c | 23 ++++++++---------------
>   1 file changed, 8 insertions(+), 15 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

> 
> diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
> index c755329b3d9..3c55b62bdec 100644
> --- a/target/i386/tcg/translate.c
> +++ b/target/i386/tcg/translate.c
> @@ -1485,6 +1485,7 @@ static void gen_helper_fp_arith_ST0_FT0(int op)
>           break;
>       case 3:
>           gen_helper_fcom_ST0_FT0(tcg_env);
> +        gen_helper_fpop(tcg_env);
>           break;
>       case 4:
>           gen_helper_fsub_ST0_FT0(tcg_env);
> @@ -2460,36 +2461,28 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
>               tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
>                                   s->mem_index, MO_LEUL);
>               gen_helper_flds_FT0(tcg_env, s->tmp2_i32);
> -            goto fp_arith_ST0_FT0;
> +            gen_helper_fp_arith_ST0_FT0(op & 7);
> +            break;
>   
>           case 0x10 ... 0x17: /* fixxxl */
>               tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
>                                   s->mem_index, MO_LEUL);
>               gen_helper_fildl_FT0(tcg_env, s->tmp2_i32);
> -            goto fp_arith_ST0_FT0;
> +            gen_helper_fp_arith_ST0_FT0(op & 7);
> +            break;
>   
>           case 0x20 ... 0x27: /* fxxxl */
>               tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0,
>                                   s->mem_index, MO_LEUQ);
>               gen_helper_fldl_FT0(tcg_env, s->tmp1_i64);
> -            goto fp_arith_ST0_FT0;
> +            gen_helper_fp_arith_ST0_FT0(op & 7);
> +            break;
>   
>           case 0x30 ... 0x37: /* fixxx */
>               tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
>                                   s->mem_index, MO_LESW);
>               gen_helper_fildl_FT0(tcg_env, s->tmp2_i32);
> -            goto fp_arith_ST0_FT0;
> -
> -fp_arith_ST0_FT0:
> -            {
> -                int op1 = op & 7;
> -
> -                gen_helper_fp_arith_ST0_FT0(op1);
> -                if (op1 == 3) {
> -                    /* fcomp needs pop */
> -                    gen_helper_fpop(tcg_env);
> -                }
> -            }
> +            gen_helper_fp_arith_ST0_FT0(op & 7);
>               break;
>   
>           case 0x08: /* flds */



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 11/18] target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for fcom STn and fcomp STn
  2025-12-10 13:16 ` [PATCH 11/18] target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for fcom STn and fcomp STn Paolo Bonzini
@ 2025-12-11 16:24   ` Richard Henderson
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 16:24 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

On 12/10/25 07:16, Paolo Bonzini wrote:
> Treat specially the undocumented ops, instead of treating specially the
> two d8/0 opcodes that have undocumented variants: just call
> gen_helper_fp_arith_ST0_FT0 for all opcodes in the d8/0 encoding.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   target/i386/tcg/translate.c | 4 +---
>   1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
> index 3c55b62bdec..8f50071a4f4 100644
> --- a/target/i386/tcg/translate.c
> +++ b/target/i386/tcg/translate.c
> @@ -2777,7 +2777,7 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
>                   break;
>               }
>               break;
> -        case 0x00: case 0x01: case 0x04 ... 0x07: /* fxxx st, sti */
> +        case 0x00 ... 0x07: /* fxxx st, sti */
>               gen_helper_fmov_FT0_STN(tcg_env,
>                                       tcg_constant_i32(opreg));
>               gen_helper_fp_arith_ST0_FT0(op & 7);
> @@ -2790,12 +2790,10 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
>                   gen_helper_fpop(tcg_env);
>               }
>               break;
> -        case 0x02: /* fcom */
>           case 0x22: /* fcom2, undocumented op */
>               gen_helper_fmov_FT0_STN(tcg_env, tcg_constant_i32(opreg));
>               gen_helper_fcom_ST0_FT0(tcg_env);
>               break;
> -        case 0x03: /* fcomp */
>           case 0x23: /* fcomp3, undocumented op */
>           case 0x32: /* fcomp5, undocumented op */
>               gen_helper_fmov_FT0_STN(tcg_env, tcg_constant_i32(opreg));

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 12/18] target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for undocumented fcom/fcomp variants
  2025-12-10 13:16 ` [PATCH 12/18] target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for undocumented fcom/fcomp variants Paolo Bonzini
@ 2025-12-11 16:26   ` Richard Henderson
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 16:26 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

On 12/10/25 07:16, Paolo Bonzini wrote:
> For 0x32 hack the op to be fcomp; for the others there isn't even anything special
> to do.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   target/i386/tcg/translate.c | 15 +++++----------
>   1 file changed, 5 insertions(+), 10 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~

> 
> diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
> index 8f50071a4f4..f47bb5de8b3 100644
> --- a/target/i386/tcg/translate.c
> +++ b/target/i386/tcg/translate.c
> @@ -2777,7 +2777,12 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
>                   break;
>               }
>               break;
> +        case 0x32: /* fcomp5, undocumented op */
> +            /* map to fcomp; op & 7 == 2 would not pop  */
> +            op = 0x03;
> +            /* fallthrough */
>           case 0x00 ... 0x07: /* fxxx st, sti */
> +        case 0x22 ... 0x23: /* fcom2 and fcomp3, undocumented ops */
>               gen_helper_fmov_FT0_STN(tcg_env,
>                                       tcg_constant_i32(opreg));
>               gen_helper_fp_arith_ST0_FT0(op & 7);
> @@ -2790,16 +2795,6 @@ static void gen_x87(DisasContext *s, X86DecodedInsn *decode)
>                   gen_helper_fpop(tcg_env);
>               }
>               break;
> -        case 0x22: /* fcom2, undocumented op */
> -            gen_helper_fmov_FT0_STN(tcg_env, tcg_constant_i32(opreg));
> -            gen_helper_fcom_ST0_FT0(tcg_env);
> -            break;
> -        case 0x23: /* fcomp3, undocumented op */
> -        case 0x32: /* fcomp5, undocumented op */
> -            gen_helper_fmov_FT0_STN(tcg_env, tcg_constant_i32(opreg));
> -            gen_helper_fcom_ST0_FT0(tcg_env);
> -            gen_helper_fpop(tcg_env);
> -            break;
>           case 0x15: /* da/5 */
>               switch (rm) {
>               case 1: /* fucompp */



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 14/18] target/i386/tcg: kill tmp1_i64
  2025-12-10 13:16 ` [PATCH 14/18] target/i386/tcg: kill tmp1_i64 Paolo Bonzini
@ 2025-12-11 16:28   ` Richard Henderson
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 16:28 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

On 12/10/25 07:16, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   target/i386/tcg/translate.c | 66 ++++++++++++++++++++--------------
>   target/i386/tcg/emit.c.inc  | 72 ++++++++++++++++++++++---------------
>   2 files changed, 84 insertions(+), 54 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 15/18] target/i386/tcg: kill tmp2_i32
  2025-12-10 13:16 ` [PATCH 15/18] target/i386/tcg: kill tmp2_i32 Paolo Bonzini
@ 2025-12-11 16:29   ` Richard Henderson
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 16:29 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

On 12/10/25 07:16, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   target/i386/tcg/translate.c | 121 +++++++++++++++++++++---------------
>   1 file changed, 71 insertions(+), 50 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 16/18] target/i386/tcg: commonize code to compute SF/ZF/PF
  2025-12-10 13:16 ` [PATCH 16/18] target/i386/tcg: commonize code to compute SF/ZF/PF Paolo Bonzini
@ 2025-12-11 18:46   ` Richard Henderson
  2025-12-12 15:45     ` Paolo Bonzini
  0 siblings, 1 reply; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 18:46 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

On 12/10/25 07:16, Paolo Bonzini wrote:
> +psz_b:
> +    shift += 8;
> +psz_w:
> +    shift += 16;
> +psz_l:
> +#ifdef TARGET_X86_64
> +    shift += 32;
> +psz_q:
> +#endif

Oof.  Use cc_op_size instead of a set of gotos.


r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 17/18] target/i386/tcg: add a CCOp for SBB x,x
  2025-12-10 13:16 ` [PATCH 17/18] target/i386/tcg: add a CCOp for SBB x,x Paolo Bonzini
@ 2025-12-11 19:11   ` Richard Henderson
  2025-12-12 17:49     ` Paolo Bonzini
  0 siblings, 1 reply; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 19:11 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

On 12/10/25 07:16, Paolo Bonzini wrote:
> This is more efficient both when generating code and when testing
> flags.

I guess sbb x,x appears quite frequently in x86 setcc computation, and the testing of the 
flags is much less important than the straight line code generation?


> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index ecca38ed0b5..314e773a5d4 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1515,7 +1515,18 @@ typedef enum {
>       CC_OP_POPCNTL__,
>       CC_OP_POPCNTQ__,
>       CC_OP_POPCNT = sizeof(target_ulong) == 8 ? CC_OP_POPCNTQ__ : CC_OP_POPCNTL__,
> -#define CC_OP_LAST_BWLQ CC_OP_POPCNTQ__
> +
> +    /*
> +     * Note that only CC_OP_SBB_SELF (i.e. the one with MO_TL size)
> +     * is used or implemented, because the translation produces a
> +     * sign-extended CC_DST.
> +     */
> +    CC_OP_SBB_SELFB__, /* S/Z/C/A via CC_DST, O clear, P set.  */
> +    CC_OP_SBB_SELFW__,
> +    CC_OP_SBB_SELFL__,
> +    CC_OP_SBB_SELFQ__,
> +    CC_OP_SBB_SELF = sizeof(target_ulong) == 8 ? CC_OP_SBB_SELFQ__ : CC_OP_SBB_SELFL__,
> +#define CC_OP_LAST_BWLQ CC_OP_SBB_SELFQ__

The documentation here could be improved to note that CC_DST is always in {-1, 0}.  The 
fact that you can derive all other flags via masking less immediately relevant.

Otherwise,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 18/18] target/i386/tcg: move fetch code out of translate.c
  2025-12-10 13:16 ` [PATCH 18/18] target/i386/tcg: move fetch code out of translate.c Paolo Bonzini
@ 2025-12-11 19:29   ` Richard Henderson
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 19:29 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

On 12/10/25 07:16, Paolo Bonzini wrote:
> Let translate.c only concern itself with TCG code generation.  Move everything
> that uses CPUX86State*, as well as gen_lea_modrm_0 now that it is only used
> to fill decode->mem, to decode-new.c.inc.
> 
> While at it also rename gen_lea_modrm_0.
> 
> Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
> ---
>   target/i386/tcg/translate.c      | 271 ------------------------------
>   target/i386/tcg/decode-new.c.inc | 277 ++++++++++++++++++++++++++++++-
>   2 files changed, 274 insertions(+), 274 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/18] target/i386/tcg: fix check for invalid VSIB instruction
  2025-12-11 15:47   ` Richard Henderson
@ 2025-12-11 20:28     ` Paolo Bonzini
  2025-12-11 22:22       ` Richard Henderson
  0 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-11 20:28 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-stable

On Thu, Dec 11, 2025 at 4:47 PM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 12/10/25 07:16, Paolo Bonzini wrote:
> > VSIB instructions (VEX class 12) must not have an address prefix.
> > Checking s->aflag == MO_16 is not enough because in 64-bit mode
> > the address prefix changes aflag to MO_32.  Add a specific check
> > bit instead.
> >
> > Cc: qemu-stable@nongnu.org
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >   target/i386/tcg/decode-new.h     |  3 +++
> >   target/i386/tcg/decode-new.c.inc | 27 +++++++++++++--------------
> >   2 files changed, 16 insertions(+), 14 deletions(-)
>
> Where do you see this?  I think this is wrong.

Yes, I was confused by the comment and by QEMU's incorrect decoding logic:

        if (CODE32(s) && !VM86(s)) {

which should be changed to

       if (PE(s) && !VM86(s)) {

And by the way, this also means that we need either separate helpers
for 32- and 64-bit addresses, or a mask argument.

Paolo

> In particular,
>
> Table 2-27. Type 12 Class Exception Conditions
> - If address size attribute is 16 bit.
>
> and
>
> 2.3.12 Vector SIB (VSIB) Memory Addressing
> In 16-bit protected mode, VSIB memory addressing is permitted if address size attribute is
> overridden to 32 bits.
>
> Therefore, in 16-bit mode, one *must* use the address prefix.
>
>
>
> r~
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/18] target/i386/tcg: fix check for invalid VSIB instruction
  2025-12-11 20:28     ` Paolo Bonzini
@ 2025-12-11 22:22       ` Richard Henderson
  2025-12-12  2:06         ` Paolo Bonzini
  0 siblings, 1 reply; 42+ messages in thread
From: Richard Henderson @ 2025-12-11 22:22 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, qemu-stable

On 12/11/25 14:28, Paolo Bonzini wrote:
> On Thu, Dec 11, 2025 at 4:47 PM Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> On 12/10/25 07:16, Paolo Bonzini wrote:
>>> VSIB instructions (VEX class 12) must not have an address prefix.
>>> Checking s->aflag == MO_16 is not enough because in 64-bit mode
>>> the address prefix changes aflag to MO_32.  Add a specific check
>>> bit instead.
>>>
>>> Cc: qemu-stable@nongnu.org
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>>    target/i386/tcg/decode-new.h     |  3 +++
>>>    target/i386/tcg/decode-new.c.inc | 27 +++++++++++++--------------
>>>    2 files changed, 16 insertions(+), 14 deletions(-)
>>
>> Where do you see this?  I think this is wrong.
> 
> Yes, I was confused by the comment and by QEMU's incorrect decoding logic:
> 
>          if (CODE32(s) && !VM86(s)) {
> 
> which should be changed to
> 
>         if (PE(s) && !VM86(s)) {

I can't find the language for that.  Can you point me at it?

> And by the way, this also means that we need either separate helpers
> for 32- and 64-bit addresses, or a mask argument.

Of course.


r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/18] target/i386/tcg: fix check for invalid VSIB instruction
  2025-12-11 22:22       ` Richard Henderson
@ 2025-12-12  2:06         ` Paolo Bonzini
  2025-12-12 14:37           ` Richard Henderson
  0 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-12  2:06 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 757 bytes --]

Il gio 11 dic 2025, 23:22 Richard Henderson <richard.henderson@linaro.org>
ha scritto:

> > Yes, I was confused by the comment and by QEMU's incorrect decoding
> logic:
> >
> >          if (CODE32(s) && !VM86(s)) {
> >
> > which should be changed to
> >
> >         if (PE(s) && !VM86(s)) {
>
> I can't find the language for that.  Can you point me at it?
>

It's the exception condition tables. They all mention that you get #UD for
the VEX prefix in real or vm86 mode.

Several BMI instructions also have language like "This instruction is not
supported in real mode and virtual-8086 mode".

Paolo


> > And by the way, this also means that we need either separate helpers
> > for 32- and 64-bit addresses, or a mask argument.
>
> Of course.
>
>
> r~
>
>

[-- Attachment #2: Type: text/html, Size: 1578 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/18] target/i386/tcg: fix check for invalid VSIB instruction
  2025-12-12  2:06         ` Paolo Bonzini
@ 2025-12-12 14:37           ` Richard Henderson
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Henderson @ 2025-12-12 14:37 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, qemu-stable

On 12/11/25 20:06, Paolo Bonzini wrote:
> 
> 
> Il gio 11 dic 2025, 23:22 Richard Henderson <richard.henderson@linaro.org 
> <mailto:richard.henderson@linaro.org>> ha scritto:
> 
>      > Yes, I was confused by the comment and by QEMU's incorrect decoding logic:
>      >
>      >          if (CODE32(s) && !VM86(s)) {
>      >
>      > which should be changed to
>      >
>      >         if (PE(s) && !VM86(s)) {
> 
>     I can't find the language for that.  Can you point me at it?
> 
> 
> It's the exception condition tables. They all mention that you get #UD for the VEX prefix 
> in real or vm86 mode.

Ah right, found it.  Thanks.

> Several BMI instructions also have language like "This instruction is not supported in 
> real mode and virtual-8086 mode".

Amusingly, some of them dropped the "not" in that sentence -- see ADCX.


r~


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 16/18] target/i386/tcg: commonize code to compute SF/ZF/PF
  2025-12-11 18:46   ` Richard Henderson
@ 2025-12-12 15:45     ` Paolo Bonzini
  0 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-12 15:45 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 425 bytes --]

Il gio 11 dic 2025, 19:46 Richard Henderson <richard.henderson@linaro.org>
ha scritto:

> On 12/10/25 07:16, Paolo Bonzini wrote:
> > +psz_b:
> > +    shift += 8;
> > +psz_w:
> > +    shift += 16;
> > +psz_l:
> > +#ifdef TARGET_X86_64
> > +    shift += 32;
> > +psz_q:
> > +#endif
>
> Oof.  Use cc_op_size instead of a set of gotos.
>

I was so proud :) I will check what the code produced with cc_op_size looks
like.

Paolo

[-- Attachment #2: Type: text/html, Size: 912 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 17/18] target/i386/tcg: add a CCOp for SBB x,x
  2025-12-11 19:11   ` Richard Henderson
@ 2025-12-12 17:49     ` Paolo Bonzini
  0 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2025-12-12 17:49 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 12/11/25 20:11, Richard Henderson wrote:
> On 12/10/25 07:16, Paolo Bonzini wrote:
>> This is more efficient both when generating code and when testing
>> flags.
> 
> I guess sbb x,x appears quite frequently in x86 setcc computation, and 
> the testing of the flags is much less important than the straight line 
> code generation?

Yes.  And to be honest, in the most common idioms generated for a modern 
processor the whole computation ends up being dead, so it doesn't really 
matter to have this vs. CC_OP_SBB or CC_OP_SUB.  For example memcmp uses 
it for "(x < y) ? -1 : 1":

                  subq     %rcx, %rax
                  sbbl     %eax, %eax
                  orl      $1, %eax

and this is also common, for "(x < y) ? VALUE : 0"

                  subq     %rcx, %rax
                  sbbq     %rax, %rax     ; could be sbbl :)
		 andl     $0x1234, %eax

In old hand-written assembly it is used more creatively, and having 
simpler generated code can matter if there are memory operations after 
the sbb.  I did this just because it's silly to compute both negsetcond 
and setcond...

Paolo



^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2025-12-12 17:50 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-10 13:16 [PATCH 00/18] First round of target/i386/tcg patches for QEMU 11.0 Paolo Bonzini
2025-12-10 13:16 ` [PATCH 01/18] target/i386/tcg: fix check for invalid VSIB instruction Paolo Bonzini
2025-12-11 15:47   ` Richard Henderson
2025-12-11 20:28     ` Paolo Bonzini
2025-12-11 22:22       ` Richard Henderson
2025-12-12  2:06         ` Paolo Bonzini
2025-12-12 14:37           ` Richard Henderson
2025-12-10 13:16 ` [PATCH 02/18] target/i386/tcg: ignore V3 in 32-bit mode Paolo Bonzini
2025-12-11 15:52   ` Richard Henderson
2025-12-10 13:16 ` [PATCH 03/18] target/i386/tcg: update cc_op after PUSHF Paolo Bonzini
2025-12-11 15:55   ` Richard Henderson
2025-12-10 13:16 ` [PATCH 04/18] target/i386/tcg: mark more instructions that are invalid in 64-bit mode Paolo Bonzini
2025-12-11 15:59   ` Richard Henderson
2025-12-10 13:16 ` [PATCH 05/18] target/i386/tcg: do not compute all flags for SAHF Paolo Bonzini
2025-12-11 16:03   ` Richard Henderson
2025-12-10 13:16 ` [PATCH 06/18] target/i386/tcg: remove do_decode_0F Paolo Bonzini
2025-12-11 16:03   ` Richard Henderson
2025-12-10 13:16 ` [PATCH 07/18] target/i386/tcg: move and expand misplaced comment Paolo Bonzini
2025-12-11 16:04   ` Richard Henderson
2025-12-10 13:16 ` [PATCH 08/18] target/i386/tcg: simplify effective address calculation Paolo Bonzini
2025-12-11 16:15   ` Richard Henderson
2025-12-10 13:16 ` [PATCH 09/18] target/i386/tcg: unnest switch statements in disas_insn_x87 Paolo Bonzini
2025-12-11 16:20   ` Richard Henderson
2025-12-10 13:16 ` [PATCH 10/18] target/i386/tcg: move fcom/fcomp differentiation to gen_helper_fp_arith_ST0_FT0 Paolo Bonzini
2025-12-11 16:21   ` Richard Henderson
2025-12-10 13:16 ` [PATCH 11/18] target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for fcom STn and fcomp STn Paolo Bonzini
2025-12-11 16:24   ` Richard Henderson
2025-12-10 13:16 ` [PATCH 12/18] target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for undocumented fcom/fcomp variants Paolo Bonzini
2025-12-11 16:26   ` Richard Henderson
2025-12-10 13:16 ` [PATCH 13/18] target/i386/tcg: unify more pop/no-pop x87 instructions Paolo Bonzini
2025-12-10 13:16 ` [PATCH 14/18] target/i386/tcg: kill tmp1_i64 Paolo Bonzini
2025-12-11 16:28   ` Richard Henderson
2025-12-10 13:16 ` [PATCH 15/18] target/i386/tcg: kill tmp2_i32 Paolo Bonzini
2025-12-11 16:29   ` Richard Henderson
2025-12-10 13:16 ` [PATCH 16/18] target/i386/tcg: commonize code to compute SF/ZF/PF Paolo Bonzini
2025-12-11 18:46   ` Richard Henderson
2025-12-12 15:45     ` Paolo Bonzini
2025-12-10 13:16 ` [PATCH 17/18] target/i386/tcg: add a CCOp for SBB x,x Paolo Bonzini
2025-12-11 19:11   ` Richard Henderson
2025-12-12 17:49     ` Paolo Bonzini
2025-12-10 13:16 ` [PATCH 18/18] target/i386/tcg: move fetch code out of translate.c Paolo Bonzini
2025-12-11 19:29   ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).