qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [CFT PATCH 0/4] target/i386/emulate: cleanups
@ 2025-05-02 21:48 Paolo Bonzini
  2025-05-02 21:48 ` [PATCH 1/4] target/i386/emulate: fix target_ulong format strings Paolo Bonzini
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Paolo Bonzini @ 2025-05-02 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: wei.liu

These are some improvements to the x86 emulator that I wrote but have no
way of testing (right now).

I tried to place them in order of importance so that, if something breaks,
it is possible to commit a subset.  I tried to compile the resulting code
on Linux but I have not run it.

Patch 1 is just to fix warnings on Linux.

Patch 2 is the most important, as it fixes some real horrors in the code.

Patch 3 makes flags handling use algorithms somewhat similar to TCG.
It should fix issues with 64-bit ALU operations, but it's also the one
where it's more likely to have a mistake.

Patch 4 is comparatively trivial, though I cannot exclude any screwups.

It should be possible to test this with both HVF and Hyper-V.

Paolo

Paolo Bonzini (4):
  target/i386/emulate: fix target_ulong format strings
  target/i386/emulate: stop overloading decode->op[N].ptr
  target/i386/emulate: mostly rewrite flags handling
  target/i386: remove lflags

 target/i386/cpu.h                |   6 -
 target/i386/emulate/x86_decode.h |   9 +-
 target/i386/emulate/x86_emu.h    |   8 +-
 target/i386/emulate/x86_flags.h  |  12 +-
 target/i386/emulate/x86_decode.c |  76 ++++++------
 target/i386/emulate/x86_emu.c    | 125 +++++++++----------
 target/i386/emulate/x86_flags.c  | 198 +++++++++++++------------------
 7 files changed, 197 insertions(+), 237 deletions(-)

-- 
2.49.0



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] target/i386/emulate: fix target_ulong format strings
  2025-05-02 21:48 [CFT PATCH 0/4] target/i386/emulate: cleanups Paolo Bonzini
@ 2025-05-02 21:48 ` Paolo Bonzini
  2025-05-02 21:48 ` [PATCH 2/4] target/i386/emulate: stop overloading decode->op[N].ptr Paolo Bonzini
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2025-05-02 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: wei.liu

Do not assume that TARGET_FMT_lx is %llx.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/emulate/x86_decode.c | 2 +-
 target/i386/emulate/x86_emu.c    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/i386/emulate/x86_decode.c b/target/i386/emulate/x86_decode.c
index 7efa2f570ea..88be9479a82 100644
--- a/target/i386/emulate/x86_decode.c
+++ b/target/i386/emulate/x86_decode.c
@@ -26,7 +26,7 @@
 
 static void decode_invalid(CPUX86State *env, struct x86_decode *decode)
 {
-    printf("%llx: failed to decode instruction ", env->eip);
+    printf(TARGET_FMT_lx ": failed to decode instruction ", env->eip);
     for (int i = 0; i < decode->opcode_len; i++) {
         printf("%x ", decode->opcode[i]);
     }
diff --git a/target/i386/emulate/x86_emu.c b/target/i386/emulate/x86_emu.c
index 26a4876aac0..7773b51b95e 100644
--- a/target/i386/emulate/x86_emu.c
+++ b/target/i386/emulate/x86_emu.c
@@ -1241,7 +1241,7 @@ static void init_cmd_handler(void)
 bool exec_instruction(CPUX86State *env, struct x86_decode *ins)
 {
     if (!_cmd_handler[ins->cmd].handler) {
-        printf("Unimplemented handler (%llx) for %d (%x %x) \n", env->eip,
+        printf("Unimplemented handler (" TARGET_FMT_lx ") for %d (%x %x) \n", env->eip,
                 ins->cmd, ins->opcode[0],
                 ins->opcode_len > 1 ? ins->opcode[1] : 0);
         env->eip += ins->len;
-- 
2.49.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] target/i386/emulate: stop overloading decode->op[N].ptr
  2025-05-02 21:48 [CFT PATCH 0/4] target/i386/emulate: cleanups Paolo Bonzini
  2025-05-02 21:48 ` [PATCH 1/4] target/i386/emulate: fix target_ulong format strings Paolo Bonzini
@ 2025-05-02 21:48 ` Paolo Bonzini
  2025-05-03 10:46   ` BALATON Zoltan
  2025-05-05  9:43   ` Philippe Mathieu-Daudé
  2025-05-02 21:48 ` [PATCH 3/4] target/i386/emulate: mostly rewrite flags handling Paolo Bonzini
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 17+ messages in thread
From: Paolo Bonzini @ 2025-05-02 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: wei.liu

decode->op[N].ptr can contain either a host pointer (!) in CPUState
or a guest virtual address.  Pass the whole struct to read_val_ext
and write_val_ext, so that it can decide the contents based on the
operand type.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/emulate/x86_decode.h |   9 ++-
 target/i386/emulate/x86_emu.h    |   8 +--
 target/i386/emulate/x86_decode.c |  74 +++++++++----------
 target/i386/emulate/x86_emu.c    | 119 ++++++++++++++++---------------
 4 files changed, 109 insertions(+), 101 deletions(-)

diff --git a/target/i386/emulate/x86_decode.h b/target/i386/emulate/x86_decode.h
index 87cc728598d..497cbdef9c7 100644
--- a/target/i386/emulate/x86_decode.h
+++ b/target/i386/emulate/x86_decode.h
@@ -266,7 +266,10 @@ typedef struct x86_decode_op {
     int reg;
     target_ulong val;
 
-    target_ulong ptr;
+    union {
+        target_ulong addr;
+        void *regptr;
+    };
 } x86_decode_op;
 
 typedef struct x86_decode {
@@ -301,8 +304,8 @@ uint64_t sign(uint64_t val, int size);
 
 uint32_t decode_instruction(CPUX86State *env, struct x86_decode *decode);
 
-target_ulong get_reg_ref(CPUX86State *env, int reg, int rex_present,
-                         int is_extended, int size);
+void * get_reg_ref(CPUX86State *env, int reg, int rex_present,
+                    int is_extended, int size);
 target_ulong get_reg_val(CPUX86State *env, int reg, int rex_present,
                          int is_extended, int size);
 void calc_modrm_operand(CPUX86State *env, struct x86_decode *decode,
diff --git a/target/i386/emulate/x86_emu.h b/target/i386/emulate/x86_emu.h
index 555b567e2c7..a1a961284b2 100644
--- a/target/i386/emulate/x86_emu.h
+++ b/target/i386/emulate/x86_emu.h
@@ -42,11 +42,11 @@ void x86_emul_raise_exception(CPUX86State *env, int exception_index, int error_c
 
 target_ulong read_reg(CPUX86State *env, int reg, int size);
 void write_reg(CPUX86State *env, int reg, target_ulong val, int size);
-target_ulong read_val_from_reg(target_ulong reg_ptr, int size);
-void write_val_to_reg(target_ulong reg_ptr, target_ulong val, int size);
-void write_val_ext(CPUX86State *env, target_ulong ptr, target_ulong val, int size);
+target_ulong read_val_from_reg(void *reg_ptr, int size);
+void write_val_to_reg(void *reg_ptr, target_ulong val, int size);
+void write_val_ext(CPUX86State *env, struct x86_decode_op *decode, target_ulong val, int size);
 uint8_t *read_mmio(CPUX86State *env, target_ulong ptr, int bytes);
-target_ulong read_val_ext(CPUX86State *env, target_ulong ptr, int size);
+target_ulong read_val_ext(CPUX86State *env, struct x86_decode_op *decode, int size);
 
 void exec_movzx(CPUX86State *env, struct x86_decode *decode);
 void exec_shl(CPUX86State *env, struct x86_decode *decode);
diff --git a/target/i386/emulate/x86_decode.c b/target/i386/emulate/x86_decode.c
index 88be9479a82..2eca39802e3 100644
--- a/target/i386/emulate/x86_decode.c
+++ b/target/i386/emulate/x86_decode.c
@@ -109,8 +109,8 @@ static void decode_modrm_reg(CPUX86State *env, struct x86_decode *decode,
 {
     op->type = X86_VAR_REG;
     op->reg = decode->modrm.reg;
-    op->ptr = get_reg_ref(env, op->reg, decode->rex.rex, decode->rex.r,
-                          decode->operand_size);
+    op->regptr = get_reg_ref(env, op->reg, decode->rex.rex, decode->rex.r,
+                             decode->operand_size);
 }
 
 static void decode_rax(CPUX86State *env, struct x86_decode *decode,
@@ -119,8 +119,8 @@ static void decode_rax(CPUX86State *env, struct x86_decode *decode,
     op->type = X86_VAR_REG;
     op->reg = R_EAX;
     /* Since reg is always AX, REX prefix has no impact. */
-    op->ptr = get_reg_ref(env, op->reg, false, 0,
-                          decode->operand_size);
+    op->regptr = get_reg_ref(env, op->reg, false, 0,
+                             decode->operand_size);
 }
 
 static inline void decode_immediate(CPUX86State *env, struct x86_decode *decode,
@@ -262,16 +262,16 @@ static void decode_incgroup(CPUX86State *env, struct x86_decode *decode)
 {
     decode->op[0].type = X86_VAR_REG;
     decode->op[0].reg = decode->opcode[0] - 0x40;
-    decode->op[0].ptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
-                                    decode->rex.b, decode->operand_size);
+    decode->op[0].regptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
+                                       decode->rex.b, decode->operand_size);
 }
 
 static void decode_decgroup(CPUX86State *env, struct x86_decode *decode)
 {
     decode->op[0].type = X86_VAR_REG;
     decode->op[0].reg = decode->opcode[0] - 0x48;
-    decode->op[0].ptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
-                                    decode->rex.b, decode->operand_size);
+    decode->op[0].regptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
+                                       decode->rex.b, decode->operand_size);
 }
 
 static void decode_incgroup2(CPUX86State *env, struct x86_decode *decode)
@@ -287,16 +287,16 @@ static void decode_pushgroup(CPUX86State *env, struct x86_decode *decode)
 {
     decode->op[0].type = X86_VAR_REG;
     decode->op[0].reg = decode->opcode[0] - 0x50;
-    decode->op[0].ptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
-                                    decode->rex.b, decode->operand_size);
+    decode->op[0].regptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
+                                       decode->rex.b, decode->operand_size);
 }
 
 static void decode_popgroup(CPUX86State *env, struct x86_decode *decode)
 {
     decode->op[0].type = X86_VAR_REG;
     decode->op[0].reg = decode->opcode[0] - 0x58;
-    decode->op[0].ptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
-                                    decode->rex.b, decode->operand_size);
+    decode->op[0].regptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
+                                       decode->rex.b, decode->operand_size);
 }
 
 static void decode_jxx(CPUX86State *env, struct x86_decode *decode)
@@ -377,16 +377,16 @@ static void decode_xchgroup(CPUX86State *env, struct x86_decode *decode)
 {
     decode->op[0].type = X86_VAR_REG;
     decode->op[0].reg = decode->opcode[0] - 0x90;
-    decode->op[0].ptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
-                                    decode->rex.b, decode->operand_size);
+    decode->op[0].regptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
+                                       decode->rex.b, decode->operand_size);
 }
 
 static void decode_movgroup(CPUX86State *env, struct x86_decode *decode)
 {
     decode->op[0].type = X86_VAR_REG;
     decode->op[0].reg = decode->opcode[0] - 0xb8;
-    decode->op[0].ptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
-                                    decode->rex.b, decode->operand_size);
+    decode->op[0].regptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
+                                       decode->rex.b, decode->operand_size);
     decode_immediate(env, decode, &decode->op[1], decode->operand_size);
 }
 
@@ -394,15 +394,15 @@ static void fetch_moffs(CPUX86State *env, struct x86_decode *decode,
                         struct x86_decode_op *op)
 {
     op->type = X86_VAR_OFFSET;
-    op->ptr = decode_bytes(env, decode, decode->addressing_size);
+    op->addr = decode_bytes(env, decode, decode->addressing_size);
 }
 
 static void decode_movgroup8(CPUX86State *env, struct x86_decode *decode)
 {
     decode->op[0].type = X86_VAR_REG;
     decode->op[0].reg = decode->opcode[0] - 0xb0;
-    decode->op[0].ptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
-                                    decode->rex.b, decode->operand_size);
+    decode->op[0].regptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
+                                       decode->rex.b, decode->operand_size);
     decode_immediate(env, decode, &decode->op[1], decode->operand_size);
 }
 
@@ -411,8 +411,8 @@ static void decode_rcx(CPUX86State *env, struct x86_decode *decode,
 {
     op->type = X86_VAR_REG;
     op->reg = R_ECX;
-    op->ptr = get_reg_ref(env, op->reg, decode->rex.rex, decode->rex.b,
-                          decode->operand_size);
+    op->regptr = get_reg_ref(env, op->reg, decode->rex.rex, decode->rex.b,
+                             decode->operand_size);
 }
 
 struct decode_tbl {
@@ -631,8 +631,8 @@ static void decode_bswap(CPUX86State *env, struct x86_decode *decode)
 {
     decode->op[0].type = X86_VAR_REG;
     decode->op[0].reg = decode->opcode[1] - 0xc8;
-    decode->op[0].ptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
-                                    decode->rex.b, decode->operand_size);
+    decode->op[0].regptr = get_reg_ref(env, decode->op[0].reg, decode->rex.rex,
+                                       decode->rex.b, decode->operand_size);
 }
 
 static void decode_d9_4(CPUX86State *env, struct x86_decode *decode)
@@ -1656,16 +1656,16 @@ void calc_modrm_operand16(CPUX86State *env, struct x86_decode *decode,
     }
 calc_addr:
     if (X86_DECODE_CMD_LEA == decode->cmd) {
-        op->ptr = (uint16_t)ptr;
+        op->addr = (uint16_t)ptr;
     } else {
-        op->ptr = decode_linear_addr(env, decode, (uint16_t)ptr, seg);
+        op->addr = decode_linear_addr(env, decode, (uint16_t)ptr, seg);
     }
 }
 
-target_ulong get_reg_ref(CPUX86State *env, int reg, int rex_present,
+void *get_reg_ref(CPUX86State *env, int reg, int rex_present,
                          int is_extended, int size)
 {
-    target_ulong ptr = 0;
+    void *ptr = NULL;
 
     if (is_extended) {
         reg |= R_R8;
@@ -1674,13 +1674,13 @@ target_ulong get_reg_ref(CPUX86State *env, int reg, int rex_present,
     switch (size) {
     case 1:
         if (is_extended || reg < 4 || rex_present) {
-            ptr = (target_ulong)&RL(env, reg);
+            ptr = &RL(env, reg);
         } else {
-            ptr = (target_ulong)&RH(env, reg - 4);
+            ptr = &RH(env, reg - 4);
         }
         break;
     default:
-        ptr = (target_ulong)&RRX(env, reg);
+        ptr = &RRX(env, reg);
         break;
     }
     return ptr;
@@ -1691,7 +1691,7 @@ target_ulong get_reg_val(CPUX86State *env, int reg, int rex_present,
 {
     target_ulong val = 0;
     memcpy(&val,
-           (void *)get_reg_ref(env, reg, rex_present, is_extended, size),
+           get_reg_ref(env, reg, rex_present, is_extended, size),
            size);
     return val;
 }
@@ -1758,9 +1758,9 @@ void calc_modrm_operand32(CPUX86State *env, struct x86_decode *decode,
     }
 
     if (X86_DECODE_CMD_LEA == decode->cmd) {
-        op->ptr = (uint32_t)ptr;
+        op->addr = (uint32_t)ptr;
     } else {
-        op->ptr = decode_linear_addr(env, decode, (uint32_t)ptr, seg);
+        op->addr = decode_linear_addr(env, decode, (uint32_t)ptr, seg);
     }
 }
 
@@ -1788,9 +1788,9 @@ void calc_modrm_operand64(CPUX86State *env, struct x86_decode *decode,
     }
 
     if (X86_DECODE_CMD_LEA == decode->cmd) {
-        op->ptr = ptr;
+        op->addr = ptr;
     } else {
-        op->ptr = decode_linear_addr(env, decode, ptr, seg);
+        op->addr = decode_linear_addr(env, decode, ptr, seg);
     }
 }
 
@@ -1801,8 +1801,8 @@ void calc_modrm_operand(CPUX86State *env, struct x86_decode *decode,
     if (3 == decode->modrm.mod) {
         op->reg = decode->modrm.reg;
         op->type = X86_VAR_REG;
-        op->ptr = get_reg_ref(env, decode->modrm.rm, decode->rex.rex,
-                              decode->rex.b, decode->operand_size);
+        op->regptr = get_reg_ref(env, decode->modrm.rm, decode->rex.rex,
+                                 decode->rex.b, decode->operand_size);
         return;
     }
 
diff --git a/target/i386/emulate/x86_emu.c b/target/i386/emulate/x86_emu.c
index 7773b51b95e..4c07f08942e 100644
--- a/target/i386/emulate/x86_emu.c
+++ b/target/i386/emulate/x86_emu.c
@@ -52,7 +52,7 @@
         uint8_t v2 = (uint8_t)decode->op[1].val;    \
         uint8_t diff = v1 cmd v2;                   \
         if (save_res) {                              \
-            write_val_ext(env, decode->op[0].ptr, diff, 1);  \
+            write_val_ext(env, &decode->op[0], diff, 1);  \
         } \
         FLAGS_FUNC##8(env, v1, v2, diff);           \
         break;                                      \
@@ -63,7 +63,7 @@
         uint16_t v2 = (uint16_t)decode->op[1].val;  \
         uint16_t diff = v1 cmd v2;                  \
         if (save_res) {                              \
-            write_val_ext(env, decode->op[0].ptr, diff, 2); \
+            write_val_ext(env, &decode->op[0], diff, 2); \
         } \
         FLAGS_FUNC##16(env, v1, v2, diff);          \
         break;                                      \
@@ -74,7 +74,7 @@
         uint32_t v2 = (uint32_t)decode->op[1].val;  \
         uint32_t diff = v1 cmd v2;                  \
         if (save_res) {                              \
-            write_val_ext(env, decode->op[0].ptr, diff, 4); \
+            write_val_ext(env, &decode->op[0], diff, 4); \
         } \
         FLAGS_FUNC##32(env, v1, v2, diff);          \
         break;                                      \
@@ -121,7 +121,7 @@ void write_reg(CPUX86State *env, int reg, target_ulong val, int size)
     }
 }
 
-target_ulong read_val_from_reg(target_ulong reg_ptr, int size)
+target_ulong read_val_from_reg(void *reg_ptr, int size)
 {
     target_ulong val;
     
@@ -144,7 +144,7 @@ target_ulong read_val_from_reg(target_ulong reg_ptr, int size)
     return val;
 }
 
-void write_val_to_reg(target_ulong reg_ptr, target_ulong val, int size)
+void write_val_to_reg(void *reg_ptr, target_ulong val, int size)
 {
     switch (size) {
     case 1:
@@ -164,18 +164,18 @@ void write_val_to_reg(target_ulong reg_ptr, target_ulong val, int size)
     }
 }
 
-static bool is_host_reg(CPUX86State *env, target_ulong ptr)
+static void write_val_to_mem(CPUX86State *env, target_ulong ptr, target_ulong val, int size)
 {
-    return (ptr - (target_ulong)&env->regs[0]) < sizeof(env->regs);
+    emul_ops->write_mem(env_cpu(env), &val, ptr, size);
 }
 
-void write_val_ext(CPUX86State *env, target_ulong ptr, target_ulong val, int size)
+void write_val_ext(CPUX86State *env, struct x86_decode_op *decode, target_ulong val, int size)
 {
-    if (is_host_reg(env, ptr)) {
-        write_val_to_reg(ptr, val, size);
-        return;
+    if (decode->type == X86_VAR_REG) {
+        write_val_to_reg(decode->regptr, val, size);
+    } else {
+        write_val_to_mem(env, decode->addr, val, size);
     }
-    emul_ops->write_mem(env_cpu(env), &val, ptr, size);
 }
 
 uint8_t *read_mmio(CPUX86State *env, target_ulong ptr, int bytes)
@@ -185,15 +185,11 @@ uint8_t *read_mmio(CPUX86State *env, target_ulong ptr, int bytes)
 }
 
 
-target_ulong read_val_ext(CPUX86State *env, target_ulong ptr, int size)
+static target_ulong read_val_from_mem(CPUX86State *env, target_long ptr, int size)
 {
     target_ulong val;
     uint8_t *mmio_ptr;
 
-    if (is_host_reg(env, ptr)) {
-        return read_val_from_reg(ptr, size);
-    }
-
     mmio_ptr = read_mmio(env, ptr, size);
     switch (size) {
     case 1:
@@ -215,6 +211,15 @@ target_ulong read_val_ext(CPUX86State *env, target_ulong ptr, int size)
     return val;
 }
 
+target_ulong read_val_ext(CPUX86State *env, struct x86_decode_op *decode, int size)
+{
+    if (decode->type == X86_VAR_REG) {
+        return read_val_from_reg(decode->regptr, size);
+    } else {
+        return read_val_from_mem(env, decode->addr, size);
+    }
+}
+
 static void fetch_operands(CPUX86State *env, struct x86_decode *decode,
                            int n, bool val_op0, bool val_op1, bool val_op2)
 {
@@ -226,25 +231,25 @@ static void fetch_operands(CPUX86State *env, struct x86_decode *decode,
         case X86_VAR_IMMEDIATE:
             break;
         case X86_VAR_REG:
-            VM_PANIC_ON(!decode->op[i].ptr);
+            VM_PANIC_ON(!decode->op[i].regptr);
             if (calc_val[i]) {
-                decode->op[i].val = read_val_from_reg(decode->op[i].ptr,
+                decode->op[i].val = read_val_from_reg(decode->op[i].regptr,
                                                       decode->operand_size);
             }
             break;
         case X86_VAR_RM:
             calc_modrm_operand(env, decode, &decode->op[i]);
             if (calc_val[i]) {
-                decode->op[i].val = read_val_ext(env, decode->op[i].ptr,
+                decode->op[i].val = read_val_ext(env, &decode->op[i],
                                                  decode->operand_size);
             }
             break;
         case X86_VAR_OFFSET:
-            decode->op[i].ptr = decode_linear_addr(env, decode,
-                                                   decode->op[i].ptr,
-                                                   R_DS);
+            decode->op[i].addr = decode_linear_addr(env, decode,
+                                                    decode->op[i].addr,
+                                                    R_DS);
             if (calc_val[i]) {
-                decode->op[i].val = read_val_ext(env, decode->op[i].ptr,
+                decode->op[i].val = read_val_ext(env, &decode->op[i],
                                                  decode->operand_size);
             }
             break;
@@ -257,7 +262,7 @@ static void fetch_operands(CPUX86State *env, struct x86_decode *decode,
 static void exec_mov(CPUX86State *env, struct x86_decode *decode)
 {
     fetch_operands(env, decode, 2, false, true, false);
-    write_val_ext(env, decode->op[0].ptr, decode->op[1].val,
+    write_val_ext(env, &decode->op[0], decode->op[1].val,
                   decode->operand_size);
 
     env->eip += decode->len;
@@ -312,7 +317,7 @@ static void exec_neg(CPUX86State *env, struct x86_decode *decode)
     fetch_operands(env, decode, 2, true, true, false);
 
     val = 0 - sign(decode->op[1].val, decode->operand_size);
-    write_val_ext(env, decode->op[1].ptr, val, decode->operand_size);
+    write_val_ext(env, &decode->op[1], val, decode->operand_size);
 
     if (4 == decode->operand_size) {
         SET_FLAGS_OSZAPC_SUB32(env, 0, 0 - val, val);
@@ -363,7 +368,7 @@ static void exec_not(CPUX86State *env, struct x86_decode *decode)
 {
     fetch_operands(env, decode, 1, true, false, false);
 
-    write_val_ext(env, decode->op[0].ptr, ~decode->op[0].val,
+    write_val_ext(env, &decode->op[0], ~decode->op[0].val,
                   decode->operand_size);
     env->eip += decode->len;
 }
@@ -382,8 +387,8 @@ void exec_movzx(CPUX86State *env, struct x86_decode *decode)
     }
     decode->operand_size = src_op_size;
     calc_modrm_operand(env, decode, &decode->op[1]);
-    decode->op[1].val = read_val_ext(env, decode->op[1].ptr, src_op_size);
-    write_val_ext(env, decode->op[0].ptr, decode->op[1].val, op_size);
+    decode->op[1].val = read_val_ext(env, &decode->op[1], src_op_size);
+    write_val_ext(env, &decode->op[0], decode->op[1].val, op_size);
 
     env->eip += decode->len;
 }
@@ -535,8 +540,8 @@ static void exec_movs_single(CPUX86State *env, struct x86_decode *decode)
     dst_addr = linear_addr_size(env_cpu(env), RDI(env),
                                 decode->addressing_size, R_ES);
 
-    val = read_val_ext(env, src_addr, decode->operand_size);
-    write_val_ext(env, dst_addr, val, decode->operand_size);
+    val = read_val_from_mem(env, src_addr, decode->operand_size);
+    write_val_to_mem(env, dst_addr, val, decode->operand_size);
 
     string_increment_reg(env, R_ESI, decode);
     string_increment_reg(env, R_EDI, decode);
@@ -563,9 +568,9 @@ static void exec_cmps_single(CPUX86State *env, struct x86_decode *decode)
                                 decode->addressing_size, R_ES);
 
     decode->op[0].type = X86_VAR_IMMEDIATE;
-    decode->op[0].val = read_val_ext(env, src_addr, decode->operand_size);
+    decode->op[0].val = read_val_from_mem(env, src_addr, decode->operand_size);
     decode->op[1].type = X86_VAR_IMMEDIATE;
-    decode->op[1].val = read_val_ext(env, dst_addr, decode->operand_size);
+    decode->op[1].val = read_val_from_mem(env, dst_addr, decode->operand_size);
 
     EXEC_2OP_FLAGS_CMD(env, decode, -, SET_FLAGS_OSZAPC_SUB, false);
 
@@ -697,15 +702,15 @@ static void do_bt(CPUX86State *env, struct x86_decode *decode, int flag)
     if (decode->op[0].type != X86_VAR_REG) {
         if (4 == decode->operand_size) {
             displacement = ((int32_t) (decode->op[1].val & 0xffffffe0)) / 32;
-            decode->op[0].ptr += 4 * displacement;
+            decode->op[0].addr += 4 * displacement;
         } else if (2 == decode->operand_size) {
             displacement = ((int16_t) (decode->op[1].val & 0xfff0)) / 16;
-            decode->op[0].ptr += 2 * displacement;
+            decode->op[0].addr += 2 * displacement;
         } else {
             VM_PANIC("bt 64bit\n");
         }
     }
-    decode->op[0].val = read_val_ext(env, decode->op[0].ptr,
+    decode->op[0].val = read_val_ext(env, &decode->op[0],
                                      decode->operand_size);
     cf = (decode->op[0].val >> index) & 0x01;
 
@@ -723,7 +728,7 @@ static void do_bt(CPUX86State *env, struct x86_decode *decode, int flag)
         decode->op[0].val &= ~(1u << index);
         break;
     }
-    write_val_ext(env, decode->op[0].ptr, decode->op[0].val,
+    write_val_ext(env, &decode->op[0], decode->op[0].val,
                   decode->operand_size);
     set_CF(env, cf);
 }
@@ -775,7 +780,7 @@ void exec_shl(CPUX86State *env, struct x86_decode *decode)
             of = cf ^ (res >> 7);
         }
 
-        write_val_ext(env, decode->op[0].ptr, res, 1);
+        write_val_ext(env, &decode->op[0], res, 1);
         SET_FLAGS_OSZAPC_LOGIC8(env, 0, 0, res);
         SET_FLAGS_OxxxxC(env, of, cf);
         break;
@@ -791,7 +796,7 @@ void exec_shl(CPUX86State *env, struct x86_decode *decode)
             of = cf ^ (res >> 15); /* of = cf ^ result15 */
         }
 
-        write_val_ext(env, decode->op[0].ptr, res, 2);
+        write_val_ext(env, &decode->op[0], res, 2);
         SET_FLAGS_OSZAPC_LOGIC16(env, 0, 0, res);
         SET_FLAGS_OxxxxC(env, of, cf);
         break;
@@ -800,7 +805,7 @@ void exec_shl(CPUX86State *env, struct x86_decode *decode)
     {
         uint32_t res = decode->op[0].val << count;
 
-        write_val_ext(env, decode->op[0].ptr, res, 4);
+        write_val_ext(env, &decode->op[0], res, 4);
         SET_FLAGS_OSZAPC_LOGIC32(env, 0, 0, res);
         cf = (decode->op[0].val >> (32 - count)) & 0x1;
         of = cf ^ (res >> 31); /* of = cf ^ result31 */
@@ -831,10 +836,10 @@ void exec_movsx(CPUX86State *env, struct x86_decode *decode)
 
     decode->operand_size = src_op_size;
     calc_modrm_operand(env, decode, &decode->op[1]);
-    decode->op[1].val = sign(read_val_ext(env, decode->op[1].ptr, src_op_size),
+    decode->op[1].val = sign(read_val_ext(env, &decode->op[1], src_op_size),
                              src_op_size);
 
-    write_val_ext(env, decode->op[0].ptr, decode->op[1].val, op_size);
+    write_val_ext(env, &decode->op[0], decode->op[1].val, op_size);
 
     env->eip += decode->len;
 }
@@ -862,7 +867,7 @@ void exec_ror(CPUX86State *env, struct x86_decode *decode)
             count &= 0x7; /* use only bottom 3 bits */
             res = ((uint8_t)decode->op[0].val >> count) |
                    ((uint8_t)decode->op[0].val << (8 - count));
-            write_val_ext(env, decode->op[0].ptr, res, 1);
+            write_val_ext(env, &decode->op[0], res, 1);
             bit6 = (res >> 6) & 1;
             bit7 = (res >> 7) & 1;
             /* set eflags: ROR count affects the following flags: C, O */
@@ -886,7 +891,7 @@ void exec_ror(CPUX86State *env, struct x86_decode *decode)
             count &= 0x0f;  /* use only 4 LSB's */
             res = ((uint16_t)decode->op[0].val >> count) |
                    ((uint16_t)decode->op[0].val << (16 - count));
-            write_val_ext(env, decode->op[0].ptr, res, 2);
+            write_val_ext(env, &decode->op[0], res, 2);
 
             bit14 = (res >> 14) & 1;
             bit15 = (res >> 15) & 1;
@@ -904,7 +909,7 @@ void exec_ror(CPUX86State *env, struct x86_decode *decode)
         if (count) {
             res = ((uint32_t)decode->op[0].val >> count) |
                    ((uint32_t)decode->op[0].val << (32 - count));
-            write_val_ext(env, decode->op[0].ptr, res, 4);
+            write_val_ext(env, &decode->op[0], res, 4);
 
             bit31 = (res >> 31) & 1;
             bit30 = (res >> 30) & 1;
@@ -941,7 +946,7 @@ void exec_rol(CPUX86State *env, struct x86_decode *decode)
             res = ((uint8_t)decode->op[0].val << count) |
                    ((uint8_t)decode->op[0].val >> (8 - count));
 
-            write_val_ext(env, decode->op[0].ptr, res, 1);
+            write_val_ext(env, &decode->op[0], res, 1);
             /* set eflags:
              * ROL count affects the following flags: C, O
              */
@@ -968,7 +973,7 @@ void exec_rol(CPUX86State *env, struct x86_decode *decode)
             res = ((uint16_t)decode->op[0].val << count) |
                    ((uint16_t)decode->op[0].val >> (16 - count));
 
-            write_val_ext(env, decode->op[0].ptr, res, 2);
+            write_val_ext(env, &decode->op[0], res, 2);
             bit0  = (res & 0x1);
             bit15 = (res >> 15);
             /* of = cf ^ result15 */
@@ -986,7 +991,7 @@ void exec_rol(CPUX86State *env, struct x86_decode *decode)
             res = ((uint32_t)decode->op[0].val << count) |
                    ((uint32_t)decode->op[0].val >> (32 - count));
 
-            write_val_ext(env, decode->op[0].ptr, res, 4);
+            write_val_ext(env, &decode->op[0], res, 4);
             bit0  = (res & 0x1);
             bit31 = (res >> 31);
             /* of = cf ^ result31 */
@@ -1024,7 +1029,7 @@ void exec_rcl(CPUX86State *env, struct x86_decode *decode)
                    (op1_8 >> (9 - count));
         }
 
-        write_val_ext(env, decode->op[0].ptr, res, 1);
+        write_val_ext(env, &decode->op[0], res, 1);
 
         cf = (op1_8 >> (8 - count)) & 0x01;
         of = cf ^ (res >> 7); /* of = cf ^ result7 */
@@ -1050,7 +1055,7 @@ void exec_rcl(CPUX86State *env, struct x86_decode *decode)
                    (op1_16 >> (17 - count));
         }
 
-        write_val_ext(env, decode->op[0].ptr, res, 2);
+        write_val_ext(env, &decode->op[0], res, 2);
 
         cf = (op1_16 >> (16 - count)) & 0x1;
         of = cf ^ (res >> 15); /* of = cf ^ result15 */
@@ -1073,7 +1078,7 @@ void exec_rcl(CPUX86State *env, struct x86_decode *decode)
                    (op1_32 >> (33 - count));
         }
 
-        write_val_ext(env, decode->op[0].ptr, res, 4);
+        write_val_ext(env, &decode->op[0], res, 4);
 
         cf = (op1_32 >> (32 - count)) & 0x1;
         of = cf ^ (res >> 31); /* of = cf ^ result31 */
@@ -1105,7 +1110,7 @@ void exec_rcr(CPUX86State *env, struct x86_decode *decode)
         res = (op1_8 >> count) | (get_CF(env) << (8 - count)) |
                (op1_8 << (9 - count));
 
-        write_val_ext(env, decode->op[0].ptr, res, 1);
+        write_val_ext(env, &decode->op[0], res, 1);
 
         cf = (op1_8 >> (count - 1)) & 0x1;
         of = (((res << 1) ^ res) >> 7) & 0x1; /* of = result6 ^ result7 */
@@ -1124,7 +1129,7 @@ void exec_rcr(CPUX86State *env, struct x86_decode *decode)
         res = (op1_16 >> count) | (get_CF(env) << (16 - count)) |
                (op1_16 << (17 - count));
 
-        write_val_ext(env, decode->op[0].ptr, res, 2);
+        write_val_ext(env, &decode->op[0], res, 2);
 
         cf = (op1_16 >> (count - 1)) & 0x1;
         of = ((uint16_t)((res << 1) ^ res) >> 15) & 0x1; /* of = result15 ^
@@ -1148,7 +1153,7 @@ void exec_rcr(CPUX86State *env, struct x86_decode *decode)
                    (op1_32 << (33 - count));
         }
 
-        write_val_ext(env, decode->op[0].ptr, res, 4);
+        write_val_ext(env, &decode->op[0], res, 4);
 
         cf = (op1_32 >> (count - 1)) & 0x1;
         of = ((res << 1) ^ res) >> 31; /* of = result30 ^ result31 */
@@ -1163,9 +1168,9 @@ static void exec_xchg(CPUX86State *env, struct x86_decode *decode)
 {
     fetch_operands(env, decode, 2, true, true, false);
 
-    write_val_ext(env, decode->op[0].ptr, decode->op[1].val,
+    write_val_ext(env, &decode->op[0], decode->op[1].val,
                   decode->operand_size);
-    write_val_ext(env, decode->op[1].ptr, decode->op[0].val,
+    write_val_ext(env, &decode->op[1], decode->op[0].val,
                   decode->operand_size);
 
     env->eip += decode->len;
@@ -1174,7 +1179,7 @@ static void exec_xchg(CPUX86State *env, struct x86_decode *decode)
 static void exec_xadd(CPUX86State *env, struct x86_decode *decode)
 {
     EXEC_2OP_FLAGS_CMD(env, decode, +, SET_FLAGS_OSZAPC_ADD, true);
-    write_val_ext(env, decode->op[1].ptr, decode->op[0].val,
+    write_val_ext(env, &decode->op[1], decode->op[0].val,
                   decode->operand_size);
 
     env->eip += decode->len;
-- 
2.49.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] target/i386/emulate: mostly rewrite flags handling
  2025-05-02 21:48 [CFT PATCH 0/4] target/i386/emulate: cleanups Paolo Bonzini
  2025-05-02 21:48 ` [PATCH 1/4] target/i386/emulate: fix target_ulong format strings Paolo Bonzini
  2025-05-02 21:48 ` [PATCH 2/4] target/i386/emulate: stop overloading decode->op[N].ptr Paolo Bonzini
@ 2025-05-02 21:48 ` Paolo Bonzini
  2025-05-02 21:48 ` [PATCH 4/4] target/i386: remove lflags Paolo Bonzini
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2025-05-02 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: wei.liu

While Bochs's algorithms are pretty solid, there are small opportunities
to improve them or to make their logic more similar to TCG's handling
of condition codes.

- use a single bit for the difference between bits 0..7 of result and PF.
This is useful because "set only ZF" is not a common case.

- place SD in the same place as SF

- move CF and PO at bits 62 and 63 when target_ulong is 64-bits wide,
  so that 64-bit ALU operations need fewer shifts

- use rotates to move CF and AF from auxbits to their eflags position

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/emulate/x86_flags.h |  12 +-
 target/i386/emulate/x86_emu.c   |   4 +-
 target/i386/emulate/x86_flags.c | 197 ++++++++++++++------------------
 3 files changed, 86 insertions(+), 127 deletions(-)

diff --git a/target/i386/emulate/x86_flags.h b/target/i386/emulate/x86_flags.h
index 6c175007b57..28b008e5771 100644
--- a/target/i386/emulate/x86_flags.h
+++ b/target/i386/emulate/x86_flags.h
@@ -28,20 +28,10 @@
 void lflags_to_rflags(CPUX86State *env);
 void rflags_to_lflags(CPUX86State *env);
 
-bool get_PF(CPUX86State *env);
-void set_PF(CPUX86State *env, bool val);
 bool get_CF(CPUX86State *env);
 void set_CF(CPUX86State *env, bool val);
-bool get_AF(CPUX86State *env);
-void set_AF(CPUX86State *env, bool val);
-bool get_ZF(CPUX86State *env);
-void set_ZF(CPUX86State *env, bool val);
-bool get_SF(CPUX86State *env);
-void set_SF(CPUX86State *env, bool val);
-bool get_OF(CPUX86State *env);
-void set_OF(CPUX86State *env, bool val);
 
-void SET_FLAGS_OxxxxC(CPUX86State *env, uint32_t new_of, uint32_t new_cf);
+void SET_FLAGS_OxxxxC(CPUX86State *env, bool new_of, bool new_cf);
 
 void SET_FLAGS_OSZAPC_SUB32(CPUX86State *env, uint32_t v1, uint32_t v2,
                             uint32_t diff);
diff --git a/target/i386/emulate/x86_emu.c b/target/i386/emulate/x86_emu.c
index 4c07f08942e..61bd5af5bb1 100644
--- a/target/i386/emulate/x86_emu.c
+++ b/target/i386/emulate/x86_emu.c
@@ -474,10 +474,10 @@ static inline void string_rep(CPUX86State *env, struct x86_decode *decode,
     while (rcx--) {
         func(env, decode);
         write_reg(env, R_ECX, rcx, decode->addressing_size);
-        if ((PREFIX_REP == rep) && !get_ZF(env)) {
+        if ((PREFIX_REP == rep) && !env->lflags.result) {
             break;
         }
-        if ((PREFIX_REPN == rep) && get_ZF(env)) {
+        if ((PREFIX_REPN == rep) && env->lflags.result) {
             break;
         }
     }
diff --git a/target/i386/emulate/x86_flags.c b/target/i386/emulate/x86_flags.c
index 84e27364a03..c347a951889 100644
--- a/target/i386/emulate/x86_flags.c
+++ b/target/i386/emulate/x86_flags.c
@@ -29,41 +29,50 @@
 #include "x86.h"
 
 
-/* this is basically bocsh code */
+/*
+ * The algorithms here are similar to those in Bochs.  After an ALU
+ * operation, RESULT can be used to compute ZF, SF and PF, whereas
+ * AUXBITS is used to compute AF, CF and OF.  In reality, SF and PF are the
+ * XOR of the value computed from RESULT and the value found in bits 7 and 2
+ * of AUXBITS; this way the same logic can be used to compute the flags
+ * both before and after an ALU operation.
+ *
+ * Compared to the TCG CC_OP codes, this avoids conditionals when converting
+ * to and from the RFLAGS representation.
+ */
 
-#define LF_SIGN_BIT     31
+#define LF_SIGN_BIT    (TARGET_LONG_BITS - 1)
 
-#define LF_BIT_SD      (0)          /* lazy Sign Flag Delta            */
-#define LF_BIT_AF      (3)          /* lazy Adjust flag                */
-#define LF_BIT_PDB     (8)          /* lazy Parity Delta Byte (8 bits) */
-#define LF_BIT_CF      (31)         /* lazy Carry Flag                 */
-#define LF_BIT_PO      (30)         /* lazy Partial Overflow = CF ^ OF */
+#define LF_BIT_PD      (2)          /* lazy Parity Delta, same bit as PF */
+#define LF_BIT_AF      (3)          /* lazy Adjust flag */
+#define LF_BIT_SD      (7)          /* lazy Sign Flag Delta, same bit as SF */
+#define LF_BIT_CF      (TARGET_LONG_BITS - 1) /* lazy Carry Flag */
+#define LF_BIT_PO      (TARGET_LONG_BITS - 2) /* lazy Partial Overflow = CF ^ OF */
 
-#define LF_MASK_SD     (0x01 << LF_BIT_SD)
-#define LF_MASK_AF     (0x01 << LF_BIT_AF)
-#define LF_MASK_PDB    (0xFF << LF_BIT_PDB)
-#define LF_MASK_CF     (0x01 << LF_BIT_CF)
-#define LF_MASK_PO     (0x01 << LF_BIT_PO)
+#define LF_MASK_PD     ((target_ulong)0x01 << LF_BIT_PD)
+#define LF_MASK_AF     ((target_ulong)0x01 << LF_BIT_AF)
+#define LF_MASK_SD     ((target_ulong)0x01 << LF_BIT_SD)
+#define LF_MASK_CF     ((target_ulong)0x01 << LF_BIT_CF)
+#define LF_MASK_PO     ((target_ulong)0x01 << LF_BIT_PO)
 
 /* ******************* */
 /* OSZAPC */
 /* ******************* */
 
-/* size, carries, result */
+/* use carries to fill in AF, PO and CF, while ensuring PD and SD are clear.
+ * for full-word operations just clear PD and SD; for smaller operand
+ * sizes only keep AF in the low byte and shift the carries left to
+ * place PO and CF in the top two bits.
+ */
 #define SET_FLAGS_OSZAPC_SIZE(size, lf_carries, lf_result) { \
-    target_ulong temp = ((lf_carries) & (LF_MASK_AF)) | \
-    (((lf_carries) >> (size - 2)) << LF_BIT_PO); \
     env->lflags.result = (target_ulong)(int##size##_t)(lf_result); \
-    if ((size) == 32) { \
-        temp = ((lf_carries) & ~(LF_MASK_PDB | LF_MASK_SD)); \
-    } else if ((size) == 16) { \
-        temp = ((lf_carries) & (LF_MASK_AF)) | ((lf_carries) << 16); \
-    } else if ((size) == 8)  { \
-        temp = ((lf_carries) & (LF_MASK_AF)) | ((lf_carries) << 24); \
+    target_ulong temp = (lf_carries); \
+    if ((size) == TARGET_LONG_BITS) { \
+        temp = temp & ~(LF_MASK_PD | LF_MASK_SD); \
     } else { \
-        VM_PANIC("unimplemented");  \
+        temp = (temp & LF_MASK_AF) | (temp << (TARGET_LONG_BITS - (size))); \
     } \
-    env->lflags.auxbits = (target_ulong)(uint32_t)temp; \
+    env->lflags.auxbits = temp; \
 }
 
 /* carries, result */
@@ -77,23 +86,18 @@
 /* ******************* */
 /* OSZAP */
 /* ******************* */
-/* size, carries, result */
+/* same as setting OSZAPC, but preserve CF and flip PO if the old value of CF
+ * did not match the high bit of lf_carries. */
 #define SET_FLAGS_OSZAP_SIZE(size, lf_carries, lf_result) { \
-    target_ulong temp = ((lf_carries) & (LF_MASK_AF)) | \
-    (((lf_carries) >> (size - 2)) << LF_BIT_PO); \
-    if ((size) == 32) { \
-        temp = ((lf_carries) & ~(LF_MASK_PDB | LF_MASK_SD)); \
-    } else if ((size) == 16) { \
-        temp = ((lf_carries) & (LF_MASK_AF)) | ((lf_carries) << 16); \
-    } else if ((size) == 8) { \
-        temp = ((lf_carries) & (LF_MASK_AF)) | ((lf_carries) << 24); \
-    } else { \
-        VM_PANIC("unimplemented");      \
-    } \
     env->lflags.result = (target_ulong)(int##size##_t)(lf_result); \
-    target_ulong delta_c = (env->lflags.auxbits ^ temp) & LF_MASK_CF; \
-    delta_c ^= (delta_c >> 1); \
-    env->lflags.auxbits = (target_ulong)(uint32_t)(temp ^ delta_c); \
+    target_ulong temp = (lf_carries); \
+    if ((size) == TARGET_LONG_BITS) { \
+        temp = (temp & ~(LF_MASK_PD | LF_MASK_SD)); \
+    } else { \
+        temp = (temp & LF_MASK_AF) | (temp << (TARGET_LONG_BITS - (size))); \
+    } \
+    target_ulong cf_changed = ((target_long)(env->lflags.auxbits ^ temp)) < 0; \
+    env->lflags.auxbits = temp ^ (cf_changed * (LF_MASK_PO | LF_MASK_CF)); \
 }
 
 /* carries, result */
@@ -104,11 +108,11 @@
 #define SET_FLAGS_OSZAP_32(carries, result) \
     SET_FLAGS_OSZAP_SIZE(32, carries, result)
 
-void SET_FLAGS_OxxxxC(CPUX86State *env, uint32_t new_of, uint32_t new_cf)
+void SET_FLAGS_OxxxxC(CPUX86State *env, bool new_of, bool new_cf)
 {
-    uint32_t temp_po = new_of ^ new_cf;
     env->lflags.auxbits &= ~(LF_MASK_PO | LF_MASK_CF);
-    env->lflags.auxbits |= (temp_po << LF_BIT_PO) | (new_cf << LF_BIT_CF);
+    env->lflags.auxbits |= (-(target_ulong)new_cf << LF_BIT_PO);
+    env->lflags.auxbits ^= ((target_ulong)new_of << LF_BIT_PO);
 }
 
 void SET_FLAGS_OSZAPC_SUB32(CPUX86State *env, uint32_t v1, uint32_t v2,
@@ -202,104 +206,69 @@ void SET_FLAGS_OSZAPC_LOGIC8(CPUX86State *env, uint8_t v1, uint8_t v2,
     SET_FLAGS_OSZAPC_8(0, diff);
 }
 
-bool get_PF(CPUX86State *env)
+static inline uint32_t get_PF(CPUX86State *env)
 {
-    uint32_t temp = (255 & env->lflags.result);
-    temp = temp ^ (255 & (env->lflags.auxbits >> LF_BIT_PDB));
-    temp = (temp ^ (temp >> 4)) & 0x0F;
-    return (0x9669U >> temp) & 1;
+    uint8_t temp = env->lflags.result;
+    return ((parity8(temp) - 1) ^ env->lflags.auxbits) & CC_P;
 }
 
-void set_PF(CPUX86State *env, bool val)
+static inline uint32_t get_OF(CPUX86State *env)
 {
-    uint32_t temp = (255 & env->lflags.result) ^ (!val);
-    env->lflags.auxbits &= ~(LF_MASK_PDB);
-    env->lflags.auxbits |= (temp << LF_BIT_PDB);
-}
-
-bool get_OF(CPUX86State *env)
-{
-    return ((env->lflags.auxbits + (1U << LF_BIT_PO)) >> LF_BIT_CF) & 1;
+    return ((env->lflags.auxbits >> (LF_BIT_CF - 11)) + CC_O / 2) & CC_O;
 }
 
 bool get_CF(CPUX86State *env)
 {
-    return (env->lflags.auxbits >> LF_BIT_CF) & 1;
-}
-
-void set_OF(CPUX86State *env, bool val)
-{
-    bool old_cf = get_CF(env);
-    SET_FLAGS_OxxxxC(env, val, old_cf);
+    return ((target_long)env->lflags.auxbits) < 0;
 }
 
 void set_CF(CPUX86State *env, bool val)
 {
-    bool old_of = get_OF(env);
-    SET_FLAGS_OxxxxC(env, old_of, val);
+    /* If CF changes, flip PO and CF */
+    target_ulong temp = -(target_ulong)val;
+    target_ulong cf_changed = ((target_long)(env->lflags.auxbits ^ temp)) < 0;
+    env->lflags.auxbits ^= cf_changed * (LF_MASK_PO | LF_MASK_CF);
 }
 
-bool get_AF(CPUX86State *env)
+static inline uint32_t get_ZF(CPUX86State *env)
 {
-    return (env->lflags.auxbits >> LF_BIT_AF) & 1;
+    return env->lflags.result ? 0 : CC_Z;
 }
 
-void set_AF(CPUX86State *env, bool val)
+static inline uint32_t get_SF(CPUX86State *env)
 {
-    env->lflags.auxbits &= ~(LF_MASK_AF);
-    env->lflags.auxbits |= val << LF_BIT_AF;
-}
-
-bool get_ZF(CPUX86State *env)
-{
-    return !env->lflags.result;
-}
-
-void set_ZF(CPUX86State *env, bool val)
-{
-    if (val) {
-        env->lflags.auxbits ^=
-         (((env->lflags.result >> LF_SIGN_BIT) & 1) << LF_BIT_SD);
-        /* merge the parity bits into the Parity Delta Byte */
-        uint32_t temp_pdb = (255 & env->lflags.result);
-        env->lflags.auxbits ^= (temp_pdb << LF_BIT_PDB);
-        /* now zero the .result value */
-        env->lflags.result = 0;
-    } else {
-        env->lflags.result |= (1 << 8);
-    }
-}
-
-bool get_SF(CPUX86State *env)
-{
-    return ((env->lflags.result >> LF_SIGN_BIT) ^
-            (env->lflags.auxbits >> LF_BIT_SD)) & 1;
-}
-
-void set_SF(CPUX86State *env, bool val)
-{
-    bool temp_sf = get_SF(env);
-    env->lflags.auxbits ^= (temp_sf ^ val) << LF_BIT_SD;
+    return ((env->lflags.result >> (LF_SIGN_BIT - LF_BIT_SD)) ^
+            env->lflags.auxbits) & CC_S;
 }
 
 void lflags_to_rflags(CPUX86State *env)
 {
     env->eflags &= ~(CC_C|CC_P|CC_A|CC_Z|CC_S|CC_O);
-    env->eflags |= get_CF(env) ? CC_C : 0;
-    env->eflags |= get_PF(env) ? CC_P : 0;
-    env->eflags |= get_AF(env) ? CC_A : 0;
-    env->eflags |= get_ZF(env) ? CC_Z : 0;
-    env->eflags |= get_SF(env) ? CC_S : 0;
-    env->eflags |= get_OF(env) ? CC_O : 0;
+    /* rotate left by one to move carry-out bits into CF and AF */
+    env->eflags |= (
+        (env->lflags.auxbits << 1) |
+        (env->lflags.auxbits >> (TARGET_LONG_BITS - 1))) & (CC_C | CC_A);
+    env->eflags |= get_SF(env);
+    env->eflags |= get_PF(env);
+    env->eflags |= get_ZF(env);
+    env->eflags |= get_OF(env);
 }
 
 void rflags_to_lflags(CPUX86State *env)
 {
-    env->lflags.auxbits = env->lflags.result = 0;
-    set_OF(env, env->eflags & CC_O);
-    set_SF(env, env->eflags & CC_S);
-    set_ZF(env, env->eflags & CC_Z);
-    set_AF(env, env->eflags & CC_A);
-    set_PF(env, env->eflags & CC_P);
-    set_CF(env, env->eflags & CC_C);
+    target_ulong cf_xor_of;
+
+    env->lflags.auxbits = CC_P;
+    env->lflags.auxbits ^= env->eflags & (CC_S | CC_P);
+
+    /* rotate right by one to move CF and AF into the carry-out positions */
+    env->lflags.auxbits |= (
+        (env->eflags >> 1) |
+        (env->eflags << (TARGET_LONG_BITS - 1))) & (CC_C | CC_A);
+
+    cf_xor_of = (env->eflags & (CC_C | CC_O)) + (CC_O - CC_C);
+    env->lflags.auxbits |= -cf_xor_of & LF_MASK_PO;
+
+    /* Leave the low byte zero so that parity is not affected.  */
+    env->lflags.result = !(env->eflags & CC_Z) << 8;
 }
-- 
2.49.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] target/i386: remove lflags
  2025-05-02 21:48 [CFT PATCH 0/4] target/i386/emulate: cleanups Paolo Bonzini
                   ` (2 preceding siblings ...)
  2025-05-02 21:48 ` [PATCH 3/4] target/i386/emulate: mostly rewrite flags handling Paolo Bonzini
@ 2025-05-02 21:48 ` Paolo Bonzini
  2025-05-03  5:38 ` [CFT PATCH 0/4] target/i386/emulate: cleanups Wei Liu
  2025-05-05 18:34 ` Wei Liu
  5 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2025-05-02 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: wei.liu

Just use cc_dst and cc_src for the same purpose.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.h               |  6 ----
 target/i386/emulate/x86_emu.c   |  4 +--
 target/i386/emulate/x86_flags.c | 55 ++++++++++++++++-----------------
 3 files changed, 29 insertions(+), 36 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 54bf9639f19..8e3323f96f8 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1809,11 +1809,6 @@ typedef struct CPUCaches {
         CPUCacheInfo *l3_cache;
 } CPUCaches;
 
-typedef struct X86LazyFlags {
-    target_ulong result;
-    target_ulong auxbits;
-} X86LazyFlags;
-
 typedef struct CPUArchState {
     /* standard registers */
     target_ulong regs[CPU_NB_REGS];
@@ -2106,7 +2101,6 @@ typedef struct CPUArchState {
     QemuMutex xen_timers_lock;
 #endif
 #if defined(CONFIG_HVF)
-    X86LazyFlags lflags;
     void *emu_mmio_buf;
 #endif
 
diff --git a/target/i386/emulate/x86_emu.c b/target/i386/emulate/x86_emu.c
index 61bd5af5bb1..4890e0a4e5e 100644
--- a/target/i386/emulate/x86_emu.c
+++ b/target/i386/emulate/x86_emu.c
@@ -474,10 +474,10 @@ static inline void string_rep(CPUX86State *env, struct x86_decode *decode,
     while (rcx--) {
         func(env, decode);
         write_reg(env, R_ECX, rcx, decode->addressing_size);
-        if ((PREFIX_REP == rep) && !env->lflags.result) {
+        if ((PREFIX_REP == rep) && !env->cc_dst) {
             break;
         }
-        if ((PREFIX_REPN == rep) && env->lflags.result) {
+        if ((PREFIX_REPN == rep) && env->cc_dst) {
             break;
         }
     }
diff --git a/target/i386/emulate/x86_flags.c b/target/i386/emulate/x86_flags.c
index c347a951889..a4f7af8aacd 100644
--- a/target/i386/emulate/x86_flags.c
+++ b/target/i386/emulate/x86_flags.c
@@ -31,10 +31,10 @@
 
 /*
  * The algorithms here are similar to those in Bochs.  After an ALU
- * operation, RESULT can be used to compute ZF, SF and PF, whereas
- * AUXBITS is used to compute AF, CF and OF.  In reality, SF and PF are the
- * XOR of the value computed from RESULT and the value found in bits 7 and 2
- * of AUXBITS; this way the same logic can be used to compute the flags
+ * operation, CC_DST can be used to compute ZF, SF and PF, whereas
+ * CC_SRC is used to compute AF, CF and OF.  In reality, SF and PF are the
+ * XOR of the value computed from CC_DST and the value found in bits 7 and 2
+ * of CC_SRC; this way the same logic can be used to compute the flags
  * both before and after an ALU operation.
  *
  * Compared to the TCG CC_OP codes, this avoids conditionals when converting
@@ -65,14 +65,14 @@
  * place PO and CF in the top two bits.
  */
 #define SET_FLAGS_OSZAPC_SIZE(size, lf_carries, lf_result) { \
-    env->lflags.result = (target_ulong)(int##size##_t)(lf_result); \
+    env->cc_dst = (target_ulong)(int##size##_t)(lf_result); \
     target_ulong temp = (lf_carries); \
     if ((size) == TARGET_LONG_BITS) { \
         temp = temp & ~(LF_MASK_PD | LF_MASK_SD); \
     } else { \
         temp = (temp & LF_MASK_AF) | (temp << (TARGET_LONG_BITS - (size))); \
     } \
-    env->lflags.auxbits = temp; \
+    env->cc_src = temp; \
 }
 
 /* carries, result */
@@ -89,15 +89,15 @@
 /* same as setting OSZAPC, but preserve CF and flip PO if the old value of CF
  * did not match the high bit of lf_carries. */
 #define SET_FLAGS_OSZAP_SIZE(size, lf_carries, lf_result) { \
-    env->lflags.result = (target_ulong)(int##size##_t)(lf_result); \
+    env->cc_dst = (target_ulong)(int##size##_t)(lf_result); \
     target_ulong temp = (lf_carries); \
     if ((size) == TARGET_LONG_BITS) { \
         temp = (temp & ~(LF_MASK_PD | LF_MASK_SD)); \
     } else { \
         temp = (temp & LF_MASK_AF) | (temp << (TARGET_LONG_BITS - (size))); \
     } \
-    target_ulong cf_changed = ((target_long)(env->lflags.auxbits ^ temp)) < 0; \
-    env->lflags.auxbits = temp ^ (cf_changed * (LF_MASK_PO | LF_MASK_CF)); \
+    target_ulong cf_changed = ((target_long)(env->cc_src ^ temp)) < 0; \
+    env->cc_src = temp ^ (cf_changed * (LF_MASK_PO | LF_MASK_CF)); \
 }
 
 /* carries, result */
@@ -110,9 +110,9 @@
 
 void SET_FLAGS_OxxxxC(CPUX86State *env, bool new_of, bool new_cf)
 {
-    env->lflags.auxbits &= ~(LF_MASK_PO | LF_MASK_CF);
-    env->lflags.auxbits |= (-(target_ulong)new_cf << LF_BIT_PO);
-    env->lflags.auxbits ^= ((target_ulong)new_of << LF_BIT_PO);
+    env->cc_src &= ~(LF_MASK_PO | LF_MASK_CF);
+    env->cc_src |= (-(target_ulong)new_cf << LF_BIT_PO);
+    env->cc_src ^= ((target_ulong)new_of << LF_BIT_PO);
 }
 
 void SET_FLAGS_OSZAPC_SUB32(CPUX86State *env, uint32_t v1, uint32_t v2,
@@ -208,37 +208,36 @@ void SET_FLAGS_OSZAPC_LOGIC8(CPUX86State *env, uint8_t v1, uint8_t v2,
 
 static inline uint32_t get_PF(CPUX86State *env)
 {
-    uint8_t temp = env->lflags.result;
-    return ((parity8(temp) - 1) ^ env->lflags.auxbits) & CC_P;
+    return ((parity8(env->cc_dst) - 1) ^ env->cc_src) & CC_P;
 }
 
 static inline uint32_t get_OF(CPUX86State *env)
 {
-    return ((env->lflags.auxbits >> (LF_BIT_CF - 11)) + CC_O / 2) & CC_O;
+    return ((env->cc_src >> (LF_BIT_CF - 11)) + CC_O / 2) & CC_O;
 }
 
 bool get_CF(CPUX86State *env)
 {
-    return ((target_long)env->lflags.auxbits) < 0;
+    return ((target_long)env->cc_src) < 0;
 }
 
 void set_CF(CPUX86State *env, bool val)
 {
     /* If CF changes, flip PO and CF */
     target_ulong temp = -(target_ulong)val;
-    target_ulong cf_changed = ((target_long)(env->lflags.auxbits ^ temp)) < 0;
-    env->lflags.auxbits ^= cf_changed * (LF_MASK_PO | LF_MASK_CF);
+    target_ulong cf_changed = ((target_long)(env->cc_src ^ temp)) < 0;
+    env->cc_src ^= cf_changed * (LF_MASK_PO | LF_MASK_CF);
 }
 
 static inline uint32_t get_ZF(CPUX86State *env)
 {
-    return env->lflags.result ? 0 : CC_Z;
+    return env->cc_dst ? 0 : CC_Z;
 }
 
 static inline uint32_t get_SF(CPUX86State *env)
 {
-    return ((env->lflags.result >> (LF_SIGN_BIT - LF_BIT_SD)) ^
-            env->lflags.auxbits) & CC_S;
+    return ((env->cc_dst >> (LF_SIGN_BIT - LF_BIT_SD)) ^
+            env->cc_src) & CC_S;
 }
 
 void lflags_to_rflags(CPUX86State *env)
@@ -246,8 +245,8 @@ void lflags_to_rflags(CPUX86State *env)
     env->eflags &= ~(CC_C|CC_P|CC_A|CC_Z|CC_S|CC_O);
     /* rotate left by one to move carry-out bits into CF and AF */
     env->eflags |= (
-        (env->lflags.auxbits << 1) |
-        (env->lflags.auxbits >> (TARGET_LONG_BITS - 1))) & (CC_C | CC_A);
+        (env->cc_src << 1) |
+        (env->cc_src >> (TARGET_LONG_BITS - 1))) & (CC_C | CC_A);
     env->eflags |= get_SF(env);
     env->eflags |= get_PF(env);
     env->eflags |= get_ZF(env);
@@ -258,17 +257,17 @@ void rflags_to_lflags(CPUX86State *env)
 {
     target_ulong cf_xor_of;
 
-    env->lflags.auxbits = CC_P;
-    env->lflags.auxbits ^= env->eflags & (CC_S | CC_P);
+    env->cc_src = CC_P;
+    env->cc_src ^= env->eflags & (CC_S | CC_P);
 
     /* rotate right by one to move CF and AF into the carry-out positions */
-    env->lflags.auxbits |= (
+    env->cc_src |= (
         (env->eflags >> 1) |
         (env->eflags << (TARGET_LONG_BITS - 1))) & (CC_C | CC_A);
 
     cf_xor_of = (env->eflags & (CC_C | CC_O)) + (CC_O - CC_C);
-    env->lflags.auxbits |= -cf_xor_of & LF_MASK_PO;
+    env->cc_src |= -cf_xor_of & LF_MASK_PO;
 
     /* Leave the low byte zero so that parity is not affected.  */
-    env->lflags.result = !(env->eflags & CC_Z) << 8;
+    env->cc_dst = !(env->eflags & CC_Z) << 8;
 }
-- 
2.49.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [CFT PATCH 0/4] target/i386/emulate: cleanups
  2025-05-02 21:48 [CFT PATCH 0/4] target/i386/emulate: cleanups Paolo Bonzini
                   ` (3 preceding siblings ...)
  2025-05-02 21:48 ` [PATCH 4/4] target/i386: remove lflags Paolo Bonzini
@ 2025-05-03  5:38 ` Wei Liu
  2025-05-03  7:01   ` Paolo Bonzini
  2025-05-05 18:34 ` Wei Liu
  5 siblings, 1 reply; 17+ messages in thread
From: Wei Liu @ 2025-05-03  5:38 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, wei.liu

On Fri, May 02, 2025 at 11:48:37PM +0200, Paolo Bonzini wrote:
> These are some improvements to the x86 emulator that I wrote but have no
> way of testing (right now).
> 
> I tried to place them in order of importance so that, if something breaks,
> it is possible to commit a subset.  I tried to compile the resulting code
> on Linux but I have not run it.
> 
> Patch 1 is just to fix warnings on Linux.
> 
> Patch 2 is the most important, as it fixes some real horrors in the code.
> 
> Patch 3 makes flags handling use algorithms somewhat similar to TCG.
> It should fix issues with 64-bit ALU operations, but it's also the one
> where it's more likely to have a mistake.
> 
> Patch 4 is comparatively trivial, though I cannot exclude any screwups.
> 
> It should be possible to test this with both HVF and Hyper-V.
> 
> Paolo

FWIW this series builds fine on for x86 HVF.

Thanks,
Wei.

> 
> Paolo Bonzini (4):
>   target/i386/emulate: fix target_ulong format strings
>   target/i386/emulate: stop overloading decode->op[N].ptr
>   target/i386/emulate: mostly rewrite flags handling
>   target/i386: remove lflags
> 
>  target/i386/cpu.h                |   6 -
>  target/i386/emulate/x86_decode.h |   9 +-
>  target/i386/emulate/x86_emu.h    |   8 +-
>  target/i386/emulate/x86_flags.h  |  12 +-
>  target/i386/emulate/x86_decode.c |  76 ++++++------
>  target/i386/emulate/x86_emu.c    | 125 +++++++++----------
>  target/i386/emulate/x86_flags.c  | 198 +++++++++++++------------------
>  7 files changed, 197 insertions(+), 237 deletions(-)
> 
> -- 
> 2.49.0
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [CFT PATCH 0/4] target/i386/emulate: cleanups
  2025-05-03  5:38 ` [CFT PATCH 0/4] target/i386/emulate: cleanups Wei Liu
@ 2025-05-03  7:01   ` Paolo Bonzini
  2025-05-05 18:33     ` Wei Liu
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2025-05-03  7:01 UTC (permalink / raw)
  To: Wei Liu; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 911 bytes --]

Il sab 3 mag 2025, 07:39 Wei Liu <wei.liu@kernel.org> ha scritto:

> FWIW this series builds fine on for x86 HVF.
>

Thanks, can you also test it on either HVF or Hyper-V?

Paolo

Thanks,
> Wei.
>
> >
> > Paolo Bonzini (4):
> >   target/i386/emulate: fix target_ulong format strings
> >   target/i386/emulate: stop overloading decode->op[N].ptr
> >   target/i386/emulate: mostly rewrite flags handling
> >   target/i386: remove lflags
> >
> >  target/i386/cpu.h                |   6 -
> >  target/i386/emulate/x86_decode.h |   9 +-
> >  target/i386/emulate/x86_emu.h    |   8 +-
> >  target/i386/emulate/x86_flags.h  |  12 +-
> >  target/i386/emulate/x86_decode.c |  76 ++++++------
> >  target/i386/emulate/x86_emu.c    | 125 +++++++++----------
> >  target/i386/emulate/x86_flags.c  | 198 +++++++++++++------------------
> >  7 files changed, 197 insertions(+), 237 deletions(-)
> >
> > --
> > 2.49.0
> >
>
>

[-- Attachment #2: Type: text/html, Size: 1730 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] target/i386/emulate: stop overloading decode->op[N].ptr
  2025-05-02 21:48 ` [PATCH 2/4] target/i386/emulate: stop overloading decode->op[N].ptr Paolo Bonzini
@ 2025-05-03 10:46   ` BALATON Zoltan
  2025-05-05  9:43   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 17+ messages in thread
From: BALATON Zoltan @ 2025-05-03 10:46 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, wei.liu

On Fri, 2 May 2025, Paolo Bonzini wrote:
> decode->op[N].ptr can contain either a host pointer (!) in CPUState
> or a guest virtual address.  Pass the whole struct to read_val_ext
> and write_val_ext, so that it can decide the contents based on the
> operand type.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> target/i386/emulate/x86_decode.h |   9 ++-
> target/i386/emulate/x86_emu.h    |   8 +--
> target/i386/emulate/x86_decode.c |  74 +++++++++----------
> target/i386/emulate/x86_emu.c    | 119 ++++++++++++++++---------------
> 4 files changed, 109 insertions(+), 101 deletions(-)
>
> diff --git a/target/i386/emulate/x86_decode.h b/target/i386/emulate/x86_decode.h
> index 87cc728598d..497cbdef9c7 100644
> --- a/target/i386/emulate/x86_decode.h
> +++ b/target/i386/emulate/x86_decode.h
> @@ -266,7 +266,10 @@ typedef struct x86_decode_op {
>     int reg;
>     target_ulong val;
>
> -    target_ulong ptr;
> +    union {
> +        target_ulong addr;
> +        void *regptr;
> +    };
> } x86_decode_op;
>
> typedef struct x86_decode {
> @@ -301,8 +304,8 @@ uint64_t sign(uint64_t val, int size);
>
> uint32_t decode_instruction(CPUX86State *env, struct x86_decode *decode);
>
> -target_ulong get_reg_ref(CPUX86State *env, int reg, int rex_present,
> -                         int is_extended, int size);
> +void * get_reg_ref(CPUX86State *env, int reg, int rex_present,
> +                    int is_extended, int size);

Stray space after *, checkpatch should have cought it.

Regards,
BALATON Zoltan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] target/i386/emulate: stop overloading decode->op[N].ptr
  2025-05-02 21:48 ` [PATCH 2/4] target/i386/emulate: stop overloading decode->op[N].ptr Paolo Bonzini
  2025-05-03 10:46   ` BALATON Zoltan
@ 2025-05-05  9:43   ` Philippe Mathieu-Daudé
  2025-05-05 10:00     ` Paolo Bonzini
  1 sibling, 1 reply; 17+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-05-05  9:43 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: wei.liu

Hi Paolo,

On 2/5/25 23:48, Paolo Bonzini wrote:
> decode->op[N].ptr can contain either a host pointer (!) in CPUState
> or a guest virtual address.  Pass the whole struct to read_val_ext
> and write_val_ext, so that it can decide the contents based on the
> operand type.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   target/i386/emulate/x86_decode.h |   9 ++-
>   target/i386/emulate/x86_emu.h    |   8 +--
>   target/i386/emulate/x86_decode.c |  74 +++++++++----------
>   target/i386/emulate/x86_emu.c    | 119 ++++++++++++++++---------------
>   4 files changed, 109 insertions(+), 101 deletions(-)
> 
> diff --git a/target/i386/emulate/x86_decode.h b/target/i386/emulate/x86_decode.h
> index 87cc728598d..497cbdef9c7 100644
> --- a/target/i386/emulate/x86_decode.h
> +++ b/target/i386/emulate/x86_decode.h
> @@ -266,7 +266,10 @@ typedef struct x86_decode_op {
>       int reg;
>       target_ulong val;
>   
> -    target_ulong ptr;
> +    union {
> +        target_ulong addr;

Prefer 'vaddr' type for "guest virtual address".

> +        void *regptr;
> +    };
>   } x86_decode_op;



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] target/i386/emulate: stop overloading decode->op[N].ptr
  2025-05-05  9:43   ` Philippe Mathieu-Daudé
@ 2025-05-05 10:00     ` Paolo Bonzini
  2025-05-05 10:54       ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2025-05-05 10:00 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, qemu-devel; +Cc: wei.liu

On 5/5/25 11:43, Philippe Mathieu-Daudé wrote:
> Hi Paolo,
> 
> On 2/5/25 23:48, Paolo Bonzini wrote:
>> decode->op[N].ptr can contain either a host pointer (!) in CPUState
>> or a guest virtual address.  Pass the whole struct to read_val_ext
>> and write_val_ext, so that it can decide the contents based on the
>> operand type.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>   target/i386/emulate/x86_decode.h |   9 ++-
>>   target/i386/emulate/x86_emu.h    |   8 +--
>>   target/i386/emulate/x86_decode.c |  74 +++++++++----------
>>   target/i386/emulate/x86_emu.c    | 119 ++++++++++++++++---------------
>>   4 files changed, 109 insertions(+), 101 deletions(-)
>>
>> diff --git a/target/i386/emulate/x86_decode.h b/target/i386/emulate/ 
>> x86_decode.h
>> index 87cc728598d..497cbdef9c7 100644
>> --- a/target/i386/emulate/x86_decode.h
>> +++ b/target/i386/emulate/x86_decode.h
>> @@ -266,7 +266,10 @@ typedef struct x86_decode_op {
>>       int reg;
>>       target_ulong val;
>> -    target_ulong ptr;
>> +    union {
>> +        target_ulong addr;
> 
> Prefer 'vaddr' type for "guest virtual address".
That would be a semantic change which I really want to avoid in this series.

I don't think target_long/target_ulong is a big blocker towards 
single-binary anyway.  The trick is to confine it to target/, making it 
essentially a #define.  That is, let target/* include one of two new 
headers target_long_32.h and target_long_64.h.  See 
lore.kernel.org/r/68b6c799-6407-43cc-aebc-a0ef6b8b64fa@redhat.com as well.

For the same reason I think target_long_bits() is a step in the wrong 
direction.

Paolo



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] target/i386/emulate: stop overloading decode->op[N].ptr
  2025-05-05 10:00     ` Paolo Bonzini
@ 2025-05-05 10:54       ` Philippe Mathieu-Daudé
  2025-05-05 12:59         ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-05-05 10:54 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: wei.liu, Pierrick Bouvier, Richard Henderson

On 5/5/25 12:00, Paolo Bonzini wrote:
> On 5/5/25 11:43, Philippe Mathieu-Daudé wrote:
>> Hi Paolo,
>>
>> On 2/5/25 23:48, Paolo Bonzini wrote:
>>> decode->op[N].ptr can contain either a host pointer (!) in CPUState
>>> or a guest virtual address.  Pass the whole struct to read_val_ext
>>> and write_val_ext, so that it can decide the contents based on the
>>> operand type.
>>>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>>   target/i386/emulate/x86_decode.h |   9 ++-
>>>   target/i386/emulate/x86_emu.h    |   8 +--
>>>   target/i386/emulate/x86_decode.c |  74 +++++++++----------
>>>   target/i386/emulate/x86_emu.c    | 119 ++++++++++++++++---------------
>>>   4 files changed, 109 insertions(+), 101 deletions(-)
>>>
>>> diff --git a/target/i386/emulate/x86_decode.h b/target/i386/emulate/ 
>>> x86_decode.h
>>> index 87cc728598d..497cbdef9c7 100644
>>> --- a/target/i386/emulate/x86_decode.h
>>> +++ b/target/i386/emulate/x86_decode.h
>>> @@ -266,7 +266,10 @@ typedef struct x86_decode_op {
>>>       int reg;
>>>       target_ulong val;
>>> -    target_ulong ptr;
>>> +    union {
>>> +        target_ulong addr;
>>
>> Prefer 'vaddr' type for "guest virtual address".
> That would be a semantic change which I really want to avoid in this 
> series.
> 
> I don't think target_long/target_ulong is a big blocker towards single- 
> binary anyway.  The trick is to confine it to target/, making it 
> essentially a #define.  That is, let target/* include one of two new 
> headers target_long_32.h and target_long_64.h.  See lore.kernel.org/ 
> r/68b6c799-6407-43cc-aebc-a0ef6b8b64fa@redhat.com as well.

Yes, I have this tagged to understand and address. Maybe Pierrick
already understood the issue (similar mention from Richard? [1]) and
is addressing it, see [2].

[1] 
https://lore.kernel.org/qemu-devel/5b152664-a752-4be8-aa15-8c71c040b026@linaro.org/
[2] 
https://lore.kernel.org/qemu-devel/20250505015223.3895275-15-pierrick.bouvier@linaro.org/

> 
> For the same reason I think target_long_bits() is a step in the wrong 
> direction.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] target/i386/emulate: stop overloading decode->op[N].ptr
  2025-05-05 10:54       ` Philippe Mathieu-Daudé
@ 2025-05-05 12:59         ` Paolo Bonzini
  2025-05-05 18:11           ` Pierrick Bouvier
  2025-05-05 19:33           ` Richard Henderson
  0 siblings, 2 replies; 17+ messages in thread
From: Paolo Bonzini @ 2025-05-05 12:59 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: qemu-devel, wei.liu, Pierrick Bouvier, Richard Henderson

On Mon, May 5, 2025 at 12:54 PM Philippe Mathieu-Daudé
<philmd@linaro.org> wrote:
> > I don't think target_long/target_ulong is a big blocker towards single-
> > binary anyway.  The trick is to confine it to target/, making it
> > essentially a #define.  That is, let target/* include one of two new
> > headers target_long_32.h and target_long_64.h.  See lore.kernel.org/
> > r/68b6c799-6407-43cc-aebc-a0ef6b8b64fa@redhat.com as well.
>
> Yes, I have this tagged to understand and address. Maybe Pierrick
> already understood the issue (similar mention from Richard? [1]) and
> is addressing it, see [2].

Those are different.  TCGv_vaddr is able to store a *host* pointer,
i.e.  a host uintptr_t.  But target_long/target_ulong are already
completely absent from tcg/ (there are a couple appearances in
include/tcg), so I'm proposing to remove them completely and leave
them as just macros.

Please give me a shout once the target-arm/ series lands, I can look
into this further.

Paolo



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] target/i386/emulate: stop overloading decode->op[N].ptr
  2025-05-05 12:59         ` Paolo Bonzini
@ 2025-05-05 18:11           ` Pierrick Bouvier
  2025-05-05 19:33           ` Richard Henderson
  1 sibling, 0 replies; 17+ messages in thread
From: Pierrick Bouvier @ 2025-05-05 18:11 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé
  Cc: qemu-devel, wei.liu, Richard Henderson

On 5/5/25 5:59 AM, Paolo Bonzini wrote:
> On Mon, May 5, 2025 at 12:54 PM Philippe Mathieu-Daudé
> <philmd@linaro.org> wrote:
>>> I don't think target_long/target_ulong is a big blocker towards single-
>>> binary anyway.  The trick is to confine it to target/, making it
>>> essentially a #define.  That is, let target/* include one of two new
>>> headers target_long_32.h and target_long_64.h.  See lore.kernel.org/
>>> r/68b6c799-6407-43cc-aebc-a0ef6b8b64fa@redhat.com as well.
>>

In an ideal world, it should be eliminated completely.
The root problem is that it creates variation of symbols/types, which 
obviously can't be unified later when we mix targets together.
Of course, it's a paradigm shift, as QEMU traditionally was written with 
the "per target" approach in mind.

 From this perspective, the only place where it could eventually survive 
is within static functions or types private to a compilation unit, only 
if this compilation unit is private to a unique target architecture 
(TARGET_AARCH64 only for instance).

But in this case, it's much more easy to get rid of it and replace with 
it the real type directly. So I don't see any good reason to keep it 
anywhere.

Our solution here is to "widen" the concerned definitions, using vaddr 
or uint64_t for target_ulong (vaddr is possible only for addresses, 
because we removed 64 bits guests support on 32 bits hosts - I hope it 
won't bite us later).
For target_long, it's more tricky, as sign extension matters.

The problem is that it will necessarily break some stable interfaces, 
which were expecting a uint32_t before, but I think it's a necessary 
change to do at some point. But it can be done slowly and on a per 
target basis.

>> Yes, I have this tagged to understand and address. Maybe Pierrick
>> already understood the issue (similar mention from Richard? [1]) and
>> is addressing it, see [2].
> 
> Those are different.  TCGv_vaddr is able to store a *host* pointer,
> i.e.  a host uintptr_t.  But target_long/target_ulong are already
> completely absent from tcg/ (there are a couple appearances in
> include/tcg), so I'm proposing to remove them completely and leave
> them as just macros.
>

 From what I understood, the original meaning is that vaddr can contain 
a *guest* pointer, and not a *host* pointer. Since 64 bits targets have 
been disabled for 32 bits hosts, vaddr definition was changed to 
uintptr_t, where it was uint64_t before. [1]

[1] 
https://gitlab.com/qemu-project/qemu/-/commit/a70af12addd9060fdf8f3dbd42b42e3072c3914f

> Please give me a shout once the target-arm/ series lands, I can look
> into this further.
> 

The current series is mostly complete, and focused on low hanging 
fruits. The one remaining are a bit harder, and/or have external 
dependencies, but I didn't see anything impossible so far.

Luckily, we didn't have a lot of target_ulong in Arm related structs, so 
it's pretty easy to change. Some other architectures (especially the 
ones using them in cpu.h) will be harder.

> Paolo
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [CFT PATCH 0/4] target/i386/emulate: cleanups
  2025-05-03  7:01   ` Paolo Bonzini
@ 2025-05-05 18:33     ` Wei Liu
  0 siblings, 0 replies; 17+ messages in thread
From: Wei Liu @ 2025-05-05 18:33 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Wei Liu, qemu-devel

On Sat, May 03, 2025 at 09:01:46AM +0200, Paolo Bonzini wrote:
> Il sab 3 mag 2025, 07:39 Wei Liu <wei.liu@kernel.org> ha scritto:
> 
> > FWIW this series builds fine on for x86 HVF.
> >
> 
> Thanks, can you also test it on either HVF or Hyper-V?
> 

I will leave it to Magnus to test your changes on MSHV.

Wei.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [CFT PATCH 0/4] target/i386/emulate: cleanups
  2025-05-02 21:48 [CFT PATCH 0/4] target/i386/emulate: cleanups Paolo Bonzini
                   ` (4 preceding siblings ...)
  2025-05-03  5:38 ` [CFT PATCH 0/4] target/i386/emulate: cleanups Wei Liu
@ 2025-05-05 18:34 ` Wei Liu
  2025-05-09 12:56   ` Magnus Kulke
  5 siblings, 1 reply; 17+ messages in thread
From: Wei Liu @ 2025-05-05 18:34 UTC (permalink / raw)
  To: Paolo Bonzini, Magnus Kulke; +Cc: qemu-devel, wei.liu

Magnus, can you test this series on MSHV?

On Fri, May 02, 2025 at 11:48:37PM +0200, Paolo Bonzini wrote:
> These are some improvements to the x86 emulator that I wrote but have no
> way of testing (right now).
> 
> I tried to place them in order of importance so that, if something breaks,
> it is possible to commit a subset.  I tried to compile the resulting code
> on Linux but I have not run it.
> 
> Patch 1 is just to fix warnings on Linux.
> 
> Patch 2 is the most important, as it fixes some real horrors in the code.
> 
> Patch 3 makes flags handling use algorithms somewhat similar to TCG.
> It should fix issues with 64-bit ALU operations, but it's also the one
> where it's more likely to have a mistake.
> 
> Patch 4 is comparatively trivial, though I cannot exclude any screwups.
> 
> It should be possible to test this with both HVF and Hyper-V.
> 
> Paolo
> 
> Paolo Bonzini (4):
>   target/i386/emulate: fix target_ulong format strings
>   target/i386/emulate: stop overloading decode->op[N].ptr
>   target/i386/emulate: mostly rewrite flags handling
>   target/i386: remove lflags
> 
>  target/i386/cpu.h                |   6 -
>  target/i386/emulate/x86_decode.h |   9 +-
>  target/i386/emulate/x86_emu.h    |   8 +-
>  target/i386/emulate/x86_flags.h  |  12 +-
>  target/i386/emulate/x86_decode.c |  76 ++++++------
>  target/i386/emulate/x86_emu.c    | 125 +++++++++----------
>  target/i386/emulate/x86_flags.c  | 198 +++++++++++++------------------
>  7 files changed, 197 insertions(+), 237 deletions(-)
> 
> -- 
> 2.49.0
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] target/i386/emulate: stop overloading decode->op[N].ptr
  2025-05-05 12:59         ` Paolo Bonzini
  2025-05-05 18:11           ` Pierrick Bouvier
@ 2025-05-05 19:33           ` Richard Henderson
  1 sibling, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2025-05-05 19:33 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé
  Cc: qemu-devel, wei.liu, Pierrick Bouvier

On 5/5/25 05:59, Paolo Bonzini wrote:
> On Mon, May 5, 2025 at 12:54 PM Philippe Mathieu-Daudé
> <philmd@linaro.org> wrote:
>>> I don't think target_long/target_ulong is a big blocker towards single-
>>> binary anyway.  The trick is to confine it to target/, making it
>>> essentially a #define.  That is, let target/* include one of two new
>>> headers target_long_32.h and target_long_64.h.  See lore.kernel.org/
>>> r/68b6c799-6407-43cc-aebc-a0ef6b8b64fa@redhat.com as well.
>>
>> Yes, I have this tagged to understand and address. Maybe Pierrick
>> already understood the issue (similar mention from Richard? [1]) and
>> is addressing it, see [2].
> 
> Those are different.  TCGv_vaddr is able to store a *host* pointer,
> i.e.  a host uintptr_t.

No, vaddr is about a guest pointer.

The state of affairs is slightly confusing because once we dropped 64-bit guests from 
32-bit hosts, we subsequently made the simplifying assumption that a host pointer is never 
smaller than a guest pointer.  Thus exec/vaddr.h has typedef uintptr_t vaddr.


>  But target_long/target_ulong are already
> completely absent from tcg/ (there are a couple appearances in
> include/tcg), so I'm proposing to remove them completely and leave
> them as just macros.

The "problem" with target_long in target/arch/ occurs when target/arch/ is built twice, 
e.g. i386 vs x86_64, arm vs aarch64, etc.

We want to build these once.  For the most part this already works, e.g. 
qemu-system-x86_64 running 32-bit cpus just fine.  That relies on the wider value of 
target_long, so we can avoid ambiguity by using a different type.  In this case, if we 
really do want to represent a guest virtual address, then "vaddr" is the best type to use.


r~


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [CFT PATCH 0/4] target/i386/emulate: cleanups
  2025-05-05 18:34 ` Wei Liu
@ 2025-05-09 12:56   ` Magnus Kulke
  0 siblings, 0 replies; 17+ messages in thread
From: Magnus Kulke @ 2025-05-09 12:56 UTC (permalink / raw)
  To: Wei Liu; +Cc: Paolo Bonzini, qemu-devel

Hey Paolo,

I applied your patches to the x86 emulator in our MSHV branch. They
compile cleanly (some off this we had changed on our branch already). I
also performed some manual testing and didn't spot any regressions with
the changes in the emulator.

magnus

already). I also performed some manual testing and didn't spot any regressions with
the emulator.

On Mon, May 05, 2025 at 06:34:39PM +0000, Wei Liu wrote:
> Magnus, can you test this series on MSHV?
> 
> On Fri, May 02, 2025 at 11:48:37PM +0200, Paolo Bonzini wrote:
> > These are some improvements to the x86 emulator that I wrote but have no
> > way of testing (right now).
> > 
> > I tried to place them in order of importance so that, if something breaks,
> > it is possible to commit a subset.  I tried to compile the resulting code
> > on Linux but I have not run it.
> > 
> > Patch 1 is just to fix warnings on Linux.
> > 
> > Patch 2 is the most important, as it fixes some real horrors in the code.
> > 
> > Patch 3 makes flags handling use algorithms somewhat similar to TCG.
> > It should fix issues with 64-bit ALU operations, but it's also the one
> > where it's more likely to have a mistake.
> > 
> > Patch 4 is comparatively trivial, though I cannot exclude any screwups.
> > 
> > It should be possible to test this with both HVF and Hyper-V.
> > 
> > Paolo
> > 
> > Paolo Bonzini (4):
> >   target/i386/emulate: fix target_ulong format strings
> >   target/i386/emulate: stop overloading decode->op[N].ptr
> >   target/i386/emulate: mostly rewrite flags handling
> >   target/i386: remove lflags
> > 
> >  target/i386/cpu.h                |   6 -
> >  target/i386/emulate/x86_decode.h |   9 +-
> >  target/i386/emulate/x86_emu.h    |   8 +-
> >  target/i386/emulate/x86_flags.h  |  12 +-
> >  target/i386/emulate/x86_decode.c |  76 ++++++------
> >  target/i386/emulate/x86_emu.c    | 125 +++++++++----------
> >  target/i386/emulate/x86_flags.c  | 198 +++++++++++++------------------
> >  7 files changed, 197 insertions(+), 237 deletions(-)
> > 
> > -- 
> > 2.49.0
> > 


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-05-09 12:56 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-02 21:48 [CFT PATCH 0/4] target/i386/emulate: cleanups Paolo Bonzini
2025-05-02 21:48 ` [PATCH 1/4] target/i386/emulate: fix target_ulong format strings Paolo Bonzini
2025-05-02 21:48 ` [PATCH 2/4] target/i386/emulate: stop overloading decode->op[N].ptr Paolo Bonzini
2025-05-03 10:46   ` BALATON Zoltan
2025-05-05  9:43   ` Philippe Mathieu-Daudé
2025-05-05 10:00     ` Paolo Bonzini
2025-05-05 10:54       ` Philippe Mathieu-Daudé
2025-05-05 12:59         ` Paolo Bonzini
2025-05-05 18:11           ` Pierrick Bouvier
2025-05-05 19:33           ` Richard Henderson
2025-05-02 21:48 ` [PATCH 3/4] target/i386/emulate: mostly rewrite flags handling Paolo Bonzini
2025-05-02 21:48 ` [PATCH 4/4] target/i386: remove lflags Paolo Bonzini
2025-05-03  5:38 ` [CFT PATCH 0/4] target/i386/emulate: cleanups Wei Liu
2025-05-03  7:01   ` Paolo Bonzini
2025-05-05 18:33     ` Wei Liu
2025-05-05 18:34 ` Wei Liu
2025-05-09 12:56   ` Magnus Kulke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).