[PULL 0/6] tcg patch queue

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PULL 0/6] tcg patch queue
@ 2025-09-05  7:50 Richard Henderson
  2025-09-05  7:50 ` [PATCH 1/2] target/sparc: Loosen decode of STBAR for v8 Richard Henderson
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Richard Henderson @ 2025-09-05  7:50 UTC (permalink / raw)
  To: qemu-devel

The following changes since commit baa79455fa92984ff0f4b9ae94bed66823177a27:

  Merge tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu into staging (2025-09-03 11:39:16 +0200)

are available in the Git repository at:

  https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20250905

for you to fetch changes up to cb2540979264c8d3984e26c5dd90a840e47ec5dd:

  tcg/i386: Use vgf2p8affineqb for MO_8 vector shifts (2025-09-04 09:49:30 +0200)

----------------------------------------------------------------
tcg/arm: Fix tgen_deposit
tcg/i386: Use vgf2p8affineqb for MO_8 vector shifts

----------------------------------------------------------------
Richard Henderson (6):
      tcg/arm: Fix tgen_deposit
      cpuinfo/i386: Detect GFNI as an AVX extension
      tcg/i386: Expand sari of bits-1 as pcmpgt
      tcg/i386: Use canonical operand ordering in expand_vec_sari
      tcg/i386: Add INDEX_op_x86_vgf2p8affineqb_vec
      tcg/i386: Use vgf2p8affineqb for MO_8 vector shifts

 host/include/i386/host/cpuinfo.h |  1 +
 include/qemu/cpuid.h             |  3 ++
 util/cpuinfo-i386.c              |  1 +
 tcg/arm/tcg-target.c.inc         |  3 +-
 tcg/i386/tcg-target-opc.h.inc    |  1 +
 tcg/i386/tcg-target.c.inc        | 91 +++++++++++++++++++++++++++++++++++++---
 6 files changed, 93 insertions(+), 7 deletions(-)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] target/sparc: Loosen decode of STBAR for v8
  2025-09-05  7:50 [PULL 0/6] tcg patch queue Richard Henderson
@ 2025-09-05  7:50 ` Richard Henderson
  2025-09-05  7:50 ` [PULL 1/6] tcg/arm: Fix tgen_deposit Richard Henderson
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Richard Henderson @ 2025-09-05  7:50 UTC (permalink / raw)
  To: qemu-devel

Solaris 8 appears to have a bug whereby it executes v9 MEMBAR
instructions when booting a freshly installed image. According
to the SPARC v8 architecture manual, whilst bits 13 and bits 12-0
of the "Read State Register Instructions" are notionally zero,
they are marked as unused (i.e. ignored).

Fixes: af25071c1d ("target/sparc: Move RDASR, STBAR, MEMBAR to decodetree")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3097
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/sparc/translate.c  | 12 +++++++++++-
 target/sparc/insns.decode | 13 ++++++++++++-
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index b922e53bf1..c2ffd965d8 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -2823,12 +2823,22 @@ static bool trans_Tcc_i_v9(DisasContext *dc, arg_Tcc_i_v9 *a)
     return do_tcc(dc, a->cond, a->cc, a->rs1, true, a->i);
 }
 
-static bool trans_STBAR(DisasContext *dc, arg_STBAR *a)
+static bool do_stbar(DisasContext *dc)
 {
     tcg_gen_mb(TCG_MO_ST_ST | TCG_BAR_SC);
     return advance_pc(dc);
 }
 
+static bool trans_STBAR_v8(DisasContext *dc, arg_STBAR_v8 *a)
+{
+    return avail_32(dc) && do_stbar(dc);
+}
+
+static bool trans_STBAR_v9(DisasContext *dc, arg_STBAR_v9 *a)
+{
+    return avail_64(dc) && do_stbar(dc);
+}
+
 static bool trans_MEMBAR(DisasContext *dc, arg_MEMBAR *a)
 {
     if (avail_32(dc)) {
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 9e39d23273..1b1b85e9c2 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -88,7 +88,7 @@ CALL    01 i:s30
 
 {
   [
-    STBAR           10 00000 101000 01111 0 0000000000000
+    STBAR_v9        10 00000 101000 01111 0 0000000000000
     MEMBAR          10 00000 101000 01111 1 000000 cmask:3 mmask:4
 
     RDCCR           10 rd:5  101000 00010 0 0000000000000
@@ -107,6 +107,17 @@ CALL    01 i:s30
     RDSTICK_CMPR    10 rd:5  101000 11001 0 0000000000000
     RDSTRAND_STATUS 10 rd:5  101000 11010 0 0000000000000
   ]
+
+  # The v8 manual, section B.30 STBAR instruction, says
+  # bits [12:0] are ignored, but bit 13 must be 0.
+  # However, section B.28 Read State Register Instruction has a
+  # comment that RDASR with rs1 = 15, rd = 0 is STBAR.  Here,
+  # bit 13 is also ignored and rd != 0 is merely reserved.
+  #
+  # Solaris 8 executes v9 MEMBAR instruction 0x8143e008 during boot.
+  # This confirms that bit 13 is ignored, as 0x8143c000 is STBAR.
+  STBAR_v8          10 ----- 101000 01111 - -------------
+
   # Before v8, all rs1 accepted; otherwise rs1==0.
   RDY               10 rd:5  101000 rs1:5 0 0000000000000
 }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PULL 1/6] tcg/arm: Fix tgen_deposit
  2025-09-05  7:50 [PULL 0/6] tcg patch queue Richard Henderson
  2025-09-05  7:50 ` [PATCH 1/2] target/sparc: Loosen decode of STBAR for v8 Richard Henderson
@ 2025-09-05  7:50 ` Richard Henderson
  2025-09-05  7:50 ` [PULL 2/6] cpuinfo/i386: Detect GFNI as an AVX extension Richard Henderson
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Richard Henderson @ 2025-09-05  7:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Michael Tokarev, Philippe Mathieu-Daudé

When converting from tcg_out_deposit, the arguments were not
shuffled properly.

Cc: qemu-stable@nongnu.org
Fixes: cf4905c03135f1181e8 ("tcg: Convert deposit to TCGOutOpDeposit")
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.c.inc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 836894b16a..338c57b061 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -975,7 +975,8 @@ static void tgen_deposit(TCGContext *s, TCGType type, TCGReg a0, TCGReg a1,
                          TCGReg a2, unsigned ofs, unsigned len)
 {
     /* bfi/bfc */
-    tcg_out32(s, 0x07c00010 | (COND_AL << 28) | (a0 << 12) | a1
+    tcg_debug_assert(a0 == a1);
+    tcg_out32(s, 0x07c00010 | (COND_AL << 28) | (a0 << 12) | a2
               | (ofs << 7) | ((ofs + len - 1) << 16));
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PULL 2/6] cpuinfo/i386: Detect GFNI as an AVX extension
  2025-09-05  7:50 [PULL 0/6] tcg patch queue Richard Henderson
  2025-09-05  7:50 ` [PATCH 1/2] target/sparc: Loosen decode of STBAR for v8 Richard Henderson
  2025-09-05  7:50 ` [PULL 1/6] tcg/arm: Fix tgen_deposit Richard Henderson
@ 2025-09-05  7:50 ` Richard Henderson
  2025-09-05  7:50 ` [PATCH 2/2] target/sparc: Loosen decode of RDY for v7 Richard Henderson
  2025-09-05 12:36 ` [PULL 0/6] tcg patch queue Richard Henderson
  4 siblings, 0 replies; 10+ messages in thread
From: Richard Henderson @ 2025-09-05  7:50 UTC (permalink / raw)
  To: qemu-devel

We won't use the SSE GFNI instructions, so delay
detection until we know AVX is present.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 host/include/i386/host/cpuinfo.h | 1 +
 include/qemu/cpuid.h             | 3 +++
 util/cpuinfo-i386.c              | 1 +
 3 files changed, 5 insertions(+)

diff --git a/host/include/i386/host/cpuinfo.h b/host/include/i386/host/cpuinfo.h
index 9541a64da6..93d029d499 100644
--- a/host/include/i386/host/cpuinfo.h
+++ b/host/include/i386/host/cpuinfo.h
@@ -27,6 +27,7 @@
 #define CPUINFO_ATOMIC_VMOVDQU  (1u << 17)
 #define CPUINFO_AES             (1u << 18)
 #define CPUINFO_PCLMUL          (1u << 19)
+#define CPUINFO_GFNI            (1u << 20)
 
 /* Initialized with a constructor. */
 extern unsigned cpuinfo;
diff --git a/include/qemu/cpuid.h b/include/qemu/cpuid.h
index b11161555b..de7a900509 100644
--- a/include/qemu/cpuid.h
+++ b/include/qemu/cpuid.h
@@ -68,6 +68,9 @@
 #ifndef bit_AVX512VBMI2
 #define bit_AVX512VBMI2 (1 << 6)
 #endif
+#ifndef bit_GFNI
+#define bit_GFNI        (1 << 8)
+#endif
 
 /* Leaf 0x80000001, %ecx */
 #ifndef bit_LZCNT
diff --git a/util/cpuinfo-i386.c b/util/cpuinfo-i386.c
index c8c8a1b370..f4c5b6ff40 100644
--- a/util/cpuinfo-i386.c
+++ b/util/cpuinfo-i386.c
@@ -50,6 +50,7 @@ unsigned __attribute__((constructor)) cpuinfo_init(void)
             if ((bv & 6) == 6) {
                 info |= CPUINFO_AVX1;
                 info |= (b7 & bit_AVX2 ? CPUINFO_AVX2 : 0);
+                info |= (c7 & bit_GFNI ? CPUINFO_GFNI : 0);
 
                 if ((bv & 0xe0) == 0xe0) {
                     info |= (b7 & bit_AVX512F ? CPUINFO_AVX512F : 0);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] target/sparc: Loosen decode of RDY for v7
  2025-09-05  7:50 [PULL 0/6] tcg patch queue Richard Henderson
                   ` (2 preceding siblings ...)
  2025-09-05  7:50 ` [PULL 2/6] cpuinfo/i386: Detect GFNI as an AVX extension Richard Henderson
@ 2025-09-05  7:50 ` Richard Henderson
  2025-09-05 12:36 ` [PULL 0/6] tcg patch queue Richard Henderson
  4 siblings, 0 replies; 10+ messages in thread
From: Richard Henderson @ 2025-09-05  7:50 UTC (permalink / raw)
  To: qemu-devel

Bits [18:0] are not decoded with v7, and for v8 unused values
of rs1 simply produce undefined results.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/sparc/translate.c  | 24 +++++++++++++-----------
 target/sparc/insns.decode | 12 ++++++++++--
 2 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index c2ffd965d8..69d5883dec 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -2865,22 +2865,24 @@ static bool do_rd_special(DisasContext *dc, bool priv, int rd,
     return advance_pc(dc);
 }
 
-static TCGv do_rdy(DisasContext *dc, TCGv dst)
+static TCGv do_rdy_1(DisasContext *dc, TCGv dst)
 {
     return cpu_y;
 }
 
-static bool trans_RDY(DisasContext *dc, arg_RDY *a)
+static bool do_rdy(DisasContext *dc, int rd)
 {
-    /*
-     * TODO: Need a feature bit for sparcv8.  In the meantime, treat all
-     * 32-bit cpus like sparcv7, which ignores the rs1 field.
-     * This matches after all other ASR, so Leon3 Asr17 is handled first.
-     */
-    if (avail_64(dc) && a->rs1 != 0) {
-        return false;
-    }
-    return do_rd_special(dc, true, a->rd, do_rdy);
+    return do_rd_special(dc, true, rd, do_rdy_1);
+}
+
+static bool trans_RDY_v7(DisasContext *dc, arg_RDY_v7 *a)
+{
+    return avail_32(dc) && do_rdy(dc, a->rd);
+}
+
+static bool trans_RDY_v9(DisasContext *dc, arg_RDY_v9 *a)
+{
+    return avail_64(dc) && do_rdy(dc, a->rd);
 }
 
 static TCGv do_rd_leon3_config(DisasContext *dc, TCGv dst)
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 1b1b85e9c2..74848996ae 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -91,6 +91,7 @@ CALL    01 i:s30
     STBAR_v9        10 00000 101000 01111 0 0000000000000
     MEMBAR          10 00000 101000 01111 1 000000 cmask:3 mmask:4
 
+    RDY_v9          10 rd:5  101000 00000 0 0000000000000
     RDCCR           10 rd:5  101000 00010 0 0000000000000
     RDASI           10 rd:5  101000 00011 0 0000000000000
     RDTICK          10 rd:5  101000 00100 0 0000000000000
@@ -118,8 +119,15 @@ CALL    01 i:s30
   # This confirms that bit 13 is ignored, as 0x8143c000 is STBAR.
   STBAR_v8          10 ----- 101000 01111 - -------------
 
-  # Before v8, all rs1 accepted; otherwise rs1==0.
-  RDY               10 rd:5  101000 rs1:5 0 0000000000000
+  # For v7, bits [18:0] are ignored.
+  # For v8, bits [18:14], aka rs1, are repurposed and rs1 = 0 is RDY,
+  # and other values are RDASR.  However, the v8 manual explicitly
+  # says that rs1 in 1..14 yield undefined results and do not cause
+  # an illegal instruction trap, and rs1 in 16..31 are available for
+  # implementation specific usage.
+  # Implement not causing an illegal instruction trap for v8 by
+  # continuing to interpret unused values per v7, i.e. as RDY.
+  RDY_v7            10 rd:5  101000 ----- - -------------
 }
 
 {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PULL 0/6] tcg patch queue
  2025-09-05  7:50 [PULL 0/6] tcg patch queue Richard Henderson
                   ` (3 preceding siblings ...)
  2025-09-05  7:50 ` [PATCH 2/2] target/sparc: Loosen decode of RDY for v7 Richard Henderson
@ 2025-09-05 12:36 ` Richard Henderson
  4 siblings, 0 replies; 10+ messages in thread
From: Richard Henderson @ 2025-09-05 12:36 UTC (permalink / raw)
  To: qemu-devel

On 9/5/25 09:50, Richard Henderson wrote:
> The following changes since commit baa79455fa92984ff0f4b9ae94bed66823177a27:
> 
>    Merge tag 'pull-trivial-patches' ofhttps://gitlab.com/mjt0k/qemu into staging (2025-09-03 11:39:16 +0200)
> 
> are available in the Git repository at:
> 
>    https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20250905
> 
> for you to fetch changes up to cb2540979264c8d3984e26c5dd90a840e47ec5dd:
> 
>    tcg/i386: Use vgf2p8affineqb for MO_8 vector shifts (2025-09-04 09:49:30 +0200)
> 
> ----------------------------------------------------------------
> tcg/arm: Fix tgen_deposit
> tcg/i386: Use vgf2p8affineqb for MO_8 vector shifts


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/10.2 as appropriate.

r~


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PULL 0/6] tcg patch queue
@ 2023-09-28 19:41 Richard Henderson
  2023-10-02 21:56 ` Stefan Hajnoczi
  2023-10-02 22:46 ` Michael Tokarev
  0 siblings, 2 replies; 10+ messages in thread
From: Richard Henderson @ 2023-09-28 19:41 UTC (permalink / raw)
  To: qemu-devel

Mini PR, aimed at fixing the mips and ovmf regressions.


r~

The following changes since commit 36e9aab3c569d4c9ad780473596e18479838d1aa:

  migration: Move return path cleanup to main migration thread (2023-09-27 13:58:02 -0400)

are available in the Git repository at:

  https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230928

for you to fetch changes up to 18a536f1f8d6222e562f59179e837fdfd8b92718:

  accel/tcg: Always require can_do_io (2023-09-28 10:08:13 -0700)

----------------------------------------------------------------
accel/tcg: Always require can_do_io, for #1866

----------------------------------------------------------------
Richard Henderson (6):
      accel/tcg: Avoid load of icount_decr if unused
      accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop
      accel/tcg: Track current value of can_do_io in the TB
      accel/tcg: Improve setting of can_do_io at start of TB
      accel/tcg: Always set CF_LAST_IO with CF_NOIRQ
      accel/tcg: Always require can_do_io

 include/exec/translator.h   |  2 ++
 accel/tcg/cpu-exec.c        |  2 +-
 accel/tcg/tb-maint.c        |  6 ++--
 accel/tcg/translator.c      | 72 +++++++++++++++++++++------------------------
 target/mips/tcg/translate.c |  1 -
 5 files changed, 41 insertions(+), 42 deletions(-)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PULL 0/6] tcg patch queue
  2023-09-28 19:41 Richard Henderson
@ 2023-10-02 21:56 ` Stefan Hajnoczi
  2023-10-02 22:46 ` Michael Tokarev
  1 sibling, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2023-10-02 21:56 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 115 bytes --]

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/8.2 for any user-visible changes.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PULL 0/6] tcg patch queue
  2023-09-28 19:41 Richard Henderson
  2023-10-02 21:56 ` Stefan Hajnoczi
@ 2023-10-02 22:46 ` Michael Tokarev
  2023-10-03  0:18   ` Richard Henderson
  1 sibling, 1 reply; 10+ messages in thread
From: Michael Tokarev @ 2023-10-02 22:46 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

28.09.2023 22:41, Richard Henderson wrote
> Mini PR, aimed at fixing the mips and ovmf regressions.
> r~
> ----------------------------------------------------------------
> accel/tcg: Always require can_do_io, for #1866
> 
> ----------------------------------------------------------------
> Richard Henderson (6):
>        accel/tcg: Avoid load of icount_decr if unused
>        accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop
>        accel/tcg: Track current value of can_do_io in the TB
>        accel/tcg: Improve setting of can_do_io at start of TB
>        accel/tcg: Always set CF_LAST_IO with CF_NOIRQ
>        accel/tcg: Always require can_do_io

What's the set required for the regression fix for -stable ?
Is it the whole thing?
(yes, I tested the complete set in debian).

Thank you!

/mjt


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PULL 0/6] tcg patch queue
  2023-10-02 22:46 ` Michael Tokarev
@ 2023-10-03  0:18   ` Richard Henderson
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Henderson @ 2023-10-03  0:18 UTC (permalink / raw)
  To: Michael Tokarev, qemu-devel

On 10/2/23 15:46, Michael Tokarev wrote:
> 28.09.2023 22:41, Richard Henderson wrote
>> Mini PR, aimed at fixing the mips and ovmf regressions.
>> r~
>> ----------------------------------------------------------------
>> accel/tcg: Always require can_do_io, for #1866
>>
>> ----------------------------------------------------------------
>> Richard Henderson (6):
>>        accel/tcg: Avoid load of icount_decr if unused
>>        accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop
>>        accel/tcg: Track current value of can_do_io in the TB
>>        accel/tcg: Improve setting of can_do_io at start of TB
>>        accel/tcg: Always set CF_LAST_IO with CF_NOIRQ
>>        accel/tcg: Always require can_do_io
> 
> What's the set required for the regression fix for -stable ?
> Is it the whole thing?
> (yes, I tested the complete set in debian).

While it would be possible to take fewer to just fix the regression, it's probably best to 
take the whole set.


r~




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-09-05 12:37 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-05  7:50 [PULL 0/6] tcg patch queue Richard Henderson
2025-09-05  7:50 ` [PATCH 1/2] target/sparc: Loosen decode of STBAR for v8 Richard Henderson
2025-09-05  7:50 ` [PULL 1/6] tcg/arm: Fix tgen_deposit Richard Henderson
2025-09-05  7:50 ` [PULL 2/6] cpuinfo/i386: Detect GFNI as an AVX extension Richard Henderson
2025-09-05  7:50 ` [PATCH 2/2] target/sparc: Loosen decode of RDY for v7 Richard Henderson
2025-09-05 12:36 ` [PULL 0/6] tcg patch queue Richard Henderson
  -- strict thread matches above, loose matches on Subject: below --
2023-09-28 19:41 Richard Henderson
2023-10-02 21:56 ` Stefan Hajnoczi
2023-10-02 22:46 ` Michael Tokarev
2023-10-03  0:18   ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).