[Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen)

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen)
@ 2026-05-22 21:48 Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 100/149] linux-user: Flush errors by using exit() instead of _exit() in error path Michael Tokarev
                   ` (50 more replies)
  0 siblings, 51 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Michael Tokarev

The following patches are queued for QEMU stable v10.2.3:

  https://gitlab.com/qemu-project/qemu/-/commits/staging-10.2

Patch freeze is 2026-05-22, and the release is planned for 2026-05-24:

  https://wiki.qemu.org/Planning/10.2

Please respond here or CC qemu-stable@nongnu.org on any additional patches
you think should (or shouldn't) be included in the release.

The changes which are staging for inclusion, with the original commit hash
from master branch, are given below the bottom line.

Thanks!

/mjt

--------------------------------------
01* b83a42dc779a Peter Maydell:
   hw/net/rtl8319: Work around GCC sanitizer / -Wstringop-overflow bug
02* 2ff529c6f64b Razvan Ghiorghe:
   linux-user: Fix zero_bss for RX PT_LOAD segments
03* 5e5b278d2b1b Razvan Ghiorghe:
   linux-user: fix mremap with old_size=0 for shared mappings
04* 37c9f6fce5c5 Peter Maydell:
   hw/dma/pl080: Handle bogus swidth and dwidth in transfers
05* b6e61d1cc3bf Tao Ding:
   hw/dma/pl080: Update interrupts after pl080_run()
06* f9b16f791502 Tao Ding:
   hw/dma/pl080: Ignore bottom 2 bits of LLI register
07* 2741d2cc3903 Sergei Heifetz:
   target/i386: fix NULL pointer dereference in legacy-cache=off handling
08* 48221e371686 Pierrick Bouvier:
   contrib/plugins/uftrace.c: fix depth for exit events
09* 9c8430f5d651 Alberto Garcia:
   throttle-group: Fix race condition in throttle_group_restart_queue()
10* 9ac85f4cc799 Fiona Ebner:
   block/mirror: fix assertion failure upon duplicate complete for job using 
   'replaces'
11* a16d4c2f162a Shivang Upadhyay:
   ppc/pnv: fix dumpdtb option
12* ba48bff09fa1 Shivang Upadhyay:
   ppc/pnv: generate dtb after machine initialization is complete
13* c20f143cc9fb Fabiano Rosas:
   io: Fix TLS bye task leak
14* 6f23dde620ef Fiona Ebner:
   ui/vdagent: add migration blocker when machine version < 10.1
15* c035d5eadf40 Marc-André Lureau:
   virtio-gpu: fix overflow check when allocating 2d image
16* 556817773849 Max Chou:
   target/riscv: rvv: Fix missing flags merge in probe_pages for cross-page 
   accesses
17* 0e8ad6a8460f Max Chou:
   target/riscv: rvv: Fix page probe issues in vext_ldff
18* 6257754bb9b0 Paolo Bonzini:
   rust: suggest passing --locked to "cargo install"
19* 129922c2bc39 Jenny Guanni Qu:
   hw/usb/hcd-ohci: check for MPS=0 to avoid infinite loop
20* bc72b2996c0b Davidlohr Bueso:
   hw/cxl: Respect Media Operation max ops discovery semantics
21* 20beec283b95 Davidlohr Bueso:
   hw/cxl: Exclude Discovery from Media Operation Discovery output
22* fa4a759fc1e1 Cédric Le Goater:
   hw/net/ftgmac100: Improve DMA error handling
23* 80c5be945877 Cédric Le Goater:
   hw/ssi/aspeed_smc: Convert mem ops to read/write_with_attrs for error 
   handling
24* 32ebd6c09c18 Jose Martins:
   target/arm: fix s2prot not set for two-stage PMSA translations
25* 0376e9c2dd1f Peter Maydell:
   linux-user/i386/signal.c: Correct definition of target_fpstate_32
26* 5a2fa06b0957 Tao Ding:
   hw/dma/pl080: Fix transfer logic in PL080
27* cc03b62df47a Hanna Czenczek:
   linux-aio: Put all parameters into qemu_laiocb
28* 7eca3d4883be Hanna Czenczek:
   linux-aio: Resubmit tails of short reads/writes
29* 51fc8443c122 GuoHan Zhao:
   block/curl: free s->password in cleanup paths
30* f093ee7ac3af Paolo Bonzini:
   tdx: fix use-after-free in tdx_fetch_cpuid
31* cb1e8c18df62 Jenny Guanni Qu:
   hw/audio/sb16: validate VMState fields in post_load
32* 539421a428fd Richard Henderson:
   tcg: Pass host-endian values to plugin_gen_mem_callbacks_*
33* 55720ba97d21 Pankaj Raghav:
   hw/nvme: re-enable wzds bit in namespace dlfeat
34* eb5cc99aff17 Kaixuan Li:
   hw/nvme: fix heap-buffer-overflow in nvme_abort
35* b5abb655fab6 Peter Maydell:
   scripts/qemu-guest-agent/fsfreeze-hook: Avoid bash-isms
36* 65b9f4791c24 Peter Maydell:
   scripts/qemu-guest-agent/fsfreeze-hook: Avoid use of PIPESTATUS
37* 08497afcb2a7 Peter Maydell:
   scripts/qemu-guest-agent/fsfreeze-hook: Fix syslog-fallback logic
38* 4862d2c95104 Paolo Bonzini:
   lsi53c895a: keep a reference to the device while SCRIPTS execute
39* 64807c84e83f Paolo Bonzini:
   lsi53c895a: do not do anything else if a reset is requested by writing 
   ISTAT0
40* 1ca38f84e194 Paolo Bonzini:
   lsi53c895a: keep lsi_request and SCSIRequest in local variables
41* 7c7aaaa342b5 Paolo Bonzini:
   lsi53c895a: keep lsi_request alive as long as the SCSIRequest
42* d459131ff590 Paolo Bonzini:
   lsi53c895a: keep SCSIRequest alive during DMA
43* 31b8d287b7fe Zenghui Yu:
   target/arm: Don't skip access flag fault for AccessType_AT
44* a0721c099b71 Peter Maydell:
   hw/net/rocker: Avoid double-free of l2_flood.group_ids
45* 3cae0b46be54 Marc-André Lureau:
   ui/vnc-jobs: fix VncRectEntry leak on job cleanup
46* 59c1d3113668 Kevin Wolf:
   ide: Fix potential assertion failure on VM stop for PIO read error
47* ccc613f96c66 Kevin Wolf:
   scsi: Don't consider LOGICAL UNIT NOT SUPPORTED guest recoverable
48* fc1a2ec7da53 hongmianquan:
   monitor: Fix deadlock in monitor_cleanup
49* 17fbf3e18c3d Daniel P. Berrangé:
   util: fix missing aio_wait sym in qemu guest agent only build
50* 813dbe869f4f Richard Henderson:
   accel/tcg: Don't pass NULL to get_page_addr_code_hostp
51* 0039e5fd2234 Richard Henderson:
   accel/tcg: Fix uninitialized hostp in get_page_addr_code_hostp
52* ad7a005d672a Peter Maydell:
   include: Don't include guest-host.h in cpu-ldst.h
53* 8330da591ef6 Peter Maydell:
   include/user/guest-host.h: Provide g2h etc for both abi_ptr and vaddr
54* 22966937f413 Clayton Craft:
   linux-user: fix name_to_handle_at when AT_HANDLE_MNT_ID_UNIQUE flag is set
55* 9b7d64686b82 Sun Haoyu:
   linux-user: update select timeout writeback
56* fa6dfcc373c2 Sun Haoyu:
   linux-user: Make openat2() use -L for absolute paths
57* 7e966ef38f58 Nicholas Piggin:
   bsd-user, linux-user: signal: recursive signal delivery fix
58* 84771c64a5ae Peter Maydell:
   target/arm: do_ats_write(): avoid assertion when ptw failed
59* 566594f10873 Alex Bennée:
   target/arm: fix fault_s1ns for stage 2 faults
60* 4e4832dd72db Nguyen Dinh Phi:
   util/readline: Fix out-of-bounds access in readline_insert_char().
61* 34f66fdfd285 Paolo Bonzini:
   rust: hide panicking default associated constants from rustdoc
62* 799713029354 Paolo Bonzini:
   virtio-scsi: pass the same cdb_size to virtio_scsi_pop_req and 
   virtio_scsi_handle_cmd_req_prepare
63* af74c9e46bb5 Gerd Hoffmann:
   hw/uefi: fix heap overflow (CVE-2026-5744)
64* 4e6fb62fb0f3 Dietmar Maurer:
   qemu-keymap: fix altgr modifier lookup for newer xkeyboard-config
65* 4913ae36f979 Stefan Hajnoczi:
   virtio-blk: fix zone report buffer out-of-memory (CVE-2026-5761)
66* f1b1db98cc3b Bernhard Beschow:
   util/cutils: Fix heap corruption under Windows
67* 7437b3eab6af Werner de Carne:
   serial COM: windows serial COM PollingFunc don't sleep
68* 52cf667ed228 GuoHan Zhao:
   ui/spice-app: detect runtime directory creation failures
69* 181fdf8a7e13 Marc-André Lureau:
   ui/console-vc: fix off-by-one in CSI J 2 (clear entire screen)
70* 027ad866bd29 Pierrick Bouvier:
   target/arm/tcg/translate.c: remove MO_TE usage
71* 87e1226e6f68 Marc-André Lureau:
   target/i386: fix strList leak in x86_cpu_get_unavailable_features
72* 3eae91a8b93a Simon Scherer:
   target/i386: fix missing PF_INSTR in SIGSEGV context
73* 76ad26dd172d Paolo Bonzini:
   target/i386/tcg: fix decoding of MOVBE and CRC32 in 16-bit mode
74* 79bc17718677 Stepan Popov:
   meson: add missing semicolon in pthread_condattr_setclock test
75* 30fad722ce68 Alex Bennée:
   hw/display: don't accidentally autofree existing virgl resources
76* d41ce10d0f5a Vladimir Sementsov-Ogievskiy:
   migration: vmstate_save_state_v: fix double error_setg
77* c0306d2b8f45 Thomas Huth:
   hw/misc: Fix the valid access size to the avr-power device
78* 3ab47a47d716 Thomas Huth:
   hw/sh4/sh7750: Remove forgotten abort() in the MM_ITLB_DATA handler
79* 654dce6c5236 Matt Turner:
   linux-user/ppc: Fix ppc64 rt_sigframe stack offset
80* 029f10e85278 Yixin Wei:
   linux-user: fix off-by-one in host_to_target_for_each_rtattr()
81* 93484c768f2b Gyorgy Tamasi:
   linux-user: Don't define target_stat64 struct for loongarch64
82* c8ea1759009a Richard Henderson:
   linux-user/arm/nwfpe: Replace user_registers with current_cpu
83* 784f1dde90df Richard Henderson:
   linux-user/arm/nwfpe: Use thread-local storage for qemufpa
84* 1730e6f33f97 Alistair Francis:
   linux-user/strace: Use pointer type for read and write values
85* 4c681ba3b82d James Hilliard:
   linux-user/mips: sync k0 TLS for EF_MIPS_MACH_OCTEON userlands
86* 8b60ed835478 Helge Deller:
   linux-user: Define SO_TIMESTAMP*_NEW and SO_RCVTIMEIO_NEW
87* edb4588309a7 Helge Deller:
   linux-user: Add setsockopt() for SO_RCVTIMEO_NEW and SO_SNDTIMEO_NEW
88* 07c7decaa54a Helge Deller:
   linux-user: Add getsockopt() for SO_RCVTIMEO_NEW and SO_SNDTIMEO_NEW
89* b03a6ac6fa5d Helge Deller:
   linux-user: Fix CLONE_PARENT_SETTID when using fork-like clone
90* e2af3eadc09b Helge Deller:
   linux-user: Use abi_int for imr_ifindex in ip_mreqn struct
91* 9e7734ead149 Helge Deller:
   linux-user: Flush errors by using exit() instead of _exit() in error path
92* 4cb2f91773e8 Yicong Yang:
   hw/riscv/virt-acpi-build.c: Use kvm timer frequency when kvm enabled
93* b2e874bfec59 Sebastián Alba Vives:
   target/riscv: fix stale ptshift and base on page walk restart
94* d5b33fc180f5 Sebastián Alba Vives:
   hw/intc: fix heap OOB in ACLINT MTIMER multi-socket
95* 14808578ccbc Munkhbaatar Enkhbaatar:
   riscv_htif: reject invalid signature ranges (end <= begin)
96* d107b748072c Alistair Francis:
   target/riscv: Generate access fault if sc comparison fails
97* 175afdb0d155 Alistair Francis:
   target/riscv: Don't OR mip.SEIP when mvien is one
98* 5dcc64828dc7 Alistair Francis:
   target/riscv: Use ELEN for Fractional LMUL check
99* dcb6e96257ee Helge Deller:
   linux-user: Add missing CDROM ioctls
100 9fb681792d65 Helge Deller:
   linux-user: Flush errors by using exit() instead of _exit() in error path
101 08dc3e240fc0 Helge Deller:
   linux-user: Allow getsockopt() with NULL optval address
102 9667bf324925 Helge Deller:
   linux-user: Translate errno in IP_RECVERR and IPV6_RECVERR
103 1aee8067fce9 kiki:
   hw/intc/xics: Add a check for an invalid server id
104 7a05be8c70bb Cédric Le Goater:
   tests/rcutorture: Fix build error
105 774e6f5c1533 Vivien LEGER:
   hw/ppc/e500: fix bus-frequency property hardcoded to zero in CPU FDT node
106 a7f27d6903b3 宋文武:
   hw/net/allwinner-sun8i-emac: Flush queued packets when rx is enabled
107 f35f0f1ca121 liugan1:
   hw/intc/arm_gicv3: Fix NS write to ICC_AP1Rn_EL1 when prebits < 7
108 455a6167f254 Peter Xu:
   migration: Fix low possibility downtime violation
109 41c417290df9 Philippe Mathieu-Daudé:
   target/microblaze: Fix endianness used to disassemble
110 f443b6876362 Peter Maydell:
   target/arm: Report IL=0 for Thumb 16-bit BKPT insn
111 18b664c90085 Peter Maydell:
   hw/misc/bcm2835_rng: Specify valid memory access sizes
112 f252769a23e6 Gerd Hoffmann:
   hw/uefi: fix buffer overruns
113 94d9a8b2c9e6 Gerd Hoffmann:
   hw/uefi: verify pio_xfer_offset before calculating buffer checksum
114 5247b3034c23 Gerd Hoffmann:
   hw/uefi: fix ucs2 string helper functions
115 c45b460d16f9 Gerd Hoffmann:
   hw/uefi: add name_size check to uefi_vars_mm_lock_variable()
116 22b7b222d8f5 Gerd Hoffmann:
   hw/uefi: verify data size before accessing it in wrap_pkcs7
117 b4680c02b8e8 Gerd Hoffmann:
   hw/uefi: avoid possibly unaligned variable_auth_2 struct field access
118 b33fd8ab1caa Gerd Hoffmann:
   hw/uefi: check auth.hdr_length minimum size
119 332ea2978780 Jeuk Kim:
   hw/ufs: Validate MCQ SQ references before use
120 283d921e771e Jeuk Kim:
   hw/ufs: Guard MCQ CQ accesses against missing queues
121 4a909c00b9e1 Jeuk Kim:
   hw/ufs: Reject zero-depth MCQ queues
122 619c2da19a05 Jeuk Kim:
   hw/ufs: Keep MCQ SQs alive while requests are outstanding
123 042dbcff8382 Jeuk Kim:
   hw/ufs: Zero reserved bytes in REPORT LUNS response header
124 aefeecb413a8 Peter Maydell:
   hw/display/cirrus_vga: Fix packed-24 color-expansion transparent pattern 
   fills
125 27d14251b904 Peter Maydell:
   hw/display/cirrus_vga: Fix packed-24 color-expansion transparent copies
126 ff36712da5ae Kane Chen:
   hw/misc/aspeed_sbc: Add bounds checking for OTP write operations
127 534a52755bef Cédric Le Goater:
   aspeed/hace: Fix out-of-bounds read in has_padding()
128 c6aa2d0ac161 Cédric Le Goater:
   aspeed/hace: Prevent total_req_len overflow
129 a824f3531a44 Peter Maydell:
   hw/i2c/microbit_i2c: Don't index off end of twi_read_sequence[]
130 a163fc1f864b Peter Maydell:
   meson.build: Add -fzero-init-padding-bits=all
131 039b057c09c6 Peter Maydell:
   tests/functional/qemu_test/asset.py: Don't use setxattr when it doesn't 
   exist
132 2293d8b4bd88 Klaus Jensen:
   hw/nvme: fix admin cq msix setup
133 6b5aef7cac9d Helge Deller:
   linux-user: Fix AT_EXECFN in AUXV for symlinked programs
134 c3176e645774 Matt Turner:
   linux-user/sh4: Fix target_ucontext tuc_link field type
135 9ac5aa722721 Matt Turner:
   linux-user/sh4: Fix setup_sigtramp to match Linux kernel trampoline 
   pattern
136 d5e4090177ad Kevin Wolf:
   blkdebug: Add 'delay-ns' option
137 34a67637767d Kevin Wolf:
   block: Add blk_co_start/end_request() and BDRV_REQ_NO_QUEUE
138 53074ba0330a Kevin Wolf:
   block: Add flags parameter to blk_*_pdiscard()
139 095c08a7ba68 Kevin Wolf:
   ide: Minimal fix for deadlock between TRIM and drain
140 c1c71a7e167f Kevin Wolf:
   ide: Clean up ide_trim_co_entry() to be idiomatic coroutine code
141 92854c9c7539 Kevin Wolf:
   ide-test: Factor out wait_dma_completion()
142 2fa24e975599 Kevin Wolf:
   ide-test: Test reset during TRIM
143 a1310cc6281d Kevin Wolf:
   block: Create DEFAULT_BLOCK_CONF macro
144 f27aea189633 Kevin Wolf:
   block: Add more defaults to DEFAULT_BLOCK_CONF
145 f0d9ccd46cf8 Kevin Wolf:
   commit: Drain nodes across all of bdrv_commit()
146 7f8466e2ce62 Kevin Wolf:
   qemu-io: Add 'aio_discard' command
147 b8bfb1478d61 Kevin Wolf:
   qcow2: Fix corruption on discard during write with COW
148 389f5bcc744d Kevin Wolf:
   iotests/046: Test that discard/write_zeroes wait for dependencies
149 e3082ab3b385 Denis V. Lunev:
   block/graph-lock: fix missed wakeup in bdrv_graph_co_rdunlock()

(commit(s) marked with * were in previous series and are not resent)


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 100/149] linux-user: Flush errors by using exit() instead of _exit() in error path
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 101/149] linux-user: Allow getsockopt() with NULL optval address Michael Tokarev
                   ` (49 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Helge Deller, Warner Losh, Michael Tokarev

From: Helge Deller <deller@gmx.de>

Similiar to previous patch - ensure that we always flush I/O by using
exit() instead of _exit().

Reported by: Tobias Bergkvist <tobias@bergkv.ist>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/2544
Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit 9fb681792d65fa570cb3e1a769945c10bf276d25)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/linux-user/main.c b/linux-user/main.c
index 84e110dfe9..86d04cca3c 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -975,7 +975,7 @@ int main(int argc, char **argv, char **envp)
                       info, &bprm);
     if (ret != 0) {
         printf("Error while loading %s: %s\n", exec_path, strerror(-ret));
-        _exit(EXIT_FAILURE);
+        exit(EXIT_FAILURE);
     }
 
     for (wrk = target_environ; *wrk; wrk++) {
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 101/149] linux-user: Allow getsockopt() with NULL optval address
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 100/149] linux-user: Flush errors by using exit() instead of _exit() in error path Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 102/149] linux-user: Translate errno in IP_RECVERR and IPV6_RECVERR Michael Tokarev
                   ` (48 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Helge Deller, Pierrick Bouvier, Michael Tokarev

From: Helge Deller <deller@gmx.de>

Some programs test availability of socket options by asking for the
value with an NULL optval address, which currenrly always trigger an
EFAULT in qemu.  Fix it by allowing a NULL address, in the same manner
as the Linux kernel on physical machines.

Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/2390
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@oss.qualcomm.com>
(cherry picked from commit 08dc3e240fc00213c0eb29b71569dc0ca9301337)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 47270eb15e..8934aa9514 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -2647,6 +2647,10 @@ get_timeout:
             if (ret < 0) {
                 return ret;
             }
+            /* special case: destination address is NULL, return 0 */
+            if (optval_addr) {
+                len = 0;
+            }
             if (len == sizeof(struct target__kernel_sock_timeval)) {
                 if (copy_to_user_timeval64(optval_addr, &tv)) {
                     return -TARGET_EFAULT;
@@ -2847,7 +2851,10 @@ get_timeout:
         }
         if (len > lv)
             len = lv;
-        if (len == 4) {
+        if (!optval_addr) {
+            /* writing to NULL does not give error */
+            len = 0;
+        } else if (len == 4) {
             if (put_user_u32(val, optval_addr))
                 return -TARGET_EFAULT;
         } else {
@@ -2880,18 +2887,24 @@ get_timeout:
                 return -TARGET_EINVAL;
             lv = sizeof(lv);
             ret = get_errno(getsockopt(sockfd, level, optname, &val, &lv));
+write_ret:
             if (ret < 0)
                 return ret;
-            if (len < sizeof(int) && len > 0 && val >= 0 && val < 255) {
+            if (!optval_addr) {
+                len = 0;
+            } else if (len < sizeof(int) && len > 0 && val >= 0 && val < 255) {
                 len = 1;
-                if (put_user_u32(len, optlen)
-                    || put_user_u8(val, optval_addr))
+                if (put_user_u8(val, optval_addr)) {
                     return -TARGET_EFAULT;
+                }
             } else {
                 if (len > sizeof(int))
                     len = sizeof(int);
-                if (put_user_u32(len, optlen)
-                    || put_user_u32(val, optval_addr))
+                if (put_user_u32(val, optval_addr)) {
+                    return -TARGET_EFAULT;
+                }
+            }
+            if (put_user_u32(len, optlen)) {
                     return -TARGET_EFAULT;
             }
             break;
@@ -2942,20 +2955,7 @@ get_timeout:
                 return -TARGET_EINVAL;
             lv = sizeof(lv);
             ret = get_errno(getsockopt(sockfd, level, optname, &val, &lv));
-            if (ret < 0)
-                return ret;
-            if (len < sizeof(int) && len > 0 && val >= 0 && val < 255) {
-                len = 1;
-                if (put_user_u32(len, optlen)
-                    || put_user_u8(val, optval_addr))
-                    return -TARGET_EFAULT;
-            } else {
-                if (len > sizeof(int))
-                    len = sizeof(int);
-                if (put_user_u32(len, optlen)
-                    || put_user_u32(val, optval_addr))
-                    return -TARGET_EFAULT;
-            }
+            goto write_ret;
             break;
         default:
             ret = -TARGET_ENOPROTOOPT;
@@ -2989,8 +2989,14 @@ get_timeout:
             if (ret < 0) {
                 return ret;
             }
-            if (put_user_u32(lv, optlen)
-                || put_user_u32(val, optval_addr)) {
+            if (optval_addr) {
+                if (put_user_u32(val, optval_addr)) {
+                    return -TARGET_EFAULT;
+                }
+            } else {
+                lv = 0;
+            }
+            if (put_user_u32(lv, optlen)) {
                 return -TARGET_EFAULT;
             }
             break;
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 102/149] linux-user: Translate errno in IP_RECVERR and IPV6_RECVERR
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 100/149] linux-user: Flush errors by using exit() instead of _exit() in error path Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 101/149] linux-user: Allow getsockopt() with NULL optval address Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 103/149] hw/intc/xics: Add a check for an invalid server id Michael Tokarev
                   ` (47 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Helge Deller, Michael Tokarev

From: Helge Deller <deller@gmx.de>

Translate host error codes of IP_RECVERR and IPV6_RECVERR control messages to
target error codes before returning to the caller.
For example, this is important for architectures (e.g. hppa, alpha, sparc,
mips) on which the value of ECONNREFUSED is different to the value on a x86_64
host.

Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/602
Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit 9667bf3249256788245c6ca07bc12106f3e4fa22)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 8934aa9514..bb818f35d9 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -2011,7 +2011,8 @@ static inline abi_long host_to_target_cmsg(struct target_msghdr *target_msgh,
                     tgt_len != sizeof(struct errhdr_t)) {
                     goto unimplemented;
                 }
-                __put_user(errh->ee.ee_errno, &target_errh->ee.ee_errno);
+                __put_user(host_to_target_errno(errh->ee.ee_errno),
+                           &target_errh->ee.ee_errno);
                 __put_user(errh->ee.ee_origin, &target_errh->ee.ee_origin);
                 __put_user(errh->ee.ee_type,  &target_errh->ee.ee_type);
                 __put_user(errh->ee.ee_code, &target_errh->ee.ee_code);
@@ -2065,7 +2066,8 @@ static inline abi_long host_to_target_cmsg(struct target_msghdr *target_msgh,
                     tgt_len != sizeof(struct errhdr6_t)) {
                     goto unimplemented;
                 }
-                __put_user(errh->ee.ee_errno, &target_errh->ee.ee_errno);
+                __put_user(host_to_target_errno(errh->ee.ee_errno),
+                           &target_errh->ee.ee_errno);
                 __put_user(errh->ee.ee_origin, &target_errh->ee.ee_origin);
                 __put_user(errh->ee.ee_type,  &target_errh->ee.ee_type);
                 __put_user(errh->ee.ee_code, &target_errh->ee.ee_code);
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 103/149] hw/intc/xics: Add a check for an invalid server id
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (2 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 102/149] linux-user: Translate errno in IP_RECVERR and IPV6_RECVERR Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 104/149] tests/rcutorture: Fix build error Michael Tokarev
                   ` (46 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, kiki, Zexiang Zhang, Gautam Menghani,
	Philippe Mathieu-Daudé, Harsh Prateek Bora, Michael Tokarev

From: kiki <Chan9Yan9@gmail.com>

A malformed IVE value can result in an invalid server field being
passed to icp_irq(). The function assumes the server id is valid and
may access invalid state otherwise, potentially leading to a crash.

Fix this by validating the server id before using it and ignoring
invalid values.

Reported-by: Zexiang Zhang <chan9yan9@gmail.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/3324
Signed-off-by: Zexiang Zhang <chan9yan9@gmail.com>
Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/qemu-devel/20260428103645.50617-1-Gautam.Menghani@ibm.com
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
(cherry picked from commit 1aee8067fce95d15061eca8fbb6772d8a90ea699)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 200710eb6c..c7312d166f 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -26,6 +26,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/log.h"
 #include "qapi/error.h"
 #include "trace.h"
 #include "qemu/timer.h"
@@ -222,6 +223,13 @@ void icp_irq(ICSState *ics, int server, int nr, uint8_t priority)
 
     trace_xics_icp_irq(server, nr, priority);
 
+    if (!icp) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XICS: invalid server %d for IRQ 0x%x\n",
+                      server, nr);
+        ics_reject(ics, nr);
+        return;
+    }
+
     if ((priority >= CPPR(icp))
         || (XISR(icp) && (icp->pending_priority <= priority))) {
         ics_reject(ics, nr);
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 104/149] tests/rcutorture: Fix build error
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (3 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 103/149] hw/intc/xics: Add a check for an invalid server id Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 105/149] hw/ppc/e500: fix bus-frequency property hardcoded to zero in CPU FDT node Michael Tokarev
                   ` (45 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Cédric Le Goater, Richard Henderson,
	Michael Tokarev

From: Cédric Le Goater <clg@redhat.com>

Newer gcc compiler (version 16.0.0 20260103 (Red Hat 16.0.0-0) (GCC))
detects an unused variable error:

  ../tests/unit/rcutorture.c: In function ‘rcu_read_stress_test’:
  ../tests/unit/rcutorture.c:251:18: error: variable ‘garbage’ set but not used [-Werror=unused-but-set-variable=]
    251 |     volatile int garbage = 0;
        |                  ^~~~~~~

Since the 'garbage' variable is used to generate memory reads from the
CPU while holding the RCU lock, it can not be removed. Tag it as
((unused)) instead to silence the compiler warnings/errors.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/qemu-devel/20260112163350.1251114-1-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit 7a05be8c70bb789c23076b1ca2563ed7d87c6fb8)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/tests/unit/rcutorture.c b/tests/unit/rcutorture.c
index 7662081683..2f19d479a3 100644
--- a/tests/unit/rcutorture.c
+++ b/tests/unit/rcutorture.c
@@ -248,7 +248,7 @@ static void *rcu_read_stress_test(void *arg)
     int pc;
     long long n_reads_local = 0;
     long long rcu_stress_local[RCU_STRESS_PIPE_LEN + 1] = { 0 };
-    volatile int garbage = 0;
+    volatile int garbage __attribute__ ((unused)) = 0;
 
     rcu_register_thread();
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 105/149] hw/ppc/e500: fix bus-frequency property hardcoded to zero in CPU FDT node
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (4 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 104/149] tests/rcutorture: Fix build error Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 106/149] hw/net/allwinner-sun8i-emac: Flush queued packets when rx is enabled Michael Tokarev
                   ` (44 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Vivien LEGER, Bernhard Beschow,
	Philippe Mathieu-Daudé, Michael Tokarev

From: Vivien LEGER <vivien.leger@gmail.com>

The bus-frequency property in the CPU FDT node was hardcoded to 0.
This is incorrect - it should reflect the actual platform bus clock
frequency, as firmware and RTOSes use it to derive peripheral clock
rates.

Notably, the RTEMS QorIQ BSP uses bus-frequency to program the MPIC
global timer interval. With bus-frequency=0, the timer interval
overflows to ~85 seconds, preventing any clock interrupts from firing.

Fix by adding a bus_freq field to PPCE500MachineClass and using it in
the FDT generator. Set bus_freq = PLATFORM_CLK_FREQ_HZ (400MHz) for
existing machines, matching the existing clock_freq value.

Signed-off-by: Vivien LEGER <vivien.leger@gmail.com>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Message-ID: <20260411154535.1451361-1-vivien.leger@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 774e6f5c1533aba9e04f95cb8cfba64d8329fcb0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
index 8842f7f6b8..bde8a928f9 100644
--- a/hw/ppc/e500.c
+++ b/hw/ppc/e500.c
@@ -517,7 +517,7 @@ static int ppce500_load_device_tree(PPCE500MachineState *pms,
                               env->icache_line_size);
         qemu_fdt_setprop_cell(fdt, cpu_name, "d-cache-size", 0x8000);
         qemu_fdt_setprop_cell(fdt, cpu_name, "i-cache-size", 0x8000);
-        qemu_fdt_setprop_cell(fdt, cpu_name, "bus-frequency", 0);
+        qemu_fdt_setprop_cell(fdt, cpu_name, "bus-frequency", pmc->bus_freq);
         if (cpu->cpu_index) {
             qemu_fdt_setprop_string(fdt, cpu_name, "status", "disabled");
             qemu_fdt_setprop_string(fdt, cpu_name, "enable-method",
diff --git a/hw/ppc/e500.h b/hw/ppc/e500.h
index 00f490519c..858684d569 100644
--- a/hw/ppc/e500.h
+++ b/hw/ppc/e500.h
@@ -40,6 +40,7 @@ struct PPCE500MachineClass {
     hwaddr pci_mmio_bus_base;
     hwaddr spin_base;
     uint32_t clock_freq;
+    uint32_t bus_freq;
     uint32_t tb_freq;
 };
 
diff --git a/hw/ppc/e500plat.c b/hw/ppc/e500plat.c
index 4f1d659e72..dab9e32b96 100644
--- a/hw/ppc/e500plat.c
+++ b/hw/ppc/e500plat.c
@@ -94,6 +94,7 @@ static void e500plat_machine_class_init(ObjectClass *oc, const void *data)
     pmc->pci_mmio_bus_base = 0xE0000000ULL;
     pmc->spin_base = 0xFEF000000ULL;
     pmc->clock_freq = PLATFORM_CLK_FREQ_HZ;
+    pmc->bus_freq = PLATFORM_CLK_FREQ_HZ;
     pmc->tb_freq = PLATFORM_CLK_FREQ_HZ;
 
     mc->desc = "generic paravirt e500 platform";
diff --git a/hw/ppc/mpc8544ds.c b/hw/ppc/mpc8544ds.c
index 582698559d..d022761cb6 100644
--- a/hw/ppc/mpc8544ds.c
+++ b/hw/ppc/mpc8544ds.c
@@ -56,6 +56,7 @@ static void mpc8544ds_machine_class_init(ObjectClass *oc, const void *data)
     pmc->pci_pio_base = 0xE1000000ULL;
     pmc->spin_base = 0xEF000000ULL;
     pmc->clock_freq = PLATFORM_CLK_FREQ_HZ;
+    pmc->bus_freq = PLATFORM_CLK_FREQ_HZ;
     pmc->tb_freq = PLATFORM_CLK_FREQ_HZ;
 
     mc->desc = "mpc8544ds";
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 106/149] hw/net/allwinner-sun8i-emac: Flush queued packets when rx is enabled
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (5 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 105/149] hw/ppc/e500: fix bus-frequency property hardcoded to zero in CPU FDT node Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 107/149] hw/intc/arm_gicv3: Fix NS write to ICC_AP1Rn_EL1 when prebits < 7 Michael Tokarev
                   ` (43 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, 宋文武, Peter Maydell,
	Michael Tokarev

From: 宋文武 <iyzsong@member.fsf.org>

The RX_CTL_0 register includes the RX_EN receive-enable bit,
which allwinner_sun8i_emac_can_receive() checks. That means that
if the guest sets it we need to call qemu_flush_queued_packets()
as we might now be able to handle them.

This fixes a bug where networking didn't work in u-boot on the
orangepi-pc machine.

Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/3459
Signed-off-by: 宋文武 <iyzsong@member.fsf.org>
Message-id: 20260430040753.3337-1-iyzsong@envs.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: expanded commit message, removed unneeded RX_EN test]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit a7f27d6903b30bcea21c46986cb7507edcbc970c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/net/allwinner-sun8i-emac.c b/hw/net/allwinner-sun8i-emac.c
index 30a81576b4..9d73a99f54 100644
--- a/hw/net/allwinner-sun8i-emac.c
+++ b/hw/net/allwinner-sun8i-emac.c
@@ -727,6 +727,9 @@ static void allwinner_sun8i_emac_write(void *opaque, hwaddr offset,
         break;
     case REG_RX_CTL_0:          /* Receive Control 0 */
         s->rx_ctl0 = value;
+        if (allwinner_sun8i_emac_can_receive(nc)) {
+            qemu_flush_queued_packets(nc);
+        }
         break;
     case REG_RX_CTL_1:          /* Receive Control 1 */
         s->rx_ctl1 = value | RX_CTL1_RX_MD;
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 107/149] hw/intc/arm_gicv3: Fix NS write to ICC_AP1Rn_EL1 when prebits < 7
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (6 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 106/149] hw/net/allwinner-sun8i-emac: Flush queued packets when rx is enabled Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 108/149] migration: Fix low possibility downtime violation Michael Tokarev
                   ` (42 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, liugan1, Peter Maydell, Michael Tokarev

From: liugan1 <liugan1@lixiang.com>

The existing code uses a blanket `regno < 2` check to make
ICC_AP1R0_EL1 and ICC_AP1R1_EL1 writes from Non-secure code WI
(Write Ignore) when EL3 is present. This is intended to prevent
NS code from claiming active interrupts in the Secure priority
range, which could block Secure interrupt delivery.

However, that check assumes prebits=7 (4 APR registers), where the
NS priority range (128..255) maps entirely to AP1R2/AP1R3. Since
commit 39f29e599355 ("hw/intc/arm_gicv3: Use correct number of
priority bits for the CPU", first in 7.1), all QEMU AArch64 CPUs
are initialised with gic_pribits=5 (one APR register), so NS
priorities map to AP1R0 bits [16:31]. Blanket WI of the entire
AP1R0 register prevents NS code from clearing its own NS active
priority bits. Machines using hw_compat_7_0 (e.g. virt-7.0) still
force pribits=8 via force-8-bit-prio and are therefore unaffected.

A concrete consequence observed in virtualisation scenarios: when
a guest VM acknowledges an SPI interrupt but does not perform EOI,
is force-killed and restarted, the new guest's attempt to clear
the residual active state by writing ICC_AP1R0_EL1=0 is silently
ignored. The running priority (RPR) remains stuck at the old
interrupt's priority, preventing all equal-or-lower priority
interrupts (including timer interrupts) from being delivered, and
hanging the guest.

Fix this by computing the exact Secure/NS boundary within the APR
bank based on prebits. For registers entirely in the Secure range,
keep the WI behaviour. For the register that straddles the
boundary, preserve only the Secure bits while allowing NS bits to
be modified. For registers entirely in the NS range, allow full
write access.

The new logic produces identical behaviour to the old code when
prebits=7, preserving existing behaviour for machines that use
force-8-bit-prio.

Fixes: 39f29e599355 ("hw/intc/arm_gicv3: Use correct number of priority bits for the CPU")
Cc: qemu-stable@nongnu.org
Signed-off-by: liugan1 <liugan1@lixiang.com>
Message-id: 20260428083119.1400110-1-gs_liugan@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit f35f0f1ca121fb4931fe98570cda3aeb06b7a87f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index 2e6c1f778a..5b15e4ff1e 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -1869,9 +1869,40 @@ static void icc_ap_write(CPUARMState *env, const ARMCPRegInfo *ri,
      * at a priority outside the Non-secure range (128..255), since this
      * would otherwise allow malicious NS code to block delivery of S interrupts
      * by writing a bad value to these registers.
+     *
+     * The NS priority range (128..255) maps to APR bits starting at
+     * aprbit = 0x80 >> (8 - prebits). Depending on prebits, this boundary
+     * may fall within AP1R0 or AP1R1, so we cannot simply WI the entire
+     * register. Instead we calculate which bits within each register
+     * correspond to the Secure range and preserve those, while allowing
+     * NS code to modify only the NS range bits.
+     *
+     *   prebits=4: num_aprs=1, NS starts at AP1R0[8]
+     *   prebits=5: num_aprs=1, NS starts at AP1R0[16]
+     *   prebits=6: num_aprs=2, NS starts at AP1R1[0]
+     *   prebits=7: num_aprs=4, NS starts at AP1R2[0]
      */
-    if (grp == GICV3_G1NS && regno < 2 && arm_feature(env, ARM_FEATURE_EL3)) {
-        return;
+    if (grp == GICV3_G1NS && arm_feature(env, ARM_FEATURE_EL3)) {
+        int ns_start_bit = 0x80 >> (8 - cs->prebits);
+        int ns_start_regno = ns_start_bit / 32;
+        int ns_start_regbit = ns_start_bit % 32;
+
+        if (regno < ns_start_regno) {
+            /* This entire register is in the Secure range: WI */
+            return;
+        } else if (regno == ns_start_regno && ns_start_regbit > 0) {
+            /*
+             * This register is split: low bits are Secure, high bits are NS.
+             * Preserve the Secure bits (below ns_start_regbit) from the
+             * current value, and take the NS bits (at and above
+             * ns_start_regbit) from the written value.
+             */
+            uint32_t secure_mask = MAKE_64BIT_MASK(0, ns_start_regbit);
+
+            value = (cs->icc_apr[grp][regno] & secure_mask) |
+                    (value & ~secure_mask);
+        }
+        /* else: regno > ns_start_regno, entire register is NS: allow write */
     }

     if (cs->nmi_support) {
-- 
2.47.3

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 108/149] migration: Fix low possibility downtime violation
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (7 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 107/149] hw/intc/arm_gicv3: Fix NS write to ICC_AP1Rn_EL1 when prebits < 7 Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 109/149] target/microblaze: Fix endianness used to disassemble Michael Tokarev
                   ` (41 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Peter Xu, Juraj Marcin, Michael Tokarev

From: Peter Xu <peterx@redhat.com>

When QEMU queried the estimated version of pending data and thinks it's
ready to converge, it'll send another accurate query to make sure of it.
It is needed to make sure we collect the latest reports and that equation
still holds true.

However we missed one tiny little difference here on "<" v.s. "<=" when
comparing pending_size (A) to threshold_size (B)..

QEMU src only re-query if A<B, but will kickoff switchover if A<=B.

I think it means it is possible to happen if A (as an estimate only so far)
accidentally equals to B, then re-query won't happen and switchover will
proceed without considering new dirtied data.

It turns out it was an accident in my commit 7aaa1fc072 when refactoring
the code around.  Fix this by using the same equation in both places.

Fixes: 7aaa1fc072 ("migration: Rewrite the migration complete detect logic")
Cc: qemu-stable@nongnu.org
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20260421202110.306051-3-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 455a6167f25416ce97ea966d6e8301df9fda9a47)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/migration/migration.c b/migration/migration.c
index b316ee01ab..5daf0d84e4 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3510,7 +3510,7 @@ static MigIterateState migration_iteration_run(MigrationState *s)
          * postcopy started, so ESTIMATE should always match with EXACT
          * during postcopy phase.
          */
-        if (pending_size < s->threshold_size) {
+        if (pending_size <= s->threshold_size) {
             qemu_savevm_state_pending_exact(&must_precopy, &can_postcopy);
             pending_size = must_precopy + can_postcopy;
             trace_migrate_pending_exact(pending_size, must_precopy,
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 109/149] target/microblaze: Fix endianness used to disassemble
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (8 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 108/149] migration: Fix low possibility downtime violation Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 110/149] target/arm: Report IL=0 for Thumb 16-bit BKPT insn Michael Tokarev
                   ` (40 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Philippe Mathieu-Daudé, Richard Henderson,
	Pierrick Bouvier, Michael Tokarev

From: Philippe Mathieu-Daudé <philmd@linaro.org>

MicroBlaze CPU model has a "little-endian" property, pointing to
the @endi internal field. Commit c36ec3a9655 ("hw/microblaze:
Explicit CPU endianness") took care of having all MicroBlaze
boards with an explicit default endianness (similarly with
commit 91fc6d8101d for linux-user binaries), so later commit
415aae543ed ("target/microblaze: Consider endianness while
translating code") could infer the endianness at runtime from
the @endi field, and not a compile time via the TARGET_BIG_ENDIAN
definition. Doing so, we forgot to propagate that runtime change
to the disassemble_info structure. Do it now to display the
opcodes in correct endianness order.

Cc: qemu-stable@nongnu.org
Fixes: 415aae543ed ("target/microblaze: Consider endianness while translating code")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@oss.qualcomm.com>
Message-Id: <20260423100612.27278-3-philmd@linaro.org>
(cherry picked from commit 41c417290df91c31a70adeb8f5271896a8c5f802)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c
index 22231f09e6..965eedbfaf 100644
--- a/target/microblaze/cpu.c
+++ b/target/microblaze/cpu.c
@@ -237,8 +237,8 @@ static void mb_disas_set_info(CPUState *cpu, disassemble_info *info)
 {
     info->mach = bfd_arch_microblaze;
     info->print_insn = print_insn_microblaze;
-    info->endian = TARGET_BIG_ENDIAN ? BFD_ENDIAN_BIG
-                                     : BFD_ENDIAN_LITTLE;
+    info->endian = MICROBLAZE_CPU(cpu)->cfg.endi ? BFD_ENDIAN_LITTLE
+                                                 : BFD_ENDIAN_BIG;
 }
 
 static void mb_cpu_realizefn(DeviceState *dev, Error **errp)
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 110/149] target/arm: Report IL=0 for Thumb 16-bit BKPT insn
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (9 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 109/149] target/microblaze: Fix endianness used to disassemble Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 111/149] hw/misc/bcm2835_rng: Specify valid memory access sizes Michael Tokarev
                   ` (39 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Peter Maydell, Philippe Mathieu-Daudé,
	Alex Bennée, Richard Henderson, Michael Tokarev

From: Peter Maydell <peter.maydell@linaro.org>

The Thumb BKPT insn is 16-bit, and the ESR_ELx syndrome register
definition requires that we set the IL bit to 0 for this, and 1 for
the 32-bit A32 and A64 BKPT/BRK.

We used to do this correctly, but accidentally lost it in the
conversion to decodetree, because we converted the A32 BKPT first,
and then when we converted the T16 BKPT we forgot that trans_BKPT()
was unconditionally setting IL=1.

Pass the right value for syn_aa32_bkpt()'s is_16bit argument.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/3474
Fixes: 43f7e42c7d515f ("target/arm: Convert T16, Miscellaneous 16-bit instructions")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20260505103726.419195-1-peter.maydell@linaro.org
(cherry picked from commit f443b687636205b7f70029692b244f1f90532cf2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c
index 0a92300f9b..bca278daf0 100644
--- a/target/arm/tcg/translate.c
+++ b/target/arm/tcg/translate.c
@@ -3562,7 +3562,7 @@ static bool trans_BKPT(DisasContext *s, arg_BKPT *a)
         (a->imm == 0xab)) {
         gen_exception_internal_insn(s, EXCP_SEMIHOST);
     } else {
-        gen_exception_bkpt_insn(s, syn_aa32_bkpt(a->imm, false));
+        gen_exception_bkpt_insn(s, syn_aa32_bkpt(a->imm, curr_insn_len(s) == 2));
     }
     return true;
 }
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 111/149] hw/misc/bcm2835_rng: Specify valid memory access sizes
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (10 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 110/149] target/arm: Report IL=0 for Thumb 16-bit BKPT insn Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 112/149] hw/uefi: fix buffer overruns Michael Tokarev
                   ` (38 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Peter Maydell, Philippe Mathieu-Daudé,
	Michael Tokarev

From: Peter Maydell <peter.maydell@linaro.org>

The BCM2835 RNG has 32-bit registers only; specify this in
the MemoryRegionOps so wrong-sized accesses are rejected rather
than getting to the assertions in the read and write functions,
and for clarity add the matching .impl constraints.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/3394
Fixes: 54a5ba13a9f ("target-arm: Implement BCM2835 hardware RNG")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20260501162700.4092512-1-peter.maydell@linaro.org
(cherry picked from commit 18b664c90085b0d2be9c2ad8c747e00a7a733402)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/misc/bcm2835_rng.c b/hw/misc/bcm2835_rng.c
index e4d2c224c8..4492e325b4 100644
--- a/hw/misc/bcm2835_rng.c
+++ b/hw/misc/bcm2835_rng.c
@@ -93,6 +93,10 @@ static const MemoryRegionOps bcm2835_rng_ops = {
     .read = bcm2835_rng_read,
     .write = bcm2835_rng_write,
     .endianness = DEVICE_NATIVE_ENDIAN,
+    .impl.min_access_size = 4,
+    .impl.max_access_size = 4,
+    .valid.min_access_size = 4,
+    .valid.max_access_size = 4,
 };
 
 static const VMStateDescription vmstate_bcm2835_rng = {
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 112/149] hw/uefi: fix buffer overruns
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (11 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 111/149] hw/misc/bcm2835_rng: Specify valid memory access sizes Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 113/149] hw/uefi: verify pio_xfer_offset before calculating buffer checksum Michael Tokarev
                   ` (37 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Gerd Hoffmann, Katherine Leaver, Michael Tokarev

From: Gerd Hoffmann <kraxel@redhat.com>

The buffer size checks do not consider the mm_header size, simliar to
CVE-2026-5744.  Factor out the repeated size check to a small helper
function, fix the check, update all places to use the new helper.

Fixes: CVE-2026-41435
Fixes: db1ecfb473ac ("hw/uefi: add var-service-vars.c")
Reported-by: Katherine Leaver <katherine.j.leaver@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20260422092910.444997-2-kraxel@redhat.com>
(cherry picked from commit f252769a23e67765f9b95d8944ca3da6c9edf58b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/uefi/var-service-vars.c b/hw/uefi/var-service-vars.c
index 5607763525..922f6dd963 100644
--- a/hw/uefi/var-service-vars.c
+++ b/hw/uefi/var-service-vars.c
@@ -260,6 +260,17 @@ static size_t uefi_vars_mm_error(mm_header *mhdr, mm_variable *mvar,
     return sizeof(*mvar);
 }
 
+static bool check_buffer_size(uefi_vars_state *uv, uint64_t length)
+{
+    /* uefi_vars_cmd_mm() checks that */
+    g_assert(uv->buf_size >= sizeof(mm_header));
+
+    if (uv->buf_size - sizeof(mm_header) < length) {
+        return false;
+    }
+    return true;
+}
+
 static size_t uefi_vars_mm_get_variable(uefi_vars_state *uv, mm_header *mhdr,
                                         mm_variable *mvar, void *func)
 {
@@ -307,7 +318,7 @@ static size_t uefi_vars_mm_get_variable(uefi_vars_state *uv, mm_header *mhdr,
     if (uadd64_overflow(length, va->data_size, &length)) {
         return uefi_vars_mm_error(mhdr, mvar, EFI_BAD_BUFFER_SIZE);
     }
-    if (uv->buf_size < length) {
+    if (!check_buffer_size(uv, length)) {
         return uefi_vars_mm_error(mhdr, mvar, EFI_BAD_BUFFER_SIZE);
     }
 
@@ -377,7 +388,7 @@ uefi_vars_mm_get_next_variable(uefi_vars_state *uv, mm_header *mhdr,
     }
 
     length = sizeof(*mvar) + sizeof(*nv) + var->name_size;
-    if (uv->buf_size < length) {
+    if (!check_buffer_size(uv, length)) {
         return uefi_vars_mm_error(mhdr, mvar, EFI_BAD_BUFFER_SIZE);
     }
 
@@ -567,7 +578,7 @@ static size_t uefi_vars_mm_variable_info(uefi_vars_state *uv, mm_header *mhdr,
     uint64_t length;
 
     length = sizeof(*mvar) + sizeof(*vi);
-    if (uv->buf_size < length) {
+    if (!check_buffer_size(uv, length)) {
         return uefi_vars_mm_error(mhdr, mvar, EFI_BAD_BUFFER_SIZE);
     }
 
@@ -588,7 +599,7 @@ uefi_vars_mm_get_payload_size(uefi_vars_state *uv, mm_header *mhdr,
     uint64_t length;
 
     length = sizeof(*mvar) + sizeof(*ps);
-    if (uv->buf_size < length) {
+    if (!check_buffer_size(uv, length)) {
         return uefi_vars_mm_error(mhdr, mvar, EFI_BAD_BUFFER_SIZE);
     }
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 113/149] hw/uefi: verify pio_xfer_offset before calculating buffer checksum
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (12 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 112/149] hw/uefi: fix buffer overruns Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 114/149] hw/uefi: fix ucs2 string helper functions Michael Tokarev
                   ` (36 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Gerd Hoffmann, Katherine Leaver, Michael Tokarev

From: Gerd Hoffmann <kraxel@redhat.com>

Without that it is possible to do trigger OOB reads by first
advancing offset, then making the buffer smaller, finally
asking for a checksum.

Fixes: CVE-2026-41436
Fixes: 90ca4e03c27d ("hw/uefi: add var-service-core.c")
Reported-by: Katherine Leaver <katherine.j.leaver@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20260422092910.444997-3-kraxel@redhat.com>
(cherry picked from commit 94d9a8b2c9e6962aa7f7673229d2db7b110cfac6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/uefi/var-service-core.c b/hw/uefi/var-service-core.c
index 91548e2f39..660ca2f9f8 100644
--- a/hw/uefi/var-service-core.c
+++ b/hw/uefi/var-service-core.c
@@ -229,6 +229,10 @@ static uint64_t uefi_vars_read(void *opaque, hwaddr addr, unsigned size)
         uv->pio_xfer_offset += size;
         break;
     case UEFI_VARS_REG_PIO_BUFFER_CRC32C:
+        if (uv->pio_xfer_offset > uv->buf_size) {
+            retval = 0;
+            break;
+        }
         retval = crc32c(0xffffffff, uv->pio_xfer_buffer, uv->pio_xfer_offset);
         break;
     case UEFI_VARS_REG_FLAGS:
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 114/149] hw/uefi: fix ucs2 string helper functions
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (13 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 113/149] hw/uefi: verify pio_xfer_offset before calculating buffer checksum Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 115/149] hw/uefi: add name_size check to uefi_vars_mm_lock_variable() Michael Tokarev
                   ` (35 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Gerd Hoffmann, Katherine Leaver, Michael Tokarev

From: Gerd Hoffmann <kraxel@redhat.com>

The length passed in is in bytes not characters.  Rename the
parameters to make that clear.  Calculate the number of chars
if needed.  Fix length checks to use the number of chars not
bytes to avoid OOB reads.

Fixes: CVE-2026-41437
Fixes: 1ebc319c8ca7 ("hw/uefi: add var-service-utils.c")
Reported-by: Katherine Leaver <katherine.j.leaver@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20260422092910.444997-4-kraxel@redhat.com>
(cherry picked from commit 5247b3034c23bdfd91a7f78587c3b3e37f90568c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/uefi/var-service-utils.c b/hw/uefi/var-service-utils.c
index 258013f436..489321a26c 100644
--- a/hw/uefi/var-service-utils.c
+++ b/hw/uefi/var-service-utils.c
@@ -19,13 +19,18 @@
  * sometimes when they are not (for example in variable policies).
  */
 
-gboolean uefi_str_is_valid(const uint16_t *str, size_t len,
+gboolean uefi_str_is_valid(const uint16_t *str, size_t bytes,
                            gboolean must_be_null_terminated)
 {
+    size_t chars = bytes / 2;
     size_t pos = 0;
 
+    if ((bytes % 2) != 0) {
+        return false;
+    }
+
     for (;;) {
-        if (pos == len) {
+        if (pos == chars) {
             if (must_be_null_terminated) {
                 return false;
             } else {
@@ -47,12 +52,13 @@ gboolean uefi_str_is_valid(const uint16_t *str, size_t len,
     }
 }
 
-size_t uefi_strlen(const uint16_t *str, size_t len)
+size_t uefi_strlen(const uint16_t *str, size_t bytes)
 {
+    size_t chars = bytes / 2;
     size_t pos = 0;
 
     for (;;) {
-        if (pos == len) {
+        if (pos == chars) {
             return pos;
         }
         if (str[pos] == 0) {
@@ -62,25 +68,25 @@ size_t uefi_strlen(const uint16_t *str, size_t len)
     }
 }
 
-gboolean uefi_str_equal_ex(const uint16_t *a, size_t alen,
-                           const uint16_t *b, size_t blen,
+gboolean uefi_str_equal_ex(const uint16_t *a, size_t a_bytes,
+                           const uint16_t *b, size_t b_bytes,
                            gboolean wildcards_in_a)
 {
+    size_t a_chars = a_bytes / 2;
+    size_t b_chars = b_bytes / 2;
     size_t pos = 0;
 
-    alen = alen / 2;
-    blen = blen / 2;
     for (;;) {
-        if (pos == alen && pos == blen) {
+        if (pos == a_chars && pos == b_chars) {
             return true;
         }
-        if (pos == alen && b[pos] == 0) {
+        if (pos == a_chars && b[pos] == 0) {
             return true;
         }
-        if (pos == blen && a[pos] == 0) {
+        if (pos == b_chars && a[pos] == 0) {
             return true;
         }
-        if (pos == alen || pos == blen) {
+        if (pos == a_chars || pos == b_chars) {
             return false;
         }
         if (a[pos] == 0 && b[pos] == 0) {
@@ -100,18 +106,18 @@ gboolean uefi_str_equal_ex(const uint16_t *a, size_t alen,
     }
 }
 
-gboolean uefi_str_equal(const uint16_t *a, size_t alen,
-                        const uint16_t *b, size_t blen)
+gboolean uefi_str_equal(const uint16_t *a, size_t a_bytes,
+                        const uint16_t *b, size_t b_bytes)
 {
-    return uefi_str_equal_ex(a, alen, b, blen, false);
+    return uefi_str_equal_ex(a, a_bytes, b, b_bytes, false);
 }
 
-char *uefi_ucs2_to_ascii(const uint16_t *ucs2, uint64_t ucs2_size)
+char *uefi_ucs2_to_ascii(const uint16_t *ucs2, uint64_t ucs2_bytes)
 {
-    char *str = g_malloc0(ucs2_size / 2 + 1);
+    char *str = g_malloc0(ucs2_bytes / 2 + 1);
     int i;
 
-    for (i = 0; i * 2 < ucs2_size; i++) {
+    for (i = 0; i * 2 < ucs2_bytes; i++) {
         if (ucs2[i] == 0) {
             break;
         }
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 115/149] hw/uefi: add name_size check to uefi_vars_mm_lock_variable()
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (14 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 114/149] hw/uefi: fix ucs2 string helper functions Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 116/149] hw/uefi: verify data size before accessing it in wrap_pkcs7 Michael Tokarev
                   ` (34 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Gerd Hoffmann, Katherine Leaver, Michael Tokarev

From: Gerd Hoffmann <kraxel@redhat.com>

Make sure the total variable_policy_entry size stays below
64k so the (16-bit) size field can not wrap.

Fixes: CVE-2026-41438
Fixes: db1ecfb473ac ("hw/uefi: add var-service-vars.c")
Reported-by: Katherine Leaver <katherine.j.leaver@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20260422092910.444997-5-kraxel@redhat.com>
(cherry picked from commit c45b460d16f991ff3f753623f3423e1adc4077a2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/uefi/var-service-vars.c b/hw/uefi/var-service-vars.c
index 922f6dd963..d7187e006d 100644
--- a/hw/uefi/var-service-vars.c
+++ b/hw/uefi/var-service-vars.c
@@ -629,6 +629,9 @@ uefi_vars_mm_lock_variable(uefi_vars_state *uv, mm_header *mhdr,
     if (mhdr->length < length) {
         return uefi_vars_mm_error(mhdr, mvar, EFI_BAD_BUFFER_SIZE);
     }
+    if (sizeof(*pe) + lv->name_size > UINT16_MAX) {
+        return uefi_vars_mm_error(mhdr, mvar, EFI_BAD_BUFFER_SIZE);
+    }
 
     uefi_trace_variable(__func__, lv->guid, name, lv->name_size);
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 116/149] hw/uefi: verify data size before accessing it in wrap_pkcs7
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (15 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 115/149] hw/uefi: add name_size check to uefi_vars_mm_lock_variable() Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 117/149] hw/uefi: avoid possibly unaligned variable_auth_2 struct field access Michael Tokarev
                   ` (33 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Gerd Hoffmann, Katherine Leaver, Michael Tokarev

From: Gerd Hoffmann <kraxel@redhat.com>

Fixes: CVE-2026-41439
Fixes: 3e33af2cb306 ("hw/uefi: add var-service-pkcs7.c")
Reported-by: Katherine Leaver <katherine.j.leaver@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20260422092910.444997-6-kraxel@redhat.com>
(cherry picked from commit 22b7b222d8f5428be8b5d4787f36efd0a0b75292)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/uefi/var-service-pkcs7.c b/hw/uefi/var-service-pkcs7.c
index 32accf4e44..f17ad6872f 100644
--- a/hw/uefi/var-service-pkcs7.c
+++ b/hw/uefi/var-service-pkcs7.c
@@ -73,7 +73,8 @@ static void wrap_pkcs7(gnutls_datum_t *pkcs7)
     };
     gnutls_datum_t wrap;
 
-    if (pkcs7->data[4] == 0x06 &&
+    if (pkcs7->size > 16 &&
+        pkcs7->data[4] == 0x06 &&
         pkcs7->data[5] == 0x09 &&
         memcmp(pkcs7->data + 6, signed_data_oid, sizeof(signed_data_oid)) == 0 &&
         pkcs7->data[15] == 0x0a &&
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 117/149] hw/uefi: avoid possibly unaligned variable_auth_2 struct field access
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (16 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 116/149] hw/uefi: verify data size before accessing it in wrap_pkcs7 Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 118/149] hw/uefi: check auth.hdr_length minimum size Michael Tokarev
                   ` (32 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Gerd Hoffmann, Katherine Leaver, Michael Tokarev

From: Gerd Hoffmann <kraxel@redhat.com>

Copy data to stack-allocated struct before accessing it
to make sure it is properly aligned.

Fixes: CVE-2026-41440
Fixes: f1488fac0584 ("hw/uefi: add var-service-auth.c")
Reported-by: Katherine Leaver <katherine.j.leaver@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20260422092910.444997-7-kraxel@redhat.com>
(cherry picked from commit b4680c02b8e838c75691656ee2c4450b454d1ca7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/uefi/var-service-auth.c b/hw/uefi/var-service-auth.c
index fba5a0956a..795f2f54e4 100644
--- a/hw/uefi/var-service-auth.c
+++ b/hw/uefi/var-service-auth.c
@@ -180,9 +180,10 @@ static efi_status uefi_vars_check_auth_2_sb(uefi_vars_state *uv,
                                             void *data,
                                             uint64_t data_offset)
 {
-    variable_auth_2 *auth = data;
+    variable_auth_2 auth;
     uefi_variable *siglist;
 
+    memcpy(&auth, data, sizeof(auth));
     if (custom_mode_is_active(uv)) {
         /* no authentication in custom mode */
         return EFI_SUCCESS;
@@ -193,7 +194,7 @@ static efi_status uefi_vars_check_auth_2_sb(uefi_vars_state *uv,
         return EFI_SUCCESS;
     }
 
-    if (auth->hdr_length == 24) {
+    if (auth.hdr_length == 24) {
         /* no signature (auth->cert_data is empty) */
         return EFI_SECURITY_VIOLATION;
     }
@@ -218,23 +219,25 @@ static efi_status uefi_vars_check_auth_2_sb(uefi_vars_state *uv,
 efi_status uefi_vars_check_auth_2(uefi_vars_state *uv, uefi_variable *var,
                                   mm_variable_access *va, void *data)
 {
-    variable_auth_2 *auth = data;
+    variable_auth_2 auth;
     uint64_t data_offset;
     efi_status status;
 
-    if (va->data_size < sizeof(*auth)) {
+    if (va->data_size < sizeof(auth)) {
         return EFI_SECURITY_VIOLATION;
     }
-    if (uadd64_overflow(sizeof(efi_time), auth->hdr_length, &data_offset)) {
+    memcpy(&auth, data, sizeof(auth));
+
+    if (uadd64_overflow(sizeof(efi_time), auth.hdr_length, &data_offset)) {
         return EFI_SECURITY_VIOLATION;
     }
     if (va->data_size < data_offset) {
         return EFI_SECURITY_VIOLATION;
     }
 
-    if (auth->hdr_revision != 0x0200 ||
-        auth->hdr_cert_type != WIN_CERT_TYPE_EFI_GUID ||
-        !qemu_uuid_is_equal(&auth->guid_cert_type, &EfiCertTypePkcs7Guid)) {
+    if (auth.hdr_revision != 0x0200 ||
+        auth.hdr_cert_type != WIN_CERT_TYPE_EFI_GUID ||
+        !qemu_uuid_is_equal(&auth.guid_cert_type, &EfiCertTypePkcs7Guid)) {
         return EFI_UNSUPPORTED;
     }
 
@@ -255,7 +258,7 @@ efi_status uefi_vars_check_auth_2(uefi_vars_state *uv, uefi_variable *var,
     }
 
     /* checks passed, set variable data */
-    var->time = auth->timestamp;
+    var->time = auth.timestamp;
     if (va->data_size - data_offset > 0) {
         var->data = g_malloc(va->data_size - data_offset);
         memcpy(var->data, data + data_offset, va->data_size - data_offset);
diff --git a/hw/uefi/var-service-pkcs7.c b/hw/uefi/var-service-pkcs7.c
index f17ad6872f..c859743e86 100644
--- a/hw/uefi/var-service-pkcs7.c
+++ b/hw/uefi/var-service-pkcs7.c
@@ -21,17 +21,20 @@
  */
 static gnutls_datum_t *build_signed_data(mm_variable_access *va, void *data)
 {
-    variable_auth_2 *auth = data;
-    uint64_t data_offset = sizeof(efi_time) + auth->hdr_length;
+    variable_auth_2 auth;
+    uint64_t data_offset;
     uint16_t *name = (void *)va + sizeof(mm_variable_access);
     gnutls_datum_t *sdata;
     uint64_t pos = 0;
 
+    memcpy(&auth, data, sizeof(auth));
+    data_offset = sizeof(efi_time) + auth.hdr_length;
+
     sdata = g_new(gnutls_datum_t, 1);
     sdata->size = (va->name_size - 2
                    + sizeof(QemuUUID)
                    + sizeof(va->attributes)
-                   + sizeof(auth->timestamp)
+                   + sizeof(auth.timestamp)
                    + va->data_size - data_offset);
     sdata->data = g_malloc(sdata->size);
 
@@ -48,8 +51,8 @@ static gnutls_datum_t *build_signed_data(mm_variable_access *va, void *data)
     pos += sizeof(va->attributes);
 
     /* TimeStamp */
-    memcpy(sdata->data + pos, &auth->timestamp, sizeof(auth->timestamp));
-    pos += sizeof(auth->timestamp);
+    memcpy(sdata->data + pos, &auth.timestamp, sizeof(auth.timestamp));
+    pos += sizeof(auth.timestamp);
 
     /* Variable Content */
     memcpy(sdata->data + pos, data + data_offset, va->data_size - data_offset);
@@ -105,11 +108,12 @@ static void wrap_pkcs7(gnutls_datum_t *pkcs7)
 
 static gnutls_datum_t *build_pkcs7(void *data)
 {
-    variable_auth_2 *auth = data;
+    variable_auth_2 auth;
     gnutls_datum_t *pkcs7;
 
+    memcpy(&auth, data, sizeof(auth));
     pkcs7 = g_new(gnutls_datum_t, 1);
-    pkcs7->size = auth->hdr_length - 24;
+    pkcs7->size = auth.hdr_length - 24;
     pkcs7->data = g_malloc(pkcs7->size);
     memcpy(pkcs7->data, data + 16 + 24, pkcs7->size);
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 118/149] hw/uefi: check auth.hdr_length minimum size
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (17 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 117/149] hw/uefi: avoid possibly unaligned variable_auth_2 struct field access Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 119/149] hw/ufs: Validate MCQ SQ references before use Michael Tokarev
                   ` (31 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Gerd Hoffmann, Feifan Qian, Daniel P. Berrangé,
	Michael Tokarev

From: Gerd Hoffmann <kraxel@redhat.com>

auth.hdr_length maximum is already checked (against buffer size).  The
header has some fixed fields which are included in the header length, so
there also is a minimum size which must be verified.  Add a check for
that.  Fixes possible integer underflow.

While being at it replace the magic number '24' with sizeof calculations
for better code documentation.

Fixes: CVE-2026-8341
Fixes: f1488fac0584 ("hw/uefi: add var-service-auth.c")
Reported-by: Feifan Qian <bea1e@proton.me>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20260512060523.17493-1-kraxel@redhat.com>
(cherry picked from commit b33fd8ab1caa07aeb290ef5dac44a4e7fd4be02b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/uefi/var-service-auth.c b/hw/uefi/var-service-auth.c
index 795f2f54e4..f3dc9c6ca6 100644
--- a/hw/uefi/var-service-auth.c
+++ b/hw/uefi/var-service-auth.c
@@ -194,7 +194,7 @@ static efi_status uefi_vars_check_auth_2_sb(uefi_vars_state *uv,
         return EFI_SUCCESS;
     }
 
-    if (auth.hdr_length == 24) {
+    if (auth.hdr_length == (sizeof(auth) - sizeof(auth.timestamp))) {
         /* no signature (auth->cert_data is empty) */
         return EFI_SECURITY_VIOLATION;
     }
@@ -228,6 +228,9 @@ efi_status uefi_vars_check_auth_2(uefi_vars_state *uv, uefi_variable *var,
     }
     memcpy(&auth, data, sizeof(auth));
 
+    if (auth.hdr_length < (sizeof(auth) - sizeof(auth.timestamp))) {
+        return EFI_SECURITY_VIOLATION;
+    }
     if (uadd64_overflow(sizeof(efi_time), auth.hdr_length, &data_offset)) {
         return EFI_SECURITY_VIOLATION;
     }
diff --git a/hw/uefi/var-service-pkcs7.c b/hw/uefi/var-service-pkcs7.c
index c859743e86..8a1f1395a2 100644
--- a/hw/uefi/var-service-pkcs7.c
+++ b/hw/uefi/var-service-pkcs7.c
@@ -113,9 +113,9 @@ static gnutls_datum_t *build_pkcs7(void *data)
 
     memcpy(&auth, data, sizeof(auth));
     pkcs7 = g_new(gnutls_datum_t, 1);
-    pkcs7->size = auth.hdr_length - 24;
+    pkcs7->size = auth.hdr_length - (sizeof(auth) - sizeof(auth.timestamp));
     pkcs7->data = g_malloc(pkcs7->size);
-    memcpy(pkcs7->data, data + 16 + 24, pkcs7->size);
+    memcpy(pkcs7->data, data + sizeof(auth), pkcs7->size);
 
     wrap_pkcs7(pkcs7);
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 119/149] hw/ufs: Validate MCQ SQ references before use
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (18 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 118/149] hw/uefi: check auth.hdr_length minimum size Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 120/149] hw/ufs: Guard MCQ CQ accesses against missing queues Michael Tokarev
                   ` (30 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Jeuk Kim, Rayhan Ramdhany Hanaputra, Michael Tokarev

From: Jeuk Kim <jeuk20.kim@samsung.com>

A guest can program an out-of-range SQATTR.CQID value, or ring an
MCQ SQ doorbell before the submission queue exists.

Reject SQ creation when the referenced CQ is invalid, and ignore SQ
doorbells for queues that have not been created. This prevents a
guest-triggerable out-of-bounds read and NULL pointer dereference.

Fixes: 5c079578d2e ("hw/ufs: Add support MCQ of UFSHCI 4.0")
Reported-by: Rayhan Ramdhany Hanaputra <hanaputrarayhan@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
(cherry picked from commit 332ea29787800fff2b49e9b89ec93bd370a11965)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/ufs/ufs.c b/hw/ufs/ufs.c
index 8fdc0854eb..5fe63b82ab 100644
--- a/hw/ufs/ufs.c
+++ b/hw/ufs/ufs.c
@@ -517,8 +517,13 @@ static bool ufs_mcq_create_sq(UfsHc *u, uint8_t qid, uint32_t attr)
         return false;
     }
 
+    if (cqid >= u->params.mcq_maxq) {
+        trace_ufs_err_mcq_create_sq_invalid_cqid(cqid);
+        return false;
+    }
+
     if (!u->cq[cqid]) {
-        trace_ufs_err_mcq_create_sq_invalid_cqid(qid);
+        trace_ufs_err_mcq_create_sq_invalid_cqid(cqid);
         return false;
     }
 
@@ -775,6 +780,11 @@ static void ufs_mcq_process_db(UfsHc *u, uint8_t qid, uint32_t db)
     }
 
     sq = u->sq[qid];
+    if (!sq) {
+        trace_ufs_err_mcq_db_wr_invalid_sqid(qid);
+        return;
+    }
+
     if (sq->size * sizeof(UfsSqEntry) <= db) {
         trace_ufs_err_mcq_db_wr_invalid_db(qid, db);
         return;
@@ -788,7 +798,14 @@ static void ufs_write_mcq_op_reg(UfsHc *u, hwaddr offset, uint32_t data,
                                  unsigned size)
 {
     int qid = offset / sizeof(UfsMcqOpReg);
-    UfsMcqOpReg *opr = &u->mcq_op_reg[qid];
+    UfsMcqOpReg *opr;
+
+    if (qid >= u->params.mcq_maxq) {
+        trace_ufs_err_invalid_register_offset(offset);
+        return;
+    }
+
+    opr = &u->mcq_op_reg[qid];
 
     switch (offset % sizeof(UfsMcqOpReg)) {
     case offsetof(UfsMcqOpReg, sq.tp):
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 120/149] hw/ufs: Guard MCQ CQ accesses against missing queues
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (19 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 119/149] hw/ufs: Validate MCQ SQ references before use Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 121/149] hw/ufs: Reject zero-depth MCQ queues Michael Tokarev
                   ` (29 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Jeuk Kim, Rayhan Ramdhany Hanaputra, Michael Tokarev

From: Jeuk Kim <jeuk20.kim@samsung.com>

A guest can ring an MCQ CQ doorbell before the completion queue exists.
The CQ head write path then dereferences a NULL CQ through
ufs_mcq_cq_full().

Ignore CQ head updates for missing CQs, and make ufs_mcq_cq_full()
handle a missing CQ defensively.

Fixes: f78762a3cc8 ("hw/ufs: Fix mcq completion queue wraparound")
Reported-by: Rayhan Ramdhany Hanaputra <hanaputrarayhan@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
(cherry picked from commit 283d921e771e8a98a5c3d1eed1ed791b89ba47a8)
Fixes: ad5f6ffcd04 ("hw/ufs: Fix mcq completion queue wraparound") in 10.2.x
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/ufs/ufs.c b/hw/ufs/ufs.c
index 5fe63b82ab..e01fc730fd 100644
--- a/hw/ufs/ufs.c
+++ b/hw/ufs/ufs.c
@@ -817,6 +817,10 @@ static void ufs_write_mcq_op_reg(UfsHc *u, hwaddr offset, uint32_t data,
     case offsetof(UfsMcqOpReg, cq.hp): {
         UfsCq *cq = u->cq[qid];
 
+        if (!cq) {
+            break;
+        }
+
         if (ufs_mcq_cq_full(u, qid) && !QTAILQ_EMPTY(&cq->req_list)) {
             /* Enqueueing to CQ was blocked because it was full */
             qemu_bh_schedule(cq->bh);
diff --git a/hw/ufs/ufs.h b/hw/ufs/ufs.h
index 13d964c5ae..9e800cafac 100644
--- a/hw/ufs/ufs.h
+++ b/hw/ufs/ufs.h
@@ -203,7 +203,14 @@ static inline bool ufs_mcq_cq_empty(UfsHc *u, uint32_t qid)
 static inline bool ufs_mcq_cq_full(UfsHc *u, uint32_t qid)
 {
     uint32_t tail = ufs_mcq_cq_tail(u, qid);
-    uint16_t cq_size = u->cq[qid]->size;
+    UfsCq *cq = u->cq[qid];
+    uint16_t cq_size;
+
+    if (!cq) {
+        return false;
+    }
+
+    cq_size = cq->size;
 
     tail = (tail + sizeof(UfsCqEntry)) % (sizeof(UfsCqEntry) * cq_size);
     return tail == ufs_mcq_cq_head(u, qid);
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 121/149] hw/ufs: Reject zero-depth MCQ queues
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (20 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 120/149] hw/ufs: Guard MCQ CQ accesses against missing queues Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 122/149] hw/ufs: Keep MCQ SQs alive while requests are outstanding Michael Tokarev
                   ` (28 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Jeuk Kim, Michael Tokarev

From: Jeuk Kim <jeuk20.kim@samsung.com>

Reject SQATTR.SIZE and CQATTR.SIZE values that produce zero-entry MCQ
queues. Such queues can later trigger a divide-by-zero while advancing
queue pointers.

Fixes: 5c079578d2e ("hw/ufs: Add support MCQ of UFSHCI 4.0")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
(cherry picked from commit 4a909c00b9e18478e67a792c7f7cfae62cb6c865)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/ufs/trace-events b/hw/ufs/trace-events
index 531dcfc686..7734b35f08 100644
--- a/hw/ufs/trace-events
+++ b/hw/ufs/trace-events
@@ -40,10 +40,12 @@ ufs_err_mcq_db_wr_invalid_sqid(uint8_t qid) "invalid mcq sqid %"PRIu8""
 ufs_err_mcq_db_wr_invalid_db(uint8_t qid, uint32_t db) "invalid mcq doorbell sqid %"PRIu8", db %"PRIu32""
 ufs_err_mcq_create_sq_invalid_sqid(uint8_t qid) "invalid mcq sqid %"PRIu8""
 ufs_err_mcq_create_sq_invalid_cqid(uint8_t qid) "invalid mcq cqid %"PRIu8""
+ufs_err_mcq_create_sq_invalid_size(uint8_t qid) "invalid mcq sq size for sqid %"PRIu8""
 ufs_err_mcq_create_sq_already_exists(uint8_t qid) "mcq sqid %"PRIu8 "already exists"
 ufs_err_mcq_delete_sq_invalid_sqid(uint8_t qid) "invalid mcq sqid %"PRIu8""
 ufs_err_mcq_delete_sq_not_exists(uint8_t qid) "mcq sqid %"PRIu8 "not exists"
 ufs_err_mcq_create_cq_invalid_cqid(uint8_t qid) "invalid mcq cqid %"PRIu8""
+ufs_err_mcq_create_cq_invalid_size(uint8_t qid) "invalid mcq cq size for cqid %"PRIu8""
 ufs_err_mcq_create_cq_already_exists(uint8_t qid) "mcq cqid %"PRIu8 "already exists"
 ufs_err_mcq_delete_cq_invalid_cqid(uint8_t qid) "invalid mcq cqid %"PRIu8""
 ufs_err_mcq_delete_cq_not_exists(uint8_t qid) "mcq cqid %"PRIu8 "not exists"
diff --git a/hw/ufs/ufs.c b/hw/ufs/ufs.c
index e01fc730fd..66f4031852 100644
--- a/hw/ufs/ufs.c
+++ b/hw/ufs/ufs.c
@@ -506,6 +506,8 @@ static bool ufs_mcq_create_sq(UfsHc *u, uint8_t qid, uint32_t attr)
     UfsMcqReg *reg = &u->mcq_reg[qid];
     UfsSq *sq;
     uint8_t cqid = FIELD_EX32(attr, SQATTR, CQID);
+    uint16_t qsize =
+        ((FIELD_EX32(attr, SQATTR, SIZE) + 1) << 2) / sizeof(UfsSqEntry);
 
     if (qid >= u->params.mcq_maxq) {
         trace_ufs_err_mcq_create_sq_invalid_sqid(qid);
@@ -527,12 +529,17 @@ static bool ufs_mcq_create_sq(UfsHc *u, uint8_t qid, uint32_t attr)
         return false;
     }
 
+    if (!qsize) {
+        trace_ufs_err_mcq_create_sq_invalid_size(qid);
+        return false;
+    }
+
     sq = g_malloc0(sizeof(*sq));
     sq->u = u;
     sq->sqid = qid;
     sq->cq = u->cq[cqid];
     sq->addr = ((uint64_t)reg->squba << 32) | reg->sqlba;
-    sq->size = ((FIELD_EX32(attr, SQATTR, SIZE) + 1) << 2) / sizeof(UfsSqEntry);
+    sq->size = qsize;
 
     sq->bh = qemu_bh_new_guarded(ufs_mcq_process_sq, sq,
                                  &DEVICE(u)->mem_reentrancy_guard);
@@ -576,6 +583,8 @@ static bool ufs_mcq_create_cq(UfsHc *u, uint8_t qid, uint32_t attr)
 {
     UfsMcqReg *reg = &u->mcq_reg[qid];
     UfsCq *cq;
+    uint16_t qsize =
+        ((FIELD_EX32(attr, CQATTR, SIZE) + 1) << 2) / sizeof(UfsCqEntry);
 
     if (qid >= u->params.mcq_maxq) {
         trace_ufs_err_mcq_create_cq_invalid_cqid(qid);
@@ -587,11 +596,16 @@ static bool ufs_mcq_create_cq(UfsHc *u, uint8_t qid, uint32_t attr)
         return false;
     }
 
+    if (!qsize) {
+        trace_ufs_err_mcq_create_cq_invalid_size(qid);
+        return false;
+    }
+
     cq = g_malloc0(sizeof(*cq));
     cq->u = u;
     cq->cqid = qid;
     cq->addr = ((uint64_t)reg->cquba << 32) | reg->cqlba;
-    cq->size = ((FIELD_EX32(attr, CQATTR, SIZE) + 1) << 2) / sizeof(UfsCqEntry);
+    cq->size = qsize;
 
     cq->bh = qemu_bh_new_guarded(ufs_mcq_process_cq, cq,
                                  &DEVICE(u)->mem_reentrancy_guard);
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 122/149] hw/ufs: Keep MCQ SQs alive while requests are outstanding
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (21 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 121/149] hw/ufs: Reject zero-depth MCQ queues Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 123/149] hw/ufs: Zero reserved bytes in REPORT LUNS response header Michael Tokarev
                   ` (27 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Jeuk Kim, Michael Tokarev

From: Jeuk Kim <jeuk20.kim@samsung.com>

MCQ requests are allocated with their SQ, but can remain in flight on the
CQ list or in the SCSI layer after leaving the SQ free list.

Reject runtime SQ deletion while any request is still outstanding, and
use separate teardown helpers so device exit can still release MCQ
queues after child devices have been unrealized.

Fixes: 5c079578d2e ("hw/ufs: Add support MCQ of UFSHCI 4.0")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
(cherry picked from commit 619c2da19a05668dabe7912afb789e50b8635c4d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/ufs/trace-events b/hw/ufs/trace-events
index 7734b35f08..6f7ea9c95f 100644
--- a/hw/ufs/trace-events
+++ b/hw/ufs/trace-events
@@ -44,6 +44,7 @@ ufs_err_mcq_create_sq_invalid_size(uint8_t qid) "invalid mcq sq size for sqid %"
 ufs_err_mcq_create_sq_already_exists(uint8_t qid) "mcq sqid %"PRIu8 "already exists"
 ufs_err_mcq_delete_sq_invalid_sqid(uint8_t qid) "invalid mcq sqid %"PRIu8""
 ufs_err_mcq_delete_sq_not_exists(uint8_t qid) "mcq sqid %"PRIu8 "not exists"
+ufs_err_mcq_delete_sq_busy(uint8_t qid) "mcq sqid %"PRIu8" has outstanding requests"
 ufs_err_mcq_create_cq_invalid_cqid(uint8_t qid) "invalid mcq cqid %"PRIu8""
 ufs_err_mcq_create_cq_invalid_size(uint8_t qid) "invalid mcq cq size for cqid %"PRIu8""
 ufs_err_mcq_create_cq_already_exists(uint8_t qid) "mcq cqid %"PRIu8 "already exists"
diff --git a/hw/ufs/ufs.c b/hw/ufs/ufs.c
index 66f4031852..d63a8f9c9c 100644
--- a/hw/ufs/ufs.c
+++ b/hw/ufs/ufs.c
@@ -556,6 +556,31 @@ static bool ufs_mcq_create_sq(UfsHc *u, uint8_t qid, uint32_t attr)
     return true;
 }
 
+static bool ufs_mcq_sq_has_outstanding_req(UfsSq *sq)
+{
+    UfsRequest *req;
+    uint16_t free_reqs = 0;
+
+    QTAILQ_FOREACH(req, &sq->req_list, entry)
+    {
+        free_reqs++;
+    }
+
+    return free_reqs != sq->size;
+}
+
+static void ufs_mcq_free_sq(UfsSq *sq)
+{
+    qemu_bh_delete(sq->bh);
+
+    for (int i = 0; i < sq->size; i++) {
+        ufs_clear_req(&sq->req[i]);
+    }
+
+    g_free(sq->req);
+    g_free(sq);
+}
+
 static bool ufs_mcq_delete_sq(UfsHc *u, uint8_t qid)
 {
     UfsSq *sq;
@@ -572,9 +597,12 @@ static bool ufs_mcq_delete_sq(UfsHc *u, uint8_t qid)
 
     sq = u->sq[qid];
 
-    qemu_bh_delete(sq->bh);
-    g_free(sq->req);
-    g_free(sq);
+    if (ufs_mcq_sq_has_outstanding_req(sq)) {
+        trace_ufs_err_mcq_delete_sq_busy(qid);
+        return false;
+    }
+
+    ufs_mcq_free_sq(sq);
     u->sq[qid] = NULL;
     return true;
 }
@@ -617,6 +645,12 @@ static bool ufs_mcq_create_cq(UfsHc *u, uint8_t qid, uint32_t attr)
     return true;
 }
 
+static void ufs_mcq_free_cq(UfsCq *cq)
+{
+    qemu_bh_delete(cq->bh);
+    g_free(cq);
+}
+
 static bool ufs_mcq_delete_cq(UfsHc *u, uint8_t qid)
 {
     UfsCq *cq;
@@ -640,8 +674,7 @@ static bool ufs_mcq_delete_cq(UfsHc *u, uint8_t qid)
 
     cq = u->cq[qid];
 
-    qemu_bh_delete(cq->bh);
-    g_free(cq);
+    ufs_mcq_free_cq(cq);
     u->cq[qid] = NULL;
     return true;
 }
@@ -1884,12 +1917,14 @@ static void ufs_exit(PCIDevice *pci_dev)
 
     for (int i = 0; i < ARRAY_SIZE(u->sq); i++) {
         if (u->sq[i]) {
-            ufs_mcq_delete_sq(u, i);
+            ufs_mcq_free_sq(u->sq[i]);
+            u->sq[i] = NULL;
         }
     }
     for (int i = 0; i < ARRAY_SIZE(u->cq); i++) {
         if (u->cq[i]) {
-            ufs_mcq_delete_cq(u, i);
+            ufs_mcq_free_cq(u->cq[i]);
+            u->cq[i] = NULL;
         }
     }
 }
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 123/149] hw/ufs: Zero reserved bytes in REPORT LUNS response header
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (22 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 122/149] hw/ufs: Keep MCQ SQs alive while requests are outstanding Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 124/149] hw/display/cirrus_vga: Fix packed-24 color-expansion transparent pattern fills Michael Tokarev
                   ` (26 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Jeuk Kim, Michael Tokarev

From: Jeuk Kim <jeuk20.kim@samsung.com>

ufs_emulate_report_luns() writes the 4-byte LUN list length into
outbuf[0..3] via stl_be_p() but leaves outbuf[4..7], the reserved
field, uninitialized. Those bytes are then DMA'd to guest memory,
leaking uninitialized QEMU stack data.

Fixes: 7708e298180 ("hw/ufs/lu: skip automatic zero-init of large array")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
(cherry picked from commit 042dbcff8382393b20b716294a6c4b1a4af6b3f1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/ufs/lu.c b/hw/ufs/lu.c
index 3f3c9589ce..709d6adcf6 100644
--- a/hw/ufs/lu.c
+++ b/hw/ufs/lu.c
@@ -101,6 +101,10 @@ static int ufs_emulate_report_luns(UfsRequest *req, uint8_t *outbuf,
         return SCSI_COMMAND_FAIL;
     }
 
+    if (outbuf_len < 8) {
+        return SCSI_COMMAND_FAIL;
+    }
+    memset(outbuf, 0, 8);
     len += 8;
 
     for (uint8_t lun = 0; lun < UFS_MAX_LUS; ++lun) {
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 124/149] hw/display/cirrus_vga: Fix packed-24 color-expansion transparent pattern fills
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (23 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 123/149] hw/ufs: Zero reserved bytes in REPORT LUNS response header Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 125/149] hw/display/cirrus_vga: Fix packed-24 color-expansion transparent copies Michael Tokarev
                   ` (25 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Peter Maydell, Junjie Cao,
	Philippe Mathieu-Daudé, Michael Tokarev

From: Peter Maydell <peter.maydell@linaro.org>

The Cirrus Logic VGA card has "pattern fill" blit modes where it
repeatedly copies an 8x8 source pattern to the display.  For the
"color expansion" subtype of these, the source pixel format is an 8x8
monochrome bitmap, and the destination can be any of 8, 16, 24 or
32bpp.  We implemented these wrong for the 24bpp case, in a way that
results in a complaint from the undefined-behavior sanitizer about a
shift by a negative value.

For these pattern fills, the GR2F register includes a field which
specifies how much to skip at the start of each scanline.  In the 8,
16 and 32 bit cases, this field is 3 bits and is a count of pixels to
skip.  We get this case right.  However, for the 24 bit case, the
field is 5 bits and is a count of destination bytes to skip.  We
tried to add support for 24-bits in commit ad81218e40e27 ("depth=24
write mask fix (Volker Ruppert)") in 2005.  However we got this
wrong, because when we need to skip, for example, 30 bytes in the
destination, this is 10 input pixels but the whole pattern is only 8
pixels wide, and we ended up with a negative bitpos for the first bit
to use in the pattern.

Fix the bug by masking srcskipleft in the 24-bit case so that it
correctly gives the first pixel to use in the pattern even if we skip
so many pixels that we have wrapped around to what would have been
the second copy of the pattern to the destination.

This patch was produced based on the information in the CL-GD5446
Technical Reference Manual, specifically sections 5.8 "GR2F: BLT
Destination Left-Side Clipping" and 9.4.8 "Pattern Fills".

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/3377
Fixes: ad81218e40e27 ("depth=24 write mask fix (Volker Ruppert)")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Junjie Cao <junjie.cao@intel.com>
Tested-by: Junjie Cao <junjie.cao@intel.com>
Message-ID: <20260410183249.4046456-2-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit aefeecb413a8e404ecb6d210cc32d60da176a336)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/display/cirrus_vga_rop2.h b/hw/display/cirrus_vga_rop2.h
index b208b7348a..8be35ec6e2 100644
--- a/hw/display/cirrus_vga_rop2.h
+++ b/hw/display/cirrus_vga_rop2.h
@@ -191,10 +191,29 @@ glue(glue(glue(cirrus_colorexpand_pattern_transp_, ROP_NAME), _),DEPTH)
     int x, y, bitpos, pattern_y;
     unsigned int bits, bits_xor;
     unsigned int col;
+
+    /*
+     * Copy from an 8x8 monochrome pattern with color expansion.
+     */
+
 #if DEPTH == 24
+    /*
+     * For packed-24 modes, GR2F bits [4:0] are a count of destination
+     * bytes to be suppressed for each scanline, which we keep in
+     * dstskipleft. Our srcskipleft is the number of pixels to skip
+     * within the 8x8 source pattern to match up with that number
+     * of suppressed bytes. As the pattern repeats every 8 bits we
+     * take the number of pixels mod 8.
+     */
     int dstskipleft = s->vga.gr[0x2f] & 0x1f;
-    int srcskipleft = dstskipleft / 3;
+    int srcskipleft = (dstskipleft / 3) & 0x7;
 #else
+    /*
+     * In all other modes, GR2F bits [2:0] are a count of how many
+     * destination pixels to suppress for each scanline, which is our
+     * srcskipleft. We get dstskipleft, the number of bytes to skip,
+     * by multiplying this by the bytes-per-pixel.
+     */
     int srcskipleft = s->vga.gr[0x2f] & 0x07;
     int dstskipleft = srcskipleft * (DEPTH / 8);
 #endif
-- 
2.47.3

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 125/149] hw/display/cirrus_vga: Fix packed-24 color-expansion transparent copies
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (24 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 124/149] hw/display/cirrus_vga: Fix packed-24 color-expansion transparent pattern fills Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 126/149] hw/misc/aspeed_sbc: Add bounds checking for OTP write operations Michael Tokarev
                   ` (24 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Peter Maydell, Junjie Cao,
	Philippe Mathieu-Daudé, Michael Tokarev

From: Peter Maydell <peter.maydell@linaro.org>

For the "color expansion" subtype of raster operations, the source
pixel format is a monochrome bitmap, and the destination can be any
of 8, 16, 24 or 32bpp.

For these pattern operations, the GR2F register includes a field
which specifies how much to skip at the start of each scanline.  In
the 8, 16 and 32 bit cases, this field is 3 bits and is a count of
pixels to skip.  We get this case right.  However, for the 24 bit
case, the field is 5 bits and is a count of destination bytes to
skip.

In commit ad81218e40e27 ("depth=24 write mask fix (Volker Ruppert)")
in 2005, we updated the code to (attempt to) handle the 5-bit mask
case.  However, we don't do the right thing when the 5-bit mask
indicates that we need to skip more than 8 bits of the input bitmap:
we will right-shift the 0x80 constant completely off the right hand
side, and will be off-by-one for all the source bitmap loads.

Fix this by calculating the whole number of input bytes we need to
skip and the residual number of bits.  In the 8/16/32bpp case the
bytes to skip is always zero.

Cc: qemu-stable@nongnu.org
Fixes: ad81218e40e27 ("depth=24 write mask fix (Volker Ruppert)")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Junjie Cao <junjie.cao@intel.com>
Tested-by: Junjie Cao <junjie.cao@intel.com>
Message-ID: <20260410183249.4046456-3-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 27d14251b904e6dd60c1053a893b52e085f48a3a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/display/cirrus_vga_rop2.h b/hw/display/cirrus_vga_rop2.h
index 8be35ec6e2..33f9b3b613 100644
--- a/hw/display/cirrus_vga_rop2.h
+++ b/hw/display/cirrus_vga_rop2.h
@@ -108,12 +108,34 @@ glue(glue(glue(cirrus_colorexpand_transp_, ROP_NAME), _),DEPTH)
     unsigned int col;
     unsigned bitmask;
     unsigned index;
+
+    /*
+     * Raster ops where the source is a monochrome bitmap with
+     * color expansion to 8/16/24/32bpp destination.
+     */
+
 #if DEPTH == 24
+    /*
+     * For packed-24 modes, GR2F bits [4:0] are a count of destination
+     * bytes to be suppressed for each scanline, which we keep in
+     * dstskipleft. We want to track the number of whole bytes
+     * to skip in the source (always either 0 or 1) and the number
+     * of bits within the byte to skip.
+     */
     int dstskipleft = s->vga.gr[0x2f] & 0x1f;
-    int srcskipleft = dstskipleft / 3;
+    int srcskipleftbits = (dstskipleft / 3) & 0x7;
+    int srcskipleftbytes = (dstskipleft / 3) >> 3;
 #else
-    int srcskipleft = s->vga.gr[0x2f] & 0x07;
-    int dstskipleft = srcskipleft * (DEPTH / 8);
+    /*
+     * In all other modes, GR2F bits [2:0] are a count of how many
+     * destination pixels to suppress for each scanline, which is our
+     * srcskipleftbits. We get dstskipleft, the number of bytes to
+     * skip, by multiplying this by the bytes-per-pixel. In these
+     * modes we never need to skip an entire source byte.
+     */
+    int srcskipleftbits = s->vga.gr[0x2f] & 0x07;
+    int srcskipleftbytes = 0;
+    int dstskipleft = srcskipleftbits * (DEPTH / 8);
 #endif
 
     if (s->cirrus_blt_modeext & CIRRUS_BLTMODEEXT_COLOREXPINV) {
@@ -125,7 +147,8 @@ glue(glue(glue(cirrus_colorexpand_transp_, ROP_NAME), _),DEPTH)
     }
 
     for(y = 0; y < bltheight; y++) {
-        bitmask = 0x80 >> srcskipleft;
+        bitmask = 0x80 >> srcskipleftbits;
+        srcaddr += srcskipleftbytes;
         bits = cirrus_src(s, srcaddr++) ^ bits_xor;
         addr = dstaddr + dstskipleft;
         for (x = dstskipleft; x < bltwidth; x += (DEPTH / 8)) {
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 126/149] hw/misc/aspeed_sbc: Add bounds checking for OTP write operations
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (25 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 125/149] hw/display/cirrus_vga: Fix packed-24 color-expansion transparent copies Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:48 ` [Stable-10.2.3 127/149] aspeed/hace: Fix out-of-bounds read in has_padding() Michael Tokarev
                   ` (23 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Kane Chen, Peter Maydell, Cédric Le Goater,
	Michael Tokarev

From: Kane Chen <kane_chen@aspeedtech.com>

There is a mismatch between the Aspeed OTP model and the Aspeed SBC
model in how the guest-provided address is handled.
aspeed_sbc_otp_prog() passes a word-indexed address directly
to address_space_write() without converting it to a byte offset,
whereas aspeed_otp_write() expects a byte offset and applies an
additional shift (otp_addr << 2). This double-shift confusion means
that an out-of-range word address can lead to a write beyond the
allocated storage.

Fix this by adding bounds checking on the word offset before
converting to byte offset and passing to address_space_write().
This matches the existing bounds check in aspeed_sbc_otp_read().

Cc: Kane-Chen-AS <kane_chen@aspeedtech.com>
Cc: qemu-stable@nongnu.org
Fixes: 1a00754ccf15 ("hw/misc: Add Aspeed Secure Boot Controller model")
Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/3436
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kane-Chen-AS <kane_chen@aspeedtech.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Link: https://lore.kernel.org/qemu-devel/20260428055254.76581-2-kane_chen@aspeedtech.com
[ clg: Kept otp_addr in event logged in aspeed_sbc_otp_prog() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit ff36712da5ae73aca5a044fe5e61c585d427013a)
(Mjt: actual Fixes: tag is this one)
Fixes: 9f58dd0a8c30 ("hw/misc/aspeed_sbc: Connect ASPEED OTP memory device to SBC")
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/misc/aspeed_sbc.c b/hw/misc/aspeed_sbc.c
index 2fc5db749d..db92364fb8 100644
--- a/hw/misc/aspeed_sbc.c
+++ b/hw/misc/aspeed_sbc.c
@@ -159,9 +159,17 @@ static bool aspeed_sbc_otp_prog(AspeedSBCState *s,
     MemTxResult ret;
     AspeedOTPState *otp = &s->otp;
     uint32_t value = s->regs[R_CAMP1];
+    uint32_t otp_offset = otp_addr << 2;
 
-    ret = address_space_write(&otp->as, otp_addr, MEMTXATTRS_UNSPECIFIED,
-                        &value, sizeof(value));
+    if (otp_addr >= OTP_TOTAL_DWORD_COUNT) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "Invalid OTP addr 0x%x\n",
+                      otp_addr);
+        return false;
+    }
+
+    ret = address_space_write(&otp->as, otp_offset, MEMTXATTRS_UNSPECIFIED,
+                              &value, sizeof(value));
     if (ret != MEMTX_OK) {
         qemu_log_mask(LOG_GUEST_ERROR,
                       "Failed to write OTP memory, addr = %x\n",
diff --git a/hw/nvram/aspeed_otp.c b/hw/nvram/aspeed_otp.c
index dcf8ed3917..605e96139a 100644
--- a/hw/nvram/aspeed_otp.c
+++ b/hw/nvram/aspeed_otp.c
@@ -57,12 +57,12 @@ static bool valid_program_data(uint32_t otp_addr,
     return has_programmable_bits != 0;
 }
 
-static bool program_otpmem_data(void *opaque, uint32_t otp_addr,
+static bool program_otpmem_data(void *opaque, hwaddr otp_offset,
                              uint32_t prog_bit, uint32_t *value)
 {
     AspeedOTPState *s = opaque;
+    uint32_t otp_addr = otp_offset >> 2;
     bool is_odd = otp_addr & 1;
-    uint32_t otp_offset = otp_addr << 2;
 
     memcpy(value, s->storage + otp_offset, sizeof(uint32_t));
 
@@ -79,26 +79,25 @@ static bool program_otpmem_data(void *opaque, uint32_t otp_addr,
     return true;
 }
 
-static void aspeed_otp_write(void *opaque, hwaddr otp_addr,
+static void aspeed_otp_write(void *opaque, hwaddr otp_offset,
                                 uint64_t val, unsigned size)
 {
     AspeedOTPState *s = opaque;
-    uint32_t otp_offset, value;
+    uint32_t value;
 
-    if (!program_otpmem_data(s, otp_addr, val, &value)) {
+    if (!program_otpmem_data(s, otp_offset, val, &value)) {
         qemu_log_mask(LOG_GUEST_ERROR,
                       "%s: Failed to program data, value = %x, bit = %"PRIx64"\n",
                       __func__, value, val);
         return;
     }
 
-    otp_offset = otp_addr << 2;
     memcpy(s->storage + otp_offset, &value, size);
 
     if (s->blk) {
         if (blk_pwrite(s->blk, otp_offset, size, &value, 0) < 0) {
             qemu_log_mask(LOG_GUEST_ERROR,
-                          "%s: Failed to write %x to %x\n",
+                          "%s: Failed to write %x to %"HWADDR_PRIx"\n",
                           __func__, value, otp_offset);
 
             return;
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 127/149] aspeed/hace: Fix out-of-bounds read in has_padding()
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (26 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 126/149] hw/misc/aspeed_sbc: Add bounds checking for OTP write operations Michael Tokarev
@ 2026-05-22 21:48 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 128/149] aspeed/hace: Prevent total_req_len overflow Michael Tokarev
                   ` (22 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Cédric Le Goater, Katherine Leaver,
	Michael Tokarev

From: Cédric Le Goater <clg@redhat.com>

The has_padding() function reads the last 8 bytes of a DMA buffer
without validating req_len. req_len is guest-controlled (via
R_HASH_SRC_LEN register or scatter-gather entries) and values less
than 8 cause integer underflow. This can result in an out-of-bounds
read of QEMU process memory.

Add a check to ensure req_len >= 8 before accessing the buffer.

Reported-by: Katherine Leaver <katherine.j.leaver@gmail.com>
Cc: qemu-stable@nongnu.org
Fixes: 5cd7d8564a8b ("aspeed/hace: Support AST2600 HACE")
Link: https://lore.kernel.org/qemu-devel/20260504213421.710035-2-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit 534a52755befa7b8d49c921f8dc964185903efae)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/misc/aspeed_hace.c b/hw/misc/aspeed_hace.c
index 726368fbbc..c7a2731e38 100644
--- a/hw/misc/aspeed_hace.c
+++ b/hw/misc/aspeed_hace.c
@@ -154,6 +154,14 @@ static bool has_padding(AspeedHACEState *s, struct iovec *iov,
                         hwaddr req_len, uint32_t *total_msg_len,
                         uint32_t *pad_offset)
 {
+    /* Need at least 8 bytes to read the total message length field */
+    if (req_len < 8) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s: invalid request length=0x%" HWADDR_PRIx "\n",
+                      __func__, req_len);
+        return false;
+    }
+
     *total_msg_len = (uint32_t)(ldq_be_p(iov->iov_base + req_len - 8) / 8);
     /*
      * SG_LIST_LEN_LAST asserted in the request length doesn't mean it is the
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 128/149] aspeed/hace: Prevent total_req_len overflow
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (27 preceding siblings ...)
  2026-05-22 21:48 ` [Stable-10.2.3 127/149] aspeed/hace: Fix out-of-bounds read in has_padding() Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 129/149] hw/i2c/microbit_i2c: Don't index off end of twi_read_sequence[] Michael Tokarev
                   ` (21 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Cédric Le Goater, Katherine Leaver,
	Michael Tokarev

From: Cédric Le Goater <clg@redhat.com>

In accumulate mode, total_req_len is incremented with plen (hwaddr)
for each hash request. Repeated additions can overflow total_req_len
(uint32_t) and potentially bypass validation checks in has_padding().

Add a helper function to detect overflow before incrementing
total_req_len and reject the request if overflow would occur.

Reported-by: Katherine Leaver <katherine.j.leaver@gmail.com>
Cc: qemu-stable@nongnu.org
Fixes: 5cd7d8564a8b ("aspeed/hace: Support AST2600 HACE")
Link: https://lore.kernel.org/qemu-devel/20260504213421.710035-3-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit c6aa2d0ac161f2a58a8fbab9a15e846278661158)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/misc/aspeed_hace.c b/hw/misc/aspeed_hace.c
index c7a2731e38..f08b7ae376 100644
--- a/hw/misc/aspeed_hace.c
+++ b/hw/misc/aspeed_hace.c
@@ -205,6 +205,19 @@ static uint64_t hash_get_source_addr(AspeedHACEState *s)
     return src_addr;
 }
 
+static bool hash_accumulate_len(AspeedHACEState *s, hwaddr plen)
+{
+    if (plen > UINT32_MAX - s->total_req_len) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s: total_req_len overflow, current=0x%x, adding=0x%"
+                      HWADDR_PRIx "\n", __func__, s->total_req_len, plen);
+        return false;
+    }
+
+    s->total_req_len += plen;
+    return true;
+}
+
 static int hash_prepare_direct_iov(AspeedHACEState *s, struct iovec *iov,
                                    bool acc_mode, bool *acc_final_request)
 {
@@ -232,7 +245,9 @@ static int hash_prepare_direct_iov(AspeedHACEState *s, struct iovec *iov,
     iov_idx = 1;
 
     if (acc_mode) {
-        s->total_req_len += plen;
+        if (!hash_accumulate_len(s, plen)) {
+            return -1;
+        }
 
         if (has_padding(s, &iov[0], plen, &total_msg_len,
                         &pad_offset)) {
@@ -299,7 +314,9 @@ static int hash_prepare_sg_iov(AspeedHACEState *s, struct iovec *iov,
 
         iov[iov_idx].iov_base = haddr;
         if (acc_mode) {
-            s->total_req_len += plen;
+            if (!hash_accumulate_len(s, plen)) {
+                return -1;
+            }
 
             if (has_padding(s, &iov[iov_idx], plen, &total_msg_len,
                             &pad_offset)) {
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 129/149] hw/i2c/microbit_i2c: Don't index off end of twi_read_sequence[]
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (28 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 128/149] aspeed/hace: Prevent total_req_len overflow Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 130/149] meson.build: Add -fzero-init-padding-bits=all Michael Tokarev
                   ` (20 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Peter Maydell, Philippe Mathieu-Daudé,
	Michael Tokarev

From: Peter Maydell <peter.maydell@linaro.org>

If the guest tries to read more bytes from our fake stub I2C device
than we have provided, we incorrectly read one byte beyond the end of
this array. Avoid this, and instead keep reporting the RXD register
as containing the last byte of the "data transfer".

Cc: qemu-stable@nongnu.org
Fixes: 9d68bf564ec ("arm: Stub out NRF51 TWI magnetometer/accelerometer detection")
Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/3408
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20260501162634.4092394-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit a824f3531a44cbd19bcd9dd0ca48e5805c781e02)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/i2c/microbit_i2c.c b/hw/i2c/microbit_i2c.c
index 2291d6370e..d9689b6f1a 100644
--- a/hw/i2c/microbit_i2c.c
+++ b/hw/i2c/microbit_i2c.c
@@ -41,8 +41,13 @@ static uint64_t microbit_i2c_read(void *opaque, hwaddr addr, unsigned int size)
         data = 0x01;
         break;
     case NRF51_TWI_REG_RXD:
+        /*
+         * Return the next byte from our fake data sequence. If
+         * the guest keeps reading the register after that, keep
+         * returning the same last byte value.
+         */
         data = twi_read_sequence[s->read_idx];
-        if (s->read_idx < G_N_ELEMENTS(twi_read_sequence)) {
+        if (s->read_idx + 1 < G_N_ELEMENTS(twi_read_sequence)) {
             s->read_idx++;
         }
         break;
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 130/149] meson.build: Add -fzero-init-padding-bits=all
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (29 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 129/149] hw/i2c/microbit_i2c: Don't index off end of twi_read_sequence[] Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 131/149] tests/functional/qemu_test/asset.py: Don't use setxattr when it doesn't exist Michael Tokarev
                   ` (19 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Peter Maydell, Daniel P. Berrangé,
	Richard Henderson, Pierrick Bouvier, Michael Tokarev

From: Peter Maydell <peter.maydell@linaro.org>

The C standard doesn't always guarantee that struct and union padding
bits are zero initialized, even if the code initializes a struct.
For QEMU, this is potentially problematic, because we often have
structs that match data structures in guest memory, where we
initialize them and then bulk copy them into the guest.  If the
compiler didn't zero init the whole of the memory containing the
struct, we could potentially leak random data from the host into the
guest via the padding bytes.

We already use -ftrivial-auto-var-init=zero, which will zero out
padding in many of these cases, but -fzero-init-padding-bits=all
closes some gaps, for example cases where we initialize a
variable with a struct initializer, and cases involving unions.

Follow the Linux kernel in using both options. Compare kernel
commit dce4aab8441 ("kbuild: Use -fzero-init-padding-bits=all").

This option exists in gcc-15 and above; it's not supported
by clang, but clang documents that it guarantees zero init
of these cases always:
https://clang.llvm.org/docs/LanguageExtensions.html#union-and-aggregate-initialization-in-c
Older gcc which don't have the option behave as if it were set.

(These options are passed through the cc.get_supported_arguments()
filter, so we don't need to do anything extra to avoid passing it to
a compiler that doesn't recognize it.)

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@oss.qualcomm.com>
Message-id: 20260508104723.2144051-1-peter.maydell@linaro.org
(cherry picked from commit a163fc1f864bef27f6e527cbad9defba7af9e60a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/meson.build b/meson.build
index 5ba29bc07d..7ae0d109ab 100644
--- a/meson.build
+++ b/meson.build
@@ -703,6 +703,12 @@ hardening_flags = [
     # it harder to take advantage of uninitialized stack
     # data to drive exploits
     '-ftrivial-auto-var-init=zero',
+    # Ensure GCC zero-initializes padding bits and trailing fields in
+    # unions. This avoids potentially leaking host data into the guest
+    # when we init a struct and copy it into guest memory.  GCC prior
+    # to GCC 15 and clang don't have this, but they zero the padding
+    # and trailing portions of a union by default.
+    '-fzero-init-padding-bits=all',
 ]

 # Zero out registers used during a function call
-- 
2.47.3

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 131/149] tests/functional/qemu_test/asset.py: Don't use setxattr when it doesn't exist
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (30 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 130/149] meson.build: Add -fzero-init-padding-bits=all Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 132/149] hw/nvme: fix admin cq msix setup Michael Tokarev
                   ` (18 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Peter Maydell, Daniel P. Berrangé,
	Alex Bennée, Thomas Huth, Michael Tokarev

From: Peter Maydell <peter.maydell@linaro.org>

The Python os.setxattr() API is Linux-specific, so trying to use
it on other OSes triggers a failure:

  File "/Users/pm215/src/qemu/tests/functional/qemu_test/asset.py",
line 227, in fetch
    os.setxattr(str(tmp_cache_file), "user.qemu-asset-url",
    ^^^^^^^^^^^
AttributeError: module 'os' has no attribute 'setxattr'

Since we only set the attributes here for informational
purposes, skip them when os.setxattr() isn't available.

Cc: qemu-stable@nongnu.org
Fixes: 9903217a4ed013 ("tests/functional: add a module for handling asset download & caching")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <th.huth+qemu@posteo.eu>
Message-id: 20260501115506.3792110-1-peter.maydell@linaro.org
(cherry picked from commit 039b057c09c6a9742b91b8f29604651ba3fdb558)
(Mjt: context fixup for 10.2.x)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/tests/functional/qemu_test/asset.py b/tests/functional/qemu_test/asset.py
index bae40765ce..ef0dff921d 100644
--- a/tests/functional/qemu_test/asset.py
+++ b/tests/functional/qemu_test/asset.py
@@ -207,11 +207,14 @@ def fetch(self):
             raise AssetError(self, "Download retries exceeded", transient=True)
 
         try:
-            # Set these just for informational purposes
-            os.setxattr(str(tmp_cache_file), "user.qemu-asset-url",
-                        self.url.encode('utf8'))
-            os.setxattr(str(tmp_cache_file), "user.qemu-asset-hash",
-                        self.hash.encode('utf8'))
+            # Set these just for informational purposes. Note that
+            # setxattr is Linux-only; as this is only informational
+            # we can simply skip it on other platforms.
+            if hasattr(os, "setxattr"):
+                os.setxattr(str(tmp_cache_file), "user.qemu-asset-url",
+                            self.url.encode('utf8'))
+                os.setxattr(str(tmp_cache_file), "user.qemu-asset-hash",
+                            self.hash.encode('utf8'))
         except Exception as e:
             self.log.debug("Unable to set xattr on %s: %s", tmp_cache_file, e)
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 132/149] hw/nvme: fix admin cq msix setup
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (31 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 131/149] tests/functional/qemu_test/asset.py: Don't use setxattr when it doesn't exist Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 133/149] linux-user: Fix AT_EXECFN in AUXV for symlinked programs Michael Tokarev
                   ` (17 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Klaus Jensen, Andreas Hindborg, Michael Tokarev

From: Klaus Jensen <k.jensen@samsung.com>

If MSI-X is not enabled when the admin completion queue is created,
msix_vector_use() is not called. But, if MSI-X is subsequently enabled,
msix_notify() will fail to fire the interrupt because the use count for
the vector remains at 0.

msix_vector_use/unuse should be called if MSI-X is *present*, not
*enabled*. Fix this.

Cc: qemu-stable@nongnu.org
Reported-by: Andreas Hindborg <a.hindborg@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit 2293d8b4bd88d3f29730cfd608935f77247919b6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index be6c7028cb..8463fd3e9a 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -5517,7 +5517,7 @@ static void nvme_free_cq(NvmeCQueue *cq, NvmeCtrl *n)
         event_notifier_set_handler(&cq->notifier, NULL);
         event_notifier_cleanup(&cq->notifier);
     }
-    if (msix_enabled(pci) && cq->irq_enabled) {
+    if (msix_present(pci) && cq->irq_enabled) {
         msix_vector_unuse(pci, cq->vector);
     }
     if (cq->cqid) {
@@ -5558,7 +5558,7 @@ static void nvme_init_cq(NvmeCQueue *cq, NvmeCtrl *n, uint64_t dma_addr,
 {
     PCIDevice *pci = PCI_DEVICE(n);
 
-    if (msix_enabled(pci) && irq_enabled) {
+    if (msix_present(pci) && irq_enabled) {
         msix_vector_use(pci, vector);
     }
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 133/149] linux-user: Fix AT_EXECFN in AUXV for symlinked programs
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (32 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 132/149] hw/nvme: fix admin cq msix setup Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 134/149] linux-user/sh4: Fix target_ucontext tuc_link field type Michael Tokarev
                   ` (16 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Helge Deller, Michael Tokarev

From: Helge Deller <deller@gmx.de>

The AT_EXECFN entry in AUXV needs to keep the value which was used when
the program was started. Especially for symlinked programs qemu should
not try to resolve the realpath.

Here is a reproducer:
(arm64-chroot)root@p100:/# cd /usr/bin
(arm64-chroot)root@p100:/usr/bin# ln -s echo testprog
(arm64-chroot)root@p100:/usr/bin# LD_SHOW_AUXV=1 ./testprog | grep AT_EXECFN
AT_EXECFN:            ./testprog

In this example, "./testprog" is the correct output, and not "/usr/bin/echo".

This patch fixes parts of commit 258bec39 ("linux-user: Fix access to
/proc/self/exe").

Fixes: 258bec39 ("linux-user: Fix access to /proc/self/exe")
Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/3379
Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit 6b5aef7cac9dab7c16451588cff6615eb2048293)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/linux-user/main.c b/linux-user/main.c
index 86d04cca3c..c08c73fd80 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -772,8 +772,10 @@ int main(int argc, char **argv, char **envp)
     }
 
     /* Resolve executable file name to full path name */
-    if (realpath(exec_path, real_exec_path)) {
-        exec_path = real_exec_path;
+    /* Keep how we started the program in exec_path, e.g. "./my_program" */
+    /* Store real path in real_exec_path, e.g. "/usr/local/bin/my_program" */
+    if (!realpath(exec_path, real_exec_path)) {
+        printf("Could not resolve %s\n", exec_path);
     }
 
     /*
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index bb818f35d9..5c101c19e4 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8799,9 +8799,9 @@ static int maybe_do_fake_open(CPUArchState *cpu_env, int dirfd,
             return -1;
         }
         if (safe) {
-            return safe_openat(dirfd, exec_path, flags, mode);
+            return safe_openat(dirfd, real_exec_path, flags, mode);
         } else {
-            return openat(dirfd, exec_path, flags, mode);
+            return openat(dirfd, real_exec_path, flags, mode);
         }
     }
 
@@ -8934,9 +8934,9 @@ ssize_t do_guest_readlink(const char *pathname, char *buf, size_t bufsiz)
          * Don't worry about sign mismatch as earlier mapping
          * logic would have thrown a bad address error.
          */
-        ret = MIN(strlen(exec_path), bufsiz);
+        ret = MIN(strlen(real_exec_path), bufsiz);
         /* We cannot NUL terminate the string. */
-        memcpy(buf, exec_path, ret);
+        memcpy(buf, real_exec_path, ret);
     } else {
         ret = readlink(path(pathname), buf, bufsiz);
     }
@@ -9027,7 +9027,7 @@ static int do_execv(CPUArchState *cpu_env, int dirfd,
 
     const char *exe = p;
     if (is_proc_myself(p, "exe")) {
-        exe = exec_path;
+        exe = real_exec_path;
     }
     ret = is_execveat
         ? safe_execveat(dirfd, exe, argp, envp, flags)
@@ -11038,9 +11038,9 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1,
                  * Don't worry about sign mismatch as earlier mapping
                  * logic would have thrown a bad address error.
                  */
-                ret = MIN(strlen(exec_path), arg4);
+                ret = MIN(strlen(real_exec_path), arg4);
                 /* We cannot NUL terminate the string. */
-                memcpy(p2, exec_path, ret);
+                memcpy(p2, real_exec_path, ret);
             } else {
                 ret = get_errno(readlinkat(arg1, path(p), p2, arg4));
             }
diff --git a/linux-user/user-internals.h b/linux-user/user-internals.h
index 24d35998f0..7730444aa5 100644
--- a/linux-user/user-internals.h
+++ b/linux-user/user-internals.h
@@ -24,6 +24,7 @@
 #include "exec/translation-block.h"
 
 extern char *exec_path;
+extern char real_exec_path[PATH_MAX];
 void init_task_state(TaskState *ts);
 void task_settid(TaskState *);
 void stop_all_tasks(void);
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 134/149] linux-user/sh4: Fix target_ucontext tuc_link field type
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (33 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 133/149] linux-user: Fix AT_EXECFN in AUXV for symlinked programs Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 135/149] linux-user/sh4: Fix setup_sigtramp to match Linux kernel trampoline pattern Michael Tokarev
                   ` (15 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Matt Turner, Richard Henderson, Helge Deller,
	Michael Tokarev

From: Matt Turner <mattst88@gmail.com>

tuc_link is declared as 'struct target_ucontext *', which is a HOST
pointer.  On a 64-bit host running a 32-bit SH4 target, this is 8 bytes
instead of the 4 bytes the target expects, padding pushes tuc_mcontext
8 bytes past its correct offset.

When a signal handler receives ucontext_t *, every field accessed through
uc_mcontext (gregs[], pc, pr, ...) is read from the wrong address.  In
particular the saved PC comes back as a garbage stack value, which breaks
any code that initialises a libunwind cursor from the signal context.

Fix it by using abi_ulong, which is always sized to the target ABI (4
bytes for SH4), matching the layout the kernel and glibc agree on.  This
is the same pattern used by arm/signal.c.

Also remove the (unsigned long *) cast from the __put_user that zeros
tuc_link.  The cast was harmless when tuc_link was pointer-sized (8
bytes matching unsigned long on a 64-bit host), but after the type
change __put_user's sizeof dispatch would select stq_le_p (8-byte write)
for a now-4-byte field, silently overwriting the start of tuc_stack.

Neither this fix nor the companion setup_sigtramp fix is independently
sufficient: this fix corrects register values read from the signal context
but libunwind still cannot detect the frame without the correct trampoline
pattern; that fix makes the frame detectable but register reads remain
garbage without the correct ucontext layout.  Together they fix the
following libunwind tests on a 64-bit host:
  Gtest-sig-context, Gtest-trace, Ltest-init-local-signal,
  Ltest-sig-context, Ltest-trace

Signed-off-by: Matt Turner <mattst88@gmail.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit c3176e645774fcb795bf99c1c6c40c67432232db)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/linux-user/sh4/signal.c b/linux-user/sh4/signal.c
index 9ecc026fae..20d2bc8b2c 100644
--- a/linux-user/sh4/signal.c
+++ b/linux-user/sh4/signal.c
@@ -57,7 +57,7 @@ struct target_sigframe

 struct target_ucontext {
     target_ulong tuc_flags;
-    struct target_ucontext *tuc_link;
+    abi_ulong tuc_link;
     target_stack_t tuc_stack;
     struct target_sigcontext tuc_mcontext;
     target_sigset_t tuc_sigmask;        /* mask last for extensibility */
@@ -237,7 +237,7 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,

     /* Create the ucontext.  */
     __put_user(0, &frame->uc.tuc_flags);
-    __put_user(0, (unsigned long *)&frame->uc.tuc_link);
+    __put_user(0, &frame->uc.tuc_link);
     target_save_altstack(&frame->uc.tuc_stack, regs);
     setup_sigcontext(&frame->uc.tuc_mcontext,
                      regs, set->sig[0]);
-- 
2.47.3

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 135/149] linux-user/sh4: Fix setup_sigtramp to match Linux kernel trampoline pattern
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (34 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 134/149] linux-user/sh4: Fix target_ucontext tuc_link field type Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 136/149] blkdebug: Add 'delay-ns' option Michael Tokarev
                   ` (14 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Matt Turner, Richard Henderson, Helge Deller,
	Michael Tokarev

From: Matt Turner <mattst88@gmail.com>

QEMU used MOVW(2) (0x9300), which loads the syscall number from PC+4,
instead of the kernel's MOVW(7) (0x9305), which loads from PC+14.  The
kernel uses five "or r0,r0" nop pads between TRAP_NOARG and the syscall
number word to reach that offset.  libunwind's unw_is_signal_frame checks
for the exact kernel byte pattern 0xc3109305 at the frame PC, so QEMU's
compact layout was not detected, breaking unwinding through signal frames.

Expand each trampoline from 6 to 16 bytes matching the kernel layout
defined in arch/sh/kernel/signal_32.c:

  #define MOVW(n)    (0x9300|((n)-2))  /* Move mem word at PC+n to R3 */
  #define TRAP_NOARG 0xc310            /* Syscall w/no args (NR in R3) */
  #define OR_R0_R0   0x200b            /* or r0,r0 (insert to avoid hardware bug) */

  __put_user(MOVW(7),          &frame->retcode[0]);  /* 0x9305 */
  __put_user(TRAP_NOARG,       &frame->retcode[1]);  /* 0xc310 */
  __put_user(OR_R0_R0,         &frame->retcode[2]);  /* 0x200b */
  __put_user(OR_R0_R0,         &frame->retcode[3]);  /* 0x200b */
  __put_user(OR_R0_R0,         &frame->retcode[4]);  /* 0x200b */
  __put_user(OR_R0_R0,         &frame->retcode[5]);  /* 0x200b */
  __put_user(OR_R0_R0,         &frame->retcode[6]);  /* 0x200b */
  __put_user((__NR_sigreturn), &frame->retcode[7]);

The first two halfwords (MOVW(7) || TRAP_NOARG = 0xc3109305) form the
32-bit value libunwind checks at the frame PC, followed by two
OR_R0_R0 halfwords (0x200b200b) at PC+4.  The same layout applies to
the rt_sigreturn trampoline (lines 366-373 of signal_32.c).

Neither this fix nor the companion tuc_link fix is independently
sufficient: this fix makes signal frames detectable but register reads
remain garbage without the correct ucontext layout; that fix corrects the
ucontext layout but libunwind still cannot detect the frame without the
correct trampoline pattern.  Together they fix the following libunwind
tests on a 64-bit host:
  Gtest-sig-context, Gtest-trace, Ltest-init-local-signal,
  Ltest-sig-context, Ltest-trace

Signed-off-by: Matt Turner <mattst88@gmail.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit 9ac5aa72272117608482cad2430a75477263fe09)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/linux-user/sh4/signal.c b/linux-user/sh4/signal.c
index 20d2bc8b2c..d70be24c38 100644
--- a/linux-user/sh4/signal.c
+++ b/linux-user/sh4/signal.c
@@ -329,20 +329,42 @@ badframe:
     return -QEMU_ESIGRETURN;
 }
 
+/*
+ * "or r0,r0" nop used by the Linux kernel inline sigreturn trampolines to
+ * avoid a hardware bug (OR_R0_R0 in arch/sh/kernel/signal_32.c).  Five of
+ * these nops follow TRAP_NOARG, placing the syscall number word 14 bytes
+ * past the MOVW(7) instruction (at MOVW(7)'s load offset).  This yields the
+ * fixed 16-byte layout that libunwind's unw_is_signal_frame detects:
+ *   [MOVW(7), TRAP_NOARG, 5x NOP_OR, .word syscall_nr]
+ */
+#define NOP_OR 0x200b
+
 void setup_sigtramp(abi_ulong sigtramp_page)
 {
-    uint16_t *tramp = lock_user(VERIFY_WRITE, sigtramp_page, 2 * 6, 0);
+    uint16_t *tramp = lock_user(VERIFY_WRITE, sigtramp_page, 2 * 16, 0);
     assert(tramp != NULL);
 
+    /* sigreturn trampoline (non-RT) at offset 0 */
     default_sigreturn = sigtramp_page;
-    __put_user(MOVW(2), &tramp[0]);
+    __put_user(MOVW(7), &tramp[0]);
     __put_user(TRAP_NOARG, &tramp[1]);
-    __put_user(TARGET_NR_sigreturn, &tramp[2]);
-
-    default_rt_sigreturn = sigtramp_page + 6;
-    __put_user(MOVW(2), &tramp[3]);
-    __put_user(TRAP_NOARG, &tramp[4]);
-    __put_user(TARGET_NR_rt_sigreturn, &tramp[5]);
-
-    unlock_user(tramp, sigtramp_page, 2 * 6);
+    __put_user(NOP_OR, &tramp[2]);
+    __put_user(NOP_OR, &tramp[3]);
+    __put_user(NOP_OR, &tramp[4]);
+    __put_user(NOP_OR, &tramp[5]);
+    __put_user(NOP_OR, &tramp[6]);
+    __put_user(TARGET_NR_sigreturn, &tramp[7]);
+
+    /* rt_sigreturn trampoline at offset 16 */
+    default_rt_sigreturn = sigtramp_page + 16;
+    __put_user(MOVW(7), &tramp[8]);
+    __put_user(TRAP_NOARG, &tramp[9]);
+    __put_user(NOP_OR, &tramp[10]);
+    __put_user(NOP_OR, &tramp[11]);
+    __put_user(NOP_OR, &tramp[12]);
+    __put_user(NOP_OR, &tramp[13]);
+    __put_user(NOP_OR, &tramp[14]);
+    __put_user(TARGET_NR_rt_sigreturn, &tramp[15]);
+
+    unlock_user(tramp, sigtramp_page, 2 * 16);
 }
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 136/149] blkdebug: Add 'delay-ns' option
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (35 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 135/149] linux-user/sh4: Fix setup_sigtramp to match Linux kernel trampoline pattern Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 137/149] block: Add blk_co_start/end_request() and BDRV_REQ_NO_QUEUE Michael Tokarev
                   ` (13 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Kevin Wolf, Michael Tokarev

From: Kevin Wolf <kwolf@redhat.com>

Sometimes reproducing a problem for debugging involves slow I/O, so
let's add something to blkdebug to make I/O slow when we need it. This
can be used either together with an error so that the request fails
after the delay, or with errno=0, which allows the request to succeed
after the delay.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20260421161132.99878-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit d5e4090177ad382e01084a1594a1a60a69f4c1cd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/block/blkdebug.c b/block/blkdebug.c
index c54aee0c84..8954fc2977 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -95,6 +95,7 @@ typedef struct BlkdebugRule {
             int immediately;
             int once;
             int64_t offset;
+            int64_t delay_ns;
         } inject;
         struct {
             int new_state;
@@ -144,6 +145,10 @@ static QemuOptsList inject_error_opts = {
             .name = "immediately",
             .type = QEMU_OPT_BOOL,
         },
+        {
+            .name = "delay-ns",
+            .type = QEMU_OPT_NUMBER,
+        },
         { /* end of list */ }
     },
 };
@@ -216,6 +221,8 @@ static int add_rule(void *opaque, QemuOpts *opts, Error **errp)
         rule->options.inject.once  = qemu_opt_get_bool(opts, "once", 0);
         rule->options.inject.immediately =
             qemu_opt_get_bool(opts, "immediately", 0);
+        rule->options.inject.delay_ns =
+            qemu_opt_get_number(opts, "delay-ns", 0);
         sector = qemu_opt_get_number(opts, "sector", -1);
         rule->options.inject.offset =
             sector == -1 ? -1 : sector * BDRV_SECTOR_SIZE;
@@ -594,6 +601,7 @@ static int coroutine_fn rule_check(BlockDriverState *bs, uint64_t offset,
     BlkdebugRule *rule = NULL;
     int error;
     bool immediately;
+    int64_t delay_ns;
 
     qemu_mutex_lock(&s->lock);
     QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
@@ -608,13 +616,14 @@ static int coroutine_fn rule_check(BlockDriverState *bs, uint64_t offset,
         }
     }
 
-    if (!rule || !rule->options.inject.error) {
+    if (!rule) {
         qemu_mutex_unlock(&s->lock);
         return 0;
     }
 
     immediately = rule->options.inject.immediately;
     error = rule->options.inject.error;
+    delay_ns  = rule->options.inject.delay_ns;
 
     if (rule->options.inject.once) {
         QSIMPLEQ_REMOVE(&s->active_rules, rule, BlkdebugRule, active_next);
@@ -622,6 +631,10 @@ static int coroutine_fn rule_check(BlockDriverState *bs, uint64_t offset,
     }
 
     qemu_mutex_unlock(&s->lock);
+
+    if (delay_ns) {
+        qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, delay_ns);
+    }
     if (!immediately) {
         aio_co_schedule(qemu_get_current_aio_context(), qemu_coroutine_self());
         qemu_coroutine_yield();
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 4118d884f4..b7b5e8bad6 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3914,6 +3914,9 @@
 #
 # @errno: error identifier (errno) to be returned; defaults to EIO
 #
+# @delay-ns: request delay before completion in nanoseconds
+#            (default: 0, since: 11.1)
+#
 # @sector: specifies the sector index which has to be affected in
 #     order to actually trigger the event; defaults to "any sector"
 #
@@ -3929,6 +3932,7 @@
             '*state': 'int',
             '*iotype': 'BlkdebugIOType',
             '*errno': 'int',
+            '*delay-ns': 'int',
             '*sector': 'int',
             '*once': 'bool',
             '*immediately': 'bool' } }
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 137/149] block: Add blk_co_start/end_request() and BDRV_REQ_NO_QUEUE
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (36 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 136/149] blkdebug: Add 'delay-ns' option Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 138/149] block: Add flags parameter to blk_*_pdiscard() Michael Tokarev
                   ` (12 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Kevin Wolf, Michael Tokarev

From: Kevin Wolf <kwolf@redhat.com>

If a device uses blk_inc/dec_in_flight() in order to build macro
operations that involve multiple requests for the block layer and that
need to be completed as a unit before the BlockBackend can be considered
drained, it sets the stage for a deadlock: When a drain is requested,
the inner request at the BlockBackend level will be queued in
blk_wait_while_drained() and wait until the drained section ends, but at
the same time, drain_begin can only return if the whole macro operation
at the device level has completed.

Introduce a new interface to allow implementing the logic correctly:
Instead of queueing individual requests, blk_co_start_request() calls
blk_wait_while_drained() once at the beginning. The individual requests
must then set BDRV_REQ_NO_QUEUE to avoid being queued and running into
the deadlock; being wrapped in blk_co_start/end_request() makes sure
that drain_begin waits for them and they don't sneak in when the
BlockBackend is supposed to already be quiescent.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20260421161132.99878-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 34a67637767d3ed1ac813c44effe827bbfba5996)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/block/block-backend.c b/block/block-backend.c
index 98315d4470..b8d877c4e7 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -82,6 +82,7 @@ struct BlockBackend {
     QemuMutex queued_requests_lock; /* protects queued_requests */
     CoQueue queued_requests;
     bool disable_request_queuing; /* atomic */
+    int start_request_count; /* atomic */
 
     VMChangeStateEntry *vmsh;
     bool force_allow_inactivate;
@@ -1306,10 +1307,16 @@ bool blk_in_drain(BlockBackend *blk)
 }
 
 /* To be called between exactly one pair of blk_inc/dec_in_flight() */
-static void coroutine_fn blk_wait_while_drained(BlockBackend *blk)
+static void coroutine_fn blk_wait_while_drained(BlockBackend *blk,
+                                                BdrvRequestFlags flags)
 {
     assert(blk->in_flight > 0);
 
+    if (flags & BDRV_REQ_NO_QUEUE) {
+        assert(qatomic_read(&blk->start_request_count));
+        return;
+    }
+
     if (qatomic_read(&blk->quiesce_counter) &&
         !qatomic_read(&blk->disable_request_queuing)) {
         /*
@@ -1335,7 +1342,7 @@ blk_co_do_preadv_part(BlockBackend *blk, int64_t offset, int64_t bytes,
     BlockDriverState *bs;
     IO_CODE();
 
-    blk_wait_while_drained(blk);
+    blk_wait_while_drained(blk, flags);
     GRAPH_RDLOCK_GUARD();
 
     /* Call blk_bs() only after waiting, the graph may have changed */
@@ -1410,7 +1417,7 @@ blk_co_do_pwritev_part(BlockBackend *blk, int64_t offset, int64_t bytes,
     BlockDriverState *bs;
     IO_CODE();
 
-    blk_wait_while_drained(blk);
+    blk_wait_while_drained(blk, flags);
     GRAPH_RDLOCK_GUARD();
 
     /* Call blk_bs() only after waiting, the graph may have changed */
@@ -1523,6 +1530,19 @@ void blk_dec_in_flight(BlockBackend *blk)
     aio_wait_kick();
 }
 
+void coroutine_fn blk_co_start_request(BlockBackend *blk)
+{
+    blk_inc_in_flight(blk);
+    blk_wait_while_drained(blk, 0);
+    qatomic_inc(&blk->start_request_count);
+}
+
+void blk_end_request(BlockBackend *blk)
+{
+    qatomic_dec(&blk->start_request_count);
+    blk_dec_in_flight(blk);
+}
+
 static void error_callback_bh(void *opaque)
 {
     struct BlockBackendAIOCB *acb = opaque;
@@ -1741,7 +1761,7 @@ blk_co_do_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
 {
     IO_CODE();
 
-    blk_wait_while_drained(blk);
+    blk_wait_while_drained(blk, 0);
     GRAPH_RDLOCK_GUARD();
 
     if (!blk_co_is_available(blk)) {
@@ -1788,7 +1808,7 @@ blk_co_do_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes)
     int ret;
     IO_CODE();
 
-    blk_wait_while_drained(blk);
+    blk_wait_while_drained(blk, 0);
     GRAPH_RDLOCK_GUARD();
 
     ret = blk_check_byte_request(blk, offset, bytes);
@@ -1834,7 +1854,7 @@ int coroutine_fn blk_co_pdiscard(BlockBackend *blk, int64_t offset,
 static int coroutine_fn blk_co_do_flush(BlockBackend *blk)
 {
     IO_CODE();
-    blk_wait_while_drained(blk);
+    blk_wait_while_drained(blk, 0);
     GRAPH_RDLOCK_GUARD();
 
     if (!blk_co_is_available(blk)) {
@@ -2009,7 +2029,7 @@ int coroutine_fn blk_co_zone_report(BlockBackend *blk, int64_t offset,
     IO_CODE();
 
     blk_inc_in_flight(blk); /* increase before waiting */
-    blk_wait_while_drained(blk);
+    blk_wait_while_drained(blk, 0);
     GRAPH_RDLOCK_GUARD();
     if (!blk_is_available(blk)) {
         blk_dec_in_flight(blk);
@@ -2034,7 +2054,7 @@ int coroutine_fn blk_co_zone_mgmt(BlockBackend *blk, BlockZoneOp op,
     IO_CODE();
 
     blk_inc_in_flight(blk);
-    blk_wait_while_drained(blk);
+    blk_wait_while_drained(blk, 0);
     GRAPH_RDLOCK_GUARD();
 
     ret = blk_check_byte_request(blk, offset, len);
@@ -2058,7 +2078,7 @@ int coroutine_fn blk_co_zone_append(BlockBackend *blk, int64_t *offset,
     IO_CODE();
 
     blk_inc_in_flight(blk);
-    blk_wait_while_drained(blk);
+    blk_wait_while_drained(blk, flags);
     GRAPH_RDLOCK_GUARD();
     if (!blk_is_available(blk)) {
         blk_dec_in_flight(blk);
diff --git a/include/block/block-common.h b/include/block/block-common.h
index c8c626daea..895ea17541 100644
--- a/include/block/block-common.h
+++ b/include/block/block-common.h
@@ -215,8 +215,17 @@ typedef enum {
      */
     BDRV_REQ_NO_WAIT = 0x400,
 
+    /*
+     * Used between blk_co_start_request() and blk_end_request() to avoid
+     * that the request waits in a drained BlockBackend until the drained
+     * section ends. Waiting would cause a deadlock because drain waits for
+     * blk_end_request() to be called, but the request never completes
+     * because it waits for the drain to end.
+     */
+    BDRV_REQ_NO_QUEUE = 0x800,
+
     /* Mask of valid flags */
-    BDRV_REQ_MASK               = 0x7ff,
+    BDRV_REQ_MASK               = 0xfff,
 } BdrvRequestFlags;
 
 #define BDRV_O_NO_SHARE    0x0001 /* don't share permissions */
diff --git a/include/system/block-backend-io.h b/include/system/block-backend-io.h
index 6d5ac476fc..0248c1c36e 100644
--- a/include/system/block-backend-io.h
+++ b/include/system/block-backend-io.h
@@ -71,6 +71,8 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
 
 void blk_inc_in_flight(BlockBackend *blk);
 void blk_dec_in_flight(BlockBackend *blk);
+void coroutine_fn blk_co_start_request(BlockBackend *blk);
+void blk_end_request(BlockBackend *blk);
 
 bool coroutine_fn GRAPH_RDLOCK blk_co_is_inserted(BlockBackend *blk);
 bool co_wrapper_mixed_bdrv_rdlock blk_is_inserted(BlockBackend *blk);
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 138/149] block: Add flags parameter to blk_*_pdiscard()
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (37 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 137/149] block: Add blk_co_start/end_request() and BDRV_REQ_NO_QUEUE Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 139/149] ide: Minimal fix for deadlock between TRIM and drain Michael Tokarev
                   ` (11 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Kevin Wolf, Michael Tokarev

From: Kevin Wolf <kwolf@redhat.com>

All existing callers pass 0, but we need a way to pass BDRV_REQ_NO_QUEUE
for discard requests.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20260421161132.99878-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 53074ba0330ae8831abbae2521c012e1d9072ed3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/block/block-backend.c b/block/block-backend.c
index b8d877c4e7..850f2ecec2 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1803,12 +1803,13 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
 
 /* To be called between exactly one pair of blk_inc/dec_in_flight() */
 static int coroutine_fn
-blk_co_do_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes)
+blk_co_do_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes,
+                   BdrvRequestFlags flags)
 {
     int ret;
     IO_CODE();
 
-    blk_wait_while_drained(blk, 0);
+    blk_wait_while_drained(blk, flags);
     GRAPH_RDLOCK_GUARD();
 
     ret = blk_check_byte_request(blk, offset, bytes);
@@ -1824,7 +1825,7 @@ static void coroutine_fn blk_aio_pdiscard_entry(void *opaque)
     BlkAioEmAIOCB *acb = opaque;
     BlkRwCo *rwco = &acb->rwco;
 
-    rwco->ret = blk_co_do_pdiscard(rwco->blk, rwco->offset, acb->bytes);
+    rwco->ret = blk_co_do_pdiscard(rwco->blk, rwco->offset, acb->bytes, 0);
     blk_aio_complete(acb);
 }
 
@@ -1838,13 +1839,13 @@ BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk,
 }
 
 int coroutine_fn blk_co_pdiscard(BlockBackend *blk, int64_t offset,
-                                 int64_t bytes)
+                                 int64_t bytes, BdrvRequestFlags flags)
 {
     int ret;
     IO_OR_GS_CODE();
 
     blk_inc_in_flight(blk);
-    ret = blk_co_do_pdiscard(blk, offset, bytes);
+    ret = blk_co_do_pdiscard(blk, offset, bytes, flags);
     blk_dec_in_flight(blk);
 
     return ret;
diff --git a/block/export/virtio-blk-handler.c b/block/export/virtio-blk-handler.c
index bc1cec6757..b82baae553 100644
--- a/block/export/virtio-blk-handler.c
+++ b/block/export/virtio-blk-handler.c
@@ -121,7 +121,7 @@ virtio_blk_discard_write_zeroes(VirtioBlkHandler *handler, struct iovec *iov,
         }
 
         if (blk_co_pdiscard(blk, sector << VIRTIO_BLK_SECTOR_BITS,
-                            bytes) == 0) {
+                            bytes, 0) == 0) {
             return VIRTIO_BLK_S_OK;
         }
     }
diff --git a/block/mirror.c b/block/mirror.c
index 2fcded9e93..089856f4a8 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -454,7 +454,7 @@ static void coroutine_fn mirror_co_discard(void *opaque)
     *op->bytes_handled = op->bytes;
     op->is_in_flight = true;
 
-    ret = blk_co_pdiscard(op->s->target, op->offset, op->bytes);
+    ret = blk_co_pdiscard(op->s->target, op->offset, op->bytes, 0);
     mirror_write_complete(op, ret);
 }
 
@@ -1532,7 +1532,7 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
                          zero_bitmap_end - zero_bitmap_offset);
         }
         assert(!qiov);
-        ret = blk_co_pdiscard(job->target, offset, bytes);
+        ret = blk_co_pdiscard(job->target, offset, bytes, 0);
         break;
 
     default:
diff --git a/include/system/block-backend-io.h b/include/system/block-backend-io.h
index 0248c1c36e..fd84723d9d 100644
--- a/include/system/block-backend-io.h
+++ b/include/system/block-backend-io.h
@@ -218,9 +218,9 @@ int co_wrapper_mixed blk_zone_append(BlockBackend *blk, int64_t *offset,
                                          BdrvRequestFlags flags);
 
 int co_wrapper_mixed blk_pdiscard(BlockBackend *blk, int64_t offset,
-                                  int64_t bytes);
+                                  int64_t bytes, BdrvRequestFlags flags);
 int coroutine_fn blk_co_pdiscard(BlockBackend *blk, int64_t offset,
-                                 int64_t bytes);
+                                 int64_t bytes, BdrvRequestFlags flags);
 
 int co_wrapper_mixed blk_flush(BlockBackend *blk);
 int coroutine_fn blk_co_flush(BlockBackend *blk);
diff --git a/nbd/server.c b/nbd/server.c
index acec0487a8..bd103a8840 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2984,7 +2984,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
                                       "flush failed", errp);
 
     case NBD_CMD_TRIM:
-        ret = blk_co_pdiscard(exp->common.blk, request->from, request->len);
+        ret = blk_co_pdiscard(exp->common.blk, request->from, request->len, 0);
         if (ret >= 0 && request->flags & NBD_CMD_FLAG_FUA) {
             ret = blk_co_flush(exp->common.blk);
         }
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 13e0330162..f6d077908f 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -2201,7 +2201,7 @@ static int discard_f(BlockBackend *blk, int argc, char **argv)
     }
 
     clock_gettime(CLOCK_MONOTONIC, &t1);
-    ret = blk_pdiscard(blk, offset, bytes);
+    ret = blk_pdiscard(blk, offset, bytes, 0);
     clock_gettime(CLOCK_MONOTONIC, &t2);
 
     if (ret < 0) {
diff --git a/tests/unit/test-block-iothread.c b/tests/unit/test-block-iothread.c
index e26b3be593..5273ff235a 100644
--- a/tests/unit/test-block-iothread.c
+++ b/tests/unit/test-block-iothread.c
@@ -270,11 +270,11 @@ static void test_sync_op_blk_pdiscard(BlockBackend *blk)
     int ret;
 
     /* Early success: UNMAP not supported */
-    ret = blk_pdiscard(blk, 0, 512);
+    ret = blk_pdiscard(blk, 0, 512, 0);
     g_assert_cmpint(ret, ==, 0);
 
     /* Early error: Negative offset */
-    ret = blk_pdiscard(blk, -2, 512);
+    ret = blk_pdiscard(blk, -2, 512, 0);
     g_assert_cmpint(ret, ==, -EIO);
 }
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 139/149] ide: Minimal fix for deadlock between TRIM and drain
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (38 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 138/149] block: Add flags parameter to blk_*_pdiscard() Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 140/149] ide: Clean up ide_trim_co_entry() to be idiomatic coroutine code Michael Tokarev
                   ` (10 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Kevin Wolf, Michael Tokarev

From: Kevin Wolf <kwolf@redhat.com>

The implementation of TRIM in IDE can chain multiple discard requests
and uses blk_inc/dec_in_flight() to make sure that the whole TRIM
operation has completed when the device needs to be quiescent (e.g. for
the drain when performing an IDE reset, it would be bad if an IDE
request like TRIM were still in flight).

The problem is that each drain request calls blk_wait_while_drained()
and when draining, it waits until the drained section ends. At the same
time, drain_begin can only return if the whole TRIM operation has
completed. This is a classic deadlock.

Use blk_co_start/end_request() and BDRV_REQ_NO_QUEUE to avoid the
problem. This requires moving the TRIM state machine to a coroutine.
This commit does the minimal conversion so that we do have a coroutine
that works for the fix, but it still looks much like a callback-based
implementation. This will be cleaned up in the next patch.

Cc: qemu-stable@nongnu.org
Fixes: 7e5cdb345f77 ('ide: Increment BB in-flight counter for TRIM BH')
Buglink: https://redhat.atlassian.net/browse/RHEL-121686
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20260421161132.99878-5-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 095c08a7ba68cabaa6e0ce7a8a0804a949542c4c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/ide/core.c b/hw/ide/core.c
index c66a9d8df0..a2430f3a8e 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -420,7 +420,6 @@ typedef struct TrimAIOCB {
     QEMUBH *bh;
     int ret;
     QEMUIOVector *qiov;
-    BlockAIOCB *aiocb;
     int i, j;
 } TrimAIOCB;
 
@@ -433,11 +432,6 @@ static void trim_aio_cancel(BlockAIOCB *acb)
     iocb->i = (iocb->qiov->iov[iocb->j].iov_len / 8) - 1;
 
     iocb->ret = -ECANCELED;
-
-    if (iocb->aiocb) {
-        blk_aio_cancel_async(iocb->aiocb);
-        iocb->aiocb = NULL;
-    }
 }
 
 static const AIOCBInfo trim_aiocb_info = {
@@ -456,15 +450,20 @@ static void ide_trim_bh_cb(void *opaque)
     iocb->bh = NULL;
     qemu_aio_unref(iocb);
 
-    /* Paired with an increment in ide_issue_trim() */
-    blk_dec_in_flight(blk);
+    /* Paired with blk_co_start_request in ide_trim_co_entry() */
+    blk_end_request(blk);
 }
 
-static void ide_issue_trim_cb(void *opaque, int ret)
+static void coroutine_fn ide_trim_co_entry(void *opaque)
 {
     TrimAIOCB *iocb = opaque;
     IDEState *s = iocb->s;
+    int ret = 0;
+
+    /* Paired with blk_end_request in ide_trim_bh_cb() */
+    blk_co_start_request(s->blk);
 
+loop:
     if (iocb->i >= 0) {
         if (ret >= 0) {
             block_acct_done(blk_get_stats(s->blk), &s->acct);
@@ -499,11 +498,11 @@ static void ide_issue_trim_cb(void *opaque, int ret)
                                  count << BDRV_SECTOR_BITS, BLOCK_ACCT_UNMAP);
 
                 /* Got an entry! Submit and exit.  */
-                iocb->aiocb = blk_aio_pdiscard(s->blk,
-                                               sector << BDRV_SECTOR_BITS,
-                                               count << BDRV_SECTOR_BITS,
-                                               ide_issue_trim_cb, opaque);
-                return;
+                ret = blk_co_pdiscard(s->blk,
+                                      sector << BDRV_SECTOR_BITS,
+                                      count << BDRV_SECTOR_BITS,
+                                      BDRV_REQ_NO_QUEUE);
+                goto loop;
             }
 
             iocb->j++;
@@ -514,7 +513,6 @@ static void ide_issue_trim_cb(void *opaque, int ret)
     }
 
 done:
-    iocb->aiocb = NULL;
     if (iocb->bh) {
         replay_bh_schedule_event(iocb->bh);
     }
@@ -527,9 +525,7 @@ BlockAIOCB *ide_issue_trim(
     IDEState *s = opaque;
     IDEDevice *dev = s->unit ? s->bus->slave : s->bus->master;
     TrimAIOCB *iocb;
-
-    /* Paired with a decrement in ide_trim_bh_cb() */
-    blk_inc_in_flight(s->blk);
+    Coroutine *co;
 
     iocb = blk_aio_get(&trim_aiocb_info, s->blk, cb, cb_opaque);
     iocb->s = s;
@@ -539,7 +535,10 @@ BlockAIOCB *ide_issue_trim(
     iocb->qiov = qiov;
     iocb->i = -1;
     iocb->j = 0;
-    ide_issue_trim_cb(iocb, 0);
+
+    co = qemu_coroutine_create(ide_trim_co_entry, iocb);
+    aio_co_enter(qemu_get_current_aio_context(), co);
+
     return &iocb->common;
 }
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 140/149] ide: Clean up ide_trim_co_entry() to be idiomatic coroutine code
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (39 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 139/149] ide: Minimal fix for deadlock between TRIM and drain Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 141/149] ide-test: Factor out wait_dma_completion() Michael Tokarev
                   ` (9 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Kevin Wolf, Michael Tokarev

From: Kevin Wolf <kwolf@redhat.com>

The previous commit did a minimal conversion of the callback based state
machine for TRIM to a coroutine in order to fix a bug. Refactor it to
actually look like normal coroutine based code, which improves its
readability.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20260421161132.99878-6-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit c1c71a7e167fdabaa9827d00c0be3aeafebdd921)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/ide/core.c b/hw/ide/core.c
index a2430f3a8e..c11d52d834 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -420,18 +420,15 @@ typedef struct TrimAIOCB {
     QEMUBH *bh;
     int ret;
     QEMUIOVector *qiov;
-    int i, j;
+    bool canceled;
 } TrimAIOCB;
 
 static void trim_aio_cancel(BlockAIOCB *acb)
 {
     TrimAIOCB *iocb = container_of(acb, TrimAIOCB, common);
 
-    /* Exit the loop so ide_issue_trim_cb will not continue  */
-    iocb->j = iocb->qiov->niov - 1;
-    iocb->i = (iocb->qiov->iov[iocb->j].iov_len / 8) - 1;
-
-    iocb->ret = -ECANCELED;
+    /* Exit the loop so ide_trim_co_entry will not continue */
+    iocb->canceled = true;
 }
 
 static const AIOCBInfo trim_aiocb_info = {
@@ -458,60 +455,55 @@ static void coroutine_fn ide_trim_co_entry(void *opaque)
 {
     TrimAIOCB *iocb = opaque;
     IDEState *s = iocb->s;
-    int ret = 0;
+    int i, j;
+    int ret;
 
     /* Paired with blk_end_request in ide_trim_bh_cb() */
     blk_co_start_request(s->blk);
 
-loop:
-    if (iocb->i >= 0) {
-        if (ret >= 0) {
-            block_acct_done(blk_get_stats(s->blk), &s->acct);
-        } else {
-            block_acct_failed(blk_get_stats(s->blk), &s->acct);
-        }
-    }
+    for (j = 0; j < iocb->qiov->niov; j++) {
+        for (i = 0; i < iocb->qiov->iov[j].iov_len / 8; i++) {
+            uint64_t *buffer = iocb->qiov->iov[j].iov_base;
 
-    if (ret >= 0) {
-        while (iocb->j < iocb->qiov->niov) {
-            int j = iocb->j;
-            while (++iocb->i < iocb->qiov->iov[j].iov_len / 8) {
-                int i = iocb->i;
-                uint64_t *buffer = iocb->qiov->iov[j].iov_base;
+            /* 6-byte LBA + 2-byte range per entry */
+            uint64_t entry = le64_to_cpu(buffer[i]);
+            uint64_t sector = entry & 0x0000ffffffffffffULL;
+            uint16_t count = entry >> 48;
 
-                /* 6-byte LBA + 2-byte range per entry */
-                uint64_t entry = le64_to_cpu(buffer[i]);
-                uint64_t sector = entry & 0x0000ffffffffffffULL;
-                uint16_t count = entry >> 48;
+            if (count == 0) {
+                continue;
+            }
 
-                if (count == 0) {
-                    continue;
-                }
+            if (iocb->canceled) {
+                iocb->ret = -ECANCELED;
+                goto done;
+            }
 
-                if (!ide_sect_range_ok(s, sector, count)) {
-                    block_acct_invalid(blk_get_stats(s->blk), BLOCK_ACCT_UNMAP);
-                    iocb->ret = -EINVAL;
-                    goto done;
-                }
+            if (!ide_sect_range_ok(s, sector, count)) {
+                block_acct_invalid(blk_get_stats(s->blk), BLOCK_ACCT_UNMAP);
+                iocb->ret = -EINVAL;
+                goto done;
+            }
 
-                block_acct_start(blk_get_stats(s->blk), &s->acct,
-                                 count << BDRV_SECTOR_BITS, BLOCK_ACCT_UNMAP);
+            block_acct_start(blk_get_stats(s->blk), &s->acct,
+                             count << BDRV_SECTOR_BITS, BLOCK_ACCT_UNMAP);
 
-                /* Got an entry! Submit and exit.  */
-                ret = blk_co_pdiscard(s->blk,
-                                      sector << BDRV_SECTOR_BITS,
-                                      count << BDRV_SECTOR_BITS,
-                                      BDRV_REQ_NO_QUEUE);
-                goto loop;
+            /* Got an entry! Submit and exit.  */
+            ret = blk_co_pdiscard(s->blk,
+                                  sector << BDRV_SECTOR_BITS,
+                                  count << BDRV_SECTOR_BITS,
+                                  BDRV_REQ_NO_QUEUE);
+            if (ret >= 0) {
+                block_acct_done(blk_get_stats(s->blk), &s->acct);
+            } else {
+                iocb->ret = ret;
+                block_acct_failed(blk_get_stats(s->blk), &s->acct);
+                goto done;
             }
-
-            iocb->j++;
-            iocb->i = -1;
         }
-    } else {
-        iocb->ret = ret;
     }
 
+    iocb->ret = 0;
 done:
     if (iocb->bh) {
         replay_bh_schedule_event(iocb->bh);
@@ -533,8 +525,7 @@ BlockAIOCB *ide_issue_trim(
                                    &DEVICE(dev)->mem_reentrancy_guard);
     iocb->ret = 0;
     iocb->qiov = qiov;
-    iocb->i = -1;
-    iocb->j = 0;
+    iocb->canceled = false;
 
     co = qemu_coroutine_create(ide_trim_co_entry, iocb);
     aio_co_enter(qemu_get_current_aio_context(), co);
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 141/149] ide-test: Factor out wait_dma_completion()
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (40 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 140/149] ide: Clean up ide_trim_co_entry() to be idiomatic coroutine code Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 142/149] ide-test: Test reset during TRIM Michael Tokarev
                   ` (8 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Kevin Wolf, Michael Tokarev

From: Kevin Wolf <kwolf@redhat.com>

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20260421161132.99878-7-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 92854c9c7539bdbf4f9c1abb33dd3ba59ff91e58)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/tests/qtest/ide-test.c b/tests/qtest/ide-test.c
index ceee444a9e..c6dcb2c074 100644
--- a/tests/qtest/ide-test.c
+++ b/tests/qtest/ide-test.c
@@ -200,6 +200,34 @@ static uint64_t trim_range_le(uint64_t sector, uint16_t count)
     return cpu_to_le64(((uint64_t)count << 48) + sector);
 }
 
+static uint8_t wait_dma_completion(QTestState *qts, QPCIDevice *dev,
+                                   QPCIBar bmdma_bar, QPCIBar ide_bar)
+{
+    uint8_t status;
+
+    /* Wait for the DMA transfer to complete */
+    do {
+        status = qpci_io_readb(dev, bmdma_bar, bmreg_status);
+    } while ((status & (BM_STS_ACTIVE | BM_STS_INTR)) == BM_STS_ACTIVE);
+
+    g_assert_cmpint(qtest_get_irq(qts, IDE_PRIMARY_IRQ), ==,
+                    !!(status & BM_STS_INTR));
+
+    /* Check IDE status code */
+    assert_bit_set(qpci_io_readb(dev, ide_bar, reg_status), DRDY);
+    assert_bit_clear(qpci_io_readb(dev, ide_bar, reg_status), BSY | DRQ);
+
+    /* Reading the status register clears the IRQ */
+    g_assert(!qtest_get_irq(qts, IDE_PRIMARY_IRQ));
+
+    /* Stop DMA transfer if still active */
+    if (status & BM_STS_ACTIVE) {
+        qpci_io_writeb(dev, bmdma_bar, bmreg_cmd, 0);
+    }
+
+    return status;
+}
+
 static int send_dma_request(QTestState *qts, int cmd, uint64_t sector,
                             int nb_sectors, PrdtEntry *prdt, int prdt_entries,
                             void(*post_exec)(QPCIDevice *dev, QPCIBar ide_bar,
@@ -280,25 +308,7 @@ static int send_dma_request(QTestState *qts, int cmd, uint64_t sector,
         qpci_io_writeb(dev, bmdma_bar, bmreg_cmd, 0);
     }
 
-    /* Wait for the DMA transfer to complete */
-    do {
-        status = qpci_io_readb(dev, bmdma_bar, bmreg_status);
-    } while ((status & (BM_STS_ACTIVE | BM_STS_INTR)) == BM_STS_ACTIVE);
-
-    g_assert_cmpint(qtest_get_irq(qts, IDE_PRIMARY_IRQ), ==,
-                    !!(status & BM_STS_INTR));
-
-    /* Check IDE status code */
-    assert_bit_set(qpci_io_readb(dev, ide_bar, reg_status), DRDY);
-    assert_bit_clear(qpci_io_readb(dev, ide_bar, reg_status), BSY | DRQ);
-
-    /* Reading the status register clears the IRQ */
-    g_assert(!qtest_get_irq(qts, IDE_PRIMARY_IRQ));
-
-    /* Stop DMA transfer if still active */
-    if (status & BM_STS_ACTIVE) {
-        qpci_io_writeb(dev, bmdma_bar, bmreg_cmd, 0);
-    }
+    status = wait_dma_completion(qts, dev, bmdma_bar, ide_bar);
 
     free_pci_device(dev);
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 142/149] ide-test: Test reset during TRIM
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (41 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 141/149] ide-test: Factor out wait_dma_completion() Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 143/149] block: Create DEFAULT_BLOCK_CONF macro Michael Tokarev
                   ` (7 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Kevin Wolf, Michael Tokarev

From: Kevin Wolf <kwolf@redhat.com>

This is a regression test for the bug fixed in the previous commits, a
deadlock between the drain issued by an IDE reset and the TRIM state
machine.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20260421161132.99878-8-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 2fa24e9755994f76f08ea2452215eb50f26f4c21)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/tests/qtest/ide-test.c b/tests/qtest/ide-test.c
index c6dcb2c074..721e78170b 100644
--- a/tests/qtest/ide-test.c
+++ b/tests/qtest/ide-test.c
@@ -41,8 +41,11 @@
 #define IDE_PCI_FUNC    1
 
 #define IDE_BASE 0x1f0
+#define IDE_BASE2 0x3f6
 #define IDE_PRIMARY_IRQ 14
 
+#define IDE_CTRL_RESET 0x04
+
 #define ATAPI_BLOCK_SIZE 2048
 
 /* How many bytes to receive via ATAPI PIO at one time.
@@ -99,6 +102,7 @@ enum {
 
     CMDF_ABORT      = 0x100,
     CMDF_NO_BM      = 0x200,
+    CMDF_NO_WAIT    = 0x400,
 };
 
 enum {
@@ -228,21 +232,21 @@ static uint8_t wait_dma_completion(QTestState *qts, QPCIDevice *dev,
     return status;
 }
 
-static int send_dma_request(QTestState *qts, int cmd, uint64_t sector,
-                            int nb_sectors, PrdtEntry *prdt, int prdt_entries,
-                            void(*post_exec)(QPCIDevice *dev, QPCIBar ide_bar,
-                                             uint64_t sector, int nb_sectors))
+static int send_dma_request_dev(QTestState *qts, QPCIDevice *dev,
+                                QPCIBar bmdma_bar, QPCIBar ide_bar, int cmd,
+                                uint64_t sector, int nb_sectors,
+                                PrdtEntry *prdt, int prdt_entries,
+                                void(*post_exec)(QPCIDevice *dev,
+                                                 QPCIBar ide_bar,
+                                                 uint64_t sector,
+                                                 int nb_sectors))
 {
-    QPCIDevice *dev;
-    QPCIBar bmdma_bar, ide_bar;
     uintptr_t guest_prdt;
     size_t len;
     bool from_dev;
     uint8_t status;
     int flags;
 
-    dev = get_pci_device(qts, &bmdma_bar, &ide_bar);
-
     flags = cmd & ~0xff;
     cmd &= 0xff;
 
@@ -308,8 +312,28 @@ static int send_dma_request(QTestState *qts, int cmd, uint64_t sector,
         qpci_io_writeb(dev, bmdma_bar, bmreg_cmd, 0);
     }
 
+    if (flags & CMDF_NO_WAIT) {
+        return 0;
+    }
+
     status = wait_dma_completion(qts, dev, bmdma_bar, ide_bar);
 
+    return status;
+}
+
+static int send_dma_request(QTestState *qts, int cmd, uint64_t sector,
+                            int nb_sectors, PrdtEntry *prdt, int prdt_entries,
+                            void(*post_exec)(QPCIDevice *dev, QPCIBar ide_bar,
+                                             uint64_t sector, int nb_sectors))
+{
+    QPCIDevice *dev;
+    QPCIBar bmdma_bar, ide_bar;
+    uint8_t status;
+
+    dev = get_pci_device(qts, &bmdma_bar, &ide_bar);
+    status = send_dma_request_dev(qts, dev, bmdma_bar, ide_bar,
+                                  cmd, sector, nb_sectors, prdt, prdt_entries,
+                                  post_exec);
     free_pci_device(dev);
 
     return status;
@@ -457,6 +481,60 @@ static void test_bmdma_trim(void)
     test_bmdma_teardown(qts);
 }
 
+static void test_bmdma_trim_reset(void)
+{
+    QTestState *qts;
+    QPCIDevice *dev;
+    QPCIBar bmdma_bar, ide_bar, ide_bar2;
+    uint8_t status;
+    const uint64_t trim_range[] = {
+        trim_range_le(0, 2),
+        trim_range_le(6, 8),
+    };
+    size_t len = 512;
+    uint8_t *buf;
+    uintptr_t guest_buf;
+    PrdtEntry prdt[1];
+
+    qts = ide_test_start(
+        "-blockdev file,filename=%s,node-name=img "
+        "-blockdev blkdebug,image=img,node-name=dbg,discard=unmap,"
+        "inject-error.0.event=none,inject-error.0.iotype=discard,"
+        "inject-error.0.errno=0,inject-error.0.delay-ns=1000000 "
+        "-device ide-hd,drive=dbg,bus=ide.0",
+        tmp_path[0]);
+    qtest_irq_intercept_in(qts, "ioapic");
+
+    guest_buf = guest_alloc(&guest_malloc, len);
+    prdt[0].addr = cpu_to_le32(guest_buf),
+    prdt[0].size = cpu_to_le32(len | PRDT_EOT),
+
+    dev = get_pci_device(qts, &bmdma_bar, &ide_bar);
+    ide_bar2 = qpci_legacy_iomap(dev, IDE_BASE2);
+
+    buf = g_malloc(len);
+
+    /* TRIM request with two segments */
+    *((uint64_t *)buf) = trim_range[0];
+    *((uint64_t *)buf + 1) = trim_range[1];
+
+    qtest_memwrite(qts, guest_buf, buf, 2 * sizeof(uint64_t));
+
+    send_dma_request_dev(qts, dev, bmdma_bar, ide_bar, CMD_DSM | CMDF_NO_WAIT, 0, 1, prdt,
+                     ARRAY_SIZE(prdt), NULL);
+
+    /* Reset the device while the first segment is in flight */
+    qpci_io_writeb(dev, ide_bar2, 0, IDE_CTRL_RESET);
+
+    status = wait_dma_completion(qts, dev, bmdma_bar, ide_bar);
+    g_assert_cmphex(status, ==, BM_STS_INTR);
+    assert_bit_clear(qpci_io_readb(dev, ide_bar, reg_status), DF | ERR);
+
+    free_pci_device(dev);
+    g_free(buf);
+    test_bmdma_teardown(qts);
+}
+
 /*
  * This test is developed according to the Programming Interface for
  * Bus Master IDE Controller (Revision 1.0 5/16/94)
@@ -1138,6 +1216,7 @@ int main(int argc, char **argv)
 
     qtest_add_func("/ide/bmdma/simple_rw", test_bmdma_simple_rw);
     qtest_add_func("/ide/bmdma/trim", test_bmdma_trim);
+    qtest_add_func("/ide/bmdma/trim_reset", test_bmdma_trim_reset);
     qtest_add_func("/ide/bmdma/various_prdts", test_bmdma_various_prdts);
     qtest_add_func("/ide/bmdma/no_busmaster", test_bmdma_no_busmaster);
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 143/149] block: Create DEFAULT_BLOCK_CONF macro
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (42 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 142/149] ide-test: Test reset during TRIM Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 144/149] block: Add more defaults to DEFAULT_BLOCK_CONF Michael Tokarev
                   ` (6 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Kevin Wolf, Michael Tokarev

From: Kevin Wolf <kwolf@redhat.com>

The property default values from include/hw/block/block.h were
duplicated in scsi_bus_legacy_handle_cmdline(), allowing them to go out
of sync easily. There doesn't seem a good way to avoid the duplication,
but moving them next to each other in the header file should help to
avoid this problem in the future.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20260410152314.86412-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit a1310cc6281d22ac948f4aa198dcc55d58fc039d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index b9b115deed..982287191e 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -482,12 +482,7 @@ void scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
     Location loc;
     DriveInfo *dinfo;
     int unit;
-    BlockConf conf = {
-        .bootindex = -1,
-        .share_rw = false,
-        .rerror = BLOCKDEV_ON_ERROR_AUTO,
-        .werror = BLOCKDEV_ON_ERROR_AUTO,
-    };
+    BlockConf conf = DEFAULT_BLOCK_CONF;
 
     loc_push_none(&loc);
     for (unit = 0; unit <= bus->info->max_target; unit++) {
diff --git a/include/hw/block/block.h b/include/hw/block/block.h
index b4d914624e..7da643faff 100644
--- a/include/hw/block/block.h
+++ b/include/hw/block/block.h
@@ -51,6 +51,13 @@ static inline unsigned int get_physical_block_exp(BlockConf *conf)
     return exp;
 }
 
+#define DEFAULT_BLOCK_CONF (BlockConf) {                                \
+    .bootindex = -1,                                                    \
+    .share_rw = false,                                                  \
+    .rerror = BLOCKDEV_ON_ERROR_AUTO,                                   \
+    .werror = BLOCKDEV_ON_ERROR_AUTO,                                   \
+}
+
 #define DEFINE_BLOCK_PROPERTIES_BASE(_state, _conf)                     \
     DEFINE_PROP_ON_OFF_AUTO("backend_defaults", _state,                 \
                             _conf.backend_defaults, ON_OFF_AUTO_AUTO),  \
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 144/149] block: Add more defaults to DEFAULT_BLOCK_CONF
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (43 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 143/149] block: Create DEFAULT_BLOCK_CONF macro Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 145/149] commit: Drain nodes across all of bdrv_commit() Michael Tokarev
                   ` (5 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Kevin Wolf, Lexi Winter, Michael Tokarev

From: Kevin Wolf <kwolf@redhat.com>

discard_granularity was missing from this, which means that SCSI disks
created with -drive if=scsi would default to 0 (i.e. disabling discards)
instead of -1, which makes scsi-hd automatically pick a granularity and
is the default of the corresponding qdev property for -device scsi-hd.

This was broken in QEMU 9.0 with commit 3089637.

Also set other fields whose default isn't an obvious 0. These are not
actual bug fixes because ON_OFF_AUTO_AUTO in fact happens to be 0, but
it's better not to rely on the order of enums.

Cc: qemu-stable@nongnu.org
Fixes: 308963746169 ('scsi: Don't ignore most usb-storage properties')
Reported-by: Lexi Winter <ivy@FreeBSD.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20260410152314.86412-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit f27aea1896338f4dd085a0e2cb2ab3797c5fe3e9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/include/hw/block/block.h b/include/hw/block/block.h
index 7da643faff..9225e75925 100644
--- a/include/hw/block/block.h
+++ b/include/hw/block/block.h
@@ -53,7 +53,12 @@ static inline unsigned int get_physical_block_exp(BlockConf *conf)
 
 #define DEFAULT_BLOCK_CONF (BlockConf) {                                \
     .bootindex = -1,                                                    \
+    .backend_defaults = ON_OFF_AUTO_AUTO,                               \
+    .discard_granularity = -1,                                          \
+    .wce = ON_OFF_AUTO_AUTO,                                            \
     .share_rw = false,                                                  \
+    .account_invalid = ON_OFF_AUTO_AUTO,                                \
+    .account_failed = ON_OFF_AUTO_AUTO,                                 \
     .rerror = BLOCKDEV_ON_ERROR_AUTO,                                   \
     .werror = BLOCKDEV_ON_ERROR_AUTO,                                   \
 }
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 145/149] commit: Drain nodes across all of bdrv_commit()
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (44 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 144/149] block: Add more defaults to DEFAULT_BLOCK_CONF Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 146/149] qemu-io: Add 'aio_discard' command Michael Tokarev
                   ` (4 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Kevin Wolf, Denis V. Lunev, Michael Tokarev

From: Kevin Wolf <kwolf@redhat.com>

The whole implementation of bdrv_commit() is only correct if no new
writes come in while it's running: It has only a single loop checking
the allocation status for each block and finally calls bdrv_make_empty()
without checking if that throws away any new changes.

We already have to drain while taking the graph write lock. Just extend
the drained section to all of bdrv_commit() to make sure that we don't
get any inconsistencies.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20260427170520.101242-2-kwolf@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Tested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit f0d9ccd46cf8fc576ab7d514f10f766546cdbc14)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/block/commit.c b/block/commit.c
index 0d9e1a16d7..c5e3ef03a2 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -518,6 +518,7 @@ int bdrv_commit(BlockDriverState *bs)
     if (!drv)
         return -ENOMEDIUM;
 
+    bdrv_drain_all_begin();
     bdrv_graph_rdlock_main_loop();
 
     backing_file_bs = bdrv_cow_bs(bs);
@@ -549,6 +550,10 @@ int bdrv_commit(BlockDriverState *bs)
                   BLK_PERM_ALL);
     backing = blk_new(ctx, BLK_PERM_WRITE | BLK_PERM_RESIZE, BLK_PERM_ALL);
 
+    /* We drained all nodes, but still make requests through BlockBackends */
+    blk_set_disable_request_queuing(src, true);
+    blk_set_disable_request_queuing(backing, true);
+
     ret = blk_insert_bs(src, bs, &local_err);
     if (ret < 0) {
         error_report_err(local_err);
@@ -565,7 +570,7 @@ int bdrv_commit(BlockDriverState *bs)
 
     bdrv_graph_rdunlock_main_loop();
 
-    bdrv_graph_wrlock_drained();
+    bdrv_graph_wrlock();
     bdrv_set_backing_hd(commit_top_bs, backing_file_bs, &error_abort);
     bdrv_set_backing_hd(bs, commit_top_bs, &error_abort);
     bdrv_graph_wrunlock();
@@ -647,7 +652,7 @@ ro_cleanup:
     blk_unref(backing);
 
     bdrv_graph_rdunlock_main_loop();
-    bdrv_graph_wrlock_drained();
+    bdrv_graph_wrlock();
     if (bdrv_cow_bs(bs) != backing_file_bs) {
         bdrv_set_backing_hd(bs, backing_file_bs, &error_abort);
     }
@@ -663,6 +668,7 @@ ro_cleanup:
 
 out:
     bdrv_graph_rdunlock_main_loop();
+    bdrv_drain_all_end();
 
     return ret;
 }
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 146/149] qemu-io: Add 'aio_discard' command
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (45 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 145/149] commit: Drain nodes across all of bdrv_commit() Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 147/149] qcow2: Fix corruption on discard during write with COW Michael Tokarev
                   ` (3 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Kevin Wolf, Denis V. Lunev, Michael Tokarev

From: Kevin Wolf <kwolf@redhat.com>

Testing interactions between multiple requests that include discard
requests require that qemu-io can do the discard asynchronously, like it
already does for reads and writes. To this effect, add an 'aio_discard'
command.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20260427170520.101242-3-kwolf@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Tested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 7f8466e2ce620e3c6a6e2f32d616367174d4dbe9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index f6d077908f..de4c1966fe 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -2218,6 +2218,120 @@ static int discard_f(BlockBackend *blk, int argc, char **argv)
     return 0;
 }
 
+static void aio_discard_help(void)
+{
+    printf(
+"\n"
+" asynchronously discards a range of bytes from the given offset\n"
+"\n"
+" Example:\n"
+" 'aio_discard 512 1k' - discards 1 kilobyte from 512 bytes into the file\n"
+"\n"
+" Discards a segment of the currently open file.\n"
+" -C, -- report statistics in a machine parsable format\n"
+" -q, -- quiet mode, do not show I/O statistics\n"
+" The discard is performed asynchronously and the aio_flush command must be\n"
+" used to ensure all outstanding aio requests have been completed.\n"
+" Note that due to its asynchronous nature, this command will be\n"
+" considered successful once the request is submitted, independently\n"
+" of potential I/O errors.\n"
+"\n");
+}
+
+static int aio_discard_f(BlockBackend *blk, int argc, char **argv);
+
+static const cmdinfo_t aio_discard_cmd = {
+    .name       = "aio_discard",
+    .cfunc      = aio_discard_f,
+    .perm       = BLK_PERM_WRITE,
+    .argmin     = 2,
+    .argmax     = -1,
+    .args       = "[-Cq] off len",
+    .oneline    = "asynchronously discards a number of bytes",
+    .help       = aio_discard_help,
+};
+
+static void aio_discard_done(void *opaque, int ret)
+{
+    struct aio_ctx *ctx = opaque;
+    struct timespec t2;
+
+    clock_gettime(CLOCK_MONOTONIC, &t2);
+
+    if (ret < 0) {
+        printf("aio_discard failed: %s\n", strerror(-ret));
+        block_acct_failed(blk_get_stats(ctx->blk), &ctx->acct);
+        goto out;
+    }
+
+    block_acct_done(blk_get_stats(ctx->blk), &ctx->acct);
+
+    if (ctx->qflag) {
+        goto out;
+    }
+
+    /* Finally, report back -- -C gives a parsable format */
+    t2 = tsub(t2, ctx->t1);
+    print_report("discarded ", &t2, ctx->offset, ctx->qiov.size,
+                 ctx->qiov.size, 1, ctx->Cflag);
+out:
+    g_free(ctx);
+}
+
+static int aio_discard_f(BlockBackend *blk, int argc, char **argv)
+{
+    int c, ret;
+    int64_t count;
+    struct aio_ctx *ctx = g_new0(struct aio_ctx, 1);
+
+    ctx->blk = blk;
+
+    while ((c = getopt(argc, argv, "Cq")) != -1) {
+        switch (c) {
+        case 'C':
+            ctx->Cflag = true;
+            break;
+        case 'q':
+            ctx->qflag = true;
+            break;
+        default:
+            g_free(ctx);
+            qemuio_command_usage(&aio_discard_cmd);
+            return -EINVAL;
+        }
+    }
+
+    if (optind != argc - 2) {
+        g_free(ctx);
+        qemuio_command_usage(&aio_discard_cmd);
+        return -EINVAL;
+    }
+
+    ctx->offset = cvtnum(argv[optind]);
+    if (ctx->offset < 0) {
+        ret = ctx->offset;
+        print_cvtnum_err(ret, argv[optind]);
+        g_free(ctx);
+        return ret;
+    }
+    optind++;
+
+    count = cvtnum(argv[optind]);
+    if (count < 0) {
+        print_cvtnum_err(count, argv[optind]);
+        g_free(ctx);
+        return count;
+    }
+
+    clock_gettime(CLOCK_MONOTONIC, &ctx->t1);
+    ctx->qiov.size = count;
+    block_acct_start(blk_get_stats(blk), &ctx->acct, ctx->qiov.size,
+                     BLOCK_ACCT_UNMAP);
+    blk_aio_pdiscard(blk, ctx->offset, count, aio_discard_done, ctx);
+
+    return 0;
+}
+
 static int alloc_f(BlockBackend *blk, int argc, char **argv)
 {
     BlockDriverState *bs = blk_bs(blk);
@@ -2800,6 +2914,7 @@ static void __attribute((constructor)) init_qemuio_commands(void)
     qemuio_add_command(&length_cmd);
     qemuio_add_command(&info_cmd);
     qemuio_add_command(&discard_cmd);
+    qemuio_add_command(&aio_discard_cmd);
     qemuio_add_command(&alloc_cmd);
     qemuio_add_command(&map_cmd);
     qemuio_add_command(&reopen_cmd);
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 147/149] qcow2: Fix corruption on discard during write with COW
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (46 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 146/149] qemu-io: Add 'aio_discard' command Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 148/149] iotests/046: Test that discard/write_zeroes wait for dependencies Michael Tokarev
                   ` (2 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Kevin Wolf, Denis V. Lunev, Michael Tokarev

From: Kevin Wolf <kwolf@redhat.com>

Most code in qcow2 that accesses (and potentially modifies) L2 tables
does so while holding s->lock.

There is one exception, which is allocating writes. They hold the lock
initially while allocating clusters, but drop it for writing the guest
payload before taking the lock again for updating the L2 tables. This
allows concurrent requests that touch other parts of the image file to
continue in parallel and is an important performance optimisation.

However, this means that other requests that run while the lock is
dropped for writing guest data must synchronise with the list of
allocating requests in s->cluster_allocs and wait if they would overlap.
For writes, this is done in handle_dependencies(), but discard and write
zeros operations neglect to synchronise with s->cluster_allocs.

This means that discard can free a cluster whose L2 entry will already
be modified in qcow2_alloc_cluster_link_l2() by a previously started
write. In the case of a pre-allocated zero cluster that is in the
process of being overwritten, this means that discard can lead to a
situation where the cluster is still mapped (because the write will
restore the L2 entry just without the zero flag), but its refcount has
been decreased, resulting in a corrupted image.

Add the missing synchronisation to qcow2_cluster_discard() and
qcow2_subcluster_zeroize() to fix the problem.

Cc: qemu-stable@nongnu.org
Reported-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20260427170520.101242-4-kwolf@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Tested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b8bfb1478d61512f851badd0d912c6661a2efee7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index c655bf6df4..8b1e80bd0b 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1392,6 +1392,9 @@ count_single_write_clusters(BlockDriverState *bs, int nb_clusters,
  * the same cluster. In this case we need to wait until the previous
  * request has completed and updated the L2 table accordingly.
  *
+ * If allow_shortening == true, instead of waiting for a dependency, *cur_bytes
+ * can be shortened so that the cluster allocations don't overlap.
+ *
  * Returns:
  *   0       if there was no dependency. *cur_bytes indicates the number of
  *           bytes from guest_offset that can be read before the next
@@ -1403,7 +1406,9 @@ count_single_write_clusters(BlockDriverState *bs, int nb_clusters,
  */
 static int coroutine_fn handle_dependencies(BlockDriverState *bs,
                                             uint64_t guest_offset,
-                                            uint64_t *cur_bytes, QCowL2Meta **m)
+                                            uint64_t *cur_bytes,
+                                            bool allow_shortening,
+                                            QCowL2Meta **m)
 {
     BDRVQcow2State *s = bs->opaque;
     QCowL2Meta *old_alloc;
@@ -1434,7 +1439,7 @@ static int coroutine_fn handle_dependencies(BlockDriverState *bs,
 
         /* Conflict */
 
-        if (start < old_start) {
+        if (start < old_start && allow_shortening) {
             /* Stop at the start of a running allocation */
             bytes = old_start - start;
         } else {
@@ -1469,6 +1474,29 @@ static int coroutine_fn handle_dependencies(BlockDriverState *bs,
     return 0;
 }
 
+static void coroutine_mixed_fn wait_for_dependencies(BlockDriverState *bs,
+                                                     uint64_t guest_offset,
+                                                     uint64_t bytes)
+{
+    BDRVQcow2State *s = bs->opaque;
+    QCowL2Meta *m = NULL;
+    int ret;
+
+    /*
+     * Discard has some non-coroutine callers (creating internal snapshots and
+     * make empty). They are calling from qemu-img or in a drained section, so
+     * we know that no writes can be in progress.
+     */
+    if (!qemu_in_coroutine()) {
+        assert(QLIST_EMPTY(&s->cluster_allocs));
+        return;
+    }
+
+    do {
+        ret = handle_dependencies(bs, guest_offset, &bytes, false, &m);
+    } while (ret == -EAGAIN);
+}
+
 /*
  * Checks how many already allocated clusters that don't require a new
  * allocation there are at the given guest_offset (up to *bytes).
@@ -1840,7 +1868,7 @@ again:
          *         the right synchronisation between the in-flight request and
          *         the new one.
          */
-        ret = handle_dependencies(bs, start, &cur_bytes, m);
+        ret = handle_dependencies(bs, start, &cur_bytes, true, m);
         if (ret == -EAGAIN) {
             /* Currently handle_dependencies() doesn't yield if we already had
              * an allocation. If it did, we would have to clean up the L2Meta
@@ -2000,6 +2028,15 @@ int qcow2_cluster_discard(BlockDriverState *bs, uint64_t offset,
     int64_t cleared;
     int ret;
 
+    /*
+     * If we're touching a cluster for which allocating writes are in flight,
+     * wait for them to complete to avoid conflicting metadata updates.
+     *
+     * We don't need to allocate a QCowL2Meta for the discard operation because
+     * s->lock is held for the duration of the whole operation.
+     */
+    wait_for_dependencies(bs, offset, bytes);
+
     /* Caller must pass aligned values, except at image end */
     assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
     assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) ||
@@ -2160,6 +2197,15 @@ int coroutine_fn qcow2_subcluster_zeroize(BlockDriverState *bs, uint64_t offset,
     int64_t cleared;
     int ret;
 
+    /*
+     * If we're touching a cluster for which allocating writes are in flight,
+     * wait for them to complete to avoid conflicting metadata updates.
+     *
+     * We don't need to allocate a QCowL2Meta for the zeroize operation because
+     * s->lock is held for the duration of the whole operation.
+     */
+    wait_for_dependencies(bs, offset, bytes);
+
     /* If we have to stay in sync with an external data file, zero out
      * s->data_file first. */
     if (data_file_is_raw(bs)) {
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 148/149] iotests/046: Test that discard/write_zeroes wait for dependencies
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (47 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 147/149] qcow2: Fix corruption on discard during write with COW Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-22 21:49 ` [Stable-10.2.3 149/149] block/graph-lock: fix missed wakeup in bdrv_graph_co_rdunlock() Michael Tokarev
  2026-05-23  9:00 ` [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Alex Bennée
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Kevin Wolf, Denis V. Lunev, Michael Tokarev

From: Kevin Wolf <kwolf@redhat.com>

This is a regression test for the bug fixed in the previous commit where
discard and write_zeroes operations wouldn't consider their dependencies
in s->cluster_allocs. Without the fix, this results in a corrupted
image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20260427170520.101242-5-kwolf@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Tested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 389f5bcc744d3ddc127d550a57261aed9bbba1f3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/tests/qemu-iotests/046 b/tests/qemu-iotests/046
index 4c9ed4d26e..e03dd40147 100755
--- a/tests/qemu-iotests/046
+++ b/tests/qemu-iotests/046
@@ -184,6 +184,48 @@ aio_write -P 160 0x104000 0x18000
 resume A
 aio_flush
 EOF
+
+# Create a pre-allocated zero cluster, then start a write on it and discard it
+# before the L2 update is made
+cat  <<EOF
+write -P 181 0x120000 0x10000
+write -z 0x120000 0x10000
+
+break write_aio A
+aio_write -P 180 0x120000 0x10000
+wait_break A
+aio_discard 0x120000 0x10000
+resume A
+aio_flush
+EOF
+
+# Create a pre-allocated zero cluster, then start a write on it and a
+# concurrent zero write with MAY_UNMAP before the L2 update is made
+cat  <<EOF
+write -P 181 0x130000 0x10000
+write -z 0x130000 0x10000
+
+break write_aio A
+aio_write -P 180 0x130000 0x10000
+wait_break A
+aio_write -z -u 0x130000 0x10000
+resume A
+aio_flush
+EOF
+
+# Create a pre-allocated zero cluster, then start a write on it and a
+# concurrent zero write without MAY_UNMAP before the L2 update is made
+cat  <<EOF
+write -P 181 0x140000 0x10000
+write -z 0x140000 0x10000
+
+break write_aio A
+aio_write -P 180 0x140000 0x10000
+wait_break A
+aio_write -z 0x140000 0x10000
+resume A
+aio_flush
+EOF
 }
 
 overlay_io | $QEMU_IO blkdebug::"$TEST_IMG" | _filter_qemu_io |\
@@ -264,6 +306,10 @@ verify_io()
     # Undefined content for 0x10c000 0x8000
     echo read -P 160 0x114000 0x8000
     echo read -P 17  0x11c000 0x4000
+
+    echo read -P 0   0x120000 0x10000
+    echo read -P 0   0x130000 0x10000
+    echo read -P 0   0x140000 0x10000
 }
 
 verify_io | $QEMU_IO "$TEST_IMG" | _filter_qemu_io
diff --git a/tests/qemu-iotests/046.out b/tests/qemu-iotests/046.out
index b1a03f4041..6341df335c 100644
--- a/tests/qemu-iotests/046.out
+++ b/tests/qemu-iotests/046.out
@@ -139,6 +139,36 @@ wrote XXX/XXX bytes at offset XXX
 XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 wrote XXX/XXX bytes at offset XXX
 XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+blkdebug: Suspended request 'A'
+blkdebug: Resuming request 'A'
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+discarded  XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+blkdebug: Suspended request 'A'
+blkdebug: Resuming request 'A'
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+blkdebug: Suspended request 'A'
+blkdebug: Resuming request 'A'
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
 == Verify image content ==
 read 65536/65536 bytes at offset 0
@@ -239,5 +269,11 @@ read 32768/32768 bytes at offset 1130496
 32 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 read 16384/16384 bytes at offset 1163264
 16 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 1179648
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 1245184
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 1310720
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 No errors were found on the image.
 *** done
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Stable-10.2.3 149/149] block/graph-lock: fix missed wakeup in bdrv_graph_co_rdunlock()
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (48 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 148/149] iotests/046: Test that discard/write_zeroes wait for dependencies Michael Tokarev
@ 2026-05-22 21:49 ` Michael Tokarev
  2026-05-23  9:00 ` [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Alex Bennée
  50 siblings, 0 replies; 52+ messages in thread
From: Michael Tokarev @ 2026-05-22 21:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-stable, Denis V. Lunev, Kevin Wolf, Hanna Reitz,
	Stefan Hajnoczi, Fiona Ebner, Michael Tokarev

From: "Denis V. Lunev" <den@openvz.org>

tests/qemu-iotests/tests/iothreads-create reproduces the hang on
master under `stress-ng --cpu $(nproc) --timeout 0`.  The iotest's
vm.run_job() times out and qemu stays permanently stuck in
ppoll(timeout=-1) inside bdrv_graph_wrlock_drained -> blk_remove_bs
during qemu_cleanup().  The timing window is narrow on modern
bare-metal hardware and much wider in a VM guest; downstream trees
that still use plain bdrv_graph_wrlock() in blk_remove_bs() hit it
on the first iteration under the same stress.

bdrv_graph_wrlock() zeroes has_writer around its AIO_WAIT_WHILE loop
so that callbacks dispatched by aio_poll() can still take the read
lock on the fast path.  The rdunlock side, however, only kicks a
waiting writer when has_writer is observed set; a reader that drops
its lock inside the polling window silently returns and nothing ever
wakes the writer:

  main thread                         iothread0 coroutine
  -----------                         -------------------
  bdrv_graph_wrlock:                  rdlock held, reader_count=1
    bdrv_drain_all_begin_nopoll
    has_writer = 0
    AIO_WAIT_WHILE_UNLOCKED(
        NULL, reader_count >= 1):
      num_waiters++
      smp_mb
      aio_poll(main_ctx, true)   -->  bdrv_graph_co_rdunlock:
        (ppoll, blocked)                reader_count-- -> 0
                                        smp_mb
                                        read has_writer = 0
                                        skip aio_wait_kick()
                                      return

reader_count is now 0 and num_waiters is still 1, but no BH, fd or
timer on the main AioContext will fire -- the only entity that could
kick just decided it did not have to.  Main stays in ppoll() holding
BQL, so RCU, VCPUs and any iothread path that needs BQL stall behind
it.  The hang is final; no timeout, no forward progress, no recovery
as there is no other source of wake up inside qemu_cleanup().

bdrv_drain_all_begin() does not close the race on its own: it
quiesces in-flight I/O, but graph readers also include non-I/O
coroutines (block-job cleanup, virtio-scsi polling) that drain does
not evict.  The bdrv_graph_wrlock_drained() wrapper narrows the
window but does not eliminate it; every plain bdrv_graph_wrlock()
site is exposed on the same basis.

Drop the has_writer check in bdrv_graph_co_rdunlock() and call
aio_wait_kick() unconditionally.  The helper itself loads num_waiters
atomically and only schedules a dummy BH when a waiter exists, so the
change is a no-op on the no-writer path and closes the missed-wakeup
on the writer path.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20260424103917.248668-2-den@openvz.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit e3082ab3b38538ebdbc5cd62b4c476b673c5e515)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

diff --git a/block/graph-lock.c b/block/graph-lock.c
index b7319473a1..f2501d75fb 100644
--- a/block/graph-lock.c
+++ b/block/graph-lock.c
@@ -278,14 +278,12 @@ void coroutine_fn bdrv_graph_co_rdunlock(void)
     smp_mb();

     /*
-     * has_writer == 0: this means reader will read reader_count decreased
-     * has_writer == 1: we don't know if writer read reader_count old or
-     *                  new. Therefore, kick again so on next iteration
-     *                  writer will for sure read the updated value.
+     * Always kick: bdrv_graph_wrlock() zeroes has_writer while polling (to
+     * let callbacks take the reader lock via the fast path), so we cannot
+     * rely on has_writer to detect a waiting writer. aio_wait_kick() is a
+     * no-op when no one is waiting, so it is cheap in the common case.
      */
-    if (qatomic_read(&has_writer)) {
-        aio_wait_kick();
-    }
+    aio_wait_kick();
 }

 void bdrv_graph_rdlock_main_loop(void)
-- 
2.47.3

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen)
  2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
                   ` (49 preceding siblings ...)
  2026-05-22 21:49 ` [Stable-10.2.3 149/149] block/graph-lock: fix missed wakeup in bdrv_graph_co_rdunlock() Michael Tokarev
@ 2026-05-23  9:00 ` Alex Bennée
  50 siblings, 0 replies; 52+ messages in thread
From: Alex Bennée @ 2026-05-23  9:00 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: qemu-devel, qemu-stable

Michael Tokarev <mjt@tls.msk.ru> writes:

> The following patches are queued for QEMU stable v10.2.3:
>
>   https://gitlab.com/qemu-project/qemu/-/commits/staging-10.2
>
> Patch freeze is 2026-05-22, and the release is planned for 2026-05-24:
>
>   https://wiki.qemu.org/Planning/10.2
>
> Please respond here or CC qemu-stable@nongnu.org on any additional patches
> you think should (or shouldn't) be included in the release.
>
> The changes which are staging for inclusion, with the original commit hash
> from master branch, are given below the bottom line.

I just wanted to make my appreciation known for the work you do keeping
the stable trees going. I suspect the AI-pocolypse is going to result in
quite a number of stable series patches over the next year.

>
> Thanks!
>
> /mjt
>
<snip>

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2026-05-23  9:00 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-22 21:48 [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 100/149] linux-user: Flush errors by using exit() instead of _exit() in error path Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 101/149] linux-user: Allow getsockopt() with NULL optval address Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 102/149] linux-user: Translate errno in IP_RECVERR and IPV6_RECVERR Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 103/149] hw/intc/xics: Add a check for an invalid server id Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 104/149] tests/rcutorture: Fix build error Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 105/149] hw/ppc/e500: fix bus-frequency property hardcoded to zero in CPU FDT node Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 106/149] hw/net/allwinner-sun8i-emac: Flush queued packets when rx is enabled Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 107/149] hw/intc/arm_gicv3: Fix NS write to ICC_AP1Rn_EL1 when prebits < 7 Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 108/149] migration: Fix low possibility downtime violation Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 109/149] target/microblaze: Fix endianness used to disassemble Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 110/149] target/arm: Report IL=0 for Thumb 16-bit BKPT insn Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 111/149] hw/misc/bcm2835_rng: Specify valid memory access sizes Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 112/149] hw/uefi: fix buffer overruns Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 113/149] hw/uefi: verify pio_xfer_offset before calculating buffer checksum Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 114/149] hw/uefi: fix ucs2 string helper functions Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 115/149] hw/uefi: add name_size check to uefi_vars_mm_lock_variable() Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 116/149] hw/uefi: verify data size before accessing it in wrap_pkcs7 Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 117/149] hw/uefi: avoid possibly unaligned variable_auth_2 struct field access Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 118/149] hw/uefi: check auth.hdr_length minimum size Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 119/149] hw/ufs: Validate MCQ SQ references before use Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 120/149] hw/ufs: Guard MCQ CQ accesses against missing queues Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 121/149] hw/ufs: Reject zero-depth MCQ queues Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 122/149] hw/ufs: Keep MCQ SQs alive while requests are outstanding Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 123/149] hw/ufs: Zero reserved bytes in REPORT LUNS response header Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 124/149] hw/display/cirrus_vga: Fix packed-24 color-expansion transparent pattern fills Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 125/149] hw/display/cirrus_vga: Fix packed-24 color-expansion transparent copies Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 126/149] hw/misc/aspeed_sbc: Add bounds checking for OTP write operations Michael Tokarev
2026-05-22 21:48 ` [Stable-10.2.3 127/149] aspeed/hace: Fix out-of-bounds read in has_padding() Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 128/149] aspeed/hace: Prevent total_req_len overflow Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 129/149] hw/i2c/microbit_i2c: Don't index off end of twi_read_sequence[] Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 130/149] meson.build: Add -fzero-init-padding-bits=all Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 131/149] tests/functional/qemu_test/asset.py: Don't use setxattr when it doesn't exist Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 132/149] hw/nvme: fix admin cq msix setup Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 133/149] linux-user: Fix AT_EXECFN in AUXV for symlinked programs Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 134/149] linux-user/sh4: Fix target_ucontext tuc_link field type Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 135/149] linux-user/sh4: Fix setup_sigtramp to match Linux kernel trampoline pattern Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 136/149] blkdebug: Add 'delay-ns' option Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 137/149] block: Add blk_co_start/end_request() and BDRV_REQ_NO_QUEUE Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 138/149] block: Add flags parameter to blk_*_pdiscard() Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 139/149] ide: Minimal fix for deadlock between TRIM and drain Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 140/149] ide: Clean up ide_trim_co_entry() to be idiomatic coroutine code Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 141/149] ide-test: Factor out wait_dma_completion() Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 142/149] ide-test: Test reset during TRIM Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 143/149] block: Create DEFAULT_BLOCK_CONF macro Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 144/149] block: Add more defaults to DEFAULT_BLOCK_CONF Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 145/149] commit: Drain nodes across all of bdrv_commit() Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 146/149] qemu-io: Add 'aio_discard' command Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 147/149] qcow2: Fix corruption on discard during write with COW Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 148/149] iotests/046: Test that discard/write_zeroes wait for dependencies Michael Tokarev
2026-05-22 21:49 ` [Stable-10.2.3 149/149] block/graph-lock: fix missed wakeup in bdrv_graph_co_rdunlock() Michael Tokarev
2026-05-23  9:00 ` [Stable-10.2.3 v2 000/149] Patch Round-up for stable 10.2.3, freeze on 2026-05-22 (frozen) Alex Bennée

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.