public inbox for qemu-devel@nongnu.org
 help / color / mirror / Atom feed
* [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze
@ 2026-03-02  8:41 Paolo Bonzini
  2026-03-02  8:41 ` [PULL 001/102] hw/i386/vmmouse: Fix hypercall clobbers Paolo Bonzini
                   ` (101 more replies)
  0 siblings, 102 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:41 UTC (permalink / raw)
  To: qemu-devel

The following changes since commit d8a9d97317d03190b34498741f98f22e2a9afe3e:

  Merge tag 'pull-target-arm-20260226' of https://gitlab.com/pm215/qemu into staging (2026-02-26 16:00:07 +0000)

are available in the Git repository at:

  https://gitlab.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 5a0f9481b0cf344c4437515b596e4ecf57ccc30f:

  target/i386: emulate: fix scas (2026-03-01 16:02:54 +0100)

----------------------------------------------------------------
* target/alpha: Fix for record/replay issue
* accel/nitro: New Nitro Enclaves accelerator
* generic + kvm: add support for rebuilding VMs on reset
* audio requirements cleanup
* vmmouse: Fix hypercall clobbers
* rust: use checked_div to make clippy happy
* kvm: Don't clear pending #SMI in kvm_get_vcpu_events
* target/i386/emulate: rework MMU code, many fixes
* target/i386/whpx: replace winhvemulation with target/i386/emulate
* target/i386/whpx: x2apic support
* target/i386/whpx: vapic support
* kvm: support for the "ignore guest PAT" quirk
* target/i386: add ITS_NO bit for the arch-capabilities MSR
* target/i386: add MBEC bit for nested VMX

----------------------------------------------------------------
Akihiko Odaki (2):
      target/alpha: Reset CPU
      Reapply "rcu: Unify force quiescent state"

Alexander Graf (11):
      scripts/update-linux-headers: Add Nitro Enclaves header
      linux-headers: Add nitro_enclaves.h
      hw/nitro: Add Nitro Vsock Bus
      accel: Add Nitro Enclaves accelerator
      hw/nitro/nitro-serial-vsock: Nitro Enclaves vsock console
      hw/nitro: Introduce Nitro Enclave Heartbeat device
      target/arm/cpu64: Allow -host for nitro
      hw/nitro: Add nitro machine
      hw/core/eif: Move definitions to header
      hw/nitro: Enable direct kernel boot
      docs: Add Nitro Enclaves documentation

Ani Sinha (34):
      i386/kvm: avoid installing duplicate msr entries in msr_handlers
      accel/kvm: add confidential class member to indicate guest rebuild capability
      hw/accel: add a per-accelerator callback to change VM accelerator handle
      system/physmem: add helper to reattach existing memory after KVM VM fd change
      accel/kvm: add changes required to support KVM VM file descriptor change
      accel/kvm: mark guest state as unprotected after vm file descriptor change
      accel/kvm: add a notifier to indicate KVM VM file descriptor has changed
      accel/kvm: notify when KVM VM file fd is about to be changed
      i386/kvm: unregister smram listeners prior to vm file descriptor change
      kvm/i386: implement architecture support for kvm file descriptor change
      i386/kvm: refactor xen init into a new function
      hw/i386: refactor x86_bios_rom_init for reuse in confidential guest reset
      hw/i386: export a new function x86_bios_rom_reload
      kvm/i386: reload firmware for confidential guest reset
      accel/kvm: rebind current VCPUs to the new KVM VM file descriptor upon reset
      i386/tdx: refactor TDX firmware memory initialization code into a new function
      i386/tdx: finalize TDX guest state upon reset
      i386/tdx: add a pre-vmfd change notifier to reset tdx state
      i386/sev: add migration blockers only once
      i386/sev: add notifiers only once
      i386/sev: free existing launch update data and kernel hashes data on init
      i386/sev: add support for confidential guest reset
      hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
      kvm/i8254: refactor pit initialization into a helper
      kvm/i8254: add support for confidential guest reset
      kvm/hyperv: add synic feature to CPU only if its not enabled
      hw/hyperv/vmbus: add support for confidential guest reset
      kvm/xen-emu: re-initialize capabilities during confidential guest reset
      ppc/openpic: create a new openpic device and reattach mem region on coco reset
      kvm/vcpu: add notifiers to inform vcpu file descriptor change
      kvm/clock: add support for confidential guest reset
      hw/machine: introduce machine specific option 'x-change-vmfd-on-reset'
      tests/functional/x86_64: add functional test to exercise vm fd change on reset
      qom: add 'confidential-guest-reset' property for x86 confidential vms

Bernhard Beschow (3):
      target/i386/emulate/x86_decode: Fix compiler warning
      target/i386/hvf/x86_mmu: Fix compiler warning
      target/i386/emulate/x86_decode: Actually use stream in decode_instruction_stream()

John Snow (1):
      rust: use checked_div to make clippy happy

Jon Kohler (6):
      target/i386: Add VMX_SECONDARY_EXEC_MODE_BASED_EPT_EXEC
      target/i386: Add MSR_IA32_ARCH_CAPABILITIES ITS_NO
      target/i386: introduce SapphireRapids-v6 to expose ITS_NO
      target/i386: introduce GraniteRapids-v5 to expose ITS_NO
      target/i386: introduce SierraForest-v5 to expose ITS_NO
      target/i386: introduce ClearwaterForest-v3 to expose ITS_NO

Josh Poimboeuf (1):
      hw/i386/vmmouse: Fix hypercall clobbers

Marc-André Lureau (6):
      audio: fix nominal volume channel (cosmetic)
      python/wheel: remove meson-1.9.0
      scripts/vendor.py: add pycotap
      audio: require pulse >= 0.9.13
      audio: require spice >= 0.15
      ui: drop spice-protocol < 0.14.3 support

Maxim Levitsky (1):
      accel/kvm: Don't clear pending #SMI in kvm_get_vcpu_events

Mohamed Mediouni (36):
      target/i386/emulate: rework string_rep emulation
      target/i386: emulate, hvf: move x86_mmu to common code
      whpx: i386: re-enable guest debug support
      whpx: preparatory changes before switching over from winhvemulation
      whpx: refactor whpx_destroy_vcpu to arch-specific function
      whpx: move whpx_get_reg/whpx_set_reg to generic code
      whpx: i386: switch over from winhvemulation to target/i386/emulate
      whpx: i386: flags conversion for target/i386/emulate internal state
      whpx: i386: remove remaining winhvemulation support code
      whpx: i386: remove messages
      whpx: i386: remove CPUID trapping
      whpx: common, i386, arm: rework state levels
      whpx: i386: saving/restoring less state for WHPX_LEVEL_FAST_RUNTIME_STATE
      target/i386: mshv, emulate: move the generic x86 helpers to target/i386/emulate
      target/i386: emulate: 5-level paging for the page table walker
      target/i386: emulate, hvf, mshv: rework MMU code
      hvf: i386: save/restore CR0/2/3
      target/i386: emulate: get rid of write_val_to_mem() helper
      target/i386: emulate: raise an exception on translation fault
      target/i386: emulate: remove fetch_instruction helper too
      target/i386: emulate: propagate memory errors on most reads/writes
      whpx: i386: inject exceptions
      whpx: i386: bump to x2apic
      whpx: i386: ignore send_msi to interrupt vector 0
      target/i386: emulate: propagate errors all the way and stop early
      whpx: x86: remove inaccurate comment
      whpx: x86: kick out of HLT manually when using the kernel-irqchip
      hw: i386: vapic: enable on WHPX with user-mode irqchip
      whpx: i386: move whpx_vcpu_kick_out_of_hlt() invocation to interrupt raise time
      whpx: i386: enable all supported host features
      whpx: i386: enable synthetic processor features
      whpx: i386: warn on unsupported MSR access instead of failing silently
      target/i386: emulate: more 64-bit register handling
      whpx: i386: enable PMU
      whpx: i386: expose HV_X64_MSR_APIC_FREQUENCY when kernel-irqchip=off
      target/i386: emulate: fix scas

myrslint (1):
      KVM: i386: Default disable ignore guest PAT quirk

 MAINTAINERS                                       |  20 +
 docs/system/confidential-guest-support.rst        |   1 +
 docs/system/index.rst                             |   1 +
 docs/system/nitro.rst                             | 133 ++++
 meson.build                                       |  20 +-
 qapi/qom.json                                     |  16 +-
 accel/nitro/trace.h                               |   2 +
 hw/core/eif.h                                     |  41 ++
 hw/nitro/trace.h                                  |   4 +
 include/accel/accel-ops.h                         |   2 +
 include/hw/core/boards.h                          |   6 +
 include/hw/i386/x86.h                             |   1 +
 include/hw/nitro/heartbeat.h                      |  24 +
 include/hw/nitro/machine.h                        |  20 +
 include/hw/nitro/nitro-vsock-bus.h                |  71 +++
 include/hw/nitro/serial-vsock.h                   |  24 +
 include/standard-headers/linux/nitro_enclaves.h   | 359 +++++++++++
 include/system/confidential-guest-support.h       |  20 +
 include/system/hw_accel.h                         |   1 +
 include/system/kvm.h                              |  43 ++
 include/system/kvm_int.h                          |   1 +
 include/system/nitro-accel.h                      |  25 +
 include/system/physmem.h                          |   1 +
 include/system/whpx-accel-ops.h                   |  16 +-
 include/system/whpx-all.h                         |  11 +-
 include/system/whpx-common.h                      |   6 +-
 include/system/whpx-internal.h                    |  16 -
 target/i386/cpu.h                                 |   4 +-
 target/i386/emulate/x86.h                         |   1 +
 target/i386/emulate/x86_emu.h                     |  24 +-
 target/i386/emulate/x86_flags.h                   |  20 +
 target/i386/{hvf => emulate}/x86_mmu.h            |  31 +-
 target/i386/kvm/tdx.h                             |   1 +
 accel/kvm/kvm-all.c                               | 372 +++++++++--
 accel/nitro/nitro-accel.c                         | 284 +++++++++
 accel/stubs/kvm-stub.c                            |  18 +
 accel/stubs/nitro-stub.c                          |  11 +
 accel/whpx/whpx-accel-ops.c                       |   8 +
 accel/whpx/whpx-common.c                          |  68 +-
 audio/audio-mixeng-be.c                           |   2 +-
 audio/paaudio.c                                   |  28 +-
 audio/spiceaudio.c                                |  30 -
 hw/core/eif.c                                     |  38 --
 hw/core/machine.c                                 |  22 +
 hw/hyperv/vmbus.c                                 |  37 ++
 hw/i386/kvm/clock.c                               |  59 ++
 hw/i386/kvm/i8254.c                               |  91 ++-
 hw/i386/vapic.c                                   |  24 +-
 hw/i386/vmmouse.c                                 |  10 +-
 hw/i386/x86-common.c                              |  71 ++-
 hw/intc/openpic_kvm.c                             | 112 +++-
 hw/nitro/heartbeat.c                              | 115 ++++
 hw/nitro/machine.c                                | 277 ++++++++
 hw/nitro/nitro-vsock-bus.c                        |  98 +++
 hw/nitro/serial-vsock.c                           | 123 ++++
 hw/vfio/helpers.c                                 |  91 +++
 stubs/kvm.c                                       |  22 +
 system/physmem.c                                  |  28 +
 system/runstate.c                                 |  44 +-
 target/alpha/cpu.c                                |   1 +
 target/arm/cpu64.c                                |   8 +
 target/arm/whpx/whpx-all.c                        |  43 +-
 target/i386/cpu.c                                 |  41 +-
 target/i386/emulate/x86_decode.c                  |  12 +-
 target/i386/emulate/x86_emu.c                     | 391 ++++++++----
 target/i386/emulate/x86_flags.c                   |  47 ++
 target/i386/{mshv/x86.c => emulate/x86_helpers.c} |  13 +-
 target/i386/emulate/x86_mmu.c                     | 354 ++++++++++
 target/i386/hvf/hvf.c                             |  40 +-
 target/i386/hvf/x86.c                             |  13 +-
 target/i386/hvf/x86_mmu.c                         | 277 --------
 target/i386/hvf/x86_task.c                        |  10 +-
 target/i386/kvm/kvm.c                             | 190 ++++--
 target/i386/kvm/tdx.c                             | 141 ++--
 target/i386/kvm/xen-emu.c                         |  38 +-
 target/i386/mshv/mshv-cpu.c                       |  71 ---
 target/i386/sev.c                                 | 127 +++-
 target/i386/whpx/whpx-all.c                       | 745 +++++++++++-----------
 target/i386/whpx/whpx-apic.c                      |   5 +
 tests/qtest/libqtest.c                            |   1 +
 ui/vdagent.c                                      |  18 -
 util/rcu.c                                        |  81 ++-
 accel/Kconfig                                     |   3 +
 accel/kvm/trace-events                            |   2 +
 accel/meson.build                                 |   1 +
 accel/nitro/meson.build                           |   3 +
 accel/nitro/trace-events                          |   6 +
 accel/stubs/meson.build                           |   1 +
 hw/Kconfig                                        |   1 +
 hw/hyperv/trace-events                            |   1 +
 hw/i386/kvm/trace-events                          |   1 +
 hw/meson.build                                    |   1 +
 hw/nitro/Kconfig                                  |  18 +
 hw/nitro/meson.build                              |   4 +
 hw/nitro/trace-events                             |   8 +
 meson_options.txt                                 |   2 +
 python/scripts/vendor.py                          |   2 +
 python/wheels/meson-1.9.0-py3-none-any.whl        | Bin 1029634 -> 0 bytes
 qemu-options.hx                                   |   8 +-
 rust/Cargo.toml                                   |   1 +
 rust/hw/core/src/qdev.rs                          |  14 +-
 scripts/meson-buildoptions.sh                     |   3 +
 scripts/update-linux-headers.sh                   |   1 +
 stubs/meson.build                                 |   1 +
 target/i386/emulate/meson.build                   |   9 +
 target/i386/hvf/meson.build                       |   1 -
 target/i386/kvm/trace-events                      |   4 +
 target/i386/mshv/meson.build                      |   2 +-
 target/i386/trace-events                          |   1 +
 tests/functional/x86_64/meson.build               |   1 +
 tests/functional/x86_64/test_rebuild_vmfd.py      | 136 ++++
 111 files changed, 4590 insertions(+), 1362 deletions(-)
 create mode 100644 docs/system/nitro.rst
 create mode 100644 accel/nitro/trace.h
 create mode 100644 hw/nitro/trace.h
 create mode 100644 include/hw/nitro/heartbeat.h
 create mode 100644 include/hw/nitro/machine.h
 create mode 100644 include/hw/nitro/nitro-vsock-bus.h
 create mode 100644 include/hw/nitro/serial-vsock.h
 create mode 100644 include/standard-headers/linux/nitro_enclaves.h
 create mode 100644 include/system/nitro-accel.h
 rename target/i386/{hvf => emulate}/x86_mmu.h (51%)
 create mode 100644 accel/nitro/nitro-accel.c
 create mode 100644 accel/stubs/nitro-stub.c
 create mode 100644 hw/nitro/heartbeat.c
 create mode 100644 hw/nitro/machine.c
 create mode 100644 hw/nitro/nitro-vsock-bus.c
 create mode 100644 hw/nitro/serial-vsock.c
 create mode 100644 stubs/kvm.c
 rename target/i386/{mshv/x86.c => emulate/x86_helpers.c} (95%)
 create mode 100644 target/i386/emulate/x86_mmu.c
 delete mode 100644 target/i386/hvf/x86_mmu.c
 create mode 100644 accel/nitro/meson.build
 create mode 100644 accel/nitro/trace-events
 create mode 100644 hw/nitro/Kconfig
 create mode 100644 hw/nitro/meson.build
 create mode 100644 hw/nitro/trace-events
 delete mode 100644 python/wheels/meson-1.9.0-py3-none-any.whl
 create mode 100755 tests/functional/x86_64/test_rebuild_vmfd.py
-- 
2.53.0



^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PULL 001/102] hw/i386/vmmouse: Fix hypercall clobbers
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
@ 2026-03-02  8:41 ` Paolo Bonzini
  2026-03-02  8:41 ` [PULL 002/102] target/i386/emulate/x86_decode: Fix compiler warning Paolo Bonzini
                   ` (100 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:41 UTC (permalink / raw)
  To: qemu-devel
  Cc: Josh Poimboeuf, Justin Forbes, Alexey Makhalov,
	Philippe Mathieu-Daudé, qemu-stable

From: Josh Poimboeuf <jpoimboe@kernel.org>

Fedora QA reported the following kernel panic:

  BUG: unable to handle page fault for address: 0000000040003e54
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 1082ec067 P4D 0
  Oops: Oops: 0002 [#1] SMP NOPTI
  CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.19.0-0.rc4.260108gf0b9d8eb98df.34.fc43.x86_64 #1 PREEMPT(lazy)
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20251119-3.fc43 11/19/2025
  RIP: 0010:vmware_hypercall4.constprop.0+0x52/0x90
  Code: 48 83 c4 20 5b e9 69 f0 fc fe 8b 05 a0 c1 b2 01 85 c0 74 23 b8 68 58 4d 56 b9 27 00 00 00 31 d2 bb 04 00 00 00 66 ba 58 56 ed <89> 1f 89 0e 41 89 10 5b e9 3c f0 fc fe 6a 00 49 89 f9 45 31 c0 31
  RSP: 0018:ff5eeb3240003e40 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: 000000000000ffca RCX: 000000000000ffac
  RDX: 0000000000000000 RSI: 0000000040003e58 RDI: 0000000040003e54
  RBP: ff1e05f3c1204800 R08: ff5eeb3240003e5c R09: 000000009d899c41
  R10: 000000000000003d R11: ff5eeb3240003ff8 R12: 0000000000000000
  R13: 00000000000000ff R14: ff1e05f3c02f9e00 R15: 000000000000000c
  FS:  0000000000000000(0000) GS:ff1e05f489e40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000040003e54 CR3: 000000010841d002 CR4: 0000000000771ef0
  PKRU: 55555554
  Call Trace:
   <IRQ>
   vmmouse_report_events+0x13e/0x1b0
   psmouse_handle_byte+0x15/0x60
   ps2_interrupt+0x8a/0xd0
   ...

It was triggered by dereferencing a bad pointer (RDI) immediately after
a VMware hypercall for VMWARE_CMD_ABSPOINTER_DATA in the vmmouse driver:

  ffffffff82135070 <vmware_hypercall4.constprop.0>:
  ...
  ffffffff821350ac:       b8 68 58 4d 56          mov    $0x564d5868,%eax
  ffffffff821350b1:       b9 27 00 00 00          mov    $0x27,%ecx
  ffffffff821350b6:       31 d2                   xor    %edx,%edx
  ffffffff821350b8:       bb 04 00 00 00          mov    $0x4,%ebx
  ffffffff821350bd:       66 ba 58 56             mov    $0x5658,%dx
  ffffffff821350c1:       ed                      in     (%dx),%eax	<-- hypercall
  ffffffff821350c2:       89 1f                   mov    %ebx,(%rdi)	<-- crash

Reading the kernel disassembly shows that RDI should contain the value
of a valid kernel stack address here (0xff5eeb3240003e54).  Instead it
contains 0x40003e54, suggesting the hypervisor cleared the upper 32
bits.

And indeed, Alexey discovered that QEMU's vmmouse_get_data() and
vmmouse_set_data() are only saving/restoring the lower 32 bits, while
clearing the upper 32.  Fix that by changing the type of the saved data
array from uint32_t to uint64_t.

Fixes: 548df2acc6fc ("VMMouse Emulation, by Anthony Liguori.")
Reported-by: Justin Forbes <jforbes@fedoraproject.org>
Debugged-by: Alexey Makhalov <alexey.makhalov@broadcom.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/c508fc1d4a4ccd8c9fb1e51b71df089e31115a53.1770309998.git.jpoimboe@kernel.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3293
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/i386/vmmouse.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/i386/vmmouse.c b/hw/i386/vmmouse.c
index 2ae7f3a242e..c1aeeca0c9a 100644
--- a/hw/i386/vmmouse.c
+++ b/hw/i386/vmmouse.c
@@ -72,7 +72,7 @@ struct VMMouseState {
     ISAKBDState *i8042;
 };
 
-static void vmmouse_get_data(uint32_t *data)
+static void vmmouse_get_data(uint64_t *data)
 {
     X86CPU *cpu = X86_CPU(current_cpu);
     CPUX86State *env = &cpu->env;
@@ -82,7 +82,7 @@ static void vmmouse_get_data(uint32_t *data)
     data[4] = env->regs[R_ESI]; data[5] = env->regs[R_EDI];
 }
 
-static void vmmouse_set_data(const uint32_t *data)
+static void vmmouse_set_data(const uint64_t *data)
 {
     X86CPU *cpu = X86_CPU(current_cpu);
     CPUX86State *env = &cpu->env;
@@ -197,7 +197,7 @@ static void vmmouse_disable(VMMouseState *s)
     vmmouse_remove_handler(s);
 }
 
-static void vmmouse_data(VMMouseState *s, uint32_t *data, uint32_t size)
+static void vmmouse_data(VMMouseState *s, uint64_t *data, uint32_t size)
 {
     int i;
 
@@ -221,7 +221,7 @@ static void vmmouse_data(VMMouseState *s, uint32_t *data, uint32_t size)
 static uint32_t vmmouse_ioport_read(void *opaque, uint32_t addr)
 {
     VMMouseState *s = opaque;
-    uint32_t data[6];
+    uint64_t data[6];
     uint16_t command;
 
     vmmouse_get_data(data);
@@ -247,7 +247,7 @@ static uint32_t vmmouse_ioport_read(void *opaque, uint32_t addr)
             vmmouse_request_absolute(s);
             break;
         default:
-            printf("vmmouse: unknown command %x\n", data[1]);
+            printf("vmmouse: unknown command %" PRIx64 "\n", data[1]);
             break;
         }
         break;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 002/102] target/i386/emulate/x86_decode: Fix compiler warning
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
  2026-03-02  8:41 ` [PULL 001/102] hw/i386/vmmouse: Fix hypercall clobbers Paolo Bonzini
@ 2026-03-02  8:41 ` Paolo Bonzini
  2026-03-02  8:41 ` [PULL 003/102] target/i386/hvf/x86_mmu: " Paolo Bonzini
                   ` (99 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: Bernhard Beschow, Mohamed Mediouni, Wei Liu (Microsoft)

From: Bernhard Beschow <shentey@gmail.com>

When compiling for i386-softmmu under MSYS2, GCC emits the following warning:

  In function 'get_reg_val',
      inlined from 'calc_modrm_operand64' at ../src/target/i386/emulate/x86_decode.c:1796:15:
  ../src/target/i386/emulate/x86_decode.c:1703:5: error: 'memcpy' forming offset [4, 7] is out of the bounds [0, 4] of object 'val' with type 'target_ulong' {aka 'unsigned int'} [-Werror=array-bounds=]
   1703 |     memcpy(&val,
        |     ^~~~~~~~~~~~
   1704 |            get_reg_ref(env, reg, rex_present, is_extended, size),
        |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   1705 |            size);
        |            ~~~~~
  ../src/target/i386/emulate/x86_decode.c: In function 'calc_modrm_operand64':
  ../src/target/i386/emulate/x86_decode.c:1702:18: note: 'val' declared here
   1702 |     target_ulong val = 0;
        |                  ^~~

In the calc_modrm_operand64() case the compiler sees size == 8 to be mem-copied
to a target_ulong variable which is only 4 bytes wide in case of i386-softmmu.
Note that when size != 1, get_reg_ref() always returns a pointer to an 8 byte
register, regardless of the target_ulong size. Fix the compiler warning by
always providing 8 bytes of storage by means of uint64_t.

Fixes: 77a2dba45cc9 ("target/i386/emulate: stop overloading decode->op[N].ptr")
cc: qemu-stable
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Wei Liu (Microsoft) <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20260223233950.96076-2-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/emulate/x86_decode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/emulate/x86_decode.c b/target/i386/emulate/x86_decode.c
index d037ed11420..6ad03b71b07 100644
--- a/target/i386/emulate/x86_decode.c
+++ b/target/i386/emulate/x86_decode.c
@@ -1699,7 +1699,7 @@ void *get_reg_ref(CPUX86State *env, int reg, int rex_present,
 target_ulong get_reg_val(CPUX86State *env, int reg, int rex_present,
                          int is_extended, int size)
 {
-    target_ulong val = 0;
+    uint64_t val = 0;
     memcpy(&val,
            get_reg_ref(env, reg, rex_present, is_extended, size),
            size);
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 003/102] target/i386/hvf/x86_mmu: Fix compiler warning
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
  2026-03-02  8:41 ` [PULL 001/102] hw/i386/vmmouse: Fix hypercall clobbers Paolo Bonzini
  2026-03-02  8:41 ` [PULL 002/102] target/i386/emulate/x86_decode: Fix compiler warning Paolo Bonzini
@ 2026-03-02  8:41 ` Paolo Bonzini
  2026-03-02  8:41 ` [PULL 004/102] target/i386/emulate/x86_decode: Actually use stream in decode_instruction_stream() Paolo Bonzini
                   ` (98 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:41 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bernhard Beschow, Mohamed Mediouni, Philippe Mathieu-Daudé,
	Wei Liu (Microsoft)

From: Bernhard Beschow <shentey@gmail.com>

When reusing the code in WHPX, GCC emits the following warning when compiling
for i386-softmmu under MSYS2:

  In file included from ../src/target/i386/emulate/x86_mmu.c:20:
  ../src/target/i386/emulate/x86_mmu.c: In function 'vmx_write_mem':
  ../src/target/i386/emulate/x86_mmu.c:251:25: error: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'target_ulong' {aka 'unsigned int'} [-Werror=format=]
    251 |             VM_PANIC_EX("%s: mmu_gva_to_gpa %llx failed\n", __func__, gva);
        |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~            ~~~
        |                                                                       |
        |                                                                       target_ulong {aka unsigned int}
  ../src/target/i386/emulate/panic.h:34:12: note: in definition of macro 'VM_PANIC_EX'
     34 |     printf(__VA_ARGS__); \
        |            ^~~~~~~~~~~
  ../src/target/i386/emulate/x86_mmu.c:251:48: note: format string is defined here
    251 |             VM_PANIC_EX("%s: mmu_gva_to_gpa %llx failed\n", __func__, gva);
        |                                             ~~~^
        |                                                |
        |                                                long long unsigned int
        |                                             %x

Fix the warning by reusing the target-specific macro TARGET_FMT_lx which exists
for this exact purpose.

Fixes: c97d6d2cdf97 ("i386: hvf: add code base from Google's QEMU repository")
cc: qemu-stable
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Wei Liu (Microsoft) <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20260223233950.96076-3-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/hvf/x86_mmu.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/target/i386/hvf/x86_mmu.c b/target/i386/hvf/x86_mmu.c
index afc5c17d5d5..fe44d2edf4a 100644
--- a/target/i386/hvf/x86_mmu.c
+++ b/target/i386/hvf/x86_mmu.c
@@ -244,7 +244,8 @@ void vmx_write_mem(CPUState *cpu, target_ulong gva, void *data, int bytes)
         int copy = MIN(bytes, 0x1000 - (gva & 0xfff));
 
         if (!mmu_gva_to_gpa(cpu, gva, &gpa)) {
-            VM_PANIC_EX("%s: mmu_gva_to_gpa %llx failed\n", __func__, gva);
+            VM_PANIC_EX("%s: mmu_gva_to_gpa " TARGET_FMT_lx " failed\n",
+                        __func__, gva);
         } else {
             address_space_write(&address_space_memory, gpa,
                                 MEMTXATTRS_UNSPECIFIED, data, copy);
@@ -265,7 +266,8 @@ void vmx_read_mem(CPUState *cpu, void *data, target_ulong gva, int bytes)
         int copy = MIN(bytes, 0x1000 - (gva & 0xfff));
 
         if (!mmu_gva_to_gpa(cpu, gva, &gpa)) {
-            VM_PANIC_EX("%s: mmu_gva_to_gpa %llx failed\n", __func__, gva);
+            VM_PANIC_EX("%s: mmu_gva_to_gpa " TARGET_FMT_lx " failed\n",
+                        __func__, gva);
         }
         address_space_read(&address_space_memory, gpa, MEMTXATTRS_UNSPECIFIED,
                            data, copy);
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 004/102] target/i386/emulate/x86_decode: Actually use stream in decode_instruction_stream()
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (2 preceding siblings ...)
  2026-03-02  8:41 ` [PULL 003/102] target/i386/hvf/x86_mmu: " Paolo Bonzini
@ 2026-03-02  8:41 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 005/102] target/i386/emulate: rework string_rep emulation Paolo Bonzini
                   ` (97 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:41 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bernhard Beschow, Mohamed Mediouni, Wei Liu (Microsoft),
	Magnus Kulke

From: Bernhard Beschow <shentey@gmail.com>

Compared to decode_instruction(), decode_instruction_stream() has an additional
stream parameter which avoids some guest memory accesses during instruction
decoding. Both functions defer the actual work to decode_opcode() which would
set the stream pointer to zero such that decode_instruction_stream() essentially
behaved like decode_instruction(). Given that all callers of
decode_instruction_stream() properly zero-initialize the decode parameter, the
memset() call can be moved into decode_instruction() which is the only other
user of decode_opcode(). This preserves the non-zero stream pointer which
avoids extra guest memory accesses.

Fixes: 1e25327b244a ("target/i386/emulate: Allow instruction decoding from stream")
cc: qemu-stable
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Wei Liu (Microsoft) <wei.liu@kernel.org>
Tested-by: Magnus Kulke <magnuskulke@linux.microsoft.com>
Link: https://lore.kernel.org/r/20260223233950.96076-4-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/emulate/x86_decode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/i386/emulate/x86_decode.c b/target/i386/emulate/x86_decode.c
index 6ad03b71b07..7bbcd2a9a2a 100644
--- a/target/i386/emulate/x86_decode.c
+++ b/target/i386/emulate/x86_decode.c
@@ -2088,8 +2088,6 @@ static void decode_opcodes(CPUX86State *env, struct x86_decode *decode)
 
 static uint32_t decode_opcode(CPUX86State *env, struct x86_decode *decode)
 {
-    memset(decode, 0, sizeof(*decode));
-
     decode_prefix(env, decode);
     set_addressing_size(env, decode);
     set_operand_size(env, decode);
@@ -2101,6 +2099,8 @@ static uint32_t decode_opcode(CPUX86State *env, struct x86_decode *decode)
 
 uint32_t decode_instruction(CPUX86State *env, struct x86_decode *decode)
 {
+    memset(decode, 0, sizeof(*decode));
+
     return decode_opcode(env, decode);
 }
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 005/102] target/i386/emulate: rework string_rep emulation
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (3 preceding siblings ...)
  2026-03-02  8:41 ` [PULL 004/102] target/i386/emulate/x86_decode: Actually use stream in decode_instruction_stream() Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 006/102] target/i386: emulate, hvf: move x86_mmu to common code Paolo Bonzini
                   ` (96 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-5-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/emulate/x86_emu.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/target/i386/emulate/x86_emu.c b/target/i386/emulate/x86_emu.c
index 4409f7bc134..bf96fe06b45 100644
--- a/target/i386/emulate/x86_emu.c
+++ b/target/i386/emulate/x86_emu.c
@@ -466,18 +466,25 @@ static inline void string_increment_reg(CPUX86State *env, int reg,
     write_reg(env, reg, val, decode->addressing_size);
 }
 
+static inline int get_ZF(CPUX86State *env) {
+    return env->cc_dst ? 0 : CC_Z;
+}
+
 static inline void string_rep(CPUX86State *env, struct x86_decode *decode,
                               void (*func)(CPUX86State *env,
                                            struct x86_decode *ins), int rep)
 {
     target_ulong rcx = read_reg(env, R_ECX, decode->addressing_size);
-    while (rcx--) {
+
+    while (rcx != 0) {
+        bool is_cmps_or_scas = decode->cmd == X86_DECODE_CMD_CMPS || decode->cmd == X86_DECODE_CMD_SCAS;
         func(env, decode);
+        rcx--;
         write_reg(env, R_ECX, rcx, decode->addressing_size);
-        if ((PREFIX_REP == rep) && !env->cc_dst) {
+        if ((PREFIX_REP == rep) && !get_ZF(env) && is_cmps_or_scas) {
             break;
         }
-        if ((PREFIX_REPN == rep) && env->cc_dst) {
+        if ((PREFIX_REPN == rep) && get_ZF(env)&& is_cmps_or_scas) {
             break;
         }
     }
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 006/102] target/i386: emulate, hvf: move x86_mmu to common code
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (4 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 005/102] target/i386/emulate: rework string_rep emulation Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 007/102] whpx: i386: re-enable guest debug support Paolo Bonzini
                   ` (95 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni, Philippe Mathieu-Daudé

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20260223233950.96076-6-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/{hvf => emulate}/x86_mmu.h |  0
 target/i386/{hvf => emulate}/x86_mmu.c | 14 +++++++++-----
 target/i386/hvf/hvf.c                  | 10 +++++++++-
 target/i386/hvf/x86.c                  |  2 +-
 target/i386/hvf/x86_task.c             |  2 +-
 target/i386/emulate/meson.build        |  1 +
 target/i386/hvf/meson.build            |  1 -
 7 files changed, 21 insertions(+), 9 deletions(-)
 rename target/i386/{hvf => emulate}/x86_mmu.h (100%)
 rename target/i386/{hvf => emulate}/x86_mmu.c (95%)

diff --git a/target/i386/hvf/x86_mmu.h b/target/i386/emulate/x86_mmu.h
similarity index 100%
rename from target/i386/hvf/x86_mmu.h
rename to target/i386/emulate/x86_mmu.h
diff --git a/target/i386/hvf/x86_mmu.c b/target/i386/emulate/x86_mmu.c
similarity index 95%
rename from target/i386/hvf/x86_mmu.c
rename to target/i386/emulate/x86_mmu.c
index fe44d2edf4a..b82a55a3da7 100644
--- a/target/i386/hvf/x86_mmu.c
+++ b/target/i386/emulate/x86_mmu.c
@@ -19,10 +19,10 @@
 #include "qemu/osdep.h"
 #include "panic.h"
 #include "cpu.h"
+#include "system/address-spaces.h"
+#include "system/memory.h"
 #include "emulate/x86.h"
-#include "x86_mmu.h"
-#include "vmcs.h"
-#include "vmx.h"
+#include "emulate/x86_mmu.h"
 
 #define pte_present(pte) (pte & PT_PRESENT)
 #define pte_write_access(pte) (pte & PT_WRITE)
@@ -99,6 +99,8 @@ static bool get_pt_entry(CPUState *cpu, struct gpt_translation *pt,
 static bool test_pt_entry(CPUState *cpu, struct gpt_translation *pt,
                           int level, int *largeness, bool pae)
 {
+    X86CPU *x86_cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86_cpu->env;
     uint64_t pte = pt->pte[level];
 
     if (pt->write_access) {
@@ -127,7 +129,7 @@ static bool test_pt_entry(CPUState *cpu, struct gpt_translation *pt,
         pt->err_code |= MMU_PAGE_PT;
     }
 
-    uint32_t cr0 = rvmcs(cpu->accel->fd, VMCS_GUEST_CR0);
+    uint32_t cr0 = env->cr[0];
     /* check protection */
     if (cr0 & CR0_WP_MASK) {
         if (pt->write_access && !pte_write_access(pte)) {
@@ -179,9 +181,11 @@ static inline uint64_t large_page_gpa(struct gpt_translation *pt, bool pae,
 static bool walk_gpt(CPUState *cpu, target_ulong addr, int err_code,
                      struct gpt_translation *pt, bool pae)
 {
+    X86CPU *x86_cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86_cpu->env;
     int top_level, level;
     int largeness = 0;
-    target_ulong cr3 = rvmcs(cpu->accel->fd, VMCS_GUEST_CR3);
+    target_ulong cr3 = env->cr[3];
     uint64_t page_mask = pae ? PAE_PTE_PAGE_MASK : LEGACY_PTE_PAGE_MASK;
     
     memset(pt, 0, sizeof(*pt));
diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index ce54020f003..0b3674ad33d 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -62,7 +62,7 @@
 #include "emulate/x86.h"
 #include "x86_descr.h"
 #include "emulate/x86_flags.h"
-#include "x86_mmu.h"
+#include "emulate/x86_mmu.h"
 #include "emulate/x86_decode.h"
 #include "emulate/x86_emu.h"
 #include "x86_task.h"
@@ -254,11 +254,19 @@ static void hvf_read_segment_descriptor(CPUState *s, struct x86_segment_descript
 
 static void hvf_read_mem(CPUState *cpu, void *data, target_ulong gva, int bytes)
 {
+    X86CPU *x86_cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86_cpu->env;
+    env->cr[0] = rvmcs(cpu->accel->fd, VMCS_GUEST_CR0);
+    env->cr[3] = rvmcs(cpu->accel->fd, VMCS_GUEST_CR3);
     vmx_read_mem(cpu, data, gva, bytes);
 }
 
 static void hvf_write_mem(CPUState *cpu, void *data, target_ulong gva, int bytes)
 {
+    X86CPU *x86_cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86_cpu->env;
+    env->cr[0] = rvmcs(cpu->accel->fd, VMCS_GUEST_CR0);
+    env->cr[3] = rvmcs(cpu->accel->fd, VMCS_GUEST_CR3);
     vmx_write_mem(cpu, gva, data, bytes);
 }
 
diff --git a/target/i386/hvf/x86.c b/target/i386/hvf/x86.c
index 5c75ec9a007..2fa210ff601 100644
--- a/target/i386/hvf/x86.c
+++ b/target/i386/hvf/x86.c
@@ -23,7 +23,7 @@
 #include "emulate/x86_emu.h"
 #include "vmcs.h"
 #include "vmx.h"
-#include "x86_mmu.h"
+#include "emulate/x86_mmu.h"
 #include "x86_descr.h"
 
 /* static uint32_t x86_segment_access_rights(struct x86_segment_descriptor *var)
diff --git a/target/i386/hvf/x86_task.c b/target/i386/hvf/x86_task.c
index bdf8b51ae67..b1e541a6420 100644
--- a/target/i386/hvf/x86_task.c
+++ b/target/i386/hvf/x86_task.c
@@ -16,7 +16,7 @@
 #include "vmx.h"
 #include "emulate/x86.h"
 #include "x86_descr.h"
-#include "x86_mmu.h"
+#include "emulate/x86_mmu.h"
 #include "emulate/x86_decode.h"
 #include "emulate/x86_emu.h"
 #include "x86_task.h"
diff --git a/target/i386/emulate/meson.build b/target/i386/emulate/meson.build
index b6dafb6a5be..dd047c424a1 100644
--- a/target/i386/emulate/meson.build
+++ b/target/i386/emulate/meson.build
@@ -2,6 +2,7 @@ emulator_files = files(
   'x86_decode.c',
   'x86_emu.c',
   'x86_flags.c',
+  'x86_mmu.c'
 )
 
 i386_system_ss.add(when: [hvf, 'CONFIG_HVF'], if_true: emulator_files)
diff --git a/target/i386/hvf/meson.build b/target/i386/hvf/meson.build
index 519d190f0e6..22bf886978f 100644
--- a/target/i386/hvf/meson.build
+++ b/target/i386/hvf/meson.build
@@ -3,7 +3,6 @@ i386_system_ss.add(when: [hvf, 'CONFIG_HVF'], if_true: files(
   'x86.c',
   'x86_cpuid.c',
   'x86_descr.c',
-  'x86_mmu.c',
   'x86_task.c',
   'x86hvf.c',
   'hvf-cpu.c',
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 007/102] whpx: i386: re-enable guest debug support
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (5 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 006/102] target/i386: emulate, hvf: move x86_mmu to common code Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 008/102] whpx: preparatory changes before switching over from winhvemulation Paolo Bonzini
                   ` (94 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni, Philippe Mathieu-Daudé

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Fix what got broken several years ago by adding ops->supports_guest_debug
support as an architecture-specific function.

arm64 WHP doesn't currently provide support needed for this.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20260223233950.96076-7-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/system/whpx-all.h   | 4 ++++
 accel/whpx/whpx-accel-ops.c | 8 ++++++++
 target/arm/whpx/whpx-all.c  | 5 +++++
 target/i386/whpx/whpx-all.c | 5 +++++
 4 files changed, 22 insertions(+)

diff --git a/include/system/whpx-all.h b/include/system/whpx-all.h
index f13cdf7f660..3db074c38c5 100644
--- a/include/system/whpx-all.h
+++ b/include/system/whpx-all.h
@@ -17,4 +17,8 @@ void whpx_translate_cpu_breakpoints(
     struct whpx_breakpoints *breakpoints,
     CPUState *cpu,
     int cpu_breakpoint_count);
+
+/* called by whpx-accel-ops */
+bool whpx_arch_supports_guest_debug(void);
+
 #endif
diff --git a/accel/whpx/whpx-accel-ops.c b/accel/whpx/whpx-accel-ops.c
index 50fadea0fd6..b8f41544cbe 100644
--- a/accel/whpx/whpx-accel-ops.c
+++ b/accel/whpx/whpx-accel-ops.c
@@ -17,6 +17,7 @@
 
 #include "system/whpx.h"
 #include "system/whpx-internal.h"
+#include "system/whpx-all.h"
 #include "system/whpx-accel-ops.h"
 
 static void *whpx_cpu_thread_fn(void *arg)
@@ -81,6 +82,12 @@ static bool whpx_vcpu_thread_is_idle(CPUState *cpu)
     return !whpx_irqchip_in_kernel();
 }
 
+static bool whpx_supports_guest_debug(void)
+{
+    return whpx_arch_supports_guest_debug();
+}
+
+
 static void whpx_accel_ops_class_init(ObjectClass *oc, const void *data)
 {
     AccelOpsClass *ops = ACCEL_OPS_CLASS(oc);
@@ -89,6 +96,7 @@ static void whpx_accel_ops_class_init(ObjectClass *oc, const void *data)
     ops->kick_vcpu_thread = whpx_kick_vcpu_thread;
     ops->cpu_thread_is_idle = whpx_vcpu_thread_is_idle;
     ops->handle_interrupt = generic_handle_interrupt;
+    ops->supports_guest_debug = whpx_supports_guest_debug;
 
     ops->synchronize_post_reset = whpx_cpu_synchronize_post_reset;
     ops->synchronize_post_init = whpx_cpu_synchronize_post_init;
diff --git a/target/arm/whpx/whpx-all.c b/target/arm/whpx/whpx-all.c
index 40ada2d5b65..e01e6499ba4 100644
--- a/target/arm/whpx/whpx-all.c
+++ b/target/arm/whpx/whpx-all.c
@@ -303,6 +303,11 @@ void whpx_translate_cpu_breakpoints(
     /* Breakpoints aren’t supported on this platform */
 }
 
+bool whpx_arch_supports_guest_debug(void) 
+{
+    return false;
+}
+
 static void whpx_get_reg(CPUState *cpu, WHV_REGISTER_NAME reg, WHV_REGISTER_VALUE* val)
 {
     struct whpx_state *whpx = &whpx_global;
diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index 8210250dc3b..e1f0fa5e770 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -1272,6 +1272,11 @@ void whpx_apply_breakpoints(
     }
 }
 
+bool whpx_arch_supports_guest_debug(void) 
+{
+    return true;
+}
+
 /* Returns the address of the next instruction that is about to be executed. */
 static vaddr whpx_vcpu_get_pc(CPUState *cpu, bool exit_context_valid)
 {
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 008/102] whpx: preparatory changes before switching over from winhvemulation
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (6 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 007/102] whpx: i386: re-enable guest debug support Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 009/102] whpx: refactor whpx_destroy_vcpu to arch-specific function Paolo Bonzini
                   ` (93 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-8-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.h               | 2 +-
 target/i386/emulate/meson.build | 1 +
 target/i386/mshv/meson.build    | 4 ++++
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 9f222a0c9fe..065613722f1 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2286,7 +2286,7 @@ typedef struct CPUArchState {
     QEMUTimer *xen_periodic_timer;
     QemuMutex xen_timers_lock;
 #endif
-#if defined(CONFIG_HVF) || defined(CONFIG_MSHV)
+#if defined(CONFIG_HVF) || defined(CONFIG_MSHV) || defined(CONFIG_WHPX)
     void *emu_mmio_buf;
 #endif
 
diff --git a/target/i386/emulate/meson.build b/target/i386/emulate/meson.build
index dd047c424a1..1bb35162498 100644
--- a/target/i386/emulate/meson.build
+++ b/target/i386/emulate/meson.build
@@ -7,3 +7,4 @@ emulator_files = files(
 
 i386_system_ss.add(when: [hvf, 'CONFIG_HVF'], if_true: emulator_files)
 i386_system_ss.add(when: 'CONFIG_MSHV', if_true: emulator_files)
+i386_system_ss.add(when: 'CONFIG_WHPX', if_true: emulator_files)
diff --git a/target/i386/mshv/meson.build b/target/i386/mshv/meson.build
index 647e5dafb77..3fadd4598a5 100644
--- a/target/i386/mshv/meson.build
+++ b/target/i386/mshv/meson.build
@@ -6,3 +6,7 @@ i386_mshv_ss.add(files(
 ))
 
 i386_system_ss.add_all(when: 'CONFIG_MSHV', if_true: i386_mshv_ss)
+
+i386_system_ss.add(when: 'CONFIG_WHPX', if_true: files(
+  'x86.c',
+))
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 009/102] whpx: refactor whpx_destroy_vcpu to arch-specific function
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (7 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 008/102] whpx: preparatory changes before switching over from winhvemulation Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 010/102] whpx: move whpx_get_reg/whpx_set_reg to generic code Paolo Bonzini
                   ` (92 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni, Philippe Mathieu-Daudé

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Avoid a TARGET_X86_64 define by moving platform-specific code
away from generic WHPX support.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20260223233950.96076-9-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/system/whpx-all.h   | 1 +
 accel/whpx/whpx-common.c    | 5 +----
 target/arm/whpx/whpx-all.c  | 5 +++++
 target/i386/whpx/whpx-all.c | 6 ++++++
 4 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/include/system/whpx-all.h b/include/system/whpx-all.h
index 3db074c38c5..b831c463b0b 100644
--- a/include/system/whpx-all.h
+++ b/include/system/whpx-all.h
@@ -17,6 +17,7 @@ void whpx_translate_cpu_breakpoints(
     struct whpx_breakpoints *breakpoints,
     CPUState *cpu,
     int cpu_breakpoint_count);
+void whpx_arch_destroy_vcpu(CPUState *cpu);
 
 /* called by whpx-accel-ops */
 bool whpx_arch_supports_guest_debug(void);
diff --git a/accel/whpx/whpx-common.c b/accel/whpx/whpx-common.c
index f018a8f5c7d..c57a0d3f0f9 100644
--- a/accel/whpx/whpx-common.c
+++ b/accel/whpx/whpx-common.c
@@ -236,10 +236,7 @@ void whpx_destroy_vcpu(CPUState *cpu)
     struct whpx_state *whpx = &whpx_global;
 
     whp_dispatch.WHvDeleteVirtualProcessor(whpx->partition, cpu->cpu_index);
-#ifdef HOST_X86_64
-    AccelCPUState *vcpu = cpu->accel;
-    whp_dispatch.WHvEmulatorDestroyEmulator(vcpu->emulator);
-#endif
+    whpx_arch_destroy_vcpu(cpu);
     g_free(cpu->accel);
 }
 
diff --git a/target/arm/whpx/whpx-all.c b/target/arm/whpx/whpx-all.c
index e01e6499ba4..0a31c7b9464 100644
--- a/target/arm/whpx/whpx-all.c
+++ b/target/arm/whpx/whpx-all.c
@@ -308,6 +308,11 @@ bool whpx_arch_supports_guest_debug(void)
     return false;
 }
 
+void whpx_arch_destroy_vcpu(CPUState *cpu)
+{
+    /* currently empty on Arm */
+}
+
 static void whpx_get_reg(CPUState *cpu, WHV_REGISTER_NAME reg, WHV_REGISTER_VALUE* val)
 {
     struct whpx_state *whpx = &whpx_global;
diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index e1f0fa5e770..cdcaebbe167 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -1277,6 +1277,12 @@ bool whpx_arch_supports_guest_debug(void)
     return true;
 }
 
+void whpx_arch_destroy_vcpu(CPUState *cpu)
+{
+    AccelCPUState *vcpu = cpu->accel;
+    whp_dispatch.WHvEmulatorDestroyEmulator(vcpu->emulator);
+}
+
 /* Returns the address of the next instruction that is about to be executed. */
 static vaddr whpx_vcpu_get_pc(CPUState *cpu, bool exit_context_valid)
 {
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 010/102] whpx: move whpx_get_reg/whpx_set_reg to generic code
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (8 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 009/102] whpx: refactor whpx_destroy_vcpu to arch-specific function Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 011/102] whpx: i386: switch over from winhvemulation to target/i386/emulate Paolo Bonzini
                   ` (91 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni, Bernhard Beschow

From: Mohamed Mediouni <mohamed@unpredictable.fr>

These will be used in the next commit on the x86_64 backend too.
Also move flush_cpu_state as it's used by get_reg/set_reg and the arm64 code.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Link: https://lore.kernel.org/r/20260223233950.96076-10-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/system/whpx-common.h |  3 +++
 accel/whpx/whpx-common.c     | 35 ++++++++++++++++++++++++++++++++++
 target/arm/whpx/whpx-all.c   | 37 +-----------------------------------
 3 files changed, 39 insertions(+), 36 deletions(-)

diff --git a/include/system/whpx-common.h b/include/system/whpx-common.h
index b86fe9db6eb..a4e16e13099 100644
--- a/include/system/whpx-common.h
+++ b/include/system/whpx-common.h
@@ -20,6 +20,9 @@ int whpx_first_vcpu_starting(CPUState *cpu);
 int whpx_last_vcpu_stopping(CPUState *cpu);
 void whpx_memory_init(void);
 struct whpx_breakpoint *whpx_lookup_breakpoint_by_addr(uint64_t address);
+void whpx_flush_cpu_state(CPUState *cpu);
+void whpx_get_reg(CPUState *cpu, WHV_REGISTER_NAME reg, WHV_REGISTER_VALUE* val);
+void whpx_set_reg(CPUState *cpu, WHV_REGISTER_NAME reg, WHV_REGISTER_VALUE val);
 
 /* On x64: same as WHvX64ExceptionTypeDebugTrapOrFault */
 #define WHPX_INTERCEPT_DEBUG_TRAPS 1
diff --git a/accel/whpx/whpx-common.c b/accel/whpx/whpx-common.c
index c57a0d3f0f9..21e9f1a1781 100644
--- a/accel/whpx/whpx-common.c
+++ b/accel/whpx/whpx-common.c
@@ -46,6 +46,41 @@ static HMODULE hWinHvEmulation;
 struct whpx_state whpx_global;
 struct WHPDispatch whp_dispatch;
 
+void whpx_flush_cpu_state(CPUState *cpu)
+{
+    if (cpu->vcpu_dirty) {
+        whpx_set_registers(cpu, WHPX_SET_RUNTIME_STATE);
+        cpu->vcpu_dirty = false;
+    }
+}
+
+void whpx_get_reg(CPUState *cpu, WHV_REGISTER_NAME reg, WHV_REGISTER_VALUE* val)
+{
+    struct whpx_state *whpx = &whpx_global;
+    HRESULT hr;
+
+    whpx_flush_cpu_state(cpu);
+
+    hr = whp_dispatch.WHvGetVirtualProcessorRegisters(whpx->partition, cpu->cpu_index,
+         &reg, 1, val);
+
+    if (FAILED(hr)) {
+        error_report("WHPX: Failed to get register %08x, hr=%08lx", reg, hr);
+    }
+}
+
+void whpx_set_reg(CPUState *cpu, WHV_REGISTER_NAME reg, WHV_REGISTER_VALUE val)
+{
+    struct whpx_state *whpx = &whpx_global;
+    HRESULT hr;
+    hr = whp_dispatch.WHvSetVirtualProcessorRegisters(whpx->partition, cpu->cpu_index,
+         &reg, 1, &val);
+
+    if (FAILED(hr)) {
+        error_report("WHPX: Failed to set register %08x, hr=%08lx", reg, hr);
+    }
+}
+
 /* Tries to find a breakpoint at the specified address. */
 struct whpx_breakpoint *whpx_lookup_breakpoint_by_addr(uint64_t address)
 {
diff --git a/target/arm/whpx/whpx-all.c b/target/arm/whpx/whpx-all.c
index 0a31c7b9464..0d56e468bdf 100644
--- a/target/arm/whpx/whpx-all.c
+++ b/target/arm/whpx/whpx-all.c
@@ -273,14 +273,6 @@ static struct whpx_sreg_match whpx_sreg_match[] = {
     { WHvArm64RegisterSpEl1, ENCODE_AA64_CP_REG(4, 1, 3, 4, 0) },
 };
 
-static void flush_cpu_state(CPUState *cpu)
-{
-    if (cpu->vcpu_dirty) {
-        whpx_set_registers(cpu, WHPX_SET_RUNTIME_STATE);
-        cpu->vcpu_dirty = false;
-    }
-}
-
 HRESULT whpx_set_exception_exit_bitmap(UINT64 exceptions)
 {
     if (exceptions != 0) {
@@ -313,33 +305,6 @@ void whpx_arch_destroy_vcpu(CPUState *cpu)
     /* currently empty on Arm */
 }
 
-static void whpx_get_reg(CPUState *cpu, WHV_REGISTER_NAME reg, WHV_REGISTER_VALUE* val)
-{
-    struct whpx_state *whpx = &whpx_global;
-    HRESULT hr;
-
-    flush_cpu_state(cpu);
-
-    hr = whp_dispatch.WHvGetVirtualProcessorRegisters(whpx->partition, cpu->cpu_index,
-         &reg, 1, val);
-
-    if (FAILED(hr)) {
-        error_report("WHPX: Failed to get register %08x, hr=%08lx", reg, hr);
-    }
-}
-
-static void whpx_set_reg(CPUState *cpu, WHV_REGISTER_NAME reg, WHV_REGISTER_VALUE val)
-{
-    struct whpx_state *whpx = &whpx_global;
-    HRESULT hr;
-    hr = whp_dispatch.WHvSetVirtualProcessorRegisters(whpx->partition, cpu->cpu_index,
-         &reg, 1, &val);
-
-    if (FAILED(hr)) {
-        error_report("WHPX: Failed to set register %08x, hr=%08lx", reg, hr);
-    }
-}
-
 static void whpx_get_global_reg(WHV_REGISTER_NAME reg, WHV_REGISTER_VALUE *val)
 {
     struct whpx_state *whpx = &whpx_global;
@@ -526,7 +491,7 @@ int whpx_vcpu_run(CPUState *cpu)
         if (advance_pc) {
             WHV_REGISTER_VALUE pc;
 
-            flush_cpu_state(cpu);
+            whpx_flush_cpu_state(cpu);
             pc.Reg64 = vcpu->exit_ctx.MemoryAccess.Header.Pc + 4;
             whpx_set_reg(cpu, WHvArm64RegisterPc, pc);
         }
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 011/102] whpx: i386: switch over from winhvemulation to target/i386/emulate
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (9 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 010/102] whpx: move whpx_get_reg/whpx_set_reg to generic code Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 012/102] whpx: i386: flags conversion for target/i386/emulate internal state Paolo Bonzini
                   ` (90 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Using the mshv backend as a base, move away from winhvemulation
to using common QEMU code used by the HVF and mshv backends.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-11-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-all.c | 274 +++++++++++++++++-------------------
 1 file changed, 126 insertions(+), 148 deletions(-)

diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index cdcaebbe167..eb6076d2f49 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -15,6 +15,7 @@
 #include "gdbstub/helpers.h"
 #include "qemu/accel.h"
 #include "accel/accel-ops.h"
+#include "system/memory.h"
 #include "system/whpx.h"
 #include "system/cpus.h"
 #include "system/runstate.h"
@@ -36,8 +37,12 @@
 #include "system/whpx-all.h"
 #include "system/whpx-common.h"
 
+#include "emulate/x86_decode.h"
+#include "emulate/x86_emu.h"
+#include "emulate/x86_flags.h"
+#include "emulate/x86_mmu.h"
+
 #include <winhvplatform.h>
-#include <winhvemulation.h>
 
 #define HYPERV_APIC_BUS_FREQUENCY      (200000000ULL)
 
@@ -756,160 +761,140 @@ void whpx_get_registers(CPUState *cpu)
     x86_update_hflags(env);
 }
 
-static HRESULT CALLBACK whpx_emu_ioport_callback(
-    void *ctx,
-    WHV_EMULATOR_IO_ACCESS_INFO *IoAccess)
+static int emulate_instruction(CPUState *cpu, const uint8_t *insn_bytes, size_t insn_len)
 {
-    MemTxAttrs attrs = { 0 };
-    address_space_rw(&address_space_io, IoAccess->Port, attrs,
-                     &IoAccess->Data, IoAccess->AccessSize,
-                     IoAccess->Direction);
-    return S_OK;
+    X86CPU *x86_cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86_cpu->env;
+    struct x86_decode decode = { 0 };
+    x86_insn_stream stream = { .bytes = insn_bytes, .len = insn_len };
+
+    whpx_get_registers(cpu);
+    decode_instruction_stream(env, &decode, &stream);
+    exec_instruction(env, &decode);
+    whpx_set_registers(cpu, WHPX_SET_RUNTIME_STATE);
+
+    return 0;
 }
 
-static HRESULT CALLBACK whpx_emu_mmio_callback(
-    void *ctx,
-    WHV_EMULATOR_MEMORY_ACCESS_INFO *ma)
+static int whpx_handle_mmio(CPUState *cpu, WHV_RUN_VP_EXIT_CONTEXT *exit_ctx)
 {
-    CPUState *cs = (CPUState *)ctx;
-    AddressSpace *as = cpu_addressspace(cs, MEMTXATTRS_UNSPECIFIED);
+    WHV_MEMORY_ACCESS_CONTEXT *ctx = &exit_ctx->MemoryAccess;
+    int ret;
 
-    address_space_rw(as, ma->GpaAddress, MEMTXATTRS_UNSPECIFIED,
-                     ma->Data, ma->AccessSize, ma->Direction);
-    return S_OK;
-}
-
-static HRESULT CALLBACK whpx_emu_getreg_callback(
-    void *ctx,
-    const WHV_REGISTER_NAME *RegisterNames,
-    UINT32 RegisterCount,
-    WHV_REGISTER_VALUE *RegisterValues)
-{
-    HRESULT hr;
-    struct whpx_state *whpx = &whpx_global;
-    CPUState *cpu = (CPUState *)ctx;
-
-    hr = whp_dispatch.WHvGetVirtualProcessorRegisters(
-        whpx->partition, cpu->cpu_index,
-        RegisterNames, RegisterCount,
-        RegisterValues);
-    if (FAILED(hr)) {
-        error_report("WHPX: Failed to get virtual processor registers,"
-                     " hr=%08lx", hr);
-    }
-
-    return hr;
-}
-
-static HRESULT CALLBACK whpx_emu_setreg_callback(
-    void *ctx,
-    const WHV_REGISTER_NAME *RegisterNames,
-    UINT32 RegisterCount,
-    const WHV_REGISTER_VALUE *RegisterValues)
-{
-    HRESULT hr;
-    struct whpx_state *whpx = &whpx_global;
-    CPUState *cpu = (CPUState *)ctx;
-
-    hr = whp_dispatch.WHvSetVirtualProcessorRegisters(
-        whpx->partition, cpu->cpu_index,
-        RegisterNames, RegisterCount,
-        RegisterValues);
-    if (FAILED(hr)) {
-        error_report("WHPX: Failed to set virtual processor registers,"
-                     " hr=%08lx", hr);
-    }
-
-    /*
-     * The emulator just successfully wrote the register state. We clear the
-     * dirty state so we avoid the double write on resume of the VP.
-     */
-    cpu->vcpu_dirty = false;
-
-    return hr;
-}
-
-static HRESULT CALLBACK whpx_emu_translate_callback(
-    void *ctx,
-    WHV_GUEST_VIRTUAL_ADDRESS Gva,
-    WHV_TRANSLATE_GVA_FLAGS TranslateFlags,
-    WHV_TRANSLATE_GVA_RESULT_CODE *TranslationResult,
-    WHV_GUEST_PHYSICAL_ADDRESS *Gpa)
-{
-    HRESULT hr;
-    struct whpx_state *whpx = &whpx_global;
-    CPUState *cpu = (CPUState *)ctx;
-    WHV_TRANSLATE_GVA_RESULT res;
-
-    hr = whp_dispatch.WHvTranslateGva(whpx->partition, cpu->cpu_index,
-                                      Gva, TranslateFlags, &res, Gpa);
-    if (FAILED(hr)) {
-        error_report("WHPX: Failed to translate GVA, hr=%08lx", hr);
-    } else {
-        *TranslationResult = res.ResultCode;
-    }
-
-    return hr;
-}
-
-static const WHV_EMULATOR_CALLBACKS whpx_emu_callbacks = {
-    .Size = sizeof(WHV_EMULATOR_CALLBACKS),
-    .WHvEmulatorIoPortCallback = whpx_emu_ioport_callback,
-    .WHvEmulatorMemoryCallback = whpx_emu_mmio_callback,
-    .WHvEmulatorGetVirtualProcessorRegisters = whpx_emu_getreg_callback,
-    .WHvEmulatorSetVirtualProcessorRegisters = whpx_emu_setreg_callback,
-    .WHvEmulatorTranslateGvaPage = whpx_emu_translate_callback,
-};
-
-static int whpx_handle_mmio(CPUState *cpu, WHV_MEMORY_ACCESS_CONTEXT *ctx)
-{
-    HRESULT hr;
-    AccelCPUState *vcpu = cpu->accel;
-    WHV_EMULATOR_STATUS emu_status;
-
-    hr = whp_dispatch.WHvEmulatorTryMmioEmulation(
-        vcpu->emulator, cpu,
-        &vcpu->exit_ctx.VpContext, ctx,
-        &emu_status);
-    if (FAILED(hr)) {
-        error_report("WHPX: Failed to parse MMIO access, hr=%08lx", hr);
-        return -1;
-    }
-
-    if (!emu_status.EmulationSuccessful) {
-        error_report("WHPX: Failed to emulate MMIO access with"
-                     " EmulatorReturnStatus: %u", emu_status.AsUINT32);
+    ret = emulate_instruction(cpu, ctx->InstructionBytes, ctx->InstructionByteCount);
+    if (ret < 0) {
+        error_report("failed to emulate mmio");
         return -1;
     }
 
     return 0;
 }
 
+static void handle_io(CPUState *env, uint16_t port, void *buffer,
+                  int direction, int size, int count)
+{
+    int i;
+    uint8_t *ptr = buffer;
+
+    for (i = 0; i < count; i++) {
+        address_space_rw(&address_space_io, port, MEMTXATTRS_UNSPECIFIED,
+                         ptr, size,
+                         direction);
+        ptr += size;
+    }
+}
+
+static void whpx_bump_rip(CPUState *cpu, WHV_RUN_VP_EXIT_CONTEXT *exit_ctx)
+{
+    WHV_REGISTER_VALUE reg;
+    whpx_get_reg(cpu, WHvX64RegisterRip, &reg);
+    reg.Reg64 = exit_ctx->VpContext.Rip + exit_ctx->VpContext.InstructionLength;
+    whpx_set_reg(cpu, WHvX64RegisterRip, reg);
+}
+
 static int whpx_handle_portio(CPUState *cpu,
-                              WHV_X64_IO_PORT_ACCESS_CONTEXT *ctx)
+                              WHV_RUN_VP_EXIT_CONTEXT *exit_ctx)
 {
-    HRESULT hr;
-    AccelCPUState *vcpu = cpu->accel;
-    WHV_EMULATOR_STATUS emu_status;
+    WHV_X64_IO_PORT_ACCESS_CONTEXT *ctx = &exit_ctx->IoPortAccess;
+    X86CPU *x86_cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86_cpu->env;
+    int ret;
 
-    hr = whp_dispatch.WHvEmulatorTryIoEmulation(
-        vcpu->emulator, cpu,
-        &vcpu->exit_ctx.VpContext, ctx,
-        &emu_status);
-    if (FAILED(hr)) {
-        error_report("WHPX: Failed to parse PortIO access, hr=%08lx", hr);
-        return -1;
+    if (!ctx->AccessInfo.StringOp && !ctx->AccessInfo.IsWrite) {
+        uint64_t val = 0;
+        WHV_REGISTER_VALUE reg;
+
+        whpx_get_reg(cpu, WHvX64RegisterRax, &reg);
+        handle_io(cpu, ctx->PortNumber, &val, 0, ctx->AccessInfo.AccessSize, 1);
+        if (ctx->AccessInfo.AccessSize == 1) {
+            reg.Reg8 = val;
+        } else if (ctx->AccessInfo.AccessSize == 2) {
+            reg.Reg16 = val;
+        } else if (ctx->AccessInfo.AccessSize == 4) {
+            reg.Reg64 = (uint32_t)val;
+        } else {
+            reg.Reg64 = (uint64_t)val;
+        }
+        whpx_bump_rip(cpu, exit_ctx);
+        whpx_set_reg(cpu, WHvX64RegisterRax, reg);
+        return 0;
+    } else if (!ctx->AccessInfo.StringOp && ctx->AccessInfo.IsWrite) {
+        RAX(env) = ctx->Rax;
+        handle_io(cpu, ctx->PortNumber, &RAX(env), 1, ctx->AccessInfo.AccessSize, 1);
+        whpx_bump_rip(cpu, exit_ctx);
+        return 0;
     }
 
-    if (!emu_status.EmulationSuccessful) {
-        error_report("WHPX: Failed to emulate PortIO access with"
-                     " EmulatorReturnStatus: %u", emu_status.AsUINT32);
+    ret = emulate_instruction(cpu, ctx->InstructionBytes, exit_ctx->VpContext.InstructionLength);
+    if (ret < 0) {
+        error_report("failed to emulate I/O port access");
         return -1;
     }
 
     return 0;
 }
 
+static void write_mem(CPUState *cpu, void *data, target_ulong addr, int bytes)
+{
+    vmx_write_mem(cpu, addr, data, bytes);
+}
+
+static void read_mem(CPUState *cpu, void *data, target_ulong addr, int bytes)
+{
+    vmx_read_mem(cpu, data, addr, bytes);
+}
+
+static void read_segment_descriptor(CPUState *cpu,
+                                    struct x86_segment_descriptor *desc,
+                                    enum X86Seg seg_idx)
+{
+    bool ret;
+    X86CPU *x86_cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86_cpu->env;
+    SegmentCache *seg = &env->segs[seg_idx];
+    x86_segment_selector sel = { .sel = seg->selector & 0xFFFF };
+
+    ret = x86_read_segment_descriptor(cpu, desc, sel);
+    if (ret == false) {
+        error_report("failed to read segment descriptor");
+        abort();
+    }
+}
+
+
+static const struct x86_emul_ops whpx_x86_emul_ops = {
+    .read_mem = read_mem,
+    .write_mem = write_mem,
+    .read_segment_descriptor = read_segment_descriptor,
+    .handle_io = handle_io
+};
+
+static void whpx_init_emu(void)
+{
+    init_decoder();
+    init_emu(&whpx_x86_emul_ops);
+}
+
 /*
  * Controls whether we should intercept various exceptions on the guest,
  * namely breakpoint/single-step events.
@@ -1279,8 +1264,9 @@ bool whpx_arch_supports_guest_debug(void)
 
 void whpx_arch_destroy_vcpu(CPUState *cpu)
 {
-    AccelCPUState *vcpu = cpu->accel;
-    whp_dispatch.WHvEmulatorDestroyEmulator(vcpu->emulator);
+    X86CPU *x86cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86cpu->env;
+    g_free(env->emu_mmio_buf);
 }
 
 /* Returns the address of the next instruction that is about to be executed. */
@@ -1639,11 +1625,11 @@ int whpx_vcpu_run(CPUState *cpu)
 
         switch (vcpu->exit_ctx.ExitReason) {
         case WHvRunVpExitReasonMemoryAccess:
-            ret = whpx_handle_mmio(cpu, &vcpu->exit_ctx.MemoryAccess);
+            ret = whpx_handle_mmio(cpu, &vcpu->exit_ctx);
             break;
 
         case WHvRunVpExitReasonX64IoPortAccess:
-            ret = whpx_handle_portio(cpu, &vcpu->exit_ctx.IoPortAccess);
+            ret = whpx_handle_portio(cpu, &vcpu->exit_ctx);
             break;
 
         case WHvRunVpExitReasonX64InterruptWindow:
@@ -1990,22 +1976,11 @@ int whpx_init_vcpu(CPUState *cpu)
 
     vcpu = g_new0(AccelCPUState, 1);
 
-    hr = whp_dispatch.WHvEmulatorCreateEmulator(
-        &whpx_emu_callbacks,
-        &vcpu->emulator);
-    if (FAILED(hr)) {
-        error_report("WHPX: Failed to setup instruction completion support,"
-                     " hr=%08lx", hr);
-        ret = -EINVAL;
-        goto error;
-    }
-
     hr = whp_dispatch.WHvCreateVirtualProcessor(
         whpx->partition, cpu->cpu_index, 0);
     if (FAILED(hr)) {
         error_report("WHPX: Failed to create a virtual processor,"
                      " hr=%08lx", hr);
-        whp_dispatch.WHvEmulatorDestroyEmulator(vcpu->emulator);
         ret = -EINVAL;
         goto error;
     }
@@ -2067,6 +2042,8 @@ int whpx_init_vcpu(CPUState *cpu)
     max_vcpu_index = max(max_vcpu_index, cpu->cpu_index);
     qemu_add_vm_change_state_handler(whpx_cpu_update_state, env);
 
+    env->emu_mmio_buf = g_new(char, 4096);
+
     return 0;
 
 error:
@@ -2256,6 +2233,7 @@ int whpx_accel_init(AccelState *as, MachineState *ms)
     }
 
     whpx_memory_init();
+    whpx_init_emu();
 
     printf("Windows Hypervisor Platform accelerator is operational\n");
     return 0;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 012/102] whpx: i386: flags conversion for target/i386/emulate internal state
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (10 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 011/102] whpx: i386: switch over from winhvemulation to target/i386/emulate Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 013/102] whpx: i386: remove remaining winhvemulation support code Paolo Bonzini
                   ` (89 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-12-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-all.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index eb6076d2f49..05248850530 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -412,6 +412,7 @@ void whpx_set_registers(CPUState *cpu, int level)
     vcxt.values[idx++].Reg64 = env->eip;
 
     assert(whpx_register_names[idx] == WHvX64RegisterRflags);
+    lflags_to_rflags(env);
     vcxt.values[idx++].Reg64 = env->eflags;
 
     /* Translate 6+4 segment registers. HV and QEMU order matches  */
@@ -637,6 +638,7 @@ void whpx_get_registers(CPUState *cpu)
     env->eip = vcxt.values[idx++].Reg64;
     assert(whpx_register_names[idx] == WHvX64RegisterRflags);
     env->eflags = vcxt.values[idx++].Reg64;
+    rflags_to_lflags(env);
 
     /* Translate 6+4 segment registers. HV and QEMU order matches  */
     assert(idx == WHvX64RegisterEs);
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 013/102] whpx: i386: remove remaining winhvemulation support code
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (11 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 012/102] whpx: i386: flags conversion for target/i386/emulate internal state Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 014/102] whpx: i386: remove messages Paolo Bonzini
                   ` (88 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

After moving away to target/i386/emulate, this is no longer necessary.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-13-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 meson.build                    |  3 +--
 include/system/whpx-common.h   |  3 ---
 include/system/whpx-internal.h | 16 ----------------
 accel/whpx/whpx-common.c       | 22 ----------------------
 4 files changed, 1 insertion(+), 43 deletions(-)

diff --git a/meson.build b/meson.build
index 3cd1d8dbc66..2bae618d848 100644
--- a/meson.build
+++ b/meson.build
@@ -865,8 +865,7 @@ if get_option('whpx').allowed() and host_os == 'windows'
     endif
    # Leave CONFIG_WHPX disabled
   else
-    if cc.has_header('winhvplatform.h', required: get_option('whpx')) and \
-      cc.has_header('winhvemulation.h', required: get_option('whpx'))
+    if cc.has_header('winhvplatform.h', required: get_option('whpx'))
       accelerators += 'CONFIG_WHPX'
     endif
   endif
diff --git a/include/system/whpx-common.h b/include/system/whpx-common.h
index a4e16e13099..04289afd973 100644
--- a/include/system/whpx-common.h
+++ b/include/system/whpx-common.h
@@ -3,9 +3,6 @@
 #define SYSTEM_WHPX_COMMON_H
 
 struct AccelCPUState {
-#ifdef HOST_X86_64
-    WHV_EMULATOR_HANDLE emulator;
-#endif
     bool window_registered;
     bool interruptable;
     bool ready_for_pic_interrupt;
diff --git a/include/system/whpx-internal.h b/include/system/whpx-internal.h
index ad6ade223ee..7a1c9871f18 100644
--- a/include/system/whpx-internal.h
+++ b/include/system/whpx-internal.h
@@ -4,9 +4,6 @@
 
 #include <windows.h>
 #include <winhvplatform.h>
-#ifdef HOST_X86_64
-#include <winhvemulation.h>
-#endif
 #include "hw/i386/apic.h"
 #include "exec/vaddr.h"
 
@@ -89,12 +86,6 @@ void whpx_apic_get(APICCommonState *s);
   X(HRESULT, WHvResetPartition, \
         (WHV_PARTITION_HANDLE Partition)) \
 
-#define LIST_WINHVEMULATION_FUNCTIONS(X) \
-  X(HRESULT, WHvEmulatorCreateEmulator, (const WHV_EMULATOR_CALLBACKS* Callbacks, WHV_EMULATOR_HANDLE* Emulator)) \
-  X(HRESULT, WHvEmulatorDestroyEmulator, (WHV_EMULATOR_HANDLE Emulator)) \
-  X(HRESULT, WHvEmulatorTryIoEmulation, (WHV_EMULATOR_HANDLE Emulator, VOID* Context, const WHV_VP_EXIT_CONTEXT* VpContext, const WHV_X64_IO_PORT_ACCESS_CONTEXT* IoInstructionContext, WHV_EMULATOR_STATUS* EmulatorReturnStatus)) \
-  X(HRESULT, WHvEmulatorTryMmioEmulation, (WHV_EMULATOR_HANDLE Emulator, VOID* Context, const WHV_VP_EXIT_CONTEXT* VpContext, const WHV_MEMORY_ACCESS_CONTEXT* MmioInstructionContext, WHV_EMULATOR_STATUS* EmulatorReturnStatus)) \
-
 #define WHP_DEFINE_TYPE(return_type, function_name, signature) \
     typedef return_type (WINAPI *function_name ## _t) signature;
 
@@ -103,16 +94,10 @@ void whpx_apic_get(APICCommonState *s);
 
 /* Define function typedef */
 LIST_WINHVPLATFORM_FUNCTIONS(WHP_DEFINE_TYPE)
-#ifdef HOST_X86_64
-LIST_WINHVEMULATION_FUNCTIONS(WHP_DEFINE_TYPE)
-#endif
 LIST_WINHVPLATFORM_FUNCTIONS_SUPPLEMENTAL(WHP_DEFINE_TYPE)
 
 struct WHPDispatch {
     LIST_WINHVPLATFORM_FUNCTIONS(WHP_DECLARE_MEMBER)
-#ifdef HOST_X86_64
-    LIST_WINHVEMULATION_FUNCTIONS(WHP_DECLARE_MEMBER)
-#endif
     LIST_WINHVPLATFORM_FUNCTIONS_SUPPLEMENTAL(WHP_DECLARE_MEMBER)
 };
 
@@ -122,7 +107,6 @@ bool init_whp_dispatch(void);
 
 typedef enum WHPFunctionList {
     WINHV_PLATFORM_FNS_DEFAULT,
-    WINHV_EMULATION_FNS_DEFAULT,
     WINHV_PLATFORM_FNS_SUPPLEMENTAL
 } WHPFunctionList;
 
diff --git a/accel/whpx/whpx-common.c b/accel/whpx/whpx-common.c
index 21e9f1a1781..88eef557998 100644
--- a/accel/whpx/whpx-common.c
+++ b/accel/whpx/whpx-common.c
@@ -39,9 +39,6 @@ bool whpx_allowed;
 bool whpx_irqchip_in_kernel;
 static bool whp_dispatch_initialized;
 static HMODULE hWinHvPlatform;
-#ifdef HOST_X86_64
-static HMODULE hWinHvEmulation;
-#endif
 
 struct whpx_state whpx_global;
 struct WHPDispatch whp_dispatch;
@@ -393,7 +390,6 @@ static bool load_whp_dispatch_fns(HMODULE *handle,
     HMODULE hLib = *handle;
 
     #define WINHV_PLATFORM_DLL "WinHvPlatform.dll"
-    #define WINHV_EMULATION_DLL "WinHvEmulation.dll"
     #define WHP_LOAD_FIELD_OPTIONAL(return_type, function_name, signature) \
         whp_dispatch.function_name = \
             (function_name ## _t)GetProcAddress(hLib, #function_name); \
@@ -420,14 +416,6 @@ static bool load_whp_dispatch_fns(HMODULE *handle,
         WHP_LOAD_LIB(WINHV_PLATFORM_DLL, hLib)
         LIST_WINHVPLATFORM_FUNCTIONS(WHP_LOAD_FIELD)
         break;
-    case WINHV_EMULATION_FNS_DEFAULT:
-#ifdef HOST_X86_64
-        WHP_LOAD_LIB(WINHV_EMULATION_DLL, hLib)
-        LIST_WINHVEMULATION_FUNCTIONS(WHP_LOAD_FIELD)
-#else
-        g_assert_not_reached();
-#endif
-        break;
     case WINHV_PLATFORM_FNS_SUPPLEMENTAL:
         WHP_LOAD_LIB(WINHV_PLATFORM_DLL, hLib)
         LIST_WINHVPLATFORM_FUNCTIONS_SUPPLEMENTAL(WHP_LOAD_FIELD_OPTIONAL)
@@ -543,11 +531,6 @@ bool init_whp_dispatch(void)
     if (!load_whp_dispatch_fns(&hWinHvPlatform, WINHV_PLATFORM_FNS_DEFAULT)) {
         goto error;
     }
-#ifdef HOST_X86_64
-    if (!load_whp_dispatch_fns(&hWinHvEmulation, WINHV_EMULATION_FNS_DEFAULT)) {
-        goto error;
-    }
-#endif
     assert(load_whp_dispatch_fns(&hWinHvPlatform,
         WINHV_PLATFORM_FNS_SUPPLEMENTAL));
     whp_dispatch_initialized = true;
@@ -557,11 +540,6 @@ error:
     if (hWinHvPlatform) {
         FreeLibrary(hWinHvPlatform);
     }
-#ifdef HOST_X86_64
-    if (hWinHvEmulation) {
-        FreeLibrary(hWinHvEmulation);
-    }
-#endif
     return false;
 }
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 014/102] whpx: i386: remove messages
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (12 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 013/102] whpx: i386: remove remaining winhvemulation support code Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 015/102] whpx: i386: remove CPUID trapping Paolo Bonzini
                   ` (87 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni, Philippe Mathieu-Daudé

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Remove some messages printed by the WHPX backend that don't
have an equivalent elsewhere and don't convey an error.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20260223233950.96076-14-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-all.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index 05248850530..2f2c613eda0 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -2163,7 +2163,6 @@ int whpx_accel_init(AccelState *as, MachineState *ms)
         whp_dispatch.WHvSetVirtualProcessorInterruptControllerState2) {
         WHV_X64_LOCAL_APIC_EMULATION_MODE mode =
             WHvX64LocalApicEmulationModeXApic;
-        printf("WHPX: setting APIC emulation mode in the hypervisor\n");
         hr = whp_dispatch.WHvSetPartitionProperty(
             whpx->partition,
             WHvPartitionPropertyCodeLocalApicEmulationMode,
@@ -2237,7 +2236,6 @@ int whpx_accel_init(AccelState *as, MachineState *ms)
     whpx_memory_init();
     whpx_init_emu();
 
-    printf("Windows Hypervisor Platform accelerator is operational\n");
     return 0;
 
 error:
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 015/102] whpx: i386: remove CPUID trapping
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (13 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 014/102] whpx: i386: remove messages Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 016/102] whpx: common, i386, arm: rework state levels Paolo Bonzini
                   ` (86 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Very partial in its current state and results in significantly inconsistent
CPUID data. Remove it until it's reimplemented later.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-15-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-all.c | 104 ------------------------------------
 1 file changed, 104 deletions(-)

diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index 2f2c613eda0..baa3169c55c 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -1795,75 +1795,6 @@ int whpx_vcpu_run(CPUState *cpu)
             ret = 0;
             break;
         }
-        case WHvRunVpExitReasonX64Cpuid: {
-            WHV_REGISTER_VALUE reg_values[5];
-            WHV_REGISTER_NAME reg_names[5];
-            UINT32 reg_count = 5;
-            UINT64 cpuid_fn, rip = 0, rax = 0, rcx = 0, rdx = 0, rbx = 0;
-            X86CPU *x86_cpu = X86_CPU(cpu);
-            CPUX86State *env = &x86_cpu->env;
-
-            memset(reg_values, 0, sizeof(reg_values));
-
-            rip = vcpu->exit_ctx.VpContext.Rip +
-                  vcpu->exit_ctx.VpContext.InstructionLength;
-            cpuid_fn = vcpu->exit_ctx.CpuidAccess.Rax;
-
-            /*
-             * Ideally, these should be supplied to the hypervisor during VCPU
-             * initialization and it should be able to satisfy this request.
-             * But, currently, WHPX doesn't support setting CPUID values in the
-             * hypervisor once the partition has been setup, which is too late
-             * since VCPUs are realized later. For now, use the values from
-             * QEMU to satisfy these requests, until WHPX adds support for
-             * being able to set these values in the hypervisor at runtime.
-             */
-            cpu_x86_cpuid(env, cpuid_fn, 0, (UINT32 *)&rax, (UINT32 *)&rbx,
-                (UINT32 *)&rcx, (UINT32 *)&rdx);
-            switch (cpuid_fn) {
-            case 0x40000000:
-                /* Expose the vmware cpu frequency cpuid leaf */
-                rax = 0x40000010;
-                rbx = rcx = rdx = 0;
-                break;
-
-            case 0x40000010:
-                rax = env->tsc_khz;
-                rbx = env->apic_bus_freq / 1000; /* Hz to KHz */
-                rcx = rdx = 0;
-                break;
-
-            case 0x80000001:
-                /* Remove any support of OSVW */
-                rcx &= ~CPUID_EXT3_OSVW;
-                break;
-            }
-
-            reg_names[0] = WHvX64RegisterRip;
-            reg_names[1] = WHvX64RegisterRax;
-            reg_names[2] = WHvX64RegisterRcx;
-            reg_names[3] = WHvX64RegisterRdx;
-            reg_names[4] = WHvX64RegisterRbx;
-
-            reg_values[0].Reg64 = rip;
-            reg_values[1].Reg64 = rax;
-            reg_values[2].Reg64 = rcx;
-            reg_values[3].Reg64 = rdx;
-            reg_values[4].Reg64 = rbx;
-
-            hr = whp_dispatch.WHvSetVirtualProcessorRegisters(
-                whpx->partition, cpu->cpu_index,
-                reg_names,
-                reg_count,
-                reg_values);
-
-            if (FAILED(hr)) {
-                error_report("WHPX: Failed to set CpuidAccess state registers,"
-                             " hr=%08lx", hr);
-            }
-            ret = 0;
-            break;
-        }
         case WHvRunVpExitReasonException:
             whpx_get_registers(cpu);
 
@@ -2017,26 +1948,6 @@ int whpx_init_vcpu(CPUState *cpu)
         }
     }
 
-    /*
-     * If the vmware cpuid frequency leaf option is set, and we have a valid
-     * tsc value, trap the corresponding cpuid's.
-     */
-    if (x86_cpu->vmware_cpuid_freq && env->tsc_khz) {
-        UINT32 cpuidExitList[] = {1, 0x80000001, 0x40000000, 0x40000010};
-
-        hr = whp_dispatch.WHvSetPartitionProperty(
-                whpx->partition,
-                WHvPartitionPropertyCodeCpuidExitList,
-                cpuidExitList,
-                RTL_NUMBER_OF(cpuidExitList) * sizeof(UINT32));
-
-        if (FAILED(hr)) {
-            error_report("WHPX: Failed to set partition CpuidExitList hr=%08lx",
-                        hr);
-            ret = -EINVAL;
-            goto error;
-        }
-    }
 
     vcpu->interruptable = true;
     cpu->vcpu_dirty = true;
@@ -2073,7 +1984,6 @@ int whpx_accel_init(AccelState *as, MachineState *ms)
     WHV_CAPABILITY whpx_cap;
     UINT32 whpx_cap_size;
     WHV_PARTITION_PROPERTY prop;
-    UINT32 cpuidExitList[] = {1, 0x80000001};
     WHV_CAPABILITY_FEATURES features = {0};
 
     whpx = &whpx_global;
@@ -2183,7 +2093,6 @@ int whpx_accel_init(AccelState *as, MachineState *ms)
     /* Register for MSR and CPUID exits */
     memset(&prop, 0, sizeof(WHV_PARTITION_PROPERTY));
     prop.ExtendedVmExits.X64MsrExit = 1;
-    prop.ExtendedVmExits.X64CpuidExit = 1;
     prop.ExtendedVmExits.ExceptionExit = 1;
     if (whpx_irqchip_in_kernel()) {
         prop.ExtendedVmExits.X64ApicInitSipiExitTrap = 1;
@@ -2200,19 +2109,6 @@ int whpx_accel_init(AccelState *as, MachineState *ms)
         goto error;
     }
 
-    hr = whp_dispatch.WHvSetPartitionProperty(
-        whpx->partition,
-        WHvPartitionPropertyCodeCpuidExitList,
-        cpuidExitList,
-        RTL_NUMBER_OF(cpuidExitList) * sizeof(UINT32));
-
-    if (FAILED(hr)) {
-        error_report("WHPX: Failed to set partition CpuidExitList hr=%08lx",
-                     hr);
-        ret = -EINVAL;
-        goto error;
-    }
-
     /*
      * We do not want to intercept any exceptions from the guest,
      * until we actually start debugging with gdb.
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 016/102] whpx: common, i386, arm: rework state levels
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (14 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 015/102] whpx: i386: remove CPUID trapping Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 017/102] whpx: i386: saving/restoring less state for WHPX_LEVEL_FAST_RUNTIME_STATE Paolo Bonzini
                   ` (85 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni, Philippe Mathieu-Daudé

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Change state levels from a set of ifdefs to an enum.
Make register state loads use state levels too.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20260223233950.96076-16-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/system/whpx-accel-ops.h | 16 ++++++++++------
 include/system/whpx-all.h       |  6 ++++--
 accel/whpx/whpx-common.c        |  8 ++++----
 target/arm/whpx/whpx-all.c      |  8 ++++----
 target/i386/whpx/whpx-all.c     | 16 ++++++++--------
 5 files changed, 30 insertions(+), 24 deletions(-)

diff --git a/include/system/whpx-accel-ops.h b/include/system/whpx-accel-ops.h
index ed9d4c49f4d..4b2a7326548 100644
--- a/include/system/whpx-accel-ops.h
+++ b/include/system/whpx-accel-ops.h
@@ -22,11 +22,15 @@ void whpx_cpu_synchronize_post_reset(CPUState *cpu);
 void whpx_cpu_synchronize_post_init(CPUState *cpu);
 void whpx_cpu_synchronize_pre_loadvm(CPUState *cpu);
 
-/* state subset only touched by the VCPU itself during runtime */
-#define WHPX_SET_RUNTIME_STATE   1
-/* state subset modified during VCPU reset */
-#define WHPX_SET_RESET_STATE     2
-/* full state set, modified during initialization or on vmload */
-#define WHPX_SET_FULL_STATE      3
+typedef enum WHPXStateLevel {
+    /* subset of runtime state for faster returns from vmexit */
+    WHPX_LEVEL_FAST_RUNTIME_STATE,
+    /* state subset only touched by the VCPU itself during runtime */
+    WHPX_LEVEL_RUNTIME_STATE,
+    /* state subset modified during VCPU reset */
+    WHPX_LEVEL_RESET_STATE,
+    /* full state set, modified during initialization or on vmload */
+    WHPX_LEVEL_FULL_STATE
+} WHPXStateLevel;
 
 #endif /* TARGET_I386_WHPX_ACCEL_OPS_H */
diff --git a/include/system/whpx-all.h b/include/system/whpx-all.h
index b831c463b0b..2cbea71b149 100644
--- a/include/system/whpx-all.h
+++ b/include/system/whpx-all.h
@@ -2,10 +2,12 @@
 #ifndef SYSTEM_WHPX_ALL_H
 #define SYSTEM_WHPX_ALL_H
 
+#include "system/whpx-accel-ops.h"
+
 /* Called by whpx-common */
 int whpx_vcpu_run(CPUState *cpu);
-void whpx_get_registers(CPUState *cpu);
-void whpx_set_registers(CPUState *cpu, int level);
+void whpx_get_registers(CPUState *cpu, WHPXStateLevel level);
+void whpx_set_registers(CPUState *cpu, WHPXStateLevel level);
 int whpx_accel_init(AccelState *as, MachineState *ms);
 void whpx_cpu_instance_init(CPUState *cs);
 HRESULT whpx_set_exception_exit_bitmap(UINT64 exceptions);
diff --git a/accel/whpx/whpx-common.c b/accel/whpx/whpx-common.c
index 88eef557998..4863fc86631 100644
--- a/accel/whpx/whpx-common.c
+++ b/accel/whpx/whpx-common.c
@@ -46,7 +46,7 @@ struct WHPDispatch whp_dispatch;
 void whpx_flush_cpu_state(CPUState *cpu)
 {
     if (cpu->vcpu_dirty) {
-        whpx_set_registers(cpu, WHPX_SET_RUNTIME_STATE);
+        whpx_set_registers(cpu, WHPX_LEVEL_RUNTIME_STATE);
         cpu->vcpu_dirty = false;
     }
 }
@@ -180,7 +180,7 @@ int whpx_last_vcpu_stopping(CPUState *cpu)
 static void do_whpx_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg)
 {
     if (!cpu->vcpu_dirty) {
-        whpx_get_registers(cpu);
+        whpx_get_registers(cpu, WHPX_LEVEL_FULL_STATE);
         cpu->vcpu_dirty = true;
     }
 }
@@ -188,14 +188,14 @@ static void do_whpx_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg)
 static void do_whpx_cpu_synchronize_post_reset(CPUState *cpu,
                                                run_on_cpu_data arg)
 {
-    whpx_set_registers(cpu, WHPX_SET_RESET_STATE);
+    whpx_set_registers(cpu, WHPX_LEVEL_RESET_STATE);
     cpu->vcpu_dirty = false;
 }
 
 static void do_whpx_cpu_synchronize_post_init(CPUState *cpu,
                                               run_on_cpu_data arg)
 {
-    whpx_set_registers(cpu, WHPX_SET_FULL_STATE);
+    whpx_set_registers(cpu, WHPX_LEVEL_FULL_STATE);
     cpu->vcpu_dirty = false;
 }
 
diff --git a/target/arm/whpx/whpx-all.c b/target/arm/whpx/whpx-all.c
index 0d56e468bdf..bb94eac7bf8 100644
--- a/target/arm/whpx/whpx-all.c
+++ b/target/arm/whpx/whpx-all.c
@@ -417,7 +417,7 @@ int whpx_vcpu_run(CPUState *cpu)
     do {
         bool advance_pc = false;
         if (cpu->vcpu_dirty) {
-            whpx_set_registers(cpu, WHPX_SET_RUNTIME_STATE);
+            whpx_set_registers(cpu, WHPX_LEVEL_RUNTIME_STATE);
             cpu->vcpu_dirty = false;
         }
 
@@ -482,7 +482,7 @@ int whpx_vcpu_run(CPUState *cpu)
         default:
             error_report("WHPX: Unexpected VP exit code 0x%08x",
                          vcpu->exit_ctx.ExitReason);
-            whpx_get_registers(cpu);
+            whpx_get_registers(cpu, WHPX_LEVEL_FULL_STATE);
             bql_lock();
             qemu_system_guest_panicked(cpu_get_crash_info(cpu));
             bql_unlock();
@@ -516,7 +516,7 @@ static void clean_whv_register_value(WHV_REGISTER_VALUE *val)
     memset(val, 0, sizeof(WHV_REGISTER_VALUE));
 }
 
-void whpx_get_registers(CPUState *cpu)
+void whpx_get_registers(CPUState *cpu, WHPXStateLevel level)
 {
     ARMCPU *arm_cpu = ARM_CPU(cpu);
     CPUARMState *env = &arm_cpu->env;
@@ -563,7 +563,7 @@ void whpx_get_registers(CPUState *cpu)
     aarch64_restore_sp(env, arm_current_el(env));
 }
 
-void whpx_set_registers(CPUState *cpu, int level)
+void whpx_set_registers(CPUState *cpu, WHPXStateLevel level)
 {
     ARMCPU *arm_cpu = ARM_CPU(cpu);
     CPUARMState *env = &arm_cpu->env;
diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index baa3169c55c..c09d9affefa 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -367,7 +367,7 @@ static uint64_t whpx_cr8_to_apic_tpr(uint64_t cr8)
     return cr8 << 4;
 }
 
-void whpx_set_registers(CPUState *cpu, int level)
+void whpx_set_registers(CPUState *cpu, WHPXStateLevel level)
 {
     struct whpx_state *whpx = &whpx_global;
     AccelCPUState *vcpu = cpu->accel;
@@ -386,7 +386,7 @@ void whpx_set_registers(CPUState *cpu, int level)
      * Following MSRs have side effects on the guest or are too heavy for
      * runtime. Limit them to full state update.
      */
-    if (level >= WHPX_SET_RESET_STATE) {
+    if (level >= WHPX_LEVEL_RESET_STATE) {
         whpx_set_tsc(cpu);
     }
 
@@ -583,7 +583,7 @@ static void whpx_get_xcrs(CPUState *cpu)
     cpu_env(cpu)->xcr0 = xcr0.Reg64;
 }
 
-void whpx_get_registers(CPUState *cpu)
+void whpx_get_registers(CPUState *cpu, WHPXStateLevel level)
 {
     struct whpx_state *whpx = &whpx_global;
     AccelCPUState *vcpu = cpu->accel;
@@ -770,10 +770,10 @@ static int emulate_instruction(CPUState *cpu, const uint8_t *insn_bytes, size_t
     struct x86_decode decode = { 0 };
     x86_insn_stream stream = { .bytes = insn_bytes, .len = insn_len };
 
-    whpx_get_registers(cpu);
+    whpx_get_registers(cpu, WHPX_LEVEL_FAST_RUNTIME_STATE);
     decode_instruction_stream(env, &decode, &stream);
     exec_instruction(env, &decode);
-    whpx_set_registers(cpu, WHPX_SET_RUNTIME_STATE);
+    whpx_set_registers(cpu, WHPX_LEVEL_FAST_RUNTIME_STATE);
 
     return 0;
 }
@@ -1589,7 +1589,7 @@ int whpx_vcpu_run(CPUState *cpu)
 
     do {
         if (cpu->vcpu_dirty) {
-            whpx_set_registers(cpu, WHPX_SET_RUNTIME_STATE);
+            whpx_set_registers(cpu, WHPX_LEVEL_RUNTIME_STATE);
             cpu->vcpu_dirty = false;
         }
 
@@ -1796,7 +1796,7 @@ int whpx_vcpu_run(CPUState *cpu)
             break;
         }
         case WHvRunVpExitReasonException:
-            whpx_get_registers(cpu);
+            whpx_get_registers(cpu, WHPX_LEVEL_FULL_STATE);
 
             if ((vcpu->exit_ctx.VpException.ExceptionType ==
                  WHvX64ExceptionTypeDebugTrapOrFault) &&
@@ -1828,7 +1828,7 @@ int whpx_vcpu_run(CPUState *cpu)
         default:
             error_report("WHPX: Unexpected VP exit code %d",
                          vcpu->exit_ctx.ExitReason);
-            whpx_get_registers(cpu);
+            whpx_get_registers(cpu, WHPX_LEVEL_FULL_STATE);
             bql_lock();
             qemu_system_guest_panicked(cpu_get_crash_info(cpu));
             bql_unlock();
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 017/102] whpx: i386: saving/restoring less state for WHPX_LEVEL_FAST_RUNTIME_STATE
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (15 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 016/102] whpx: common, i386, arm: rework state levels Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 018/102] target/i386: mshv, emulate: move the generic x86 helpers to target/i386/emulate Paolo Bonzini
                   ` (84 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni, Bernhard Beschow

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Optimise vmexits by save/restoring less state in those cases instead of the full state.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Link: https://lore.kernel.org/r/20260223233950.96076-17-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-all.c | 196 +++++++++++++++++++-----------------
 1 file changed, 101 insertions(+), 95 deletions(-)

diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index c09d9affefa..ab583e922d4 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -422,118 +422,124 @@ void whpx_set_registers(CPUState *cpu, WHPXStateLevel level)
     }
 
     assert(idx == WHvX64RegisterLdtr);
-    vcxt.values[idx++].Segment = whpx_seg_q2h(&env->ldt, 0, 0);
-
-    assert(idx == WHvX64RegisterTr);
-    vcxt.values[idx++].Segment = whpx_seg_q2h(&env->tr, 0, 0);
-
-    assert(idx == WHvX64RegisterIdtr);
-    vcxt.values[idx].Table.Base = env->idt.base;
-    vcxt.values[idx].Table.Limit = env->idt.limit;
-    idx += 1;
-
-    assert(idx == WHvX64RegisterGdtr);
-    vcxt.values[idx].Table.Base = env->gdt.base;
-    vcxt.values[idx].Table.Limit = env->gdt.limit;
-    idx += 1;
-
-    /* CR0, 2, 3, 4, 8 */
-    assert(whpx_register_names[idx] == WHvX64RegisterCr0);
-    vcxt.values[idx++].Reg64 = env->cr[0];
-    assert(whpx_register_names[idx] == WHvX64RegisterCr2);
-    vcxt.values[idx++].Reg64 = env->cr[2];
-    assert(whpx_register_names[idx] == WHvX64RegisterCr3);
-    vcxt.values[idx++].Reg64 = env->cr[3];
-    assert(whpx_register_names[idx] == WHvX64RegisterCr4);
-    vcxt.values[idx++].Reg64 = env->cr[4];
-    assert(whpx_register_names[idx] == WHvX64RegisterCr8);
-    vcxt.values[idx++].Reg64 = vcpu->tpr;
-
-    /* 8 Debug Registers - Skipped */
-
     /*
-     * Extended control registers needs to be handled separately depending
-     * on whether xsave is supported/enabled or not.
+     * Skip those registers for synchronisation after MMIO accesses
+     * as they're not going to be modified in that case.
      */
-    whpx_set_xcrs(cpu);
+    if (level > WHPX_LEVEL_FAST_RUNTIME_STATE) {
+        vcxt.values[idx++].Segment = whpx_seg_q2h(&env->ldt, 0, 0);
 
-    /* 16 XMM registers */
-    assert(whpx_register_names[idx] == WHvX64RegisterXmm0);
-    idx_next = idx + 16;
-    for (i = 0; i < sizeof(env->xmm_regs) / sizeof(ZMMReg); i += 1, idx += 1) {
-        vcxt.values[idx].Reg128.Low64 = env->xmm_regs[i].ZMM_Q(0);
-        vcxt.values[idx].Reg128.High64 = env->xmm_regs[i].ZMM_Q(1);
-    }
-    idx = idx_next;
+        assert(idx == WHvX64RegisterTr);
+        vcxt.values[idx++].Segment = whpx_seg_q2h(&env->tr, 0, 0);
 
-    /* 8 FP registers */
-    assert(whpx_register_names[idx] == WHvX64RegisterFpMmx0);
-    for (i = 0; i < 8; i += 1, idx += 1) {
-        vcxt.values[idx].Fp.AsUINT128.Low64 = env->fpregs[i].mmx.MMX_Q(0);
-        /* vcxt.values[idx].Fp.AsUINT128.High64 =
-               env->fpregs[i].mmx.MMX_Q(1);
-        */
-    }
+        assert(idx == WHvX64RegisterIdtr);
+        vcxt.values[idx].Table.Base = env->idt.base;
+        vcxt.values[idx].Table.Limit = env->idt.limit;
+        idx += 1;
 
-    /* FP control status register */
-    assert(whpx_register_names[idx] == WHvX64RegisterFpControlStatus);
-    vcxt.values[idx].FpControlStatus.FpControl = env->fpuc;
-    vcxt.values[idx].FpControlStatus.FpStatus =
-        (env->fpus & ~0x3800) | (env->fpstt & 0x7) << 11;
-    vcxt.values[idx].FpControlStatus.FpTag = 0;
-    for (i = 0; i < 8; ++i) {
-        vcxt.values[idx].FpControlStatus.FpTag |= (!env->fptags[i]) << i;
-    }
-    vcxt.values[idx].FpControlStatus.Reserved = 0;
-    vcxt.values[idx].FpControlStatus.LastFpOp = env->fpop;
-    vcxt.values[idx].FpControlStatus.LastFpRip = env->fpip;
-    idx += 1;
+        assert(idx == WHvX64RegisterGdtr);
+        vcxt.values[idx].Table.Base = env->gdt.base;
+        vcxt.values[idx].Table.Limit = env->gdt.limit;
+        idx += 1;
 
-    /* XMM control status register */
-    assert(whpx_register_names[idx] == WHvX64RegisterXmmControlStatus);
-    vcxt.values[idx].XmmControlStatus.LastFpRdp = 0;
-    vcxt.values[idx].XmmControlStatus.XmmStatusControl = env->mxcsr;
-    vcxt.values[idx].XmmControlStatus.XmmStatusControlMask = 0x0000ffff;
-    idx += 1;
+        /* CR0, 2, 3, 4, 8 */
+        assert(whpx_register_names[idx] == WHvX64RegisterCr0);
+        vcxt.values[idx++].Reg64 = env->cr[0];
+        assert(whpx_register_names[idx] == WHvX64RegisterCr2);
+        vcxt.values[idx++].Reg64 = env->cr[2];
+        assert(whpx_register_names[idx] == WHvX64RegisterCr3);
+        vcxt.values[idx++].Reg64 = env->cr[3];
+        assert(whpx_register_names[idx] == WHvX64RegisterCr4);
+        vcxt.values[idx++].Reg64 = env->cr[4];
+        assert(whpx_register_names[idx] == WHvX64RegisterCr8);
+        vcxt.values[idx++].Reg64 = vcpu->tpr;
 
-    /* MSRs */
-    assert(whpx_register_names[idx] == WHvX64RegisterEfer);
-    vcxt.values[idx++].Reg64 = env->efer;
+        /* 8 Debug Registers - Skipped */
+
+        /*
+         * Extended control registers needs to be handled separately depending
+         * on whether xsave is supported/enabled or not.
+         */
+        whpx_set_xcrs(cpu);
+
+        /* 16 XMM registers */
+        assert(whpx_register_names[idx] == WHvX64RegisterXmm0);
+        idx_next = idx + 16;
+        for (i = 0; i < sizeof(env->xmm_regs) / sizeof(ZMMReg); i += 1, idx += 1) {
+            vcxt.values[idx].Reg128.Low64 = env->xmm_regs[i].ZMM_Q(0);
+            vcxt.values[idx].Reg128.High64 = env->xmm_regs[i].ZMM_Q(1);
+        }
+        idx = idx_next;
+
+        /* 8 FP registers */
+        assert(whpx_register_names[idx] == WHvX64RegisterFpMmx0);
+        for (i = 0; i < 8; i += 1, idx += 1) {
+            vcxt.values[idx].Fp.AsUINT128.Low64 = env->fpregs[i].mmx.MMX_Q(0);
+            /* vcxt.values[idx].Fp.AsUINT128.High64 =
+                       env->fpregs[i].mmx.MMX_Q(1);
+            */
+        }
+
+        /* FP control status register */
+        assert(whpx_register_names[idx] == WHvX64RegisterFpControlStatus);
+        vcxt.values[idx].FpControlStatus.FpControl = env->fpuc;
+        vcxt.values[idx].FpControlStatus.FpStatus =
+            (env->fpus & ~0x3800) | (env->fpstt & 0x7) << 11;
+        vcxt.values[idx].FpControlStatus.FpTag = 0;
+        for (i = 0; i < 8; ++i) {
+            vcxt.values[idx].FpControlStatus.FpTag |= (!env->fptags[i]) << i;
+        }
+        vcxt.values[idx].FpControlStatus.Reserved = 0;
+        vcxt.values[idx].FpControlStatus.LastFpOp = env->fpop;
+        vcxt.values[idx].FpControlStatus.LastFpRip = env->fpip;
+        idx += 1;
+
+        /* XMM control status register */
+        assert(whpx_register_names[idx] == WHvX64RegisterXmmControlStatus);
+        vcxt.values[idx].XmmControlStatus.LastFpRdp = 0;
+        vcxt.values[idx].XmmControlStatus.XmmStatusControl = env->mxcsr;
+        vcxt.values[idx].XmmControlStatus.XmmStatusControlMask = 0x0000ffff;
+        idx += 1;
+
+        /* MSRs */
+        assert(whpx_register_names[idx] == WHvX64RegisterEfer);
+        vcxt.values[idx++].Reg64 = env->efer;
 #ifdef TARGET_X86_64
-    assert(whpx_register_names[idx] == WHvX64RegisterKernelGsBase);
-    vcxt.values[idx++].Reg64 = env->kernelgsbase;
+        assert(whpx_register_names[idx] == WHvX64RegisterKernelGsBase);
+        vcxt.values[idx++].Reg64 = env->kernelgsbase;
 #endif
 
-    assert(whpx_register_names[idx] == WHvX64RegisterApicBase);
-    vcxt.values[idx++].Reg64 = vcpu->apic_base;
+        assert(whpx_register_names[idx] == WHvX64RegisterApicBase);
+        vcxt.values[idx++].Reg64 = vcpu->apic_base;
 
-    /* WHvX64RegisterPat - Skipped */
+        /* WHvX64RegisterPat - Skipped */
 
-    assert(whpx_register_names[idx] == WHvX64RegisterSysenterCs);
-    vcxt.values[idx++].Reg64 = env->sysenter_cs;
-    assert(whpx_register_names[idx] == WHvX64RegisterSysenterEip);
-    vcxt.values[idx++].Reg64 = env->sysenter_eip;
-    assert(whpx_register_names[idx] == WHvX64RegisterSysenterEsp);
-    vcxt.values[idx++].Reg64 = env->sysenter_esp;
-    assert(whpx_register_names[idx] == WHvX64RegisterStar);
-    vcxt.values[idx++].Reg64 = env->star;
+        assert(whpx_register_names[idx] == WHvX64RegisterSysenterCs);
+        vcxt.values[idx++].Reg64 = env->sysenter_cs;
+        assert(whpx_register_names[idx] == WHvX64RegisterSysenterEip);
+        vcxt.values[idx++].Reg64 = env->sysenter_eip;
+        assert(whpx_register_names[idx] == WHvX64RegisterSysenterEsp);
+        vcxt.values[idx++].Reg64 = env->sysenter_esp;
+        assert(whpx_register_names[idx] == WHvX64RegisterStar);
+        vcxt.values[idx++].Reg64 = env->star;
 #ifdef TARGET_X86_64
-    assert(whpx_register_names[idx] == WHvX64RegisterLstar);
-    vcxt.values[idx++].Reg64 = env->lstar;
-    assert(whpx_register_names[idx] == WHvX64RegisterCstar);
-    vcxt.values[idx++].Reg64 = env->cstar;
-    assert(whpx_register_names[idx] == WHvX64RegisterSfmask);
-    vcxt.values[idx++].Reg64 = env->fmask;
+        assert(whpx_register_names[idx] == WHvX64RegisterLstar);
+        vcxt.values[idx++].Reg64 = env->lstar;
+        assert(whpx_register_names[idx] == WHvX64RegisterCstar);
+        vcxt.values[idx++].Reg64 = env->cstar;
+        assert(whpx_register_names[idx] == WHvX64RegisterSfmask);
+        vcxt.values[idx++].Reg64 = env->fmask;
 #endif
 
-    /* Interrupt / Event Registers - Skipped */
+        /* Interrupt / Event Registers - Skipped */
 
-    assert(idx == RTL_NUMBER_OF(whpx_register_names));
+        assert(idx == RTL_NUMBER_OF(whpx_register_names));
+    }
 
     hr = whp_dispatch.WHvSetVirtualProcessorRegisters(
         whpx->partition, cpu->cpu_index,
         whpx_register_names,
-        RTL_NUMBER_OF(whpx_register_names),
+        idx,
         &vcxt.values[0]);
 
     if (FAILED(hr)) {
@@ -613,7 +619,7 @@ void whpx_get_registers(CPUState *cpu, WHPXStateLevel level)
                      hr);
     }
 
-    if (whpx_irqchip_in_kernel()) {
+    if (level > WHPX_LEVEL_FAST_RUNTIME_STATE && whpx_irqchip_in_kernel()) {
         /*
          * Fetch the TPR value from the emulated APIC. It may get overwritten
          * below with the value from CR8 returned by
@@ -670,7 +676,7 @@ void whpx_get_registers(CPUState *cpu, WHPXStateLevel level)
     env->cr[4] = vcxt.values[idx++].Reg64;
     assert(whpx_register_names[idx] == WHvX64RegisterCr8);
     tpr = vcxt.values[idx++].Reg64;
-    if (tpr != vcpu->tpr) {
+    if (level > WHPX_LEVEL_FAST_RUNTIME_STATE && tpr != vcpu->tpr) {
         vcpu->tpr = tpr;
         cpu_set_apic_tpr(x86_cpu->apic_state, whpx_cr8_to_apic_tpr(tpr));
     }
@@ -756,7 +762,7 @@ void whpx_get_registers(CPUState *cpu, WHPXStateLevel level)
 
     assert(idx == RTL_NUMBER_OF(whpx_register_names));
 
-    if (whpx_irqchip_in_kernel()) {
+    if (level > WHPX_LEVEL_FAST_RUNTIME_STATE && whpx_irqchip_in_kernel()) {
         whpx_apic_get(x86_cpu->apic_state);
     }
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 018/102] target/i386: mshv, emulate: move the generic x86 helpers to target/i386/emulate
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (16 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 017/102] whpx: i386: saving/restoring less state for WHPX_LEVEL_FAST_RUNTIME_STATE Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 019/102] target/i386: emulate: 5-level paging for the page table walker Paolo Bonzini
                   ` (83 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

HVF doesn't use them at this point, but move them to common code as that's what they are.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-18-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/{mshv/x86.c => emulate/x86_helpers.c} | 0
 target/i386/emulate/meson.build                   | 7 +++++++
 target/i386/mshv/meson.build                      | 4 ----
 3 files changed, 7 insertions(+), 4 deletions(-)
 rename target/i386/{mshv/x86.c => emulate/x86_helpers.c} (100%)

diff --git a/target/i386/mshv/x86.c b/target/i386/emulate/x86_helpers.c
similarity index 100%
rename from target/i386/mshv/x86.c
rename to target/i386/emulate/x86_helpers.c
diff --git a/target/i386/emulate/meson.build b/target/i386/emulate/meson.build
index 1bb35162498..1fa1a8e8ec8 100644
--- a/target/i386/emulate/meson.build
+++ b/target/i386/emulate/meson.build
@@ -5,6 +5,13 @@ emulator_files = files(
   'x86_mmu.c'
 )
 
+emulator_helper_files = files(
+  'x86_helpers.c'
+)
+
 i386_system_ss.add(when: [hvf, 'CONFIG_HVF'], if_true: emulator_files)
 i386_system_ss.add(when: 'CONFIG_MSHV', if_true: emulator_files)
 i386_system_ss.add(when: 'CONFIG_WHPX', if_true: emulator_files)
+
+i386_system_ss.add(when: 'CONFIG_MSHV', if_true: emulator_helper_files)
+i386_system_ss.add(when: 'CONFIG_WHPX', if_true: emulator_helper_files)
diff --git a/target/i386/mshv/meson.build b/target/i386/mshv/meson.build
index 3fadd4598a5..49f28d4a5b9 100644
--- a/target/i386/mshv/meson.build
+++ b/target/i386/mshv/meson.build
@@ -2,11 +2,7 @@ i386_mshv_ss = ss.source_set()
 
 i386_mshv_ss.add(files(
   'mshv-cpu.c',
-  'x86.c',
 ))
 
 i386_system_ss.add_all(when: 'CONFIG_MSHV', if_true: i386_mshv_ss)
 
-i386_system_ss.add(when: 'CONFIG_WHPX', if_true: files(
-  'x86.c',
-))
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 019/102] target/i386: emulate: 5-level paging for the page table walker
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (17 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 018/102] target/i386: mshv, emulate: move the generic x86 helpers to target/i386/emulate Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 020/102] target/i386: emulate, hvf, mshv: rework MMU code Paolo Bonzini
                   ` (82 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-19-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/emulate/x86.h         | 1 +
 target/i386/emulate/x86_helpers.c | 8 ++++++++
 target/i386/emulate/x86_mmu.c     | 3 +++
 target/i386/hvf/x86.c             | 5 +++++
 4 files changed, 17 insertions(+)

diff --git a/target/i386/emulate/x86.h b/target/i386/emulate/x86.h
index 73edccfba00..caf0e3be50e 100644
--- a/target/i386/emulate/x86.h
+++ b/target/i386/emulate/x86.h
@@ -263,6 +263,7 @@ bool x86_is_protected(CPUState *cpu);
 bool x86_is_real(CPUState *cpu);
 bool x86_is_v8086(CPUState *cpu);
 bool x86_is_long_mode(CPUState *cpu);
+bool x86_is_la57(CPUState *cpu);
 bool x86_is_long64_mode(CPUState *cpu);
 bool x86_is_paging_mode(CPUState *cpu);
 bool x86_is_pae_enabled(CPUState *cpu);
diff --git a/target/i386/emulate/x86_helpers.c b/target/i386/emulate/x86_helpers.c
index 0700cc05efb..7bdd7e4c2a1 100644
--- a/target/i386/emulate/x86_helpers.c
+++ b/target/i386/emulate/x86_helpers.c
@@ -236,6 +236,14 @@ bool x86_is_long_mode(CPUState *cpu)
     return ((efer & lme_lma) == lme_lma);
 }
 
+bool x86_is_la57(CPUState *cpu)
+{
+    X86CPU *x86_cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86_cpu->env;
+    uint64_t is_la57 = env->cr[4] & CR4_LA57_MASK;
+    return is_la57;
+}
+
 bool x86_is_long64_mode(CPUState *cpu)
 {
     error_report("unimplemented: is_long64_mode()");
diff --git a/target/i386/emulate/x86_mmu.c b/target/i386/emulate/x86_mmu.c
index b82a55a3da7..35987a897aa 100644
--- a/target/i386/emulate/x86_mmu.c
+++ b/target/i386/emulate/x86_mmu.c
@@ -56,6 +56,9 @@ static int gpt_top_level(CPUState *cpu, bool pae)
         return 2;
     }
     if (x86_is_long_mode(cpu)) {
+        if (x86_is_la57(cpu)) {
+            return 5;
+        }
         return 4;
     }
 
diff --git a/target/i386/hvf/x86.c b/target/i386/hvf/x86.c
index 2fa210ff601..e98f480f411 100644
--- a/target/i386/hvf/x86.c
+++ b/target/i386/hvf/x86.c
@@ -138,6 +138,11 @@ bool x86_is_long_mode(CPUState *cpu)
     return rvmcs(cpu->accel->fd, VMCS_GUEST_IA32_EFER) & MSR_EFER_LMA;
 }
 
+bool x86_is_la57(CPUState *cpu)
+{
+    return false;
+}
+
 bool x86_is_long64_mode(CPUState *cpu)
 {
     struct vmx_segment desc;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 020/102] target/i386: emulate, hvf, mshv: rework MMU code
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (18 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 019/102] target/i386: emulate: 5-level paging for the page table walker Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 021/102] hvf: i386: save/restore CR0/2/3 Paolo Bonzini
                   ` (81 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

target/i386/emulate doesn't currently properly emulate instructions
which might cause a page fault during their execution. Notably, REP STOS/MOVS
from MMIO to an address which is unmapped until a page fault exception is raised
causes an abort() in vmx_write_mem.

Change the interface between the HW accel backend and target/i386/emulate as a first step towards addressing that.

Adapt the page table walker code to give actionable errors,
while leaving a possibility for backends to provide their own walker.

This removes the usage of the Hyper-V page walker in the mshv backend.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-20-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/emulate/x86_emu.h     |   4 +-
 target/i386/emulate/x86_mmu.h     |  31 +++++--
 target/i386/emulate/x86_decode.c  |   2 +-
 target/i386/emulate/x86_emu.c     |  14 +--
 target/i386/emulate/x86_helpers.c |   5 +-
 target/i386/emulate/x86_mmu.c     | 146 +++++++++++++++++++-----------
 target/i386/hvf/hvf.c             |  31 +++----
 target/i386/hvf/x86.c             |   6 +-
 target/i386/hvf/x86_task.c        |   8 +-
 target/i386/mshv/mshv-cpu.c       |  71 ---------------
 target/i386/whpx/whpx-all.c       |  12 ---
 11 files changed, 146 insertions(+), 184 deletions(-)

diff --git a/target/i386/emulate/x86_emu.h b/target/i386/emulate/x86_emu.h
index 05686b162f6..3e485b8ca36 100644
--- a/target/i386/emulate/x86_emu.h
+++ b/target/i386/emulate/x86_emu.h
@@ -21,13 +21,13 @@
 
 #include "x86.h"
 #include "x86_decode.h"
+#include "x86_mmu.h"
 #include "cpu.h"
 
 struct x86_emul_ops {
     void (*fetch_instruction)(CPUState *cpu, void *data, target_ulong addr,
                               int bytes);
-    void (*read_mem)(CPUState *cpu, void *data, target_ulong addr, int bytes);
-    void (*write_mem)(CPUState *cpu, void *data, target_ulong addr, int bytes);
+    MMUTranslateResult (*mmu_gva_to_gpa) (CPUState *cpu, target_ulong gva, uint64_t *gpa, MMUTranslateFlags flags);
     void (*read_segment_descriptor)(CPUState *cpu, struct x86_segment_descriptor *desc,
                                     enum X86Seg seg);
     void (*handle_io)(CPUState *cpu, uint16_t port, void *data, int direction,
diff --git a/target/i386/emulate/x86_mmu.h b/target/i386/emulate/x86_mmu.h
index 9447ae072cd..190bd272a23 100644
--- a/target/i386/emulate/x86_mmu.h
+++ b/target/i386/emulate/x86_mmu.h
@@ -30,15 +30,30 @@
 #define PT_GLOBAL       (1 << 8)
 #define PT_NX           (1llu << 63)
 
-/* error codes */
-#define MMU_PAGE_PT             (1 << 0)
-#define MMU_PAGE_WT             (1 << 1)
-#define MMU_PAGE_US             (1 << 2)
-#define MMU_PAGE_NX             (1 << 3)
+typedef enum MMUTranslateFlags {
+    MMU_TRANSLATE_VALIDATE_WRITE = BIT(1),
+    MMU_TRANSLATE_VALIDATE_EXECUTE = BIT(2),
+    MMU_TRANSLATE_PRIV_CHECKS_EXEMPT = BIT(3)
+} MMUTranslateFlags;
 
-bool mmu_gva_to_gpa(CPUState *cpu, target_ulong gva, uint64_t *gpa);
+typedef enum MMUTranslateResult {
+    MMU_TRANSLATE_SUCCESS = 0,
+    MMU_TRANSLATE_PAGE_NOT_MAPPED = 1,
+    MMU_TRANSLATE_PRIV_VIOLATION = 2,
+    MMU_TRANSLATE_INVALID_PT_FLAGS = 3,
+    MMU_TRANSLATE_GPA_UNMAPPED = 4,
+    MMU_TRANSLATE_GPA_NO_READ_ACCESS = 5,
+    MMU_TRANSLATE_GPA_NO_WRITE_ACCESS = 6
+} MMUTranslateResult;
+
+MMUTranslateResult mmu_gva_to_gpa(CPUState *cpu, target_ulong gva, uint64_t *gpa, MMUTranslateFlags flags);
+
+/* Thin wrappers x86_write_mem_ex/x86_read_mem_ex for code readability */
+MMUTranslateResult x86_write_mem(CPUState *cpu, void *data, target_ulong gva, int bytes);
+MMUTranslateResult x86_read_mem(CPUState *cpu, void *data, target_ulong gva, int bytes);
+
+MMUTranslateResult x86_write_mem_priv(CPUState *cpu, void *data, target_ulong gva, int bytes);
+MMUTranslateResult x86_read_mem_priv(CPUState *cpu, void *data, target_ulong gva, int bytes);
 
-void vmx_write_mem(CPUState *cpu, target_ulong gva, void *data, int bytes);
-void vmx_read_mem(CPUState *cpu, void *data, target_ulong gva, int bytes);
 
 #endif /* X86_MMU_H */
diff --git a/target/i386/emulate/x86_decode.c b/target/i386/emulate/x86_decode.c
index 7bbcd2a9a2a..9faa65a5797 100644
--- a/target/i386/emulate/x86_decode.c
+++ b/target/i386/emulate/x86_decode.c
@@ -80,7 +80,7 @@ static inline uint64_t decode_bytes(CPUX86State *env, struct x86_decode *decode,
         if (emul_ops->fetch_instruction) {
             emul_ops->fetch_instruction(env_cpu(env), &val, va, size);
         } else {
-            emul_ops->read_mem(env_cpu(env), &val, va, size);
+            x86_read_mem(env_cpu(env), &val, va, size);
         }
     }
     decode->len += size;
diff --git a/target/i386/emulate/x86_emu.c b/target/i386/emulate/x86_emu.c
index bf96fe06b45..cfa35561dd5 100644
--- a/target/i386/emulate/x86_emu.c
+++ b/target/i386/emulate/x86_emu.c
@@ -166,7 +166,7 @@ void write_val_to_reg(void *reg_ptr, target_ulong val, int size)
 
 static void write_val_to_mem(CPUX86State *env, target_ulong ptr, target_ulong val, int size)
 {
-    emul_ops->write_mem(env_cpu(env), &val, ptr, size);
+    x86_write_mem(env_cpu(env), &val, ptr, size);
 }
 
 void write_val_ext(CPUX86State *env, struct x86_decode_op *decode, target_ulong val, int size)
@@ -180,7 +180,7 @@ void write_val_ext(CPUX86State *env, struct x86_decode_op *decode, target_ulong
 
 uint8_t *read_mmio(CPUX86State *env, target_ulong ptr, int bytes)
 {
-    emul_ops->read_mem(env_cpu(env), env->emu_mmio_buf, ptr, bytes);
+    x86_read_mem(env_cpu(env), env->emu_mmio_buf, ptr, bytes);
     return env->emu_mmio_buf;
 }
 
@@ -497,7 +497,7 @@ static void exec_ins_single(CPUX86State *env, struct x86_decode *decode)
 
     emul_ops->handle_io(env_cpu(env), DX(env), env->emu_mmio_buf, 0,
                         decode->operand_size, 1);
-    emul_ops->write_mem(env_cpu(env), env->emu_mmio_buf, addr,
+    x86_write_mem(env_cpu(env), env->emu_mmio_buf, addr,
                         decode->operand_size);
 
     string_increment_reg(env, R_EDI, decode);
@@ -518,7 +518,7 @@ static void exec_outs_single(CPUX86State *env, struct x86_decode *decode)
 {
     target_ulong addr = decode_linear_addr(env, decode, RSI(env), R_DS);
 
-    emul_ops->read_mem(env_cpu(env), env->emu_mmio_buf, addr,
+    x86_read_mem(env_cpu(env), env->emu_mmio_buf, addr,
                        decode->operand_size);
     emul_ops->handle_io(env_cpu(env), DX(env), env->emu_mmio_buf, 1,
                         decode->operand_size, 1);
@@ -604,7 +604,7 @@ static void exec_stos_single(CPUX86State *env, struct x86_decode *decode)
     addr = linear_addr_size(env_cpu(env), RDI(env),
                             decode->addressing_size, R_ES);
     val = read_reg(env, R_EAX, decode->operand_size);
-    emul_ops->write_mem(env_cpu(env), &val, addr, decode->operand_size);
+    x86_write_mem(env_cpu(env), &val, addr, decode->operand_size);
 
     string_increment_reg(env, R_EDI, decode);
 }
@@ -628,7 +628,7 @@ static void exec_scas_single(CPUX86State *env, struct x86_decode *decode)
     addr = linear_addr_size(env_cpu(env), RDI(env),
                             decode->addressing_size, R_ES);
     decode->op[1].type = X86_VAR_IMMEDIATE;
-    emul_ops->read_mem(env_cpu(env), &decode->op[1].val, addr, decode->operand_size);
+    x86_read_mem(env_cpu(env), &decode->op[1].val, addr, decode->operand_size);
 
     EXEC_2OP_FLAGS_CMD(env, decode, -, SET_FLAGS_OSZAPC_SUB, false);
     string_increment_reg(env, R_EDI, decode);
@@ -653,7 +653,7 @@ static void exec_lods_single(CPUX86State *env, struct x86_decode *decode)
     target_ulong val = 0;
 
     addr = decode_linear_addr(env, decode, RSI(env), R_DS);
-    emul_ops->read_mem(env_cpu(env), &val, addr,  decode->operand_size);
+    x86_read_mem(env_cpu(env), &val, addr,  decode->operand_size);
     write_reg(env, R_EAX, val, decode->operand_size);
 
     string_increment_reg(env, R_ESI, decode);
diff --git a/target/i386/emulate/x86_helpers.c b/target/i386/emulate/x86_helpers.c
index 7bdd7e4c2a1..024f9a2afcf 100644
--- a/target/i386/emulate/x86_helpers.c
+++ b/target/i386/emulate/x86_helpers.c
@@ -13,6 +13,7 @@
 #include "cpu.h"
 #include "emulate/x86_decode.h"
 #include "emulate/x86_emu.h"
+#include "emulate/x86_mmu.h"
 #include "qemu/error-report.h"
 #include "system/mshv.h"
 
@@ -176,7 +177,7 @@ bool x86_read_segment_descriptor(CPUState *cpu,
     }
 
     gva = base + sel.index * 8;
-    emul_ops->read_mem(cpu, desc, gva, sizeof(*desc));
+    x86_read_mem_priv(cpu, desc, gva, sizeof(*desc));
 
     return true;
 }
@@ -200,7 +201,7 @@ bool x86_read_call_gate(CPUState *cpu, struct x86_call_gate *idt_desc,
     }
 
     gva = base + gate * 8;
-    emul_ops->read_mem(cpu, idt_desc, gva, sizeof(*idt_desc));
+    x86_read_mem_priv(cpu, idt_desc, gva, sizeof(*idt_desc));
 
     return true;
 }
diff --git a/target/i386/emulate/x86_mmu.c b/target/i386/emulate/x86_mmu.c
index 35987a897aa..11e17c2db1d 100644
--- a/target/i386/emulate/x86_mmu.c
+++ b/target/i386/emulate/x86_mmu.c
@@ -21,7 +21,9 @@
 #include "cpu.h"
 #include "system/address-spaces.h"
 #include "system/memory.h"
+#include "qemu/error-report.h"
 #include "emulate/x86.h"
+#include "emulate/x86_emu.h"
 #include "emulate/x86_mmu.h"
 
 #define pte_present(pte) (pte & PT_PRESENT)
@@ -32,6 +34,11 @@
 #define pte_large_page(pte) (pte & PT_PS)
 #define pte_global_access(pte) (pte & PT_GLOBAL)
 
+#define mmu_validate_write(flags) (flags & MMU_TRANSLATE_VALIDATE_WRITE)
+#define mmu_validate_execute(flags) (flags & MMU_TRANSLATE_VALIDATE_EXECUTE)
+#define mmu_priv_checks_exempt(flags) (flags & MMU_TRANSLATE_PRIV_CHECKS_EXEMPT)
+
+
 #define PAE_CR3_MASK                (~0x1fllu)
 #define LEGACY_CR3_MASK             (0xffffffff)
 
@@ -40,14 +47,16 @@
 #define PAE_PTE_LARGE_PAGE_MASK     ((-1llu << (21)) & ((1llu << 52) - 1))
 #define PAE_PTE_SUPER_PAGE_MASK     ((-1llu << (30)) & ((1llu << 52) - 1))
 
+static bool is_user(CPUState *cpu)
+{
+    return false;
+}
+
+
 struct gpt_translation {
     target_ulong  gva;
     uint64_t gpa;
-    int    err_code;
     uint64_t pte[5];
-    bool write_access;
-    bool user_access;
-    bool exec_access;
 };
 
 static int gpt_top_level(CPUState *cpu, bool pae)
@@ -99,25 +108,15 @@ static bool get_pt_entry(CPUState *cpu, struct gpt_translation *pt,
 }
 
 /* test page table entry */
-static bool test_pt_entry(CPUState *cpu, struct gpt_translation *pt,
-                          int level, int *largeness, bool pae)
+static MMUTranslateResult test_pt_entry(CPUState *cpu, struct gpt_translation *pt,
+                          int level, int *largeness, bool pae, MMUTranslateFlags flags)
 {
     X86CPU *x86_cpu = X86_CPU(cpu);
     CPUX86State *env = &x86_cpu->env;
     uint64_t pte = pt->pte[level];
 
-    if (pt->write_access) {
-        pt->err_code |= MMU_PAGE_WT;
-    }
-    if (pt->user_access) {
-        pt->err_code |= MMU_PAGE_US;
-    }
-    if (pt->exec_access) {
-        pt->err_code |= MMU_PAGE_NX;
-    }
-
     if (!pte_present(pte)) {
-        return false;
+        return MMU_TRANSLATE_PAGE_NOT_MAPPED;
     }
 
     if (pae && !x86_is_long_mode(cpu) && 2 == level) {
@@ -125,32 +124,30 @@ static bool test_pt_entry(CPUState *cpu, struct gpt_translation *pt,
     }
 
     if (level && pte_large_page(pte)) {
-        pt->err_code |= MMU_PAGE_PT;
         *largeness = level;
     }
-    if (!level) {
-        pt->err_code |= MMU_PAGE_PT;
-    }
 
     uint32_t cr0 = env->cr[0];
     /* check protection */
     if (cr0 & CR0_WP_MASK) {
-        if (pt->write_access && !pte_write_access(pte)) {
-            return false;
+        if (mmu_validate_write(flags) && !pte_write_access(pte)) {
+            return MMU_TRANSLATE_PRIV_VIOLATION;
         }
     }
 
-    if (pt->user_access && !pte_user_access(pte)) {
-        return false;
+    if (!mmu_priv_checks_exempt(flags)) {
+        if (is_user(cpu) && !pte_user_access(pte)) {
+            return MMU_TRANSLATE_PRIV_VIOLATION;
+        }
     }
 
-    if (pae && pt->exec_access && !pte_exec_access(pte)) {
-        return false;
+    if (pae && mmu_validate_execute(flags) && !pte_exec_access(pte)) {
+        return MMU_TRANSLATE_PRIV_VIOLATION;
     }
     
 exit:
     /* TODO: check reserved bits */
-    return true;
+    return MMU_TRANSLATE_SUCCESS;
 }
 
 static inline uint64_t pse_pte_to_page(uint64_t pte)
@@ -181,7 +178,7 @@ static inline uint64_t large_page_gpa(struct gpt_translation *pt, bool pae,
 
 
 
-static bool walk_gpt(CPUState *cpu, target_ulong addr, int err_code,
+static MMUTranslateResult walk_gpt(CPUState *cpu, target_ulong addr, MMUTranslateFlags flags,
                      struct gpt_translation *pt, bool pae)
 {
     X86CPU *x86_cpu = X86_CPU(cpu);
@@ -190,21 +187,20 @@ static bool walk_gpt(CPUState *cpu, target_ulong addr, int err_code,
     int largeness = 0;
     target_ulong cr3 = env->cr[3];
     uint64_t page_mask = pae ? PAE_PTE_PAGE_MASK : LEGACY_PTE_PAGE_MASK;
+    MMUTranslateResult res;
     
     memset(pt, 0, sizeof(*pt));
     top_level = gpt_top_level(cpu, pae);
 
     pt->pte[top_level] = pae ? (cr3 & PAE_CR3_MASK) : (cr3 & LEGACY_CR3_MASK);
     pt->gva = addr;
-    pt->user_access = (err_code & MMU_PAGE_US);
-    pt->write_access = (err_code & MMU_PAGE_WT);
-    pt->exec_access = (err_code & MMU_PAGE_NX);
     
     for (level = top_level; level > 0; level--) {
         get_pt_entry(cpu, pt, level, pae);
+        res = test_pt_entry(cpu, pt, level - 1, &largeness, pae, flags);
 
-        if (!test_pt_entry(cpu, pt, level - 1, &largeness, pae)) {
-            return false;
+        if (res) {
+            return res;
         }
 
         if (largeness) {
@@ -218,69 +214,111 @@ static bool walk_gpt(CPUState *cpu, target_ulong addr, int err_code,
         pt->gpa = large_page_gpa(pt, pae, largeness);
     }
 
-    return true;
+    return res;
 }
 
 
-bool mmu_gva_to_gpa(CPUState *cpu, target_ulong gva, uint64_t *gpa)
+MMUTranslateResult mmu_gva_to_gpa(CPUState *cpu, target_ulong gva, uint64_t *gpa, MMUTranslateFlags flags)
 {
+    if (emul_ops->mmu_gva_to_gpa) {
+        return emul_ops->mmu_gva_to_gpa(cpu, gva, gpa, flags);
+    }
+
     bool res;
     struct gpt_translation pt;
-    int err_code = 0;
 
     if (!x86_is_paging_mode(cpu)) {
         *gpa = gva;
-        return true;
+        return MMU_TRANSLATE_SUCCESS;
     }
 
-    res = walk_gpt(cpu, gva, err_code, &pt, x86_is_pae_enabled(cpu));
-    if (res) {
+    res = walk_gpt(cpu, gva, flags, &pt, x86_is_pae_enabled(cpu));
+    if (res == MMU_TRANSLATE_SUCCESS) {
         *gpa = pt.gpa;
-        return true;
     }
 
-    return false;
+    return res;
 }
 
-void vmx_write_mem(CPUState *cpu, target_ulong gva, void *data, int bytes)
+static MMUTranslateResult x86_write_mem_ex(CPUState *cpu, void *data, target_ulong gva, int bytes, bool priv_check_exempt)
 {
+    MMUTranslateResult translate_res = MMU_TRANSLATE_SUCCESS;
+    MemTxResult mem_tx_res;
     uint64_t gpa;
 
     while (bytes > 0) {
         /* copy page */
         int copy = MIN(bytes, 0x1000 - (gva & 0xfff));
 
-        if (!mmu_gva_to_gpa(cpu, gva, &gpa)) {
-            VM_PANIC_EX("%s: mmu_gva_to_gpa " TARGET_FMT_lx " failed\n",
-                        __func__, gva);
-        } else {
-            address_space_write(&address_space_memory, gpa,
-                                MEMTXATTRS_UNSPECIFIED, data, copy);
+        translate_res = mmu_gva_to_gpa(cpu, gva, &gpa, MMU_TRANSLATE_VALIDATE_WRITE);
+        if (translate_res) {
+            return translate_res;
+        }
+
+        mem_tx_res = address_space_write(&address_space_memory, gpa,
+                            MEMTXATTRS_UNSPECIFIED, data, copy);
+
+        if (mem_tx_res == MEMTX_DECODE_ERROR) {
+            warn_report("write to unmapped mmio region gpa=0x%" PRIx64 " size=%i", gpa, bytes);
+            return MMU_TRANSLATE_GPA_UNMAPPED;
+        } else if (mem_tx_res == MEMTX_ACCESS_ERROR) {
+            return MMU_TRANSLATE_GPA_NO_WRITE_ACCESS;
         }
 
         bytes -= copy;
         gva += copy;
         data += copy;
     }
+    return translate_res;
 }
 
-void vmx_read_mem(CPUState *cpu, void *data, target_ulong gva, int bytes)
+MMUTranslateResult x86_write_mem(CPUState *cpu, void *data, target_ulong gva, int bytes)
 {
+    return x86_write_mem_ex(cpu, data, gva, bytes, false);
+}
+
+MMUTranslateResult x86_write_mem_priv(CPUState *cpu, void *data, target_ulong gva, int bytes)
+{
+    return x86_write_mem_ex(cpu, data, gva, bytes, true);
+}
+
+static MMUTranslateResult x86_read_mem_ex(CPUState *cpu, void *data, target_ulong gva, int bytes, bool priv_check_exempt)
+{
+    MMUTranslateResult translate_res = MMU_TRANSLATE_SUCCESS;
+    MemTxResult mem_tx_res;
     uint64_t gpa;
 
     while (bytes > 0) {
         /* copy page */
         int copy = MIN(bytes, 0x1000 - (gva & 0xfff));
 
-        if (!mmu_gva_to_gpa(cpu, gva, &gpa)) {
-            VM_PANIC_EX("%s: mmu_gva_to_gpa " TARGET_FMT_lx " failed\n",
-                        __func__, gva);
+        translate_res = mmu_gva_to_gpa(cpu, gva, &gpa, 0);
+        if (translate_res) {
+            return translate_res;
         }
-        address_space_read(&address_space_memory, gpa, MEMTXATTRS_UNSPECIFIED,
+        mem_tx_res = address_space_read(&address_space_memory, gpa, MEMTXATTRS_UNSPECIFIED,
                            data, copy);
 
+        if (mem_tx_res == MEMTX_DECODE_ERROR) {
+            warn_report("read from unmapped mmio region gpa=0x%" PRIx64 " size=%i", gpa, bytes);
+            return MMU_TRANSLATE_GPA_UNMAPPED;
+        } else if (mem_tx_res == MEMTX_ACCESS_ERROR) {
+            return MMU_TRANSLATE_GPA_NO_READ_ACCESS;
+        }
+
         bytes -= copy;
         gva += copy;
         data += copy;
     }
+    return translate_res;
+}
+
+MMUTranslateResult x86_read_mem(CPUState *cpu, void *data, target_ulong gva, int bytes)
+{
+    return x86_read_mem_ex(cpu, data, gva, bytes, false);
+}
+
+MMUTranslateResult x86_read_mem_priv(CPUState *cpu, void *data, target_ulong gva, int bytes)
+{
+    return x86_read_mem_ex(cpu, data, gva, bytes, true);
 }
diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 0b3674ad33d..fb039ff7bd5 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -252,27 +252,7 @@ static void hvf_read_segment_descriptor(CPUState *s, struct x86_segment_descript
     vmx_segment_to_x86_descriptor(s, &vmx_segment, desc);
 }
 
-static void hvf_read_mem(CPUState *cpu, void *data, target_ulong gva, int bytes)
-{
-    X86CPU *x86_cpu = X86_CPU(cpu);
-    CPUX86State *env = &x86_cpu->env;
-    env->cr[0] = rvmcs(cpu->accel->fd, VMCS_GUEST_CR0);
-    env->cr[3] = rvmcs(cpu->accel->fd, VMCS_GUEST_CR3);
-    vmx_read_mem(cpu, data, gva, bytes);
-}
-
-static void hvf_write_mem(CPUState *cpu, void *data, target_ulong gva, int bytes)
-{
-    X86CPU *x86_cpu = X86_CPU(cpu);
-    CPUX86State *env = &x86_cpu->env;
-    env->cr[0] = rvmcs(cpu->accel->fd, VMCS_GUEST_CR0);
-    env->cr[3] = rvmcs(cpu->accel->fd, VMCS_GUEST_CR3);
-    vmx_write_mem(cpu, gva, data, bytes);
-}
-
 static const struct x86_emul_ops hvf_x86_emul_ops = {
-    .read_mem = hvf_read_mem,
-    .write_mem = hvf_write_mem,
     .read_segment_descriptor = hvf_read_segment_descriptor,
     .handle_io = hvf_handle_io,
     .simulate_rdmsr = hvf_simulate_rdmsr,
@@ -490,6 +470,14 @@ static void hvf_cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     }
 }
 
+static void hvf_load_crs(CPUState *cs)
+{
+    X86CPU *x86_cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86_cpu->env;
+
+    env->cr[0] = rvmcs(cpu->accel->fd, VMCS_GUEST_CR0);
+    env->cr[3] = rvmcs(cpu->accel->fd, VMCS_GUEST_CR3);
+}
 void hvf_load_regs(CPUState *cs)
 {
     X86CPU *cpu = X86_CPU(cs);
@@ -802,6 +790,7 @@ static int hvf_handle_vmexit(CPUState *cpu)
             struct x86_decode decode;
 
             hvf_load_regs(cpu);
+            hvf_load_crs(cpu);
             decode_instruction(env, &decode);
             exec_instruction(env, &decode);
             hvf_store_regs(cpu);
@@ -843,6 +832,7 @@ static int hvf_handle_vmexit(CPUState *cpu)
         }
 
         hvf_load_regs(cpu);
+        hvf_load_crs(cpu);
         decode_instruction(env, &decode);
         assert(ins_len == decode.len);
         exec_instruction(env, &decode);
@@ -948,6 +938,7 @@ static int hvf_handle_vmexit(CPUState *cpu)
         struct x86_decode decode;
 
         hvf_load_regs(cpu);
+        hvf_load_crs(cpu);
         decode_instruction(env, &decode);
         exec_instruction(env, &decode);
         hvf_store_regs(cpu);
diff --git a/target/i386/hvf/x86.c b/target/i386/hvf/x86.c
index e98f480f411..7fe710aca3b 100644
--- a/target/i386/hvf/x86.c
+++ b/target/i386/hvf/x86.c
@@ -72,7 +72,7 @@ bool x86_read_segment_descriptor(CPUState *cpu,
         return false;
     }
 
-    vmx_read_mem(cpu, desc, base + sel.index * 8, sizeof(*desc));
+    x86_read_mem_priv(cpu, desc, base + sel.index * 8, sizeof(*desc));
     return true;
 }
 
@@ -95,7 +95,7 @@ bool x86_write_segment_descriptor(CPUState *cpu,
         printf("%s: gdt limit\n", __func__);
         return false;
     }
-    vmx_write_mem(cpu, base + sel.index * 8, desc, sizeof(*desc));
+    x86_write_mem_priv(cpu, desc, base + sel.index * 8, sizeof(*desc));
     return true;
 }
 
@@ -111,7 +111,7 @@ bool x86_read_call_gate(CPUState *cpu, struct x86_call_gate *idt_desc,
         return false;
     }
 
-    vmx_read_mem(cpu, idt_desc, base + gate * 8, sizeof(*idt_desc));
+    x86_read_mem_priv(cpu, idt_desc, base + gate * 8, sizeof(*idt_desc));
     return true;
 }
 
diff --git a/target/i386/hvf/x86_task.c b/target/i386/hvf/x86_task.c
index b1e541a6420..64e30e970d9 100644
--- a/target/i386/hvf/x86_task.c
+++ b/target/i386/hvf/x86_task.c
@@ -93,16 +93,16 @@ static int task_switch_32(CPUState *cpu, x86_segment_selector tss_sel, x86_segme
     uint32_t eip_offset = offsetof(struct x86_tss_segment32, eip);
     uint32_t ldt_sel_offset = offsetof(struct x86_tss_segment32, ldt);
 
-    vmx_read_mem(cpu, &tss_seg, old_tss_base, sizeof(tss_seg));
+    x86_read_mem_priv(cpu, &tss_seg, old_tss_base, sizeof(tss_seg));
     save_state_to_tss32(cpu, &tss_seg);
 
-    vmx_write_mem(cpu, old_tss_base + eip_offset, &tss_seg.eip, ldt_sel_offset - eip_offset);
-    vmx_read_mem(cpu, &tss_seg, new_tss_base, sizeof(tss_seg));
+    x86_write_mem_priv(cpu, &tss_seg.eip, old_tss_base + eip_offset, ldt_sel_offset - eip_offset);
+    x86_read_mem_priv(cpu, &tss_seg, new_tss_base, sizeof(tss_seg));
 
     if (old_tss_sel.sel != 0xffff) {
         tss_seg.prev_tss = old_tss_sel.sel;
 
-        vmx_write_mem(cpu, new_tss_base, &tss_seg.prev_tss, sizeof(tss_seg.prev_tss));
+        x86_write_mem_priv(cpu, &tss_seg.prev_tss, new_tss_base, sizeof(tss_seg.prev_tss));
     }
     load_state_from_tss32(cpu, &tss_seg);
     return 0;
diff --git a/target/i386/mshv/mshv-cpu.c b/target/i386/mshv/mshv-cpu.c
index f190e83bd15..2bc978deb25 100644
--- a/target/i386/mshv/mshv-cpu.c
+++ b/target/i386/mshv/mshv-cpu.c
@@ -1548,74 +1548,6 @@ int mshv_create_vcpu(int vm_fd, uint8_t vp_index, int *cpu_fd)
     return 0;
 }
 
-static int guest_mem_read_with_gva(const CPUState *cpu, uint64_t gva,
-                                   uint8_t *data, uintptr_t size,
-                                   bool fetch_instruction)
-{
-    int ret;
-    uint64_t gpa, flags;
-
-    flags = HV_TRANSLATE_GVA_VALIDATE_READ;
-    ret = translate_gva(cpu, gva, &gpa, flags);
-    if (ret < 0) {
-        error_report("failed to translate gva to gpa");
-        return -1;
-    }
-
-    ret = mshv_guest_mem_read(gpa, data, size, false, fetch_instruction);
-    if (ret < 0) {
-        error_report("failed to read from guest memory");
-        return -1;
-    }
-
-    return 0;
-}
-
-static int guest_mem_write_with_gva(const CPUState *cpu, uint64_t gva,
-                                    const uint8_t *data, uintptr_t size)
-{
-    int ret;
-    uint64_t gpa, flags;
-
-    flags = HV_TRANSLATE_GVA_VALIDATE_WRITE;
-    ret = translate_gva(cpu, gva, &gpa, flags);
-    if (ret < 0) {
-        error_report("failed to translate gva to gpa");
-        return -1;
-    }
-    ret = mshv_guest_mem_write(gpa, data, size, false);
-    if (ret < 0) {
-        error_report("failed to write to guest memory");
-        return -1;
-    }
-    return 0;
-}
-
-static void write_mem(CPUState *cpu, void *data, target_ulong addr, int bytes)
-{
-    if (guest_mem_write_with_gva(cpu, addr, data, bytes) < 0) {
-        error_report("failed to write memory");
-        abort();
-    }
-}
-
-static void fetch_instruction(CPUState *cpu, void *data,
-                              target_ulong addr, int bytes)
-{
-    if (guest_mem_read_with_gva(cpu, addr, data, bytes, true) < 0) {
-        error_report("failed to fetch instruction");
-        abort();
-    }
-}
-
-static void read_mem(CPUState *cpu, void *data, target_ulong addr, int bytes)
-{
-    if (guest_mem_read_with_gva(cpu, addr, data, bytes, false) < 0) {
-        error_report("failed to read memory");
-        abort();
-    }
-}
-
 static void read_segment_descriptor(CPUState *cpu,
                                     struct x86_segment_descriptor *desc,
                                     enum X86Seg seg_idx)
@@ -1634,9 +1566,6 @@ static void read_segment_descriptor(CPUState *cpu,
 }
 
 static const struct x86_emul_ops mshv_x86_emul_ops = {
-    .fetch_instruction = fetch_instruction,
-    .read_mem = read_mem,
-    .write_mem = write_mem,
     .read_segment_descriptor = read_segment_descriptor,
 };
 
diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index ab583e922d4..561a48206ca 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -862,16 +862,6 @@ static int whpx_handle_portio(CPUState *cpu,
     return 0;
 }
 
-static void write_mem(CPUState *cpu, void *data, target_ulong addr, int bytes)
-{
-    vmx_write_mem(cpu, addr, data, bytes);
-}
-
-static void read_mem(CPUState *cpu, void *data, target_ulong addr, int bytes)
-{
-    vmx_read_mem(cpu, data, addr, bytes);
-}
-
 static void read_segment_descriptor(CPUState *cpu,
                                     struct x86_segment_descriptor *desc,
                                     enum X86Seg seg_idx)
@@ -891,8 +881,6 @@ static void read_segment_descriptor(CPUState *cpu,
 
 
 static const struct x86_emul_ops whpx_x86_emul_ops = {
-    .read_mem = read_mem,
-    .write_mem = write_mem,
     .read_segment_descriptor = read_segment_descriptor,
     .handle_io = handle_io
 };
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 021/102] hvf: i386: save/restore CR0/2/3
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (19 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 020/102] target/i386: emulate, hvf, mshv: rework MMU code Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 022/102] target/i386: emulate: get rid of write_val_to_mem() helper Paolo Bonzini
                   ` (80 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

For symmetry, save/restore the same set of registers even when not needed.

CR2 save/restore needed as page faults injected to the guest imply modifying CR2.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-21-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/hvf/hvf.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index fb039ff7bd5..a70f8461b04 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -477,7 +477,19 @@ static void hvf_load_crs(CPUState *cs)
 
     env->cr[0] = rvmcs(cpu->accel->fd, VMCS_GUEST_CR0);
     env->cr[3] = rvmcs(cpu->accel->fd, VMCS_GUEST_CR3);
+    env->cr[2] = rreg(cpu->accel->fd, HV_X86_CR2);
 }
+
+static void hvf_save_crs(CPUState *cs)
+{
+    X86CPU *x86_cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86_cpu->env;
+
+    wvmcs(cpu->accel->fd, VMCS_GUEST_CR0, env->cr[0]);
+    wvmcs(cpu->accel->fd, VMCS_GUEST_CR3, env->cr[3]);
+    wreg(cs->accel->fd, HV_X86_CR2, env->cr[2]);
+}
+
 void hvf_load_regs(CPUState *cs)
 {
     X86CPU *cpu = X86_CPU(cs);
@@ -794,6 +806,7 @@ static int hvf_handle_vmexit(CPUState *cpu)
             decode_instruction(env, &decode);
             exec_instruction(env, &decode);
             hvf_store_regs(cpu);
+            hvf_save_crs(cpu);
             break;
         }
         break;
@@ -837,6 +850,7 @@ static int hvf_handle_vmexit(CPUState *cpu)
         assert(ins_len == decode.len);
         exec_instruction(env, &decode);
         hvf_store_regs(cpu);
+        hvf_save_crs(cpu);
 
         break;
     }
@@ -942,6 +956,7 @@ static int hvf_handle_vmexit(CPUState *cpu)
         decode_instruction(env, &decode);
         exec_instruction(env, &decode);
         hvf_store_regs(cpu);
+        hvf_save_crs(cpu);
         break;
     }
     case EXIT_REASON_TPR: {
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 022/102] target/i386: emulate: get rid of write_val_to_mem() helper
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (20 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 021/102] hvf: i386: save/restore CR0/2/3 Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 023/102] target/i386: emulate: raise an exception on translation fault Paolo Bonzini
                   ` (79 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-22-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/emulate/x86_emu.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/target/i386/emulate/x86_emu.c b/target/i386/emulate/x86_emu.c
index cfa35561dd5..3aedd638a10 100644
--- a/target/i386/emulate/x86_emu.c
+++ b/target/i386/emulate/x86_emu.c
@@ -164,17 +164,12 @@ void write_val_to_reg(void *reg_ptr, target_ulong val, int size)
     }
 }
 
-static void write_val_to_mem(CPUX86State *env, target_ulong ptr, target_ulong val, int size)
-{
-    x86_write_mem(env_cpu(env), &val, ptr, size);
-}
-
 void write_val_ext(CPUX86State *env, struct x86_decode_op *decode, target_ulong val, int size)
 {
     if (decode->type == X86_VAR_REG) {
         write_val_to_reg(decode->regptr, val, size);
     } else {
-        write_val_to_mem(env, decode->addr, val, size);
+        x86_write_mem(env_cpu(env), &val, decode->addr, size);
     }
 }
 
@@ -548,7 +543,7 @@ static void exec_movs_single(CPUX86State *env, struct x86_decode *decode)
                                 decode->addressing_size, R_ES);
 
     val = read_val_from_mem(env, src_addr, decode->operand_size);
-    write_val_to_mem(env, dst_addr, val, decode->operand_size);
+    x86_write_mem(env_cpu(env), &val, dst_addr, decode->operand_size);
 
     string_increment_reg(env, R_ESI, decode);
     string_increment_reg(env, R_EDI, decode);
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 023/102] target/i386: emulate: raise an exception on translation fault
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (21 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 022/102] target/i386: emulate: get rid of write_val_to_mem() helper Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 024/102] target/i386: emulate: remove fetch_instruction helper too Paolo Bonzini
                   ` (78 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-23-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/emulate/x86_mmu.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/target/i386/emulate/x86_mmu.c b/target/i386/emulate/x86_mmu.c
index 11e17c2db1d..8261ca16351 100644
--- a/target/i386/emulate/x86_mmu.c
+++ b/target/i386/emulate/x86_mmu.c
@@ -240,8 +240,29 @@ MMUTranslateResult mmu_gva_to_gpa(CPUState *cpu, target_ulong gva, uint64_t *gpa
     return res;
 }
 
+static int translate_res_to_error_code(MMUTranslateResult res, bool is_write, bool is_user)
+{
+    int error_code = 0;
+    if (is_user) {
+        error_code |= PG_ERROR_U_MASK;
+    }
+    if (!(res & MMU_TRANSLATE_PAGE_NOT_MAPPED)) {
+        error_code |= PG_ERROR_P_MASK;
+    }
+    if (is_write && (res & MMU_TRANSLATE_PRIV_VIOLATION)) {
+        error_code |= PG_ERROR_W_MASK;
+    }
+    if (res & MMU_TRANSLATE_INVALID_PT_FLAGS) {
+        error_code |= PG_ERROR_RSVD_MASK;
+    }
+    return error_code;
+}
+
 static MMUTranslateResult x86_write_mem_ex(CPUState *cpu, void *data, target_ulong gva, int bytes, bool priv_check_exempt)
 {
+    X86CPU *x86_cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86_cpu->env;
+
     MMUTranslateResult translate_res = MMU_TRANSLATE_SUCCESS;
     MemTxResult mem_tx_res;
     uint64_t gpa;
@@ -252,6 +273,9 @@ static MMUTranslateResult x86_write_mem_ex(CPUState *cpu, void *data, target_ulo
 
         translate_res = mmu_gva_to_gpa(cpu, gva, &gpa, MMU_TRANSLATE_VALIDATE_WRITE);
         if (translate_res) {
+            int error_code = translate_res_to_error_code(translate_res, true, is_user(cpu));
+            env->cr[2] = gva;
+            x86_emul_raise_exception(env, EXCP0E_PAGE, error_code);
             return translate_res;
         }
 
@@ -284,6 +308,9 @@ MMUTranslateResult x86_write_mem_priv(CPUState *cpu, void *data, target_ulong gv
 
 static MMUTranslateResult x86_read_mem_ex(CPUState *cpu, void *data, target_ulong gva, int bytes, bool priv_check_exempt)
 {
+    X86CPU *x86_cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86_cpu->env;
+
     MMUTranslateResult translate_res = MMU_TRANSLATE_SUCCESS;
     MemTxResult mem_tx_res;
     uint64_t gpa;
@@ -294,6 +321,9 @@ static MMUTranslateResult x86_read_mem_ex(CPUState *cpu, void *data, target_ulon
 
         translate_res = mmu_gva_to_gpa(cpu, gva, &gpa, 0);
         if (translate_res) {
+            int error_code = translate_res_to_error_code(translate_res, false, is_user(cpu));
+            env->cr[2] = gva;
+            x86_emul_raise_exception(env, EXCP0E_PAGE, error_code);
             return translate_res;
         }
         mem_tx_res = address_space_read(&address_space_memory, gpa, MEMTXATTRS_UNSPECIFIED,
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 024/102] target/i386: emulate: remove fetch_instruction helper too
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (22 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 023/102] target/i386: emulate: raise an exception on translation fault Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 025/102] target/i386: emulate: propagate memory errors on most reads/writes Paolo Bonzini
                   ` (77 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Not used anymore.
Link: https://lore.kernel.org/r/20260223233950.96076-24-mohamed@unpredictable.fr

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/emulate/x86_emu.h    | 2 --
 target/i386/emulate/x86_decode.c | 6 +-----
 2 files changed, 1 insertion(+), 7 deletions(-)

diff --git a/target/i386/emulate/x86_emu.h b/target/i386/emulate/x86_emu.h
index 3e485b8ca36..6b691118221 100644
--- a/target/i386/emulate/x86_emu.h
+++ b/target/i386/emulate/x86_emu.h
@@ -25,8 +25,6 @@
 #include "cpu.h"
 
 struct x86_emul_ops {
-    void (*fetch_instruction)(CPUState *cpu, void *data, target_ulong addr,
-                              int bytes);
     MMUTranslateResult (*mmu_gva_to_gpa) (CPUState *cpu, target_ulong gva, uint64_t *gpa, MMUTranslateFlags flags);
     void (*read_segment_descriptor)(CPUState *cpu, struct x86_segment_descriptor *desc,
                                     enum X86Seg seg);
diff --git a/target/i386/emulate/x86_decode.c b/target/i386/emulate/x86_decode.c
index 9faa65a5797..bae1dd4d6f8 100644
--- a/target/i386/emulate/x86_decode.c
+++ b/target/i386/emulate/x86_decode.c
@@ -77,11 +77,7 @@ static inline uint64_t decode_bytes(CPUX86State *env, struct x86_decode *decode,
         memcpy(&val, decode->stream->bytes + decode->len, size);
     } else {
         target_ulong va = linear_rip(env_cpu(env), env->eip) + decode->len;
-        if (emul_ops->fetch_instruction) {
-            emul_ops->fetch_instruction(env_cpu(env), &val, va, size);
-        } else {
-            x86_read_mem(env_cpu(env), &val, va, size);
-        }
+        x86_read_mem(env_cpu(env), &val, va, size);
     }
     decode->len += size;
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 025/102] target/i386: emulate: propagate memory errors on most reads/writes
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (23 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 024/102] target/i386: emulate: remove fetch_instruction helper too Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 026/102] whpx: i386: inject exceptions Paolo Bonzini
                   ` (76 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Use that to not bump RIP for those cases.

Warn on read/write from/to unmapped MMIO, but not consider that as an exception.
For reads, return 0xFF(s) as the register value in that case.

Leaves a coverage gap for read_val_ext(), to be handled in a later commit.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-25-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/emulate/x86_emu.c | 119 +++++++++++++++++++++++++---------
 1 file changed, 88 insertions(+), 31 deletions(-)

diff --git a/target/i386/emulate/x86_emu.c b/target/i386/emulate/x86_emu.c
index 3aedd638a10..ec6bc798a42 100644
--- a/target/i386/emulate/x86_emu.c
+++ b/target/i386/emulate/x86_emu.c
@@ -36,11 +36,14 @@
 /////////////////////////////////////////////////////////////////////////
 
 #include "qemu/osdep.h"
+#include "qemu/error-report.h"
 #include "panic.h"
 #include "x86_decode.h"
 #include "x86.h"
 #include "x86_emu.h"
 #include "x86_flags.h"
+#include "x86_mmu.h"
+
 
 #define EXEC_2OP_FLAGS_CMD(env, decode, cmd, FLAGS_FUNC, save_res) \
 {                                                       \
@@ -175,43 +178,56 @@ void write_val_ext(CPUX86State *env, struct x86_decode_op *decode, target_ulong
 
 uint8_t *read_mmio(CPUX86State *env, target_ulong ptr, int bytes)
 {
-    x86_read_mem(env_cpu(env), env->emu_mmio_buf, ptr, bytes);
+    MMUTranslateResult res = x86_read_mem(env_cpu(env), env->emu_mmio_buf, ptr, bytes);
+    if (res) {
+        if (res == MMU_TRANSLATE_GPA_UNMAPPED) {
+            memset(env->emu_mmio_buf, 0xFF, bytes);
+            return env->emu_mmio_buf;
+        }
+        return NULL;
+    }
     return env->emu_mmio_buf;
 }
 
 
-static target_ulong read_val_from_mem(CPUX86State *env, target_long ptr, int size)
+static bool read_val_from_mem(CPUX86State *env, target_long ptr, int size, target_ulong* val)
 {
-    target_ulong val;
     uint8_t *mmio_ptr;
 
     mmio_ptr = read_mmio(env, ptr, size);
+    if (mmio_ptr == NULL) {
+        return 1;
+    }
     switch (size) {
     case 1:
-        val = *(uint8_t *)mmio_ptr;
+        *val = *(uint8_t *)mmio_ptr;
         break;
     case 2:
-        val = *(uint16_t *)mmio_ptr;
+        *val = *(uint16_t *)mmio_ptr;
         break;
     case 4:
-        val = *(uint32_t *)mmio_ptr;
+        *val = *(uint32_t *)mmio_ptr;
         break;
     case 8:
-        val = *(uint64_t *)mmio_ptr;
+        *val = *(uint64_t *)mmio_ptr;
         break;
     default:
         VM_PANIC("bad size\n");
         break;
     }
-    return val;
+    return 0;
 }
 
 target_ulong read_val_ext(CPUX86State *env, struct x86_decode_op *decode, int size)
 {
+    target_ulong val;
     if (decode->type == X86_VAR_REG) {
         return read_val_from_reg(decode->regptr, size);
     } else {
-        return read_val_from_mem(env, decode->addr, size);
+        if (read_val_from_mem(env, decode->addr, size, &val)) {
+            error_report("target/i386/emulate: read_val_ext: reading from unmapped address.");
+        }
+        return val;
     }
 }
 
@@ -465,15 +481,17 @@ static inline int get_ZF(CPUX86State *env) {
     return env->cc_dst ? 0 : CC_Z;
 }
 
-static inline void string_rep(CPUX86State *env, struct x86_decode *decode,
-                              void (*func)(CPUX86State *env,
+static inline bool string_rep(CPUX86State *env, struct x86_decode *decode,
+                              bool (*func)(CPUX86State *env,
                                            struct x86_decode *ins), int rep)
 {
     target_ulong rcx = read_reg(env, R_ECX, decode->addressing_size);
 
     while (rcx != 0) {
         bool is_cmps_or_scas = decode->cmd == X86_DECODE_CMD_CMPS || decode->cmd == X86_DECODE_CMD_SCAS;
-        func(env, decode);
+        if (func(env, decode)) {
+            return 1;
+        }
         rcx--;
         write_reg(env, R_ECX, rcx, decode->addressing_size);
         if ((PREFIX_REP == rep) && !get_ZF(env) && is_cmps_or_scas) {
@@ -483,33 +501,44 @@ static inline void string_rep(CPUX86State *env, struct x86_decode *decode,
             break;
         }
     }
+    return 0;
 }
 
-static void exec_ins_single(CPUX86State *env, struct x86_decode *decode)
+static bool exec_ins_single(CPUX86State *env, struct x86_decode *decode)
 {
+    MMUTranslateResult res;
+
     target_ulong addr = linear_addr_size(env_cpu(env), RDI(env),
                                          decode->addressing_size, R_ES);
 
     emul_ops->handle_io(env_cpu(env), DX(env), env->emu_mmio_buf, 0,
                         decode->operand_size, 1);
-    x86_write_mem(env_cpu(env), env->emu_mmio_buf, addr,
+    res = x86_write_mem(env_cpu(env), env->emu_mmio_buf, addr,
                         decode->operand_size);
+    if (res) {
+        return 1;
+    }
 
     string_increment_reg(env, R_EDI, decode);
+    return 0;
 }
 
 static void exec_ins(CPUX86State *env, struct x86_decode *decode)
 {
+    bool res;
     if (decode->rep) {
-        string_rep(env, decode, exec_ins_single, 0);
+        res = string_rep(env, decode, exec_ins_single, 0);
     } else {
-        exec_ins_single(env, decode);
+        res = exec_ins_single(env, decode);
     }
 
+    if (res) {
+        return;
+    }
     env->eip += decode->len;
 }
 
-static void exec_outs_single(CPUX86State *env, struct x86_decode *decode)
+static bool exec_outs_single(CPUX86State *env, struct x86_decode *decode)
 {
     target_ulong addr = decode_linear_addr(env, decode, RSI(env), R_DS);
 
@@ -519,48 +548,64 @@ static void exec_outs_single(CPUX86State *env, struct x86_decode *decode)
                         decode->operand_size, 1);
 
     string_increment_reg(env, R_ESI, decode);
+    return 0;
 }
 
 static void exec_outs(CPUX86State *env, struct x86_decode *decode)
 {
+    bool res;
     if (decode->rep) {
-        string_rep(env, decode, exec_outs_single, 0);
+        res = string_rep(env, decode, exec_outs_single, 0);
     } else {
-        exec_outs_single(env, decode);
+        res = exec_outs_single(env, decode);
     }
 
+    if (res) {
+        return;
+    }
     env->eip += decode->len;
 }
 
-static void exec_movs_single(CPUX86State *env, struct x86_decode *decode)
+static bool exec_movs_single(CPUX86State *env, struct x86_decode *decode)
 {
     target_ulong src_addr;
     target_ulong dst_addr;
     target_ulong val;
+    MMUTranslateResult res;
 
     src_addr = decode_linear_addr(env, decode, RSI(env), R_DS);
     dst_addr = linear_addr_size(env_cpu(env), RDI(env),
                                 decode->addressing_size, R_ES);
 
-    val = read_val_from_mem(env, src_addr, decode->operand_size);
-    x86_write_mem(env_cpu(env), &val, dst_addr, decode->operand_size);
+    if (read_val_from_mem(env, src_addr, decode->operand_size, &val)) {
+        return 1;
+    }
+    res = x86_write_mem(env_cpu(env), &val, dst_addr, decode->operand_size);
+    if (res) {
+        return 1;
+    }
 
     string_increment_reg(env, R_ESI, decode);
     string_increment_reg(env, R_EDI, decode);
+    return 0;
 }
 
 static void exec_movs(CPUX86State *env, struct x86_decode *decode)
 {
+    bool res;
     if (decode->rep) {
-        string_rep(env, decode, exec_movs_single, 0);
+        res = string_rep(env, decode, exec_movs_single, 0);
     } else {
-        exec_movs_single(env, decode);
+        res = exec_movs_single(env, decode);
     }
 
+    if (res) {
+        return;
+    }
     env->eip += decode->len;
 }
 
-static void exec_cmps_single(CPUX86State *env, struct x86_decode *decode)
+static bool exec_cmps_single(CPUX86State *env, struct x86_decode *decode)
 {
     target_ulong src_addr;
     target_ulong dst_addr;
@@ -570,14 +615,19 @@ static void exec_cmps_single(CPUX86State *env, struct x86_decode *decode)
                                 decode->addressing_size, R_ES);
 
     decode->op[0].type = X86_VAR_IMMEDIATE;
-    decode->op[0].val = read_val_from_mem(env, src_addr, decode->operand_size);
+    if (read_val_from_mem(env, src_addr, decode->operand_size, &decode->op[0].val)) {
+        return 1;
+    }
     decode->op[1].type = X86_VAR_IMMEDIATE;
-    decode->op[1].val = read_val_from_mem(env, dst_addr, decode->operand_size);
+    if (read_val_from_mem(env, dst_addr, decode->operand_size, &decode->op[1].val)) {
+        return 1;
+    }
 
     EXEC_2OP_FLAGS_CMD(env, decode, -, SET_FLAGS_OSZAPC_SUB, false);
 
     string_increment_reg(env, R_ESI, decode);
     string_increment_reg(env, R_EDI, decode);
+    return 0;
 }
 
 static void exec_cmps(CPUX86State *env, struct x86_decode *decode)
@@ -591,17 +641,22 @@ static void exec_cmps(CPUX86State *env, struct x86_decode *decode)
 }
 
 
-static void exec_stos_single(CPUX86State *env, struct x86_decode *decode)
+static bool exec_stos_single(CPUX86State *env, struct x86_decode *decode)
 {
     target_ulong addr;
     target_ulong val;
+    MMUTranslateResult res;
 
     addr = linear_addr_size(env_cpu(env), RDI(env),
                             decode->addressing_size, R_ES);
     val = read_reg(env, R_EAX, decode->operand_size);
-    x86_write_mem(env_cpu(env), &val, addr, decode->operand_size);
+    res = x86_write_mem(env_cpu(env), &val, addr, decode->operand_size);
+    if (res) {
+        return 1;
+    }
 
     string_increment_reg(env, R_EDI, decode);
+    return 0;
 }
 
 
@@ -616,7 +671,7 @@ static void exec_stos(CPUX86State *env, struct x86_decode *decode)
     env->eip += decode->len;
 }
 
-static void exec_scas_single(CPUX86State *env, struct x86_decode *decode)
+static bool exec_scas_single(CPUX86State *env, struct x86_decode *decode)
 {
     target_ulong addr;
 
@@ -627,6 +682,7 @@ static void exec_scas_single(CPUX86State *env, struct x86_decode *decode)
 
     EXEC_2OP_FLAGS_CMD(env, decode, -, SET_FLAGS_OSZAPC_SUB, false);
     string_increment_reg(env, R_EDI, decode);
+    return 0;
 }
 
 static void exec_scas(CPUX86State *env, struct x86_decode *decode)
@@ -642,7 +698,7 @@ static void exec_scas(CPUX86State *env, struct x86_decode *decode)
     env->eip += decode->len;
 }
 
-static void exec_lods_single(CPUX86State *env, struct x86_decode *decode)
+static bool exec_lods_single(CPUX86State *env, struct x86_decode *decode)
 {
     target_ulong addr;
     target_ulong val = 0;
@@ -652,6 +708,7 @@ static void exec_lods_single(CPUX86State *env, struct x86_decode *decode)
     write_reg(env, R_EAX, val, decode->operand_size);
 
     string_increment_reg(env, R_ESI, decode);
+    return 0;
 }
 
 static void exec_lods(CPUX86State *env, struct x86_decode *decode)
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 026/102] whpx: i386: inject exceptions
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (24 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 025/102] target/i386: emulate: propagate memory errors on most reads/writes Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 027/102] whpx: i386: bump to x2apic Paolo Bonzini
                   ` (75 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-26-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-all.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index 561a48206ca..0259782a822 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -1506,6 +1506,26 @@ static void whpx_vcpu_process_async_events(CPUState *cpu)
     }
 }
 
+static void whpx_inject_exceptions(CPUState* cpu)
+{
+    X86CPU *x86_cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86_cpu->env;
+
+    if (env->exception_injected) {
+        env->exception_injected = 0;
+        WHV_REGISTER_VALUE reg = {};
+        reg.ExceptionEvent.EventPending = 1;
+        reg.ExceptionEvent.EventType = WHvX64PendingEventException;
+        reg.ExceptionEvent.DeliverErrorCode = 1;
+        reg.ExceptionEvent.Vector = env->exception_nr;
+        reg.ExceptionEvent.ErrorCode = env->error_code;
+        if (env->exception_nr == EXCP0E_PAGE) {
+            reg.ExceptionEvent.ExceptionParameter = env->cr[2];
+        }
+        whpx_set_reg(cpu, WHvRegisterPendingEvent, reg);
+    }
+}
+
 int whpx_vcpu_run(CPUState *cpu)
 {
     HRESULT hr;
@@ -1600,6 +1620,8 @@ int whpx_vcpu_run(CPUState *cpu)
             whpx_vcpu_configure_single_stepping(cpu, true, NULL);
         }
 
+        whpx_inject_exceptions(cpu);
+
         hr = whp_dispatch.WHvRunVirtualProcessor(
             whpx->partition, cpu->cpu_index,
             &vcpu->exit_ctx, sizeof(vcpu->exit_ctx));
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 027/102] whpx: i386: bump to x2apic
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (25 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 026/102] whpx: i386: inject exceptions Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 028/102] whpx: i386: ignore send_msi to interrupt vector 0 Paolo Bonzini
                   ` (74 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-27-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-all.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index 0259782a822..d98619facee 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -2088,7 +2088,7 @@ int whpx_accel_init(AccelState *as, MachineState *ms)
     if (whpx->kernel_irqchip_allowed && features.LocalApicEmulation &&
         whp_dispatch.WHvSetVirtualProcessorInterruptControllerState2) {
         WHV_X64_LOCAL_APIC_EMULATION_MODE mode =
-            WHvX64LocalApicEmulationModeXApic;
+            WHvX64LocalApicEmulationModeX2Apic;
         hr = whp_dispatch.WHvSetPartitionProperty(
             whpx->partition,
             WHvPartitionPropertyCodeLocalApicEmulationMode,
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 028/102] whpx: i386: ignore send_msi to interrupt vector 0
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (26 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 027/102] whpx: i386: bump to x2apic Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 029/102] target/i386: emulate: propagate errors all the way and stop early Paolo Bonzini
                   ` (73 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-28-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-apic.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/target/i386/whpx/whpx-apic.c b/target/i386/whpx/whpx-apic.c
index b934fdcbe19..f26ecaf6e83 100644
--- a/target/i386/whpx/whpx-apic.c
+++ b/target/i386/whpx/whpx-apic.c
@@ -192,6 +192,11 @@ static void whpx_send_msi(MSIMessage *msg)
     uint8_t trigger_mode = (data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
     uint8_t delivery = (data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x7;
 
+    if (vector == 0) {
+        warn_report("Ignoring request for interrupt vector 0");
+        return;
+    }
+
     WHV_INTERRUPT_CONTROL interrupt = {
         /* Values correspond to delivery modes */
         .Type = delivery,
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 029/102] target/i386: emulate: propagate errors all the way and stop early
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (27 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 028/102] whpx: i386: ignore send_msi to interrupt vector 0 Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 030/102] accel/kvm: Don't clear pending #SMI in kvm_get_vcpu_events Paolo Bonzini
                   ` (72 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

This ended up being a bigger patch than I thought it'd be...

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-29-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/emulate/x86_emu.h |  18 +--
 target/i386/emulate/x86_emu.c | 227 ++++++++++++++++++++++------------
 2 files changed, 160 insertions(+), 85 deletions(-)

diff --git a/target/i386/emulate/x86_emu.h b/target/i386/emulate/x86_emu.h
index 6b691118221..0f284b0c3d1 100644
--- a/target/i386/emulate/x86_emu.h
+++ b/target/i386/emulate/x86_emu.h
@@ -44,15 +44,15 @@ target_ulong read_reg(CPUX86State *env, int reg, int size);
 void write_reg(CPUX86State *env, int reg, target_ulong val, int size);
 target_ulong read_val_from_reg(void *reg_ptr, int size);
 void write_val_to_reg(void *reg_ptr, target_ulong val, int size);
-void write_val_ext(CPUX86State *env, struct x86_decode_op *decode, target_ulong val, int size);
+bool write_val_ext(CPUX86State *env, struct x86_decode_op *decode, target_ulong val, int size);
 uint8_t *read_mmio(CPUX86State *env, target_ulong ptr, int bytes);
-target_ulong read_val_ext(CPUX86State *env, struct x86_decode_op *decode, int size);
+bool read_val_ext(CPUX86State *env, struct x86_decode_op *decode, int size, target_ulong* val);
 
-void exec_movzx(CPUX86State *env, struct x86_decode *decode);
-void exec_shl(CPUX86State *env, struct x86_decode *decode);
-void exec_movsx(CPUX86State *env, struct x86_decode *decode);
-void exec_ror(CPUX86State *env, struct x86_decode *decode);
-void exec_rol(CPUX86State *env, struct x86_decode *decode);
-void exec_rcl(CPUX86State *env, struct x86_decode *decode);
-void exec_rcr(CPUX86State *env, struct x86_decode *decode);
+bool exec_movzx(CPUX86State *env, struct x86_decode *decode);
+bool exec_shl(CPUX86State *env, struct x86_decode *decode);
+bool exec_movsx(CPUX86State *env, struct x86_decode *decode);
+bool exec_ror(CPUX86State *env, struct x86_decode *decode);
+bool exec_rol(CPUX86State *env, struct x86_decode *decode);
+bool exec_rcl(CPUX86State *env, struct x86_decode *decode);
+bool exec_rcr(CPUX86State *env, struct x86_decode *decode);
 #endif
diff --git a/target/i386/emulate/x86_emu.c b/target/i386/emulate/x86_emu.c
index ec6bc798a42..8d35f3338c1 100644
--- a/target/i386/emulate/x86_emu.c
+++ b/target/i386/emulate/x86_emu.c
@@ -47,7 +47,9 @@
 
 #define EXEC_2OP_FLAGS_CMD(env, decode, cmd, FLAGS_FUNC, save_res) \
 {                                                       \
-    fetch_operands(env, decode, 2, true, true, false);  \
+    if (fetch_operands(env, decode, 2, true, true, false))  {\
+        return 1; \
+    }\
     switch (decode->operand_size) {                     \
     case 1:                                         \
     {                                               \
@@ -55,7 +57,7 @@
         uint8_t v2 = (uint8_t)decode->op[1].val;    \
         uint8_t diff = v1 cmd v2;                   \
         if (save_res) {                              \
-            write_val_ext(env, &decode->op[0], diff, 1);  \
+            if (write_val_ext(env, &decode->op[0], diff, 1)) { return 1; }  \
         } \
         FLAGS_FUNC##8(env, v1, v2, diff);           \
         break;                                      \
@@ -66,7 +68,7 @@
         uint16_t v2 = (uint16_t)decode->op[1].val;  \
         uint16_t diff = v1 cmd v2;                  \
         if (save_res) {                              \
-            write_val_ext(env, &decode->op[0], diff, 2); \
+            if (write_val_ext(env, &decode->op[0], diff, 2)) { return 1; } \
         } \
         FLAGS_FUNC##16(env, v1, v2, diff);          \
         break;                                      \
@@ -77,7 +79,7 @@
         uint32_t v2 = (uint32_t)decode->op[1].val;  \
         uint32_t diff = v1 cmd v2;                  \
         if (save_res) {                              \
-            write_val_ext(env, &decode->op[0], diff, 4); \
+            if (write_val_ext(env, &decode->op[0], diff, 4)) { return 1; } \
         } \
         FLAGS_FUNC##32(env, v1, v2, diff);          \
         break;                                      \
@@ -167,13 +169,20 @@ void write_val_to_reg(void *reg_ptr, target_ulong val, int size)
     }
 }
 
-void write_val_ext(CPUX86State *env, struct x86_decode_op *decode, target_ulong val, int size)
+bool write_val_ext(CPUX86State *env, struct x86_decode_op *decode, target_ulong val, int size)
 {
     if (decode->type == X86_VAR_REG) {
         write_val_to_reg(decode->regptr, val, size);
     } else {
-        x86_write_mem(env_cpu(env), &val, decode->addr, size);
+        MMUTranslateResult res = x86_write_mem(env_cpu(env), &val, decode->addr, size);
+        if (res) {
+            if (res == MMU_TRANSLATE_GPA_UNMAPPED) {
+                return 0;
+            }
+            return 1;
+        }
     }
+    return 0;
 }
 
 uint8_t *read_mmio(CPUX86State *env, target_ulong ptr, int bytes)
@@ -218,20 +227,19 @@ static bool read_val_from_mem(CPUX86State *env, target_long ptr, int size, targe
     return 0;
 }
 
-target_ulong read_val_ext(CPUX86State *env, struct x86_decode_op *decode, int size)
+bool read_val_ext(CPUX86State *env, struct x86_decode_op *decode, int size, target_ulong* val)
 {
-    target_ulong val;
     if (decode->type == X86_VAR_REG) {
-        return read_val_from_reg(decode->regptr, size);
+        *val = read_val_from_reg(decode->regptr, size);
     } else {
-        if (read_val_from_mem(env, decode->addr, size, &val)) {
-            error_report("target/i386/emulate: read_val_ext: reading from unmapped address.");
+        if (read_val_from_mem(env, decode->addr, size, val)) {
+            return 1;
         }
-        return val;
     }
+    return 0;
 }
 
-static void fetch_operands(CPUX86State *env, struct x86_decode *decode,
+static bool fetch_operands(CPUX86State *env, struct x86_decode *decode,
                            int n, bool val_op0, bool val_op1, bool val_op2)
 {
     int i;
@@ -251,8 +259,10 @@ static void fetch_operands(CPUX86State *env, struct x86_decode *decode,
         case X86_VAR_RM:
             calc_modrm_operand(env, decode, &decode->op[i]);
             if (calc_val[i]) {
-                decode->op[i].val = read_val_ext(env, &decode->op[i],
-                                                 decode->operand_size);
+                if (read_val_ext(env, &decode->op[i],decode->operand_size,
+                                                            &decode->op[i].val)) {
+                    return 1;
+                }
             }
             break;
         case X86_VAR_OFFSET:
@@ -260,68 +270,81 @@ static void fetch_operands(CPUX86State *env, struct x86_decode *decode,
                                                     decode->op[i].addr,
                                                     R_DS);
             if (calc_val[i]) {
-                decode->op[i].val = read_val_ext(env, &decode->op[i],
-                                                 decode->operand_size);
+                if (read_val_ext(env, &decode->op[i], decode->operand_size,
+                                                 &decode->op[i].val)) {
+                    return 1;
+                }
             }
             break;
         default:
             break;
         }
     }
+    return 0;
 }
 
-static void exec_mov(CPUX86State *env, struct x86_decode *decode)
+static bool exec_mov(CPUX86State *env, struct x86_decode *decode)
 {
     fetch_operands(env, decode, 2, false, true, false);
-    write_val_ext(env, &decode->op[0], decode->op[1].val,
-                  decode->operand_size);
+    if (write_val_ext(env, &decode->op[0], decode->op[1].val,
+                  decode->operand_size)) {
+        return 1;
+    }
 
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_add(CPUX86State *env, struct x86_decode *decode)
+static bool exec_add(CPUX86State *env, struct x86_decode *decode)
 {
     EXEC_2OP_FLAGS_CMD(env, decode, +, SET_FLAGS_OSZAPC_ADD, true);
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_or(CPUX86State *env, struct x86_decode *decode)
+static bool exec_or(CPUX86State *env, struct x86_decode *decode)
 {
     EXEC_2OP_FLAGS_CMD(env, decode, |, SET_FLAGS_OSZAPC_LOGIC, true);
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_adc(CPUX86State *env, struct x86_decode *decode)
+static bool exec_adc(CPUX86State *env, struct x86_decode *decode)
 {
     EXEC_2OP_FLAGS_CMD(env, decode, +get_CF(env)+, SET_FLAGS_OSZAPC_ADD, true);
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_sbb(CPUX86State *env, struct x86_decode *decode)
+static bool exec_sbb(CPUX86State *env, struct x86_decode *decode)
 {
     EXEC_2OP_FLAGS_CMD(env, decode, -get_CF(env)-, SET_FLAGS_OSZAPC_SUB, true);
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_and(CPUX86State *env, struct x86_decode *decode)
+static bool exec_and(CPUX86State *env, struct x86_decode *decode)
 {
     EXEC_2OP_FLAGS_CMD(env, decode, &, SET_FLAGS_OSZAPC_LOGIC, true);
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_sub(CPUX86State *env, struct x86_decode *decode)
+static bool exec_sub(CPUX86State *env, struct x86_decode *decode)
 {
     EXEC_2OP_FLAGS_CMD(env, decode, -, SET_FLAGS_OSZAPC_SUB, true);
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_xor(CPUX86State *env, struct x86_decode *decode)
+static bool exec_xor(CPUX86State *env, struct x86_decode *decode)
 {
     EXEC_2OP_FLAGS_CMD(env, decode, ^, SET_FLAGS_OSZAPC_LOGIC, true);
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_neg(CPUX86State *env, struct x86_decode *decode)
+static bool exec_neg(CPUX86State *env, struct x86_decode *decode)
 {
     /*EXEC_2OP_FLAGS_CMD(env, decode, -, SET_FLAGS_OSZAPC_SUB, false);*/
     int32_t val;
@@ -342,15 +365,17 @@ static void exec_neg(CPUX86State *env, struct x86_decode *decode)
 
     /*lflags_to_rflags(env);*/
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_cmp(CPUX86State *env, struct x86_decode *decode)
+static bool exec_cmp(CPUX86State *env, struct x86_decode *decode)
 {
     EXEC_2OP_FLAGS_CMD(env, decode, -, SET_FLAGS_OSZAPC_SUB, false);
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_inc(CPUX86State *env, struct x86_decode *decode)
+static bool exec_inc(CPUX86State *env, struct x86_decode *decode)
 {
     decode->op[1].type = X86_VAR_IMMEDIATE;
     decode->op[1].val = 0;
@@ -358,33 +383,37 @@ static void exec_inc(CPUX86State *env, struct x86_decode *decode)
     EXEC_2OP_FLAGS_CMD(env, decode, +1+, SET_FLAGS_OSZAP_ADD, true);
 
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_dec(CPUX86State *env, struct x86_decode *decode)
+static bool exec_dec(CPUX86State *env, struct x86_decode *decode)
 {
     decode->op[1].type = X86_VAR_IMMEDIATE;
     decode->op[1].val = 0;
 
     EXEC_2OP_FLAGS_CMD(env, decode, -1-, SET_FLAGS_OSZAP_SUB, true);
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_tst(CPUX86State *env, struct x86_decode *decode)
+static bool exec_tst(CPUX86State *env, struct x86_decode *decode)
 {
     EXEC_2OP_FLAGS_CMD(env, decode, &, SET_FLAGS_OSZAPC_LOGIC, false);
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_not(CPUX86State *env, struct x86_decode *decode)
+static bool exec_not(CPUX86State *env, struct x86_decode *decode)
 {
     fetch_operands(env, decode, 1, true, false, false);
 
     write_val_ext(env, &decode->op[0], ~decode->op[0].val,
                   decode->operand_size);
     env->eip += decode->len;
+    return 0;
 }
 
-void exec_movzx(CPUX86State *env, struct x86_decode *decode)
+bool exec_movzx(CPUX86State *env, struct x86_decode *decode)
 {
     int src_op_size;
     int op_size = decode->operand_size;
@@ -398,13 +427,16 @@ void exec_movzx(CPUX86State *env, struct x86_decode *decode)
     }
     decode->operand_size = src_op_size;
     calc_modrm_operand(env, decode, &decode->op[1]);
-    decode->op[1].val = read_val_ext(env, &decode->op[1], src_op_size);
+    if (read_val_ext(env, &decode->op[1], src_op_size, &decode->op[1].val)) {
+        return 1;
+    }
     write_val_ext(env, &decode->op[0], decode->op[1].val, op_size);
 
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_out(CPUX86State *env, struct x86_decode *decode)
+static bool exec_out(CPUX86State *env, struct x86_decode *decode)
 {
     switch (decode->opcode[0]) {
     case 0xe6:
@@ -426,9 +458,10 @@ static void exec_out(CPUX86State *env, struct x86_decode *decode)
         break;
     }
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_in(CPUX86State *env, struct x86_decode *decode)
+static bool exec_in(CPUX86State *env, struct x86_decode *decode)
 {
     target_ulong val = 0;
     switch (decode->opcode[0]) {
@@ -463,6 +496,7 @@ static void exec_in(CPUX86State *env, struct x86_decode *decode)
     }
 
     env->eip += decode->len;
+    return 0;
 }
 
 static inline void string_increment_reg(CPUX86State *env, int reg,
@@ -523,7 +557,7 @@ static bool exec_ins_single(CPUX86State *env, struct x86_decode *decode)
     return 0;
 }
 
-static void exec_ins(CPUX86State *env, struct x86_decode *decode)
+static bool exec_ins(CPUX86State *env, struct x86_decode *decode)
 {
     bool res;
     if (decode->rep) {
@@ -533,9 +567,10 @@ static void exec_ins(CPUX86State *env, struct x86_decode *decode)
     }
 
     if (res) {
-        return;
+        return 1;
     }
     env->eip += decode->len;
+    return 0;
 }
 
 static bool exec_outs_single(CPUX86State *env, struct x86_decode *decode)
@@ -551,7 +586,7 @@ static bool exec_outs_single(CPUX86State *env, struct x86_decode *decode)
     return 0;
 }
 
-static void exec_outs(CPUX86State *env, struct x86_decode *decode)
+static bool exec_outs(CPUX86State *env, struct x86_decode *decode)
 {
     bool res;
     if (decode->rep) {
@@ -561,9 +596,10 @@ static void exec_outs(CPUX86State *env, struct x86_decode *decode)
     }
 
     if (res) {
-        return;
+        return 1;
     }
     env->eip += decode->len;
+    return 0;
 }
 
 static bool exec_movs_single(CPUX86State *env, struct x86_decode *decode)
@@ -590,7 +626,7 @@ static bool exec_movs_single(CPUX86State *env, struct x86_decode *decode)
     return 0;
 }
 
-static void exec_movs(CPUX86State *env, struct x86_decode *decode)
+static bool exec_movs(CPUX86State *env, struct x86_decode *decode)
 {
     bool res;
     if (decode->rep) {
@@ -600,9 +636,10 @@ static void exec_movs(CPUX86State *env, struct x86_decode *decode)
     }
 
     if (res) {
-        return;
+        return 1;
     }
     env->eip += decode->len;
+    return 0;
 }
 
 static bool exec_cmps_single(CPUX86State *env, struct x86_decode *decode)
@@ -630,7 +667,7 @@ static bool exec_cmps_single(CPUX86State *env, struct x86_decode *decode)
     return 0;
 }
 
-static void exec_cmps(CPUX86State *env, struct x86_decode *decode)
+static bool exec_cmps(CPUX86State *env, struct x86_decode *decode)
 {
     if (decode->rep) {
         string_rep(env, decode, exec_cmps_single, decode->rep);
@@ -638,6 +675,7 @@ static void exec_cmps(CPUX86State *env, struct x86_decode *decode)
         exec_cmps_single(env, decode);
     }
     env->eip += decode->len;
+    return 0;
 }
 
 
@@ -660,7 +698,7 @@ static bool exec_stos_single(CPUX86State *env, struct x86_decode *decode)
 }
 
 
-static void exec_stos(CPUX86State *env, struct x86_decode *decode)
+static bool exec_stos(CPUX86State *env, struct x86_decode *decode)
 {
     if (decode->rep) {
         string_rep(env, decode, exec_stos_single, 0);
@@ -669,6 +707,7 @@ static void exec_stos(CPUX86State *env, struct x86_decode *decode)
     }
 
     env->eip += decode->len;
+    return 0;
 }
 
 static bool exec_scas_single(CPUX86State *env, struct x86_decode *decode)
@@ -685,7 +724,7 @@ static bool exec_scas_single(CPUX86State *env, struct x86_decode *decode)
     return 0;
 }
 
-static void exec_scas(CPUX86State *env, struct x86_decode *decode)
+static bool exec_scas(CPUX86State *env, struct x86_decode *decode)
 {
     decode->op[0].type = X86_VAR_REG;
     decode->op[0].reg = R_EAX;
@@ -696,6 +735,7 @@ static void exec_scas(CPUX86State *env, struct x86_decode *decode)
     }
 
     env->eip += decode->len;
+    return 0;
 }
 
 static bool exec_lods_single(CPUX86State *env, struct x86_decode *decode)
@@ -711,7 +751,7 @@ static bool exec_lods_single(CPUX86State *env, struct x86_decode *decode)
     return 0;
 }
 
-static void exec_lods(CPUX86State *env, struct x86_decode *decode)
+static bool exec_lods(CPUX86State *env, struct x86_decode *decode)
 {
     if (decode->rep) {
         string_rep(env, decode, exec_lods_single, 0);
@@ -720,6 +760,7 @@ static void exec_lods(CPUX86State *env, struct x86_decode *decode)
     }
 
     env->eip += decode->len;
+    return 0;
 }
 
 void x86_emul_raise_exception(CPUX86State *env, int exception_index, int error_code)
@@ -730,23 +771,25 @@ void x86_emul_raise_exception(CPUX86State *env, int exception_index, int error_c
     env->exception_injected = 1;
 }
 
-static void exec_rdmsr(CPUX86State *env, struct x86_decode *decode)
+static bool exec_rdmsr(CPUX86State *env, struct x86_decode *decode)
 {
     emul_ops->simulate_rdmsr(env_cpu(env));
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_wrmsr(CPUX86State *env, struct x86_decode *decode)
+static bool exec_wrmsr(CPUX86State *env, struct x86_decode *decode)
 {
     emul_ops->simulate_wrmsr(env_cpu(env));
     env->eip += decode->len;
+    return 0;
 }
 
 /*
  * flag:
  * 0 - bt, 1 - btc, 2 - bts, 3 - btr
  */
-static void do_bt(CPUX86State *env, struct x86_decode *decode, int flag)
+static bool do_bt(CPUX86State *env, struct x86_decode *decode, int flag)
 {
     int32_t displacement;
     uint8_t index;
@@ -755,7 +798,9 @@ static void do_bt(CPUX86State *env, struct x86_decode *decode, int flag)
 
     VM_PANIC_ON(decode->rex.rex);
 
-    fetch_operands(env, decode, 2, false, true, false);
+    if (fetch_operands(env, decode, 2, false, true, false)) {
+        return 1;
+    }
     index = decode->op[1].val & mask;
 
     if (decode->op[0].type != X86_VAR_REG) {
@@ -769,14 +814,16 @@ static void do_bt(CPUX86State *env, struct x86_decode *decode, int flag)
             VM_PANIC("bt 64bit\n");
         }
     }
-    decode->op[0].val = read_val_ext(env, &decode->op[0],
-                                     decode->operand_size);
+    if (read_val_ext(env, &decode->op[0],
+                                     decode->operand_size, &decode->op[0].val)) {
+        return 1;
+    }
     cf = (decode->op[0].val >> index) & 0x01;
 
     switch (flag) {
     case 0:
         set_CF(env, cf);
-        return;
+        return 0;
     case 1:
         decode->op[0].val ^= (1u << index);
         break;
@@ -787,41 +834,58 @@ static void do_bt(CPUX86State *env, struct x86_decode *decode, int flag)
         decode->op[0].val &= ~(1u << index);
         break;
     }
-    write_val_ext(env, &decode->op[0], decode->op[0].val,
-                  decode->operand_size);
+    if (write_val_ext(env, &decode->op[0], decode->op[0].val,
+                  decode->operand_size)) {
+        return 1;
+    }
     set_CF(env, cf);
+    return 0;
 }
 
-static void exec_bt(CPUX86State *env, struct x86_decode *decode)
+static bool exec_bt(CPUX86State *env, struct x86_decode *decode)
 {
-    do_bt(env, decode, 0);
+    if (do_bt(env, decode, 0)) {
+        return 1;
+    }
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_btc(CPUX86State *env, struct x86_decode *decode)
+static bool exec_btc(CPUX86State *env, struct x86_decode *decode)
 {
-    do_bt(env, decode, 1);
+    if (do_bt(env, decode, 1)) {
+        return 1;
+    }
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_btr(CPUX86State *env, struct x86_decode *decode)
+static bool exec_btr(CPUX86State *env, struct x86_decode *decode)
 {
-    do_bt(env, decode, 3);
+    if (do_bt(env, decode, 3)) {
+        return 1;
+    }
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_bts(CPUX86State *env, struct x86_decode *decode)
+static bool exec_bts(CPUX86State *env, struct x86_decode *decode)
 {
-    do_bt(env, decode, 2);
+    if (do_bt(env, decode, 2)) {
+        return 1;
+    }
     env->eip += decode->len;
+    return 0;
 }
 
-void exec_shl(CPUX86State *env, struct x86_decode *decode)
+bool exec_shl(CPUX86State *env, struct x86_decode *decode)
 {
     uint8_t count;
     int of = 0, cf = 0;
 
-    fetch_operands(env, decode, 2, true, true, false);
+    if (fetch_operands(env, decode, 2, true, true, false)) {
+        return 1;
+    }
 
     count = decode->op[1].val;
     count &= 0x1f;      /* count is masked to 5 bits*/
@@ -878,12 +942,14 @@ void exec_shl(CPUX86State *env, struct x86_decode *decode)
 exit:
     /* lflags_to_rflags(env); */
     env->eip += decode->len;
+    return 0;
 }
 
-void exec_movsx(CPUX86State *env, struct x86_decode *decode)
+bool exec_movsx(CPUX86State *env, struct x86_decode *decode)
 {
     int src_op_size;
     int op_size = decode->operand_size;
+    target_ulong val;
 
     fetch_operands(env, decode, 2, false, false, false);
 
@@ -895,15 +961,18 @@ void exec_movsx(CPUX86State *env, struct x86_decode *decode)
 
     decode->operand_size = src_op_size;
     calc_modrm_operand(env, decode, &decode->op[1]);
-    decode->op[1].val = sign(read_val_ext(env, &decode->op[1], src_op_size),
-                             src_op_size);
+    if (read_val_ext(env, &decode->op[1], src_op_size, &val)) {
+        return 1;
+    }
+    decode->op[1].val = sign(val, src_op_size);
 
     write_val_ext(env, &decode->op[0], decode->op[1].val, op_size);
 
     env->eip += decode->len;
+    return 0;
 }
 
-void exec_ror(CPUX86State *env, struct x86_decode *decode)
+bool exec_ror(CPUX86State *env, struct x86_decode *decode)
 {
     uint8_t count;
 
@@ -979,9 +1048,10 @@ void exec_ror(CPUX86State *env, struct x86_decode *decode)
         }
     }
     env->eip += decode->len;
+    return 0;
 }
 
-void exec_rol(CPUX86State *env, struct x86_decode *decode)
+bool exec_rol(CPUX86State *env, struct x86_decode *decode)
 {
     uint8_t count;
 
@@ -1060,10 +1130,11 @@ void exec_rol(CPUX86State *env, struct x86_decode *decode)
         }
     }
     env->eip += decode->len;
+    return 0;
 }
 
 
-void exec_rcl(CPUX86State *env, struct x86_decode *decode)
+bool exec_rcl(CPUX86State *env, struct x86_decode *decode)
 {
     uint8_t count;
     int of = 0, cf = 0;
@@ -1146,9 +1217,10 @@ void exec_rcl(CPUX86State *env, struct x86_decode *decode)
         }
     }
     env->eip += decode->len;
+    return 0;
 }
 
-void exec_rcr(CPUX86State *env, struct x86_decode *decode)
+bool exec_rcr(CPUX86State *env, struct x86_decode *decode)
 {
     uint8_t count;
     int of = 0, cf = 0;
@@ -1221,9 +1293,10 @@ void exec_rcr(CPUX86State *env, struct x86_decode *decode)
         }
     }
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_xchg(CPUX86State *env, struct x86_decode *decode)
+static bool exec_xchg(CPUX86State *env, struct x86_decode *decode)
 {
     fetch_operands(env, decode, 2, true, true, false);
 
@@ -1233,20 +1306,22 @@ static void exec_xchg(CPUX86State *env, struct x86_decode *decode)
                   decode->operand_size);
 
     env->eip += decode->len;
+    return 0;
 }
 
-static void exec_xadd(CPUX86State *env, struct x86_decode *decode)
+static bool exec_xadd(CPUX86State *env, struct x86_decode *decode)
 {
     EXEC_2OP_FLAGS_CMD(env, decode, +, SET_FLAGS_OSZAPC_ADD, true);
     write_val_ext(env, &decode->op[1], decode->op[0].val,
                   decode->operand_size);
 
     env->eip += decode->len;
+    return 0;
 }
 
 static struct cmd_handler {
     enum x86_decode_cmd cmd;
-    void (*handler)(CPUX86State *env, struct x86_decode *ins);
+    bool (*handler)(CPUX86State *env, struct x86_decode *ins);
 } handlers[] = {
     {X86_DECODE_CMD_INVL, NULL,},
     {X86_DECODE_CMD_MOV, exec_mov},
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 030/102] accel/kvm: Don't clear pending #SMI in kvm_get_vcpu_events
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (28 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 029/102] target/i386: emulate: propagate errors all the way and stop early Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 031/102] scripts/update-linux-headers: Add Nitro Enclaves header Paolo Bonzini
                   ` (71 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Maxim Levitsky

From: Maxim Levitsky <mlevitsk@redhat.com>

The kvm_get_vcpu_events propogates the state of the pending smi
from the kernel to the cpu->interrupt_request, with the intention
of having un up to date migration state.

Later the opposite is done, the kvm_put_vcpu_events restores the state
of the pending #SMI from the 'cs->interrupt_request'

The only problem is that kvm_get_vcpu_events also resets the SMI
in cpu->interrupt_request when there is no pending #SMI indicated by the kernel,
and that is wrong as the SMI might be still raised by qemu.

While at it, also fix a similar but more theoretical bug with regard to a
latched #INIT while in SMM.

A simple reproducer for this bug is to read an EFI variable in a loop
from within a guest, while at the same time run 'info registers' on
the qemu HMP monitor.

The reads will, once in a while, fail with an 'Invalid argument' error.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20260223221908.361456-1-mlevitsk@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/kvm/kvm.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 3b66ec8c42b..bb8303c39fe 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -5501,8 +5501,6 @@ static int kvm_get_vcpu_events(X86CPU *cpu)
         }
         if (events.smi.pending) {
             cpu_interrupt(CPU(cpu), CPU_INTERRUPT_SMI);
-        } else {
-            cpu_reset_interrupt(CPU(cpu), CPU_INTERRUPT_SMI);
         }
         if (events.smi.smm_inside_nmi) {
             env->hflags2 |= HF2_SMM_INSIDE_NMI_MASK;
@@ -5511,8 +5509,6 @@ static int kvm_get_vcpu_events(X86CPU *cpu)
         }
         if (events.smi.latched_init) {
             cpu_interrupt(CPU(cpu), CPU_INTERRUPT_INIT);
-        } else {
-            cpu_reset_interrupt(CPU(cpu), CPU_INTERRUPT_INIT);
         }
     }
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 031/102] scripts/update-linux-headers: Add Nitro Enclaves header
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (29 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 030/102] accel/kvm: Don't clear pending #SMI in kvm_get_vcpu_events Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 032/102] linux-headers: Add nitro_enclaves.h Paolo Bonzini
                   ` (70 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexander Graf

From: Alexander Graf <graf@amazon.com>

We want to enable QEMU to drive the /dev/nitro_enclaves device node. Add
its UAPI header into our kernel sync so we have all defines we need to
drive it.

Signed-off-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20260225220807.33092-2-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 scripts/update-linux-headers.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index d09d8cf4c6f..386d7a38e7a 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -254,6 +254,7 @@ for i in "$hdrdir"/include/linux/*virtio*.h \
          "$hdrdir/include/linux/kvm_para.h" \
          "$hdrdir/include/linux/vhost_types.h" \
          "$hdrdir/include/linux/vmclock-abi.h" \
+         "$hdrdir/include/linux/nitro_enclaves.h" \
          "$hdrdir/include/linux/sysinfo.h"; do
     cp_portable "$i" "$output/include/standard-headers/linux"
 done
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 032/102] linux-headers: Add nitro_enclaves.h
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (30 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 031/102] scripts/update-linux-headers: Add Nitro Enclaves header Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 033/102] hw/nitro: Add Nitro Vsock Bus Paolo Bonzini
                   ` (69 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexander Graf

From: Alexander Graf <graf@amazon.com>

QEMU is learning to drive the /dev/nitro_enclaves device node. Include
its UAPI header into our local copy of kernel headers so it has all
defines we need to drive it.

Signed-off-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20260225220807.33092-3-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 .../standard-headers/linux/nitro_enclaves.h   | 359 ++++++++++++++++++
 1 file changed, 359 insertions(+)
 create mode 100644 include/standard-headers/linux/nitro_enclaves.h

diff --git a/include/standard-headers/linux/nitro_enclaves.h b/include/standard-headers/linux/nitro_enclaves.h
new file mode 100644
index 00000000000..5545267dd95
--- /dev/null
+++ b/include/standard-headers/linux/nitro_enclaves.h
@@ -0,0 +1,359 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Copyright 2020-2021 Amazon.com, Inc. or its affiliates. All Rights Reserved.
+ */
+
+#ifndef _LINUX_NITRO_ENCLAVES_H_
+#define _LINUX_NITRO_ENCLAVES_H_
+
+#include "standard-headers/linux/types.h"
+
+/**
+ * DOC: Nitro Enclaves (NE) Kernel Driver Interface
+ */
+
+/**
+ * NE_CREATE_VM - The command is used to create a slot that is associated with
+ *		  an enclave VM.
+ *		  The generated unique slot id is an output parameter.
+ *		  The ioctl can be invoked on the /dev/nitro_enclaves fd, before
+ *		  setting any resources, such as memory and vCPUs, for an
+ *		  enclave. Memory and vCPUs are set for the slot mapped to an enclave.
+ *		  A NE CPU pool has to be set before calling this function. The
+ *		  pool can be set after the NE driver load, using
+ *		  /sys/module/nitro_enclaves/parameters/ne_cpus.
+ *		  Its format is the detailed in the cpu-lists section:
+ *		  https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html
+ *		  CPU 0 and its siblings have to remain available for the
+ *		  primary / parent VM, so they cannot be set for enclaves. Full
+ *		  CPU core(s), from the same NUMA node, need(s) to be included
+ *		  in the CPU pool.
+ *
+ * Context: Process context.
+ * Return:
+ * * Enclave file descriptor		- Enclave file descriptor used with
+ *					  ioctl calls to set vCPUs and memory
+ *					  regions, then start the enclave.
+ * *  -1				- There was a failure in the ioctl logic.
+ * On failure, errno is set to:
+ * * EFAULT				- copy_to_user() failure.
+ * * ENOMEM				- Memory allocation failure for internal
+ *					  bookkeeping variables.
+ * * NE_ERR_NO_CPUS_AVAIL_IN_POOL	- No NE CPU pool set / no CPUs available
+ *					  in the pool.
+ * * Error codes from get_unused_fd_flags() and anon_inode_getfile().
+ * * Error codes from the NE PCI device request.
+ */
+#define NE_CREATE_VM			_IOR(0xAE, 0x20, uint64_t)
+
+/**
+ * NE_ADD_VCPU - The command is used to set a vCPU for an enclave. The vCPU can
+ *		 be auto-chosen from the NE CPU pool or it can be set by the
+ *		 caller, with the note that it needs to be available in the NE
+ *		 CPU pool. Full CPU core(s), from the same NUMA node, need(s) to
+ *		 be associated with an enclave.
+ *		 The vCPU id is an input / output parameter. If its value is 0,
+ *		 then a CPU is chosen from the enclave CPU pool and returned via
+ *		 this parameter.
+ *		 The ioctl can be invoked on the enclave fd, before an enclave
+ *		 is started.
+ *
+ * Context: Process context.
+ * Return:
+ * * 0					- Logic successfully completed.
+ * *  -1				- There was a failure in the ioctl logic.
+ * On failure, errno is set to:
+ * * EFAULT				- copy_from_user() / copy_to_user() failure.
+ * * ENOMEM				- Memory allocation failure for internal
+ *					  bookkeeping variables.
+ * * EIO				- Current task mm is not the same as the one
+ *					  that created the enclave.
+ * * NE_ERR_NO_CPUS_AVAIL_IN_POOL	- No CPUs available in the NE CPU pool.
+ * * NE_ERR_VCPU_ALREADY_USED		- The provided vCPU is already used.
+ * * NE_ERR_VCPU_NOT_IN_CPU_POOL	- The provided vCPU is not available in the
+ *					  NE CPU pool.
+ * * NE_ERR_VCPU_INVALID_CPU_CORE	- The core id of the provided vCPU is invalid
+ *					  or out of range.
+ * * NE_ERR_NOT_IN_INIT_STATE		- The enclave is not in init state
+ *					  (init = before being started).
+ * * NE_ERR_INVALID_VCPU		- The provided vCPU is not in the available
+ *					  CPUs range.
+ * * Error codes from the NE PCI device request.
+ */
+#define NE_ADD_VCPU			_IOWR(0xAE, 0x21, uint32_t)
+
+/**
+ * NE_GET_IMAGE_LOAD_INFO - The command is used to get information needed for
+ *			    in-memory enclave image loading e.g. offset in
+ *			    enclave memory to start placing the enclave image.
+ *			    The image load info is an input / output parameter.
+ *			    It includes info provided by the caller - flags -
+ *			    and returns the offset in enclave memory where to
+ *			    start placing the enclave image.
+ *			    The ioctl can be invoked on the enclave fd, before
+ *			    an enclave is started.
+ *
+ * Context: Process context.
+ * Return:
+ * * 0				- Logic successfully completed.
+ * *  -1			- There was a failure in the ioctl logic.
+ * On failure, errno is set to:
+ * * EFAULT			- copy_from_user() / copy_to_user() failure.
+ * * NE_ERR_NOT_IN_INIT_STATE	- The enclave is not in init state (init =
+ *				  before being started).
+ * * NE_ERR_INVALID_FLAG_VALUE	- The value of the provided flag is invalid.
+ */
+#define NE_GET_IMAGE_LOAD_INFO		_IOWR(0xAE, 0x22, struct ne_image_load_info)
+
+/**
+ * NE_SET_USER_MEMORY_REGION - The command is used to set a memory region for an
+ *			       enclave, given the allocated memory from the
+ *			       userspace. Enclave memory needs to be from the
+ *			       same NUMA node as the enclave CPUs.
+ *			       The user memory region is an input parameter. It
+ *			       includes info provided by the caller - flags,
+ *			       memory size and userspace address.
+ *			       The ioctl can be invoked on the enclave fd,
+ *			       before an enclave is started.
+ *
+ * Context: Process context.
+ * Return:
+ * * 0					- Logic successfully completed.
+ * *  -1				- There was a failure in the ioctl logic.
+ * On failure, errno is set to:
+ * * EFAULT				- copy_from_user() failure.
+ * * EINVAL				- Invalid physical memory region(s) e.g.
+ *					  unaligned address.
+ * * EIO				- Current task mm is not the same as
+ *					  the one that created the enclave.
+ * * ENOMEM				- Memory allocation failure for internal
+ *					  bookkeeping variables.
+ * * NE_ERR_NOT_IN_INIT_STATE		- The enclave is not in init state
+ *					  (init = before being started).
+ * * NE_ERR_INVALID_MEM_REGION_SIZE	- The memory size of the region is not
+ *					  multiple of 2 MiB.
+ * * NE_ERR_INVALID_MEM_REGION_ADDR	- Invalid user space address given.
+ * * NE_ERR_UNALIGNED_MEM_REGION_ADDR	- Unaligned user space address given.
+ * * NE_ERR_MEM_REGION_ALREADY_USED	- The memory region is already used.
+ * * NE_ERR_MEM_NOT_HUGE_PAGE		- The memory region is not backed by
+ *					  huge pages.
+ * * NE_ERR_MEM_DIFFERENT_NUMA_NODE	- The memory region is not from the same
+ *					  NUMA node as the CPUs.
+ * * NE_ERR_MEM_MAX_REGIONS		- The number of memory regions set for
+ *					  the enclave reached maximum.
+ * * NE_ERR_INVALID_PAGE_SIZE		- The memory region is not backed by
+ *					  pages multiple of 2 MiB.
+ * * NE_ERR_INVALID_FLAG_VALUE		- The value of the provided flag is invalid.
+ * * Error codes from get_user_pages().
+ * * Error codes from the NE PCI device request.
+ */
+#define NE_SET_USER_MEMORY_REGION	_IOW(0xAE, 0x23, struct ne_user_memory_region)
+
+/**
+ * NE_START_ENCLAVE - The command is used to trigger enclave start after the
+ *		      enclave resources, such as memory and CPU, have been set.
+ *		      The enclave start info is an input / output parameter. It
+ *		      includes info provided by the caller - enclave cid and
+ *		      flags - and returns the cid (if input cid is 0).
+ *		      The ioctl can be invoked on the enclave fd, after an
+ *		      enclave slot is created and resources, such as memory and
+ *		      vCPUs are set for an enclave.
+ *
+ * Context: Process context.
+ * Return:
+ * * 0					- Logic successfully completed.
+ * *  -1				- There was a failure in the ioctl logic.
+ * On failure, errno is set to:
+ * * EFAULT				- copy_from_user() / copy_to_user() failure.
+ * * NE_ERR_NOT_IN_INIT_STATE		- The enclave is not in init state
+ *					  (init = before being started).
+ * * NE_ERR_NO_MEM_REGIONS_ADDED	- No memory regions are set.
+ * * NE_ERR_NO_VCPUS_ADDED		- No vCPUs are set.
+ * *  NE_ERR_FULL_CORES_NOT_USED	- Full core(s) not set for the enclave.
+ * * NE_ERR_ENCLAVE_MEM_MIN_SIZE	- Enclave memory is less than minimum
+ *					  memory size (64 MiB).
+ * * NE_ERR_INVALID_FLAG_VALUE		- The value of the provided flag is invalid.
+ * *  NE_ERR_INVALID_ENCLAVE_CID	- The provided enclave CID is invalid.
+ * * Error codes from the NE PCI device request.
+ */
+#define NE_START_ENCLAVE		_IOWR(0xAE, 0x24, struct ne_enclave_start_info)
+
+/**
+ * DOC: NE specific error codes
+ */
+
+/**
+ * NE_ERR_VCPU_ALREADY_USED - The provided vCPU is already used.
+ */
+#define NE_ERR_VCPU_ALREADY_USED		(256)
+/**
+ * NE_ERR_VCPU_NOT_IN_CPU_POOL - The provided vCPU is not available in the
+ *				 NE CPU pool.
+ */
+#define NE_ERR_VCPU_NOT_IN_CPU_POOL		(257)
+/**
+ * NE_ERR_VCPU_INVALID_CPU_CORE - The core id of the provided vCPU is invalid
+ *				  or out of range of the NE CPU pool.
+ */
+#define NE_ERR_VCPU_INVALID_CPU_CORE		(258)
+/**
+ * NE_ERR_INVALID_MEM_REGION_SIZE - The user space memory region size is not
+ *				    multiple of 2 MiB.
+ */
+#define NE_ERR_INVALID_MEM_REGION_SIZE		(259)
+/**
+ * NE_ERR_INVALID_MEM_REGION_ADDR - The user space memory region address range
+ *				    is invalid.
+ */
+#define NE_ERR_INVALID_MEM_REGION_ADDR		(260)
+/**
+ * NE_ERR_UNALIGNED_MEM_REGION_ADDR - The user space memory region address is
+ *				      not aligned.
+ */
+#define NE_ERR_UNALIGNED_MEM_REGION_ADDR	(261)
+/**
+ * NE_ERR_MEM_REGION_ALREADY_USED - The user space memory region is already used.
+ */
+#define NE_ERR_MEM_REGION_ALREADY_USED		(262)
+/**
+ * NE_ERR_MEM_NOT_HUGE_PAGE - The user space memory region is not backed by
+ *			      contiguous physical huge page(s).
+ */
+#define NE_ERR_MEM_NOT_HUGE_PAGE		(263)
+/**
+ * NE_ERR_MEM_DIFFERENT_NUMA_NODE - The user space memory region is backed by
+ *				    pages from different NUMA nodes than the CPUs.
+ */
+#define NE_ERR_MEM_DIFFERENT_NUMA_NODE		(264)
+/**
+ * NE_ERR_MEM_MAX_REGIONS - The supported max memory regions per enclaves has
+ *			    been reached.
+ */
+#define NE_ERR_MEM_MAX_REGIONS			(265)
+/**
+ * NE_ERR_NO_MEM_REGIONS_ADDED - The command to start an enclave is triggered
+ *				 and no memory regions are added.
+ */
+#define NE_ERR_NO_MEM_REGIONS_ADDED		(266)
+/**
+ * NE_ERR_NO_VCPUS_ADDED - The command to start an enclave is triggered and no
+ *			   vCPUs are added.
+ */
+#define NE_ERR_NO_VCPUS_ADDED			(267)
+/**
+ * NE_ERR_ENCLAVE_MEM_MIN_SIZE - The enclave memory size is lower than the
+ *				 minimum supported.
+ */
+#define NE_ERR_ENCLAVE_MEM_MIN_SIZE		(268)
+/**
+ * NE_ERR_FULL_CORES_NOT_USED - The command to start an enclave is triggered and
+ *				full CPU cores are not set.
+ */
+#define NE_ERR_FULL_CORES_NOT_USED		(269)
+/**
+ * NE_ERR_NOT_IN_INIT_STATE - The enclave is not in init state when setting
+ *			      resources or triggering start.
+ */
+#define NE_ERR_NOT_IN_INIT_STATE		(270)
+/**
+ * NE_ERR_INVALID_VCPU - The provided vCPU is out of range of the available CPUs.
+ */
+#define NE_ERR_INVALID_VCPU			(271)
+/**
+ * NE_ERR_NO_CPUS_AVAIL_IN_POOL - The command to create an enclave is triggered
+ *				  and no CPUs are available in the pool.
+ */
+#define NE_ERR_NO_CPUS_AVAIL_IN_POOL		(272)
+/**
+ * NE_ERR_INVALID_PAGE_SIZE - The user space memory region is not backed by pages
+ *			      multiple of 2 MiB.
+ */
+#define NE_ERR_INVALID_PAGE_SIZE		(273)
+/**
+ * NE_ERR_INVALID_FLAG_VALUE - The provided flag value is invalid.
+ */
+#define NE_ERR_INVALID_FLAG_VALUE		(274)
+/**
+ * NE_ERR_INVALID_ENCLAVE_CID - The provided enclave CID is invalid, either
+ *				being a well-known value or the CID of the
+ *				parent / primary VM.
+ */
+#define NE_ERR_INVALID_ENCLAVE_CID		(275)
+
+/**
+ * DOC: Image load info flags
+ */
+
+/**
+ * NE_EIF_IMAGE - Enclave Image Format (EIF)
+ */
+#define NE_EIF_IMAGE			(0x01)
+
+#define NE_IMAGE_LOAD_MAX_FLAG_VAL	(0x02)
+
+/**
+ * struct ne_image_load_info - Info necessary for in-memory enclave image
+ *			       loading (in / out).
+ * @flags:		Flags to determine the enclave image type
+ *			(e.g. Enclave Image Format - EIF) (in).
+ * @memory_offset:	Offset in enclave memory where to start placing the
+ *			enclave image (out).
+ */
+struct ne_image_load_info {
+	uint64_t	flags;
+	uint64_t	memory_offset;
+};
+
+/**
+ * DOC: User memory region flags
+ */
+
+/**
+ * NE_DEFAULT_MEMORY_REGION - Memory region for enclave general usage.
+ */
+#define NE_DEFAULT_MEMORY_REGION	(0x00)
+
+#define NE_MEMORY_REGION_MAX_FLAG_VAL	(0x01)
+
+/**
+ * struct ne_user_memory_region - Memory region to be set for an enclave (in).
+ * @flags:		Flags to determine the usage for the memory region (in).
+ * @memory_size:	The size, in bytes, of the memory region to be set for
+ *			an enclave (in).
+ * @userspace_addr:	The start address of the userspace allocated memory of
+ *			the memory region to set for an enclave (in).
+ */
+struct ne_user_memory_region {
+	uint64_t	flags;
+	uint64_t	memory_size;
+	uint64_t	userspace_addr;
+};
+
+/**
+ * DOC: Enclave start info flags
+ */
+
+/**
+ * NE_ENCLAVE_PRODUCTION_MODE - Start enclave in production mode.
+ */
+#define NE_ENCLAVE_PRODUCTION_MODE	(0x00)
+/**
+ * NE_ENCLAVE_DEBUG_MODE - Start enclave in debug mode.
+ */
+#define NE_ENCLAVE_DEBUG_MODE		(0x01)
+
+#define NE_ENCLAVE_START_MAX_FLAG_VAL	(0x02)
+
+/**
+ * struct ne_enclave_start_info - Setup info necessary for enclave start (in / out).
+ * @flags:		Flags for the enclave to start with (e.g. debug mode) (in).
+ * @enclave_cid:	Context ID (CID) for the enclave vsock device. If 0 as
+ *			input, the CID is autogenerated by the hypervisor and
+ *			returned back as output by the driver (in / out).
+ */
+struct ne_enclave_start_info {
+	uint64_t	flags;
+	uint64_t	enclave_cid;
+};
+
+#endif /* _LINUX_NITRO_ENCLAVES_H_ */
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 033/102] hw/nitro: Add Nitro Vsock Bus
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (31 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 032/102] linux-headers: Add nitro_enclaves.h Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 034/102] accel: Add Nitro Enclaves accelerator Paolo Bonzini
                   ` (68 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexander Graf

From: Alexander Graf <graf@amazon.com>

Add a dedicated bus for Nitro Enclave vsock devices. In Nitro Enclaves,
communication between parent and enclave/hypervisor happens almost
exclusively through vsock. The nitro-vsock-bus models this dependency
in QEMU, which allows devices in this bus to implement individual services
on top of vsock.

The nitro machine spawns this bus by creating the included
nitro-vsock-bridge sysbus device.

The nitro accel then advertises the Enclave's CID to the bus by calling
nitro_vsock_bridge_start_enclave() on the bridge device as soon as it
knows the CID.

Nitro vsock devices can listen to that event and learn the Enclave's CID
when it is available to perform actions, such as connect to the debug
serial vsock port.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20260225220807.33092-4-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 MAINTAINERS                        |  6 ++
 meson.build                        |  1 +
 hw/nitro/trace.h                   |  4 ++
 include/hw/nitro/nitro-vsock-bus.h | 71 ++++++++++++++++++++++
 hw/nitro/nitro-vsock-bus.c         | 98 ++++++++++++++++++++++++++++++
 hw/Kconfig                         |  1 +
 hw/meson.build                     |  1 +
 hw/nitro/Kconfig                   |  2 +
 hw/nitro/meson.build               |  1 +
 hw/nitro/trace-events              |  2 +
 10 files changed, 187 insertions(+)
 create mode 100644 hw/nitro/trace.h
 create mode 100644 include/hw/nitro/nitro-vsock-bus.h
 create mode 100644 hw/nitro/nitro-vsock-bus.c
 create mode 100644 hw/nitro/Kconfig
 create mode 100644 hw/nitro/meson.build
 create mode 100644 hw/nitro/trace-events

diff --git a/MAINTAINERS b/MAINTAINERS
index 606b16762cf..d781fe59bb1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3020,6 +3020,12 @@ F: hw/vmapple/*
 F: include/hw/vmapple/*
 F: docs/system/arm/vmapple.rst
 
+Nitro Enclaves (native)
+M: Alexander Graf <graf@amazon.com>
+S: Maintained
+F: hw/nitro/
+F: include/hw/nitro/
+
 Subsystems
 ----------
 Overall Audio backends
diff --git a/meson.build b/meson.build
index 2bae618d848..f3ee08772d4 100644
--- a/meson.build
+++ b/meson.build
@@ -3620,6 +3620,7 @@ if have_system
     'hw/misc/macio',
     'hw/net',
     'hw/net/can',
+    'hw/nitro',
     'hw/nubus',
     'hw/nvme',
     'hw/nvram',
diff --git a/hw/nitro/trace.h b/hw/nitro/trace.h
new file mode 100644
index 00000000000..b455d6c17b3
--- /dev/null
+++ b/hw/nitro/trace.h
@@ -0,0 +1,4 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include "trace/trace-hw_nitro.h"
diff --git a/include/hw/nitro/nitro-vsock-bus.h b/include/hw/nitro/nitro-vsock-bus.h
new file mode 100644
index 00000000000..064260aa410
--- /dev/null
+++ b/include/hw/nitro/nitro-vsock-bus.h
@@ -0,0 +1,71 @@
+/*
+ * Nitro Enclave Vsock Bus
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef HW_NITRO_VSOCK_BUS_H
+#define HW_NITRO_VSOCK_BUS_H
+
+#include "hw/core/qdev.h"
+#include "hw/core/sysbus.h"
+#include "qom/object.h"
+
+#define TYPE_NITRO_VSOCK_BUS "nitro-vsock-bus"
+OBJECT_DECLARE_SIMPLE_TYPE(NitroVsockBus, NITRO_VSOCK_BUS)
+
+#define TYPE_NITRO_VSOCK_BRIDGE "nitro-vsock-bridge"
+OBJECT_DECLARE_SIMPLE_TYPE(NitroVsockBridge, NITRO_VSOCK_BRIDGE)
+
+#define TYPE_NITRO_VSOCK_DEVICE "nitro-vsock-device"
+OBJECT_DECLARE_TYPE(NitroVsockDevice, NitroVsockDeviceClass,
+                    NITRO_VSOCK_DEVICE)
+
+struct NitroVsockBus {
+    BusState parent_obj;
+};
+
+struct NitroVsockBridge {
+    SysBusDevice parent_obj;
+
+    NitroVsockBus bus;
+    uint32_t enclave_cid;
+};
+
+struct NitroVsockDevice {
+    DeviceState parent_obj;
+};
+
+struct NitroVsockDeviceClass {
+    DeviceClass parent_class;
+
+    /*
+     * Called after the enclave has been started and the CID is known.
+     * Devices use this to establish vsock connections to the enclave.
+     */
+    void (*enclave_started)(NitroVsockDevice *dev, uint32_t enclave_cid,
+                            Error **errp);
+};
+
+/*
+ * Machine helper to create the Nitro vsock bridge sysbus device.
+ */
+NitroVsockBridge *nitro_vsock_bridge_create(void);
+
+/*
+ * Find the Nitro vsock bridge on the sysbus.
+ */
+static inline NitroVsockBridge *nitro_vsock_bridge_find(void)
+{
+    return NITRO_VSOCK_BRIDGE(
+        object_resolve_path_type("", TYPE_NITRO_VSOCK_BRIDGE, NULL));
+}
+
+/*
+ * Notify the bridge that the enclave has started. Dispatches
+ * enclave_started() to all devices on the bus.
+ */
+void nitro_vsock_bridge_start_enclave(NitroVsockBridge *bridge,
+                                      uint32_t enclave_cid, Error **errp);
+
+#endif /* HW_NITRO_VSOCK_BUS_H */
diff --git a/hw/nitro/nitro-vsock-bus.c b/hw/nitro/nitro-vsock-bus.c
new file mode 100644
index 00000000000..eed29df512e
--- /dev/null
+++ b/hw/nitro/nitro-vsock-bus.c
@@ -0,0 +1,98 @@
+/*
+ * Nitro Enclave Vsock Bus
+ *
+ * Copyright © 2026 Amazon.com, Inc. or its affiliates. All Rights Reserved.
+ *
+ * Authors:
+ *   Alexander Graf <graf@amazon.com>
+ *
+ * A bus for Nitro Enclave vsock devices. In Nitro Enclaves, communication
+ * between parent and enclave/hypervisor happens almost exclusively through
+ * vsock. The nitro-vsock-bus models this dependency in QEMU, which allows
+ * devices in this bus to implement individual services on top of vsock.
+ *
+ * The nitro accel advertises the Enclave's CID to the bus by calling
+ * nitro_vsock_bridge_start_enclave() on the bridge device as soon as it
+ * knows the CID.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "monitor/qdev.h"
+#include "hw/core/sysbus.h"
+#include "hw/nitro/nitro-vsock-bus.h"
+
+void nitro_vsock_bridge_start_enclave(NitroVsockBridge *bridge,
+                                      uint32_t enclave_cid, Error **errp)
+{
+    ERRP_GUARD();
+    BusState *qbus = BUS(&bridge->bus);
+    BusChild *kid;
+
+    bridge->enclave_cid = enclave_cid;
+
+    QTAILQ_FOREACH(kid, &qbus->children, sibling) {
+        NitroVsockDevice *ndev = NITRO_VSOCK_DEVICE(kid->child);
+        NitroVsockDeviceClass *ndc = NITRO_VSOCK_DEVICE_GET_CLASS(ndev);
+
+        if (ndc->enclave_started) {
+            ndc->enclave_started(ndev, enclave_cid, errp);
+            if (*errp) {
+                return;
+            }
+        }
+    }
+}
+
+NitroVsockBridge *nitro_vsock_bridge_create(void)
+{
+    DeviceState *dev = qdev_new(TYPE_NITRO_VSOCK_BRIDGE);
+
+    qdev_set_id(dev, g_strdup("nitro-vsock"), &error_fatal);
+    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
+
+    return NITRO_VSOCK_BRIDGE(dev);
+}
+
+static void nitro_vsock_bridge_init(Object *obj)
+{
+    NitroVsockBridge *s = NITRO_VSOCK_BRIDGE(obj);
+
+    qbus_init(&s->bus, sizeof(s->bus), TYPE_NITRO_VSOCK_BUS,
+              DEVICE(s), "nitro-vsock");
+    object_property_add_uint32_ptr(obj, "enclave-cid",
+                                   &s->enclave_cid, OBJ_PROP_FLAG_READ);
+}
+
+static void nitro_vsock_device_class_init(ObjectClass *oc, const void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+
+    dc->bus_type = TYPE_NITRO_VSOCK_BUS;
+}
+
+static const TypeInfo nitro_vsock_bus_types[] = {
+    {
+        .name = TYPE_NITRO_VSOCK_BUS,
+        .parent = TYPE_BUS,
+        .instance_size = sizeof(NitroVsockBus),
+    },
+    {
+        .name = TYPE_NITRO_VSOCK_BRIDGE,
+        .parent = TYPE_SYS_BUS_DEVICE,
+        .instance_size = sizeof(NitroVsockBridge),
+        .instance_init = nitro_vsock_bridge_init,
+    },
+    {
+        .name = TYPE_NITRO_VSOCK_DEVICE,
+        .parent = TYPE_DEVICE,
+        .instance_size = sizeof(NitroVsockDevice),
+        .class_size = sizeof(NitroVsockDeviceClass),
+        .class_init = nitro_vsock_device_class_init,
+        .abstract = true,
+    },
+};
+
+DEFINE_TYPES(nitro_vsock_bus_types);
diff --git a/hw/Kconfig b/hw/Kconfig
index f8f92b5d03d..b3ce1520a6b 100644
--- a/hw/Kconfig
+++ b/hw/Kconfig
@@ -22,6 +22,7 @@ source isa/Kconfig
 source mem/Kconfig
 source misc/Kconfig
 source net/Kconfig
+source nitro/Kconfig
 source nubus/Kconfig
 source nvme/Kconfig
 source nvram/Kconfig
diff --git a/hw/meson.build b/hw/meson.build
index 66e46b8090d..36da5322f7e 100644
--- a/hw/meson.build
+++ b/hw/meson.build
@@ -44,6 +44,7 @@ subdir('isa')
 subdir('mem')
 subdir('misc')
 subdir('net')
+subdir('nitro')
 subdir('nubus')
 subdir('nvme')
 subdir('nvram')
diff --git a/hw/nitro/Kconfig b/hw/nitro/Kconfig
new file mode 100644
index 00000000000..767472cb2c6
--- /dev/null
+++ b/hw/nitro/Kconfig
@@ -0,0 +1,2 @@
+config NITRO_VSOCK_BUS
+    bool
diff --git a/hw/nitro/meson.build b/hw/nitro/meson.build
new file mode 100644
index 00000000000..7e2807f1379
--- /dev/null
+++ b/hw/nitro/meson.build
@@ -0,0 +1 @@
+system_ss.add(when: 'CONFIG_NITRO_VSOCK_BUS', if_true: files('nitro-vsock-bus.c'))
diff --git a/hw/nitro/trace-events b/hw/nitro/trace-events
new file mode 100644
index 00000000000..9ccc5790487
--- /dev/null
+++ b/hw/nitro/trace-events
@@ -0,0 +1,2 @@
+# See docs/devel/tracing.rst for syntax documentation.
+
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 034/102] accel: Add Nitro Enclaves accelerator
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (32 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 033/102] hw/nitro: Add Nitro Vsock Bus Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 035/102] hw/nitro/nitro-serial-vsock: Nitro Enclaves vsock console Paolo Bonzini
                   ` (67 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexander Graf

From: Alexander Graf <graf@amazon.com>

Nitro Enclaves are a confidential compute technology which
allows a parent instance to carve out resources from itself
and spawn a confidential sibling VM next to itself. Similar
to other confidential compute solutions, this sibling is
controlled by an underlying vmm, but still has a higher level
vmm (QEMU) to implement some of its I/O functionality and
lifecycle.

Add an accelerator to drive this interface. In combination with
follow-on patches to enhance the Nitro Enclaves machine model, this
will allow users to run a Nitro Enclave using QEMU.

Signed-off-by: Alexander Graf <graf@amazon.com>

Link: https://lore.kernel.org/r/20260225220807.33092-5-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 MAINTAINERS                   |   6 +
 meson.build                   |  12 ++
 accel/nitro/trace.h           |   2 +
 include/system/hw_accel.h     |   1 +
 include/system/nitro-accel.h  |  25 +++
 accel/nitro/nitro-accel.c     | 284 ++++++++++++++++++++++++++++++++++
 accel/stubs/nitro-stub.c      |  11 ++
 accel/Kconfig                 |   3 +
 accel/meson.build             |   1 +
 accel/nitro/meson.build       |   3 +
 accel/nitro/trace-events      |   6 +
 accel/stubs/meson.build       |   1 +
 meson_options.txt             |   2 +
 qemu-options.hx               |   8 +-
 scripts/meson-buildoptions.sh |   3 +
 15 files changed, 364 insertions(+), 4 deletions(-)
 create mode 100644 accel/nitro/trace.h
 create mode 100644 include/system/nitro-accel.h
 create mode 100644 accel/nitro/nitro-accel.c
 create mode 100644 accel/stubs/nitro-stub.c
 create mode 100644 accel/nitro/meson.build
 create mode 100644 accel/nitro/trace-events

diff --git a/MAINTAINERS b/MAINTAINERS
index d781fe59bb1..0458980d434 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -586,6 +586,12 @@ F: include/system/mshv.h
 F: include/hw/hyperv/hvgdk*.h
 F: include/hw/hyperv/hvhdk*.h
 
+Nitro Enclaves (native)
+M: Alexander Graf <graf@amazon.com>
+S: Maintained
+F: accel/nitro/
+F: include/system/nitro-accel.h
+
 X86 MSHV CPUs
 M: Magnus Kulke <magnus.kulke@linux.microsoft.com>
 R: Wei Liu <wei.liu@kernel.org>
diff --git a/meson.build b/meson.build
index f3ee08772d4..cbd6d90ce64 100644
--- a/meson.build
+++ b/meson.build
@@ -302,11 +302,13 @@ accelerator_targets += { 'CONFIG_XEN': xen_targets }
 if cpu == 'aarch64'
   accelerator_targets += {
     'CONFIG_HVF': ['aarch64-softmmu'],
+    'CONFIG_NITRO': ['aarch64-softmmu'],
     'CONFIG_WHPX': ['aarch64-softmmu']
   }
 elif cpu == 'x86_64'
   accelerator_targets += {
     'CONFIG_HVF': ['x86_64-softmmu'],
+    'CONFIG_NITRO': ['x86_64-softmmu'],
     'CONFIG_NVMM': ['i386-softmmu', 'x86_64-softmmu'],
     'CONFIG_WHPX': ['i386-softmmu', 'x86_64-softmmu'],
     'CONFIG_MSHV': ['x86_64-softmmu'],
@@ -880,6 +882,11 @@ if get_option('hvf').allowed()
   endif
 endif
 
+nitro = not_found
+if get_option('nitro').allowed() and host_os == 'linux'
+  accelerators += 'CONFIG_NITRO'
+endif
+
 nvmm = not_found
 if host_os == 'netbsd'
   nvmm = cc.find_library('nvmm', required: get_option('nvmm'))
@@ -921,6 +928,9 @@ endif
 if 'CONFIG_HVF' not in accelerators and get_option('hvf').enabled()
   error('HVF not available on this platform')
 endif
+if 'CONFIG_NITRO' not in accelerators and get_option('nitro').enabled()
+  error('NITRO not available on this platform')
+endif
 if 'CONFIG_NVMM' not in accelerators and get_option('nvmm').enabled()
   error('NVMM not available on this platform')
 endif
@@ -3590,6 +3600,7 @@ if have_system
     'accel/hvf',
     'accel/kvm',
     'accel/mshv',
+    'accel/nitro',
     'audio',
     'backends',
     'backends/tpm',
@@ -4789,6 +4800,7 @@ endif
 summary_info = {}
 if have_system
   summary_info += {'KVM support':       config_all_accel.has_key('CONFIG_KVM')}
+  summary_info += {'Nitro support':     config_all_accel.has_key('CONFIG_NITRO')}
   summary_info += {'HVF support':       config_all_accel.has_key('CONFIG_HVF')}
   summary_info += {'WHPX support':      config_all_accel.has_key('CONFIG_WHPX')}
   summary_info += {'NVMM support':      config_all_accel.has_key('CONFIG_NVMM')}
diff --git a/accel/nitro/trace.h b/accel/nitro/trace.h
new file mode 100644
index 00000000000..8c5564725dc
--- /dev/null
+++ b/accel/nitro/trace.h
@@ -0,0 +1,2 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#include "trace/trace-accel_nitro.h"
diff --git a/include/system/hw_accel.h b/include/system/hw_accel.h
index 628a50e066e..f0c10b6d805 100644
--- a/include/system/hw_accel.h
+++ b/include/system/hw_accel.h
@@ -17,6 +17,7 @@
 #include "system/mshv.h"
 #include "system/whpx.h"
 #include "system/nvmm.h"
+#include "system/nitro-accel.h"
 
 /**
  * cpu_synchronize_state:
diff --git a/include/system/nitro-accel.h b/include/system/nitro-accel.h
new file mode 100644
index 00000000000..a93aa6fb00d
--- /dev/null
+++ b/include/system/nitro-accel.h
@@ -0,0 +1,25 @@
+/*
+ * Nitro Enclaves accelerator - public interface
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef SYSTEM_NITRO_ACCEL_H
+#define SYSTEM_NITRO_ACCEL_H
+
+#include "qemu/accel.h"
+
+extern bool nitro_allowed;
+
+static inline bool nitro_enabled(void)
+{
+    return nitro_allowed;
+}
+
+#define TYPE_NITRO_ACCEL ACCEL_CLASS_NAME("nitro")
+
+typedef struct NitroAccelState NitroAccelState;
+DECLARE_INSTANCE_CHECKER(NitroAccelState, NITRO_ACCEL,
+                         TYPE_NITRO_ACCEL)
+
+#endif /* SYSTEM_NITRO_ACCEL_H */
diff --git a/accel/nitro/nitro-accel.c b/accel/nitro/nitro-accel.c
new file mode 100644
index 00000000000..a1e97a9162e
--- /dev/null
+++ b/accel/nitro/nitro-accel.c
@@ -0,0 +1,284 @@
+/*
+ * Nitro Enclaves accelerator
+ *
+ * Copyright © 2026 Amazon.com, Inc. or its affiliates. All Rights Reserved.
+ *
+ * Authors:
+ *   Alexander Graf <graf@amazon.com>
+ *
+ * Nitro Enclaves are a confidential compute technology which
+ * allows a parent instance to carve out resources from itself
+ * and spawn a confidential sibling VM next to itself. Similar
+ * to other confidential compute solutions, this sibling is
+ * controlled by an underlying vmm, but still has a higher level
+ * vmm (QEMU) to implement some of its I/O functionality and
+ * lifecycle.
+ *
+ * This accelerator drives /dev/nitro_enclaves to spawn a Nitro
+ * Enclave. It works in tandem with the nitro_enclaves machine
+ * which ensures the correct backend devices are available and
+ * that the initial seed (an EIF file) is loaded at the correct
+ * offset in memory.
+ *
+ * The accel starts the enclave when the machine starts, after
+ * all device setup is finished.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "qapi/visitor.h"
+#include "qemu/module.h"
+#include "qemu/rcu.h"
+#include "qemu/accel.h"
+#include "qemu/guest-random.h"
+#include "qemu/main-loop.h"
+#include "accel/accel-ops.h"
+#include "accel/accel-cpu-ops.h"
+#include "accel/dummy-cpus.h"
+#include "system/cpus.h"
+#include "hw/core/cpu.h"
+#include "hw/core/boards.h"
+#include "hw/nitro/nitro-vsock-bus.h"
+#include "system/ramblock.h"
+#include "system/nitro-accel.h"
+#include "trace.h"
+
+#include <sys/ioctl.h>
+#include "standard-headers/linux/nitro_enclaves.h"
+
+bool nitro_allowed;
+
+typedef struct NitroAccelState {
+    AccelState parent_obj;
+
+    int ne_fd;
+    int enclave_fd;
+    uint64_t slot_uid;
+    uint64_t enclave_cid;
+    bool debug_mode;
+} NitroAccelState;
+
+static int nitro_init_machine(AccelState *as, MachineState *ms)
+{
+    NitroAccelState *s = NITRO_ACCEL(as);
+    uint64_t slot_uid = 0;
+    int ret;
+
+    s->ne_fd = open("/dev/nitro_enclaves", O_RDWR | O_CLOEXEC);
+    if (s->ne_fd < 0) {
+        error_report("nitro: failed to open /dev/nitro_enclaves: %s",
+                     strerror(errno));
+        return -errno;
+    }
+
+    ret = ioctl(s->ne_fd, NE_CREATE_VM, &slot_uid);
+    if (ret < 0) {
+        error_report("nitro: NE_CREATE_VM failed: %s", strerror(errno));
+        close(s->ne_fd);
+        return -errno;
+    }
+    s->enclave_fd = ret;
+    s->slot_uid = slot_uid;
+
+    return 0;
+}
+
+static int nitro_donate_ram_block(RAMBlock *rb, void *opaque)
+{
+    NitroAccelState *s = opaque;
+    struct ne_user_memory_region region = {
+        .flags = 0,
+        .memory_size = rb->used_length,
+        .userspace_addr = (uint64_t)(uintptr_t)rb->host,
+    };
+
+    if (!rb->used_length) {
+        return 0;
+    }
+
+    if (ioctl(s->enclave_fd, NE_SET_USER_MEMORY_REGION, &region) < 0) {
+        error_report("nitro: NE_SET_USER_MEMORY_REGION failed for %s "
+                     "(%" PRIu64 " bytes): %s", rb->idstr, rb->used_length,
+                     strerror(errno));
+        return -errno;
+    }
+    return 0;
+}
+
+/*
+ * Start the Enclave. At this point memory is set up and the EIF is loaded.
+ * This function donates memory, adds vCPUs, and starts the enclave.
+ */
+static void nitro_setup_post(AccelState *as)
+{
+    MachineState *ms = MACHINE(qdev_get_machine());
+    NitroAccelState *s = NITRO_ACCEL(as);
+    int nr_cpus = ms->smp.cpus;
+    int i, ret;
+    struct ne_enclave_start_info start_info = {
+        .flags = s->debug_mode ? NE_ENCLAVE_DEBUG_MODE : 0,
+        .enclave_cid = s->enclave_cid,
+    };
+
+    ret = qemu_ram_foreach_block(nitro_donate_ram_block, s);
+    if (ret < 0) {
+        error_report("nitro: failed to donate memory");
+        exit(1);
+    }
+
+    for (i = 0; i < nr_cpus; i++) {
+        uint32_t cpu_id = 0;
+        if (ioctl(s->enclave_fd, NE_ADD_VCPU, &cpu_id) < 0) {
+            error_report("nitro: NE_ADD_VCPU failed: %s", strerror(errno));
+            exit(1);
+        }
+    }
+
+    ret = ioctl(s->enclave_fd, NE_START_ENCLAVE, &start_info);
+    if (ret < 0) {
+        switch (errno) {
+        case NE_ERR_NO_MEM_REGIONS_ADDED:
+            error_report("nitro: no memory regions added");
+            break;
+        case NE_ERR_NO_VCPUS_ADDED:
+            error_report("nitro: no vCPUs added");
+            break;
+        case NE_ERR_ENCLAVE_MEM_MIN_SIZE:
+            error_report("nitro: memory is below the minimum "
+                         "required size. Try increasing -m");
+            break;
+        case NE_ERR_FULL_CORES_NOT_USED:
+            error_report("nitro: requires full CPU cores. "
+                         "Try increasing -smp to a multiple of threads "
+                         "per core on this host (e.g. -smp 2)");
+            break;
+        case NE_ERR_NOT_IN_INIT_STATE:
+            error_report("nitro: not in init state");
+            break;
+        case NE_ERR_INVALID_FLAG_VALUE:
+            error_report("nitro: invalid flag value for NE_START_ENCLAVE");
+            break;
+        case NE_ERR_INVALID_ENCLAVE_CID:
+            error_report("nitro: invalid enclave CID");
+            break;
+        default:
+            error_report("nitro: NE_START_ENCLAVE failed: %s (errno %d)",
+                         strerror(errno), errno);
+            break;
+        }
+        exit(1);
+    }
+
+    s->enclave_cid = start_info.enclave_cid;
+    trace_nitro_enclave_started(s->enclave_cid);
+
+    /*
+     * Notify all Nitro vsock bus devices that the enclave has started
+     * and provide them with the CID for vsock connections.
+     */
+    {
+        NitroVsockBridge *bridge = nitro_vsock_bridge_find();
+        Error *err = NULL;
+
+        if (bridge) {
+            nitro_vsock_bridge_start_enclave(bridge,
+                                             (uint32_t)s->enclave_cid, &err);
+            if (err) {
+                error_report_err(err);
+                exit(1);
+            }
+        }
+    }
+}
+
+/* QOM properties */
+
+static bool nitro_get_debug_mode(Object *obj, Error **errp)
+{
+    return NITRO_ACCEL(obj)->debug_mode;
+}
+
+static void nitro_set_debug_mode(Object *obj, bool value, Error **errp)
+{
+    NITRO_ACCEL(obj)->debug_mode = value;
+}
+
+static void nitro_get_enclave_cid(Object *obj, Visitor *v,
+                                  const char *name, void *opaque,
+                                  Error **errp)
+{
+    uint64_t val = NITRO_ACCEL(obj)->enclave_cid;
+    visit_type_uint64(v, name, &val, errp);
+}
+
+static void nitro_set_enclave_cid(Object *obj, Visitor *v,
+                                  const char *name, void *opaque,
+                                  Error **errp)
+{
+    uint64_t val;
+    if (visit_type_uint64(v, name, &val, errp)) {
+        NITRO_ACCEL(obj)->enclave_cid = val;
+    }
+}
+
+static void nitro_accel_class_init(ObjectClass *oc, const void *data)
+{
+    AccelClass *ac = ACCEL_CLASS(oc);
+    ac->name = "Nitro";
+    ac->init_machine = nitro_init_machine;
+    ac->setup_post = nitro_setup_post;
+    ac->allowed = &nitro_allowed;
+
+    object_class_property_add_bool(oc, "debug-mode",
+                                   nitro_get_debug_mode,
+                                   nitro_set_debug_mode);
+    object_class_property_set_description(oc, "debug-mode",
+        "Start enclave in debug mode (enables console output)");
+
+    object_class_property_add(oc, "enclave-cid", "uint64",
+                              nitro_get_enclave_cid,
+                              nitro_set_enclave_cid,
+                              NULL, NULL);
+    object_class_property_set_description(oc, "enclave-cid",
+        "Enclave CID (0 = auto-assigned by Nitro)");
+}
+
+static const TypeInfo nitro_accel_type = {
+    .name = TYPE_NITRO_ACCEL,
+    .parent = TYPE_ACCEL,
+    .instance_size = sizeof(NitroAccelState),
+    .class_init = nitro_accel_class_init,
+};
+module_obj(TYPE_NITRO_ACCEL);
+
+static bool nitro_cpus_are_resettable(void)
+{
+    return false;
+}
+
+static void nitro_accel_ops_class_init(ObjectClass *oc, const void *data)
+{
+    AccelOpsClass *ops = ACCEL_OPS_CLASS(oc);
+    ops->create_vcpu_thread = dummy_start_vcpu_thread;
+    ops->handle_interrupt = generic_handle_interrupt;
+    ops->cpus_are_resettable = nitro_cpus_are_resettable;
+}
+
+static const TypeInfo nitro_accel_ops_type = {
+    .name = ACCEL_OPS_NAME("nitro"),
+    .parent = TYPE_ACCEL_OPS,
+    .class_init = nitro_accel_ops_class_init,
+    .abstract = true,
+};
+module_obj(ACCEL_OPS_NAME("nitro"));
+
+static void nitro_type_init(void)
+{
+    type_register_static(&nitro_accel_type);
+    type_register_static(&nitro_accel_ops_type);
+}
+
+type_init(nitro_type_init);
diff --git a/accel/stubs/nitro-stub.c b/accel/stubs/nitro-stub.c
new file mode 100644
index 00000000000..186c8444f86
--- /dev/null
+++ b/accel/stubs/nitro-stub.c
@@ -0,0 +1,11 @@
+/*
+ * Nitro accel stubs for QEMU
+ *
+ * Copyright © 2026 Amazon.com, Inc. or its affiliates. All Rights Reserved.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+
+bool nitro_allowed;
diff --git a/accel/Kconfig b/accel/Kconfig
index a60f1149238..6d052875ee2 100644
--- a/accel/Kconfig
+++ b/accel/Kconfig
@@ -16,6 +16,9 @@ config KVM
 config MSHV
     bool
 
+config NITRO
+    bool
+
 config XEN
     bool
     select FSDEV_9P if VIRTFS
diff --git a/accel/meson.build b/accel/meson.build
index 289b7420ffa..7da12b9741f 100644
--- a/accel/meson.build
+++ b/accel/meson.build
@@ -12,6 +12,7 @@ if have_system
   subdir('xen')
   subdir('stubs')
   subdir('mshv')
+  subdir('nitro')
 endif
 
 # qtest
diff --git a/accel/nitro/meson.build b/accel/nitro/meson.build
new file mode 100644
index 00000000000..e01c1bab96d
--- /dev/null
+++ b/accel/nitro/meson.build
@@ -0,0 +1,3 @@
+nitro_ss = ss.source_set()
+nitro_ss.add(files('nitro-accel.c'))
+system_ss.add_all(when: 'CONFIG_NITRO', if_true: nitro_ss)
diff --git a/accel/nitro/trace-events b/accel/nitro/trace-events
new file mode 100644
index 00000000000..9673eb5aa22
--- /dev/null
+++ b/accel/nitro/trace-events
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# See docs/devel/tracing.rst for syntax documentation.
+
+# nitro-accel.c
+nitro_enclave_started(uint64_t cid) "nitro: enclave started, CID=%"PRIu64
diff --git a/accel/stubs/meson.build b/accel/stubs/meson.build
index 48eccd1b861..5de4a279ff9 100644
--- a/accel/stubs/meson.build
+++ b/accel/stubs/meson.build
@@ -3,6 +3,7 @@ system_stubs_ss.add(when: 'CONFIG_XEN', if_false: files('xen-stub.c'))
 system_stubs_ss.add(when: 'CONFIG_KVM', if_false: files('kvm-stub.c'))
 system_stubs_ss.add(when: 'CONFIG_TCG', if_false: files('tcg-stub.c'))
 system_stubs_ss.add(when: 'CONFIG_HVF', if_false: files('hvf-stub.c'))
+system_stubs_ss.add(when: 'CONFIG_NITRO', if_false: files('nitro-stub.c'))
 system_stubs_ss.add(when: 'CONFIG_NVMM', if_false: files('nvmm-stub.c'))
 system_stubs_ss.add(when: 'CONFIG_WHPX', if_false: files('whpx-stub.c'))
 system_stubs_ss.add(when: 'CONFIG_MSHV', if_false: files('mshv-stub.c'))
diff --git a/meson_options.txt b/meson_options.txt
index 2836156257a..31d5916cfce 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -79,6 +79,8 @@ option('whpx', type: 'feature', value: 'auto',
        description: 'WHPX acceleration support')
 option('hvf', type: 'feature', value: 'auto',
        description: 'HVF acceleration support')
+option('nitro', type: 'feature', value: 'auto',
+       description: 'Nitro acceleration support')
 option('nvmm', type: 'feature', value: 'auto',
        description: 'NVMM acceleration support')
 option('xen', type: 'feature', value: 'auto',
diff --git a/qemu-options.hx b/qemu-options.hx
index 4043e8ca22b..0da2b4d0348 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -28,7 +28,7 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
     "-machine [type=]name[,prop[=value][,...]]\n"
     "                selects emulated machine ('-machine help' for list)\n"
     "                property accel=accel1[:accel2[:...]] selects accelerator\n"
-    "                supported accelerators are kvm, xen, hvf, nvmm, whpx, mshv or tcg (default: tcg)\n"
+    "                supported accelerators are kvm, xen, hvf, nitro, nvmm, whpx, mshv or tcg (default: tcg)\n"
     "                vmport=on|off|auto controls emulation of vmport (default: auto)\n"
     "                dump-guest-core=on|off include guest memory in a core dump (default=on)\n"
     "                mem-merge=on|off controls memory merge support (default: on)\n"
@@ -67,7 +67,7 @@ SRST
 
     ``accel=accels1[:accels2[:...]]``
         This is used to enable an accelerator. Depending on the target
-        architecture, kvm, xen, hvf, nvmm, whpx, mshv or tcg can be
+        architecture, kvm, xen, hvf, nitro, nvmm, whpx, mshv or tcg can be
         available. By default, tcg is used. If there is more than one
         accelerator specified, the next one is used if the previous one
         fails to initialize.
@@ -228,7 +228,7 @@ ERST
 
 DEF("accel", HAS_ARG, QEMU_OPTION_accel,
     "-accel [accel=]accelerator[,prop[=value][,...]]\n"
-    "                select accelerator (kvm, xen, hvf, nvmm, whpx, mshv or tcg; use 'help' for a list)\n"
+    "                select accelerator (kvm, xen, hvf, nitro, nvmm, whpx, mshv or tcg; use 'help' for a list)\n"
     "                igd-passthru=on|off (enable Xen integrated Intel graphics passthrough, default=off)\n"
     "                kernel-irqchip=on|off|split controls accelerated irqchip support (default=on)\n"
     "                kvm-shadow-mem=size of KVM shadow MMU in bytes\n"
@@ -243,7 +243,7 @@ DEF("accel", HAS_ARG, QEMU_OPTION_accel,
 SRST
 ``-accel name[,prop=value[,...]]``
     This is used to enable an accelerator. Depending on the target
-    architecture, kvm, xen, hvf, nvmm, whpx, mshv or tcg can be available.
+    architecture, kvm, xen, hvf, nitro, nvmm, whpx, mshv or tcg can be available.
     By default, tcg is used. If there is more than one accelerator
     specified, the next one is used if the previous one fails to
     initialize.
diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh
index e8edc5252a3..ca5b113119a 100644
--- a/scripts/meson-buildoptions.sh
+++ b/scripts/meson-buildoptions.sh
@@ -158,6 +158,7 @@ meson_options_help() {
   printf "%s\n" '  multiprocess    Out of process device emulation support'
   printf "%s\n" '  netmap          netmap network backend support'
   printf "%s\n" '  nettle          nettle cryptography support'
+  printf "%s\n" '  nitro           Nitro acceleration support'
   printf "%s\n" '  numa            libnuma support'
   printf "%s\n" '  nvmm            NVMM acceleration support'
   printf "%s\n" '  opengl          OpenGL support'
@@ -418,6 +419,8 @@ _meson_option_parse() {
     --disable-netmap) printf "%s" -Dnetmap=disabled ;;
     --enable-nettle) printf "%s" -Dnettle=enabled ;;
     --disable-nettle) printf "%s" -Dnettle=disabled ;;
+    --enable-nitro) printf "%s" -Dnitro=enabled ;;
+    --disable-nitro) printf "%s" -Dnitro=disabled ;;
     --enable-numa) printf "%s" -Dnuma=enabled ;;
     --disable-numa) printf "%s" -Dnuma=disabled ;;
     --enable-nvmm) printf "%s" -Dnvmm=enabled ;;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 035/102] hw/nitro/nitro-serial-vsock: Nitro Enclaves vsock console
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (33 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 034/102] accel: Add Nitro Enclaves accelerator Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 036/102] hw/nitro: Introduce Nitro Enclave Heartbeat device Paolo Bonzini
                   ` (66 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexander Graf

From: Alexander Graf <graf@amazon.com>

Nitro Enclaves support a special "debug" mode. When in debug mode, the
Nitro Hypervisor provides a vsock port that the parent can connect to to
receive serial console output of the Enclave. Add a new nitro-serial-vsock
driver that implements short-circuit logic to establish the vsock
connection to that port and feed its data into a chardev, so that a machine
model can use it as serial device.

Signed-off-by: Alexander Graf <graf@amazon.com>

Link: https://lore.kernel.org/r/20260225220807.33092-6-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/hw/nitro/serial-vsock.h |  24 +++++++
 hw/nitro/serial-vsock.c         | 123 ++++++++++++++++++++++++++++++++
 hw/nitro/Kconfig                |   4 ++
 hw/nitro/meson.build            |   1 +
 hw/nitro/trace-events           |   2 +
 5 files changed, 154 insertions(+)
 create mode 100644 include/hw/nitro/serial-vsock.h
 create mode 100644 hw/nitro/serial-vsock.c

diff --git a/include/hw/nitro/serial-vsock.h b/include/hw/nitro/serial-vsock.h
new file mode 100644
index 00000000000..c365880e110
--- /dev/null
+++ b/include/hw/nitro/serial-vsock.h
@@ -0,0 +1,24 @@
+/*
+ * Nitro Enclave Serial (vsock)
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef HW_CHAR_NITRO_SERIAL_VSOCK_H
+#define HW_CHAR_NITRO_SERIAL_VSOCK_H
+
+#include "hw/nitro/nitro-vsock-bus.h"
+#include "chardev/char-fe.h"
+#include "qom/object.h"
+
+#define TYPE_NITRO_SERIAL_VSOCK "nitro-serial-vsock"
+OBJECT_DECLARE_SIMPLE_TYPE(NitroSerialVsockState, NITRO_SERIAL_VSOCK)
+
+struct NitroSerialVsockState {
+    NitroVsockDevice parent_obj;
+
+    CharFrontend output;    /* chardev to write console output to */
+    CharFrontend vsock;     /* vsock chardev to enclave console */
+};
+
+#endif /* HW_CHAR_NITRO_SERIAL_VSOCK_H */
diff --git a/hw/nitro/serial-vsock.c b/hw/nitro/serial-vsock.c
new file mode 100644
index 00000000000..1d56c338049
--- /dev/null
+++ b/hw/nitro/serial-vsock.c
@@ -0,0 +1,123 @@
+/*
+ * Nitro Enclave Vsock Serial
+ *
+ * Copyright © 2026 Amazon.com, Inc. or its affiliates. All Rights Reserved.
+ *
+ * Authors:
+ *   Alexander Graf <graf@amazon.com>
+ *
+ * With Nitro Enclaves in debug mode, the Nitro Hypervisor provides a vsock
+ * port that the parent can connect to to receive serial console output of
+ * the Enclave. This driver implements short-circuit logic to establish the
+ * vsock connection to that port and feed its data into a chardev, so that
+ * a machine model can use it as serial device.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "chardev/char.h"
+#include "chardev/char-fe.h"
+#include "hw/core/qdev-properties.h"
+#include "hw/core/qdev-properties-system.h"
+#include "hw/nitro/serial-vsock.h"
+#include "trace.h"
+
+#define CONSOLE_PORT_START 10000
+#define VMADDR_CID_HYPERVISOR_STR "0"
+
+static int nitro_serial_vsock_can_read(void *opaque)
+{
+    NitroSerialVsockState *s = opaque;
+
+    /* Refuse vsock input until the output backend is ready */
+    return qemu_chr_fe_backend_open(&s->output) ? 4096 : 0;
+}
+
+static void nitro_serial_vsock_read(void *opaque, const uint8_t *buf, int size)
+{
+    NitroSerialVsockState *s = opaque;
+
+    /* Forward all vsock data to the output chardev */
+    qemu_chr_fe_write_all(&s->output, buf, size);
+}
+
+static void nitro_serial_vsock_event(void *opaque, QEMUChrEvent event)
+{
+    /* No need to action on connect/disconnect events, but trace for debug */
+    trace_nitro_serial_vsock_event(event);
+}
+
+static void nitro_serial_vsock_enclave_started(NitroVsockDevice *dev,
+                                               uint32_t enclave_cid,
+                                               Error **errp)
+{
+    NitroSerialVsockState *s = NITRO_SERIAL_VSOCK(dev);
+    uint32_t port = enclave_cid + CONSOLE_PORT_START;
+    g_autofree char *chardev_id = NULL;
+    Chardev *chr;
+    ChardevBackend *backend;
+    ChardevSocket *sock;
+
+    /*
+     * We know the Enclave CID to connect to now. Create a vsock
+     * client chardev that connects to the Enclave's console.
+     */
+    chardev_id = g_strdup_printf("nitro-console-%u", enclave_cid);
+
+    backend = g_new0(ChardevBackend, 1);
+    backend->type = CHARDEV_BACKEND_KIND_SOCKET;
+    sock = backend->u.socket.data = g_new0(ChardevSocket, 1);
+    sock->addr = g_new0(SocketAddressLegacy, 1);
+    sock->addr->type = SOCKET_ADDRESS_TYPE_VSOCK;
+    sock->addr->u.vsock.data = g_new0(VsockSocketAddress, 1);
+    sock->addr->u.vsock.data->cid = g_strdup(VMADDR_CID_HYPERVISOR_STR);
+    sock->addr->u.vsock.data->port = g_strdup_printf("%u", port);
+    sock->server = false;
+    sock->has_server = true;
+
+    chr = qemu_chardev_new(chardev_id, TYPE_CHARDEV_SOCKET,
+                           backend, NULL, errp);
+    if (!chr) {
+        return;
+    }
+
+    if (!qemu_chr_fe_init(&s->vsock, chr, errp)) {
+        return;
+    }
+
+    qemu_chr_fe_set_handlers(&s->vsock,
+                             nitro_serial_vsock_can_read,
+                             nitro_serial_vsock_read,
+                             nitro_serial_vsock_event,
+                             NULL, s, NULL, true);
+}
+
+static const Property nitro_serial_vsock_props[] = {
+    DEFINE_PROP_CHR("chardev", NitroSerialVsockState, output),
+};
+
+static void nitro_serial_vsock_class_init(ObjectClass *oc, const void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+    NitroVsockDeviceClass *ndc = NITRO_VSOCK_DEVICE_CLASS(oc);
+
+    device_class_set_props(dc, nitro_serial_vsock_props);
+    ndc->enclave_started = nitro_serial_vsock_enclave_started;
+}
+
+static const TypeInfo nitro_serial_vsock_info = {
+    .name = TYPE_NITRO_SERIAL_VSOCK,
+    .parent = TYPE_NITRO_VSOCK_DEVICE,
+    .instance_size = sizeof(NitroSerialVsockState),
+    .class_init = nitro_serial_vsock_class_init,
+};
+
+static void nitro_serial_vsock_register(void)
+{
+    type_register_static(&nitro_serial_vsock_info);
+}
+
+type_init(nitro_serial_vsock_register);
diff --git a/hw/nitro/Kconfig b/hw/nitro/Kconfig
index 767472cb2c6..ce24c09c218 100644
--- a/hw/nitro/Kconfig
+++ b/hw/nitro/Kconfig
@@ -1,2 +1,6 @@
 config NITRO_VSOCK_BUS
     bool
+
+config NITRO_SERIAL_VSOCK
+    bool
+    depends on NITRO_VSOCK_BUS
diff --git a/hw/nitro/meson.build b/hw/nitro/meson.build
index 7e2807f1379..76399d4265d 100644
--- a/hw/nitro/meson.build
+++ b/hw/nitro/meson.build
@@ -1 +1,2 @@
 system_ss.add(when: 'CONFIG_NITRO_VSOCK_BUS', if_true: files('nitro-vsock-bus.c'))
+system_ss.add(when: 'CONFIG_NITRO_SERIAL_VSOCK', if_true: files('serial-vsock.c'))
diff --git a/hw/nitro/trace-events b/hw/nitro/trace-events
index 9ccc5790487..20617a024a9 100644
--- a/hw/nitro/trace-events
+++ b/hw/nitro/trace-events
@@ -1,2 +1,4 @@
 # See docs/devel/tracing.rst for syntax documentation.
 
+# serial-vsock.c
+nitro_serial_vsock_event(int event) "event %d"
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 036/102] hw/nitro: Introduce Nitro Enclave Heartbeat device
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (34 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 035/102] hw/nitro/nitro-serial-vsock: Nitro Enclaves vsock console Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 037/102] target/arm/cpu64: Allow -host for nitro Paolo Bonzini
                   ` (65 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexander Graf

From: Alexander Graf <graf@amazon.com>

Nitro Enclaves expect the parent instance to host a vsock heartbeat listener
at port 9000. To host a Nitro Enclave with the nitro accel in QEMU, add
such a heartbeat listener as device model, so that the machine can
easily instantiate it.

Signed-off-by: Alexander Graf <graf@amazon.com>

Link: https://lore.kernel.org/r/20260225220807.33092-7-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/hw/nitro/heartbeat.h |  24 ++++++++
 hw/nitro/heartbeat.c         | 115 +++++++++++++++++++++++++++++++++++
 hw/nitro/Kconfig             |   4 ++
 hw/nitro/meson.build         |   1 +
 hw/nitro/trace-events        |   4 ++
 5 files changed, 148 insertions(+)
 create mode 100644 include/hw/nitro/heartbeat.h
 create mode 100644 hw/nitro/heartbeat.c

diff --git a/include/hw/nitro/heartbeat.h b/include/hw/nitro/heartbeat.h
new file mode 100644
index 00000000000..6b9271a47df
--- /dev/null
+++ b/include/hw/nitro/heartbeat.h
@@ -0,0 +1,24 @@
+/*
+ * Nitro Heartbeat device
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef HW_MISC_NITRO_HEARTBEAT_H
+#define HW_MISC_NITRO_HEARTBEAT_H
+
+#include "hw/nitro/nitro-vsock-bus.h"
+#include "chardev/char-fe.h"
+#include "qom/object.h"
+
+#define TYPE_NITRO_HEARTBEAT "nitro-heartbeat"
+OBJECT_DECLARE_SIMPLE_TYPE(NitroHeartbeatState, NITRO_HEARTBEAT)
+
+struct NitroHeartbeatState {
+    NitroVsockDevice parent_obj;
+
+    CharFrontend vsock;     /* vsock server chardev for heartbeat */
+    bool done;
+};
+
+#endif /* HW_MISC_NITRO_HEARTBEAT_H */
diff --git a/hw/nitro/heartbeat.c b/hw/nitro/heartbeat.c
new file mode 100644
index 00000000000..dc413232667
--- /dev/null
+++ b/hw/nitro/heartbeat.c
@@ -0,0 +1,115 @@
+/*
+ * Nitro Enclave Heartbeat device
+ *
+ * Copyright © 2026 Amazon.com, Inc. or its affiliates. All Rights Reserved.
+ *
+ * Authors:
+ *   Alexander Graf <graf@amazon.com>
+ *
+ * The Nitro Enclave init process sends a heartbeat byte (0xB7) to
+ * CID 3 (parent) port 9000 on boot to signal it reached initramfs.
+ * The parent must accept the connection, read the byte, and echo it
+ * back. If the enclave init cannot reach the listener, it exits.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "chardev/char.h"
+#include "chardev/char-fe.h"
+#include "hw/nitro/heartbeat.h"
+#include "trace.h"
+
+#define HEARTBEAT_PORT      9000
+#define VMADDR_CID_ANY_STR  "4294967295"
+
+static int nitro_heartbeat_can_read(void *opaque)
+{
+    NitroHeartbeatState *s = opaque;
+
+    /* One-shot protocol: stop reading after the first heartbeat */
+    return s->done ? 0 : 1;
+}
+
+static void nitro_heartbeat_read(void *opaque, const uint8_t *buf, int size)
+{
+    NitroHeartbeatState *s = opaque;
+
+    if (s->done || size < 1) {
+        return;
+    }
+
+    /* Echo the heartbeat byte back and disconnect */
+    qemu_chr_fe_write_all(&s->vsock, buf, 1);
+    s->done = true;
+    qemu_chr_fe_deinit(&s->vsock, true);
+
+    trace_nitro_heartbeat_done();
+}
+
+static void nitro_heartbeat_event(void *opaque, QEMUChrEvent event)
+{
+    trace_nitro_heartbeat_event(event);
+}
+
+static void nitro_heartbeat_realize(DeviceState *dev, Error **errp)
+{
+    NitroHeartbeatState *s = NITRO_HEARTBEAT(dev);
+    g_autofree char *chardev_id = NULL;
+    Chardev *chr;
+    ChardevBackend *backend;
+    ChardevSocket *sock;
+
+    chardev_id = g_strdup_printf("nitro-heartbeat");
+
+    backend = g_new0(ChardevBackend, 1);
+    backend->type = CHARDEV_BACKEND_KIND_SOCKET;
+    sock = backend->u.socket.data = g_new0(ChardevSocket, 1);
+    sock->addr = g_new0(SocketAddressLegacy, 1);
+    sock->addr->type = SOCKET_ADDRESS_TYPE_VSOCK;
+    sock->addr->u.vsock.data = g_new0(VsockSocketAddress, 1);
+    sock->addr->u.vsock.data->cid = g_strdup(VMADDR_CID_ANY_STR);
+    sock->addr->u.vsock.data->port = g_strdup_printf("%u", HEARTBEAT_PORT);
+    sock->server = true;
+    sock->has_server = true;
+    sock->wait = false;
+    sock->has_wait = true;
+
+    chr = qemu_chardev_new(chardev_id, TYPE_CHARDEV_SOCKET,
+                           backend, NULL, errp);
+    if (!chr) {
+        return;
+    }
+
+    if (!qemu_chr_fe_init(&s->vsock, chr, errp)) {
+        return;
+    }
+
+    qemu_chr_fe_set_handlers(&s->vsock,
+                             nitro_heartbeat_can_read,
+                             nitro_heartbeat_read,
+                             nitro_heartbeat_event,
+                             NULL, s, NULL, true);
+}
+
+static void nitro_heartbeat_class_init(ObjectClass *oc, const void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+
+    dc->realize = nitro_heartbeat_realize;
+}
+
+static const TypeInfo nitro_heartbeat_info = {
+    .name = TYPE_NITRO_HEARTBEAT,
+    .parent = TYPE_NITRO_VSOCK_DEVICE,
+    .instance_size = sizeof(NitroHeartbeatState),
+    .class_init = nitro_heartbeat_class_init,
+};
+
+static void nitro_heartbeat_register(void)
+{
+    type_register_static(&nitro_heartbeat_info);
+}
+
+type_init(nitro_heartbeat_register);
diff --git a/hw/nitro/Kconfig b/hw/nitro/Kconfig
index ce24c09c218..d3fbc7b683c 100644
--- a/hw/nitro/Kconfig
+++ b/hw/nitro/Kconfig
@@ -4,3 +4,7 @@ config NITRO_VSOCK_BUS
 config NITRO_SERIAL_VSOCK
     bool
     depends on NITRO_VSOCK_BUS
+
+config NITRO_HEARTBEAT
+    bool
+    depends on NITRO_VSOCK_BUS
diff --git a/hw/nitro/meson.build b/hw/nitro/meson.build
index 76399d4265d..381c1ee6c15 100644
--- a/hw/nitro/meson.build
+++ b/hw/nitro/meson.build
@@ -1,2 +1,3 @@
 system_ss.add(when: 'CONFIG_NITRO_VSOCK_BUS', if_true: files('nitro-vsock-bus.c'))
 system_ss.add(when: 'CONFIG_NITRO_SERIAL_VSOCK', if_true: files('serial-vsock.c'))
+system_ss.add(when: 'CONFIG_NITRO_HEARTBEAT', if_true: files('heartbeat.c'))
diff --git a/hw/nitro/trace-events b/hw/nitro/trace-events
index 20617a024a9..311ab78e699 100644
--- a/hw/nitro/trace-events
+++ b/hw/nitro/trace-events
@@ -2,3 +2,7 @@
 
 # serial-vsock.c
 nitro_serial_vsock_event(int event) "event %d"
+
+# heartbeat.c
+nitro_heartbeat_event(int event) "event %d"
+nitro_heartbeat_done(void) "enclave heartbeat received"
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 037/102] target/arm/cpu64: Allow -host for nitro
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (35 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 036/102] hw/nitro: Introduce Nitro Enclave Heartbeat device Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 038/102] hw/nitro: Add nitro machine Paolo Bonzini
                   ` (64 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexander Graf

From: Alexander Graf <graf@amazon.com>

The nitro accel does not actually make use of CPU emulation or details:
It always uses the host CPU regardless of configuration. Machines for
the nitro accel select the host CPU type as default to have a clear
statement of the above and to have a unified cpu type across all
supported architectures.

The arm64 logic on Linux currently only allows -cpu host for KVM based
virtual machines. Add a special case for nitro so that when the nitro
accel is active, it allows use of the host cpu type.

Signed-off-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20260225220807.33092-8-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/arm/cpu64.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 58215216c55..d6feba220e8 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -813,6 +813,14 @@ static void aarch64_a53_initfn(Object *obj)
 static void aarch64_host_initfn(Object *obj)
 {
     ARMCPU *cpu = ARM_CPU(obj);
+
+#if defined(CONFIG_NITRO)
+    if (nitro_enabled()) {
+        /* The nitro accel uses -cpu host, but does not actually consume it */
+        return;
+    }
+#endif
+
 #if defined(CONFIG_KVM)
     kvm_arm_set_cpu_features_from_host(cpu);
     aarch64_add_sve_properties(obj);
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 038/102] hw/nitro: Add nitro machine
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (36 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 037/102] target/arm/cpu64: Allow -host for nitro Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 039/102] hw/core/eif: Move definitions to header Paolo Bonzini
                   ` (63 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexander Graf

From: Alexander Graf <graf@amazon.com>

Add a machine model to spawn a Nitro Enclave. Unlike the existing -M
nitro-enclave, this machine model works exclusively with the -accel
nitro accelerator to drive real Nitro Enclave creation. It supports
memory allocation, number of CPU selection, both x86_64 as well as
aarch64, implements the Enclave heartbeat logic and debug serial
console.

To use it, create an EIF file and run

  $ qemu-system-x86_64 -accel nitro,debug-mode=on -M nitro -nographic \
                       -kernel test.eif

or

  $ qemu-system-aarch64 -accel nitro,debug-mode=on -M nitro -nographic \
                        -kernel test.eif

Signed-off-by: Alexander Graf <graf@amazon.com>

Link: https://lore.kernel.org/r/20260225220807.33092-9-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/hw/nitro/machine.h |  20 +++++
 hw/nitro/machine.c         | 161 +++++++++++++++++++++++++++++++++++++
 tests/qtest/libqtest.c     |   1 +
 hw/nitro/Kconfig           |   8 ++
 hw/nitro/meson.build       |   1 +
 5 files changed, 191 insertions(+)
 create mode 100644 include/hw/nitro/machine.h
 create mode 100644 hw/nitro/machine.c

diff --git a/include/hw/nitro/machine.h b/include/hw/nitro/machine.h
new file mode 100644
index 00000000000..d78ba7d6dc3
--- /dev/null
+++ b/include/hw/nitro/machine.h
@@ -0,0 +1,20 @@
+/*
+ * Nitro Enclaves (accel) machine
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef HW_NITRO_MACHINE_H
+#define HW_NITRO_MACHINE_H
+
+#include "hw/core/boards.h"
+#include "qom/object.h"
+
+#define TYPE_NITRO_MACHINE MACHINE_TYPE_NAME("nitro")
+OBJECT_DECLARE_SIMPLE_TYPE(NitroMachineState, NITRO_MACHINE)
+
+struct NitroMachineState {
+    MachineState parent;
+};
+
+#endif /* HW_NITRO_MACHINE_H */
diff --git a/hw/nitro/machine.c b/hw/nitro/machine.c
new file mode 100644
index 00000000000..e28c8e9bf50
--- /dev/null
+++ b/hw/nitro/machine.c
@@ -0,0 +1,161 @@
+/*
+ * Nitro Enclaves (accel) machine
+ *
+ * Copyright © 2026 Amazon.com, Inc. or its affiliates. All Rights Reserved.
+ *
+ * Authors:
+ *   Alexander Graf <graf@amazon.com>
+ *
+ * Nitro Enclaves machine model for -accel nitro. This machine behaves
+ * like the nitro-enclave machine, but uses the real Nitro Enclaves
+ * backend to launch the virtual machine. It requires use of the -accel
+ * nitro.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "qom/object_interfaces.h"
+#include "chardev/char.h"
+#include "hw/core/boards.h"
+#include "hw/core/cpu.h"
+#include "hw/core/qdev-properties-system.h"
+#include "hw/nitro/heartbeat.h"
+#include "hw/nitro/machine.h"
+#include "hw/nitro/nitro-vsock-bus.h"
+#include "hw/nitro/serial-vsock.h"
+#include "system/address-spaces.h"
+#include "system/hostmem.h"
+#include "system/system.h"
+#include "system/nitro-accel.h"
+#include "qemu/accel.h"
+#include "hw/arm/machines-qom.h"
+
+#define EIF_LOAD_ADDR   (8 * 1024 * 1024)
+
+static void nitro_machine_init(MachineState *machine)
+{
+    const char *eif_path = machine->kernel_filename;
+    const char *cpu_type = machine->cpu_type;
+    g_autofree char *eif_data = NULL;
+    gsize eif_size;
+
+    if (!nitro_enabled()) {
+        error_report("The 'nitro' machine requires -accel nitro");
+        exit(1);
+    }
+
+    if (!cpu_type) {
+        ObjectClass *oc = cpu_class_by_name(target_cpu_type(), "host");
+
+        if (!oc) {
+            error_report("nitro: no 'host' CPU available");
+            exit(1);
+        }
+        cpu_type = object_class_get_name(oc);
+    }
+
+    if (!eif_path) {
+        error_report("nitro: -kernel <eif-file> is required");
+        exit(1);
+    }
+
+    /* Expose memory as normal QEMU RAM. Needs to be huge page backed. */
+    memory_region_add_subregion(get_system_memory(), 0, machine->ram);
+
+    /*
+     * Load EIF (-kernel) as raw blob at the EIF_LOAD_ADDR into guest RAM.
+     * The Nitro Hypervisor will extract its contents and bootstrap the
+     * Enclave from it.
+     */
+    if (!g_file_get_contents(eif_path, &eif_data, &eif_size, NULL)) {
+        error_report("nitro: failed to read EIF '%s'", eif_path);
+        exit(1);
+    }
+    address_space_write(&address_space_memory, EIF_LOAD_ADDR,
+                        MEMTXATTRS_UNSPECIFIED, eif_data, eif_size);
+
+    if (defaults_enabled()) {
+        NitroVsockBridge *bridge = nitro_vsock_bridge_create();
+
+        /* Nitro Enclaves require a heartbeat device. Provide one. */
+        qdev_realize(qdev_new(TYPE_NITRO_HEARTBEAT),
+                     BUS(&bridge->bus), &error_fatal);
+
+        /*
+         * In debug mode, Nitro Enclaves expose the guest's serial output via
+         * vsock. When the accel is in debug mode, wire the vsock serial to
+         * the machine's serial port so that -nographic automatically works
+         */
+        if (object_property_get_bool(OBJECT(current_accel()), "debug-mode", NULL)) {
+            Chardev *chr = serial_hd(0);
+
+            if (chr) {
+                DeviceState *dev = qdev_new(TYPE_NITRO_SERIAL_VSOCK);
+
+                qdev_prop_set_chr(dev, "chardev", chr);
+                qdev_realize(dev, BUS(&bridge->bus), &error_fatal);
+            }
+        }
+    }
+}
+
+static bool nitro_create_memfd_backend(MachineState *ms, const char *path,
+                                       Error **errp)
+{
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
+    Object *root = object_get_objects_root();
+    Object *obj;
+    bool r = false;
+
+    obj = object_new(TYPE_MEMORY_BACKEND_MEMFD);
+
+    /* Nitro Enclaves require huge page backing */
+    if (!object_property_set_int(obj, "size", ms->ram_size, errp) ||
+        !object_property_set_bool(obj, "hugetlb", true, errp)) {
+        goto out;
+    }
+
+    object_property_add_child(root, mc->default_ram_id, obj);
+
+    if (!user_creatable_complete(USER_CREATABLE(obj), errp)) {
+        goto out;
+    }
+    r = object_property_set_link(OBJECT(ms), "memory-backend", obj, errp);
+
+out:
+    object_unref(obj);
+    return r;
+}
+
+static void nitro_machine_class_init(ObjectClass *oc, const void *data)
+{
+    MachineClass *mc = MACHINE_CLASS(oc);
+
+    mc->desc = "Nitro Enclave";
+    mc->init = nitro_machine_init;
+    mc->create_default_memdev = nitro_create_memfd_backend;
+    mc->default_ram_id = "ram";
+    mc->max_cpus = 4096;
+}
+
+static const TypeInfo nitro_machine_info = {
+    .name = TYPE_NITRO_MACHINE,
+    .parent = TYPE_MACHINE,
+    .instance_size = sizeof(NitroMachineState),
+    .class_init = nitro_machine_class_init,
+    .interfaces = (const InterfaceInfo[]) {
+        /* x86_64 and aarch64 only */
+        { TYPE_TARGET_AARCH64_MACHINE },
+        { }
+    },
+};
+
+static void nitro_machine_register(void)
+{
+    type_register_static(&nitro_machine_info);
+}
+
+type_init(nitro_machine_register);
diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
index 794d8700857..051faf31e14 100644
--- a/tests/qtest/libqtest.c
+++ b/tests/qtest/libqtest.c
@@ -1815,6 +1815,7 @@ void qtest_cb_for_every_machine(void (*cb)(const char *machine),
             g_str_equal("xenpv", machines[i].name) ||
             g_str_equal("xenpvh", machines[i].name) ||
             g_str_equal("vmapple", machines[i].name) ||
+            g_str_equal("nitro", machines[i].name) ||
             g_str_equal("nitro-enclave", machines[i].name)) {
             continue;
         }
diff --git a/hw/nitro/Kconfig b/hw/nitro/Kconfig
index d3fbc7b683c..cfae85920a0 100644
--- a/hw/nitro/Kconfig
+++ b/hw/nitro/Kconfig
@@ -8,3 +8,11 @@ config NITRO_SERIAL_VSOCK
 config NITRO_HEARTBEAT
     bool
     depends on NITRO_VSOCK_BUS
+
+config NITRO_MACHINE
+    bool
+    default y
+    depends on NITRO
+    select NITRO_VSOCK_BUS
+    select NITRO_HEARTBEAT
+    select NITRO_SERIAL_VSOCK
diff --git a/hw/nitro/meson.build b/hw/nitro/meson.build
index 381c1ee6c15..e3f18958906 100644
--- a/hw/nitro/meson.build
+++ b/hw/nitro/meson.build
@@ -1,3 +1,4 @@
 system_ss.add(when: 'CONFIG_NITRO_VSOCK_BUS', if_true: files('nitro-vsock-bus.c'))
 system_ss.add(when: 'CONFIG_NITRO_SERIAL_VSOCK', if_true: files('serial-vsock.c'))
 system_ss.add(when: 'CONFIG_NITRO_HEARTBEAT', if_true: files('heartbeat.c'))
+system_ss.add(when: 'CONFIG_NITRO_MACHINE', if_true: files('machine.c'))
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 039/102] hw/core/eif: Move definitions to header
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (37 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 038/102] hw/nitro: Add nitro machine Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 040/102] hw/nitro: Enable direct kernel boot Paolo Bonzini
                   ` (62 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexander Graf, Dorjoy Chowdhury

From: Alexander Graf <graf@amazon.com>

In follow-up patches we need some EIF file definitions that are
currently in the eif.c file, but want to access them from a separate
device. Move them into the header instead.

Signed-off-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Link: https://lore.kernel.org/r/20260225220807.33092-10-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/core/eif.h | 38 ++++++++++++++++++++++++++++++++++++++
 hw/core/eif.c | 38 --------------------------------------
 2 files changed, 38 insertions(+), 38 deletions(-)

diff --git a/hw/core/eif.h b/hw/core/eif.h
index fed3cb55140..a3412377a90 100644
--- a/hw/core/eif.h
+++ b/hw/core/eif.h
@@ -11,6 +11,44 @@
 #ifndef HW_CORE_EIF_H
 #define HW_CORE_EIF_H
 
+#define MAX_SECTIONS 32
+
+/* members are ordered according to field order in .eif file */
+typedef struct EifHeader {
+    uint8_t  magic[4]; /* must be .eif in ascii i.e., [46, 101, 105, 102] */
+    uint16_t version;
+    uint16_t flags;
+    uint64_t default_memory;
+    uint64_t default_cpus;
+    uint16_t reserved;
+    uint16_t section_cnt;
+    uint64_t section_offsets[MAX_SECTIONS];
+    uint64_t section_sizes[MAX_SECTIONS];
+    uint32_t unused;
+    uint32_t eif_crc32;
+} QEMU_PACKED EifHeader;
+
+/* members are ordered according to field order in .eif file */
+typedef struct EifSectionHeader {
+    /*
+     * 0 = invalid, 1 = kernel, 2 = cmdline, 3 = ramdisk, 4 = signature,
+     * 5 = metadata
+     */
+    uint16_t section_type;
+    uint16_t flags;
+    uint64_t section_size;
+} QEMU_PACKED EifSectionHeader;
+
+enum EifSectionTypes {
+    EIF_SECTION_INVALID = 0,
+    EIF_SECTION_KERNEL = 1,
+    EIF_SECTION_CMDLINE = 2,
+    EIF_SECTION_RAMDISK = 3,
+    EIF_SECTION_SIGNATURE = 4,
+    EIF_SECTION_METADATA = 5,
+    EIF_SECTION_MAX = 6,
+};
+
 bool read_eif_file(const char *eif_path, const char *machine_initrd,
                    char **kernel_path, char **initrd_path,
                    char **kernel_cmdline, uint8_t *image_sha384,
diff --git a/hw/core/eif.c b/hw/core/eif.c
index 513caec6b49..96f1d765785 100644
--- a/hw/core/eif.c
+++ b/hw/core/eif.c
@@ -18,44 +18,6 @@
 
 #include "hw/core/eif.h"
 
-#define MAX_SECTIONS 32
-
-/* members are ordered according to field order in .eif file */
-typedef struct EifHeader {
-    uint8_t  magic[4]; /* must be .eif in ascii i.e., [46, 101, 105, 102] */
-    uint16_t version;
-    uint16_t flags;
-    uint64_t default_memory;
-    uint64_t default_cpus;
-    uint16_t reserved;
-    uint16_t section_cnt;
-    uint64_t section_offsets[MAX_SECTIONS];
-    uint64_t section_sizes[MAX_SECTIONS];
-    uint32_t unused;
-    uint32_t eif_crc32;
-} QEMU_PACKED EifHeader;
-
-/* members are ordered according to field order in .eif file */
-typedef struct EifSectionHeader {
-    /*
-     * 0 = invalid, 1 = kernel, 2 = cmdline, 3 = ramdisk, 4 = signature,
-     * 5 = metadata
-     */
-    uint16_t section_type;
-    uint16_t flags;
-    uint64_t section_size;
-} QEMU_PACKED EifSectionHeader;
-
-enum EifSectionTypes {
-    EIF_SECTION_INVALID = 0,
-    EIF_SECTION_KERNEL = 1,
-    EIF_SECTION_CMDLINE = 2,
-    EIF_SECTION_RAMDISK = 3,
-    EIF_SECTION_SIGNATURE = 4,
-    EIF_SECTION_METADATA = 5,
-    EIF_SECTION_MAX = 6,
-};
-
 static const char *section_type_to_string(uint16_t type)
 {
     const char *str;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 040/102] hw/nitro: Enable direct kernel boot
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (38 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 039/102] hw/core/eif: Move definitions to header Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 041/102] docs: Add Nitro Enclaves documentation Paolo Bonzini
                   ` (61 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexander Graf, Dorjoy Chowdhury

From: Alexander Graf <graf@amazon.com>

Nitro Enclaves can only boot EIF files which are a combination of
kernel, initramfs and cmdline in a single file. When the kernel image is
not an EIF, treat it like a kernel image and assemble an EIF image on
the fly. This way, users can call QEMU with a direct
kernel/initrd/cmdline combination and everything "just works".

Signed-off-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Link: https://lore.kernel.org/r/20260225220807.33092-11-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/core/eif.h        |   3 ++
 hw/nitro/machine.c   | 116 +++++++++++++++++++++++++++++++++++++++++++
 hw/nitro/meson.build |   2 +-
 3 files changed, 120 insertions(+), 1 deletion(-)

diff --git a/hw/core/eif.h b/hw/core/eif.h
index a3412377a90..0c432dbc2dc 100644
--- a/hw/core/eif.h
+++ b/hw/core/eif.h
@@ -12,6 +12,7 @@
 #define HW_CORE_EIF_H
 
 #define MAX_SECTIONS 32
+#define EIF_HDR_ARCH_ARM64 0x1
 
 /* members are ordered according to field order in .eif file */
 typedef struct EifHeader {
@@ -49,6 +50,8 @@ enum EifSectionTypes {
     EIF_SECTION_MAX = 6,
 };
 
+#define EIF_MAGIC { '.', 'e', 'i', 'f' }
+
 bool read_eif_file(const char *eif_path, const char *machine_initrd,
                    char **kernel_path, char **initrd_path,
                    char **kernel_cmdline, uint8_t *image_sha384,
diff --git a/hw/nitro/machine.c b/hw/nitro/machine.c
index e28c8e9bf50..8849959359c 100644
--- a/hw/nitro/machine.c
+++ b/hw/nitro/machine.c
@@ -32,9 +32,104 @@
 #include "system/nitro-accel.h"
 #include "qemu/accel.h"
 #include "hw/arm/machines-qom.h"
+#include "hw/core/eif.h"
+#include <zlib.h> /* for crc32 */
 
 #define EIF_LOAD_ADDR   (8 * 1024 * 1024)
 
+static bool is_eif(char *eif, gsize len)
+{
+    const char eif_magic[] = EIF_MAGIC;
+
+    return len >= sizeof(eif_magic) &&
+           !memcmp(eif, eif_magic, sizeof(eif_magic));
+}
+
+static void build_eif_section(EifHeader *hdr, GByteArray *buf, uint16_t type,
+                              const char *data, uint64_t size)
+{
+    uint16_t section = be16_to_cpu(hdr->section_cnt);
+    EifSectionHeader shdr = {
+        .section_type = cpu_to_be16(type),
+        .flags = 0,
+        .section_size = cpu_to_be64(size),
+    };
+
+    hdr->section_offsets[section] = cpu_to_be64(buf->len);
+    hdr->section_sizes[section] = cpu_to_be64(size);
+
+    g_byte_array_append(buf, (const uint8_t *)&shdr, sizeof(shdr));
+    if (size) {
+        g_byte_array_append(buf, (const uint8_t *)data, size);
+    }
+
+    hdr->section_cnt = cpu_to_be16(section + 1);
+}
+
+/*
+ * Nitro Enclaves only support loading EIF files. When the user provides
+ * a Linux kernel, initrd and cmdline, convert them into EIF format.
+ */
+static char *build_eif(const char *kernel_data, gsize kernel_size,
+                       const char *initrd_path, const char *cmdline,
+                       gsize *out_size, Error **errp)
+{
+    g_autofree char *initrd_data = NULL;
+    static const char metadata[] = "{}";
+    size_t metadata_len = sizeof(metadata) - 1;
+    gsize initrd_size = 0;
+    GByteArray *buf;
+    EifHeader hdr;
+    uint32_t crc = 0;
+    size_t cmdline_len;
+
+    if (initrd_path) {
+        if (!g_file_get_contents(initrd_path, &initrd_data,
+                                 &initrd_size, NULL)) {
+            error_setg(errp, "Failed to read initrd '%s'", initrd_path);
+            return NULL;
+        }
+    }
+
+    buf = g_byte_array_new();
+
+    cmdline_len = cmdline ? strlen(cmdline) : 0;
+
+    hdr = (EifHeader) {
+        .magic = EIF_MAGIC,
+        .version = cpu_to_be16(4),
+        .flags = cpu_to_be16(target_aarch64() ? EIF_HDR_ARCH_ARM64 : 0),
+    };
+
+    g_byte_array_append(buf, (const uint8_t *)&hdr, sizeof(hdr));
+
+    /* Kernel */
+    build_eif_section(&hdr, buf, EIF_SECTION_KERNEL, kernel_data, kernel_size);
+
+    /* Command line */
+    build_eif_section(&hdr, buf, EIF_SECTION_CMDLINE, cmdline, cmdline_len);
+
+    /* Initramfs */
+    build_eif_section(&hdr, buf, EIF_SECTION_RAMDISK, initrd_data, initrd_size);
+
+    /* Metadata */
+    build_eif_section(&hdr, buf, EIF_SECTION_METADATA, metadata, metadata_len);
+
+    /*
+     * Patch the header into the buffer first (with real section offsets
+     * and sizes), then compute CRC over everything except the CRC field.
+     */
+    memcpy(buf->data, &hdr, sizeof(hdr));
+    crc = crc32(crc, buf->data, offsetof(EifHeader, eif_crc32));
+    crc = crc32(crc, &buf->data[sizeof(hdr)], buf->len - sizeof(hdr));
+
+    /* Finally write the CRC into the in-buffer header */
+    ((EifHeader *)buf->data)->eif_crc32 = cpu_to_be32(crc);
+
+    *out_size = buf->len;
+    return (char *)g_byte_array_free(buf, false);
+}
+
 static void nitro_machine_init(MachineState *machine)
 {
     const char *eif_path = machine->kernel_filename;
@@ -74,6 +169,27 @@ static void nitro_machine_init(MachineState *machine)
         error_report("nitro: failed to read EIF '%s'", eif_path);
         exit(1);
     }
+
+    if (!is_eif(eif_data, eif_size)) {
+        char *kernel_data = eif_data;
+        gsize kernel_size = eif_size;
+        Error *err = NULL;
+
+        /*
+         * The user gave us a non-EIF kernel, likely a Linux kernel image.
+         * Assemble an EIF file from it, the -initrd and the -append arguments,
+         * so that users can perform a natural direct kernel boot.
+         */
+        eif_data = build_eif(kernel_data, kernel_size, machine->initrd_filename,
+                             machine->kernel_cmdline, &eif_size, &err);
+        if (!eif_data) {
+            error_report_err(err);
+            exit(1);
+        }
+
+        g_free(kernel_data);
+    }
+
     address_space_write(&address_space_memory, EIF_LOAD_ADDR,
                         MEMTXATTRS_UNSPECIFIED, eif_data, eif_size);
 
diff --git a/hw/nitro/meson.build b/hw/nitro/meson.build
index e3f18958906..b9bd0d43002 100644
--- a/hw/nitro/meson.build
+++ b/hw/nitro/meson.build
@@ -1,4 +1,4 @@
 system_ss.add(when: 'CONFIG_NITRO_VSOCK_BUS', if_true: files('nitro-vsock-bus.c'))
 system_ss.add(when: 'CONFIG_NITRO_SERIAL_VSOCK', if_true: files('serial-vsock.c'))
 system_ss.add(when: 'CONFIG_NITRO_HEARTBEAT', if_true: files('heartbeat.c'))
-system_ss.add(when: 'CONFIG_NITRO_MACHINE', if_true: files('machine.c'))
+system_ss.add(when: 'CONFIG_NITRO_MACHINE', if_true: [files('machine.c'), zlib])
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 041/102] docs: Add Nitro Enclaves documentation
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (39 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 040/102] hw/nitro: Enable direct kernel boot Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 042/102] i386/kvm: avoid installing duplicate msr entries in msr_handlers Paolo Bonzini
                   ` (60 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexander Graf

From: Alexander Graf <graf@amazon.com>

Now that all pieces are in place to spawn Nitro Enclaves using
a special purpose accelerator and machine model, document how
to use it.

Signed-off-by: Alexander Graf <graf@amazon.com>

Link: https://lore.kernel.org/r/20260225220807.33092-12-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 MAINTAINERS                                |   1 +
 docs/system/confidential-guest-support.rst |   1 +
 docs/system/index.rst                      |   1 +
 docs/system/nitro.rst                      | 133 +++++++++++++++++++++
 4 files changed, 136 insertions(+)
 create mode 100644 docs/system/nitro.rst

diff --git a/MAINTAINERS b/MAINTAINERS
index 0458980d434..72153288c53 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3031,6 +3031,7 @@ M: Alexander Graf <graf@amazon.com>
 S: Maintained
 F: hw/nitro/
 F: include/hw/nitro/
+F: docs/system/nitro.rst
 
 Subsystems
 ----------
diff --git a/docs/system/confidential-guest-support.rst b/docs/system/confidential-guest-support.rst
index 66129fbab64..562a7c3c285 100644
--- a/docs/system/confidential-guest-support.rst
+++ b/docs/system/confidential-guest-support.rst
@@ -41,5 +41,6 @@ Currently supported confidential guest mechanisms are:
 * Intel Trust Domain Extension (TDX) (see :doc:`i386/tdx`)
 * POWER Protected Execution Facility (PEF) (see :ref:`power-papr-protected-execution-facility-pef`)
 * s390x Protected Virtualization (PV) (see :doc:`s390x/protvirt`)
+* AWS Nitro Enclaves (see :doc:`nitro`)
 
 Other mechanisms may be supported in future.
diff --git a/docs/system/index.rst b/docs/system/index.rst
index 427b0204831..d297a952823 100644
--- a/docs/system/index.rst
+++ b/docs/system/index.rst
@@ -39,5 +39,6 @@ or Hypervisor.Framework.
    multi-process
    confidential-guest-support
    igvm
+   nitro
    vm-templating
    sriov
diff --git a/docs/system/nitro.rst b/docs/system/nitro.rst
new file mode 100644
index 00000000000..5907d6153eb
--- /dev/null
+++ b/docs/system/nitro.rst
@@ -0,0 +1,133 @@
+AWS Nitro Enclaves
+==================
+
+`AWS Nitro Enclaves <https://aws.amazon.com/ec2/nitro/nitro-enclaves/>`_
+are isolated compute environments that run alongside EC2 instances.
+They are created by partitioning CPU and memory resources from a parent
+instance and launching a signed Enclave Image Format (EIF) file inside
+a confidential VM managed by the Nitro Hypervisor.
+
+QEMU supports launching Nitro Enclaves on EC2 instances that have
+enclave support enabled, using the ``nitro`` accelerator and the
+``nitro`` machine type.
+
+Prerequisites
+-------------
+
+* An EC2 instance with Nitro Enclaves enabled
+* The ``nitro_enclaves`` kernel module loaded (provides ``/dev/nitro_enclaves``)
+* CPU cores allocated to the Nitro Enclaves pool via ``nitro-enclaves-allocator``
+* Huge pages allocated for Nitro Enclaves via ``nitro-enclaves-allocator``
+
+Quick Start
+-----------
+
+Launch a Nitro Enclave from a pre-built EIF file::
+
+    $ qemu-system-x86_64 -accel nitro,debug-mode=on -M nitro -nographic \
+        -smp 2 -m 512M -kernel enclave.eif
+
+Launch an enclave from individual kernel and initrd files::
+
+    $ qemu-system-x86_64 -accel nitro,debug-mode=on -M nitro -nographic \
+        -smp 2 -m 512M -kernel vmlinuz -initrd initrd.cpio \
+        -append "console=ttyS0"
+
+The same commands work with ``qemu-system-aarch64`` on Graviton based EC2
+instances.
+
+Accelerator
+-----------
+
+The ``nitro`` accelerator (``-accel nitro``) drives the
+``/dev/nitro_enclaves`` device to create and manage a Nitro Enclave.
+It handles:
+
+* Creating the enclave VM slot
+* Donating memory regions (must be huge page backed)
+* Adding vCPUs (must be full physical cores)
+* Starting the enclave
+* Notifying vsock bus devices of the enclave CID
+
+Accelerator options:
+
+``debug-mode=on|off``
+    Enable debug mode. When enabled, the Nitro Hypervisor exposes the
+    enclave's serial console output via a vsock port that the machine
+    model automatically connects to. In debug mode, PCR values are zero.
+    Default is ``off``.
+
+Machine
+-------
+
+The ``nitro`` machine (``-M nitro``) is a minimal, architecture-independent
+machine that provides only what a Nitro Enclave needs:
+
+* RAM (huge page backed via memfd)
+* vCPUs (defaults to ``host`` CPU type)
+* A Nitro vsock bus with:
+
+  - A heartbeat device (vsock server on port 9000)
+  - A serial console bridge (vsock client, debug mode only)
+
+Communication to the Nitro Enclave is limited to virtio-vsock. The Enclave
+is allocated a CID at launch at which it is reachable. A specific CID can
+be requested with ``-accel nitro,enclave-cid=<N>`` (0 lets the hypervisor
+choose). The assigned CID is readable from the vsock bridge device::
+
+    (qemu) qom-get /machine/peripheral/nitro-vsock enclave-cid
+
+EIF Image Format
+^^^^^^^^^^^^^^^^
+
+Nitro Enclaves boot from EIF (Enclave Image Format) files. When
+``-kernel`` points to an EIF file (detected by the ``.eif`` magic
+bytes), it is loaded directly into guest memory.
+
+When ``-kernel`` points to a regular kernel image (e.g. a bzImage or
+Image), the machine automatically assembles a minimal EIF on the fly
+from ``-kernel``, ``-initrd``, and ``-append``. This allows standard
+direct kernel boot without external EIF tooling.
+
+CPU Requirements
+^^^^^^^^^^^^^^^^
+
+Nitro Enclaves require full physical CPU cores. On hyperthreaded
+systems, this means ``-smp`` must be a multiple of the threads per
+core (typically 2).
+
+Nitro Enclaves can only consume cores that are donated to the Nitro Enclave
+CPU pool. You can configure the CPU pool using the ``nitro-enclaves-allocator``
+tool or manually by writing to the nitro_enclaves cpu pool parameter. To
+allocate vCPUs 1, 2 and 3, you can call::
+
+  $ echo 1,2,3 | sudo tee /sys/module/nitro_enclaves/parameters/ne_cpus
+
+Beware that on x86-64 systems, hyperthread siblings are not consecutive
+and must be added in pairs to the pool. Consult tools like ``lstopo``
+or ``lscpu`` for details about your instance's CPU topology.
+
+Memory Requirements
+^^^^^^^^^^^^^^^^^^^
+
+Enclave memory must be huge page backed. The machine automatically
+creates a memfd memory backend with huge pages enabled. To make the
+huge page allocation work, ensure that huge pages are reserved in
+the system. To reserve 1 GiB of memory on a 4 KiB PAGE_SIZE system,
+you can call::
+
+    $ echo 512 | sudo tee /proc/sys/vm/nr_hugepages
+
+Emulated Nitro Enclaves
+-----------------------
+
+In addition to the native Nitro Enclaves invocation, you can also use
+the emulated nitro-enclave machine target (see :doc:`i386/nitro-enclave`)
+which implements the x86 Nitro Enclave device model. While -M nitro
+delegates virtual machine device emulation to the Nitro Hypervisor, -M
+nitro-enclave implements all devices itself, which means it also works
+on non-EC2 instances.
+
+If you require NSM based attestation backed by valid AWS certificates,
+you must use -M nitro. The -M nitro-enclave model does not provide
+you with an AWS signed attestation document.
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 042/102] i386/kvm: avoid installing duplicate msr entries in msr_handlers
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (40 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 041/102] docs: Add Nitro Enclaves documentation Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 043/102] accel/kvm: add confidential class member to indicate guest rebuild capability Paolo Bonzini
                   ` (59 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

kvm_filter_msr() does not check if an msr entry is already present in the
msr_handlers table and installs a new handler unconditionally. If the function
is called again with the same MSR, it will result in duplicate entries in the
table and multiple such calls will fill up the table needlessly. Fix that.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-2-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/kvm/kvm.c | 28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index bb8303c39fe..c0be2e932da 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -6273,27 +6273,33 @@ static int kvm_install_msr_filters(KVMState *s)
 static int kvm_filter_msr(KVMState *s, uint32_t msr, QEMURDMSRHandler *rdmsr,
                           QEMUWRMSRHandler *wrmsr)
 {
-    int i, ret;
+    int i, ret = 0;
 
     for (i = 0; i < ARRAY_SIZE(msr_handlers); i++) {
-        if (!msr_handlers[i].msr) {
+        if (msr_handlers[i].msr == msr) {
+            break;
+        } else if (!msr_handlers[i].msr) {
             msr_handlers[i] = (KVMMSRHandlers) {
                 .msr = msr,
                 .rdmsr = rdmsr,
                 .wrmsr = wrmsr,
             };
-
-            ret = kvm_install_msr_filters(s);
-            if (ret) {
-                msr_handlers[i] = (KVMMSRHandlers) { };
-                return ret;
-            }
-
-            return 0;
+            break;
         }
     }
 
-    return -EINVAL;
+    if (i == ARRAY_SIZE(msr_handlers)) {
+        ret = -EINVAL;
+        goto end;
+    }
+
+    ret = kvm_install_msr_filters(s);
+    if (ret) {
+        msr_handlers[i] = (KVMMSRHandlers) { };
+    }
+
+ end:
+    return ret;
 }
 
 static int kvm_handle_rdmsr(X86CPU *cpu, struct kvm_run *run)
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 043/102] accel/kvm: add confidential class member to indicate guest rebuild capability
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (41 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 042/102] i386/kvm: avoid installing duplicate msr entries in msr_handlers Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 044/102] hw/accel: add a per-accelerator callback to change VM accelerator handle Paolo Bonzini
                   ` (58 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

As a part of the confidential guest reset process, the existing encrypted guest
state must be made mutable since it would be discarded after reset. A new
encrypted and locked guest state must be established after the reset. To this
end, a new boolean member per confidential guest support class
(eg, tdx or sev-snp) is added that will indicate whether its possible to
rebuild guest state:

bool can_rebuild_guest_state;

This is true if rebuilding guest state is possible, false otherwise.
A KVM based confidential guest reset is only possible when
the existing state is locked but its possible to rebuild guest state.
Otherwise, the guest is not resettable.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-3-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/system/confidential-guest-support.h | 20 ++++++++++++++++++++
 system/runstate.c                           |  6 +++---
 target/i386/kvm/tdx.c                       |  1 +
 target/i386/sev.c                           |  1 +
 4 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/include/system/confidential-guest-support.h b/include/system/confidential-guest-support.h
index 0cc8b26e644..5dca7173088 100644
--- a/include/system/confidential-guest-support.h
+++ b/include/system/confidential-guest-support.h
@@ -152,6 +152,11 @@ typedef struct ConfidentialGuestSupportClass {
      */
     int (*get_mem_map_entry)(int index, ConfidentialGuestMemoryMapEntry *entry,
                              Error **errp);
+
+    /*
+     * is it possible to rebuild the guest state?
+     */
+    bool can_rebuild_guest_state;
 } ConfidentialGuestSupportClass;
 
 static inline int confidential_guest_kvm_init(ConfidentialGuestSupport *cgs,
@@ -167,6 +172,21 @@ static inline int confidential_guest_kvm_init(ConfidentialGuestSupport *cgs,
     return 0;
 }
 
+static inline bool
+confidential_guest_can_rebuild_state(ConfidentialGuestSupport *cgs)
+{
+    ConfidentialGuestSupportClass *klass;
+
+    if (!cgs) {
+        /* non-confidential guests */
+        return true;
+    }
+
+    klass = CONFIDENTIAL_GUEST_SUPPORT_GET_CLASS(cgs);
+    return klass->can_rebuild_guest_state;
+
+}
+
 static inline int confidential_guest_kvm_reset(ConfidentialGuestSupport *cgs,
                                                Error **errp)
 {
diff --git a/system/runstate.c b/system/runstate.c
index d091a2bdddb..13f32bed8cb 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -57,6 +57,7 @@
 #include "system/reset.h"
 #include "system/runstate.h"
 #include "system/runstate-action.h"
+#include "system/confidential-guest-support.h"
 #include "system/system.h"
 #include "system/tpm.h"
 #include "trace.h"
@@ -543,8 +544,6 @@ void qemu_system_reset(ShutdownCause reason)
      */
     if (cpus_are_resettable()) {
         cpu_synchronize_all_post_reset();
-    } else {
-        assert(runstate_check(RUN_STATE_PRELAUNCH));
     }
 
     vm_set_suspended(false);
@@ -697,7 +696,8 @@ void qemu_system_reset_request(ShutdownCause reason)
     if (reboot_action == REBOOT_ACTION_SHUTDOWN &&
         reason != SHUTDOWN_CAUSE_SUBSYSTEM_RESET) {
         shutdown_requested = reason;
-    } else if (!cpus_are_resettable()) {
+    } else if (!cpus_are_resettable() &&
+               !confidential_guest_can_rebuild_state(current_machine->cgs)) {
         error_report("cpus are not resettable, terminating");
         shutdown_requested = reason;
     } else {
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 01619857685..a3e81e1c0cc 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -1543,6 +1543,7 @@ static void tdx_guest_class_init(ObjectClass *oc, const void *data)
     X86ConfidentialGuestClass *x86_klass = X86_CONFIDENTIAL_GUEST_CLASS(oc);
 
     klass->kvm_init = tdx_kvm_init;
+    klass->can_rebuild_guest_state = true;
     x86_klass->kvm_type = tdx_kvm_type;
     x86_klass->cpu_instance_init = tdx_cpu_instance_init;
     x86_klass->adjust_cpuid_features = tdx_adjust_cpuid_features;
diff --git a/target/i386/sev.c b/target/i386/sev.c
index acdcb9c4e68..66e38ca32e1 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -2760,6 +2760,7 @@ sev_common_instance_init(Object *obj)
     cgs->set_guest_state = cgs_set_guest_state;
     cgs->get_mem_map_entry = cgs_get_mem_map_entry;
     cgs->set_guest_policy = cgs_set_guest_policy;
+    cgs->can_rebuild_guest_state = true;
 
     QTAILQ_INIT(&sev_common->launch_vmsa);
 }
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 044/102] hw/accel: add a per-accelerator callback to change VM accelerator handle
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (42 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 043/102] accel/kvm: add confidential class member to indicate guest rebuild capability Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 045/102] system/physmem: add helper to reattach existing memory after KVM VM fd change Paolo Bonzini
                   ` (57 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

When a confidential virtual machine is reset, a new guest context in the
accelerator must be generated post reset. Therefore, the old accelerator guest
file handle must be closed and a new one created. To this end, a per-accelerator
callback, "rebuild_guest" is introduced that would get called when a confidential
guest is reset. Subsequent patches will introduce specific implementation of
this callback for KVM accelerator.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-4-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/accel/accel-ops.h |  2 ++
 system/runstate.c         | 38 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/include/accel/accel-ops.h b/include/accel/accel-ops.h
index 23a8c246e15..f46492e3fe1 100644
--- a/include/accel/accel-ops.h
+++ b/include/accel/accel-ops.h
@@ -23,6 +23,8 @@ struct AccelClass {
     AccelOpsClass *ops;
 
     int (*init_machine)(AccelState *as, MachineState *ms);
+    /* used mainly by confidential guests to rebuild guest state upon reset */
+    int (*rebuild_guest)(MachineState *ms);
     bool (*cpu_common_realize)(CPUState *cpu, Error **errp);
     void (*cpu_common_unrealize)(CPUState *cpu);
     /* get_stats: Append statistics to @buf */
diff --git a/system/runstate.c b/system/runstate.c
index 13f32bed8cb..e7b50e6a3b1 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -42,6 +42,7 @@
 #include "qapi/qapi-commands-run-state.h"
 #include "qapi/qapi-events-run-state.h"
 #include "qemu/accel.h"
+#include "accel/accel-ops.h"
 #include "qemu/error-report.h"
 #include "qemu/job.h"
 #include "qemu/log.h"
@@ -509,6 +510,9 @@ void qemu_system_reset(ShutdownCause reason)
 {
     MachineClass *mc;
     ResetType type;
+    AccelClass *ac = ACCEL_GET_CLASS(current_accel());
+    bool guest_state_rebuilt = false;
+    int ret;
 
     mc = current_machine ? MACHINE_GET_CLASS(current_machine) : NULL;
 
@@ -521,6 +525,29 @@ void qemu_system_reset(ShutdownCause reason)
     default:
         type = RESET_TYPE_COLD;
     }
+
+    if (!cpus_are_resettable() &&
+        (reason == SHUTDOWN_CAUSE_GUEST_RESET ||
+         reason == SHUTDOWN_CAUSE_HOST_QMP_SYSTEM_RESET)) {
+        if (ac->rebuild_guest) {
+            ret = ac->rebuild_guest(current_machine);
+            if (ret < 0) {
+                error_report("unable to rebuild guest: %s(%d)",
+                             strerror(-ret), ret);
+                vm_stop(RUN_STATE_INTERNAL_ERROR);
+            } else {
+                info_report("virtual machine state has been rebuilt with new "
+                            "guest file handle.");
+                guest_state_rebuilt = true;
+            }
+        } else if (!cpus_are_resettable())  {
+            error_report("accelerator does not support reset!");
+        } else {
+            error_report("accelerator does not support rebuilding guest state,"
+                         " proceeding with normal reset!");
+        }
+    }
+
     if (mc && mc->reset) {
         mc->reset(current_machine, type);
     } else {
@@ -543,7 +570,16 @@ void qemu_system_reset(ShutdownCause reason)
      * it does _more_  than cpu_synchronize_all_post_reset().
      */
     if (cpus_are_resettable()) {
-        cpu_synchronize_all_post_reset();
+        if (guest_state_rebuilt) {
+            /*
+             * If guest state has been rebuilt, then we
+             * need to sync full cpu state for non confidential guests post
+             * reset.
+             */
+            cpu_synchronize_all_post_init();
+        } else {
+            cpu_synchronize_all_post_reset();
+        }
     }
 
     vm_set_suspended(false);
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 045/102] system/physmem: add helper to reattach existing memory after KVM VM fd change
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (43 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 044/102] hw/accel: add a per-accelerator callback to change VM accelerator handle Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 046/102] accel/kvm: add changes required to support KVM VM file descriptor change Paolo Bonzini
                   ` (56 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

After the guest KVM file descriptor has changed as a part of the process of
confidential guest reset mechanism, existing memory needs to be reattached to
the new file descriptor. This change adds a helper function ram_block_rebind()
for this purpose. The next patch will make use of this function.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-5-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/system/physmem.h |  1 +
 system/physmem.c         | 28 ++++++++++++++++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/include/system/physmem.h b/include/system/physmem.h
index 7bb7d3e1545..da91b77bd9b 100644
--- a/include/system/physmem.h
+++ b/include/system/physmem.h
@@ -51,5 +51,6 @@ physical_memory_snapshot_and_clear_dirty(MemoryRegion *mr, hwaddr offset,
 bool physical_memory_snapshot_get_dirty(DirtyBitmapSnapshot *snap,
                                         ram_addr_t start,
                                         ram_addr_t length);
+int ram_block_rebind(Error **errp);
 
 #endif
diff --git a/system/physmem.c b/system/physmem.c
index 2fb0c25c93b..e5ff26acecd 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2826,6 +2826,34 @@ found:
     return block;
 }
 
+/*
+ * Creates new guest memfd for the ramblocks and closes the
+ * existing memfd.
+ */
+int ram_block_rebind(Error **errp)
+{
+    RAMBlock *block;
+
+    qemu_mutex_lock_ramlist();
+
+    RAMBLOCK_FOREACH(block) {
+        if (block->flags & RAM_GUEST_MEMFD) {
+            if (block->guest_memfd >= 0) {
+                close(block->guest_memfd);
+            }
+            block->guest_memfd = kvm_create_guest_memfd(block->max_length,
+                                                        0, errp);
+            if (block->guest_memfd < 0) {
+                qemu_mutex_unlock_ramlist();
+                return -1;
+            }
+
+        }
+    }
+    qemu_mutex_unlock_ramlist();
+    return 0;
+}
+
 /*
  * Finds the named RAMBlock
  *
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 046/102] accel/kvm: add changes required to support KVM VM file descriptor change
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (44 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 045/102] system/physmem: add helper to reattach existing memory after KVM VM fd change Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 047/102] accel/kvm: mark guest state as unprotected after vm " Paolo Bonzini
                   ` (55 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

This change adds common kvm specific support to handle KVM VM file descriptor
change. KVM VM file descriptor can change as a part of confidential guest reset
mechanism. A new function api kvm_arch_on_vmfd_change() per
architecture platform is added in order to implement architecture specific
changes required to support it. A subsequent patch will add x86 specific
implementation for kvm_arch_on_vmfd_change() as currently only x86 supports
confidential guest reset.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-6-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 MAINTAINERS            |  6 +++
 include/system/kvm.h   |  3 ++
 accel/kvm/kvm-all.c    | 88 ++++++++++++++++++++++++++++++++++++++++--
 stubs/kvm.c            | 22 +++++++++++
 target/i386/kvm/kvm.c  | 10 +++++
 accel/kvm/trace-events |  1 +
 stubs/meson.build      |  1 +
 7 files changed, 128 insertions(+), 3 deletions(-)
 create mode 100644 stubs/kvm.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 72153288c53..a8e1546de1e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -152,6 +152,12 @@ F: tools/i386/
 F: tests/functional/i386/
 F: tests/functional/x86_64/
 
+X86 VM file descriptor change on reset test
+M: Ani Sinha <anisinha@redhat.com>
+M: Paolo Bonzini <pbonzini@redhat.com>
+S: Maintained
+F: stubs/kvm.c
+
 Guest CPU cores (TCG)
 ---------------------
 Overall TCG CPUs
diff --git a/include/system/kvm.h b/include/system/kvm.h
index 8f9eecf044c..5fc7251fd94 100644
--- a/include/system/kvm.h
+++ b/include/system/kvm.h
@@ -456,6 +456,9 @@ int kvm_physical_memory_addr_from_host(KVMState *s, void *ram_addr,
 
 #endif /* COMPILING_PER_TARGET */
 
+bool kvm_arch_supports_vmfd_change(void);
+int kvm_arch_on_vmfd_change(MachineState *ms, KVMState *s);
+
 void kvm_cpu_synchronize_state(CPUState *cpu);
 
 void kvm_init_cpu_signals(CPUState *cpu);
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 0d8b0c43470..cc5c42ce4de 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2415,11 +2415,9 @@ void kvm_irqchip_set_qemuirq_gsi(KVMState *s, qemu_irq irq, int gsi)
     g_hash_table_insert(s->gsimap, irq, GINT_TO_POINTER(gsi));
 }
 
-static void kvm_irqchip_create(KVMState *s)
+static void do_kvm_irqchip_create(KVMState *s)
 {
     int ret;
-
-    assert(s->kernel_irqchip_split != ON_OFF_AUTO_AUTO);
     if (kvm_check_extension(s, KVM_CAP_IRQCHIP)) {
         ;
     } else if (kvm_check_extension(s, KVM_CAP_S390_IRQCHIP)) {
@@ -2452,7 +2450,13 @@ static void kvm_irqchip_create(KVMState *s)
         fprintf(stderr, "Create kernel irqchip failed: %s\n", strerror(-ret));
         exit(1);
     }
+}
 
+static void kvm_irqchip_create(KVMState *s)
+{
+    assert(s->kernel_irqchip_split != ON_OFF_AUTO_AUTO);
+
+    do_kvm_irqchip_create(s);
     kvm_kernel_irqchip = true;
     /* If we have an in-kernel IRQ chip then we must have asynchronous
      * interrupt delivery (though the reverse is not necessarily true)
@@ -2607,6 +2611,83 @@ static int kvm_setup_dirty_ring(KVMState *s)
     return 0;
 }
 
+static int kvm_reset_vmfd(MachineState *ms)
+{
+    KVMState *s;
+    KVMMemoryListener *kml;
+    int ret = 0, type;
+    Error *err = NULL;
+
+    /*
+     * bail if the current architecture does not support VM file
+     * descriptor change.
+     */
+    if (!kvm_arch_supports_vmfd_change()) {
+        error_report("This target architecture does not support KVM VM "
+                     "file descriptor change.");
+        return -EOPNOTSUPP;
+    }
+
+    s = KVM_STATE(ms->accelerator);
+    kml = &s->memory_listener;
+
+    memory_listener_unregister(&kml->listener);
+    memory_listener_unregister(&kvm_io_listener);
+
+    if (s->vmfd >= 0) {
+        close(s->vmfd);
+    }
+
+    type = find_kvm_machine_type(ms);
+    if (type < 0) {
+        return -EINVAL;
+    }
+
+    ret = do_kvm_create_vm(s, type);
+    if (ret < 0) {
+        return ret;
+    }
+
+    s->vmfd = ret;
+
+    kvm_setup_dirty_ring(s);
+
+    /* rebind memory to new vm fd */
+    ret = ram_block_rebind(&err);
+    if (ret < 0) {
+        return ret;
+    }
+    assert(!err);
+
+    ret = kvm_arch_on_vmfd_change(ms, s);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (s->kernel_irqchip_allowed) {
+        do_kvm_irqchip_create(s);
+    }
+
+    /* these can be only called after ram_block_rebind() */
+    memory_listener_register(&kml->listener, &address_space_memory);
+    memory_listener_register(&kvm_io_listener, &address_space_io);
+
+    /*
+     * kvm fd has changed. Commit the irq routes to KVM once more.
+     */
+    kvm_irqchip_commit_routes(s);
+    /*
+     * for confidential guest, this is the last possible place where we
+     * can call synchronize_all_post_init() to sync all vcpu states to
+     * kvm.
+     */
+    if (ms->cgs) {
+        cpu_synchronize_all_post_init();
+    }
+    trace_kvm_reset_vmfd();
+    return ret;
+}
+
 static int kvm_init(AccelState *as, MachineState *ms)
 {
     MachineClass *mc = MACHINE_GET_CLASS(ms);
@@ -4015,6 +4096,7 @@ static void kvm_accel_class_init(ObjectClass *oc, const void *data)
     AccelClass *ac = ACCEL_CLASS(oc);
     ac->name = "KVM";
     ac->init_machine = kvm_init;
+    ac->rebuild_guest = kvm_reset_vmfd;
     ac->has_memory = kvm_accel_has_memory;
     ac->allowed = &kvm_allowed;
     ac->gdbstub_supported_sstep_flags = kvm_gdbstub_sstep_flags;
diff --git a/stubs/kvm.c b/stubs/kvm.c
new file mode 100644
index 00000000000..2db61d89a73
--- /dev/null
+++ b/stubs/kvm.c
@@ -0,0 +1,22 @@
+/*
+ * kvm target arch specific stubs
+ *
+ * Copyright (c) 2026 Red Hat, Inc.
+ *
+ * Author:
+ *   Ani Sinha <anisinha@redhat.com>
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include "qemu/osdep.h"
+#include "system/kvm.h"
+
+int kvm_arch_on_vmfd_change(MachineState *ms, KVMState *s)
+{
+    abort();
+}
+
+bool kvm_arch_supports_vmfd_change(void)
+{
+    return false;
+}
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index c0be2e932da..524b5276a68 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -3389,6 +3389,16 @@ static int kvm_vm_enable_energy_msrs(KVMState *s)
     return 0;
 }
 
+int kvm_arch_on_vmfd_change(MachineState *ms, KVMState *s)
+{
+    abort();
+}
+
+bool kvm_arch_supports_vmfd_change(void)
+{
+    return false;
+}
+
 int kvm_arch_init(MachineState *ms, KVMState *s)
 {
     int ret;
diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events
index e43d18a8692..e4beda01488 100644
--- a/accel/kvm/trace-events
+++ b/accel/kvm/trace-events
@@ -14,6 +14,7 @@ kvm_destroy_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu"
 kvm_park_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu"
 kvm_unpark_vcpu(unsigned long arch_cpu_id, const char *msg) "id: %lu %s"
 kvm_irqchip_commit_routes(void) ""
+kvm_reset_vmfd(void) ""
 kvm_irqchip_add_msi_route(char *name, int vector, int virq) "dev %s vector %d virq %d"
 kvm_irqchip_update_msi_route(int virq) "Updating MSI route virq=%d"
 kvm_irqchip_release_virq(int virq) "virq %d"
diff --git a/stubs/meson.build b/stubs/meson.build
index 8a07059500d..6ae478bacc1 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -74,6 +74,7 @@ if have_system
   if igvm.found()
     stub_ss.add(files('igvm.c'))
   endif
+  stub_ss.add(files('kvm.c'))
   stub_ss.add(files('target-get-monitor-def.c'))
   stub_ss.add(files('target-monitor-defs.c'))
   stub_ss.add(files('win32-kbd-hook.c'))
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 047/102] accel/kvm: mark guest state as unprotected after vm file descriptor change
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (45 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 046/102] accel/kvm: add changes required to support KVM VM file descriptor change Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 048/102] accel/kvm: add a notifier to indicate KVM VM file descriptor has changed Paolo Bonzini
                   ` (54 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

When the KVM VM file descriptor has changed and a new one created, the guest
state is no longer in protected state. Mark it as such.
The guest state becomes protected again when TDX and SEV-ES and SEV-SNP mark
it as such.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-7-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 accel/kvm/kvm-all.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index cc5c42ce4de..096edb5e198 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2650,6 +2650,9 @@ static int kvm_reset_vmfd(MachineState *ms)
 
     s->vmfd = ret;
 
+    /* guest state is now unprotected again */
+    kvm_state->guest_state_protected = false;
+
     kvm_setup_dirty_ring(s);
 
     /* rebind memory to new vm fd */
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 048/102] accel/kvm: add a notifier to indicate KVM VM file descriptor has changed
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (46 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 047/102] accel/kvm: mark guest state as unprotected after vm " Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 049/102] accel/kvm: notify when KVM VM file fd is about to be changed Paolo Bonzini
                   ` (53 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

A notifier callback can be used by various subsystems to perform actions when
KVM file descriptor for a virtual machine changes as a part of confidential
guest reset process. This change adds this notifier mechanism. Subsequent
patches will add specific implementations for various notifier callbacks
corresponding to various subsystems that need to take action when KVM VM file
descriptor changed.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-8-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/system/kvm.h   | 21 +++++++++++++++++++++
 accel/kvm/kvm-all.c    | 30 ++++++++++++++++++++++++++++++
 accel/stubs/kvm-stub.c |  8 ++++++++
 3 files changed, 59 insertions(+)

diff --git a/include/system/kvm.h b/include/system/kvm.h
index 5fc7251fd94..f11729f432c 100644
--- a/include/system/kvm.h
+++ b/include/system/kvm.h
@@ -181,6 +181,7 @@ DECLARE_INSTANCE_CHECKER(KVMState, KVM_STATE,
 
 extern KVMState *kvm_state;
 typedef struct Notifier Notifier;
+typedef struct NotifierWithReturn NotifierWithReturn;
 
 typedef struct KVMRouteChange {
      KVMState *s;
@@ -567,4 +568,24 @@ int kvm_set_memory_attributes_shared(hwaddr start, uint64_t size);
 
 int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private);
 
+/* argument to vmfd change notifier */
+typedef struct VmfdChangeNotifier {
+    int vmfd;
+} VmfdChangeNotifier;
+
+/**
+ * kvm_vmfd_add_change_notifier - register a notifier to get notified when
+ * a KVM vm file descriptor changes as a part of the confidential guest "reset"
+ * process. Various subsystems should use this mechanism to take actions such
+ * as creating new fds against this new vm file descriptor.
+ * @n: notifier with return value.
+ */
+void kvm_vmfd_add_change_notifier(NotifierWithReturn *n);
+/**
+ * kvm_vmfd_remove_change_notifier - de-register a notifer previously
+ * registered with kvm_vmfd_add_change_notifier call.
+ * @n: notifier that was previously registered.
+ */
+void kvm_vmfd_remove_change_notifier(NotifierWithReturn *n);
+
 #endif
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 096edb5e198..3b57d2f9769 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -90,6 +90,7 @@ struct KVMParkedVcpu {
 };
 
 KVMState *kvm_state;
+VmfdChangeNotifier vmfd_notifier;
 bool kvm_kernel_irqchip;
 bool kvm_split_irqchip;
 bool kvm_async_interrupts_allowed;
@@ -123,6 +124,9 @@ static const KVMCapabilityInfo kvm_required_capabilites[] = {
 static NotifierList kvm_irqchip_change_notifiers =
     NOTIFIER_LIST_INITIALIZER(kvm_irqchip_change_notifiers);
 
+static NotifierWithReturnList register_vmfd_changed_notifiers =
+    NOTIFIER_WITH_RETURN_LIST_INITIALIZER(register_vmfd_changed_notifiers);
+
 struct KVMResampleFd {
     int gsi;
     EventNotifier *resample_event;
@@ -2173,6 +2177,22 @@ void kvm_irqchip_change_notify(void)
     notifier_list_notify(&kvm_irqchip_change_notifiers, NULL);
 }
 
+void kvm_vmfd_add_change_notifier(NotifierWithReturn *n)
+{
+    notifier_with_return_list_add(&register_vmfd_changed_notifiers, n);
+}
+
+void kvm_vmfd_remove_change_notifier(NotifierWithReturn *n)
+{
+    notifier_with_return_remove(n);
+}
+
+static int kvm_vmfd_change_notify(Error **errp)
+{
+    return notifier_with_return_list_notify(&register_vmfd_changed_notifiers,
+                                            &vmfd_notifier, errp);
+}
+
 int kvm_irqchip_get_virq(KVMState *s)
 {
     int next_virq;
@@ -2671,6 +2691,16 @@ static int kvm_reset_vmfd(MachineState *ms)
         do_kvm_irqchip_create(s);
     }
 
+    /*
+     * notify everyone that vmfd has changed.
+     */
+    vmfd_notifier.vmfd = s->vmfd;
+    ret = kvm_vmfd_change_notify(&err);
+    if (ret < 0) {
+        return ret;
+    }
+    assert(!err);
+
     /* these can be only called after ram_block_rebind() */
     memory_listener_register(&kml->listener, &address_space_memory);
     memory_listener_register(&kvm_io_listener, &address_space_io);
diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
index 68cd33ba973..a6e8a6e16cf 100644
--- a/accel/stubs/kvm-stub.c
+++ b/accel/stubs/kvm-stub.c
@@ -79,6 +79,14 @@ void kvm_irqchip_change_notify(void)
 {
 }
 
+void kvm_vmfd_add_change_notifier(NotifierWithReturn *n)
+{
+}
+
+void kvm_vmfd_remove_change_notifier(NotifierWithReturn *n)
+{
+}
+
 int kvm_irqchip_add_irqfd_notifier_gsi(KVMState *s, EventNotifier *n,
                                        EventNotifier *rn, int virq)
 {
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 049/102] accel/kvm: notify when KVM VM file fd is about to be changed
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (47 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 048/102] accel/kvm: add a notifier to indicate KVM VM file descriptor has changed Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 050/102] i386/kvm: unregister smram listeners prior to vm file descriptor change Paolo Bonzini
                   ` (52 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

Various subsystems might need to take some steps before the KVM file descriptor
for a virtual machine is changed. So a new boolean attribute is added to the
vmfd_notifier structure which is passed to the notifier callbacks.
vmfd_notifer.pre is true for pre-notification of vmfd change and false for
post notification. Notifier callback implementations can simply check
the boolean value for (vmfd_notifer*)->pre and can take actions for pre or
post vmfd change based on the value.

Subsequent patches will add callback implementations for specific components
that need this pre-notification.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-9-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/system/kvm.h | 6 ++++--
 accel/kvm/kvm-all.c  | 9 +++++++++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/include/system/kvm.h b/include/system/kvm.h
index f11729f432c..fbe23608a16 100644
--- a/include/system/kvm.h
+++ b/include/system/kvm.h
@@ -571,12 +571,14 @@ int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private);
 /* argument to vmfd change notifier */
 typedef struct VmfdChangeNotifier {
     int vmfd;
+    bool pre;
 } VmfdChangeNotifier;
 
 /**
  * kvm_vmfd_add_change_notifier - register a notifier to get notified when
- * a KVM vm file descriptor changes as a part of the confidential guest "reset"
- * process. Various subsystems should use this mechanism to take actions such
+ * a KVM vm file descriptor changes or about to be changed as a part of the
+ * confidential guest "reset" process.
+ * Various subsystems should use this mechanism to take actions such
  * as creating new fds against this new vm file descriptor.
  * @n: notifier with return value.
  */
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 3b57d2f9769..d244156f6f4 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2654,6 +2654,13 @@ static int kvm_reset_vmfd(MachineState *ms)
     memory_listener_unregister(&kml->listener);
     memory_listener_unregister(&kvm_io_listener);
 
+    vmfd_notifier.pre = true;
+    ret = kvm_vmfd_change_notify(&err);
+    if (ret < 0) {
+        return ret;
+    }
+    assert(!err);
+
     if (s->vmfd >= 0) {
         close(s->vmfd);
     }
@@ -2695,6 +2702,8 @@ static int kvm_reset_vmfd(MachineState *ms)
      * notify everyone that vmfd has changed.
      */
     vmfd_notifier.vmfd = s->vmfd;
+    vmfd_notifier.pre = false;
+
     ret = kvm_vmfd_change_notify(&err);
     if (ret < 0) {
         return ret;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 050/102] i386/kvm: unregister smram listeners prior to vm file descriptor change
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (48 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 049/102] accel/kvm: notify when KVM VM file fd is about to be changed Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 051/102] kvm/i386: implement architecture support for kvm " Paolo Bonzini
                   ` (51 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

We will re-register smram listeners after the VM file descriptors has changed.
We need to unregister them first to make sure addresses and reference counters
work properly.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-10-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/kvm/kvm.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 524b5276a68..8edfbb834c6 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -112,6 +112,11 @@ typedef struct {
 static void kvm_init_msrs(X86CPU *cpu);
 static int kvm_filter_msr(KVMState *s, uint32_t msr, QEMURDMSRHandler *rdmsr,
                           QEMUWRMSRHandler *wrmsr);
+static int unregister_smram_listener(NotifierWithReturn *notifier,
+                                     void *data, Error** errp);
+NotifierWithReturn kvm_vmfd_change_notifier = {
+    .notify = unregister_smram_listener,
+};
 
 const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
     KVM_CAP_INFO(SET_TSS_ADDR),
@@ -2885,6 +2890,17 @@ static void register_smram_listener(Notifier *n, void *unused)
     }
 }
 
+static int unregister_smram_listener(NotifierWithReturn *notifier,
+                                     void *data, Error** errp)
+{
+    if (!((VmfdChangeNotifier *)data)->pre) {
+        return 0;
+    }
+
+    memory_listener_unregister(&smram_listener.listener);
+    return 0;
+}
+
 /* It should only be called in cpu's hotplug callback */
 void kvm_smm_cpu_address_space_init(X86CPU *cpu)
 {
@@ -3538,6 +3554,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     }
 
     pmu_cap = kvm_check_extension(s, KVM_CAP_PMU_CAPABILITY);
+    kvm_vmfd_add_change_notifier(&kvm_vmfd_change_notifier);
 
     return 0;
 }
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 051/102] kvm/i386: implement architecture support for kvm file descriptor change
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (49 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 050/102] i386/kvm: unregister smram listeners prior to vm file descriptor change Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 052/102] i386/kvm: refactor xen init into a new function Paolo Bonzini
                   ` (50 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

When the kvm file descriptor changes as a part of confidential guest reset,
some architecture specific setups including SEV/SEV-SNP/TDX specific setups
needs to be redone. These changes are implemented as a part of the
kvm_arch_on_vmfd_change() callback which was introduced previously.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-11-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/kvm/kvm.c        | 49 ++++++++++++++++++++++++++++--------
 target/i386/kvm/trace-events |  1 +
 2 files changed, 39 insertions(+), 11 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 8edfbb834c6..40e4b3f1283 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -3407,12 +3407,30 @@ static int kvm_vm_enable_energy_msrs(KVMState *s)
 
 int kvm_arch_on_vmfd_change(MachineState *ms, KVMState *s)
 {
-    abort();
+    int ret;
+
+    ret = kvm_arch_init(ms, s);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (object_dynamic_cast(OBJECT(ms), TYPE_X86_MACHINE)) {
+        X86MachineState *x86ms = X86_MACHINE(ms);
+
+        if (x86_machine_is_smm_enabled(x86ms)) {
+            memory_listener_register(&smram_listener.listener,
+                                     &smram_address_space);
+        }
+        kvm_set_max_apic_id(x86ms->apic_id_limit);
+    }
+
+    trace_kvm_arch_on_vmfd_change();
+    return 0;
 }
 
 bool kvm_arch_supports_vmfd_change(void)
 {
-    return false;
+    return true;
 }
 
 int kvm_arch_init(MachineState *ms, KVMState *s)
@@ -3420,6 +3438,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     int ret;
     struct utsname utsname;
     Error *local_err = NULL;
+    static bool first = true;
 
     /*
      * Initialize confidential guest (SEV/TDX) context, if required
@@ -3489,16 +3508,17 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
         return ret;
     }
 
-    /* Tell fw_cfg to notify the BIOS to reserve the range. */
-    e820_add_entry(KVM_IDENTITY_BASE, 0x4000, E820_RESERVED);
-
+    if (first) {
+        /* Tell fw_cfg to notify the BIOS to reserve the range. */
+        e820_add_entry(KVM_IDENTITY_BASE, 0x4000, E820_RESERVED);
+    }
     ret = kvm_vm_set_nr_mmu_pages(s);
     if (ret < 0) {
         return ret;
     }
 
     if (object_dynamic_cast(OBJECT(ms), TYPE_X86_MACHINE) &&
-        x86_machine_is_smm_enabled(X86_MACHINE(ms))) {
+        x86_machine_is_smm_enabled(X86_MACHINE(ms)) && first) {
         smram_machine_done.notify = register_smram_listener;
         qemu_add_machine_init_done_notifier(&smram_machine_done);
     }
@@ -3545,16 +3565,23 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
                 return ret;
             }
 
-            ret = kvm_msr_energy_thread_init(s, ms);
-            if (ret < 0) {
-                error_report("kvm : error RAPL feature requirement not met");
-                return ret;
+            if (first) {
+                ret = kvm_msr_energy_thread_init(s, ms);
+                if (ret < 0) {
+                    error_report("kvm : "
+                                 "error RAPL feature requirement not met");
+                    return ret;
+                }
             }
         }
     }
 
     pmu_cap = kvm_check_extension(s, KVM_CAP_PMU_CAPABILITY);
-    kvm_vmfd_add_change_notifier(&kvm_vmfd_change_notifier);
+
+    if (first) {
+        kvm_vmfd_add_change_notifier(&kvm_vmfd_change_notifier);
+    }
+    first = false;
 
     return 0;
 }
diff --git a/target/i386/kvm/trace-events b/target/i386/kvm/trace-events
index 74a6234ff7f..2d213c9f9b6 100644
--- a/target/i386/kvm/trace-events
+++ b/target/i386/kvm/trace-events
@@ -6,6 +6,7 @@ kvm_x86_add_msi_route(int virq) "Adding route entry for virq %d"
 kvm_x86_remove_msi_route(int virq) "Removing route entry for virq %d"
 kvm_x86_update_msi_routes(int num) "Updated %d MSI routes"
 kvm_hc_map_gpa_range(uint64_t gpa, uint64_t size, uint64_t attributes, uint64_t flags) "gpa 0x%" PRIx64 " size 0x%" PRIx64 " attributes 0x%" PRIx64 " flags 0x%" PRIx64
+kvm_arch_on_vmfd_change(void) ""
 
 # xen-emu.c
 kvm_xen_hypercall(int cpu, uint8_t cpl, uint64_t input, uint64_t a0, uint64_t a1, uint64_t a2, uint64_t ret) "xen_hypercall: cpu %d cpl %d input %" PRIu64 " a0 0x%" PRIx64 " a1 0x%" PRIx64 " a2 0x%" PRIx64" ret 0x%" PRIx64
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 052/102] i386/kvm: refactor xen init into a new function
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (50 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 051/102] kvm/i386: implement architecture support for kvm " Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 053/102] hw/i386: refactor x86_bios_rom_init for reuse in confidential guest reset Paolo Bonzini
                   ` (49 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

Cosmetic - no new functionality added. Xen initialisation code is refactored
into its own function.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-12-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/kvm/kvm.c | 31 +++++++++++++++++++------------
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 40e4b3f1283..cc98cc961b7 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -3433,6 +3433,24 @@ bool kvm_arch_supports_vmfd_change(void)
     return true;
 }
 
+static int xen_init(MachineState *ms, KVMState *s)
+{
+#ifdef CONFIG_XEN_EMU
+    int ret = 0;
+    if (!object_dynamic_cast(OBJECT(ms), TYPE_PC_MACHINE)) {
+        error_report("kvm: Xen support only available in PC machine");
+        return -ENOTSUP;
+    }
+    /* hyperv_enabled() doesn't work yet. */
+    uint32_t msr = XEN_HYPERCALL_MSR;
+    ret = kvm_xen_init(s, msr);
+    return ret;
+#else
+    error_report("kvm: Xen support not enabled in qemu");
+    return -ENOTSUP;
+#endif
+}
+
 int kvm_arch_init(MachineState *ms, KVMState *s)
 {
     int ret;
@@ -3467,21 +3485,10 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     }
 
     if (s->xen_version) {
-#ifdef CONFIG_XEN_EMU
-        if (!object_dynamic_cast(OBJECT(ms), TYPE_PC_MACHINE)) {
-            error_report("kvm: Xen support only available in PC machine");
-            return -ENOTSUP;
-        }
-        /* hyperv_enabled() doesn't work yet. */
-        uint32_t msr = XEN_HYPERCALL_MSR;
-        ret = kvm_xen_init(s, msr);
+        ret = xen_init(ms, s);
         if (ret < 0) {
             return ret;
         }
-#else
-        error_report("kvm: Xen support not enabled in qemu");
-        return -ENOTSUP;
-#endif
     }
 
     ret = kvm_get_supported_msrs(s);
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 053/102] hw/i386: refactor x86_bios_rom_init for reuse in confidential guest reset
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (51 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 052/102] i386/kvm: refactor xen init into a new function Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 054/102] hw/i386: export a new function x86_bios_rom_reload Paolo Bonzini
                   ` (48 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Michael S. Tsirkin

From: Ani Sinha <anisinha@redhat.com>

For confidential guests, bios image must be reinitialized upon reset. This
is because bios memory is encrypted and hence once the old confidential
kvm context is destroyed, it cannot be decrypted. It needs to be reinitilized.
Towards that, this change refactors x86_bios_rom_init() code so that
parts of it can be called during confidential guest reset.
No functional chnage.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-13-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/i386/x86-common.c | 50 ++++++++++++++++++++++++++++++++------------
 1 file changed, 37 insertions(+), 13 deletions(-)

diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c
index de4cd7650a4..c98abaf3689 100644
--- a/hw/i386/x86-common.c
+++ b/hw/i386/x86-common.c
@@ -1020,17 +1020,11 @@ void x86_isa_bios_init(MemoryRegion *isa_bios, MemoryRegion *isa_memory,
     memory_region_set_readonly(isa_bios, read_only);
 }
 
-void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
-                       MemoryRegion *rom_memory, bool isapc_ram_fw)
+static int get_bios_size(X86MachineState *x86ms,
+                         const char *bios_name, char *filename)
 {
-    const char *bios_name;
-    char *filename;
     int bios_size;
-    ssize_t ret;
 
-    /* BIOS load */
-    bios_name = MACHINE(x86ms)->firmware ?: default_firmware;
-    filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
     if (filename) {
         bios_size = get_image_size(filename, NULL);
     } else {
@@ -1040,6 +1034,21 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
         (bios_size % 65536) != 0) {
         goto bios_error;
     }
+
+    return bios_size;
+
+ bios_error:
+    fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
+    exit(1);
+}
+
+static void load_bios_from_file(X86MachineState *x86ms, const char *bios_name,
+                                char *filename, int bios_size,
+                                bool isapc_ram_fw)
+{
+    ssize_t ret;
+
+    /* BIOS load */
     if (machine_require_guest_memfd(MACHINE(x86ms))) {
         memory_region_init_ram_guest_memfd(&x86ms->bios, NULL, "pc.bios",
                                            bios_size, &error_fatal);
@@ -1068,7 +1077,26 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
             goto bios_error;
         }
     }
-    g_free(filename);
+
+    return;
+
+ bios_error:
+    fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
+    exit(1);
+}
+
+void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
+                       MemoryRegion *rom_memory, bool isapc_ram_fw)
+{
+    int bios_size;
+    const char *bios_name;
+    g_autofree char *filename;
+
+    bios_name = MACHINE(x86ms)->firmware ?: default_firmware;
+    filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
+
+    bios_size = get_bios_size(x86ms, bios_name, filename);
+    load_bios_from_file(x86ms, bios_name, filename, bios_size, isapc_ram_fw);
 
     if (!machine_require_guest_memfd(MACHINE(x86ms))) {
         /* map the last 128KB of the BIOS in ISA space */
@@ -1081,8 +1109,4 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
                                 (uint32_t)(-bios_size),
                                 &x86ms->bios);
     return;
-
-bios_error:
-    fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
-    exit(1);
 }
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 054/102] hw/i386: export a new function x86_bios_rom_reload
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (52 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 053/102] hw/i386: refactor x86_bios_rom_init for reuse in confidential guest reset Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 055/102] kvm/i386: reload firmware for confidential guest reset Paolo Bonzini
                   ` (47 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Bernhard Beschow

From: Ani Sinha <anisinha@redhat.com>

Confidential guest smust reload their bios rom upon reset. This is because
bios memory is encrypted and upon reset, the contents of the old bios memory
is lost and cannot be re-used. To this end, export a new x86 function
x86_bios_rom_reload() to reload the bios again. This function will be used in
the subsequent patches.

Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-14-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/hw/i386/x86.h |  1 +
 hw/i386/x86-common.c  | 21 +++++++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index 23be6274377..a85a5600ce9 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -125,6 +125,7 @@ void x86_isa_bios_init(MemoryRegion *isa_bios, MemoryRegion *isa_memory,
                        MemoryRegion *bios, bool read_only);
 void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
                        MemoryRegion *rom_memory, bool isapc_ram_fw);
+void x86_bios_rom_reload(X86MachineState *x86ms);
 
 void x86_load_linux(X86MachineState *x86ms,
                     FWCfgState *fw_cfg,
diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c
index c98abaf3689..a420112666a 100644
--- a/hw/i386/x86-common.c
+++ b/hw/i386/x86-common.c
@@ -1085,6 +1085,27 @@ static void load_bios_from_file(X86MachineState *x86ms, const char *bios_name,
     exit(1);
 }
 
+void x86_bios_rom_reload(X86MachineState *x86ms)
+{
+    int bios_size;
+    const char *bios_name;
+    char *filename;
+
+    if (memory_region_size(&x86ms->bios) == 0) {
+        /* if -bios is not used */
+        return;
+    }
+
+    bios_name = MACHINE(x86ms)->firmware ?: "bios.bin";
+    filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
+
+    bios_size = get_bios_size(x86ms, bios_name, filename);
+
+    void *ptr = memory_region_get_ram_ptr(&x86ms->bios);
+    load_image_size(filename, ptr, bios_size);
+    x86_firmware_configure(0x100000000ULL - bios_size, ptr, bios_size);
+}
+
 void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
                        MemoryRegion *rom_memory, bool isapc_ram_fw)
 {
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 055/102] kvm/i386: reload firmware for confidential guest reset
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (53 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 054/102] hw/i386: export a new function x86_bios_rom_reload Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 056/102] accel/kvm: rebind current VCPUs to the new KVM VM file descriptor upon reset Paolo Bonzini
                   ` (46 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

When IGVM is not being used by the confidential guest, the guest firmware has
to be reloaded explicitly again into memory. This is because, the memory into
which the firmware was loaded before reset was encrypted and is thus lost
upon reset. When IGVM is used, it is expected that the IGVM will contain the
guest firmware and the execution of the IGVM directives will set up the guest
firmware memory.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-15-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/kvm/kvm.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index cc98cc961b7..9d7a9ffceb8 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -3416,7 +3416,14 @@ int kvm_arch_on_vmfd_change(MachineState *ms, KVMState *s)
 
     if (object_dynamic_cast(OBJECT(ms), TYPE_X86_MACHINE)) {
         X86MachineState *x86ms = X86_MACHINE(ms);
-
+        /*
+         * For confidential guests, reload bios ROM if IGVM is not specified.
+         * If an IGVM file is specified then the firmware must be provided
+         * in the IGVM file.
+         */
+        if (ms->cgs && !x86ms->igvm) {
+                x86_bios_rom_reload(x86ms);
+        }
         if (x86_machine_is_smm_enabled(x86ms)) {
             memory_listener_register(&smram_listener.listener,
                                      &smram_address_space);
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 056/102] accel/kvm: rebind current VCPUs to the new KVM VM file descriptor upon reset
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (54 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 055/102] kvm/i386: reload firmware for confidential guest reset Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 057/102] i386/tdx: refactor TDX firmware memory initialization code into a new function Paolo Bonzini
                   ` (45 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

Confidential guests needs to generate a new KVM file descriptor upon virtual
machine reset. Existing VCPUs needs to be reattached to this new
KVM VM file descriptor. As a part of this, new VCPU file descriptors against
this new KVM VM file descriptor needs to be created and re-initialized.
Resources allocated against the old VCPU fds needs to be released. This change
makes this happen.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-16-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 accel/kvm/kvm-all.c    | 215 +++++++++++++++++++++++++++++++++--------
 accel/kvm/trace-events |   1 +
 2 files changed, 174 insertions(+), 42 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index d244156f6f4..a347a71a2ee 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -127,6 +127,10 @@ static NotifierList kvm_irqchip_change_notifiers =
 static NotifierWithReturnList register_vmfd_changed_notifiers =
     NOTIFIER_WITH_RETURN_LIST_INITIALIZER(register_vmfd_changed_notifiers);
 
+static int map_kvm_run(KVMState *s, CPUState *cpu, Error **errp);
+static int map_kvm_dirty_gfns(KVMState *s, CPUState *cpu, Error **errp);
+static int vcpu_unmap_regions(KVMState *s, CPUState *cpu);
+
 struct KVMResampleFd {
     int gsi;
     EventNotifier *resample_event;
@@ -420,6 +424,90 @@ err:
     return ret;
 }
 
+static void kvm_create_vcpu_internal(CPUState *cpu, KVMState *s, int kvm_fd)
+{
+    cpu->kvm_fd = kvm_fd;
+    cpu->kvm_state = s;
+    if (!s->guest_state_protected) {
+        cpu->vcpu_dirty = true;
+    }
+    cpu->dirty_pages = 0;
+    cpu->throttle_us_per_full = 0;
+
+    return;
+}
+
+static int kvm_rebind_vcpus(Error **errp)
+{
+    CPUState *cpu;
+    unsigned long vcpu_id;
+    KVMState *s = kvm_state;
+    int kvm_fd, ret = 0;
+
+    CPU_FOREACH(cpu) {
+        vcpu_id = kvm_arch_vcpu_id(cpu);
+
+        if (cpu->kvm_fd) {
+            close(cpu->kvm_fd);
+        }
+
+        ret = kvm_arch_destroy_vcpu(cpu);
+        if (ret < 0) {
+            goto err;
+        }
+
+        if (s->coalesced_mmio_ring == (void *)cpu->kvm_run + PAGE_SIZE) {
+            s->coalesced_mmio_ring = NULL;
+        }
+
+        ret = vcpu_unmap_regions(s, cpu);
+        if (ret < 0) {
+            goto err;
+        }
+
+        ret = kvm_arch_pre_create_vcpu(cpu, errp);
+        if (ret < 0) {
+            goto err;
+        }
+
+        kvm_fd = kvm_vm_ioctl(s, KVM_CREATE_VCPU, vcpu_id);
+        if (kvm_fd < 0) {
+            error_report("KVM_CREATE_VCPU IOCTL failed for vCPU %lu (%s)",
+                         vcpu_id, strerror(kvm_fd));
+            return kvm_fd;
+        }
+
+        kvm_create_vcpu_internal(cpu, s, kvm_fd);
+
+        ret = map_kvm_run(s, cpu, errp);
+        if (ret < 0) {
+            goto err;
+        }
+
+        if (s->kvm_dirty_ring_size) {
+            ret = map_kvm_dirty_gfns(s, cpu, errp);
+            if (ret < 0) {
+                goto err;
+            }
+        }
+
+        ret = kvm_arch_init_vcpu(cpu);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret,
+                             "kvm_init_vcpu: kvm_arch_init_vcpu failed (%lu)",
+                             vcpu_id);
+        }
+
+        close(cpu->kvm_vcpu_stats_fd);
+        cpu->kvm_vcpu_stats_fd = kvm_vcpu_ioctl(cpu, KVM_GET_STATS_FD, NULL);
+        kvm_init_cpu_signals(cpu);
+    }
+    trace_kvm_rebind_vcpus();
+
+ err:
+    return ret;
+}
+
 static void kvm_park_vcpu(CPUState *cpu)
 {
     struct KVMParkedVcpu *vcpu;
@@ -483,13 +571,7 @@ static int kvm_create_vcpu(CPUState *cpu)
         }
     }
 
-    cpu->kvm_fd = kvm_fd;
-    cpu->kvm_state = s;
-    if (!s->guest_state_protected) {
-        cpu->vcpu_dirty = true;
-    }
-    cpu->dirty_pages = 0;
-    cpu->throttle_us_per_full = 0;
+    kvm_create_vcpu_internal(cpu, s, kvm_fd);
 
     trace_kvm_create_vcpu(cpu->cpu_index, vcpu_id, kvm_fd);
 
@@ -508,19 +590,11 @@ int kvm_create_and_park_vcpu(CPUState *cpu)
     return ret;
 }
 
-static int do_kvm_destroy_vcpu(CPUState *cpu)
+static int vcpu_unmap_regions(KVMState *s, CPUState *cpu)
 {
-    KVMState *s = kvm_state;
     int mmap_size;
     int ret = 0;
 
-    trace_kvm_destroy_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
-
-    ret = kvm_arch_destroy_vcpu(cpu);
-    if (ret < 0) {
-        goto err;
-    }
-
     mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
     if (mmap_size < 0) {
         ret = mmap_size;
@@ -548,6 +622,31 @@ static int do_kvm_destroy_vcpu(CPUState *cpu)
         cpu->kvm_dirty_gfns = NULL;
     }
 
+ err:
+    return ret;
+}
+
+static int do_kvm_destroy_vcpu(CPUState *cpu)
+{
+    KVMState *s = kvm_state;
+    int ret = 0;
+
+    trace_kvm_destroy_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
+
+    ret = kvm_arch_destroy_vcpu(cpu);
+    if (ret < 0) {
+        goto err;
+    }
+
+    /* If I am the CPU that created coalesced_mmio_ring, then discard it */
+    if (s->coalesced_mmio_ring == (void *)cpu->kvm_run + PAGE_SIZE) {
+        s->coalesced_mmio_ring = NULL;
+    }
+
+    ret = vcpu_unmap_regions(s, cpu);
+    if (ret < 0) {
+        goto err;
+    }
     kvm_park_vcpu(cpu);
 err:
     return ret;
@@ -561,26 +660,9 @@ void kvm_destroy_vcpu(CPUState *cpu)
     }
 }
 
-int kvm_init_vcpu(CPUState *cpu, Error **errp)
+static int map_kvm_run(KVMState *s, CPUState *cpu, Error **errp)
 {
-    KVMState *s = kvm_state;
-    int mmap_size;
-    int ret;
-
-    trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
-
-    ret = kvm_arch_pre_create_vcpu(cpu, errp);
-    if (ret < 0) {
-        goto err;
-    }
-
-    ret = kvm_create_vcpu(cpu);
-    if (ret < 0) {
-        error_setg_errno(errp, -ret,
-                         "kvm_init_vcpu: kvm_create_vcpu failed (%lu)",
-                         kvm_arch_vcpu_id(cpu));
-        goto err;
-    }
+    int mmap_size, ret = 0;
 
     mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
     if (mmap_size < 0) {
@@ -605,14 +687,53 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
             (void *)cpu->kvm_run + s->coalesced_mmio * PAGE_SIZE;
     }
 
+ err:
+    return ret;
+}
+
+static int map_kvm_dirty_gfns(KVMState *s, CPUState *cpu, Error **errp)
+{
+    int ret = 0;
+    /* Use MAP_SHARED to share pages with the kernel */
+    cpu->kvm_dirty_gfns = mmap(NULL, s->kvm_dirty_ring_bytes,
+                               PROT_READ | PROT_WRITE, MAP_SHARED,
+                               cpu->kvm_fd,
+                               PAGE_SIZE * KVM_DIRTY_LOG_PAGE_OFFSET);
+    if (cpu->kvm_dirty_gfns == MAP_FAILED) {
+        ret = -errno;
+    }
+
+    return ret;
+}
+
+int kvm_init_vcpu(CPUState *cpu, Error **errp)
+{
+    KVMState *s = kvm_state;
+    int ret;
+
+    trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
+
+    ret = kvm_arch_pre_create_vcpu(cpu, errp);
+    if (ret < 0) {
+        goto err;
+    }
+
+    ret = kvm_create_vcpu(cpu);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret,
+                         "kvm_init_vcpu: kvm_create_vcpu failed (%lu)",
+                         kvm_arch_vcpu_id(cpu));
+        goto err;
+    }
+
+    ret = map_kvm_run(s, cpu, errp);
+    if (ret < 0) {
+        goto err;
+    }
+
     if (s->kvm_dirty_ring_size) {
-        /* Use MAP_SHARED to share pages with the kernel */
-        cpu->kvm_dirty_gfns = mmap(NULL, s->kvm_dirty_ring_bytes,
-                                   PROT_READ | PROT_WRITE, MAP_SHARED,
-                                   cpu->kvm_fd,
-                                   PAGE_SIZE * KVM_DIRTY_LOG_PAGE_OFFSET);
-        if (cpu->kvm_dirty_gfns == MAP_FAILED) {
-            ret = -errno;
+        ret = map_kvm_dirty_gfns(s, cpu, errp);
+        if (ret < 0) {
             goto err;
         }
     }
@@ -2710,6 +2831,16 @@ static int kvm_reset_vmfd(MachineState *ms)
     }
     assert(!err);
 
+    /*
+     * rebind new vcpu fds with the new kvm fds
+     * These can only be called after kvm_arch_on_vmfd_change()
+     */
+    ret = kvm_rebind_vcpus(&err);
+    if (ret < 0) {
+        return ret;
+    }
+    assert(!err);
+
     /* these can be only called after ram_block_rebind() */
     memory_listener_register(&kml->listener, &address_space_memory);
     memory_listener_register(&kvm_io_listener, &address_space_io);
diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events
index e4beda01488..4a8921c632b 100644
--- a/accel/kvm/trace-events
+++ b/accel/kvm/trace-events
@@ -15,6 +15,7 @@ kvm_park_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu"
 kvm_unpark_vcpu(unsigned long arch_cpu_id, const char *msg) "id: %lu %s"
 kvm_irqchip_commit_routes(void) ""
 kvm_reset_vmfd(void) ""
+kvm_rebind_vcpus(void) ""
 kvm_irqchip_add_msi_route(char *name, int vector, int virq) "dev %s vector %d virq %d"
 kvm_irqchip_update_msi_route(int virq) "Updating MSI route virq=%d"
 kvm_irqchip_release_virq(int virq) "virq %d"
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 057/102] i386/tdx: refactor TDX firmware memory initialization code into a new function
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (55 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 056/102] accel/kvm: rebind current VCPUs to the new KVM VM file descriptor upon reset Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 058/102] i386/tdx: finalize TDX guest state upon reset Paolo Bonzini
                   ` (44 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

A new helper function is introduced that refactors all firmware memory
initialization code into a separate function. No functional change.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-17-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/kvm/tdx.c | 73 ++++++++++++++++++++++++-------------------
 1 file changed, 40 insertions(+), 33 deletions(-)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index a3e81e1c0cc..fd8e3de9693 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -295,13 +295,50 @@ static void tdx_post_init_vcpus(void)
     }
 }
 
+static void tdx_init_fw_mem_region(void)
+{
+    TdxFirmware *tdvf = &tdx_guest->tdvf;
+    TdxFirmwareEntry *entry;
+    Error *local_err = NULL;
+    int r;
+
+    for_each_tdx_fw_entry(tdvf, entry) {
+        struct kvm_tdx_init_mem_region region;
+        uint32_t flags;
+
+        region = (struct kvm_tdx_init_mem_region) {
+            .source_addr = (uintptr_t)entry->mem_ptr,
+            .gpa = entry->address,
+            .nr_pages = entry->size >> 12,
+        };
+
+        flags = entry->attributes & TDVF_SECTION_ATTRIBUTES_MR_EXTEND ?
+                KVM_TDX_MEASURE_MEMORY_REGION : 0;
+
+        do {
+            error_free(local_err);
+            local_err = NULL;
+            r = tdx_vcpu_ioctl(first_cpu, KVM_TDX_INIT_MEM_REGION, flags,
+                               &region, &local_err);
+        } while (r == -EAGAIN || r == -EINTR);
+        if (r < 0) {
+            error_report_err(local_err);
+            exit(1);
+        }
+
+        if (entry->type == TDVF_SECTION_TYPE_TD_HOB ||
+            entry->type == TDVF_SECTION_TYPE_TEMP_MEM) {
+            qemu_ram_munmap(-1, entry->mem_ptr, entry->size);
+            entry->mem_ptr = NULL;
+        }
+    }
+}
+
 static void tdx_finalize_vm(Notifier *notifier, void *unused)
 {
     TdxFirmware *tdvf = &tdx_guest->tdvf;
     TdxFirmwareEntry *entry;
     RAMBlock *ram_block;
-    Error *local_err = NULL;
-    int r;
 
     tdx_init_ram_entries();
 
@@ -339,37 +376,7 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
     tdvf_hob_create(tdx_guest, tdx_get_hob_entry(tdx_guest));
 
     tdx_post_init_vcpus();
-
-    for_each_tdx_fw_entry(tdvf, entry) {
-        struct kvm_tdx_init_mem_region region;
-        uint32_t flags;
-
-        region = (struct kvm_tdx_init_mem_region) {
-            .source_addr = (uintptr_t)entry->mem_ptr,
-            .gpa = entry->address,
-            .nr_pages = entry->size >> 12,
-        };
-
-        flags = entry->attributes & TDVF_SECTION_ATTRIBUTES_MR_EXTEND ?
-                KVM_TDX_MEASURE_MEMORY_REGION : 0;
-
-        do {
-            error_free(local_err);
-            local_err = NULL;
-            r = tdx_vcpu_ioctl(first_cpu, KVM_TDX_INIT_MEM_REGION, flags,
-                               &region, &local_err);
-        } while (r == -EAGAIN || r == -EINTR);
-        if (r < 0) {
-            error_report_err(local_err);
-            exit(1);
-        }
-
-        if (entry->type == TDVF_SECTION_TYPE_TD_HOB ||
-            entry->type == TDVF_SECTION_TYPE_TEMP_MEM) {
-            qemu_ram_munmap(-1, entry->mem_ptr, entry->size);
-            entry->mem_ptr = NULL;
-        }
-    }
+    tdx_init_fw_mem_region();
 
     /*
      * TDVF image has been copied into private region above via
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 058/102] i386/tdx: finalize TDX guest state upon reset
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (56 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 057/102] i386/tdx: refactor TDX firmware memory initialization code into a new function Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 059/102] i386/tdx: add a pre-vmfd change notifier to reset tdx state Paolo Bonzini
                   ` (43 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

When the confidential virtual machine KVM file descriptor changes due to the
guest reset, some TDX specific setup steps needs to be done again. This
includes finalizing the initial guest launch state again. This change
re-executes some parts of the TDX setup during the device reset phaze using a
resettable interface. This finalizes the guest launch state again and locks
it in. Machine done notifier which was previously used is no longer needed as
the same code is now executed as a part of VM reset.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-18-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/kvm/tdx.h        |  1 +
 target/i386/kvm/tdx.c        | 38 +++++++++++++++++++++++++++++++-----
 target/i386/kvm/trace-events |  3 +++
 3 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 1c38faf9834..264fbe530cc 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -70,6 +70,7 @@ typedef struct TdxGuest {
 
     uint32_t event_notify_vector;
     uint32_t event_notify_apicid;
+    ResettableState reset_state;
 } TdxGuest;
 
 #ifdef CONFIG_TDX
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index fd8e3de9693..37e91d95e1e 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -19,6 +19,7 @@
 #include "crypto/hash.h"
 #include "system/kvm_int.h"
 #include "system/runstate.h"
+#include "system/reset.h"
 #include "system/system.h"
 #include "system/ramblock.h"
 #include "system/address-spaces.h"
@@ -38,6 +39,7 @@
 #include "kvm_i386.h"
 #include "tdx.h"
 #include "tdx-quote-generator.h"
+#include "trace.h"
 
 #include "standard-headers/asm-x86/kvm_para.h"
 
@@ -389,9 +391,19 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
     CONFIDENTIAL_GUEST_SUPPORT(tdx_guest)->ready = true;
 }
 
-static Notifier tdx_machine_done_notify = {
-    .notify = tdx_finalize_vm,
-};
+static void tdx_handle_reset(Object *obj, ResetType type)
+{
+    if (!runstate_is_running() && !phase_check(PHASE_MACHINE_READY)) {
+        return;
+    }
+
+    if (!kvm_enable_hypercall(BIT_ULL(KVM_HC_MAP_GPA_RANGE))) {
+        error_setg(&error_fatal, "KVM_HC_MAP_GPA_RANGE not enabled for guest");
+    }
+
+    tdx_finalize_vm(NULL, NULL);
+    trace_tdx_handle_reset();
+}
 
 /*
  * Some CPUID bits change from fixed1 to configurable bits when TDX module
@@ -738,8 +750,6 @@ static int tdx_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
      */
     kvm_readonly_mem_allowed = false;
 
-    qemu_add_machine_init_done_notifier(&tdx_machine_done_notify);
-
     tdx_guest = tdx;
     return 0;
 }
@@ -1505,6 +1515,7 @@ OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
                                    TDX_GUEST,
                                    X86_CONFIDENTIAL_GUEST,
                                    { TYPE_USER_CREATABLE },
+                                   { TYPE_RESETTABLE_INTERFACE },
                                    { NULL })
 
 static void tdx_guest_init(Object *obj)
@@ -1538,16 +1549,24 @@ static void tdx_guest_init(Object *obj)
 
     tdx->event_notify_vector = -1;
     tdx->event_notify_apicid = -1;
+    qemu_register_resettable(obj);
 }
 
 static void tdx_guest_finalize(Object *obj)
 {
 }
 
+static ResettableState *tdx_reset_state(Object *obj)
+{
+    TdxGuest *tdx = TDX_GUEST(obj);
+    return &tdx->reset_state;
+}
+
 static void tdx_guest_class_init(ObjectClass *oc, const void *data)
 {
     ConfidentialGuestSupportClass *klass = CONFIDENTIAL_GUEST_SUPPORT_CLASS(oc);
     X86ConfidentialGuestClass *x86_klass = X86_CONFIDENTIAL_GUEST_CLASS(oc);
+    ResettableClass *rc = RESETTABLE_CLASS(oc);
 
     klass->kvm_init = tdx_kvm_init;
     klass->can_rebuild_guest_state = true;
@@ -1555,4 +1574,13 @@ static void tdx_guest_class_init(ObjectClass *oc, const void *data)
     x86_klass->cpu_instance_init = tdx_cpu_instance_init;
     x86_klass->adjust_cpuid_features = tdx_adjust_cpuid_features;
     x86_klass->check_features = tdx_check_features;
+
+    /*
+     * the exit phase makes sure sev handles reset after all legacy resets
+     * have taken place (in the hold phase) and IGVM has also properly
+     * set up the boot state.
+     */
+    rc->phases.exit = tdx_handle_reset;
+    rc->get_state = tdx_reset_state;
+
 }
diff --git a/target/i386/kvm/trace-events b/target/i386/kvm/trace-events
index 2d213c9f9b6..a3862345714 100644
--- a/target/i386/kvm/trace-events
+++ b/target/i386/kvm/trace-events
@@ -14,3 +14,6 @@ kvm_xen_soft_reset(void) ""
 kvm_xen_set_shared_info(uint64_t gfn) "shared info at gfn 0x%" PRIx64
 kvm_xen_set_vcpu_attr(int cpu, int type, uint64_t gpa) "vcpu attr cpu %d type %d gpa 0x%" PRIx64
 kvm_xen_set_vcpu_callback(int cpu, int vector) "callback vcpu %d vector %d"
+
+# tdx.c
+tdx_handle_reset(void) ""
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 059/102] i386/tdx: add a pre-vmfd change notifier to reset tdx state
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (57 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 058/102] i386/tdx: finalize TDX guest state upon reset Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 060/102] i386/sev: add migration blockers only once Paolo Bonzini
                   ` (42 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

During reset, when the VM file descriptor is changed, the TDX state needs to be
re-initialized. A notifier callback is implemented to reset the old
state and free memory before the new state is initialized post VM file
descriptor change.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-19-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/kvm/tdx.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 37e91d95e1e..4cae99c281a 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -405,6 +405,36 @@ static void tdx_handle_reset(Object *obj, ResetType type)
     trace_tdx_handle_reset();
 }
 
+/* TDX guest reset will require us to reinitialize some of tdx guest state. */
+static int set_tdx_vm_uninitialized(NotifierWithReturn *notifier,
+                                    void *data, Error** errp)
+{
+    TdxFirmware *fw = &tdx_guest->tdvf;
+
+    if (!((VmfdChangeNotifier *)data)->pre) {
+        return 0;
+    }
+
+    if (tdx_guest->initialized) {
+        tdx_guest->initialized = false;
+    }
+
+    g_free(tdx_guest->ram_entries);
+
+    /*
+     * the firmware entries will be parsed again, see
+     * x86_firmware_configure() -> tdx_parse_tdvf()
+     */
+    fw->entries = 0;
+    g_free(fw->entries);
+
+    return 0;
+}
+
+static NotifierWithReturn tdx_vmfd_change_notifier = {
+    .notify = set_tdx_vm_uninitialized,
+};
+
 /*
  * Some CPUID bits change from fixed1 to configurable bits when TDX module
  * supports TDX_FEATURES0.VE_REDUCTION. e.g., MCA/MCE/MTRR/CORE_CAPABILITY.
@@ -1549,6 +1579,7 @@ static void tdx_guest_init(Object *obj)
 
     tdx->event_notify_vector = -1;
     tdx->event_notify_apicid = -1;
+    kvm_vmfd_add_change_notifier(&tdx_vmfd_change_notifier);
     qemu_register_resettable(obj);
 }
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 060/102] i386/sev: add migration blockers only once
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (58 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 059/102] i386/tdx: add a pre-vmfd change notifier to reset tdx state Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 061/102] i386/sev: add notifiers " Paolo Bonzini
                   ` (41 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Prasad Pandit

From: Ani Sinha <anisinha@redhat.com>

sev_launch_finish() and sev_snp_launch_finish() could be called multiple times
when the confidential guest is being reset/rebooted. The migration
blockers should not be added multiple times, once per invocation. This change
makes sure that the migration blockers are added only one time by adding the
migration blockers to the vm state change handler when the vm transitions to
the running state. Subsequent reboots do not change the state of the vm.

Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-20-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/sev.c | 20 +++++---------------
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/target/i386/sev.c b/target/i386/sev.c
index 66e38ca32e1..260d8ef88bf 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -1421,11 +1421,6 @@ sev_launch_finish(SevCommonState *sev_common)
     }
 
     sev_set_guest_state(sev_common, SEV_STATE_RUNNING);
-
-    /* add migration blocker */
-    error_setg(&sev_mig_blocker,
-               "SEV: Migration is not implemented");
-    migrate_add_blocker(&sev_mig_blocker, &error_fatal);
 }
 
 static int snp_launch_update_data(uint64_t gpa, void *hva, size_t len,
@@ -1608,7 +1603,6 @@ static void
 sev_snp_launch_finish(SevCommonState *sev_common)
 {
     int ret, error;
-    Error *local_err = NULL;
     OvmfSevMetadata *metadata;
     SevLaunchUpdateData *data;
     SevSnpGuestState *sev_snp = SEV_SNP_GUEST(sev_common);
@@ -1655,15 +1649,6 @@ sev_snp_launch_finish(SevCommonState *sev_common)
 
     kvm_mark_guest_state_protected();
     sev_set_guest_state(sev_common, SEV_STATE_RUNNING);
-
-    /* add migration blocker */
-    error_setg(&sev_mig_blocker,
-               "SEV-SNP: Migration is not implemented");
-    ret = migrate_add_blocker(&sev_mig_blocker, &local_err);
-    if (local_err) {
-        error_report_err(local_err);
-        exit(1);
-    }
 }
 
 
@@ -1676,6 +1661,11 @@ sev_vm_state_change(void *opaque, bool running, RunState state)
     if (running) {
         if (!sev_check_state(sev_common, SEV_STATE_RUNNING)) {
             klass->launch_finish(sev_common);
+
+            /* add migration blocker */
+            error_setg(&sev_mig_blocker,
+                       "SEV: Migration is not implemented");
+            migrate_add_blocker(&sev_mig_blocker, &error_fatal);
         }
     }
 }
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 061/102] i386/sev: add notifiers only once
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (59 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 060/102] i386/sev: add migration blockers only once Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 062/102] i386/sev: free existing launch update data and kernel hashes data on init Paolo Bonzini
                   ` (40 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

The various notifiers that are used needs to be installed only once not on
every initialization. This includes the vm state change notifier and others.
This change uses 'cgs->ready' flag to install the notifiers only one time,
the first time.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-21-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/sev.c | 36 +++++++++++++++++++-----------------
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/target/i386/sev.c b/target/i386/sev.c
index 260d8ef88bf..647f4bf63d5 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -1920,8 +1920,9 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
         return -1;
     }
 
-    qemu_add_vm_change_state_handler(sev_vm_state_change, sev_common);
-
+    if (!cgs->ready) {
+        qemu_add_vm_change_state_handler(sev_vm_state_change, sev_common);
+    }
     cgs->ready = true;
 
     return 0;
@@ -1943,22 +1944,23 @@ static int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
         return -1;
     }
 
-    /*
-     * SEV uses these notifiers to register/pin pages prior to guest use,
-     * but SNP relies on guest_memfd for private pages, which has its
-     * own internal mechanisms for registering/pinning private memory.
-     */
-    ram_block_notifier_add(&sev_ram_notifier);
-
-    /*
-     * The machine done notify event is used for SEV guests to get the
-     * measurement of the encrypted images. When SEV-SNP is enabled, the
-     * measurement is part of the guest attestation process where it can
-     * be collected without any reliance on the VMM. So skip registering
-     * the notifier for SNP in favor of using guest attestation instead.
-     */
-    qemu_add_machine_init_done_notifier(&sev_machine_done_notify);
+    if (!cgs->ready) {
+        /*
+         * SEV uses these notifiers to register/pin pages prior to guest use,
+         * but SNP relies on guest_memfd for private pages, which has its
+         * own internal mechanisms for registering/pinning private memory.
+         */
+        ram_block_notifier_add(&sev_ram_notifier);
 
+        /*
+         * The machine done notify event is used for SEV guests to get the
+         * measurement of the encrypted images. When SEV-SNP is enabled, the
+         * measurement is part of the guest attestation process where it can
+         * be collected without any reliance on the VMM. So skip registering
+         * the notifier for SNP in favor of using guest attestation instead.
+         */
+        qemu_add_machine_init_done_notifier(&sev_machine_done_notify);
+    }
     return 0;
 }
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 062/102] i386/sev: free existing launch update data and kernel hashes data on init
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (60 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 061/102] i386/sev: add notifiers " Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:42 ` [PULL 063/102] i386/sev: add support for confidential guest reset Paolo Bonzini
                   ` (39 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

If there is existing launch update data and kernel hashes data, they need to be
freed when initialization code is executed. This is important for resettable
confidential guests where the initialization happens once every reset.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-22-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/sev.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/target/i386/sev.c b/target/i386/sev.c
index 647f4bf63d5..b3893e431c4 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -1773,6 +1773,7 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
     uint32_t ebx;
     uint32_t host_cbitpos;
     struct sev_user_data_status status = {};
+    SevLaunchUpdateData *data, *next_elm;
     SevCommonState *sev_common = SEV_COMMON(cgs);
     SevCommonStateClass *klass = SEV_COMMON_GET_CLASS(cgs);
     X86ConfidentialGuestClass *x86_klass =
@@ -1780,6 +1781,11 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
 
     sev_common->state = SEV_STATE_UNINIT;
 
+    /* free existing launch update data if any */
+    QTAILQ_FOREACH_SAFE(data, &launch_update, next, next_elm) {
+        g_free(data);
+    }
+
     host_cpuid(0x8000001F, 0, NULL, &ebx, NULL, NULL);
     host_cbitpos = ebx & 0x3f;
 
@@ -1968,6 +1974,8 @@ static int sev_snp_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
 {
     MachineState *ms = MACHINE(qdev_get_machine());
     X86MachineState *x86ms = X86_MACHINE(ms);
+    SevCommonState *sev_common = SEV_COMMON(cgs);
+    SevSnpGuestState *sev_snp_guest = SEV_SNP_GUEST(sev_common);
 
     if (x86ms->smm == ON_OFF_AUTO_AUTO) {
         x86ms->smm = ON_OFF_AUTO_OFF;
@@ -1976,6 +1984,10 @@ static int sev_snp_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
         return -1;
     }
 
+    /* free existing kernel hashes data if any */
+    g_free(sev_snp_guest->kernel_hashes_data);
+    sev_snp_guest->kernel_hashes_data = NULL;
+
     return 0;
 }
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 063/102] i386/sev: add support for confidential guest reset
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (61 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 062/102] i386/sev: free existing launch update data and kernel hashes data on init Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-10 16:59   ` Peter Maydell
  2026-03-02  8:42 ` [PULL 064/102] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors Paolo Bonzini
                   ` (38 subsequent siblings)
  101 siblings, 1 reply; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

When the KVM VM file descriptor changes as a part of the confidential guest
reset mechanism, it necessary to create a new confidential guest context and
re-encrypt the VM memory. This happens for SEV-ES and SEV-SNP virtual machines
as a part of SEV_LAUNCH_FINISH, SEV_SNP_LAUNCH_FINISH operations.

A new resettable interface for SEV module has been added. A new reset callback
for the reset 'exit' state has been implemented to perform the above operations
when the VM file descriptor has changed during VM reset.

Tracepoints has been added also for tracing purpose.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-23-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/sev.c        | 58 ++++++++++++++++++++++++++++++++++++++++
 target/i386/trace-events |  1 +
 2 files changed, 59 insertions(+)

diff --git a/target/i386/sev.c b/target/i386/sev.c
index b3893e431c4..549e6241769 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -30,8 +30,10 @@
 #include "system/kvm.h"
 #include "kvm/kvm_i386.h"
 #include "sev.h"
+#include "system/cpus.h"
 #include "system/system.h"
 #include "system/runstate.h"
+#include "system/reset.h"
 #include "trace.h"
 #include "migration/blocker.h"
 #include "qom/object.h"
@@ -86,6 +88,10 @@ typedef struct QEMU_PACKED PaddedSevHashTable {
     uint8_t padding[ROUND_UP(sizeof(SevHashTable), 16) - sizeof(SevHashTable)];
 } PaddedSevHashTable;
 
+static void sev_handle_reset(Object *obj, ResetType type);
+
+SevKernelLoaderContext sev_load_ctx = {};
+
 QEMU_BUILD_BUG_ON(sizeof(PaddedSevHashTable) % 16 != 0);
 
 #define SEV_INFO_BLOCK_GUID     "00f771de-1a7e-4fcb-890e-68c77e2fb44e"
@@ -129,6 +135,7 @@ struct SevCommonState {
     uint8_t build_id;
     int sev_fd;
     SevState state;
+    ResettableState reset_state;
 
     QTAILQ_HEAD(, SevLaunchVmsa) launch_vmsa;
 };
@@ -1666,6 +1673,11 @@ sev_vm_state_change(void *opaque, bool running, RunState state)
             error_setg(&sev_mig_blocker,
                        "SEV: Migration is not implemented");
             migrate_add_blocker(&sev_mig_blocker, &error_fatal);
+            /*
+             * mark SEV guest as resettable so that we can reinitialize
+             * SEV upon reset.
+             */
+            qemu_register_resettable(OBJECT(sev_common));
         }
     }
 }
@@ -1991,6 +2003,41 @@ static int sev_snp_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
     return 0;
 }
 
+/*
+ * handle sev vm reset
+ */
+static void sev_handle_reset(Object *obj, ResetType type)
+{
+    SevCommonState *sev_common = SEV_COMMON(MACHINE(qdev_get_machine())->cgs);
+    SevCommonStateClass *klass = SEV_COMMON_GET_CLASS(sev_common);
+
+    if (!sev_common) {
+        return;
+    }
+
+    if (!runstate_is_running()) {
+        return;
+    }
+
+    sev_add_kernel_loader_hashes(&sev_load_ctx, &error_fatal);
+    if (sev_es_enabled() && !sev_snp_enabled()) {
+        sev_launch_get_measure(NULL, NULL);
+    }
+    if (!sev_check_state(sev_common, SEV_STATE_RUNNING)) {
+        /* this calls sev_snp_launch_finish() etc */
+        klass->launch_finish(sev_common);
+    }
+
+    trace_sev_handle_reset();
+    return;
+}
+
+static ResettableState *sev_reset_state(Object *obj)
+{
+    SevCommonState *sev_common = SEV_COMMON(obj);
+    return &sev_common->reset_state;
+}
+
 int
 sev_encrypt_flash(hwaddr gpa, uint8_t *ptr, uint64_t len, Error **errp)
 {
@@ -2469,6 +2516,8 @@ bool sev_add_kernel_loader_hashes(SevKernelLoaderContext *ctx, Error **errp)
         return false;
     }
 
+    /* save the context here so that it can be re-used when vm is reset */
+    memcpy(&sev_load_ctx, ctx, sizeof(*ctx));
     return klass->build_kernel_loader_hashes(sev_common, area, ctx, errp);
 }
 
@@ -2729,8 +2778,16 @@ static void
 sev_common_class_init(ObjectClass *oc, const void *data)
 {
     ConfidentialGuestSupportClass *klass = CONFIDENTIAL_GUEST_SUPPORT_CLASS(oc);
+    ResettableClass *rc = RESETTABLE_CLASS(oc);
 
     klass->kvm_init = sev_common_kvm_init;
+    /*
+     * the exit phase makes sure sev handles reset after all legacy resets
+     * have taken place (in the hold phase) and IGVM has also properly
+     * set up the boot state.
+     */
+    rc->phases.exit = sev_handle_reset;
+    rc->get_state = sev_reset_state;
 
     object_class_property_add_str(oc, "sev-device",
                                   sev_common_get_sev_device,
@@ -2780,6 +2837,7 @@ static const TypeInfo sev_common_info = {
     .abstract = true,
     .interfaces = (const InterfaceInfo[]) {
         { TYPE_USER_CREATABLE },
+        { TYPE_RESETTABLE_INTERFACE },
         { }
     }
 };
diff --git a/target/i386/trace-events b/target/i386/trace-events
index 51301673f0c..b320f655eeb 100644
--- a/target/i386/trace-events
+++ b/target/i386/trace-events
@@ -14,3 +14,4 @@ kvm_sev_attestation_report(const char *mnonce, const char *data) "mnonce %s data
 kvm_sev_snp_launch_start(uint64_t policy, char *gosvw) "policy 0x%" PRIx64 " gosvw %s"
 kvm_sev_snp_launch_update(uint64_t src, uint64_t gpa, uint64_t len, const char *type) "src 0x%" PRIx64 " gpa 0x%" PRIx64 " len 0x%" PRIx64 " (%s page)"
 kvm_sev_snp_launch_finish(char *id_block, char *id_auth, char *host_data) "id_block %s id_auth %s host_data %s"
+sev_handle_reset(void) ""
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 064/102] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (62 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 063/102] i386/sev: add support for confidential guest reset Paolo Bonzini
@ 2026-03-02  8:42 ` Paolo Bonzini
  2026-03-02  8:43 ` [PULL 065/102] kvm/i8254: refactor pit initialization into a helper Paolo Bonzini
                   ` (37 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Cédric Le Goater

From: Ani Sinha <anisinha@redhat.com>

Normally the vfio pseudo device file descriptor lives for the life of the VM.
However, when the kvm VM file descriptor changes, a new file descriptor
for the pseudo device needs to be generated against the new kvm VM descriptor.
Other existing vfio descriptors needs to be reattached to the new pseudo device
descriptor. This change performs the above steps.

Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260227072445.406907-1-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/vfio/helpers.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 91 insertions(+)

diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c
index f68f8165d09..00d42d3b98e 100644
--- a/hw/vfio/helpers.c
+++ b/hw/vfio/helpers.c
@@ -116,6 +116,88 @@ bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
  * we'll re-use it should another vfio device be attached before then.
  */
 int vfio_kvm_device_fd = -1;
+
+/*
+ * Confidential virtual machines:
+ * During reset of confidential vms, the kvm vm file descriptor changes.
+ * In this case, the old vfio kvm file descriptor is
+ * closed and a new descriptor is created against the new kvm vm file
+ * descriptor.
+ */
+
+typedef struct VFIODeviceFd {
+    int fd;
+    QLIST_ENTRY(VFIODeviceFd) node;
+} VFIODeviceFd;
+
+static QLIST_HEAD(, VFIODeviceFd) vfio_device_fds =
+    QLIST_HEAD_INITIALIZER(vfio_device_fds);
+
+static void vfio_device_fd_list_add(int fd)
+{
+    VFIODeviceFd *file_fd;
+    file_fd = g_malloc0(sizeof(*file_fd));
+    file_fd->fd = fd;
+    QLIST_INSERT_HEAD(&vfio_device_fds, file_fd, node);
+}
+
+static void vfio_device_fd_list_remove(int fd)
+{
+    VFIODeviceFd *file_fd, *next;
+
+    QLIST_FOREACH_SAFE(file_fd, &vfio_device_fds, node, next) {
+        if (file_fd->fd == fd) {
+            QLIST_REMOVE(file_fd, node);
+            g_free(file_fd);
+            break;
+        }
+    }
+}
+
+static int vfio_device_fd_rebind(NotifierWithReturn *notifier, void *data,
+                                  Error **errp)
+{
+    VFIODeviceFd *file_fd;
+    struct kvm_device_attr attr = {
+        .group = KVM_DEV_VFIO_FILE,
+        .attr = KVM_DEV_VFIO_FILE_ADD,
+    };
+    struct kvm_create_device cd = {
+        .type = KVM_DEV_TYPE_VFIO,
+    };
+
+    /* we are not interested in pre vmfd change notification */
+    if (((VmfdChangeNotifier *)data)->pre) {
+        return 0;
+    }
+
+    if (kvm_vm_ioctl(kvm_state, KVM_CREATE_DEVICE, &cd)) {
+        error_setg_errno(errp, errno, "Failed to create KVM VFIO device");
+        return -errno;
+    }
+
+    if (vfio_kvm_device_fd != -1) {
+        close(vfio_kvm_device_fd);
+    }
+
+    vfio_kvm_device_fd = cd.fd;
+
+    QLIST_FOREACH(file_fd, &vfio_device_fds, node) {
+        attr.addr = (uint64_t)(unsigned long)&file_fd->fd;
+        if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
+            error_setg_errno(errp, errno,
+                             "Failed to add fd %d to KVM VFIO device",
+                             file_fd->fd);
+            return -errno;
+        }
+    }
+    return 0;
+}
+
+static struct NotifierWithReturn vfio_vmfd_change_notifier = {
+    .notify = vfio_device_fd_rebind,
+};
+
 #endif
 
 void vfio_kvm_device_close(void)
@@ -153,6 +235,11 @@ int vfio_kvm_device_add_fd(int fd, Error **errp)
         }
 
         vfio_kvm_device_fd = cd.fd;
+        /*
+         * If the vm file descriptor changes, add a notifier so that we can
+         * re-create the vfio_kvm_device_fd.
+         */
+        kvm_vmfd_add_change_notifier(&vfio_vmfd_change_notifier);
     }
 
     if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
@@ -160,6 +247,8 @@ int vfio_kvm_device_add_fd(int fd, Error **errp)
                          fd);
         return -errno;
     }
+
+    vfio_device_fd_list_add(fd);
 #endif
     return 0;
 }
@@ -183,6 +272,8 @@ int vfio_kvm_device_del_fd(int fd, Error **errp)
                          "Failed to remove fd %d from KVM VFIO device", fd);
         return -errno;
     }
+
+    vfio_device_fd_list_remove(fd);
 #endif
     return 0;
 }
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 065/102] kvm/i8254: refactor pit initialization into a helper
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (63 preceding siblings ...)
  2026-03-02  8:42 ` [PULL 064/102] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors Paolo Bonzini
@ 2026-03-02  8:43 ` Paolo Bonzini
  2026-03-02  8:43 ` [PULL 066/102] kvm/i8254: add support for confidential guest reset Paolo Bonzini
                   ` (36 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

The initialization code will be used again by VM file descriptor change
notifier callback in a subsequent change. So refactor common code into a new
helper function.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-25-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/i386/kvm/i8254.c | 68 +++++++++++++++++++++++++--------------------
 1 file changed, 38 insertions(+), 30 deletions(-)

diff --git a/hw/i386/kvm/i8254.c b/hw/i386/kvm/i8254.c
index 81e742f8667..255047458a8 100644
--- a/hw/i386/kvm/i8254.c
+++ b/hw/i386/kvm/i8254.c
@@ -60,6 +60,43 @@ struct KVMPITClass {
     DeviceRealize parent_realize;
 };
 
+static void do_pit_initialize(KVMPITState *s, Error **errp)
+{
+    struct kvm_pit_config config = {
+        .flags = 0,
+    };
+    int ret;
+
+    ret = kvm_vm_ioctl(kvm_state, KVM_CREATE_PIT2, &config);
+    if (ret < 0) {
+        error_setg(errp, "Create kernel PIC irqchip failed: %s",
+                   strerror(-ret));
+        return;
+    }
+    switch (s->lost_tick_policy) {
+    case LOST_TICK_POLICY_DELAY:
+        break; /* enabled by default */
+    case LOST_TICK_POLICY_DISCARD:
+        if (kvm_check_extension(kvm_state, KVM_CAP_REINJECT_CONTROL)) {
+            struct kvm_reinject_control control = { .pit_reinject = 0 };
+
+            ret = kvm_vm_ioctl(kvm_state, KVM_REINJECT_CONTROL, &control);
+            if (ret < 0) {
+                error_setg(errp,
+                           "Can't disable in-kernel PIT reinjection: %s",
+                           strerror(-ret));
+                return;
+            }
+        }
+        break;
+    default:
+        error_setg(errp, "Lost tick policy not supported.");
+        return;
+    }
+
+    return;
+}
+
 static void kvm_pit_update_clock_offset(KVMPITState *s)
 {
     int64_t offset, clock_offset;
@@ -241,42 +278,13 @@ static void kvm_pit_realizefn(DeviceState *dev, Error **errp)
     PITCommonState *pit = PIT_COMMON(dev);
     KVMPITClass *kpc = KVM_PIT_GET_CLASS(dev);
     KVMPITState *s = KVM_PIT(pit);
-    struct kvm_pit_config config = {
-        .flags = 0,
-    };
-    int ret;
 
     if (!kvm_check_extension(kvm_state, KVM_CAP_PIT_STATE2) ||
         !kvm_check_extension(kvm_state, KVM_CAP_PIT2)) {
         error_setg(errp, "In-kernel PIT not available");
     }
 
-    ret = kvm_vm_ioctl(kvm_state, KVM_CREATE_PIT2, &config);
-    if (ret < 0) {
-        error_setg(errp, "Create kernel PIC irqchip failed: %s",
-                   strerror(-ret));
-        return;
-    }
-    switch (s->lost_tick_policy) {
-    case LOST_TICK_POLICY_DELAY:
-        break; /* enabled by default */
-    case LOST_TICK_POLICY_DISCARD:
-        if (kvm_check_extension(kvm_state, KVM_CAP_REINJECT_CONTROL)) {
-            struct kvm_reinject_control control = { .pit_reinject = 0 };
-
-            ret = kvm_vm_ioctl(kvm_state, KVM_REINJECT_CONTROL, &control);
-            if (ret < 0) {
-                error_setg(errp,
-                           "Can't disable in-kernel PIT reinjection: %s",
-                           strerror(-ret));
-                return;
-            }
-        }
-        break;
-    default:
-        error_setg(errp, "Lost tick policy not supported.");
-        return;
-    }
+    do_pit_initialize(s, errp);
 
     memory_region_init_io(&pit->ioports, OBJECT(dev), NULL, NULL, "kvm-pit", 4);
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 066/102] kvm/i8254: add support for confidential guest reset
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (64 preceding siblings ...)
  2026-03-02  8:43 ` [PULL 065/102] kvm/i8254: refactor pit initialization into a helper Paolo Bonzini
@ 2026-03-02  8:43 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 067/102] kvm/hyperv: add synic feature to CPU only if its not enabled Paolo Bonzini
                   ` (35 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

A confidential guest reset involves closing the old virtual machine KVM file
descriptor and opening a new one. Since its a new KVM fd, PIT needs to be
re-initialized again. This is done with the help of a notifier which is invoked
upon KVM vm file descriptor change during the confidential guest reset process.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-26-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/i386/kvm/i8254.c      | 23 +++++++++++++++++++++++
 hw/i386/kvm/trace-events |  1 +
 2 files changed, 24 insertions(+)

diff --git a/hw/i386/kvm/i8254.c b/hw/i386/kvm/i8254.c
index 255047458a8..70e8fd83cd0 100644
--- a/hw/i386/kvm/i8254.c
+++ b/hw/i386/kvm/i8254.c
@@ -35,6 +35,7 @@
 #include "hw/core/qdev-properties-system.h"
 #include "system/kvm.h"
 #include "target/i386/kvm/kvm_i386.h"
+#include "trace.h"
 #include "qom/object.h"
 
 #define KVM_PIT_REINJECT_BIT 0
@@ -52,6 +53,8 @@ struct KVMPITState {
     LostTickPolicy lost_tick_policy;
     bool vm_stopped;
     int64_t kernel_clock_offset;
+
+    NotifierWithReturn kvmpit_vmfd_change_notifier;
 };
 
 struct KVMPITClass {
@@ -203,6 +206,23 @@ static void kvm_pit_put(PITCommonState *pit)
     }
 }
 
+static int kvmpit_post_vmfd_change(NotifierWithReturn *notifier,
+                                   void *data, Error** errp)
+{
+    KVMPITState *s = container_of(notifier, KVMPITState,
+                                  kvmpit_vmfd_change_notifier);
+
+    /* we are not interested in pre vmfd change notification */
+    if (((VmfdChangeNotifier *)data)->pre) {
+        return 0;
+    }
+
+    do_pit_initialize(s, errp);
+
+    trace_kvmpit_post_vmfd_change();
+    return 0;
+}
+
 static void kvm_pit_set_gate(PITCommonState *s, PITChannelState *sc, int val)
 {
     kvm_pit_get(s);
@@ -292,6 +312,9 @@ static void kvm_pit_realizefn(DeviceState *dev, Error **errp)
 
     qemu_add_vm_change_state_handler(kvm_pit_vm_state_change, s);
 
+    s->kvmpit_vmfd_change_notifier.notify = kvmpit_post_vmfd_change;
+    kvm_vmfd_add_change_notifier(&s->kvmpit_vmfd_change_notifier);
+
     kpc->parent_realize(dev, errp);
 }
 
diff --git a/hw/i386/kvm/trace-events b/hw/i386/kvm/trace-events
index 67bf7f174ed..33680ff82bd 100644
--- a/hw/i386/kvm/trace-events
+++ b/hw/i386/kvm/trace-events
@@ -20,3 +20,4 @@ xenstore_reset_watches(void) ""
 xenstore_watch_event(const char *path, const char *token) "path %s token %s"
 xen_primary_console_create(void) ""
 xen_primary_console_reset(int port) "port %u"
+kvmpit_post_vmfd_change(void) ""
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 067/102] kvm/hyperv: add synic feature to CPU only if its not enabled
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (65 preceding siblings ...)
  2026-03-02  8:43 ` [PULL 066/102] kvm/i8254: add support for confidential guest reset Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 068/102] hw/hyperv/vmbus: add support for confidential guest reset Paolo Bonzini
                   ` (34 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

We need to make sure that synic CPU feature is not already enabled. If it is,
trying to enable it again will result in the following assertion:

Unexpected error in object_property_try_add() at ../qom/object.c:1268:
qemu-system-x86_64: attempt to add duplicate property 'synic' to object (type 'host-x86_64-cpu')

So enable synic only if its not enabled already.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-27-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/kvm/kvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 9d7a9ffceb8..7cfbc7832de 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1761,7 +1761,7 @@ static int hyperv_init_vcpu(X86CPU *cpu)
             return ret;
         }
 
-        if (!cpu->hyperv_synic_kvm_only) {
+        if (!cpu->hyperv_synic_kvm_only && !hyperv_is_synic_enabled()) {
             ret = hyperv_x86_synic_add(cpu);
             if (ret < 0) {
                 error_report("failed to create HyperV SynIC: %s",
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 068/102] hw/hyperv/vmbus: add support for confidential guest reset
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (66 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 067/102] kvm/hyperv: add synic feature to CPU only if its not enabled Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 069/102] kvm/xen-emu: re-initialize capabilities during " Paolo Bonzini
                   ` (33 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Maciej S. Szmigiero

From: Ani Sinha <anisinha@redhat.com>

On confidential guests when the KVM virtual machine file descriptor changes as
a part of the reset process, event file descriptors needs to be reassociated
with the new KVM VM file descriptor. This is achieved with the help of a
callback handler that gets called when KVM VM file descriptor changes during
the confidential guest reset process.

This patch is tested on non-confidential platform only.

Acked-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-28-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/hyperv/vmbus.c      | 37 +++++++++++++++++++++++++++++++++++++
 hw/hyperv/trace-events |  1 +
 2 files changed, 38 insertions(+)

diff --git a/hw/hyperv/vmbus.c b/hw/hyperv/vmbus.c
index c5bab5d2452..64abe4c4c16 100644
--- a/hw/hyperv/vmbus.c
+++ b/hw/hyperv/vmbus.c
@@ -20,6 +20,7 @@
 #include "hw/hyperv/vmbus-bridge.h"
 #include "hw/core/sysbus.h"
 #include "exec/cpu-common.h"
+#include "system/kvm.h"
 #include "exec/target_page.h"
 #include "trace.h"
 
@@ -248,6 +249,12 @@ struct VMBus {
      * interrupt page
      */
     EventNotifier notifier;
+
+    /*
+     * Notifier to inform when vmfd is changed as a part of confidential guest
+     * reset mechanism.
+     */
+    NotifierWithReturn vmbus_vmfd_change_notifier;
 };
 
 static bool gpadl_full(VMBusGpadl *gpadl)
@@ -2347,6 +2354,33 @@ static void vmbus_dev_unrealize(DeviceState *dev)
     free_channels(vdev);
 }
 
+/*
+ * If the KVM fd changes because of VM reset in confidential guests,
+ * reassociate event fd with the new KVM fd.
+ */
+static int vmbus_handle_vmfd_change(NotifierWithReturn *notifier,
+                                    void *data, Error** errp)
+{
+    VMBus *vmbus = container_of(notifier, VMBus,
+                                vmbus_vmfd_change_notifier);
+    int ret = 0;
+
+    /* we are not interested in pre vmfd change notification */
+    if (((VmfdChangeNotifier *)data)->pre) {
+        return 0;
+    }
+
+    ret = hyperv_set_event_flag_handler(VMBUS_EVENT_CONNECTION_ID,
+                                            &vmbus->notifier);
+    /* if we are only using userland event handler, it may already exist */
+    if (ret != 0 && ret != -EEXIST) {
+        error_setg(errp, "hyperv set event handler failed with %d", ret);
+    }
+
+    trace_vmbus_handle_vmfd_change();
+    return ret;
+}
+
 static const Property vmbus_dev_props[] = {
     DEFINE_PROP_UUID("instanceid", VMBusDevice, instanceid),
 };
@@ -2429,6 +2463,9 @@ static void vmbus_realize(BusState *bus, Error **errp)
         goto clear_event_notifier;
     }
 
+    vmbus->vmbus_vmfd_change_notifier.notify = vmbus_handle_vmfd_change;
+    kvm_vmfd_add_change_notifier(&vmbus->vmbus_vmfd_change_notifier);
+
     return;
 
 clear_event_notifier:
diff --git a/hw/hyperv/trace-events b/hw/hyperv/trace-events
index 7963c215b1c..d8c96f18e98 100644
--- a/hw/hyperv/trace-events
+++ b/hw/hyperv/trace-events
@@ -16,6 +16,7 @@ vmbus_gpadl_torndown(uint32_t gpadl_id) "gpadl #%d"
 vmbus_open_channel(uint32_t chan_id, uint32_t gpadl_id, uint32_t target_vp) "channel #%d gpadl #%d target vp %d"
 vmbus_channel_open(uint32_t chan_id, uint32_t status) "channel #%d status %d"
 vmbus_close_channel(uint32_t chan_id) "channel #%d"
+vmbus_handle_vmfd_change(void) ""
 
 # hv-balloon
 hv_balloon_state_change(const char *tostr) "-> %s"
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 069/102] kvm/xen-emu: re-initialize capabilities during confidential guest reset
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (67 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 068/102] hw/hyperv/vmbus: add support for confidential guest reset Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 070/102] ppc/openpic: create a new openpic device and reattach mem region on coco reset Paolo Bonzini
                   ` (32 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

On confidential guests KVM virtual machine file descriptor changes as a
part of the guest reset process. Xen capabilities needs to be re-initialized in
KVM against the new file descriptor.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-29-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/kvm/xen-emu.c | 38 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c
index 52de0198343..29364a92797 100644
--- a/target/i386/kvm/xen-emu.c
+++ b/target/i386/kvm/xen-emu.c
@@ -44,9 +44,12 @@
 
 #include "xen-compat.h"
 
+NotifierWithReturn xen_vmfd_change_notifier;
+static uint32_t xen_msr;
 static void xen_vcpu_singleshot_timer_event(void *opaque);
 static void xen_vcpu_periodic_timer_event(void *opaque);
 static int vcpuop_stop_singleshot_timer(CPUState *cs);
+static int do_initialize_xen_caps(KVMState *s, uint32_t hypercall_msr);
 
 #ifdef TARGET_X86_64
 #define hypercall_compat32(longmode) (!(longmode))
@@ -54,6 +57,23 @@ static int vcpuop_stop_singleshot_timer(CPUState *cs);
 #define hypercall_compat32(longmode) (false)
 #endif
 
+static int xen_handle_vmfd_change(NotifierWithReturn *n,
+                                  void *data, Error** errp)
+{
+    int ret;
+
+    /* we are not interested in pre vmfd change notification */
+    if (((VmfdChangeNotifier *)data)->pre) {
+        return 0;
+    }
+
+    ret = do_initialize_xen_caps(kvm_state, xen_msr);
+    if (ret < 0) {
+        return ret;
+    }
+    return 0;
+}
+
 static bool kvm_gva_to_gpa(CPUState *cs, uint64_t gva, uint64_t *gpa,
                            size_t *len, bool is_write)
 {
@@ -111,7 +131,7 @@ static inline int kvm_copy_to_gva(CPUState *cs, uint64_t gva, void *buf,
     return kvm_gva_rw(cs, gva, buf, sz, true);
 }
 
-int kvm_xen_init(KVMState *s, uint32_t hypercall_msr)
+static int do_initialize_xen_caps(KVMState *s, uint32_t hypercall_msr)
 {
     const int required_caps = KVM_XEN_HVM_CONFIG_HYPERCALL_MSR |
         KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL | KVM_XEN_HVM_CONFIG_SHARED_INFO;
@@ -143,6 +163,19 @@ int kvm_xen_init(KVMState *s, uint32_t hypercall_msr)
                      strerror(-ret));
         return ret;
     }
+    return xen_caps;
+}
+
+int kvm_xen_init(KVMState *s, uint32_t hypercall_msr)
+{
+    int xen_caps;
+
+    xen_caps = do_initialize_xen_caps(s, hypercall_msr);
+    if (xen_caps < 0) {
+        return xen_caps;
+    }
+
+    xen_msr = hypercall_msr;
 
     /* If called a second time, don't repeat the rest of the setup. */
     if (s->xen_caps) {
@@ -185,6 +218,9 @@ int kvm_xen_init(KVMState *s, uint32_t hypercall_msr)
     xen_primary_console_reset();
     xen_xenstore_reset();
 
+    xen_vmfd_change_notifier.notify = xen_handle_vmfd_change;
+    kvm_vmfd_add_change_notifier(&xen_vmfd_change_notifier);
+
     return 0;
 }
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 070/102] ppc/openpic: create a new openpic device and reattach mem region on coco reset
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (68 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 069/102] kvm/xen-emu: re-initialize capabilities during " Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 071/102] kvm/vcpu: add notifiers to inform vcpu file descriptor change Paolo Bonzini
                   ` (31 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Bernhard Beschow

From: Ani Sinha <anisinha@redhat.com>

For confidential guests during the reset process, the old KVM VM file
descriptor is closed and a new one is created. When a new file descriptor is
created, a new openpic device needs to be created against this new KVM VM file
descriptor as well. Additionally, existing memory region needs to be reattached
to this new openpic device and proper CPU attributes set associating new file
descriptor. This change makes this happen with the help of a callback handler
that gets called when the KVM VM file descriptor changes as a part of the
confidential guest reset process.

Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-30-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/intc/openpic_kvm.c | 112 +++++++++++++++++++++++++++++++++---------
 1 file changed, 88 insertions(+), 24 deletions(-)

diff --git a/hw/intc/openpic_kvm.c b/hw/intc/openpic_kvm.c
index fbf0bdbe071..b099da20eb9 100644
--- a/hw/intc/openpic_kvm.c
+++ b/hw/intc/openpic_kvm.c
@@ -49,6 +49,7 @@ struct KVMOpenPICState {
     uint32_t fd;
     uint32_t model;
     hwaddr mapped;
+    NotifierWithReturn vmfd_change_notifier;
 };
 
 static void kvm_openpic_set_irq(void *opaque, int n_IRQ, int level)
@@ -114,6 +115,88 @@ static const MemoryRegionOps kvm_openpic_mem_ops = {
     },
 };
 
+static int kvm_openpic_setup(KVMOpenPICState *opp, Error **errp)
+{
+    int kvm_openpic_model;
+    struct kvm_create_device cd = {0};
+    KVMState *s = kvm_state;
+    int ret;
+
+    switch (opp->model) {
+    case OPENPIC_MODEL_FSL_MPIC_20:
+        kvm_openpic_model = KVM_DEV_TYPE_FSL_MPIC_20;
+        break;
+
+    case OPENPIC_MODEL_FSL_MPIC_42:
+        kvm_openpic_model = KVM_DEV_TYPE_FSL_MPIC_42;
+        break;
+
+    default:
+        error_setg(errp, "Unsupported OpenPIC model %" PRIu32, opp->model);
+        return -1;
+    }
+
+    cd.type = kvm_openpic_model;
+    ret = kvm_vm_ioctl(s, KVM_CREATE_DEVICE, &cd);
+    if (ret < 0) {
+        error_setg(errp, "Can't create device %d: %s",
+                   cd.type, strerror(errno));
+        return -1;
+    }
+    opp->fd = cd.fd;
+
+    return 0;
+}
+
+static int kvm_openpic_handle_vmfd_change(NotifierWithReturn *notifier,
+                                          void *data, Error **errp)
+{
+    KVMOpenPICState *opp = container_of(notifier, KVMOpenPICState,
+                                        vmfd_change_notifier);
+    uint64_t reg_base;
+    struct kvm_device_attr attr;
+    CPUState *cs;
+    int ret;
+
+    /* we are not interested in pre vmfd change notification */
+    if (((VmfdChangeNotifier *)data)->pre) {
+        return 0;
+    }
+
+    /* close the old descriptor */
+    close(opp->fd);
+
+    if (kvm_openpic_setup(opp, errp) < 0) {
+        return -1;
+    }
+
+    if (!opp->mapped) {
+        return 0;
+    }
+
+    reg_base = opp->mapped;
+    attr.group = KVM_DEV_MPIC_GRP_MISC;
+    attr.attr = KVM_DEV_MPIC_BASE_ADDR;
+    attr.addr = (uint64_t)(unsigned long)&reg_base;
+
+    ret = ioctl(opp->fd, KVM_SET_DEVICE_ATTR, &attr);
+    if (ret < 0) {
+        error_setg(errp, "%s: %s %" PRIx64, __func__,
+                   strerror(errno), reg_base);
+        return -1;
+    }
+
+    CPU_FOREACH(cs) {
+        ret = kvm_vcpu_enable_cap(cs, KVM_CAP_IRQ_MPIC, 0, opp->fd,
+                                   kvm_arch_vcpu_id(cs));
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    return 0;
+}
+
 static void kvm_openpic_region_add(MemoryListener *listener,
                                    MemoryRegionSection *section)
 {
@@ -197,36 +280,14 @@ static void kvm_openpic_realize(DeviceState *dev, Error **errp)
     SysBusDevice *d = SYS_BUS_DEVICE(dev);
     KVMOpenPICState *opp = KVM_OPENPIC(dev);
     KVMState *s = kvm_state;
-    int kvm_openpic_model;
-    struct kvm_create_device cd = {0};
-    int ret, i;
+    int i;
 
     if (!kvm_check_extension(s, KVM_CAP_DEVICE_CTRL)) {
         error_setg(errp, "Kernel is lacking Device Control API");
         return;
     }
 
-    switch (opp->model) {
-    case OPENPIC_MODEL_FSL_MPIC_20:
-        kvm_openpic_model = KVM_DEV_TYPE_FSL_MPIC_20;
-        break;
-
-    case OPENPIC_MODEL_FSL_MPIC_42:
-        kvm_openpic_model = KVM_DEV_TYPE_FSL_MPIC_42;
-        break;
-
-    default:
-        error_setg(errp, "Unsupported OpenPIC model %" PRIu32, opp->model);
-        return;
-    }
-
-    cd.type = kvm_openpic_model;
-    ret = kvm_vm_ioctl(s, KVM_CREATE_DEVICE, &cd);
-    if (ret < 0) {
-        error_setg_errno(errp, errno, "Can't create device %d", cd.type);
-        return;
-    }
-    opp->fd = cd.fd;
+    kvm_openpic_setup(opp, errp);
 
     sysbus_init_mmio(d, &opp->mem);
     qdev_init_gpio_in(dev, kvm_openpic_set_irq, OPENPIC_MAX_IRQ);
@@ -235,6 +296,9 @@ static void kvm_openpic_realize(DeviceState *dev, Error **errp)
     opp->mem_listener.region_del = kvm_openpic_region_del;
     opp->mem_listener.name = "openpic-kvm";
     memory_listener_register(&opp->mem_listener, &address_space_memory);
+    opp->vmfd_change_notifier.notify =
+        kvm_openpic_handle_vmfd_change;
+    kvm_vmfd_add_change_notifier(&opp->vmfd_change_notifier);
 
     /* indicate pic capabilities */
     msi_nonbroken = true;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 071/102] kvm/vcpu: add notifiers to inform vcpu file descriptor change
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (69 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 070/102] ppc/openpic: create a new openpic device and reattach mem region on coco reset Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 072/102] kvm/clock: add support for confidential guest reset Paolo Bonzini
                   ` (30 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

When new vcpu file descriptors are created and bound to the new kvm file
descriptor as a part of the confidential guest reset mechanism, various
subsystems needs to know about it. This change adds notifiers so that various
subsystems can take appropriate actions when vcpu fds change by registering
their handlers to this notifier.
Subsequent changes will register specific handlers to this notifier.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-31-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/system/kvm.h   | 17 +++++++++++++++++
 accel/kvm/kvm-all.c    | 26 ++++++++++++++++++++++++++
 accel/stubs/kvm-stub.c | 10 ++++++++++
 3 files changed, 53 insertions(+)

diff --git a/include/system/kvm.h b/include/system/kvm.h
index fbe23608a16..4b0e1b4ab14 100644
--- a/include/system/kvm.h
+++ b/include/system/kvm.h
@@ -590,4 +590,21 @@ void kvm_vmfd_add_change_notifier(NotifierWithReturn *n);
  */
 void kvm_vmfd_remove_change_notifier(NotifierWithReturn *n);
 
+/**
+ * kvm_vcpufd_add_change_notifier - register a notifier to get notified when
+ * a KVM vcpu file descriptors changes as a part of the confidential guest
+ * "reset" process. Various subsystems should use this mechanism to take
+ * actions such as re-issuing vcpu ioctls as a part of setting up vcpu
+ * features.
+ * @n: notifier with return value.
+ */
+void kvm_vcpufd_add_change_notifier(NotifierWithReturn *n);
+
+/**
+ * kvm_vcpufd_remove_change_notifier - de-register a notifer previously
+ * registered with kvm_vcpufd_add_change_notifier call.
+ * @n: notifier that was previously registered.
+ */
+void kvm_vcpufd_remove_change_notifier(NotifierWithReturn *n);
+
 #endif
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index a347a71a2ee..a1f910e9dff 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -127,6 +127,9 @@ static NotifierList kvm_irqchip_change_notifiers =
 static NotifierWithReturnList register_vmfd_changed_notifiers =
     NOTIFIER_WITH_RETURN_LIST_INITIALIZER(register_vmfd_changed_notifiers);
 
+static NotifierWithReturnList register_vcpufd_changed_notifiers =
+    NOTIFIER_WITH_RETURN_LIST_INITIALIZER(register_vcpufd_changed_notifiers);
+
 static int map_kvm_run(KVMState *s, CPUState *cpu, Error **errp);
 static int map_kvm_dirty_gfns(KVMState *s, CPUState *cpu, Error **errp);
 static int vcpu_unmap_regions(KVMState *s, CPUState *cpu);
@@ -2314,6 +2317,22 @@ static int kvm_vmfd_change_notify(Error **errp)
                                             &vmfd_notifier, errp);
 }
 
+void kvm_vcpufd_add_change_notifier(NotifierWithReturn *n)
+{
+    notifier_with_return_list_add(&register_vcpufd_changed_notifiers, n);
+}
+
+void kvm_vcpufd_remove_change_notifier(NotifierWithReturn *n)
+{
+    notifier_with_return_remove(n);
+}
+
+static int kvm_vcpufd_change_notify(Error **errp)
+{
+    return notifier_with_return_list_notify(&register_vcpufd_changed_notifiers,
+                                            &vmfd_notifier, errp);
+}
+
 int kvm_irqchip_get_virq(KVMState *s)
 {
     int next_virq;
@@ -2841,6 +2860,13 @@ static int kvm_reset_vmfd(MachineState *ms)
     }
     assert(!err);
 
+    /* notify everyone that vcpu fd has changed. */
+    ret = kvm_vcpufd_change_notify(&err);
+    if (ret < 0) {
+        return ret;
+    }
+    assert(!err);
+
     /* these can be only called after ram_block_rebind() */
     memory_listener_register(&kml->listener, &address_space_memory);
     memory_listener_register(&kvm_io_listener, &address_space_io);
diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
index a6e8a6e16cf..c4617caac6b 100644
--- a/accel/stubs/kvm-stub.c
+++ b/accel/stubs/kvm-stub.c
@@ -87,6 +87,16 @@ void kvm_vmfd_remove_change_notifier(NotifierWithReturn *n)
 {
 }
 
+void kvm_vcpufd_add_change_notifier(NotifierWithReturn *n)
+{
+    return;
+}
+
+void kvm_vcpufd_remove_change_notifier(NotifierWithReturn *n)
+{
+    return;
+}
+
 int kvm_irqchip_add_irqfd_notifier_gsi(KVMState *s, EventNotifier *n,
                                        EventNotifier *rn, int virq)
 {
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 072/102] kvm/clock: add support for confidential guest reset
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (70 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 071/102] kvm/vcpu: add notifiers to inform vcpu file descriptor change Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 073/102] hw/machine: introduce machine specific option 'x-change-vmfd-on-reset' Paolo Bonzini
                   ` (29 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

Confidential guests change the KVM VM file descriptor upon reset and also create
new VCPU file descriptors against the new KVM VM file descriptor. We need to
save the clock state from kvm before KVM VM file descriptor changes and restore
it after. Also after VCPU file descriptors changed, we must call
KVM_KVMCLOCK_CTRL on the VCPU file descriptor to inform KVM that the VCPU is
in paused state.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-32-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/i386/kvm/clock.c | 59 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c
index aba6842a22c..10d34254f02 100644
--- a/hw/i386/kvm/clock.c
+++ b/hw/i386/kvm/clock.c
@@ -50,6 +50,9 @@ struct KVMClockState {
     /* whether the 'clock' value was obtained in a host with
      * reliable KVM_GET_CLOCK */
     bool clock_is_reliable;
+
+    NotifierWithReturn kvmclock_vcpufd_change_notifier;
+    NotifierWithReturn kvmclock_vmfd_change_notifier;
 };
 
 struct pvclock_vcpu_time_info {
@@ -63,6 +66,9 @@ struct pvclock_vcpu_time_info {
     uint8_t    pad[2];
 } __attribute__((__packed__)); /* 32 bytes */
 
+static int kvmclock_set_clock(NotifierWithReturn *notifier,
+                              void *data, Error** errp);
+
 static uint64_t kvmclock_current_nsec(KVMClockState *s)
 {
     CPUState *cpu = first_cpu;
@@ -219,6 +225,54 @@ static void kvmclock_vm_state_change(void *opaque, bool running,
     }
 }
 
+static int kvmclock_save_clock(NotifierWithReturn *notifier,
+                               void *data, Error** errp)
+{
+    if (!((VmfdChangeNotifier *)data)->pre) {
+        return 0;
+    }
+    KVMClockState *s = container_of(notifier, KVMClockState,
+                                    kvmclock_vmfd_change_notifier);
+    kvm_update_clock(s);
+    return 0;
+}
+
+static int kvmclock_set_clock(NotifierWithReturn *notifier,
+                              void *data, Error** errp)
+{
+    struct kvm_clock_data clock_data = {};
+    CPUState *cpu;
+    int ret;
+    KVMClockState *s = container_of(notifier, KVMClockState,
+                                    kvmclock_vcpufd_change_notifier);
+    int cap_clock_ctrl = kvm_check_extension(kvm_state, KVM_CAP_KVMCLOCK_CTRL);
+
+    if (!s->clock_is_reliable) {
+        uint64_t pvclock_via_mem = kvmclock_current_nsec(s);
+        /* saved clock value before vmfd change is not reliable */
+        if (pvclock_via_mem) {
+            s->clock = pvclock_via_mem;
+        }
+    }
+
+    clock_data.clock = s->clock;
+    ret = kvm_vm_ioctl(kvm_state, KVM_SET_CLOCK, &clock_data);
+    if (ret < 0) {
+        fprintf(stderr, "KVM_SET_CLOCK failed: %s\n", strerror(-ret));
+        abort();
+    }
+
+    if (!cap_clock_ctrl) {
+        return 0;
+    }
+    CPU_FOREACH(cpu) {
+        run_on_cpu(cpu, do_kvmclock_ctrl, RUN_ON_CPU_NULL);
+    }
+
+    return 0;
+}
+
+
 static void kvmclock_realize(DeviceState *dev, Error **errp)
 {
     KVMClockState *s = KVM_CLOCK(dev);
@@ -230,7 +284,12 @@ static void kvmclock_realize(DeviceState *dev, Error **errp)
 
     kvm_update_clock(s);
 
+    s->kvmclock_vcpufd_change_notifier.notify = kvmclock_set_clock;
+    s->kvmclock_vmfd_change_notifier.notify = kvmclock_save_clock;
+
     qemu_add_vm_change_state_handler(kvmclock_vm_state_change, s);
+    kvm_vcpufd_add_change_notifier(&s->kvmclock_vcpufd_change_notifier);
+    kvm_vmfd_add_change_notifier(&s->kvmclock_vmfd_change_notifier);
 }
 
 static bool kvmclock_clock_is_reliable_needed(void *opaque)
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 073/102] hw/machine: introduce machine specific option 'x-change-vmfd-on-reset'
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (71 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 072/102] kvm/clock: add support for confidential guest reset Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-10 16:49   ` Peter Maydell
  2026-03-02  8:47 ` [PULL 074/102] tests/functional/x86_64: add functional test to exercise vm fd change on reset Paolo Bonzini
                   ` (28 subsequent siblings)
  101 siblings, 1 reply; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

A new machine specific option 'x-change-vmfd-on-reset' is introduced for
debugging and testing only (hence the 'x-' prefix). This option when enabled
will force KVM VM file descriptor to be changed upon guest reset like
in the case of confidential guests. This can be used to exercise the code
changes that are specific for confidential guests on non-confidential
guests as well (except changes that require hardware support for
confidential guests).
A new functional test has been added in the next patch that uses this new
parameter to test the VM file descriptor changes.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-33-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/hw/core/boards.h |  6 ++++++
 hw/core/machine.c        | 22 ++++++++++++++++++++++
 system/runstate.c        |  6 +++---
 3 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/include/hw/core/boards.h b/include/hw/core/boards.h
index edbe8d03e56..12b21493789 100644
--- a/include/hw/core/boards.h
+++ b/include/hw/core/boards.h
@@ -448,6 +448,12 @@ struct MachineState {
     struct NVDIMMState *nvdimms_state;
     struct NumaState *numa_state;
     bool acpi_spcr_enabled;
+    /*
+     * Whether to change virtual machine accelerator handle upon
+     * reset or not. Used only for debugging and testing purpose.
+     * Set to false by default for all regular use.
+     */
+    bool new_accel_vmfd_on_reset;
 };
 
 /*
diff --git a/hw/core/machine.c b/hw/core/machine.c
index d4ef620c178..eae1f6be8d5 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -435,6 +435,21 @@ static void machine_set_dump_guest_core(Object *obj, bool value, Error **errp)
     ms->dump_guest_core = value;
 }
 
+static bool machine_get_new_accel_vmfd_on_reset(Object *obj, Error **errp)
+{
+    MachineState *ms = MACHINE(obj);
+
+    return ms->new_accel_vmfd_on_reset;
+}
+
+static void machine_set_new_accel_vmfd_on_reset(Object *obj,
+                                                bool value, Error **errp)
+{
+    MachineState *ms = MACHINE(obj);
+
+    ms->new_accel_vmfd_on_reset = value;
+}
+
 static bool machine_get_mem_merge(Object *obj, Error **errp)
 {
     MachineState *ms = MACHINE(obj);
@@ -1183,6 +1198,13 @@ static void machine_class_init(ObjectClass *oc, const void *data)
     object_class_property_set_description(oc, "dump-guest-core",
         "Include guest memory in a core dump");
 
+    object_class_property_add_bool(oc, "x-change-vmfd-on-reset",
+        machine_get_new_accel_vmfd_on_reset,
+        machine_set_new_accel_vmfd_on_reset);
+    object_class_property_set_description(oc, "x-change-vmfd-on-reset",
+        "Set on/off to enable/disable generating new accelerator guest handle "
+         "on guest reset. Default: off (used only for testing/debugging).");
+
     object_class_property_add_bool(oc, "mem-merge",
         machine_get_mem_merge, machine_set_mem_merge);
     object_class_property_set_description(oc, "mem-merge",
diff --git a/system/runstate.c b/system/runstate.c
index e7b50e6a3b1..eca722b43c6 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -526,9 +526,9 @@ void qemu_system_reset(ShutdownCause reason)
         type = RESET_TYPE_COLD;
     }
 
-    if (!cpus_are_resettable() &&
-        (reason == SHUTDOWN_CAUSE_GUEST_RESET ||
-         reason == SHUTDOWN_CAUSE_HOST_QMP_SYSTEM_RESET)) {
+    if ((reason == SHUTDOWN_CAUSE_GUEST_RESET ||
+         reason == SHUTDOWN_CAUSE_HOST_QMP_SYSTEM_RESET) &&
+        (current_machine->new_accel_vmfd_on_reset || !cpus_are_resettable())) {
         if (ac->rebuild_guest) {
             ret = ac->rebuild_guest(current_machine);
             if (ret < 0) {
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 074/102] tests/functional/x86_64: add functional test to exercise vm fd change on reset
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (72 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 073/102] hw/machine: introduce machine specific option 'x-change-vmfd-on-reset' Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 075/102] qom: add 'confidential-guest-reset' property for x86 confidential vms Paolo Bonzini
                   ` (27 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha

From: Ani Sinha <anisinha@redhat.com>

A new functional test is added that exercises the code changes related to
closing of the old KVM VM file descriptor and opening a new one upon VM reset.
This normally happens when confidential guests are reset but for
non-confidential guests, we use a special machine specific debug/test parameter
'x-change-vmfd-on-reset' to enable this behavior.
Only specific code changes related to re-initialisation of SEV-ES, SEV-SNP and
TDX platforms are not exercised in this test as they require hardware that
supports running confidential guests.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-34-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 MAINTAINERS                                  |   1 +
 tests/functional/x86_64/meson.build          |   1 +
 tests/functional/x86_64/test_rebuild_vmfd.py | 136 +++++++++++++++++++
 3 files changed, 138 insertions(+)
 create mode 100755 tests/functional/x86_64/test_rebuild_vmfd.py

diff --git a/MAINTAINERS b/MAINTAINERS
index a8e1546de1e..83c05112a33 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -157,6 +157,7 @@ M: Ani Sinha <anisinha@redhat.com>
 M: Paolo Bonzini <pbonzini@redhat.com>
 S: Maintained
 F: stubs/kvm.c
+F: tests/functional/x86_64/test_rebuild_vmfd.py
 
 Guest CPU cores (TCG)
 ---------------------
diff --git a/tests/functional/x86_64/meson.build b/tests/functional/x86_64/meson.build
index beab4f304ba..05e4914c772 100644
--- a/tests/functional/x86_64/meson.build
+++ b/tests/functional/x86_64/meson.build
@@ -37,4 +37,5 @@ tests_x86_64_system_thorough = [
   'vhost_user_bridge',
   'virtio_balloon',
   'virtio_gpu',
+  'rebuild_vmfd',
 ]
diff --git a/tests/functional/x86_64/test_rebuild_vmfd.py b/tests/functional/x86_64/test_rebuild_vmfd.py
new file mode 100755
index 00000000000..5a8e5fd89b4
--- /dev/null
+++ b/tests/functional/x86_64/test_rebuild_vmfd.py
@@ -0,0 +1,136 @@
+#!/usr/bin/env python3
+#
+# Functional tests exercising guest KVM file descriptor change on reset.
+#
+# Copyright © 2026 Red Hat, Inc.
+#
+# Author:
+#  Ani Sinha <anisinha@redhat.com>
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+import os
+from qemu.machine import machine
+
+from qemu_test import QemuSystemTest, Asset, exec_command_and_wait_for_pattern
+from qemu_test import wait_for_console_pattern
+
+class KVMGuest(QemuSystemTest):
+
+    # ASSET UKI was generated using
+    # https://gitlab.com/kraxel/edk2-tests/-/blob/unittest/tools/make-supermin.sh
+    ASSET_UKI = Asset('https://gitlab.com/anisinha/misc-artifacts/'
+                      '-/raw/main/uki.x86-64.efi?ref_type=heads',
+                      'e0f806bd1fa24111312e1fe849d2ee69808d4343930a5'
+                      'dc8c1688da17c65f576')
+    # ASSET_OVMF comes from /usr/share/edk2/ovmf/OVMF.stateless.fd of a
+    # fedora core 43 distribution which in turn comes from the
+    # edk2-ovmf-20251119-3.fc43.noarch rpm of that distribution.
+    ASSET_OVMF = Asset('https://gitlab.com/anisinha/misc-artifacts/'
+                       '-/raw/main/OVMF.stateless.fd?ref_type=heads',
+                       '58a4275aafa8774bd6b1540adceae4ea434b8db75b476'
+                       '11839ff47be88cfcf22')
+
+    def common_vm_setup(self, kvm_args=None, cpu_args=None):
+        self.set_machine('q35')
+        self.require_accelerator("kvm")
+
+        self.vm.set_console()
+        if kvm_args:
+            self.vm.add_args("-accel", "kvm,%s" %kvm_args)
+        else:
+            self.vm.add_args("-accel", "kvm")
+        self.vm.add_args("-smp", "2")
+        if cpu_args:
+            self.vm.add_args("-cpu", "host,%s" %cpu_args)
+        else:
+            self.vm.add_args("-cpu", "host")
+        self.vm.add_args("-m", "2G")
+        self.vm.add_args("-nographic", "-nodefaults")
+
+
+        self.uki_path = self.ASSET_UKI.fetch()
+        self.ovmf_path = self.ASSET_OVMF.fetch()
+
+        self.vm.add_args('-kernel', self.uki_path)
+        self.vm.add_args("-bios", self.ovmf_path)
+        # enable KVM VMFD change on reset for a non-coco VM
+        self.vm.add_args("-machine", "q35,x-change-vmfd-on-reset=on")
+
+        # enable tracing of basic vmfd change function
+        self.vm.add_args("--trace", "kvm_reset_vmfd")
+
+    def launch_vm(self):
+        try:
+            self.vm.launch()
+        except machine.VMLaunchFailure as e:
+            if "Xen HVM guest support not present" in e.output:
+                self.skipTest("KVM Xen support is not present "
+                              "(need v5.12+ kernel with CONFIG_KVM_XEN)")
+            elif "Property 'kvm-accel.xen-version' not found" in e.output:
+                self.skipTest("QEMU not built with CONFIG_XEN_EMU support")
+            else:
+                raise e
+
+        self.log.info('VM launched')
+        console_pattern = 'bash-5.1#'
+        wait_for_console_pattern(self, console_pattern)
+        self.log.info('VM ready with a bash prompt')
+
+    def vm_console_reset(self):
+        exec_command_and_wait_for_pattern(self, '/usr/sbin/reboot -f',
+                                          'reboot: machine restart')
+        console_pattern = '# --- Hello world ---'
+        wait_for_console_pattern(self, console_pattern)
+        self.vm.shutdown()
+
+    def vm_qmp_reset(self):
+        self.vm.qmp('system_reset')
+        console_pattern = '# --- Hello world ---'
+        wait_for_console_pattern(self, console_pattern)
+        self.vm.shutdown()
+
+    def check_logs(self):
+        self.assertRegex(self.vm.get_log(),
+                         r'kvm_reset_vmfd')
+        self.assertRegex(self.vm.get_log(),
+                         r'virtual machine state has been rebuilt')
+
+    def test_reset_console(self):
+        self.common_vm_setup()
+        self.launch_vm()
+        self.vm_console_reset()
+        self.check_logs()
+
+    def test_reset_qmp(self):
+        self.common_vm_setup()
+        self.launch_vm()
+        self.vm_qmp_reset()
+        self.check_logs()
+
+    def test_reset_kvmpit(self):
+        self.common_vm_setup()
+        self.vm.add_args("--trace", "kvmpit_post_vmfd_change")
+        self.launch_vm()
+        self.vm_console_reset()
+        self.assertRegex(self.vm.get_log(),
+                         r'kvmpit_post_vmfd_change')
+
+    def test_reset_xen_emulation(self):
+        self.common_vm_setup("xen-version=0x4000a,kernel-irqchip=split")
+        self.launch_vm()
+        self.vm_console_reset()
+        self.check_logs()
+
+    def test_reset_hyperv_vmbus(self):
+        self.common_vm_setup(None, "hv-syndbg,hv-relaxed,hv_time,hv-synic,"
+                             "hv-vpindex,hv-runtime,hv-stimer")
+        self.vm.add_args("-device", "vmbus-bridge,irq=15")
+        self.vm.add_args("-trace", "vmbus_handle_vmfd_change")
+        self.launch_vm()
+        self.vm_console_reset()
+        self.assertRegex(self.vm.get_log(),
+                         r'vmbus_handle_vmfd_change')
+
+if __name__ == '__main__':
+    QemuSystemTest.main()
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 075/102] qom: add 'confidential-guest-reset' property for x86 confidential vms
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (73 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 074/102] tests/functional/x86_64: add functional test to exercise vm fd change on reset Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 076/102] audio: fix nominal volume channel (cosmetic) Paolo Bonzini
                   ` (26 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Daniel P. Berrangé, Markus Armbruster

From: Ani Sinha <anisinha@redhat.com>

Through the new 'confidential-guest-reset' property, control plane should be
able to detect if the hypervisor supports x86 confidential guest resets. Older
hypervisors that do not support resets will not have this property populated.

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-35-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 qapi/qom.json | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index 6f5c9de0f0b..c653248f85d 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -1009,13 +1009,19 @@
 #     designated guest firmware page for measured boot with -kernel
 #     (default: false) (since 6.2)
 #
+# Features:
+#
+# @confidential-guest-reset: If present, the hypervisor supports
+#     confidential guest resets (since 11.0).
+#
 # Since: 9.1
 ##
 { 'struct': 'SevCommonProperties',
   'data': { '*sev-device': 'str',
             '*cbitpos': 'uint32',
             'reduced-phys-bits': 'uint32',
-            '*kernel-hashes': 'bool' } }
+            '*kernel-hashes': 'bool' },
+  'features': ['confidential-guest-reset']}
 
 ##
 # @SevGuestProperties:
@@ -1136,6 +1142,11 @@
 #     it, the guest will not be able to get a TD quote for
 #     attestation.
 #
+# Features:
+#
+# @confidential-guest-reset: If present, the hypervisor supports
+#     confidential guest resets (since 11.0).
+#
 # Since: 10.1
 ##
 { 'struct': 'TdxGuestProperties',
@@ -1144,7 +1155,8 @@
             '*mrconfigid': 'str',
             '*mrowner': 'str',
             '*mrownerconfig': 'str',
-            '*quote-generation-socket': 'SocketAddress' } }
+            '*quote-generation-socket': 'SocketAddress' },
+   'features': ['confidential-guest-reset']}
 
 ##
 # @ThreadContextProperties:
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 076/102] audio: fix nominal volume channel (cosmetic)
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (74 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 075/102] qom: add 'confidential-guest-reset' property for x86 confidential vms Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 078/102] scripts/vendor.py: add pycotap Paolo Bonzini
                   ` (25 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Marc-André Lureau

From: Marc-André Lureau <marcandre.lureau@redhat.com>

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20260211-cleanups-v1-1-e63c96572389@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 audio/audio-mixeng-be.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/audio/audio-mixeng-be.c b/audio/audio-mixeng-be.c
index 37040450511..5878b23e04e 100644
--- a/audio/audio-mixeng-be.c
+++ b/audio/audio-mixeng-be.c
@@ -1649,7 +1649,7 @@ static void audio_mixeng_backend_set_volume_out(AudioBackend *be, SWVoiceOut *sw
 
         sw->vol.mute = vol->mute;
         sw->vol.l = nominal_volume.l * vol->vol[0] / 255;
-        sw->vol.r = nominal_volume.l * vol->vol[vol->channels > 1 ? 1 : 0] /
+        sw->vol.r = nominal_volume.r * vol->vol[vol->channels > 1 ? 1 : 0] /
             255;
 
         if (k->volume_out) {
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 078/102] scripts/vendor.py: add pycotap
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (75 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 076/102] audio: fix nominal volume channel (cosmetic) Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 079/102] audio: require pulse >= 0.9.13 Paolo Bonzini
                   ` (24 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Marc-André Lureau, Thomas Huth

From: Marc-André Lureau <marcandre.lureau@redhat.com>

Related to commit 5ec1eec11000ef118b2a87c330245ffaa475f5ee ("python:
Install pycotap in our venv if necessary")

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20260211-cleanups-v1-3-e63c96572389@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 python/scripts/vendor.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/python/scripts/vendor.py b/python/scripts/vendor.py
index 46ce2980d5d..78058183e4c 100755
--- a/python/scripts/vendor.py
+++ b/python/scripts/vendor.py
@@ -45,6 +45,8 @@ def main() -> int:
         "4b27aafce281e652dcb437b28007457411245d975c48b5db3a797d3e93ae1585",
         "qemu.qmp==0.0.5":
         "e05782d6df5844b34e0d2f7c68693525da074deef7b641c1401dda6e4e3d6303",
+        "pycotap==1.3.1":
+        "1c3a25b3ff89e48f4e00f1f71dbbc1642b4f65c65d416524d07e73492fff25ea",
     }
 
     vendor_dir = Path(__file__, "..", "..", "wheels").resolve()
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 079/102] audio: require pulse >= 0.9.13
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (76 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 078/102] scripts/vendor.py: add pycotap Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 080/102] audio: require spice >= 0.15 Paolo Bonzini
                   ` (23 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Marc-André Lureau

From: Marc-André Lureau <marcandre.lureau@redhat.com>

pulseaudio 0.9.13 was released on 2009-09-10. All our supported
distros have it.

PA_*_IS_GOOD are from 0.9.11.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20260211-cleanups-v1-4-e63c96572389@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 meson.build     |  2 +-
 audio/paaudio.c | 28 ++--------------------------
 2 files changed, 3 insertions(+), 27 deletions(-)

diff --git a/meson.build b/meson.build
index cbd6d90ce64..11f83cf05c4 100644
--- a/meson.build
+++ b/meson.build
@@ -1298,7 +1298,7 @@ endif
 
 pulse = not_found
 if not get_option('pa').auto() or (host_os == 'linux' and have_system)
-  pulse = dependency('libpulse', required: get_option('pa'),
+  pulse = dependency('libpulse', version: '>=0.9.13', required: get_option('pa'),
                      method: 'pkg-config')
 endif
 alsa = not_found
diff --git a/audio/paaudio.c b/audio/paaudio.c
index 23e8767a46b..24327ecbf45 100644
--- a/audio/paaudio.c
+++ b/audio/paaudio.c
@@ -62,26 +62,6 @@ static void G_GNUC_PRINTF(2, 3) qpa_logerr(int err, const char *fmt, ...)
     error_printf(" Reason: %s\n", pa_strerror(err));
 }
 
-#ifndef PA_CONTEXT_IS_GOOD
-static inline int PA_CONTEXT_IS_GOOD(pa_context_state_t x)
-{
-    return
-        x == PA_CONTEXT_CONNECTING ||
-        x == PA_CONTEXT_AUTHORIZING ||
-        x == PA_CONTEXT_SETTING_NAME ||
-        x == PA_CONTEXT_READY;
-}
-#endif
-
-#ifndef PA_STREAM_IS_GOOD
-static inline int PA_STREAM_IS_GOOD(pa_stream_state_t x)
-{
-    return
-        x == PA_STREAM_CREATING ||
-        x == PA_STREAM_READY;
-}
-#endif
-
 #define CHECK_SUCCESS_GOTO(c, expression, label, msg)           \
     do {                                                        \
         if (!(expression)) {                                    \
@@ -682,9 +662,7 @@ static void qpa_volume_out(HWVoiceOut *hw, Volume *vol)
     PAConnection *c = pa->g->conn;
     int i;
 
-#ifdef PA_CHECK_VERSION    /* macro is present in 0.9.16+ */
-    pa_cvolume_init (&v);  /* function is present in 0.9.13+ */
-#endif
+    pa_cvolume_init(&v);
 
     v.channels = vol->channels;
     for (i = 0; i < vol->channels; ++i) {
@@ -724,9 +702,7 @@ static void qpa_volume_in(HWVoiceIn *hw, Volume *vol)
     PAConnection *c = pa->g->conn;
     int i;
 
-#ifdef PA_CHECK_VERSION
-    pa_cvolume_init (&v);
-#endif
+    pa_cvolume_init(&v);
 
     v.channels = vol->channels;
     for (i = 0; i < vol->channels; ++i) {
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 080/102] audio: require spice >= 0.15
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (77 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 079/102] audio: require pulse >= 0.9.13 Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 081/102] ui: drop spice-protocol < 0.14.3 support Paolo Bonzini
                   ` (22 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Marc-André Lureau

From: Marc-André Lureau <marcandre.lureau@redhat.com>

Spice server 0.15.0 was released on 2021-04-16. It is part of all our
supported distro (except CentOS 9, which doesn't include it).

It has all the new required audio APIs/interfaces.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20260211-cleanups-v1-5-e63c96572389@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 audio/spiceaudio.c | 30 ------------------------------
 1 file changed, 30 deletions(-)

diff --git a/audio/spiceaudio.c b/audio/spiceaudio.c
index 70a0b60dd83..5a97eb80a67 100644
--- a/audio/spiceaudio.c
+++ b/audio/spiceaudio.c
@@ -49,17 +49,8 @@ static bool spice_audio_realize(AudioBackend *abe, Audiodev *dev, Error **errp)
     return audio_spice_parent_class->realize(abe, dev, errp);
 }
 
-#if SPICE_INTERFACE_PLAYBACK_MAJOR > 1 || SPICE_INTERFACE_PLAYBACK_MINOR >= 3
 #define LINE_OUT_SAMPLES (480 * 4)
-#else
-#define LINE_OUT_SAMPLES (256 * 4)
-#endif
-
-#if SPICE_INTERFACE_RECORD_MAJOR > 2 || SPICE_INTERFACE_RECORD_MINOR >= 3
 #define LINE_IN_SAMPLES (480 * 4)
-#else
-#define LINE_IN_SAMPLES (256 * 4)
-#endif
 
 typedef struct SpiceVoiceOut {
     HWVoiceOut            hw;
@@ -99,11 +90,7 @@ static int line_out_init(HWVoiceOut *hw, struct audsettings *as)
     SpiceVoiceOut *out = container_of (hw, SpiceVoiceOut, hw);
     struct audsettings settings;
 
-#if SPICE_INTERFACE_PLAYBACK_MAJOR > 1 || SPICE_INTERFACE_PLAYBACK_MINOR >= 3
     settings.freq       = spice_server_get_best_playback_rate(NULL);
-#else
-    settings.freq       = SPICE_INTERFACE_PLAYBACK_FREQ;
-#endif
     settings.nchannels  = SPICE_INTERFACE_PLAYBACK_CHAN;
     settings.fmt        = AUDIO_FORMAT_S16;
     settings.big_endian = HOST_BIG_ENDIAN;
@@ -114,9 +101,7 @@ static int line_out_init(HWVoiceOut *hw, struct audsettings *as)
 
     out->sin.base.sif = &playback_sif.base;
     qemu_spice.add_interface(&out->sin.base);
-#if SPICE_INTERFACE_PLAYBACK_MAJOR > 1 || SPICE_INTERFACE_PLAYBACK_MINOR >= 3
     spice_server_set_playback_rate(&out->sin, settings.freq);
-#endif
     return 0;
 }
 
@@ -194,7 +179,6 @@ static void line_out_enable(HWVoiceOut *hw, bool enable)
     }
 }
 
-#if ((SPICE_INTERFACE_PLAYBACK_MAJOR >= 1) && (SPICE_INTERFACE_PLAYBACK_MINOR >= 2))
 static void line_out_volume(HWVoiceOut *hw, Volume *vol)
 {
     SpiceVoiceOut *out = container_of(hw, SpiceVoiceOut, hw);
@@ -206,7 +190,6 @@ static void line_out_volume(HWVoiceOut *hw, Volume *vol)
     spice_server_playback_set_volume(&out->sin, 2, svol);
     spice_server_playback_set_mute(&out->sin, vol->mute);
 }
-#endif
 
 /* record */
 
@@ -215,11 +198,7 @@ static int line_in_init(HWVoiceIn *hw, struct audsettings *as)
     SpiceVoiceIn *in = container_of (hw, SpiceVoiceIn, hw);
     struct audsettings settings;
 
-#if SPICE_INTERFACE_RECORD_MAJOR > 2 || SPICE_INTERFACE_RECORD_MINOR >= 3
     settings.freq       = spice_server_get_best_record_rate(NULL);
-#else
-    settings.freq       = SPICE_INTERFACE_RECORD_FREQ;
-#endif
     settings.nchannels  = SPICE_INTERFACE_RECORD_CHAN;
     settings.fmt        = AUDIO_FORMAT_S16;
     settings.big_endian = HOST_BIG_ENDIAN;
@@ -230,9 +209,7 @@ static int line_in_init(HWVoiceIn *hw, struct audsettings *as)
 
     in->sin.base.sif = &record_sif.base;
     qemu_spice.add_interface(&in->sin.base);
-#if SPICE_INTERFACE_RECORD_MAJOR > 2 || SPICE_INTERFACE_RECORD_MINOR >= 3
     spice_server_set_record_rate(&in->sin, settings.freq);
-#endif
     return 0;
 }
 
@@ -281,7 +258,6 @@ static void line_in_enable(HWVoiceIn *hw, bool enable)
     }
 }
 
-#if ((SPICE_INTERFACE_RECORD_MAJOR >= 2) && (SPICE_INTERFACE_RECORD_MINOR >= 2))
 static void line_in_volume(HWVoiceIn *hw, Volume *vol)
 {
     SpiceVoiceIn *in = container_of(hw, SpiceVoiceIn, hw);
@@ -293,7 +269,6 @@ static void line_in_volume(HWVoiceIn *hw, Volume *vol)
     spice_server_record_set_volume(&in->sin, 2, svol);
     spice_server_record_set_mute(&in->sin, vol->mute);
 }
-#endif
 
 static void audio_spice_class_init(ObjectClass *klass, const void *data)
 {
@@ -315,19 +290,14 @@ static void audio_spice_class_init(ObjectClass *klass, const void *data)
     k->get_buffer_out = line_out_get_buffer;
     k->put_buffer_out = line_out_put_buffer;
     k->enable_out = line_out_enable;
-#if (SPICE_INTERFACE_PLAYBACK_MAJOR >= 1) && \
-        (SPICE_INTERFACE_PLAYBACK_MINOR >= 2)
     k->volume_out = line_out_volume;
-#endif
 
     k->init_in = line_in_init;
     k->fini_in = line_in_fini;
     k->read = line_in_read;
     k->run_buffer_in = audio_generic_run_buffer_in;
     k->enable_in = line_in_enable;
-#if ((SPICE_INTERFACE_RECORD_MAJOR >= 2) && (SPICE_INTERFACE_RECORD_MINOR >= 2))
     k->volume_in = line_in_volume;
-#endif
 }
 
 static const TypeInfo audio_types[] = {
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 081/102] ui: drop spice-protocol < 0.14.3 support
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (78 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 080/102] audio: require spice >= 0.15 Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 082/102] rust: use checked_div to make clippy happy Paolo Bonzini
                   ` (21 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Marc-André Lureau

From: Marc-André Lureau <marcandre.lureau@redhat.com>

According to repology, all our supported distributions have 0.14.3.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20260211-cleanups-v1-7-e63c96572389@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 meson.build  |  2 +-
 ui/vdagent.c | 18 ------------------
 2 files changed, 1 insertion(+), 19 deletions(-)

diff --git a/meson.build b/meson.build
index 11f83cf05c4..f67f8668bd8 100644
--- a/meson.build
+++ b/meson.build
@@ -1325,7 +1325,7 @@ endif
 
 spice_protocol = not_found
 if not get_option('spice_protocol').auto() or have_system
-  spice_protocol = dependency('spice-protocol', version: '>=0.14.0',
+  spice_protocol = dependency('spice-protocol', version: '>=0.14.3',
                               required: get_option('spice_protocol'),
                               method: 'pkg-config')
 endif
diff --git a/ui/vdagent.c b/ui/vdagent.c
index 7ff0861f3e9..5a5e4bf6818 100644
--- a/ui/vdagent.c
+++ b/ui/vdagent.c
@@ -17,14 +17,6 @@
 
 #include "spice/vd_agent.h"
 
-#define CHECK_SPICE_PROTOCOL_VERSION(major, minor, micro) \
-    (CONFIG_SPICE_PROTOCOL_MAJOR > (major) ||             \
-     (CONFIG_SPICE_PROTOCOL_MAJOR == (major) &&           \
-      CONFIG_SPICE_PROTOCOL_MINOR > (minor)) ||           \
-     (CONFIG_SPICE_PROTOCOL_MAJOR == (major) &&           \
-      CONFIG_SPICE_PROTOCOL_MINOR == (minor) &&           \
-      CONFIG_SPICE_PROTOCOL_MICRO >= (micro)))
-
 #define VDAGENT_BUFFER_LIMIT (1 * MiB)
 #define VDAGENT_MOUSE_DEFAULT true
 #define VDAGENT_CLIPBOARD_DEFAULT false
@@ -87,10 +79,8 @@ static const char *cap_name[] = {
     [VD_AGENT_CAP_FILE_XFER_DISABLED]             = "file-xfer-disabled",
     [VD_AGENT_CAP_FILE_XFER_DETAILED_ERRORS]      = "file-xfer-detailed-errors",
     [VD_AGENT_CAP_GRAPHICS_DEVICE_INFO]           = "graphics-device-info",
-#if CHECK_SPICE_PROTOCOL_VERSION(0, 14, 1)
     [VD_AGENT_CAP_CLIPBOARD_NO_RELEASE_ON_REGRAB] = "clipboard-no-release-on-regrab",
     [VD_AGENT_CAP_CLIPBOARD_GRAB_SERIAL]          = "clipboard-grab-serial",
-#endif
 };
 
 static const char *msg_name[] = {
@@ -125,9 +115,7 @@ static const char *type_name[] = {
     [VD_AGENT_CLIPBOARD_IMAGE_BMP]  = "bmp",
     [VD_AGENT_CLIPBOARD_IMAGE_TIFF] = "tiff",
     [VD_AGENT_CLIPBOARD_IMAGE_JPG]  = "jpg",
-#if CHECK_SPICE_PROTOCOL_VERSION(0, 14, 3)
     [VD_AGENT_CLIPBOARD_FILE_LIST]  = "files",
-#endif
 };
 
 #define GET_NAME(_m, _v) \
@@ -197,9 +185,7 @@ static void vdagent_send_caps(VDAgentChardev *vd, bool request)
     if (vd->clipboard) {
         caps->caps[0] |= (1 << VD_AGENT_CAP_CLIPBOARD_BY_DEMAND);
         caps->caps[0] |= (1 << VD_AGENT_CAP_CLIPBOARD_SELECTION);
-#if CHECK_SPICE_PROTOCOL_VERSION(0, 14, 1)
         caps->caps[0] |= (1 << VD_AGENT_CAP_CLIPBOARD_GRAB_SERIAL);
-#endif
     }
 
     caps->request = request;
@@ -318,11 +304,7 @@ static bool have_selection(VDAgentChardev *vd)
 
 static bool have_clipboard_serial(VDAgentChardev *vd)
 {
-#if CHECK_SPICE_PROTOCOL_VERSION(0, 14, 1)
     return vd->caps & (1 << VD_AGENT_CAP_CLIPBOARD_GRAB_SERIAL);
-#else
-    return false;
-#endif
 }
 
 static uint32_t type_qemu_to_vdagent(enum QemuClipboardType type)
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 082/102] rust: use checked_div to make clippy happy
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (79 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 081/102] ui: drop spice-protocol < 0.14.3 support Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 083/102] KVM: i386: Default disable ignore guest PAT quirk Paolo Bonzini
                   ` (20 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, John Snow

From: John Snow <jsnow@redhat.com>

When upgrading from Fedora 41 to Fedora 43 for CI tests, clippy begins
complaining about not using checked_div instead of manually checking
divisors. Make clippy happy and use checked_div() instead.

Signed-off-by: John Snow <jsnow@redhat.com>
Link: https://lore.kernel.org/r/20260219185409.708130-2-jsnow@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 rust/Cargo.toml          |  1 +
 rust/hw/core/src/qdev.rs | 14 ++++++--------
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/rust/Cargo.toml b/rust/Cargo.toml
index ace0baf9bd7..0d24eb84e1c 100644
--- a/rust/Cargo.toml
+++ b/rust/Cargo.toml
@@ -46,6 +46,7 @@ redundant_explicit_links = "deny"
 [workspace.lints.clippy]
 # default-warn lints
 result_unit_err = "allow"
+manual_checked_ops = "deny"
 should_implement_trait = "deny"
 # can be for a reason, e.g. in callbacks
 unused_self = "allow"
diff --git a/rust/hw/core/src/qdev.rs b/rust/hw/core/src/qdev.rs
index 145e20a984f..b2e5441079d 100644
--- a/rust/hw/core/src/qdev.rs
+++ b/rust/hw/core/src/qdev.rs
@@ -425,18 +425,16 @@ pub const fn period_from_ns(ns: u64) -> u64 {
     }
 
     pub const fn period_from_hz(hz: u64) -> u64 {
-        if hz == 0 {
-            0
-        } else {
-            Self::PERIOD_1SEC / hz
+        match Self::PERIOD_1SEC.checked_div(hz) {
+            Some(value) => value,
+            None => 0,
         }
     }
 
     pub const fn period_to_hz(period: u64) -> u64 {
-        if period == 0 {
-            0
-        } else {
-            Self::PERIOD_1SEC / period
+        match Self::PERIOD_1SEC.checked_div(period) {
+            Some(value) => value,
+            None => 0,
         }
     }
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 083/102] KVM: i386: Default disable ignore guest PAT quirk
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (80 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 082/102] rust: use checked_div to make clippy happy Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 084/102] whpx: x86: remove inaccurate comment Paolo Bonzini
                   ` (19 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, myrslint

From: myrslint <qemu.haziness801@passinbox.com>

Add a new accelerator option that allows the guest to adjust the PAT.
This is already the case for TDX guests and allows using virtio-gpu
Venus with RADV or NVIDIA drivers.

The quirk is disabled by default.  Since this caused problems with
Linux's Bochs video device driver, add a knob to leave it enabled,
and for now do ont enable it by default.

Signed-off-by: Myrsky Lintu <qemu.haziness801@passinbox.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2943
Link: https://lore.kernel.org/r/175527721636.15451.4393515241478547957-1@git.sr.ht
[Add property; for now leave it off by default. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/system/kvm_int.h |  1 +
 accel/kvm/kvm-all.c      |  1 +
 target/i386/kvm/kvm.c    | 50 +++++++++++++++++++++++++++++++++++++++-
 3 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/include/system/kvm_int.h b/include/system/kvm_int.h
index baeb166d393..0876aac938d 100644
--- a/include/system/kvm_int.h
+++ b/include/system/kvm_int.h
@@ -167,6 +167,7 @@ struct KVMState
     uint16_t xen_gnttab_max_frames;
     uint16_t xen_evtchn_max_pirq;
     char *device;
+    OnOffAuto honor_guest_pat;
 };
 
 void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index a1f910e9dff..ebd721c3d66 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -4277,6 +4277,7 @@ static void kvm_accel_instance_init(Object *obj)
     s->xen_evtchn_max_pirq = 256;
     s->device = NULL;
     s->msr_energy.enable = false;
+    s->honor_guest_pat = ON_OFF_AUTO_OFF;
 }
 
 /**
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 7cfbc7832de..27b1b848d6a 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -3595,8 +3595,30 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     if (first) {
         kvm_vmfd_add_change_notifier(&kvm_vmfd_change_notifier);
     }
-    first = false;
 
+    /*
+     * Most x86 CPUs in current use have self-snoop, so honoring guest PAT is
+     * preferable.  As well, the bochs video driver bug which motivated making
+     * this a default-enabled quirk in KVM was fixed long ago.
+     */
+    if (s->honor_guest_pat != ON_OFF_AUTO_OFF) {
+        ret = kvm_check_extension(s, KVM_CAP_DISABLE_QUIRKS2);
+        if (ret & KVM_X86_QUIRK_IGNORE_GUEST_PAT) {
+            ret = kvm_vm_enable_cap(s, KVM_CAP_DISABLE_QUIRKS2, 0,
+                                    KVM_X86_QUIRK_IGNORE_GUEST_PAT);
+            if (ret < 0) {
+                error_report("failed to disable KVM_X86_QUIRK_IGNORE_GUEST_PAT");
+                return ret;
+            }
+        } else {
+            if (s->honor_guest_pat == ON_OFF_AUTO_ON) {
+                error_report("KVM does not support disabling ignore-guest-PAT quirk");
+                return -EINVAL;
+            }
+        }
+    }
+
+    first = false;
     return 0;
 }
 
@@ -7053,6 +7075,24 @@ static void kvm_arch_set_xen_evtchn_max_pirq(Object *obj, Visitor *v,
     s->xen_evtchn_max_pirq = value;
 }
 
+static int kvm_arch_get_honor_guest_pat(Object *obj, Error **errp)
+{
+    KVMState *s = KVM_STATE(obj);
+    return s->honor_guest_pat;
+}
+
+static void kvm_arch_set_honor_guest_pat(Object *obj, int value, Error **errp)
+{
+    KVMState *s = KVM_STATE(obj);
+
+    if (s->fd != -1) {
+        error_setg(errp, "Cannot set properties after the accelerator has been initialized");
+        return;
+    }
+
+    s->honor_guest_pat = value;
+}
+
 void kvm_arch_accel_class_init(ObjectClass *oc)
 {
     object_class_property_add_enum(oc, "notify-vmexit", "NotifyVMexitOption",
@@ -7092,6 +7132,14 @@ void kvm_arch_accel_class_init(ObjectClass *oc)
                               NULL, NULL);
     object_class_property_set_description(oc, "xen-evtchn-max-pirq",
                                           "Maximum number of Xen PIRQs");
+
+    object_class_property_add_enum(oc, "honor-guest-pat", "OnOffAuto",
+                                   &OnOffAuto_lookup,
+                                   kvm_arch_get_honor_guest_pat,
+                                   kvm_arch_set_honor_guest_pat);
+    object_class_property_set_description(oc, "honor-guest-pat",
+                                          "Disable KVM quirk that ignores guest PAT "
+                                          "memory type settings (default: auto)");
 }
 
 void kvm_set_max_apic_id(uint32_t max_apic_id)
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 084/102] whpx: x86: remove inaccurate comment
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (81 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 083/102] KVM: i386: Default disable ignore guest PAT quirk Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 085/102] whpx: x86: kick out of HLT manually when using the kernel-irqchip Paolo Bonzini
                   ` (18 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

WHvRunVpExitReasonX64Halt _is_ triggered on halt with kernel-irqchip=off as of Windows 11 version 25H2.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260226181930.53170-2-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-all.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index d98619facee..cfc63065807 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -1663,8 +1663,7 @@ int whpx_vcpu_run(CPUState *cpu)
 
         case WHvRunVpExitReasonX64Halt:
             /*
-             * WARNING: as of build 19043.1526 (21H1), this exit reason is no
-             * longer used.
+             * Used for kernel-irqchip=off
              */
             ret = whpx_handle_halt(cpu);
             break;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 085/102] whpx: x86: kick out of HLT manually when using the kernel-irqchip
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (82 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 084/102] whpx: x86: remove inaccurate comment Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 086/102] hw: i386: vapic: enable on WHPX with user-mode irqchip Paolo Bonzini
                   ` (17 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Otherwise, interrupts processed through the cancel vCPU and inject path will not cause the vCPU to go out of its halt state.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260226181930.53170-3-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-all.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index cfc63065807..51ecc9531fe 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -1468,6 +1468,16 @@ static void whpx_vcpu_post_run(CPUState *cpu)
         !vcpu->exit_ctx.VpContext.ExecutionState.InterruptShadow;
 }
 
+static void whpx_vcpu_kick_out_of_hlt(CPUState *cpu) 
+{
+    WHV_REGISTER_VALUE reg;
+    whpx_get_reg(cpu, WHvRegisterInternalActivityState, &reg);
+    if (reg.InternalActivity.HaltSuspend) {
+        reg.InternalActivity.HaltSuspend = 0;
+        whpx_set_reg(cpu, WHvRegisterInternalActivityState, reg);
+    }
+}
+
 static void whpx_vcpu_process_async_events(CPUState *cpu)
 {
     X86CPU *x86_cpu = X86_CPU(cpu);
@@ -1775,6 +1785,25 @@ int whpx_vcpu_run(CPUState *cpu)
                 cpu->exception_index = EXCP_INTERRUPT;
                 ret = 1;
             }
+            /* 
+             * When the Hyper-V APIC is enabled, to get out of HLT we
+             * either have to request an interrupt or manually get it away
+             * from HLT.
+             *
+             * We also manually do inject some interrupts via WHvRegisterPendingEvent
+             * instead of WHVRequestInterrupt, which does not reset the HLT state.
+             *
+             * However, even with this done, if the guest does an HLT without
+             * interrupts enabled (which the test_sti_inhibit KVM unit test does)
+             * then the guest will stay in HLT forever.
+             *
+             * Keep it this way for now, with perhaps adding a heartbeat later
+             * so that we get the CPU time savings from having Hyper-V handle HLT
+             * instead of going away from it as soon as possible.
+             */
+            if (whpx_irqchip_in_kernel()) {
+                whpx_vcpu_kick_out_of_hlt(cpu);
+            }
             break;
         case WHvRunVpExitReasonX64MsrAccess: {
             WHV_REGISTER_VALUE reg_values[3] = {0};
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 086/102] hw: i386: vapic: enable on WHPX with user-mode irqchip
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (83 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 085/102] whpx: x86: kick out of HLT manually when using the kernel-irqchip Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 087/102] target/alpha: Reset CPU Paolo Bonzini
                   ` (16 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Alleviate a performance bottleneck on legacy Windows guests.

In my test setup, this makes Windows XP boot times be 20x faster
than they're otherwise.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260226181930.53170-4-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/i386/vapic.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/hw/i386/vapic.c b/hw/i386/vapic.c
index 670a50524d6..41e5ca26dfd 100644
--- a/hw/i386/vapic.c
+++ b/hw/i386/vapic.c
@@ -16,6 +16,7 @@
 #include "system/cpus.h"
 #include "system/hw_accel.h"
 #include "system/kvm.h"
+#include "system/whpx.h"
 #include "system/runstate.h"
 #include "system/address-spaces.h"
 #include "hw/i386/apic_internal.h"
@@ -229,7 +230,8 @@ static int evaluate_tpr_instruction(VAPICROMState *s, X86CPU *cpu,
         return -1;
     }
 
-    if (kvm_enabled() && !kvm_irqchip_in_kernel()) {
+    if ((kvm_enabled() && !kvm_irqchip_in_kernel())
+        || (whpx_enabled() && !whpx_irqchip_in_kernel())) {
         /*
          * KVM without kernel-based TPR access reporting will pass an IP that
          * points after the accessing instruction. So we need to look backward
@@ -549,7 +551,7 @@ static int patch_hypercalls(VAPICROMState *s)
     cpu_physical_memory_read(rom_paddr, rom, s->rom_size);
 
     for (pos = 0; pos < s->rom_size - sizeof(vmcall_pattern); pos++) {
-        if (kvm_irqchip_in_kernel()) {
+        if (kvm_enabled() && kvm_irqchip_in_kernel()) {
             pattern = outl_pattern;
             alternates[0] = outl_pattern[7];
             alternates[1] = outl_pattern[7];
@@ -679,16 +681,25 @@ static void vapic_write(void *opaque, hwaddr addr, uint64_t data,
         }
         break;
     case 1:
-        if (kvm_enabled()) {
+        if (kvm_enabled() || (whpx_enabled() && !whpx_irqchip_in_kernel())) {
             /*
              * Disable triggering instruction in ROM by writing a NOP.
              *
              * We cannot do this in TCG mode as the reported IP is not
              * accurate.
+             *
+             * Oddly enough, KVM increments EIP _before_ the execution
+             * of the instruction is finished.
              */
             pause_all_vcpus();
-            patch_byte(cpu, env->eip - 2, 0x66);
-            patch_byte(cpu, env->eip - 1, 0x90);
+            if (!kvm_enabled()) {
+                patch_byte(cpu, env->eip, 0x66);
+                patch_byte(cpu, env->eip + 1, 0x90);
+            }
+            else {
+                patch_byte(cpu, env->eip - 2, 0x66);
+                patch_byte(cpu, env->eip - 1, 0x90);
+            }
             resume_all_vcpus();
         }
 
@@ -705,7 +716,8 @@ static void vapic_write(void *opaque, hwaddr addr, uint64_t data,
         break;
     default:
     case 4:
-        if (!kvm_irqchip_in_kernel()) {
+        if ((kvm_enabled() && !kvm_irqchip_in_kernel())
+          || (whpx_enabled() && !whpx_irqchip_in_kernel())) {
             apic_poll_irq(cpu->apic_state);
         }
         break;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 087/102] target/alpha: Reset CPU
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (84 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 086/102] hw: i386: vapic: enable on WHPX with user-mode irqchip Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 088/102] Reapply "rcu: Unify force quiescent state" Paolo Bonzini
                   ` (15 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Akihiko Odaki, Thomas Huth

From: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>

alpha_cpu_realizefn() did not properly call cpu_reset(), which
corrupted icount. Add the missing function call to fix icount.

Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Tested-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20260217-alpha-v1-1-0dcc708c9db3@rsg.ci.i.u-tokyo.ac.jp
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/alpha/cpu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/alpha/cpu.c b/target/alpha/cpu.c
index e0e13d31e55..ff053043a38 100644
--- a/target/alpha/cpu.c
+++ b/target/alpha/cpu.c
@@ -124,6 +124,7 @@ static void alpha_cpu_realizefn(DeviceState *dev, Error **errp)
     }
 
     qemu_init_vcpu(cs);
+    cpu_reset(cs);
 
     acc->parent_realize(dev, errp);
 }
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 088/102] Reapply "rcu: Unify force quiescent state"
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (85 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 087/102] target/alpha: Reset CPU Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 089/102] target/i386: Add VMX_SECONDARY_EXEC_MODE_BASED_EPT_EXEC Paolo Bonzini
                   ` (14 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Akihiko Odaki

From: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>

This reverts commit ddb4d9d1748681cfde824d765af6cda4334fcce3.

The commit says:
> This reverts commit 55d98e3edeeb17dd8445db27605d2b34f4c3ba85.
>
> The commit introduced a regression in the replay functional test
> on alpha (tests/functional/alpha/test_replay.py), that causes CI
> failures regularly. Thus revert this change until someone has
> figured out what is going wrong here.

Reapply the change as alpha is fixed.

Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20260217-alpha-v1-2-0dcc708c9db3@rsg.ci.i.u-tokyo.ac.jp
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/rcu.c | 81 +++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 52 insertions(+), 29 deletions(-)

diff --git a/util/rcu.c b/util/rcu.c
index b703c86f15a..acac9446ea9 100644
--- a/util/rcu.c
+++ b/util/rcu.c
@@ -43,10 +43,14 @@
 #define RCU_GP_LOCKED           (1UL << 0)
 #define RCU_GP_CTR              (1UL << 1)
 
+
+#define RCU_CALL_MIN_SIZE        30
+
 unsigned long rcu_gp_ctr = RCU_GP_LOCKED;
 
 QemuEvent rcu_gp_event;
 static int in_drain_call_rcu;
+static int rcu_call_count;
 static QemuMutex rcu_registry_lock;
 static QemuMutex rcu_sync_lock;
 
@@ -76,15 +80,29 @@ static void wait_for_readers(void)
 {
     ThreadList qsreaders = QLIST_HEAD_INITIALIZER(qsreaders);
     struct rcu_reader_data *index, *tmp;
+    int sleeps = 0;
+    bool forced = false;
 
     for (;;) {
-        /* We want to be notified of changes made to rcu_gp_ongoing
-         * while we walk the list.
+        /*
+         * Force the grace period to end and wait for it if any of the
+         * following heuristical conditions are satisfied:
+         * - A decent number of callbacks piled up.
+         * - It timed out.
+         * - It is in a drain_call_rcu() call.
+         *
+         * Otherwise, periodically poll the grace period, hoping it ends
+         * promptly.
          */
-        qemu_event_reset(&rcu_gp_event);
+        if (!forced &&
+            (qatomic_read(&rcu_call_count) >= RCU_CALL_MIN_SIZE ||
+             sleeps >= 5 || qatomic_read(&in_drain_call_rcu))) {
+            forced = true;
 
-        QLIST_FOREACH(index, &registry, node) {
-            qatomic_set(&index->waiting, true);
+            QLIST_FOREACH(index, &registry, node) {
+                notifier_list_notify(&index->force_rcu, NULL);
+                qatomic_set(&index->waiting, true);
+            }
         }
 
         /* Here, order the stores to index->waiting before the loads of
@@ -106,8 +124,6 @@ static void wait_for_readers(void)
                  * get some extra futex wakeups.
                  */
                 qatomic_set(&index->waiting, false);
-            } else if (qatomic_read(&in_drain_call_rcu)) {
-                notifier_list_notify(&index->force_rcu, NULL);
             }
         }
 
@@ -115,7 +131,8 @@ static void wait_for_readers(void)
             break;
         }
 
-        /* Wait for one thread to report a quiescent state and try again.
+        /*
+         * Sleep for a while and try again.
          * Release rcu_registry_lock, so rcu_(un)register_thread() doesn't
          * wait too much time.
          *
@@ -133,7 +150,20 @@ static void wait_for_readers(void)
          * rcu_registry_lock is released.
          */
         qemu_mutex_unlock(&rcu_registry_lock);
-        qemu_event_wait(&rcu_gp_event);
+
+        if (forced) {
+            qemu_event_wait(&rcu_gp_event);
+
+            /*
+             * We want to be notified of changes made to rcu_gp_ongoing
+             * while we walk the list.
+             */
+            qemu_event_reset(&rcu_gp_event);
+        } else {
+            g_usleep(10000);
+            sleeps++;
+        }
+
         qemu_mutex_lock(&rcu_registry_lock);
     }
 
@@ -173,15 +203,11 @@ void synchronize_rcu(void)
     }
 }
 
-
-#define RCU_CALL_MIN_SIZE        30
-
 /* Multi-producer, single-consumer queue based on urcu/static/wfqueue.h
  * from liburcu.  Note that head is only used by the consumer.
  */
 static struct rcu_head dummy;
 static struct rcu_head *head = &dummy, **tail = &dummy.next;
-static int rcu_call_count;
 static QemuEvent rcu_call_ready_event;
 
 static void enqueue(struct rcu_head *node)
@@ -259,30 +285,27 @@ static void *call_rcu_thread(void *opaque)
     rcu_register_thread();
 
     for (;;) {
-        int tries = 0;
-        int n = qatomic_read(&rcu_call_count);
+        int n;
 
-        /* Heuristically wait for a decent number of callbacks to pile up.
+        /*
          * Fetch rcu_call_count now, we only must process elements that were
          * added before synchronize_rcu() starts.
          */
-        while (n == 0 || (n < RCU_CALL_MIN_SIZE && ++tries <= 5)) {
-            g_usleep(10000);
-            if (n == 0) {
-                qemu_event_reset(&rcu_call_ready_event);
-                n = qatomic_read(&rcu_call_count);
-                if (n == 0) {
-#if defined(CONFIG_MALLOC_TRIM)
-                    malloc_trim(4 * 1024 * 1024);
-#endif
-                    qemu_event_wait(&rcu_call_ready_event);
-                }
-            }
+        for (;;) {
+            qemu_event_reset(&rcu_call_ready_event);
             n = qatomic_read(&rcu_call_count);
+            if (n) {
+                break;
+            }
+
+#if defined(CONFIG_MALLOC_TRIM)
+            malloc_trim(4 * 1024 * 1024);
+#endif
+            qemu_event_wait(&rcu_call_ready_event);
         }
 
-        qatomic_sub(&rcu_call_count, n);
         synchronize_rcu();
+        qatomic_sub(&rcu_call_count, n);
         bql_lock();
         while (n > 0) {
             node = try_dequeue();
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 089/102] target/i386: Add VMX_SECONDARY_EXEC_MODE_BASED_EPT_EXEC
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (86 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 088/102] Reapply "rcu: Unify force quiescent state" Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 090/102] target/i386: Add MSR_IA32_ARCH_CAPABILITIES ITS_NO Paolo Bonzini
                   ` (13 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Jon Kohler, Xiaoyao Li, Zhao Liu, Aditya Desai

From: Jon Kohler <jon@nutanix.com>

Enumerate ability to enable Intel Mode-Based Execute Control (MBEC)
on secondary execution control bit 22.

Intel MBEC is a hardware feature, introduced in the Kabylake
generation, that allows for more granular control over execution
permissions. MBEC enables the separation and tracking of execution
permissions for supervisor (kernel) and user-mode code. It is used as
an accelerator for Microsoft's Memory Integrity [1] (also known as
hypervisor-protected code integrity or HVCI).

[1] https://learn.microsoft.com/en-us/windows/security/hardware-security/enable-virtualization-based-protection-of-code-integrity

Code is mirrored here:
https://github.com/JonKohler/linux/tree/mbec-v1-6.18
https://github.com/JonKohler/kvm-unit-tests/tree/mbec-v1

LKML thread(s) are here:
Original RFC: https://lore.kernel.org/all/20250313203702.575156-1-jon@nutanix.com/
V1 code: https://lore.kernel.org/all/20251223054806.1611168-1-jon@nutanix.com/
KVM unit test changes: https://lore.kernel.org/all/20251223054850.1611618-1-jon@nutanix.com/

Cc: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: Zhao Liu <zhao1.liu@intel.com>
Co-authored-by: Jon Kohler <jon@nutanix.com>
Co-authored-by: Aditya Desai <aditya.desai@nutanix.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
Link: https://lore.kernel.org/r/20251223060834.1618428-1-jon@nutanix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.h | 1 +
 target/i386/cpu.c | 6 +++++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 065613722f1..c384302de32 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1414,6 +1414,7 @@ uint64_t x86_cpu_get_supported_feature_word(X86CPU *cpu, FeatureWord w);
 #define VMX_SECONDARY_EXEC_RDSEED_EXITING           0x00010000
 #define VMX_SECONDARY_EXEC_ENABLE_PML               0x00020000
 #define VMX_SECONDARY_EXEC_XSAVES                   0x00100000
+#define VMX_SECONDARY_EXEC_MODE_BASED_EPT_EXEC      0x00400000
 #define VMX_SECONDARY_EXEC_TSC_SCALING              0x02000000
 #define VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE   0x04000000
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 9b9ed2d1e38..619ed0de322 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1656,7 +1656,7 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
             "vmx-apicv-register", "vmx-apicv-vid", "vmx-ple", "vmx-rdrand-exit",
             "vmx-invpcid-exit", "vmx-vmfunc", "vmx-shadow-vmcs", "vmx-encls-exit",
             "vmx-rdseed-exit", "vmx-pml", NULL, NULL,
-            "vmx-xsaves", NULL, NULL, NULL,
+            "vmx-xsaves", NULL, "vmx-mbec", NULL,
             NULL, "vmx-tsc-scaling", "vmx-enable-user-wait-pause", NULL,
             NULL, NULL, NULL, NULL,
         },
@@ -1971,6 +1971,10 @@ static FeatureDep feature_dependencies[] = {
         .from = { FEAT_VMX_SECONDARY_CTLS,  VMX_SECONDARY_EXEC_ENABLE_EPT },
         .to = { FEAT_VMX_SECONDARY_CTLS,    VMX_SECONDARY_EXEC_UNRESTRICTED_GUEST },
     },
+    {
+        .from = { FEAT_VMX_SECONDARY_CTLS,  VMX_SECONDARY_EXEC_ENABLE_EPT },
+        .to = { FEAT_VMX_SECONDARY_CTLS,    VMX_SECONDARY_EXEC_MODE_BASED_EPT_EXEC },
+    },
     {
         .from = { FEAT_VMX_SECONDARY_CTLS,  VMX_SECONDARY_EXEC_ENABLE_VPID },
         .to = { FEAT_VMX_EPT_VPID_CAPS,     0xffffffffull << 32 },
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 090/102] target/i386: Add MSR_IA32_ARCH_CAPABILITIES ITS_NO
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (87 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 089/102] target/i386: Add VMX_SECONDARY_EXEC_MODE_BASED_EPT_EXEC Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 091/102] target/i386: introduce SapphireRapids-v6 to expose ITS_NO Paolo Bonzini
                   ` (12 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Jon Kohler, Pawan Gupta

From: Jon Kohler <jon@nutanix.com>

Add bit definition for Indirect Target Selection (ITS_NO) bit 62, to
allow ITS_NO to be added directly to a CPU model in the future.

Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
Link: https://lore.kernel.org/r/20251106174626.49930-2-jon@nutanix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index c384302de32..f2679cc5b72 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1329,6 +1329,7 @@ uint64_t x86_cpu_get_supported_feature_word(X86CPU *cpu, FeatureWord w);
 #define MSR_ARCH_CAP_PBRSB_NO           (1U << 24)
 #define MSR_ARCH_CAP_GDS_NO             (1U << 26)
 #define MSR_ARCH_CAP_RFDS_NO            (1U << 27)
+#define MSR_ARCH_CAP_ITS_NO             (1U << 62)
 
 #define MSR_CORE_CAP_SPLIT_LOCK_DETECT  (1U << 5)
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 091/102] target/i386: introduce SapphireRapids-v6 to expose ITS_NO
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (88 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 090/102] target/i386: Add MSR_IA32_ARCH_CAPABILITIES ITS_NO Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 092/102] target/i386: introduce GraniteRapids-v5 " Paolo Bonzini
                   ` (11 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Jon Kohler, Pawan Gupta

From: Jon Kohler <jon@nutanix.com>

Expose ITS_NO by default, as users using Sapphire Rapids and higher
CPU models would not be able to live migrate to lower CPU hosts due to
missing features. In that case, they would not be vulnerable to ITS.

its-no was originally added on [1], but needs to be exposed on the
individual CPU models for the guests to see by default.

[1] 74978391b2da ("target/i386: Make ITS_NO available to guests")

Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
Link: https://lore.kernel.org/r/20251106174626.49930-3-jon@nutanix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 619ed0de322..81779483d31 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5261,6 +5261,15 @@ static const X86CPUDefinition builtin_x86_defs[] = {
                     { /* end of list */ },
                 }
             },
+            {
+                .version = 6,
+                .note = "with cet-ss, cet-ibt, its-no",
+                .cache_info = &xeon_spr_cache_info,
+                .props = (PropValue[]) {
+                    { "its-no", "on" },
+                    { /* end of list */ },
+                }
+            },
             { /* end of list */ }
         }
     },
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 092/102] target/i386: introduce GraniteRapids-v5 to expose ITS_NO
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (89 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 091/102] target/i386: introduce SapphireRapids-v6 to expose ITS_NO Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 093/102] target/i386: introduce SierraForest-v5 " Paolo Bonzini
                   ` (10 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Jon Kohler, Pawan Gupta

From: Jon Kohler <jon@nutanix.com>

Expose ITS_NO by default, as users using Granite Rapids and higher
CPU models would not be able to live migrate to lower CPU hosts due to
missing features. In that case, they would not be vulnerable to ITS.

its-no was originally added on [1], but needs to be exposed on the
individual CPU models for the guests to see by default.

[1] 74978391b2da ("target/i386: Make ITS_NO available to guests")

Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
Link: https://lore.kernel.org/r/20251106174626.49930-4-jon@nutanix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 81779483d31..987f64c5af3 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5443,6 +5443,15 @@ static const X86CPUDefinition builtin_x86_defs[] = {
                     { /* end of list */ },
                 }
             },
+            {
+                .version = 5,
+                .note = "with cet-ss, cet-ibt, its-no",
+                .cache_info = &xeon_gnr_cache_info,
+                .props = (PropValue[]) {
+                    { "its-no", "on" },
+                    { /* end of list */ },
+                }
+            },
             { /* end of list */ },
         },
     },
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 093/102] target/i386: introduce SierraForest-v5 to expose ITS_NO
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (90 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 092/102] target/i386: introduce GraniteRapids-v5 " Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 094/102] target/i386: introduce ClearwaterForest-v3 " Paolo Bonzini
                   ` (9 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Jon Kohler, Pawan Gupta

From: Jon Kohler <jon@nutanix.com>

Expose ITS_NO by default, as users using Sierra Forest and higher
CPU models would not be able to live migrate to lower CPU hosts due to
missing features. In that case, they would not be vulnerable to ITS.

its-no was originally added on [1], but needs to be exposed on the
individual CPU models for the guests to see by default.

Note: For SRF, version 2 already exposed BHI_CTRL, which would already
mark the CPU as invulnerable to ITS (at least in Linux); however,
expose ITS_NO for completeness.

[1] 74978391b2da ("target/i386: Make ITS_NO available to guests")

Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
Link: https://lore.kernel.org/r/20251106174626.49930-5-jon@nutanix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 987f64c5af3..2a869f5b739 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5809,6 +5809,15 @@ static const X86CPUDefinition builtin_x86_defs[] = {
                     { /* end of list */ },
                 }
             },
+            {
+                .version = 5,
+                .note = "with ITS_NO",
+                .cache_info = &xeon_srf_cache_info,
+                .props = (PropValue[]) {
+                    { "its-no", "on" },
+                    { /* end of list */ },
+                }
+            },
             { /* end of list */ },
         },
     },
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 094/102] target/i386: introduce ClearwaterForest-v3 to expose ITS_NO
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (91 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 093/102] target/i386: introduce SierraForest-v5 " Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 095/102] whpx: i386: move whpx_vcpu_kick_out_of_hlt() invocation to interrupt raise time Paolo Bonzini
                   ` (8 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Jon Kohler, Pawan Gupta

From: Jon Kohler <jon@nutanix.com>

Expose ITS_NO by default, as users using Clearwater Forest and higher
CPU models would not be able to live migrate to lower CPU hosts due to
missing features. In that case, they would not be vulnerable to ITS.

its-no was originally added on [1], but needs to be exposed on the
individual CPU models for the guests to see by default.

Note: Version 1 already exposes ARCH_CAP_BHI_NO, which would already
mark the CPU as invulnerable to ITS (at least in Linux); however,
expose ITS_NO for completeness.

[1] 74978391b2da ("target/i386: Make ITS_NO available to guests")

Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
Link: https://lore.kernel.org/r/20251106174626.49930-6-jon@nutanix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 2a869f5b739..01b64940b17 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5964,6 +5964,14 @@ static const X86CPUDefinition builtin_x86_defs[] = {
                     { /* end of list */ },
                 }
             },
+            {
+                .version = 3,
+                .note = "with cet-ss, cet-ibt, ITS_NO",
+                .props = (PropValue[]) {
+                    { "its-no", "on" },
+                    { /* end of list */ },
+                }
+            },
             { /* end of list */ },
         },
     },
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 095/102] whpx: i386: move whpx_vcpu_kick_out_of_hlt() invocation to interrupt raise time
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (92 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 094/102] target/i386: introduce ClearwaterForest-v3 " Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 096/102] whpx: i386: enable all supported host features Paolo Bonzini
                   ` (7 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

This fixes the sti followed by hlt kvm_unit_tests.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-2-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-all.c | 49 ++++++++++++++++---------------------
 1 file changed, 21 insertions(+), 28 deletions(-)

diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index 51ecc9531fe..f12e621a412 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -1323,6 +1323,16 @@ static int whpx_handle_halt(CPUState *cpu)
     return ret;
 }
 
+static void whpx_vcpu_kick_out_of_hlt(CPUState *cpu) 
+{
+    WHV_REGISTER_VALUE reg;
+    whpx_get_reg(cpu, WHvRegisterInternalActivityState, &reg);
+    if (reg.InternalActivity.HaltSuspend) {
+        reg.InternalActivity.HaltSuspend = 0;
+        whpx_set_reg(cpu, WHvRegisterInternalActivityState, reg);
+    }
+}
+
 static void whpx_vcpu_pre_run(CPUState *cpu)
 {
     HRESULT hr;
@@ -1406,6 +1416,17 @@ static void whpx_vcpu_pre_run(CPUState *cpu)
                 .Vector = irq,
             };
             reg_count += 1;
+            /* 
+             * When the Hyper-V APIC is enabled, to get out of HLT we
+             * either have to request an interrupt or manually get it away
+             * from HLT.
+             *
+             * We also manually do inject some interrupts via WHvRegisterPendingEvent
+             * instead of WHVRequestInterrupt, which does not reset the HLT state.
+             */
+            if (whpx_irqchip_in_kernel()) {
+                whpx_vcpu_kick_out_of_hlt(cpu);
+            }
         }
      }
 
@@ -1468,15 +1489,6 @@ static void whpx_vcpu_post_run(CPUState *cpu)
         !vcpu->exit_ctx.VpContext.ExecutionState.InterruptShadow;
 }
 
-static void whpx_vcpu_kick_out_of_hlt(CPUState *cpu) 
-{
-    WHV_REGISTER_VALUE reg;
-    whpx_get_reg(cpu, WHvRegisterInternalActivityState, &reg);
-    if (reg.InternalActivity.HaltSuspend) {
-        reg.InternalActivity.HaltSuspend = 0;
-        whpx_set_reg(cpu, WHvRegisterInternalActivityState, reg);
-    }
-}
 
 static void whpx_vcpu_process_async_events(CPUState *cpu)
 {
@@ -1785,25 +1797,6 @@ int whpx_vcpu_run(CPUState *cpu)
                 cpu->exception_index = EXCP_INTERRUPT;
                 ret = 1;
             }
-            /* 
-             * When the Hyper-V APIC is enabled, to get out of HLT we
-             * either have to request an interrupt or manually get it away
-             * from HLT.
-             *
-             * We also manually do inject some interrupts via WHvRegisterPendingEvent
-             * instead of WHVRequestInterrupt, which does not reset the HLT state.
-             *
-             * However, even with this done, if the guest does an HLT without
-             * interrupts enabled (which the test_sti_inhibit KVM unit test does)
-             * then the guest will stay in HLT forever.
-             *
-             * Keep it this way for now, with perhaps adding a heartbeat later
-             * so that we get the CPU time savings from having Hyper-V handle HLT
-             * instead of going away from it as soon as possible.
-             */
-            if (whpx_irqchip_in_kernel()) {
-                whpx_vcpu_kick_out_of_hlt(cpu);
-            }
             break;
         case WHvRunVpExitReasonX64MsrAccess: {
             WHV_REGISTER_VALUE reg_values[3] = {0};
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 096/102] whpx: i386: enable all supported host features
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (93 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 095/102] whpx: i386: move whpx_vcpu_kick_out_of_hlt() invocation to interrupt raise time Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 097/102] whpx: i386: enable synthetic processor features Paolo Bonzini
                   ` (6 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-3-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-all.c | 40 +++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index f12e621a412..285740bae87 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -2022,6 +2022,7 @@ int whpx_accel_init(AccelState *as, MachineState *ms)
     UINT32 whpx_cap_size;
     WHV_PARTITION_PROPERTY prop;
     WHV_CAPABILITY_FEATURES features = {0};
+    WHV_PROCESSOR_FEATURES_BANKS processor_features;
 
     whpx = &whpx_global;
 
@@ -2127,6 +2128,45 @@ int whpx_accel_init(AccelState *as, MachineState *ms)
         }
     }
 
+    /* Set all the supported features, to follow the MSHV example */
+    memset(&processor_features, 0, sizeof(WHV_PROCESSOR_FEATURES_BANKS));
+    processor_features.BanksCount = 2;
+
+    hr = whp_dispatch.WHvGetCapability(
+        WHvCapabilityCodeProcessorFeaturesBanks, &processor_features,
+        sizeof(WHV_PROCESSOR_FEATURES_BANKS), &whpx_cap_size);
+    if (FAILED(hr)) {
+        error_report("WHPX: Failed to get processor features, hr=%08lx", hr);
+        ret = -ENOSPC;
+        goto error;
+    }
+
+    if (processor_features.Bank1.NestedVirtSupport) {
+        memset(&prop, 0, sizeof(WHV_PARTITION_PROPERTY));
+        prop.NestedVirtualization = 1;
+        hr = whp_dispatch.WHvSetPartitionProperty(
+            whpx->partition,
+            WHvPartitionPropertyCodeNestedVirtualization,
+            &prop,
+            sizeof(WHV_PARTITION_PROPERTY));
+            if (FAILED(hr)) {
+                error_report("WHPX: Failed to enable nested virtualization, hr=%08lx", hr);
+                ret = -EINVAL;
+                goto error;
+        }
+    }
+
+    hr = whp_dispatch.WHvSetPartitionProperty(
+            whpx->partition,
+            WHvPartitionPropertyCodeProcessorFeaturesBanks,
+            &processor_features,
+            sizeof(WHV_PROCESSOR_FEATURES_BANKS));
+    if (FAILED(hr)) {
+        error_report("WHPX: Failed to set processor features, hr=%08lx", hr);
+        ret = -EINVAL;
+        goto error;
+    }
+
     /* Register for MSR and CPUID exits */
     memset(&prop, 0, sizeof(WHV_PARTITION_PROPERTY));
     prop.ExtendedVmExits.X64MsrExit = 1;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 097/102] whpx: i386: enable synthetic processor features
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (94 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 096/102] whpx: i386: enable all supported host features Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 098/102] whpx: i386: warn on unsupported MSR access instead of failing silently Paolo Bonzini
                   ` (5 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

At the point in time in which we setup the partition, the vCPUs
aren't available yet.

So enable them by default for now like what the MSHV backend does.

AccessFrequencyRegs is shared for both the LAPIC frequency reporting and the TSC frequency.

To still benefit from the fixed TSC frequency reporting when kernel-irqchip=off, still enable AccessFrequencyRegs anyway.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-4-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-all.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index 285740bae87..2863224cd2e 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -2167,6 +2167,40 @@ int whpx_accel_init(AccelState *as, MachineState *ms)
         goto error;
     }
 
+    /* Enable synthetic processor features */
+    WHV_SYNTHETIC_PROCESSOR_FEATURES_BANKS synthetic_features;
+    memset(&synthetic_features, 0, sizeof(WHV_SYNTHETIC_PROCESSOR_FEATURES_BANKS));
+    synthetic_features.BanksCount = 1;
+
+    synthetic_features.Bank0.HypervisorPresent = 1;
+    synthetic_features.Bank0.Hv1 = 1;
+    synthetic_features.Bank0.AccessPartitionReferenceCounter = 1;
+    synthetic_features.Bank0.AccessPartitionReferenceTsc = 1;
+    /* if kernel-irqchip=off, HV_X64_MSR_APIC_FREQUENCY = 0. */
+    synthetic_features.Bank0.AccessFrequencyRegs = 1;
+    synthetic_features.Bank0.AccessVpIndex = 1;
+    synthetic_features.Bank0.AccessHypercallRegs = 1;
+    synthetic_features.Bank0.TbFlushHypercalls = 1;
+
+    if (whpx_irqchip_in_kernel()) {
+        synthetic_features.Bank0.AccessSynicRegs = 1;
+        synthetic_features.Bank0.AccessSyntheticTimerRegs = 1;
+        synthetic_features.Bank0.AccessIntrCtrlRegs = 1;
+        synthetic_features.Bank0.SyntheticClusterIpi = 1;
+        synthetic_features.Bank0.DirectSyntheticTimers = 1;
+    }
+
+    hr = whp_dispatch.WHvSetPartitionProperty(
+            whpx->partition,
+            WHvPartitionPropertyCodeSyntheticProcessorFeaturesBanks,
+            &synthetic_features,
+            sizeof(WHV_SYNTHETIC_PROCESSOR_FEATURES_BANKS));
+    if (FAILED(hr)) {
+        error_report("WHPX: Failed to set synthetic features, hr=%08lx", hr);
+        ret = -EINVAL;
+        goto error;
+    }
+
     /* Register for MSR and CPUID exits */
     memset(&prop, 0, sizeof(WHV_PARTITION_PROPERTY));
     prop.ExtendedVmExits.X64MsrExit = 1;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 098/102] whpx: i386: warn on unsupported MSR access instead of failing silently
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (95 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 097/102] whpx: i386: enable synthetic processor features Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 099/102] target/i386: emulate: more 64-bit register handling Paolo Bonzini
                   ` (4 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-5-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-all.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index 2863224cd2e..4186be62ada 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -1819,6 +1819,9 @@ int whpx_vcpu_run(CPUState *cpu)
             reg_count = vcpu->exit_ctx.MsrAccess.AccessInfo.IsWrite ?
                         1 : 3;
 
+            warn_report("WHPX: Unsupported MSR access (0x%x), IsWrite=%i", 
+                vcpu->exit_ctx.MsrAccess.MsrNumber, vcpu->exit_ctx.MsrAccess.AccessInfo.IsWrite);
+
             hr = whp_dispatch.WHvSetVirtualProcessorRegisters(
                 whpx->partition,
                 cpu->cpu_index,
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 099/102] target/i386: emulate: more 64-bit register handling
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (96 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 098/102] whpx: i386: warn on unsupported MSR access instead of failing silently Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 100/102] whpx: i386: enable PMU Paolo Bonzini
                   ` (3 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-6-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/emulate/x86_flags.h | 20 ++++++++++++++
 target/i386/emulate/x86_emu.c   | 17 ++++++++++++
 target/i386/emulate/x86_flags.c | 47 +++++++++++++++++++++++++++++++++
 3 files changed, 84 insertions(+)

diff --git a/target/i386/emulate/x86_flags.h b/target/i386/emulate/x86_flags.h
index a395c837a0e..7ffbbe5c122 100644
--- a/target/i386/emulate/x86_flags.h
+++ b/target/i386/emulate/x86_flags.h
@@ -33,6 +33,10 @@ void set_CF(CPUX86State *env, bool val);
 
 void SET_FLAGS_OxxxxC(CPUX86State *env, bool new_of, bool new_cf);
 
+#ifdef TARGET_X86_64
+void SET_FLAGS_OSZAPC_SUB64(CPUX86State *env, uint64_t v1, uint64_t v2,
+                            uint64_t diff);
+#endif
 void SET_FLAGS_OSZAPC_SUB32(CPUX86State *env, uint32_t v1, uint32_t v2,
                             uint32_t diff);
 void SET_FLAGS_OSZAPC_SUB16(CPUX86State *env, uint16_t v1, uint16_t v2,
@@ -40,6 +44,10 @@ void SET_FLAGS_OSZAPC_SUB16(CPUX86State *env, uint16_t v1, uint16_t v2,
 void SET_FLAGS_OSZAPC_SUB8(CPUX86State *env, uint8_t v1, uint8_t v2,
                            uint8_t diff);
 
+#ifdef TARGET_X86_64
+void SET_FLAGS_OSZAPC_ADD64(CPUX86State *env, uint64_t v1, uint64_t v2,
+                            uint64_t diff);
+#endif
 void SET_FLAGS_OSZAPC_ADD32(CPUX86State *env, uint32_t v1, uint32_t v2,
                             uint32_t diff);
 void SET_FLAGS_OSZAPC_ADD16(CPUX86State *env, uint16_t v1, uint16_t v2,
@@ -47,6 +55,10 @@ void SET_FLAGS_OSZAPC_ADD16(CPUX86State *env, uint16_t v1, uint16_t v2,
 void SET_FLAGS_OSZAPC_ADD8(CPUX86State *env, uint8_t v1, uint8_t v2,
                            uint8_t diff);
 
+#ifdef TARGET_X86_64
+void SET_FLAGS_OSZAP_SUB64(CPUX86State *env, uint64_t v1, uint64_t v2,
+                           uint64_t diff);
+#endif
 void SET_FLAGS_OSZAP_SUB32(CPUX86State *env, uint32_t v1, uint32_t v2,
                            uint32_t diff);
 void SET_FLAGS_OSZAP_SUB16(CPUX86State *env, uint16_t v1, uint16_t v2,
@@ -54,6 +66,10 @@ void SET_FLAGS_OSZAP_SUB16(CPUX86State *env, uint16_t v1, uint16_t v2,
 void SET_FLAGS_OSZAP_SUB8(CPUX86State *env, uint8_t v1, uint8_t v2,
                           uint8_t diff);
 
+#ifdef TARGET_X86_64
+void SET_FLAGS_OSZAP_ADD64(CPUX86State *env, uint64_t v1, uint64_t v2,
+                           uint64_t diff);
+#endif
 void SET_FLAGS_OSZAP_ADD32(CPUX86State *env, uint32_t v1, uint32_t v2,
                            uint32_t diff);
 void SET_FLAGS_OSZAP_ADD16(CPUX86State *env, uint16_t v1, uint16_t v2,
@@ -61,6 +77,10 @@ void SET_FLAGS_OSZAP_ADD16(CPUX86State *env, uint16_t v1, uint16_t v2,
 void SET_FLAGS_OSZAP_ADD8(CPUX86State *env, uint8_t v1, uint8_t v2,
                           uint8_t diff);
 
+#ifdef TARGET_X86_64
+void SET_FLAGS_OSZAPC_LOGIC64(CPUX86State *env, uint64_t v1, uint64_t v2,
+                              uint64_t diff);
+#endif
 void SET_FLAGS_OSZAPC_LOGIC32(CPUX86State *env, uint32_t v1, uint32_t v2,
                               uint32_t diff);
 void SET_FLAGS_OSZAPC_LOGIC16(CPUX86State *env, uint16_t v1, uint16_t v2,
diff --git a/target/i386/emulate/x86_emu.c b/target/i386/emulate/x86_emu.c
index 8d35f3338c1..6c4ccc45383 100644
--- a/target/i386/emulate/x86_emu.c
+++ b/target/i386/emulate/x86_emu.c
@@ -45,6 +45,22 @@
 #include "x86_mmu.h"
 
 
+#ifdef TARGET_X86_64
+#define EXEC_2OP_FLAGS_CMD_64(env, decode, cmd, FLAGS_FUNC, save_res) \
+    case 8:                                        \
+    {                                               \
+        uint64_t v1 = (uint64_t)decode->op[0].val;  \
+        uint64_t v2 = (uint64_t)decode->op[1].val;  \
+        uint64_t diff = v1 cmd v2;                  \
+        if (save_res) {                              \
+            if (write_val_ext(env, &decode->op[0], diff, 8)) { return 1; } \
+        } \
+        FLAGS_FUNC##64(env, v1, v2, diff);          \
+        break;                                      \
+    }
+#else
+#define EXEC_2OP_FLAGS_CMD_64(env, decode, cmd, FLAGS_FUNC, save_res)
+#endif
 #define EXEC_2OP_FLAGS_CMD(env, decode, cmd, FLAGS_FUNC, save_res) \
 {                                                       \
     if (fetch_operands(env, decode, 2, true, true, false))  {\
@@ -84,6 +100,7 @@
         FLAGS_FUNC##32(env, v1, v2, diff);          \
         break;                                      \
     }                                               \
+    EXEC_2OP_FLAGS_CMD_64(env, decode, cmd, FLAGS_FUNC, save_res) \
     default:                                        \
         VM_PANIC("bad size\n");                    \
     }                                                   \
diff --git a/target/i386/emulate/x86_flags.c b/target/i386/emulate/x86_flags.c
index 6592193b5e0..3c4270a14c1 100644
--- a/target/i386/emulate/x86_flags.c
+++ b/target/i386/emulate/x86_flags.c
@@ -82,6 +82,10 @@
     SET_FLAGS_OSZAPC_SIZE(16, carries, result)
 #define SET_FLAGS_OSZAPC_32(carries, result) \
     SET_FLAGS_OSZAPC_SIZE(32, carries, result)
+#ifdef TARGET_X86_64
+#define SET_FLAGS_OSZAPC_64(carries, result) \
+    SET_FLAGS_OSZAPC_SIZE(64, carries, result)
+#endif
 
 /* ******************* */
 /* OSZAP */
@@ -107,6 +111,10 @@
     SET_FLAGS_OSZAP_SIZE(16, carries, result)
 #define SET_FLAGS_OSZAP_32(carries, result) \
     SET_FLAGS_OSZAP_SIZE(32, carries, result)
+#ifdef TARGET_X86_64
+#define SET_FLAGS_OSZAP_64(carries, result) \
+    SET_FLAGS_OSZAP_SIZE(64, carries, result)
+#endif
 
 void SET_FLAGS_OxxxxC(CPUX86State *env, bool new_of, bool new_cf)
 {
@@ -115,6 +123,14 @@ void SET_FLAGS_OxxxxC(CPUX86State *env, bool new_of, bool new_cf)
     env->cc_src ^= ((target_ulong)new_of << LF_BIT_PO);
 }
 
+#ifdef TARGET_X86_64
+void SET_FLAGS_OSZAPC_SUB64(CPUX86State *env, uint64_t v1, uint64_t v2,
+                            uint64_t diff)
+{
+    SET_FLAGS_OSZAPC_64(SUB_COUT_VEC(v1, v2, diff), diff);
+}
+#endif
+
 void SET_FLAGS_OSZAPC_SUB32(CPUX86State *env, uint32_t v1, uint32_t v2,
                             uint32_t diff)
 {
@@ -133,6 +149,14 @@ void SET_FLAGS_OSZAPC_SUB8(CPUX86State *env, uint8_t v1, uint8_t v2,
     SET_FLAGS_OSZAPC_8(SUB_COUT_VEC(v1, v2, diff), diff);
 }
 
+#ifdef TARGET_X86_64
+void SET_FLAGS_OSZAPC_ADD64(CPUX86State *env, uint64_t v1, uint64_t v2,
+                            uint64_t diff)
+{
+    SET_FLAGS_OSZAPC_64(ADD_COUT_VEC(v1, v2, diff), diff);
+}
+#endif
+
 void SET_FLAGS_OSZAPC_ADD32(CPUX86State *env, uint32_t v1, uint32_t v2,
                             uint32_t diff)
 {
@@ -151,6 +175,14 @@ void SET_FLAGS_OSZAPC_ADD8(CPUX86State *env, uint8_t v1, uint8_t v2,
     SET_FLAGS_OSZAPC_8(ADD_COUT_VEC(v1, v2, diff), diff);
 }
 
+#ifdef TARGET_X86_64
+void SET_FLAGS_OSZAP_SUB64(CPUX86State *env, uint64_t v1, uint64_t v2,
+                            uint64_t diff)
+{
+    SET_FLAGS_OSZAP_64(SUB_COUT_VEC(v1, v2, diff), diff);
+}
+#endif
+
 void SET_FLAGS_OSZAP_SUB32(CPUX86State *env, uint32_t v1, uint32_t v2,
                             uint32_t diff)
 {
@@ -169,6 +201,14 @@ void SET_FLAGS_OSZAP_SUB8(CPUX86State *env, uint8_t v1, uint8_t v2,
     SET_FLAGS_OSZAP_8(SUB_COUT_VEC(v1, v2, diff), diff);
 }
 
+#ifdef TARGET_X86_64
+void SET_FLAGS_OSZAP_ADD64(CPUX86State *env, uint64_t v1, uint64_t v2,
+                            uint64_t diff)
+{
+    SET_FLAGS_OSZAP_64(ADD_COUT_VEC(v1, v2, diff), diff);
+}
+#endif
+
 void SET_FLAGS_OSZAP_ADD32(CPUX86State *env, uint32_t v1, uint32_t v2,
                             uint32_t diff)
 {
@@ -187,6 +227,13 @@ void SET_FLAGS_OSZAP_ADD8(CPUX86State *env, uint8_t v1, uint8_t v2,
     SET_FLAGS_OSZAP_8(ADD_COUT_VEC(v1, v2, diff), diff);
 }
 
+#ifdef TARGET_X86_64
+void SET_FLAGS_OSZAPC_LOGIC64(CPUX86State *env, uint64_t v1, uint64_t v2,
+                            uint64_t diff)
+{
+    SET_FLAGS_OSZAPC_64(0, diff);
+}
+#endif
 
 void SET_FLAGS_OSZAPC_LOGIC32(CPUX86State *env, uint32_t v1, uint32_t v2,
                               uint32_t diff)
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 100/102] whpx: i386: enable PMU
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (97 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 099/102] target/i386: emulate: more 64-bit register handling Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 101/102] whpx: i386: expose HV_X64_MSR_APIC_FREQUENCY when kernel-irqchip=off Paolo Bonzini
                   ` (2 subsequent siblings)
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Also a partition property instead of a CPU one...

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-7-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-all.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index 4186be62ada..7ccf92e4d11 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -2026,6 +2026,7 @@ int whpx_accel_init(AccelState *as, MachineState *ms)
     WHV_PARTITION_PROPERTY prop;
     WHV_CAPABILITY_FEATURES features = {0};
     WHV_PROCESSOR_FEATURES_BANKS processor_features;
+    WHV_PROCESSOR_PERFMON_FEATURES perfmon_features;
 
     whpx = &whpx_global;
 
@@ -2170,6 +2171,27 @@ int whpx_accel_init(AccelState *as, MachineState *ms)
         goto error;
     }
 
+    /* Enable supported performance monitoring capabilities */
+    hr = whp_dispatch.WHvGetCapability(
+        WHvCapabilityCodeProcessorPerfmonFeatures, &perfmon_features,
+        sizeof(WHV_PROCESSOR_PERFMON_FEATURES), &whpx_cap_size);
+    if (FAILED(hr)) {
+        error_report("WHPX: Failed to get performance monitoring features, hr=%08lx", hr);
+        ret = -ENOSPC;
+        goto error;
+    }
+
+    hr = whp_dispatch.WHvSetPartitionProperty(
+            whpx->partition,
+            WHvPartitionPropertyCodeProcessorPerfmonFeatures,
+            &perfmon_features,
+            sizeof(WHV_PROCESSOR_PERFMON_FEATURES));
+    if (FAILED(hr)) {
+        error_report("WHPX: Failed to set performance monitoring features, hr=%08lx", hr);
+        ret = -EINVAL;
+        goto error;
+    }
+
     /* Enable synthetic processor features */
     WHV_SYNTHETIC_PROCESSOR_FEATURES_BANKS synthetic_features;
     memset(&synthetic_features, 0, sizeof(WHV_SYNTHETIC_PROCESSOR_FEATURES_BANKS));
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 101/102] whpx: i386: expose HV_X64_MSR_APIC_FREQUENCY when kernel-irqchip=off
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (98 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 100/102] whpx: i386: enable PMU Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02  8:47 ` [PULL 102/102] target/i386: emulate: fix scas Paolo Bonzini
  2026-03-02 14:01 ` [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Peter Maydell
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Now that we expose AccessFrequencyRegs, expose HV_X64_MSR_APIC_FREQUENCY as well for the case when the Hyper-V LAPIC is not used.

If the Hyper-V LAPIC is used, this will be handled by the hypervisor instead of the VMM, hence gating it on !whpx_irqchip_in_kernel().

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-8-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/whpx/whpx-all.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/target/i386/whpx/whpx-all.c b/target/i386/whpx/whpx-all.c
index 7ccf92e4d11..c172e86886b 100644
--- a/target/i386/whpx/whpx-all.c
+++ b/target/i386/whpx/whpx-all.c
@@ -45,6 +45,8 @@
 #include <winhvplatform.h>
 
 #define HYPERV_APIC_BUS_FREQUENCY      (200000000ULL)
+/* for kernel-irqchip=off */
+#define HV_X64_MSR_APIC_FREQUENCY       0x40000023
 
 static const WHV_REGISTER_NAME whpx_register_names[] = {
 
@@ -1802,6 +1804,7 @@ int whpx_vcpu_run(CPUState *cpu)
             WHV_REGISTER_VALUE reg_values[3] = {0};
             WHV_REGISTER_NAME reg_names[3];
             UINT32 reg_count;
+            bool is_known_msr = 0; 
 
             reg_names[0] = WHvX64RegisterRip;
             reg_names[1] = WHvX64RegisterRax;
@@ -1811,6 +1814,12 @@ int whpx_vcpu_run(CPUState *cpu)
                 vcpu->exit_ctx.VpContext.Rip +
                 vcpu->exit_ctx.VpContext.InstructionLength;
 
+            if (vcpu->exit_ctx.MsrAccess.MsrNumber == HV_X64_MSR_APIC_FREQUENCY
+                && !vcpu->exit_ctx.MsrAccess.AccessInfo.IsWrite
+                && !whpx_irqchip_in_kernel()) {
+                is_known_msr = 1;
+                reg_values[1].Reg32 = (uint32_t)X86_CPU(cpu)->env.apic_bus_freq;
+            }
             /*
              * For all unsupported MSR access we:
              *     ignore writes
@@ -1819,8 +1828,10 @@ int whpx_vcpu_run(CPUState *cpu)
             reg_count = vcpu->exit_ctx.MsrAccess.AccessInfo.IsWrite ?
                         1 : 3;
 
-            warn_report("WHPX: Unsupported MSR access (0x%x), IsWrite=%i", 
+            if (!is_known_msr) {
+                warn_report("WHPX: Unsupported MSR access (0x%x), IsWrite=%i", 
                 vcpu->exit_ctx.MsrAccess.MsrNumber, vcpu->exit_ctx.MsrAccess.AccessInfo.IsWrite);
+            }
 
             hr = whp_dispatch.WHvSetVirtualProcessorRegisters(
                 whpx->partition,
@@ -1988,6 +1999,10 @@ int whpx_init_vcpu(CPUState *cpu)
         }
     }
 
+    /* When not using the Hyper-V APIC, the frequency is 1 GHz */
+    if (!whpx_irqchip_in_kernel()) {
+        env->apic_bus_freq = 1000000000;
+    }
 
     vcpu->interruptable = true;
     cpu->vcpu_dirty = true;
@@ -2201,7 +2216,6 @@ int whpx_accel_init(AccelState *as, MachineState *ms)
     synthetic_features.Bank0.Hv1 = 1;
     synthetic_features.Bank0.AccessPartitionReferenceCounter = 1;
     synthetic_features.Bank0.AccessPartitionReferenceTsc = 1;
-    /* if kernel-irqchip=off, HV_X64_MSR_APIC_FREQUENCY = 0. */
     synthetic_features.Bank0.AccessFrequencyRegs = 1;
     synthetic_features.Bank0.AccessVpIndex = 1;
     synthetic_features.Bank0.AccessHypercallRegs = 1;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PULL 102/102] target/i386: emulate: fix scas
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (99 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 101/102] whpx: i386: expose HV_X64_MSR_APIC_FREQUENCY when kernel-irqchip=off Paolo Bonzini
@ 2026-03-02  8:47 ` Paolo Bonzini
  2026-03-02 14:01 ` [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Peter Maydell
  101 siblings, 0 replies; 105+ messages in thread
From: Paolo Bonzini @ 2026-03-02  8:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ani Sinha, Mohamed Mediouni

From: Mohamed Mediouni <mohamed@unpredictable.fr>

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-9-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/emulate/x86_emu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/i386/emulate/x86_emu.c b/target/i386/emulate/x86_emu.c
index 6c4ccc45383..55b1a68eb6c 100644
--- a/target/i386/emulate/x86_emu.c
+++ b/target/i386/emulate/x86_emu.c
@@ -745,6 +745,8 @@ static bool exec_scas(CPUX86State *env, struct x86_decode *decode)
 {
     decode->op[0].type = X86_VAR_REG;
     decode->op[0].reg = R_EAX;
+    decode->op[0].regptr = x86_reg(env, R_EAX);
+
     if (decode->rep) {
         string_rep(env, decode, exec_scas_single, decode->rep);
     } else {
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 105+ messages in thread

* Re: [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze
  2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
                   ` (100 preceding siblings ...)
  2026-03-02  8:47 ` [PULL 102/102] target/i386: emulate: fix scas Paolo Bonzini
@ 2026-03-02 14:01 ` Peter Maydell
  101 siblings, 0 replies; 105+ messages in thread
From: Peter Maydell @ 2026-03-02 14:01 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On Mon, 2 Mar 2026 at 08:46, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> The following changes since commit d8a9d97317d03190b34498741f98f22e2a9afe3e:
>
>   Merge tag 'pull-target-arm-20260226' of https://gitlab.com/pm215/qemu into staging (2026-02-26 16:00:07 +0000)
>
> are available in the Git repository at:
>
>   https://gitlab.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to 5a0f9481b0cf344c4437515b596e4ecf57ccc30f:
>
>   target/i386: emulate: fix scas (2026-03-01 16:02:54 +0100)
>
> ----------------------------------------------------------------
> * target/alpha: Fix for record/replay issue
> * accel/nitro: New Nitro Enclaves accelerator
> * generic + kvm: add support for rebuilding VMs on reset
> * audio requirements cleanup
> * vmmouse: Fix hypercall clobbers
> * rust: use checked_div to make clippy happy
> * kvm: Don't clear pending #SMI in kvm_get_vcpu_events
> * target/i386/emulate: rework MMU code, many fixes
> * target/i386/whpx: replace winhvemulation with target/i386/emulate
> * target/i386/whpx: x2apic support
> * target/i386/whpx: vapic support
> * kvm: support for the "ignore guest PAT" quirk
> * target/i386: add ITS_NO bit for the arch-capabilities MSR
> * target/i386: add MBEC bit for nested VMX
>
> ----------------------------------------------------------------



Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/11.0
for any user-visible changes.

-- PMM


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PULL 073/102] hw/machine: introduce machine specific option 'x-change-vmfd-on-reset'
  2026-03-02  8:47 ` [PULL 073/102] hw/machine: introduce machine specific option 'x-change-vmfd-on-reset' Paolo Bonzini
@ 2026-03-10 16:49   ` Peter Maydell
  0 siblings, 0 replies; 105+ messages in thread
From: Peter Maydell @ 2026-03-10 16:49 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Ani Sinha

On Mon, 2 Mar 2026 at 08:57, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> From: Ani Sinha <anisinha@redhat.com>
>
> A new machine specific option 'x-change-vmfd-on-reset' is introduced for
> debugging and testing only (hence the 'x-' prefix). This option when enabled
> will force KVM VM file descriptor to be changed upon guest reset like
> in the case of confidential guests. This can be used to exercise the code
> changes that are specific for confidential guests on non-confidential
> guests as well (except changes that require hardware support for
> confidential guests).
> A new functional test has been added in the next patch that uses this new
> parameter to test the VM file descriptor changes.
>
> Signed-off-by: Ani Sinha <anisinha@redhat.com>
> Link: https://lore.kernel.org/r/20260225035000.385950-33-anisinha@redhat.com
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Hi; Coverity points out an issue in this commit (CID 1644565):

> --- a/system/runstate.c
> +++ b/system/runstate.c
> @@ -526,9 +526,9 @@ void qemu_system_reset(ShutdownCause reason)
>          type = RESET_TYPE_COLD;
>      }
>
> -    if (!cpus_are_resettable() &&
> -        (reason == SHUTDOWN_CAUSE_GUEST_RESET ||
> -         reason == SHUTDOWN_CAUSE_HOST_QMP_SYSTEM_RESET)) {
> +    if ((reason == SHUTDOWN_CAUSE_GUEST_RESET ||
> +         reason == SHUTDOWN_CAUSE_HOST_QMP_SYSTEM_RESET) &&
> +        (current_machine->new_accel_vmfd_on_reset || !cpus_are_resettable())) {

This change adds a dereference of current_machine, but earlier
in the file we have

    mc = current_machine ? MACHINE_GET_CLASS(current_machine) : NULL;

which assumes that current_machine can be NULL.

Presumably here we should be handling the current_machine == NULL
possibility?

>          if (ac->rebuild_guest) {
>              ret = ac->rebuild_guest(current_machine);
>              if (ret < 0) {
> --
> 2.53.0

thanks
-- PMM


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PULL 063/102] i386/sev: add support for confidential guest reset
  2026-03-02  8:42 ` [PULL 063/102] i386/sev: add support for confidential guest reset Paolo Bonzini
@ 2026-03-10 16:59   ` Peter Maydell
  0 siblings, 0 replies; 105+ messages in thread
From: Peter Maydell @ 2026-03-10 16:59 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Ani Sinha

On Mon, 2 Mar 2026 at 08:57, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> From: Ani Sinha <anisinha@redhat.com>
>
> When the KVM VM file descriptor changes as a part of the confidential guest
> reset mechanism, it necessary to create a new confidential guest context and
> re-encrypt the VM memory. This happens for SEV-ES and SEV-SNP virtual machines
> as a part of SEV_LAUNCH_FINISH, SEV_SNP_LAUNCH_FINISH operations.
>
> A new resettable interface for SEV module has been added. A new reset callback
> for the reset 'exit' state has been implemented to perform the above operations
> when the VM file descriptor has changed during VM reset.
>
> Tracepoints has been added also for tracing purpose.
>
> Signed-off-by: Ani Sinha <anisinha@redhat.com>
> Link: https://lore.kernel.org/r/20260225035000.385950-23-anisinha@redhat.com
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Hi; Coverity points out an issue with this commit:




> +static void sev_handle_reset(Object *obj, ResetType type)
> +{
> +    SevCommonState *sev_common = SEV_COMMON(MACHINE(qdev_get_machine())->cgs);
> +    SevCommonStateClass *klass = SEV_COMMON_GET_CLASS(sev_common);

Getting the class pointer assumes that sev_common is not NULL...

> +
> +    if (!sev_common) {

...but then we check for this afterwards.

Since this is a reset method you can assume that the object
is not NULL, as usual for methods on objects.

> +        return;
> +    }
> +
> +    if (!runstate_is_running()) {
> +        return;
> +    }
> +
> +    sev_add_kernel_loader_hashes(&sev_load_ctx, &error_fatal);
> +    if (sev_es_enabled() && !sev_snp_enabled()) {
> +        sev_launch_get_measure(NULL, NULL);
> +    }
> +    if (!sev_check_state(sev_common, SEV_STATE_RUNNING)) {
> +        /* this calls sev_snp_launch_finish() etc */
> +        klass->launch_finish(sev_common);
> +    }
> +
> +    trace_sev_handle_reset();
> +    return;
> +}

thanks
-- PMM


^ permalink raw reply	[flat|nested] 105+ messages in thread

end of thread, other threads:[~2026-03-10 17:00 UTC | newest]

Thread overview: 105+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-02  8:41 [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Paolo Bonzini
2026-03-02  8:41 ` [PULL 001/102] hw/i386/vmmouse: Fix hypercall clobbers Paolo Bonzini
2026-03-02  8:41 ` [PULL 002/102] target/i386/emulate/x86_decode: Fix compiler warning Paolo Bonzini
2026-03-02  8:41 ` [PULL 003/102] target/i386/hvf/x86_mmu: " Paolo Bonzini
2026-03-02  8:41 ` [PULL 004/102] target/i386/emulate/x86_decode: Actually use stream in decode_instruction_stream() Paolo Bonzini
2026-03-02  8:42 ` [PULL 005/102] target/i386/emulate: rework string_rep emulation Paolo Bonzini
2026-03-02  8:42 ` [PULL 006/102] target/i386: emulate, hvf: move x86_mmu to common code Paolo Bonzini
2026-03-02  8:42 ` [PULL 007/102] whpx: i386: re-enable guest debug support Paolo Bonzini
2026-03-02  8:42 ` [PULL 008/102] whpx: preparatory changes before switching over from winhvemulation Paolo Bonzini
2026-03-02  8:42 ` [PULL 009/102] whpx: refactor whpx_destroy_vcpu to arch-specific function Paolo Bonzini
2026-03-02  8:42 ` [PULL 010/102] whpx: move whpx_get_reg/whpx_set_reg to generic code Paolo Bonzini
2026-03-02  8:42 ` [PULL 011/102] whpx: i386: switch over from winhvemulation to target/i386/emulate Paolo Bonzini
2026-03-02  8:42 ` [PULL 012/102] whpx: i386: flags conversion for target/i386/emulate internal state Paolo Bonzini
2026-03-02  8:42 ` [PULL 013/102] whpx: i386: remove remaining winhvemulation support code Paolo Bonzini
2026-03-02  8:42 ` [PULL 014/102] whpx: i386: remove messages Paolo Bonzini
2026-03-02  8:42 ` [PULL 015/102] whpx: i386: remove CPUID trapping Paolo Bonzini
2026-03-02  8:42 ` [PULL 016/102] whpx: common, i386, arm: rework state levels Paolo Bonzini
2026-03-02  8:42 ` [PULL 017/102] whpx: i386: saving/restoring less state for WHPX_LEVEL_FAST_RUNTIME_STATE Paolo Bonzini
2026-03-02  8:42 ` [PULL 018/102] target/i386: mshv, emulate: move the generic x86 helpers to target/i386/emulate Paolo Bonzini
2026-03-02  8:42 ` [PULL 019/102] target/i386: emulate: 5-level paging for the page table walker Paolo Bonzini
2026-03-02  8:42 ` [PULL 020/102] target/i386: emulate, hvf, mshv: rework MMU code Paolo Bonzini
2026-03-02  8:42 ` [PULL 021/102] hvf: i386: save/restore CR0/2/3 Paolo Bonzini
2026-03-02  8:42 ` [PULL 022/102] target/i386: emulate: get rid of write_val_to_mem() helper Paolo Bonzini
2026-03-02  8:42 ` [PULL 023/102] target/i386: emulate: raise an exception on translation fault Paolo Bonzini
2026-03-02  8:42 ` [PULL 024/102] target/i386: emulate: remove fetch_instruction helper too Paolo Bonzini
2026-03-02  8:42 ` [PULL 025/102] target/i386: emulate: propagate memory errors on most reads/writes Paolo Bonzini
2026-03-02  8:42 ` [PULL 026/102] whpx: i386: inject exceptions Paolo Bonzini
2026-03-02  8:42 ` [PULL 027/102] whpx: i386: bump to x2apic Paolo Bonzini
2026-03-02  8:42 ` [PULL 028/102] whpx: i386: ignore send_msi to interrupt vector 0 Paolo Bonzini
2026-03-02  8:42 ` [PULL 029/102] target/i386: emulate: propagate errors all the way and stop early Paolo Bonzini
2026-03-02  8:42 ` [PULL 030/102] accel/kvm: Don't clear pending #SMI in kvm_get_vcpu_events Paolo Bonzini
2026-03-02  8:42 ` [PULL 031/102] scripts/update-linux-headers: Add Nitro Enclaves header Paolo Bonzini
2026-03-02  8:42 ` [PULL 032/102] linux-headers: Add nitro_enclaves.h Paolo Bonzini
2026-03-02  8:42 ` [PULL 033/102] hw/nitro: Add Nitro Vsock Bus Paolo Bonzini
2026-03-02  8:42 ` [PULL 034/102] accel: Add Nitro Enclaves accelerator Paolo Bonzini
2026-03-02  8:42 ` [PULL 035/102] hw/nitro/nitro-serial-vsock: Nitro Enclaves vsock console Paolo Bonzini
2026-03-02  8:42 ` [PULL 036/102] hw/nitro: Introduce Nitro Enclave Heartbeat device Paolo Bonzini
2026-03-02  8:42 ` [PULL 037/102] target/arm/cpu64: Allow -host for nitro Paolo Bonzini
2026-03-02  8:42 ` [PULL 038/102] hw/nitro: Add nitro machine Paolo Bonzini
2026-03-02  8:42 ` [PULL 039/102] hw/core/eif: Move definitions to header Paolo Bonzini
2026-03-02  8:42 ` [PULL 040/102] hw/nitro: Enable direct kernel boot Paolo Bonzini
2026-03-02  8:42 ` [PULL 041/102] docs: Add Nitro Enclaves documentation Paolo Bonzini
2026-03-02  8:42 ` [PULL 042/102] i386/kvm: avoid installing duplicate msr entries in msr_handlers Paolo Bonzini
2026-03-02  8:42 ` [PULL 043/102] accel/kvm: add confidential class member to indicate guest rebuild capability Paolo Bonzini
2026-03-02  8:42 ` [PULL 044/102] hw/accel: add a per-accelerator callback to change VM accelerator handle Paolo Bonzini
2026-03-02  8:42 ` [PULL 045/102] system/physmem: add helper to reattach existing memory after KVM VM fd change Paolo Bonzini
2026-03-02  8:42 ` [PULL 046/102] accel/kvm: add changes required to support KVM VM file descriptor change Paolo Bonzini
2026-03-02  8:42 ` [PULL 047/102] accel/kvm: mark guest state as unprotected after vm " Paolo Bonzini
2026-03-02  8:42 ` [PULL 048/102] accel/kvm: add a notifier to indicate KVM VM file descriptor has changed Paolo Bonzini
2026-03-02  8:42 ` [PULL 049/102] accel/kvm: notify when KVM VM file fd is about to be changed Paolo Bonzini
2026-03-02  8:42 ` [PULL 050/102] i386/kvm: unregister smram listeners prior to vm file descriptor change Paolo Bonzini
2026-03-02  8:42 ` [PULL 051/102] kvm/i386: implement architecture support for kvm " Paolo Bonzini
2026-03-02  8:42 ` [PULL 052/102] i386/kvm: refactor xen init into a new function Paolo Bonzini
2026-03-02  8:42 ` [PULL 053/102] hw/i386: refactor x86_bios_rom_init for reuse in confidential guest reset Paolo Bonzini
2026-03-02  8:42 ` [PULL 054/102] hw/i386: export a new function x86_bios_rom_reload Paolo Bonzini
2026-03-02  8:42 ` [PULL 055/102] kvm/i386: reload firmware for confidential guest reset Paolo Bonzini
2026-03-02  8:42 ` [PULL 056/102] accel/kvm: rebind current VCPUs to the new KVM VM file descriptor upon reset Paolo Bonzini
2026-03-02  8:42 ` [PULL 057/102] i386/tdx: refactor TDX firmware memory initialization code into a new function Paolo Bonzini
2026-03-02  8:42 ` [PULL 058/102] i386/tdx: finalize TDX guest state upon reset Paolo Bonzini
2026-03-02  8:42 ` [PULL 059/102] i386/tdx: add a pre-vmfd change notifier to reset tdx state Paolo Bonzini
2026-03-02  8:42 ` [PULL 060/102] i386/sev: add migration blockers only once Paolo Bonzini
2026-03-02  8:42 ` [PULL 061/102] i386/sev: add notifiers " Paolo Bonzini
2026-03-02  8:42 ` [PULL 062/102] i386/sev: free existing launch update data and kernel hashes data on init Paolo Bonzini
2026-03-02  8:42 ` [PULL 063/102] i386/sev: add support for confidential guest reset Paolo Bonzini
2026-03-10 16:59   ` Peter Maydell
2026-03-02  8:42 ` [PULL 064/102] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors Paolo Bonzini
2026-03-02  8:43 ` [PULL 065/102] kvm/i8254: refactor pit initialization into a helper Paolo Bonzini
2026-03-02  8:43 ` [PULL 066/102] kvm/i8254: add support for confidential guest reset Paolo Bonzini
2026-03-02  8:47 ` [PULL 067/102] kvm/hyperv: add synic feature to CPU only if its not enabled Paolo Bonzini
2026-03-02  8:47 ` [PULL 068/102] hw/hyperv/vmbus: add support for confidential guest reset Paolo Bonzini
2026-03-02  8:47 ` [PULL 069/102] kvm/xen-emu: re-initialize capabilities during " Paolo Bonzini
2026-03-02  8:47 ` [PULL 070/102] ppc/openpic: create a new openpic device and reattach mem region on coco reset Paolo Bonzini
2026-03-02  8:47 ` [PULL 071/102] kvm/vcpu: add notifiers to inform vcpu file descriptor change Paolo Bonzini
2026-03-02  8:47 ` [PULL 072/102] kvm/clock: add support for confidential guest reset Paolo Bonzini
2026-03-02  8:47 ` [PULL 073/102] hw/machine: introduce machine specific option 'x-change-vmfd-on-reset' Paolo Bonzini
2026-03-10 16:49   ` Peter Maydell
2026-03-02  8:47 ` [PULL 074/102] tests/functional/x86_64: add functional test to exercise vm fd change on reset Paolo Bonzini
2026-03-02  8:47 ` [PULL 075/102] qom: add 'confidential-guest-reset' property for x86 confidential vms Paolo Bonzini
2026-03-02  8:47 ` [PULL 076/102] audio: fix nominal volume channel (cosmetic) Paolo Bonzini
2026-03-02  8:47 ` [PULL 078/102] scripts/vendor.py: add pycotap Paolo Bonzini
2026-03-02  8:47 ` [PULL 079/102] audio: require pulse >= 0.9.13 Paolo Bonzini
2026-03-02  8:47 ` [PULL 080/102] audio: require spice >= 0.15 Paolo Bonzini
2026-03-02  8:47 ` [PULL 081/102] ui: drop spice-protocol < 0.14.3 support Paolo Bonzini
2026-03-02  8:47 ` [PULL 082/102] rust: use checked_div to make clippy happy Paolo Bonzini
2026-03-02  8:47 ` [PULL 083/102] KVM: i386: Default disable ignore guest PAT quirk Paolo Bonzini
2026-03-02  8:47 ` [PULL 084/102] whpx: x86: remove inaccurate comment Paolo Bonzini
2026-03-02  8:47 ` [PULL 085/102] whpx: x86: kick out of HLT manually when using the kernel-irqchip Paolo Bonzini
2026-03-02  8:47 ` [PULL 086/102] hw: i386: vapic: enable on WHPX with user-mode irqchip Paolo Bonzini
2026-03-02  8:47 ` [PULL 087/102] target/alpha: Reset CPU Paolo Bonzini
2026-03-02  8:47 ` [PULL 088/102] Reapply "rcu: Unify force quiescent state" Paolo Bonzini
2026-03-02  8:47 ` [PULL 089/102] target/i386: Add VMX_SECONDARY_EXEC_MODE_BASED_EPT_EXEC Paolo Bonzini
2026-03-02  8:47 ` [PULL 090/102] target/i386: Add MSR_IA32_ARCH_CAPABILITIES ITS_NO Paolo Bonzini
2026-03-02  8:47 ` [PULL 091/102] target/i386: introduce SapphireRapids-v6 to expose ITS_NO Paolo Bonzini
2026-03-02  8:47 ` [PULL 092/102] target/i386: introduce GraniteRapids-v5 " Paolo Bonzini
2026-03-02  8:47 ` [PULL 093/102] target/i386: introduce SierraForest-v5 " Paolo Bonzini
2026-03-02  8:47 ` [PULL 094/102] target/i386: introduce ClearwaterForest-v3 " Paolo Bonzini
2026-03-02  8:47 ` [PULL 095/102] whpx: i386: move whpx_vcpu_kick_out_of_hlt() invocation to interrupt raise time Paolo Bonzini
2026-03-02  8:47 ` [PULL 096/102] whpx: i386: enable all supported host features Paolo Bonzini
2026-03-02  8:47 ` [PULL 097/102] whpx: i386: enable synthetic processor features Paolo Bonzini
2026-03-02  8:47 ` [PULL 098/102] whpx: i386: warn on unsupported MSR access instead of failing silently Paolo Bonzini
2026-03-02  8:47 ` [PULL 099/102] target/i386: emulate: more 64-bit register handling Paolo Bonzini
2026-03-02  8:47 ` [PULL 100/102] whpx: i386: enable PMU Paolo Bonzini
2026-03-02  8:47 ` [PULL 101/102] whpx: i386: expose HV_X64_MSR_APIC_FREQUENCY when kernel-irqchip=off Paolo Bonzini
2026-03-02  8:47 ` [PULL 102/102] target/i386: emulate: fix scas Paolo Bonzini
2026-03-02 14:01 ` [PULL 000/102] Mostly i386 patches for QEMU 11.0 soft freeze Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox