[PATCH 00/24] KVM: x86: fix various GHCB issues

Kernel KVM virtualization development
 help / color / mirror / Atom feed

* [PATCH 00/24] KVM: x86: fix various GHCB issues
@ 2026-05-29 18:35 Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 01/24] KVM: SEV: Require in-GHCB scratch area if GHCB v2+ is in use Paolo Bonzini
                   ` (24 more replies)
  0 siblings, 25 replies; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth

The PSC code has a variety of bugs, several of which have to do with
not following the spec.  It took a while to disentagle and root cause
everything (the GHCB ain't exactly the easiest spec to decipher), but
here it is.  The first 10 patches have just been sent to Linus.

For the rest we decided to handle it on-list for 7.2, but it is possible
that a clever guest could abuse the bugs to cause the host to leak a
pile of pages (limited by how fast the VM can get recreated).

This can also be found in a sev-for-7.2 branch of kvm.git

Thanks,

Paolo, Sean, Mike, Tom

Michael Roth (1):
  KVM: SEV: Require in-GHCB scratch area if GHCB v2+ is in use

Sean Christopherson (23):
  KVM: SEV: Ignore MMIO requests of length '0'
  KVM: SEV: Reject MMIO requests larger than 8 bytes with GHCB v2+
  KVM: SEV: Ignore Port I/O requests of length '0'
  KVM: SEV: Use the size of the PSC header as the minimum size for PSC
    requests
  KVM: SEV: Compute the correct max length of the in-GHCB scratch area
  KVM: SEV: WARN if KVM attempts to setup scratch area with min_len==0
  KVM: SEV: Don't explicitly pass PSC buffer to snp_begin_psc()
  KVM: SEV: Check PSC request indices against the actual size of the
    buffer
  KVM: SEV: Use READ_ONCE() when reading entries/indices from PSC buffer
  KVM: SEV: Make it more obvious when KVM is writing back the current
    PSC index
  KVM: SEV: Add an anonymous "psc" struct to track current PSC metadata
  KVM: SEV: Read start/end indices of PSC requests exactly once per
    #VMGEXIT
  KVM: Don't WARN if memory is dirtied without a vCPU when the VM is
    dying
  KVM: SEV: Move sev_free_vcpu() down below sev_es_unmap_ghcb()
  KVM: SEV: Decouple the need to sync the GHCB SA from the need to free
    the SA
  KVM: SEV: Unmap and unpin the GHCB as needed on vCPU free
  KVM: SEV: Don't terminate SNP VMs on #VMGEXIT without a registered
    GHCB
  KVM: SEV: Move GHCB "usage" check out of sev_es_validate_vmgexit()
  KVM: SEV: Return INVALID_EVENT for SNP-only #VMGEXIT from non-SNP
    guest
  KVM: SEV: Return INVALID_INPUT, not MISSING_INPUT, for bad
    GUEST_REQUEST input(s)
  KVM: SEV: Handle unknown #VMGEXIT reasons in sev_handle_vmgexit()
  KVM: SEV: Turn sev_es_validate_vmgexit() into a dedicated predicate
  KVM: SEV: Remove sometimes-used function-scoped "ret" from #VMGEXIT
    handler

 arch/x86/kvm/svm/sev.c | 554 ++++++++++++++++++++++-------------------
 arch/x86/kvm/svm/svm.h |   9 +-
 virt/kvm/kvm_main.c    |   3 +-
 3 files changed, 302 insertions(+), 264 deletions(-)

-- 
2.54.0


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 01/24] KVM: SEV: Require in-GHCB scratch area if GHCB v2+ is in use
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 02/24] KVM: SEV: Ignore MMIO requests of length '0' Paolo Bonzini
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: seanjc, Tom Lendacky, Michael Roth, Stan Shaw, Peter Gonda,
	Jacky Li, stable

From: Michael Roth <michael.roth@amd.com>

As per the GHCB spec, when using GHCB v2+ require the software scratch area
to reside in the GHCB's shared buffer.  Note, things like Page State Change
(PSC) requests _rely_ on this behavior, as the guest can't provide a length
when making the request, i.e. the size of the guest payload is bounded by
the size of the shared buffer.

Failure to force usage of the GHCB, and a slew of other flaws, lets a
malicious SNP guest corrupt host kernel heap memory, and leak host heap
layout information.

setup_vmgexit_scratch() allocates a buffer via kvzalloc(exit_info_2),
where exit_info_2 is guest-controlled. With exit_info_2=24, this yields
a 24-byte allocation in kmalloc-cg-32 (32-byte slab objects). The buffer
holds an 8-byte psc_hdr followed by 8-byte psc_entry structs, so only
entries[0] and entries[1] are in-bounds.

snp_begin_psc() validates end_entry against VMGEXIT_PSC_MAX_COUNT (253)
but NOT against the actual buffer size:

      idx_end = hdr->end_entry;

      if (idx_end >= VMGEXIT_PSC_MAX_COUNT) {   // checks 253, not buffer
          snp_complete_psc(svm, ...);
          return 1;
      }

      for (idx = idx_start; idx <= idx_end; idx++) {
          entry_start = entries[idx];           // OOB when idx >= 2

The guest sets end_entry=10+, causing the host to iterate entries[2+]
which are OOB into adjacent slab objects. For each OOB entry:

  - The host reads 8 bytes (OOB READ / info leak oracle)
  - If the data passes PSC validation, __snp_complete_one_psc() writes
    cur_page = 1 or 512 into the entry (OOB WRITE, sev.c:3806)
  - If validation fails, the error response reveals whether adjacent
    memory is zero vs non-zero (information disclosure to guest)

The guest controls allocation size (exit_info_2), entry range
(cur_entry/end_entry), and can fire unlimited VMGEXITs to repeatedly
hit different slab positions.

By exploiting the variety of bugs, a malicious SEV-SNP guest can:
    - OOB read adjacent kmalloc-cg-32 objects (heap layout disclosure)
    - OOB write cur_page bits into adjacent objects (heap corruption)
    - Trigger use-after-free conditions across VMGEXITs

E.g. with KASAN enabled, a single insmod of the PoC guest module
produces 73 KASAN reports:

    BUG: KASAN: slab-out-of-bounds in snp_begin_psc+0x126/0x890
    Read of size 8 at addr ffff888219ffb5e0 by task qemu-system-x86/2199

    BUG: KASAN: slab-out-of-bounds in snp_begin_psc+0x468/0x890
    Write of size 8 at addr ffff888351566648 by task qemu-system-x86/2199

    The buggy address belongs to the object at ffff888XXXXXXXXX
     which belongs to the cache kmalloc-cg-32 of size 32
    The buggy address is located N bytes to the right of
     allocated 32-byte region [ffff888XXXXXXXXX, ffff888XXXXXXXXX)

  Breakdown:
    62 slab-out-of-bounds (reads + writes past allocation)
     7 slab-use-after-free
     4 use-after-free

All credit to Stan for the wonderful description and reproducer!

Reported-by: Stan Shaw <shawstan96@gmail.com>
Cc: Michael Roth <michael.roth@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Peter Gonda <pgonda@google.com>
Cc: Jacky Li <jackyli@google.com>
Fixes: 4af663c2f64a ("KVM: SEV: Allow per-guest configuration of GHCB protocol version")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Roth <michael.roth@amd.com>
[sean: write changelog]
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index c2126b3c3072..23170b64f4a3 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3703,6 +3703,10 @@ static int setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 len)
 		scratch_va = (void *)svm->sev_es.ghcb;
 		scratch_va += (scratch_gpa_beg - control->ghcb_gpa);
 	} else {
+		/* GHCB v2 requires the scratch area to be within the GHCB. */
+		if (to_kvm_sev_info(svm->vcpu.kvm)->ghcb_version >= 2)
+			goto e_scratch;
+
 		/*
 		 * The guest memory must be read into a kernel buffer, so
 		 * limit the size
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 02/24] KVM: SEV: Ignore MMIO requests of length '0'
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 01/24] KVM: SEV: Require in-GHCB scratch area if GHCB v2+ is in use Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 19:29   ` sashiko-bot
  2026-05-29 18:35 ` [PATCH 03/24] KVM: SEV: Reject MMIO requests larger than 8 bytes with GHCB v2+ Paolo Bonzini
                   ` (22 subsequent siblings)
  24 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth, stable

From: Sean Christopherson <seanjc@google.com>

Explicitly ignore MMIO requests of length '0', so that setting up the
software scratch area (and other code) doesn't have to worry about
underflowing the length, and to allow for special casing '0' in the
future.

Fixes: 8f423a80d299 ("KVM: SVM: Support MMIO for an SEV-ES guest")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 23170b64f4a3..fb2174b6d1ba 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4497,13 +4497,17 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 	case SVM_VMGEXIT_MMIO_READ:
 	case SVM_VMGEXIT_MMIO_WRITE: {
 		bool is_write = control->exit_code == SVM_VMGEXIT_MMIO_WRITE;
+		u64 len = control->exit_info_2;
 
-		ret = setup_vmgexit_scratch(svm, !is_write, control->exit_info_2);
+		if (!len)
+			return 1;
+
+		ret = setup_vmgexit_scratch(svm, !is_write, len);
 		if (ret)
 			break;
 
-		ret = kvm_sev_es_mmio(vcpu, is_write, control->exit_info_1,
-				      control->exit_info_2, svm->sev_es.ghcb_sa);
+		ret = kvm_sev_es_mmio(vcpu, is_write, control->exit_info_1, len,
+				      svm->sev_es.ghcb_sa);
 		break;
 	}
 	case SVM_VMGEXIT_NMI_COMPLETE:
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 03/24] KVM: SEV: Reject MMIO requests larger than 8 bytes with GHCB v2+
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 01/24] KVM: SEV: Require in-GHCB scratch area if GHCB v2+ is in use Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 02/24] KVM: SEV: Ignore MMIO requests of length '0' Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 19:54   ` sashiko-bot
  2026-05-29 18:35 ` [PATCH 04/24] KVM: SEV: Ignore Port I/O requests of length '0' Paolo Bonzini
                   ` (21 subsequent siblings)
  24 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth, stable

From: Sean Christopherson <seanjc@google.com>

When using GHCB v2+, reject MMIO requests that are larger than 8 bytes.
Per the GHCB spec:

  SW_EXITINFO2 must be less than or equal to 0x7fffffff for version 1 and
  less than or equal to 0x8 for all other versions.

Fixes: 4af663c2f64a ("KVM: SEV: Allow per-guest configuration of GHCB protocol version")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index fb2174b6d1ba..e6579ca9f364 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4502,6 +4502,11 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		if (!len)
 			return 1;
 
+		if (to_kvm_sev_info(vcpu->kvm)->ghcb_version >= 2 && len > 8) {
+			svm_vmgexit_bad_input(svm, GHCB_ERR_INVALID_INPUT);
+			return 1;
+		}
+
 		ret = setup_vmgexit_scratch(svm, !is_write, len);
 		if (ret)
 			break;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 04/24] KVM: SEV: Ignore Port I/O requests of length '0'
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (2 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 03/24] KVM: SEV: Reject MMIO requests larger than 8 bytes with GHCB v2+ Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 05/24] KVM: SEV: Use the size of the PSC header as the minimum size for PSC requests Paolo Bonzini
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth, stable

From: Sean Christopherson <seanjc@google.com>

Explicitly ignore Port I/O requests of length '0' (or count '0'), so that
setting up the software scratch area (and other code) doesn't have to
worry about underflowing the length, and to allow for WARNing on trying
to configure the scratch area with len==0.

Fixes: 291bd20d5d88 ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index e6579ca9f364..52703c954856 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4585,6 +4585,11 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 			    control->exit_info_1, control->exit_info_2);
 		ret = -EINVAL;
 		break;
+	case SVM_EXIT_IOIO:
+		if (!((control->exit_info_1 & SVM_IOIO_SIZE_MASK) >> SVM_IOIO_SIZE_SHIFT))
+			return 1;
+
+		fallthrough;
 	default:
 		ret = svm_invoke_exit_handler(vcpu, control->exit_code);
 	}
@@ -4605,6 +4610,9 @@ int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in)
 	if (unlikely(check_mul_overflow(count, size, &bytes)))
 		return -EINVAL;
 
+	if (!bytes)
+		return 1;
+
 	r = setup_vmgexit_scratch(svm, in, bytes);
 	if (r)
 		return r;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 05/24] KVM: SEV: Use the size of the PSC header as the minimum size for PSC requests
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (3 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 04/24] KVM: SEV: Ignore Port I/O requests of length '0' Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 20:49   ` sashiko-bot
  2026-05-29 18:35 ` [PATCH 06/24] KVM: SEV: Compute the correct max length of the in-GHCB scratch area Paolo Bonzini
                   ` (19 subsequent siblings)
  24 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth, stable

From: Sean Christopherson <seanjc@google.com>

When handling a Page State Change (PSC) #VMGEXIT use the size of the PSC
header as the minimum size for the scratch area.  Per the GHCB spec, PSC
requests do NOT provide the length, i.e. using control->exit_info_2 for the
length is completely made up behavior.  The existing code "works", e.g.
even though Linux-as-a-guest always passes '0', because KVM doesn't do
anything with the length when the request is in the GHCB's shared buffer.

Use the header as the min length.  Once the header is retrieved, KVM can
use the specified indices to compute the full size of the request.

Fixes: 9b54e248d264 ("KVM: SEV: Add support to handle Page State Change VMGEXIT")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-6-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 52703c954856..cbb3040e0778 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4559,7 +4559,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		vcpu->run->system_event.data[0] = control->ghcb_gpa;
 		break;
 	case SVM_VMGEXIT_PSC:
-		ret = setup_vmgexit_scratch(svm, true, control->exit_info_2);
+		ret = setup_vmgexit_scratch(svm, true, sizeof(struct psc_hdr));
 		if (ret)
 			break;
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 06/24] KVM: SEV: Compute the correct max length of the in-GHCB scratch area
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (4 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 05/24] KVM: SEV: Use the size of the PSC header as the minimum size for PSC requests Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 07/24] KVM: SEV: WARN if KVM attempts to setup scratch area with min_len==0 Paolo Bonzini
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth, stable

From: Sean Christopherson <seanjc@google.com>

When setting the length of the GHCB scratch area, and the area is in the
GHCB shared buffer, set the effective length of the scratch area to the max
possible size given the start of the guest-provided pointer, and the end of
the shared buffer.

The code was "fine" when first introduced, as KVM doesn't consult the
length of the buffer when emulating MMIO, because the passed in @len always
specifies the *max* size required.  But for PSC requests, the incoming @len
is just the minimum length (to process the header), and KVM needs to know
the full size of the scratch area to avoid buffer overflows (spoiler alert).

Opportunistically rename @len => @min_len to better reflect its role.

Fixes: 9b54e248d264 ("KVM: SEV: Add support to handle Page State Change VMGEXIT")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-7-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index cbb3040e0778..6072fecfe994 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3662,7 +3662,7 @@ int pre_sev_run(struct vcpu_svm *svm, int cpu)
 }
 
 #define GHCB_SCRATCH_AREA_LIMIT		(16ULL * PAGE_SIZE)
-static int setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 len)
+static int setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 min_len)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
 	u64 ghcb_scratch_beg, ghcb_scratch_end;
@@ -3675,10 +3675,10 @@ static int setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 len)
 		goto e_scratch;
 	}
 
-	scratch_gpa_end = scratch_gpa_beg + len;
+	scratch_gpa_end = scratch_gpa_beg + min_len;
 	if (scratch_gpa_end < scratch_gpa_beg) {
 		pr_err("vmgexit: scratch length (%#llx) not valid for scratch address (%#llx)\n",
-		       len, scratch_gpa_beg);
+		       min_len, scratch_gpa_beg);
 		goto e_scratch;
 	}
 
@@ -3702,6 +3702,8 @@ static int setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 len)
 
 		scratch_va = (void *)svm->sev_es.ghcb;
 		scratch_va += (scratch_gpa_beg - control->ghcb_gpa);
+
+		svm->sev_es.ghcb_sa_len = ghcb_scratch_end - scratch_gpa_beg;
 	} else {
 		/* GHCB v2 requires the scratch area to be within the GHCB. */
 		if (to_kvm_sev_info(svm->vcpu.kvm)->ghcb_version >= 2)
@@ -3711,16 +3713,16 @@ static int setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 len)
 		 * The guest memory must be read into a kernel buffer, so
 		 * limit the size
 		 */
-		if (len > GHCB_SCRATCH_AREA_LIMIT) {
+		if (min_len > GHCB_SCRATCH_AREA_LIMIT) {
 			pr_err("vmgexit: scratch area exceeds KVM limits (%#llx requested, %#llx limit)\n",
-			       len, GHCB_SCRATCH_AREA_LIMIT);
+			       min_len, GHCB_SCRATCH_AREA_LIMIT);
 			goto e_scratch;
 		}
-		scratch_va = kvzalloc(len, GFP_KERNEL_ACCOUNT);
+		scratch_va = kvzalloc(min_len, GFP_KERNEL_ACCOUNT);
 		if (!scratch_va)
 			return -ENOMEM;
 
-		if (kvm_read_guest(svm->vcpu.kvm, scratch_gpa_beg, scratch_va, len)) {
+		if (kvm_read_guest(svm->vcpu.kvm, scratch_gpa_beg, scratch_va, min_len)) {
 			/* Unable to copy scratch area from guest */
 			pr_err("vmgexit: kvm_read_guest for scratch area failed\n");
 
@@ -3736,11 +3738,10 @@ static int setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 len)
 		 */
 		svm->sev_es.ghcb_sa_sync = sync;
 		svm->sev_es.ghcb_sa_free = true;
+		svm->sev_es.ghcb_sa_len = min_len;
 	}
 
 	svm->sev_es.ghcb_sa = scratch_va;
-	svm->sev_es.ghcb_sa_len = len;
-
 	return 0;
 
 e_scratch:
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 07/24] KVM: SEV: WARN if KVM attempts to setup scratch area with min_len==0
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (5 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 06/24] KVM: SEV: Compute the correct max length of the in-GHCB scratch area Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 21:32   ` sashiko-bot
  2026-05-29 18:35 ` [PATCH 08/24] KVM: SEV: Don't explicitly pass PSC buffer to snp_begin_psc() Paolo Bonzini
                   ` (17 subsequent siblings)
  24 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth, stable

From: Sean Christopherson <seanjc@google.com>

Now that all paths in KVM properly validate the length needed for the
scratch area, and are guaranteed to pass in a non-zero length, WARN if KVM
attempts to configured the scratch area with min_len==0 to guard against
future bugs.

Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-8-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 6072fecfe994..a3e85348ace9 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3669,6 +3669,9 @@ static int setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 min_len)
 	u64 scratch_gpa_beg, scratch_gpa_end;
 	void *scratch_va;
 
+	if (WARN_ON_ONCE(!min_len))
+		goto e_scratch;
+
 	scratch_gpa_beg = svm->sev_es.sw_scratch;
 	if (!scratch_gpa_beg) {
 		pr_err("vmgexit: scratch gpa not provided\n");
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 08/24] KVM: SEV: Don't explicitly pass PSC buffer to snp_begin_psc()
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (6 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 07/24] KVM: SEV: WARN if KVM attempts to setup scratch area with min_len==0 Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 09/24] KVM: SEV: Check PSC request indices against the actual size of the buffer Paolo Bonzini
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth, stable

From: Sean Christopherson <seanjc@google.com>

Stop explicitly passing the PSC buffer to snp_begin_psc(): it *must*
be the scratch area.  This will allow fixing a variety of bugs without
further complicating the code.

No functional change intended.

Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-9-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index a3e85348ace9..8577451b82b2 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3841,7 +3841,7 @@ struct psc_buffer {
 	struct psc_entry entries[];
 } __packed;
 
-static int snp_begin_psc(struct vcpu_svm *svm, struct psc_buffer *psc);
+static int snp_begin_psc(struct vcpu_svm *svm);
 
 static void snp_complete_psc(struct vcpu_svm *svm, u64 psc_ret)
 {
@@ -3883,7 +3883,6 @@ static void __snp_complete_one_psc(struct vcpu_svm *svm)
 static int snp_complete_one_psc(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
-	struct psc_buffer *psc = svm->sev_es.ghcb_sa;
 
 	if (vcpu->run->hypercall.ret) {
 		snp_complete_psc(svm, VMGEXIT_PSC_ERROR_GENERIC);
@@ -3893,11 +3892,13 @@ static int snp_complete_one_psc(struct kvm_vcpu *vcpu)
 	__snp_complete_one_psc(svm);
 
 	/* Handle the next range (if any). */
-	return snp_begin_psc(svm, psc);
+	return snp_begin_psc(svm);
 }
 
-static int snp_begin_psc(struct vcpu_svm *svm, struct psc_buffer *psc)
+static int snp_begin_psc(struct vcpu_svm *svm)
 {
+	struct vcpu_sev_es_state *sev_es = &svm->sev_es;
+	struct psc_buffer *psc = sev_es->ghcb_sa;
 	struct psc_entry *entries = psc->entries;
 	struct kvm_vcpu *vcpu = &svm->vcpu;
 	struct psc_hdr *hdr = &psc->hdr;
@@ -4567,7 +4568,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		if (ret)
 			break;
 
-		ret = snp_begin_psc(svm, svm->sev_es.ghcb_sa);
+		ret = snp_begin_psc(svm);
 		break;
 	case SVM_VMGEXIT_AP_CREATION:
 		ret = sev_snp_ap_creation(svm);
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 09/24] KVM: SEV: Check PSC request indices against the actual size of the buffer
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (7 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 08/24] KVM: SEV: Don't explicitly pass PSC buffer to snp_begin_psc() Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 10/24] KVM: SEV: Use READ_ONCE() when reading entries/indices from PSC buffer Paolo Bonzini
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth, stable

From: Sean Christopherson <seanjc@google.com>

When processing Page State Change (PSC) requests, validate the PSC buffer
against the effective size of the scratch area, which could be less than
the maximum size if the guest provided a pointer that isn't exactly at the
start of the GHCB shared buffer.

Fixes: 9b54e248d264 ("KVM: SEV: Add support to handle Page State Change VMGEXIT")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-10-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 8577451b82b2..6e8cbae2135a 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3903,7 +3903,7 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 	struct kvm_vcpu *vcpu = &svm->vcpu;
 	struct psc_hdr *hdr = &psc->hdr;
 	struct psc_entry entry_start;
-	u16 idx, idx_start, idx_end;
+	u16 idx, idx_start, idx_end, max_nr_entries;
 	int npages;
 	bool huge;
 	u64 gfn;
@@ -3913,6 +3913,19 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 		return 1;
 	}
 
+	/*
+	 * GHCB v2 requires the scratch area to reside within the GHCB itself,
+	 * and PSC requests are only supported for GHCB v2+.  Thus it should be
+	 * impossible to exceed the max PSC entry count (which is derived from
+	 * the size of the shared GHCB buffer).
+	 */
+	max_nr_entries = (sev_es->ghcb_sa_len - sizeof(struct psc_hdr)) /
+			 sizeof(struct psc_entry);
+	if (WARN_ON_ONCE(max_nr_entries > VMGEXIT_PSC_MAX_COUNT)) {
+		snp_complete_psc(svm, VMGEXIT_PSC_ERROR_GENERIC);
+		return 1;
+	}
+
 next_range:
 	/* There should be no other PSCs in-flight at this point. */
 	if (WARN_ON_ONCE(svm->sev_es.psc_inflight)) {
@@ -3928,7 +3941,7 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 	idx_start = hdr->cur_entry;
 	idx_end = hdr->end_entry;
 
-	if (idx_end >= VMGEXIT_PSC_MAX_COUNT) {
+	if (idx_end >= max_nr_entries) {
 		snp_complete_psc(svm, VMGEXIT_PSC_ERROR_INVALID_HDR);
 		return 1;
 	}
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 10/24] KVM: SEV: Use READ_ONCE() when reading entries/indices from PSC buffer
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (8 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 09/24] KVM: SEV: Check PSC request indices against the actual size of the buffer Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 22:28   ` sashiko-bot
  2026-05-29 18:35 ` [PATCH 11/24] KVM: SEV: Make it more obvious when KVM is writing back the current PSC index Paolo Bonzini
                   ` (14 subsequent siblings)
  24 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth, stable

From: Sean Christopherson <seanjc@google.com>

Use READ_ONCE() when reading entries/indices from the guest-accessible
Page State Change buffer to defend against TOCTOU bugs.

Don't bother with READ_ONCE()/WRITE_ONCE() for cases where KVM is writing
(and not consuming the result!), as the guest isn't supposed to touch the
buffer while it's being processed.  I.e. using READ_ONCE() is all about
protecting against misbehaving guests.

Fixes: 9b54e248d264 ("KVM: SEV: Add support to handle Page State Change VMGEXIT")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-11-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 6e8cbae2135a..62b5befe0eed 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3872,9 +3872,9 @@ static void __snp_complete_one_psc(struct vcpu_svm *svm)
 	 */
 	for (idx = svm->sev_es.psc_idx; svm->sev_es.psc_inflight;
 	     svm->sev_es.psc_inflight--, idx++) {
-		struct psc_entry *entry = &entries[idx];
+		struct psc_entry entry = READ_ONCE(entries[idx]);
 
-		entry->cur_page = entry->pagesize ? 512 : 1;
+		entries[idx].cur_page = entry.pagesize ? 512 : 1;
 	}
 
 	hdr->cur_entry = idx;
@@ -3938,8 +3938,8 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 	 * validation, so take care to only use validated copies of values used
 	 * for things like array indexing.
 	 */
-	idx_start = hdr->cur_entry;
-	idx_end = hdr->end_entry;
+	idx_start = READ_ONCE(hdr->cur_entry);
+	idx_end = READ_ONCE(hdr->end_entry);
 
 	if (idx_end >= max_nr_entries) {
 		snp_complete_psc(svm, VMGEXIT_PSC_ERROR_INVALID_HDR);
@@ -3948,7 +3948,7 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 
 	/* Find the start of the next range which needs processing. */
 	for (idx = idx_start; idx <= idx_end; idx++, hdr->cur_entry++) {
-		entry_start = entries[idx];
+		entry_start = READ_ONCE(entries[idx]);
 
 		gfn = entry_start.gfn;
 		huge = entry_start.pagesize;
@@ -3992,7 +3992,7 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 	 * KVM_HC_MAP_GPA_RANGE exit.
 	 */
 	while (++idx <= idx_end) {
-		struct psc_entry entry = entries[idx];
+		struct psc_entry entry = READ_ONCE(entries[idx]);
 
 		if (entry.operation != entry_start.operation ||
 		    entry.gfn != entry_start.gfn + npages ||
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 11/24] KVM: SEV: Make it more obvious when KVM is writing back the current PSC index
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (9 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 10/24] KVM: SEV: Use READ_ONCE() when reading entries/indices from PSC buffer Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 23:21   ` sashiko-bot
  2026-05-29 18:35 ` [PATCH 12/24] KVM: SEV: Add an anonymous "psc" struct to track current PSC metadata Paolo Bonzini
                   ` (13 subsequent siblings)
  24 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth

From: Sean Christopherson <seanjc@google.com>

Increment the guest-visible "cur_entry" index outside of the for-loop
when processing Page State Change entries, and add a comment to make it
more obvious which code is operating on trusted data, and which code is
touching guest-accessible data.

No functional change intended.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-12-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 62b5befe0eed..1982d13e71d9 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3947,7 +3947,7 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 	}
 
 	/* Find the start of the next range which needs processing. */
-	for (idx = idx_start; idx <= idx_end; idx++, hdr->cur_entry++) {
+	for (idx = idx_start; idx <= idx_end; idx++) {
 		entry_start = READ_ONCE(entries[idx]);
 
 		gfn = entry_start.gfn;
@@ -3974,6 +3974,14 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 
 		if (npages)
 			break;
+
+		/*
+		 * Increment the guest-visible index to communicate the current
+		 * entry back to the guest, e.g. in case of failure.  No need
+		 * for READ_ONCE() as KVM doesn't consume the field, i.e. a
+		 * misbehaving guest can only break itself.
+		 */
+		hdr->cur_entry++;
 	}
 
 	if (idx > idx_end) {
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 12/24] KVM: SEV: Add an anonymous "psc" struct to track current PSC metadata
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (10 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 11/24] KVM: SEV: Make it more obvious when KVM is writing back the current PSC index Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-30  8:07   ` sashiko-bot
  2026-05-29 18:35 ` [PATCH 13/24] KVM: SEV: Read start/end indices of PSC requests exactly once per #VMGEXIT Paolo Bonzini
                   ` (12 subsequent siblings)
  24 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth

From: Sean Christopherson <seanjc@google.com>

Add a "psc" struct to vcpu_sev_es_state to avoid having to prefix all of
the fields with "psc_".

Take advantage of the code churn to opportunistically rename local
variables to "guest_psc" to make it more obvious that the buffer is guest
data, and more importantly, guest accessible!

Opportunistically rename inflight => batch_size as well, because there can
really only be one operation in-flight (per-vCPU), i.e. "inflight" _looks_
like a boolean, but in actuality is an integer tracking how many pages are
being handled by the current operation.

No functional change intended.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-13-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 43 +++++++++++++++++++-----------------------
 arch/x86/kvm/svm/svm.h |  8 +++++---
 2 files changed, 24 insertions(+), 27 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 1982d13e71d9..9f6543cebedf 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3845,9 +3845,7 @@ static int snp_begin_psc(struct vcpu_svm *svm);
 
 static void snp_complete_psc(struct vcpu_svm *svm, u64 psc_ret)
 {
-	svm->sev_es.psc_inflight = 0;
-	svm->sev_es.psc_idx = 0;
-	svm->sev_es.psc_2m = false;
+	memset(&svm->sev_es.psc, 0, sizeof(svm->sev_es.psc));
 
 	/*
 	 * PSC requests always get a "no action" response in SW_EXITINFO1, with
@@ -3860,9 +3858,8 @@ static void snp_complete_psc(struct vcpu_svm *svm, u64 psc_ret)
 
 static void __snp_complete_one_psc(struct vcpu_svm *svm)
 {
-	struct psc_buffer *psc = svm->sev_es.ghcb_sa;
-	struct psc_entry *entries = psc->entries;
-	struct psc_hdr *hdr = &psc->hdr;
+	struct vcpu_sev_es_state *sev_es = &svm->sev_es;
+	struct psc_buffer *guest_psc = sev_es->ghcb_sa;
 	__u16 idx;
 
 	/*
@@ -3870,14 +3867,14 @@ static void __snp_complete_one_psc(struct vcpu_svm *svm)
 	 * corresponding entries in the guest's PSC buffer and zero out the
 	 * count of in-flight PSC entries.
 	 */
-	for (idx = svm->sev_es.psc_idx; svm->sev_es.psc_inflight;
-	     svm->sev_es.psc_inflight--, idx++) {
-		struct psc_entry entry = READ_ONCE(entries[idx]);
+	for (idx = sev_es->psc.cur_idx; sev_es->psc.batch_size;
+	     sev_es->psc.batch_size--, idx++) {
+		struct psc_entry entry = READ_ONCE(guest_psc->entries[idx]);
 
-		entries[idx].cur_page = entry.pagesize ? 512 : 1;
+		guest_psc->entries[idx].cur_page = entry.pagesize ? 512 : 1;
 	}
 
-	hdr->cur_entry = idx;
+	guest_psc->hdr.cur_entry = idx;
 }
 
 static int snp_complete_one_psc(struct kvm_vcpu *vcpu)
@@ -3898,10 +3895,8 @@ static int snp_complete_one_psc(struct kvm_vcpu *vcpu)
 static int snp_begin_psc(struct vcpu_svm *svm)
 {
 	struct vcpu_sev_es_state *sev_es = &svm->sev_es;
-	struct psc_buffer *psc = sev_es->ghcb_sa;
-	struct psc_entry *entries = psc->entries;
+	struct psc_buffer *guest_psc = sev_es->ghcb_sa;
 	struct kvm_vcpu *vcpu = &svm->vcpu;
-	struct psc_hdr *hdr = &psc->hdr;
 	struct psc_entry entry_start;
 	u16 idx, idx_start, idx_end, max_nr_entries;
 	int npages;
@@ -3928,7 +3923,7 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 
 next_range:
 	/* There should be no other PSCs in-flight at this point. */
-	if (WARN_ON_ONCE(svm->sev_es.psc_inflight)) {
+	if (WARN_ON_ONCE(svm->sev_es.psc.batch_size)) {
 		snp_complete_psc(svm, VMGEXIT_PSC_ERROR_GENERIC);
 		return 1;
 	}
@@ -3938,8 +3933,8 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 	 * validation, so take care to only use validated copies of values used
 	 * for things like array indexing.
 	 */
-	idx_start = READ_ONCE(hdr->cur_entry);
-	idx_end = READ_ONCE(hdr->end_entry);
+	idx_start = READ_ONCE(guest_psc->hdr.cur_entry);
+	idx_end = READ_ONCE(guest_psc->hdr.end_entry);
 
 	if (idx_end >= max_nr_entries) {
 		snp_complete_psc(svm, VMGEXIT_PSC_ERROR_INVALID_HDR);
@@ -3948,7 +3943,7 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 
 	/* Find the start of the next range which needs processing. */
 	for (idx = idx_start; idx <= idx_end; idx++) {
-		entry_start = READ_ONCE(entries[idx]);
+		entry_start = READ_ONCE(guest_psc->entries[idx]);
 
 		gfn = entry_start.gfn;
 		huge = entry_start.pagesize;
@@ -3981,7 +3976,7 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 		 * for READ_ONCE() as KVM doesn't consume the field, i.e. a
 		 * misbehaving guest can only break itself.
 		 */
-		hdr->cur_entry++;
+		guest_psc->hdr.cur_entry++;
 	}
 
 	if (idx > idx_end) {
@@ -3990,9 +3985,9 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 		return 1;
 	}
 
-	svm->sev_es.psc_2m = huge;
-	svm->sev_es.psc_idx = idx;
-	svm->sev_es.psc_inflight = 1;
+	sev_es->psc.is_2m = huge;
+	sev_es->psc.cur_idx = idx;
+	sev_es->psc.batch_size = 1;
 
 	/*
 	 * Find all subsequent PSC entries that contain adjacent GPA
@@ -4000,14 +3995,14 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 	 * KVM_HC_MAP_GPA_RANGE exit.
 	 */
 	while (++idx <= idx_end) {
-		struct psc_entry entry = READ_ONCE(entries[idx]);
+		struct psc_entry entry = READ_ONCE(guest_psc->entries[idx]);
 
 		if (entry.operation != entry_start.operation ||
 		    entry.gfn != entry_start.gfn + npages ||
 		    entry.cur_page || !!entry.pagesize != huge)
 			break;
 
-		svm->sev_es.psc_inflight++;
+		sev_es->psc.batch_size++;
 		npages += huge ? 512 : 1;
 	}
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index a10668d17a16..06192bc9c107 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -257,9 +257,11 @@ struct vcpu_sev_es_state {
 	bool ghcb_sa_free;
 
 	/* SNP Page-State-Change buffer entries currently being processed */
-	u16 psc_idx;
-	u16 psc_inflight;
-	bool psc_2m;
+	struct {
+		u16 cur_idx;
+		u16 batch_size;
+		bool is_2m;
+	} psc;
 
 	u64 ghcb_registered_gpa;
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 13/24] KVM: SEV: Read start/end indices of PSC requests exactly once per #VMGEXIT
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (11 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 12/24] KVM: SEV: Add an anonymous "psc" struct to track current PSC metadata Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 14/24] KVM: Don't WARN if memory is dirtied without a vCPU when the VM is dying Paolo Bonzini
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth

From: Sean Christopherson <seanjc@google.com>

Rework Page State Change (PSC) handling to read the guest-provided start
and end indices exactly once, at the beginning of the request.  Re-reading
the indices is "fine", _if_ the guest is well-behaved.  KVM _should_ be
safe against concurrent guest modification of the indices, but there is
zero reason to introduce unnecessary risk.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-14-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 86 +++++++++++++++++++++++-------------------
 arch/x86/kvm/svm/svm.h |  1 +
 2 files changed, 49 insertions(+), 38 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 9f6543cebedf..4ebe0d449789 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3841,7 +3841,7 @@ struct psc_buffer {
 	struct psc_entry entries[];
 } __packed;
 
-static int snp_begin_psc(struct vcpu_svm *svm);
+static int snp_do_psc(struct vcpu_svm *svm);
 
 static void snp_complete_psc(struct vcpu_svm *svm, u64 psc_ret)
 {
@@ -3874,6 +3874,7 @@ static void __snp_complete_one_psc(struct vcpu_svm *svm)
 		guest_psc->entries[idx].cur_page = entry.pagesize ? 512 : 1;
 	}
 
+	sev_es->psc.cur_idx = idx;
 	guest_psc->hdr.cur_entry = idx;
 }
 
@@ -3889,37 +3890,19 @@ static int snp_complete_one_psc(struct kvm_vcpu *vcpu)
 	__snp_complete_one_psc(svm);
 
 	/* Handle the next range (if any). */
-	return snp_begin_psc(svm);
+	return snp_do_psc(svm);
 }
 
-static int snp_begin_psc(struct vcpu_svm *svm)
+static int snp_do_psc(struct vcpu_svm *svm)
 {
 	struct vcpu_sev_es_state *sev_es = &svm->sev_es;
 	struct psc_buffer *guest_psc = sev_es->ghcb_sa;
 	struct kvm_vcpu *vcpu = &svm->vcpu;
 	struct psc_entry entry_start;
-	u16 idx, idx_start, idx_end, max_nr_entries;
 	int npages;
 	bool huge;
 	u64 gfn;
-
-	if (!user_exit_on_hypercall(vcpu->kvm, KVM_HC_MAP_GPA_RANGE)) {
-		snp_complete_psc(svm, VMGEXIT_PSC_ERROR_GENERIC);
-		return 1;
-	}
-
-	/*
-	 * GHCB v2 requires the scratch area to reside within the GHCB itself,
-	 * and PSC requests are only supported for GHCB v2+.  Thus it should be
-	 * impossible to exceed the max PSC entry count (which is derived from
-	 * the size of the shared GHCB buffer).
-	 */
-	max_nr_entries = (sev_es->ghcb_sa_len - sizeof(struct psc_hdr)) /
-			 sizeof(struct psc_entry);
-	if (WARN_ON_ONCE(max_nr_entries > VMGEXIT_PSC_MAX_COUNT)) {
-		snp_complete_psc(svm, VMGEXIT_PSC_ERROR_GENERIC);
-		return 1;
-	}
+	u16 idx;
 
 next_range:
 	/* There should be no other PSCs in-flight at this point. */
@@ -3928,21 +3911,8 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 		return 1;
 	}
 
-	/*
-	 * The PSC descriptor buffer can be modified by a misbehaved guest after
-	 * validation, so take care to only use validated copies of values used
-	 * for things like array indexing.
-	 */
-	idx_start = READ_ONCE(guest_psc->hdr.cur_entry);
-	idx_end = READ_ONCE(guest_psc->hdr.end_entry);
-
-	if (idx_end >= max_nr_entries) {
-		snp_complete_psc(svm, VMGEXIT_PSC_ERROR_INVALID_HDR);
-		return 1;
-	}
-
 	/* Find the start of the next range which needs processing. */
-	for (idx = idx_start; idx <= idx_end; idx++) {
+	for (idx = sev_es->psc.cur_idx; idx <= sev_es->psc.end_idx; idx++) {
 		entry_start = READ_ONCE(guest_psc->entries[idx]);
 
 		gfn = entry_start.gfn;
@@ -3979,7 +3949,7 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 		guest_psc->hdr.cur_entry++;
 	}
 
-	if (idx > idx_end) {
+	if (idx > sev_es->psc.end_idx) {
 		/* Nothing more to process. */
 		snp_complete_psc(svm, 0);
 		return 1;
@@ -3994,7 +3964,7 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 	 * ranges/operations and can be combined into a single
 	 * KVM_HC_MAP_GPA_RANGE exit.
 	 */
-	while (++idx <= idx_end) {
+	while (++idx <= sev_es->psc.end_idx) {
 		struct psc_entry entry = READ_ONCE(guest_psc->entries[idx]);
 
 		if (entry.operation != entry_start.operation ||
@@ -4044,6 +4014,46 @@ static int snp_begin_psc(struct vcpu_svm *svm)
 	BUG();
 }
 
+static int snp_begin_psc(struct vcpu_svm *svm)
+{
+	struct vcpu_sev_es_state *sev_es = &svm->sev_es;
+	struct psc_buffer *guest_psc = sev_es->ghcb_sa;
+	u16 max_nr_entries;
+
+	if (!user_exit_on_hypercall(svm->vcpu.kvm, KVM_HC_MAP_GPA_RANGE)) {
+		snp_complete_psc(svm, VMGEXIT_PSC_ERROR_GENERIC);
+		return 1;
+	}
+
+	/*
+	 * GHCB v2 requires the scratch area to reside within the GHCB itself,
+	 * and PSC requests are only supported for GHCB v2+.  Thus it should be
+	 * impossible to exceed the max PSC entry count (which is derived from
+	 * the size of the shared GHCB buffer).
+	 */
+	max_nr_entries = (sev_es->ghcb_sa_len - sizeof(struct psc_hdr)) /
+			 sizeof(struct psc_entry);
+	if (WARN_ON_ONCE(max_nr_entries > VMGEXIT_PSC_MAX_COUNT)) {
+		snp_complete_psc(svm, VMGEXIT_PSC_ERROR_GENERIC);
+		return 1;
+	}
+
+	/*
+	 * The PSC descriptor buffer can be modified by a misbehaved guest after
+	 * validation, so take care to only use validated copies of values used
+	 * for things like array indexing.
+	 */
+	sev_es->psc.cur_idx = READ_ONCE(guest_psc->hdr.cur_entry);
+	sev_es->psc.end_idx = READ_ONCE(guest_psc->hdr.end_entry);
+
+	if (sev_es->psc.end_idx >= max_nr_entries) {
+		snp_complete_psc(svm, VMGEXIT_PSC_ERROR_INVALID_HDR);
+		return 1;
+	}
+
+	return snp_do_psc(svm);
+}
+
 /*
  * Invoked as part of svm_vcpu_reset() processing of an init event.
  */
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 06192bc9c107..5137416be593 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -259,6 +259,7 @@ struct vcpu_sev_es_state {
 	/* SNP Page-State-Change buffer entries currently being processed */
 	struct {
 		u16 cur_idx;
+		u16 end_idx;
 		u16 batch_size;
 		bool is_2m;
 	} psc;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 14/24] KVM: Don't WARN if memory is dirtied without a vCPU when the VM is dying
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (12 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 13/24] KVM: SEV: Read start/end indices of PSC requests exactly once per #VMGEXIT Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 15/24] KVM: SEV: Move sev_free_vcpu() down below sev_es_unmap_ghcb() Paolo Bonzini
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth, stable

From: Sean Christopherson <seanjc@google.com>

When marking a page dirty, complain about not having a running/loaded vCPU
if and only if the VM is still alive, i.e. its refcount is non-zero.  This
will allow fixing a memory leak for x86 SEV-ES guests without hitting what
is effectively a false positive on the WARN.

For some SEV-ES VM-Exits, KVM keeps a writable mapping of a guest page
across an exit to userspace, and typically unmaps the page on the next
KVM_RUN.  But if userspace never calls KVM_RUN after such an exit, then KVM
needs to unmap the page when the vCPU is destroyed, which in turn triggers
the WARN about not having a running vCPU.

Alternatively, SEV-ES could temporarily load the vCPU to suppress the WARN,
as is done in nested_vmx_free_vcpu() (but for completely unrelated reasons;
suppressing WARN from nested_put_vmcs12_pages() is pure happenstance).  But
loading a vCPU during destruction is gross (ideally nVMX code would be
cleaned up), risks complicating the SEV-ES code (KVM would need to ensure
the temporarily load()+put() only runs when the vCPU isn't already loaded),
and is ultimately pointless.

The motivation for the WARN is to guard against KVM dirtying guest memory
without pushing the corresponding GFN to the active vCPU's dirty ring, e.g.
to ensure userspace doesn't miss a dirty page.  But for the VM's refcount
to reach zero, there can't be _any_ userspace mappings to the dirty ring,
as mapping the dirty ring requires doing mmap() on the vCPU FD.  I.e. if
userspace had a valid mapping for the dirty ring, then the vCPU file and
thus the owning VM would still be alive.  And so since userspace can't
possibly reach the dirty ring, whether or not KVM technically "misses" a
push to the dirty ring is irrelevant.

Reported-by: Michael Roth <michael.roth@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-15-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 virt/kvm/kvm_main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 89489996fbc1..881f92d7a469 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3520,7 +3520,8 @@ void mark_page_dirty_in_slot(struct kvm *kvm,
 	if (WARN_ON_ONCE(vcpu && vcpu->kvm != kvm))
 		return;

-	WARN_ON_ONCE(!vcpu && !kvm_arch_allow_write_without_running_vcpu(kvm));
+	WARN_ON_ONCE(!vcpu && refcount_read(&kvm->users_count) &&
+		     !kvm_arch_allow_write_without_running_vcpu(kvm));
 #endif

 	if (memslot && kvm_slot_dirty_track_enabled(memslot)) {
-- 
2.54.0

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 15/24] KVM: SEV: Move sev_free_vcpu() down below sev_es_unmap_ghcb()
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (13 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 14/24] KVM: Don't WARN if memory is dirtied without a vCPU when the VM is dying Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-30  8:36   ` sashiko-bot
  2026-05-29 18:35 ` [PATCH 16/24] KVM: SEV: Decouple the need to sync the GHCB SA from the need to free the SA Paolo Bonzini
                   ` (9 subsequent siblings)
  24 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth, stable

From: Sean Christopherson <seanjc@google.com>

Relocate sev_free_vcpu() down in sev.c so that it's definition comes after
sev_es_unmap_ghcb().  This will allow sharing unmap functionality between
the two functions without needing a forward declaration (or weird placement
of the common code).

No functional change intended.

Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-16-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 62 +++++++++++++++++++++---------------------
 1 file changed, 31 insertions(+), 31 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 4ebe0d449789..437282f0ea94 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3313,37 +3313,6 @@ void sev_guest_memory_reclaimed(struct kvm *kvm)
 	sev_writeback_caches(kvm);
 }
 
-void sev_free_vcpu(struct kvm_vcpu *vcpu)
-{
-	struct vcpu_svm *svm;
-
-	if (!is_sev_es_guest(vcpu))
-		return;
-
-	svm = to_svm(vcpu);
-
-	/*
-	 * If it's an SNP guest, then the VMSA was marked in the RMP table as
-	 * a guest-owned page. Transition the page to hypervisor state before
-	 * releasing it back to the system.
-	 */
-	if (is_sev_snp_guest(vcpu)) {
-		u64 pfn = __pa(svm->sev_es.vmsa) >> PAGE_SHIFT;
-
-		if (kvm_rmp_make_shared(vcpu->kvm, pfn, PG_LEVEL_4K))
-			goto skip_vmsa_free;
-	}
-
-	if (vcpu->arch.guest_state_protected)
-		sev_flush_encrypted_page(vcpu, svm->sev_es.vmsa);
-
-	__free_page(virt_to_page(svm->sev_es.vmsa));
-
-skip_vmsa_free:
-	if (svm->sev_es.ghcb_sa_free)
-		kvfree(svm->sev_es.ghcb_sa);
-}
-
 static void dump_ghcb(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -3618,6 +3587,37 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm)
 	svm->sev_es.ghcb = NULL;
 }
 
+void sev_free_vcpu(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm;
+
+	if (!is_sev_es_guest(vcpu))
+		return;
+
+	svm = to_svm(vcpu);
+
+	/*
+	 * If it's an SNP guest, then the VMSA was marked in the RMP table as
+	 * a guest-owned page. Transition the page to hypervisor state before
+	 * releasing it back to the system.
+	 */
+	if (is_sev_snp_guest(vcpu)) {
+		u64 pfn = __pa(svm->sev_es.vmsa) >> PAGE_SHIFT;
+
+		if (kvm_rmp_make_shared(vcpu->kvm, pfn, PG_LEVEL_4K))
+			goto skip_vmsa_free;
+	}
+
+	if (vcpu->arch.guest_state_protected)
+		sev_flush_encrypted_page(vcpu, svm->sev_es.vmsa);
+
+	__free_page(virt_to_page(svm->sev_es.vmsa));
+
+skip_vmsa_free:
+	if (svm->sev_es.ghcb_sa_free)
+		kvfree(svm->sev_es.ghcb_sa);
+}
+
 int pre_sev_run(struct vcpu_svm *svm, int cpu)
 {
 	struct svm_cpu_data *sd = per_cpu_ptr(&svm_data, cpu);
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 16/24] KVM: SEV: Decouple the need to sync the GHCB SA from the need to free the SA
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (14 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 15/24] KVM: SEV: Move sev_free_vcpu() down below sev_es_unmap_ghcb() Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-30  8:50   ` sashiko-bot
  2026-05-29 18:35 ` [PATCH 17/24] KVM: SEV: Unmap and unpin the GHCB as needed on vCPU free Paolo Bonzini
                   ` (8 subsequent siblings)
  24 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth, stable

From: Sean Christopherson <seanjc@google.com>

Decouple synchronizing the GHCB SA from freeing/unpinning the SA, so that
the free/unpin path can be reused when freeing a vCPU.

Opportunistically add a WARN to harden KVM against stomping over (and thus
leaking) an already-allocated scratch area.

Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-17-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 437282f0ea94..11d46600cbdc 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3560,20 +3560,17 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm)
 	if (!svm->sev_es.ghcb)
 		return;
 
-	if (svm->sev_es.ghcb_sa_free) {
-		/*
-		 * The scratch area lives outside the GHCB, so there is a
-		 * buffer that, depending on the operation performed, may
-		 * need to be synced, then freed.
-		 */
-		if (svm->sev_es.ghcb_sa_sync) {
-			kvm_write_guest(svm->vcpu.kvm,
-					svm->sev_es.sw_scratch,
-					svm->sev_es.ghcb_sa,
-					svm->sev_es.ghcb_sa_len);
-			svm->sev_es.ghcb_sa_sync = false;
-		}
+	/*
+	 * If the scratch area lives outside the GHCB, there's a buffer that,
+	 * depending on the operation performed, may need to be synced.
+	 */
+	if (svm->sev_es.ghcb_sa_sync) {
+		kvm_write_guest(svm->vcpu.kvm, svm->sev_es.sw_scratch,
+				svm->sev_es.ghcb_sa, svm->sev_es.ghcb_sa_len);
+		svm->sev_es.ghcb_sa_sync = false;
+	}
 
+	if (svm->sev_es.ghcb_sa_free) {
 		kvfree(svm->sev_es.ghcb_sa);
 		svm->sev_es.ghcb_sa = NULL;
 		svm->sev_es.ghcb_sa_free = false;
@@ -3685,6 +3682,8 @@ static int setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 min_len)
 		goto e_scratch;
 	}
 
+	WARN_ON_ONCE(svm->sev_es.ghcb_sa_sync || svm->sev_es.ghcb_sa_free);
+
 	if ((scratch_gpa_beg & PAGE_MASK) == control->ghcb_gpa) {
 		/* Scratch area begins within GHCB */
 		ghcb_scratch_beg = control->ghcb_gpa +
@@ -3706,6 +3705,8 @@ static int setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 min_len)
 		scratch_va = (void *)svm->sev_es.ghcb;
 		scratch_va += (scratch_gpa_beg - control->ghcb_gpa);
 
+		svm->sev_es.ghcb_sa_sync = false;
+		svm->sev_es.ghcb_sa_free = false;
 		svm->sev_es.ghcb_sa_len = ghcb_scratch_end - scratch_gpa_beg;
 	} else {
 		/* GHCB v2 requires the scratch area to be within the GHCB. */
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 17/24] KVM: SEV: Unmap and unpin the GHCB as needed on vCPU free
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (15 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 16/24] KVM: SEV: Decouple the need to sync the GHCB SA from the need to free the SA Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-30  9:06   ` sashiko-bot
  2026-05-29 18:35 ` [PATCH 18/24] KVM: SEV: Don't terminate SNP VMs on #VMGEXIT without a registered GHCB Paolo Bonzini
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth, stable

From: Sean Christopherson <seanjc@google.com>

Unmap and unpin the GHCB as needed when freeing a vCPU.  If the VM is
destroyed after mapping+pinning the GHCB on #VMGEXIT, without re-running
the vCPU, KVM will effectively leak the GHCB and any mappings created for
the GHCB.

Fixes: 291bd20d5d88 ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT")
Cc: stable@vger.kernel.org
Tested-by: Michael Roth <michael.roth@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-18-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 11d46600cbdc..6c6a6d663e29 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3552,6 +3552,20 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	return 1;
 }
 
+static void __sev_es_unmap_ghcb(struct vcpu_svm *svm)
+{
+	if (svm->sev_es.ghcb_sa_free) {
+		kvfree(svm->sev_es.ghcb_sa);
+		svm->sev_es.ghcb_sa = NULL;
+		svm->sev_es.ghcb_sa_free = false;
+	}
+
+	if (svm->sev_es.ghcb) {
+		kvm_vcpu_unmap(&svm->vcpu, &svm->sev_es.ghcb_map);
+		svm->sev_es.ghcb = NULL;
+	}
+}
+
 void sev_es_unmap_ghcb(struct vcpu_svm *svm)
 {
 	/* Clear any indication that the vCPU is in a type of AP Reset Hold */
@@ -3570,18 +3584,11 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm)
 		svm->sev_es.ghcb_sa_sync = false;
 	}
 
-	if (svm->sev_es.ghcb_sa_free) {
-		kvfree(svm->sev_es.ghcb_sa);
-		svm->sev_es.ghcb_sa = NULL;
-		svm->sev_es.ghcb_sa_free = false;
-	}
-
 	trace_kvm_vmgexit_exit(svm->vcpu.vcpu_id, svm->sev_es.ghcb);
 
 	sev_es_sync_to_ghcb(svm);
 
-	kvm_vcpu_unmap(&svm->vcpu, &svm->sev_es.ghcb_map);
-	svm->sev_es.ghcb = NULL;
+	__sev_es_unmap_ghcb(svm);
 }
 
 void sev_free_vcpu(struct kvm_vcpu *vcpu)
@@ -3611,8 +3618,7 @@ void sev_free_vcpu(struct kvm_vcpu *vcpu)
 	__free_page(virt_to_page(svm->sev_es.vmsa));
 
 skip_vmsa_free:
-	if (svm->sev_es.ghcb_sa_free)
-		kvfree(svm->sev_es.ghcb_sa);
+	__sev_es_unmap_ghcb(svm);
 }
 
 int pre_sev_run(struct vcpu_svm *svm, int cpu)
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 18/24] KVM: SEV: Don't terminate SNP VMs on #VMGEXIT without a registered GHCB
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (16 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 17/24] KVM: SEV: Unmap and unpin the GHCB as needed on vCPU free Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 19/24] KVM: SEV: Move GHCB "usage" check out of sev_es_validate_vmgexit() Paolo Bonzini
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth, stable

From: Sean Christopherson <seanjc@google.com>

If the guest attempts a non-MSR #VMGEXIT without the registered GHCB,
return a GHCB_HV_RESP_MALFORMED_INPUT+GHCB_ERR_NOT_REGISTERED error to the
guest instead of exiting KVM_RUN with -EINVAL (and in likelihood killing
the VM).  KVM has already mapped the requested GHCB, i.e. can cleanly
report an error, and so exiting with -EINVAL is completely unjustified.

Fixes: 0c76b1d08280 ("KVM: SEV: Add support to handle GHCB GPA register VMGEXIT")
Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-19-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 6c6a6d663e29..7c2ebc81306f 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4520,9 +4520,12 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 	sev_es_sync_from_ghcb(svm);
 
 	/* SEV-SNP guest requires that the GHCB GPA must be registered */
-	if (is_sev_snp_guest(vcpu) && !ghcb_gpa_is_registered(svm, ghcb_gpa)) {
-		vcpu_unimpl(&svm->vcpu, "vmgexit: GHCB GPA [%#llx] is not registered.\n", ghcb_gpa);
-		return -EINVAL;
+	if (is_sev_snp_guest(vcpu) &&
+	    !ghcb_gpa_is_registered(svm, control->ghcb_gpa)) {
+		vcpu_unimpl(vcpu, "vmgexit: GHCB GPA [%#llx] is not registered.\n",
+			    control->ghcb_gpa);
+		svm_vmgexit_bad_input(svm, GHCB_ERR_NOT_REGISTERED);
+		return 1;
 	}
 
 	ret = sev_es_validate_vmgexit(svm);
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 19/24] KVM: SEV: Move GHCB "usage" check out of sev_es_validate_vmgexit()
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (17 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 18/24] KVM: SEV: Don't terminate SNP VMs on #VMGEXIT without a registered GHCB Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 20/24] KVM: SEV: Return INVALID_EVENT for SNP-only #VMGEXIT from non-SNP guest Paolo Bonzini
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth

From: Sean Christopherson <seanjc@google.com>

Move the check to verify the guest's requested GHCB out of
sev_es_validate_vmgexit() as the first step towards making said helper a
predicate whose sole purpose is to verify the guest has marked required
GHCB fields as valid.

Using a single "validate" helper sounds good on paper, but in practice it's
difficult to verify that KVM is performing the necessary sanity checks (the
usage of state is far removed from the relevant checks), makes it difficult
to understand that "legacy" exits are simply routed to KVM's existing exit
handlers, and most importantly, has directly contributed to a number of
bugs as adding case-statements to the validation subtly removes them from
the default path that rejects unknown exit codes with INVALID_EVENT.

Deliberately extract the usage code check first so as to preserve the order
of KVM's checks, even though future code extraction will technically fix
bugs.

No functional change intended.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-20-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 7c2ebc81306f..880a2acd77bf 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3416,12 +3416,6 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	struct kvm_vcpu *vcpu = &svm->vcpu;
 	u64 reason;
 
-	/* Only GHCB Usage code 0 is supported */
-	if (svm->sev_es.ghcb->ghcb_usage) {
-		reason = GHCB_ERR_INVALID_USAGE;
-		goto vmgexit_err;
-	}
-
 	reason = GHCB_ERR_MISSING_INPUT;
 
 	if (!kvm_ghcb_sw_exit_code_is_valid(svm) ||
@@ -3534,10 +3528,7 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	 * Print the exit code even though it may not be marked valid as it
 	 * could help with debugging.
 	 */
-	if (reason == GHCB_ERR_INVALID_USAGE) {
-		vcpu_unimpl(vcpu, "vmgexit: ghcb usage %#x is not valid\n",
-			    svm->sev_es.ghcb->ghcb_usage);
-	} else if (reason == GHCB_ERR_INVALID_EVENT) {
+	if (reason == GHCB_ERR_INVALID_EVENT) {
 		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx is not valid\n",
 			    control->exit_code);
 	} else {
@@ -4528,6 +4519,14 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		return 1;
 	}
 
+	/* Only GHCB Usage code 0 is supported */
+	if (svm->sev_es.ghcb->ghcb_usage) {
+		vcpu_unimpl(vcpu, "vmgexit: ghcb usage %#x is not valid\n",
+			    svm->sev_es.ghcb->ghcb_usage);
+		svm_vmgexit_bad_input(svm, GHCB_ERR_INVALID_USAGE);
+		return 1;
+	}
+
 	ret = sev_es_validate_vmgexit(svm);
 	if (ret)
 		return ret;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 20/24] KVM: SEV: Return INVALID_EVENT for SNP-only #VMGEXIT from non-SNP guest
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (18 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 19/24] KVM: SEV: Move GHCB "usage" check out of sev_es_validate_vmgexit() Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-30  9:29   ` sashiko-bot
  2026-05-29 18:35 ` [PATCH 21/24] KVM: SEV: Return INVALID_INPUT, not MISSING_INPUT, for bad GUEST_REQUEST input(s) Paolo Bonzini
                   ` (4 subsequent siblings)
  24 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth

From: Sean Christopherson <seanjc@google.com>

Signal INVALID_EVENT, not MISSING_INPUT, if a non-SNP guest attempts to
invoke an SNP-only #VMGEXIT.  Opportunistically move the checks out of
sev_es_validate_vmgexit() to continue the march towards making said helper
a predicate whose sole purpose is to verify the guest has marked required
GHCB fields as valid.

Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
Fixes: 9b54e248d264 ("KVM: SEV: Add support to handle Page State Change VMGEXIT")
Fixes: 88caf544c930 ("KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event")
Fixes: 74458e4859d8 ("KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event")
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-21-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 880a2acd77bf..b59adddfdbcc 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3491,8 +3491,6 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 			goto vmgexit_err;
 		break;
 	case SVM_VMGEXIT_AP_CREATION:
-		if (!is_sev_snp_guest(vcpu))
-			goto vmgexit_err;
 		if (lower_32_bits(control->exit_info_1) != SVM_VMGEXIT_AP_DESTROY)
 			if (!kvm_ghcb_rax_is_valid(svm))
 				goto vmgexit_err;
@@ -3505,13 +3503,12 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	case SVM_VMGEXIT_TERM_REQUEST:
 		break;
 	case SVM_VMGEXIT_PSC:
-		if (!is_sev_snp_guest(vcpu) || !kvm_ghcb_sw_scratch_is_valid(svm))
+		if (!kvm_ghcb_sw_scratch_is_valid(svm))
 			goto vmgexit_err;
 		break;
 	case SVM_VMGEXIT_GUEST_REQUEST:
 	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
-		if (!is_sev_snp_guest(vcpu) ||
-		    !PAGE_ALIGNED(control->exit_info_1) ||
+		if (!PAGE_ALIGNED(control->exit_info_1) ||
 		    !PAGE_ALIGNED(control->exit_info_2) ||
 		    control->exit_info_1 == control->exit_info_2)
 			goto vmgexit_err;
@@ -4476,6 +4473,19 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 	return 0;
 }
 
+static bool is_snp_only_vmgexit(u64 exit_code)
+{
+	switch (exit_code) {
+	case SVM_VMGEXIT_AP_CREATION:
+	case SVM_VMGEXIT_GUEST_REQUEST:
+	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
+	case SVM_VMGEXIT_PSC:
+		return true;
+	default:
+		return false;
+	}
+}
+
 int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -4527,6 +4537,13 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		return 1;
 	}
 
+	if (is_snp_only_vmgexit(control->exit_code) && !is_sev_snp_guest(vcpu)) {
+		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx is SNP-only\n",
+			    control->exit_code);
+		svm_vmgexit_bad_input(svm, GHCB_ERR_INVALID_EVENT);
+		return 1;
+	}
+
 	ret = sev_es_validate_vmgexit(svm);
 	if (ret)
 		return ret;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 21/24] KVM: SEV: Return INVALID_INPUT, not MISSING_INPUT, for bad GUEST_REQUEST input(s)
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (19 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 20/24] KVM: SEV: Return INVALID_EVENT for SNP-only #VMGEXIT from non-SNP guest Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 22/24] KVM: SEV: Handle unknown #VMGEXIT reasons in sev_handle_vmgexit() Paolo Bonzini
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth

From: Sean Christopherson <seanjc@google.com>

Return INVALID_INPUT, not MISSING_INPUT, if the guest provides an unaligned
address for a GUEST_REQUEST, and/or attempts to use the same page for the
source and destination.  The inputs are obviously invalid, not missing.

Opportunistically move the checks out of sev_es_validate_vmgexit(), to
continue the march towards reducing the scope of the helper, and to help
guide future changes into correctly handling bad input.

Fixes: 88caf544c930 ("KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event")
Fixes: 74458e4859d8 ("KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event")
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-22-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index b59adddfdbcc..84421d9a116b 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3508,10 +3508,6 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 		break;
 	case SVM_VMGEXIT_GUEST_REQUEST:
 	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
-		if (!PAGE_ALIGNED(control->exit_info_1) ||
-		    !PAGE_ALIGNED(control->exit_info_2) ||
-		    control->exit_info_1 == control->exit_info_2)
-			goto vmgexit_err;
 		break;
 	default:
 		reason = GHCB_ERR_INVALID_EVENT;
@@ -4631,10 +4627,20 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		ret = 1;
 		break;
 	case SVM_VMGEXIT_GUEST_REQUEST:
-		ret = snp_handle_guest_req(svm, control->exit_info_1, control->exit_info_2);
-		break;
 	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
-		ret = snp_handle_ext_guest_req(svm, control->exit_info_1, control->exit_info_2);
+		if (!PAGE_ALIGNED(control->exit_info_1) ||
+		    !PAGE_ALIGNED(control->exit_info_2) ||
+		    control->exit_info_1 == control->exit_info_2) {
+			svm_vmgexit_bad_input(svm, GHCB_ERR_INVALID_INPUT);
+			return 1;
+		}
+
+		if (control->exit_code == SVM_VMGEXIT_GUEST_REQUEST)
+			ret = snp_handle_guest_req(svm, control->exit_info_1,
+						   control->exit_info_2);
+		else
+			ret = snp_handle_ext_guest_req(svm, control->exit_info_1,
+						       control->exit_info_2);
 		break;
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 		vcpu_unimpl(vcpu,
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 22/24] KVM: SEV: Handle unknown #VMGEXIT reasons in sev_handle_vmgexit()
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (20 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 21/24] KVM: SEV: Return INVALID_INPUT, not MISSING_INPUT, for bad GUEST_REQUEST input(s) Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 23/24] KVM: SEV: Turn sev_es_validate_vmgexit() into a dedicated predicate Paolo Bonzini
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth

From: Sean Christopherson <seanjc@google.com>

Handle unknown #VMGEXIT reasons in sev_handle_vmgexit(), not in
sev_es_validate_vmgexit().  This makes it _much_ more obvious that KVM
simply funnels "legacy" exits to the standard SVM interception handlers,
and is the final preparatory change needed to reduce the scope of
sev_es_validate_vmgexit().

No functional change intended.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-23-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 74 +++++++++++++++++++-----------------------
 1 file changed, 33 insertions(+), 41 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 84421d9a116b..864d6aea544b 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3414,9 +3414,6 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
 	struct kvm_vcpu *vcpu = &svm->vcpu;
-	u64 reason;
-
-	reason = GHCB_ERR_MISSING_INPUT;
 
 	if (!kvm_ghcb_sw_exit_code_is_valid(svm) ||
 	    !kvm_ghcb_sw_exit_info_1_is_valid(svm) ||
@@ -3424,14 +3421,10 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 		goto vmgexit_err;
 
 	switch (control->exit_code) {
-	case SVM_EXIT_READ_DR7:
-		break;
 	case SVM_EXIT_WRITE_DR7:
 		if (!kvm_ghcb_rax_is_valid(svm))
 			goto vmgexit_err;
 		break;
-	case SVM_EXIT_RDTSC:
-		break;
 	case SVM_EXIT_RDPMC:
 		if (!kvm_ghcb_rcx_is_valid(svm))
 			goto vmgexit_err;
@@ -3444,8 +3437,6 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 			if (!kvm_ghcb_xcr0_is_valid(svm))
 				goto vmgexit_err;
 		break;
-	case SVM_EXIT_INVD:
-		break;
 	case SVM_EXIT_IOIO:
 		if (control->exit_info_1 & SVM_IOIO_STR_MASK) {
 			if (!kvm_ghcb_sw_scratch_is_valid(svm))
@@ -3470,10 +3461,6 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 		    !kvm_ghcb_cpl_is_valid(svm))
 			goto vmgexit_err;
 		break;
-	case SVM_EXIT_RDTSCP:
-		break;
-	case SVM_EXIT_WBINVD:
-		break;
 	case SVM_EXIT_MONITOR:
 		if (!kvm_ghcb_rax_is_valid(svm) ||
 		    !kvm_ghcb_rcx_is_valid(svm) ||
@@ -3495,23 +3482,12 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 			if (!kvm_ghcb_rax_is_valid(svm))
 				goto vmgexit_err;
 		break;
-	case SVM_VMGEXIT_NMI_COMPLETE:
-	case SVM_VMGEXIT_AP_HLT_LOOP:
-	case SVM_VMGEXIT_AP_JUMP_TABLE:
-	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
-	case SVM_VMGEXIT_HV_FEATURES:
-	case SVM_VMGEXIT_TERM_REQUEST:
-		break;
 	case SVM_VMGEXIT_PSC:
 		if (!kvm_ghcb_sw_scratch_is_valid(svm))
 			goto vmgexit_err;
 		break;
-	case SVM_VMGEXIT_GUEST_REQUEST:
-	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
-		break;
 	default:
-		reason = GHCB_ERR_INVALID_EVENT;
-		goto vmgexit_err;
+		break;
 	}
 
 	return 0;
@@ -3521,16 +3497,10 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	 * Print the exit code even though it may not be marked valid as it
 	 * could help with debugging.
 	 */
-	if (reason == GHCB_ERR_INVALID_EVENT) {
-		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx is not valid\n",
-			    control->exit_code);
-	} else {
-		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx input is not valid\n",
-			    control->exit_code);
-		dump_ghcb(svm);
-	}
-
-	svm_vmgexit_bad_input(svm, reason);
+	vcpu_unimpl(vcpu, "vmgexit: exit code %#llx input is not valid\n",
+		    control->exit_code);
+	dump_ghcb(svm);
+	svm_vmgexit_bad_input(svm, GHCB_ERR_MISSING_INPUT);
 
 	/* Resume the guest to "return" the error code. */
 	return 1;
@@ -4547,6 +4517,25 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 	svm_vmgexit_success(svm, 0);
 
 	switch (control->exit_code) {
+	case SVM_EXIT_IOIO:
+		if (!((control->exit_info_1 & SVM_IOIO_SIZE_MASK) >> SVM_IOIO_SIZE_SHIFT))
+			return 1;
+
+		fallthrough;
+	case SVM_EXIT_READ_DR7:
+	case SVM_EXIT_WRITE_DR7:
+	case SVM_EXIT_RDTSC:
+	case SVM_EXIT_RDTSCP:
+	case SVM_EXIT_RDPMC:
+	case SVM_EXIT_CPUID:
+	case SVM_EXIT_INVD:
+	case SVM_EXIT_MSR:
+	case SVM_EXIT_VMMCALL:
+	case SVM_EXIT_WBINVD:
+	case SVM_EXIT_MONITOR:
+	case SVM_EXIT_MWAIT:
+		ret = svm_invoke_exit_handler(vcpu, control->exit_code);
+		break;
 	case SVM_VMGEXIT_MMIO_READ:
 	case SVM_VMGEXIT_MMIO_WRITE: {
 		bool is_write = control->exit_code == SVM_VMGEXIT_MMIO_WRITE;
@@ -4643,18 +4632,21 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 						       control->exit_info_2);
 		break;
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
+		/*
+		 * Note, the _guest_ is reporting an unsupported #VC, i.e. this
+		 * isn't the same thing as KVM getting an unsupported #VMGEXIT.
+		 */
 		vcpu_unimpl(vcpu,
 			    "vmgexit: unsupported event - exit_info_1=%#llx, exit_info_2=%#llx\n",
 			    control->exit_info_1, control->exit_info_2);
 		ret = -EINVAL;
 		break;
-	case SVM_EXIT_IOIO:
-		if (!((control->exit_info_1 & SVM_IOIO_SIZE_MASK) >> SVM_IOIO_SIZE_SHIFT))
-			return 1;
-
-		fallthrough;
 	default:
-		ret = svm_invoke_exit_handler(vcpu, control->exit_code);
+		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx is not valid\n",
+			    control->exit_code);
+		svm_vmgexit_bad_input(svm, GHCB_ERR_INVALID_EVENT);
+		ret = 1;
+		break;
 	}
 
 	return ret;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 23/24] KVM: SEV: Turn sev_es_validate_vmgexit() into a dedicated predicate
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (21 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 22/24] KVM: SEV: Handle unknown #VMGEXIT reasons in sev_handle_vmgexit() Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-29 18:35 ` [PATCH 24/24] KVM: SEV: Remove sometimes-used function-scoped "ret" from #VMGEXIT handler Paolo Bonzini
  2026-05-30 16:27 ` [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
  24 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth

From: Sean Christopherson <seanjc@google.com>

Now that sev_es_validate_vmgexit() is only responsible for checking that
all required GHCB fields are marked valid, turn it into a predicate whose
name reflects exactly that.

No functional change intended.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-24-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 112 +++++++++++++++--------------------------
 1 file changed, 41 insertions(+), 71 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 864d6aea544b..bb70df2bf1a4 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3410,7 +3410,7 @@ static void sev_es_sync_from_ghcb(struct vcpu_svm *svm)
 	memset(ghcb->save.valid_bitmap, 0, sizeof(ghcb->save.valid_bitmap));
 }
 
-static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
+static bool sev_es_are_required_ghcb_fields_valid(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
 	struct kvm_vcpu *vcpu = &svm->vcpu;
@@ -3418,92 +3418,53 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	if (!kvm_ghcb_sw_exit_code_is_valid(svm) ||
 	    !kvm_ghcb_sw_exit_info_1_is_valid(svm) ||
 	    !kvm_ghcb_sw_exit_info_2_is_valid(svm))
-		goto vmgexit_err;
+		return false;
 
 	switch (control->exit_code) {
 	case SVM_EXIT_WRITE_DR7:
-		if (!kvm_ghcb_rax_is_valid(svm))
-			goto vmgexit_err;
-		break;
+		return kvm_ghcb_rax_is_valid(svm);
 	case SVM_EXIT_RDPMC:
-		if (!kvm_ghcb_rcx_is_valid(svm))
-			goto vmgexit_err;
-		break;
+		return kvm_ghcb_rcx_is_valid(svm);
 	case SVM_EXIT_CPUID:
 		if (!kvm_ghcb_rax_is_valid(svm) ||
 		    !kvm_ghcb_rcx_is_valid(svm))
-			goto vmgexit_err;
-		if (vcpu->arch.regs[VCPU_REGS_RAX] == 0xd)
-			if (!kvm_ghcb_xcr0_is_valid(svm))
-				goto vmgexit_err;
-		break;
+			return false;
+
+		return vcpu->arch.regs[VCPU_REGS_RAX] != 0xd ||
+		       kvm_ghcb_xcr0_is_valid(svm);
 	case SVM_EXIT_IOIO:
-		if (control->exit_info_1 & SVM_IOIO_STR_MASK) {
-			if (!kvm_ghcb_sw_scratch_is_valid(svm))
-				goto vmgexit_err;
-		} else {
-			if (!(control->exit_info_1 & SVM_IOIO_TYPE_MASK))
-				if (!kvm_ghcb_rax_is_valid(svm))
-					goto vmgexit_err;
-		}
-		break;
+		if (control->exit_info_1 & SVM_IOIO_STR_MASK)
+			return kvm_ghcb_sw_scratch_is_valid(svm);
+
+		if (!(control->exit_info_1 & SVM_IOIO_TYPE_MASK))
+			return kvm_ghcb_rax_is_valid(svm);
+
+		return true;
 	case SVM_EXIT_MSR:
 		if (!kvm_ghcb_rcx_is_valid(svm))
-			goto vmgexit_err;
-		if (control->exit_info_1) {
-			if (!kvm_ghcb_rax_is_valid(svm) ||
-			    !kvm_ghcb_rdx_is_valid(svm))
-				goto vmgexit_err;
-		}
-		break;
+			return false;
+
+		return !control->exit_info_1 ||
+		       (kvm_ghcb_rax_is_valid(svm) && kvm_ghcb_rdx_is_valid(svm));
 	case SVM_EXIT_VMMCALL:
-		if (!kvm_ghcb_rax_is_valid(svm) ||
-		    !kvm_ghcb_cpl_is_valid(svm))
-			goto vmgexit_err;
-		break;
+		return kvm_ghcb_rax_is_valid(svm) && kvm_ghcb_cpl_is_valid(svm);
 	case SVM_EXIT_MONITOR:
-		if (!kvm_ghcb_rax_is_valid(svm) ||
-		    !kvm_ghcb_rcx_is_valid(svm) ||
-		    !kvm_ghcb_rdx_is_valid(svm))
-			goto vmgexit_err;
-		break;
+		return kvm_ghcb_rax_is_valid(svm) &&
+		       kvm_ghcb_rcx_is_valid(svm) &&
+		       kvm_ghcb_rdx_is_valid(svm);
 	case SVM_EXIT_MWAIT:
-		if (!kvm_ghcb_rax_is_valid(svm) ||
-		    !kvm_ghcb_rcx_is_valid(svm))
-			goto vmgexit_err;
+		return kvm_ghcb_rax_is_valid(svm) && kvm_ghcb_rcx_is_valid(svm);
+	case SVM_VMGEXIT_AP_CREATION:
+		return kvm_ghcb_rax_is_valid(svm) ||
+		       lower_32_bits(control->exit_info_1) == SVM_VMGEXIT_AP_DESTROY;
 		break;
 	case SVM_VMGEXIT_MMIO_READ:
 	case SVM_VMGEXIT_MMIO_WRITE:
-		if (!kvm_ghcb_sw_scratch_is_valid(svm))
-			goto vmgexit_err;
-		break;
-	case SVM_VMGEXIT_AP_CREATION:
-		if (lower_32_bits(control->exit_info_1) != SVM_VMGEXIT_AP_DESTROY)
-			if (!kvm_ghcb_rax_is_valid(svm))
-				goto vmgexit_err;
-		break;
 	case SVM_VMGEXIT_PSC:
-		if (!kvm_ghcb_sw_scratch_is_valid(svm))
-			goto vmgexit_err;
-		break;
+		return kvm_ghcb_sw_scratch_is_valid(svm);
 	default:
-		break;
+		return true;
 	}
-
-	return 0;
-
-vmgexit_err:
-	/*
-	 * Print the exit code even though it may not be marked valid as it
-	 * could help with debugging.
-	 */
-	vcpu_unimpl(vcpu, "vmgexit: exit code %#llx input is not valid\n",
-		    control->exit_code);
-	dump_ghcb(svm);
-	svm_vmgexit_bad_input(svm, GHCB_ERR_MISSING_INPUT);
-
-	/* Resume the guest to "return" the error code. */
-	return 1;
 }
 
 static void __sev_es_unmap_ghcb(struct vcpu_svm *svm)
@@ -4510,9 +4471,17 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		return 1;
 	}
 
-	ret = sev_es_validate_vmgexit(svm);
-	if (ret)
-		return ret;
+	if (!sev_es_are_required_ghcb_fields_valid(svm)) {
+		/*
+		 * Print the exit code even though it may not be marked valid
+		 * as it could help with debugging.
+		 */
+		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx input is not valid\n",
+			    control->exit_code);
+		dump_ghcb(svm);
+		svm_vmgexit_bad_input(svm, GHCB_ERR_MISSING_INPUT);
+		return 1;
+	}
 
 	svm_vmgexit_success(svm, 0);
 
@@ -4599,6 +4568,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SEV_TERM;
 		vcpu->run->system_event.ndata = 1;
 		vcpu->run->system_event.data[0] = control->ghcb_gpa;
+		ret = 0;
 		break;
 	case SVM_VMGEXIT_PSC:
 		ret = setup_vmgexit_scratch(svm, true, sizeof(struct psc_hdr));
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 24/24] KVM: SEV: Remove sometimes-used function-scoped "ret" from #VMGEXIT handler
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (22 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 23/24] KVM: SEV: Turn sev_es_validate_vmgexit() into a dedicated predicate Paolo Bonzini
@ 2026-05-29 18:35 ` Paolo Bonzini
  2026-05-30 16:27 ` [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
  24 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-29 18:35 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth

From: Sean Christopherson <seanjc@google.com>

Now that only two case-statements actually need a local "ret" variable,
refactor sev_handle_vmgexit() to have all flows return directly when
possible, and bury "ret" as "r" in the two paths that need to propagate a
return value from a helper.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-25-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c | 74 ++++++++++++++++++------------------------
 1 file changed, 31 insertions(+), 43 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index bb70df2bf1a4..bc9dd39778a1 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4418,7 +4418,6 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 	struct vcpu_svm *svm = to_svm(vcpu);
 	struct vmcb_control_area *control = &svm->vmcb->control;
 	u64 ghcb_gpa;
-	int ret;
 
 	/* Validate the GHCB */
 	ghcb_gpa = control->ghcb_gpa;
@@ -4503,12 +4502,12 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 	case SVM_EXIT_WBINVD:
 	case SVM_EXIT_MONITOR:
 	case SVM_EXIT_MWAIT:
-		ret = svm_invoke_exit_handler(vcpu, control->exit_code);
-		break;
+		return svm_invoke_exit_handler(vcpu, control->exit_code);
 	case SVM_VMGEXIT_MMIO_READ:
 	case SVM_VMGEXIT_MMIO_WRITE: {
 		bool is_write = control->exit_code == SVM_VMGEXIT_MMIO_WRITE;
 		u64 len = control->exit_info_2;
+		int r;
 
 		if (!len)
 			return 1;
@@ -4518,24 +4517,21 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 			return 1;
 		}
 
-		ret = setup_vmgexit_scratch(svm, !is_write, len);
-		if (ret)
-			break;
+		r = setup_vmgexit_scratch(svm, !is_write, len);
+		if (r)
+			return r;
 
-		ret = kvm_sev_es_mmio(vcpu, is_write, control->exit_info_1, len,
-				      svm->sev_es.ghcb_sa);
-		break;
+		return kvm_sev_es_mmio(vcpu, is_write, control->exit_info_1, len,
+				       svm->sev_es.ghcb_sa);
 	}
 	case SVM_VMGEXIT_NMI_COMPLETE:
 		++vcpu->stat.nmi_window_exits;
 		svm->nmi_masked = false;
 		kvm_make_request(KVM_REQ_EVENT, vcpu);
-		ret = 1;
-		break;
+		return 1;
 	case SVM_VMGEXIT_AP_HLT_LOOP:
 		svm->sev_es.ap_reset_hold_type = AP_RESET_HOLD_NAE_EVENT;
-		ret = kvm_emulate_ap_reset_hold(vcpu);
-		break;
+		return kvm_emulate_ap_reset_hold(vcpu);
 	case SVM_VMGEXIT_AP_JUMP_TABLE: {
 		struct kvm_sev_info *sev = to_kvm_sev_info(vcpu->kvm);
 
@@ -4553,14 +4549,11 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 			       control->exit_info_1);
 			svm_vmgexit_bad_input(svm, GHCB_ERR_INVALID_INPUT);
 		}
-
-		ret = 1;
-		break;
+		return 1;
 	}
 	case SVM_VMGEXIT_HV_FEATURES:
 		svm_vmgexit_success(svm, GHCB_HV_FT_SUPPORTED);
-		ret = 1;
-		break;
+		return 1;
 	case SVM_VMGEXIT_TERM_REQUEST:
 		pr_info("SEV-ES guest requested termination: reason %#llx info %#llx\n",
 			control->exit_info_1, control->exit_info_2);
@@ -4568,23 +4561,20 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SEV_TERM;
 		vcpu->run->system_event.ndata = 1;
 		vcpu->run->system_event.data[0] = control->ghcb_gpa;
-		ret = 0;
-		break;
-	case SVM_VMGEXIT_PSC:
-		ret = setup_vmgexit_scratch(svm, true, sizeof(struct psc_hdr));
-		if (ret)
-			break;
+		return 0;
+	case SVM_VMGEXIT_PSC: {
+		int r;
 
-		ret = snp_begin_psc(svm);
-		break;
+		r = setup_vmgexit_scratch(svm, true, sizeof(struct psc_hdr));
+		if (r)
+			return r;
+
+		return snp_begin_psc(svm);
+	}
 	case SVM_VMGEXIT_AP_CREATION:
-		ret = sev_snp_ap_creation(svm);
-		if (ret) {
+		if (sev_snp_ap_creation(svm))
 			svm_vmgexit_bad_input(svm, GHCB_ERR_INVALID_INPUT);
-		}
-
-		ret = 1;
-		break;
+		return 1;
 	case SVM_VMGEXIT_GUEST_REQUEST:
 	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
 		if (!PAGE_ALIGNED(control->exit_info_1) ||
@@ -4595,12 +4585,11 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		}
 
 		if (control->exit_code == SVM_VMGEXIT_GUEST_REQUEST)
-			ret = snp_handle_guest_req(svm, control->exit_info_1,
-						   control->exit_info_2);
-		else
-			ret = snp_handle_ext_guest_req(svm, control->exit_info_1,
-						       control->exit_info_2);
-		break;
+			return snp_handle_guest_req(svm, control->exit_info_1,
+						    control->exit_info_2);
+
+		return snp_handle_ext_guest_req(svm, control->exit_info_1,
+						control->exit_info_2);
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 		/*
 		 * Note, the _guest_ is reporting an unsupported #VC, i.e. this
@@ -4609,17 +4598,16 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		vcpu_unimpl(vcpu,
 			    "vmgexit: unsupported event - exit_info_1=%#llx, exit_info_2=%#llx\n",
 			    control->exit_info_1, control->exit_info_2);
-		ret = -EINVAL;
-		break;
+		return -EINVAL;
 	default:
 		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx is not valid\n",
 			    control->exit_code);
 		svm_vmgexit_bad_input(svm, GHCB_ERR_INVALID_EVENT);
-		ret = 1;
-		break;
+		return 1;
 	}
 
-	return ret;
+	KVM_BUG_ON(1, vcpu->kvm);
+	return -EIO;
 }
 
 int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in)
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/24] KVM: SEV: Ignore MMIO requests of length '0'
  2026-05-29 18:35 ` [PATCH 02/24] KVM: SEV: Ignore MMIO requests of length '0' Paolo Bonzini
@ 2026-05-29 19:29   ` sashiko-bot
  0 siblings, 0 replies; 42+ messages in thread
From: sashiko-bot @ 2026-05-29 19:29 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] A malicious SEV-ES guest can flood the host kernel log and cause a Denial of Service by repeatedly triggering un-ratelimited `pr_err()` prints via invalid GHCB scratch configurations.
--

commit 294af95cb303ec47ddd0734af250cb6201c6779d
Author: Sean Christopherson <seanjc@google.com>

KVM: SEV: Ignore MMIO requests of length '0'

This commit explicitly ignores MMIO requests of length 0. This prevents
underflowing the length when setting up the software scratch area and
allows for special casing 0 in the future.

> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 23170b64f4a33..fb2174b6d1ba4 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -4497,13 +4497,17 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
>  	case SVM_VMGEXIT_MMIO_READ:
>  	case SVM_VMGEXIT_MMIO_WRITE: {
>  		bool is_write = control->exit_code == SVM_VMGEXIT_MMIO_WRITE;
> +		u64 len = control->exit_info_2;
>  
> -		ret = setup_vmgexit_scratch(svm, !is_write, control->exit_info_2);
> +		if (!len)
> +			return 1;
> +
> +		ret = setup_vmgexit_scratch(svm, !is_write, len);

[Severity: High]
This isn't a bug introduced by this patch, but could this lead to a host
denial of service through printk flooding?

Looking at setup_vmgexit_scratch() in arch/x86/kvm/svm/sev.c, there are
several validation checks on guest-controlled values that use pr_err()
without ratelimiting:

arch/x86/kvm/svm/sev.c:setup_vmgexit_scratch() {
    ...
	scratch_gpa_beg = svm->sev_es.sw_scratch;
	if (!scratch_gpa_beg) {
		pr_err("vmgexit: scratch gpa not provided\n");
		goto e_scratch;
	}

	scratch_gpa_end = scratch_gpa_beg + len;
	if (scratch_gpa_end < scratch_gpa_beg) {
		pr_err("vmgexit: scratch length (%#llx) not valid for scratch address (%#llx)\n",
		       len, scratch_gpa_beg);
		goto e_scratch;
	}
    ...
}

Since these failure paths gracefully return to the guest with an error code
rather than terminating the VM, an unprivileged guest can trigger these
paths in a tight loop.

For instance, by repeatedly setting sw_scratch to 0 and issuing an MMIO
write request, the guest can flood the host kernel log.

Should these messages be converted to pr_err_ratelimited() or removed
entirely to prevent guest-triggered spam?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529183549.1104619-1-pbonzini@redhat.com?part=2

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 03/24] KVM: SEV: Reject MMIO requests larger than 8 bytes with GHCB v2+
  2026-05-29 18:35 ` [PATCH 03/24] KVM: SEV: Reject MMIO requests larger than 8 bytes with GHCB v2+ Paolo Bonzini
@ 2026-05-29 19:54   ` sashiko-bot
  0 siblings, 0 replies; 42+ messages in thread
From: sashiko-bot @ 2026-05-29 19:54 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] Guest-reachable `pr_err` calls without ratelimiting in `setup_vmgexit_scratch` allow a malicious SEV-ES guest to flood the host kernel log.
--

commit e862165e67c650ff667c57363d24f101d03cf5e9
Author: Sean Christopherson <seanjc@google.com>

KVM: SEV: Reject MMIO requests larger than 8 bytes with GHCB v2+

This commit updates KVM to reject MMIO requests larger than 8 bytes when
using GHCB version 2 or later, as required by the GHCB specification.

> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -4502,6 +4502,11 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
>  		if (!len)
>  			return 1;
>  
> +		if (to_kvm_sev_info(vcpu->kvm)->ghcb_version >= 2 && len > 8) {
> +			svm_vmgexit_bad_input(svm, GHCB_ERR_INVALID_INPUT);
> +			return 1;
> +		}
> +
>  		ret = setup_vmgexit_scratch(svm, !is_write, len);

[Severity: High]
This isn't a bug introduced by this patch, but looking at
setup_vmgexit_scratch() called here:

arch/x86/kvm/svm/sev.c:setup_vmgexit_scratch() {
	scratch_gpa_beg = svm->sev_es.sw_scratch;
	if (!scratch_gpa_beg) {
		pr_err("vmgexit: scratch gpa not provided\n");
		goto e_scratch;
	}
...

Could a malicious SEV-ES guest flood the host kernel log by repeatedly
triggering this error path?

If the guest continuously passes invalid scratch parameters, the function
immediately returns 1 and resumes the guest without exiting to userspace.
Does this allow an infinite loop of un-ratelimited pr_err() calls that
could stall the host CPU and cause console lockups?

>  		if (ret)
>  			break;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529183549.1104619-1-pbonzini@redhat.com?part=3

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 05/24] KVM: SEV: Use the size of the PSC header as the minimum size for PSC requests
  2026-05-29 18:35 ` [PATCH 05/24] KVM: SEV: Use the size of the PSC header as the minimum size for PSC requests Paolo Bonzini
@ 2026-05-29 20:49   ` sashiko-bot
  0 siblings, 0 replies; 42+ messages in thread
From: sashiko-bot @ 2026-05-29 20:49 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [Critical] Insufficient bounds checking for Page State Change (PSC) requests allows out-of-bounds host memory reads and writes past the mapped 4KB GHCB page.
- [High] The GHCB mapping pinned via kvm_vcpu_map() on #VMGEXIT is permanently leaked if the vCPU is destroyed before the next VMRUN.
--

commit 2f8a93115345b3dcacdee9bdd3b3bd2f9154bfc6
Author: Sean Christopherson <seanjc@google.com>

KVM: SEV: Use the size of the PSC header as the minimum size for PSC requests

This commit updates the Page State Change VMGEXIT handler to use the size
of the PSC header as the minimum length for the scratch area. It stops
using exit_info_2 for the length, since the GHCB specification does not
provide length for PSC requests.

Cc: stable@vger.kernel.org

> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 52703c9548568..cbb3040e07782 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -4559,7 +4559,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
>  		vcpu->run->system_event.data[0] = control->ghcb_gpa;
>  		break;
>  	case SVM_VMGEXIT_PSC:
> -		ret = setup_vmgexit_scratch(svm, true, control->exit_info_2);
> +		ret = setup_vmgexit_scratch(svm, true, sizeof(struct psc_hdr));

[Severity: Critical]
This isn't a bug introduced by this patch, but by hardcoding the bounds check
to just the header size, does this allow out-of-bounds host memory reads and
writes past the mapped GHCB page?

If a malicious guest places the scratch area near the end of the GHCB shared
buffer, it can satisfy the bounds check for the 8-byte header in
setup_vmgexit_scratch():

setup_vmgexit_scratch() {
        ...
        scratch_gpa_end = scratch_gpa_beg + len;
        ...
        if (scratch_gpa_beg < ghcb_scratch_beg ||
            scratch_gpa_end > ghcb_scratch_end) {
            ...
}

However, later processing in snp_begin_psc() and __snp_complete_one_psc()
iterates up to VMGEXIT_PSC_MAX_COUNT, directly accessing and writing to
entries[idx]. Since the GHCB is a single page mapped via kvm_vcpu_map(),
accessing entries beyond the validated 8 bytes could result in accessing
host physical memory beyond the page boundary.

I see this is fixed later in the series by commit
89266ec65feabeba5d0f747da7115a19492fab4e and commit
4dd1fe3ae23c326b447d893657435197aa346458, which compute and check the correct
max length of the scratch area.

[Severity: High]
This is a pre-existing issue, but could the GHCB mapping pinned via
kvm_vcpu_map() earlier in sev_handle_vmgexit() be permanently leaked?

sev_handle_vmgexit() pins the GHCB page:

sev_handle_vmgexit() {
        ...
        if (kvm_vcpu_map(vcpu, ghcb_gpa >> PAGE_SHIFT, &svm->sev_es.ghcb_map)) {
        ...
}

Normally, this is unmapped on the next VMRUN in sev_es_unmap_ghcb(). However,
if the vCPU exits to userspace and the VM is subsequently destroyed without
running the vCPU again, the VCPU teardown path in sev_free_vcpu() does not
appear to unmap the GHCB. Would this result in a leaked pinned page and memory
mapping for every such destroyed vCPU?

I note this is addressed later in the series by commit
b2c42b3d82becdc801cc7f669caa3dbec0464c89.

>  		if (ret)
>  			break;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529183549.1104619-1-pbonzini@redhat.com?part=5

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/24] KVM: SEV: WARN if KVM attempts to setup scratch area with min_len==0
  2026-05-29 18:35 ` [PATCH 07/24] KVM: SEV: WARN if KVM attempts to setup scratch area with min_len==0 Paolo Bonzini
@ 2026-05-29 21:32   ` sashiko-bot
  0 siblings, 0 replies; 42+ messages in thread
From: sashiko-bot @ 2026-05-29 21:32 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] Guest-triggerable non-ratelimited `pr_err` messages can be spammed to flood the host logs, leading to a host-side DoS.
--

commit f2dbe3638336199969535efb957ca83d946dd1f5
Author: Sean Christopherson <seanjc@google.com>

KVM: SEV: WARN if KVM attempts to setup scratch area with min_len==0

This commit adds a warning when KVM attempts to configure the scratch area 
with a minimum length of zero. This guards against future bugs, as all
paths are now expected to provide a valid, non-zero length.

Link: https://lkml.kernel.org/r/20260501202250.2115252-8-seanjc@google.com

> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3669,6 +3669,9 @@ static int setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 min_len)
>  	u64 scratch_gpa_beg, scratch_gpa_end;
>  	void *scratch_va;
>  
> +	if (WARN_ON_ONCE(!min_len))
> +		goto e_scratch;
> +
>  	scratch_gpa_beg = svm->sev_es.sw_scratch;
>  	if (!scratch_gpa_beg) {
>  		pr_err("vmgexit: scratch gpa not provided\n");

[Severity: High]
This is a pre-existing issue, but could a malicious SEV-ES guest intentionally
provide a malformed scratch GPA and repeatedly issue VMGEXIT requests in a
tight loop?

Since this pr_err in setup_vmgexit_scratch is not rate-limited, the continuous
VM exits might flood the host kernel log. Would it be safer to convert
guest-triggerable log messages like this one to use vcpu_unimpl or
pr_err_ratelimited to prevent host-side log spam?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529183549.1104619-1-pbonzini@redhat.com?part=7

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/24] KVM: SEV: Use READ_ONCE() when reading entries/indices from PSC buffer
  2026-05-29 18:35 ` [PATCH 10/24] KVM: SEV: Use READ_ONCE() when reading entries/indices from PSC buffer Paolo Bonzini
@ 2026-05-29 22:28   ` sashiko-bot
  0 siblings, 0 replies; 42+ messages in thread
From: sashiko-bot @ 2026-05-29 22:28 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] KVM passes KVM_MAP_GPA_RANGE_PAGE_SZ_2M to userspace for partially completed 2M pages that are being forced to 4K handling.
--

commit d195b85e8b040d5bc29811a11645551c672d8a47
Author: Sean Christopherson <seanjc@google.com>

KVM: SEV: Use READ_ONCE() when reading entries/indices from PSC buffer

This patch uses READ_ONCE() when accessing guest-accessible Page State Change
buffer entries to protect against TOCTOU vulnerabilities. It skips these
macros for code paths where KVM exclusively writes without consuming the
results.

> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
[ ... ]
> @@ -3948,7 +3948,7 @@ static int snp_begin_psc(struct vcpu_svm *svm)
>  
>  	/* Find the start of the next range which needs processing. */
>  	for (idx = idx_start; idx <= idx_end; idx++, hdr->cur_entry++) {
> -		entry_start = entries[idx];
> +		entry_start = READ_ONCE(entries[idx]);
>  
>  		gfn = entry_start.gfn;
>  		huge = entry_start.pagesize;
>  		npages = huge ? 512 : 1;
>  
>  		if (entry_start.cur_page > npages || !IS_ALIGNED(gfn, npages)) {
>  			snp_complete_psc(svm, VMGEXIT_PSC_ERROR_INVALID_ENTRY);
>  			return 1;
>  		}
>  
>  		if (entry_start.cur_page) {
>  			/*
>  			 * If this is a partially-completed 2M range, force 4K handling
>  			 * for the remaining pages since they're effectively split at
>  			 * this point. Subsequent code should ensure this doesn't get
>  			 * combined with adjacent PSC entries where 2M handling is still
>  			 * possible.
>  			 */
>  			npages -= entry_start.cur_page;
>  			gfn += entry_start.cur_page;
>  			huge = false;
>  		}

[Severity: High]
This is a pre-existing issue, but does this code violate the hypercall ABI by
using the original entry_start.pagesize when populating the hypercall
arguments later in the function?

When a 2M Page State Change request is partially completed, the code above
correctly forces it to be handled as 4K pages by setting huge = false and
adjusting npages and gfn.

However, further down in snp_begin_psc(), the hypercall arguments are
populated using the unmodified entry_start.pagesize:

arch/x86/kvm/svm/sev.c:snp_begin_psc() {
    ...
    vcpu->run->hypercall.args[2] |= entry_start.pagesize
                                    ? KVM_MAP_GPA_RANGE_PAGE_SZ_2M
                                    : KVM_MAP_GPA_RANGE_PAGE_SZ_4K;
    ...
}

Will this cause KVM to pass KVM_MAP_GPA_RANGE_PAGE_SZ_2M to userspace for a
range that is no longer 2M-aligned and whose size is no longer a multiple of
512 pages?

Can a guest exploit this by passing a partially-completed 2M page request
to force userspace into an error state?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529183549.1104619-1-pbonzini@redhat.com?part=10

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 11/24] KVM: SEV: Make it more obvious when KVM is writing back the current PSC index
  2026-05-29 18:35 ` [PATCH 11/24] KVM: SEV: Make it more obvious when KVM is writing back the current PSC index Paolo Bonzini
@ 2026-05-29 23:21   ` sashiko-bot
  2026-06-01 16:20     ` Sean Christopherson
  0 siblings, 1 reply; 42+ messages in thread
From: sashiko-bot @ 2026-05-29 23:21 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] A malicious guest can cause an infinite kernel loop and a soft lockup by concurrently modifying the GHCB scratch area during unsupported PSC operations.
--

commit 7f37f8948c706d884d14a7c69279c900524daced
Author: Sean Christopherson <seanjc@google.com>

KVM: SEV: Make it more obvious when KVM is writing back the current PSC index

This commit moves the guest-visible "cur_entry" index increment outside the
inner for-loop. It also adds a comment to clarify which code modifies trusted
versus guest-accessible data.

Message-ID: <20260501202250.2115252-12-seanjc@google.com>

> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3947,7 +3947,7 @@ static int snp_begin_psc(struct vcpu_svm *svm)
>  	}
>  
>  	/* Find the start of the next range which needs processing. */
> -	for (idx = idx_start; idx <= idx_end; idx++, hdr->cur_entry++) {
> +	for (idx = idx_start; idx <= idx_end; idx++) {
>  		entry_start = READ_ONCE(entries[idx]);
>  
[ ... ]
> @@ -3974,6 +3974,14 @@ static int snp_begin_psc(struct vcpu_svm *svm)
>  
>  		if (npages)
>  			break;
> +
> +		/*
> +		 * Increment the guest-visible index to communicate the current
> +		 * entry back to the guest, e.g. in case of failure.  No need
> +		 * for READ_ONCE() as KVM doesn't consume the field, i.e. a
> +		 * misbehaving guest can only break itself.
> +		 */
> +		hdr->cur_entry++;
>  	}
>  
>  	if (idx > idx_end) {

[Severity: High]
This is a pre-existing issue, but since we're examining how a misbehaving 
guest affects the PSC processing loop, is it possible for a guest to trigger 
a soft lockup here?

Looking at snp_begin_psc(), if the guest submits an unsupported PSC operation,
we hit the default case and loop back:

next_range:
    idx_start = READ_ONCE(hdr->cur_entry);
    ...
    switch (entry_start.operation) {
    ...
    default:
        __snp_complete_one_psc(svm);
        goto next_range;
    }

If a malicious guest concurrently resets hdr->cur_entry back to 0 in the
GHCB shared memory, could the host get trapped continuously reading an
index of 0 and re-processing the same unsupported entry? 

Since there is no yield or exit to userspace in this default case, it seems 
this might trap the host thread in the kernel while holding vcpu->mutex,
leading to a persistent soft lockup.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529183549.1104619-1-pbonzini@redhat.com?part=11

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 12/24] KVM: SEV: Add an anonymous "psc" struct to track current PSC metadata
  2026-05-29 18:35 ` [PATCH 12/24] KVM: SEV: Add an anonymous "psc" struct to track current PSC metadata Paolo Bonzini
@ 2026-05-30  8:07   ` sashiko-bot
  0 siblings, 0 replies; 42+ messages in thread
From: sashiko-bot @ 2026-05-30  8:07 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] KVM incorrectly passes the 2M page size flag to userspace for partially completed 2M PSC entries, despite forcing 4K handling.
--

commit c91cb923b18a5adb2ee4beca4e478b8c40f1fbc2
Author: Sean Christopherson <seanjc@google.com>

KVM: SEV: Add an anonymous "psc" struct to track current PSC metadata

This commit groups Page State Change fields into an anonymous psc struct 
within vcpu_sev_es_state to improve code readability. It also renames 
several local variables to clarify that they point to guest-accessible data.
Lastly, it clarifies that inflight represents the batch size of pages.

Message-ID: <20260501202250.2115252-13-seanjc@google.com>

> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 1982d13e71d9e..9f6543cebedf5 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c

[ ... ]

> @@ -4000,14 +3995,14 @@ static int snp_begin_psc(struct vcpu_svm *svm)
>  	 * KVM_HC_MAP_GPA_RANGE exit.
>  	 */
>  	while (++idx <= idx_end) {
> -		struct psc_entry entry = READ_ONCE(entries[idx]);
> +		struct psc_entry entry = READ_ONCE(guest_psc->entries[idx]);
>  
>  		if (entry.operation != entry_start.operation ||
>  		    entry.gfn != entry_start.gfn + npages ||
>  		    entry.cur_page || !!entry.pagesize != huge)
>  			break;
>  
> -		svm->sev_es.psc_inflight++;
> +		sev_es->psc.batch_size++;
>  		npages += huge ? 512 : 1;
>  	}

[Severity: High]
This isn't a bug introduced by this patch, but while reviewing the Page State
Change updates in arch/x86/kvm/svm/sev.c:snp_begin_psc(), I noticed a
potential problem when handling partially completed 2M requests.

When a 2M request is interrupted, the guest resubmits it with
entry_start.pagesize set to true and entry_start.cur_page > 0. KVM handles
this by forcing 4K handling for the remaining pages, setting huge = false,
adjusting npages, and advancing the gfn.

However, when KVM forwards this request to userspace via the
KVM_HC_MAP_GPA_RANGE hypercall right below this loop, it populates the page
size argument using the original guest request:

switch (entry_start.operation) {
case VMGEXIT_PSC_OP_PRIVATE:
case VMGEXIT_PSC_OP_SHARED:
    ...
    vcpu->run->hypercall.args[2] |= entry_start.pagesize
                                    ? KVM_MAP_GPA_RANGE_PAGE_SZ_2M
                                    : KVM_MAP_GPA_RANGE_PAGE_SZ_4K;

Could this result in passing KVM_MAP_GPA_RANGE_PAGE_SZ_2M along with a
4K-aligned (but not 2M-aligned) GPA and a non-512-multiple page count?

If host userspace performs alignment checks on the mapping request, would
this cause it to reject the malformed hypercall and fatally terminate the VM?

Would it be safer to use the adjusted huge boolean instead of
entry_start.pagesize when populating these arguments?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529183549.1104619-1-pbonzini@redhat.com?part=12

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 15/24] KVM: SEV: Move sev_free_vcpu() down below sev_es_unmap_ghcb()
  2026-05-29 18:35 ` [PATCH 15/24] KVM: SEV: Move sev_free_vcpu() down below sev_es_unmap_ghcb() Paolo Bonzini
@ 2026-05-30  8:36   ` sashiko-bot
  0 siblings, 0 replies; 42+ messages in thread
From: sashiko-bot @ 2026-05-30  8:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [Critical] `sev_es_unmap_ghcb` executes a sleepable operation (`kvm_write_guest`) while preemption is disabled, causing a sleeping-while-atomic violation.
--

commit ed02a1ac42db279c6f38774405471c3163d2450b
Author: Sean Christopherson <seanjc@google.com>

KVM: SEV: Move sev_free_vcpu() down below sev_es_unmap_ghcb()

Relocate sev_free_vcpu() down in sev.c so that it's definition comes after
sev_es_unmap_ghcb().  This will allow sharing unmap functionality between
the two functions without needing a forward declaration (or weird placement
of the common code).

> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 4ebe0d4497898..437282f0ea943 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
[ ... ]
> @@ -3618,6 +3587,37 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm)
>  	svm->sev_es.ghcb = NULL;
>  }

[Severity: Critical]
This is a pre-existing issue, but since we are looking at sev_es_unmap_ghcb(),
does this unmap routine execute a sleepable operation while preemption is
disabled?

When handling a guest exit where the SEV-ES scratch area is outside the GHCB,
ghcb_sa_sync can be set to true. If the IO requires exiting to userspace,
sev_es_unmap_ghcb() is delayed until the next vcpu_enter_guest() execution.

During guest reentry, vcpu_enter_guest() disables preemption before calling
the architecture hook:

arch/x86/kvm/x86.c:vcpu_enter_guest() {
    ...
	preempt_disable();

	kvm_x86_call(prepare_switch_to_guest)(vcpu);
    ...
}

For SVM, this hook resolves to svm_prepare_switch_to_guest(), which calls
sev_es_unmap_ghcb() while preemption is still disabled:

arch/x86/kvm/svm/svm.c:svm_prepare_switch_to_guest() {
    ...
	if (is_sev_es_guest(vcpu))
		sev_es_unmap_ghcb(svm);
    ...
}

Inside sev_es_unmap_ghcb(), if ghcb_sa_sync is true, it calls kvm_write_guest()
which is a sleepable operation:

arch/x86/kvm/svm/sev.c:sev_es_unmap_ghcb() {
    ...
		if (svm->sev_es.ghcb_sa_sync) {
			kvm_write_guest(svm->vcpu.kvm,
					svm->sev_es.sw_scratch,
					svm->sev_es.ghcb_sa,
					svm->sev_es.ghcb_sa_len);
    ...
}

Calling a sleepable function with preemption disabled leads to a "sleeping
function called from invalid context" bug. If the host virtual address
corresponding to the scratch area is not resident in memory, the kernel will
attempt to sleep on the page fault, leading to a "scheduling while atomic"
panic.

Could this be triggered by an unprivileged guest providing an out-of-bounds
scratch area, leading to a host denial of service?

> +void sev_free_vcpu(struct kvm_vcpu *vcpu)
> +{
> +	struct vcpu_svm *svm;
> +
> +	if (!is_sev_es_guest(vcpu))
> +		return;
[ ... ]

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529183549.1104619-1-pbonzini@redhat.com?part=15

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 16/24] KVM: SEV: Decouple the need to sync the GHCB SA from the need to free the SA
  2026-05-29 18:35 ` [PATCH 16/24] KVM: SEV: Decouple the need to sync the GHCB SA from the need to free the SA Paolo Bonzini
@ 2026-05-30  8:50   ` sashiko-bot
  2026-06-01 16:02     ` Sean Christopherson
  0 siblings, 1 reply; 42+ messages in thread
From: sashiko-bot @ 2026-05-30  8:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] kvm_write_guest() is called with preemption disabled, which can trigger a kernel warning or silent guest memory corruption.
--

commit 5732decf12690466d8d24cbb274489189ae7d7b2
Author: Sean Christopherson <seanjc@google.com>

KVM: SEV: Decouple the need to sync the GHCB SA from the need to free the SA

This commit decouples synchronizing the GHCB scratch area from freeing and
unpinning it. This change allows the free and unpin path to be reused when
freeing a vCPU, and opportunistically adds a warning to prevent stomping over
an already-allocated scratch area.

> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 437282f0ea943..11d46600cbdc0 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3560,20 +3560,17 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm)
>  	if (!svm->sev_es.ghcb)
>  		return;
>  
> -	if (svm->sev_es.ghcb_sa_free) {
> -		/*
> -		 * The scratch area lives outside the GHCB, so there is a
> -		 * buffer that, depending on the operation performed, may
> -		 * need to be synced, then freed.
> -		 */
> -		if (svm->sev_es.ghcb_sa_sync) {
> -			kvm_write_guest(svm->vcpu.kvm,
> -					svm->sev_es.sw_scratch,
> -					svm->sev_es.ghcb_sa,
> -					svm->sev_es.ghcb_sa_len);
> -			svm->sev_es.ghcb_sa_sync = false;
> -		}
> +	/*
> +	 * If the scratch area lives outside the GHCB, there's a buffer that,
> +	 * depending on the operation performed, may need to be synced.
> +	 */
> +	if (svm->sev_es.ghcb_sa_sync) {
> +		kvm_write_guest(svm->vcpu.kvm, svm->sev_es.sw_scratch,
> +				svm->sev_es.ghcb_sa, svm->sev_es.ghcb_sa_len);
> +		svm->sev_es.ghcb_sa_sync = false;
> +	}

[Severity: High]
This isn't a bug introduced by this patch, but can this call to
kvm_write_guest() trigger a warning or cause silent guest memory corruption?

In arch/x86/kvm/x86.c:vcpu_enter_guest(), KVM disables preemption before
calling prepare_switch_to_guest():

	preempt_disable();

	kvm_x86_call(prepare_switch_to_guest)(vcpu);

For SEV-ES guests, this invokes svm_prepare_switch_to_guest() ->
sev_es_unmap_ghcb(). Within sev_es_unmap_ghcb(), if svm->sev_es.ghcb_sa_sync is
true, KVM calls kvm_write_guest().

Since kvm_write_guest() is a sleepable function and preemption is disabled,
this could trigger a warning in might_fault(), potentially leading to a host
denial of service if panic_on_warn is set.

Additionally, if the user address faults, the page fault handler might detect
the atomic context and fail the fault, causing the copy to fail silently and
corrupting the guest's execution.

Could a guest using GHCB version 1 force ghcb_sa_sync to true by supplying a
scratch area outside the GHCB, thereby triggering this path?

[ ... ]

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529183549.1104619-1-pbonzini@redhat.com?part=16

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 17/24] KVM: SEV: Unmap and unpin the GHCB as needed on vCPU free
  2026-05-29 18:35 ` [PATCH 17/24] KVM: SEV: Unmap and unpin the GHCB as needed on vCPU free Paolo Bonzini
@ 2026-05-30  9:06   ` sashiko-bot
  0 siblings, 0 replies; 42+ messages in thread
From: sashiko-bot @ 2026-05-30  9:06 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] `sev_es_unmap_ghcb` calls sleepable functions (`memunmap`) and performs unsafe guest memory writes while executing in an atomic context.
--

commit b4a31af8d74ea9fb855e6bc78164835ceed7d4bd
Author: Sean Christopherson <seanjc@google.com>

KVM: SEV: Unmap and unpin the GHCB as needed on vCPU free

Unmap and unpin the GHCB as needed when freeing a vCPU.  If the VM is
destroyed after mapping+pinning the GHCB on #VMGEXIT, without re-running
the vCPU, KVM will effectively leak the GHCB and any mappings created for
the GHCB.

> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 11d46600cbdc0..6c6a6d663e296 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
[ ... ]
> @@ -3552,6 +3552,20 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
>  	return 1;
>  }
>  
> +static void __sev_es_unmap_ghcb(struct vcpu_svm *svm)
> +{
> +	if (svm->sev_es.ghcb_sa_free) {
> +		kvfree(svm->sev_es.ghcb_sa);
> +		svm->sev_es.ghcb_sa = NULL;
> +		svm->sev_es.ghcb_sa_free = false;
> +	}
> +
> +	if (svm->sev_es.ghcb) {
> +		kvm_vcpu_unmap(&svm->vcpu, &svm->sev_es.ghcb_map);

[Severity: High]
This is a pre-existing issue, but does this execution path end up calling
sleepable functions and performing unsafe memory writes while preemption is
disabled?

vcpu_enter_guest() disables preemption before invoking
svm_prepare_switch_to_guest(), establishing an atomic execution context.
That function then calls sev_es_unmap_ghcb() for SEV-ES guests.

If a guest sets its GHCB GPA to an IOMEM region (e.g., an assigned device BAR),
kvm_vcpu_map() maps it using memremap(). When __sev_es_unmap_ghcb() later calls
kvm_vcpu_unmap() here, it delegates to memunmap(). Since memunmap() can sleep,
could this trigger a scheduling while atomic bug and allow a guest-triggered
host DoS?

Additionally, just before __sev_es_unmap_ghcb() is called, sev_es_unmap_ghcb()
calls kvm_write_guest():

void sev_es_unmap_ghcb(struct vcpu_svm *svm)
{
        ...
        if (svm->sev_es.ghcb_sa_sync) {
                kvm_write_guest(svm->vcpu.kvm, ...);
                ...
        }
        ...
}

If the guest scratch area is swapped out, the page fault handler will abort the
fault due to the atomic context, causing kvm_write_guest() to fail with -EFAULT.
Since KVM ignores this return value, could this lead to silent data corruption
in the guest, such as missing MMIO read data?

> +		svm->sev_es.ghcb = NULL;
> +	}
> +}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529183549.1104619-1-pbonzini@redhat.com?part=17

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 20/24] KVM: SEV: Return INVALID_EVENT for SNP-only #VMGEXIT from non-SNP guest
  2026-05-29 18:35 ` [PATCH 20/24] KVM: SEV: Return INVALID_EVENT for SNP-only #VMGEXIT from non-SNP guest Paolo Bonzini
@ 2026-05-30  9:29   ` sashiko-bot
  2026-06-01 16:18     ` Sean Christopherson
  0 siblings, 1 reply; 42+ messages in thread
From: sashiko-bot @ 2026-05-30  9:29 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Low] KVM consumes the GHCB `sw_exit_code` field before verifying that the guest marked it as valid.
--

commit 96be1ee0d4433ce7c711162daac605e0cd3278ae
Author: Sean Christopherson <seanjc@google.com>

KVM: SEV: Return INVALID_EVENT for SNP-only #VMGEXIT from non-SNP guest

This commit updates the error signaled when a non-SNP guest attempts to
invoke an SNP-only #VMGEXIT to INVALID_EVENT instead of MISSING_INPUT.
It also moves the SNP checks out of sev_es_validate_vmgexit() to make it
solely a predicate for verifying GHCB fields.

> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 880a2acd77bff..b59adddfdbccb 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
[ ... ]
> @@ -4527,6 +4537,13 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
>  		return 1;
>  	}
>  
> +	if (is_snp_only_vmgexit(control->exit_code) && !is_sev_snp_guest(vcpu)) {
> +		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx is SNP-only\n",
> +			    control->exit_code);
> +		svm_vmgexit_bad_input(svm, GHCB_ERR_INVALID_EVENT);
> +		return 1;
> +	}
> +
>  	ret = sev_es_validate_vmgexit(svm);
>  	if (ret)
>  		return ret;

[Severity: Low]
Does this consume the GHCB sw_exit_code field before verifying that the
guest actually marked it as valid?

By moving the is_snp_only_vmgexit(control->exit_code) check before the
call to sev_es_validate_vmgexit(), KVM evaluates the exit code before
kvm_ghcb_sw_exit_code_is_valid() is checked.

If a non-SNP guest provides a GHCB without setting the valid bit for the
exit code, but the uninitialized memory coincidentally matches an SNP-only
exit code, will KVM incorrectly return GHCB_ERR_INVALID_EVENT instead of
the required GHCB_ERR_MISSING_INPUT?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529183549.1104619-1-pbonzini@redhat.com?part=20

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 00/24] KVM: x86: fix various GHCB issues
  2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
                   ` (23 preceding siblings ...)
  2026-05-29 18:35 ` [PATCH 24/24] KVM: SEV: Remove sometimes-used function-scoped "ret" from #VMGEXIT handler Paolo Bonzini
@ 2026-05-30 16:27 ` Paolo Bonzini
  2026-06-03 12:52   ` Sean Christopherson
  24 siblings, 1 reply; 42+ messages in thread
From: Paolo Bonzini @ 2026-05-30 16:27 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, Tom Lendacky, Michael Roth

I think patches 11-13 should also be included in 7.1.  Any other opinions?

Paolo

On Fri, May 29, 2026 at 8:35 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> The PSC code has a variety of bugs, several of which have to do with
> not following the spec.  It took a while to disentagle and root cause
> everything (the GHCB ain't exactly the easiest spec to decipher), but
> here it is.  The first 10 patches have just been sent to Linus.
>
> For the rest we decided to handle it on-list for 7.2, but it is possible
> that a clever guest could abuse the bugs to cause the host to leak a
> pile of pages (limited by how fast the VM can get recreated).
>
> This can also be found in a sev-for-7.2 branch of kvm.git
>
> Thanks,
>
> Paolo, Sean, Mike, Tom
>
> Michael Roth (1):
>   KVM: SEV: Require in-GHCB scratch area if GHCB v2+ is in use
>
> Sean Christopherson (23):
>   KVM: SEV: Ignore MMIO requests of length '0'
>   KVM: SEV: Reject MMIO requests larger than 8 bytes with GHCB v2+
>   KVM: SEV: Ignore Port I/O requests of length '0'
>   KVM: SEV: Use the size of the PSC header as the minimum size for PSC
>     requests
>   KVM: SEV: Compute the correct max length of the in-GHCB scratch area
>   KVM: SEV: WARN if KVM attempts to setup scratch area with min_len==0
>   KVM: SEV: Don't explicitly pass PSC buffer to snp_begin_psc()
>   KVM: SEV: Check PSC request indices against the actual size of the
>     buffer
>   KVM: SEV: Use READ_ONCE() when reading entries/indices from PSC buffer
>   KVM: SEV: Make it more obvious when KVM is writing back the current
>     PSC index
>   KVM: SEV: Add an anonymous "psc" struct to track current PSC metadata
>   KVM: SEV: Read start/end indices of PSC requests exactly once per
>     #VMGEXIT
>   KVM: Don't WARN if memory is dirtied without a vCPU when the VM is
>     dying
>   KVM: SEV: Move sev_free_vcpu() down below sev_es_unmap_ghcb()
>   KVM: SEV: Decouple the need to sync the GHCB SA from the need to free
>     the SA
>   KVM: SEV: Unmap and unpin the GHCB as needed on vCPU free
>   KVM: SEV: Don't terminate SNP VMs on #VMGEXIT without a registered
>     GHCB
>   KVM: SEV: Move GHCB "usage" check out of sev_es_validate_vmgexit()
>   KVM: SEV: Return INVALID_EVENT for SNP-only #VMGEXIT from non-SNP
>     guest
>   KVM: SEV: Return INVALID_INPUT, not MISSING_INPUT, for bad
>     GUEST_REQUEST input(s)
>   KVM: SEV: Handle unknown #VMGEXIT reasons in sev_handle_vmgexit()
>   KVM: SEV: Turn sev_es_validate_vmgexit() into a dedicated predicate
>   KVM: SEV: Remove sometimes-used function-scoped "ret" from #VMGEXIT
>     handler
>
>  arch/x86/kvm/svm/sev.c | 554 ++++++++++++++++++++++-------------------
>  arch/x86/kvm/svm/svm.h |   9 +-
>  virt/kvm/kvm_main.c    |   3 +-
>  3 files changed, 302 insertions(+), 264 deletions(-)
>
> --
> 2.54.0


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 16/24] KVM: SEV: Decouple the need to sync the GHCB SA from the need to free the SA
  2026-05-30  8:50   ` sashiko-bot
@ 2026-06-01 16:02     ` Sean Christopherson
  0 siblings, 0 replies; 42+ messages in thread
From: Sean Christopherson @ 2026-06-01 16:02 UTC (permalink / raw)
  To: sashiko-reviews; +Cc: Paolo Bonzini, kvm

On Sat, May 30, 2026, sashiko-bot@kernel.org wrote:
> > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > index 437282f0ea943..11d46600cbdc0 100644
> > --- a/arch/x86/kvm/svm/sev.c
> > +++ b/arch/x86/kvm/svm/sev.c
> > @@ -3560,20 +3560,17 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm)
> >  	if (!svm->sev_es.ghcb)
> >  		return;
> >  
> > -	if (svm->sev_es.ghcb_sa_free) {
> > -		/*
> > -		 * The scratch area lives outside the GHCB, so there is a
> > -		 * buffer that, depending on the operation performed, may
> > -		 * need to be synced, then freed.
> > -		 */
> > -		if (svm->sev_es.ghcb_sa_sync) {
> > -			kvm_write_guest(svm->vcpu.kvm,
> > -					svm->sev_es.sw_scratch,
> > -					svm->sev_es.ghcb_sa,
> > -					svm->sev_es.ghcb_sa_len);
> > -			svm->sev_es.ghcb_sa_sync = false;
> > -		}
> > +	/*
> > +	 * If the scratch area lives outside the GHCB, there's a buffer that,
> > +	 * depending on the operation performed, may need to be synced.
> > +	 */
> > +	if (svm->sev_es.ghcb_sa_sync) {
> > +		kvm_write_guest(svm->vcpu.kvm, svm->sev_es.sw_scratch,
> > +				svm->sev_es.ghcb_sa, svm->sev_es.ghcb_sa_len);
> > +		svm->sev_es.ghcb_sa_sync = false;
> > +	}
> 
> [Severity: High]
> This isn't a bug introduced by this patch, but can this call to
> kvm_write_guest() trigger a warning

Yes.

> or cause silent guest memory corruption?

I suppose in theory, yes?

> In arch/x86/kvm/x86.c:vcpu_enter_guest(), KVM disables preemption before
> calling prepare_switch_to_guest():
> 
> 	preempt_disable();
> 
> 	kvm_x86_call(prepare_switch_to_guest)(vcpu);
> 
> For SEV-ES guests, this invokes svm_prepare_switch_to_guest() ->
> sev_es_unmap_ghcb(). Within sev_es_unmap_ghcb(), if svm->sev_es.ghcb_sa_sync is
> true, KVM calls kvm_write_guest().
> 
> Since kvm_write_guest() is a sleepable function and preemption is disabled,
> this could trigger a warning in might_fault(), potentially leading to a host
> denial of service if panic_on_warn is set.

LOL, yes, for all of those production environments that run with panic_on_warn=1,
CONFIG_DEBUG_ATOMIC_SLEEP=y, and a preemptible kernel model.

We (Google) have an internal patch that we've been carrying for a while, but it's
a hack.  It's basically:

	kvm_x86_call(prepare_switch_to_guest_early)(vcpu);

	preempt_disable();

	kvm_x86_call(prepare_switch_to_guest)(vcpu);

Given that the external scratch area is deprecated in v2, i.e. now that we know
it's not going to be a super common operation, I think the proper way to fix this
would be to have kvm_sev_es_mmio() and kvm_sev_es_{ins,outs}() sync and free the
buffer.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 20/24] KVM: SEV: Return INVALID_EVENT for SNP-only #VMGEXIT from non-SNP guest
  2026-05-30  9:29   ` sashiko-bot
@ 2026-06-01 16:18     ` Sean Christopherson
  0 siblings, 0 replies; 42+ messages in thread
From: Sean Christopherson @ 2026-06-01 16:18 UTC (permalink / raw)
  To: sashiko-reviews; +Cc: Paolo Bonzini, kvm

On Sat, May 30, 2026, sashiko-bot@kernel.org wrote:
> > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > index 880a2acd77bff..b59adddfdbccb 100644
> > --- a/arch/x86/kvm/svm/sev.c
> > +++ b/arch/x86/kvm/svm/sev.c
> [ ... ]
> > @@ -4527,6 +4537,13 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
> >  		return 1;
> >  	}
> >  
> > +	if (is_snp_only_vmgexit(control->exit_code) && !is_sev_snp_guest(vcpu)) {
> > +		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx is SNP-only\n",
> > +			    control->exit_code);
> > +		svm_vmgexit_bad_input(svm, GHCB_ERR_INVALID_EVENT);
> > +		return 1;
> > +	}
> > +
> >  	ret = sev_es_validate_vmgexit(svm);
> >  	if (ret)
> >  		return ret;
> 
> [Severity: Low]
> Does this consume the GHCB sw_exit_code field before verifying that the
> guest actually marked it as valid?
> 
> By moving the is_snp_only_vmgexit(control->exit_code) check before the
> call to sev_es_validate_vmgexit(), KVM evaluates the exit code before
> kvm_ghcb_sw_exit_code_is_valid() is checked.
> 
> If a non-SNP guest provides a GHCB without setting the valid bit for the
> exit code, but the uninitialized memory coincidentally matches an SNP-only
> exit code, will KVM incorrectly return GHCB_ERR_INVALID_EVENT instead of
> the required GHCB_ERR_MISSING_INPUT?

Super duper technically, yes.  We can't simply swap the order, because exits
like SVM_VMGEXIT_PSC shouldn't exist for non-SNP guests.  But I guess it would
be easy and safe enough to do:

	if (kvm_ghcb_sw_exit_code_is_valid(svm) &&
	    is_snp_only_vmgexit(control->exit_code) && !is_sev_snp_guest(vcpu)) {
		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx is SNP-only\n",
			    control->exit_code);
		svm_vmgexit_bad_input(svm, GHCB_ERR_INVALID_EVENT);
		return 1;
	}

Hmm, I guess the other option would be to split out the "always mandatory" checks


	if (!kvm_ghcb_sw_exit_code_is_valid(svm) ||
	    !kvm_ghcb_sw_exit_info_1_is_valid(svm) ||
	    !kvm_ghcb_sw_exit_info_2_is_valid(svm))
		return false;

e.g. end up with:

	if (!kvm_ghcb_sw_exit_code_is_valid(svm) ||
	    !kvm_ghcb_sw_exit_info_1_is_valid(svm) ||
	    !kvm_ghcb_sw_exit_info_2_is_valid(svm)) {
		/*
		 * Print the exit code even though it may not be marked valid
		 * as it could help with debugging.
		 */
		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx input is not valid\n",
			    control->exit_code);
		dump_ghcb(svm);
		svm_vmgexit_bad_input(svm, GHCB_ERR_MISSING_INPUT);
		return 1;
	}

	if (is_snp_only_vmgexit(control->exit_code) && !is_sev_snp_guest(vcpu)) {
		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx is SNP-only\n",
			    control->exit_code);
		svm_vmgexit_bad_input(svm, GHCB_ERR_INVALID_EVENT);
		return 1;
	}

	if (!sev_es_are_required_ghcb_fields_valid(svm)) {
		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx input is not valid\n",
			    control->exit_code);
		dump_ghcb(svm);
		svm_vmgexit_bad_input(svm, GHCB_ERR_MISSING_INPUT);
		return 1;
	}

Hmm, and if we go with the second option, maybe rename sev_es_are_required_ghcb_fields_valid()
to sev_es_are_ancillary_ghcb_fields_valid()?

I don't particularly love either option, but I think I have a slight (very slight)
preference for this?

	if (!kvm_ghcb_sw_exit_code_is_valid(svm) ||
	    !kvm_ghcb_sw_exit_info_1_is_valid(svm) ||
	    !kvm_ghcb_sw_exit_info_2_is_valid(svm)) {
		/*
		 * Print the exit code even though it may not be marked valid
		 * as it could help with debugging.
		 */
		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx input is not valid\n",
			    control->exit_code);
		dump_ghcb(svm);
		svm_vmgexit_bad_input(svm, GHCB_ERR_MISSING_INPUT);
		return 1;
	}

	if (is_snp_only_vmgexit(control->exit_code) && !is_sev_snp_guest(vcpu)) {
		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx is SNP-only\n",
			    control->exit_code);
		svm_vmgexit_bad_input(svm, GHCB_ERR_INVALID_EVENT);
		return 1;
	}

	if (!sev_es_are_ancillary_ghcb_fields_valid(svm)) {
		vcpu_unimpl(vcpu, "vmgexit: exit code %#llx input is not valid\n",
			    control->exit_code);
		dump_ghcb(svm);
		svm_vmgexit_bad_input(svm, GHCB_ERR_MISSING_INPUT);
		return 1;
	}

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 11/24] KVM: SEV: Make it more obvious when KVM is writing back the current PSC index
  2026-05-29 23:21   ` sashiko-bot
@ 2026-06-01 16:20     ` Sean Christopherson
  0 siblings, 0 replies; 42+ messages in thread
From: Sean Christopherson @ 2026-06-01 16:20 UTC (permalink / raw)
  To: sashiko-reviews; +Cc: Paolo Bonzini, kvm

On Fri, May 29, 2026, sashiko-bot@kernel.org wrote:
> > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > --- a/arch/x86/kvm/svm/sev.c
> > +++ b/arch/x86/kvm/svm/sev.c
> > @@ -3947,7 +3947,7 @@ static int snp_begin_psc(struct vcpu_svm *svm)
> >  	}
> >  
> >  	/* Find the start of the next range which needs processing. */
> > -	for (idx = idx_start; idx <= idx_end; idx++, hdr->cur_entry++) {
> > +	for (idx = idx_start; idx <= idx_end; idx++) {
> >  		entry_start = READ_ONCE(entries[idx]);
> >  
> [ ... ]
> > @@ -3974,6 +3974,14 @@ static int snp_begin_psc(struct vcpu_svm *svm)
> >  
> >  		if (npages)
> >  			break;
> > +
> > +		/*
> > +		 * Increment the guest-visible index to communicate the current
> > +		 * entry back to the guest, e.g. in case of failure.  No need
> > +		 * for READ_ONCE() as KVM doesn't consume the field, i.e. a
> > +		 * misbehaving guest can only break itself.
> > +		 */
> > +		hdr->cur_entry++;
> >  	}
> >  
> >  	if (idx > idx_end) {
> 
> [Severity: High]
> This is a pre-existing issue, but since we're examining how a misbehaving 
> guest affects the PSC processing loop, is it possible for a guest to trigger 
> a soft lockup here?
> 
> Looking at snp_begin_psc(), if the guest submits an unsupported PSC operation,
> we hit the default case and loop back:
> 
> next_range:
>     idx_start = READ_ONCE(hdr->cur_entry);
>     ...
>     switch (entry_start.operation) {
>     ...
>     default:
>         __snp_complete_one_psc(svm);
>         goto next_range;
>     }
> 
> If a malicious guest concurrently resets hdr->cur_entry back to 0 in the
> GHCB shared memory, could the host get trapped continuously reading an
> index of 0 and re-processing the same unsupported entry? 

Already fixed in patch 16, "KVM: SEV: Read start/end indices of PSC requests
exactly once per #VMGEXIT".

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 00/24] KVM: x86: fix various GHCB issues
  2026-05-30 16:27 ` [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
@ 2026-06-03 12:52   ` Sean Christopherson
  2026-06-03 15:06     ` Paolo Bonzini
  0 siblings, 1 reply; 42+ messages in thread
From: Sean Christopherson @ 2026-06-03 12:52 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, Tom Lendacky, Michael Roth

On Sat, May 30, 2026, Paolo Bonzini wrote:
> I think patches 11-13 should also be included in 7.1.

+1.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 00/24] KVM: x86: fix various GHCB issues
  2026-06-03 12:52   ` Sean Christopherson
@ 2026-06-03 15:06     ` Paolo Bonzini
  0 siblings, 0 replies; 42+ messages in thread
From: Paolo Bonzini @ 2026-06-03 15:06 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: linux-kernel, kvm, Tom Lendacky, Michael Roth

On 6/3/26 14:52, Sean Christopherson wrote:
> On Sat, May 30, 2026, Paolo Bonzini wrote:
>> I think patches 11-13 should also be included in 7.1.
> 
> +1.

Thanks for confirming; pushed 11-17 to master and the whole series to next.

Paolo


^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2026-06-03 15:07 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-29 18:35 [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
2026-05-29 18:35 ` [PATCH 01/24] KVM: SEV: Require in-GHCB scratch area if GHCB v2+ is in use Paolo Bonzini
2026-05-29 18:35 ` [PATCH 02/24] KVM: SEV: Ignore MMIO requests of length '0' Paolo Bonzini
2026-05-29 19:29   ` sashiko-bot
2026-05-29 18:35 ` [PATCH 03/24] KVM: SEV: Reject MMIO requests larger than 8 bytes with GHCB v2+ Paolo Bonzini
2026-05-29 19:54   ` sashiko-bot
2026-05-29 18:35 ` [PATCH 04/24] KVM: SEV: Ignore Port I/O requests of length '0' Paolo Bonzini
2026-05-29 18:35 ` [PATCH 05/24] KVM: SEV: Use the size of the PSC header as the minimum size for PSC requests Paolo Bonzini
2026-05-29 20:49   ` sashiko-bot
2026-05-29 18:35 ` [PATCH 06/24] KVM: SEV: Compute the correct max length of the in-GHCB scratch area Paolo Bonzini
2026-05-29 18:35 ` [PATCH 07/24] KVM: SEV: WARN if KVM attempts to setup scratch area with min_len==0 Paolo Bonzini
2026-05-29 21:32   ` sashiko-bot
2026-05-29 18:35 ` [PATCH 08/24] KVM: SEV: Don't explicitly pass PSC buffer to snp_begin_psc() Paolo Bonzini
2026-05-29 18:35 ` [PATCH 09/24] KVM: SEV: Check PSC request indices against the actual size of the buffer Paolo Bonzini
2026-05-29 18:35 ` [PATCH 10/24] KVM: SEV: Use READ_ONCE() when reading entries/indices from PSC buffer Paolo Bonzini
2026-05-29 22:28   ` sashiko-bot
2026-05-29 18:35 ` [PATCH 11/24] KVM: SEV: Make it more obvious when KVM is writing back the current PSC index Paolo Bonzini
2026-05-29 23:21   ` sashiko-bot
2026-06-01 16:20     ` Sean Christopherson
2026-05-29 18:35 ` [PATCH 12/24] KVM: SEV: Add an anonymous "psc" struct to track current PSC metadata Paolo Bonzini
2026-05-30  8:07   ` sashiko-bot
2026-05-29 18:35 ` [PATCH 13/24] KVM: SEV: Read start/end indices of PSC requests exactly once per #VMGEXIT Paolo Bonzini
2026-05-29 18:35 ` [PATCH 14/24] KVM: Don't WARN if memory is dirtied without a vCPU when the VM is dying Paolo Bonzini
2026-05-29 18:35 ` [PATCH 15/24] KVM: SEV: Move sev_free_vcpu() down below sev_es_unmap_ghcb() Paolo Bonzini
2026-05-30  8:36   ` sashiko-bot
2026-05-29 18:35 ` [PATCH 16/24] KVM: SEV: Decouple the need to sync the GHCB SA from the need to free the SA Paolo Bonzini
2026-05-30  8:50   ` sashiko-bot
2026-06-01 16:02     ` Sean Christopherson
2026-05-29 18:35 ` [PATCH 17/24] KVM: SEV: Unmap and unpin the GHCB as needed on vCPU free Paolo Bonzini
2026-05-30  9:06   ` sashiko-bot
2026-05-29 18:35 ` [PATCH 18/24] KVM: SEV: Don't terminate SNP VMs on #VMGEXIT without a registered GHCB Paolo Bonzini
2026-05-29 18:35 ` [PATCH 19/24] KVM: SEV: Move GHCB "usage" check out of sev_es_validate_vmgexit() Paolo Bonzini
2026-05-29 18:35 ` [PATCH 20/24] KVM: SEV: Return INVALID_EVENT for SNP-only #VMGEXIT from non-SNP guest Paolo Bonzini
2026-05-30  9:29   ` sashiko-bot
2026-06-01 16:18     ` Sean Christopherson
2026-05-29 18:35 ` [PATCH 21/24] KVM: SEV: Return INVALID_INPUT, not MISSING_INPUT, for bad GUEST_REQUEST input(s) Paolo Bonzini
2026-05-29 18:35 ` [PATCH 22/24] KVM: SEV: Handle unknown #VMGEXIT reasons in sev_handle_vmgexit() Paolo Bonzini
2026-05-29 18:35 ` [PATCH 23/24] KVM: SEV: Turn sev_es_validate_vmgexit() into a dedicated predicate Paolo Bonzini
2026-05-29 18:35 ` [PATCH 24/24] KVM: SEV: Remove sometimes-used function-scoped "ret" from #VMGEXIT handler Paolo Bonzini
2026-05-30 16:27 ` [PATCH 00/24] KVM: x86: fix various GHCB issues Paolo Bonzini
2026-06-03 12:52   ` Sean Christopherson
2026-06-03 15:06     ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox