public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Srikanth Aithal <sraithal@amd.com>
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	"Jethro Beekman" <jethro@fortanix.com>,
	"Alexander Potapenko" <glider@google.com>,
	"Carlos López" <clopez@suse.de>,
	"Thomas Lendacky" <Thomas.Lendacky@amd.com>
Subject: Re: [PATCH 05/21] KVM: SEV: Lock all vCPUs when synchronzing VMSAs for SNP launch finish
Date: Wed, 8 Apr 2026 11:42:50 -0700	[thread overview]
Message-ID: <adahqilc9IrIpIvr@google.com> (raw)
In-Reply-To: <177b405e-87f7-400f-a783-adc7c20bef89@amd.com>

On Wed, Apr 08, 2026, Srikanth Aithal wrote:
> On 3/11/2026 5:18 AM, Sean Christopherson wrote:
> > Lock all vCPUs when synchronizing and encrypting VMSAs for SNP guests, as
> > allowing userspace to manipulate and/or run a vCPU while its state is being
> > synchronized would at best corrupt vCPU state, and at worst crash the host
> > kernel.
> > 
> > Opportunistically assert that vcpu->mutex is held when synchronizing its
> > VMSA (the SEV-ES path already locks vCPUs).
> > 
> > Fixes: ad27ce155566 ("KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >   arch/x86/kvm/svm/sev.c | 16 +++++++++++++---
> >   1 file changed, 13 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > index 5de36bbc4c53..c10c71608208 100644
> > --- a/arch/x86/kvm/svm/sev.c
> > +++ b/arch/x86/kvm/svm/sev.c
> > @@ -882,6 +882,8 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
> >   	u8 *d;
> >   	int i;
> > +	lockdep_assert_held(&vcpu->mutex);
> > +
> >   	if (vcpu->arch.guest_state_protected)
> >   		return -EINVAL;
> > @@ -2456,6 +2458,10 @@ static int snp_launch_update_vmsa(struct kvm *kvm, struct kvm_sev_cmd *argp)
> >   	if (kvm_is_vcpu_creation_in_progress(kvm))
> >   		return -EBUSY;
> > +	ret = kvm_lock_all_vcpus(kvm);
> > +	if (ret)
> > +		return ret;
> > +
> >   	data.gctx_paddr = __psp_pa(sev->snp_context);
> >   	data.page_type = SNP_PAGE_TYPE_VMSA;
> > @@ -2465,12 +2471,12 @@ static int snp_launch_update_vmsa(struct kvm *kvm, struct kvm_sev_cmd *argp)
> >   		ret = sev_es_sync_vmsa(svm);
> >   		if (ret)
> > -			return ret;
> > +			goto err;
> >   		/* Transition the VMSA page to a firmware state. */
> >   		ret = rmp_make_private(pfn, INITIAL_VMSA_GPA, PG_LEVEL_4K, sev->asid, true);
> >   		if (ret)
> > -			return ret;
> > +			goto err;
> >   		/* Issue the SNP command to encrypt the VMSA */
> >   		data.address = __sme_pa(svm->sev_es.vmsa);
> > @@ -2479,7 +2485,7 @@ static int snp_launch_update_vmsa(struct kvm *kvm, struct kvm_sev_cmd *argp)
> >   		if (ret) {
> >   			snp_page_reclaim(kvm, pfn);
> > -			return ret;
> > +			goto err;
> >   		}
> >   		svm->vcpu.arch.guest_state_protected = true;
> > @@ -2494,6 +2500,10 @@ static int snp_launch_update_vmsa(struct kvm *kvm, struct kvm_sev_cmd *argp)
> >   	}
> >   	return 0;
> > +
> > +err:
> > +	kvm_unlock_all_vcpus(kvm);
> > +	return ret;

/facepalm

With an assist from lockdep (see below), I forgot to actually unlock in the
*success* path.  The failure manifested as a "guest" hang instead of a deadlock
by pure dumb luck: kvm_vcpu_ioctl() uses mutex_lock_killable() so killing QEMU
still works.

I'll squash this (assuming it fixes the problem you're seeing), and omit this
entire series from the initial 7.1 pull requests.  If everything looks good, I'll
plan on sending a second pull request for this series after it's had more time to
soak in -next.

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 2010b157e288..770f7dfc0e5c 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2512,12 +2512,12 @@ static int snp_launch_update_vmsa(struct kvm *kvm, struct kvm_sev_cmd *argp)
 
                ret = sev_es_sync_vmsa(svm);
                if (ret)
-                       goto err;
+                       goto out;
 
                /* Transition the VMSA page to a firmware state. */
                ret = rmp_make_private(pfn, INITIAL_VMSA_GPA, PG_LEVEL_4K, sev->asid, true);
                if (ret)
-                       goto err;
+                       goto out;
 
                /* Issue the SNP command to encrypt the VMSA */
                data.address = __sme_pa(svm->sev_es.vmsa);
@@ -2526,7 +2526,7 @@ static int snp_launch_update_vmsa(struct kvm *kvm, struct kvm_sev_cmd *argp)
                if (ret) {
                        snp_page_reclaim(kvm, pfn);
 
-                       goto err;
+                       goto out;
                }
 
                svm->vcpu.arch.guest_state_protected = true;
@@ -2540,9 +2540,7 @@ static int snp_launch_update_vmsa(struct kvm *kvm, struct kvm_sev_cmd *argp)
                svm_enable_lbrv(vcpu);
        }
 
-       return 0;
-
-err:
+out:
        kvm_unlock_all_vcpus(kvm);
        return ret;
 }

> >   }
> >   static int snp_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
> 
> 
> I am seeing an SNP guest boot failure starting with linux-next tag
> next-20260406 [1].
>
> The SNP guest hangs during boot.

...

> There are no error messages on either the host or guest serial console when
> this happens.

 WARNING: Nested lock was not taken
 7.0.0-smp--36ad607330fb-snp #112 Tainted: G     U  W  O       
 ----------------------------------
 qemu/39235 is trying to lock:
 ffff8d0e590c00b0 (&vcpu->mutex){+.+.}-{4:4}, at: kvm_lock_all_vcpus+0xab/0x180 [kvm]
 
              but this task is not holding:
 &kvm->lock
 
              stack backtrace:
 CPU: 123 UID: 0 PID: 39235 Comm: qemu Tainted: G     U  W  O        7.0.0-smp--36ad607330fb-snp #112 PREEMPTLAZY 
 Tainted: [U]=USER, [W]=WARN, [O]=OOT_MODULE
 Hardware name: Google, Inc.                                                       Arcadia_IT_80/Arcadia_IT_80, BIOS 34.86.0-102 01/25/2026
 Call Trace:
  <TASK>
  dump_stack_lvl+0x54/0x70
  __lock_acquire+0x7b9/0x2900
  reacquire_held_locks+0x107/0x160
  lock_release+0x177/0x360
  __mutex_unlock_slowpath+0x3c/0x2b0
  sev_mem_enc_ioctl+0x3c9/0x400 [kvm_amd]
  kvm_vm_ioctl+0x57c/0x5d0 [kvm]
  __se_sys_ioctl+0x6d/0xb0
  do_syscall_64+0xe8/0x920
  entry_SYSCALL_64_after_hwframe+0x4b/0x53
  </TASK>
 
              other info that might help us debug this:
 no locks held by qemu/39235.
 
              stack backtrace:
 CPU: 123 UID: 0 PID: 39235 Comm: qemu Tainted: G     U  W  O        7.0.0-smp--36ad607330fb-snp #112 PREEMPTLAZY 
 Tainted: [U]=USER, [W]=WARN, [O]=OOT_MODULE
 Hardware name: Google, Inc.                                                       Arcadia_IT_80/Arcadia_IT_80, BIOS 34.86.0-102 01/25/2026
 Call Trace:
  <TASK>
  dump_stack_lvl+0x54/0x70
  __lock_acquire+0x7de/0x2900
  reacquire_held_locks+0x107/0x160
  lock_release+0x177/0x360
  __mutex_unlock_slowpath+0x3c/0x2b0
  sev_mem_enc_ioctl+0x3c9/0x400 [kvm_amd]
  kvm_vm_ioctl+0x57c/0x5d0 [kvm]
  __se_sys_ioctl+0x6d/0xb0
  do_syscall_64+0xe8/0x920
  entry_SYSCALL_64_after_hwframe+0x4b/0x53
  </TASK>

  reply	other threads:[~2026-04-08 18:42 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10 23:48 [PATCH 00/21] Fixes and lock cleanup+hardening Sean Christopherson
2026-03-10 23:48 ` [PATCH 01/21] KVM: selftests: Remove duplicate LAUNCH_UPDATE_VMSA call in SEV-ES migrate test Sean Christopherson
2026-03-10 23:48 ` [PATCH 02/21] KVM: SEV: Reject attempts to sync VMSA of an already-launched/encrypted vCPU Sean Christopherson
2026-03-10 23:48 ` [PATCH 03/21] KVM: SEV: Protect *all* of sev_mem_enc_register_region() with kvm->lock Sean Christopherson
2026-03-10 23:48 ` [PATCH 04/21] KVM: SEV: Disallow LAUNCH_FINISH if vCPUs are actively being created Sean Christopherson
2026-03-10 23:48 ` [PATCH 05/21] KVM: SEV: Lock all vCPUs when synchronzing VMSAs for SNP launch finish Sean Christopherson
2026-04-08 14:07   ` Aithal, Srikanth
2026-04-08 18:42     ` Sean Christopherson [this message]
2026-03-10 23:48 ` [PATCH 06/21] KVM: SEV: Lock all vCPUs for the duration of SEV-ES VMSA synchronization Sean Christopherson
2026-03-10 23:48 ` [PATCH 07/21] KVM: SEV: Provide vCPU-scoped accessors for detecting SEV+ guests Sean Christopherson
2026-03-10 23:48 ` [PATCH 08/21] KVM: SEV: Add quad-underscore version of VM-scoped APIs to detect " Sean Christopherson
2026-03-10 23:48 ` [PATCH 09/21] KVM: SEV: Document the SEV-ES check when querying SMM support as "safe" Sean Christopherson
2026-03-10 23:48 ` [PATCH 10/21] KVM: SEV: Move standard VM-scoped helpers to detect SEV+ guests to sev.c Sean Christopherson
2026-03-17 10:33   ` Alexander Potapenko
2026-03-31 18:42     ` Sean Christopherson
2026-03-10 23:48 ` [PATCH 11/21] KVM: SEV: Move SEV-specific VM initialization " Sean Christopherson
2026-03-10 23:48 ` [PATCH 12/21] KVM: SEV: WARN on unhandled VM type when initializing VM Sean Christopherson
2026-03-10 23:48 ` [PATCH 13/21] KVM: SEV: Hide "struct kvm_sev_info" behind CONFIG_KVM_AMD_SEV=y Sean Christopherson
2026-03-10 23:48 ` [PATCH 14/21] KVM: SEV: Document that checking for SEV+ guests when reclaiming memory is "safe" Sean Christopherson
2026-03-10 23:48 ` [PATCH 15/21] KVM: SEV: Assert that kvm->lock is held when querying SEV+ support Sean Christopherson
2026-03-10 23:48 ` [PATCH 16/21] KVM: SEV: use mutex guard in snp_launch_update() Sean Christopherson
2026-03-10 23:48 ` [PATCH 17/21] KVM: SEV: use mutex guard in sev_mem_enc_ioctl() Sean Christopherson
2026-03-10 23:48 ` [PATCH 18/21] KVM: SEV: use mutex guard in sev_mem_enc_unregister_region() Sean Christopherson
2026-03-10 23:48 ` [PATCH 19/21] KVM: SEV: use mutex guard in snp_handle_guest_req() Sean Christopherson
2026-03-10 23:48 ` [PATCH 20/21] KVM: SVM: Move lock-protected allocation of SEV ASID into a separate helper Sean Christopherson
2026-03-10 23:48 ` [PATCH 21/21] KVM: SEV: Goto an existing error label if charging misc_cg for an ASID fails Sean Christopherson
2026-03-11 14:29 ` [PATCH 00/21] Fixes and lock cleanup+hardening Jethro Beekman
2026-03-12 16:03   ` Sean Christopherson
2026-04-08  0:14 ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adahqilc9IrIpIvr@google.com \
    --to=seanjc@google.com \
    --cc=Thomas.Lendacky@amd.com \
    --cc=clopez@suse.de \
    --cc=glider@google.com \
    --cc=jethro@fortanix.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=sraithal@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox