public inbox for linux-crypto@vger.kernel.org
 help / color / mirror / Atom feed
From: Tycho Andersen <tycho@kernel.org>
To: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Ashish Kalra <ashish.kalra@amd.com>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	John Allen <john.allen@amd.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>,
	Ard Biesheuvel <ardb@kernel.org>,
	Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>,
	Kishon Vijay Abraham I <kvijayab@amd.com>,
	Alexey Kardashevskiy <aik@amd.com>,
	Nikunj A Dadhania <nikunj@amd.com>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Kim Phillips <kim.phillips@amd.com>,
	Sean Christopherson <seanjc@google.com>,
	linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org
Subject: Re: [PATCH v4 0/7] Move SNP initialization to the CCP driver
Date: Mon, 30 Mar 2026 09:38:49 -0600	[thread overview]
Message-ID: <acqZCYwNdepPnNMB@tycho.pizza> (raw)
In-Reply-To: <20260328113814.GSace9ppUByOyRrTFI@fat_crate.local>

On Sat, Mar 28, 2026 at 12:38:14PM +0100, Borislav Petkov wrote:
> > > Is there a race condition with CPU hotplug here?
> > >
> > > Since snp_prepare() lacks cpus_read_lock() protection, a CPU could
> > > come online exactly between the two passes, missing the mfd_enable step
> > > but receiving snp_enable.
> > 
> > I think it makes sense to do the operations on the same set of CPUs
> > even if we don't support hotplug. I will resend with
> > cpus_read_lock().
> 
> Right, especially if that function would run now at arbitrary points in time
> - as this is the main reason we're doing this whole dance.
> 
> BUT!
> 
> If you grab the hotplug lock and you realize that you don't have all CPUs
> online and since we zapped the hotplug notifier and since SNP enable would
> fail anyway, we should simply check if all CPUs are online and return error
> if not instead of running the IPIs.

Sure, I will send a patch on top with that.

> > > This isn't a bug introduced by this commit, but if SEV initialization
> > > fails and KVM is actively running normal VMs, could a userspace process
> > > trigger this code path via /dev/sev ioctls (e.g., SEV_PDH_GEN) and zero out
> > > MSR_VM_HSAVE_PA globally? Would the next VMRUN execution for an active VM
> > > trigger a general protection fault and crash the host?
> > 
> > Oof, yes. I wonder if we shouldn't set psp_dead = true if
> > sev_platform_init() sees an error. After this series, if
> > the error is e.g. init_ex_path failure, you can unload, fix the
> > failure, and try again.
> 
> Let's slow down here.
> 
> So the LLM is talking about a use case where you have unencrypted VMs running
> and then userspace can go and poke /dev/sev, zero out that MSR_VM_HSAVE_PA in
> the process but that's the MSR which contains the physical address where host
> state is saved on a VMRUN and if that MSR is cleared because SNP init needs
> it, the machine would explode.
> 
> Ok, so far so good, I don't see anything wrong with the use case - nothing's
> stopping the admin from modprobing ccp and then launching SNP guests.
> 
> Now, you're talking about some psp_dead - yet another silly boolean folks love
> to introduce in the SEV code - and then something about that init_ex_path
> hack. I don't know what that means, frankly.
> 
> What my simple intuition says is, *if* snp_prepare() runs,  then no guests
> should do VMRUN anymore until they're ready to do that again.
> 
> Which begs the question: if snp_prepare() clears MSR_VM_HSAVE_PA, how can you
> even run normal VMs after that?

IIUC, the normal flow is:

1. amd_iommu_init() -> snp_rmptable_init() // previously snp_prepare()
1. kvm load
1. ccp load, /dev/sev created  // now snp_prepare()
1. kvm_amd load, sev_init()
1. kvm_x86_vendor_init() -> sev_hardware_setup()
1. kvm_init() -> kvm_arch_enable_virtualization_cpu() ->
       svm_enable_virtualization_cpu() -> HSAVE_PA = $real_pa

So the happy path works fine. The problem is if e.g. the first
snp_prepare() fails, userspace can do ioctl(/dev/sev, SEV_PDH_GEN, ...)
or try to start an SNP VM, which will unconditionally do
sev_platform_init(). Both of those trigger the snp_prepare() again,
and you can't run normal VMs.

IMO if you fail SNP init once, you probably will again. I was
suggesting setting psp_dead to just kill everything.

But the real issue is that late-SNP initialization is problematic.
I'll see if there's some way we can figure out if we're in that path
and forbid it.

> > >  	if (sev_version_greater_or_equal(SNP_MIN_API_MAJOR, 52)) {
> > [ ... ]
> > >  		memset(&data, 0, sizeof(data));
> > [ ... ]
> > >  		data.tio_en = tio_supp && sev_tio_enabled && amd_iommu_sev_tio_supported();
> > [ ... ]
> > >  	} else {
> > >  		cmd = SEV_CMD_SNP_INIT;
> > >  		arg = NULL;
> > >  	}
> > > This isn't a bug introduced by this commit, but is the stack variable
> > > data left uninitialized when taking the else branch? Since data.tio_en is
> > > later evaluated unconditionally, could stack garbage cause it to evaluate
> > > to true, leading to erroneous attempts to allocate pages and initialize
> > > SEV-TIO on unsupported hardware?
> > 
> > No, arg is the actual pointer passed, and it is set to NULL. non-EX
> > init doesn't support TIO anyway...
> 
> This code is a total mess.
> 
> struct sev_data_snp_init_ex data;
> ...
> 
> ... the else branch executes so you do
> 
> 	arg = NULL;
> 
> ...
> 
> and now *after* it, you're testing data:
> 
>         dev_dbg(sev->dev, "SEV-SNP firmware initialized, SEV-TIO is %s\n",
>                 data.tio_en ? "enabled" : "disabled");
> 
> Which *is* uninitialized stack data.
> 
> So the AI is right AFAICT.

do'h, yes, I glossed over the printk(), thanks.

> If I were the AI, I'd say, what a total mess this code is. This
> __sev_snp_init_locked() thing needs serious cleanup because it is too
> convoluted to exist. And silly bugs like that creep in, as a result.
> 
> If I were maintaining this, I'd enforce a mandatory driver cleanup before any
> new features come in. For example, __sev_snp_init_locked() needs proper
> splitting and streamlining instead of doing gazillion things and with
> a conditional in it which has consequences about the code after it. :-\
> 
> > > Also, regarding the bounds check in snp_filter_reserved_mem_regions()
> > > called via walk_iomem_res_desc(): does the check
> > > if ((range_list->num_elements * 16 + 8) > PAGE_SIZE)
> > > allow an off-by-one heap buffer overflow?
> > >
> > > If range_list->num_elements is 255, 255 * 16 + 8 = 4088, which is <= 4096.
> > > Writing range->base (8 bytes) fills 4088-4095, but writing range->page_count
> > > (4 bytes) would write to 4096-4099, overflowing the kzalloc-allocated
> > > PAGE_SIZE buffer.
> 
> That's a good catch.
> 
> > Yes, this also looks real, and needs:
> > 
> > diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
> > index 939fa8aa155c..3642226c0fc0 100644
> > --- a/drivers/crypto/ccp/sev-dev.c
> > +++ b/drivers/crypto/ccp/sev-dev.c
> > @@ -1328,10 +1328,11 @@ static int snp_filter_reserved_mem_regions(struct resource *rs, void *arg)
> >  	size_t size;
> >  
> >  	/*
> > -	 * Ensure the list of HV_FIXED pages that will be passed to firmware
> > -	 * do not exceed the page-sized argument buffer.
> > +	 * Ensure the list of HV_FIXED pages including the one we are about to
> 
> No "we" - use passive voice pls.

Thanks, will fix.

> > +	 * use that will be passed to firmware do not exceed the page-sized
> > +	 * argument buffer.
> >  	 */
> > -	if ((range_list->num_elements * sizeof(struct sev_data_range) +
> > +	if (((range_list->num_elements + 1) * sizeof(struct sev_data_range) +
> >  	     sizeof(struct sev_data_range_list)) > PAGE_SIZE)
> >  		return -E2BIG;
> 
> Yes, this is a short-term fix for stable.
> 
> But that "handling" in there is just nuts. You have this:
> 
> 	snp_range_list = kzalloc(PAGE_SIZE, GFP_KERNEL);
> 
> 	...
> 
>         rc = walk_iomem_res_desc(IORES_DESC_NONE, IORESOURCE_MEM, 0, ~0,
> 				  snp_range_list, snp_filter_reserved_mem_regions);
> 
> That function calls
> 
> 		snp_filter_reserved_mem_regions(resource *, snp_range_list);
> 
> and that resource walking BIOS-like yuck code is iterating over the resources
> and calling ->func each time.
> 
> So we pass in a *page* but then that range list *array* we turn it into, is
> not a multiple of the element size of 24 AFAICT.
> 
> So that last element can trail and overflow heap. Lovely.
> 
> So this thing needs complete change: *actually* pass in an array instead of
> a page so that you're not trailing, check the current element index against
> the array size instead of doing obscure struct size calculations which are
> visible only to very motivated reviewers like an AI and then just get rid of
> the overflow possibility in the first place.
> 
> > I have another bug that review-prompts found unrelated to this series.
> > I can put the two fixes above with that or include them here, let me
> > know what you prefer. Either way I'll resend one more with
> > cpus_read_lock().
> 
> So, your set is kinda ready to go and I'll take it but if I were you, right
> after this, I'll sit down and fix all that crap in sev-dev.c. Do a nice
> patchset, simple backportable fixes first and proper refactoring and cleanup
> ontop.

Thanks for applying it. I have a stack of patches from this and my own
review-prompts use that I'm testing now, but they are just the simple
fixes. I'll work on trying to make things a little for the future
too...

Tycho

      reply	other threads:[~2026-03-30 15:38 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-24 16:12 [PATCH v4 0/7] Move SNP initialization to the CCP driver Tycho Andersen
2026-03-24 16:12 ` [PATCH v4 1/7] x86/sev: Create a function to clear/zero the RMP Tycho Andersen
2026-03-24 16:12 ` [PATCH v4 2/7] x86/sev: Create snp_prepare() Tycho Andersen
2026-03-24 16:12 ` [PATCH v4 3/7] x86/sev: Create snp_shutdown() Tycho Andersen
2026-03-24 16:12 ` [PATCH v4 4/7] x86/sev, crypto/ccp: Move SNP init to ccp driver Tycho Andersen
2026-03-24 16:12 ` [PATCH v4 5/7] x86/sev, crypto/ccp: Move HSAVE_PA setup to arch/x86/ Tycho Andersen
2026-03-24 16:13 ` [PATCH v4 6/7] crypto/ccp: Implement SNP x86 shutdown Tycho Andersen
2026-03-24 16:13 ` [PATCH v4 7/7] crypto/ccp: Update HV_FIXED page states to allow freeing of memory Tycho Andersen
2026-03-25  9:07 ` [PATCH v4 0/7] Move SNP initialization to the CCP driver Borislav Petkov
2026-03-25 15:25   ` Tycho Andersen
2026-03-28 11:38     ` Borislav Petkov
2026-03-30 15:38       ` Tycho Andersen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=acqZCYwNdepPnNMB@tycho.pizza \
    --to=tycho@kernel.org \
    --cc=Neeraj.Upadhyay@amd.com \
    --cc=aik@amd.com \
    --cc=ardb@kernel.org \
    --cc=ashish.kalra@amd.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=hpa@zytor.com \
    --cc=john.allen@amd.com \
    --cc=kim.phillips@amd.com \
    --cc=kvijayab@amd.com \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nikunj@amd.com \
    --cc=peterz@infradead.org \
    --cc=seanjc@google.com \
    --cc=tglx@kernel.org \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox