All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tycho Andersen <tycho@kernel.org>
To: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Ashish Kalra <ashish.kalra@amd.com>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	John Allen <john.allen@amd.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>,
	Ard Biesheuvel <ardb@kernel.org>,
	Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>,
	Kishon Vijay Abraham I <kvijayab@amd.com>,
	Alexey Kardashevskiy <aik@amd.com>,
	Nikunj A Dadhania <nikunj@amd.com>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Kim Phillips <kim.phillips@amd.com>,
	Sean Christopherson <seanjc@google.com>,
	linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org
Subject: Re: [PATCH v4 0/7] Move SNP initialization to the CCP driver
Date: Mon, 30 Mar 2026 09:38:49 -0600	[thread overview]
Message-ID: <acqZCYwNdepPnNMB@tycho.pizza> (raw)
In-Reply-To: <20260328113814.GSace9ppUByOyRrTFI@fat_crate.local>

On Sat, Mar 28, 2026 at 12:38:14PM +0100, Borislav Petkov wrote:
> > > Is there a race condition with CPU hotplug here?
> > >
> > > Since snp_prepare() lacks cpus_read_lock() protection, a CPU could
> > > come online exactly between the two passes, missing the mfd_enable step
> > > but receiving snp_enable.
> > 
> > I think it makes sense to do the operations on the same set of CPUs
> > even if we don't support hotplug. I will resend with
> > cpus_read_lock().
> 
> Right, especially if that function would run now at arbitrary points in time
> - as this is the main reason we're doing this whole dance.
> 
> BUT!
> 
> If you grab the hotplug lock and you realize that you don't have all CPUs
> online and since we zapped the hotplug notifier and since SNP enable would
> fail anyway, we should simply check if all CPUs are online and return error
> if not instead of running the IPIs.

Sure, I will send a patch on top with that.

> > > This isn't a bug introduced by this commit, but if SEV initialization
> > > fails and KVM is actively running normal VMs, could a userspace process
> > > trigger this code path via /dev/sev ioctls (e.g., SEV_PDH_GEN) and zero out
> > > MSR_VM_HSAVE_PA globally? Would the next VMRUN execution for an active VM
> > > trigger a general protection fault and crash the host?
> > 
> > Oof, yes. I wonder if we shouldn't set psp_dead = true if
> > sev_platform_init() sees an error. After this series, if
> > the error is e.g. init_ex_path failure, you can unload, fix the
> > failure, and try again.
> 
> Let's slow down here.
> 
> So the LLM is talking about a use case where you have unencrypted VMs running
> and then userspace can go and poke /dev/sev, zero out that MSR_VM_HSAVE_PA in
> the process but that's the MSR which contains the physical address where host
> state is saved on a VMRUN and if that MSR is cleared because SNP init needs
> it, the machine would explode.
> 
> Ok, so far so good, I don't see anything wrong with the use case - nothing's
> stopping the admin from modprobing ccp and then launching SNP guests.
> 
> Now, you're talking about some psp_dead - yet another silly boolean folks love
> to introduce in the SEV code - and then something about that init_ex_path
> hack. I don't know what that means, frankly.
> 
> What my simple intuition says is, *if* snp_prepare() runs,  then no guests
> should do VMRUN anymore until they're ready to do that again.
> 
> Which begs the question: if snp_prepare() clears MSR_VM_HSAVE_PA, how can you
> even run normal VMs after that?

IIUC, the normal flow is:

1. amd_iommu_init() -> snp_rmptable_init() // previously snp_prepare()
1. kvm load
1. ccp load, /dev/sev created  // now snp_prepare()
1. kvm_amd load, sev_init()
1. kvm_x86_vendor_init() -> sev_hardware_setup()
1. kvm_init() -> kvm_arch_enable_virtualization_cpu() ->
       svm_enable_virtualization_cpu() -> HSAVE_PA = $real_pa

So the happy path works fine. The problem is if e.g. the first
snp_prepare() fails, userspace can do ioctl(/dev/sev, SEV_PDH_GEN, ...)
or try to start an SNP VM, which will unconditionally do
sev_platform_init(). Both of those trigger the snp_prepare() again,
and you can't run normal VMs.

IMO if you fail SNP init once, you probably will again. I was
suggesting setting psp_dead to just kill everything.

But the real issue is that late-SNP initialization is problematic.
I'll see if there's some way we can figure out if we're in that path
and forbid it.

> > >  	if (sev_version_greater_or_equal(SNP_MIN_API_MAJOR, 52)) {
> > [ ... ]
> > >  		memset(&data, 0, sizeof(data));
> > [ ... ]
> > >  		data.tio_en = tio_supp && sev_tio_enabled && amd_iommu_sev_tio_supported();
> > [ ... ]
> > >  	} else {
> > >  		cmd = SEV_CMD_SNP_INIT;
> > >  		arg = NULL;
> > >  	}
> > > This isn't a bug introduced by this commit, but is the stack variable
> > > data left uninitialized when taking the else branch? Since data.tio_en is
> > > later evaluated unconditionally, could stack garbage cause it to evaluate
> > > to true, leading to erroneous attempts to allocate pages and initialize
> > > SEV-TIO on unsupported hardware?
> > 
> > No, arg is the actual pointer passed, and it is set to NULL. non-EX
> > init doesn't support TIO anyway...
> 
> This code is a total mess.
> 
> struct sev_data_snp_init_ex data;
> ...
> 
> ... the else branch executes so you do
> 
> 	arg = NULL;
> 
> ...
> 
> and now *after* it, you're testing data:
> 
>         dev_dbg(sev->dev, "SEV-SNP firmware initialized, SEV-TIO is %s\n",
>                 data.tio_en ? "enabled" : "disabled");
> 
> Which *is* uninitialized stack data.
> 
> So the AI is right AFAICT.

do'h, yes, I glossed over the printk(), thanks.

> If I were the AI, I'd say, what a total mess this code is. This
> __sev_snp_init_locked() thing needs serious cleanup because it is too
> convoluted to exist. And silly bugs like that creep in, as a result.
> 
> If I were maintaining this, I'd enforce a mandatory driver cleanup before any
> new features come in. For example, __sev_snp_init_locked() needs proper
> splitting and streamlining instead of doing gazillion things and with
> a conditional in it which has consequences about the code after it. :-\
> 
> > > Also, regarding the bounds check in snp_filter_reserved_mem_regions()
> > > called via walk_iomem_res_desc(): does the check
> > > if ((range_list->num_elements * 16 + 8) > PAGE_SIZE)
> > > allow an off-by-one heap buffer overflow?
> > >
> > > If range_list->num_elements is 255, 255 * 16 + 8 = 4088, which is <= 4096.
> > > Writing range->base (8 bytes) fills 4088-4095, but writing range->page_count
> > > (4 bytes) would write to 4096-4099, overflowing the kzalloc-allocated
> > > PAGE_SIZE buffer.
> 
> That's a good catch.
> 
> > Yes, this also looks real, and needs:
> > 
> > diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
> > index 939fa8aa155c..3642226c0fc0 100644
> > --- a/drivers/crypto/ccp/sev-dev.c
> > +++ b/drivers/crypto/ccp/sev-dev.c
> > @@ -1328,10 +1328,11 @@ static int snp_filter_reserved_mem_regions(struct resource *rs, void *arg)
> >  	size_t size;
> >  
> >  	/*
> > -	 * Ensure the list of HV_FIXED pages that will be passed to firmware
> > -	 * do not exceed the page-sized argument buffer.
> > +	 * Ensure the list of HV_FIXED pages including the one we are about to
> 
> No "we" - use passive voice pls.

Thanks, will fix.

> > +	 * use that will be passed to firmware do not exceed the page-sized
> > +	 * argument buffer.
> >  	 */
> > -	if ((range_list->num_elements * sizeof(struct sev_data_range) +
> > +	if (((range_list->num_elements + 1) * sizeof(struct sev_data_range) +
> >  	     sizeof(struct sev_data_range_list)) > PAGE_SIZE)
> >  		return -E2BIG;
> 
> Yes, this is a short-term fix for stable.
> 
> But that "handling" in there is just nuts. You have this:
> 
> 	snp_range_list = kzalloc(PAGE_SIZE, GFP_KERNEL);
> 
> 	...
> 
>         rc = walk_iomem_res_desc(IORES_DESC_NONE, IORESOURCE_MEM, 0, ~0,
> 				  snp_range_list, snp_filter_reserved_mem_regions);
> 
> That function calls
> 
> 		snp_filter_reserved_mem_regions(resource *, snp_range_list);
> 
> and that resource walking BIOS-like yuck code is iterating over the resources
> and calling ->func each time.
> 
> So we pass in a *page* but then that range list *array* we turn it into, is
> not a multiple of the element size of 24 AFAICT.
> 
> So that last element can trail and overflow heap. Lovely.
> 
> So this thing needs complete change: *actually* pass in an array instead of
> a page so that you're not trailing, check the current element index against
> the array size instead of doing obscure struct size calculations which are
> visible only to very motivated reviewers like an AI and then just get rid of
> the overflow possibility in the first place.
> 
> > I have another bug that review-prompts found unrelated to this series.
> > I can put the two fixes above with that or include them here, let me
> > know what you prefer. Either way I'll resend one more with
> > cpus_read_lock().
> 
> So, your set is kinda ready to go and I'll take it but if I were you, right
> after this, I'll sit down and fix all that crap in sev-dev.c. Do a nice
> patchset, simple backportable fixes first and proper refactoring and cleanup
> ontop.

Thanks for applying it. I have a stack of patches from this and my own
review-prompts use that I'm testing now, but they are just the simple
fixes. I'll work on trying to make things a little for the future
too...

Tycho

      reply	other threads:[~2026-03-30 15:38 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-24 16:12 [PATCH v4 0/7] Move SNP initialization to the CCP driver Tycho Andersen
2026-03-24 16:12 ` [PATCH v4 1/7] x86/sev: Create a function to clear/zero the RMP Tycho Andersen
2026-03-30 10:46   ` [tip: x86/sev] " tip-bot2 for Tom Lendacky
2026-03-24 16:12 ` [PATCH v4 2/7] x86/sev: Create snp_prepare() Tycho Andersen
2026-03-24 16:12 ` [PATCH v4 3/7] x86/sev: Create snp_shutdown() Tycho Andersen
2026-03-30 10:46   ` [tip: x86/sev] " tip-bot2 for Tycho Andersen (AMD)
2026-03-24 16:12 ` [PATCH v4 4/7] x86/sev, crypto/ccp: Move SNP init to ccp driver Tycho Andersen
2026-03-30 10:45   ` [tip: x86/sev] " tip-bot2 for Tycho Andersen (AMD)
2026-03-24 16:12 ` [PATCH v4 5/7] x86/sev, crypto/ccp: Move HSAVE_PA setup to arch/x86/ Tycho Andersen
2026-03-30 10:45   ` [tip: x86/sev] " tip-bot2 for Tycho Andersen (AMD)
2026-03-24 16:13 ` [PATCH v4 6/7] crypto/ccp: Implement SNP x86 shutdown Tycho Andersen
2026-03-30 10:45   ` [tip: x86/sev] " tip-bot2 for Tycho Andersen (AMD)
2026-03-24 16:13 ` [PATCH v4 7/7] crypto/ccp: Update HV_FIXED page states to allow freeing of memory Tycho Andersen
2026-03-30 10:45   ` [tip: x86/sev] " tip-bot2 for Tom Lendacky
2026-03-25  9:07 ` [PATCH v4 0/7] Move SNP initialization to the CCP driver Borislav Petkov
2026-03-25 15:25   ` Tycho Andersen
2026-03-28 11:38     ` Borislav Petkov
2026-03-30 15:38       ` Tycho Andersen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=acqZCYwNdepPnNMB@tycho.pizza \
    --to=tycho@kernel.org \
    --cc=Neeraj.Upadhyay@amd.com \
    --cc=aik@amd.com \
    --cc=ardb@kernel.org \
    --cc=ashish.kalra@amd.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=hpa@zytor.com \
    --cc=john.allen@amd.com \
    --cc=kim.phillips@amd.com \
    --cc=kvijayab@amd.com \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nikunj@amd.com \
    --cc=peterz@infradead.org \
    --cc=seanjc@google.com \
    --cc=tglx@kernel.org \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.