* [PATCH] crypto/ccp: Fix locking on alloc failure handling @ 2025-06-17 9:43 Alexey Kardashevskiy 2025-06-20 19:20 ` Tom Lendacky ` (2 more replies) 0 siblings, 3 replies; 6+ messages in thread From: Alexey Kardashevskiy @ 2025-06-17 9:43 UTC (permalink / raw) To: linux-crypto Cc: linux-kernel, Ashish Kalra, Tom Lendacky, John Allen, Herbert Xu, Borislav Petkov (AMD), Michael Roth, Alexey Kardashevskiy The __snp_alloc_firmware_pages() helper allocates pages in the firmware state (alloc + rmpupdate). In case of failed rmpupdate, it tries reclaiming pages with already changed state. This requires calling the PSP firmware and since there is sev_cmd_mutex to guard such calls, the helper takes a "locked" parameter so specify if the lock needs to be held. Most calls happen from snp_alloc_firmware_page() which executes without the lock. However commit 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation when SNP is enabled") switched sev_fw_alloc() from alloc_pages() (which does not call the PSP) to __snp_alloc_firmware_pages() (which does) but did not account for the fact that sev_fw_alloc() is called from __sev_platform_init_locked() (via __sev_platform_init_handle_tmr()) and executes with the lock held. Add a "locked" parameter to __snp_alloc_firmware_pages(). Make sev_fw_alloc() use the new parameter to prevent potential deadlock in rmp_mark_pages_firmware() if rmpupdate() failed. Fixes: 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation when SNP is enabled") Signed-off-by: Alexey Kardashevskiy <aik@amd.com> --- drivers/crypto/ccp/sev-dev.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c index 3451bada884e..16a11d5efe46 100644 --- a/drivers/crypto/ccp/sev-dev.c +++ b/drivers/crypto/ccp/sev-dev.c @@ -434,7 +434,7 @@ static int rmp_mark_pages_firmware(unsigned long paddr, unsigned int npages, boo return rc; } -static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order) +static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order, bool locked) { unsigned long npages = 1ul << order, paddr; struct sev_device *sev; @@ -453,7 +453,7 @@ static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order) return page; paddr = __pa((unsigned long)page_address(page)); - if (rmp_mark_pages_firmware(paddr, npages, false)) + if (rmp_mark_pages_firmware(paddr, npages, locked)) return NULL; return page; @@ -463,7 +463,7 @@ void *snp_alloc_firmware_page(gfp_t gfp_mask) { struct page *page; - page = __snp_alloc_firmware_pages(gfp_mask, 0); + page = __snp_alloc_firmware_pages(gfp_mask, 0, false); return page ? page_address(page) : NULL; } @@ -498,7 +498,7 @@ static void *sev_fw_alloc(unsigned long len) { struct page *page; - page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len)); + page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len), true); if (!page) return NULL; -- 2.49.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] crypto/ccp: Fix locking on alloc failure handling 2025-06-17 9:43 [PATCH] crypto/ccp: Fix locking on alloc failure handling Alexey Kardashevskiy @ 2025-06-20 19:20 ` Tom Lendacky 2025-06-24 1:41 ` Alexey Kardashevskiy 2025-06-26 14:50 ` Pratik R. Sampat 2025-07-07 3:29 ` Herbert Xu 2 siblings, 1 reply; 6+ messages in thread From: Tom Lendacky @ 2025-06-20 19:20 UTC (permalink / raw) To: Alexey Kardashevskiy, linux-crypto Cc: linux-kernel, Ashish Kalra, John Allen, Herbert Xu, Borislav Petkov (AMD), Michael Roth On 6/17/25 04:43, Alexey Kardashevskiy wrote: > The __snp_alloc_firmware_pages() helper allocates pages in the firmware > state (alloc + rmpupdate). In case of failed rmpupdate, it tries > reclaiming pages with already changed state. This requires calling > the PSP firmware and since there is sev_cmd_mutex to guard such calls, > the helper takes a "locked" parameter so specify if the lock needs to > be held. > > Most calls happen from snp_alloc_firmware_page() which executes without > the lock. However > > commit 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation when SNP is enabled") > > switched sev_fw_alloc() from alloc_pages() (which does not call the PSP) to > __snp_alloc_firmware_pages() (which does) but did not account for the fact > that sev_fw_alloc() is called from __sev_platform_init_locked() > (via __sev_platform_init_handle_tmr()) and executes with the lock held. > > Add a "locked" parameter to __snp_alloc_firmware_pages(). > Make sev_fw_alloc() use the new parameter to prevent potential deadlock in > rmp_mark_pages_firmware() if rmpupdate() failed. Would it make sense to add the locked parameter to sev_fw_alloc(), too? Right now there is only one caller of sev_fw_alloc(), but in the future, if some other path should call sev_fw_alloc() and that path doesn't have the lock, then we'll miss taking it. Thanks, Tom > > Fixes: 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation when SNP is enabled") > Signed-off-by: Alexey Kardashevskiy <aik@amd.com> > --- > drivers/crypto/ccp/sev-dev.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c > index 3451bada884e..16a11d5efe46 100644 > --- a/drivers/crypto/ccp/sev-dev.c > +++ b/drivers/crypto/ccp/sev-dev.c > @@ -434,7 +434,7 @@ static int rmp_mark_pages_firmware(unsigned long paddr, unsigned int npages, boo > return rc; > } > > -static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order) > +static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order, bool locked) > { > unsigned long npages = 1ul << order, paddr; > struct sev_device *sev; > @@ -453,7 +453,7 @@ static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order) > return page; > > paddr = __pa((unsigned long)page_address(page)); > - if (rmp_mark_pages_firmware(paddr, npages, false)) > + if (rmp_mark_pages_firmware(paddr, npages, locked)) > return NULL; > > return page; > @@ -463,7 +463,7 @@ void *snp_alloc_firmware_page(gfp_t gfp_mask) > { > struct page *page; > > - page = __snp_alloc_firmware_pages(gfp_mask, 0); > + page = __snp_alloc_firmware_pages(gfp_mask, 0, false); > > return page ? page_address(page) : NULL; > } > @@ -498,7 +498,7 @@ static void *sev_fw_alloc(unsigned long len) > { > struct page *page; > > - page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len)); > + page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len), true); > if (!page) > return NULL; > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] crypto/ccp: Fix locking on alloc failure handling 2025-06-20 19:20 ` Tom Lendacky @ 2025-06-24 1:41 ` Alexey Kardashevskiy 2025-06-24 15:20 ` Tom Lendacky 0 siblings, 1 reply; 6+ messages in thread From: Alexey Kardashevskiy @ 2025-06-24 1:41 UTC (permalink / raw) To: Tom Lendacky, linux-crypto Cc: linux-kernel, Ashish Kalra, John Allen, Herbert Xu, Borislav Petkov (AMD), Michael Roth On 21/6/25 05:20, Tom Lendacky wrote: > On 6/17/25 04:43, Alexey Kardashevskiy wrote: >> The __snp_alloc_firmware_pages() helper allocates pages in the firmware >> state (alloc + rmpupdate). In case of failed rmpupdate, it tries >> reclaiming pages with already changed state. This requires calling >> the PSP firmware and since there is sev_cmd_mutex to guard such calls, >> the helper takes a "locked" parameter so specify if the lock needs to >> be held. >> >> Most calls happen from snp_alloc_firmware_page() which executes without >> the lock. However >> >> commit 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation when SNP is enabled") >> >> switched sev_fw_alloc() from alloc_pages() (which does not call the PSP) to >> __snp_alloc_firmware_pages() (which does) but did not account for the fact >> that sev_fw_alloc() is called from __sev_platform_init_locked() >> (via __sev_platform_init_handle_tmr()) and executes with the lock held. >> >> Add a "locked" parameter to __snp_alloc_firmware_pages(). >> Make sev_fw_alloc() use the new parameter to prevent potential deadlock in >> rmp_mark_pages_firmware() if rmpupdate() failed. > > Would it make sense to add the locked parameter to sev_fw_alloc(), too? That would be another patch then, this one is a fix ;) and I'd probably just ditch both snp_alloc_firmware_page() and sev_fw_alloc(), rename __snp_alloc_firmware_pages() to snp_alloc_firmware_page() and just use this one everywhere. Nobody needs page struct anyway, and the locking will be clear everywhere. Also do the same for snp_free_firmware_page(). It is just that snp_alloc_firmware_page() and snp_free_firmware_page() are EXPORT_SYMBOL_GPL, > Right now there is only one caller of sev_fw_alloc(), but in the future, > if some other path should call sev_fw_alloc() and that path doesn't have > the lock, then we'll miss taking it. I'd rather just ditch sev_fw_alloc(), does not look very useful. Thanks, > Thanks, > Tom > >> >> Fixes: 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation when SNP is enabled") >> Signed-off-by: Alexey Kardashevskiy <aik@amd.com> >> --- >> drivers/crypto/ccp/sev-dev.c | 8 ++++---- >> 1 file changed, 4 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c >> index 3451bada884e..16a11d5efe46 100644 >> --- a/drivers/crypto/ccp/sev-dev.c >> +++ b/drivers/crypto/ccp/sev-dev.c >> @@ -434,7 +434,7 @@ static int rmp_mark_pages_firmware(unsigned long paddr, unsigned int npages, boo >> return rc; >> } >> >> -static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order) >> +static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order, bool locked) >> { >> unsigned long npages = 1ul << order, paddr; >> struct sev_device *sev; >> @@ -453,7 +453,7 @@ static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order) >> return page; >> >> paddr = __pa((unsigned long)page_address(page)); >> - if (rmp_mark_pages_firmware(paddr, npages, false)) >> + if (rmp_mark_pages_firmware(paddr, npages, locked)) >> return NULL; >> >> return page; >> @@ -463,7 +463,7 @@ void *snp_alloc_firmware_page(gfp_t gfp_mask) >> { >> struct page *page; >> >> - page = __snp_alloc_firmware_pages(gfp_mask, 0); >> + page = __snp_alloc_firmware_pages(gfp_mask, 0, false); >> >> return page ? page_address(page) : NULL; >> } >> @@ -498,7 +498,7 @@ static void *sev_fw_alloc(unsigned long len) >> { >> struct page *page; >> >> - page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len)); >> + page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len), true); >> if (!page) >> return NULL; >> -- Alexey ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] crypto/ccp: Fix locking on alloc failure handling 2025-06-24 1:41 ` Alexey Kardashevskiy @ 2025-06-24 15:20 ` Tom Lendacky 0 siblings, 0 replies; 6+ messages in thread From: Tom Lendacky @ 2025-06-24 15:20 UTC (permalink / raw) To: Alexey Kardashevskiy, linux-crypto Cc: linux-kernel, Ashish Kalra, John Allen, Herbert Xu, Borislav Petkov (AMD), Michael Roth On 6/23/25 20:41, Alexey Kardashevskiy wrote: > On 21/6/25 05:20, Tom Lendacky wrote: >> On 6/17/25 04:43, Alexey Kardashevskiy wrote: >>> The __snp_alloc_firmware_pages() helper allocates pages in the firmware >>> state (alloc + rmpupdate). In case of failed rmpupdate, it tries >>> reclaiming pages with already changed state. This requires calling >>> the PSP firmware and since there is sev_cmd_mutex to guard such calls, >>> the helper takes a "locked" parameter so specify if the lock needs to >>> be held. >>> >>> Most calls happen from snp_alloc_firmware_page() which executes without >>> the lock. However >>> >>> commit 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation >>> when SNP is enabled") >>> >>> switched sev_fw_alloc() from alloc_pages() (which does not call the >>> PSP) to >>> __snp_alloc_firmware_pages() (which does) but did not account for the fact >>> that sev_fw_alloc() is called from __sev_platform_init_locked() >>> (via __sev_platform_init_handle_tmr()) and executes with the lock held. >>> >>> Add a "locked" parameter to __snp_alloc_firmware_pages(). >>> Make sev_fw_alloc() use the new parameter to prevent potential deadlock in >>> rmp_mark_pages_firmware() if rmpupdate() failed. >> >> Would it make sense to add the locked parameter to sev_fw_alloc(), too? > > That would be another patch then, this one is a fix ;) > > and I'd probably just ditch both snp_alloc_firmware_page() and > sev_fw_alloc(), rename __snp_alloc_firmware_pages() to > snp_alloc_firmware_page() and just use this one everywhere. Nobody needs > page struct anyway, and the locking will be clear everywhere. Also do the > same for snp_free_firmware_page(). > > It is just that snp_alloc_firmware_page() and snp_free_firmware_page() are > EXPORT_SYMBOL_GPL, > >> Right now there is only one caller of sev_fw_alloc(), but in the future, >> if some other path should call sev_fw_alloc() and that path doesn't have >> the lock, then we'll miss taking it. > > I'd rather just ditch sev_fw_alloc(), does not look very useful. Thanks, Ok, I'm good with this patch for the fix. Can you also provide a follow up patch(es) to address what you've discussed? For the fix: Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Thanks, Tom > > > >> Thanks, >> Tom >> >>> >>> Fixes: 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation >>> when SNP is enabled") >>> Signed-off-by: Alexey Kardashevskiy <aik@amd.com> >>> --- >>> drivers/crypto/ccp/sev-dev.c | 8 ++++---- >>> 1 file changed, 4 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c >>> index 3451bada884e..16a11d5efe46 100644 >>> --- a/drivers/crypto/ccp/sev-dev.c >>> +++ b/drivers/crypto/ccp/sev-dev.c >>> @@ -434,7 +434,7 @@ static int rmp_mark_pages_firmware(unsigned long >>> paddr, unsigned int npages, boo >>> return rc; >>> } >>> -static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int >>> order) >>> +static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int >>> order, bool locked) >>> { >>> unsigned long npages = 1ul << order, paddr; >>> struct sev_device *sev; >>> @@ -453,7 +453,7 @@ static struct page >>> *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order) >>> return page; >>> paddr = __pa((unsigned long)page_address(page)); >>> - if (rmp_mark_pages_firmware(paddr, npages, false)) >>> + if (rmp_mark_pages_firmware(paddr, npages, locked)) >>> return NULL; >>> return page; >>> @@ -463,7 +463,7 @@ void *snp_alloc_firmware_page(gfp_t gfp_mask) >>> { >>> struct page *page; >>> - page = __snp_alloc_firmware_pages(gfp_mask, 0); >>> + page = __snp_alloc_firmware_pages(gfp_mask, 0, false); >>> return page ? page_address(page) : NULL; >>> } >>> @@ -498,7 +498,7 @@ static void *sev_fw_alloc(unsigned long len) >>> { >>> struct page *page; >>> - page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len)); >>> + page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len), true); >>> if (!page) >>> return NULL; >>> > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] crypto/ccp: Fix locking on alloc failure handling 2025-06-17 9:43 [PATCH] crypto/ccp: Fix locking on alloc failure handling Alexey Kardashevskiy 2025-06-20 19:20 ` Tom Lendacky @ 2025-06-26 14:50 ` Pratik R. Sampat 2025-07-07 3:29 ` Herbert Xu 2 siblings, 0 replies; 6+ messages in thread From: Pratik R. Sampat @ 2025-06-26 14:50 UTC (permalink / raw) To: Alexey Kardashevskiy, linux-crypto Cc: linux-kernel, Ashish Kalra, Tom Lendacky, John Allen, Herbert Xu, Borislav Petkov (AMD), Michael Roth Hi Alexey, On 6/17/25 4:43 AM, Alexey Kardashevskiy wrote: > The __snp_alloc_firmware_pages() helper allocates pages in the firmware > state (alloc + rmpupdate). In case of failed rmpupdate, it tries > reclaiming pages with already changed state. This requires calling > the PSP firmware and since there is sev_cmd_mutex to guard such calls, > the helper takes a "locked" parameter so specify if the lock needs to > be held. > > Most calls happen from snp_alloc_firmware_page() which executes without > the lock. However > > commit 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation when SNP is enabled") > > switched sev_fw_alloc() from alloc_pages() (which does not call the PSP) to > __snp_alloc_firmware_pages() (which does) but did not account for the fact > that sev_fw_alloc() is called from __sev_platform_init_locked() > (via __sev_platform_init_handle_tmr()) and executes with the lock held. > > Add a "locked" parameter to __snp_alloc_firmware_pages(). > Make sev_fw_alloc() use the new parameter to prevent potential deadlock in > rmp_mark_pages_firmware() if rmpupdate() failed. > > Fixes: 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation when SNP is enabled") > Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Thank you for fixing this. I was facing a locking issue and was writing a similar patch when I discovered this! Reviewed-by: Pratik R. Sampat <prsampat@amd.com> > --- > drivers/crypto/ccp/sev-dev.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c > index 3451bada884e..16a11d5efe46 100644 > --- a/drivers/crypto/ccp/sev-dev.c > +++ b/drivers/crypto/ccp/sev-dev.c > @@ -434,7 +434,7 @@ static int rmp_mark_pages_firmware(unsigned long paddr, unsigned int npages, boo > return rc; > } > > -static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order) > +static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order, bool locked) > { > unsigned long npages = 1ul << order, paddr; > struct sev_device *sev; > @@ -453,7 +453,7 @@ static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order) > return page; > > paddr = __pa((unsigned long)page_address(page)); > - if (rmp_mark_pages_firmware(paddr, npages, false)) > + if (rmp_mark_pages_firmware(paddr, npages, locked)) > return NULL; > > return page; > @@ -463,7 +463,7 @@ void *snp_alloc_firmware_page(gfp_t gfp_mask) > { > struct page *page; > > - page = __snp_alloc_firmware_pages(gfp_mask, 0); > + page = __snp_alloc_firmware_pages(gfp_mask, 0, false); > > return page ? page_address(page) : NULL; > } > @@ -498,7 +498,7 @@ static void *sev_fw_alloc(unsigned long len) > { > struct page *page; > > - page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len)); > + page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len), true); > if (!page) > return NULL; > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] crypto/ccp: Fix locking on alloc failure handling 2025-06-17 9:43 [PATCH] crypto/ccp: Fix locking on alloc failure handling Alexey Kardashevskiy 2025-06-20 19:20 ` Tom Lendacky 2025-06-26 14:50 ` Pratik R. Sampat @ 2025-07-07 3:29 ` Herbert Xu 2 siblings, 0 replies; 6+ messages in thread From: Herbert Xu @ 2025-07-07 3:29 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: linux-crypto, linux-kernel, Ashish Kalra, Tom Lendacky, John Allen, Borislav Petkov (AMD), Michael Roth On Tue, Jun 17, 2025 at 07:43:54PM +1000, Alexey Kardashevskiy wrote: > The __snp_alloc_firmware_pages() helper allocates pages in the firmware > state (alloc + rmpupdate). In case of failed rmpupdate, it tries > reclaiming pages with already changed state. This requires calling > the PSP firmware and since there is sev_cmd_mutex to guard such calls, > the helper takes a "locked" parameter so specify if the lock needs to > be held. > > Most calls happen from snp_alloc_firmware_page() which executes without > the lock. However > > commit 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation when SNP is enabled") > > switched sev_fw_alloc() from alloc_pages() (which does not call the PSP) to > __snp_alloc_firmware_pages() (which does) but did not account for the fact > that sev_fw_alloc() is called from __sev_platform_init_locked() > (via __sev_platform_init_handle_tmr()) and executes with the lock held. > > Add a "locked" parameter to __snp_alloc_firmware_pages(). > Make sev_fw_alloc() use the new parameter to prevent potential deadlock in > rmp_mark_pages_firmware() if rmpupdate() failed. > > Fixes: 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation when SNP is enabled") > Signed-off-by: Alexey Kardashevskiy <aik@amd.com> > --- > drivers/crypto/ccp/sev-dev.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) Patch applied. Thanks. -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-07-07 3:29 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-06-17 9:43 [PATCH] crypto/ccp: Fix locking on alloc failure handling Alexey Kardashevskiy 2025-06-20 19:20 ` Tom Lendacky 2025-06-24 1:41 ` Alexey Kardashevskiy 2025-06-24 15:20 ` Tom Lendacky 2025-06-26 14:50 ` Pratik R. Sampat 2025-07-07 3:29 ` Herbert Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).