From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>,
Ryan Roberts <ryan.roberts@arm.com>,
"Yang Shi" <yang@os.amperecomputing.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Kevin Brodsky <kevin.brodsky@arm.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Zhenhua Huang <quic_zhenhuah@quicinc.com>,
Joey Gouly <joey.gouly@arm.com>
Subject: Re: [PATCH 1/2] arm64/mm: Allow __create_pgd_mapping() to propagate pgtable_alloc() errors
Date: Thu, 14 Aug 2025 17:22:55 +0100 [thread overview]
Message-ID: <20250814172255.000039ca@huawei.com> (raw)
In-Reply-To: <20250813145607.1612234-2-chaitanyas.prakash@arm.com>
On Wed, 13 Aug 2025 20:26:06 +0530
Chaitanya S Prakash <chaitanyas.prakash@arm.com> wrote:
> arch_add_memory() is used to hotplug memory into a system but as a part
> of its implementation it calls __create_pgd_mapping(), which uses
> pgtable_alloc() in order to build intermediate page tables. As this path
> was initally only used during early boot pgtable_alloc() is designed to
> BUG_ON() on failure. However, in the event that memory hotplug is
> attempted when the system's memory is extremely tight and the allocation
> were to fail, it would lead to panicking the system, which is not
> desirable. Hence update __create_pgd_mapping and all it's callers to be
> non void and propagate -ENOMEM on allocation failure to allow system to
> fail gracefully.
>
> But during early boot if there is an allocation failure, we want the
> system to panic, hence create a wrapper around __create_pgd_mapping()
> called ___create_pgd_mapping() which is designed to BUG_ON(ret), if ret
> is non zero value. All the init calls are updated to use the wrapper
> rather than the modified __create_pgd_mapping() to restore
> functionality.
>
> Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
> ---
> arch/arm64/mm/mmu.c | 156 +++++++++++++++++++++++++++++++++-----------
> 1 file changed, 117 insertions(+), 39 deletions(-)
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 00ab1d648db62..db7f45ef16574 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -196,12 +196,13 @@ static void init_pte(pte_t *ptep, unsigned long addr, unsigned long end,
> } while (ptep++, addr += PAGE_SIZE, addr != end);
> }
>
> -static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
> +static int alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
> unsigned long end, phys_addr_t phys,
> pgprot_t prot,
> phys_addr_t (*pgtable_alloc)(enum pgtable_type),
> int flags)
> {
> + int ret = 0;
> unsigned long next;
> pmd_t pmd = READ_ONCE(*pmdp);
> pte_t *ptep;
> @@ -215,6 +216,10 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
> pmdval |= PMD_TABLE_PXN;
> BUG_ON(!pgtable_alloc);
> pte_phys = pgtable_alloc(TABLE_PTE);
> + if (!pte_phys) {
> + ret = -ENOMEM;
return -ENOMEM;
When I saw this I wondered if local style was to always exit at the end of
functions, but it isn't so early returns should be fine and simplify this
a fair bit.
> + goto out;
> + }
> ptep = pte_set_fixmap(pte_phys);
> init_clear_pgtable(ptep);
> ptep += pte_index(addr);
> @@ -246,12 +251,16 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
> * walker.
> */
> pte_clear_fixmap();
> +
> +out:
> + return ret;
return 0;
> }
>
> -static void init_pmd(pmd_t *pmdp, unsigned long addr, unsigned long end,
> +static int init_pmd(pmd_t *pmdp, unsigned long addr, unsigned long end,
> phys_addr_t phys, pgprot_t prot,
> phys_addr_t (*pgtable_alloc)(enum pgtable_type), int flags)
> {
> + int ret = 0;
> unsigned long next;
>
> do {
> @@ -271,22 +280,27 @@ static void init_pmd(pmd_t *pmdp, unsigned long addr, unsigned long end,
> BUG_ON(!pgattr_change_is_safe(pmd_val(old_pmd),
> READ_ONCE(pmd_val(*pmdp))));
> } else {
> - alloc_init_cont_pte(pmdp, addr, next, phys, prot,
> - pgtable_alloc, flags);
> + ret = alloc_init_cont_pte(pmdp, addr, next, phys, prot,
> + pgtable_alloc, flags);
> + if (ret)
> + break;
return ret;
>
> BUG_ON(pmd_val(old_pmd) != 0 &&
> pmd_val(old_pmd) != READ_ONCE(pmd_val(*pmdp)));
> }
> phys += next - addr;
> } while (pmdp++, addr = next, addr != end);
> +
> + return ret;
return 0;
Same in the other cases where there is nothing else to do and they are simply
early error returns.
> }
>
> -static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,
> +static int __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,
> unsigned long virt, phys_addr_t size,
> pgprot_t prot,
> phys_addr_t (*pgtable_alloc)(enum pgtable_type),
> int flags)
> {
> + int ret = 0;
> +
> mutex_lock(&fixmap_lock);
> - __create_pgd_mapping_locked(pgdir, phys, virt, size, prot,
> - pgtable_alloc, flags);
> + ret = __create_pgd_mapping_locked(pgdir, phys, virt, size, prot,
> + pgtable_alloc, flags);
> mutex_unlock(&fixmap_lock);
Could use guard(mutex)(&fixmap_lock);
but maybe not worth introducing that stuff just to simplify this.
> +
> + return ret;
> +}
> +
> +static void ___create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,
> + unsigned long virt, phys_addr_t size,
> + pgprot_t prot,
> + phys_addr_t (*pgtable_alloc)(enum pgtable_type),
> + int flags)
> +{
> + int ret = 0;
> +
> + ret = __create_pgd_mapping(pgdir, phys, virt, size, prot, pgtable_alloc,
> + flags);
> + BUG_ON(ret);
> }
>
> #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> @@ -485,9 +553,11 @@ static phys_addr_t __pgd_pgtable_alloc(struct mm_struct *mm,
> {
> /* Page is zeroed by init_clear_pgtable() so don't duplicate effort. */
> struct ptdesc *ptdesc = pagetable_alloc(GFP_PGTABLE_KERNEL & ~__GFP_ZERO, 0);
> - phys_addr_t pa;
> + phys_addr_t pa = 0;
> +
> + if (!ptdesc)
> + goto out;
I'd return 0 here.
>
> - BUG_ON(!ptdesc);
> pa = page_to_phys(ptdesc_page(ptdesc));
>
> switch (pgtable_type) {
> @@ -505,6 +575,7 @@ static phys_addr_t __pgd_pgtable_alloc(struct mm_struct *mm,
> break;
> }
>
> +out:
> return pa;
> }
>
> @@ -533,8 +604,8 @@ void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt,
> &phys, virt);
> return;
> }
> - __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL,
> - NO_CONT_MAPPINGS);
> + ___create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL,
> + NO_CONT_MAPPINGS);
> }
>
> @@ -1369,23 +1440,30 @@ int arch_add_memory(int nid, u64 start, u64 size,
> if (can_set_direct_map())
> flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
>
> - __create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
> - size, params->pgprot, pgd_pgtable_alloc_init_mm,
> - flags);
> + ret = __create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
> + size, params->pgprot, pgd_pgtable_alloc_init_mm,
> + flags);
> +
> + if (ret)
> + goto out;
>
> memblock_clear_nomap(start, size);
>
> ret = __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
> params);
> - if (ret)
> - __remove_pgd_mapping(swapper_pg_dir,
> - __phys_to_virt(start), size);
> - else {
> + if (ret) {
> + goto out;
> + } else {
> /* Address of hotplugged memory can be smaller */
> max_pfn = max(max_pfn, PFN_UP(start + size));
> max_low_pfn = max_pfn;
> }
>
> + return 0;
> +
> +out:
> + __remove_pgd_mapping(swapper_pg_dir,
> + __phys_to_virt(start), size);
This one is fine as common cleanup to do.
Jonathan
> return ret;
> }
>
next prev parent reply other threads:[~2025-08-14 16:58 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-13 14:56 [PATCH 0/2] arm64/mm: prevent panic on -ENOMEM in arch_add_memory() Chaitanya S Prakash
2025-08-13 14:56 ` [PATCH 1/2] arm64/mm: Allow __create_pgd_mapping() to propagate pgtable_alloc() errors Chaitanya S Prakash
2025-08-14 16:22 ` Jonathan Cameron [this message]
2025-08-15 7:30 ` Dev Jain
2025-08-15 12:12 ` Kevin Brodsky
2025-08-13 14:56 ` [PATCH 2/2] arm64/mm: Update create_kpti_ng_temp_pgd() to handle pgtable_alloc failure Chaitanya S Prakash
2025-08-15 12:00 ` Kevin Brodsky
2025-08-19 7:41 ` David Hildenbrand
2025-08-19 8:44 ` Giorgi Tchankvetadze
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250814172255.000039ca@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=catalin.marinas@arm.com \
--cc=chaitanyas.prakash@arm.com \
--cc=joey.gouly@arm.com \
--cc=kevin.brodsky@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=quic_zhenhuah@quicinc.com \
--cc=ryan.roberts@arm.com \
--cc=will@kernel.org \
--cc=yang@os.amperecomputing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.