From: Catalin Marinas <catalin.marinas@arm.com>
To: Zhenhua Huang <quic_zhenhuah@quicinc.com>
Cc: anshuman.khandual@arm.com, will@kernel.org, ardb@kernel.org,
ryan.roberts@arm.com, mark.rutland@arm.com, joey.gouly@arm.com,
dave.hansen@linux.intel.com, akpm@linux-foundation.org,
chenfeiyang@loongson.cn, chenhuacai@kernel.org,
linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, quic_tingweiz@quicinc.com,
stable@vger.kernel.org
Subject: Re: [PATCH v4] arm64: mm: Populate vmemmap/linear at the page level for hotplugged sections
Date: Tue, 7 Jan 2025 19:22:35 +0000 [thread overview]
Message-ID: <Z31--x4unDHRU5Zo@arm.com> (raw)
In-Reply-To: <20250107074252.1062127-1-quic_zhenhuah@quicinc.com>
On Tue, Jan 07, 2025 at 03:42:52PM +0800, Zhenhua Huang wrote:
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index e2739b69e11b..5e0f514de870 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -42,9 +42,11 @@
> #include <asm/pgalloc.h>
> #include <asm/kfence.h>
>
> -#define NO_BLOCK_MAPPINGS BIT(0)
> -#define NO_CONT_MAPPINGS BIT(1)
> -#define NO_EXEC_MAPPINGS BIT(2) /* assumes FEAT_HPDS is not used */
> +#define NO_PMD_BLOCK_MAPPINGS BIT(0)
> +#define NO_PUD_BLOCK_MAPPINGS BIT(1) /* Hotplug case: do not want block mapping for PUD */
> +#define NO_BLOCK_MAPPINGS (NO_PMD_BLOCK_MAPPINGS | NO_PUD_BLOCK_MAPPINGS)
Nit: please use a tab instead of space before (NO_PMD_...)
> +#define NO_CONT_MAPPINGS BIT(2)
> +#define NO_EXEC_MAPPINGS BIT(3) /* assumes FEAT_HPDS is not used */
>
> u64 kimage_voffset __ro_after_init;
> EXPORT_SYMBOL(kimage_voffset);
> @@ -254,7 +256,7 @@ static void init_pmd(pmd_t *pmdp, unsigned long addr, unsigned long end,
>
> /* try section mapping first */
> if (((addr | next | phys) & ~PMD_MASK) == 0 &&
> - (flags & NO_BLOCK_MAPPINGS) == 0) {
> + (flags & NO_PMD_BLOCK_MAPPINGS) == 0) {
> pmd_set_huge(pmdp, phys, prot);
>
> /*
> @@ -356,10 +358,11 @@ static void alloc_init_pud(p4d_t *p4dp, unsigned long addr, unsigned long end,
>
> /*
> * For 4K granule only, attempt to put down a 1GB block
> + * Hotplug case: do not attempt 1GB block
> */
I don't think we need this comment added here. The hotplug case is a
decision of the caller, so better to have the comment there.
> if (pud_sect_supported() &&
> ((addr | next | phys) & ~PUD_MASK) == 0 &&
> - (flags & NO_BLOCK_MAPPINGS) == 0) {
> + (flags & NO_PUD_BLOCK_MAPPINGS) == 0) {
> pud_set_huge(pudp, phys, prot);
Nit: something wrong with the alignment here. I think the unmodified
line after the 'if' one above was misaligned before your patch.
>
> /*
> @@ -1175,9 +1178,21 @@ int __meminit vmemmap_check_pmd(pmd_t *pmdp, int node,
> int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
> struct vmem_altmap *altmap)
> {
> + unsigned long start_pfn;
> + struct mem_section *ms;
> +
> WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END));
>
> - if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES))
> + start_pfn = page_to_pfn((struct page *)start);
> + ms = __pfn_to_section(start_pfn);
Hmm, it would have been better if the core code provided the start pfn
as it does for vmemmap_populate_compound_pages() but I'm fine with
deducting it from 'start'.
> + /*
> + * Hotplugged section does not support hugepages as
> + * PMD_SIZE (hence PUD_SIZE) section mapping covers
> + * struct page range that exceeds a SUBSECTION_SIZE
> + * i.e 2MB - for all available base page sizes.
> + */
> + if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES) || !early_section(ms))
> return vmemmap_populate_basepages(start, end, node, altmap);
> else
> return vmemmap_populate_hugepages(start, end, node, altmap);
> @@ -1339,9 +1354,25 @@ int arch_add_memory(int nid, u64 start, u64 size,
> struct mhp_params *params)
> {
> int ret, flags = NO_EXEC_MAPPINGS;
> + unsigned long start_pfn = page_to_pfn((struct page *)start);
> + struct mem_section *ms = __pfn_to_section(start_pfn);
This looks wrong. 'start' here is a physical address, you want
PFN_DOWN() instead.
>
> VM_BUG_ON(!mhp_range_allowed(start, size, true));
>
> + /* should not be invoked by early section */
> + WARN_ON(early_section(ms));
> +
> + /*
> + * 4K base page's PMD_SIZE matches SUBSECTION_SIZE i.e 2MB. Hence
> + * PMD section mapping can be allowed, but only for 4K base pages.
> + * Where as PMD_SIZE (hence PUD_SIZE) for other page sizes exceed
> + * SUBSECTION_SIZE.
> + */
> + if (IS_ENABLED(CONFIG_ARM64_4K_PAGES))
> + flags |= NO_PUD_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
In theory we can allow contiguous PTE mappings but not PMD. You could
probably do the same as a NO_BLOCK_MAPPINGS and split it into multiple
components - NO_PTE_CONT_MAPPINGS and so on.
> + else
> + flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
Similarly with 16K/64K pages we can allow contiguous PTEs as they all go
up to 2MB blocks.
I think we should write the flags setup in a more readable way than
trying to do mental maths on the possible combinations, something like:
flags = NO_PUD_BLOCK_MAPPINGS | NO_PMD_CONT_MAPPINGS;
if (SUBSECTION_SHIFT < PMD_SHIFT)
flags |= NO_PMD_BLOCK_MAPPINGS;
if (SUBSECTION_SHIFT < CONT_PTE_SHIFT)
flags |= NO_PTE_CONT_MAPPINGS;
This way we don't care about the page size and should cover any changes
to SUBSECTION_SHIFT making it smaller than 2MB.
--
Catalin
next prev parent reply other threads:[~2025-01-07 19:27 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-07 7:42 [PATCH v4] arm64: mm: Populate vmemmap/linear at the page level for hotplugged sections Zhenhua Huang
2025-01-07 19:22 ` Catalin Marinas [this message]
2025-01-08 10:07 ` Zhenhua Huang
2025-01-08 10:52 ` Anshuman Khandual
2025-01-09 7:04 ` Zhenhua Huang
2025-01-09 14:32 ` Catalin Marinas
2025-01-10 3:13 ` Zhenhua Huang
2025-01-08 10:11 ` Anshuman Khandual
2025-01-09 7:04 ` Zhenhua Huang
2025-01-09 12:10 ` Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z31--x4unDHRU5Zo@arm.com \
--to=catalin.marinas@arm.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=ardb@kernel.org \
--cc=chenfeiyang@loongson.cn \
--cc=chenhuacai@kernel.org \
--cc=dave.hansen@linux.intel.com \
--cc=joey.gouly@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mark.rutland@arm.com \
--cc=quic_tingweiz@quicinc.com \
--cc=quic_zhenhuah@quicinc.com \
--cc=ryan.roberts@arm.com \
--cc=stable@vger.kernel.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.