* [PATCH 0/2] Fix subsection vmemmap_populate logic
@ 2024-11-21 7:12 Zhenhua Huang
2024-11-21 7:12 ` [PATCH 1/2] arm64: mm: vmemmap populate to page level if not section aligned Zhenhua Huang
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Zhenhua Huang @ 2024-11-21 7:12 UTC (permalink / raw)
To: catalin.marinas, will, ardb, ryan.roberts, mark.rutland,
joey.gouly, dave.hansen, akpm, chenfeiyang, chenhuacai
Cc: linux-arm-kernel, linux-kernel, Zhenhua Huang
To perform memory hotplug operations, the memmap (aka struct page) will be
updated. For arm64 with 4K page size, the typical granularity is 128M,
which corresponds to a 2M memmap buffer.
Commit 2045a3b8911b ("mm/sparse-vmemmap: generalise vmemmap_populate_hugepages()")
optimizes this 2M buffer to be mapped with PMD huge pages. However,
commit ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
which supports 2M subsection hotplug granularity, causes other issues
(refer to the change log of patch #1). The logic is adjusted to populate
with huge pages only if the hotplug address/size is section-aligned.
Zhenhua Huang (2):
arm64: mm: vmemmap populate to page level if not section aligned
arm64: mm: implement vmemmap_check_pmd for arm64
arch/arm64/mm/mmu.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH 1/2] arm64: mm: vmemmap populate to page level if not section aligned 2024-11-21 7:12 [PATCH 0/2] Fix subsection vmemmap_populate logic Zhenhua Huang @ 2024-11-21 7:12 ` Zhenhua Huang 2024-12-06 17:13 ` Catalin Marinas 2024-11-21 7:12 ` [PATCH 2/2] arm64: mm: implement vmemmap_check_pmd for arm64 Zhenhua Huang 2024-11-28 7:26 ` [PATCH 0/2] Fix subsection vmemmap_populate logic Zhenhua Huang 2 siblings, 1 reply; 9+ messages in thread From: Zhenhua Huang @ 2024-11-21 7:12 UTC (permalink / raw) To: catalin.marinas, will, ardb, ryan.roberts, mark.rutland, joey.gouly, dave.hansen, akpm, chenfeiyang, chenhuacai Cc: linux-arm-kernel, linux-kernel, Zhenhua Huang Commit 2045a3b8911b ("mm/sparse-vmemmap: generalise vmemmap_populate_hugepages()") optimizes the vmemmap to populate at the PMD section level. However, if start or end is not aligned to a section boundary, such as when a subsection is hot added, populating the entire section is inefficient and wasteful. In such cases, it is more effective to populate at page granularity. This change also addresses misalignment issues during vmemmap_free(). When pmd_sect() is true, the entire PMD section is cleared, even if only a subsection is mapped. For example, if subsections pagemap1 and pagemap2 are added sequentially and then pagemap1 is removed, vmemmap_free() will clear the entire PMD section, even though pagemap2 is still active. Fixes: 2045a3b8911b ("mm/sparse-vmemmap: generalise vmemmap_populate_hugepages()") Signed-off-by: Zhenhua Huang <quic_zhenhuah@quicinc.com> --- arch/arm64/mm/mmu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index fe833de501f7..bfecabac14a3 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1151,7 +1151,9 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, { WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END)); - if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES)) + if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES) || + !IS_ALIGNED(page_to_pfn((struct page *)start), PAGES_PER_SECTION) || + !IS_ALIGNED(page_to_pfn((struct page *)end), PAGES_PER_SECTION)) return vmemmap_populate_basepages(start, end, node, altmap); else return vmemmap_populate_hugepages(start, end, node, altmap); -- 2.25.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] arm64: mm: vmemmap populate to page level if not section aligned 2024-11-21 7:12 ` [PATCH 1/2] arm64: mm: vmemmap populate to page level if not section aligned Zhenhua Huang @ 2024-12-06 17:13 ` Catalin Marinas 2024-12-09 6:04 ` Zhenhua Huang 0 siblings, 1 reply; 9+ messages in thread From: Catalin Marinas @ 2024-12-06 17:13 UTC (permalink / raw) To: Zhenhua Huang Cc: will, ardb, ryan.roberts, mark.rutland, joey.gouly, dave.hansen, akpm, chenfeiyang, chenhuacai, linux-arm-kernel, linux-kernel On Thu, Nov 21, 2024 at 03:12:55PM +0800, Zhenhua Huang wrote: > Commit 2045a3b8911b ("mm/sparse-vmemmap: generalise vmemmap_populate_hugepages()") > optimizes the vmemmap to populate at the PMD section level. Wasn't the above commit just a non-functional change making the code generic? If there was a functional change, it needs to be spelt out. It also implies that the code prior to the above commit needs fixing. > However, if start > or end is not aligned to a section boundary, such as when a subsection is hot > added, populating the entire section is inefficient and wasteful. In such > cases, it is more effective to populate at page granularity. Do you have any numbers to show how inefficient it is? We trade some memory for less TLB pressure by using huge pages for vmemmap. > This change also addresses misalignment issues during vmemmap_free(). When > pmd_sect() is true, the entire PMD section is cleared, even if only a > subsection is mapped. For example, if subsections pagemap1 and pagemap2 are > added sequentially and then pagemap1 is removed, vmemmap_free() will clear the > entire PMD section, even though pagemap2 is still active. What do you mean by a PMD section? The whole PAGE_SIZE * PAGES_PER_SECTION range or a single pmd entry? I couldn't see how the former happens in the core code but I only looked briefly. If it's just a pmd entry, I think it's fair to require a 2MB alignment of hotplugged memory ranges. -- Catalin ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] arm64: mm: vmemmap populate to page level if not section aligned 2024-12-06 17:13 ` Catalin Marinas @ 2024-12-09 6:04 ` Zhenhua Huang 0 siblings, 0 replies; 9+ messages in thread From: Zhenhua Huang @ 2024-12-09 6:04 UTC (permalink / raw) To: Catalin Marinas Cc: will, ardb, ryan.roberts, mark.rutland, joey.gouly, dave.hansen, akpm, chenfeiyang, chenhuacai, linux-arm-kernel, linux-kernel, Tingwei Zhang Thanks Catalin for review! On 2024/12/7 1:13, Catalin Marinas wrote: > On Thu, Nov 21, 2024 at 03:12:55PM +0800, Zhenhua Huang wrote: >> Commit 2045a3b8911b ("mm/sparse-vmemmap: generalise vmemmap_populate_hugepages()") >> optimizes the vmemmap to populate at the PMD section level. > > Wasn't the above commit just a non-functional change making the code > generic? If there was a functional change, it needs to be spelt out. It > also implies that the code prior to the above commit needs fixing. > Oh... right. I looked up your change from over a decade ago, identified by commit c1cc1552616d ("arm64: MMU initialisation"). However, at that time, there was no support for subsection hotplug, which was later introduced by commit ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug"). >> However, if start >> or end is not aligned to a section boundary, such as when a subsection is hot >> added, populating the entire section is inefficient and wasteful. In such >> cases, it is more effective to populate at page granularity. > > Do you have any numbers to show how inefficient it is? We trade some > memory for less TLB pressure by using huge pages for vmemmap. I see.. thanks, yeah. TLB efficiency will benefit. I want to express even one subsection hot-added, current code logic still populate 2M backup metadata, although only 2M/64 = 32K needs. > >> This change also addresses misalignment issues during vmemmap_free(). When >> pmd_sect() is true, the entire PMD section is cleared, even if only a >> subsection is mapped. For example, if subsections pagemap1 and pagemap2 are >> added sequentially and then pagemap1 is removed, vmemmap_free() will clear the >> entire PMD section, even though pagemap2 is still active. > > What do you mean by a PMD section? The whole PAGE_SIZE * > PAGES_PER_SECTION range or a single pmd entry? I couldn't see how the I am referring to a single pmd entry, but the buffer it points to manage whole PAGE_SIZE * PAGES_PER_SECTION physical memory. for arm64,4K pages: pmd entry(2M, struct page metadata) -> PAGE_SIZE * PAGES_PER_SECTION(128M physical memory) pagemap1(Where a subsection equals to 2M/64 = 32K) and pagemap2 are part of a single PMD entry. When pagemap1 is removed, vmemmap_free() will clear the entire PMD section. IOW, total 128M physical memory will become unusable. > former happens in the core code but I only looked briefly. If it's just > a pmd entry, I think it's fair to require a 2MB alignment of hotplugged > memory ranges. Agree that 2MB alignment of hotplugged memory is fair, commit ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") supported it. The issue here I want to address is for its backup struct page metadata. > ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/2] arm64: mm: implement vmemmap_check_pmd for arm64 2024-11-21 7:12 [PATCH 0/2] Fix subsection vmemmap_populate logic Zhenhua Huang 2024-11-21 7:12 ` [PATCH 1/2] arm64: mm: vmemmap populate to page level if not section aligned Zhenhua Huang @ 2024-11-21 7:12 ` Zhenhua Huang 2024-11-28 7:26 ` [PATCH 0/2] Fix subsection vmemmap_populate logic Zhenhua Huang 2 siblings, 0 replies; 9+ messages in thread From: Zhenhua Huang @ 2024-11-21 7:12 UTC (permalink / raw) To: catalin.marinas, will, ardb, ryan.roberts, mark.rutland, joey.gouly, dave.hansen, akpm, chenfeiyang, chenhuacai Cc: linux-arm-kernel, linux-kernel, Zhenhua Huang vmemmap_check_pmd() is used to determine if needs to populate to base pages. Implement it for arm64 arch. Fixes: 2045a3b8911b ("mm/sparse-vmemmap: generalise vmemmap_populate_hugepages()") Signed-off-by: Zhenhua Huang <quic_zhenhuah@quicinc.com> --- arch/arm64/mm/mmu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index bfecabac14a3..0e19dd1cfc0c 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1143,7 +1143,8 @@ int __meminit vmemmap_check_pmd(pmd_t *pmdp, int node, unsigned long addr, unsigned long next) { vmemmap_verify((pte_t *)pmdp, node, addr, next); - return 1; + + return pmd_sect(*pmdp); } int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, -- 2.25.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] Fix subsection vmemmap_populate logic 2024-11-21 7:12 [PATCH 0/2] Fix subsection vmemmap_populate logic Zhenhua Huang 2024-11-21 7:12 ` [PATCH 1/2] arm64: mm: vmemmap populate to page level if not section aligned Zhenhua Huang 2024-11-21 7:12 ` [PATCH 2/2] arm64: mm: implement vmemmap_check_pmd for arm64 Zhenhua Huang @ 2024-11-28 7:26 ` Zhenhua Huang 2024-12-06 9:13 ` Zhenhua Huang 2 siblings, 1 reply; 9+ messages in thread From: Zhenhua Huang @ 2024-11-28 7:26 UTC (permalink / raw) To: catalin.marinas, will, ardb, ryan.roberts, mark.rutland, joey.gouly, dave.hansen, akpm, chenfeiyang, chenhuacai Cc: linux-arm-kernel, linux-kernel On 2024/11/21 15:12, Zhenhua Huang wrote: > To perform memory hotplug operations, the memmap (aka struct page) will be > updated. For arm64 with 4K page size, the typical granularity is 128M, > which corresponds to a 2M memmap buffer. > Commit 2045a3b8911b ("mm/sparse-vmemmap: generalise vmemmap_populate_hugepages()") > optimizes this 2M buffer to be mapped with PMD huge pages. However, > commit ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") > which supports 2M subsection hotplug granularity, causes other issues > (refer to the change log of patch #1). The logic is adjusted to populate > with huge pages only if the hotplug address/size is section-aligned. Could any expert please help review ? > > Zhenhua Huang (2): > arm64: mm: vmemmap populate to page level if not section aligned > arm64: mm: implement vmemmap_check_pmd for arm64 > > arch/arm64/mm/mmu.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] Fix subsection vmemmap_populate logic 2024-11-28 7:26 ` [PATCH 0/2] Fix subsection vmemmap_populate logic Zhenhua Huang @ 2024-12-06 9:13 ` Zhenhua Huang 2024-12-07 6:14 ` Andrew Morton 0 siblings, 1 reply; 9+ messages in thread From: Zhenhua Huang @ 2024-12-06 9:13 UTC (permalink / raw) To: catalin.marinas, will, ardb, ryan.roberts, mark.rutland, joey.gouly, dave.hansen, akpm, chenfeiyang, chenhuacai Cc: linux-arm-kernel, linux-kernel On 2024/11/28 15:26, Zhenhua Huang wrote: > > > On 2024/11/21 15:12, Zhenhua Huang wrote: >> To perform memory hotplug operations, the memmap (aka struct page) >> will be >> updated. For arm64 with 4K page size, the typical granularity is 128M, >> which corresponds to a 2M memmap buffer. >> Commit 2045a3b8911b ("mm/sparse-vmemmap: generalise >> vmemmap_populate_hugepages()") >> optimizes this 2M buffer to be mapped with PMD huge pages. However, >> commit ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") >> which supports 2M subsection hotplug granularity, causes other issues >> (refer to the change log of patch #1). The logic is adjusted to populate >> with huge pages only if the hotplug address/size is section-aligned. > > Could any expert please help review ? Gentle reminder for review.. > >> >> Zhenhua Huang (2): >> arm64: mm: vmemmap populate to page level if not section aligned >> arm64: mm: implement vmemmap_check_pmd for arm64 >> >> arch/arm64/mm/mmu.c | 7 +++++-- >> 1 file changed, 5 insertions(+), 2 deletions(-) >> > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] Fix subsection vmemmap_populate logic 2024-12-06 9:13 ` Zhenhua Huang @ 2024-12-07 6:14 ` Andrew Morton 2024-12-09 6:04 ` Zhenhua Huang 0 siblings, 1 reply; 9+ messages in thread From: Andrew Morton @ 2024-12-07 6:14 UTC (permalink / raw) To: Zhenhua Huang Cc: catalin.marinas, will, ardb, ryan.roberts, mark.rutland, joey.gouly, dave.hansen, chenfeiyang, chenhuacai, linux-arm-kernel, linux-kernel On Fri, 6 Dec 2024 17:13:39 +0800 Zhenhua Huang <quic_zhenhuah@quicinc.com> wrote: > > > On 2024/11/28 15:26, Zhenhua Huang wrote: > > > > > > On 2024/11/21 15:12, Zhenhua Huang wrote: > >> To perform memory hotplug operations, the memmap (aka struct page) > >> will be > >> updated. For arm64 with 4K page size, the typical granularity is 128M, > >> which corresponds to a 2M memmap buffer. > >> Commit 2045a3b8911b ("mm/sparse-vmemmap: generalise > >> vmemmap_populate_hugepages()") > >> optimizes this 2M buffer to be mapped with PMD huge pages. However, > >> commit ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") > >> which supports 2M subsection hotplug granularity, causes other issues > >> (refer to the change log of patch #1). The logic is adjusted to populate > >> with huge pages only if the hotplug address/size is section-aligned. > > > > Could any expert please help review ? > > Gentle reminder for review.. > MM developers work on the linux-mm mailing list, which was not cc'ed. Please address Catalin's review comment (https://lkml.kernel.org/r/Z1Mwo5OajFZQYlOg@arm.com) then resend a v2 series with the appropriate cc's. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] Fix subsection vmemmap_populate logic 2024-12-07 6:14 ` Andrew Morton @ 2024-12-09 6:04 ` Zhenhua Huang 0 siblings, 0 replies; 9+ messages in thread From: Zhenhua Huang @ 2024-12-09 6:04 UTC (permalink / raw) To: Andrew Morton Cc: catalin.marinas, will, ardb, ryan.roberts, mark.rutland, joey.gouly, dave.hansen, chenfeiyang, chenhuacai, linux-arm-kernel, linux-kernel On 2024/12/7 14:14, Andrew Morton wrote: > On Fri, 6 Dec 2024 17:13:39 +0800 Zhenhua Huang <quic_zhenhuah@quicinc.com> wrote: > >> >> >> On 2024/11/28 15:26, Zhenhua Huang wrote: >>> >>> >>> On 2024/11/21 15:12, Zhenhua Huang wrote: >>>> To perform memory hotplug operations, the memmap (aka struct page) >>>> will be >>>> updated. For arm64 with 4K page size, the typical granularity is 128M, >>>> which corresponds to a 2M memmap buffer. >>>> Commit 2045a3b8911b ("mm/sparse-vmemmap: generalise >>>> vmemmap_populate_hugepages()") >>>> optimizes this 2M buffer to be mapped with PMD huge pages. However, >>>> commit ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") >>>> which supports 2M subsection hotplug granularity, causes other issues >>>> (refer to the change log of patch #1). The logic is adjusted to populate >>>> with huge pages only if the hotplug address/size is section-aligned. >>> >>> Could any expert please help review ? >> >> Gentle reminder for review.. >> > > > MM developers work on the linux-mm mailing list, which was not cc'ed. > > Please address Catalin's review comment > (https://lkml.kernel.org/r/Z1Mwo5OajFZQYlOg@arm.com) then resend a v2 > series with the appropriate cc's. Thanks Andrew! Will do. I was realizing I couldn't fully rely on get_maintainers script :) > > ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-12-09 6:06 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-11-21 7:12 [PATCH 0/2] Fix subsection vmemmap_populate logic Zhenhua Huang 2024-11-21 7:12 ` [PATCH 1/2] arm64: mm: vmemmap populate to page level if not section aligned Zhenhua Huang 2024-12-06 17:13 ` Catalin Marinas 2024-12-09 6:04 ` Zhenhua Huang 2024-11-21 7:12 ` [PATCH 2/2] arm64: mm: implement vmemmap_check_pmd for arm64 Zhenhua Huang 2024-11-28 7:26 ` [PATCH 0/2] Fix subsection vmemmap_populate logic Zhenhua Huang 2024-12-06 9:13 ` Zhenhua Huang 2024-12-07 6:14 ` Andrew Morton 2024-12-09 6:04 ` Zhenhua Huang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).