From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: linux-mm@kvack.org, akpm@linux-foundation.org,
mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org,
npiggin@gmail.com, christophe.leroy@csgroup.eu
Cc: Catalin Marinas <catalin.marinas@arm.com>,
Muchun Song <muchun.song@linux.dev>,
Dan Williams <dan.j.williams@intel.com>,
Oscar Salvador <osalvador@suse.de>, Will Deacon <will@kernel.org>,
Joao Martins <joao.m.martins@oracle.com>,
Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [PATCH v5 10/13] powerpc/book3s64/vmemmap: Switch radix to use a different vmemmap handling function
Date: Mon, 24 Jul 2023 23:59:27 +0530 [thread overview]
Message-ID: <87edkx2mew.fsf@linux.ibm.com> (raw)
In-Reply-To: <20230718022934.90447-11-aneesh.kumar@linux.ibm.com>
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> This is in preparation to update radix to implement vmemmap optimization
> for devdax. Below are the rules w.r.t radix vmemmap mapping
>
> 1. First try to map things using PMD (2M)
> 2. With altmap if altmap cross-boundary check returns true, fall back to
> PAGE_SIZE
> 3. If we can't allocate PMD_SIZE backing memory for vmemmap, fallback to
> PAGE_SIZE
>
> On removing vmemmap mapping, check if every subsection that is using the
> vmemmap area is invalid. If found to be invalid, that implies we can safely
> free the vmemmap area. We don't use the PAGE_UNUSED pattern used by x86
> because with 64K page size, we need to do the above check even at the
> PAGE_SIZE granularity.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
> arch/powerpc/include/asm/book3s/64/radix.h | 2 +
> arch/powerpc/include/asm/pgtable.h | 4 +
> arch/powerpc/mm/book3s64/radix_pgtable.c | 326 +++++++++++++++++++--
> arch/powerpc/mm/init_64.c | 26 +-
> 4 files changed, 327 insertions(+), 31 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
> index 2ef92f36340f..f1461289643a 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -331,6 +331,8 @@ extern int __meminit radix__vmemmap_create_mapping(unsigned long start,
> unsigned long phys);
> int __meminit radix__vmemmap_populate(unsigned long start, unsigned long end,
> int node, struct vmem_altmap *altmap);
> +void __ref radix__vmemmap_free(unsigned long start, unsigned long end,
> + struct vmem_altmap *altmap);
> extern void radix__vmemmap_remove_mapping(unsigned long start,
> unsigned long page_size);
>
> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> index 6a88bfdaa69b..68817ea7f994 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -165,6 +165,10 @@ static inline bool is_ioremap_addr(const void *x)
>
> return addr >= IOREMAP_BASE && addr < IOREMAP_END;
> }
> +
> +int __meminit vmemmap_populated(unsigned long vmemmap_addr, int vmemmap_map_size);
> +bool altmap_cross_boundary(struct vmem_altmap *altmap, unsigned long start,
> + unsigned long page_size);
> #endif /* CONFIG_PPC64 */
>
> #endif /* __ASSEMBLY__ */
> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
> index 227fea53c217..9a7f3707b6fb 100644
> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> @@ -744,8 +744,59 @@ static void free_pud_table(pud_t *pud_start, p4d_t *p4d)
> p4d_clear(p4d);
> }
>
> +#ifdef CONFIG_SPARSEMEM_VMEMMAP
> +static bool __meminit vmemmap_pmd_is_unused(unsigned long addr, unsigned long end)
> +{
> + unsigned long start = ALIGN_DOWN(addr, PMD_SIZE);
> +
> + return !vmemmap_populated(start, PMD_SIZE);
> +}
> +
> +static bool __meminit vmemmap_page_is_unused(unsigned long addr, unsigned long end)
> +{
> + unsigned long start = ALIGN_DOWN(addr, PAGE_SIZE);
> +
> + return !vmemmap_populated(start, PAGE_SIZE);
> +
> +}
> +#endif
> +
> +static void __meminit free_vmemmap_pages(struct page *page,
> + struct vmem_altmap *altmap,
> + int order)
> +{
> + unsigned int nr_pages = 1 << order;
> +
> + if (altmap) {
> + unsigned long alt_start, alt_end;
> + unsigned long base_pfn = page_to_pfn(page);
> +
> + /*
> + * with 2M vmemmap mmaping we can have things setup
> + * such that even though atlmap is specified we never
> + * used altmap.
> + */
> + alt_start = altmap->base_pfn;
> + alt_end = altmap->base_pfn + altmap->reserve +
> + altmap->free + altmap->alloc + altmap->align;
> +
> + if (base_pfn >= alt_start && base_pfn < alt_end) {
> + vmem_altmap_free(altmap, nr_pages);
> + return;
> + }
> + }
> +
Please take this diff on top of this patch when adding this series to
-mm .
commit 613569d9517be60611a86bf4b9821b150c4c4954
Author: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Date: Mon Jul 24 22:49:29 2023 +0530
powerpc/mm/altmap: Fix altmap boundary check
altmap->free includes the entire free space from which altmap blocks
can be allocated. So when checking whether the kernel is doing altmap
block free, compute the boundary correctly.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 7761c2e93bff..ed63c2953b54 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -766,8 +766,7 @@ static void __meminit free_vmemmap_pages(struct page *page,
* used altmap.
*/
alt_start = altmap->base_pfn;
- alt_end = altmap->base_pfn + altmap->reserve +
- altmap->free + altmap->alloc + altmap->align;
+ alt_end = altmap->base_pfn + altmap->reserve + altmap->free;
if (base_pfn >= alt_start && base_pfn < alt_end) {
vmem_altmap_free(altmap, nr_pages);
WARNING: multiple messages have this Message-ID (diff)
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: linux-mm@kvack.org, akpm@linux-foundation.org,
mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org,
npiggin@gmail.com, christophe.leroy@csgroup.eu
Cc: Oscar Salvador <osalvador@suse.de>,
Mike Kravetz <mike.kravetz@oracle.com>,
Dan Williams <dan.j.williams@intel.com>,
Joao Martins <joao.m.martins@oracle.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Muchun Song <muchun.song@linux.dev>,
Will Deacon <will@kernel.org>
Subject: Re: [PATCH v5 10/13] powerpc/book3s64/vmemmap: Switch radix to use a different vmemmap handling function
Date: Mon, 24 Jul 2023 23:59:27 +0530 [thread overview]
Message-ID: <87edkx2mew.fsf@linux.ibm.com> (raw)
In-Reply-To: <20230718022934.90447-11-aneesh.kumar@linux.ibm.com>
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> This is in preparation to update radix to implement vmemmap optimization
> for devdax. Below are the rules w.r.t radix vmemmap mapping
>
> 1. First try to map things using PMD (2M)
> 2. With altmap if altmap cross-boundary check returns true, fall back to
> PAGE_SIZE
> 3. If we can't allocate PMD_SIZE backing memory for vmemmap, fallback to
> PAGE_SIZE
>
> On removing vmemmap mapping, check if every subsection that is using the
> vmemmap area is invalid. If found to be invalid, that implies we can safely
> free the vmemmap area. We don't use the PAGE_UNUSED pattern used by x86
> because with 64K page size, we need to do the above check even at the
> PAGE_SIZE granularity.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
> arch/powerpc/include/asm/book3s/64/radix.h | 2 +
> arch/powerpc/include/asm/pgtable.h | 4 +
> arch/powerpc/mm/book3s64/radix_pgtable.c | 326 +++++++++++++++++++--
> arch/powerpc/mm/init_64.c | 26 +-
> 4 files changed, 327 insertions(+), 31 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
> index 2ef92f36340f..f1461289643a 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -331,6 +331,8 @@ extern int __meminit radix__vmemmap_create_mapping(unsigned long start,
> unsigned long phys);
> int __meminit radix__vmemmap_populate(unsigned long start, unsigned long end,
> int node, struct vmem_altmap *altmap);
> +void __ref radix__vmemmap_free(unsigned long start, unsigned long end,
> + struct vmem_altmap *altmap);
> extern void radix__vmemmap_remove_mapping(unsigned long start,
> unsigned long page_size);
>
> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> index 6a88bfdaa69b..68817ea7f994 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -165,6 +165,10 @@ static inline bool is_ioremap_addr(const void *x)
>
> return addr >= IOREMAP_BASE && addr < IOREMAP_END;
> }
> +
> +int __meminit vmemmap_populated(unsigned long vmemmap_addr, int vmemmap_map_size);
> +bool altmap_cross_boundary(struct vmem_altmap *altmap, unsigned long start,
> + unsigned long page_size);
> #endif /* CONFIG_PPC64 */
>
> #endif /* __ASSEMBLY__ */
> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
> index 227fea53c217..9a7f3707b6fb 100644
> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> @@ -744,8 +744,59 @@ static void free_pud_table(pud_t *pud_start, p4d_t *p4d)
> p4d_clear(p4d);
> }
>
> +#ifdef CONFIG_SPARSEMEM_VMEMMAP
> +static bool __meminit vmemmap_pmd_is_unused(unsigned long addr, unsigned long end)
> +{
> + unsigned long start = ALIGN_DOWN(addr, PMD_SIZE);
> +
> + return !vmemmap_populated(start, PMD_SIZE);
> +}
> +
> +static bool __meminit vmemmap_page_is_unused(unsigned long addr, unsigned long end)
> +{
> + unsigned long start = ALIGN_DOWN(addr, PAGE_SIZE);
> +
> + return !vmemmap_populated(start, PAGE_SIZE);
> +
> +}
> +#endif
> +
> +static void __meminit free_vmemmap_pages(struct page *page,
> + struct vmem_altmap *altmap,
> + int order)
> +{
> + unsigned int nr_pages = 1 << order;
> +
> + if (altmap) {
> + unsigned long alt_start, alt_end;
> + unsigned long base_pfn = page_to_pfn(page);
> +
> + /*
> + * with 2M vmemmap mmaping we can have things setup
> + * such that even though atlmap is specified we never
> + * used altmap.
> + */
> + alt_start = altmap->base_pfn;
> + alt_end = altmap->base_pfn + altmap->reserve +
> + altmap->free + altmap->alloc + altmap->align;
> +
> + if (base_pfn >= alt_start && base_pfn < alt_end) {
> + vmem_altmap_free(altmap, nr_pages);
> + return;
> + }
> + }
> +
Please take this diff on top of this patch when adding this series to
-mm .
commit 613569d9517be60611a86bf4b9821b150c4c4954
Author: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Date: Mon Jul 24 22:49:29 2023 +0530
powerpc/mm/altmap: Fix altmap boundary check
altmap->free includes the entire free space from which altmap blocks
can be allocated. So when checking whether the kernel is doing altmap
block free, compute the boundary correctly.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 7761c2e93bff..ed63c2953b54 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -766,8 +766,7 @@ static void __meminit free_vmemmap_pages(struct page *page,
* used altmap.
*/
alt_start = altmap->base_pfn;
- alt_end = altmap->base_pfn + altmap->reserve +
- altmap->free + altmap->alloc + altmap->align;
+ alt_end = altmap->base_pfn + altmap->reserve + altmap->free;
if (base_pfn >= alt_start && base_pfn < alt_end) {
vmem_altmap_free(altmap, nr_pages);
next prev parent reply other threads:[~2023-07-24 18:30 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-18 2:29 [PATCH v5 00/13] Add support for DAX vmemmap optimization for ppc64 Aneesh Kumar K.V
2023-07-18 2:29 ` Aneesh Kumar K.V
2023-07-18 2:29 ` [PATCH v5 01/13] mm/hugepage pud: Allow arch-specific helper function to check huge page pud support Aneesh Kumar K.V
2023-07-18 2:29 ` Aneesh Kumar K.V
2023-07-18 2:29 ` [PATCH v5 02/13] mm: Change pudp_huge_get_and_clear_full take vm_area_struct as arg Aneesh Kumar K.V
2023-07-18 2:29 ` [PATCH v5 03/13] mm/vmemmap: Improve vmemmap_can_optimize and allow architectures to override Aneesh Kumar K.V
2023-07-18 2:29 ` [PATCH v5 04/13] mm/vmemmap: Allow architectures to override how vmemmap optimization works Aneesh Kumar K.V
2023-07-18 2:29 ` Aneesh Kumar K.V
2023-07-18 2:29 ` [PATCH v5 05/13] mm: Add pud_same similar to __HAVE_ARCH_P4D_SAME Aneesh Kumar K.V
2023-07-18 2:29 ` Aneesh Kumar K.V
2023-07-18 2:29 ` [PATCH v5 06/13] mm/huge pud: Use transparent huge pud helpers only with CONFIG_TRANSPARENT_HUGEPAGE Aneesh Kumar K.V
2023-07-18 2:29 ` [PATCH v5 07/13] mm/vmemmap optimization: Split hugetlb and devdax vmemmap optimization Aneesh Kumar K.V
2023-07-18 2:29 ` Aneesh Kumar K.V
2023-07-18 2:29 ` [PATCH v5 08/13] powerpc/mm/trace: Convert trace event to trace event class Aneesh Kumar K.V
2023-07-18 2:29 ` Aneesh Kumar K.V
2023-07-18 2:29 ` [PATCH v5 09/13] powerpc/book3s64/mm: Enable transparent pud hugepage Aneesh Kumar K.V
2023-07-18 2:29 ` Aneesh Kumar K.V
2023-07-18 2:29 ` [PATCH v5 10/13] powerpc/book3s64/vmemmap: Switch radix to use a different vmemmap handling function Aneesh Kumar K.V
2023-07-18 2:29 ` Aneesh Kumar K.V
2023-07-24 18:29 ` Aneesh Kumar K.V [this message]
2023-07-24 18:29 ` Aneesh Kumar K.V
2023-07-24 18:48 ` Andrew Morton
2023-07-24 18:48 ` Andrew Morton
2023-07-18 2:29 ` [PATCH v5 11/13] powerpc/book3s64/radix: Add support for vmemmap optimization for radix Aneesh Kumar K.V
2023-07-18 2:29 ` Aneesh Kumar K.V
2023-07-18 2:29 ` [PATCH v5 12/13] powerpc/book3s64/radix: Remove mmu_vmemmap_psize Aneesh Kumar K.V
2023-07-18 2:29 ` Aneesh Kumar K.V
2023-07-18 2:29 ` [PATCH v5 13/13] powerpc/book3s64/radix: Add debug message to give more details of vmemmap allocation Aneesh Kumar K.V
2023-07-18 2:29 ` Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87edkx2mew.fsf@linux.ibm.com \
--to=aneesh.kumar@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=christophe.leroy@csgroup.eu \
--cc=dan.j.williams@intel.com \
--cc=joao.m.martins@oracle.com \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mike.kravetz@oracle.com \
--cc=mpe@ellerman.id.au \
--cc=muchun.song@linux.dev \
--cc=npiggin@gmail.com \
--cc=osalvador@suse.de \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.