* [PATCH v14 0/3] riscv, mm: detect svnapot cpu support at runtime
@ 2023-03-08 7:48 Qinglin Pan
2023-03-08 7:48 ` [PATCH v14 1/3] riscv: mm: modify pte format for Svnapot Qinglin Pan
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Qinglin Pan @ 2023-03-08 7:48 UTC (permalink / raw)
To: paul.walmsley, palmer, linux-riscv
Cc: jeff, xuyinan, conor, ajones, Qinglin Pan
Svnapot is a RISC-V extension for marking contiguous 4K pages as a non-4K
page. This patch set is for using Svnapot in hugetlb fs and huge vmap.
This patchset adds a Kconfig item for using Svnapot in
"Platform type"->"SVNAPOT extension support". Its default value is on,
and people can set it off if they don't allow kernel to detect Svnapot
hardware support and leverage it.
Tested on:
- qemu rv64 with "Svnapot support" off and svnapot=true.
- qemu rv64 with "Svnapot support" on and svnapot=true.
- qemu rv64 with "Svnapot support" off and svnapot=false.
- qemu rv64 with "Svnapot support" on and svnapot=false.
Changes in v2:
- detect Svnapot hardware support at boot time.
Changes in v3:
- do linear mapping again if has_svnapot
Changes in v4:
- fix some errors/warns reported by checkpatch.pl, thanks @Conor
Changes in v5:
- modify code according to @Conor and @Heiko
Changes in v6:
- use static key insead of alternative errata
Changes in v7:
- add napot_cont_order for possible more napot order in the future
- remove linear mapping related code from this patchset to another patch
- refactor hugetlb code for svnapot according to thanks @Andrew @Conor
- use tools/testing/selftests/vm/map_hugetlb as hugetlb testing program
- support svnapot in huge vmap on newer for-next branch
Changes in v8:
- fix compilation errors in rv32_defconfig
- insert some lines of whitespace according to @Conor's suggestion
Changes in v9:
- use alternative to avoid using static branches inside heavily used
inline functions
- change napot_cont_mask definition
- post test_vmalloc modification about testing vmalloc_huge
Changes in v10:
- fix some nits caught by @Andrew
- collect Reviewed-by/Acked-by
- add memory leak warning in KConfig text
- replace test_vmalloc patch link with the standard one
Changes in v11:
- add more detailed warning about HUGETLBFS and SVNAPOT in KConfig text
- fix missing reverse-xmas tree
Changes in v12:
- rebase on the new ISA extension API [1]
Changes in v13:
- fix warnings reported by patchwork
Changes in v14:
- rebase on for-next
[1]https://lore.kernel.org/all/20230128172856.3814-5-jszhang@kernel.org/
Qinglin Pan (3):
riscv: mm: modify pte format for Svnapot
riscv: mm: support Svnapot in hugetlb page
riscv: mm: support Svnapot in huge vmap
arch/riscv/Kconfig | 21 +-
arch/riscv/include/asm/hugetlb.h | 34 +++-
arch/riscv/include/asm/hwcap.h | 3 +-
arch/riscv/include/asm/page.h | 5 -
arch/riscv/include/asm/pgtable-64.h | 34 ++++
arch/riscv/include/asm/pgtable.h | 39 +++-
arch/riscv/include/asm/vmalloc.h | 61 +++++-
arch/riscv/kernel/cpu.c | 1 +
arch/riscv/kernel/cpufeature.c | 1 +
arch/riscv/mm/hugetlbpage.c | 301 ++++++++++++++++++++++++++++
10 files changed, 490 insertions(+), 10 deletions(-)
--
2.39.2
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v14 1/3] riscv: mm: modify pte format for Svnapot
2023-03-08 7:48 [PATCH v14 0/3] riscv, mm: detect svnapot cpu support at runtime Qinglin Pan
@ 2023-03-08 7:48 ` Qinglin Pan
2023-03-08 9:09 ` Conor Dooley
2023-03-08 7:48 ` [PATCH v14 2/3] riscv: mm: support Svnapot in hugetlb page Qinglin Pan
` (2 subsequent siblings)
3 siblings, 1 reply; 7+ messages in thread
From: Qinglin Pan @ 2023-03-08 7:48 UTC (permalink / raw)
To: paul.walmsley, palmer, linux-riscv
Cc: jeff, xuyinan, conor, ajones, Qinglin Pan, Qinglin Pan
From: Qinglin Pan <panqinglin2020@iscas.ac.cn>
Add one alternative to enable/disable svnapot support, enable this static
key when "svnapot" is in the "riscv,isa" field of fdt and SVNAPOT compile
option is set. It will influence the behavior of has_svnapot. All code
dependent on svnapot should make sure that has_svnapot return true firstly.
Modify PTE definition for Svnapot, and creates some functions in pgtable.h
to mark a PTE as napot and check if it is a Svnapot PTE. Until now, only
64KB napot size is supported in spec, so some macros has only 64KB version.
Signed-off-by: Qinglin Pan <panqinglin00@gmail.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index c5e42cc37604..43a101369aa5 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -398,6 +398,25 @@ config RISCV_ISA_C
If you don't know what to do here, say Y.
+config RISCV_ISA_SVNAPOT
+ bool "SVNAPOT extension support"
+ depends on 64BIT && MMU
+ default y
+ select RISCV_ALTERNATIVE
+ help
+ Allow kernel to detect the SVNAPOT ISA-extension dynamically at boot
+ time and enable its usage.
+
+ The SVNAPOT extension is used to mark contiguous PTEs as a range
+ of contiguous virtual-to-physical translations for a naturally
+ aligned power-of-2 (NAPOT) granularity larger than the base 4KB page
+ size. When HUGETLBFS is also selected this option unconditionally
+ allocates some memory for each NAPOT page size supported by the kernel.
+ When optimizing for low memory consumption and for platforms without
+ the SVNAPOT extension, it may be better to say N here.
+
+ If you don't know what to do here, say Y.
+
config RISCV_ISA_SVPBMT
bool "SVPBMT extension support"
depends on 64BIT && MMU
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index e3021b2590de..ba9f4650f051 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -42,9 +42,10 @@
#define RISCV_ISA_EXT_ZBB 30
#define RISCV_ISA_EXT_ZICBOM 31
#define RISCV_ISA_EXT_ZIHINTPAUSE 32
+#define RISCV_ISA_EXT_SVNAPOT 33
#define RISCV_ISA_EXT_MAX 64
-#define RISCV_ISA_EXT_NAME_LEN_MAX 32
+#define RISCV_ISA_EXT_NAME_LEN_MAX 33
#ifndef __ASSEMBLY__
diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h
index 7fed7c431928..0f3220e253bc 100644
--- a/arch/riscv/include/asm/page.h
+++ b/arch/riscv/include/asm/page.h
@@ -16,11 +16,6 @@
#define PAGE_SIZE (_AC(1, UL) << PAGE_SHIFT)
#define PAGE_MASK (~(PAGE_SIZE - 1))
-#ifdef CONFIG_64BIT
-#define HUGE_MAX_HSTATE 2
-#else
-#define HUGE_MAX_HSTATE 1
-#endif
#define HPAGE_SHIFT PMD_SHIFT
#define HPAGE_SIZE (_AC(1, UL) << HPAGE_SHIFT)
#define HPAGE_MASK (~(HPAGE_SIZE - 1))
diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h
index 42a042c0e13e..7a5097202e15 100644
--- a/arch/riscv/include/asm/pgtable-64.h
+++ b/arch/riscv/include/asm/pgtable-64.h
@@ -78,6 +78,40 @@ typedef struct {
*/
#define _PAGE_PFN_MASK GENMASK(53, 10)
+/*
+ * [63] Svnapot definitions:
+ * 0 Svnapot disabled
+ * 1 Svnapot enabled
+ */
+#define _PAGE_NAPOT_SHIFT 63
+#define _PAGE_NAPOT BIT(_PAGE_NAPOT_SHIFT)
+/*
+ * Only 64KB (order 4) napot ptes supported.
+ */
+#define NAPOT_CONT_ORDER_BASE 4
+enum napot_cont_order {
+ NAPOT_CONT64KB_ORDER = NAPOT_CONT_ORDER_BASE,
+ NAPOT_ORDER_MAX,
+};
+
+#define for_each_napot_order(order) \
+ for (order = NAPOT_CONT_ORDER_BASE; order < NAPOT_ORDER_MAX; order++)
+#define for_each_napot_order_rev(order) \
+ for (order = NAPOT_ORDER_MAX - 1; \
+ order >= NAPOT_CONT_ORDER_BASE; order--)
+#define napot_cont_order(val) (__builtin_ctzl((val.pte >> _PAGE_PFN_SHIFT) << 1))
+
+#define napot_cont_shift(order) ((order) + PAGE_SHIFT)
+#define napot_cont_size(order) BIT(napot_cont_shift(order))
+#define napot_cont_mask(order) (~(napot_cont_size(order) - 1UL))
+#define napot_pte_num(order) BIT(order)
+
+#ifdef CONFIG_RISCV_ISA_SVNAPOT
+#define HUGE_MAX_HSTATE (2 + (NAPOT_ORDER_MAX - NAPOT_CONT_ORDER_BASE))
+#else
+#define HUGE_MAX_HSTATE 2
+#endif
+
/*
* [62:61] Svpbmt Memory Type definitions:
*
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index ab05f892d317..3303fc03d724 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -264,10 +264,47 @@ static inline pte_t pud_pte(pud_t pud)
return __pte(pud_val(pud));
}
+#ifdef CONFIG_RISCV_ISA_SVNAPOT
+
+static __always_inline bool has_svnapot(void)
+{
+ return riscv_has_extension_likely(RISCV_ISA_EXT_SVNAPOT);
+}
+
+static inline unsigned long pte_napot(pte_t pte)
+{
+ return pte_val(pte) & _PAGE_NAPOT;
+}
+
+static inline pte_t pte_mknapot(pte_t pte, unsigned int order)
+{
+ int pos = order - 1 + _PAGE_PFN_SHIFT;
+ unsigned long napot_bit = BIT(pos);
+ unsigned long napot_mask = ~GENMASK(pos, _PAGE_PFN_SHIFT);
+
+ return __pte((pte_val(pte) & napot_mask) | napot_bit | _PAGE_NAPOT);
+}
+
+#else
+
+static __always_inline bool has_svnapot(void) { return false; }
+
+static inline unsigned long pte_napot(pte_t pte)
+{
+ return 0;
+}
+
+#endif /* CONFIG_RISCV_ISA_SVNAPOT */
+
/* Yields the page frame number (PFN) of a page table entry */
static inline unsigned long pte_pfn(pte_t pte)
{
- return __page_val_to_pfn(pte_val(pte));
+ unsigned long res = __page_val_to_pfn(pte_val(pte));
+
+ if (has_svnapot() && pte_napot(pte))
+ res = res & (res - 1UL);
+
+ return res;
}
#define pte_page(x) pfn_to_page(pte_pfn(x))
diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index 8400f0cc9704..7d0271c4d525 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -191,6 +191,7 @@ static struct riscv_isa_ext_data isa_ext_arr[] = {
__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
__RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC),
__RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL),
+ __RISCV_ISA_EXT_DATA(svnapot, RISCV_ISA_EXT_SVNAPOT),
__RISCV_ISA_EXT_DATA(svpbmt, RISCV_ISA_EXT_SVPBMT),
__RISCV_ISA_EXT_DATA("", RISCV_ISA_EXT_MAX),
};
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 59d58ee0f68d..a4728bf60b84 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -224,6 +224,7 @@ void __init riscv_fill_hwcap(void)
SET_ISA_EXT_MAP("sscofpmf", RISCV_ISA_EXT_SSCOFPMF);
SET_ISA_EXT_MAP("sstc", RISCV_ISA_EXT_SSTC);
SET_ISA_EXT_MAP("svinval", RISCV_ISA_EXT_SVINVAL);
+ SET_ISA_EXT_MAP("svnapot", RISCV_ISA_EXT_SVNAPOT);
SET_ISA_EXT_MAP("svpbmt", RISCV_ISA_EXT_SVPBMT);
SET_ISA_EXT_MAP("zbb", RISCV_ISA_EXT_ZBB);
SET_ISA_EXT_MAP("zicbom", RISCV_ISA_EXT_ZICBOM);
--
2.39.2
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v14 2/3] riscv: mm: support Svnapot in hugetlb page
2023-03-08 7:48 [PATCH v14 0/3] riscv, mm: detect svnapot cpu support at runtime Qinglin Pan
2023-03-08 7:48 ` [PATCH v14 1/3] riscv: mm: modify pte format for Svnapot Qinglin Pan
@ 2023-03-08 7:48 ` Qinglin Pan
2023-03-08 7:48 ` [PATCH v14 3/3] riscv: mm: support Svnapot in huge vmap Qinglin Pan
2023-03-15 1:40 ` [PATCH v14 0/3] riscv, mm: detect svnapot cpu support at runtime patchwork-bot+linux-riscv
3 siblings, 0 replies; 7+ messages in thread
From: Qinglin Pan @ 2023-03-08 7:48 UTC (permalink / raw)
To: paul.walmsley, palmer, linux-riscv
Cc: jeff, xuyinan, conor, ajones, Qinglin Pan, Qinglin Pan
From: Qinglin Pan <panqinglin2020@iscas.ac.cn>
Svnapot can be used to support 64KB hugetlb page, so it can become a new
option when using hugetlbfs. Add a basic implementation of hugetlb page,
and support 64KB as a size in it by using Svnapot.
For test, boot kernel with command line contains "default_hugepagesz=64K
hugepagesz=64K hugepages=20" and run a simple test like this:
tools/testing/selftests/vm/map_hugetlb 1 16
And it should be passed.
Signed-off-by: Qinglin Pan <panqinglin00@gmail.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 43a101369aa5..c736dc8e2593 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -44,7 +44,7 @@ config RISCV
select ARCH_USE_QUEUED_RWLOCKS
select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
select ARCH_WANT_FRAME_POINTERS
- select ARCH_WANT_GENERAL_HUGETLB
+ select ARCH_WANT_GENERAL_HUGETLB if !RISCV_ISA_SVNAPOT
select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
select ARCH_WANT_LD_ORPHAN_WARN if !XIP_KERNEL
diff --git a/arch/riscv/include/asm/hugetlb.h b/arch/riscv/include/asm/hugetlb.h
index ec19d6afc896..fe6f23006641 100644
--- a/arch/riscv/include/asm/hugetlb.h
+++ b/arch/riscv/include/asm/hugetlb.h
@@ -2,7 +2,6 @@
#ifndef _ASM_RISCV_HUGETLB_H
#define _ASM_RISCV_HUGETLB_H
-#include <asm-generic/hugetlb.h>
#include <asm/page.h>
static inline void arch_clear_hugepage_flags(struct page *page)
@@ -11,4 +10,37 @@ static inline void arch_clear_hugepage_flags(struct page *page)
}
#define arch_clear_hugepage_flags arch_clear_hugepage_flags
+#ifdef CONFIG_RISCV_ISA_SVNAPOT
+#define __HAVE_ARCH_HUGE_PTE_CLEAR
+void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep, unsigned long sz);
+
+#define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
+void set_huge_pte_at(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep, pte_t pte);
+
+#define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
+pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep);
+
+#define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
+pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep);
+
+#define __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT
+void huge_ptep_set_wrprotect(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep);
+
+#define __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS
+int huge_ptep_set_access_flags(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep,
+ pte_t pte, int dirty);
+
+pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags);
+#define arch_make_huge_pte arch_make_huge_pte
+
+#endif /*CONFIG_RISCV_ISA_SVNAPOT*/
+
+#include <asm-generic/hugetlb.h>
+
#endif /* _ASM_RISCV_HUGETLB_H */
diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
index 932dadfdca54..a163a3e0f0d4 100644
--- a/arch/riscv/mm/hugetlbpage.c
+++ b/arch/riscv/mm/hugetlbpage.c
@@ -2,6 +2,305 @@
#include <linux/hugetlb.h>
#include <linux/err.h>
+#ifdef CONFIG_RISCV_ISA_SVNAPOT
+pte_t *huge_pte_alloc(struct mm_struct *mm,
+ struct vm_area_struct *vma,
+ unsigned long addr,
+ unsigned long sz)
+{
+ unsigned long order;
+ pte_t *pte = NULL;
+ pgd_t *pgd;
+ p4d_t *p4d;
+ pud_t *pud;
+ pmd_t *pmd;
+
+ pgd = pgd_offset(mm, addr);
+ p4d = p4d_alloc(mm, pgd, addr);
+ if (!p4d)
+ return NULL;
+
+ pud = pud_alloc(mm, p4d, addr);
+ if (!pud)
+ return NULL;
+
+ if (sz == PUD_SIZE) {
+ pte = (pte_t *)pud;
+ goto out;
+ }
+
+ if (sz == PMD_SIZE) {
+ if (want_pmd_share(vma, addr) && pud_none(*pud))
+ pte = huge_pmd_share(mm, vma, addr, pud);
+ else
+ pte = (pte_t *)pmd_alloc(mm, pud, addr);
+ goto out;
+ }
+
+ pmd = pmd_alloc(mm, pud, addr);
+ if (!pmd)
+ return NULL;
+
+ for_each_napot_order(order) {
+ if (napot_cont_size(order) == sz) {
+ pte = pte_alloc_map(mm, pmd, addr & napot_cont_mask(order));
+ break;
+ }
+ }
+
+out:
+ WARN_ON_ONCE(pte && pte_present(*pte) && !pte_huge(*pte));
+ return pte;
+}
+
+pte_t *huge_pte_offset(struct mm_struct *mm,
+ unsigned long addr,
+ unsigned long sz)
+{
+ unsigned long order;
+ pte_t *pte = NULL;
+ pgd_t *pgd;
+ p4d_t *p4d;
+ pud_t *pud;
+ pmd_t *pmd;
+
+ pgd = pgd_offset(mm, addr);
+ if (!pgd_present(*pgd))
+ return NULL;
+
+ p4d = p4d_offset(pgd, addr);
+ if (!p4d_present(*p4d))
+ return NULL;
+
+ pud = pud_offset(p4d, addr);
+ if (sz == PUD_SIZE)
+ /* must be pud huge, non-present or none */
+ return (pte_t *)pud;
+
+ if (!pud_present(*pud))
+ return NULL;
+
+ pmd = pmd_offset(pud, addr);
+ if (sz == PMD_SIZE)
+ /* must be pmd huge, non-present or none */
+ return (pte_t *)pmd;
+
+ if (!pmd_present(*pmd))
+ return NULL;
+
+ for_each_napot_order(order) {
+ if (napot_cont_size(order) == sz) {
+ pte = pte_offset_kernel(pmd, addr & napot_cont_mask(order));
+ break;
+ }
+ }
+ return pte;
+}
+
+static pte_t get_clear_contig(struct mm_struct *mm,
+ unsigned long addr,
+ pte_t *ptep,
+ unsigned long pte_num)
+{
+ pte_t orig_pte = ptep_get(ptep);
+ unsigned long i;
+
+ for (i = 0; i < pte_num; i++, addr += PAGE_SIZE, ptep++) {
+ pte_t pte = ptep_get_and_clear(mm, addr, ptep);
+
+ if (pte_dirty(pte))
+ orig_pte = pte_mkdirty(orig_pte);
+
+ if (pte_young(pte))
+ orig_pte = pte_mkyoung(orig_pte);
+ }
+
+ return orig_pte;
+}
+
+static pte_t get_clear_contig_flush(struct mm_struct *mm,
+ unsigned long addr,
+ pte_t *ptep,
+ unsigned long pte_num)
+{
+ pte_t orig_pte = get_clear_contig(mm, addr, ptep, pte_num);
+ struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0);
+ bool valid = !pte_none(orig_pte);
+
+ if (valid)
+ flush_tlb_range(&vma, addr, addr + (PAGE_SIZE * pte_num));
+
+ return orig_pte;
+}
+
+pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags)
+{
+ unsigned long order;
+
+ for_each_napot_order(order) {
+ if (shift == napot_cont_shift(order)) {
+ entry = pte_mknapot(entry, order);
+ break;
+ }
+ }
+ if (order == NAPOT_ORDER_MAX)
+ entry = pte_mkhuge(entry);
+
+ return entry;
+}
+
+void set_huge_pte_at(struct mm_struct *mm,
+ unsigned long addr,
+ pte_t *ptep,
+ pte_t pte)
+{
+ int i, pte_num;
+
+ if (!pte_napot(pte)) {
+ set_pte_at(mm, addr, ptep, pte);
+ return;
+ }
+
+ pte_num = napot_pte_num(napot_cont_order(pte));
+ for (i = 0; i < pte_num; i++, ptep++, addr += PAGE_SIZE)
+ set_pte_at(mm, addr, ptep, pte);
+}
+
+int huge_ptep_set_access_flags(struct vm_area_struct *vma,
+ unsigned long addr,
+ pte_t *ptep,
+ pte_t pte,
+ int dirty)
+{
+ struct mm_struct *mm = vma->vm_mm;
+ unsigned long order;
+ pte_t orig_pte;
+ int i, pte_num;
+
+ if (!pte_napot(pte))
+ return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
+
+ order = napot_cont_order(pte);
+ pte_num = napot_pte_num(order);
+ ptep = huge_pte_offset(mm, addr, napot_cont_size(order));
+ orig_pte = get_clear_contig_flush(mm, addr, ptep, pte_num);
+
+ if (pte_dirty(orig_pte))
+ pte = pte_mkdirty(pte);
+
+ if (pte_young(orig_pte))
+ pte = pte_mkyoung(pte);
+
+ for (i = 0; i < pte_num; i++, addr += PAGE_SIZE, ptep++)
+ set_pte_at(mm, addr, ptep, pte);
+
+ return true;
+}
+
+pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
+ unsigned long addr,
+ pte_t *ptep)
+{
+ pte_t orig_pte = ptep_get(ptep);
+ int pte_num;
+
+ if (!pte_napot(orig_pte))
+ return ptep_get_and_clear(mm, addr, ptep);
+
+ pte_num = napot_pte_num(napot_cont_order(orig_pte));
+
+ return get_clear_contig(mm, addr, ptep, pte_num);
+}
+
+void huge_ptep_set_wrprotect(struct mm_struct *mm,
+ unsigned long addr,
+ pte_t *ptep)
+{
+ pte_t pte = ptep_get(ptep);
+ unsigned long order;
+ int i, pte_num;
+
+ if (!pte_napot(pte)) {
+ ptep_set_wrprotect(mm, addr, ptep);
+ return;
+ }
+
+ order = napot_cont_order(pte);
+ pte_num = napot_pte_num(order);
+ ptep = huge_pte_offset(mm, addr, napot_cont_size(order));
+
+ for (i = 0; i < pte_num; i++, addr += PAGE_SIZE, ptep++)
+ ptep_set_wrprotect(mm, addr, ptep);
+}
+
+pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
+ unsigned long addr,
+ pte_t *ptep)
+{
+ pte_t pte = ptep_get(ptep);
+ int pte_num;
+
+ if (!pte_napot(pte))
+ return ptep_clear_flush(vma, addr, ptep);
+
+ pte_num = napot_pte_num(napot_cont_order(pte));
+
+ return get_clear_contig_flush(vma->vm_mm, addr, ptep, pte_num);
+}
+
+void huge_pte_clear(struct mm_struct *mm,
+ unsigned long addr,
+ pte_t *ptep,
+ unsigned long sz)
+{
+ pte_t pte = READ_ONCE(*ptep);
+ int i, pte_num;
+
+ if (!pte_napot(pte)) {
+ pte_clear(mm, addr, ptep);
+ return;
+ }
+
+ pte_num = napot_pte_num(napot_cont_order(pte));
+ for (i = 0; i < pte_num; i++, addr += PAGE_SIZE, ptep++)
+ pte_clear(mm, addr, ptep);
+}
+
+static __init bool is_napot_size(unsigned long size)
+{
+ unsigned long order;
+
+ if (!has_svnapot())
+ return false;
+
+ for_each_napot_order(order) {
+ if (size == napot_cont_size(order))
+ return true;
+ }
+ return false;
+}
+
+static __init int napot_hugetlbpages_init(void)
+{
+ if (has_svnapot()) {
+ unsigned long order;
+
+ for_each_napot_order(order)
+ hugetlb_add_hstate(order);
+ }
+ return 0;
+}
+arch_initcall(napot_hugetlbpages_init);
+
+#else
+
+static __init bool is_napot_size(unsigned long size)
+{
+ return false;
+}
+
+#endif /*CONFIG_RISCV_ISA_SVNAPOT*/
+
int pud_huge(pud_t pud)
{
return pud_leaf(pud);
@@ -18,6 +317,8 @@ bool __init arch_hugetlb_valid_size(unsigned long size)
return true;
else if (IS_ENABLED(CONFIG_64BIT) && size == PUD_SIZE)
return true;
+ else if (is_napot_size(size))
+ return true;
else
return false;
}
--
2.39.2
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v14 3/3] riscv: mm: support Svnapot in huge vmap
2023-03-08 7:48 [PATCH v14 0/3] riscv, mm: detect svnapot cpu support at runtime Qinglin Pan
2023-03-08 7:48 ` [PATCH v14 1/3] riscv: mm: modify pte format for Svnapot Qinglin Pan
2023-03-08 7:48 ` [PATCH v14 2/3] riscv: mm: support Svnapot in hugetlb page Qinglin Pan
@ 2023-03-08 7:48 ` Qinglin Pan
2023-03-15 1:40 ` [PATCH v14 0/3] riscv, mm: detect svnapot cpu support at runtime patchwork-bot+linux-riscv
3 siblings, 0 replies; 7+ messages in thread
From: Qinglin Pan @ 2023-03-08 7:48 UTC (permalink / raw)
To: paul.walmsley, palmer, linux-riscv
Cc: jeff, xuyinan, conor, ajones, Qinglin Pan, Qinglin Pan,
Conor Dooley
From: Qinglin Pan <panqinglin2020@iscas.ac.cn>
As HAVE_ARCH_HUGE_VMAP and HAVE_ARCH_HUGE_VMALLOC is supported, we can
implement arch_vmap_pte_range_map_size and arch_vmap_pte_supported_shift
for Svnapot to support huge vmap about napot size.
It can be tested by huge vmap used in pci driver. Huge vmalloc with svnapot
can be tested by test_vmalloc with [1] applied, and probe this
module to run fix_size_alloc_test with use_huge true.
[1]https://lore.kernel.org/all/20221212055657.698420-1-panqinglin2020@iscas.ac.cn/
Signed-off-by: Qinglin Pan <panqinglin00@gmail.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
diff --git a/arch/riscv/include/asm/vmalloc.h b/arch/riscv/include/asm/vmalloc.h
index 48da5371f1e9..58d3e447f191 100644
--- a/arch/riscv/include/asm/vmalloc.h
+++ b/arch/riscv/include/asm/vmalloc.h
@@ -17,6 +17,65 @@ static inline bool arch_vmap_pmd_supported(pgprot_t prot)
return true;
}
-#endif
+#ifdef CONFIG_RISCV_ISA_SVNAPOT
+#include <linux/pgtable.h>
+#define arch_vmap_pte_range_map_size arch_vmap_pte_range_map_size
+static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr, unsigned long end,
+ u64 pfn, unsigned int max_page_shift)
+{
+ unsigned long map_size = PAGE_SIZE;
+ unsigned long size, order;
+
+ if (!has_svnapot())
+ return map_size;
+
+ for_each_napot_order_rev(order) {
+ if (napot_cont_shift(order) > max_page_shift)
+ continue;
+
+ size = napot_cont_size(order);
+ if (end - addr < size)
+ continue;
+
+ if (!IS_ALIGNED(addr, size))
+ continue;
+
+ if (!IS_ALIGNED(PFN_PHYS(pfn), size))
+ continue;
+
+ map_size = size;
+ break;
+ }
+
+ return map_size;
+}
+
+#define arch_vmap_pte_supported_shift arch_vmap_pte_supported_shift
+static inline int arch_vmap_pte_supported_shift(unsigned long size)
+{
+ int shift = PAGE_SHIFT;
+ unsigned long order;
+
+ if (!has_svnapot())
+ return shift;
+
+ WARN_ON_ONCE(size >= PMD_SIZE);
+
+ for_each_napot_order_rev(order) {
+ if (napot_cont_size(order) > size)
+ continue;
+
+ if (!IS_ALIGNED(size, napot_cont_size(order)))
+ continue;
+
+ shift = napot_cont_shift(order);
+ break;
+ }
+
+ return shift;
+}
+
+#endif /* CONFIG_RISCV_ISA_SVNAPOT */
+#endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */
#endif /* _ASM_RISCV_VMALLOC_H */
--
2.39.2
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v14 1/3] riscv: mm: modify pte format for Svnapot
2023-03-08 7:48 ` [PATCH v14 1/3] riscv: mm: modify pte format for Svnapot Qinglin Pan
@ 2023-03-08 9:09 ` Conor Dooley
2023-03-08 11:34 ` Qinglin Pan
0 siblings, 1 reply; 7+ messages in thread
From: Conor Dooley @ 2023-03-08 9:09 UTC (permalink / raw)
To: Qinglin Pan
Cc: paul.walmsley, palmer, linux-riscv, jeff, xuyinan, ajones,
Qinglin Pan
[-- Attachment #1.1: Type: text/plain, Size: 1312 bytes --]
On Wed, Mar 08, 2023 at 03:48:51PM +0800, Qinglin Pan wrote:
> From: Qinglin Pan <panqinglin2020@iscas.ac.cn>
>
> Add one alternative to enable/disable svnapot support, enable this static
> key when "svnapot" is in the "riscv,isa" field of fdt and SVNAPOT compile
> option is set. It will influence the behavior of has_svnapot. All code
> dependent on svnapot should make sure that has_svnapot return true firstly.
>
> Modify PTE definition for Svnapot, and creates some functions in pgtable.h
> to mark a PTE as napot and check if it is a Svnapot PTE. Until now, only
> 64KB napot size is supported in spec, so some macros has only 64KB version.
>
> Signed-off-by: Qinglin Pan <panqinglin00@gmail.com>
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
> index e3021b2590de..ba9f4650f051 100644
> --- a/arch/riscv/include/asm/hwcap.h
> +++ b/arch/riscv/include/asm/hwcap.h
> @@ -42,9 +42,10 @@
> #define RISCV_ISA_EXT_ZBB 30
> #define RISCV_ISA_EXT_ZICBOM 31
> #define RISCV_ISA_EXT_ZIHINTPAUSE 32
> +#define RISCV_ISA_EXT_SVNAPOT 33
>
> #define RISCV_ISA_EXT_MAX 64
> -#define RISCV_ISA_EXT_NAME_LEN_MAX 32
> +#define RISCV_ISA_EXT_NAME_LEN_MAX 33
I think this change is not needed.
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
[-- Attachment #2: Type: text/plain, Size: 161 bytes --]
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v14 1/3] riscv: mm: modify pte format for Svnapot
2023-03-08 9:09 ` Conor Dooley
@ 2023-03-08 11:34 ` Qinglin Pan
0 siblings, 0 replies; 7+ messages in thread
From: Qinglin Pan @ 2023-03-08 11:34 UTC (permalink / raw)
To: Conor Dooley, Qinglin Pan
Cc: paul.walmsley, palmer, linux-riscv, jeff, xuyinan, ajones
On 3/8/23 5:09 PM, Conor Dooley wrote:
> On Wed, Mar 08, 2023 at 03:48:51PM +0800, Qinglin Pan wrote:
>> From: Qinglin Pan <panqinglin2020@iscas.ac.cn>
>>
>> Add one alternative to enable/disable svnapot support, enable this static
>> key when "svnapot" is in the "riscv,isa" field of fdt and SVNAPOT compile
>> option is set. It will influence the behavior of has_svnapot. All code
>> dependent on svnapot should make sure that has_svnapot return true firstly.
>>
>> Modify PTE definition for Svnapot, and creates some functions in pgtable.h
>> to mark a PTE as napot and check if it is a Svnapot PTE. Until now, only
>> 64KB napot size is supported in spec, so some macros has only 64KB version.
>>
>> Signed-off-by: Qinglin Pan <panqinglin00@gmail.com>
>> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
>> diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
>> index e3021b2590de..ba9f4650f051 100644
>> --- a/arch/riscv/include/asm/hwcap.h
>> +++ b/arch/riscv/include/asm/hwcap.h
>> @@ -42,9 +42,10 @@
>> #define RISCV_ISA_EXT_ZBB 30
>> #define RISCV_ISA_EXT_ZICBOM 31
>> #define RISCV_ISA_EXT_ZIHINTPAUSE 32
>> +#define RISCV_ISA_EXT_SVNAPOT 33
>>
>> #define RISCV_ISA_EXT_MAX 64
>
>> -#define RISCV_ISA_EXT_NAME_LEN_MAX 32
>> +#define RISCV_ISA_EXT_NAME_LEN_MAX 33
>
> I think this change is not needed.
Hi Conor,
You are right. I change this by mistake..
A v15 will be sent out two days later :)
Thanks,
Qinglin.
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v14 0/3] riscv, mm: detect svnapot cpu support at runtime
2023-03-08 7:48 [PATCH v14 0/3] riscv, mm: detect svnapot cpu support at runtime Qinglin Pan
` (2 preceding siblings ...)
2023-03-08 7:48 ` [PATCH v14 3/3] riscv: mm: support Svnapot in huge vmap Qinglin Pan
@ 2023-03-15 1:40 ` patchwork-bot+linux-riscv
3 siblings, 0 replies; 7+ messages in thread
From: patchwork-bot+linux-riscv @ 2023-03-15 1:40 UTC (permalink / raw)
To: Qinglin Pan
Cc: linux-riscv, paul.walmsley, palmer, jeff, xuyinan, conor, ajones
Hello:
This series was applied to riscv/linux.git (for-next)
by Palmer Dabbelt <palmer@rivosinc.com>:
On Wed, 8 Mar 2023 15:48:50 +0800 you wrote:
> Svnapot is a RISC-V extension for marking contiguous 4K pages as a non-4K
> page. This patch set is for using Svnapot in hugetlb fs and huge vmap.
>
> This patchset adds a Kconfig item for using Svnapot in
> "Platform type"->"SVNAPOT extension support". Its default value is on,
> and people can set it off if they don't allow kernel to detect Svnapot
> hardware support and leverage it.
>
> [...]
Here is the summary with links:
- [v14,1/3] riscv: mm: modify pte format for Svnapot
(no matching commit)
- [v14,2/3] riscv: mm: support Svnapot in hugetlb page
(no matching commit)
- [v14,3/3] riscv: mm: support Svnapot in huge vmap
https://git.kernel.org/riscv/c/ce173474cf19
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-03-15 1:40 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-08 7:48 [PATCH v14 0/3] riscv, mm: detect svnapot cpu support at runtime Qinglin Pan
2023-03-08 7:48 ` [PATCH v14 1/3] riscv: mm: modify pte format for Svnapot Qinglin Pan
2023-03-08 9:09 ` Conor Dooley
2023-03-08 11:34 ` Qinglin Pan
2023-03-08 7:48 ` [PATCH v14 2/3] riscv: mm: support Svnapot in hugetlb page Qinglin Pan
2023-03-08 7:48 ` [PATCH v14 3/3] riscv: mm: support Svnapot in huge vmap Qinglin Pan
2023-03-15 1:40 ` [PATCH v14 0/3] riscv, mm: detect svnapot cpu support at runtime patchwork-bot+linux-riscv
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox