* [powerpc:merge] BUILD SUCCESS 7c25bda14d66718f9fa428808dae289dd84f1da3
From: kernel test robot @ 2020-08-21 5:00 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge
branch HEAD: 7c25bda14d66718f9fa428808dae289dd84f1da3 Automatic merge of 'master', 'next' and 'fixes' (2020-08-20 23:20)
elapsed time: 926m
configs tested: 69
configs skipped: 2
The following configs have been built successfully.
More configs may be tested in the coming days.
arm defconfig
arm64 allyesconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
m68k m5275evb_defconfig
arm keystone_defconfig
s390 alldefconfig
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k allmodconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
arc allyesconfig
nds32 allnoconfig
c6x allyesconfig
nds32 defconfig
nios2 allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
parisc allyesconfig
s390 defconfig
i386 allyesconfig
sparc allyesconfig
sparc defconfig
i386 defconfig
mips allyesconfig
mips allmodconfig
powerpc allyesconfig
powerpc allmodconfig
powerpc allnoconfig
powerpc defconfig
i386 randconfig-a002-20200820
i386 randconfig-a004-20200820
i386 randconfig-a005-20200820
i386 randconfig-a003-20200820
i386 randconfig-a006-20200820
i386 randconfig-a001-20200820
x86_64 randconfig-a015-20200820
x86_64 randconfig-a012-20200820
x86_64 randconfig-a016-20200820
x86_64 randconfig-a014-20200820
x86_64 randconfig-a011-20200820
x86_64 randconfig-a013-20200820
i386 randconfig-a013-20200820
i386 randconfig-a012-20200820
i386 randconfig-a011-20200820
i386 randconfig-a016-20200820
i386 randconfig-a014-20200820
i386 randconfig-a015-20200820
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
x86_64 rhel
x86_64 allyesconfig
x86_64 rhel-7.6-kselftests
x86_64 defconfig
x86_64 rhel-8.3
x86_64 kexec
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [powerpc:fixes-test] BUILD SUCCESS 90a9b102eddf6a3f987d15f4454e26a2532c1c98
From: kernel test robot @ 2020-08-21 5:00 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git fixes-test
branch HEAD: 90a9b102eddf6a3f987d15f4454e26a2532c1c98 powerpc/pseries: Do not initiate shutdown when system is running on UPS
elapsed time: 927m
configs tested: 75
configs skipped: 75
The following configs have been built successfully.
More configs may be tested in the coming days.
arm defconfig
arm64 allyesconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
m68k m5275evb_defconfig
arm keystone_defconfig
s390 alldefconfig
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k allmodconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
arc allyesconfig
nds32 allnoconfig
c6x allyesconfig
nds32 defconfig
nios2 allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
parisc allyesconfig
s390 defconfig
i386 allyesconfig
sparc allyesconfig
sparc defconfig
i386 defconfig
mips allyesconfig
mips allmodconfig
powerpc allyesconfig
powerpc allmodconfig
powerpc allnoconfig
powerpc defconfig
i386 randconfig-a002-20200820
i386 randconfig-a004-20200820
i386 randconfig-a005-20200820
i386 randconfig-a003-20200820
i386 randconfig-a006-20200820
i386 randconfig-a001-20200820
x86_64 randconfig-a015-20200820
x86_64 randconfig-a012-20200820
x86_64 randconfig-a016-20200820
x86_64 randconfig-a014-20200820
x86_64 randconfig-a011-20200820
x86_64 randconfig-a013-20200820
i386 randconfig-a013-20200820
i386 randconfig-a012-20200820
i386 randconfig-a011-20200820
i386 randconfig-a016-20200820
i386 randconfig-a014-20200820
i386 randconfig-a015-20200820
i386 randconfig-a013-20200821
i386 randconfig-a012-20200821
i386 randconfig-a011-20200821
i386 randconfig-a016-20200821
i386 randconfig-a014-20200821
i386 randconfig-a015-20200821
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
x86_64 rhel
x86_64 allyesconfig
x86_64 rhel-7.6-kselftests
x86_64 defconfig
x86_64 rhel-8.3
x86_64 kexec
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* Re: [PATCH v2 3/6] powerpc/32s: Only leave NX unset on segments used for modules
From: Christophe Leroy @ 2020-08-21 5:11 UTC (permalink / raw)
To: Andreas Schwab; +Cc: Paul Mackerras, linuxppc-dev, linux-kernel
In-Reply-To: <87eeo1kmet.fsf@igel.home>
Le 21/08/2020 à 00:00, Andreas Schwab a écrit :
> On Jun 29 2020, Christophe Leroy wrote:
>
>> Instead of leaving NX unset on all segments above the start
>> of vmalloc space, only leave NX unset on segments used for
>> modules.
>
> I'm getting this crash:
>
> kernel tried to execute exec-protected page (f294b000) - exploit attempt (uid: 0)
> BUG: Unable to handle kernel instruction fetch
> Faulting instruction address: 0xf294b000
> Oops: Kernel access of bad area, sig: 11 [#1]
> BE PAGE_SIZE=4K MMU=Hash PowerMac
> Modules linked in: pata_macio(+)
> CPU: 0 PID: 87 Comm: udevd Not tainted 5.8.0-rc2-test #49
> NIP: f294b000 LR: 0005c60 CTR: f294b000
> REGS: f18d9cc0 TRAP: 0400 Not tainted (5.8.0-rc2-test)
> MSR: 10009032 <E,ME,IR,DR,RI> CR: 84222422 XER: 20000000
> GPR00: c0005c14 f18d9d78 ef30ca20 00000000 ef0000e0 c00993d0 ef6da038 0000005e
> GPR08: c09050b8 c08b0000 00000000 f18d9d78 44222422 10072070 00000000 0fefaca4
> GPR16: 1006a00c f294d50b 00000120 00000124 c0096ea8 0000000e ef2776c0 ef2776e4
> GPR24: f18fd6e8 00000001 c086fe64 c086fe04 00000000 c08b0000 f294b000 ffffffff
> NIP [f294b000] pata_macio_init+0x0/0xc0 [pata_macio]
> LR [c0005c60] do_one_initcall+0x6c/0x160
> Call Trace:
> [f18d9d78] [c0005c14] do_one_initcall+0x20/0x160 (unreliable)
> [f18d9dd8] [c009a22c] do_init_module+0x60/0x1c0
> [f18d9df8] [c00993d8] load_module+0x16a8/0x1c14
> [f18d9ea8] [c0099aa4] sys_finit_module+0x8c/0x94
> [f18d9f38] [c0012174] ret_from_syscall+0x0/0x34
> --- interrupt: c01 at 0xfdb4318
> LR = 0xfeee9c0
> Instruction dump:
> XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX <3d20c08b> 3d40c086 9421ffe0 8129106c
> ---[ end trace 85a98cc836109871 ]---
>
Please try the patch at
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/07884ed033c31e074747b7eb8eaa329d15db07ec.1596641219.git.christophe.leroy@csgroup.eu/
And if you are using KAsan, also take
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/6eddca2d5611fd57312a88eae31278c87a8fc99d.1596641224.git.christophe.leroy@csgroup.eu/
Allthough I have some doubt that it will fix it, because the faulting
instruction address is at 0xf294b000 which is within the vmalloc area.
In the likely case the patch doesn't fix the issue, can you provide your
.config and a dump of /sys/kernel/debug/powerpc/segment_registers (You
have to have CONFIG_PPC_PTDUMP enabled for that) and also the below part
from boot log.
[ 0.000000] Memory: 509556K/524288K available (7088K kernel code,
592K rwdata, 1304K rodata, 356K init, 803K bss, 14732K reserved, 0K
cma-reserved)
[ 0.000000] Kernel virtual memory layout:
[ 0.000000] * 0xff7ff000..0xfffff000 : fixmap
[ 0.000000] * 0xff7fd000..0xff7ff000 : early ioremap
[ 0.000000] * 0xe1000000..0xff7fd000 : vmalloc & ioremap
Thanks
Christophe
^ permalink raw reply
* Re: [PATCH v5 5/8] mm: HUGE_VMAP arch support cleanup
From: Christophe Leroy @ 2020-08-21 5:40 UTC (permalink / raw)
To: Nicholas Piggin, linux-mm, Andrew Morton
Cc: linux-arch, Zefan Li, linuxppc-dev, linux-kernel,
Jonathan Cameron
In-Reply-To: <20200821044427.736424-6-npiggin@gmail.com>
Le 21/08/2020 à 06:44, Nicholas Piggin a écrit :
> This changes the awkward approach where architectures provide init
> functions to determine which levels they can provide large mappings for,
> to one where the arch is queried for each call.
>
> This removes code and indirection, and allows constant-folding of dead
> code for unsupported levels.
I think that in order to allow constant-folding of dead code for
unsupported levels, you must define arch_vmap_xxx_supported() as static
inline in a .h
If you have them in .c files, you'll get calls to tiny functions that
will always return false, but will still be called and dead code won't
be eliminated. And performance wise, that's probably not optimal either.
Christophe
>
> This also adds a prot argument to the arch query. This is unused
> currently but could help with some architectures (e.g., some powerpc
> processors can't map uncacheable memory with large pages).
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/arm64/mm/mmu.c | 12 +--
> arch/powerpc/mm/book3s64/radix_pgtable.c | 10 ++-
> arch/x86/mm/ioremap.c | 12 +--
> include/linux/io.h | 9 ---
> include/linux/vmalloc.h | 10 +++
> init/main.c | 1 -
> mm/ioremap.c | 96 +++++++++++-------------
> 7 files changed, 73 insertions(+), 77 deletions(-)
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 75df62fea1b6..bbb3ccf6a7ce 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1304,12 +1304,13 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot)
> return dt_virt;
> }
>
> -int __init arch_ioremap_p4d_supported(void)
> +#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
> +bool arch_vmap_p4d_supported(pgprot_t prot)
> {
> - return 0;
> + return false;
> }
>
> -int __init arch_ioremap_pud_supported(void)
> +bool arch_vmap_pud_supported(pgprot_t prot)
> {
> /*
> * Only 4k granule supports level 1 block mappings.
> @@ -1319,11 +1320,12 @@ int __init arch_ioremap_pud_supported(void)
> !IS_ENABLED(CONFIG_PTDUMP_DEBUGFS);
> }
>
> -int __init arch_ioremap_pmd_supported(void)
> +bool arch_vmap_pmd_supported(pgprot_t prot)
> {
> - /* See arch_ioremap_pud_supported() */
> + /* See arch_vmap_pud_supported() */
> return !IS_ENABLED(CONFIG_PTDUMP_DEBUGFS);
> }
> +#endif
>
> int pud_set_huge(pud_t *pudp, phys_addr_t phys, pgprot_t prot)
> {
> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
> index ae823bba29f2..7d3a620c5adf 100644
> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> @@ -1182,13 +1182,14 @@ void radix__ptep_modify_prot_commit(struct vm_area_struct *vma,
> set_pte_at(mm, addr, ptep, pte);
> }
>
> -int __init arch_ioremap_pud_supported(void)
> +#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
> +bool arch_vmap_pud_supported(pgprot_t prot)
> {
> /* HPT does not cope with large pages in the vmalloc area */
> return radix_enabled();
> }
>
> -int __init arch_ioremap_pmd_supported(void)
> +bool arch_vmap_pmd_supported(pgprot_t prot)
> {
> return radix_enabled();
> }
> @@ -1197,6 +1198,7 @@ int p4d_free_pud_page(p4d_t *p4d, unsigned long addr)
> {
> return 0;
> }
> +#endif
>
> int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
> {
> @@ -1282,7 +1284,7 @@ int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
> return 1;
> }
>
> -int __init arch_ioremap_p4d_supported(void)
> +bool arch_vmap_p4d_supported(pgprot_t prot)
> {
> - return 0;
> + return false;
> }
> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index 84d85dbd1dad..5b8b495ab4ed 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -481,24 +481,26 @@ void iounmap(volatile void __iomem *addr)
> }
> EXPORT_SYMBOL(iounmap);
>
> -int __init arch_ioremap_p4d_supported(void)
> +#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
> +bool arch_vmap_p4d_supported(pgprot_t prot)
> {
> - return 0;
> + return false;
> }
>
> -int __init arch_ioremap_pud_supported(void)
> +bool arch_vmap_pud_supported(pgprot_t prot)
> {
> #ifdef CONFIG_X86_64
> return boot_cpu_has(X86_FEATURE_GBPAGES);
> #else
> - return 0;
> + return false;
> #endif
> }
>
> -int __init arch_ioremap_pmd_supported(void)
> +bool arch_vmap_pmd_supported(pgprot_t prot)
> {
> return boot_cpu_has(X86_FEATURE_PSE);
> }
> +#endif
>
> /*
> * Convert a physical pointer to a virtual kernel pointer for /dev/mem
> diff --git a/include/linux/io.h b/include/linux/io.h
> index 8394c56babc2..f1effd4d7a3c 100644
> --- a/include/linux/io.h
> +++ b/include/linux/io.h
> @@ -31,15 +31,6 @@ static inline int ioremap_page_range(unsigned long addr, unsigned long end,
> }
> #endif
>
> -#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
> -void __init ioremap_huge_init(void);
> -int arch_ioremap_p4d_supported(void);
> -int arch_ioremap_pud_supported(void);
> -int arch_ioremap_pmd_supported(void);
> -#else
> -static inline void ioremap_huge_init(void) { }
> -#endif
> -
> /*
> * Managed iomap interface
> */
> diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> index 0221f852a7e1..787d77ad7536 100644
> --- a/include/linux/vmalloc.h
> +++ b/include/linux/vmalloc.h
> @@ -84,6 +84,16 @@ struct vmap_area {
> };
> };
>
> +#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
> +bool arch_vmap_p4d_supported(pgprot_t prot);
> +bool arch_vmap_pud_supported(pgprot_t prot);
> +bool arch_vmap_pmd_supported(pgprot_t prot);
> +#else
> +static inline bool arch_vmap_p4d_supported(pgprot_t prot) { return false; }
> +static inline bool arch_vmap_pud_supported(pgprot_t prot) { return false; }
> +static inline bool arch_vmap_pmd_supported(pgprot_t prot) { return false; }
> +#endif
> +
> /*
> * Highlevel APIs for driver use
> */
> diff --git a/init/main.c b/init/main.c
> index ae78fb68d231..1c89aa127b8f 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -820,7 +820,6 @@ static void __init mm_init(void)
> pgtable_init();
> debug_objects_mem_init();
> vmalloc_init();
> - ioremap_huge_init();
> /* Should be run before the first non-init thread is created */
> init_espfix_bsp();
> /* Should be run after espfix64 is set up. */
> diff --git a/mm/ioremap.c b/mm/ioremap.c
> index 6016ae3227ad..b0032dbadaf7 100644
> --- a/mm/ioremap.c
> +++ b/mm/ioremap.c
> @@ -16,49 +16,16 @@
> #include "pgalloc-track.h"
>
> #ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
> -static int __read_mostly ioremap_p4d_capable;
> -static int __read_mostly ioremap_pud_capable;
> -static int __read_mostly ioremap_pmd_capable;
> -static int __read_mostly ioremap_huge_disabled;
> +static bool __ro_after_init iomap_allow_huge = true;
>
> static int __init set_nohugeiomap(char *str)
> {
> - ioremap_huge_disabled = 1;
> + iomap_allow_huge = false;
> return 0;
> }
> early_param("nohugeiomap", set_nohugeiomap);
> -
> -void __init ioremap_huge_init(void)
> -{
> - if (!ioremap_huge_disabled) {
> - if (arch_ioremap_p4d_supported())
> - ioremap_p4d_capable = 1;
> - if (arch_ioremap_pud_supported())
> - ioremap_pud_capable = 1;
> - if (arch_ioremap_pmd_supported())
> - ioremap_pmd_capable = 1;
> - }
> -}
> -
> -static inline int ioremap_p4d_enabled(void)
> -{
> - return ioremap_p4d_capable;
> -}
> -
> -static inline int ioremap_pud_enabled(void)
> -{
> - return ioremap_pud_capable;
> -}
> -
> -static inline int ioremap_pmd_enabled(void)
> -{
> - return ioremap_pmd_capable;
> -}
> -
> -#else /* !CONFIG_HAVE_ARCH_HUGE_VMAP */
> -static inline int ioremap_p4d_enabled(void) { return 0; }
> -static inline int ioremap_pud_enabled(void) { return 0; }
> -static inline int ioremap_pmd_enabled(void) { return 0; }
> +#else /* CONFIG_HAVE_ARCH_HUGE_VMAP */
> +static const bool iomap_allow_huge = false;
> #endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */
>
> static int vmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
> @@ -81,9 +48,12 @@ static int vmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
> }
>
> static int vmap_try_huge_pmd(pmd_t *pmd, unsigned long addr, unsigned long end,
> - phys_addr_t phys_addr, pgprot_t prot)
> + phys_addr_t phys_addr, pgprot_t prot, unsigned int max_page_shift)
> {
> - if (!ioremap_pmd_enabled())
> + if (max_page_shift < PMD_SHIFT)
> + return 0;
> +
> + if (!arch_vmap_pmd_supported(prot))
> return 0;
>
> if ((end - addr) != PMD_SIZE)
> @@ -102,7 +72,8 @@ static int vmap_try_huge_pmd(pmd_t *pmd, unsigned long addr, unsigned long end,
> }
>
> static int vmap_pmd_range(pud_t *pud, unsigned long addr, unsigned long end,
> - phys_addr_t phys_addr, pgprot_t prot, pgtbl_mod_mask *mask)
> + phys_addr_t phys_addr, pgprot_t prot, unsigned int max_page_shift,
> + pgtbl_mod_mask *mask)
> {
> pmd_t *pmd;
> unsigned long next;
> @@ -113,7 +84,7 @@ static int vmap_pmd_range(pud_t *pud, unsigned long addr, unsigned long end,
> do {
> next = pmd_addr_end(addr, end);
>
> - if (vmap_try_huge_pmd(pmd, addr, next, phys_addr, prot)) {
> + if (vmap_try_huge_pmd(pmd, addr, next, phys_addr, prot, max_page_shift)) {
> *mask |= PGTBL_PMD_MODIFIED;
> continue;
> }
> @@ -125,9 +96,12 @@ static int vmap_pmd_range(pud_t *pud, unsigned long addr, unsigned long end,
> }
>
> static int vmap_try_huge_pud(pud_t *pud, unsigned long addr, unsigned long end,
> - phys_addr_t phys_addr, pgprot_t prot)
> + phys_addr_t phys_addr, pgprot_t prot, unsigned int max_page_shift)
> {
> - if (!ioremap_pud_enabled())
> + if (max_page_shift < PUD_SHIFT)
> + return 0;
> +
> + if (!arch_vmap_pud_supported(prot))
> return 0;
>
> if ((end - addr) != PUD_SIZE)
> @@ -146,7 +120,8 @@ static int vmap_try_huge_pud(pud_t *pud, unsigned long addr, unsigned long end,
> }
>
> static int vmap_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end,
> - phys_addr_t phys_addr, pgprot_t prot, pgtbl_mod_mask *mask)
> + phys_addr_t phys_addr, pgprot_t prot, unsigned int max_page_shift,
> + pgtbl_mod_mask *mask)
> {
> pud_t *pud;
> unsigned long next;
> @@ -157,21 +132,24 @@ static int vmap_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end,
> do {
> next = pud_addr_end(addr, end);
>
> - if (vmap_try_huge_pud(pud, addr, next, phys_addr, prot)) {
> + if (vmap_try_huge_pud(pud, addr, next, phys_addr, prot, max_page_shift)) {
> *mask |= PGTBL_PUD_MODIFIED;
> continue;
> }
>
> - if (vmap_pmd_range(pud, addr, next, phys_addr, prot, mask))
> + if (vmap_pmd_range(pud, addr, next, phys_addr, prot, max_page_shift, mask))
> return -ENOMEM;
> } while (pud++, phys_addr += (next - addr), addr = next, addr != end);
> return 0;
> }
>
> static int vmap_try_huge_p4d(p4d_t *p4d, unsigned long addr, unsigned long end,
> - phys_addr_t phys_addr, pgprot_t prot)
> + phys_addr_t phys_addr, pgprot_t prot, unsigned int max_page_shift)
> {
> - if (!ioremap_p4d_enabled())
> + if (max_page_shift < P4D_SHIFT)
> + return 0;
> +
> + if (!arch_vmap_p4d_supported(prot))
> return 0;
>
> if ((end - addr) != P4D_SIZE)
> @@ -190,7 +168,8 @@ static int vmap_try_huge_p4d(p4d_t *p4d, unsigned long addr, unsigned long end,
> }
>
> static int vmap_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end,
> - phys_addr_t phys_addr, pgprot_t prot, pgtbl_mod_mask *mask)
> + phys_addr_t phys_addr, pgprot_t prot, unsigned int max_page_shift,
> + pgtbl_mod_mask *mask)
> {
> p4d_t *p4d;
> unsigned long next;
> @@ -201,18 +180,19 @@ static int vmap_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end,
> do {
> next = p4d_addr_end(addr, end);
>
> - if (vmap_try_huge_p4d(p4d, addr, next, phys_addr, prot)) {
> + if (vmap_try_huge_p4d(p4d, addr, next, phys_addr, prot, max_page_shift)) {
> *mask |= PGTBL_P4D_MODIFIED;
> continue;
> }
>
> - if (vmap_pud_range(p4d, addr, next, phys_addr, prot, mask))
> + if (vmap_pud_range(p4d, addr, next, phys_addr, prot, max_page_shift, mask))
> return -ENOMEM;
> } while (p4d++, phys_addr += (next - addr), addr = next, addr != end);
> return 0;
> }
>
> -int ioremap_page_range(unsigned long addr, unsigned long end, phys_addr_t phys_addr, pgprot_t prot)
> +static int vmap_range(unsigned long addr, unsigned long end, phys_addr_t phys_addr, pgprot_t prot,
> + unsigned int max_page_shift)
> {
> pgd_t *pgd;
> unsigned long start;
> @@ -227,7 +207,7 @@ int ioremap_page_range(unsigned long addr, unsigned long end, phys_addr_t phys_a
> pgd = pgd_offset_k(addr);
> do {
> next = pgd_addr_end(addr, end);
> - err = vmap_p4d_range(pgd, addr, next, phys_addr, prot, &mask);
> + err = vmap_p4d_range(pgd, addr, next, phys_addr, prot, max_page_shift, &mask);
> if (err)
> break;
> } while (pgd++, phys_addr += (next - addr), addr = next, addr != end);
> @@ -240,6 +220,16 @@ int ioremap_page_range(unsigned long addr, unsigned long end, phys_addr_t phys_a
> return err;
> }
>
> +int ioremap_page_range(unsigned long addr, unsigned long end, phys_addr_t phys_addr, pgprot_t prot)
> +{
> + unsigned int max_page_shift = PAGE_SHIFT;
> +
> + if (iomap_allow_huge)
> + max_page_shift = P4D_SHIFT;
> +
> + return vmap_range(addr, end, phys_addr, prot, max_page_shift);
> +}
> +
> #ifdef CONFIG_GENERIC_IOREMAP
> void __iomem *ioremap_prot(phys_addr_t addr, size_t size, unsigned long prot)
> {
>
^ permalink raw reply
* Re: [PATCH v5 3/8] mm/vmalloc: rename vmap_*_range vmap_pages_*_range
From: Christoph Hellwig @ 2020-08-21 5:45 UTC (permalink / raw)
To: Nicholas Piggin
Cc: linux-arch, linux-kernel, linux-mm, Zefan Li, Jonathan Cameron,
Andrew Morton, linuxppc-dev
In-Reply-To: <20200821044427.736424-4-npiggin@gmail.com>
On Fri, Aug 21, 2020 at 02:44:22PM +1000, Nicholas Piggin wrote:
> The vmalloc mapper operates on a struct page * array rather than a
> linear physical address, re-name it to make this distinction clear.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> mm/vmalloc.c | 28 ++++++++++++----------------
> 1 file changed, 12 insertions(+), 16 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 49f225b0f855..3a1e45fd1626 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -190,9 +190,8 @@ void unmap_kernel_range_noflush(unsigned long start, unsigned long size)
> arch_sync_kernel_mappings(start, end);
> }
>
> -static int vmap_pte_range(pmd_t *pmd, unsigned long addr,
> - unsigned long end, pgprot_t prot, struct page **pages, int *nr,
> - pgtbl_mod_mask *mask)
> +static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
> + pgprot_t prot, struct page **pages, int *nr, pgtbl_mod_mask *mask)
Please don't add > 80 lines without any good reason.
^ permalink raw reply
* Re: [PATCH v5 4/8] lib/ioremap: rename ioremap_*_range to vmap_*_range
From: Christoph Hellwig @ 2020-08-21 5:45 UTC (permalink / raw)
To: Nicholas Piggin
Cc: linux-arch, linux-kernel, linux-mm, Zefan Li, Jonathan Cameron,
Andrew Morton, linuxppc-dev
In-Reply-To: <20200821044427.736424-5-npiggin@gmail.com>
On Fri, Aug 21, 2020 at 02:44:23PM +1000, Nicholas Piggin wrote:
> This will be moved to mm/ and used as a generic kernel virtual mapping
> function, so re-name it in preparation.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> mm/ioremap.c | 55 ++++++++++++++++++++++------------------------------
> 1 file changed, 23 insertions(+), 32 deletions(-)
>
> diff --git a/mm/ioremap.c b/mm/ioremap.c
> index 5fa1ab41d152..6016ae3227ad 100644
> --- a/mm/ioremap.c
> +++ b/mm/ioremap.c
> @@ -61,9 +61,8 @@ static inline int ioremap_pud_enabled(void) { return 0; }
> static inline int ioremap_pmd_enabled(void) { return 0; }
> #endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */
>
> -static int ioremap_pte_range(pmd_t *pmd, unsigned long addr,
> - unsigned long end, phys_addr_t phys_addr, pgprot_t prot,
> - pgtbl_mod_mask *mask)
> +static int vmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
> + phys_addr_t phys_addr, pgprot_t prot, pgtbl_mod_mask *mask)
Same here.
^ permalink raw reply
* Re: [PATCH v5 5/8] mm: HUGE_VMAP arch support cleanup
From: Christoph Hellwig @ 2020-08-21 5:46 UTC (permalink / raw)
To: Nicholas Piggin
Cc: linux-arch, linux-kernel, linux-mm, Zefan Li, Jonathan Cameron,
Andrew Morton, linuxppc-dev
In-Reply-To: <20200821044427.736424-6-npiggin@gmail.com>
> static int vmap_try_huge_pmd(pmd_t *pmd, unsigned long addr, unsigned long end,
> - phys_addr_t phys_addr, pgprot_t prot)
> + phys_addr_t phys_addr, pgprot_t prot, unsigned int max_page_shift)
> {
... and here.
^ permalink raw reply
* Re: [PATCH v5 6/8] mm: Move vmap_range from lib/ioremap.c to mm/vmalloc.c
From: Christoph Hellwig @ 2020-08-21 5:46 UTC (permalink / raw)
To: Nicholas Piggin
Cc: linux-arch, linux-kernel, linux-mm, Zefan Li, Jonathan Cameron,
Andrew Morton, linuxppc-dev
In-Reply-To: <20200821044427.736424-7-npiggin@gmail.com>
On Fri, Aug 21, 2020 at 02:44:25PM +1000, Nicholas Piggin wrote:
> This is a generic kernel virtual memory mapper, not specific to ioremap.
lib/ioremap doesn't exist any more.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> include/linux/vmalloc.h | 2 +
> mm/ioremap.c | 192 ----------------------------------------
> mm/vmalloc.c | 191 +++++++++++++++++++++++++++++++++++++++
> 3 files changed, 193 insertions(+), 192 deletions(-)
>
> diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> index 787d77ad7536..e3590e93bfff 100644
> --- a/include/linux/vmalloc.h
> +++ b/include/linux/vmalloc.h
> @@ -181,6 +181,8 @@ extern struct vm_struct *remove_vm_area(const void *addr);
> extern struct vm_struct *find_vm_area(const void *addr);
>
> #ifdef CONFIG_MMU
> +extern int vmap_range(unsigned long addr, unsigned long end, phys_addr_t phys_addr, pgprot_t prot,
> + unsigned int max_page_shift);
Please avoid the pointlessly long line. And don't add the pointless
extern.
^ permalink raw reply
* Re: [PATCH v5 0/8] huge vmalloc mappings
From: Christophe Leroy @ 2020-08-21 5:47 UTC (permalink / raw)
To: Nicholas Piggin, linux-mm, Andrew Morton
Cc: linux-arch, Zefan Li, linuxppc-dev, linux-kernel,
Jonathan Cameron
In-Reply-To: <20200821044427.736424-1-npiggin@gmail.com>
Le 21/08/2020 à 06:44, Nicholas Piggin a écrit :
> I made this powerpc-only for the time being. It shouldn't be too hard to
> add support for other archs that define HUGE_VMAP. I have booted x86
> with it enabled, just may not have audited everything.
I like this series, but if I understand correctly it enables huge
vmalloc mappings only for hugepages sizes matching a page directory
levels, ie on a PPC32 it would work only for 4M hugepages.
On the 8xx, we only have 8M and 512k hugepages. Any change that it can
support these as well one day ?
Christophe
>
> Hi Andrew, would you care to put this in your tree?
>
> Thanks,
> Nick
>
> Since v4:
> - Fixed an off-by-page-order bug in v4
> - Several minor cleanups.
> - Added page order to /proc/vmallocinfo
> - Added hugepage to alloc_large_system_hage output.
> - Made an architecture config option, powerpc only for now.
>
> Since v3:
> - Fixed an off-by-one bug in a loop
> - Fix !CONFIG_HAVE_ARCH_HUGE_VMAP build fail
> - Hopefully this time fix the arm64 vmap stack bug, thanks Jonathan
> Cameron for debugging the cause of this (hopefully).
>
> Since v2:
> - Rebased on vmalloc cleanups, split series into simpler pieces.
> - Fixed several compile errors and warnings
> - Keep the page array and accounting in small page units because
> struct vm_struct is an interface (this should fix x86 vmap stack debug
> assert). [Thanks Zefan]
>
> Nicholas Piggin (8):
> mm/vmalloc: fix vmalloc_to_page for huge vmap mappings
> mm: apply_to_pte_range warn and fail if a large pte is encountered
> mm/vmalloc: rename vmap_*_range vmap_pages_*_range
> lib/ioremap: rename ioremap_*_range to vmap_*_range
> mm: HUGE_VMAP arch support cleanup
> mm: Move vmap_range from lib/ioremap.c to mm/vmalloc.c
> mm/vmalloc: add vmap_range_noflush variant
> mm/vmalloc: Hugepage vmalloc mappings
>
> .../admin-guide/kernel-parameters.txt | 2 +
> arch/Kconfig | 4 +
> arch/arm64/mm/mmu.c | 12 +-
> arch/powerpc/Kconfig | 1 +
> arch/powerpc/mm/book3s64/radix_pgtable.c | 10 +-
> arch/x86/mm/ioremap.c | 12 +-
> include/linux/io.h | 9 -
> include/linux/vmalloc.h | 13 +
> init/main.c | 1 -
> mm/ioremap.c | 231 +--------
> mm/memory.c | 60 ++-
> mm/page_alloc.c | 4 +-
> mm/vmalloc.c | 456 +++++++++++++++---
> 13 files changed, 476 insertions(+), 339 deletions(-)
>
^ permalink raw reply
* Re: [PATCH v5 6/8] mm: Move vmap_range from lib/ioremap.c to mm/vmalloc.c
From: Christophe Leroy @ 2020-08-21 5:48 UTC (permalink / raw)
To: Nicholas Piggin, linux-mm, Andrew Morton
Cc: linux-arch, Zefan Li, linuxppc-dev, linux-kernel,
Jonathan Cameron
In-Reply-To: <20200821044427.736424-7-npiggin@gmail.com>
Le 21/08/2020 à 06:44, Nicholas Piggin a écrit :
> This is a generic kernel virtual memory mapper, not specific to ioremap.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> include/linux/vmalloc.h | 2 +
> mm/ioremap.c | 192 ----------------------------------------
> mm/vmalloc.c | 191 +++++++++++++++++++++++++++++++++++++++
> 3 files changed, 193 insertions(+), 192 deletions(-)
>
> diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> index 787d77ad7536..e3590e93bfff 100644
> --- a/include/linux/vmalloc.h
> +++ b/include/linux/vmalloc.h
> @@ -181,6 +181,8 @@ extern struct vm_struct *remove_vm_area(const void *addr);
> extern struct vm_struct *find_vm_area(const void *addr);
>
> #ifdef CONFIG_MMU
> +extern int vmap_range(unsigned long addr, unsigned long end, phys_addr_t phys_addr, pgprot_t prot,
> + unsigned int max_page_shift);
extern keyword is useless on function prototypes and deprecated. Please
don't add new function prototypes with that keyword.
> extern int map_kernel_range_noflush(unsigned long start, unsigned long size,
> pgprot_t prot, struct page **pages);
> int map_kernel_range(unsigned long start, unsigned long size, pgprot_t prot,
Christophe
^ permalink raw reply
* Re: [PATCH v2 3/6] powerpc/32s: Only leave NX unset on segments used for modules
From: Christophe Leroy @ 2020-08-21 6:43 UTC (permalink / raw)
To: Andreas Schwab; +Cc: linuxppc-dev, Paul Mackerras, linux-kernel
In-Reply-To: <6c480b23-297a-4f3d-daff-962a01b0b54c@csgroup.eu>
On 08/21/2020 05:11 AM, Christophe Leroy wrote:
>
>
> Le 21/08/2020 à 00:00, Andreas Schwab a écrit :
>> On Jun 29 2020, Christophe Leroy wrote:
>>
>>> Instead of leaving NX unset on all segments above the start
>>> of vmalloc space, only leave NX unset on segments used for
>>> modules.
>>
>> I'm getting this crash:
>>
>> kernel tried to execute exec-protected page (f294b000) - exploit
>> attempt (uid: 0)
>> BUG: Unable to handle kernel instruction fetch
>> Faulting instruction address: 0xf294b000
>> Oops: Kernel access of bad area, sig: 11 [#1]
>> BE PAGE_SIZE=4K MMU=Hash PowerMac
>> Modules linked in: pata_macio(+)
>> CPU: 0 PID: 87 Comm: udevd Not tainted 5.8.0-rc2-test #49
>> NIP: f294b000 LR: 0005c60 CTR: f294b000
>> REGS: f18d9cc0 TRAP: 0400 Not tainted (5.8.0-rc2-test)
>> MSR: 10009032 <E,ME,IR,DR,RI> CR: 84222422 XER: 20000000
>> GPR00: c0005c14 f18d9d78 ef30ca20 00000000 ef0000e0 c00993d0 ef6da038
>> 0000005e
>> GPR08: c09050b8 c08b0000 00000000 f18d9d78 44222422 10072070 00000000
>> 0fefaca4
>> GPR16: 1006a00c f294d50b 00000120 00000124 c0096ea8 0000000e ef2776c0
>> ef2776e4
>> GPR24: f18fd6e8 00000001 c086fe64 c086fe04 00000000 c08b0000 f294b000
>> ffffffff
>> NIP [f294b000] pata_macio_init+0x0/0xc0 [pata_macio]
>> LR [c0005c60] do_one_initcall+0x6c/0x160
>> Call Trace:
>> [f18d9d78] [c0005c14] do_one_initcall+0x20/0x160 (unreliable)
>> [f18d9dd8] [c009a22c] do_init_module+0x60/0x1c0
>> [f18d9df8] [c00993d8] load_module+0x16a8/0x1c14
>> [f18d9ea8] [c0099aa4] sys_finit_module+0x8c/0x94
>> [f18d9f38] [c0012174] ret_from_syscall+0x0/0x34
>> --- interrupt: c01 at 0xfdb4318
>> LR = 0xfeee9c0
>> Instruction dump:
>> XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX <3d20c08b> 3d40c086 9421ffe0 8129106c
>> ---[ end trace 85a98cc836109871 ]---
>>
>
> Please try the patch at
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/07884ed033c31e074747b7eb8eaa329d15db07ec.1596641219.git.christophe.leroy@csgroup.eu/
>
>
> And if you are using KAsan, also take
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/6eddca2d5611fd57312a88eae31278c87a8fc99d.1596641224.git.christophe.leroy@csgroup.eu/
>
>
> Allthough I have some doubt that it will fix it, because the faulting
> instruction address is at 0xf294b000 which is within the vmalloc area.
> In the likely case the patch doesn't fix the issue, can you provide your
> .config and a dump of /sys/kernel/debug/powerpc/segment_registers (You
> have to have CONFIG_PPC_PTDUMP enabled for that) and also the below part
> from boot log.
>
> [ 0.000000] Memory: 509556K/524288K available (7088K kernel code,
> 592K rwdata, 1304K rodata, 356K init, 803K bss, 14732K reserved, 0K
> cma-reserved)
> [ 0.000000] Kernel virtual memory layout:
> [ 0.000000] * 0xff7ff000..0xfffff000 : fixmap
> [ 0.000000] * 0xff7fd000..0xff7ff000 : early ioremap
> [ 0.000000] * 0xe1000000..0xff7fd000 : vmalloc & ioremap
>
I found the issue, when VMALLOC_END is above 0xf0000000,
ALIGN(VMALLOC_END, SZ_256M) is 0 so the test is always false.
The below change should fix it.
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index 82ae9e06a773..d426eaf76bb0 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -194,12 +194,12 @@ static bool is_module_segment(unsigned long addr)
#ifdef MODULES_VADDR
if (addr < ALIGN_DOWN(MODULES_VADDR, SZ_256M))
return false;
- if (addr >= ALIGN(MODULES_END, SZ_256M))
+ if (addr > ALIGN(MODULES_END, SZ_256M) - 1)
return false;
#else
if (addr < ALIGN_DOWN(VMALLOC_START, SZ_256M))
return false;
- if (addr >= ALIGN(VMALLOC_END, SZ_256M))
+ if (addr > ALIGN(VMALLOC_END, SZ_256M) - 1)
return false;
#endif
return true;
Christophe
^ permalink raw reply related
* Re: [PATCH v2 00/13] mm/debug_vm_pgtable fixes
From: Aneesh Kumar K.V @ 2020-08-21 6:53 UTC (permalink / raw)
To: Anshuman Khandual, linux-mm, akpm; +Cc: linuxppc-dev
In-Reply-To: <856eb6d7-9c09-728e-b374-d787145ac052@arm.com>
On 8/21/20 9:03 AM, Anshuman Khandual wrote:
>
>
> On 08/19/2020 07:15 PM, Aneesh Kumar K.V wrote:
>> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>>
>>> This patch series includes fixes for debug_vm_pgtable test code so that
>>> they follow page table updates rules correctly. The first two patches introduce
>>> changes w.r.t ppc64. The patches are included in this series for completeness. We can
>>> merge them via ppc64 tree if required.
>>>
>>> Hugetlb test is disabled on ppc64 because that needs larger change to satisfy
>>> page table update rules.
>>>
>>> Changes from V1:
>>> * Address review feedback
>>> * drop test specific pfn_pte and pfn_pmd.
>>> * Update ppc64 page table helper to add _PAGE_PTE
>>>
>>> Aneesh Kumar K.V (13):
>>> powerpc/mm: Add DEBUG_VM WARN for pmd_clear
>>> powerpc/mm: Move setting pte specific flags to pfn_pte
>>> mm/debug_vm_pgtable/ppc64: Avoid setting top bits in radom value
>>> mm/debug_vm_pgtables/hugevmap: Use the arch helper to identify huge
>>> vmap support.
>>> mm/debug_vm_pgtable/savedwrite: Enable savedwrite test with
>>> CONFIG_NUMA_BALANCING
>>> mm/debug_vm_pgtable/THP: Mark the pte entry huge before using
>>> set_pmd/pud_at
>>> mm/debug_vm_pgtable/set_pte/pmd/pud: Don't use set_*_at to update an
>>> existing pte entry
>>> mm/debug_vm_pgtable/thp: Use page table depost/withdraw with THP
>>> mm/debug_vm_pgtable/locks: Move non page table modifying test together
>>> mm/debug_vm_pgtable/locks: Take correct page table lock
>>> mm/debug_vm_pgtable/pmd_clear: Don't use pmd/pud_clear on pte entries
>>> mm/debug_vm_pgtable/hugetlb: Disable hugetlb test on ppc64
>>> mm/debug_vm_pgtable: populate a pte entry before fetching it
>>>
>>> arch/powerpc/include/asm/book3s/64/pgtable.h | 29 +++-
>>> arch/powerpc/include/asm/nohash/pgtable.h | 5 -
>>> arch/powerpc/mm/book3s64/pgtable.c | 2 +-
>>> arch/powerpc/mm/pgtable.c | 5 -
>>> include/linux/io.h | 12 ++
>>> mm/debug_vm_pgtable.c | 151 +++++++++++--------
>>> 6 files changed, 127 insertions(+), 77 deletions(-)
>>>
>>
>> BTW I picked a wrong branch when sending this. Attaching the diff
>> against what I want to send. pfn_pmd() no more updates _PAGE_PTE
>> because that is handled by pmd_mkhuge().
>>
>> diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
>> index 3b4da7c63e28..e18ae50a275c 100644
>> --- a/arch/powerpc/mm/book3s64/pgtable.c
>> +++ b/arch/powerpc/mm/book3s64/pgtable.c
>> @@ -141,7 +141,7 @@ pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot)
>> unsigned long pmdv;
>>
>> pmdv = (pfn << PAGE_SHIFT) & PTE_RPN_MASK;
>> - return __pmd(pmdv | pgprot_val(pgprot) | _PAGE_PTE);
>> + return pmd_set_protbits(__pmd(pmdv), pgprot);
>> }
>>
>> pmd_t mk_pmd(struct page *page, pgprot_t pgprot)
>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>> index 7d9f8e1d790f..cad61d22f33a 100644
>> --- a/mm/debug_vm_pgtable.c
>> +++ b/mm/debug_vm_pgtable.c
>> @@ -229,7 +229,7 @@ static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
>>
>> static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot)
>> {
>> - pmd_t pmd = pfn_pmd(pfn, prot);
>> + pmd_t pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>>
>> if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
>> return;
>>
>
> Cover letter does not mention which branch or tag this series applies on.
> Just assumed it to be 5.9-rc1. Should the above changes be captured as a
> pre-requisite patch ?
>
> Anyways, the series fails to be build on arm64.
>
> A) Without CONFIG_TRANSPARENT_HUGEPAGE
>
> mm/debug_vm_pgtable.c: In function ‘debug_vm_pgtable’:
> mm/debug_vm_pgtable.c:1045:2: error: too many arguments to function ‘pmd_advanced_tests’
> pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot, saved_ptep);
> ^~~~~~~~~~~~~~~~~~
> mm/debug_vm_pgtable.c:366:20: note: declared here
> static void __init pmd_advanced_tests(struct mm_struct *mm,
> ^~~~~~~~~~~~~~~~~~
>
> B) As mentioned previously, this should be solved by including <linux/io.h>
>
> mm/debug_vm_pgtable.c: In function ‘pmd_huge_tests’:
> mm/debug_vm_pgtable.c:215:7: error: implicit declaration of function ‘arch_ioremap_pmd_supported’; did you mean ‘arch_disable_smp_support’? [-Werror=implicit-function-declaration]
> if (!arch_ioremap_pmd_supported())
> ^~~~~~~~~~~~~~~~~~~~~~~~~~
>
> Please make sure that the series builds on all enabled platforms i.e x86,
> arm64, ppc32, ppc64, arc, s390 along with selectively enabling/disabling
> all the features that make various #ifdefs in the test.
>
I was hoping to get kernel test robot build report to verify that. But
if you can help with that i have pushed a branch to github with reported
build failure fixes.
https://github.com/kvaneesh/linux/tree/debug_vm_pgtable
I still haven't looked at the PMD_FOLDED feedback from Christophe
because I am not sure i follow why we are checking for PMD folded there.
-aneesh
^ permalink raw reply
* Re: kernel since 5.6 do not boot anymore on Apple PowerBook
From: Christophe Leroy @ 2020-08-21 6:55 UTC (permalink / raw)
To: Giuseppe Sacco, linuxppc-dev
In-Reply-To: <b70a6343-a380-ff08-a401-04f9ab50be6b@csgroup.eu>
Hi Giuseppe,
Le 08/07/2020 à 20:44, Christophe Leroy a écrit :
>
>
> Le 08/07/2020 à 19:36, Giuseppe Sacco a écrit :
>> Hi Cristophe,
>>
>> Il giorno mer, 08/07/2020 alle 19.09 +0200, Christophe Leroy ha
>> scritto:
>>> Hi
>>>
>>> Le 08/07/2020 à 19:00, Giuseppe Sacco a écrit :
>>>> Hello,
>>>> while trying to debug a problem using git bisect, I am now at a point
>>>> where I cannot build the kernel at all. This is the error message I
>>>> get:
>>>>
>>>> $ LANG=C make ARCH=powerpc \
>>>> CROSS_COMPILE=powerpc-linux- \
>>>> CONFIG_MODULE_COMPRESS_GZIP=true \
>>>> INSTALL_MOD_STRIP=1 CONFIG_MODULE_COMPRESS=1 \
>>>> -j4 INSTALL_MOD_PATH=$BOOT INSTALL_PATH=$BOOT \
>>>> CONFIG_DEBUG_INFO_COMPRESSED=1 \
>>>> install modules_install
>>>> make[2]: *** No rule to make target 'vmlinux', needed by
>>>
>>> Surprising.
>>>
>>> Did you make any change to Makefiles ?
>>
>> No
>>
>>> Are you in the middle of a bisect ? If so, if the previous builds
>>> worked, I'd do 'git bisect skip'
>>
>> Yes, the previous one worked.
>>
>>> What's the result with:
>>>
>>> LANG=C make ARCH=powerpc CROSS_COMPILE=powerpc-linux- vmlinux
>>
>> $ LANG=C make ARCH=powerpc CROSS_COMPILE=powerpc-linux- vmlinux
>> CALL scripts/checksyscalls.sh
>> CALL scripts/atomic/check-atomics.sh
>> CHK include/generated/compile.h
>> CC kernel/module.o
>> kernel/module.c: In function 'do_init_module':
>> kernel/module.c:3593:2: error: implicit declaration of function
>> 'module_enable_ro'; did you mean 'module_enable_x'? [-Werror=implicit-
>> function-declaration]
>> 3593 | module_enable_ro(mod, true);
>> | ^~~~~~~~~~~~~~~~
>> | module_enable_x
>> cc1: some warnings being treated as errors
>> make[1]: *** [scripts/Makefile.build:267: kernel/module.o] Error 1
>> make: *** [Makefile:1735: kernel] Error 2
>>
>> So, should I 'git bisect skip'?
>
> Ah yes, I had the exact same problem last time I bisected.
>
> So yes do 'git bisect skip'. You'll probably hit this problem half a
> dozen of times, but at the end you should get a usefull bisect anyway.
>
Were you able to progress ?
Christophe
^ permalink raw reply
* Re: [PATCH v2 07/13] mm/debug_vm_pgtable/set_pte/pmd/pud: Don't use set_*_at to update an existing pte entry
From: Aneesh Kumar K.V @ 2020-08-21 7:14 UTC (permalink / raw)
To: Christophe Leroy, linux-mm, akpm; +Cc: linuxppc-dev, Anshuman Khandual
In-Reply-To: <b21d1dbb-7439-d317-8516-94c80f333e92@csgroup.eu>
On 8/20/20 8:02 PM, Christophe Leroy wrote:
>
>
> Le 19/08/2020 à 15:01, Aneesh Kumar K.V a écrit :
>> set_pte_at() should not be used to set a pte entry at locations that
>> already holds a valid pte entry. Architectures like ppc64 don't do TLB
>> invalidate in set_pte_at() and hence expect it to be used to set
>> locations
>> that are not a valid PTE.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>> mm/debug_vm_pgtable.c | 35 +++++++++++++++--------------------
>> 1 file changed, 15 insertions(+), 20 deletions(-)
>>
>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>> index 76f4c713e5a3..9c7e2c9cfc76 100644
>> --- a/mm/debug_vm_pgtable.c
>> +++ b/mm/debug_vm_pgtable.c
>> @@ -74,15 +74,18 @@ static void __init pte_advanced_tests(struct
>> mm_struct *mm,
>> {
>> pte_t pte = pfn_pte(pfn, prot);
>> + /*
>> + * Architectures optimize set_pte_at by avoiding TLB flush.
>> + * This requires set_pte_at to be not used to update an
>> + * existing pte entry. Clear pte before we do set_pte_at
>> + */
>> +
>> pr_debug("Validating PTE advanced\n");
>> pte = pfn_pte(pfn, prot);
>> set_pte_at(mm, vaddr, ptep, pte);
>> ptep_set_wrprotect(mm, vaddr, ptep);
>> pte = ptep_get(ptep);
>> WARN_ON(pte_write(pte));
>> -
>> - pte = pfn_pte(pfn, prot);
>> - set_pte_at(mm, vaddr, ptep, pte);
>> ptep_get_and_clear(mm, vaddr, ptep);
>> pte = ptep_get(ptep);
>> WARN_ON(!pte_none(pte));
>> @@ -96,13 +99,11 @@ static void __init pte_advanced_tests(struct
>> mm_struct *mm,
>> ptep_set_access_flags(vma, vaddr, ptep, pte, 1);
>> pte = ptep_get(ptep);
>> WARN_ON(!(pte_write(pte) && pte_dirty(pte)));
>> -
>> - pte = pfn_pte(pfn, prot);
>> - set_pte_at(mm, vaddr, ptep, pte);
>> ptep_get_and_clear_full(mm, vaddr, ptep, 1);
>> pte = ptep_get(ptep);
>> WARN_ON(!pte_none(pte));
>> + pte = pfn_pte(pfn, prot);
>> pte = pte_mkyoung(pte);
>> set_pte_at(mm, vaddr, ptep, pte);
>> ptep_test_and_clear_young(vma, vaddr, ptep);
>> @@ -164,9 +165,6 @@ static void __init pmd_advanced_tests(struct
>> mm_struct *mm,
>> pmdp_set_wrprotect(mm, vaddr, pmdp);
>> pmd = READ_ONCE(*pmdp);
>> WARN_ON(pmd_write(pmd));
>> -
>> - pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>> - set_pmd_at(mm, vaddr, pmdp, pmd);
>> pmdp_huge_get_and_clear(mm, vaddr, pmdp);
>> pmd = READ_ONCE(*pmdp);
>> WARN_ON(!pmd_none(pmd));
>> @@ -180,13 +178,11 @@ static void __init pmd_advanced_tests(struct
>> mm_struct *mm,
>> pmdp_set_access_flags(vma, vaddr, pmdp, pmd, 1);
>> pmd = READ_ONCE(*pmdp);
>> WARN_ON(!(pmd_write(pmd) && pmd_dirty(pmd)));
>> -
>> - pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>> - set_pmd_at(mm, vaddr, pmdp, pmd);
>> pmdp_huge_get_and_clear_full(vma, vaddr, pmdp, 1);
>> pmd = READ_ONCE(*pmdp);
>> WARN_ON(!pmd_none(pmd));
>> + pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>> pmd = pmd_mkyoung(pmd);
>> set_pmd_at(mm, vaddr, pmdp, pmd);
>> pmdp_test_and_clear_young(vma, vaddr, pmdp);
>> @@ -283,18 +279,10 @@ static void __init pud_advanced_tests(struct
>> mm_struct *mm,
>> WARN_ON(pud_write(pud));
>> #ifndef __PAGETABLE_PMD_FOLDED
>
> Same as below, once set_put_at() is gone, I don't think this #ifndef
> __PAGETABLE_PMD_FOLDED is still need, should be possible to replace by
> 'if (mm_pmd_folded())'
I would skip that change in this series because I still haven't worked
out what it means to have FOLDED PMD with
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
We should probably push that as a cleanup later and somebody who can
test that config can do that? Currently i can't boot ppc64 with
DBUG_VM_PGTABLE enabled on ppc64 because it is all buggy w.r.t rules.
-aneesh
^ permalink raw reply
* [PATCH] powerpc/32s: Fix module loading failure when VMALLOC_END is over 0xf0000000
From: Christophe Leroy @ 2020-08-21 7:15 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, schwab
Cc: linuxppc-dev, linux-kernel
In is_module_segment(), when VMALLOC_END is over 0xf0000000,
ALIGN(VMALLOC_END, SZ_256M) has value 0.
In that case, addr >= ALIGN(VMALLOC_END, SZ_256M) is always
true then is_module_segment() always returns false.
Use (ALIGN(VMALLOC_END, SZ_256M) - 1) which will have
value 0xffffffff and will be suitable for the comparison.
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Fixes: c49643319715 ("powerpc/32s: Only leave NX unset on segments used for modules")
---
arch/powerpc/mm/book3s32/mmu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index 82ae9e06a773..d426eaf76bb0 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -194,12 +194,12 @@ static bool is_module_segment(unsigned long addr)
#ifdef MODULES_VADDR
if (addr < ALIGN_DOWN(MODULES_VADDR, SZ_256M))
return false;
- if (addr >= ALIGN(MODULES_END, SZ_256M))
+ if (addr > ALIGN(MODULES_END, SZ_256M) - 1)
return false;
#else
if (addr < ALIGN_DOWN(VMALLOC_START, SZ_256M))
return false;
- if (addr >= ALIGN(VMALLOC_END, SZ_256M))
+ if (addr > ALIGN(VMALLOC_END, SZ_256M) - 1)
return false;
#endif
return true;
--
2.25.0
^ permalink raw reply related
* Re: [PATCH v5 8/8] mm/vmalloc: Hugepage vmalloc mappings
From: kernel test robot @ 2020-08-21 7:17 UTC (permalink / raw)
To: Nicholas Piggin, linux-mm, Andrew Morton
Cc: linux-arch, kbuild-all, linux-kernel, Nicholas Piggin,
Linux Memory Management List, Zefan Li, Jonathan Cameron,
linuxppc-dev
In-Reply-To: <20200821044427.736424-9-npiggin@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1387 bytes --]
Hi Nicholas,
I love your patch! Yet something to improve:
[auto build test ERROR on powerpc/next]
[also build test ERROR on arm64/for-next/core tip/x86/mm linus/master v5.9-rc1]
[cannot apply to hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Nicholas-Piggin/huge-vmalloc-mappings/20200821-124543
base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: riscv-allnoconfig (attached as .config)
compiler: riscv64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=riscv
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
riscv64-linux-ld: mm/page_alloc.o: in function `.L1578':
>> page_alloc.c:(.init.text+0x11a4): undefined reference to `find_vm_area'
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 5502 bytes --]
^ permalink raw reply
* Re: [PATCH v2 00/13] mm/debug_vm_pgtable fixes
From: Anshuman Khandual @ 2020-08-21 8:01 UTC (permalink / raw)
To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev
In-Reply-To: <46cc2987-0d1e-f8e8-ecaf-2d246b33413e@linux.ibm.com>
On 08/21/2020 12:23 PM, Aneesh Kumar K.V wrote:
> On 8/21/20 9:03 AM, Anshuman Khandual wrote:
>>
>>
>> On 08/19/2020 07:15 PM, Aneesh Kumar K.V wrote:
>>> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>>>
>>>> This patch series includes fixes for debug_vm_pgtable test code so that
>>>> they follow page table updates rules correctly. The first two patches introduce
>>>> changes w.r.t ppc64. The patches are included in this series for completeness. We can
>>>> merge them via ppc64 tree if required.
>>>>
>>>> Hugetlb test is disabled on ppc64 because that needs larger change to satisfy
>>>> page table update rules.
>>>>
>>>> Changes from V1:
>>>> * Address review feedback
>>>> * drop test specific pfn_pte and pfn_pmd.
>>>> * Update ppc64 page table helper to add _PAGE_PTE
>>>>
>>>> Aneesh Kumar K.V (13):
>>>> powerpc/mm: Add DEBUG_VM WARN for pmd_clear
>>>> powerpc/mm: Move setting pte specific flags to pfn_pte
>>>> mm/debug_vm_pgtable/ppc64: Avoid setting top bits in radom value
>>>> mm/debug_vm_pgtables/hugevmap: Use the arch helper to identify huge
>>>> vmap support.
>>>> mm/debug_vm_pgtable/savedwrite: Enable savedwrite test with
>>>> CONFIG_NUMA_BALANCING
>>>> mm/debug_vm_pgtable/THP: Mark the pte entry huge before using
>>>> set_pmd/pud_at
>>>> mm/debug_vm_pgtable/set_pte/pmd/pud: Don't use set_*_at to update an
>>>> existing pte entry
>>>> mm/debug_vm_pgtable/thp: Use page table depost/withdraw with THP
>>>> mm/debug_vm_pgtable/locks: Move non page table modifying test together
>>>> mm/debug_vm_pgtable/locks: Take correct page table lock
>>>> mm/debug_vm_pgtable/pmd_clear: Don't use pmd/pud_clear on pte entries
>>>> mm/debug_vm_pgtable/hugetlb: Disable hugetlb test on ppc64
>>>> mm/debug_vm_pgtable: populate a pte entry before fetching it
>>>>
>>>> arch/powerpc/include/asm/book3s/64/pgtable.h | 29 +++-
>>>> arch/powerpc/include/asm/nohash/pgtable.h | 5 -
>>>> arch/powerpc/mm/book3s64/pgtable.c | 2 +-
>>>> arch/powerpc/mm/pgtable.c | 5 -
>>>> include/linux/io.h | 12 ++
>>>> mm/debug_vm_pgtable.c | 151 +++++++++++--------
>>>> 6 files changed, 127 insertions(+), 77 deletions(-)
>>>>
>>>
>>> BTW I picked a wrong branch when sending this. Attaching the diff
>>> against what I want to send. pfn_pmd() no more updates _PAGE_PTE
>>> because that is handled by pmd_mkhuge().
>>>
>>> diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
>>> index 3b4da7c63e28..e18ae50a275c 100644
>>> --- a/arch/powerpc/mm/book3s64/pgtable.c
>>> +++ b/arch/powerpc/mm/book3s64/pgtable.c
>>> @@ -141,7 +141,7 @@ pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot)
>>> unsigned long pmdv;
>>> pmdv = (pfn << PAGE_SHIFT) & PTE_RPN_MASK;
>>> - return __pmd(pmdv | pgprot_val(pgprot) | _PAGE_PTE);
>>> + return pmd_set_protbits(__pmd(pmdv), pgprot);
>>> }
>>> pmd_t mk_pmd(struct page *page, pgprot_t pgprot)
>>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>>> index 7d9f8e1d790f..cad61d22f33a 100644
>>> --- a/mm/debug_vm_pgtable.c
>>> +++ b/mm/debug_vm_pgtable.c
>>> @@ -229,7 +229,7 @@ static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
>>> static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot)
>>> {
>>> - pmd_t pmd = pfn_pmd(pfn, prot);
>>> + pmd_t pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>>> if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
>>> return;
>>>
>>
>> Cover letter does not mention which branch or tag this series applies on.
>> Just assumed it to be 5.9-rc1. Should the above changes be captured as a
>> pre-requisite patch ?
>>
>> Anyways, the series fails to be build on arm64.
>>
>> A) Without CONFIG_TRANSPARENT_HUGEPAGE
>>
>> mm/debug_vm_pgtable.c: In function ‘debug_vm_pgtable’:
>> mm/debug_vm_pgtable.c:1045:2: error: too many arguments to function ‘pmd_advanced_tests’
>> pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot, saved_ptep);
>> ^~~~~~~~~~~~~~~~~~
>> mm/debug_vm_pgtable.c:366:20: note: declared here
>> static void __init pmd_advanced_tests(struct mm_struct *mm,
>> ^~~~~~~~~~~~~~~~~~
>>
>> B) As mentioned previously, this should be solved by including <linux/io.h>
>>
>> mm/debug_vm_pgtable.c: In function ‘pmd_huge_tests’:
>> mm/debug_vm_pgtable.c:215:7: error: implicit declaration of function ‘arch_ioremap_pmd_supported’; did you mean ‘arch_disable_smp_support’? [-Werror=implicit-function-declaration]
>> if (!arch_ioremap_pmd_supported())
>> ^~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> Please make sure that the series builds on all enabled platforms i.e x86,
>> arm64, ppc32, ppc64, arc, s390 along with selectively enabling/disabling
>> all the features that make various #ifdefs in the test.
>>
>
> I was hoping to get kernel test robot build report to verify that. But if you can help with that i have pushed a branch to github with reported build failure fixes.
>
> https://github.com/kvaneesh/linux/tree/debug_vm_pgtable
>
> I still haven't looked at the PMD_FOLDED feedback from Christophe because I am not sure i follow why we are checking for PMD folded there.
If this series does not build on existing enabled platforms, wondering
how effective the review could be, assuming that things would need to
change again to fix those build failures on various platforms. Getting
this to build here is essential, as not all page table constructs are
available across these platforms. Hence wondering, it might be better
if you could resend the series after fixing build issues.
^ permalink raw reply
* Re: [PATCH v2 10/13] mm/debug_vm_pgtable/locks: Take correct page table lock
From: Anshuman Khandual @ 2020-08-21 8:03 UTC (permalink / raw)
To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev
In-Reply-To: <20200819130107.478414-11-aneesh.kumar@linux.ibm.com>
On 08/19/2020 06:31 PM, Aneesh Kumar K.V wrote:
> Make sure we call pte accessors with correct lock held.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
> mm/debug_vm_pgtable.c | 34 ++++++++++++++++++++--------------
> 1 file changed, 20 insertions(+), 14 deletions(-)
>
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 69fe3cd8126c..8f7a8ccb5a54 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -1024,33 +1024,39 @@ static int __init debug_vm_pgtable(void)
> pmd_thp_tests(pmd_aligned, prot);
> pud_thp_tests(pud_aligned, prot);
>
> + hugetlb_basic_tests(pte_aligned, prot);
> +
> /*
> * Page table modifying tests
> */
> - pte_clear_tests(mm, ptep, vaddr);
> - pmd_clear_tests(mm, pmdp);
> - pud_clear_tests(mm, pudp);
> - p4d_clear_tests(mm, p4dp);
> - pgd_clear_tests(mm, pgdp);
>
> ptep = pte_alloc_map_lock(mm, pmdp, vaddr, &ptl);
> + pte_clear_tests(mm, ptep, vaddr);
> pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
> - pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot, saved_ptep);
> - pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
> - hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
> -
> + pte_unmap_unlock(ptep, ptl);
>
> + ptl = pmd_lock(mm, pmdp);
> + pmd_clear_tests(mm, pmdp);
> + pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot, saved_ptep);
> pmd_huge_tests(pmdp, pmd_aligned, prot);
> + pmd_populate_tests(mm, pmdp, saved_ptep);
> + spin_unlock(ptl);
> +
> + ptl = pud_lock(mm, pudp);
> + pud_clear_tests(mm, pudp);
> + pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
> pud_huge_tests(pudp, pud_aligned, prot);
> + pud_populate_tests(mm, pudp, saved_pmdp);
> + spin_unlock(ptl);
>
> - pte_unmap_unlock(ptep, ptl);
> + //hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
Commenting out an existing test in the middle of another change ?
^ permalink raw reply
* [PATCH] powerpc/perf/hv-24x7: Move cpumask file to top folder of hv-24x7 driver
From: Kajol Jain @ 2020-08-21 8:06 UTC (permalink / raw)
To: mpe, linuxppc-dev; +Cc: kjain, suka, maddy
Commit 792f73f747b8 ("powerpc/hv-24x7: Add sysfs files inside hv-24x7
device to show cpumask") added cpumask file as part of hv-24x7 driver
inside the interface folder. Cpumask file suppose to be in the top
folder of the pmu driver inorder to make hotplug works.
This patch fix that issue and create new group 'cpumask_attr_group'
to add cpumask file and make sure it added on top folder.
command:# cat /sys/devices/hv_24x7/cpumask
0
Fixes: 792f73f747b8 ("powerpc/hv-24x7: Add sysfs files inside hv-24x7
device to show cpumask")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
---
.../testing/sysfs-bus-event_source-devices-hv_24x7 | 2 +-
arch/powerpc/perf/hv-24x7.c | 11 ++++++++++-
2 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
index f7e32f218f73..e82fc37be802 100644
--- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
@@ -43,7 +43,7 @@ Description: read only
This sysfs interface exposes the number of cores per chip
present in the system.
-What: /sys/devices/hv_24x7/interface/cpumask
+What: /sys/devices/hv_24x7/cpumask
Date: July 2020
Contact: Linux on PowerPC Developer List <linuxppc-dev@lists.ozlabs.org>
Description: read only
diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index cdb7bfbd157e..6e7e820508df 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -1128,6 +1128,15 @@ static struct bin_attribute *if_bin_attrs[] = {
NULL,
};
+static struct attribute *cpumask_attrs[] = {
+ &dev_attr_cpumask.attr,
+ NULL,
+};
+
+static struct attribute_group cpumask_attr_group = {
+ .attrs = cpumask_attrs,
+};
+
static struct attribute *if_attrs[] = {
&dev_attr_catalog_len.attr,
&dev_attr_catalog_version.attr,
@@ -1135,7 +1144,6 @@ static struct attribute *if_attrs[] = {
&dev_attr_sockets.attr,
&dev_attr_chipspersocket.attr,
&dev_attr_coresperchip.attr,
- &dev_attr_cpumask.attr,
NULL,
};
@@ -1151,6 +1159,7 @@ static const struct attribute_group *attr_groups[] = {
&event_desc_group,
&event_long_desc_group,
&if_group,
+ &cpumask_attr_group,
NULL,
};
--
2.18.2
^ permalink raw reply related
* Re: [PATCH v2 10/13] mm/debug_vm_pgtable/locks: Take correct page table lock
From: Aneesh Kumar K.V @ 2020-08-21 8:08 UTC (permalink / raw)
To: Anshuman Khandual, linux-mm, akpm; +Cc: linuxppc-dev
In-Reply-To: <af7282c1-0d59-3f29-2a59-05575cd9d7f3@arm.com>
On 8/21/20 1:33 PM, Anshuman Khandual wrote:
>
>
> On 08/19/2020 06:31 PM, Aneesh Kumar K.V wrote:
>> Make sure we call pte accessors with correct lock held.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>> mm/debug_vm_pgtable.c | 34 ++++++++++++++++++++--------------
>> 1 file changed, 20 insertions(+), 14 deletions(-)
>>
>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>> index 69fe3cd8126c..8f7a8ccb5a54 100644
>> --- a/mm/debug_vm_pgtable.c
>> +++ b/mm/debug_vm_pgtable.c
>> @@ -1024,33 +1024,39 @@ static int __init debug_vm_pgtable(void)
>> pmd_thp_tests(pmd_aligned, prot);
>> pud_thp_tests(pud_aligned, prot);
>>
>> + hugetlb_basic_tests(pte_aligned, prot);
>> +
>> /*
>> * Page table modifying tests
>> */
>> - pte_clear_tests(mm, ptep, vaddr);
>> - pmd_clear_tests(mm, pmdp);
>> - pud_clear_tests(mm, pudp);
>> - p4d_clear_tests(mm, p4dp);
>> - pgd_clear_tests(mm, pgdp);
>>
>> ptep = pte_alloc_map_lock(mm, pmdp, vaddr, &ptl);
>> + pte_clear_tests(mm, ptep, vaddr);
>> pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
>> - pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot, saved_ptep);
>> - pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
>> - hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
>> -
>> + pte_unmap_unlock(ptep, ptl);
>>
>> + ptl = pmd_lock(mm, pmdp);
>> + pmd_clear_tests(mm, pmdp);
>> + pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot, saved_ptep);
>> pmd_huge_tests(pmdp, pmd_aligned, prot);
>> + pmd_populate_tests(mm, pmdp, saved_ptep);
>> + spin_unlock(ptl);
>> +
>> + ptl = pud_lock(mm, pudp);
>> + pud_clear_tests(mm, pudp);
>> + pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
>> pud_huge_tests(pudp, pud_aligned, prot);
>> + pud_populate_tests(mm, pudp, saved_pmdp);
>> + spin_unlock(ptl);
>>
>> - pte_unmap_unlock(ptep, ptl);
>> + //hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
>
> Commenting out an existing test in the middle of another change ?
>
That is already fixed. That was me creating a git diff against a wrong
branch.
Thanks.
-aneesh
^ permalink raw reply
* Re: [PATCH v2 00/13] mm/debug_vm_pgtable fixes
From: Aneesh Kumar K.V @ 2020-08-21 8:10 UTC (permalink / raw)
To: Anshuman Khandual, linux-mm, akpm; +Cc: linuxppc-dev
In-Reply-To: <6927a5cf-4100-e43e-6aba-5d7bc0533276@arm.com>
On 8/21/20 1:31 PM, Anshuman Khandual wrote:
>
>
> On 08/21/2020 12:23 PM, Aneesh Kumar K.V wrote:
>> On 8/21/20 9:03 AM, Anshuman Khandual wrote:
>>>
>>>
>>> On 08/19/2020 07:15 PM, Aneesh Kumar K.V wrote:
>>>> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>>>>
>>>>> This patch series includes fixes for debug_vm_pgtable test code so that
>>>>> they follow page table updates rules correctly. The first two patches introduce
>>>>> changes w.r.t ppc64. The patches are included in this series for completeness. We can
>>>>> merge them via ppc64 tree if required.
>>>>>
>>>>> Hugetlb test is disabled on ppc64 because that needs larger change to satisfy
>>>>> page table update rules.
>>>>>
>>>>> Changes from V1:
>>>>> * Address review feedback
>>>>> * drop test specific pfn_pte and pfn_pmd.
>>>>> * Update ppc64 page table helper to add _PAGE_PTE
>>>>>
>>>>> Aneesh Kumar K.V (13):
>>>>> powerpc/mm: Add DEBUG_VM WARN for pmd_clear
>>>>> powerpc/mm: Move setting pte specific flags to pfn_pte
>>>>> mm/debug_vm_pgtable/ppc64: Avoid setting top bits in radom value
>>>>> mm/debug_vm_pgtables/hugevmap: Use the arch helper to identify huge
>>>>> vmap support.
>>>>> mm/debug_vm_pgtable/savedwrite: Enable savedwrite test with
>>>>> CONFIG_NUMA_BALANCING
>>>>> mm/debug_vm_pgtable/THP: Mark the pte entry huge before using
>>>>> set_pmd/pud_at
>>>>> mm/debug_vm_pgtable/set_pte/pmd/pud: Don't use set_*_at to update an
>>>>> existing pte entry
>>>>> mm/debug_vm_pgtable/thp: Use page table depost/withdraw with THP
>>>>> mm/debug_vm_pgtable/locks: Move non page table modifying test together
>>>>> mm/debug_vm_pgtable/locks: Take correct page table lock
>>>>> mm/debug_vm_pgtable/pmd_clear: Don't use pmd/pud_clear on pte entries
>>>>> mm/debug_vm_pgtable/hugetlb: Disable hugetlb test on ppc64
>>>>> mm/debug_vm_pgtable: populate a pte entry before fetching it
>>>>>
>>>>> arch/powerpc/include/asm/book3s/64/pgtable.h | 29 +++-
>>>>> arch/powerpc/include/asm/nohash/pgtable.h | 5 -
>>>>> arch/powerpc/mm/book3s64/pgtable.c | 2 +-
>>>>> arch/powerpc/mm/pgtable.c | 5 -
>>>>> include/linux/io.h | 12 ++
>>>>> mm/debug_vm_pgtable.c | 151 +++++++++++--------
>>>>> 6 files changed, 127 insertions(+), 77 deletions(-)
>>>>>
>>>>
>>>> BTW I picked a wrong branch when sending this. Attaching the diff
>>>> against what I want to send. pfn_pmd() no more updates _PAGE_PTE
>>>> because that is handled by pmd_mkhuge().
>>>>
>>>> diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
>>>> index 3b4da7c63e28..e18ae50a275c 100644
>>>> --- a/arch/powerpc/mm/book3s64/pgtable.c
>>>> +++ b/arch/powerpc/mm/book3s64/pgtable.c
>>>> @@ -141,7 +141,7 @@ pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot)
>>>> unsigned long pmdv;
>>>> pmdv = (pfn << PAGE_SHIFT) & PTE_RPN_MASK;
>>>> - return __pmd(pmdv | pgprot_val(pgprot) | _PAGE_PTE);
>>>> + return pmd_set_protbits(__pmd(pmdv), pgprot);
>>>> }
>>>> pmd_t mk_pmd(struct page *page, pgprot_t pgprot)
>>>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>>>> index 7d9f8e1d790f..cad61d22f33a 100644
>>>> --- a/mm/debug_vm_pgtable.c
>>>> +++ b/mm/debug_vm_pgtable.c
>>>> @@ -229,7 +229,7 @@ static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
>>>> static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot)
>>>> {
>>>> - pmd_t pmd = pfn_pmd(pfn, prot);
>>>> + pmd_t pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>>>> if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
>>>> return;
>>>>
>>>
>>> Cover letter does not mention which branch or tag this series applies on.
>>> Just assumed it to be 5.9-rc1. Should the above changes be captured as a
>>> pre-requisite patch ?
>>>
>>> Anyways, the series fails to be build on arm64.
>>>
>>> A) Without CONFIG_TRANSPARENT_HUGEPAGE
>>>
>>> mm/debug_vm_pgtable.c: In function ‘debug_vm_pgtable’:
>>> mm/debug_vm_pgtable.c:1045:2: error: too many arguments to function ‘pmd_advanced_tests’
>>> pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot, saved_ptep);
>>> ^~~~~~~~~~~~~~~~~~
>>> mm/debug_vm_pgtable.c:366:20: note: declared here
>>> static void __init pmd_advanced_tests(struct mm_struct *mm,
>>> ^~~~~~~~~~~~~~~~~~
>>>
>>> B) As mentioned previously, this should be solved by including <linux/io.h>
>>>
>>> mm/debug_vm_pgtable.c: In function ‘pmd_huge_tests’:
>>> mm/debug_vm_pgtable.c:215:7: error: implicit declaration of function ‘arch_ioremap_pmd_supported’; did you mean ‘arch_disable_smp_support’? [-Werror=implicit-function-declaration]
>>> if (!arch_ioremap_pmd_supported())
>>> ^~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>> Please make sure that the series builds on all enabled platforms i.e x86,
>>> arm64, ppc32, ppc64, arc, s390 along with selectively enabling/disabling
>>> all the features that make various #ifdefs in the test.
>>>
>>
>> I was hoping to get kernel test robot build report to verify that. But if you can help with that i have pushed a branch to github with reported build failure fixes.
>>
>> https://github.com/kvaneesh/linux/tree/debug_vm_pgtable
>>
>> I still haven't looked at the PMD_FOLDED feedback from Christophe because I am not sure i follow why we are checking for PMD folded there.
>
> If this series does not build on existing enabled platforms, wondering
> how effective the review could be, assuming that things would need to
> change again to fix those build failures on various platforms. Getting
> this to build here is essential, as not all page table constructs are
> available across these platforms. Hence wondering, it might be better
> if you could resend the series after fixing build issues.
>
Sure. I am hoping kernel test robot will pick this up. I did an x86 and
about 19 different ppc config build with the series. The git tree above
was pushed with that. Considering you authored the change i am wondering
if you could help with checking other architecture (may be atleast arm
variant)
-aneesh
^ permalink raw reply
* Re: [PATCH v2 07/13] mm/debug_vm_pgtable/set_pte/pmd/pud: Don't use set_*_at to update an existing pte entry
From: Anshuman Khandual @ 2020-08-21 8:20 UTC (permalink / raw)
To: Aneesh Kumar K.V, Christophe Leroy, linux-mm, akpm; +Cc: linuxppc-dev
In-Reply-To: <3d966519-0a6b-3ccb-fd21-b7f06c8e4df7@linux.ibm.com>
On 08/21/2020 12:44 PM, Aneesh Kumar K.V wrote:
> On 8/20/20 8:02 PM, Christophe Leroy wrote:
>>
>>
>> Le 19/08/2020 à 15:01, Aneesh Kumar K.V a écrit :
>>> set_pte_at() should not be used to set a pte entry at locations that
>>> already holds a valid pte entry. Architectures like ppc64 don't do TLB
>>> invalidate in set_pte_at() and hence expect it to be used to set locations
>>> that are not a valid PTE.
>>>
>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>> ---
>>> mm/debug_vm_pgtable.c | 35 +++++++++++++++--------------------
>>> 1 file changed, 15 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>>> index 76f4c713e5a3..9c7e2c9cfc76 100644
>>> --- a/mm/debug_vm_pgtable.c
>>> +++ b/mm/debug_vm_pgtable.c
>>> @@ -74,15 +74,18 @@ static void __init pte_advanced_tests(struct mm_struct *mm,
>>> {
>>> pte_t pte = pfn_pte(pfn, prot);
>>> + /*
>>> + * Architectures optimize set_pte_at by avoiding TLB flush.
>>> + * This requires set_pte_at to be not used to update an
>>> + * existing pte entry. Clear pte before we do set_pte_at
>>> + */
>>> +
>>> pr_debug("Validating PTE advanced\n");
>>> pte = pfn_pte(pfn, prot);
>>> set_pte_at(mm, vaddr, ptep, pte);
>>> ptep_set_wrprotect(mm, vaddr, ptep);
>>> pte = ptep_get(ptep);
>>> WARN_ON(pte_write(pte));
>>> -
>>> - pte = pfn_pte(pfn, prot);
>>> - set_pte_at(mm, vaddr, ptep, pte);
>>> ptep_get_and_clear(mm, vaddr, ptep);
>>> pte = ptep_get(ptep);
>>> WARN_ON(!pte_none(pte));
>>> @@ -96,13 +99,11 @@ static void __init pte_advanced_tests(struct mm_struct *mm,
>>> ptep_set_access_flags(vma, vaddr, ptep, pte, 1);
>>> pte = ptep_get(ptep);
>>> WARN_ON(!(pte_write(pte) && pte_dirty(pte)));
>>> -
>>> - pte = pfn_pte(pfn, prot);
>>> - set_pte_at(mm, vaddr, ptep, pte);
>>> ptep_get_and_clear_full(mm, vaddr, ptep, 1);
>>> pte = ptep_get(ptep);
>>> WARN_ON(!pte_none(pte));
>>> + pte = pfn_pte(pfn, prot);
>>> pte = pte_mkyoung(pte);
>>> set_pte_at(mm, vaddr, ptep, pte);
>>> ptep_test_and_clear_young(vma, vaddr, ptep);
>>> @@ -164,9 +165,6 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
>>> pmdp_set_wrprotect(mm, vaddr, pmdp);
>>> pmd = READ_ONCE(*pmdp);
>>> WARN_ON(pmd_write(pmd));
>>> -
>>> - pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>>> - set_pmd_at(mm, vaddr, pmdp, pmd);
>>> pmdp_huge_get_and_clear(mm, vaddr, pmdp);
>>> pmd = READ_ONCE(*pmdp);
>>> WARN_ON(!pmd_none(pmd));
>>> @@ -180,13 +178,11 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
>>> pmdp_set_access_flags(vma, vaddr, pmdp, pmd, 1);
>>> pmd = READ_ONCE(*pmdp);
>>> WARN_ON(!(pmd_write(pmd) && pmd_dirty(pmd)));
>>> -
>>> - pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>>> - set_pmd_at(mm, vaddr, pmdp, pmd);
>>> pmdp_huge_get_and_clear_full(vma, vaddr, pmdp, 1);
>>> pmd = READ_ONCE(*pmdp);
>>> WARN_ON(!pmd_none(pmd));
>>> + pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>>> pmd = pmd_mkyoung(pmd);
>>> set_pmd_at(mm, vaddr, pmdp, pmd);
>>> pmdp_test_and_clear_young(vma, vaddr, pmdp);
>>> @@ -283,18 +279,10 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
>>> WARN_ON(pud_write(pud));
>>> #ifndef __PAGETABLE_PMD_FOLDED
>>
>> Same as below, once set_put_at() is gone, I don't think this #ifndef __PAGETABLE_PMD_FOLDED is still need, should be possible to replace by 'if (mm_pmd_folded())'
>
> I would skip that change in this series because I still haven't worked out what it means to have FOLDED PMD with CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
>
>
> We should probably push that as a cleanup later and somebody who can test that config can do that? Currently i can't boot ppc64 with DBUG_VM_PGTABLE enabled on ppc64 because it is all buggy w.r.t rules.
Agreed. I think its OK not address these changes/improvements in this particular
series which is trying to modify the test to make it run on ppc64 platform. I will
probably look into that later.
^ permalink raw reply
* Re: [PATCH v3] pseries/drmem: don't cache node id in drmem_lmb struct
From: Laurent Dufour @ 2020-08-21 8:33 UTC (permalink / raw)
To: Scott Cheloha, linuxppc-dev
Cc: Nathan Lynch, Michal Suchanek, David Hildenbrand, Rick Lindsley
In-Reply-To: <20200811015115.63677-1-cheloha@linux.ibm.com>
Le 11/08/2020 à 03:51, Scott Cheloha a écrit :
> At memory hot-remove time we can retrieve an LMB's nid from its
> corresponding memory_block. There is no need to store the nid
> in multiple locations.
>
> Note that lmb_to_memblock() uses find_memory_block() to get the
> corresponding memory_block. As find_memory_block() runs in sub-linear
> time this approach is negligibly slower than what we do at present.
>
> In exchange for this lookup at hot-remove time we no longer need to
> call memory_add_physaddr_to_nid() during drmem_init() for each LMB.
> On powerpc, memory_add_physaddr_to_nid() is a linear search, so this
> spares us an O(n^2) initialization during boot.
>
> On systems with many LMBs that initialization overhead is palpable and
> disruptive. For example, on a box with 249854 LMBs we're seeing
> drmem_init() take upwards of 30 seconds to complete:
>
> [ 53.721639] drmem: initializing drmem v2
> [ 80.604346] watchdog: BUG: soft lockup - CPU#65 stuck for 23s! [swapper/0:1]
> [ 80.604377] Modules linked in:
> [ 80.604389] CPU: 65 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc2+ #4
> [ 80.604397] NIP: c0000000000a4980 LR: c0000000000a4940 CTR: 0000000000000000
> [ 80.604407] REGS: c0002dbff8493830 TRAP: 0901 Not tainted (5.6.0-rc2+)
> [ 80.604412] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 44000248 XER: 0000000d
> [ 80.604431] CFAR: c0000000000a4a38 IRQMASK: 0
> [ 80.604431] GPR00: c0000000000a4940 c0002dbff8493ac0 c000000001904400 c0003cfffffede30
> [ 80.604431] GPR04: 0000000000000000 c000000000f4095a 000000000000002f 0000000010000000
> [ 80.604431] GPR08: c0000bf7ecdb7fb8 c0000bf7ecc2d3c8 0000000000000008 c00c0002fdfb2001
> [ 80.604431] GPR12: 0000000000000000 c00000001e8ec200
> [ 80.604477] NIP [c0000000000a4980] hot_add_scn_to_nid+0xa0/0x3e0
> [ 80.604486] LR [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0
> [ 80.604492] Call Trace:
> [ 80.604498] [c0002dbff8493ac0] [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0 (unreliable)
> [ 80.604509] [c0002dbff8493b20] [c000000000087c10] memory_add_physaddr_to_nid+0x20/0x60
> [ 80.604521] [c0002dbff8493b40] [c0000000010d4880] drmem_init+0x25c/0x2f0
> [ 80.604530] [c0002dbff8493c10] [c000000000010154] do_one_initcall+0x64/0x2c0
> [ 80.604540] [c0002dbff8493ce0] [c0000000010c4aa0] kernel_init_freeable+0x2d8/0x3a0
> [ 80.604550] [c0002dbff8493db0] [c000000000010824] kernel_init+0x2c/0x148
> [ 80.604560] [c0002dbff8493e20] [c00000000000b648] ret_from_kernel_thread+0x5c/0x74
> [ 80.604567] Instruction dump:
> [ 80.604574] 392918e8 e9490000 e90a000a e92a0000 80ea000c 1d080018 3908ffe8 7d094214
> [ 80.604586] 7fa94040 419d00dc e9490010 714a0088 <2faa0008> 409e00ac e9490000 7fbe5040
> [ 89.047390] drmem: 249854 LMB(s)
>
> With a patched kernel on the same machine we're no longer seeing the
> soft lockup. drmem_init() now completes in negligible time, even when
> the LMB count is large.
>
> Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
> ---
> v1:
> - RFC
>
> v2:
> - Adjusted commit message.
> - Miscellaneous cleanup.
>
> v3:
> - Correct issue found by Laurent Dufour <ldufour@linux.vnet.ibm.com>:
> - Add missing put_device() call in dlpar_remove_lmb() for the
> lmb's associated mem_block.
>
> arch/powerpc/include/asm/drmem.h | 21 ----------------
> arch/powerpc/mm/drmem.c | 6 +----
> .../platforms/pseries/hotplug-memory.c | 24 ++++++++++++-------
> 3 files changed, 17 insertions(+), 34 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
> index 414d209f45bb..34e4e9b257f5 100644
> --- a/arch/powerpc/include/asm/drmem.h
> +++ b/arch/powerpc/include/asm/drmem.h
> @@ -13,9 +13,6 @@ struct drmem_lmb {
> u32 drc_index;
> u32 aa_index;
> u32 flags;
> -#ifdef CONFIG_MEMORY_HOTPLUG
> - int nid;
> -#endif
> };
>
> struct drmem_lmb_info {
> @@ -104,22 +101,4 @@ static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
> lmb->aa_index = 0xffffffff;
> }
>
> -#ifdef CONFIG_MEMORY_HOTPLUG
> -static inline void lmb_set_nid(struct drmem_lmb *lmb)
> -{
> - lmb->nid = memory_add_physaddr_to_nid(lmb->base_addr);
> -}
> -static inline void lmb_clear_nid(struct drmem_lmb *lmb)
> -{
> - lmb->nid = -1;
> -}
> -#else
> -static inline void lmb_set_nid(struct drmem_lmb *lmb)
> -{
> -}
> -static inline void lmb_clear_nid(struct drmem_lmb *lmb)
> -{
> -}
> -#endif
> -
> #endif /* _ASM_POWERPC_LMB_H */
> diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
> index 59327cefbc6a..873fcfc7b875 100644
> --- a/arch/powerpc/mm/drmem.c
> +++ b/arch/powerpc/mm/drmem.c
> @@ -362,10 +362,8 @@ static void __init init_drmem_v1_lmbs(const __be32 *prop)
> if (!drmem_info->lmbs)
> return;
>
> - for_each_drmem_lmb(lmb) {
> + for_each_drmem_lmb(lmb)
> read_drconf_v1_cell(lmb, &prop);
> - lmb_set_nid(lmb);
> - }
> }
>
> static void __init init_drmem_v2_lmbs(const __be32 *prop)
> @@ -410,8 +408,6 @@ static void __init init_drmem_v2_lmbs(const __be32 *prop)
>
> lmb->aa_index = dr_cell.aa_index;
> lmb->flags = dr_cell.flags;
> -
> - lmb_set_nid(lmb);
> }
> }
> }
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index 5ace2f9a277e..e34326d22400 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -356,25 +356,32 @@ static int dlpar_add_lmb(struct drmem_lmb *);
>
> static int dlpar_remove_lmb(struct drmem_lmb *lmb)
> {
> + struct memory_block *mem_block;
> unsigned long block_sz;
> int rc;
>
> if (!lmb_is_removable(lmb))
> return -EINVAL;
>
> + mem_block = lmb_to_memblock(lmb);
> + if (mem_block == NULL)
> + return -EINVAL;
> +
> rc = dlpar_offline_lmb(lmb);
> - if (rc)
> + if (rc) {
> + put_device(&mem_block->dev);
> return rc;
> + }
>
> block_sz = pseries_memory_block_size();
>
> - __remove_memory(lmb->nid, lmb->base_addr, block_sz);
> + __remove_memory(mem_block->nid, lmb->base_addr, block_sz);
> + put_device(&mem_block->dev);
>
> /* Update memory regions for memory remove */
> memblock_remove(lmb->base_addr, block_sz);
>
> invalidate_lmb_associativity_index(lmb);
> - lmb_clear_nid(lmb);
> lmb->flags &= ~DRCONF_MEM_ASSIGNED;
>
> return 0;
> @@ -631,7 +638,7 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index)
> static int dlpar_add_lmb(struct drmem_lmb *lmb)
> {
> unsigned long block_sz;
> - int rc;
> + int nid, rc;
>
> if (lmb->flags & DRCONF_MEM_ASSIGNED)
> return -EINVAL;
> @@ -642,11 +649,13 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
> return rc;
> }
>
> - lmb_set_nid(lmb);
> block_sz = memory_block_size_bytes();
>
> + /* Find the node id for this address. */
> + nid = memory_add_physaddr_to_nid(lmb->base_addr);
I think we could be more efficient here.
Here is the call stack behind memory_add_physaddr_to_nid():
memory_add_physaddr_to_nid(lmb->base_addr)
hot_add_scn_to_nid()
if (of_find_node_by_path("/ibm,dynamic-reconfiguration-memory")) == true*
then
hot_add_drconf_scn_to_nid()
for_each_drmem_lmb() to find the LMB based on lmb->base_addr
of_drconf_to_nid_single(found LMB)
use lmb->aa_index to get the nid.
* that test is necessarily true when called from dlpar_add_lmb() otherwise the
call to update_lmb_associativity_index() would have failed earlier.
Basically, we have a LMB and we later walk all the LMBs to find that lmb again.
In the case of dlpar_add_lmb(), it would be more efficient to directly call
of_drconf_to_nid_single(). That function is not exported from
arch/powerpc/mm/numa.c but it may be good to export it through that patch.
> +
> /* Add the memory */
> - rc = __add_memory(lmb->nid, lmb->base_addr, block_sz);
> + rc = __add_memory(nid, lmb->base_addr, block_sz);
> if (rc) {
> invalidate_lmb_associativity_index(lmb);
> return rc;
> @@ -654,9 +663,8 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
>
> rc = dlpar_online_lmb(lmb);
> if (rc) {
> - __remove_memory(lmb->nid, lmb->base_addr, block_sz);
> + __remove_memory(nid, lmb->base_addr, block_sz);
> invalidate_lmb_associativity_index(lmb);
> - lmb_clear_nid(lmb);
> } else {
> lmb->flags |= DRCONF_MEM_ASSIGNED;
> }
>
^ permalink raw reply
* Re: [PATCH v2 04/13] mm/debug_vm_pgtables/hugevmap: Use the arch helper to identify huge vmap support.
From: Anshuman Khandual @ 2020-08-21 8:38 UTC (permalink / raw)
To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev
In-Reply-To: <20200819130107.478414-5-aneesh.kumar@linux.ibm.com>
On 08/19/2020 06:30 PM, Aneesh Kumar K.V wrote:
> ppc64 supports huge vmap only with radix translation. Hence use arch helper
> to determine the huge vmap support.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
> include/linux/io.h | 12 ++++++++++++
> mm/debug_vm_pgtable.c | 4 ++--
> 2 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/io.h b/include/linux/io.h
> index 8394c56babc2..0b1ecda0cc86 100644
> --- a/include/linux/io.h
> +++ b/include/linux/io.h
> @@ -38,6 +38,18 @@ int arch_ioremap_pud_supported(void);
> int arch_ioremap_pmd_supported(void);
> #else
> static inline void ioremap_huge_init(void) { }
> +static inline int arch_ioremap_p4d_supported(void)
> +{
> + return false;
> +}
> +static inline int arch_ioremap_pud_supported(void)
> +{
> + return false;
> +}
> +static inline int arch_ioremap_pmd_supported(void)
> +{
> + return false;
> +}
> #endif
>
> /*
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 57259e2dbd17..cf3c4792b4a2 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
This would need an explicit inclusion of <linux/io.h> in order
to prevent build failure in some cases.
> @@ -206,7 +206,7 @@ static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
> {
> pmd_t pmd;
>
> - if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
> + if (!arch_ioremap_pmd_supported())
> return;
>
> pr_debug("Validating PMD huge\n");
> @@ -320,7 +320,7 @@ static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
> {
> pud_t pud;
>
> - if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
> + if (!arch_ioremap_pud_supported())
> return;
>
> pr_debug("Validating PUD huge\n");
>
^ permalink raw reply
* Re: [PATCH v2 00/13] mm/debug_vm_pgtable fixes
From: Aneesh Kumar K.V @ 2020-08-21 8:50 UTC (permalink / raw)
To: Anshuman Khandual, linux-mm, akpm; +Cc: linuxppc-dev
In-Reply-To: <9b01e909-e6c3-1e6d-ae83-249bdab84ece@linux.ibm.com>
> Sure. I am hoping kernel test robot will pick this up. I did an x86 and
> about 19 different ppc config build with the series. The git tree above
> was pushed with that. Considering you authored the change i am wondering
> if you could help with checking other architecture (may be atleast arm
> variant)
>
I updated the tree after a defconfig build on arch/arm64/s390/x86. I
will not be able to boot test them.
Can you help with boot testing on arm?
-aneesh
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox