LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 01/45] powerpc/kasan: Fix error detection on memory allocation
From: Christophe Leroy @ 2020-05-11 11:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1589196133.git.christophe.leroy@csgroup.eu>

In case (k_start & PAGE_MASK) doesn't equal (kstart), 'va' will never be
NULL allthough 'block' is NULL

Check the return of memblock_alloc() directly instead of
the resulting address in the loop.

Fixes: 509cd3f2b473 ("powerpc/32: Simplify KASAN init")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/mm/kasan/kasan_init_32.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
index cbcad369fcb2..8b15fe09b967 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -76,15 +76,14 @@ static int __init kasan_init_region(void *start, size_t size)
 		return ret;
 
 	block = memblock_alloc(k_end - k_start, PAGE_SIZE);
+	if (!block)
+		return -ENOMEM;
 
 	for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
 		pmd_t *pmd = pmd_ptr_k(k_cur);
 		void *va = block + k_cur - k_start;
 		pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);
 
-		if (!va)
-			return -ENOMEM;
-
 		__set_pte_at(&init_mm, k_cur, pte_offset_kernel(pmd, k_cur), pte, 0);
 	}
 	flush_tlb_kernel_range(k_start, k_end);
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 04/45] powerpc/kasan: Remove unnecessary page table locking
From: Christophe Leroy @ 2020-05-11 11:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1589196133.git.christophe.leroy@csgroup.eu>

Commit 45ff3c559585 ("powerpc/kasan: Fix parallel loading of
modules.") added spinlocks to manage parallele module loading.

Since then commit 47febbeeec44 ("powerpc/32: Force KASAN_VMALLOC for
modules") converted the module loading to KASAN_VMALLOC.

The spinlocking has then become unneeded and can be removed to
simplify kasan_init_shadow_page_tables()

Also remove inclusion of linux/moduleloader.h and linux/vmalloc.h
which are not needed anymore since the removal of modules management.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/mm/kasan/kasan_init_32.c | 19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
index b7c287adfd59..91e2ade75192 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -5,9 +5,7 @@
 #include <linux/kasan.h>
 #include <linux/printk.h>
 #include <linux/memblock.h>
-#include <linux/moduleloader.h>
 #include <linux/sched/task.h>
-#include <linux/vmalloc.h>
 #include <asm/pgalloc.h>
 #include <asm/code-patching.h>
 #include <mm/mmu_decl.h>
@@ -34,31 +32,22 @@ static int __init kasan_init_shadow_page_tables(unsigned long k_start, unsigned
 {
 	pmd_t *pmd;
 	unsigned long k_cur, k_next;
-	pte_t *new = NULL;
 
 	pmd = pmd_ptr_k(k_start);
 
 	for (k_cur = k_start; k_cur != k_end; k_cur = k_next, pmd++) {
+		pte_t *new;
+
 		k_next = pgd_addr_end(k_cur, k_end);
 		if ((void *)pmd_page_vaddr(*pmd) != kasan_early_shadow_pte)
 			continue;
 
-		if (!new)
-			new = memblock_alloc(PTE_FRAG_SIZE, PTE_FRAG_SIZE);
+		new = memblock_alloc(PTE_FRAG_SIZE, PTE_FRAG_SIZE);
 
 		if (!new)
 			return -ENOMEM;
 		kasan_populate_pte(new, PAGE_KERNEL);
-
-		smp_wmb(); /* See comment in __pte_alloc */
-
-		spin_lock(&init_mm.page_table_lock);
-			/* Has another populated it ? */
-		if (likely((void *)pmd_page_vaddr(*pmd) == kasan_early_shadow_pte)) {
-			pmd_populate_kernel(&init_mm, pmd, new);
-			new = NULL;
-		}
-		spin_unlock(&init_mm.page_table_lock);
+		pmd_populate_kernel(&init_mm, pmd, new);
 	}
 	return 0;
 }
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 03/45] powerpc/kasan: Fix shadow pages allocation failure
From: Christophe Leroy @ 2020-05-11 11:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1589196133.git.christophe.leroy@csgroup.eu>

Doing kasan pages allocation in MMU_init is too early, kernel doesn't
have access yet to the entire memory space and memblock_alloc() fails
when the kernel is a bit big.

Do it from kasan_init() instead.

Fixes: 2edb16efc899 ("powerpc/32: Add KASAN support")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/kasan.h      | 2 --
 arch/powerpc/mm/init_32.c             | 2 --
 arch/powerpc/mm/kasan/kasan_init_32.c | 4 +++-
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/kasan.h b/arch/powerpc/include/asm/kasan.h
index fc900937f653..4769bbf7173a 100644
--- a/arch/powerpc/include/asm/kasan.h
+++ b/arch/powerpc/include/asm/kasan.h
@@ -27,12 +27,10 @@
 
 #ifdef CONFIG_KASAN
 void kasan_early_init(void);
-void kasan_mmu_init(void);
 void kasan_init(void);
 void kasan_late_init(void);
 #else
 static inline void kasan_init(void) { }
-static inline void kasan_mmu_init(void) { }
 static inline void kasan_late_init(void) { }
 #endif
 
diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c
index 872df48ae41b..a6991ef8727d 100644
--- a/arch/powerpc/mm/init_32.c
+++ b/arch/powerpc/mm/init_32.c
@@ -170,8 +170,6 @@ void __init MMU_init(void)
 	btext_unmap();
 #endif
 
-	kasan_mmu_init();
-
 	setup_kup();
 
 	/* Shortly after that, the entire linear mapping will be available */
diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
index 8b15fe09b967..b7c287adfd59 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -131,7 +131,7 @@ static void __init kasan_unmap_early_shadow_vmalloc(void)
 	flush_tlb_kernel_range(k_start, k_end);
 }
 
-void __init kasan_mmu_init(void)
+static void __init kasan_mmu_init(void)
 {
 	int ret;
 	struct memblock_region *reg;
@@ -159,6 +159,8 @@ void __init kasan_mmu_init(void)
 
 void __init kasan_init(void)
 {
+	kasan_mmu_init();
+
 	kasan_remap_early_shadow_ro();
 
 	clear_page(kasan_early_shadow_page);
-- 
2.25.0


^ permalink raw reply related

* Re: [PATCH] powerpc/kvm: silence kmemleak false positives
From: Michael Ellerman @ 2020-05-11 11:15 UTC (permalink / raw)
  To: Qian Cai; +Cc: linux-kernel, kvm-ppc, Qian Cai, catalin.marinas, linuxppc-dev
In-Reply-To: <20200509015538.3183-1-cai@lca.pw>

Qian Cai <cai@lca.pw> writes:
> kvmppc_pmd_alloc() and kvmppc_pte_alloc() allocate some memory but then
> pud_populate() and pmd_populate() will use __pa() to reference the newly
> allocated memory. The same is in xive_native_provision_pages().
>
> Since kmemleak is unable to track the physical memory resulting in false
> positives, silence those by using kmemleak_ignore().

There is kmemleak_alloc_phys(), which according to the docs can be used
for tracking a phys address.

Did you try that?

cheers


> unreferenced object 0xc000201c382a1000 (size 4096):
>   comm "qemu-kvm", pid 124828, jiffies 4295733767 (age 341.250s)
>   hex dump (first 32 bytes):
>     c0 00 20 09 f4 60 03 87 c0 00 20 10 72 a0 03 87  .. ..`.... .r...
>     c0 00 20 0e 13 a0 03 87 c0 00 20 1b dc c0 03 87  .. ....... .....
>   backtrace:
>     [<000000004cc2790f>] kvmppc_create_pte+0x838/0xd20 [kvm_hv]
>     kvmppc_pmd_alloc at arch/powerpc/kvm/book3s_64_mmu_radix.c:366
>     (inlined by) kvmppc_create_pte at arch/powerpc/kvm/book3s_64_mmu_radix.c:590
>     [<00000000d123c49a>] kvmppc_book3s_instantiate_page+0x2e0/0x8c0 [kvm_hv]
>     [<00000000bb549087>] kvmppc_book3s_radix_page_fault+0x1b4/0x2b0 [kvm_hv]
>     [<0000000086dddc0e>] kvmppc_book3s_hv_page_fault+0x214/0x12a0 [kvm_hv]
>     [<000000005ae9ccc2>] kvmppc_vcpu_run_hv+0xc5c/0x15f0 [kvm_hv]
>     [<00000000d22162ff>] kvmppc_vcpu_run+0x34/0x48 [kvm]
>     [<00000000d6953bc4>] kvm_arch_vcpu_ioctl_run+0x314/0x420 [kvm]
>     [<000000002543dd54>] kvm_vcpu_ioctl+0x33c/0x950 [kvm]
>     [<0000000048155cd6>] ksys_ioctl+0xd8/0x130
>     [<0000000041ffeaa7>] sys_ioctl+0x28/0x40
>     [<000000004afc4310>] system_call_exception+0x114/0x1e0
>     [<00000000fb70a873>] system_call_common+0xf0/0x278
> unreferenced object 0xc0002001f0c03900 (size 256):
>   comm "qemu-kvm", pid 124830, jiffies 4295735235 (age 326.570s)
>   hex dump (first 32 bytes):
>     c0 00 20 10 fa a0 03 87 c0 00 20 10 fa a1 03 87  .. ....... .....
>     c0 00 20 10 fa a2 03 87 c0 00 20 10 fa a3 03 87  .. ....... .....
>   backtrace:
>     [<0000000023f675b8>] kvmppc_create_pte+0x854/0xd20 [kvm_hv]
>     kvmppc_pte_alloc at arch/powerpc/kvm/book3s_64_mmu_radix.c:356
>     (inlined by) kvmppc_create_pte at arch/powerpc/kvm/book3s_64_mmu_radix.c:593
>     [<00000000d123c49a>] kvmppc_book3s_instantiate_page+0x2e0/0x8c0 [kvm_hv]
>     [<00000000bb549087>] kvmppc_book3s_radix_page_fault+0x1b4/0x2b0 [kvm_hv]
>     [<0000000086dddc0e>] kvmppc_book3s_hv_page_fault+0x214/0x12a0 [kvm_hv]
>     [<000000005ae9ccc2>] kvmppc_vcpu_run_hv+0xc5c/0x15f0 [kvm_hv]
>     [<00000000d22162ff>] kvmppc_vcpu_run+0x34/0x48 [kvm]
>     [<00000000d6953bc4>] kvm_arch_vcpu_ioctl_run+0x314/0x420 [kvm]
>     [<000000002543dd54>] kvm_vcpu_ioctl+0x33c/0x950 [kvm]
>     [<0000000048155cd6>] ksys_ioctl+0xd8/0x130
>     [<0000000041ffeaa7>] sys_ioctl+0x28/0x40
>     [<000000004afc4310>] system_call_exception+0x114/0x1e0
>     [<00000000fb70a873>] system_call_common+0xf0/0x278
> unreferenced object 0xc000201b53e90000 (size 65536):
>   comm "qemu-kvm", pid 124557, jiffies 4295650285 (age 364.370s)
>   hex dump (first 32 bytes):
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<00000000acc2fb77>] xive_native_alloc_vp_block+0x168/0x210
>     xive_native_provision_pages at arch/powerpc/sysdev/xive/native.c:645
>     (inlined by) xive_native_alloc_vp_block at arch/powerpc/sysdev/xive/native.c:674
>     [<000000004d5c7964>] kvmppc_xive_compute_vp_id+0x20c/0x3b0 [kvm]
>     [<0000000055317cd2>] kvmppc_xive_connect_vcpu+0xa4/0x4a0 [kvm]
>     [<0000000093dfc014>] kvm_arch_vcpu_ioctl+0x388/0x508 [kvm]
>     [<00000000d25aea0f>] kvm_vcpu_ioctl+0x15c/0x950 [kvm]
>     [<0000000048155cd6>] ksys_ioctl+0xd8/0x130
>     [<0000000041ffeaa7>] sys_ioctl+0x28/0x40
>     [<000000004afc4310>] system_call_exception+0x114/0x1e0
>     [<00000000fb70a873>] system_call_common+0xf0/0x278
>
> Signed-off-by: Qian Cai <cai@lca.pw>
> ---
>  arch/powerpc/kvm/book3s_64_mmu_radix.c | 16 ++++++++++++++--
>  arch/powerpc/sysdev/xive/native.c      |  4 ++++
>  2 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> index aa12cd4078b3..bc6c1aa3d0e9 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> @@ -353,7 +353,13 @@ static struct kmem_cache *kvm_pmd_cache;
>  
>  static pte_t *kvmppc_pte_alloc(void)
>  {
> -	return kmem_cache_alloc(kvm_pte_cache, GFP_KERNEL);
> +	pte_t *pte;
> +
> +	pte = kmem_cache_alloc(kvm_pte_cache, GFP_KERNEL);
> +	/* pmd_populate() will only reference _pa(pte). */
> +	kmemleak_ignore(pte);
> +
> +	return pte;
>  }
>  
>  static void kvmppc_pte_free(pte_t *ptep)
> @@ -363,7 +369,13 @@ static void kvmppc_pte_free(pte_t *ptep)
>  
>  static pmd_t *kvmppc_pmd_alloc(void)
>  {
> -	return kmem_cache_alloc(kvm_pmd_cache, GFP_KERNEL);
> +	pmd_t *pmd;
> +
> +	pmd = kmem_cache_alloc(kvm_pmd_cache, GFP_KERNEL);
> +	/* pud_populate() will only reference _pa(pmd). */
> +	kmemleak_ignore(pmd);
> +
> +	return pmd;
>  }
>  
>  static void kvmppc_pmd_free(pmd_t *pmdp)
> diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
> index 5218fdc4b29a..2d19f28967a6 100644
> --- a/arch/powerpc/sysdev/xive/native.c
> +++ b/arch/powerpc/sysdev/xive/native.c
> @@ -18,6 +18,7 @@
>  #include <linux/delay.h>
>  #include <linux/cpumask.h>
>  #include <linux/mm.h>
> +#include <linux/kmemleak.h>
>  
>  #include <asm/machdep.h>
>  #include <asm/prom.h>
> @@ -647,6 +648,9 @@ static bool xive_native_provision_pages(void)
>  			pr_err("Failed to allocate provisioning page\n");
>  			return false;
>  		}
> +		/* Kmemleak is unable to track the physical address. */
> +		kmemleak_ignore(p);
> +
>  		opal_xive_donate_page(chip, __pa(p));
>  	}
>  	return true;
> -- 
> 2.21.0 (Apple Git-122.2)

^ permalink raw reply

* Re: [PATCH 02/31] arm64: fix the flush_icache_range arguments in machine_kexec
From: Catalin Marinas @ 2020-05-11 11:00 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-ia64, linux-sh, Roman Zippel, linux-mips, linux-mm,
	sparclinux, linux-riscv, Christoph Hellwig, linux-arch,
	linux-c6x-dev, linux-hexagon, x86, linux-xtensa, Arnd Bergmann,
	Jessica Yu, linux-um, linux-m68k, openrisc, linux-arm-kernel,
	Michal Simek, linux-kernel, james.morse, linux-alpha,
	linux-fsdevel, Andrew Morton, linuxppc-dev
In-Reply-To: <20200511075115.GA16134@willie-the-truck>

On Mon, May 11, 2020 at 08:51:15AM +0100, Will Deacon wrote:
> On Sun, May 10, 2020 at 09:54:41AM +0200, Christoph Hellwig wrote:
> > The second argument is the end "pointer", not the length.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > ---
> >  arch/arm64/kernel/machine_kexec.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> > index 8e9c924423b4e..a0b144cfaea71 100644
> > --- a/arch/arm64/kernel/machine_kexec.c
> > +++ b/arch/arm64/kernel/machine_kexec.c
> > @@ -177,6 +177,7 @@ void machine_kexec(struct kimage *kimage)
> >  	 * the offline CPUs. Therefore, we must use the __* variant here.
> >  	 */
> >  	__flush_icache_range((uintptr_t)reboot_code_buffer,
> > +			     (uintptr_t)reboot_code_buffer +
> >  			     arm64_relocate_new_kernel_size);
> 
> Urgh, well spotted. It's annoyingly different from __flush_dcache_area().
> 
> But now I'm wondering what this code actually does... the loop condition
> in invalidate_icache_by_line works with 64-bit arithmetic, so we could
> spend a /very/ long time here afaict.

I think it goes through the loop only once. The 'b.lo' saves us here.
OTOH, there is no I-cache maintenance done.

> It's also a bit annoying that we do a bunch of redundant D-cache
> maintenance too. Should we use invalidate_icache_range() here instead?

Since we have the __flush_dcache_area() above it for cleaning to PoC, we
could use invalidate_icache_range() here. We probably didn't have this
function at the time, it was added for KVM (commit 4fee94736603cd6).

> (and why does that thing need to toggle uaccess)?

invalidate_icache_range() doesn't need to, it works on the kernel linear
map.

__flush_icache_range() doesn't need to either, that's a side-effect of
the fall-through implementation.

Anyway, I think Christoph's patch needs to go in with a fixes tag:

Fixes: d28f6df1305a ("arm64/kexec: Add core kexec support")
Cc: <stable@vger.kernel.org> # 4.8.x-

and we'll change these functions/helpers going forward for arm64.

Happy to pick this up via the arm64 for-next/fixes branch.

-- 
Catalin

^ permalink raw reply

* [PATCH v2] powerpc/64/signal: balance return predictor stack in signal trampoline
From: Nicholas Piggin @ 2020-05-11 10:19 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Alan Modra

Returning from an interrupt or syscall to a signal handler currently
begins execution directly at the handler's entry point, with LR set to
the address of the sigreturn trampoline. When the signal handler
function returns, it runs the trampoline. It looks like this:

    # interrupt at user address xyz
    # kernel stuff... signal is raised
    rfid
    # void handler(int sig)
    addis 2,12,.TOC.-.LCF0@ha
    addi 2,2,.TOC.-.LCF0@l
    mflr 0
    std 0,16(1)
    stdu 1,-96(1)
    # handler stuff
    ld 0,16(1)
    mtlr 0
    blr
    # __kernel_sigtramp_rt64
    addi    r1,r1,__SIGNAL_FRAMESIZE
    li      r0,__NR_rt_sigreturn
    sc
    # kernel executes rt_sigreturn
    rfid
    # back to user address xyz

Note the blr with no matching bl. This can corrupt the return predictor.

Solve this by instead resuming execution at the signal trampoline which
then calls the signal handler. qtrace-tools link_stack checker confirms
the entire user/kernel/vdso cycle is balanced after this patch, whereas
it's not upstream.

Alan confirms the dwarf unwind info still looks good. gdb still
recognises the signal frame and can step into parent frames if it break
inside a signal handler.

Performance is pretty noisy, not a very significant change on a POWER9
here, but branch misses are consistently a lot lower on a microbenchmark:

 Performance counter stats for './signal':

         13,085.72 msec task-clock                #    1.000 CPUs utilized
    45,024,760,101      cycles                    #    3.441 GHz
    65,102,895,542      instructions              #    1.45  insn per cycle
    11,271,673,787      branches                  #  861.372 M/sec
        59,468,979      branch-misses             #    0.53% of all branches

         12,989.09 msec task-clock                #    1.000 CPUs utilized
    44,692,719,559      cycles                    #    3.441 GHz
    65,109,984,964      instructions              #    1.46  insn per cycle
    11,282,136,057      branches                  #  868.585 M/sec
        39,786,942      branch-misses             #    0.35% of all branches

Cc: Alan Modra <amodra@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
Since v1:
- Fix for the legacy on-stack trampoline path (TRAMP_TRACEBACK was not
  updated to 4).
- Tested legacy case (must also disable no-exec stack to test this).

 arch/powerpc/include/asm/ppc-opcode.h |  1 +
 arch/powerpc/kernel/signal_64.c       | 22 ++++++++++++----------
 arch/powerpc/kernel/vdso64/sigtramp.S | 13 +++++--------
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
index c1df75edde44..747b37f1ce09 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -329,6 +329,7 @@
 #define PPC_INST_BLR			0x4e800020
 #define PPC_INST_BLRL			0x4e800021
 #define PPC_INST_BCTR			0x4e800420
+#define PPC_INST_BCTRL			0x4e800421
 #define PPC_INST_MULLD			0x7c0001d2
 #define PPC_INST_MULLW			0x7c0001d6
 #define PPC_INST_MULHWU			0x7c000016
diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index adfde59cf4ba..6bb14cdf256f 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -40,8 +40,8 @@
 #define GP_REGS_SIZE	min(sizeof(elf_gregset_t), sizeof(struct pt_regs))
 #define FP_REGS_SIZE	sizeof(elf_fpregset_t)
 
-#define TRAMP_TRACEBACK	3
-#define TRAMP_SIZE	6
+#define TRAMP_TRACEBACK	4
+#define TRAMP_SIZE	7
 
 /*
  * When we have signals to deliver, we set up on the user stack,
@@ -603,13 +603,15 @@ static long setup_trampoline(unsigned int syscall, unsigned int __user *tramp)
 	int i;
 	long err = 0;
 
+	/* bctrl # call the handler */
+	err |= __put_user(PPC_INST_BCTRL, &tramp[0]);
 	/* addi r1, r1, __SIGNAL_FRAMESIZE  # Pop the dummy stackframe */
 	err |= __put_user(PPC_INST_ADDI | __PPC_RT(R1) | __PPC_RA(R1) |
-			  (__SIGNAL_FRAMESIZE & 0xffff), &tramp[0]);
+			  (__SIGNAL_FRAMESIZE & 0xffff), &tramp[1]);
 	/* li r0, __NR_[rt_]sigreturn| */
-	err |= __put_user(PPC_INST_ADDI | (syscall & 0xffff), &tramp[1]);
+	err |= __put_user(PPC_INST_ADDI | (syscall & 0xffff), &tramp[2]);
 	/* sc */
-	err |= __put_user(PPC_INST_SC, &tramp[2]);
+	err |= __put_user(PPC_INST_SC, &tramp[3]);
 
 	/* Minimal traceback info */
 	for (i=TRAMP_TRACEBACK; i < TRAMP_SIZE ;i++)
@@ -867,12 +869,12 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
 
 	/* Set up to return from userspace. */
 	if (vdso64_rt_sigtramp && tsk->mm->context.vdso_base) {
-		regs->link = tsk->mm->context.vdso_base + vdso64_rt_sigtramp;
+		regs->nip = tsk->mm->context.vdso_base + vdso64_rt_sigtramp;
 	} else {
 		err |= setup_trampoline(__NR_rt_sigreturn, &frame->tramp[0]);
 		if (err)
 			goto badframe;
-		regs->link = (unsigned long) &frame->tramp[0];
+		regs->nip = (unsigned long) &frame->tramp[0];
 	}
 
 	/* Allocate a dummy caller frame for the signal handler. */
@@ -881,8 +883,8 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
 
 	/* Set up "regs" so we "return" to the signal handler. */
 	if (is_elf2_task()) {
-		regs->nip = (unsigned long) ksig->ka.sa.sa_handler;
-		regs->gpr[12] = regs->nip;
+		regs->ctr = (unsigned long) ksig->ka.sa.sa_handler;
+		regs->gpr[12] = regs->ctr;
 	} else {
 		/* Handler is *really* a pointer to the function descriptor for
 		 * the signal routine.  The first entry in the function
@@ -892,7 +894,7 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
 		func_descr_t __user *funct_desc_ptr =
 			(func_descr_t __user *) ksig->ka.sa.sa_handler;
 
-		err |= get_user(regs->nip, &funct_desc_ptr->entry);
+		err |= get_user(regs->ctr, &funct_desc_ptr->entry);
 		err |= get_user(regs->gpr[2], &funct_desc_ptr->toc);
 	}
 
diff --git a/arch/powerpc/kernel/vdso64/sigtramp.S b/arch/powerpc/kernel/vdso64/sigtramp.S
index a8cc0409d7d2..bbf68cd01088 100644
--- a/arch/powerpc/kernel/vdso64/sigtramp.S
+++ b/arch/powerpc/kernel/vdso64/sigtramp.S
@@ -6,6 +6,7 @@
  * Copyright (C) 2004 Benjamin Herrenschmuidt (benh@kernel.crashing.org), IBM Corp.
  * Copyright (C) 2004 Alan Modra (amodra@au.ibm.com)), IBM Corp.
  */
+#include <asm/cache.h>		/* IFETCH_ALIGN_BYTES */
 #include <asm/processor.h>
 #include <asm/ppc_asm.h>
 #include <asm/unistd.h>
@@ -14,21 +15,17 @@
 
 	.text
 
-/* The nop here is a hack.  The dwarf2 unwind routines subtract 1 from
-   the return address to get an address in the middle of the presumed
-   call instruction.  Since we don't have a call here, we artificially
-   extend the range covered by the unwind info by padding before the
-   real start.  */
-	nop
 	.balign 8
+	.balign IFETCH_ALIGN_BYTES
 V_FUNCTION_BEGIN(__kernel_sigtramp_rt64)
-.Lsigrt_start = . - 4
+.Lsigrt_start:
+	bctrl	/* call the handler */
 	addi	r1, r1, __SIGNAL_FRAMESIZE
 	li	r0,__NR_rt_sigreturn
 	sc
 .Lsigrt_end:
 V_FUNCTION_END(__kernel_sigtramp_rt64)
-/* The ".balign 8" above and the following zeros mimic the old stack
+/* The .balign 8 above and the following zeros mimic the old stack
    trampoline layout.  The last magic value is the ucontext pointer,
    chosen in such a way that older libgcc unwind code returns a zero
    for a sigcontext pointer.  */
-- 
2.23.0


^ permalink raw reply related

* Re: powerpc/pci: [PATCH 1/1]: PCIE PHB reset
From: Oliver O'Halloran @ 2020-05-11 10:17 UTC (permalink / raw)
  To: wenxiong; +Cc: Brian King, linuxppc-dev, wenxiong
In-Reply-To: <1588857037-25950-1-git-send-email-wenxiong@linux.vnet.ibm.com>

On Fri, May 8, 2020 at 12:36 AM <wenxiong@linux.vnet.ibm.com> wrote:
>
> From: Wen Xiong <wenxiong@linux.vnet.ibm.com>
>
> Several device drivers hit EEH(Extended Error handling) when triggering
> kdump on Pseries PowerVM. This patch implemented a reset of the PHBs
> in pci general code. PHB reset stop all PCI transactions from previous
> kernel. We have tested the patch in several enviroments:
> - direct slot adapters
> - adapters under the switch
> - a VF adapter in PowerVM
> - a VF adapter/adapter in KVM guest.
>
> Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
> ---
>  arch/powerpc/platforms/pseries/pci.c | 153 +++++++++++++++++++++++++++
>  1 file changed, 153 insertions(+)
>
> diff --git a/arch/powerpc/platforms/pseries/pci.c b/arch/powerpc/platforms/pseries/pci.c
> index 911534b89c85..aac7f00696d2 100644
> --- a/arch/powerpc/platforms/pseries/pci.c
> +++ b/arch/powerpc/platforms/pseries/pci.c
> @@ -11,6 +11,8 @@
>  #include <linux/kernel.h>
>  #include <linux/pci.h>
>  #include <linux/string.h>
> +#include <linux/crash_dump.h>
> +#include <linux/delay.h>
>
>  #include <asm/eeh.h>
>  #include <asm/pci-bridge.h>
> @@ -354,3 +356,154 @@ int pseries_root_bridge_prepare(struct pci_host_bridge *bridge)
>
>         return 0;
>  }
> +
> +/**
> + * pseries_get_pdn_addr - Retrieve PHB address
> + * @pe: EEH PE
> + *
> + * Retrieve the assocated PHB address. Actually, there're 2 RTAS
> + * function calls dedicated for the purpose. We need implement
> + * it through the new function and then the old one. Besides,
> + * you should make sure the config address is figured out from
> + * FDT node before calling the function.
> + *
> + */
> +static int pseries_get_pdn_addr(struct pci_controller *phb)
> +{
> +       int ret = -1;
> +       int rets[3];
> +       int ibm_get_config_addr_info;
> +       int ibm_get_config_addr_info2;
> +       int config_addr = 0;
> +       struct pci_dn *root_pdn, *pdn;
> +
> +       ibm_get_config_addr_info2   = rtas_token("ibm,get-config-addr-info2");
> +       ibm_get_config_addr_info    = rtas_token("ibm,get-config-addr-info");
> +
> +       root_pdn = PCI_DN(phb->dn);
> +       pdn = list_first_entry(&root_pdn->child_list, struct pci_dn, list);
> +       config_addr = (pdn->busno << 16) | (pdn->devfn << 8);
> +
> +       if (ibm_get_config_addr_info2 != RTAS_UNKNOWN_SERVICE) {
> +               /*
> +                * First of all, we need to make sure there has one PE
> +                * associated with the device. Otherwise, PE address is
> +                * meaningless.
> +                */
> +               ret = rtas_call(ibm_get_config_addr_info2, 4, 2, rets,
> +                       config_addr, BUID_HI(pdn->phb->buid),
> +                       BUID_LO(pdn->phb->buid), 1);
> +               if (ret || (rets[0] == 0)) {
> +                       pr_warn("%s: Failed to get address for PHB#%x-PE# "
> +                               "option=%d config_addr=%x\n",
> +                               __func__, pdn->phb->global_number, 1, rets[0]);
> +                       return -1;
> +               }
> +
> +               /* Retrieve the associated PE config address */
> +               ret = rtas_call(ibm_get_config_addr_info2, 4, 2, rets,
> +                       config_addr, BUID_HI(pdn->phb->buid),
> +                       BUID_LO(pdn->phb->buid), 0);
> +               if (ret) {
> +                       pr_warn("%s: Failed to get address for PHB#%x-PE# "
> +                               "option=%d config_addr=%x\n",
> +                               __func__, pdn->phb->global_number, 0, rets[0]);
> +                       return -1;
> +               }
> +               return rets[0];
> +       }
> +
> +       if (ibm_get_config_addr_info != RTAS_UNKNOWN_SERVICE) {
> +               ret = rtas_call(ibm_get_config_addr_info, 4, 2, rets,
> +                       config_addr, BUID_HI(pdn->phb->buid),
> +                       BUID_LO(pdn->phb->buid), 0);
> +               if (ret || rets[0]) {
> +                       pr_warn("%s: Failed to get address for PHB#%x-PE# "
> +                               "config_addr=%x\n",
> +                               __func__, pdn->phb->global_number, rets[0]);
> +                       return -1;
> +               }
> +               return rets[0];
> +       }
> +
> +       return ret;
> +}

I'd be nice if we could reduce the amount of duplication between this
function and eeh_pseries_get_pe_addr(). I was planning to re-working
the EEH version though so that this is fine for now.

> +
> +static int __init pseries_phb_reset(void)
> +{
> +       struct pci_controller *phb;
> +       int config_addr;
> +       int ibm_set_slot_reset;
> +       int ibm_configure_pe;
> +       int ret;
> +
> +       if (is_kdump_kernel() || reset_devices) {
> +               pr_info("Issue PHB reset ...\n");
> +               ibm_set_slot_reset = rtas_token("ibm,set-slot-reset");
> +               ibm_configure_pe = rtas_token("ibm,configure-pe");
> +
> +               if (ibm_set_slot_reset == RTAS_UNKNOWN_SERVICE ||
> +                               ibm_configure_pe == RTAS_UNKNOWN_SERVICE) {
> +                       pr_info("%s: EEH functionality not supported\n",
> +                               __func__);
> +               }
> +
> +               list_for_each_entry(phb, &hose_list, list_node) {
> +                       config_addr = pseries_get_pdn_addr(phb);
> +                       if (config_addr == -1)
> +                               continue;
Considering we already cache the buid in the pci_controller we could
also cache the config_addr. That said, considering all this runs
precisely once at boot I'm not that bothered by calling
pseries_get_pdn_addr() again in each loop.


> +
> +                       ret = rtas_call(ibm_set_slot_reset, 4, 1, NULL,
> +                               config_addr, BUID_HI(phb->buid),
> +                               BUID_LO(phb->buid), EEH_RESET_FUNDAMENTAL);
> +
> +                       /* If fundamental-reset not supported, try hot-reset */
> +                       if (ret == -8)
> +                               ret = rtas_call(ibm_set_slot_reset, 4, 1, NULL,
> +                                       config_addr, BUID_HI(phb->buid),
> +                                       BUID_LO(phb->buid), EEH_RESET_HOT);
> +
> +                       if (ret) {
> +                               pr_err("%s: fail with rtas_call fundamental reset=%d\n",
> +                                       __func__, ret);
> +                               continue;
> +                       }
> +               }
> +               msleep(EEH_PE_RST_SETTLE_TIME);
> +
> +               list_for_each_entry(phb, &hose_list, list_node) {
> +                       config_addr = pseries_get_pdn_addr(phb);
> +                       if (config_addr == -1)
> +                               continue;
> +
> +                       ret = rtas_call(ibm_set_slot_reset, 4, 1, NULL,
> +                               config_addr, BUID_HI(phb->buid),
> +                               BUID_LO(phb->buid), EEH_RESET_DEACTIVATE);
> +                       if (ret) {
> +                               pr_err("%s: fail with rtas_call deactive=%d\n",
> +                                       __func__, ret);
> +                               continue;
> +                       }
> +               }
> +               msleep(EEH_PE_RST_SETTLE_TIME);
> +
> +               list_for_each_entry(phb, &hose_list, list_node) {
> +                       config_addr = pseries_get_pdn_addr(phb);
> +                       if (config_addr == -1)
> +                               continue;
> +
> +                       ret = rtas_call(ibm_configure_pe, 3, 1, NULL,
> +                               config_addr, BUID_HI(phb->buid),
> +                               BUID_LO(phb->buid));
> +                       if (ret) {
> +                               pr_err("%s: fail with rtas_call configure_pe =%d\n",
> +                                       __func__, ret);
> +                               continue;
> +                       }
> +               }
> +       }
> +
> +       return 0;
> +}
> +postcore_initcall(pseries_phb_reset);

You probably should use machine_postcore_initcall(pseries,
pseries_phb_reset); so that this only gets run on pseries. Without the
machine type specifier it'll run on PowerNV too.

> +
> --
> 2.18.1
>

^ permalink raw reply

* Re: [PATCH v4 11/16] powerpc/64s: machine check interrupt update NMI accounting
From: Michael Ellerman @ 2020-05-11  9:50 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev, kbuild test robot; +Cc: kbuild-all
In-Reply-To: <1589010505.dk8cddftjn.astroid@bobo.none>

Nicholas Piggin <npiggin@gmail.com> writes:
> Excerpts from kbuild test robot's message of May 9, 2020 1:13 pm:
>> Hi Nicholas,
>> 
>> I love your patch! Yet something to improve:
>
> ...
>
>>   1419	#if defined(CONFIG_4xx) || defined(CONFIG_BOOKE)
>>   1420			pr_cont("DEAR: "REG" ESR: "REG" ", regs->dar, regs->dsisr);
>>   1421	#else
>>   1422			pr_cont("DAR: "REG" DSISR: %08lx ", regs->dar, regs->dsisr);
>>   1423	#endif
>>   1424	#ifdef CONFIG_PPC64
>>> 1425		pr_cont("IRQMASK: %lx IN_NMI:%d IN_MCE:%d", regs->softe, (int)get_paca()->in_nmi, (int)get_paca()->in_mce);
>
> Oh I meant to get rid of that hunk, it crept back in :(
>
> mpe if you could please take it out if you're merging this.

Yep. I just came here to tell you I'd dropped that hunk :)

> It was quite useful for debugging this stuff, I might do a proper patch 
> for this, but for now not necessary (it doesn't matter for "normal" 
> crashes only crash crashes).

Yeah would be good to print more of those flags.

cheers

^ permalink raw reply

* Re: [PATCH v2 0/5] Statsfs: a new ram-based file sytem for Linux kernel statistics
From: Emanuele Giuseppe Esposito @ 2020-05-11  9:37 UTC (permalink / raw)
  To: Paolo Bonzini, Jonathan Adams
  Cc: linux-s390, kvm list, David Hildenbrand, Cornelia Huck,
	Emanuele Giuseppe Esposito, LKML, kvm-ppc, linux-mips,
	Christian Borntraeger, Alexander Viro, David Rientjes,
	linux-fsdevel, Vitaly Kuznetsov, linuxppc-dev, Jim Mattson
In-Reply-To: <29982969-92f6-b6d0-aeae-22edb401e3ac@redhat.com>


On 5/8/20 11:44 AM, Paolo Bonzini wrote:
> So in general I'd say the sources/values model holds up.  We certainly
> want to:
> 
> - switch immediately to callbacks instead of the type constants (so that
> core statsfs code only does signed/unsigned)
> 
> - add a field to distinguish cumulative and floating properties (and use
> it to determine the default file mode)
> 
> - add a new argument to statsfs_create_source and statsfs_create_values
> that makes it not create directories and files respectively
> 
> - add a new API to look for a statsfs_value recursively in all the
> subordinate sources, and pass the source/value pair to a callback
> function; and reimplement recursive aggregation and clear in terms of
> this function.

Ok I will apply this, thank you for all the suggestions. 
I will post the v3 patchset in the next few weeks. 

In the meanwhile, I wrote the documentation you asked (even though it's 
going to change in v3), you can find it here:

https://github.com/esposem/linux/commit/dfa92f270f1aed73d5f3b7f12640b2a1635c711f

Thank you,
Emanuele


^ permalink raw reply

* [PATCH v3 2/2] powerpc/64s/hash: add torture_hpt kernel boot option to increase hash faults
From: Nicholas Piggin @ 2020-05-11  9:20 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20200511092040.1339667-1-npiggin@gmail.com>

This option increases the number of hash misses by limiting the number of
kernel HPT entries, by accessing the address immediately after installing
the PTE, then removing it again (except in the case of CI entries that
must not be accessed, these are removed on the next hash fault).

This helps stress test difficult to hit paths in the kernel.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
v3:
- Remove the dead code that didn't work on QEMU, and comment it instead.

 .../admin-guide/kernel-parameters.txt         |  9 ++++
 arch/powerpc/include/asm/book3s/64/mmu-hash.h | 11 ++++
 arch/powerpc/mm/book3s64/hash_4k.c            |  3 ++
 arch/powerpc/mm/book3s64/hash_64k.c           |  8 +++
 arch/powerpc/mm/book3s64/hash_utils.c         | 54 ++++++++++++++++++-
 5 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 8dd4260746dc..f51ed836954f 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -876,6 +876,15 @@
 			them frequently to increase the rate of SLB faults
 			on kernel addresses.
 
+	torture_hpt	[PPC]
+			Limits the number of kernel HPT entries in the hash
+			page table to increase the rate of hash page table
+			faults on kernel addresses.
+
+			This may hang when run on processors / emulators which
+			do not have a TLB, or flush it more often than
+			required, QEMU seems to have problems.
+
 	disable=	[IPV6]
 			See Documentation/networking/ipv6.txt.
 
diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 758de1e0f676..3d50ab629bde 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -8,6 +8,7 @@
  *   PPC64 rework.
  */
 
+#include <linux/jump_label.h>
 #include <asm/page.h>
 #include <asm/bug.h>
 #include <asm/asm-const.h>
@@ -324,6 +325,16 @@ static inline bool torture_slb(void)
 	return static_branch_unlikely(&torture_slb_key);
 }
 
+extern bool torture_hpt_enabled;
+DECLARE_STATIC_KEY_FALSE(torture_hpt_key);
+static inline bool torture_hpt(void)
+{
+	return static_branch_unlikely(&torture_hpt_key);
+}
+
+void hpt_do_torture(unsigned long ea, unsigned long access,
+		    unsigned long rflags, unsigned long hpte_group);
+
 /*
  * This computes the AVPN and B fields of the first dword of a HPTE,
  * for use when we want to match an existing PTE.  The bottom 7 bits
diff --git a/arch/powerpc/mm/book3s64/hash_4k.c b/arch/powerpc/mm/book3s64/hash_4k.c
index 22e787123cdf..54e4ff8c558d 100644
--- a/arch/powerpc/mm/book3s64/hash_4k.c
+++ b/arch/powerpc/mm/book3s64/hash_4k.c
@@ -118,6 +118,9 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 		}
 		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;
 		new_pte |= pte_set_hidx(ptep, rpte, 0, slot, PTRS_PER_PTE);
+
+		if (torture_hpt())
+			hpt_do_torture(ea, access, rflags, hpte_group);
 	}
 	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
 	return 0;
diff --git a/arch/powerpc/mm/book3s64/hash_64k.c b/arch/powerpc/mm/book3s64/hash_64k.c
index 7084ce2951e6..19ea0fc145a9 100644
--- a/arch/powerpc/mm/book3s64/hash_64k.c
+++ b/arch/powerpc/mm/book3s64/hash_64k.c
@@ -216,6 +216,9 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	new_pte |= pte_set_hidx(ptep, rpte, subpg_index, slot, PTRS_PER_PTE);
 	new_pte |= H_PAGE_HASHPTE;
 
+	if (torture_hpt())
+		hpt_do_torture(ea, access, rflags, hpte_group);
+
 	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
 	return 0;
 }
@@ -327,7 +330,12 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 
 		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;
 		new_pte |= pte_set_hidx(ptep, rpte, 0, slot, PTRS_PER_PTE);
+
+		if (torture_hpt())
+			hpt_do_torture(ea, access, rflags, hpte_group);
 	}
+
 	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
+
 	return 0;
 }
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 9c487b5782ef..ffe364269b8a 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -353,8 +353,12 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
 	return ret;
 }
 
-static bool disable_1tb_segments = false;
+static bool disable_1tb_segments __read_mostly = false;
 bool torture_slb_enabled __read_mostly = false;
+bool torture_hpt_enabled __read_mostly = false;
+
+/* per-CPU array allocated if we enable torture_hpt. */
+static unsigned long *torture_hpt_last_group;
 
 static int __init parse_disable_1tb_segments(char *p)
 {
@@ -370,6 +374,13 @@ static int __init parse_torture_slb(char *p)
 }
 early_param("torture_slb", parse_torture_slb);
 
+static int __init parse_torture_hpt(char *p)
+{
+	torture_hpt_enabled = true;
+	return 0;
+}
+early_param("torture_hpt", parse_torture_hpt);
+
 static int __init htab_dt_scan_seg_sizes(unsigned long node,
 					 const char *uname, int depth,
 					 void *data)
@@ -863,6 +874,7 @@ static void __init hash_init_partition_table(phys_addr_t hash_table,
 }
 
 DEFINE_STATIC_KEY_FALSE(torture_slb_key);
+DEFINE_STATIC_KEY_FALSE(torture_hpt_key);
 
 static void __init htab_initialize(void)
 {
@@ -882,6 +894,15 @@ static void __init htab_initialize(void)
 
 	if (torture_slb_enabled)
 		static_branch_enable(&torture_slb_key);
+	if (torture_hpt_enabled) {
+		unsigned long tmp;
+		static_branch_enable(&torture_hpt_key);
+		tmp = memblock_phys_alloc_range(sizeof(unsigned long) * NR_CPUS,
+						  0,
+						  0, MEMBLOCK_ALLOC_ANYWHERE);
+		memset((void *)tmp, 0xff, sizeof(unsigned long) * NR_CPUS);
+		torture_hpt_last_group = __va(tmp);
+	}
 
 	/*
 	 * Calculate the required size of the htab.  We want the number of
@@ -1901,6 +1922,37 @@ long hpte_insert_repeating(unsigned long hash, unsigned long vpn,
 	return slot;
 }
 
+void hpt_do_torture(unsigned long ea, unsigned long access,
+		    unsigned long rflags, unsigned long hpte_group)
+{
+	unsigned long last_group;
+	int cpu = raw_smp_processor_id();
+
+	last_group = torture_hpt_last_group[cpu];
+	if (last_group != -1UL) {
+		while (mmu_hash_ops.hpte_remove(last_group) != -1)
+			;
+		torture_hpt_last_group[cpu] = -1UL;
+	}
+
+	if (ea >= PAGE_OFFSET) {
+		/*
+		 * We would really like to prefetch here to get the TLB loaded,
+		 * then remove the PTE before returning to userspace, to
+		 * increase the hash fault rate.
+		 *
+		 * Unfortunately QEMU TCG does not model the TLB in a way that
+		 * makes this possible, and systemsim (mambo) emulator does not
+		 * bring in TLBs with prefetches (although loads/stores do
+		 * work for non-CI PTEs).
+		 *
+		 * So remember this PTE and clear it on the next hash fault.
+		 */
+		torture_hpt_last_group[cpu] = hpte_group;
+	}
+}
+
+
 #ifdef CONFIG_DEBUG_PAGEALLOC
 static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi)
 {
-- 
2.23.0


^ permalink raw reply related

* [PATCH v3 1/2] powerpc/64s/hash: add torture_slb kernel boot option to increase SLB faults
From: Nicholas Piggin @ 2020-05-11  9:20 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

This option increases the number of SLB misses by limiting the number of
kernel SLB entries, and increased flushing of cached lookaside information.
This helps stress test difficult to hit paths in the kernel.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
v3:
- Fix compile error found by ktr

 .../admin-guide/kernel-parameters.txt         |   5 +
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |   7 +
 arch/powerpc/mm/book3s64/hash_utils.c         |  13 ++
 arch/powerpc/mm/book3s64/slb.c                | 152 ++++++++++++------
 4 files changed, 132 insertions(+), 45 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 7bc83f3d9bdf..8dd4260746dc 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -871,6 +871,11 @@
 			can be useful when debugging issues that require an SLB
 			miss to occur.
 
+	torture_slb	[PPC]
+			Limits the number of kernel SLB entries, and flushes
+			them frequently to increase the rate of SLB faults
+			on kernel addresses.
+
 	disable=	[IPV6]
 			See Documentation/networking/ipv6.txt.
 
diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 3fa1b962dc27..758de1e0f676 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -317,6 +317,13 @@ extern unsigned long tce_alloc_start, tce_alloc_end;
  */
 extern int mmu_ci_restrictions;
 
+extern bool torture_slb_enabled;
+DECLARE_STATIC_KEY_FALSE(torture_slb_key);
+static inline bool torture_slb(void)
+{
+	return static_branch_unlikely(&torture_slb_key);
+}
+
 /*
  * This computes the AVPN and B fields of the first dword of a HPTE,
  * for use when we want to match an existing PTE.  The bottom 7 bits
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 8ed2411c3f39..9c487b5782ef 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -354,6 +354,7 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
 }
 
 static bool disable_1tb_segments = false;
+bool torture_slb_enabled __read_mostly = false;
 
 static int __init parse_disable_1tb_segments(char *p)
 {
@@ -362,6 +363,13 @@ static int __init parse_disable_1tb_segments(char *p)
 }
 early_param("disable_1tb_segments", parse_disable_1tb_segments);
 
+static int __init parse_torture_slb(char *p)
+{
+	torture_slb_enabled = true;
+	return 0;
+}
+early_param("torture_slb", parse_torture_slb);
+
 static int __init htab_dt_scan_seg_sizes(unsigned long node,
 					 const char *uname, int depth,
 					 void *data)
@@ -854,6 +862,8 @@ static void __init hash_init_partition_table(phys_addr_t hash_table,
 	pr_info("Partition table %p\n", partition_tb);
 }
 
+DEFINE_STATIC_KEY_FALSE(torture_slb_key);
+
 static void __init htab_initialize(void)
 {
 	unsigned long table;
@@ -870,6 +880,9 @@ static void __init htab_initialize(void)
 		printk(KERN_INFO "Using 1TB segments\n");
 	}
 
+	if (torture_slb_enabled)
+		static_branch_enable(&torture_slb_key);
+
 	/*
 	 * Calculate the required size of the htab.  We want the number of
 	 * PTEGs to equal one half the number of real pages.
diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
index 716204aee3da..cf41c4b63d95 100644
--- a/arch/powerpc/mm/book3s64/slb.c
+++ b/arch/powerpc/mm/book3s64/slb.c
@@ -68,7 +68,7 @@ static void assert_slb_presence(bool present, unsigned long ea)
 	 * slbfee. requires bit 24 (PPC bit 39) be clear in RB. Hardware
 	 * ignores all other bits from 0-27, so just clear them all.
 	 */
-	ea &= ~((1UL << 28) - 1);
+	ea &= ~((1UL << SID_SHIFT) - 1);
 	asm volatile(__PPC_SLBFEE_DOT(%0, %1) : "=r"(tmp) : "r"(ea) : "cr0");
 
 	WARN_ON(present == (tmp == 0));
@@ -153,14 +153,42 @@ void slb_flush_all_realmode(void)
 	asm volatile("slbmte %0,%0; slbia" : : "r" (0));
 }
 
+static __always_inline void __slb_flush_and_restore_bolted(bool preserve_kernel_lookaside)
+{
+	struct slb_shadow *p = get_slb_shadow();
+	unsigned long ksp_esid_data, ksp_vsid_data;
+	u32 ih;
+
+	/*
+	 * SLBIA IH=1 on ISA v2.05 and newer processors may preserve lookaside
+	 * information created with Class=0 entries, which we use for kernel
+	 * SLB entries (the SLB entries themselves are still invalidated).
+	 *
+	 * Older processors will ignore this optimisation. Over-invalidation
+	 * is fine because we never rely on lookaside information existing.
+	 */
+	if (preserve_kernel_lookaside)
+		ih = 1;
+	else
+		ih = 0;
+
+	ksp_esid_data = be64_to_cpu(p->save_area[KSTACK_INDEX].esid);
+	ksp_vsid_data = be64_to_cpu(p->save_area[KSTACK_INDEX].vsid);
+
+	asm volatile(PPC_SLBIA(%0)"	\n"
+		     "slbmte	%1, %2	\n"
+		     :: "i" (ih),
+			"r" (ksp_vsid_data),
+			"r" (ksp_esid_data)
+		     : "memory");
+}
+
 /*
  * This flushes non-bolted entries, it can be run in virtual mode. Must
  * be called with interrupts disabled.
  */
 void slb_flush_and_restore_bolted(void)
 {
-	struct slb_shadow *p = get_slb_shadow();
-
 	BUILD_BUG_ON(SLB_NUM_BOLTED != 2);
 
 	WARN_ON(!irqs_disabled());
@@ -171,13 +199,10 @@ void slb_flush_and_restore_bolted(void)
 	 */
 	hard_irq_disable();
 
-	asm volatile("isync\n"
-		     "slbia\n"
-		     "slbmte  %0, %1\n"
-		     "isync\n"
-		     :: "r" (be64_to_cpu(p->save_area[KSTACK_INDEX].vsid)),
-			"r" (be64_to_cpu(p->save_area[KSTACK_INDEX].esid))
-		     : "memory");
+	isync();
+	__slb_flush_and_restore_bolted(false);
+	isync();
+
 	assert_slb_presence(true, get_paca()->kstack);
 
 	get_paca()->slb_cache_ptr = 0;
@@ -400,6 +425,30 @@ void preload_new_slb_context(unsigned long start, unsigned long sp)
 	local_irq_enable();
 }
 
+static void slb_cache_slbie_kernel(unsigned int index)
+{
+	unsigned long slbie_data = get_paca()->slb_cache[index];
+	unsigned long ksp = get_paca()->kstack;
+
+	slbie_data <<= SID_SHIFT;
+	slbie_data |= 0xc000000000000000ULL;
+	if ((ksp & slb_esid_mask(mmu_kernel_ssize)) == slbie_data)
+		return;
+	slbie_data |= mmu_kernel_ssize << SLBIE_SSIZE_SHIFT;
+
+	asm volatile("slbie %0" : : "r" (slbie_data));
+}
+
+static void slb_cache_slbie_user(unsigned int index)
+{
+	unsigned long slbie_data = get_paca()->slb_cache[index];
+
+	slbie_data <<= SID_SHIFT;
+	slbie_data |= user_segment_size(slbie_data) << SLBIE_SSIZE_SHIFT;
+	slbie_data |= SLBIE_C; /* user slbs have C=1 */
+
+	asm volatile("slbie %0" : : "r" (slbie_data));
+}
 
 /* Flush all user entries from the segment table of the current processor. */
 void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
@@ -414,8 +463,14 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
 	 * which would update the slb_cache/slb_cache_ptr fields in the PACA.
 	 */
 	hard_irq_disable();
-	asm volatile("isync" : : : "memory");
-	if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+	isync();
+	if (torture_slb()) {
+		__slb_flush_and_restore_bolted(false);
+		isync();
+		get_paca()->slb_cache_ptr = 0;
+		get_paca()->slb_kern_bitmap = (1U << SLB_NUM_BOLTED) - 1;
+
+	} else if (cpu_has_feature(CPU_FTR_ARCH_300)) {
 		/*
 		 * SLBIA IH=3 invalidates all Class=1 SLBEs and their
 		 * associated lookaside structures, which matches what
@@ -423,47 +478,29 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
 		 * cache.
 		 */
 		asm volatile(PPC_SLBIA(3));
+
 	} else {
 		unsigned long offset = get_paca()->slb_cache_ptr;
 
 		if (!mmu_has_feature(MMU_FTR_NO_SLBIE_B) &&
 		    offset <= SLB_CACHE_ENTRIES) {
-			unsigned long slbie_data = 0;
-
-			for (i = 0; i < offset; i++) {
-				unsigned long ea;
-
-				ea = (unsigned long)
-					get_paca()->slb_cache[i] << SID_SHIFT;
-				/*
-				 * Could assert_slb_presence(true) here, but
-				 * hypervisor or machine check could have come
-				 * in and removed the entry at this point.
-				 */
-
-				slbie_data = ea;
-				slbie_data |= user_segment_size(slbie_data)
-						<< SLBIE_SSIZE_SHIFT;
-				slbie_data |= SLBIE_C; /* user slbs have C=1 */
-				asm volatile("slbie %0" : : "r" (slbie_data));
-			}
+			/*
+			 * Could assert_slb_presence(true) here, but
+			 * hypervisor or machine check could have come
+			 * in and removed the entry at this point.
+			 */
+
+			for (i = 0; i < offset; i++)
+				slb_cache_slbie_user(i);
 
 			/* Workaround POWER5 < DD2.1 issue */
 			if (!cpu_has_feature(CPU_FTR_ARCH_207S) && offset == 1)
-				asm volatile("slbie %0" : : "r" (slbie_data));
+				slb_cache_slbie_user(0);
 
 		} else {
-			struct slb_shadow *p = get_slb_shadow();
-			unsigned long ksp_esid_data =
-				be64_to_cpu(p->save_area[KSTACK_INDEX].esid);
-			unsigned long ksp_vsid_data =
-				be64_to_cpu(p->save_area[KSTACK_INDEX].vsid);
-
-			asm volatile(PPC_SLBIA(1) "\n"
-				     "slbmte	%0,%1\n"
-				     "isync"
-				     :: "r"(ksp_vsid_data),
-					"r"(ksp_esid_data));
+			/* Flush but retain kernel lookaside information */
+			__slb_flush_and_restore_bolted(true);
+			isync();
 
 			get_paca()->slb_kern_bitmap = (1U << SLB_NUM_BOLTED) - 1;
 		}
@@ -503,7 +540,7 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
 	 * address accesses by the kernel (user mode won't happen until
 	 * rfid, which is safe).
 	 */
-	asm volatile("isync" : : : "memory");
+	isync();
 }
 
 void slb_set_size(u16 size)
@@ -571,6 +608,9 @@ static void slb_cache_update(unsigned long esid_data)
 	if (cpu_has_feature(CPU_FTR_ARCH_300))
 		return; /* ISAv3.0B and later does not use slb_cache */
 
+	if (torture_slb())
+		return;
+
 	/*
 	 * Now update slb cache entries
 	 */
@@ -580,7 +620,7 @@ static void slb_cache_update(unsigned long esid_data)
 		 * We have space in slb cache for optimized switch_slb().
 		 * Top 36 bits from esid_data as per ISA
 		 */
-		local_paca->slb_cache[slb_cache_index++] = esid_data >> 28;
+		local_paca->slb_cache[slb_cache_index++] = esid_data >> SID_SHIFT;
 		local_paca->slb_cache_ptr++;
 	} else {
 		/*
@@ -671,6 +711,28 @@ static long slb_insert_entry(unsigned long ea, unsigned long context,
 	 * accesses user memory before it returns to userspace with rfid.
 	 */
 	assert_slb_presence(false, ea);
+	if (torture_slb()) {
+		int slb_cache_index = local_paca->slb_cache_ptr;
+
+		/*
+		 * torture_slb() does not use slb cache, repurpose as a
+		 * cache of inserted (non-bolted) kernel SLB entries. All
+		 * non-bolted kernel entries are flushed on any user fault,
+		 * or if there are already 3 non-boled kernel entries.
+		 */
+		BUILD_BUG_ON(SLB_CACHE_ENTRIES < 3);
+		if (!kernel || slb_cache_index == 3) {
+			int i;
+
+			for (i = 0; i < slb_cache_index; i++)
+				slb_cache_slbie_kernel(i);
+			slb_cache_index = 0;
+		}
+
+		if (kernel)
+			local_paca->slb_cache[slb_cache_index++] = esid_data >> SID_SHIFT;
+		local_paca->slb_cache_ptr = slb_cache_index;
+	}
 	asm volatile("slbmte %0, %1" : : "r" (vsid_data), "r" (esid_data));
 
 	barrier();
-- 
2.23.0


^ permalink raw reply related

* [powerpc:merge] BUILD SUCCESS 78263190ec9727216ca715bfc0ee8b58b657d1ea
From: kbuild test robot @ 2020-05-11  8:25 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git  merge
branch HEAD: 78263190ec9727216ca715bfc0ee8b58b657d1ea  Automatic merge of 'master', 'next' and 'fixes' (2020-05-07 22:25)

elapsed time: 5480m

configs tested: 118
configs skipped: 26

The following configs have been built successfully.
More configs may be tested in the coming days.

arm                                 defconfig
arm                              allyesconfig
arm                              allmodconfig
arm                               allnoconfig
arm64                            allyesconfig
arm64                               defconfig
arm64                            allmodconfig
arm64                             allnoconfig
sparc                            allyesconfig
m68k                             allyesconfig
m68k                             allmodconfig
i386                             allyesconfig
s390                             allmodconfig
m68k                              allnoconfig
um                                allnoconfig
m68k                                defconfig
i386                              allnoconfig
i386                                defconfig
i386                              debian-10.3
ia64                             allmodconfig
ia64                                defconfig
ia64                              allnoconfig
ia64                             allyesconfig
m68k                           sun3_defconfig
nds32                               defconfig
nds32                             allnoconfig
csky                             allyesconfig
csky                                defconfig
alpha                               defconfig
alpha                            allyesconfig
xtensa                           allyesconfig
h8300                            allyesconfig
h8300                            allmodconfig
xtensa                              defconfig
arc                                 defconfig
arc                              allyesconfig
sh                               allmodconfig
sh                                allnoconfig
microblaze                        allnoconfig
nios2                               defconfig
nios2                            allyesconfig
openrisc                            defconfig
c6x                              allyesconfig
c6x                               allnoconfig
openrisc                         allyesconfig
mips                             allyesconfig
mips                              allnoconfig
mips                             allmodconfig
parisc                            allnoconfig
parisc                              defconfig
parisc                           allyesconfig
parisc                           allmodconfig
powerpc                             defconfig
powerpc                          allyesconfig
powerpc                          rhel-kconfig
powerpc                          allmodconfig
powerpc                           allnoconfig
i386                 randconfig-a006-20200511
i386                 randconfig-a005-20200511
i386                 randconfig-a003-20200511
i386                 randconfig-a001-20200511
i386                 randconfig-a004-20200511
i386                 randconfig-a002-20200511
i386                 randconfig-a005-20200507
i386                 randconfig-a004-20200507
i386                 randconfig-a001-20200507
i386                 randconfig-a002-20200507
i386                 randconfig-a003-20200507
i386                 randconfig-a006-20200507
x86_64               randconfig-a005-20200511
x86_64               randconfig-a003-20200511
x86_64               randconfig-a006-20200511
x86_64               randconfig-a004-20200511
x86_64               randconfig-a001-20200511
x86_64               randconfig-a002-20200511
i386                 randconfig-a012-20200510
i386                 randconfig-a016-20200510
i386                 randconfig-a014-20200510
i386                 randconfig-a011-20200510
i386                 randconfig-a013-20200510
i386                 randconfig-a015-20200510
i386                 randconfig-a012-20200507
i386                 randconfig-a016-20200507
i386                 randconfig-a014-20200507
i386                 randconfig-a011-20200507
i386                 randconfig-a015-20200507
i386                 randconfig-a013-20200507
i386                 randconfig-a012-20200511
i386                 randconfig-a016-20200511
i386                 randconfig-a014-20200511
i386                 randconfig-a011-20200511
i386                 randconfig-a013-20200511
i386                 randconfig-a015-20200511
x86_64               randconfig-a004-20200507
x86_64               randconfig-a006-20200507
x86_64               randconfig-a002-20200507
riscv                            allyesconfig
riscv                             allnoconfig
riscv                               defconfig
riscv                            allmodconfig
s390                             allyesconfig
s390                              allnoconfig
s390                                defconfig
sparc                               defconfig
sparc64                             defconfig
sparc64                           allnoconfig
sparc64                          allyesconfig
sparc64                          allmodconfig
um                               allmodconfig
um                               allyesconfig
um                                  defconfig
x86_64                                   rhel
x86_64                               rhel-7.6
x86_64                    rhel-7.6-kselftests
x86_64                         rhel-7.2-clear
x86_64                                    lkp
x86_64                              fedora-25
x86_64                                  kexec

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply

* [powerpc:fixes] BUILD SUCCESS c44dc6323cd49d8d742c37e234b952e822c35de4
From: kbuild test robot @ 2020-05-11  8:25 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git  fixes
branch HEAD: c44dc6323cd49d8d742c37e234b952e822c35de4  powerpc/64s/kuap: Restore AMR in fast_interrupt_return

elapsed time: 5481m

configs tested: 160
configs skipped: 38

The following configs have been built successfully.
More configs may be tested in the coming days.

arm                                 defconfig
arm                              allyesconfig
arm                              allmodconfig
arm                               allnoconfig
arm64                            allyesconfig
arm64                               defconfig
arm64                            allmodconfig
arm64                             allnoconfig
sparc                            allyesconfig
m68k                             allyesconfig
m68k                             allmodconfig
um                                  defconfig
i386                             allyesconfig
alpha                            allyesconfig
mips                             allmodconfig
s390                             allmodconfig
m68k                              allnoconfig
m68k                                defconfig
um                                allnoconfig
s390                              allnoconfig
riscv                             allnoconfig
riscv                            allmodconfig
parisc                           allyesconfig
i386                              allnoconfig
i386                                defconfig
i386                              debian-10.3
ia64                             allmodconfig
ia64                                defconfig
ia64                              allnoconfig
ia64                             allyesconfig
m68k                           sun3_defconfig
nios2                               defconfig
nios2                            allyesconfig
openrisc                            defconfig
c6x                              allyesconfig
c6x                               allnoconfig
openrisc                         allyesconfig
nds32                               defconfig
nds32                             allnoconfig
csky                             allyesconfig
csky                                defconfig
alpha                               defconfig
xtensa                           allyesconfig
h8300                            allyesconfig
h8300                            allmodconfig
xtensa                              defconfig
arc                                 defconfig
arc                              allyesconfig
sh                               allmodconfig
sh                                allnoconfig
microblaze                        allnoconfig
mips                             allyesconfig
mips                              allnoconfig
parisc                            allnoconfig
parisc                              defconfig
parisc                           allmodconfig
powerpc                             defconfig
powerpc                          allyesconfig
powerpc                          rhel-kconfig
powerpc                          allmodconfig
powerpc                           allnoconfig
x86_64               randconfig-a005-20200508
x86_64               randconfig-a003-20200508
x86_64               randconfig-a001-20200508
x86_64               randconfig-a006-20200508
x86_64               randconfig-a004-20200508
x86_64               randconfig-a002-20200508
i386                 randconfig-a006-20200511
i386                 randconfig-a005-20200511
i386                 randconfig-a003-20200511
i386                 randconfig-a001-20200511
i386                 randconfig-a004-20200511
i386                 randconfig-a002-20200511
i386                 randconfig-a005-20200509
i386                 randconfig-a004-20200509
i386                 randconfig-a003-20200509
i386                 randconfig-a002-20200509
i386                 randconfig-a001-20200509
i386                 randconfig-a006-20200509
i386                 randconfig-a005-20200507
i386                 randconfig-a004-20200507
i386                 randconfig-a001-20200507
i386                 randconfig-a002-20200507
i386                 randconfig-a003-20200507
i386                 randconfig-a006-20200507
i386                 randconfig-a005-20200508
i386                 randconfig-a004-20200508
i386                 randconfig-a003-20200508
i386                 randconfig-a002-20200508
i386                 randconfig-a001-20200508
i386                 randconfig-a006-20200508
x86_64               randconfig-a005-20200511
x86_64               randconfig-a003-20200511
x86_64               randconfig-a006-20200511
x86_64               randconfig-a004-20200511
x86_64               randconfig-a001-20200511
x86_64               randconfig-a002-20200511
x86_64               randconfig-a015-20200507
x86_64               randconfig-a014-20200507
x86_64               randconfig-a012-20200507
x86_64               randconfig-a013-20200507
x86_64               randconfig-a011-20200507
x86_64               randconfig-a016-20200507
x86_64               randconfig-a015-20200509
x86_64               randconfig-a014-20200509
x86_64               randconfig-a011-20200509
x86_64               randconfig-a013-20200509
x86_64               randconfig-a012-20200509
x86_64               randconfig-a016-20200509
x86_64               randconfig-a014-20200508
x86_64               randconfig-a012-20200508
x86_64               randconfig-a016-20200508
x86_64               randconfig-a016-20200511
x86_64               randconfig-a012-20200511
x86_64               randconfig-a014-20200511
x86_64               randconfig-a004-20200507
x86_64               randconfig-a006-20200507
x86_64               randconfig-a002-20200507
i386                 randconfig-a012-20200511
i386                 randconfig-a016-20200511
i386                 randconfig-a014-20200511
i386                 randconfig-a011-20200511
i386                 randconfig-a013-20200511
i386                 randconfig-a015-20200511
i386                 randconfig-a012-20200510
i386                 randconfig-a016-20200510
i386                 randconfig-a014-20200510
i386                 randconfig-a011-20200510
i386                 randconfig-a013-20200510
i386                 randconfig-a015-20200510
i386                 randconfig-a012-20200507
i386                 randconfig-a016-20200507
i386                 randconfig-a014-20200507
i386                 randconfig-a011-20200507
i386                 randconfig-a015-20200507
i386                 randconfig-a013-20200507
i386                 randconfig-a012-20200508
i386                 randconfig-a014-20200508
i386                 randconfig-a016-20200508
i386                 randconfig-a011-20200508
i386                 randconfig-a013-20200508
i386                 randconfig-a015-20200508
riscv                            allyesconfig
riscv                               defconfig
s390                             allyesconfig
s390                                defconfig
sparc                               defconfig
sparc64                             defconfig
sparc64                           allnoconfig
sparc64                          allyesconfig
sparc64                          allmodconfig
um                               allmodconfig
um                               allyesconfig
x86_64                                   rhel
x86_64                               rhel-7.6
x86_64                    rhel-7.6-kselftests
x86_64                         rhel-7.2-clear
x86_64                                    lkp
x86_64                              fedora-25
x86_64                                  kexec

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply

* [powerpc:topic/ppc-kvm] BUILD SUCCESS b1f9be9392f090f08e4ad9e2c68963aeff03bd67
From: kbuild test robot @ 2020-05-11  8:25 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git  topic/ppc-kvm
branch HEAD: b1f9be9392f090f08e4ad9e2c68963aeff03bd67  powerpc/xive: Enforce load-after-store ordering when StoreEOI is active

elapsed time: 484m

configs tested: 95
configs skipped: 71

The following configs have been built successfully.
More configs may be tested in the coming days.

arm                                 defconfig
arm                              allyesconfig
arm                              allmodconfig
arm                               allnoconfig
arm64                            allyesconfig
arm64                               defconfig
arm64                            allmodconfig
arm64                             allnoconfig
sparc                            allyesconfig
m68k                             allyesconfig
parisc                           allyesconfig
i386                              allnoconfig
i386                             allyesconfig
i386                                defconfig
i386                              debian-10.3
ia64                             allmodconfig
ia64                                defconfig
ia64                              allnoconfig
ia64                             allyesconfig
nios2                               defconfig
nios2                            allyesconfig
openrisc                            defconfig
c6x                              allyesconfig
c6x                               allnoconfig
openrisc                         allyesconfig
nds32                               defconfig
nds32                             allnoconfig
csky                             allyesconfig
csky                                defconfig
alpha                               defconfig
alpha                            allyesconfig
xtensa                           allyesconfig
h8300                            allyesconfig
h8300                            allmodconfig
xtensa                              defconfig
arc                                 defconfig
arc                              allyesconfig
sh                               allmodconfig
sh                                allnoconfig
microblaze                        allnoconfig
mips                             allyesconfig
mips                              allnoconfig
mips                             allmodconfig
parisc                            allnoconfig
parisc                              defconfig
parisc                           allmodconfig
powerpc                             defconfig
powerpc                          allyesconfig
powerpc                          rhel-kconfig
powerpc                          allmodconfig
powerpc                           allnoconfig
i386                 randconfig-a006-20200511
i386                 randconfig-a005-20200511
i386                 randconfig-a003-20200511
i386                 randconfig-a001-20200511
i386                 randconfig-a004-20200511
i386                 randconfig-a002-20200511
i386                 randconfig-a012-20200510
i386                 randconfig-a016-20200510
i386                 randconfig-a014-20200510
i386                 randconfig-a011-20200510
i386                 randconfig-a013-20200510
i386                 randconfig-a015-20200510
i386                 randconfig-a012-20200511
i386                 randconfig-a016-20200511
i386                 randconfig-a014-20200511
i386                 randconfig-a011-20200511
i386                 randconfig-a013-20200511
i386                 randconfig-a015-20200511
x86_64               randconfig-a005-20200511
x86_64               randconfig-a003-20200511
x86_64               randconfig-a006-20200511
x86_64               randconfig-a004-20200511
x86_64               randconfig-a001-20200511
x86_64               randconfig-a002-20200511
riscv                            allyesconfig
riscv                             allnoconfig
riscv                               defconfig
riscv                            allmodconfig
s390                             allyesconfig
s390                              allnoconfig
s390                             allmodconfig
s390                                defconfig
sparc64                             defconfig
sparc64                           allnoconfig
sparc64                          allyesconfig
sparc64                          allmodconfig
sparc                               defconfig
x86_64                                   rhel
x86_64                               rhel-7.6
x86_64                    rhel-7.6-kselftests
x86_64                         rhel-7.2-clear
x86_64                                    lkp
x86_64                              fedora-25
x86_64                                  kexec

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply

* [powerpc:fixes-test] BUILD SUCCESS 25107dc0610d8687ebbf4fc56323babf87149cb4
From: kbuild test robot @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git  fixes-test
branch HEAD: 25107dc0610d8687ebbf4fc56323babf87149cb4  powerpc/64s/kuap: Restore AMR in fast_interrupt_return

elapsed time: 8440m

configs tested: 111
configs skipped: 81

The following configs have been built successfully.
More configs may be tested in the coming days.

arm64                            allyesconfig
arm                              allyesconfig
arm64                            allmodconfig
arm                              allmodconfig
arm64                             allnoconfig
arm                               allnoconfig
arm                                 defconfig
arm64                               defconfig
sparc                            allyesconfig
m68k                             allyesconfig
mips                             allmodconfig
m68k                           sun3_defconfig
riscv                               defconfig
xtensa                           allyesconfig
nds32                             allnoconfig
powerpc                           allnoconfig
parisc                           allyesconfig
i386                              allnoconfig
i386                             allyesconfig
i386                                defconfig
i386                              debian-10.3
ia64                             allmodconfig
ia64                                defconfig
ia64                              allnoconfig
ia64                             allyesconfig
m68k                             allmodconfig
m68k                              allnoconfig
m68k                                defconfig
nios2                               defconfig
nios2                            allyesconfig
openrisc                            defconfig
c6x                              allyesconfig
c6x                               allnoconfig
openrisc                         allyesconfig
nds32                               defconfig
csky                             allyesconfig
csky                                defconfig
alpha                               defconfig
alpha                            allyesconfig
h8300                            allyesconfig
h8300                            allmodconfig
xtensa                              defconfig
arc                                 defconfig
arc                              allyesconfig
sh                               allmodconfig
sh                                allnoconfig
microblaze                        allnoconfig
mips                             allyesconfig
mips                              allnoconfig
parisc                            allnoconfig
parisc                           allmodconfig
parisc                              defconfig
powerpc                          allyesconfig
powerpc                          allmodconfig
powerpc                             defconfig
powerpc                          rhel-kconfig
i386                 randconfig-a006-20200511
i386                 randconfig-a005-20200511
i386                 randconfig-a003-20200511
i386                 randconfig-a001-20200511
i386                 randconfig-a004-20200511
i386                 randconfig-a002-20200511
x86_64               randconfig-a015-20200507
x86_64               randconfig-a014-20200507
x86_64               randconfig-a012-20200507
x86_64               randconfig-a013-20200507
x86_64               randconfig-a011-20200507
x86_64               randconfig-a016-20200507
i386                 randconfig-a012-20200511
i386                 randconfig-a016-20200511
i386                 randconfig-a014-20200511
i386                 randconfig-a011-20200511
i386                 randconfig-a013-20200511
i386                 randconfig-a015-20200511
i386                 randconfig-a012-20200510
i386                 randconfig-a016-20200510
i386                 randconfig-a014-20200510
i386                 randconfig-a011-20200510
i386                 randconfig-a013-20200510
i386                 randconfig-a015-20200510
x86_64               randconfig-a003-20200505
x86_64               randconfig-a001-20200505
i386                 randconfig-a001-20200505
i386                 randconfig-a003-20200505
i386                 randconfig-a002-20200505
x86_64               randconfig-a005-20200511
x86_64               randconfig-a003-20200511
x86_64               randconfig-a006-20200511
x86_64               randconfig-a004-20200511
x86_64               randconfig-a001-20200511
x86_64               randconfig-a002-20200511
riscv                            allyesconfig
riscv                             allnoconfig
riscv                            allmodconfig
s390                             allyesconfig
s390                              allnoconfig
s390                             allmodconfig
s390                                defconfig
sparc                               defconfig
sparc64                             defconfig
sparc64                           allnoconfig
sparc64                          allyesconfig
sparc64                          allmodconfig
um                                  defconfig
x86_64                                   rhel
x86_64                               rhel-7.6
x86_64                    rhel-7.6-kselftests
x86_64                         rhel-7.2-clear
x86_64                                    lkp
x86_64                              fedora-25
x86_64                                  kexec

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply

* Re: [PATCH 02/31] arm64: fix the flush_icache_range arguments in machine_kexec
From: Will Deacon @ 2020-05-11  7:51 UTC (permalink / raw)
  To: Christoph Hellwig, james.morse, catalin.marinas
  Cc: linux-ia64, linux-sh, Roman Zippel, linux-mips, linux-mm,
	sparclinux, linux-riscv, linux-arch, linux-c6x-dev, linux-hexagon,
	x86, linux-xtensa, Arnd Bergmann, Jessica Yu, linux-um,
	linux-m68k, openrisc, linux-arm-kernel, Michal Simek,
	linux-kernel, linux-alpha, linux-fsdevel, Andrew Morton,
	linuxppc-dev
In-Reply-To: <20200510075510.987823-3-hch@lst.de>

[+James and Catalin]

On Sun, May 10, 2020 at 09:54:41AM +0200, Christoph Hellwig wrote:
> The second argument is the end "pointer", not the length.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  arch/arm64/kernel/machine_kexec.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index 8e9c924423b4e..a0b144cfaea71 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -177,6 +177,7 @@ void machine_kexec(struct kimage *kimage)
>  	 * the offline CPUs. Therefore, we must use the __* variant here.
>  	 */
>  	__flush_icache_range((uintptr_t)reboot_code_buffer,
> +			     (uintptr_t)reboot_code_buffer +
>  			     arm64_relocate_new_kernel_size);

Urgh, well spotted. It's annoyingly different from __flush_dcache_area().

But now I'm wondering what this code actually does... the loop condition
in invalidate_icache_by_line works with 64-bit arithmetic, so we could
spend a /very/ long time here afaict. It's also a bit annoying that we
do a bunch of redundant D-cache maintenance too.

Should we use invalidate_icache_range() here instead? (and why does that
thing need to toggle uaccess)? Argh, too many questions!

Will

^ permalink raw reply

* Re: sort out the flush_icache_range mess
From: Geert Uytterhoeven @ 2020-05-11  7:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-ia64@vger.kernel.org, Linux-sh list, Roman Zippel,
	open list:BROADCOM NVRAM DRIVER, Linux MM, sparclinux,
	linux-riscv, Linux-Arch, linux-c6x-dev,
	open list:QUALCOMM HEXAGON..., the arch/x86 maintainers,
	open list:TENSILICA XTENSA PORT (xtensa), Arnd Bergmann,
	Jessica Yu, linux-um, linux-m68k, Openrisc, Linux ARM,
	Michal Simek, Linux Kernel Mailing List, alpha, Linux FS Devel,
	Andrew Morton, linuxppc-dev
In-Reply-To: <20200510075510.987823-1-hch@lst.de>

Hi Christoph,

On Sun, May 10, 2020 at 9:55 AM Christoph Hellwig <hch@lst.de> wrote:
> none of which really are used by a typical MMU enabled kernel, as a.out can
> only be build for alpha and m68k to start with.

Quoting myself:
"I think it's safe to assume no one still runs a.out binaries on m68k."
http://lore.kernel.org/r/CAMuHMdW+m0Q+j3rsQdMXnrEPm+XB5Y2AQrxW5sD1mZAKgmEqoA@mail.gmail.com

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
From: Greg KH @ 2020-05-11  7:41 UTC (permalink / raw)
  To: rananta; +Cc: andrew, linuxppc-dev, linux-kernel, jslaby
In-Reply-To: <a033c31f8d8bf121e2cfdabbca138c1a@codeaurora.org>

On Mon, May 11, 2020 at 12:34:44AM -0700, rananta@codeaurora.org wrote:
> On 2020-05-11 00:23, rananta@codeaurora.org wrote:
> > On 2020-05-09 23:48, Greg KH wrote:
> > > On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org
> > > wrote:
> > > > On 2020-05-06 02:48, Greg KH wrote:
> > > > > On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
> > > > > > Potentially, hvc_open() can be called in parallel when two tasks calls
> > > > > > open() on /dev/hvcX. In such a scenario, if the
> > > > > > hp->ops->notifier_add()
> > > > > > callback in the function fails, where it sets the tty->driver_data to
> > > > > > NULL, the parallel hvc_open() can see this NULL and cause a memory
> > > > > > abort.
> > > > > > Hence, serialize hvc_open and check if tty->private_data is NULL
> > > > > > before
> > > > > > proceeding ahead.
> > > > > >
> > > > > > The issue can be easily reproduced by launching two tasks
> > > > > > simultaneously
> > > > > > that does nothing but open() and close() on /dev/hvcX.
> > > > > > For example:
> > > > > > $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
> > > > > >
> > > > > > Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
> > > > > > ---
> > > > > >  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
> > > > > >  1 file changed, 14 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/tty/hvc/hvc_console.c
> > > > > > b/drivers/tty/hvc/hvc_console.c
> > > > > > index 436cc51c92c3..ebe26fe5ac09 100644
> > > > > > --- a/drivers/tty/hvc/hvc_console.c
> > > > > > +++ b/drivers/tty/hvc/hvc_console.c
> > > > > > @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
> > > > > >   */
> > > > > >  static DEFINE_MUTEX(hvc_structs_mutex);
> > > > > >
> > > > > > +/* Mutex to serialize hvc_open */
> > > > > > +static DEFINE_MUTEX(hvc_open_mutex);
> > > > > >  /*
> > > > > >   * This value is used to assign a tty->index value to a hvc_struct
> > > > > > based
> > > > > >   * upon order of exposure via hvc_probe(), when we can not match it
> > > > > > to
> > > > > > @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
> > > > > > *driver, struct tty_struct *tty)
> > > > > >   */
> > > > > >  static int hvc_open(struct tty_struct *tty, struct file * filp)
> > > > > >  {
> > > > > > -	struct hvc_struct *hp = tty->driver_data;
> > > > > > +	struct hvc_struct *hp;
> > > > > >  	unsigned long flags;
> > > > > >  	int rc = 0;
> > > > > >
> > > > > > +	mutex_lock(&hvc_open_mutex);
> > > > > > +
> > > > > > +	hp = tty->driver_data;
> > > > > > +	if (!hp) {
> > > > > > +		rc = -EIO;
> > > > > > +		goto out;
> > > > > > +	}
> > > > > > +
> > > > > >  	spin_lock_irqsave(&hp->port.lock, flags);
> > > > > >  	/* Check and then increment for fast path open. */
> > > > > >  	if (hp->port.count++ > 0) {
> > > > > >  		spin_unlock_irqrestore(&hp->port.lock, flags);
> > > > > >  		hvc_kick();
> > > > > > -		return 0;
> > > > > > +		goto out;
> > > > > >  	} /* else count == 0 */
> > > > > >  	spin_unlock_irqrestore(&hp->port.lock, flags);
> > > > >
> > > > > Wait, why isn't this driver just calling tty_port_open() instead of
> > > > > trying to open-code all of this?
> > > > >
> > > > > Keeping a single mutext for open will not protect it from close, it will
> > > > > just slow things down a bit.  There should already be a tty lock held by
> > > > > the tty core for open() to keep it from racing things, right?
> > > > The tty lock should have been held, but not likely across
> > > > ->install() and
> > > > ->open() callbacks, thus resulting in a race between
> > > > hvc_install() and
> > > > hvc_open(),
> > > 
> > > How?  The tty lock is held in install, and should not conflict with
> > > open(), otherwise, we would be seeing this happen in all tty drivers,
> > > right?
> > > 
> > Well, I was expecting the same, but IIRC, I see that the open() was
> > being
> > called in parallel for the same device node.
> > 
> > Is it expected that the tty core would allow only one thread to
> > access the dev-node, while blocking the other, or is it the client
> > driver's responsibility to handle the exclusiveness?
> Or is there any optimization going on where the second call doesn't go
> through
> install(), but calls open() directly as the file was already opened by the
> first
> thread?

Yes, it should only happen once, look at the logic in tty_kopen().

greg k-h

^ permalink raw reply

* Re: [PATCH 31/31] module: move the set_fs hack for flush_icache_range to m68k
From: Geert Uytterhoeven @ 2020-05-11  7:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-ia64@vger.kernel.org, Linux-sh list, Roman Zippel,
	open list:BROADCOM NVRAM DRIVER, Linux MM, sparclinux,
	linux-riscv, Linux-Arch, linux-c6x-dev,
	open list:QUALCOMM HEXAGON..., the arch/x86 maintainers,
	open list:TENSILICA XTENSA PORT (xtensa), Arnd Bergmann,
	Jessica Yu, linux-um, linux-m68k, Openrisc, Linux ARM,
	Michal Simek, Linux Kernel Mailing List, alpha, Linux FS Devel,
	Andrew Morton, linuxppc-dev
In-Reply-To: <20200510075510.987823-32-hch@lst.de>

On Sun, May 10, 2020 at 9:57 AM Christoph Hellwig <hch@lst.de> wrote:
>
> flush_icache_range generally operates on kernel addresses, but for some
> reason m68k needed a set_fs override.  Move that into the m68k code
> insted of keeping it in the module loader.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
From: Greg KH @ 2020-05-11  7:39 UTC (permalink / raw)
  To: rananta; +Cc: andrew, linuxppc-dev, linux-kernel, jslaby
In-Reply-To: <77d889be4e0cb0e6e30f96199e2d843d@codeaurora.org>

On Mon, May 11, 2020 at 12:23:58AM -0700, rananta@codeaurora.org wrote:
> On 2020-05-09 23:48, Greg KH wrote:
> > On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org wrote:
> > > On 2020-05-06 02:48, Greg KH wrote:
> > > > On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
> > > > > Potentially, hvc_open() can be called in parallel when two tasks calls
> > > > > open() on /dev/hvcX. In such a scenario, if the
> > > > > hp->ops->notifier_add()
> > > > > callback in the function fails, where it sets the tty->driver_data to
> > > > > NULL, the parallel hvc_open() can see this NULL and cause a memory
> > > > > abort.
> > > > > Hence, serialize hvc_open and check if tty->private_data is NULL
> > > > > before
> > > > > proceeding ahead.
> > > > >
> > > > > The issue can be easily reproduced by launching two tasks
> > > > > simultaneously
> > > > > that does nothing but open() and close() on /dev/hvcX.
> > > > > For example:
> > > > > $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
> > > > >
> > > > > Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
> > > > > ---
> > > > >  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
> > > > >  1 file changed, 14 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/tty/hvc/hvc_console.c
> > > > > b/drivers/tty/hvc/hvc_console.c
> > > > > index 436cc51c92c3..ebe26fe5ac09 100644
> > > > > --- a/drivers/tty/hvc/hvc_console.c
> > > > > +++ b/drivers/tty/hvc/hvc_console.c
> > > > > @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
> > > > >   */
> > > > >  static DEFINE_MUTEX(hvc_structs_mutex);
> > > > >
> > > > > +/* Mutex to serialize hvc_open */
> > > > > +static DEFINE_MUTEX(hvc_open_mutex);
> > > > >  /*
> > > > >   * This value is used to assign a tty->index value to a hvc_struct
> > > > > based
> > > > >   * upon order of exposure via hvc_probe(), when we can not match it
> > > > > to
> > > > > @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
> > > > > *driver, struct tty_struct *tty)
> > > > >   */
> > > > >  static int hvc_open(struct tty_struct *tty, struct file * filp)
> > > > >  {
> > > > > -	struct hvc_struct *hp = tty->driver_data;
> > > > > +	struct hvc_struct *hp;
> > > > >  	unsigned long flags;
> > > > >  	int rc = 0;
> > > > >
> > > > > +	mutex_lock(&hvc_open_mutex);
> > > > > +
> > > > > +	hp = tty->driver_data;
> > > > > +	if (!hp) {
> > > > > +		rc = -EIO;
> > > > > +		goto out;
> > > > > +	}
> > > > > +
> > > > >  	spin_lock_irqsave(&hp->port.lock, flags);
> > > > >  	/* Check and then increment for fast path open. */
> > > > >  	if (hp->port.count++ > 0) {
> > > > >  		spin_unlock_irqrestore(&hp->port.lock, flags);
> > > > >  		hvc_kick();
> > > > > -		return 0;
> > > > > +		goto out;
> > > > >  	} /* else count == 0 */
> > > > >  	spin_unlock_irqrestore(&hp->port.lock, flags);
> > > >
> > > > Wait, why isn't this driver just calling tty_port_open() instead of
> > > > trying to open-code all of this?
> > > >
> > > > Keeping a single mutext for open will not protect it from close, it will
> > > > just slow things down a bit.  There should already be a tty lock held by
> > > > the tty core for open() to keep it from racing things, right?
> > > The tty lock should have been held, but not likely across
> > > ->install() and
> > > ->open() callbacks, thus resulting in a race between hvc_install() and
> > > hvc_open(),
> > 
> > How?  The tty lock is held in install, and should not conflict with
> > open(), otherwise, we would be seeing this happen in all tty drivers,
> > right?
> > 
> Well, I was expecting the same, but IIRC, I see that the open() was being
> called in parallel for the same device node.

So open and install are happening at the same time?  And the tty_lock()
does not protect the needed fields from being protected properly?  If
not, what fields are being touched without the lock?

> Is it expected that the tty core would allow only one thread to
> access the dev-node, while blocking the other, or is it the client
> driver's responsibility to handle the exclusiveness?

The tty core should handle this correctly, for things that can mess
stuff up (like install and open at the same time).  A driver should not
have to worry about that.

> > > where hvc_install() sets a data and the hvc_open() clears it.
> > > hvc_open()
> > > doesn't
> > > check if the data was set to NULL and proceeds.
> > 
> > What data is being set that hvc_open is checking?
> hvc_install sets tty->private_data to hp, while hvc_open sets it to NULL (in
> one of the paths).

I see no use of private_data in drivers/tty/hvc/ so what exactly are you
referring to?  The file private_data or the port private_data or
something else?

> > And you are not grabbing a lock in your install callback, you are only
> > serializing your open call here, I don't see how this is fixing anything
> > other than perhaps slowing down your codepaths.
> Basically, my intention was to add a NULL check before accessing *hp in
> open().
> The intention of the lock was to protect against this check.
> If the tty layer would have taken care of this, then perhaps there won't be
> a
> need to check for NULL.

Ah, driver_data is what you are referring to, not private_data.

Look at hvc_close(), no locking is done there to test for private_data,
right?  Why not?  The only thing setting driver_data is in install, and
your lock is not touching that.

And again, install and open should not race, if so, the tty core needs
to be fixed.

> > As an arument why this isn't correct, can you answer why this same type
> > of change wouldn't be required for all tty drivers in the tree?
> > 
> I agree, that if it's already taken care by the tty-core, we don't need it
> here.
> Correct me if I'm wrong, but looks like the tty layer is allowing parallel
> accesses
> to open(),

I do not think that happens, try counting the calls to open(), there
should only be one.  If not, that's a bug somewhere else.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH 26/31] m68k: implement flush_icache_user_range
From: Geert Uytterhoeven @ 2020-05-11  7:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-ia64@vger.kernel.org, Linux-sh list, Roman Zippel,
	open list:BROADCOM NVRAM DRIVER, Linux MM, sparclinux,
	linux-riscv, Linux-Arch, linux-c6x-dev,
	open list:QUALCOMM HEXAGON..., the arch/x86 maintainers,
	open list:TENSILICA XTENSA PORT (xtensa), Arnd Bergmann,
	Jessica Yu, linux-um, linux-m68k, Openrisc, Linux ARM,
	Michal Simek, Linux Kernel Mailing List, alpha, Linux FS Devel,
	Andrew Morton, linuxppc-dev
In-Reply-To: <20200510075510.987823-27-hch@lst.de>

On Sun, May 10, 2020 at 9:57 AM Christoph Hellwig <hch@lst.de> wrote:
> Rename the current flush_icache_range to flush_icache_user_range as
> per commit ae92ef8a4424 ("PATCH] flush icache in correct context") there
> seems to be an assumption that it operates on user addresses.  Add a
> flush_icache_range around it that for now is a no-op.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH 21/31] mm: rename flush_icache_user_range to flush_icache_user_page
From: Geert Uytterhoeven @ 2020-05-11  7:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-ia64@vger.kernel.org, Linux-sh list, Roman Zippel,
	open list:BROADCOM NVRAM DRIVER, Linux MM, sparclinux,
	linux-riscv, Linux-Arch, linux-c6x-dev,
	open list:QUALCOMM HEXAGON..., the arch/x86 maintainers,
	open list:TENSILICA XTENSA PORT (xtensa), Arnd Bergmann,
	Jessica Yu, linux-um, linux-m68k, Openrisc, Linux ARM,
	Michal Simek, Linux Kernel Mailing List, alpha, Linux FS Devel,
	Andrew Morton, linuxppc-dev
In-Reply-To: <20200510075510.987823-22-hch@lst.de>

On Sun, May 10, 2020 at 9:57 AM Christoph Hellwig <hch@lst.de> wrote:
> The function currently known as flush_icache_user_range only operates
> on a single page.  Rename it to flush_icache_user_page as we'll need
> the name flush_icache_user_range for something else soon.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>

>  arch/m68k/include/asm/cacheflush_mm.h  |  4 ++--
>  arch/m68k/mm/cache.c                   |  2 +-

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
From: rananta @ 2020-05-11  7:34 UTC (permalink / raw)
  To: Greg KH; +Cc: andrew, linuxppc-dev, linux-kernel, jslaby
In-Reply-To: <77d889be4e0cb0e6e30f96199e2d843d@codeaurora.org>

On 2020-05-11 00:23, rananta@codeaurora.org wrote:
> On 2020-05-09 23:48, Greg KH wrote:
>> On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org 
>> wrote:
>>> On 2020-05-06 02:48, Greg KH wrote:
>>> > On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
>>> > > Potentially, hvc_open() can be called in parallel when two tasks calls
>>> > > open() on /dev/hvcX. In such a scenario, if the
>>> > > hp->ops->notifier_add()
>>> > > callback in the function fails, where it sets the tty->driver_data to
>>> > > NULL, the parallel hvc_open() can see this NULL and cause a memory
>>> > > abort.
>>> > > Hence, serialize hvc_open and check if tty->private_data is NULL
>>> > > before
>>> > > proceeding ahead.
>>> > >
>>> > > The issue can be easily reproduced by launching two tasks
>>> > > simultaneously
>>> > > that does nothing but open() and close() on /dev/hvcX.
>>> > > For example:
>>> > > $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
>>> > >
>>> > > Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
>>> > > ---
>>> > >  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
>>> > >  1 file changed, 14 insertions(+), 2 deletions(-)
>>> > >
>>> > > diff --git a/drivers/tty/hvc/hvc_console.c
>>> > > b/drivers/tty/hvc/hvc_console.c
>>> > > index 436cc51c92c3..ebe26fe5ac09 100644
>>> > > --- a/drivers/tty/hvc/hvc_console.c
>>> > > +++ b/drivers/tty/hvc/hvc_console.c
>>> > > @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
>>> > >   */
>>> > >  static DEFINE_MUTEX(hvc_structs_mutex);
>>> > >
>>> > > +/* Mutex to serialize hvc_open */
>>> > > +static DEFINE_MUTEX(hvc_open_mutex);
>>> > >  /*
>>> > >   * This value is used to assign a tty->index value to a hvc_struct
>>> > > based
>>> > >   * upon order of exposure via hvc_probe(), when we can not match it
>>> > > to
>>> > > @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
>>> > > *driver, struct tty_struct *tty)
>>> > >   */
>>> > >  static int hvc_open(struct tty_struct *tty, struct file * filp)
>>> > >  {
>>> > > -	struct hvc_struct *hp = tty->driver_data;
>>> > > +	struct hvc_struct *hp;
>>> > >  	unsigned long flags;
>>> > >  	int rc = 0;
>>> > >
>>> > > +	mutex_lock(&hvc_open_mutex);
>>> > > +
>>> > > +	hp = tty->driver_data;
>>> > > +	if (!hp) {
>>> > > +		rc = -EIO;
>>> > > +		goto out;
>>> > > +	}
>>> > > +
>>> > >  	spin_lock_irqsave(&hp->port.lock, flags);
>>> > >  	/* Check and then increment for fast path open. */
>>> > >  	if (hp->port.count++ > 0) {
>>> > >  		spin_unlock_irqrestore(&hp->port.lock, flags);
>>> > >  		hvc_kick();
>>> > > -		return 0;
>>> > > +		goto out;
>>> > >  	} /* else count == 0 */
>>> > >  	spin_unlock_irqrestore(&hp->port.lock, flags);
>>> >
>>> > Wait, why isn't this driver just calling tty_port_open() instead of
>>> > trying to open-code all of this?
>>> >
>>> > Keeping a single mutext for open will not protect it from close, it will
>>> > just slow things down a bit.  There should already be a tty lock held by
>>> > the tty core for open() to keep it from racing things, right?
>>> The tty lock should have been held, but not likely across ->install() 
>>> and
>>> ->open() callbacks, thus resulting in a race between hvc_install() 
>>> and
>>> hvc_open(),
>> 
>> How?  The tty lock is held in install, and should not conflict with
>> open(), otherwise, we would be seeing this happen in all tty drivers,
>> right?
>> 
> Well, I was expecting the same, but IIRC, I see that the open() was 
> being
> called in parallel for the same device node.
> 
> Is it expected that the tty core would allow only one thread to
> access the dev-node, while blocking the other, or is it the client
> driver's responsibility to handle the exclusiveness?
Or is there any optimization going on where the second call doesn't go 
through
install(), but calls open() directly as the file was already opened by 
the first
thread?
>>> where hvc_install() sets a data and the hvc_open() clears it. 
>>> hvc_open()
>>> doesn't
>>> check if the data was set to NULL and proceeds.
>> 
>> What data is being set that hvc_open is checking?
> hvc_install sets tty->private_data to hp, while hvc_open sets it to
> NULL (in one of the paths).
>> 
>> And you are not grabbing a lock in your install callback, you are only
>> serializing your open call here, I don't see how this is fixing 
>> anything
>> other than perhaps slowing down your codepaths.
> Basically, my intention was to add a NULL check before accessing *hp in 
> open().
> The intention of the lock was to protect against this check.
> If the tty layer would have taken care of this, then perhaps there 
> won't be a
> need to check for NULL.
>> 
>> As an arument why this isn't correct, can you answer why this same 
>> type
>> of change wouldn't be required for all tty drivers in the tree?
>> 
> I agree, that if it's already taken care by the tty-core, we don't need 
> it here.
> Correct me if I'm wrong, but looks like the tty layer is allowing
> parallel accesses
> to open(),
>> thanks,
>> 
>> greg k-h

^ permalink raw reply

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
From: rananta @ 2020-05-11  7:23 UTC (permalink / raw)
  To: Greg KH; +Cc: andrew, linuxppc-dev, linux-kernel, jslaby
In-Reply-To: <20200510064819.GB3400311@kroah.com>

On 2020-05-09 23:48, Greg KH wrote:
> On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org wrote:
>> On 2020-05-06 02:48, Greg KH wrote:
>> > On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
>> > > Potentially, hvc_open() can be called in parallel when two tasks calls
>> > > open() on /dev/hvcX. In such a scenario, if the
>> > > hp->ops->notifier_add()
>> > > callback in the function fails, where it sets the tty->driver_data to
>> > > NULL, the parallel hvc_open() can see this NULL and cause a memory
>> > > abort.
>> > > Hence, serialize hvc_open and check if tty->private_data is NULL
>> > > before
>> > > proceeding ahead.
>> > >
>> > > The issue can be easily reproduced by launching two tasks
>> > > simultaneously
>> > > that does nothing but open() and close() on /dev/hvcX.
>> > > For example:
>> > > $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
>> > >
>> > > Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
>> > > ---
>> > >  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
>> > >  1 file changed, 14 insertions(+), 2 deletions(-)
>> > >
>> > > diff --git a/drivers/tty/hvc/hvc_console.c
>> > > b/drivers/tty/hvc/hvc_console.c
>> > > index 436cc51c92c3..ebe26fe5ac09 100644
>> > > --- a/drivers/tty/hvc/hvc_console.c
>> > > +++ b/drivers/tty/hvc/hvc_console.c
>> > > @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
>> > >   */
>> > >  static DEFINE_MUTEX(hvc_structs_mutex);
>> > >
>> > > +/* Mutex to serialize hvc_open */
>> > > +static DEFINE_MUTEX(hvc_open_mutex);
>> > >  /*
>> > >   * This value is used to assign a tty->index value to a hvc_struct
>> > > based
>> > >   * upon order of exposure via hvc_probe(), when we can not match it
>> > > to
>> > > @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
>> > > *driver, struct tty_struct *tty)
>> > >   */
>> > >  static int hvc_open(struct tty_struct *tty, struct file * filp)
>> > >  {
>> > > -	struct hvc_struct *hp = tty->driver_data;
>> > > +	struct hvc_struct *hp;
>> > >  	unsigned long flags;
>> > >  	int rc = 0;
>> > >
>> > > +	mutex_lock(&hvc_open_mutex);
>> > > +
>> > > +	hp = tty->driver_data;
>> > > +	if (!hp) {
>> > > +		rc = -EIO;
>> > > +		goto out;
>> > > +	}
>> > > +
>> > >  	spin_lock_irqsave(&hp->port.lock, flags);
>> > >  	/* Check and then increment for fast path open. */
>> > >  	if (hp->port.count++ > 0) {
>> > >  		spin_unlock_irqrestore(&hp->port.lock, flags);
>> > >  		hvc_kick();
>> > > -		return 0;
>> > > +		goto out;
>> > >  	} /* else count == 0 */
>> > >  	spin_unlock_irqrestore(&hp->port.lock, flags);
>> >
>> > Wait, why isn't this driver just calling tty_port_open() instead of
>> > trying to open-code all of this?
>> >
>> > Keeping a single mutext for open will not protect it from close, it will
>> > just slow things down a bit.  There should already be a tty lock held by
>> > the tty core for open() to keep it from racing things, right?
>> The tty lock should have been held, but not likely across ->install() 
>> and
>> ->open() callbacks, thus resulting in a race between hvc_install() and
>> hvc_open(),
> 
> How?  The tty lock is held in install, and should not conflict with
> open(), otherwise, we would be seeing this happen in all tty drivers,
> right?
> 
Well, I was expecting the same, but IIRC, I see that the open() was 
being
called in parallel for the same device node.

Is it expected that the tty core would allow only one thread to
access the dev-node, while blocking the other, or is it the client
driver's responsibility to handle the exclusiveness?
>> where hvc_install() sets a data and the hvc_open() clears it. 
>> hvc_open()
>> doesn't
>> check if the data was set to NULL and proceeds.
> 
> What data is being set that hvc_open is checking?
hvc_install sets tty->private_data to hp, while hvc_open sets it to NULL 
(in one of the paths).
> 
> And you are not grabbing a lock in your install callback, you are only
> serializing your open call here, I don't see how this is fixing 
> anything
> other than perhaps slowing down your codepaths.
Basically, my intention was to add a NULL check before accessing *hp in 
open().
The intention of the lock was to protect against this check.
If the tty layer would have taken care of this, then perhaps there won't 
be a
need to check for NULL.
> 
> As an arument why this isn't correct, can you answer why this same type
> of change wouldn't be required for all tty drivers in the tree?
> 
I agree, that if it's already taken care by the tty-core, we don't need 
it here.
Correct me if I'm wrong, but looks like the tty layer is allowing 
parallel accesses
to open(),
> thanks,
> 
> greg k-h

^ permalink raw reply

* Re: [PATCH RFC 3/4] powerpc/microwatt: Add early debug UART support for Microwatt
From: Segher Boessenkool @ 2020-05-11  7:07 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev, Michael Neuling, Benjamin Herrenschmidt
In-Reply-To: <20200509050340.GD1464954@thinks.paulus.ozlabs.org>

Hi!

On Sat, May 09, 2020 at 03:03:40PM +1000, Paul Mackerras wrote:
> +	__asm__ volatile("mtmsrd %3,0; ldcix %0,%1,%2; mtmsrd %4,0"
> +			 : "=r" (val) : "b" (potato_uart_base), "r" (offset),
> +			   "r" (msr & ~MSR_DR), "r" (msr));

That should be  "=&r"(val)  (an earlyclobber), because when %0 is
written, %4 will still be used later.

Looks fine otherwise.

Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>


Segher

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox