LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
From: Edgecombe, Rick P @ 2020-10-26  0:38 UTC (permalink / raw)
  To: rppt@kernel.org, akpm@linux-foundation.org
  Cc: david@redhat.com, peterz@infradead.org, catalin.marinas@arm.com,
	dave.hansen@linux.intel.com, linux-mm@kvack.org, paulus@samba.org,
	pavel@ucw.cz, hpa@zytor.com, sparclinux@vger.kernel.org,
	cl@linux.com, will@kernel.org, linux-riscv@lists.infradead.org,
	linux-s390@vger.kernel.org, x86@kernel.org, rppt@linux.ibm.com,
	borntraeger@de.ibm.com, mingo@redhat.com, rientjes@google.com,
	Brown, Len, aou@eecs.berkeley.edu, gor@linux.ibm.com,
	linux-pm@vger.kernel.org, hca@linux.ibm.com, bp@alien8.de,
	luto@kernel.org, paul.walmsley@sifive.com, kirill@shutemov.name,
	tglx@linutronix.de, linux-arm-kernel@lists.infradead.org,
	rjw@rjwysocki.net, linux-kernel@vger.kernel.org,
	penberg@kernel.org, palmer@dabbelt.com, iamjoonsoo.kim@lge.com,
	linuxppc-dev@lists.ozlabs.org, davem@davemloft.net
In-Reply-To: <20201025101555.3057-3-rppt@kernel.org>

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may
> be
> not present in the direct map and has to be explicitly mapped before
> it
> could be copied.
> 
> On arm64 it is possible that a page would be removed from the direct
> map
> using set_direct_map_invalid_noflush() but __kernel_map_pages() will
> refuse
> to map this page back if DEBUG_PAGEALLOC is disabled.

It looks to me that arm64 __kernel_map_pages() will still attempt to
map it if rodata_full is true, how does this happen?

> Explicitly use set_direct_map_{default,invalid}_noflush() for
> ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
> DEBUG_PAGEALLOC case.
> 
> While on that, rename kernel_map_pages() to hibernate_map_page() and
> drop
> numpages parameter.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
>  kernel/power/snapshot.c | 29 +++++++++++++++++++----------
>  1 file changed, 19 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> index fa499466f645..ecb7b32ce77c 100644
> --- a/kernel/power/snapshot.c
> +++ b/kernel/power/snapshot.c
> @@ -76,16 +76,25 @@ static inline void
> hibernate_restore_protect_page(void *page_address) {}
>  static inline void hibernate_restore_unprotect_page(void
> *page_address) {}
>  #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
>  
> -#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable)
> +static inline void hibernate_map_page(struct page *page, int enable)
>  {
> -	__kernel_map_pages(page, numpages, enable);
> +	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> +		unsigned long addr = (unsigned long)page_address(page);
> +		int ret;
> +
> +		if (enable)
> +			ret = set_direct_map_default_noflush(page);
> +		else
> +			ret = set_direct_map_invalid_noflush(page);
> +
> +		if (WARN_ON(ret))
> +			return;
> +
> +		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +	} else {
> +		debug_pagealloc_map_pages(page, 1, enable);
> +	}
>  }
> -#else
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable) {}
> -#endif
>  
>  static int swsusp_page_is_free(struct page *);
>  static void swsusp_set_page_forbidden(struct page *);
> @@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct
> page *s_page)
>  	if (kernel_page_present(s_page)) {
>  		do_copy_page(dst, page_address(s_page));
>  	} else {
> -		kernel_map_pages(s_page, 1, 1);
> +		hibernate_map_page(s_page, 1);
>  		do_copy_page(dst, page_address(s_page));
> -		kernel_map_pages(s_page, 1, 0);
> +		hibernate_map_page(s_page, 0);
>  	}
>  }
>  

If somehow a page was unmapped such that
set_direct_map_default_noflush() would fail, then this code introduces
a WARN, but it will still try to read the unmapped page. Why not just
have the WARN's inside of __kernel_map_pages() if they fail and then
have a warning for the debug page alloc cases as well? Since logic
around both expects them not to fail.



^ permalink raw reply

* Re: [PATCH 0/3] warn and suppress irqflood
From: Pingfan Liu @ 2020-10-25 11:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Maulik Shah, Petr Mladek, Oliver Neukum, Jonathan Corbet,
	Gustavo A. R. Silva, Peter Zijlstra, Marc Zyngier, Linus Walleij,
	Guilherme G. Piccoli, linux-doc, LKML, Lina Iyer, Jisheng Zhang,
	Pawan Gupta, Al Viro, Andrew Morton, afzal mohammed,
	Kexec Mailing List, Mike Kravetz
In-Reply-To: <871rhq7j1h.fsf@nanos.tec.linutronix.de>

On Thu, Oct 22, 2020 at 4:37 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Thu, Oct 22 2020 at 13:56, Pingfan Liu wrote:
> > I hit a irqflood bug on powerpc platform, and two years ago, on a x86 platform.
> > When the bug happens, the kernel is totally occupies by irq.  Currently, there
> > may be nothing or just soft lockup warning showed in console. It is better
> > to warn users with irq flood info.
> >
> > In the kdump case, the kernel can move on by suppressing the irq flood.
>
> You're curing the symptom not the cause and the cure is just magic and
> can't work reliably.
Yeah, it is magic. But at least, it is better to printk something and
alarm users about what happens. With current code, it may show nothing
when system hangs.
>
> Where is that irq flood originated from and why is none of the
> mechanisms we have in place to shut it up working?
The bug originates from a driver tpm_i2c_nuvoton, which calls i2c-bus
driver (i2c-opal.c). After i2c_opal_send_request(), the bug is
triggered.

But things are complicated by introducing a firmware layer: Skiboot.
This software layer hides the detail of manipulating the hardware from
Linux.

I guess the software logic can not enter a sane state when kernel crashes.

Cc Skiboot and ppc64 community to see whether anyone has idea about it.

Thanks,
Pingfan

^ permalink raw reply

* [PATCH 4/4] arch, mm: make kernel_page_present() always available
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390, x86,
	Mike Rapoport, Christian Borntraeger, Ingo Molnar,
	Catalin Marinas, Len Brown, Albert Ou, Vasily Gorbik, linux-pm,
	Heiko Carstens, David Rientjes, Borislav Petkov, Andy Lutomirski,
	Paul Walmsley, Kirill A. Shutemov, Thomas Gleixner,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller, Mike Rapoport
In-Reply-To: <20201025101555.3057-1-rppt@kernel.org>

From: Mike Rapoport <rppt@linux.ibm.com>

For architectures that enable ARCH_HAS_SET_MEMORY having the ability to
verify that a page is mapped in the kernel direct map can be useful
regardless of hibernation.

Add RISC-V implementation of kernel_page_present() and update its forward
declarations and stubs to be a part of set_memory API.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm64/include/asm/cacheflush.h |  1 +
 arch/riscv/include/asm/set_memory.h |  1 +
 arch/riscv/mm/pageattr.c            | 29 +++++++++++++++++++++++++++++
 arch/x86/include/asm/set_memory.h   |  1 +
 arch/x86/mm/pat/set_memory.c        |  2 --
 include/linux/mm.h                  |  7 -------
 include/linux/set_memory.h          |  5 +++++
 7 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
index 9384fd8fc13c..45217f21f1fe 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -140,6 +140,7 @@ int set_memory_valid(unsigned long addr, int numpages, int enable);
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 #include <asm-generic/cacheflush.h>
 
diff --git a/arch/riscv/include/asm/set_memory.h b/arch/riscv/include/asm/set_memory.h
index 4c5bae7ca01c..d690b08dff2a 100644
--- a/arch/riscv/include/asm/set_memory.h
+++ b/arch/riscv/include/asm/set_memory.h
@@ -24,6 +24,7 @@ static inline int set_memory_nx(unsigned long addr, int numpages) { return 0; }
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 321b09d2e2ea..87ba5a68bbb8 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -198,3 +198,32 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 			     __pgprot(0), __pgprot(_PAGE_PRESENT));
 }
 #endif
+
+bool kernel_page_present(struct page *page)
+{
+	unsigned long addr = (unsigned long)page_address(page);
+	pgd_t *pgd;
+	pud_t *pud;
+	p4d_t *p4d;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	pgd = pgd_offset_k(addr);
+	if (!pgd_present(*pgd))
+		return false;
+
+	p4d = p4d_offset(pgd, addr);
+	if (!p4d_present(*p4d))
+		return false;
+
+	pud = pud_offset(p4d, addr);
+	if (!pud_present(*pud))
+		return false;
+
+	pmd = pmd_offset(pud, addr);
+	if (!pmd_present(*pmd))
+		return false;
+
+	pte = pte_offset_kernel(pmd, addr);
+	return pte_present(*pte);
+}
diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
index 5948218f35c5..4352f08bfbb5 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -82,6 +82,7 @@ int set_pages_rw(struct page *page, int numpages);
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 extern int kernel_set_to_readonly;
 
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 7f248fc45317..16f878c26667 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 }
 #endif /* CONFIG_DEBUG_PAGEALLOC */
 
-#ifdef CONFIG_HIBERNATION
 bool kernel_page_present(struct page *page)
 {
 	unsigned int level;
@@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
 	pte = lookup_address((unsigned long)page_address(page), &level);
 	return (pte_val(*pte) & _PAGE_PRESENT);
 }
-#endif /* CONFIG_HIBERNATION */
 
 int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 				   unsigned numpages, unsigned long page_flags)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ab0ef6bd351d..44b82f22e76a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2937,16 +2937,9 @@ static inline void debug_pagealloc_map_pages(struct page *page,
 	if (debug_pagealloc_enabled_static())
 		__kernel_map_pages(page, numpages, enable);
 }
-
-#ifdef CONFIG_HIBERNATION
-extern bool kernel_page_present(struct page *page);
-#endif	/* CONFIG_HIBERNATION */
 #else	/* CONFIG_DEBUG_PAGEALLOC */
 static inline void debug_pagealloc_map_pages(struct page *page,
 					     int numpages, int enable) {}
-#ifdef CONFIG_HIBERNATION
-static inline bool kernel_page_present(struct page *page) { return true; }
-#endif	/* CONFIG_HIBERNATION */
 #endif	/* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef __HAVE_ARCH_GATE_AREA
diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h
index 860e0f843c12..fe1aa4e54680 100644
--- a/include/linux/set_memory.h
+++ b/include/linux/set_memory.h
@@ -23,6 +23,11 @@ static inline int set_direct_map_default_noflush(struct page *page)
 {
 	return 0;
 }
+
+static inline bool kernel_page_present(struct page *page)
+{
+	return true;
+}
 #endif
 
 #ifndef set_mce_nospec
-- 
2.28.0


^ permalink raw reply related

* [PATCH 3/4] arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390, x86,
	Mike Rapoport, Christian Borntraeger, Ingo Molnar,
	Catalin Marinas, Len Brown, Albert Ou, Vasily Gorbik, linux-pm,
	Heiko Carstens, David Rientjes, Borislav Petkov, Andy Lutomirski,
	Paul Walmsley, Kirill A. Shutemov, Thomas Gleixner,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller, Mike Rapoport
In-Reply-To: <20201025101555.3057-1-rppt@kernel.org>

From: Mike Rapoport <rppt@linux.ibm.com>

The design of DEBUG_PAGEALLOC presumes that __kernel_map_pages() must never
fail. With this assumption is wouldn't be safe to allow general usage of
this function.

Moreover, some architectures that implement __kernel_map_pages() have this
function guarded by #ifdef DEBUG_PAGEALLOC and some refuse to map/unmap
pages when page allocation debugging is disabled at runtime.

As all the users of __kernel_map_pages() were converted to use
debug_pagealloc_map_pages() it is safe to make it available only when
DEBUG_PAGEALLOC is set.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/Kconfig                     |  3 +++
 arch/arm64/Kconfig               |  4 +---
 arch/arm64/mm/pageattr.c         |  6 ++++--
 arch/powerpc/Kconfig             |  5 +----
 arch/riscv/Kconfig               |  4 +---
 arch/riscv/include/asm/pgtable.h |  2 --
 arch/riscv/mm/pageattr.c         |  2 ++
 arch/s390/Kconfig                |  4 +---
 arch/sparc/Kconfig               |  4 +---
 arch/x86/Kconfig                 |  4 +---
 arch/x86/mm/pat/set_memory.c     |  2 ++
 include/linux/mm.h               | 10 +++++++---
 12 files changed, 24 insertions(+), 26 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 56b6ccc0e32d..56d4752b6db6 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1028,6 +1028,9 @@ config HAVE_STATIC_CALL_INLINE
 	bool
 	depends on HAVE_STATIC_CALL
 
+config ARCH_SUPPORTS_DEBUG_PAGEALLOC
+	bool
+
 source "kernel/gcov/Kconfig"
 
 source "scripts/gcc-plugins/Kconfig"
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 08fa3a1c50f0..1d4da0843668 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -71,6 +71,7 @@ config ARM64
 	select ARCH_USE_QUEUED_RWLOCKS
 	select ARCH_USE_QUEUED_SPINLOCKS
 	select ARCH_USE_SYM_ANNOTATIONS
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_MEMORY_FAILURE
 	select ARCH_SUPPORTS_SHADOW_CALL_STACK if CC_HAVE_SHADOW_CALL_STACK
 	select ARCH_SUPPORTS_ATOMIC_RMW
@@ -1004,9 +1005,6 @@ config HOLES_IN_ZONE
 
 source "kernel/Kconfig.hz"
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config ARCH_SPARSEMEM_ENABLE
 	def_bool y
 	select SPARSEMEM_VMEMMAP_ENABLE
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index 1b94f5b82654..18613d8834db 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -178,13 +178,15 @@ int set_direct_map_default_noflush(struct page *page)
 				   PAGE_SIZE, change_page_range, &data);
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
-	if (!debug_pagealloc_enabled() && !rodata_full)
+	if (!rodata_full)
 		return;
 
 	set_memory_valid((unsigned long)page_address(page), numpages, enable);
 }
+#endif /* CONFIG_DEBUG_PAGEALLOC */
 
 /*
  * This function is used to determine if a linear map page has been marked as
@@ -204,7 +206,7 @@ bool kernel_page_present(struct page *page)
 	pte_t *ptep;
 	unsigned long addr = (unsigned long)page_address(page);
 
-	if (!debug_pagealloc_enabled() && !rodata_full)
+	if (!rodata_full)
 		return true;
 
 	pgdp = pgd_offset_k(addr);
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e9f13fe08492..ad8a83f3ddca 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -146,6 +146,7 @@ config PPC
 	select ARCH_MIGHT_HAVE_PC_SERIO
 	select ARCH_OPTIONAL_KERNEL_RWX		if ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC	if PPC32 || PPC_BOOK3S_64
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF		if PPC64
 	select ARCH_USE_QUEUED_RWLOCKS		if PPC_QUEUED_SPINLOCKS
@@ -355,10 +356,6 @@ config PPC_OF_PLATFORM_PCI
 	depends on PCI
 	depends on PPC64 # not supported on 32 bits yet
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	depends on PPC32 || PPC_BOOK3S_64
-	def_bool y
-
 config ARCH_SUPPORTS_UPROBES
 	def_bool y
 
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d5e7ca08f22c..c704562ba45e 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -14,6 +14,7 @@ config RISCV
 	def_bool y
 	select ARCH_CLOCKSOURCE_INIT
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
 	select ARCH_HAS_BINFMT_FLAT
 	select ARCH_HAS_DEBUG_VM_PGTABLE
 	select ARCH_HAS_DEBUG_VIRTUAL if MMU
@@ -153,9 +154,6 @@ config ARCH_SELECT_MEMORY_MODEL
 config ARCH_WANT_GENERAL_HUGETLB
 	def_bool y
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config SYS_SUPPORTS_HUGETLBFS
 	depends on MMU
 	def_bool y
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 183f1f4b2ae6..41a72861987c 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -461,8 +461,6 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
 #define VMALLOC_START		0
 #define VMALLOC_END		TASK_SIZE
 
-static inline void __kernel_map_pages(struct page *page, int numpages, int enable) {}
-
 #endif /* !CONFIG_MMU */
 
 #define kern_addr_valid(addr)   (1) /* FIXME */
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 19fecb362d81..321b09d2e2ea 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -184,6 +184,7 @@ int set_direct_map_default_noflush(struct page *page)
 	return ret;
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
 	if (!debug_pagealloc_enabled())
@@ -196,3 +197,4 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 		__set_memory((unsigned long)page_address(page), numpages,
 			     __pgprot(0), __pgprot(_PAGE_PRESENT));
 }
+#endif
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 34371539a9b9..0a42d457bff4 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -35,9 +35,6 @@ config GENERIC_LOCKBREAK
 config PGSTE
 	def_bool y if KVM
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config AUDIT_ARCH
 	def_bool y
 
@@ -106,6 +103,7 @@ config S390
 	select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE
 	select ARCH_STACKWALK
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_NUMA_BALANCING
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index a6ca135442f9..2c729b8d097a 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -88,6 +88,7 @@ config SPARC64
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_ARCH_AUDITSYSCALL
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select HAVE_NMI
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select ARCH_USE_QUEUED_RWLOCKS
@@ -148,9 +149,6 @@ config GENERIC_ISA_DMA
 	bool
 	default y if SPARC32
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y if SPARC64
-
 config PGTABLE_LEVELS
 	default 4 if 64BIT
 	default 3
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f6946b81f74a..0db3fb1da70c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -91,6 +91,7 @@ config X86
 	select ARCH_STACKWALK
 	select ARCH_SUPPORTS_ACPI
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_NUMA_BALANCING	if X86_64
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_QUEUED_RWLOCKS
@@ -329,9 +330,6 @@ config ZONE_DMA32
 config AUDIT_ARCH
 	def_bool y if X86_64
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config KASAN_SHADOW_OFFSET
 	hex
 	depends on KASAN
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 40baa90e74f4..7f248fc45317 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2194,6 +2194,7 @@ int set_direct_map_default_noflush(struct page *page)
 	return __set_pages_p(page, 1);
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
 	if (PageHighMem(page))
@@ -2225,6 +2226,7 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 
 	arch_flush_lazy_mmu_mode();
 }
+#endif /* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef CONFIG_HIBERNATION
 bool kernel_page_present(struct page *page)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 14e397f3752c..ab0ef6bd351d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2924,7 +2924,11 @@ static inline bool debug_pagealloc_enabled_static(void)
 	return static_branch_unlikely(&_debug_pagealloc_enabled);
 }
 
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
+#ifdef CONFIG_DEBUG_PAGEALLOC
+/*
+ * To support DEBUG_PAGEALLOC architecture must ensure that
+ * __kernel_map_pages() never fails
+ */
 extern void __kernel_map_pages(struct page *page, int numpages, int enable);
 
 static inline void debug_pagealloc_map_pages(struct page *page,
@@ -2937,13 +2941,13 @@ static inline void debug_pagealloc_map_pages(struct page *page,
 #ifdef CONFIG_HIBERNATION
 extern bool kernel_page_present(struct page *page);
 #endif	/* CONFIG_HIBERNATION */
-#else	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
+#else	/* CONFIG_DEBUG_PAGEALLOC */
 static inline void debug_pagealloc_map_pages(struct page *page,
 					     int numpages, int enable) {}
 #ifdef CONFIG_HIBERNATION
 static inline bool kernel_page_present(struct page *page) { return true; }
 #endif	/* CONFIG_HIBERNATION */
-#endif	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
+#endif	/* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef __HAVE_ARCH_GATE_AREA
 extern struct vm_area_struct *get_gate_vma(struct mm_struct *mm);
-- 
2.28.0


^ permalink raw reply related

* [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390, x86,
	Mike Rapoport, Christian Borntraeger, Ingo Molnar,
	Catalin Marinas, Len Brown, Albert Ou, Vasily Gorbik, linux-pm,
	Heiko Carstens, David Rientjes, Borislav Petkov, Andy Lutomirski,
	Paul Walmsley, Kirill A. Shutemov, Thomas Gleixner,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller, Mike Rapoport
In-Reply-To: <20201025101555.3057-1-rppt@kernel.org>

From: Mike Rapoport <rppt@linux.ibm.com>

When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may be
not present in the direct map and has to be explicitly mapped before it
could be copied.

On arm64 it is possible that a page would be removed from the direct map
using set_direct_map_invalid_noflush() but __kernel_map_pages() will refuse
to map this page back if DEBUG_PAGEALLOC is disabled.

Explicitly use set_direct_map_{default,invalid}_noflush() for
ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
DEBUG_PAGEALLOC case.

While on that, rename kernel_map_pages() to hibernate_map_page() and drop
numpages parameter.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 kernel/power/snapshot.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index fa499466f645..ecb7b32ce77c 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -76,16 +76,25 @@ static inline void hibernate_restore_protect_page(void *page_address) {}
 static inline void hibernate_restore_unprotect_page(void *page_address) {}
 #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
 
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable)
+static inline void hibernate_map_page(struct page *page, int enable)
 {
-	__kernel_map_pages(page, numpages, enable);
+	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
+		unsigned long addr = (unsigned long)page_address(page);
+		int ret;
+
+		if (enable)
+			ret = set_direct_map_default_noflush(page);
+		else
+			ret = set_direct_map_invalid_noflush(page);
+
+		if (WARN_ON(ret))
+			return;
+
+		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+	} else {
+		debug_pagealloc_map_pages(page, 1, enable);
+	}
 }
-#else
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable) {}
-#endif
 
 static int swsusp_page_is_free(struct page *);
 static void swsusp_set_page_forbidden(struct page *);
@@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct page *s_page)
 	if (kernel_page_present(s_page)) {
 		do_copy_page(dst, page_address(s_page));
 	} else {
-		kernel_map_pages(s_page, 1, 1);
+		hibernate_map_page(s_page, 1);
 		do_copy_page(dst, page_address(s_page));
-		kernel_map_pages(s_page, 1, 0);
+		hibernate_map_page(s_page, 0);
 	}
 }
 
-- 
2.28.0


^ permalink raw reply related

* [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390, x86,
	Mike Rapoport, Christian Borntraeger, Ingo Molnar,
	Catalin Marinas, Len Brown, Albert Ou, Vasily Gorbik, linux-pm,
	Heiko Carstens, David Rientjes, Borislav Petkov, Andy Lutomirski,
	Paul Walmsley, Kirill A. Shutemov, Thomas Gleixner,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller, Mike Rapoport
In-Reply-To: <20201025101555.3057-1-rppt@kernel.org>

From: Mike Rapoport <rppt@linux.ibm.com>

When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the kernel
direct mapping after free_pages(). The pages than need to be mapped back
before they could be used. Theese mapping operations use
__kernel_map_pages() guarded with with debug_pagealloc_enabled().

The only place that calls __kernel_map_pages() without checking whether
DEBUG_PAGEALLOC is enabled is the hibernation code that presumes
availability of this function when ARCH_HAS_SET_DIRECT_MAP is set.
Still, on arm64, __kernel_map_pages() will bail out when DEBUG_PAGEALLOC is
not enabled but set_direct_map_invalid_noflush() may render some pages not
present in the direct map and hibernation code won't be able to save such
pages.

To make page allocation debugging and hibernation interaction more robust,
the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP has to be made
more explicit.

Start with combining the guard condition and the call to
__kernel_map_pages() into a single debug_pagealloc_map_pages() function to
emphasize that __kernel_map_pages() should not be called without
DEBUG_PAGEALLOC and use this new function to map/unmap pages when page
allocation debug is enabled.

As the only remaining user of kernel_map_pages() is the hibernation code,
mode that function into kernel/power/snapshot.c closer to a caller.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 include/linux/mm.h      | 16 +++++++---------
 kernel/power/snapshot.c | 11 +++++++++++
 mm/memory_hotplug.c     |  3 +--
 mm/page_alloc.c         |  6 ++----
 mm/slab.c               |  8 +++-----
 5 files changed, 24 insertions(+), 20 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ef360fe70aaf..14e397f3752c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2927,21 +2927,19 @@ static inline bool debug_pagealloc_enabled_static(void)
 #if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
 extern void __kernel_map_pages(struct page *page, int numpages, int enable);
 
-/*
- * When called in DEBUG_PAGEALLOC context, the call should most likely be
- * guarded by debug_pagealloc_enabled() or debug_pagealloc_enabled_static()
- */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable)
+static inline void debug_pagealloc_map_pages(struct page *page,
+					     int numpages, int enable)
 {
-	__kernel_map_pages(page, numpages, enable);
+	if (debug_pagealloc_enabled_static())
+		__kernel_map_pages(page, numpages, enable);
 }
+
 #ifdef CONFIG_HIBERNATION
 extern bool kernel_page_present(struct page *page);
 #endif	/* CONFIG_HIBERNATION */
 #else	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable) {}
+static inline void debug_pagealloc_map_pages(struct page *page,
+					     int numpages, int enable) {}
 #ifdef CONFIG_HIBERNATION
 static inline bool kernel_page_present(struct page *page) { return true; }
 #endif	/* CONFIG_HIBERNATION */
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 46b1804c1ddf..fa499466f645 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -76,6 +76,17 @@ static inline void hibernate_restore_protect_page(void *page_address) {}
 static inline void hibernate_restore_unprotect_page(void *page_address) {}
 #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
 
+#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
+static inline void
+kernel_map_pages(struct page *page, int numpages, int enable)
+{
+	__kernel_map_pages(page, numpages, enable);
+}
+#else
+static inline void
+kernel_map_pages(struct page *page, int numpages, int enable) {}
+#endif
+
 static int swsusp_page_is_free(struct page *);
 static void swsusp_set_page_forbidden(struct page *);
 static void swsusp_unset_page_forbidden(struct page *);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b44d4c7ba73b..e2b6043a4428 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -614,8 +614,7 @@ void generic_online_page(struct page *page, unsigned int order)
 	 * so we should map it first. This is better than introducing a special
 	 * case in page freeing fast path.
 	 */
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 1);
+	debug_pagealloc_map_pages(page, 1 << order, 1);
 	__free_pages_core(page, order);
 	totalram_pages_add(1UL << order);
 #ifdef CONFIG_HIGHMEM
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 23f5066bd4a5..9a66a1ff9193 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1272,8 +1272,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
 	 */
 	arch_free_page(page, order);
 
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 0);
+	debug_pagealloc_map_pages(page, 1 << order, 0);
 
 	kasan_free_nondeferred_pages(page, order);
 
@@ -2270,8 +2269,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 	set_page_refcounted(page);
 
 	arch_alloc_page(page, order);
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 1);
+	debug_pagealloc_map_pages(page, 1 << order, 1);
 	kasan_alloc_pages(page, order);
 	kernel_poison_pages(page, 1 << order, 1);
 	set_page_owner(page, order, gfp_flags);
diff --git a/mm/slab.c b/mm/slab.c
index b1113561b98b..340db0ce74c4 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1431,10 +1431,8 @@ static bool is_debug_pagealloc_cache(struct kmem_cache *cachep)
 #ifdef CONFIG_DEBUG_PAGEALLOC
 static void slab_kernel_map(struct kmem_cache *cachep, void *objp, int map)
 {
-	if (!is_debug_pagealloc_cache(cachep))
-		return;
-
-	kernel_map_pages(virt_to_page(objp), cachep->size / PAGE_SIZE, map);
+	debug_pagealloc_map_pages(virt_to_page(objp),
+				  cachep->size / PAGE_SIZE, map);
 }
 
 #else
@@ -2062,7 +2060,7 @@ int __kmem_cache_create(struct kmem_cache *cachep, slab_flags_t flags)
 
 #if DEBUG
 	/*
-	 * If we're going to use the generic kernel_map_pages()
+	 * If we're going to use the generic debug_pagealloc_map_pages()
 	 * poisoning, then it's going to smash the contents of
 	 * the redzone and userword anyhow, so switch them off.
 	 */
-- 
2.28.0


^ permalink raw reply related

* [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390, x86,
	Mike Rapoport, Christian Borntraeger, Ingo Molnar,
	Catalin Marinas, Len Brown, Albert Ou, Vasily Gorbik, linux-pm,
	Heiko Carstens, David Rientjes, Borislav Petkov, Andy Lutomirski,
	Paul Walmsley, Kirill A. Shutemov, Thomas Gleixner,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Hi,

During recent discussion about KVM protected memory, David raised a concern
about usage of __kernel_map_pages() outside of DEBUG_PAGEALLOC scope [1].

Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP it is
possible that __kernel_map_pages() would fail, but since this function is
void, the failure will go unnoticed.

Moreover, there's lack of consistency of __kernel_map_pages() semantics
across architectures as some guard this function with
#ifdef DEBUG_PAGEALLOC, some refuse to update the direct map if page
allocation debugging is disabled at run time and some allow modifying the
direct map regardless of DEBUG_PAGEALLOC settings.

This set straightens this out by restoring dependency of
__kernel_map_pages() on DEBUG_PAGEALLOC and updating the call sites
accordingly. 

[1] https://lore.kernel.org/lkml/2759b4bf-e1e3-d006-7d86-78a40348269d@redhat.com

Mike Rapoport (4):
  mm: introduce debug_pagealloc_map_pages() helper
  PM: hibernate: improve robustness of mapping pages in the direct map
  arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
  arch, mm: make kernel_page_present() always available

 arch/Kconfig                        |  3 +++
 arch/arm64/Kconfig                  |  4 +---
 arch/arm64/include/asm/cacheflush.h |  1 +
 arch/arm64/mm/pageattr.c            |  6 +++--
 arch/powerpc/Kconfig                |  5 +----
 arch/riscv/Kconfig                  |  4 +---
 arch/riscv/include/asm/pgtable.h    |  2 --
 arch/riscv/include/asm/set_memory.h |  1 +
 arch/riscv/mm/pageattr.c            | 31 +++++++++++++++++++++++++
 arch/s390/Kconfig                   |  4 +---
 arch/sparc/Kconfig                  |  4 +---
 arch/x86/Kconfig                    |  4 +---
 arch/x86/include/asm/set_memory.h   |  1 +
 arch/x86/mm/pat/set_memory.c        |  4 ++--
 include/linux/mm.h                  | 35 +++++++++++++----------------
 include/linux/set_memory.h          |  5 +++++
 kernel/power/snapshot.c             | 24 ++++++++++++++++++--
 mm/memory_hotplug.c                 |  3 +--
 mm/page_alloc.c                     |  6 ++---
 mm/slab.c                           |  8 +++----
 20 files changed, 97 insertions(+), 58 deletions(-)

-- 
2.28.0


^ permalink raw reply

* [powerpc:merge] BUILD SUCCESS 8cb17737940b156329cb5210669b9c9b23f4dd56
From: kernel test robot @ 2020-10-25  2:48 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git  merge
branch HEAD: 8cb17737940b156329cb5210669b9c9b23f4dd56  Automatic merge of 'master' into merge (2020-10-24 21:33)

elapsed time: 966m

configs tested: 106
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm                                 defconfig
arm64                            allyesconfig
arm64                               defconfig
arm                              allyesconfig
arm                              allmodconfig
mips                     cu1830-neo_defconfig
powerpc                   bluestone_defconfig
powerpc                 mpc8272_ads_defconfig
arm                      tct_hammer_defconfig
powerpc                 canyonlands_defconfig
nios2                               defconfig
powerpc                      mgcoge_defconfig
sh                          r7780mp_defconfig
nios2                            allyesconfig
mips                      loongson3_defconfig
arm                      integrator_defconfig
sh                         ap325rxa_defconfig
sh                          r7785rp_defconfig
sh                              ul2_defconfig
h8300                       h8s-sim_defconfig
sh                           se7722_defconfig
powerpc                          allyesconfig
mips                malta_qemu_32r6_defconfig
m68k                          multi_defconfig
m68k                        m5407c3_defconfig
powerpc                     pseries_defconfig
arm                           tegra_defconfig
m68k                        m5272c3_defconfig
arm                      jornada720_defconfig
m68k                         apollo_defconfig
arm                            lart_defconfig
m68k                             allmodconfig
powerpc                 mpc834x_mds_defconfig
um                            kunit_defconfig
arm                        shmobile_defconfig
powerpc                         wii_defconfig
sh                     sh7710voipgw_defconfig
powerpc                  iss476-smp_defconfig
sh                      rts7751r2d1_defconfig
arm                             pxa_defconfig
mips                         tb0219_defconfig
ia64                             allmodconfig
ia64                                defconfig
ia64                             allyesconfig
m68k                                defconfig
m68k                             allyesconfig
nds32                               defconfig
csky                                defconfig
alpha                               defconfig
alpha                            allyesconfig
xtensa                           allyesconfig
h8300                            allyesconfig
arc                                 defconfig
sh                               allmodconfig
parisc                              defconfig
s390                             allyesconfig
parisc                           allyesconfig
s390                                defconfig
arc                              allyesconfig
nds32                             allnoconfig
c6x                              allyesconfig
i386                             allyesconfig
sparc                            allyesconfig
sparc                               defconfig
i386                                defconfig
mips                             allyesconfig
mips                             allmodconfig
powerpc                          allmodconfig
powerpc                           allnoconfig
i386                 randconfig-a002-20201024
i386                 randconfig-a003-20201024
i386                 randconfig-a005-20201024
i386                 randconfig-a001-20201024
i386                 randconfig-a006-20201024
i386                 randconfig-a004-20201024
x86_64               randconfig-a013-20201024
x86_64               randconfig-a016-20201024
x86_64               randconfig-a015-20201024
x86_64               randconfig-a012-20201024
x86_64               randconfig-a014-20201024
x86_64               randconfig-a011-20201024
i386                 randconfig-a016-20201024
i386                 randconfig-a015-20201024
i386                 randconfig-a014-20201024
i386                 randconfig-a013-20201024
i386                 randconfig-a012-20201024
i386                 randconfig-a011-20201024
riscv                    nommu_k210_defconfig
riscv                            allyesconfig
riscv                    nommu_virt_defconfig
riscv                             allnoconfig
riscv                               defconfig
riscv                          rv32_defconfig
riscv                            allmodconfig
x86_64                                   rhel
x86_64                           allyesconfig
x86_64                    rhel-7.6-kselftests
x86_64                              defconfig
x86_64                               rhel-8.3
x86_64                                  kexec

clang tested configs:
x86_64               randconfig-a001-20201024
x86_64               randconfig-a003-20201024
x86_64               randconfig-a002-20201024
x86_64               randconfig-a006-20201024
x86_64               randconfig-a005-20201024
x86_64               randconfig-a004-20201024

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply

* [powerpc:next-test] BUILD SUCCESS 29b535f8c5da3984d083068bd651af0631dcdca6
From: kernel test robot @ 2020-10-25  2:48 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git  next-test
branch HEAD: 29b535f8c5da3984d083068bd651af0631dcdca6  selftests/powerpc/eeh: disable kselftest timeout setting for eeh-basic

elapsed time: 965m

configs tested: 112
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm                                 defconfig
arm64                            allyesconfig
arm64                               defconfig
arm                              allyesconfig
arm                              allmodconfig
mips                     cu1830-neo_defconfig
powerpc                   bluestone_defconfig
powerpc                 mpc8272_ads_defconfig
arm                      tct_hammer_defconfig
powerpc                 canyonlands_defconfig
s390                       zfcpdump_defconfig
mips                malta_kvm_guest_defconfig
arm                         hackkit_defconfig
m68k                       m5249evb_defconfig
sh                           se7619_defconfig
sh                            hp6xx_defconfig
h8300                       h8s-sim_defconfig
sh                           se7722_defconfig
powerpc                          allyesconfig
mips                malta_qemu_32r6_defconfig
m68k                          multi_defconfig
powerpc                     mpc5200_defconfig
arm                              zx_defconfig
mips                        bcm47xx_defconfig
xtensa                           alldefconfig
xtensa                  nommu_kc705_defconfig
m68k                        m5407c3_defconfig
powerpc                     pseries_defconfig
arm                           tegra_defconfig
m68k                        m5272c3_defconfig
arm                      jornada720_defconfig
m68k                         apollo_defconfig
arm                            lart_defconfig
m68k                             allmodconfig
powerpc                 mpc834x_mds_defconfig
um                            kunit_defconfig
sh                      rts7751r2d1_defconfig
arm                             pxa_defconfig
mips                         tb0219_defconfig
arm                        shmobile_defconfig
sh                             sh03_defconfig
powerpc                      ppc6xx_defconfig
sparc64                          alldefconfig
arm                        magician_defconfig
arm                             ezx_defconfig
ia64                             allmodconfig
ia64                                defconfig
ia64                             allyesconfig
m68k                                defconfig
m68k                             allyesconfig
nios2                               defconfig
arc                              allyesconfig
nds32                             allnoconfig
c6x                              allyesconfig
nds32                               defconfig
nios2                            allyesconfig
csky                                defconfig
alpha                               defconfig
alpha                            allyesconfig
xtensa                           allyesconfig
h8300                            allyesconfig
arc                                 defconfig
sh                               allmodconfig
parisc                              defconfig
s390                             allyesconfig
parisc                           allyesconfig
s390                                defconfig
i386                             allyesconfig
sparc                            allyesconfig
sparc                               defconfig
i386                                defconfig
mips                             allyesconfig
mips                             allmodconfig
powerpc                          allmodconfig
powerpc                           allnoconfig
i386                 randconfig-a002-20201024
i386                 randconfig-a003-20201024
i386                 randconfig-a005-20201024
i386                 randconfig-a001-20201024
i386                 randconfig-a006-20201024
i386                 randconfig-a004-20201024
x86_64               randconfig-a013-20201024
x86_64               randconfig-a016-20201024
x86_64               randconfig-a015-20201024
x86_64               randconfig-a012-20201024
x86_64               randconfig-a014-20201024
x86_64               randconfig-a011-20201024
i386                 randconfig-a016-20201024
i386                 randconfig-a015-20201024
i386                 randconfig-a014-20201024
i386                 randconfig-a013-20201024
i386                 randconfig-a012-20201024
i386                 randconfig-a011-20201024
riscv                    nommu_k210_defconfig
riscv                            allyesconfig
riscv                    nommu_virt_defconfig
riscv                             allnoconfig
riscv                               defconfig
riscv                          rv32_defconfig
riscv                            allmodconfig
x86_64                                   rhel
x86_64                           allyesconfig
x86_64                    rhel-7.6-kselftests
x86_64                              defconfig
x86_64                               rhel-8.3
x86_64                                  kexec

clang tested configs:
x86_64               randconfig-a001-20201024
x86_64               randconfig-a003-20201024
x86_64               randconfig-a002-20201024
x86_64               randconfig-a006-20201024
x86_64               randconfig-a005-20201024
x86_64               randconfig-a004-20201024

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply

* [PATCH] ibmvscsi: fix race potential race after loss of transport
From: Tyrel Datwyler @ 2020-10-25  0:13 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
	linuxppc-dev

After a loss of tranport due to an adatper migration or crash/disconnect from
the host partner there is a tiny window where we can race adjusting the
request_limit of the adapter. The request limit is atomically inc/dec to track
the number of inflight requests against the allowed limit of our VIOS partner.
After a transport loss we set the request_limit to zero to reflect this state.
However, there is a window where the adapter may attempt to queue a command
because the transport loss event hasn't been fully processed yet and
request_limit is still greater than zero. The hypercall to send the event will
fail and the error path will increment the request_limit as a result. If the
adapter processes the transport event prior to this increment the request_limit
becomes out of sync with the adapter state and can result in scsi commands being
submitted on the now reset connection prior to an SRP Login resulting in a
protocol violation.

Fix this race by protecting request_limit with the host lock when changing the
value via atomic_set() to indicate no transport.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++++++++++++++++++++++---------
 1 file changed, 26 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index b1f3017b6547..188ed75417a5 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code)
 	spin_unlock_irqrestore(hostdata->host->host_lock, flags);
 }
 
+/**
+ * ibmvscsi_set_request_limit - Set the adapter request_limit in response to
+ * an adapter failure, reset, or SRP Login. Done under host lock to prevent
+ * race with scsi command submission.
+ * @hostdata:	adapter to adjust
+ * @limit:	new request limit
+ */
+static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, int limit)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(hostdata->host->host_lock, flags);
+	atomic_set(&hostdata->request_limit, limit);
+	spin_unlock_irqrestore(hostdata->host->host_lock, flags);
+}
+
 /**
  * ibmvscsi_reset_host - Reset the connection to the server
  * @hostdata:	struct ibmvscsi_host_data to reset
@@ -813,7 +829,7 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code)
 static void ibmvscsi_reset_host(struct ibmvscsi_host_data *hostdata)
 {
 	scsi_block_requests(hostdata->host);
-	atomic_set(&hostdata->request_limit, 0);
+	ibmvscsi_set_request_limit(hostdata, 0);
 
 	purge_requests(hostdata, DID_ERROR);
 	hostdata->action = IBMVSCSI_HOST_ACTION_RESET;
@@ -1146,13 +1162,13 @@ static void login_rsp(struct srp_event_struct *evt_struct)
 		dev_info(hostdata->dev, "SRP_LOGIN_REJ reason %u\n",
 			 evt_struct->xfer_iu->srp.login_rej.reason);
 		/* Login failed.  */
-		atomic_set(&hostdata->request_limit, -1);
+		ibmvscsi_set_request_limit(hostdata, -1);
 		return;
 	default:
 		dev_err(hostdata->dev, "Invalid login response typecode 0x%02x!\n",
 			evt_struct->xfer_iu->srp.login_rsp.opcode);
 		/* Login failed.  */
-		atomic_set(&hostdata->request_limit, -1);
+		ibmvscsi_set_request_limit(hostdata, -1);
 		return;
 	}
 
@@ -1163,7 +1179,7 @@ static void login_rsp(struct srp_event_struct *evt_struct)
 	 * This value is set rather than added to request_limit because
 	 * request_limit could have been set to -1 by this client.
 	 */
-	atomic_set(&hostdata->request_limit,
+	ibmvscsi_set_request_limit(hostdata,
 		   be32_to_cpu(evt_struct->xfer_iu->srp.login_rsp.req_lim_delta));
 
 	/* If we had any pending I/Os, kick them */
@@ -1195,13 +1211,13 @@ static int send_srp_login(struct ibmvscsi_host_data *hostdata)
 	login->req_buf_fmt = cpu_to_be16(SRP_BUF_FORMAT_DIRECT |
 					 SRP_BUF_FORMAT_INDIRECT);
 
-	spin_lock_irqsave(hostdata->host->host_lock, flags);
 	/* Start out with a request limit of 0, since this is negotiated in
 	 * the login request we are just sending and login requests always
 	 * get sent by the driver regardless of request_limit.
 	 */
-	atomic_set(&hostdata->request_limit, 0);
+	ibmvscsi_set_request_limit(hostdata, 0);
 
+	spin_lock_irqsave(hostdata->host->host_lock, flags);
 	rc = ibmvscsi_send_srp_event(evt_struct, hostdata, login_timeout * 2);
 	spin_unlock_irqrestore(hostdata->host->host_lock, flags);
 	dev_info(hostdata->dev, "sent SRP login\n");
@@ -1781,7 +1797,7 @@ static void ibmvscsi_handle_crq(struct viosrp_crq *crq,
 		return;
 	case VIOSRP_CRQ_XPORT_EVENT:	/* Hypervisor telling us the connection is closed */
 		scsi_block_requests(hostdata->host);
-		atomic_set(&hostdata->request_limit, 0);
+		ibmvscsi_set_request_limit(hostdata, 0);
 		if (crq->format == 0x06) {
 			/* We need to re-setup the interpartition connection */
 			dev_info(hostdata->dev, "Re-enabling adapter!\n");
@@ -2137,12 +2153,12 @@ static void ibmvscsi_do_work(struct ibmvscsi_host_data *hostdata)
 	}
 
 	hostdata->action = IBMVSCSI_HOST_ACTION_NONE;
+	spin_unlock_irqrestore(hostdata->host->host_lock, flags);
 
 	if (rc) {
-		atomic_set(&hostdata->request_limit, -1);
+		ibmvscsi_set_request_limit(hostdata, -1);
 		dev_err(hostdata->dev, "error after %s\n", action);
 	}
-	spin_unlock_irqrestore(hostdata->host->host_lock, flags);
 
 	scsi_unblock_requests(hostdata->host);
 }
@@ -2226,7 +2242,7 @@ static int ibmvscsi_probe(struct vio_dev *vdev, const struct vio_device_id *id)
 	init_waitqueue_head(&hostdata->work_wait_q);
 	hostdata->host = host;
 	hostdata->dev = dev;
-	atomic_set(&hostdata->request_limit, -1);
+	ibmvscsi_set_request_limit(hostdata, -1);
 	hostdata->host->max_sectors = IBMVSCSI_MAX_SECTORS_DEFAULT;
 
 	if (map_persist_bufs(hostdata)) {
-- 
2.27.0


^ permalink raw reply related

* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight @ 2020-10-24 21:12 UTC (permalink / raw)
  To: 'Segher Boessenkool'
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, Al Viro, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, 'Greg KH', Nick Desaulniers,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201024172903.GK2672@gate.crashing.org>

From: Segher Boessenkool
> Sent: 24 October 2020 18:29
> 
> On Fri, Oct 23, 2020 at 09:28:59PM +0000, David Laight wrote:
> > From: Segher Boessenkool
> > > Sent: 23 October 2020 19:27
> > > On Fri, Oct 23, 2020 at 06:58:57PM +0100, Al Viro wrote:
> > > > On Fri, Oct 23, 2020 at 03:09:30PM +0200, David Hildenbrand wrote:
> > > > On arm64 when callee expects a 32bit argument, the caller is *not* responsible
> > > > for clearing the upper half of 64bit register used to pass the value - it only
> > > > needs to store the actual value into the lower half.  The callee must consider
> > > > the contents of the upper half of that register as undefined.  See AAPCS64 (e.g.
> > > > https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#parameter-passing-rules
> > > > ); AFAICS, the relevant bit is
> > > > 	"Unlike in the 32-bit AAPCS, named integral values must be narrowed by
> > > > the callee rather than the caller."
> > >
> > > Or the formal rule:
> > >
> > > C.9 	If the argument is an Integral or Pointer Type, the size of the
> > > 	argument is less than or equal to 8 bytes and the NGRN is less
> > > 	than 8, the argument is copied to the least significant bits in
> > > 	x[NGRN]. The NGRN is incremented by one. The argument has now
> > > 	been allocated.
> >
> > So, in essence, if the value is in a 64bit register the calling
> > code is independent of the actual type of the formal parameter.
> > Clearly a value might need explicit widening.
> 
> No, this says that if you pass a 32-bit integer in a 64-bit register,
> then the top 32 bits of that register hold an undefined value.

That's sort of what I meant.
The 'normal' junk in the hight bits will there because the variable
in the calling code is wider.

> > I've found a copy of the 64 bit arm instruction set.
> > Unfortunately it is alpha sorted and repetitive so shows none
> > of the symmetry and makes things difficult to find.
> 
> All of this is ABI, not ISA.  Look at the AAPCS64 pointed to above.
> 
> > But, contrary to what someone suggested most register writes
> > (eg from arithmetic) seem to zero/extend the high bits.
> 
> Everything that writes a "w" does, yes.  But that has nothing to do with
> the parameter passing rules, that is ABI.  It just means that very often
> a 32-bit integer will be passed zero-extended in a 64-bit register, but
> that is just luck (or not, it makes finding bugs harder ;-) )

Working out why the code is wrong is more of an ISA issue than an ABI one.
It may be an ABI one, but the analysis is ISA.

I've written a lot of asm over the years - decoding compiler generated
asm isn't that hard.
At least ARM doesn't have annulled delay slots.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply

* Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.10-2 tag
From: pr-tracker-bot @ 2020-10-24 18:11 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: mikey, srikar, aneesh.kumar, linuxppc-dev, linux-kernel,
	hegdevasant, ganeshgr, jniethe5, oohall, Linus Torvalds
In-Reply-To: <871rhnyk2a.fsf@mpe.ellerman.id.au>

The pull request you sent on Sat, 24 Oct 2020 21:50:21 +1100:

> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.10-2

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b6f96e75ae121ead54da3f58c545d68184079f90

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Segher Boessenkool @ 2020-10-24 17:29 UTC (permalink / raw)
  To: David Laight
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, Al Viro, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, 'Greg KH', Nick Desaulniers,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <e9a3136ead214186877804aabde74b38@AcuMS.aculab.com>

On Fri, Oct 23, 2020 at 09:28:59PM +0000, David Laight wrote:
> From: Segher Boessenkool
> > Sent: 23 October 2020 19:27
> > On Fri, Oct 23, 2020 at 06:58:57PM +0100, Al Viro wrote:
> > > On Fri, Oct 23, 2020 at 03:09:30PM +0200, David Hildenbrand wrote:
> > > On arm64 when callee expects a 32bit argument, the caller is *not* responsible
> > > for clearing the upper half of 64bit register used to pass the value - it only
> > > needs to store the actual value into the lower half.  The callee must consider
> > > the contents of the upper half of that register as undefined.  See AAPCS64 (e.g.
> > > https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#parameter-passing-rules
> > > ); AFAICS, the relevant bit is
> > > 	"Unlike in the 32-bit AAPCS, named integral values must be narrowed by
> > > the callee rather than the caller."
> > 
> > Or the formal rule:
> > 
> > C.9 	If the argument is an Integral or Pointer Type, the size of the
> > 	argument is less than or equal to 8 bytes and the NGRN is less
> > 	than 8, the argument is copied to the least significant bits in
> > 	x[NGRN]. The NGRN is incremented by one. The argument has now
> > 	been allocated.
> 
> So, in essence, if the value is in a 64bit register the calling
> code is independent of the actual type of the formal parameter.
> Clearly a value might need explicit widening.

No, this says that if you pass a 32-bit integer in a 64-bit register,
then the top 32 bits of that register hold an undefined value.

> I've found a copy of the 64 bit arm instruction set.
> Unfortunately it is alpha sorted and repetitive so shows none
> of the symmetry and makes things difficult to find.

All of this is ABI, not ISA.  Look at the AAPCS64 pointed to above.

> But, contrary to what someone suggested most register writes
> (eg from arithmetic) seem to zero/extend the high bits.

Everything that writes a "w" does, yes.  But that has nothing to do with
the parameter passing rules, that is ABI.  It just means that very often
a 32-bit integer will be passed zero-extended in a 64-bit register, but
that is just luck (or not, it makes finding bugs harder ;-) )


Segher

^ permalink raw reply

* Re: C vdso
From: Christophe Leroy @ 2020-10-24 11:16 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev@ozlabs.org
In-Reply-To: <874kmkx7gi.fsf@mpe.ellerman.id.au>



Le 24/10/2020 à 12:07, Michael Ellerman a écrit :
> Michael Ellerman <mpe@ellerman.id.au> writes:
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>> Le 24/09/2020 à 15:17, Christophe Leroy a écrit :
>>>> Le 17/09/2020 à 14:33, Michael Ellerman a écrit :
>>>>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>>>>>
>>>>>> What is the status with the generic C vdso merge ?
>>>>>> In some mail, you mentionned having difficulties getting it working on
>>>>>> ppc64, any progress ? What's the problem ? Can I help ?
>>>>>
>>>>> Yeah sorry I was hoping to get time to work on it but haven't been able
>>>>> to.
>>>>>
>>>>> It's causing crashes on ppc64 ie. big endian.
>> ...
>>>>
>>>> Can you tell what defconfig you are using ? I have been able to setup a full glibc PPC64 cross
>>>> compilation chain and been able to test it under QEMU with success, using Nathan's vdsotest tool.
>>>
>>> What config are you using ?
>>
>> ppc64_defconfig + guest.config
>>
>> Or pseries_defconfig.
>>
>> I'm using Ubuntu GCC 9.3.0 mostly, but it happens with other toolchains too.
> 
> I'm also seeing warnings because of the feature fixups:
> 
> 
> That's happening because the 32-bit VDSO is built with CONFIG_PPC32=y,
> due to config-fake32.h, and that causes the feature fixup entries to be
> the wrong size.
> 
> See the logic in feature-fixup.h:
> 
> 
> 
> We expect the fixup entries to still use 64-bit values, even for the
> 32-bit VDSO in a 64-bit kernel.
> 
> TBH I'm not sure how config-fake32.h can work long term, it's so fragile
> to be defining/redefining a handful of CONFIG symbols like that.

I took the idea from mips (arch/mips/vdso/config-n32-o32-env.c) after struggling in different 
direction. At that time the generic VDSO code was far less careful and was including several linux 
headers IIRC.
I agree with you that it's rather fragile.

> 
> The generic VDSO code is fairly careful to only include uapi and vdso
> headers, not linux ones. So I think we need to better split our headers
> so that we can build the VDSO code with few or no linux headers, and so
> avoid the need to define any (or most) CONFIG symbols.
> 

I'll revisit it when I'm back from vacation (I'm leaving now for two weeks).

Christophe

^ permalink raw reply

* [GIT PULL] Please pull powerpc/linux.git powerpc-5.10-2 tag
From: Michael Ellerman @ 2020-10-24 10:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: mikey, srikar, aneesh.kumar, linux-kernel, hegdevasant, ganeshgr,
	jniethe5, oohall, linuxppc-dev

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hi Linus,

Please pull powerpc fixes for 5.10:

The following changes since commit ffd0b25ca049a477cb757e5bcf2d5e1664d12e5d:

  Revert "powerpc/pci: unmap legacy INTx interrupts when a PHB is removed" (2020-10-15 13:42:49 +1100)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.10-2

for you to fetch changes up to 4ff753feab021242144818b9a3ba011238218145:

  powerpc/pseries: Avoid using addr_to_pfn in real mode (2020-10-22 14:34:45 +1100)

- ------------------------------------------------------------------
powerpc fixes for 5.10 #2

A fix for undetected data corruption on Power9 Nimbus <= DD2.1 in the emulation
of VSX loads. The affected CPUs were not widely available.

Two fixes for machine check handling in guests under PowerVM.

A fix for our recent changes to SMP setup, when CONFIG_CPUMASK_OFFSTACK=y.

Three fixes for races in the handling of some of our powernv sysfs attributes.

One change to remove TM from the set of Power10 CPU features.

A couple of other minor fixes.

Thanks to:
  Aneesh Kumar K.V, Christophe Leroy, Ganesh Goudar, Jordan Niethe, Mahesh
  Salgaonkar, Michael Neuling, Oliver O'Halloran, Qian Cai, Srikar Dronamraju,
  Vasant Hegde.

- ------------------------------------------------------------------
Aneesh Kumar K.V (1):
      powerpc/opal_elog: Handle multiple writes to ack attribute

Christophe Leroy (1):
      powerpc/uaccess: Don't use "m<>" constraint with GCC 4.9

Ganesh Goudar (2):
      powerpc/mce: Avoid nmi_enter/exit in real mode on pseries hash
      powerpc/pseries: Avoid using addr_to_pfn in real mode

Jordan Niethe (1):
      powerpc/64s: Remove TM from Power10 features

Michael Neuling (2):
      powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation
      selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround

Oliver O'Halloran (1):
      powerpc/eeh: Fix eeh_dev_check_failure() for PE#0

Srikar Dronamraju (2):
      powerpc/smp: Remove unnecessary variable
      powerpc/smp: Use GFP_ATOMIC while allocating tmp mask

Vasant Hegde (2):
      powerpc/powernv/dump: Fix race while processing OPAL dump
      powerpc/powernv/dump: Handle multiple writes to ack attribute


 arch/powerpc/include/asm/asm-const.h                          |  13 +++
 arch/powerpc/include/asm/cputable.h                           |   2 +-
 arch/powerpc/include/asm/uaccess.h                            |   4 +-
 arch/powerpc/kernel/cputable.c                                |  13 ++-
 arch/powerpc/kernel/eeh.c                                     |   5 -
 arch/powerpc/kernel/mce.c                                     |   7 +-
 arch/powerpc/kernel/smp.c                                     |  70 ++++++------
 arch/powerpc/kernel/traps.c                                   |   2 +-
 arch/powerpc/platforms/powernv/opal-dump.c                    |  52 ++++++---
 arch/powerpc/platforms/powernv/opal-elog.c                    |  11 +-
 arch/powerpc/platforms/pseries/ras.c                          | 118 ++++++++++++--------
 tools/testing/selftests/powerpc/alignment/alignment_handler.c |   8 +-
 12 files changed, 185 insertions(+), 120 deletions(-)
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl+UBeUACgkQUevqPMjh
pYBHUA//THLt6DJlSPPqn8LQZQGT76Gx82cKyy9DQ7/Elcl13xcuq3XbhHD5asi0
QbJGbLhRqpRhtmj3c8BCYAygi5FXZWH4IeN6FK8xoZGR2bi/gY7VkhIUSzFAHnRi
PFXafzb8eWVS7O5k8xbxrjxdOAu8SjEzywG5I8PPn5IWFwhUwjGosv81QtxJOLVc
V9WwuTBK87nfvoMdfcl3YJXRs+4vKOQQ0Gqa5vTVTUmgdbJOqJi1MvLULnSnKxTJ
G+XplOeSI1N3gk+E2cycPasghTYziTtzEyrTHe0Uufgx+9t6VuN1g2zL81kDA7U9
10Oqqry4Z66V2BhGrDMnXKYGeQNGRO8vNLT2DuuZd5XTN/LpV0knJHm/9F2E+5zl
T+GgQwS0IhXDcbS70TcbxXHPyBe2/eXRH1+rkSlAEjl656JVbKefgi1VUsqSzkjH
JBF2+zCodYelbbnRP89Aj5/03t+VeHbNC/1jixoYDHR7drXiU2XQqjfFZeCHvxOQ
YCznpoC84gcDupGC5q4op3tHBmvULJn0KaHYWryaAEWlCxjdVcjBis48B+GQVww8
JnDMC5WGVvAAxPHc74EkyEvdROx4Q+8UeOj+TXnrRlowEF8Wymxcvy7NUn2Bqq2J
VsRCUzLIReKCckdJQ/+SxG8eb9JUxQRWG76+Q9zzTHbdaBSWuMc=
=9Oxg
-----END PGP SIGNATURE-----

^ permalink raw reply

* Re: [PATCH] powerpc/eeh: Fix eeh_dev_check_failure() for PE#0
From: Michael Ellerman @ 2020-10-24 10:27 UTC (permalink / raw)
  To: linuxppc-dev, Oliver O'Halloran
In-Reply-To: <20201021232554.1434687-1-oohall@gmail.com>

On Thu, 22 Oct 2020 10:25:54 +1100, Oliver O'Halloran wrote:
> In commit 269e583357df ("powerpc/eeh: Delete eeh_pe->config_addr") the
> following simplification was made:
> 
> -       if (!pe->addr && !pe->config_addr) {
> +       if (!pe->addr) {
>                 eeh_stats.no_cfg_addr++;
>                 return 0;
>         }
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/eeh: Fix eeh_dev_check_failure() for PE#0
      https://git.kernel.org/powerpc/c/99f6e9795a68fe23f96a2b5b0be07a3dd9457f99

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/64s: Remove TM from Power10 features
From: Michael Ellerman @ 2020-10-24 10:27 UTC (permalink / raw)
  To: linuxppc-dev, Jordan Niethe
In-Reply-To: <20200827035529.900-1-jniethe5@gmail.com>

On Thu, 27 Aug 2020 13:55:29 +1000, Jordan Niethe wrote:
> ISA v3.1 removes transactional memory and hence it should not be present
> in cpu_features or cpu_user_features2. Remove CPU_FTR_TM_COMP from
> CPU_FTRS_POWER10. Remove PPC_FEATURE2_HTM_COMP and
> PPC_FEATURE2_HTM_NOSC_COMP from COMMON_USER2_POWER10.

Applied to powerpc/fixes.

[1/1] powerpc/64s: Remove TM from Power10 features
      https://git.kernel.org/powerpc/c/ec613a57fa1d57381f890c3166175fe68cf43f12

cheers

^ permalink raw reply

* Re: [PATCH v2 1/3] powerpc/uaccess: Don't use "m<>" constraint with GCC 4.9
From: Michael Ellerman @ 2020-10-24 10:27 UTC (permalink / raw)
  To: Michael Ellerman, Christophe Leroy, Paul Mackerras,
	Benjamin Herrenschmidt, mathieu.desnoyers
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <212d3bc4a52ca71523759517bb9c61f7e477c46a.1603179582.git.christophe.leroy@csgroup.eu>

On Tue, 20 Oct 2020 07:40:07 +0000 (UTC), Christophe Leroy wrote:
> GCC 4.9 sometimes fails to build with "m<>" constraint in
> inline assembly.
> 
>   CC      lib/iov_iter.o
> In file included from ./arch/powerpc/include/asm/cmpxchg.h:6:0,
>                  from ./arch/powerpc/include/asm/atomic.h:11,
>                  from ./include/linux/atomic.h:7,
>                  from ./include/linux/crypto.h:15,
>                  from ./include/crypto/hash.h:11,
>                  from lib/iov_iter.c:2:
> lib/iov_iter.c: In function 'iovec_from_user.part.30':
> ./arch/powerpc/include/asm/uaccess.h:287:2: error: 'asm' operand has impossible constraints
>   __asm__ __volatile__(    \
>   ^
> ./include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
>  # define unlikely(x) __builtin_expect(!!(x), 0)
>                                           ^
> ./arch/powerpc/include/asm/uaccess.h:583:34: note: in expansion of macro 'unsafe_op_wrap'
>  #define unsafe_get_user(x, p, e) unsafe_op_wrap(__get_user_allowed(x, p), e)
>                                   ^
> ./arch/powerpc/include/asm/uaccess.h:329:10: note: in expansion of macro '__get_user_asm'
>   case 4: __get_user_asm(x, (u32 __user *)ptr, retval, "lwz"); break; \
>           ^
> ./arch/powerpc/include/asm/uaccess.h:363:3: note: in expansion of macro '__get_user_size_allowed'
>    __get_user_size_allowed(__gu_val, __gu_addr, __gu_size, __gu_err); \
>    ^
> ./arch/powerpc/include/asm/uaccess.h:100:2: note: in expansion of macro '__get_user_nocheck'
>   __get_user_nocheck((x), (ptr), sizeof(*(ptr)), false)
>   ^
> ./arch/powerpc/include/asm/uaccess.h:583:49: note: in expansion of macro '__get_user_allowed'
>  #define unsafe_get_user(x, p, e) unsafe_op_wrap(__get_user_allowed(x, p), e)
>                                                  ^
> lib/iov_iter.c:1663:3: note: in expansion of macro 'unsafe_get_user'
>    unsafe_get_user(len, &uiov[i].iov_len, uaccess_end);
>    ^
> make[1]: *** [scripts/Makefile.build:283: lib/iov_iter.o] Error 1
> 
> [...]

Patch 1 applied to powerpc/fixes.

[1/3] powerpc/uaccess: Don't use "m<>" constraint with GCC 4.9
      https://git.kernel.org/powerpc/c/592bbe9c505d9a0ef69260f8c8263df47da2698e

cheers

^ permalink raw reply

* Re: [PATCH v4] powerpc/pseries: Avoid using addr_to_pfn in real mode
From: Michael Ellerman @ 2020-10-24 10:27 UTC (permalink / raw)
  To: linuxppc-dev, mpe, Ganesh Goudar; +Cc: aneesh.kumar, npiggin, mahesh
In-Reply-To: <20200724063946.21378-1-ganeshgr@linux.ibm.com>

On Fri, 24 Jul 2020 12:09:46 +0530, Ganesh Goudar wrote:
> When an UE or memory error exception is encountered the MCE handler
> tries to find the pfn using addr_to_pfn() which takes effective
> address as an argument, later pfn is used to poison the page where
> memory error occurred, recent rework in this area made addr_to_pfn
> to run in real mode, which can be fatal as it may try to access
> memory outside RMO region.
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/pseries: Avoid using addr_to_pfn in real mode
      https://git.kernel.org/powerpc/c/4ff753feab021242144818b9a3ba011238218145

cheers

^ permalink raw reply

* Re: C vdso
From: Michael Ellerman @ 2020-10-24 10:07 UTC (permalink / raw)
  To: Christophe Leroy; +Cc: linuxppc-dev@ozlabs.org
In-Reply-To: <877drhxeg8.fsf@mpe.ellerman.id.au>

Michael Ellerman <mpe@ellerman.id.au> writes:
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> Le 24/09/2020 à 15:17, Christophe Leroy a écrit :
>>> Le 17/09/2020 à 14:33, Michael Ellerman a écrit :
>>>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>>>>
>>>>> What is the status with the generic C vdso merge ?
>>>>> In some mail, you mentionned having difficulties getting it working on
>>>>> ppc64, any progress ? What's the problem ? Can I help ?
>>>>
>>>> Yeah sorry I was hoping to get time to work on it but haven't been able
>>>> to.
>>>>
>>>> It's causing crashes on ppc64 ie. big endian.
> ...
>>> 
>>> Can you tell what defconfig you are using ? I have been able to setup a full glibc PPC64 cross 
>>> compilation chain and been able to test it under QEMU with success, using Nathan's vdsotest tool.
>>
>> What config are you using ?
>
> ppc64_defconfig + guest.config
>
> Or pseries_defconfig.
>
> I'm using Ubuntu GCC 9.3.0 mostly, but it happens with other toolchains too.

I'm also seeing warnings because of the feature fixups:

  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 1 at arch/powerpc/lib/feature-fixups.c:109 .do_feature_fixups+0x80/0xc0
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc2-00261-g107a86292cc4 #11
  NIP:  c0000000000a3660 LR: c0000000000a362c CTR: 0000000000000000
  REGS: c00000007e3a3790 TRAP: 0700   Not tainted  (5.9.0-rc2-00261-g107a86292cc4)
  MSR:  8000000002029032 <SF,VEC,EE,ME,IR,DR,RI>  CR: 48000422  XER: 00000000
  CFAR: c0000000000a3630 IRQMASK: 0
  GPR00: c0000000011e8964 c00000007e3a3a20 c000000001bb2b00 0000000000000001
  GPR04: c000000001bc0bc0 c000000001bc0bf0 0000000066736574 00000000fffffe00
  GPR08: 0000000300000004 0000000000000000 0000000000000000 0000000000000db8
  GPR12: 0000000028000224 c000000001dc0000 c000000000012d40 0000000000000000
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR24: c000000001aae0f0 c000000001063f08 c000000001063f18 c000000001063f28
  GPR28: c00000000106e188 000000eb8f4d91a7 c000000001bc0bf0 c000000001bc0bc0
  NIP [c0000000000a3660] .do_feature_fixups+0x80/0xc0
  LR [c0000000000a362c] .do_feature_fixups+0x4c/0xc0
  Call Trace:
  [c00000007e3a3a20] [c0000000000a362c] .do_feature_fixups+0x4c/0xc0 (unreliable)
  [c00000007e3a3ab0] [c0000000011e8964] .vdso_init+0x498/0x95c
  [c00000007e3a3bd0] [c000000000012780] .do_one_initcall+0x60/0x2b8
  [c00000007e3a3cb0] [c0000000011e4d8c] .kernel_init_freeable+0x2d8/0x370
  [c00000007e3a3da0] [c000000000012d64] .kernel_init+0x24/0x150
  [c00000007e3a3e20] [c00000000000e24c] .ret_from_kernel_thread+0x58/0x6c
  Instruction dump:
  40820030 3bff0030 7c3ef840 4181ffe4 38210090 e8010010 eb81ffe0 eba1ffe8
  ebc1fff0 7c0803a6 ebe1fff8 4e800020 <0fe00000> e8ff0028 e8df0020 7f83e378
  ---[ end trace ece1c957ca5bd6e9 ]---
  Unable to patch feature section at bffffffd01bc0bbc - c000000001bc0bc0 with bffffd9101bc0958 - bfffffe501bc0ba4


That's happening because the 32-bit VDSO is built with CONFIG_PPC32=y,
due to config-fake32.h, and that causes the feature fixup entries to be
the wrong size.

See the logic in feature-fixup.h:

  #if defined(CONFIG_PPC64) && !defined(__powerpc64__)
  /* 64 bits kernel, 32 bits code (ie. vdso32) */
  #define FTR_ENTRY_LONG		.8byte
  #define FTR_ENTRY_OFFSET	.long 0xffffffff; .long
  #elif defined(CONFIG_PPC64)
  #define FTR_ENTRY_LONG		.8byte
  #define FTR_ENTRY_OFFSET	.8byte
  #else
  #define FTR_ENTRY_LONG		.long
  #define FTR_ENTRY_OFFSET	.long
  #endif


We expect the fixup entries to still use 64-bit values, even for the
32-bit VDSO in a 64-bit kernel.

TBH I'm not sure how config-fake32.h can work long term, it's so fragile
to be defining/redefining a handful of CONFIG symbols like that.

The generic VDSO code is fairly careful to only include uapi and vdso
headers, not linux ones. So I think we need to better split our headers
so that we can build the VDSO code with few or no linux headers, and so
avoid the need to define any (or most) CONFIG symbols.

cheers

^ permalink raw reply

* Re: [PATCH] net: ucc_geth: Drop extraneous parentheses in comparison
From: patchwork-bot+netdevbpf @ 2020-10-24  1:50 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: netdev, linux-kernel, leoyang.li, linuxppc-dev, kuba, davem
In-Reply-To: <20201023033236.3296988-1-mpe@ellerman.id.au>

Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Fri, 23 Oct 2020 14:32:36 +1100 you wrote:
> Clang warns about the extra parentheses in this comparison:
> 
>   drivers/net/ethernet/freescale/ucc_geth.c:1361:28:
>   warning: equality comparison with extraneous parentheses
>     if ((ugeth->phy_interface == PHY_INTERFACE_MODE_SGMII))
>          ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> [...]

Here is the summary with links:
  - net: ucc_geth: Drop extraneous parentheses in comparison
    https://git.kernel.org/netdev/net/c/dab234227cbd

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH] net: ucc_geth: Drop extraneous parentheses in comparison
From: Jakub Kicinski @ 2020-10-24  1:45 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, leoyang.li, davem, linux-kernel, netdev
In-Reply-To: <20201023033236.3296988-1-mpe@ellerman.id.au>

On Fri, 23 Oct 2020 14:32:36 +1100 Michael Ellerman wrote:
> Clang warns about the extra parentheses in this comparison:
> 
>   drivers/net/ethernet/freescale/ucc_geth.c:1361:28:
>   warning: equality comparison with extraneous parentheses
>     if ((ugeth->phy_interface == PHY_INTERFACE_MODE_SGMII))
>          ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> It seems clear the intent here is to do a comparison not an
> assignment, so drop the extra parentheses to avoid any confusion.
> 
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

Applied, thanks!

^ permalink raw reply

* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight @ 2020-10-23 21:28 UTC (permalink / raw)
  To: 'Segher Boessenkool', Al Viro
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, 'Greg KH', Nick Desaulniers,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201023182713.GG2672@gate.crashing.org>

From: Segher Boessenkool
> Sent: 23 October 2020 19:27
> 
> On Fri, Oct 23, 2020 at 06:58:57PM +0100, Al Viro wrote:
> > On Fri, Oct 23, 2020 at 03:09:30PM +0200, David Hildenbrand wrote:
> >
> > > Now, I am not a compiler expert, but as I already cited, at least on
> > > x86-64 clang expects that the high bits were cleared by the caller - in
> > > contrast to gcc. I suspect it's the same on arm64, but again, I am no
> > > compiler expert.
> > >
> > > If what I said and cites for x86-64 is correct, if the function expects
> > > an "unsigned int", it will happily use 64bit operations without further
> > > checks where valid when assuming high bits are zero. That's why even
> > > converting everything to "unsigned int" as proposed by me won't work on
> > > clang - it assumes high bits are zero (as indicated by Nick).
> > >
> > > As I am neither a compiler experts (did I mention that already? ;) ) nor
> > > an arm64 experts, I can't tell if this is a compiler BUG or not.
> >
> > On arm64 when callee expects a 32bit argument, the caller is *not* responsible
> > for clearing the upper half of 64bit register used to pass the value - it only
> > needs to store the actual value into the lower half.  The callee must consider
> > the contents of the upper half of that register as undefined.  See AAPCS64 (e.g.
> > https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#parameter-passing-rules
> > ); AFAICS, the relevant bit is
> > 	"Unlike in the 32-bit AAPCS, named integral values must be narrowed by
> > the callee rather than the caller."
> 
> Or the formal rule:
> 
> C.9 	If the argument is an Integral or Pointer Type, the size of the
> 	argument is less than or equal to 8 bytes and the NGRN is less
> 	than 8, the argument is copied to the least significant bits in
> 	x[NGRN]. The NGRN is incremented by one. The argument has now
> 	been allocated.

So, in essence, if the value is in a 64bit register the calling
code is independent of the actual type of the formal parameter.
Clearly a value might need explicit widening.

I've found a copy of the 64 bit arm instruction set.
Unfortunately it is alpha sorted and repetitive so shows none
of the symmetry and makes things difficult to find.
But, contrary to what someone suggested most register writes
(eg from arithmetic) seem to zero/extend the high bits.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply

* Re: [PATCH] treewide: Convert macro and uses of __section(foo) to __section("foo")
From: Miguel Ojeda @ 2020-10-23 20:03 UTC (permalink / raw)
  To: Joe Perches
  Cc: clang-built-linux, Nick Desaulniers, Linus Torvalds, linux-kernel,
	linuxppc-dev
In-Reply-To: <64b49cd3680f45808dad286b408e7b196c31ec79.camel@perches.com>

On Fri, Oct 23, 2020 at 10:03 AM Joe Perches <joe@perches.com> wrote:
>
> Thanks Miguel, but IMO it doesn't need time in next.

You're welcome! It never hurts to keep things for a bit there.

> Applying it just before an rc1 minimizes conflicts.

There shouldn't be many conflicts after -rc1. The amount of changes is
reasonable too, so no need to apply the script directly. In any case,
if you prefer that Linus picks it up himself right away for this -rc1,
it looks good to me (with the caveat that it isn't tested):

Reviewed-by: Miguel Ojeda <ojeda@kernel.org>

Cheers,
Miguel

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Segher Boessenkool @ 2020-10-23 18:27 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, 'Greg KH', Nick Desaulniers,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201023175857.GA3576660@ZenIV.linux.org.uk>

On Fri, Oct 23, 2020 at 06:58:57PM +0100, Al Viro wrote:
> On Fri, Oct 23, 2020 at 03:09:30PM +0200, David Hildenbrand wrote:
> 
> > Now, I am not a compiler expert, but as I already cited, at least on
> > x86-64 clang expects that the high bits were cleared by the caller - in
> > contrast to gcc. I suspect it's the same on arm64, but again, I am no
> > compiler expert.
> > 
> > If what I said and cites for x86-64 is correct, if the function expects
> > an "unsigned int", it will happily use 64bit operations without further
> > checks where valid when assuming high bits are zero. That's why even
> > converting everything to "unsigned int" as proposed by me won't work on
> > clang - it assumes high bits are zero (as indicated by Nick).
> > 
> > As I am neither a compiler experts (did I mention that already? ;) ) nor
> > an arm64 experts, I can't tell if this is a compiler BUG or not.
> 
> On arm64 when callee expects a 32bit argument, the caller is *not* responsible
> for clearing the upper half of 64bit register used to pass the value - it only
> needs to store the actual value into the lower half.  The callee must consider
> the contents of the upper half of that register as undefined.  See AAPCS64 (e.g.
> https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#parameter-passing-rules
> ); AFAICS, the relevant bit is
> 	"Unlike in the 32-bit AAPCS, named integral values must be narrowed by
> the callee rather than the caller."

Or the formal rule:

C.9 	If the argument is an Integral or Pointer Type, the size of the
	argument is less than or equal to 8 bytes and the NGRN is less
	than 8, the argument is copied to the least significant bits in
	x[NGRN]. The NGRN is incremented by one. The argument has now
	been allocated.


Segher

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox