LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 8/9] extable: Use is_kernel_text() helper
From: Kefeng Wang @ 2021-09-26  7:20 UTC (permalink / raw)
  To: arnd, linux-arch, linux-kernel, linuxppc-dev, rostedt, mingo,
	davem, ast, ryabinin.a.a, akpm
  Cc: bpf, paulus, Kefeng Wang
In-Reply-To: <20210926072048.190336-1-wangkefeng.wang@huawei.com>

The core_kernel_text() should check the gate area, as it is part
of kernel text range.

Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 kernel/extable.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/extable.c b/kernel/extable.c
index 98ca627ac5ef..0ba383d850ff 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -64,8 +64,7 @@ const struct exception_table_entry *search_exception_tables(unsigned long addr)
 
 int notrace core_kernel_text(unsigned long addr)
 {
-	if (addr >= (unsigned long)_stext &&
-	    addr < (unsigned long)_etext)
+	if (is_kernel_text(addr))
 		return 1;
 
 	if (system_state < SYSTEM_RUNNING &&
-- 
2.26.2


^ permalink raw reply related

* [PATCH v3 7/9] mm: kasan: Use is_kernel() helper
From: Kefeng Wang @ 2021-09-26  7:20 UTC (permalink / raw)
  To: arnd, linux-arch, linux-kernel, linuxppc-dev, rostedt, mingo,
	davem, ast, ryabinin.a.a, akpm
  Cc: Kefeng Wang, Alexander Potapenko, paulus, bpf, Dmitry Vyukov,
	Andrey Konovalov
In-Reply-To: <20210926072048.190336-1-wangkefeng.wang@huawei.com>

Directly use is_kernel() helper in kernel_or_module_addr().

Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 mm/kasan/report.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 3239fd8f8747..1c955e1c98d5 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -226,7 +226,7 @@ static void describe_object(struct kmem_cache *cache, void *object,
 
 static inline bool kernel_or_module_addr(const void *addr)
 {
-	if (addr >= (void *)_stext && addr < (void *)_end)
+	if (is_kernel((unsigned long)addr))
 		return true;
 	if (is_module_address((unsigned long)addr))
 		return true;
-- 
2.26.2


^ permalink raw reply related

* [PATCH v3 0/9] sections: Unify kernel sections range check and use
From: Kefeng Wang @ 2021-09-26  7:20 UTC (permalink / raw)
  To: arnd, linux-arch, linux-kernel, linuxppc-dev, rostedt, mingo,
	davem, ast, ryabinin.a.a, akpm
  Cc: bpf, paulus, Kefeng Wang

There are three head files(kallsyms.h, kernel.h and sections.h) which
include the kernel sections range check, let's make some cleanup and
unify them.

1. cleanup arch specific text/data check and fix address boundary check
   in kallsyms.h
2. make all the basic/core kernel range check function into sections.h
3. update all the callers, and use the helper in sections.h to simplify
   the code

After this series, we have 5 APIs about kernel sections range check in
sections.h

 * is_kernel_rodata()		--- already in sections.h
 * is_kernel_core_data()	--- come from core_kernel_data() in kernel.h
 * is_kernel_inittext()		--- come from kernel.h and kallsyms.h
 * __is_kernel_text()		--- add new internal helper
 * __is_kernel()		--- add new internal helper

Note: For the last two helpers, people should not use directly, consider to
      use corresponding function in kallsyms.h.

v3:
- Add Steven's RB to patch2
- Introduce two internal helper, then use is_kernel_text() in core_kernel_text()
  and is_kernel() in kernel_or_module_addr() suggested by Steven

v2:
https://lore.kernel.org/linux-arch/20210728081320.20394-1-wangkefeng.wang@huawei.com/
- add ACK/RW to patch2, and drop inappropriate fix tag
- keep 'core' to check kernel data, suggestted by Steven Rostedt
  <rostedt@goodmis.org>, rename is_kernel_data() to is_kernel_core_data()
- drop patch8 which is merged
- drop patch9 which is resend independently

v1:
https://lore.kernel.org/linux-arch/20210626073439.150586-1-wangkefeng.wang@huawei.com

Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arch@vger.kernel.org 
Cc: bpf@vger.kernel.org 

Kefeng Wang (9):
  kallsyms: Remove arch specific text and data check
  kallsyms: Fix address-checks for kernel related range
  sections: Move and rename core_kernel_data() to is_kernel_core_data()
  sections: Move is_kernel_inittext() into sections.h
  x86: mm: Rename __is_kernel_text() to is_x86_32_kernel_text()
  sections: Provide internal __is_kernel() and __is_kernel_text() helper
  mm: kasan: Use is_kernel() helper
  extable: Use is_kernel_text() helper
  powerpc/mm: Use is_kernel_text() and is_kernel_inittext() helper

 arch/powerpc/mm/pgtable_32.c   |  7 +---
 arch/x86/kernel/unwind_orc.c   |  2 +-
 arch/x86/mm/init_32.c          | 14 +++----
 include/asm-generic/sections.h | 75 ++++++++++++++++++++++++++--------
 include/linux/kallsyms.h       | 13 +-----
 include/linux/kernel.h         |  2 -
 kernel/extable.c               | 33 ++-------------
 kernel/locking/lockdep.c       |  3 --
 kernel/trace/ftrace.c          |  2 +-
 mm/kasan/report.c              |  2 +-
 net/sysctl_net.c               |  2 +-
 11 files changed, 75 insertions(+), 80 deletions(-)

-- 
2.26.2


^ permalink raw reply

* [PATCH v3 2/9] kallsyms: Fix address-checks for kernel related range
From: Kefeng Wang @ 2021-09-26  7:20 UTC (permalink / raw)
  To: arnd, linux-arch, linux-kernel, linuxppc-dev, rostedt, mingo,
	davem, ast, ryabinin.a.a, akpm
  Cc: Petr Mladek, Kefeng Wang, Sergey Senozhatsky, Sergey Senozhatsky,
	paulus, bpf
In-Reply-To: <20210926072048.190336-1-wangkefeng.wang@huawei.com>

The is_kernel_inittext/is_kernel_text/is_kernel function should not
include the end address(the labels _einittext, _etext and _end) when
check the address range, the issue exists since Linux v2.6.12.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 include/linux/kallsyms.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
index 2a241e3f063f..b016c62f30a6 100644
--- a/include/linux/kallsyms.h
+++ b/include/linux/kallsyms.h
@@ -27,21 +27,21 @@ struct module;
 static inline int is_kernel_inittext(unsigned long addr)
 {
 	if (addr >= (unsigned long)_sinittext
-	    && addr <= (unsigned long)_einittext)
+	    && addr < (unsigned long)_einittext)
 		return 1;
 	return 0;
 }
 
 static inline int is_kernel_text(unsigned long addr)
 {
-	if ((addr >= (unsigned long)_stext && addr <= (unsigned long)_etext))
+	if ((addr >= (unsigned long)_stext && addr < (unsigned long)_etext))
 		return 1;
 	return in_gate_area_no_mm(addr);
 }
 
 static inline int is_kernel(unsigned long addr)
 {
-	if (addr >= (unsigned long)_stext && addr <= (unsigned long)_end)
+	if (addr >= (unsigned long)_stext && addr < (unsigned long)_end)
 		return 1;
 	return in_gate_area_no_mm(addr);
 }
-- 
2.26.2


^ permalink raw reply related

* [PATCH v3 1/9] kallsyms: Remove arch specific text and data check
From: Kefeng Wang @ 2021-09-26  7:20 UTC (permalink / raw)
  To: arnd, linux-arch, linux-kernel, linuxppc-dev, rostedt, mingo,
	davem, ast, ryabinin.a.a, akpm
  Cc: bpf, paulus, Kefeng Wang
In-Reply-To: <20210926072048.190336-1-wangkefeng.wang@huawei.com>

After commit 4ba66a976072 ("arch: remove blackfin port"),
no need arch-specific text/data check.

Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 include/asm-generic/sections.h | 16 ----------------
 include/linux/kallsyms.h       |  3 +--
 kernel/locking/lockdep.c       |  3 ---
 3 files changed, 1 insertion(+), 21 deletions(-)

diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
index d16302d3eb59..817309e289db 100644
--- a/include/asm-generic/sections.h
+++ b/include/asm-generic/sections.h
@@ -64,22 +64,6 @@ extern __visible const void __nosave_begin, __nosave_end;
 #define dereference_kernel_function_descriptor(p) ((void *)(p))
 #endif
 
-/* random extra sections (if any).  Override
- * in asm/sections.h */
-#ifndef arch_is_kernel_text
-static inline int arch_is_kernel_text(unsigned long addr)
-{
-	return 0;
-}
-#endif
-
-#ifndef arch_is_kernel_data
-static inline int arch_is_kernel_data(unsigned long addr)
-{
-	return 0;
-}
-#endif
-
 /*
  * Check if an address is part of freed initmem. This is needed on architectures
  * with virt == phys kernel mapping, for code that wants to check if an address
diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
index 6851c2313cad..2a241e3f063f 100644
--- a/include/linux/kallsyms.h
+++ b/include/linux/kallsyms.h
@@ -34,8 +34,7 @@ static inline int is_kernel_inittext(unsigned long addr)
 
 static inline int is_kernel_text(unsigned long addr)
 {
-	if ((addr >= (unsigned long)_stext && addr <= (unsigned long)_etext) ||
-	    arch_is_kernel_text(addr))
+	if ((addr >= (unsigned long)_stext && addr <= (unsigned long)_etext))
 		return 1;
 	return in_gate_area_no_mm(addr);
 }
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 7096384dc60f..dcdbcee391cd 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -803,9 +803,6 @@ static int static_obj(const void *obj)
 	if ((addr >= start) && (addr < end))
 		return 1;
 
-	if (arch_is_kernel_data(addr))
-		return 1;
-
 	/*
 	 * in-kernel percpu var?
 	 */
-- 
2.26.2


^ permalink raw reply related

* [PATCH v3 5/9] x86: mm: Rename __is_kernel_text() to is_x86_32_kernel_text()
From: Kefeng Wang @ 2021-09-26  7:20 UTC (permalink / raw)
  To: arnd, linux-arch, linux-kernel, linuxppc-dev, rostedt, mingo,
	davem, ast, ryabinin.a.a, akpm
  Cc: Kefeng Wang, x86, Borislav Petkov, paulus, bpf
In-Reply-To: <20210926072048.190336-1-wangkefeng.wang@huawei.com>

Commit b56cd05c55a1 ("x86/mm: Rename is_kernel_text to __is_kernel_text"),
add '__' prefix not to get in conflict with existing is_kernel_text() in
<linux/kallsyms.h>.

We will add __is_kernel_text() for the basic kernel text range check in the
next patch, so use private is_x86_32_kernel_text() naming for x86 special
check.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/x86/mm/init_32.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index bd90b8fe81e4..523743ee9dea 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -238,11 +238,7 @@ page_table_range_init(unsigned long start, unsigned long end, pgd_t *pgd_base)
 	}
 }
 
-/*
- * The <linux/kallsyms.h> already defines is_kernel_text,
- * using '__' prefix not to get in conflict.
- */
-static inline int __is_kernel_text(unsigned long addr)
+static inline int is_x86_32_kernel_text(unsigned long addr)
 {
 	if (addr >= (unsigned long)_text && addr <= (unsigned long)__init_end)
 		return 1;
@@ -333,8 +329,8 @@ kernel_physical_mapping_init(unsigned long start,
 				addr2 = (pfn + PTRS_PER_PTE-1) * PAGE_SIZE +
 					PAGE_OFFSET + PAGE_SIZE-1;
 
-				if (__is_kernel_text(addr) ||
-				    __is_kernel_text(addr2))
+				if (is_x86_32_kernel_text(addr) ||
+				    is_x86_32_kernel_text(addr2))
 					prot = PAGE_KERNEL_LARGE_EXEC;
 
 				pages_2m++;
@@ -359,7 +355,7 @@ kernel_physical_mapping_init(unsigned long start,
 				 */
 				pgprot_t init_prot = __pgprot(PTE_IDENT_ATTR);
 
-				if (__is_kernel_text(addr))
+				if (is_x86_32_kernel_text(addr))
 					prot = PAGE_KERNEL_EXEC;
 
 				pages_4k++;
@@ -820,7 +816,7 @@ static void mark_nxdata_nx(void)
 	 */
 	unsigned long start = PFN_ALIGN(_etext);
 	/*
-	 * This comes from __is_kernel_text upper limit. Also HPAGE where used:
+	 * This comes from is_x86_32_kernel_text upper limit. Also HPAGE where used:
 	 */
 	unsigned long size = (((unsigned long)__init_end + HPAGE_SIZE) & HPAGE_MASK) - start;
 
-- 
2.26.2


^ permalink raw reply related

* [PATCH v2] ibmveth: Use dma_alloc_coherent() instead of kmalloc/dma_map_single()
From: Cai Huoqing @ 2021-09-26  6:52 UTC (permalink / raw)
  To: caihuoqing
  Cc: Cristobal Forno, linux-kernel, netdev, Paul Mackerras,
	Jakub Kicinski, linuxppc-dev, David S. Miller

Replacing kmalloc/kfree/dma_map_single/dma_unmap_single()
with dma_alloc_coherent/dma_free_coherent() helps to reduce
code size, and simplify the code, and coherent DMA will not
clear the cache every time.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
---
v1->v2: Remove extra change in Kconfig

 drivers/net/ethernet/ibm/ibmveth.c | 25 +++++++++----------------
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
index 3d9b4f99d357..3aedb680adb8 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -605,17 +605,13 @@ static int ibmveth_open(struct net_device *netdev)
 	}
 
 	rc = -ENOMEM;
-	adapter->bounce_buffer =
-	    kmalloc(netdev->mtu + IBMVETH_BUFF_OH, GFP_KERNEL);
-	if (!adapter->bounce_buffer)
-		goto out_free_irq;
 
-	adapter->bounce_buffer_dma =
-	    dma_map_single(&adapter->vdev->dev, adapter->bounce_buffer,
-			   netdev->mtu + IBMVETH_BUFF_OH, DMA_BIDIRECTIONAL);
-	if (dma_mapping_error(dev, adapter->bounce_buffer_dma)) {
-		netdev_err(netdev, "unable to map bounce buffer\n");
-		goto out_free_bounce_buffer;
+	adapter->bounce_buffer = dma_alloc_coherent(&adapter->vdev->dev,
+						    netdev->mtu + IBMVETH_BUFF_OH,
+						    &adapter->bounce_buffer_dma, GFP_KERNEL);
+	if (!adapter->bounce_buffer) {
+		netdev_err(netdev, "unable to alloc bounce buffer\n");
+		goto out_free_irq;
 	}
 
 	netdev_dbg(netdev, "initial replenish cycle\n");
@@ -627,8 +623,6 @@ static int ibmveth_open(struct net_device *netdev)
 
 	return 0;
 
-out_free_bounce_buffer:
-	kfree(adapter->bounce_buffer);
 out_free_irq:
 	free_irq(netdev->irq, netdev);
 out_free_buffer_pools:
@@ -702,10 +696,9 @@ static int ibmveth_close(struct net_device *netdev)
 			ibmveth_free_buffer_pool(adapter,
 						 &adapter->rx_buff_pool[i]);
 
-	dma_unmap_single(&adapter->vdev->dev, adapter->bounce_buffer_dma,
-			 adapter->netdev->mtu + IBMVETH_BUFF_OH,
-			 DMA_BIDIRECTIONAL);
-	kfree(adapter->bounce_buffer);
+	dma_free_coherent(&adapter->vdev->dev,
+			  adapter->netdev->mtu + IBMVETH_BUFF_OH,
+			  adapter->bounce_buffer, adapter->bounce_buffer_dma);
 
 	netdev_dbg(netdev, "close complete\n");
 
-- 
2.25.1


^ permalink raw reply related

* [PATCH] ibmveth: Use dma_alloc_coherent() instead of kmalloc/dma_map_single()
From: Cai Huoqing @ 2021-09-26  6:11 UTC (permalink / raw)
  To: caihuoqing
  Cc: Cristobal Forno, linux-kernel, netdev, Paul Mackerras,
	Jakub Kicinski, linuxppc-dev, David S. Miller

Replacing kmalloc/kfree/dma_map_single/dma_unmap_single()
with dma_alloc_coherent/dma_free_coherent() helps to reduce
code size, and simplify the code, and coherent DMA will not
clear the cache every time.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
---
 drivers/net/ethernet/ibm/Kconfig   |  4 ++--
 drivers/net/ethernet/ibm/ibmveth.c | 25 +++++++++----------------
 2 files changed, 11 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/ibm/Kconfig b/drivers/net/ethernet/ibm/Kconfig
index c0c112d95b89..926c0642784d 100644
--- a/drivers/net/ethernet/ibm/Kconfig
+++ b/drivers/net/ethernet/ibm/Kconfig
@@ -19,7 +19,7 @@ if NET_VENDOR_IBM
 
 config IBMVETH
 	tristate "IBM LAN Virtual Ethernet support"
-	depends on PPC_PSERIES
+	depends on PPC_PSERIES || COMPILE_TEST
 	help
 	  This driver supports virtual ethernet adapters on newer IBM iSeries
 	  and pSeries systems.
@@ -40,7 +40,7 @@ config EHEA
 
 config IBMVNIC
 	tristate "IBM Virtual NIC support"
-	depends on PPC_PSERIES
+	depends on PPC_PSERIES || COMPILE_TEST
 	help
 	  This driver supports Virtual NIC adapters on IBM i and IBM System p
 	  systems.
diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
index 3d9b4f99d357..3aedb680adb8 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -605,17 +605,13 @@ static int ibmveth_open(struct net_device *netdev)
 	}
 
 	rc = -ENOMEM;
-	adapter->bounce_buffer =
-	    kmalloc(netdev->mtu + IBMVETH_BUFF_OH, GFP_KERNEL);
-	if (!adapter->bounce_buffer)
-		goto out_free_irq;
 
-	adapter->bounce_buffer_dma =
-	    dma_map_single(&adapter->vdev->dev, adapter->bounce_buffer,
-			   netdev->mtu + IBMVETH_BUFF_OH, DMA_BIDIRECTIONAL);
-	if (dma_mapping_error(dev, adapter->bounce_buffer_dma)) {
-		netdev_err(netdev, "unable to map bounce buffer\n");
-		goto out_free_bounce_buffer;
+	adapter->bounce_buffer = dma_alloc_coherent(&adapter->vdev->dev,
+						    netdev->mtu + IBMVETH_BUFF_OH,
+						    &adapter->bounce_buffer_dma, GFP_KERNEL);
+	if (!adapter->bounce_buffer) {
+		netdev_err(netdev, "unable to alloc bounce buffer\n");
+		goto out_free_irq;
 	}
 
 	netdev_dbg(netdev, "initial replenish cycle\n");
@@ -627,8 +623,6 @@ static int ibmveth_open(struct net_device *netdev)
 
 	return 0;
 
-out_free_bounce_buffer:
-	kfree(adapter->bounce_buffer);
 out_free_irq:
 	free_irq(netdev->irq, netdev);
 out_free_buffer_pools:
@@ -702,10 +696,9 @@ static int ibmveth_close(struct net_device *netdev)
 			ibmveth_free_buffer_pool(adapter,
 						 &adapter->rx_buff_pool[i]);
 
-	dma_unmap_single(&adapter->vdev->dev, adapter->bounce_buffer_dma,
-			 adapter->netdev->mtu + IBMVETH_BUFF_OH,
-			 DMA_BIDIRECTIONAL);
-	kfree(adapter->bounce_buffer);
+	dma_free_coherent(&adapter->vdev->dev,
+			  adapter->netdev->mtu + IBMVETH_BUFF_OH,
+			  adapter->bounce_buffer, adapter->bounce_buffer_dma);
 
 	netdev_dbg(netdev, "close complete\n");
 
-- 
2.25.1


^ permalink raw reply related

* [PATCH] powerpc:85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n
From: Xiaoming Ni @ 2021-09-26  2:51 UTC (permalink / raw)
  To: linux-kernel, oss, mpe, benh, paulus, paul.gortmaker,
	Yuantian.Tang, chenhui.zhao, linuxppc-dev
  Cc: wangle6, nixiaoming, wangxiongfeng2

When CONFIG_SMP=y, timebase synchronization is required when the second
 kernel is started.
	arch/powerpc/kernel/smp.c:
	int __cpu_up(unsigned int cpu, struct task_struct *tidle)
	{
		...
		if (smp_ops->give_timebase)
			smp_ops->give_timebase();
		...
	}

	void start_secondary(void *unused)
	{
		...
		if (smp_ops->take_timebase)
			smp_ops->take_timebase();
		...
	}

When CONFIG_HOTPLUG_CPU=n and CONFIG_KEXEC_CORE=n,
 smp_85xx_ops.give_timebase is NULL,
 smp_85xx_ops.take_timebase is NULL,
As a result, the timebase is not synchronized.

Timebase  synchronization does not depend on CONFIG_HOTPLUG_CPU.

Fixes: 56f1ba280719 ("powerpc/mpc85xx: refactor the PM operations")
Cc: stable@vger.kernel.org #v4.6
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
---
 arch/powerpc/platforms/85xx/Makefile         | 2 +-
 arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c | 4 ++++
 arch/powerpc/platforms/85xx/smp.c            | 9 ++++-----
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/85xx/Makefile b/arch/powerpc/platforms/85xx/Makefile
index 60e4e97a929d..71ce1f6b6966 100644
--- a/arch/powerpc/platforms/85xx/Makefile
+++ b/arch/powerpc/platforms/85xx/Makefile
@@ -3,7 +3,7 @@
 # Makefile for the PowerPC 85xx linux kernel.
 #
 obj-$(CONFIG_SMP) += smp.o
-obj-$(CONFIG_FSL_PMC)		  += mpc85xx_pm_ops.o
+obj-$(CONFIG_SMP) += mpc85xx_pm_ops.o
 
 obj-y += common.o
 
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c b/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c
index 7c0133f558d0..a5656b3e9701 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c
@@ -17,6 +17,7 @@
 
 static struct ccsr_guts __iomem *guts;
 
+#ifdef CONFIG_FSL_PMC
 static void mpc85xx_irq_mask(int cpu)
 {
 
@@ -49,6 +50,7 @@ static void mpc85xx_cpu_up_prepare(int cpu)
 {
 
 }
+#endif
 
 static void mpc85xx_freeze_time_base(bool freeze)
 {
@@ -76,10 +78,12 @@ static const struct of_device_id mpc85xx_smp_guts_ids[] = {
 
 static const struct fsl_pm_ops mpc85xx_pm_ops = {
 	.freeze_time_base = mpc85xx_freeze_time_base,
+#ifdef CONFIG_FSL_PMC
 	.irq_mask = mpc85xx_irq_mask,
 	.irq_unmask = mpc85xx_irq_unmask,
 	.cpu_die = mpc85xx_cpu_die,
 	.cpu_up_prepare = mpc85xx_cpu_up_prepare,
+#endif
 };
 
 int __init mpc85xx_setup_pmc(void)
diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
index c6df294054fe..349298cd9671 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -40,7 +40,6 @@ struct epapr_spin_table {
 	u32	pir;
 };
 
-#ifdef CONFIG_HOTPLUG_CPU
 static u64 timebase;
 static int tb_req;
 static int tb_valid;
@@ -112,6 +111,7 @@ static void mpc85xx_take_timebase(void)
 	local_irq_restore(flags);
 }
 
+#ifdef CONFIG_HOTPLUG_CPU
 static void smp_85xx_cpu_offline_self(void)
 {
 	unsigned int cpu = smp_processor_id();
@@ -499,17 +499,16 @@ void __init mpc85xx_smp_init(void)
 #ifdef CONFIG_FSL_CORENET_RCPM
 	fsl_rcpm_init();
 #endif
-
-#ifdef CONFIG_FSL_PMC
-	mpc85xx_setup_pmc();
 #endif
+	mpc85xx_setup_pmc();
 	if (qoriq_pm_ops) {
 		smp_85xx_ops.give_timebase = mpc85xx_give_timebase;
 		smp_85xx_ops.take_timebase = mpc85xx_take_timebase;
+#ifdef CONFIG_HOTPLUG_CPU
 		smp_85xx_ops.cpu_offline_self = smp_85xx_cpu_offline_self;
 		smp_85xx_ops.cpu_die = qoriq_cpu_kill;
-	}
 #endif
+	}
 	smp_ops = &smp_85xx_ops;
 
 #ifdef CONFIG_KEXEC_CORE
-- 
2.27.0


^ permalink raw reply related

* [PATCH] Revert "ibmvnic: check failover_pending in login response"
From: Desnes A. Nunes do Rosario @ 2021-09-25 15:14 UTC (permalink / raw)
  To: netdev; +Cc: drt, tlfalcon, sukadev, linuxppc-dev, Desnes A. Nunes do Rosario

This reverts commit d437f5aa23aa2b7bd07cd44b839d7546cc17166f.

Code has been duplicated through commit <273c29e944bd> "ibmvnic: check
failover_pending in login response"

Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index a4579b340120..6aa6ff89a765 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -4708,14 +4708,6 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq,
 		return 0;
 	}
 
-	if (adapter->failover_pending) {
-		adapter->init_done_rc = -EAGAIN;
-		netdev_dbg(netdev, "Failover pending, ignoring login response\n");
-		complete(&adapter->init_done);
-		/* login response buffer will be released on reset */
-		return 0;
-	}
-
 	netdev->mtu = adapter->req_mtu - ETH_HLEN;
 
 	netdev_dbg(adapter->netdev, "Login Response Buffer:\n");
-- 
2.31.1


^ permalink raw reply related

* [PATCH v2 3/3] powerpc: Remove init_mem_is_free
From: Christophe Leroy @ 2021-09-25  8:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Michael Neuling, linuxppc-dev, linux-kernel
In-Reply-To: <a7a35416ddab4b42617e2c81fe5816c7f8d0c63f.1632558672.git.christophe.leroy@csgroup.eu>

init_mem_is_free is not used anymore. Remove it.

Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/setup.h | 1 -
 arch/powerpc/mm/mem.c            | 2 --
 2 files changed, 3 deletions(-)

diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h
index 6c1a7d217d1a..426a2d8d028f 100644
--- a/arch/powerpc/include/asm/setup.h
+++ b/arch/powerpc/include/asm/setup.h
@@ -9,7 +9,6 @@ extern void ppc_printk_progress(char *s, unsigned short hex);
 
 extern unsigned int rtas_data;
 extern unsigned long long memory_limit;
-extern bool init_mem_is_free;
 extern void *zalloc_maybe_bootmem(size_t size, gfp_t mask);
 
 struct device_node;
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index c3c4e31462ec..5b1eae6c0356 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -26,7 +26,6 @@
 #include <mm/mmu_decl.h>
 
 unsigned long long memory_limit;
-bool init_mem_is_free;
 
 unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss;
 EXPORT_SYMBOL(empty_zero_page);
@@ -312,7 +311,6 @@ void free_initmem(void)
 {
 	ppc_md.progress = ppc_printk_progress;
 	mark_initmem_nx();
-	init_mem_is_free = true;
 	free_initmem_default(POISON_FREE_INITMEM);
 }
 
-- 
2.31.1


^ permalink raw reply related

* [PATCH v2 2/3] powerpc: Stop using init_mem_is_free
From: Christophe Leroy @ 2021-09-25  8:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Michael Neuling, linuxppc-dev, linux-kernel
In-Reply-To: <a7a35416ddab4b42617e2c81fe5816c7f8d0c63f.1632558672.git.christophe.leroy@csgroup.eu>

Generic parts of the kernel for instance core_kernel_text()
use 'system_state' to check whether init memory has been
freed.

Do the same and stop using init_mem_is_free.

Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch_is_kernel_initmem_freed() is likely going to go away via the mm tree, in that case this patch will void.

v2: Also remove declaration of init_mem_is_free.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/sections.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/sections.h b/arch/powerpc/include/asm/sections.h
index 6e4af4492a14..454b8434bfc4 100644
--- a/arch/powerpc/include/asm/sections.h
+++ b/arch/powerpc/include/asm/sections.h
@@ -10,11 +10,9 @@
 
 #include <asm-generic/sections.h>
 
-extern bool init_mem_is_free;
-
 static inline int arch_is_kernel_initmem_freed(unsigned long addr)
 {
-	if (!init_mem_is_free)
+	if (system_state < SYSTEM_RUNNING)
 		return 0;
 
 	return addr >= (unsigned long)__init_begin &&
-- 
2.31.1


^ permalink raw reply related

* [PATCH v2 1/3] powerpc/code-patching: Improve verification of patchability
From: Christophe Leroy @ 2021-09-25  8:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Michael Neuling, linuxppc-dev, linux-kernel

Today, patch_instruction() assumes that is called exclusively on valid
addresses, and only checks that it is not called on an init address
after init section has been freed.

Improve verification by calling kernel_text_address() instead.

kernel_text_address() already includes a verification of
initmem release.

Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/lib/code-patching.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index f9a3019e37b4..20ec6648aacc 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -190,10 +190,9 @@ static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
 int patch_instruction(u32 *addr, struct ppc_inst instr)
 {
 	/* Make sure we aren't patching a freed init section */
-	if (init_mem_is_free && init_section_contains(addr, 4)) {
-		pr_debug("Skipping init section patching addr: 0x%px\n", addr);
+	if (!kernel_text_address((unsigned long)addr))
 		return 0;
-	}
+
 	return do_patch_instruction(addr, instr);
 }
 NOKPROBE_SYMBOL(patch_instruction);
-- 
2.31.1


^ permalink raw reply related

* Re: ppc32 doesn't boot when linkaddr=0x00900000
From: Christophe Leroy @ 2021-09-25 12:24 UTC (permalink / raw)
  To: cp, linuxppc-dev; +Cc: paulus, torvalds
In-Reply-To: <CA+QBN9AFNSf3+U4iMhwZx7c69MLk-BtSbVODBEA97ObYWRczbQ@mail.gmail.com>

Hi,

Le 25/09/2021 à 11:49, cp a écrit :
> hi
> I am new to this list. Hope this is the right place to ask.
> 
> I am working with a PPC405GP board, and as far as I understand, the
> support for ppc40x platforms like Acadia and Walnut were dropped with
> kernel 5.8.0, so this seems like a pretty straightforward question,
> but extensive experiments from kernel 4.11 to kernel 5.7.19 haven't
> shown a really clear, up-to-date answer.
> 
> In k4.11 .. k5.7.19, when the kernel size is bigger than 8 MB, the
> final kernel doesn't boot but rather arch/powerpc/boot/main.c dies
> before the first message from the kernel shows up.
> 
> Why?

That's the fourth time I receive your email. Please give us time to 
analyse it, there is no point in sending it again and again unless you 
have additional information to provide.

Christophe

^ permalink raw reply

* ppc32 doesn't boot when linkaddr=0x00900000
From: cp @ 2021-09-25  9:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: torvalds, paulus

hi
I am new to this list. Hope this is the right place to ask.

I am working with a PPC405GP board, and as far as I understand, the
support for ppc40x platforms like Acadia and Walnut were dropped with
kernel 5.8.0, so this seems like a pretty straightforward question,
but extensive experiments from kernel 4.11 to kernel 5.7.19 haven't
shown a really clear, up-to-date answer.

In k4.11 .. k5.7.19, when the kernel size is bigger than 8 MB, the
final kernel doesn't boot but rather arch/powerpc/boot/main.c dies
before the first message from the kernel shows up.

Why?

Digging deeper I see the relation between the kernel size and link_addr

        # Round the size to next higher MB limit
        round_size=$(((strip_size + 0xfffff) & 0xfff00000))

        round_size=0x$(printf "%x" $round_size)
        link_addr=$(printf "%d" $link_address)

and this is where link_addr is involved

        text_start="-Ttext $link_address"

My kernels are compiled for cuboot, and the code that invokes "kentry"
is entirely located in arch/powerpc/boot/main.c

I instrumned that module, and this is what I see on the condole

The following is the same kernel, compiled with the same .config, but
with two link_addr values

A) with link_addr=0x0080.0000
image loaded from 0x00800000
SP=0x03eb1b80
kernel_size = 7411084 bytes
copying 256 bytes from kernel-image at 0x0080f000 to elfheader
elf_info.loadsize = 0x00700e68
elf_info.memsize  = 0x0074234c
allocating 7611212 bytes for the new kernel
copying ...
from = 0x0081f000
to = 0x00000000
size = 7343720
flush_cache, 32Mbyte flushed
cmdline: uboot bootargs overridden
cmdline=[console=ttyS0,115200 root=/dev/sda2 rootfstype=ext2 rw
init=/sbin/init ]
Finalizing device tree... flat tree at 0xf23b80
ft_addr=0xf23b80
my tp1: success
kernel booting ....
(it boots)

B) with link_addr=0x0080.0000
image loaded from 0x00900000
SP=0x03eb1b80
kernel_size = 7411084
copying 256 bytes from kernel-image at 0x0090f000 to elfheader
elf_info.loadsize = 0x00700e68
elf_info.memsize  = 0x0074234c
allocating 7611212 bytes for the new kernel
copying ...
from = 0x0091f000
to = 0x00000000
size = 7343720
flush_cache, 32Mbyte flushed
cmdline: uboot bootargs overridden
cmdline=[console=ttyS0,115200 root=/dev/sda2 rootfstype=ext2 rw
init=/sbin/init ]
Finalizing device tree... flat tree at 0x1023b80
ft_addr=0x1023b80
my tp2: success
my tp3: success
invalidate_cache 0x00000000+0x02000000
my tp4: (point of no return)
calling kentry()...
kernel booting ....
(it dies at this point, but without a debugger it's like watching
something fall into a black hole)

Any ideas?
I am lost ...

Carlo

^ permalink raw reply

* Re: [PATCH] powerpc: don't select KFENCE on platform PPC_FSL_BOOK3E
From: Liu Shixin @ 2021-09-25  3:10 UTC (permalink / raw)
  To: Christophe Leroy, Marco Elver, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <f8d12860-56d7-5697-7cba-3cac95bb0a1c@csgroup.eu>



On 2021/9/24 14:41, Christophe Leroy wrote:
>
>
> Le 24/09/2021 à 08:39, Liu Shixin a écrit :
>> On platform PPC_FSL_BOOK3E, all lowmem is managed by tlbcam. That means
>> we didn't really map the kfence pool with page granularity. Therefore,
>> if KFENCE is enabled, the system will hit the following panic:
>
> Could you please explain a bit more what the problem is ?
>
> KFENCE has been implemented with the same logic as DEBUG_PAGEALLOC.
>
> DEBUG_PAGEALLOC is enabled on FSL_BOOK3E.
>
> In MMU_setup(), __map_without_ltlbs is set to 1 when KFENCE is enabled.
>
> __map_without_ltlbs should disable the use of tlbcam.
>
>
> So what's wrong really ?
>
> Does DEBUG_PAGEALLOC work on FSL_BOOK3E ?
>
> Thanks
> Christophe
>
hi Christophe,

The phenomenon is that kernel panic in the kfence_protect_page function because
__kfence_pool is not mapped with page granularity.

The problem is that in the mapin_ram function, the return value(i.e base) of mmu_mapin_ram
is equal to top. As a result, no level-2 page table is created for [base, top]. It seems that
__map_without_ltlbs didn't diable the use of tlbcam.

I have tried to force page table for all lowmem, then this problem will go away
but the kfence_test failed, which could be explained by the fact that tlbcam is still used.

By the way, DEBUG_PAGEALLOC works well on FSL_BOOK3E without level-2 page table.

Thanks,
>>
>>      BUG: Kernel NULL pointer dereference on read at 0x00000000
>>      Faulting instruction address: 0xc01de598
>>      Oops: Kernel access of bad area, sig: 11 [#1]
>>      BE PAGE_SIZE=4K SMP NR_CPUS=4 MPC8544 DS
>>      Dumping ftrace buffer:
>>         (ftrace buffer empty)
>>      Modules linked in:
>>      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc3+ #298
>>      NIP:  c01de598 LR: c08ae9c4 CTR: 00000000
>>      REGS: c0b4bea0 TRAP: 0300   Not tainted  (5.12.0-rc3+)
>>      MSR:  00021000 <CE,ME>  CR: 24000228  XER: 20000000
>>      DEAR: 00000000 ESR: 00000000
>>      GPR00: c08ae9c4 c0b4bf60 c0ad64e0 ef720000 00021000 00000000 00000000 00000200
>>      GPR08: c0ad5000 00000000 00000000 00000004 00000000 008fbb30 00000000 00000000
>>      GPR16: 00000000 00000000 00000000 00000000 c0000000 00000000 00000000 00000000
>>      GPR24: c08ca004 c08ca004 c0b6a0e0 c0b60000 c0b58f00 c0850000 c08ca000 ef720000
>>      NIP [c01de598] kfence_protect+0x44/0x6c
>>      LR [c08ae9c4] kfence_init+0xfc/0x2a4
>>      Call Trace:
>>      [c0b4bf60] [efffe160] 0xefffe160 (unreliable)
>>      [c0b4bf70] [c08ae9c4] kfence_init+0xfc/0x2a4
>>      [c0b4bfb0] [c0894d3c] start_kernel+0x3bc/0x574
>>      [c0b4bff0] [c0000470] set_ivor+0x14c/0x188
>>      Instruction dump:
>>      7c0802a6 8109d594 546a653a 90010014 54630026 39200000 7d48502e 2c0a0000
>>      41820010 554a0026 5469b53a 7d295214 <81490000> 38831000 554a003c 91490000
>>      random: get_random_bytes called from print_oops_end_marker+0x40/0x78 with crng_init=0
>>      ---[ end trace 0000000000000000 ]---
>>
>> Signed-off-by: Liu Shixin <liushixin2@huawei.com>
>> ---
>>   arch/powerpc/Kconfig | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index d46db0bfb998..cffd57bcb5e4 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -185,7 +185,7 @@ config PPC
>>       select HAVE_ARCH_KASAN            if PPC32 && PPC_PAGE_SHIFT <= 14
>>       select HAVE_ARCH_KASAN_VMALLOC        if PPC32 && PPC_PAGE_SHIFT <= 14
>>       select HAVE_ARCH_KGDB
>> -    select HAVE_ARCH_KFENCE            if PPC32
>> +    select HAVE_ARCH_KFENCE            if PPC32 && !PPC_FSL_BOOK3E
>>       select HAVE_ARCH_MMAP_RND_BITS
>>       select HAVE_ARCH_MMAP_RND_COMPAT_BITS    if COMPAT
>>       select HAVE_ARCH_NVRAM_OPS
>>
> .
>


^ permalink raw reply

* Re: [PATCH] powerpc/paravirt: correct preempt debug splat in vcpu_is_preempted()
From: Nathan Lynch @ 2021-09-25  0:10 UTC (permalink / raw)
  To: Michael Ellerman, Srikar Dronamraju; +Cc: linuxppc-dev, npiggin
In-Reply-To: <87pmsylli8.fsf@mpe.ellerman.id.au>

Michael Ellerman <mpe@ellerman.id.au> writes:
> Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
>> * Michael Ellerman <mpe@ellerman.id.au> [2021-09-23 17:29:32]:
>>
>>> Nathan Lynch <nathanl@linux.ibm.com> writes:
>>> > Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
>>> >
>>> >> * Nathan Lynch <nathanl@linux.ibm.com> [2021-09-22 11:01:12]:
>>> >>
>>> >>> Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> ...
>>> >> Or can I understand how debug_smp_processor_id() is useful if
>>> >> __smp_processor_id() is defined as raw_smp_processor_id()?
>>> 
>>> debug_smp_processor_id() is useful on powerpc, as well as other arches,
>>> because it checks that we're in a context where the processor id won't
>>> change out from under us.
>>> 
>>> eg. something like this is unsafe:
>>> 
>>>   int counts[NR_CPUS];
>>>   int tmp, cpu;
>>>   
>>>   cpu = smp_processor_id();
>>>   tmp = counts[cpu];
>>>   				<- preempted here and migrated to another CPU
>>>   counts[cpu] = tmp + 1;
>>> 
>>
>> If lets say the above call was replaced by raw_smp_processor_id(), how would
>> it avoid the preemption / migration to another CPU?
>>
>> Replacing it with raw_smp_processor_id() may avoid, the debug splat but the
>> underlying issue would still remain as is. No?
>
> Correct.
>
> Using raw_smp_processor_id() is generally the wrong answer. For this
> example the correct fix is to disable preemption around the code, eg:
>
>    int counts[NR_CPUS];
>    int tmp, cpu;
>    
>    preempt_disable()
>
>    cpu = smp_processor_id();
>    tmp = counts[cpu];
>    counts[cpu] = tmp + 1;
>
>    preempt_enable();
>
>
> For the original issue I think it is OK to use raw_smp_processor_id(),
> because we're already doing a racy check of whether another CPU has been
> preempted by the hypervisor.
>
>         if (!is_kvm_guest()) {
>                 int first_cpu = cpu_first_thread_sibling(smp_processor_id());
>
>                 if (cpu_first_thread_sibling(cpu) == first_cpu)
>                         return false;
>         }
>
> We could disable preemption around that, eg:
>
>         if (!is_kvm_guest()) {
>                 int first_cpu;
>                 bool is_sibling;
>
>                 preempt_disable();
>                 first_cpu = cpu_first_thread_sibling(smp_processor_id());
>                 is_sibling = (cpu_first_thread_sibling(cpu) == first_cpu)
>                 preempt_enable();
>
>                 // Can be preempted here
>
>                 if (is_sibling)
>                     return false;
>         }
>
> But before we return we could be preempted, and then is_sibling is no
> longer necessarily correct. So that doesn't really gain us anything.
>
> The only way to make that result stable is to disable preemption in the
> caller, but most callers don't want to AFAICS, because they know they're
> doing a racy check to begin with.

I'll add that one way I think about this is that when I choose
smp_processor_id(), I am making a claim about the context in which it is
used, and when I use raw_smp_processor_id() I am making a different
claim.

smp_processor_id() => I am sampling the CPU index in a critical section
where the result equals the CPU that executes the entire critical
section, and I am relying on that property for the section's
correctness. This claim is verified by debug_smp_processor_id() when
DEBUG_PREEMPT=y.

raw_smp_processor_id() => I am sampling the CPU index and using the
result in a way that is safe even if it differs from the CPU(s) on which
the surrounding code actually executes.

This framing doesn't cover all situations, but I've found it to be
generally useful.

^ permalink raw reply

* [Bug 213837] "Kernel panic - not syncing: corrupted stack end detected inside scheduler" at building via distcc on a G5
From: bugzilla-daemon @ 2021-09-24 22:25 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <bug-213837-206035@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=213837

--- Comment #12 from Erhard F. (erhard_f@mailbox.org) ---
Created attachment 298963
  --> https://bugzilla.kernel.org/attachment.cgi?id=298963&action=edit
dmesg (5.15-rc2 + patch, PowerMac G5 11,2) #1

Last stack trace with CONFIG_THREAD_SHIFT=14 however did reveal a bit more
data:

[...]
stack: c0000000023effb0: 00000000 28022284 00000000 00000000  ....(.".........
stack: c0000000023effc0: 00000000 00000c00 00003fff 94c95000  ..........?...P.
stack: c0000000023effd0: 00000000 42000000 ffffffff ffffffea  ....B...........
stack: c0000000023effe0: 00000000 00000000 00000000 00000000  ................
stack: c0000000023efff0: 00000000 00000000 00000000 00000000  ................
Kernel panic - not syncing: corrupted stack end detected inside scheduler
CPU: 1 PID: 2652 Comm: cc1plus Tainted: G        W        
5.15.0-rc2-PowerMacG5+ #4
Call Trace:
[c0000000023ef7f0] [c0000000005532d8] .dump_stack_lvl+0x98/0xe0 (unreliable)
[c0000000023ef880] [c000000000069538] .panic+0x14c/0x3e8
[c0000000023ef930] [c00000000081d5a0] .__schedule+0xc0/0x874
[c0000000023efa00] [c00000000081dea0] .preempt_schedule_common+0x28/0x48
[c0000000023efa80] [c00000000081deec] .__cond_resched+0x2c/0x50
[c0000000023efaf0] [c00000000029579c] .dput+0x40/0x218
[c0000000023efba0] [c000000000285204] .path_put+0x1c/0x34
[c0000000023efc20] [c00000000027eab8] .do_readlinkat+0xdc/0x124
[c0000000023efcf0] [c00000000027f310] .__se_sys_readlink+0x20/0x30
[c0000000023efd60] [c000000000022850] .system_call_exception+0x1ac/0x1e4
[c0000000023efe10] [c00000000000b4cc] system_call_common+0xec/0x250
--- interrupt: c00 at 0x3fff95335500
NIP:  00003fff95335500 LR: 00003fff95273f6c CTR: 0000000000000000
REGS: c0000000023efe80 TRAP: 0c00   Tainted: G        W         
(5.15.0-rc2-PowerMacG5+)
MSR:  900000000200f032 <SF,HV,VEC,EE,PR,FP,ME,IR,DR,RI>  CR: 28022284  XER:
00000000
IRQMASK: 0 
GPR00: 0000000000000055 00003fffd5fb3600 00003fff95424300 00003fffd5fb4040 
GPR04: 00003fffd5fb3b20 00000000000003ff 0000000062697473 000000002bcf8581 
GPR08: ffffffffd4307a80 0000000000000000 0000000000000000 0000000000000000 
GPR12: 0000000000000000 00003fff95946430 00000000000003ff 00003fffd5fb4030 
GPR16: 00003fffd5fb3b20 0000000000000000 00003fffd5fb3700 0000000000000000 
GPR20: 000000000000002f 00003fffd5fb3b10 00003fff9593f7b8 00003fffd5fb4080 
GPR24: 0000000000000004 00003fffd5fb3710 00003fffd5fb3b20 00003fffd5fb4040 
GPR28: 00003fffd5fb4538 00003fffd5fb4084 00003fffd5fb4040 0000000000000000 
NIP [00003fff95335500] 0x3fff95335500
LR [00003fff95273f6c] 0x3fff95273f6c
--- interrupt: c00
Rebooting in 40 seconds..

And another one:
[...]
stack: c00000002cd77fb0: 00000000 28042822 00000000 00000000  ....(.("........
stack: c00000002cd77fc0: 00000000 00000c00 00003fff 9b2dd000  ..........?..-..
stack: c00000002cd77fd0: 00000000 42000000 00000000 00000000  ....B...........
stack: c00000002cd77fe0: 00000000 00000000 00000000 00000000  ................
stack: c00000002cd77ff0: 00000000 00000000 00000000 00000000  ................
Kernel panic - not syncing: corrupted stack end detected inside scheduler
CPU: 1 PID: 2713 Comm: cc1plus Tainted: G        W        
5.15.0-rc2-PowerMacG5+ #4
Call Trace:
[c00000002cd76dd0] [c0000000005532d8] .dump_stack_lvl+0x98/0xe0 (unreliable)
[c00000002cd76e60] [c000000000069538] .panic+0x14c/0x3e8
[c00000002cd76f10] [c00000000081d5a0] .__schedule+0xc0/0x874
[c00000002cd76fe0] [c00000000081dea0] .preempt_schedule_common+0x28/0x48
[c00000002cd77060] [c00000000081deec] .__cond_resched+0x2c/0x50
[c00000002cd770d0] [c000000000327848] .__ext4_handle_dirty_metadata+0x24/0x214
[c00000002cd771a0] [c000000000352a6c] .ext4_mb_mark_diskspace_used+0x3e0/0x41c
[c00000002cd77290] [c000000000355ae4] .ext4_mb_new_blocks+0x580/0xe10
[c00000002cd773b0] [c00000000033b2f0] .ext4_ind_map_blocks+0x63c/0xb28
[c00000002cd775a0] [c000000000342bf4] .ext4_map_blocks+0x37c/0x588
[c00000002cd77680] [c000000000342e64] ._ext4_get_block+0x64/0xec
[c00000002cd77730] [c0000000002c36a4] .__block_write_begin_int+0x188/0x4a4
[c00000002cd77850] [c0000000003480f8] .ext4_write_begin+0x2a8/0x3d0
[c00000002cd77970] [c0000000001c9170] .generic_perform_write+0xb8/0x1f4
[c00000002cd77a60] [c000000000333d68] .ext4_buffered_write_iter+0xb8/0x154
[c00000002cd77b00] [c000000000277a14] .new_sync_write+0x94/0xe8
[c00000002cd77c00] [c000000000278d6c] .vfs_write+0x13c/0x140
[c00000002cd77ca0] [c000000000278eb4] .ksys_write+0x78/0xc4
[c00000002cd77d60] [c000000000022850] .system_call_exception+0x1ac/0x1e4
[c00000002cd77e10] [c00000000000b4cc] system_call_common+0xec/0x250
--- interrupt: c00 at 0x3fff9b804b00
NIP:  00003fff9b804b00 LR: 00003fff9b780d04 CTR: 0000000000000000
REGS: c00000002cd77e80 TRAP: 0c00   Tainted: G        W         
(5.15.0-rc2-PowerMacG5+)
MSR:  900000000200d032 <SF,HV,VEC,EE,PR,ME,IR,DR,RI>  CR: 28042822  XER:
00000000
IRQMASK: 0 
GPR00: 0000000000000004 00003fffe21d3000 00003fff9b8f6300 0000000000000001 
GPR04: 0000000026979780 0000000000001000 0000000000000001 0000000000000036 
GPR08: 000000002697a780 0000000000000000 0000000000000000 0000000000000000 
GPR12: 0000000000000000 00003fff9be17430 00000000100f1858 0000000010063a50 
GPR16: 00000000100639f8 0000000010063968 000000012e83eb28 0000000000000000 
GPR20: ffffffffffffffff 0000000011a5b890 0000000000000001 00000000114e5160 
GPR24: 00000000114e5170 0000000000000000 0000000026979780 0000000000001000 
GPR28: 0000000000001000 00003fff9b8f0920 0000000026979780 0000000000001000 
NIP [00003fff9b804b00] 0x3fff9b804b00
LR [00003fff9b780d04] 0x3fff9b780d04
--- interrupt: c00
Rebooting in 40 seconds..

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.

^ permalink raw reply

* [Bug 213837] "Kernel panic - not syncing: corrupted stack end detected inside scheduler" at building via distcc on a G5
From: bugzilla-daemon @ 2021-09-24 22:17 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <bug-213837-206035@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=213837

--- Comment #11 from Erhard F. (erhard_f@mailbox.org) ---
Created attachment 298961
  --> https://bugzilla.kernel.org/attachment.cgi?id=298961&action=edit
System.map (5.15-rc2 + patch + CONFIG_THREAD_SHIFT=15, PowerMac G5 11,2)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.

^ permalink raw reply

* [Bug 213837] "Kernel panic - not syncing: corrupted stack end detected inside scheduler" at building via distcc on a G5
From: bugzilla-daemon @ 2021-09-24 22:16 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <bug-213837-206035@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=213837

--- Comment #10 from Erhard F. (erhard_f@mailbox.org) ---
Created attachment 298959
  --> https://bugzilla.kernel.org/attachment.cgi?id=298959&action=edit
kernel .config (5.15-rc2 + CONFIG_THREAD_SHIFT=15, PowerMac G5 11,2)

(In reply to mpe from comment #8)
> You could also try changing CONFIG_THREAD_SHIFT to 15, that might keep
> the system running a bit longer and give us some other clues.
The stack seems just large enough with CONFIG_THREAD_SHIFT=15 to not run into
this bug. I let the G5 build stuff via distcc in zram disk for a day without an
issue. With  CONFIG_THREAD_SHIFT=14 I hit the bug within minutes.

Just for completeness I'll upload the System.map and kernel .config with
CONFIG_THREAD_SHIFT=15.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.

^ permalink raw reply

* Re: coherency issue observed after hotplug on POWER8
From: Naveen N. Rao @ 2021-09-24 17:17 UTC (permalink / raw)
  To: Thadeu Lima de Souza Cascardo, linuxppc-dev
In-Reply-To: <YUpIqytZqpohq4EM@mussarela>

Hi Cascardo,
Thanks for reporting this.


Thadeu Lima de Souza Cascardo wrote:
> Hi, there.
> 
> We have been investigating an issue we have observed on POWER8 POWERNV systems.
> When running the kernel selftests reuseport_bpf_cpu after a CPU hotplug, we see
> crashes, in different forms. [1]

Just to re-confirm: you are only seeing this on P8 powernv, and not in a 
P8 guest/LPAR? I haven't been able to reproduce this on a firestone -- 
can you share more details about your power8 machine?

Also, do you only see this with ubuntu kernels, or are you also able to 
reproduce this with the upstream tree?

> 
> I managed to get xmon on that trap, and did some debugging. [2] I tried to dump
> the BPF JIT code, and it looks different when dumped from CPU#0 and CPU#0x9f
> (the one that was hotplugged, offlined, then onlined).

Next time you reproduce this, can you try dumping the SLBs for the cpus 
(command 'u' in xmon)?

> 
> Here is my partial analysis [3]. Basically, the BPF JIT fills a page with
> invalid instructions (traps, in ppc64 case), and puts the BPF program in a
> random offset of the page. In the case of the hotplugged CPU, which was the one
> that compiled the program, the page had the expected contents (BPF program
> started at the offset used to run the program). On the other CPU (in many
> cases, CPU #0), the same memory address/page had different contents, with the
> program starting at a different offset.

From [3], I think fp->aux->jit_data can be NULL if there are subprogs.  
But, I find it interesting that you don't always see the correct 
bpf_func, as reported in comment #25. Can you also try dumping the full 
bpf_prog structure (prog/fp) from xmon?

> 
> Is this a case of a bug in the micro-architecture or the firmware when 
> doing the hotplug? Can someone chime in?

It's possible that something is going wrong when offlining the cpu. Can 
you try booting the kernel with 'powersave=off' and see if the problem 
goes away?

> 
> Notice that we can't reproduce the same issue on a POWER9 system.
> 
> Thanks.
> Cascardo.
> 
> [1] https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1927076
> [2] https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1927076/comments/29
> [3] https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1927076/comments/30
> 

- Naveen


^ permalink raw reply

* [PATCH] powerpc: Activate CONFIG_STRICT_KERNEL_RWX by default
From: Christophe Leroy @ 2021-09-24 17:13 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel, Kees Cook

CONFIG_STRICT_KERNEL_RWX should be set by default on every
architectures (See https://github.com/KSPP/linux/issues/4)

On PPC32 we have to find a compromise between performance and/or
memory wasting and selection of strict_kernel_rwx, because it implies
either smaller memory chunks or larger alignment between RO memory
and RW memory.

For instance the 8xx maps memory with 8M pages. So either the limit
between RO and RW must be 8M aligned or it falls back or 512k pages
which implies more pressure on the TLB.

book3s/32 maps memory with BATs as much as possible. BATS can have
any power-of-two size between 128k and 256M but we have only 4 to 8
BATs so the alignment must be good enough to allow efficient use of
the BATs and avoid falling back on standard page mapping which would
kill performance.

So let's go one step forward and make it the default but still allow
users to unset it when wanted.

Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index ba5b66189358..79332f51185d 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -148,6 +148,7 @@ config PPC
 	select ARCH_MIGHT_HAVE_PC_PARPORT
 	select ARCH_MIGHT_HAVE_PC_SERIO
 	select ARCH_OPTIONAL_KERNEL_RWX		if ARCH_HAS_STRICT_KERNEL_RWX
+	select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT
 	select ARCH_STACKWALK
 	select ARCH_SUPPORTS_ATOMIC_RMW
 	select ARCH_SUPPORTS_DEBUG_PAGEALLOC	if PPC32 || PPC_BOOK3S_64
-- 
2.31.1


^ permalink raw reply related

* Re: [PATCH v2 1/2] serial: 8250: SERIAL_8250_FSL should not default to y when compile-testing
From: Johan Hovold @ 2021-09-24 14:10 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: linux-kernel, Greg Kroah-Hartman, linuxppc-dev, Li Yang,
	Scott Wood, linux-serial, Shawn Guo, Jiri Slaby, linux-arm-kernel
In-Reply-To: <6421f256407262afd658ffa74ec9430581528a7d.1632467477.git.geert+renesas@glider.be>

On Fri, Sep 24, 2021 at 09:12:30AM +0200, Geert Uytterhoeven wrote:
> Commit b1442c55ce8977aa ("serial: 8250: extend compile-test coverage")
> added compile-test support to the Freescale 16550 driver.  However, as
> SERIAL_8250_FSL is an invisible symbol, merely enabling compile-testing
> now enables this driver.
> 
> Fix this by dropping the COMPILE_TEST default again, but making the
> SERIAL_8250_FSL symbol visible instead.
> 
> Fixes: b1442c55ce8977aa ("serial: 8250: extend compile-test coverage")
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> v2:
>   - Split in two parts.

I realised that we can do better than this. I've prepared a patch that
preserves the old behaviour of always enabling the option on platforms
that may need it while also not enabling it by default when compile
testing.

Note that SERIAL_8250_FSL only enables a workaround for an erratum in
the Freescale UARTs in the 8250 driver (leaving the later added ACPI
support aside) and we shouldn't make it easier to disable it by mistake.

> ---
>  drivers/tty/serial/8250/Kconfig | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/tty/serial/8250/Kconfig b/drivers/tty/serial/8250/Kconfig
> index 808268edd2e82a45..0af96f3adab517f6 100644
> --- a/drivers/tty/serial/8250/Kconfig
> +++ b/drivers/tty/serial/8250/Kconfig
> @@ -361,9 +361,12 @@ config SERIAL_8250_BCM2835AUX
>  	  If unsure, say N.
>  
>  config SERIAL_8250_FSL
> -	bool
> +	bool "Freescale 16550-style UART support (8250 based driver)"
>  	depends on SERIAL_8250_CONSOLE
> -	default PPC || ARM || ARM64 || COMPILE_TEST
> +	default PPC || ARM || ARM64
> +	help
> +	  Selecting this option will add support for the 16550-style serial
> +	  port hardware found on Freescale SoCs.
>  
>  config SERIAL_8250_DW
>  	tristate "Support for Synopsys DesignWare 8250 quirks"

Johan

^ permalink raw reply

* [PATCH 1/3] mm: Make generic arch_is_kernel_initmem_freed() do what it says
From: Christophe Leroy @ 2021-09-24 13:48 UTC (permalink / raw)
  To: Andrew Morton, arnd
  Cc: linux-arch, linux-s390, linux-kernel, linux-mm, Gerald Schaefer,
	linuxppc-dev

Commit 7a5da02de8d6 ("locking/lockdep: check for freed initmem in
static_obj()") added arch_is_kernel_initmem_freed() which is supposed
to report whether an object is part of already freed init memory.

For the time being, the generic version of arch_is_kernel_initmem_freed()
always reports 'false', allthough free_initmem() is generically called
on all architectures.

Therefore, change the generic version of arch_is_kernel_initmem_freed()
to check whether free_initmem() has been called. If so, then check
if a given address falls into init memory.

In order to use function init_section_contains(), the fonction is
moved at the end of asm-generic/section.h

Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 include/asm-generic/sections.h | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
index d16302d3eb59..d1e5bb2c6b72 100644
--- a/include/asm-generic/sections.h
+++ b/include/asm-generic/sections.h
@@ -80,20 +80,6 @@ static inline int arch_is_kernel_data(unsigned long addr)
 }
 #endif
 
-/*
- * Check if an address is part of freed initmem. This is needed on architectures
- * with virt == phys kernel mapping, for code that wants to check if an address
- * is part of a static object within [_stext, _end]. After initmem is freed,
- * memory can be allocated from it, and such allocations would then have
- * addresses within the range [_stext, _end].
- */
-#ifndef arch_is_kernel_initmem_freed
-static inline int arch_is_kernel_initmem_freed(unsigned long addr)
-{
-	return 0;
-}
-#endif
-
 /**
  * memory_contains - checks if an object is contained within a memory region
  * @begin: virtual address of the beginning of the memory region
@@ -172,4 +158,21 @@ static inline bool is_kernel_rodata(unsigned long addr)
 	       addr < (unsigned long)__end_rodata;
 }
 
+/*
+ * Check if an address is part of freed initmem. This is needed on architectures
+ * with virt == phys kernel mapping, for code that wants to check if an address
+ * is part of a static object within [_stext, _end]. After initmem is freed,
+ * memory can be allocated from it, and such allocations would then have
+ * addresses within the range [_stext, _end].
+ */
+#ifndef arch_is_kernel_initmem_freed
+static inline int arch_is_kernel_initmem_freed(unsigned long addr)
+{
+	if (system_state < SYSTEM_RUNNING)
+		return 0;
+
+	return init_section_contains((void *)addr, 1);
+}
+#endif
+
 #endif /* _ASM_GENERIC_SECTIONS_H_ */
-- 
2.31.1


^ permalink raw reply related

* [PATCH 2/3] powerpc: Use generic version of arch_is_kernel_initmem_freed()
From: Christophe Leroy @ 2021-09-24 13:48 UTC (permalink / raw)
  To: Andrew Morton, arnd
  Cc: linux-arch, linux-s390, linux-kernel, linux-mm, linuxppc-dev
In-Reply-To: <0b55650058a5bf64f7d74781871a1ada2298c8b4.1632491308.git.christophe.leroy@csgroup.eu>

Generic version of arch_is_kernel_initmem_freed() now does the same
as powerpc version.

Remove the powerpc version.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/sections.h | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/arch/powerpc/include/asm/sections.h b/arch/powerpc/include/asm/sections.h
index 6e4af4492a14..79cb7a25a5fb 100644
--- a/arch/powerpc/include/asm/sections.h
+++ b/arch/powerpc/include/asm/sections.h
@@ -6,21 +6,8 @@
 #include <linux/elf.h>
 #include <linux/uaccess.h>
 
-#define arch_is_kernel_initmem_freed arch_is_kernel_initmem_freed
-
 #include <asm-generic/sections.h>
 
-extern bool init_mem_is_free;
-
-static inline int arch_is_kernel_initmem_freed(unsigned long addr)
-{
-	if (!init_mem_is_free)
-		return 0;
-
-	return addr >= (unsigned long)__init_begin &&
-		addr < (unsigned long)__init_end;
-}
-
 extern char __head_end[];
 
 #ifdef __powerpc64__
-- 
2.31.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox