LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL] Please pull powerpc/linux.git powerpc-5.11-8 tag
From: Michael Ellerman @ 2021-02-11 23:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: aneesh.kumar, linuxppc-dev, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hi Linus,

Please pull one final powerpc fix for 5.11:

The following changes since commit 24321ac668e452a4942598533d267805f291fdc9:

  powerpc/64/signal: Fix regression in __kernel_sigtramp_rt64() semantics (2021-02-02 22:14:41 +1100)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.11-8

for you to fetch changes up to 8c511eff1827239f24ded212b1bcda7ca5b16203:

  powerpc/kuap: Allow kernel thread to access userspace after kthread_use_mm (2021-02-06 23:13:04 +1100)

- ------------------------------------------------------------------
powerpc fixes for 5.11 #8

One fix for a regression seen in io_uring, introduced by our support for KUAP
(Kernel User Access Prevention) with the Hash MMU.

Thanks to: Aneesh Kumar K.V, Zorro Lang.

- ------------------------------------------------------------------
Aneesh Kumar K.V (1):
      powerpc/kuap: Allow kernel thread to access userspace after kthread_use_mm


 arch/powerpc/include/asm/book3s/64/kup.h   | 16 +++++++++++-----
 arch/powerpc/include/asm/book3s/64/pkeys.h |  4 ----
 arch/powerpc/mm/book3s64/pkeys.c           |  1 +
 3 files changed, 12 insertions(+), 9 deletions(-)
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmAlumMACgkQUevqPMjh
pYBb9BAAqpyhvEjDiyP0xeHUTQ9UsNMnNUW7Hp03jJYH+EKWC9nT7kb3TY1eNyk5
KyMBOtXiQsggzWDE31bR32BpH/JChSeRPaCO4HHdDS9t+3ZIo16RTFksdiIhCN6m
nI4WhnfrdZszstMRsKWzfoLRVDGyNi0hyyQaMS3ypuKvRmozZk6u9K/YNXMa97Wf
ihhB0lYRdfMNgxMm66uaqEtzYt3Z4dRj9Y24LQirJnp32xK+sNgoleHl4gvvKG3m
r7CogqHlcExbkD3dl/PPe/SVEesfXpmTrQPCvJmi0qWm9NzkduQWrSEkUkUp1YQD
T0pBHnCxtI7GAAQpCphBA3gjrz03Og4/RAXmfESgI0JHyh7Vihx91XOwuonuJndn
5ThY2D9+nkZ2vnlis2/AoLx6ClbNgZysr0iAOsRyd2SYR9Er2CcPZ4OuNHXWTHlz
G4SmVYiZj9gnSrzqlEGIqBOVWdykV+x+xkLLQx86HUAI+7f1mFV1+dJ4E5NLGKzS
jB+XwwG2y6q6SnJt2iiybqDu9K1lPWnKFLNeb4at7GfpJ4riKNcRSYrkY9xYxmkR
kgKyXfW+rNt1RcIoy65kNa3hSnXi4p9wc5Lph7joS7zTkZIKR2E3QUJjGQao85Qa
rNxccSz3X7ZAL9OEJP/4nE72Zf3VXkzuKbB79D3dhFJ8SJmbPqQ=
=GDL5
-----END PGP SIGNATURE-----

^ permalink raw reply

* Re: [PATCH 5/6] powerpc/mm/64s/hash: Add real-mode change_memory_range() for hash LPAR
From: Nicholas Piggin @ 2021-02-11 23:16 UTC (permalink / raw)
  To: linuxppc-dev, Michael Ellerman; +Cc: aneesh.kumar
In-Reply-To: <20210211135130.3474832-5-mpe@ellerman.id.au>

Excerpts from Michael Ellerman's message of February 11, 2021 11:51 pm:
> When we enabled STRICT_KERNEL_RWX we received some reports of boot
> failures when using the Hash MMU and running under phyp. The crashes
> are intermittent, and often exhibit as a completely unresponsive
> system, or possibly an oops.
> 
> One example, which was caught in xmon:
> 
>   [   14.068327][    T1] devtmpfs: mounted
>   [   14.069302][    T1] Freeing unused kernel memory: 5568K
>   [   14.142060][  T347] BUG: Unable to handle kernel instruction fetch
>   [   14.142063][    T1] Run /sbin/init as init process
>   [   14.142074][  T347] Faulting instruction address: 0xc000000000004400
>   cpu 0x2: Vector: 400 (Instruction Access) at [c00000000c7475e0]
>       pc: c000000000004400: exc_virt_0x4400_instruction_access+0x0/0x80
>       lr: c0000000001862d4: update_rq_clock+0x44/0x110
>       sp: c00000000c747880
>      msr: 8000000040001031
>     current = 0xc00000000c60d380
>     paca    = 0xc00000001ec9de80   irqmask: 0x03   irq_happened: 0x01
>       pid   = 347, comm = kworker/2:1
>   ...
>   enter ? for help
>   [c00000000c747880] c0000000001862d4 update_rq_clock+0x44/0x110 (unreliable)
>   [c00000000c7478f0] c000000000198794 update_blocked_averages+0xb4/0x6d0
>   [c00000000c7479f0] c000000000198e40 update_nohz_stats+0x90/0xd0
>   [c00000000c747a20] c0000000001a13b4 _nohz_idle_balance+0x164/0x390
>   [c00000000c747b10] c0000000001a1af8 newidle_balance+0x478/0x610
>   [c00000000c747be0] c0000000001a1d48 pick_next_task_fair+0x58/0x480
>   [c00000000c747c40] c000000000eaab5c __schedule+0x12c/0x950
>   [c00000000c747cd0] c000000000eab3e8 schedule+0x68/0x120
>   [c00000000c747d00] c00000000016b730 worker_thread+0x130/0x640
>   [c00000000c747da0] c000000000174d50 kthread+0x1a0/0x1b0
>   [c00000000c747e10] c00000000000e0f0 ret_from_kernel_thread+0x5c/0x6c
> 
> This shows that CPU 2, which was idle, woke up and then appears to
> randomly take an instruction fault on a completely valid area of
> kernel text.
> 
> The cause turns out to be the call to hash__mark_rodata_ro(), late in
> boot. Due to the way we layout text and rodata, that function actually
> changes the permissions for all of text and rodata to read-only plus
> execute.
> 
> To do the permission change we use a hypervisor call, H_PROTECT. On
> phyp that appears to be implemented by briefly removing the mapping of
> the kernel text, before putting it back with the updated permissions.
> If any other CPU is executing during that window, it will see spurious
> faults on the kernel text and/or data, leading to crashes.
> 
> To fix it we use stop machine to collect all other CPUs, and then have
> them drop into real mode (MMU off), while we change the mapping. That
> way they are unaffected by the mapping temporarily disappearing.
> 
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
>  arch/powerpc/mm/book3s64/hash_pgtable.c | 105 +++++++++++++++++++++++-
>  1 file changed, 104 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
> index 3663d3cdffac..01de985df2c4 100644
> --- a/arch/powerpc/mm/book3s64/hash_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
> @@ -8,6 +8,7 @@
>  #include <linux/sched.h>
>  #include <linux/mm_types.h>
>  #include <linux/mm.h>
> +#include <linux/stop_machine.h>
>  
>  #include <asm/sections.h>
>  #include <asm/mmu.h>
> @@ -400,6 +401,19 @@ EXPORT_SYMBOL_GPL(hash__has_transparent_hugepage);
>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>  
>  #ifdef CONFIG_STRICT_KERNEL_RWX
> +
> +struct change_memory_parms {
> +	unsigned long start, end, newpp;
> +	unsigned int step, nr_cpus, master_cpu;
> +	atomic_t cpu_counter;
> +};
> +
> +// We'd rather this was on the stack but it has to be in the RMO
> +static struct change_memory_parms chmem_parms;
> +
> +// And therefore we need a lock to protect it from concurrent use
> +static DEFINE_MUTEX(chmem_lock);
> +
>  static void change_memory_range(unsigned long start, unsigned long end,
>  				unsigned int step, unsigned long newpp)
>  {
> @@ -414,6 +428,73 @@ static void change_memory_range(unsigned long start, unsigned long end,
>  							mmu_kernel_ssize);
>  }
>  
> +static int notrace chmem_secondary_loop(struct change_memory_parms *parms)
> +{
> +	unsigned long msr, tmp, flags;
> +	int *p;
> +
> +	p = &parms->cpu_counter.counter;
> +
> +	local_irq_save(flags);
> +	__hard_EE_RI_disable();
> +
> +	asm volatile (
> +	// Switch to real mode and leave interrupts off
> +	"mfmsr	%[msr]			;"
> +	"li	%[tmp], %[MSR_IR_DR]	;"
> +	"andc	%[tmp], %[msr], %[tmp]	;"
> +	"mtmsrd %[tmp]			;"
> +
> +	// Tell the master we are in real mode
> +	"1:				"
> +	"lwarx	%[tmp], 0, %[p]		;"
> +	"addic	%[tmp], %[tmp], -1	;"
> +	"stwcx.	%[tmp], 0, %[p]		;"
> +	"bne-	1b			;"
> +
> +	// Spin until the counter goes to zero
> +	"2:				;"
> +	"lwz	%[tmp], 0(%[p])		;"
> +	"cmpwi	%[tmp], 0		;"
> +	"bne-	2b			;"
> +
> +	// Switch back to virtual mode
> +	"mtmsrd %[msr]			;"
> +
> +	: // outputs
> +	  [msr] "=&r" (msr), [tmp] "=&b" (tmp), "+m" (*p)
> +	: // inputs
> +	  [p] "b" (p), [MSR_IR_DR] "i" (MSR_IR | MSR_DR)
> +	: // clobbers
> +	  "cc", "xer"
> +	);
> +
> +	local_irq_restore(flags);

Hmm. __hard_EE_RI_disable won't get restored by this because it doesn't
set the HARD_DIS flag. Also we don't want RI disabled here because 
tracing will get called first (which might take SLB or HPTE fault).

But it's also slightly rude to ever enable EE under an irq soft mask,
because you don't know if it had been disabled by the masked interrupt 
handler. It's not strictly a problem AFAIK because the interrupt would
just get masked again, but if we try to maintain a good pattern would
be good. Hmm that means we should add a check for irqs soft masked in
__hard_irq_enable(), I'm not sure if all existing users would follow
this rule.

Might be better to call hard_irq_disable(); after the local_irq_save();
and then clear and reset RI inside that region (could just do it at the
same time as disabling MMU).

You could possibly pass old_msr and new_msr into asm directly and do
mfmsr() in C?

Clearing RI here unfortuantely I don't think will prevent interrupt
handlers (sreset or MCE) from trying to go into virtual mode if they
hit here. It only prevents them trying to return.

Thanks,
Nick

^ permalink raw reply

* [PATCH v2 0/2] of: of_device.h cleanups
From: Rob Herring @ 2021-02-11 23:27 UTC (permalink / raw)
  To: Michael Ellerman, Greg Kroah-Hartman, David S. Miller,
	Jakub Kicinski, Frank Rowand, devicetree
  Cc: Felipe Balbi, Michal Marek, Rafael J. Wysocki, netdev, linux-usb,
	Nicolas Palix, Patrice Chotard, linux-kernel, Julia Lawall,
	Gilles Muller, Paul Mackerras, linuxppc-dev, cocci,
	linux-arm-kernel

This is a couple of cleanups for of_device.h. They fell out from my
attempt at decoupling of_device.h and of_platform.h which is a mess
and I haven't finished, but there's no reason to wait on these.

Rob

Rob Herring (2):
  of: Remove of_dev_{get,put}()
  driver core: platform: Drop of_device_node_put() wrapper

 arch/powerpc/platforms/pseries/ibmebus.c |  4 ++--
 drivers/base/platform.c                  |  2 +-
 drivers/net/ethernet/ibm/emac/core.c     | 15 ++++++++-------
 drivers/of/device.c                      | 21 ---------------------
 drivers/of/platform.c                    |  4 ++--
 drivers/of/unittest.c                    |  2 +-
 drivers/usb/dwc3/dwc3-st.c               |  2 +-
 include/linux/of_device.h                | 10 ----------
 scripts/coccinelle/free/put_device.cocci |  1 -
 9 files changed, 15 insertions(+), 46 deletions(-)

-- 
2.27.0


^ permalink raw reply

* [PATCH v2 1/2] of: Remove of_dev_{get,put}()
From: Rob Herring @ 2021-02-11 23:27 UTC (permalink / raw)
  To: Michael Ellerman, Greg Kroah-Hartman, David S. Miller,
	Jakub Kicinski, Frank Rowand, devicetree
  Cc: Felipe Balbi, Michal Marek, Rafael J. Wysocki, netdev, linux-usb,
	Nicolas Palix, Patrice Chotard, linux-kernel, Julia Lawall,
	Gilles Muller, Paul Mackerras, linuxppc-dev, cocci,
	linux-arm-kernel
In-Reply-To: <20210211232745.1498137-1-robh@kernel.org>

of_dev_get() and of_dev_put are just wrappers for get_device()/put_device()
on a platform_device. There's also already platform_device_{get,put}()
wrappers for this purpose. Let's update the few users and remove
of_dev_{get,put}().

Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Patrice Chotard <patrice.chotard@st.com>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Julia Lawall <Julia.Lawall@inria.fr>
Cc: Gilles Muller <Gilles.Muller@inria.fr>
Cc: Nicolas Palix <nicolas.palix@imag.fr>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: netdev@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-usb@vger.kernel.org
Cc: cocci@systeme.lip6.fr
Signed-off-by: Rob Herring <robh@kernel.org>
---
v2:
 - Fix build
---
 arch/powerpc/platforms/pseries/ibmebus.c |  4 ++--
 drivers/net/ethernet/ibm/emac/core.c     | 15 ++++++++-------
 drivers/of/device.c                      | 21 ---------------------
 drivers/of/platform.c                    |  4 ++--
 drivers/of/unittest.c                    |  2 +-
 drivers/usb/dwc3/dwc3-st.c               |  2 +-
 include/linux/of_device.h                |  3 ---
 scripts/coccinelle/free/put_device.cocci |  1 -
 8 files changed, 14 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ibmebus.c b/arch/powerpc/platforms/pseries/ibmebus.c
index 8c6e509f6967..a15ab33646b3 100644
--- a/arch/powerpc/platforms/pseries/ibmebus.c
+++ b/arch/powerpc/platforms/pseries/ibmebus.c
@@ -355,12 +355,12 @@ static int ibmebus_bus_device_probe(struct device *dev)
 	if (!drv->probe)
 		return error;
 
-	of_dev_get(of_dev);
+	get_device(dev);
 
 	if (of_driver_match_device(dev, dev->driver))
 		error = drv->probe(of_dev);
 	if (error)
-		of_dev_put(of_dev);
+		put_device(dev);
 
 	return error;
 }
diff --git a/drivers/net/ethernet/ibm/emac/core.c b/drivers/net/ethernet/ibm/emac/core.c
index c00b9097eeea..471be6ec7e8a 100644
--- a/drivers/net/ethernet/ibm/emac/core.c
+++ b/drivers/net/ethernet/ibm/emac/core.c
@@ -38,6 +38,7 @@
 #include <linux/of_irq.h>
 #include <linux/of_net.h>
 #include <linux/of_mdio.h>
+#include <linux/platform_device.h>
 #include <linux/slab.h>
 
 #include <asm/processor.h>
@@ -2390,11 +2391,11 @@ static int emac_check_deps(struct emac_instance *dev,
 
 static void emac_put_deps(struct emac_instance *dev)
 {
-	of_dev_put(dev->mal_dev);
-	of_dev_put(dev->zmii_dev);
-	of_dev_put(dev->rgmii_dev);
-	of_dev_put(dev->mdio_dev);
-	of_dev_put(dev->tah_dev);
+	platform_device_put(dev->mal_dev);
+	platform_device_put(dev->zmii_dev);
+	platform_device_put(dev->rgmii_dev);
+	platform_device_put(dev->mdio_dev);
+	platform_device_put(dev->tah_dev);
 }
 
 static int emac_of_bus_notify(struct notifier_block *nb, unsigned long action,
@@ -2435,7 +2436,7 @@ static int emac_wait_deps(struct emac_instance *dev)
 	for (i = 0; i < EMAC_DEP_COUNT; i++) {
 		of_node_put(deps[i].node);
 		if (err)
-			of_dev_put(deps[i].ofdev);
+			platform_device_put(deps[i].ofdev);
 	}
 	if (err == 0) {
 		dev->mal_dev = deps[EMAC_DEP_MAL_IDX].ofdev;
@@ -2444,7 +2445,7 @@ static int emac_wait_deps(struct emac_instance *dev)
 		dev->tah_dev = deps[EMAC_DEP_TAH_IDX].ofdev;
 		dev->mdio_dev = deps[EMAC_DEP_MDIO_IDX].ofdev;
 	}
-	of_dev_put(deps[EMAC_DEP_PREV_IDX].ofdev);
+	platform_device_put(deps[EMAC_DEP_PREV_IDX].ofdev);
 	return err;
 }
 
diff --git a/drivers/of/device.c b/drivers/of/device.c
index aedfaaafd3e7..9a748855b39d 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -33,27 +33,6 @@ const struct of_device_id *of_match_device(const struct of_device_id *matches,
 }
 EXPORT_SYMBOL(of_match_device);
 
-struct platform_device *of_dev_get(struct platform_device *dev)
-{
-	struct device *tmp;
-
-	if (!dev)
-		return NULL;
-	tmp = get_device(&dev->dev);
-	if (tmp)
-		return to_platform_device(tmp);
-	else
-		return NULL;
-}
-EXPORT_SYMBOL(of_dev_get);
-
-void of_dev_put(struct platform_device *dev)
-{
-	if (dev)
-		put_device(&dev->dev);
-}
-EXPORT_SYMBOL(of_dev_put);
-
 int of_device_add(struct platform_device *ofdev)
 {
 	BUG_ON(ofdev->dev.of_node == NULL);
diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index 79bd5f5a1bf1..020bf860c72c 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -687,7 +687,7 @@ static int of_platform_notify(struct notifier_block *nb,
 		pdev_parent = of_find_device_by_node(rd->dn->parent);
 		pdev = of_platform_device_create(rd->dn, NULL,
 				pdev_parent ? &pdev_parent->dev : NULL);
-		of_dev_put(pdev_parent);
+		platform_device_put(pdev_parent);
 
 		if (pdev == NULL) {
 			pr_err("%s: failed to create for '%pOF'\n",
@@ -712,7 +712,7 @@ static int of_platform_notify(struct notifier_block *nb,
 		of_platform_device_destroy(&pdev->dev, &children_left);
 
 		/* and put the reference of the find */
-		of_dev_put(pdev);
+		platform_device_put(pdev);
 		break;
 	}
 
diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
index eb51bc147440..eb100627c186 100644
--- a/drivers/of/unittest.c
+++ b/drivers/of/unittest.c
@@ -1286,7 +1286,7 @@ static void __init of_unittest_platform_populate(void)
 			unittest(pdev,
 				 "Could not create device for node '%pOFn'\n",
 				 grandchild);
-			of_dev_put(pdev);
+			platform_device_put(pdev);
 		}
 	}
 
diff --git a/drivers/usb/dwc3/dwc3-st.c b/drivers/usb/dwc3/dwc3-st.c
index e733be840545..b06b7092b1a2 100644
--- a/drivers/usb/dwc3/dwc3-st.c
+++ b/drivers/usb/dwc3/dwc3-st.c
@@ -274,7 +274,7 @@ static int st_dwc3_probe(struct platform_device *pdev)
 
 	dwc3_data->dr_mode = usb_get_dr_mode(&child_pdev->dev);
 	of_node_put(child);
-	of_dev_put(child_pdev);
+	platform_device_put(child_pdev);
 
 	/*
 	 * Configure the USB port as device or host according to the static
diff --git a/include/linux/of_device.h b/include/linux/of_device.h
index 937f32f6aecb..d7a407dfeecb 100644
--- a/include/linux/of_device.h
+++ b/include/linux/of_device.h
@@ -26,9 +26,6 @@ static inline int of_driver_match_device(struct device *dev,
 	return of_match_device(drv->of_match_table, dev) != NULL;
 }
 
-extern struct platform_device *of_dev_get(struct platform_device *dev);
-extern void of_dev_put(struct platform_device *dev);
-
 extern int of_device_add(struct platform_device *pdev);
 extern int of_device_register(struct platform_device *ofdev);
 extern void of_device_unregister(struct platform_device *ofdev);
diff --git a/scripts/coccinelle/free/put_device.cocci b/scripts/coccinelle/free/put_device.cocci
index 120921366e84..f09f1e79bfa6 100644
--- a/scripts/coccinelle/free/put_device.cocci
+++ b/scripts/coccinelle/free/put_device.cocci
@@ -21,7 +21,6 @@ id = of_find_device_by_node@p1(x)
 if (id == NULL || ...) { ... return ...; }
 ... when != put_device(&id->dev)
     when != platform_device_put(id)
-    when != of_dev_put(id)
     when != if (id) { ... put_device(&id->dev) ... }
     when != e1 = (T)id
     when != e1 = (T)(&id->dev)
-- 
2.27.0


^ permalink raw reply related

* [PATCH v2 2/2] driver core: platform: Drop of_device_node_put() wrapper
From: Rob Herring @ 2021-02-11 23:27 UTC (permalink / raw)
  To: Michael Ellerman, Greg Kroah-Hartman, David S. Miller,
	Jakub Kicinski, Frank Rowand, devicetree
  Cc: Felipe Balbi, Michal Marek, Rafael J. Wysocki, netdev, linux-usb,
	Nicolas Palix, Patrice Chotard, linux-kernel, Julia Lawall,
	Gilles Muller, Paul Mackerras, linuxppc-dev, cocci,
	linux-arm-kernel
In-Reply-To: <20210211232745.1498137-1-robh@kernel.org>

of_device_node_put() is just a wrapper for of_node_put(). The platform
driver core is already polluted with of_node pointers and the only 'get'
already uses of_node_get() (though typically the get would happen in
of_device_alloc()).

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
---
v2:
 - Fix build
---
 drivers/base/platform.c   | 2 +-
 include/linux/of_device.h | 7 -------
 2 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 95fd1549f87d..9d5171e5f967 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -571,7 +571,7 @@ static void platform_device_release(struct device *dev)
 	struct platform_object *pa = container_of(dev, struct platform_object,
 						  pdev.dev);
 
-	of_device_node_put(&pa->pdev.dev);
+	of_node_put(pa->pdev.dev.of_node);
 	kfree(pa->pdev.dev.platform_data);
 	kfree(pa->pdev.mfd_cell);
 	kfree(pa->pdev.resource);
diff --git a/include/linux/of_device.h b/include/linux/of_device.h
index d7a407dfeecb..1d7992a02e36 100644
--- a/include/linux/of_device.h
+++ b/include/linux/of_device.h
@@ -38,11 +38,6 @@ extern int of_device_request_module(struct device *dev);
 extern void of_device_uevent(struct device *dev, struct kobj_uevent_env *env);
 extern int of_device_uevent_modalias(struct device *dev, struct kobj_uevent_env *env);
 
-static inline void of_device_node_put(struct device *dev)
-{
-	of_node_put(dev->of_node);
-}
-
 static inline struct device_node *of_cpu_device_node_get(int cpu)
 {
 	struct device *cpu_dev;
@@ -94,8 +89,6 @@ static inline int of_device_uevent_modalias(struct device *dev,
 	return -ENODEV;
 }
 
-static inline void of_device_node_put(struct device *dev) { }
-
 static inline const struct of_device_id *of_match_device(
 		const struct of_device_id *matches, const struct device *dev)
 {
-- 
2.27.0


^ permalink raw reply related

* Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.11-8 tag
From: pr-tracker-bot @ 2021-02-11 23:51 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: aneesh.kumar, Linus Torvalds, linuxppc-dev, linux-kernel
In-Reply-To: <87blcqnqkw.fsf@mpe.ellerman.id.au>

The pull request you sent on Fri, 12 Feb 2021 10:15:59 +1100:

> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.11-8

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/dcc0b49040c70ad827a7f3d58a21b01fdb14e749

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply

* Re: Fwd: Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
From: Thiago Jung Bauermann @ 2021-02-11 23:59 UTC (permalink / raw)
  To: Lakshmi Ramasubramanian
  Cc: Rob Herring, linux-integrity, linuxppc-dev, linux-arm-kernel,
	Mimi Zohar
In-Reply-To: <b534117e-333d-097c-a3c0-2a80985bd37f@linux.microsoft.com>


Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:

> On 2/11/21 9:42 AM, Lakshmi Ramasubramanian wrote:
>> Hi Rob,
>> [PATCH] powerpc: Rename kexec elfcorehdr_addr to elf_headers_mem
>> This change causes build problem for x86_64 architecture (please see the 
>> mail from kernel test bot below) since arch/x86/include/asm/kexec.h uses
>> "elf_load_addr" for the ELF header buffer address and not 
>> "elf_headers_mem".
>> struct kimage_arch {
>>      ...
>>      /* Core ELF header buffer */
>>      void *elf_headers;
>>      unsigned long elf_headers_sz;
>>      unsigned long elf_load_addr;
>> };
>> I am thinking of limiting of_kexec_alloc_and_setup_fdt() to ARM64 and 
>> PPC64 since they are the only ones using this function now.
>> #if defined(CONFIG_ARM64) && defined(CONFIG_PPC64)
> Sorry - I meant to say
> #if defined(CONFIG_ARM64) || defined(CONFIG_PPC64)
>

Does it build correctly if you rename elf_headers_mem to elf_load_addr?
Or the other way around, renaming x86's elf_load_addr to
elf_headers_mem. I don't really have a preference.

That would be better than adding an #if condition.

>> void *of_kexec_alloc_and_setup_fdt(const struct kimage *image,
>>                     unsigned long initrd_load_addr,
>>                     unsigned long initrd_len,
>>                     const char *cmdline)
>> {
>>      ...
>> }
>> #endif /* defined(CONFIG_ARM64) && defined(CONFIG_PPC64) */
>> Please let me know if you have any concerns.
>> thanks,
>>   -lakshmi

-- 
Thiago Jung Bauermann
IBM Linux Technology Center

^ permalink raw reply

* Re: [PATCH kernel] powerpc/kuap: Restore AMR after replaying soft interrupts
From: Michael Ellerman @ 2021-02-12  0:19 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20210202091541.36499-1-aik@ozlabs.ru>

On Tue, 2 Feb 2021 20:15:41 +1100, Alexey Kardashevskiy wrote:
> Since de78a9c "powerpc: Add a framework for Kernel Userspace Access
> Protection", user access helpers call user_{read|write}_access_{begin|end}
> when user space access is allowed.
> 
> 890274c "powerpc/64s: Implement KUAP for Radix MMU" made the mentioned
> helpers program a AMR special register to allow such access for a short
> period of time, most of the time AMR is expected to block user memory
> access by the kernel.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/kuap: Restore AMR after replaying soft interrupts
      https://git.kernel.org/powerpc/c/60a707d0c99aff4eadb7fd334c5fd21df386723e

cheers

^ permalink raw reply

* Re: [PATCH kernel v3] powerpc/uaccess: Skip might_fault() when user access is enabled
From: Michael Ellerman @ 2021-02-12  0:19 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev; +Cc: Jordan Niethe, Nicholas Piggin
In-Reply-To: <20210204121612.32721-1-aik@ozlabs.ru>

On Thu, 4 Feb 2021 23:16:12 +1100, Alexey Kardashevskiy wrote:
> The amount of code executed with enabled user space access (unlocked KUAP)
> should be minimal. However with CONFIG_PROVE_LOCKING or
> CONFIG_DEBUG_ATOMIC_SLEEP enabled, might_fault() may end up replaying
> interrupts which in turn may access the user space and forget to restore
> the KUAP state.
> 
> The problem places are:
> 1. strncpy_from_user (and similar) which unlock KUAP and call
> unsafe_get_user -> __get_user_allowed -> __get_user_nocheck()
> with do_allow=false to skip KUAP as the caller took care of it.
> 2. __put_user_nocheck_goto() which is called with unlocked KUAP.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/uaccess: Avoid might_fault() when user access is enabled
      https://git.kernel.org/powerpc/c/7d506ca97b665b95e698a53697dad99fae813c1a

cheers

^ permalink raw reply

* Re: [PATCH 1/3] powerpc/mm: Enable compound page check for both THP and HugeTLB
From: Michael Ellerman @ 2021-02-12  0:19 UTC (permalink / raw)
  To: Aneesh Kumar K.V, mpe, linuxppc-dev
In-Reply-To: <20210203045812.234439-1-aneesh.kumar@linux.ibm.com>

On Wed, 3 Feb 2021 10:28:10 +0530, Aneesh Kumar K.V wrote:
> THP config results in compound pages. Make sure the kernel enables
> the PageCompound() check with CONFIG_HUGETLB_PAGE disabled and
> CONFIG_TRANSPARENT_HUGEPAGE enabled.
> 
> This makes sure we correctly flush the icache with THP pages.
> flush_dcache_icache_page only matter for platforms that don't support
> COHERENT_ICACHE.

Applied to powerpc/next.

[1/3] powerpc/mm: Enable compound page check for both THP and HugeTLB
      https://git.kernel.org/powerpc/c/c7ba2d636342093cfb842f47640e5b62192adfed
[2/3] powerpc/mm: Add PG_dcache_clean to indicate dcache clean state
      https://git.kernel.org/powerpc/c/ec94b9b23d620d40ab2ced094a30c22bb8d69b9f
[3/3] powerpc/mm: Remove dcache flush from memory remove.
      https://git.kernel.org/powerpc/c/2ac02e5ecec0cc2484d60a73b1bc6394aa2fad28

cheers

^ permalink raw reply

* Re: [PATCH 1/3] spi: mpc52xx: Avoid using get_tbl()
From: Michael Ellerman @ 2021-02-12  0:19 UTC (permalink / raw)
  To: Paul Mackerras, Michael Ellerman, broonie, Christophe Leroy,
	Benjamin Herrenschmidt
  Cc: linuxppc-dev, linux-kernel, linux-spi
In-Reply-To: <99bf008e2970de7f8ed3225cda69a6d06ae1a644.1612866360.git.christophe.leroy@csgroup.eu>

On Tue, 9 Feb 2021 10:26:21 +0000 (UTC), Christophe Leroy wrote:
> get_tbl() is confusing as it returns the content TBL register
> on PPC32 but the concatenation of TBL and TBU on PPC64.
> 
> Use mftb() instead.
> 
> This will allow the removal of get_tbl() in a following patch.

Applied to powerpc/next.

[1/3] spi: mpc52xx: Avoid using get_tbl()
      https://git.kernel.org/powerpc/c/e10656114d32c659768e7ca8aebaaa6ac6e959ab
[2/3] powerpc/time: Avoid using get_tbl()
      https://git.kernel.org/powerpc/c/55d68df623eb679cc91f61137f14751e7f369662
[3/3] powerpc/time: Remove get_tbl()
      https://git.kernel.org/powerpc/c/132f94f133961d18af615cb3503368e59529e9a8

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/32: Preserve cr1 in exception prolog stack check to fix build error
From: Michael Ellerman @ 2021-02-12  0:19 UTC (permalink / raw)
  To: Paul Mackerras, Michael Ellerman, Benjamin Herrenschmidt,
	Christophe Leroy
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <5ae4d545e3ac58e133d2599e0deb88843cb494fc.1612768623.git.christophe.leroy@csgroup.eu>

On Mon, 8 Feb 2021 07:17:40 +0000 (UTC), Christophe Leroy wrote:
> THREAD_ALIGN_SHIFT = THREAD_SHIFT + 1 = PAGE_SHIFT + 1
> Maximum PAGE_SHIFT is 18 for 256k pages so
> THREAD_ALIGN_SHIFT is 19 at the maximum.
> 
> No need to clobber cr1, it can be preserved when moving r1
> into CR when we check stack overflow.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/32: Preserve cr1 in exception prolog stack check to fix build error
      https://git.kernel.org/powerpc/c/3642eb21256a317ac14e9ed560242c6d20cf06d9

cheers

^ permalink raw reply

* Re: [PATCH v5 00/22] powerpc/32: Implement C syscall entry/exit
From: Michael Ellerman @ 2021-02-12  0:19 UTC (permalink / raw)
  To: Paul Mackerras, msuchanek, Michael Ellerman, npiggin,
	Benjamin Herrenschmidt, Christophe Leroy
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1612796617.git.christophe.leroy@csgroup.eu>

On Mon, 8 Feb 2021 15:10:19 +0000 (UTC), Christophe Leroy wrote:
> This series implements C syscall entry/exit for PPC32. It reuses
> the work already done for PPC64.
> 
> This series is based on today's merge-test (b6f72fc05389e3fc694bf5a5fa1bbd33f61879e0)
> 
> In terms on performance we have the following number of cycles on an
> 8xx running null_syscall benchmark:
> - mainline: 296 cycles
> - after patch 4: 283 cycles
> - after patch 16: 304 cycles
> - after patch 17: 348 cycles
> - at the end of the series: 320 cycles
> 
> [...]

Patches 1-15 and 21 applied to powerpc/next.

[01/22] powerpc/32s: Add missing call to kuep_lock on syscall entry
        https://git.kernel.org/powerpc/c/57fdfbce89137ae85cd5cef48be168040a47dd13
[02/22] powerpc/32: Always enable data translation on syscall entry
        https://git.kernel.org/powerpc/c/eca2411040c1ee15b8882c6427fb4eb5a48ada69
[03/22] powerpc/32: On syscall entry, enable instruction translation at the same time as data
        https://git.kernel.org/powerpc/c/76249ddc27080b6b835a89cedcc4185b3b5a6b23
[04/22] powerpc/32: Reorder instructions to avoid using CTR in syscall entry
        https://git.kernel.org/powerpc/c/2c59e5104821c5720e88bafa9e522f8bea9ce8fa
[05/22] powerpc/irq: Add helper to set regs->softe
        https://git.kernel.org/powerpc/c/fb5608fd117a8b48752d2b5a7e70847c1ed33d33
[06/22] powerpc/irq: Rework helpers that manipulate MSR[EE/RI]
        https://git.kernel.org/powerpc/c/08353779f2889305f64e04de3e46ed59ed60f859
[07/22] powerpc/irq: Add stub irq_soft_mask_return() for PPC32
        https://git.kernel.org/powerpc/c/6650c4782d5788346a25a4f698880d124f2699a0
[08/22] powerpc/syscall: Rename syscall_64.c into interrupt.c
        https://git.kernel.org/powerpc/c/ab1a517d55b01b54ba70f5d54f926f5ab4b18339
[09/22] powerpc/syscall: Make interrupt.c buildable on PPC32
        https://git.kernel.org/powerpc/c/344bb20b159dd0996e521c0d4c131a6ae10c322a
[10/22] powerpc/syscall: Use is_compat_task()
        https://git.kernel.org/powerpc/c/72b7a9e56b25babfe4c90bf3ce88285c7fb62ab9
[11/22] powerpc/syscall: Save r3 in regs->orig_r3
        https://git.kernel.org/powerpc/c/8875f47b7681aa4e4484a9b612577b044725f839
[12/22] powerpc/syscall: Change condition to check MSR_RI
        https://git.kernel.org/powerpc/c/c01b916658150e98f00a4981750c37a3224c8735
[13/22] powerpc/32: Always save non volatile GPRs at syscall entry
        https://git.kernel.org/powerpc/c/fbcee2ebe8edbb6a93316f0a189ae7fcfaa7094f
[14/22] powerpc/syscall: implement system call entry/exit logic in C for PPC32
        https://git.kernel.org/powerpc/c/6f76a01173ccaa363739f913394d4e138d92d718
[15/22] powerpc/32: Remove verification of MSR_PR on syscall in the ASM entry
        https://git.kernel.org/powerpc/c/4d67facbcbdb3d9e3c9cb82e4ec47fc63d298dd8
[21/22] powerpc/32: Remove the counter in global_dbcr0
        https://git.kernel.org/powerpc/c/eb595eca74067b78d36fb188b555e30f28686fc7

cheers

^ permalink raw reply

* Re: [PATCH v2 1/3] powerpc/uaccess: get rid of small constant size cases in raw_copy_{to, from}_user()
From: Michael Ellerman @ 2021-02-12  0:19 UTC (permalink / raw)
  To: Paul Mackerras, Michael Ellerman, Benjamin Herrenschmidt,
	Christophe Leroy
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <99d4ccb58a20d8408d0e19874393655ad5b40822.1612879284.git.christophe.leroy@csgroup.eu>

On Tue, 9 Feb 2021 14:02:12 +0000 (UTC), Christophe Leroy wrote:
> Copied from commit 4b842e4e25b1 ("x86: get rid of small
> constant size cases in raw_copy_{to,from}_user()")
> 
> Very few call sites where that would be triggered remain, and none
> of those is anywhere near hot enough to bother.

Applied to powerpc/next.

[1/3] powerpc/uaccess: get rid of small constant size cases in raw_copy_{to,from}_user()
      https://git.kernel.org/powerpc/c/6b385d1d7c0a346758e35b128815afa25d4709ee
[2/3] powerpc/uaccess: Merge __put_user_size_allowed() into __put_user_size()
      https://git.kernel.org/powerpc/c/95d019e0f9225954e33b6efcad315be9d548a4d7
[3/3] powerpc/uaccess: Merge raw_copy_to_user_allowed() into raw_copy_to_user()
      https://git.kernel.org/powerpc/c/052f9d206f6c4b5b512b8c201d375f2dd194be35

cheers

^ permalink raw reply

* Re: [PATCH v6 0/2] powerpc/32: Implement C syscall entry/exit (complement)
From: Michael Ellerman @ 2021-02-12  0:19 UTC (permalink / raw)
  To: Paul Mackerras, msuchanek, Michael Ellerman, npiggin,
	Benjamin Herrenschmidt, Christophe Leroy
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1612898425.git.christophe.leroy@csgroup.eu>

On Tue, 9 Feb 2021 19:29:26 +0000 (UTC), Christophe Leroy wrote:
> This series implements C syscall entry/exit for PPC32. It reuses
> the work already done for PPC64.
> 
> This series is based on today's next-test (f538b53fd47a) where main patchs from v5 are merged in.
> 
> The first patch is important for performance.
> 
> [...]

Applied to powerpc/next.

[1/3] powerpc/syscall: Do not check unsupported scv vector on PPC32
      https://git.kernel.org/powerpc/c/b966f2279048ee9f30d83ef8568b99fa40917c54
[2/3] powerpc/32: Handle bookE debugging in C in syscall entry/exit
      https://git.kernel.org/powerpc/c/d524dda719f06967db4d3ba519edf9267f84c155
[3/3] powerpc/syscall: Avoid storing 'current' in another pointer
      https://git.kernel.org/powerpc/c/5b90b9661a3396e00f6e8bcbb617a0787fb683d0

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/kexec_file: fix FDT size estimation for kdump kernel
From: Michael Ellerman @ 2021-02-12  0:19 UTC (permalink / raw)
  To: Michael Ellerman, Hari Bathini
  Cc: Pingfan Liu, Petr Tesarik, Mahesh J Salgaonkar, stable,
	linuxppc-dev, Sourabh Jain, Dave Young, Thiago Jung Bauermann
In-Reply-To: <161243826811.119001.14083048209224609814.stgit@hbathini>

On Thu, 04 Feb 2021 17:01:10 +0530, Hari Bathini wrote:
> On systems with large amount of memory, loading kdump kernel through
> kexec_file_load syscall may fail with the below error:
> 
>     "Failed to update fdt with linux,drconf-usable-memory property"
> 
> This happens because the size estimation for kdump kernel's FDT does
> not account for the additional space needed to setup usable memory
> properties. Fix it by accounting for the space needed to include
> linux,usable-memory & linux,drconf-usable-memory properties while
> estimating kdump kernel's FDT size.

Applied to powerpc/next.

[1/1] powerpc/kexec_file: fix FDT size estimation for kdump kernel
      https://git.kernel.org/powerpc/c/2377c92e37fe97bc5b365f55cf60f56dfc4849f5

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/xive: Assign boolean values to a bool variable
From: Michael Ellerman @ 2021-02-12  0:19 UTC (permalink / raw)
  To: paulus, Jiapeng Chong; +Cc: linuxppc-dev, linux-kernel, kvm-ppc
In-Reply-To: <1612680192-43116-1-git-send-email-jiapeng.chong@linux.alibaba.com>

On Sun, 7 Feb 2021 14:43:12 +0800, Jiapeng Chong wrote:
> Fix the following coccicheck warnings:
> 
> ./arch/powerpc/kvm/book3s_xive.c:1856:2-17: WARNING: Assignment of 0/1
> to bool variable.
> 
> ./arch/powerpc/kvm/book3s_xive.c:1854:2-17: WARNING: Assignment of 0/1
> to bool variable.

Applied to powerpc/next.

[1/1] powerpc/xive: Assign boolean values to a bool variable
      https://git.kernel.org/powerpc/c/c9df3f809cc98b196548864f52d3c4e280dd1970

cheers

^ permalink raw reply

* Re: [PATCH V3] powerpc/perf: Adds support for programming of Thresholding in P10
From: Michael Ellerman @ 2021-02-12  0:19 UTC (permalink / raw)
  To: Kajol Jain, mpe; +Cc: atrajeev, maddy, linuxppc-dev
In-Reply-To: <20210209095234.837356-1-kjain@linux.ibm.com>

On Tue, 9 Feb 2021 15:22:34 +0530, Kajol Jain wrote:
> Thresholding, a performance monitoring unit feature, can be
> used to identify marked instructions which take more than
> expected cycles between start event and end event.
> Threshold compare (thresh_cmp) bits are programmed in MMCRA
> register. In Power9, thresh_cmp bits were part of the
> event code. But in case of P10, thresh_cmp are not part of
> event code due to inclusion of MMCR3 bits.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/perf: Adds support for programming of Thresholding in P10
      https://git.kernel.org/powerpc/c/82d2c16b350f72aa21ac2a6860c542aa4b43a51e

cheers

^ permalink raw reply

* Re: [PATCH 1/3] powerpc/83xx: Fix build error when CONFIG_PCI=n
From: Michael Ellerman @ 2021-02-12  0:20 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev
In-Reply-To: <20210210130804.3190952-1-mpe@ellerman.id.au>

On Thu, 11 Feb 2021 00:08:02 +1100, Michael Ellerman wrote:
> As reported by lkp:
> 
>   arch/powerpc/platforms/83xx/km83xx.c:183:19: error: 'mpc83xx_setup_pci' undeclared here (not in a function)
>      183 |  .discover_phbs = mpc83xx_setup_pci,
> 	 |                   ^~~~~~~~~~~~~~~~~
> 	 |                   mpc83xx_setup_arch
> 
> [...]

Applied to powerpc/next.

[1/3] powerpc/83xx: Fix build error when CONFIG_PCI=n
      https://git.kernel.org/powerpc/c/5c47c44f157f408c862b144bbd1d1e161a521aa2
[2/3] powerpc/mm/64s: Fix no previous prototype warning
      https://git.kernel.org/powerpc/c/2bb421a3d93601aa81bc39af7aac7280303e0761
[3/3] powerpc/amigaone: Make amigaone_discover_phbs() static
      https://git.kernel.org/powerpc/c/f30520c64f290589e91461d7326b497c23e7f5fd

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/64s: interrupt exit improve bounding of interrupt recursion
From: Michael Ellerman @ 2021-02-12  0:20 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Athira Rajeev
In-Reply-To: <20210123061504.2076317-1-npiggin@gmail.com>

On Sat, 23 Jan 2021 16:15:04 +1000, Nicholas Piggin wrote:
> When replaying pending soft-masked interrupts when an interrupt returns
> to an irqs-enabled context, there is a special case required if this was
> an asynchronous interrupt to avoid unbounded interrupt recursion.
> 
> This case was not tested for in the case the asynchronous interrupt hit
> in user context, because a subsequent nested interrupt would by definition
> hit in kernel mode, which then exits via the kernel path which does test
> this case.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/64s: interrupt exit improve bounding of interrupt recursion
      https://git.kernel.org/powerpc/c/c0ef717305f51e29b5ce0c78a6bfe566b3283415

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/64s: Remove EXSLB interrupt save area
From: Michael Ellerman @ 2021-02-12  0:20 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev
In-Reply-To: <20210208063406.331655-1-npiggin@gmail.com>

On Mon, 8 Feb 2021 16:34:06 +1000, Nicholas Piggin wrote:
> SLB faults should not be taken while the PACA save areas are live, all
> memory accesses should be fetches from the kernel text, and access to
> PACA and the current stack, before C code is called or any other
> accesses are made.
> 
> All of these have pinned SLBs so will not take a SLB fault. Therefore
> EXSLB is not be required.

Applied to powerpc/next.

[1/1] powerpc/64s: Remove EXSLB interrupt save area
      https://git.kernel.org/powerpc/c/ac7c5e9b08acdb54ef3525abcad24bdb3ed05551

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/64s: syscall real mode entry use mtmsrd rather than rfid
From: Michael Ellerman @ 2021-02-12  0:20 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev
In-Reply-To: <20210208063326.331502-1-npiggin@gmail.com>

On Mon, 8 Feb 2021 16:33:26 +1000, Nicholas Piggin wrote:
> Have the real mode system call entry handler branch to the kernel
> 0xc000... address and then use mtmsrd to enable the MMU, rather than use
> SRRs and rfid.
> 
> Commit 8729c26e675c ("powerpc/64s/exception: Move real to virt switch
> into the common handler") implemented this style of real mode entry for
> other interrupt handlers, so this brings system calls into line with
> them, which is the main motivcation for the change.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/64s: syscall real mode entry use mtmsrd rather than rfid
      https://git.kernel.org/powerpc/c/14ad0e7d04f46865775fb010ccd96fb1cc83433a

cheers

^ permalink raw reply

* Re: [PATCH] powerpc: remove interrupt handler functions from the noinstr section
From: Michael Ellerman @ 2021-02-12  0:20 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Stephen Rothwell
In-Reply-To: <20210211063636.236420-1-npiggin@gmail.com>

On Thu, 11 Feb 2021 16:36:36 +1000, Nicholas Piggin wrote:
> The allyesconfig ppc64 kernel fails to link with relocations unable to
> fit after commit 3a96570ffceb ("powerpc: convert interrupt handlers to
> use wrappers"), which is due to the interrupt handler functions being
> put into the .noinstr.text section, which the linker script places on
> the opposite side of the main .text section from the interrupt entry
> asm code which calls the handlers.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc: remove interrupt handler functions from the noinstr section
      https://git.kernel.org/powerpc/c/e4bb64c7a42e61bcb6f8b70279fc1f7805eaad3f

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/pci: Remove unimplemented prototypes
From: Michael Ellerman @ 2021-02-12  0:20 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev
In-Reply-To: <20200902035138.1762531-1-oohall@gmail.com>

On Wed, 2 Sep 2020 13:51:38 +1000, Oliver O'Halloran wrote:
> The corresponding definitions were deleted in commit 3d5134ee8341
> ("[POWERPC] Rewrite IO allocation & mapping on powerpc64") which
> was merged a mere 13 years ago.

Applied to powerpc/next.

[1/1] powerpc/pci: Remove unimplemented prototypes
      https://git.kernel.org/powerpc/c/b3abe590c80e0ba55b6fce48762232d90dbc37a5

cheers

^ permalink raw reply

* Re: [PATCH 5/6] powerpc/mm/64s/hash: Add real-mode change_memory_range() for hash LPAR
From: Nicholas Piggin @ 2021-02-12  0:36 UTC (permalink / raw)
  To: linuxppc-dev, Michael Ellerman; +Cc: aneesh.kumar
In-Reply-To: <20210211135130.3474832-5-mpe@ellerman.id.au>

Excerpts from Michael Ellerman's message of February 11, 2021 11:51 pm:
> When we enabled STRICT_KERNEL_RWX we received some reports of boot
> failures when using the Hash MMU and running under phyp. The crashes
> are intermittent, and often exhibit as a completely unresponsive
> system, or possibly an oops.
> 
> One example, which was caught in xmon:
> 
>   [   14.068327][    T1] devtmpfs: mounted
>   [   14.069302][    T1] Freeing unused kernel memory: 5568K
>   [   14.142060][  T347] BUG: Unable to handle kernel instruction fetch
>   [   14.142063][    T1] Run /sbin/init as init process
>   [   14.142074][  T347] Faulting instruction address: 0xc000000000004400
>   cpu 0x2: Vector: 400 (Instruction Access) at [c00000000c7475e0]
>       pc: c000000000004400: exc_virt_0x4400_instruction_access+0x0/0x80
>       lr: c0000000001862d4: update_rq_clock+0x44/0x110
>       sp: c00000000c747880
>      msr: 8000000040001031
>     current = 0xc00000000c60d380
>     paca    = 0xc00000001ec9de80   irqmask: 0x03   irq_happened: 0x01
>       pid   = 347, comm = kworker/2:1
>   ...
>   enter ? for help
>   [c00000000c747880] c0000000001862d4 update_rq_clock+0x44/0x110 (unreliable)
>   [c00000000c7478f0] c000000000198794 update_blocked_averages+0xb4/0x6d0
>   [c00000000c7479f0] c000000000198e40 update_nohz_stats+0x90/0xd0
>   [c00000000c747a20] c0000000001a13b4 _nohz_idle_balance+0x164/0x390
>   [c00000000c747b10] c0000000001a1af8 newidle_balance+0x478/0x610
>   [c00000000c747be0] c0000000001a1d48 pick_next_task_fair+0x58/0x480
>   [c00000000c747c40] c000000000eaab5c __schedule+0x12c/0x950
>   [c00000000c747cd0] c000000000eab3e8 schedule+0x68/0x120
>   [c00000000c747d00] c00000000016b730 worker_thread+0x130/0x640
>   [c00000000c747da0] c000000000174d50 kthread+0x1a0/0x1b0
>   [c00000000c747e10] c00000000000e0f0 ret_from_kernel_thread+0x5c/0x6c
> 
> This shows that CPU 2, which was idle, woke up and then appears to
> randomly take an instruction fault on a completely valid area of
> kernel text.
> 
> The cause turns out to be the call to hash__mark_rodata_ro(), late in
> boot. Due to the way we layout text and rodata, that function actually
> changes the permissions for all of text and rodata to read-only plus
> execute.
> 
> To do the permission change we use a hypervisor call, H_PROTECT. On
> phyp that appears to be implemented by briefly removing the mapping of
> the kernel text, before putting it back with the updated permissions.
> If any other CPU is executing during that window, it will see spurious
> faults on the kernel text and/or data, leading to crashes.
> 
> To fix it we use stop machine to collect all other CPUs, and then have
> them drop into real mode (MMU off), while we change the mapping. That
> way they are unaffected by the mapping temporarily disappearing.
> 
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
>  arch/powerpc/mm/book3s64/hash_pgtable.c | 105 +++++++++++++++++++++++-
>  1 file changed, 104 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
> index 3663d3cdffac..01de985df2c4 100644
> --- a/arch/powerpc/mm/book3s64/hash_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
> @@ -8,6 +8,7 @@
>  #include <linux/sched.h>
>  #include <linux/mm_types.h>
>  #include <linux/mm.h>
> +#include <linux/stop_machine.h>
>  
>  #include <asm/sections.h>
>  #include <asm/mmu.h>
> @@ -400,6 +401,19 @@ EXPORT_SYMBOL_GPL(hash__has_transparent_hugepage);
>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>  
>  #ifdef CONFIG_STRICT_KERNEL_RWX
> +
> +struct change_memory_parms {
> +	unsigned long start, end, newpp;
> +	unsigned int step, nr_cpus, master_cpu;
> +	atomic_t cpu_counter;
> +};
> +
> +// We'd rather this was on the stack but it has to be in the RMO
> +static struct change_memory_parms chmem_parms;
> +
> +// And therefore we need a lock to protect it from concurrent use
> +static DEFINE_MUTEX(chmem_lock);
> +
>  static void change_memory_range(unsigned long start, unsigned long end,
>  				unsigned int step, unsigned long newpp)
>  {
> @@ -414,6 +428,73 @@ static void change_memory_range(unsigned long start, unsigned long end,
>  							mmu_kernel_ssize);
>  }
>  
> +static int notrace chmem_secondary_loop(struct change_memory_parms *parms)
> +{
> +	unsigned long msr, tmp, flags;
> +	int *p;
> +
> +	p = &parms->cpu_counter.counter;
> +
> +	local_irq_save(flags);
> +	__hard_EE_RI_disable();
> +
> +	asm volatile (
> +	// Switch to real mode and leave interrupts off
> +	"mfmsr	%[msr]			;"
> +	"li	%[tmp], %[MSR_IR_DR]	;"
> +	"andc	%[tmp], %[msr], %[tmp]	;"
> +	"mtmsrd %[tmp]			;"
> +
> +	// Tell the master we are in real mode
> +	"1:				"
> +	"lwarx	%[tmp], 0, %[p]		;"
> +	"addic	%[tmp], %[tmp], -1	;"
> +	"stwcx.	%[tmp], 0, %[p]		;"
> +	"bne-	1b			;"
> +
> +	// Spin until the counter goes to zero
> +	"2:				;"
> +	"lwz	%[tmp], 0(%[p])		;"
> +	"cmpwi	%[tmp], 0		;"
> +	"bne-	2b			;"
> +
> +	// Switch back to virtual mode
> +	"mtmsrd %[msr]			;"

Pity we don't have something that can switch to emergency stack and
so we can write this stuff in C.

How's something like this suit you?

---
 arch/powerpc/kernel/misc_64.S | 22 +++++++++++++++++++++
 arch/powerpc/kernel/process.c | 37 +++++++++++++++++++++++++++++++++++
 2 files changed, 59 insertions(+)

diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index 070465825c21..5e911d0b0b16 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -27,6 +27,28 @@
 
 	.text
 
+#ifdef CONFIG_PPC_BOOK3S_64
+_GLOBAL(__call_realmode)
+	mflr	r0
+	std	r0,16(r1)
+	stdu	r1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r5)
+	mr	r1,r5
+	mtctr	r3
+	mr	r3,r4
+	mfmsr	r4
+	xori	r4,r4,(MSR_IR|MSR_DR)
+	mtmsrd	r4
+	bctrl
+	mfmsr	r4
+	xori	r4,r4,(MSR_IR|MSR_DR)
+	mtmsrd	r4
+	ld	r1,0(r1)
+	ld	r0,16(r1)
+	mtlr	r0
+	blr
+
+#endif
+
 _GLOBAL(call_do_softirq)
 	mflr	r0
 	std	r0,16(r1)
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index a66f435dabbf..260d60f665a3 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -2197,6 +2197,43 @@ void show_stack(struct task_struct *tsk, unsigned long *stack,
 	put_task_stack(tsk);
 }
 
+#ifdef CONFIG_PPC_BOOK3S_64
+int __call_realmode(int (*fn)(void *arg), void *arg, void *sp);
+
+/* XXX: find a better place for this
+ * Executing C code in real-mode in general Book3S-64 code can only be done
+ * via this function that switches the stack to one inside the real-mode-area,
+ * which may cover only a small first part of real memory on hash guest LPARs.
+ * fn must be NOKPROBES, must not access vmalloc or anything outside the RMA,
+ * probably shouldn't enable the MMU or interrupts, etc, and be very careful
+ * about calling other generic kernel or powerpc functions.
+ */
+int call_realmode(int (*fn)(void *arg), void *arg)
+{
+	unsigned long flags;
+	void *cursp, *emsp;
+	int ret;
+
+	/* Stack switch is only really required for HPT LPAR, but do it for all to help test coverage of tricky code */
+	cursp = (void *)(current_stack_pointer & ~(THREAD_SIZE - 1));
+	emsp = (void *)(local_paca->emergency_sp - THREAD_SIZE);
+
+	/* XXX check_stack_overflow(); */
+
+	if (WARN_ON_ONCE(cursp == emsp))
+		return -EBUSY;
+
+	local_irq_save(flags);
+	hard_irq_disable();
+
+	ret = __call_realmode(fn, arg, emsp);
+
+	local_irq_restore(flags);
+
+	return ret;
+}
+#endif
+
 #ifdef CONFIG_PPC64
 /* Called with hard IRQs off */
 void notrace __ppc64_runlatch_on(void)
-- 
2.23.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox