LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] cxl: Fix kobject memleak
From: Frederic Barrat @ 2020-06-02 16:21 UTC (permalink / raw)
  To: Wang Hai, ajd, arnd, gregkh; +Cc: linuxppc-dev, imunsie, linux-kernel
In-Reply-To: <20200602120733.5943-1-wanghai38@huawei.com>



Le 02/06/2020 à 14:07, Wang Hai a écrit :
> Currently the error return path from kobject_init_and_add() is not
> followed by a call to kobject_put() - which means we are leaking
> the kobject.
> 
> Fix it by adding a call to kobject_put() in the error path of
> kobject_init_and_add().
> 
> Fixes: b087e6190ddc ("cxl: Export optional AFU configuration record in sysfs")
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: Wang Hai <wanghai38@huawei.com>


Indeed, a call to kobject_put() is needed when the init fails.
Thanks!
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>


> ---
>   drivers/misc/cxl/sysfs.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/misc/cxl/sysfs.c b/drivers/misc/cxl/sysfs.c
> index f0263d1..d97a243 100644
> --- a/drivers/misc/cxl/sysfs.c
> +++ b/drivers/misc/cxl/sysfs.c
> @@ -624,7 +624,7 @@ static struct afu_config_record *cxl_sysfs_afu_new_cr(struct cxl_afu *afu, int c
>   	rc = kobject_init_and_add(&cr->kobj, &afu_config_record_type,
>   				  &afu->dev.kobj, "cr%i", cr->cr);
>   	if (rc)
> -		goto err;
> +		goto err1;
>   
>   	rc = sysfs_create_bin_file(&cr->kobj, &cr->config_attr);
>   	if (rc)
> 

^ permalink raw reply

* Re: [PATCH v2] cxl: Remove dead Kconfig option
From: Frederic Barrat @ 2020-06-02 16:10 UTC (permalink / raw)
  To: Andrew Donnellan, linuxppc-dev
In-Reply-To: <20200602070545.11942-1-ajd@linux.ibm.com>



Le 02/06/2020 à 09:05, Andrew Donnellan a écrit :
> The CXL_AFU_DRIVER_OPS Kconfig option was added to coordinate merging of
> new features. It no longer serves any purpose, so remove it.
> 
> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>


Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>


> 
> ---
> v1->v2:
> - keep CXL_LIB for now to avoid breaking a driver that is currently out of
> tree
> ---
>   drivers/misc/cxl/Kconfig | 4 ----
>   1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/misc/cxl/Kconfig b/drivers/misc/cxl/Kconfig
> index 39eec9031487..984114f7cee9 100644
> --- a/drivers/misc/cxl/Kconfig
> +++ b/drivers/misc/cxl/Kconfig
> @@ -7,9 +7,6 @@ config CXL_BASE
>   	bool
>   	select PPC_COPRO_BASE
>   
> -config CXL_AFU_DRIVER_OPS
> -	bool
> -
>   config CXL_LIB
>   	bool
>   
> @@ -17,7 +14,6 @@ config CXL
>   	tristate "Support for IBM Coherent Accelerators (CXL)"
>   	depends on PPC_POWERNV && PCI_MSI && EEH
>   	select CXL_BASE
> -	select CXL_AFU_DRIVER_OPS
>   	select CXL_LIB
>   	default m
>   	help
> 

^ permalink raw reply

* Re: ppc64le and 32-bit LE userland compatibility
From: Michal Suchánek @ 2020-06-02 15:56 UTC (permalink / raw)
  To: Daniel Kolesa
  Cc: libc-alpha, eery, musl, Will Springer,
	Palmer Dabbelt via binutils, via libc-dev, linuxppc-dev,
	Joseph Myers
In-Reply-To: <454e0d68-d69e-43fc-9a8c-0461dd5817a9@www.fastmail.com>

On Tue, Jun 02, 2020 at 05:40:39PM +0200, Daniel Kolesa wrote:
> 
> 
> On Tue, Jun 2, 2020, at 17:27, Michal Suchánek wrote:
> > On Tue, Jun 02, 2020 at 05:13:25PM +0200, Daniel Kolesa wrote:
> > > On Tue, Jun 2, 2020, at 16:23, Michal Suchánek wrote:
> > > > On Tue, Jun 02, 2020 at 01:40:23PM +0000, Joseph Myers wrote:
> > > > > On Tue, 2 Jun 2020, Daniel Kolesa wrote:
> > > > > 
> > > > > > not be limited to being just userspace under ppc64le, but should be 
> > > > > > runnable on a native kernel as well, which should not be limited to any 
> > > > > > particular baseline other than just PowerPC.
> > > > > 
> > > > > This is a fairly unusual approach to bringing up a new ABI.  Since new 
> > > > > ABIs are more likely to be used on new systems rather than switching ABI 
> > > > > on an existing installation, and since it can take quite some time for all 
> > > > > the software support for a new ABI to become widely available in 
> > > > > distributions, people developing new ABIs are likely to think about what 
> > > > > new systems are going to be relevant in a few years' time when working out 
> > > > > the minimum hardware requirements for the new ABI.  (The POWER8 minimum 
> > > > > for powerpc64le fits in with that, for example.)
> > > > That means that you cannot run ppc64le on FSL embedded CPUs (which lack
> > > > the vector instructions in LE mode). Which may be fine with you but
> > > > other people may want to support these. Can't really say if that's good
> > > > idea or not but I don't foresee them going away in a few years, either.
> > > 
> > > well, ppc64le already cannot be run on those, as far as I know (I don't think it's possible to build ppc64le userland without VSX in any configuration)
> > 
> > What hardware are you targetting then? I did not notice anything
> > specific mentioned in the thread.
> > 
> > Naturally on POWER the first cpu that has LE support is POWER8 so you
> > can count on all other POWER8 features to be present. With other
> > architecture variants the situation is different.
> 
> This is not true; nearly every 32-bit PowerPC CPU has LE support (all the way back to 6xx), these would be the native-hardware targets for the port (would need kernel support implemented, but it's technically possible).
I find dealing with memory management issues on 32bit architectures a
pain. There is never enough address space.
> 
> As far as 64-bit CPUs go, POWER7 is the first one that could in practice run the current ppc64le configuration, but in glibc it's limited to POWER8 and in gcc the default for powerpc64le is also POWER8 (however, it is perfectly possible to configure gcc for POWER7 and use musl libc with it).
That's interesting. I guess I was tricked but the glibc limitation.

Thanks

Michal

^ permalink raw reply

* Re: ppc64le and 32-bit LE userland compatibility
From: Daniel Kolesa @ 2020-06-02 15:40 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: libc-alpha, eery, musl, Will Springer,
	Palmer Dabbelt via binutils, via libc-dev, linuxppc-dev,
	Joseph Myers
In-Reply-To: <20200602152724.GU25173@kitsune.suse.cz>



On Tue, Jun 2, 2020, at 17:27, Michal Suchánek wrote:
> On Tue, Jun 02, 2020 at 05:13:25PM +0200, Daniel Kolesa wrote:
> > On Tue, Jun 2, 2020, at 16:23, Michal Suchánek wrote:
> > > On Tue, Jun 02, 2020 at 01:40:23PM +0000, Joseph Myers wrote:
> > > > On Tue, 2 Jun 2020, Daniel Kolesa wrote:
> > > > 
> > > > > not be limited to being just userspace under ppc64le, but should be 
> > > > > runnable on a native kernel as well, which should not be limited to any 
> > > > > particular baseline other than just PowerPC.
> > > > 
> > > > This is a fairly unusual approach to bringing up a new ABI.  Since new 
> > > > ABIs are more likely to be used on new systems rather than switching ABI 
> > > > on an existing installation, and since it can take quite some time for all 
> > > > the software support for a new ABI to become widely available in 
> > > > distributions, people developing new ABIs are likely to think about what 
> > > > new systems are going to be relevant in a few years' time when working out 
> > > > the minimum hardware requirements for the new ABI.  (The POWER8 minimum 
> > > > for powerpc64le fits in with that, for example.)
> > > That means that you cannot run ppc64le on FSL embedded CPUs (which lack
> > > the vector instructions in LE mode). Which may be fine with you but
> > > other people may want to support these. Can't really say if that's good
> > > idea or not but I don't foresee them going away in a few years, either.
> > 
> > well, ppc64le already cannot be run on those, as far as I know (I don't think it's possible to build ppc64le userland without VSX in any configuration)
> 
> What hardware are you targetting then? I did not notice anything
> specific mentioned in the thread.
> 
> Naturally on POWER the first cpu that has LE support is POWER8 so you
> can count on all other POWER8 features to be present. With other
> architecture variants the situation is different.

This is not true; nearly every 32-bit PowerPC CPU has LE support (all the way back to 6xx), these would be the native-hardware targets for the port (would need kernel support implemented, but it's technically possible).

As far as 64-bit CPUs go, POWER7 is the first one that could in practice run the current ppc64le configuration, but in glibc it's limited to POWER8 and in gcc the default for powerpc64le is also POWER8 (however, it is perfectly possible to configure gcc for POWER7 and use musl libc with it).

> 
> Thanks
> 
> Michal
>

^ permalink raw reply

* Re: ppc64le and 32-bit LE userland compatibility
From: Michal Suchánek @ 2020-06-02 15:27 UTC (permalink / raw)
  To: Daniel Kolesa
  Cc: libc-alpha, eery, musl, Will Springer,
	Palmer Dabbelt via binutils, via libc-dev, linuxppc-dev,
	Joseph Myers
In-Reply-To: <3aeb6dfe-ae23-42f9-ac23-16be6b54a850@www.fastmail.com>

On Tue, Jun 02, 2020 at 05:13:25PM +0200, Daniel Kolesa wrote:
> On Tue, Jun 2, 2020, at 16:23, Michal Suchánek wrote:
> > On Tue, Jun 02, 2020 at 01:40:23PM +0000, Joseph Myers wrote:
> > > On Tue, 2 Jun 2020, Daniel Kolesa wrote:
> > > 
> > > > not be limited to being just userspace under ppc64le, but should be 
> > > > runnable on a native kernel as well, which should not be limited to any 
> > > > particular baseline other than just PowerPC.
> > > 
> > > This is a fairly unusual approach to bringing up a new ABI.  Since new 
> > > ABIs are more likely to be used on new systems rather than switching ABI 
> > > on an existing installation, and since it can take quite some time for all 
> > > the software support for a new ABI to become widely available in 
> > > distributions, people developing new ABIs are likely to think about what 
> > > new systems are going to be relevant in a few years' time when working out 
> > > the minimum hardware requirements for the new ABI.  (The POWER8 minimum 
> > > for powerpc64le fits in with that, for example.)
> > That means that you cannot run ppc64le on FSL embedded CPUs (which lack
> > the vector instructions in LE mode). Which may be fine with you but
> > other people may want to support these. Can't really say if that's good
> > idea or not but I don't foresee them going away in a few years, either.
> 
> well, ppc64le already cannot be run on those, as far as I know (I don't think it's possible to build ppc64le userland without VSX in any configuration)

What hardware are you targetting then? I did not notice anything
specific mentioned in the thread.

Naturally on POWER the first cpu that has LE support is POWER8 so you
can count on all other POWER8 features to be present. With other
architecture variants the situation is different.

Thanks

Michal

^ permalink raw reply

* Re: ppc64le and 32-bit LE userland compatibility
From: Daniel Kolesa @ 2020-06-02 15:13 UTC (permalink / raw)
  To: Michal Suchánek, Joseph Myers
  Cc: libc-alpha, eery, musl, Will Springer,
	Palmer Dabbelt via binutils, via libc-dev, linuxppc-dev
In-Reply-To: <20200602142337.GS25173@kitsune.suse.cz>

On Tue, Jun 2, 2020, at 16:23, Michal Suchánek wrote:
> On Tue, Jun 02, 2020 at 01:40:23PM +0000, Joseph Myers wrote:
> > On Tue, 2 Jun 2020, Daniel Kolesa wrote:
> > 
> > > not be limited to being just userspace under ppc64le, but should be 
> > > runnable on a native kernel as well, which should not be limited to any 
> > > particular baseline other than just PowerPC.
> > 
> > This is a fairly unusual approach to bringing up a new ABI.  Since new 
> > ABIs are more likely to be used on new systems rather than switching ABI 
> > on an existing installation, and since it can take quite some time for all 
> > the software support for a new ABI to become widely available in 
> > distributions, people developing new ABIs are likely to think about what 
> > new systems are going to be relevant in a few years' time when working out 
> > the minimum hardware requirements for the new ABI.  (The POWER8 minimum 
> > for powerpc64le fits in with that, for example.)
> That means that you cannot run ppc64le on FSL embedded CPUs (which lack
> the vector instructions in LE mode). Which may be fine with you but
> other people may want to support these. Can't really say if that's good
> idea or not but I don't foresee them going away in a few years, either.

well, ppc64le already cannot be run on those, as far as I know (I don't think it's possible to build ppc64le userland without VSX in any configuration)

> 
> Thanks
> 
> Michal
>

Daniel

^ permalink raw reply

* Re: [PATCH 11/14] x86/entry: Clarify irq_{enter,exit}_rcu()
From: Peter Zijlstra @ 2020-06-02 15:05 UTC (permalink / raw)
  To: Qian Cai
  Cc: daniel.thompson, andrew.cooper3, bigeasy, x86, linux-kernel,
	sean.j.christopherson, luto, Lai Jiangshan, rostedt, a.darwish,
	tglx, linuxppc-dev
In-Reply-To: <20200602144235.GA1129@lca.pw>

On Tue, Jun 02, 2020 at 10:42:35AM -0400, Qian Cai wrote:

> Reverted this commit fixed the POWER9 boot warning,

ARGH, I'm an idiot. Please try this instead:


diff --git a/kernel/softirq.c b/kernel/softirq.c
index a3eb6eba8c41..c4201b7f42b1 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -438,7 +438,7 @@ void irq_exit_rcu(void)
  */
 void irq_exit(void)
 {
-	irq_exit_rcu();
+	__irq_exit_rcu();
 	rcu_irq_exit();
 	 /* must be last! */
 	lockdep_hardirq_exit();



^ permalink raw reply related

* Re: ppc64le and 32-bit LE userland compatibility
From: Daniel Kolesa @ 2020-06-02 14:52 UTC (permalink / raw)
  To: Joseph Myers
  Cc: libc-alpha, eery, musl, Will Springer,
	Palmer Dabbelt via binutils, via libc-dev, linuxppc-dev
In-Reply-To: <alpine.DEB.2.21.2006021334170.24059@digraph.polyomino.org.uk>

On Tue, Jun 2, 2020, at 15:40, Joseph Myers wrote:
> On Tue, 2 Jun 2020, Daniel Kolesa wrote:
> 
> > not be limited to being just userspace under ppc64le, but should be 
> > runnable on a native kernel as well, which should not be limited to any 
> > particular baseline other than just PowerPC.
> 
> This is a fairly unusual approach to bringing up a new ABI.  Since new 
> ABIs are more likely to be used on new systems rather than switching ABI 
> on an existing installation, and since it can take quite some time for all 
> the software support for a new ABI to become widely available in 
> distributions, people developing new ABIs are likely to think about what 
> new systems are going to be relevant in a few years' time when working out 
> the minimum hardware requirements for the new ABI.  (The POWER8 minimum 
> for powerpc64le fits in with that, for example.)

In this case the whole point is targeting existing hardware that we already have. It also aims to largely utilize things that are already present in all parts of the toolchain, otherwise it's a lot of effort with questionable benefit and artificially limits the baseline for no good reason.

> 
> > either the AIX/ELFv1 nor the ELFv2 ABIs) If we were to introduce new 
> > ports, what would those use? ld64.so.3 for BE/v2? ld.so.2 for LE/32-bit? 
> 
> Rather than relying on numbers such as "3" or 2" in a particular place 
> being unique across all (architecture, ABI) pairs supported by glibc, 
> something more obviously specific to a particular architecture and ABI, 
> e.g. ld-linux-powerpc64be-elfv2.so.1, would be better.

Yes, agreed on that - probably ld-linux-powerpc64-elfv2.so.1, to match existing conventions (powerpc64 implicitly refers to BE in target triples, etc). It's just inconsistent with the existing ports, but I guess the reason in those is legacy in the first place, so not much point in worrying about that.

> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com
>

Daniel

^ permalink raw reply

* Re: [PATCH] powerpc: Fix misleading small cores print
From: Gautham R Shenoy @ 2020-06-02 14:52 UTC (permalink / raw)
  To: Michael Neuling; +Cc: linuxppc-dev, Gautham R . Shenoy
In-Reply-To: <20200528230731.1235752-1-mikey@neuling.org>

On Fri, May 29, 2020 at 09:07:31AM +1000, Michael Neuling wrote:
> Currently when we boot on a big core system, we get this print:
>   [    0.040500] Using small cores at SMT level
> 
> This is misleading as we've actually detected big cores.
> 
> This patch clears up the print to say we've detect big cores but are
> using small cores for scheduling.

Thanks for making the print more meaningful.


> 
> Signed-off-by: Michael Neuling <mikey@neuling.org>

FWIW,
Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  arch/powerpc/kernel/smp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 6d2a3a3666..c820c95162 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1383,7 +1383,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
> 
>  #ifdef CONFIG_SCHED_SMT
>  	if (has_big_cores) {
> -		pr_info("Using small cores at SMT level\n");
> +		pr_info("Big cores detected but using small core scheduling\n");
>  		power9_topology[0].mask = smallcore_smt_mask;
>  		powerpc_topology[0].mask = smallcore_smt_mask;
>  	}
> -- 
> 2.26.2
> 

^ permalink raw reply

* Re: [PATCH 11/14] x86/entry: Clarify irq_{enter,exit}_rcu()
From: Qian Cai @ 2020-06-02 14:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: daniel.thompson, andrew.cooper3, bigeasy, x86, linux-kernel,
	sean.j.christopherson, luto, Lai Jiangshan, rostedt, a.darwish,
	tglx, linuxppc-dev
In-Reply-To: <20200529213321.359433429@infradead.org>

On Fri, May 29, 2020 at 11:27:39PM +0200, Peter Zijlstra wrote:
> Because:
> 
>   irq_enter_rcu() includes lockdep_hardirq_enter()
>   irq_exit_rcu() does *NOT* include lockdep_hardirq_exit()
> 
> Which resulted in two 'stray' lockdep_hardirq_exit() calls in
> idtentry.h, and me spending a long time trying to find the matching
> enter calls.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/include/asm/idtentry.h |    2 --
>  kernel/softirq.c                |   19 +++++++++++++------
>  2 files changed, 13 insertions(+), 8 deletions(-)
> 
[]
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -404,12 +404,7 @@ static inline void tick_irq_exit(void)
>  #endif
>  }
>  
> -/**
> - * irq_exit_rcu() - Exit an interrupt context without updating RCU
> - *
> - * Also processes softirqs if needed and possible.
> - */
> -void irq_exit_rcu(void)
> +static inline void __irq_exit_rcu(void)
>  {
>  #ifndef __ARCH_IRQ_EXIT_IRQS_DISABLED
>  	local_irq_disable();
> @@ -425,6 +420,18 @@ void irq_exit_rcu(void)
>  }
>  
>  /**
> + * irq_exit_rcu() - Exit an interrupt context without updating RCU
> + *
> + * Also processes softirqs if needed and possible.
> + */
> +void irq_exit_rcu(void)
> +{
> +	__irq_exit_rcu();
> +	 /* must be last! */
> +	lockdep_hardirq_exit();
> +}
> +
> +/**
>   * irq_exit - Exit an interrupt context, update RCU and lockdep
>   *
>   * Also processes softirqs if needed and possible.
> 
>

Reverted this commit fixed the POWER9 boot warning,

[    0.005196][    T0] clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns
[    0.012502][    T0] clocksource: timebase mult[1f40000] shift[24] registered
[    0.030273][    T0] ------------[ cut here ]------------
[    0.034421][    T0] DEBUG_LOCKS_WARN_ON(current->hardirq_context)
[    0.034433][    T0] WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:3680 lockdep_hardirqs_on_prepare+0x29c/0x2d0
[    0.045874][    T0] Modules linked in:
[    0.047977][    T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-next-20200602 #1
[    0.053187][    T0] NIP:  c0000000001d2fec LR: c0000000001d2fe8 CTR: c00000000074b0a0
[    0.057395][    T0] REGS: c00000000130f810 TRAP: 0700   Not tainted  (5.7.0-next-20200602)
[    0.062614][    T0] MSR:  9000000000021033 <SF,HV,ME,IR,DR,RI,LE>  CR: 48000422  XER: 20040000
[    0.069856][    T0] CFAR: c00000000010e448 IRQMASK: 1
[    0.069856][    T0] GPR00: c0000000001d2fe8 c00000000130faa0 c00000000130aa00 000000000000002d
[    0.069856][    T0] GPR04: c00000000133c3b0 000000000000000d 000000006e6f635f 72727563284e4f5f
[    0.069856][    T0] GPR08: 0000000000000002 c000000000dcf230 0000000000000001 c0000000012b0280
[    0.069856][    T0] GPR12: 0000000000000000 c0000000057b0000 0000000000000000 0000000000000000
[    0.069856][    T0] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[    0.069856][    T0] GPR20: 0000000000000000 0000000000000001 0000000010004d9c 00000000100053ed
[    0.069856][    T0] GPR24: 0000000010005411 0000000000000001 0000000000000002 0000000000000003
[    0.069856][    T0] GPR28: 0000000000000000 0000000000000000 0000000000000000 c000000003e3b008
[    0.117846][    T0] NIP [c0000000001d2fec] lockdep_hardirqs_on_prepare+0x29c/0x2d0
[    0.123052][    T0] LR [c0000000001d2fe8] lockdep_hardirqs_on_prepare+0x298/0x2d0
[    0.127248][    T0] Call Trace:
[    0.129337][    T0] [c00000000130faa0] [c0000000001d2fe8] lockdep_hardirqs_on_prepare+0x298/0x2d0 (unreliable)
[    0.137613][    T0] [c00000000130fb10] [c0000000002d3834] trace_hardirqs_on+0x94/0x230
trace_hardirqs_on at kernel/trace/trace_preemptirq.c:49
[    0.141824][    T0] [c00000000130fb60] [c000000000039100] interrupt_exit_kernel_prepare+0x110/0x1f0
interrupt_exit_kernel_prepare at arch/powerpc/kernel/syscall_64.c:337
[    0.148069][    T0] [c00000000130fbc0] [c00000000000f328] interrupt_return+0x118/0x1c0
[    0.152281][    T0] --- interrupt: 900 at arch_local_irq_restore+0xc0/0xd0
arch_local_irq_restore at arch/powerpc/kernel/irq.c:367
(inlined by) arch_local_irq_restore at arch/powerpc/kernel/irq.c:318
[    0.152281][    T0]     LR = start_kernel+0x7f0/0x9dc
[    0.153579][    T0] [c00000000130fec0] [c000000001208fa8] init_on_free+0x0/0x2b0 (unreliable)
[    0.159810][    T0] [c00000000130fee0] [c000000000c845c8] start_kernel+0x7e4/0x9dc
start_kernel at init/main.c:961 (discriminator 3)
[    0.165017][    T0] [c00000000130ff90] [c00000000000c890] start_here_common+0x1c/0x8c
[    0.169224][    T0] Instruction dump:
[    0.171324][    T0] 0fe00000 e8010080 ebc10060 ebe10068 7c0803a6 4bfffe7c 3c82ff8b 3c62ff8a
[    0.177558][    T0] 38848808 3863e460 4bf3b3fd 60000000 <0fe00000> e8010080 ebc10060 ebe10068
[    0.183796][    T0] irq event stamp: 16
[    0.186904][    T0] hardirqs last  enabled at (14): [<c00000000020cf14>] rcu_core+0x9a4/0xbe0
[    0.191130][    T0] hardirqs last disabled at (15): [<c000000000a39944>] __do_softirq+0x5d4/0x8d8
[    0.195365][    T0] softirqs last  enabled at (16): [<c000000000a399c8>] __do_softirq+0x658/0x8d8
[    0.201606][    T0] softirqs last disabled at (5): [<c00000000011cbbc>] irq_exit+0x17c/0x1c0
[    0.206832][    T0] ---[ end trace 339d75c2056bfda1 ]---
[    0.208990][    T0] printk: console [hvc0] enabled
[    0.208990][    T0] printk: console [hvc0] enabled

^ permalink raw reply

* Re: ppc64le and 32-bit LE userland compatibility
From: Michal Suchánek @ 2020-06-02 14:23 UTC (permalink / raw)
  To: Joseph Myers
  Cc: libc-alpha, eery, Daniel Kolesa, musl, Will Springer,
	Palmer Dabbelt via binutils, via libc-dev, linuxppc-dev
In-Reply-To: <alpine.DEB.2.21.2006021334170.24059@digraph.polyomino.org.uk>

On Tue, Jun 02, 2020 at 01:40:23PM +0000, Joseph Myers wrote:
> On Tue, 2 Jun 2020, Daniel Kolesa wrote:
> 
> > not be limited to being just userspace under ppc64le, but should be 
> > runnable on a native kernel as well, which should not be limited to any 
> > particular baseline other than just PowerPC.
> 
> This is a fairly unusual approach to bringing up a new ABI.  Since new 
> ABIs are more likely to be used on new systems rather than switching ABI 
> on an existing installation, and since it can take quite some time for all 
> the software support for a new ABI to become widely available in 
> distributions, people developing new ABIs are likely to think about what 
> new systems are going to be relevant in a few years' time when working out 
> the minimum hardware requirements for the new ABI.  (The POWER8 minimum 
> for powerpc64le fits in with that, for example.)
That means that you cannot run ppc64le on FSL embedded CPUs (which lack
the vector instructions in LE mode). Which may be fine with you but
other people may want to support these. Can't really say if that's good
idea or not but I don't foresee them going away in a few years, either.

Thanks

Michal

^ permalink raw reply

* Re: ppc64le and 32-bit LE userland compatibility
From: Joseph Myers @ 2020-06-02 13:50 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: libc-alpha, eery, Daniel Kolesa, musl, Will Springer,
	Palmer Dabbelt via binutils, via libc-dev, linuxppc-dev
In-Reply-To: <20200602021245.GO31009@gate.crashing.org>

On Mon, 1 Jun 2020, Segher Boessenkool wrote:

> > The supported glibc ABIs are listed at 
> > <https://sourceware.org/glibc/wiki/ABIList>.
> 
> powerpcle-linux already does work somewhat, and that should also he
> worth something, official or not ;-)
> 
> (It has worked for very many years already...  That is, I have built it
> a lot, I have no idea about running a full distro that way).

Greg McGary's patches as referenced at 
<https://sourceware.org/legacy-ml/libc-hacker/2006-11/msg00013.html> were 
never merged, and I don't know how big the changes to .S files were or if 
they were ever posted.  Many files were subsequently fixed as part of 
bringing up support for powerpc64le, but without actual 32-bit LE testing 
as part of each release cycle there's not much evidence of correctness for 
LE of code not used for powerpc64le.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply

* Re: ppc64le and 32-bit LE userland compatibility
From: Joseph Myers @ 2020-06-02 13:40 UTC (permalink / raw)
  To: Daniel Kolesa
  Cc: libc-alpha, eery, musl, Will Springer,
	Palmer Dabbelt via binutils, via libc-dev, linuxppc-dev
In-Reply-To: <b44b3aa7-f9cc-43e1-b2c4-0edb6ea06189@www.fastmail.com>

On Tue, 2 Jun 2020, Daniel Kolesa wrote:

> not be limited to being just userspace under ppc64le, but should be 
> runnable on a native kernel as well, which should not be limited to any 
> particular baseline other than just PowerPC.

This is a fairly unusual approach to bringing up a new ABI.  Since new 
ABIs are more likely to be used on new systems rather than switching ABI 
on an existing installation, and since it can take quite some time for all 
the software support for a new ABI to become widely available in 
distributions, people developing new ABIs are likely to think about what 
new systems are going to be relevant in a few years' time when working out 
the minimum hardware requirements for the new ABI.  (The POWER8 minimum 
for powerpc64le fits in with that, for example.)

> either the AIX/ELFv1 nor the ELFv2 ABIs) If we were to introduce new 
> ports, what would those use? ld64.so.3 for BE/v2? ld.so.2 for LE/32-bit? 

Rather than relying on numbers such as "3" or 2" in a particular place 
being unique across all (architecture, ABI) pairs supported by glibc, 
something more obviously specific to a particular architecture and ABI, 
e.g. ld-linux-powerpc64be-elfv2.so.1, would be better.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply

* Re: [PATCH] powerpc/kvm: Enable support for ISA v3.1 guests
From: kbuild test robot @ 2020-06-02 13:05 UTC (permalink / raw)
  To: Alistair Popple, linuxppc-dev
  Cc: ravi.bangoria, mikey, kbuild-all, kvm-ppc, Alistair Popple
In-Reply-To: <20200602055325.6102-1-alistair@popple.id.au>

[-- Attachment #1: Type: text/plain, Size: 4657 bytes --]

Hi Alistair,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.7 next-20200529]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Alistair-Popple/powerpc-kvm-Enable-support-for-ISA-v3-1-guests/20200602-140435
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>, old ones prefixed by <<):

arch/powerpc/kvm/book3s_hv.c: In function 'kvmppc_set_arch_compat':
<<                  from arch/powerpc/kvm/book3s_hv.c:81:
>> arch/powerpc/kvm/book3s_hv.c:356:22: error: 'CPU_FTR_ARCH_31' undeclared (first use in this function); did you mean 'CPU_FTR_ARCH_300'?
356 |  if (cpu_has_feature(CPU_FTR_ARCH_31))
|                      ^~~~~~~~~~~~~~~
|                      CPU_FTR_ARCH_300
arch/powerpc/kvm/book3s_hv.c:356:22: note: each undeclared identifier is reported only once for each function it appears in
<<                  from arch/powerpc/kvm/book3s_hv.c:81:
>> arch/powerpc/kvm/book3s_hv.c:348:25: error: 'PCR_ARCH_300' undeclared (first use in this function); did you mean 'PVR_ARCH_300'?
348 | #define PCR_ARCH_31    (PCR_ARCH_300 << 1)
|                         ^~~~~~~~~~~~
<<                  from arch/powerpc/kvm/book3s_hv.c:81:
>> arch/powerpc/kvm/book3s_hv.c:357:18: note: in expansion of macro 'PCR_ARCH_31'
357 |   host_pcr_bit = PCR_ARCH_31;
|                  ^~~~~~~~~~~
arch/powerpc/kvm/book3s_hv.c: At top level:
arch/powerpc/kvm/book3s_hv.c:3521:5: error: no previous prototype for 'kvmhv_p9_guest_entry' [-Werror=missing-prototypes]
3521 | int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
|     ^~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors

vim +356 arch/powerpc/kvm/book3s_hv.c

   346	
   347	/* Dummy value used in computing PCR value below */
 > 348	#define PCR_ARCH_31    (PCR_ARCH_300 << 1)
   349	
   350	static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, u32 arch_compat)
   351	{
   352		unsigned long host_pcr_bit = 0, guest_pcr_bit = 0;
   353		struct kvmppc_vcore *vc = vcpu->arch.vcore;
   354	
   355		/* We can (emulate) our own architecture version and anything older */
 > 356		if (cpu_has_feature(CPU_FTR_ARCH_31))
 > 357			host_pcr_bit = PCR_ARCH_31;
   358		else if (cpu_has_feature(CPU_FTR_ARCH_300))
   359			host_pcr_bit = PCR_ARCH_300;
   360		else if (cpu_has_feature(CPU_FTR_ARCH_207S))
   361			host_pcr_bit = PCR_ARCH_207;
   362		else if (cpu_has_feature(CPU_FTR_ARCH_206))
   363			host_pcr_bit = PCR_ARCH_206;
   364		else
   365			host_pcr_bit = PCR_ARCH_205;
   366	
   367		/* Determine lowest PCR bit needed to run guest in given PVR level */
   368		guest_pcr_bit = host_pcr_bit;
   369		if (arch_compat) {
   370			switch (arch_compat) {
   371			case PVR_ARCH_205:
   372				guest_pcr_bit = PCR_ARCH_205;
   373				break;
   374			case PVR_ARCH_206:
   375			case PVR_ARCH_206p:
   376				guest_pcr_bit = PCR_ARCH_206;
   377				break;
   378			case PVR_ARCH_207:
   379				guest_pcr_bit = PCR_ARCH_207;
   380				break;
   381			case PVR_ARCH_300:
   382				guest_pcr_bit = PCR_ARCH_300;
   383				break;
   384			case PVR_ARCH_31:
   385				guest_pcr_bit = PCR_ARCH_31;
   386				break;
   387			default:
   388				return -EINVAL;
   389			}
   390		}
   391	
   392		/* Check requested PCR bits don't exceed our capabilities */
   393		if (guest_pcr_bit > host_pcr_bit)
   394			return -EINVAL;
   395	
   396		spin_lock(&vc->lock);
   397		vc->arch_compat = arch_compat;
   398		/*
   399		 * Set all PCR bits for which guest_pcr_bit <= bit < host_pcr_bit
   400		 * Also set all reserved PCR bits
   401		 */
   402		vc->pcr = (host_pcr_bit - guest_pcr_bit) | PCR_MASK;
   403		spin_unlock(&vc->lock);
   404	
   405		return 0;
   406	}
   407	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26146 bytes --]

^ permalink raw reply

* [PATCH] cxl: Fix kobject memleak
From: Wang Hai @ 2020-06-02 12:07 UTC (permalink / raw)
  To: fbarrat, ajd, arnd, gregkh; +Cc: wanghai38, linuxppc-dev, imunsie, linux-kernel

Currently the error return path from kobject_init_and_add() is not
followed by a call to kobject_put() - which means we are leaking
the kobject.

Fix it by adding a call to kobject_put() in the error path of
kobject_init_and_add().

Fixes: b087e6190ddc ("cxl: Export optional AFU configuration record in sysfs")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
---
 drivers/misc/cxl/sysfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/cxl/sysfs.c b/drivers/misc/cxl/sysfs.c
index f0263d1..d97a243 100644
--- a/drivers/misc/cxl/sysfs.c
+++ b/drivers/misc/cxl/sysfs.c
@@ -624,7 +624,7 @@ static struct afu_config_record *cxl_sysfs_afu_new_cr(struct cxl_afu *afu, int c
 	rc = kobject_init_and_add(&cr->kobj, &afu_config_record_type,
 				  &afu->dev.kobj, "cr%i", cr->cr);
 	if (rc)
-		goto err;
+		goto err1;
 
 	rc = sysfs_create_bin_file(&cr->kobj, &cr->config_attr);
 	if (rc)
-- 
1.8.3.1


^ permalink raw reply related

* Re: [PATCH] powerpc/nvram: Replace kmalloc with kzalloc in the error message
From: Dan Carpenter @ 2020-06-02 11:41 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Yi Wang, Tony Luck, Kees Cook, Wang Liang, Greg Kroah-Hartman,
	Anton Vorontsov, kernel-janitors, LKML, Markus Elfring,
	Liao Pingfang, Xue Zhihong, Colin Cross, Joe Perches,
	Paul Mackerras, Thomas Gleixner, linuxppc-dev, Allison Randal
In-Reply-To: <87a71liucy.fsf@mpe.ellerman.id.au>

On Tue, Jun 02, 2020 at 09:23:57PM +1000, Michael Ellerman wrote:
> Markus Elfring <Markus.Elfring@web.de> writes:
> >>>> Please just remove the message instead, it's a tiny allocation that's
> >>>> unlikely to ever fail, and the caller will print an error anyway.
> >>>
> >>> How do you think about to take another look at a previous update suggestion
> >>> like the following?
> >>>
> >>> powerpc/nvram: Delete three error messages for a failed memory allocation
> >>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/00845261-8528-d011-d3b8-e9355a231d3a@users.sourceforge.net/
> >>> https://lore.kernel.org/linuxppc-dev/00845261-8528-d011-d3b8-e9355a231d3a@users.sourceforge.net/
> >>> https://lore.kernel.org/patchwork/patch/752720/
> >>> https://lkml.org/lkml/2017/1/19/537
> >>
> >> That deleted the messages from nvram_scan_partitions(), but neither of
> >> the callers of nvram_scan_paritions() check its return value or print
> >> anything if it fails. So removing those messages would make those
> >> failures silent which is not what we want.
> >
> > * How do you think about information like the following?
> >   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst?id=f359287765c04711ff54fbd11645271d8e5ff763#n883
> > “…
> > These generic allocation functions all emit a stack dump on failure when used
> > without __GFP_NOWARN so there is no use in emitting an additional failure
> > message when NULL is returned.
> > …”
> 
> Are you sure that's actually true?
> 
> A quick look around in slub.c leads me to:
> 
> slab_out_of_memory(struct kmem_cache *s, gfp_t gfpflags, int nid)
> {
> #ifdef CONFIG_SLUB_DEBUG

You first have to enable EXPERT mode before you can disable SLUB_DEBUG.
So that hopefully means you *really* want to save memory.  It doesn't
make sense to add a bunch of memory wasting printks when the users want
to go to extra lengths to conserve memory.

regards,
dan carpenter


^ permalink raw reply

* Re: [PATCH] powerpc/nvram: Replace kmalloc with kzalloc in the error message
From: Michael Ellerman @ 2020-06-02 11:23 UTC (permalink / raw)
  To: Markus Elfring, Liao Pingfang, linuxppc-dev, Joe Perches
  Cc: Yi Wang, Tony Luck, Kees Cook, Wang Liang, Anton Vorontsov,
	kernel-janitors, LKML, Colin Cross, Paul Mackerras, Xue Zhihong,
	Greg Kroah-Hartman, Thomas Gleixner, Allison Randal
In-Reply-To: <a3c158fa-3829-f38a-9202-8984b5ef5f21@web.de>

Markus Elfring <Markus.Elfring@web.de> writes:
>>>> Please just remove the message instead, it's a tiny allocation that's
>>>> unlikely to ever fail, and the caller will print an error anyway.
>>>
>>> How do you think about to take another look at a previous update suggestion
>>> like the following?
>>>
>>> powerpc/nvram: Delete three error messages for a failed memory allocation
>>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/00845261-8528-d011-d3b8-e9355a231d3a@users.sourceforge.net/
>>> https://lore.kernel.org/linuxppc-dev/00845261-8528-d011-d3b8-e9355a231d3a@users.sourceforge.net/
>>> https://lore.kernel.org/patchwork/patch/752720/
>>> https://lkml.org/lkml/2017/1/19/537
>>
>> That deleted the messages from nvram_scan_partitions(), but neither of
>> the callers of nvram_scan_paritions() check its return value or print
>> anything if it fails. So removing those messages would make those
>> failures silent which is not what we want.
>
> * How do you think about information like the following?
>   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst?id=f359287765c04711ff54fbd11645271d8e5ff763#n883
> “…
> These generic allocation functions all emit a stack dump on failure when used
> without __GFP_NOWARN so there is no use in emitting an additional failure
> message when NULL is returned.
> …”

Are you sure that's actually true?

A quick look around in slub.c leads me to:

slab_out_of_memory(struct kmem_cache *s, gfp_t gfpflags, int nid)
{
#ifdef CONFIG_SLUB_DEBUG
	static DEFINE_RATELIMIT_STATE(slub_oom_rs, DEFAULT_RATELIMIT_INTERVAL,
				      DEFAULT_RATELIMIT_BURST);
	int node;
	struct kmem_cache_node *n;

	if ((gfpflags & __GFP_NOWARN) || !__ratelimit(&slub_oom_rs))
		return;

	pr_warn("SLUB: Unable to allocate memory on node %d, gfp=%#x(%pGg)\n",
		nid, gfpflags, &gfpflags);
	pr_warn("  cache: %s, object size: %u, buffer size: %u, default order: %u, min order: %u\n",
		s->name, s->object_size, s->size, oo_order(s->oo),
		oo_order(s->min));

	if (oo_order(s->min) > get_order(s->object_size))
		pr_warn("  %s debugging increased min order, use slub_debug=O to disable.\n",
			s->name);

	for_each_kmem_cache_node(s, node, n) {
		unsigned long nr_slabs;
		unsigned long nr_objs;
		unsigned long nr_free;

		nr_free  = count_partial(n, count_free);
		nr_slabs = node_nr_slabs(n);
		nr_objs  = node_nr_objs(n);

		pr_warn("  node %d: slabs: %ld, objs: %ld, free: %ld\n",
			node, nr_slabs, nr_objs, nr_free);
	}
#endif
}

Which looks a lot like it won't print anything when CONFIG_SLUB_DEBUG=n.

But maybe I'm looking in the wrong place?

cheers

^ permalink raw reply

* Re: [PATCH] hw_breakpoint: Fix build warnings with clang
From: Ravi Bangoria @ 2020-06-02 11:19 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: christophe.leroy, apopple, mikey, Ravi Bangoria, linux-kernel,
	npiggin, paulus, naveen.n.rao, linuxppc-dev
In-Reply-To: <87d06hivfs.fsf@mpe.ellerman.id.au>



On 6/2/20 4:30 PM, Michael Ellerman wrote:
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> Le 02/06/2020 à 06:12, Ravi Bangoria a écrit :
>>> kbuild test robot reported few build warnings with hw_breakpoint code
>>> when compiled with clang[1]. Fix those.
>>>
>>> [1]: https://lore.kernel.org/linuxppc-dev/202005192233.oi9CjRtA%25lkp@intel.com/
>>>
> 
> This should have mentioned that some of the errors were recently
> introduced by your commit.

Sure, will take care next time.

> 
>>> Reported-by: kbuild test robot <lkp@intel.com>
>>> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
>>> ---
>>> Note: Prepared on top of powerpc/next.
>>>
>>>    arch/powerpc/include/asm/hw_breakpoint.h | 3 ---
>>>    include/linux/hw_breakpoint.h            | 4 ++++
>>>    2 files changed, 4 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
>>> index f42a55eb77d2..cb424799da0d 100644
>>> --- a/arch/powerpc/include/asm/hw_breakpoint.h
>>> +++ b/arch/powerpc/include/asm/hw_breakpoint.h
>>> @@ -70,9 +70,6 @@ extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
>>>    						unsigned long val, void *data);
>>>    int arch_install_hw_breakpoint(struct perf_event *bp);
>>>    void arch_uninstall_hw_breakpoint(struct perf_event *bp);
>>> -int arch_reserve_bp_slot(struct perf_event *bp);
>>> -void arch_release_bp_slot(struct perf_event *bp);
>>> -void arch_unregister_hw_breakpoint(struct perf_event *bp);
>>>    void hw_breakpoint_pmu_read(struct perf_event *bp);
>>>    extern void flush_ptrace_hw_breakpoint(struct task_struct *tsk);
>>>    
>>> diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
>>> index 6058c3844a76..521481f0d320 100644
>>> --- a/include/linux/hw_breakpoint.h
>>> +++ b/include/linux/hw_breakpoint.h
>>> @@ -80,6 +80,10 @@ extern int dbg_reserve_bp_slot(struct perf_event *bp);
>>>    extern int dbg_release_bp_slot(struct perf_event *bp);
>>>    extern int reserve_bp_slot(struct perf_event *bp);
>>>    extern void release_bp_slot(struct perf_event *bp);
>>> +extern int hw_breakpoint_weight(struct perf_event *bp);
>>> +extern int arch_reserve_bp_slot(struct perf_event *bp);
>>> +extern void arch_release_bp_slot(struct perf_event *bp);
>>> +extern void arch_unregister_hw_breakpoint(struct perf_event *bp);
>>
>> Please no new 'extern'. In the old days 'extern' keyword was used, but
>> new code shall not introduce new unnecessary usage of 'extern' keyword.
>> See report from Checkpatch below:
>>
>> WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description
>> (prefer a maximum 75 chars per line)
>> #9:
>> [1]:
>> https://lore.kernel.org/linuxppc-dev/202005192233.oi9CjRtA%25lkp@intel.com/
>>
>> CHECK:AVOID_EXTERNS: extern prototypes should be avoided in .h files
>> #40: FILE: include/linux/hw_breakpoint.h:83:
>> +extern int hw_breakpoint_weight(struct perf_event *bp);
> 
> I fixed it up when applying.

Thanks Michael.

Ravi

^ permalink raw reply

* Re: [PATCH] hw_breakpoint: Fix build warnings with clang
From: Michael Ellerman @ 2020-06-02 11:00 UTC (permalink / raw)
  To: Christophe Leroy, Ravi Bangoria
  Cc: christophe.leroy, apopple, mikey, linux-kernel, npiggin, paulus,
	naveen.n.rao, linuxppc-dev
In-Reply-To: <0217bbaf-a831-8aea-3ecd-fa217fca1669@csgroup.eu>

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 02/06/2020 à 06:12, Ravi Bangoria a écrit :
>> kbuild test robot reported few build warnings with hw_breakpoint code
>> when compiled with clang[1]. Fix those.
>> 
>> [1]: https://lore.kernel.org/linuxppc-dev/202005192233.oi9CjRtA%25lkp@intel.com/
>> 

This should have mentioned that some of the errors were recently
introduced by your commit.

>> Reported-by: kbuild test robot <lkp@intel.com>
>> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
>> ---
>> Note: Prepared on top of powerpc/next.
>> 
>>   arch/powerpc/include/asm/hw_breakpoint.h | 3 ---
>>   include/linux/hw_breakpoint.h            | 4 ++++
>>   2 files changed, 4 insertions(+), 3 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
>> index f42a55eb77d2..cb424799da0d 100644
>> --- a/arch/powerpc/include/asm/hw_breakpoint.h
>> +++ b/arch/powerpc/include/asm/hw_breakpoint.h
>> @@ -70,9 +70,6 @@ extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
>>   						unsigned long val, void *data);
>>   int arch_install_hw_breakpoint(struct perf_event *bp);
>>   void arch_uninstall_hw_breakpoint(struct perf_event *bp);
>> -int arch_reserve_bp_slot(struct perf_event *bp);
>> -void arch_release_bp_slot(struct perf_event *bp);
>> -void arch_unregister_hw_breakpoint(struct perf_event *bp);
>>   void hw_breakpoint_pmu_read(struct perf_event *bp);
>>   extern void flush_ptrace_hw_breakpoint(struct task_struct *tsk);
>>   
>> diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
>> index 6058c3844a76..521481f0d320 100644
>> --- a/include/linux/hw_breakpoint.h
>> +++ b/include/linux/hw_breakpoint.h
>> @@ -80,6 +80,10 @@ extern int dbg_reserve_bp_slot(struct perf_event *bp);
>>   extern int dbg_release_bp_slot(struct perf_event *bp);
>>   extern int reserve_bp_slot(struct perf_event *bp);
>>   extern void release_bp_slot(struct perf_event *bp);
>> +extern int hw_breakpoint_weight(struct perf_event *bp);
>> +extern int arch_reserve_bp_slot(struct perf_event *bp);
>> +extern void arch_release_bp_slot(struct perf_event *bp);
>> +extern void arch_unregister_hw_breakpoint(struct perf_event *bp);
>
> Please no new 'extern'. In the old days 'extern' keyword was used, but 
> new code shall not introduce new unnecessary usage of 'extern' keyword. 
> See report from Checkpatch below:
>
> WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
> (prefer a maximum 75 chars per line)
> #9:
> [1]: 
> https://lore.kernel.org/linuxppc-dev/202005192233.oi9CjRtA%25lkp@intel.com/
>
> CHECK:AVOID_EXTERNS: extern prototypes should be avoided in .h files
> #40: FILE: include/linux/hw_breakpoint.h:83:
> +extern int hw_breakpoint_weight(struct perf_event *bp);

I fixed it up when applying.

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/32: disable KASAN with pages bigger than 16k
From: Michael Ellerman @ 2020-06-02 10:25 UTC (permalink / raw)
  To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <afb1a3a9-8d7d-4c99-d42f-f6d78dcef0a5@csgroup.eu>

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 28/05/2020 à 12:17, Christophe Leroy a écrit :
>> Mapping of early shadow area is implemented by using a single static
>> page table having all entries pointing to the same early shadow page.
>> The shadow area must therefore occupy full PGD entries.
>> 
>> The shadow area has a size of 128Mbytes starting at 0xf8000000.
>> With 4k pages, a PGD entry is 4Mbytes
>> With 16k pages, a PGD entry is 64Mbytes
>> With 64k pages, a PGD entry is 256Mbytes which is too big.
>
> That's for 32k pages that a PGD is 256Mbytes.
>
> With 64k pages, a PGD entry is 1Gbytes which is too big.
>
> Michael, can you fix the commit log ?

Yes.

  powerpc/32: Disable KASAN with pages bigger than 16k
  
  Mapping of early shadow area is implemented by using a single static
  page table having all entries pointing to the same early shadow page.
  The shadow area must therefore occupy full PGD entries.
  
  The shadow area has a size of 128MB starting at 0xf8000000.
  With 4k pages, a PGD entry is 4MB
  With 16k pages, a PGD entry is 64MB
  With 64k pages, a PGD entry is 1GB which is too big.
  
  Until we rework the early shadow mapping, disable KASAN when the page
  size is too big.

  Fixes: 2edb16efc899 ("powerpc/32: Add KASAN support")
  Cc: stable@vger.kernel.org # v5.2+
  Reported-by: kbuild test robot <lkp@intel.com>
  Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
  Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
  Link: https://lore.kernel.org/r/7195fcde7314ccbf7a081b356084a69d421b10d4.1590660977.git.christophe.leroy@csgroup.eu


cheers

^ permalink raw reply

* [RESEND PATCH v9 2/5] seq_buf: Export seq_buf_printf
From: Vaibhav Jain @ 2020-06-02 10:14 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm, linux-kernel
  Cc: Santosh Sivaraj, Cezary Rojewski, Piotr Maziarz, Steven Rostedt,
	Christoph Hellwig, Oliver O'Halloran, Aneesh Kumar K . V,
	Borislav Petkov, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200602101438.73929-1-vaibhav@linux.ibm.com>

'seq_buf' provides a very useful abstraction for writing to a string
buffer without needing to worry about it over-flowing. However even
though the API has been stable for couple of years now its still not
exported to kernel loadable modules limiting its usage.

Hence this patch proposes update to 'seq_buf.c' to mark
seq_buf_printf() which is part of the seq_buf API to be exported to
kernel loadable GPL modules. This symbol will be used in later parts
of this patch-set to simplify content creation for a sysfs attribute.

Cc: Piotr Maziarz <piotrx.maziarz@linux.intel.com>
Cc: Cezary Rojewski <cezary.rojewski@intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Borislav Petkov <bp@alien8.de>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:

Resend:
* Added ack from Steven Rostedt

v8..v9:
* None

v7..v8:
* Updated the patch title [ Christoph Hellwig ]
* Updated patch description to replace confusing term 'external kernel
  modules' to 'kernel lodable modules'.

Resend:
* Added ack from Steven Rostedt

v6..v7:
* New patch in the series
---
 lib/seq_buf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/seq_buf.c b/lib/seq_buf.c
index 4e865d42ab03..707453f5d58e 100644
--- a/lib/seq_buf.c
+++ b/lib/seq_buf.c
@@ -91,6 +91,7 @@ int seq_buf_printf(struct seq_buf *s, const char *fmt, ...)
 
 	return ret;
 }
+EXPORT_SYMBOL_GPL(seq_buf_printf);
 
 #ifdef CONFIG_BINARY_PRINTF
 /**
-- 
2.26.2


^ permalink raw reply related

* [RESEND PATCH v9 5/5] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
From: Vaibhav Jain @ 2020-06-02 10:14 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm, linux-kernel
  Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
	Aneesh Kumar K . V, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200602101438.73929-1-vaibhav@linux.ibm.com>

This patch implements support for PDSM request 'PAPR_PDSM_HEALTH'
that returns a newly introduced 'struct nd_papr_pdsm_health' instance
containing dimm health information back to user space in response to
ND_CMD_CALL. This functionality is implemented in newly introduced
papr_pdsm_health() that queries the nvdimm health information and
then copies this information to the package payload whose layout is
defined by 'struct nd_papr_pdsm_health'.

The patch also introduces a new member 'struct papr_scm_priv.health'
thats an instance of 'struct nd_papr_pdsm_health' to cache the health
information of a nvdimm. As a result functions drc_pmem_query_health()
and flags_show() are updated to populate and use this new struct
instead of a u64 integer that was earlier used.

Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:

Resend:
* Added ack from Aneesh.

v8..v9:
* s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g  [ Dan , Aneesh ]
* s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g
* Renamed papr_scm_get_health() to papr_psdm_health()
* Updated patch description to replace papr-scm dimm with nvdimm.

v7..v8:
* None

Resend:
* None

v6..v7:
* Updated flags_show() to use seq_buf_printf(). [Mpe]
* Updated papr_scm_get_health() to use newly introduced
  __drc_pmem_query_health() bypassing the cache [Mpe].

v5..v6:
* Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to
  gaurd against possibility of different compilers adding different
  paddings to the struct [ Dan Williams ]

* Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of
  'bool' and also updated drc_pmem_query_health() to take this into
  account. [ Dan Williams ]

v4..v5:
* None

v3..v4:
* Call the DSM_PAPR_SCM_HEALTH service function from
  papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh]

v2..v3:
* Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types
  as its exported to the userspace [Aneesh]
* Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health
  from enum to #defines [Aneesh]

v1..v2:
* New patch in the series
---
 arch/powerpc/include/uapi/asm/papr_pdsm.h |  39 +++++++
 arch/powerpc/platforms/pseries/papr_scm.c | 125 +++++++++++++++++++---
 2 files changed, 147 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h b/arch/powerpc/include/uapi/asm/papr_pdsm.h
index 6407fefcc007..411725a91591 100644
--- a/arch/powerpc/include/uapi/asm/papr_pdsm.h
+++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
@@ -115,6 +115,7 @@ struct nd_pdsm_cmd_pkg {
  */
 enum papr_pdsm {
 	PAPR_PDSM_MIN = 0x0,
+	PAPR_PDSM_HEALTH,
 	PAPR_PDSM_MAX,
 };
 
@@ -133,4 +134,42 @@ static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd)
 		return (void *)(pcmd->payload);
 }
 
+/* Various nvdimm health indicators */
+#define PAPR_PDSM_DIMM_HEALTHY       0
+#define PAPR_PDSM_DIMM_UNHEALTHY     1
+#define PAPR_PDSM_DIMM_CRITICAL      2
+#define PAPR_PDSM_DIMM_FATAL         3
+
+/*
+ * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH
+ * Various flags indicate the health status of the dimm.
+ *
+ * dimm_unarmed		: Dimm not armed. So contents wont persist.
+ * dimm_bad_shutdown	: Previous shutdown did not persist contents.
+ * dimm_bad_restore	: Contents from previous shutdown werent restored.
+ * dimm_scrubbed	: Contents of the dimm have been scrubbed.
+ * dimm_locked		: Contents of the dimm cant be modified until CEC reboot
+ * dimm_encrypted	: Contents of dimm are encrypted.
+ * dimm_health		: Dimm health indicator. One of PAPR_PDSM_DIMM_XXXX
+ */
+struct nd_papr_pdsm_health_v1 {
+	__u8 dimm_unarmed;
+	__u8 dimm_bad_shutdown;
+	__u8 dimm_bad_restore;
+	__u8 dimm_scrubbed;
+	__u8 dimm_locked;
+	__u8 dimm_encrypted;
+	__u16 dimm_health;
+} __packed;
+
+/*
+ * Typedef the current struct for dimm_health so that any application
+ * or kernel recompiled after introducing a new version automatically
+ * supports the new version.
+ */
+#define nd_papr_pdsm_health nd_papr_pdsm_health_v1
+
+/* Current version number for the dimm health struct */
+#define ND_PAPR_PDSM_HEALTH_VERSION 1
+
 #endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index 5e2237e7ec08..c0606c0c659c 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -88,7 +88,7 @@ struct papr_scm_priv {
 	unsigned long lasthealth_jiffies;
 
 	/* Health information for the dimm */
-	u64 health_bitmap;
+	struct nd_papr_pdsm_health health;
 };
 
 static int drc_pmem_bind(struct papr_scm_priv *p)
@@ -201,6 +201,7 @@ static int drc_pmem_query_n_bind(struct papr_scm_priv *p)
 static int __drc_pmem_query_health(struct papr_scm_priv *p)
 {
 	unsigned long ret[PLPAR_HCALL_BUFSIZE];
+	u64 health;
 	long rc;
 
 	/* issue the hcall */
@@ -208,18 +209,46 @@ static int __drc_pmem_query_health(struct papr_scm_priv *p)
 	if (rc != H_SUCCESS) {
 		dev_err(&p->pdev->dev,
 			 "Failed to query health information, Err:%ld\n", rc);
-		rc = -ENXIO;
-		goto out;
+		return -ENXIO;
 	}
 
 	p->lasthealth_jiffies = jiffies;
-	p->health_bitmap = ret[0] & ret[1];
+	health = ret[0] & ret[1];
 
 	dev_dbg(&p->pdev->dev,
 		"Queried dimm health info. Bitmap:0x%016lx Mask:0x%016lx\n",
 		ret[0], ret[1]);
-out:
-	return rc;
+
+	memset(&p->health, 0, sizeof(p->health));
+
+	/* Check for various masks in bitmap and set the buffer */
+	if (health & PAPR_PMEM_UNARMED_MASK)
+		p->health.dimm_unarmed = 1;
+
+	if (health & PAPR_PMEM_BAD_SHUTDOWN_MASK)
+		p->health.dimm_bad_shutdown = 1;
+
+	if (health & PAPR_PMEM_BAD_RESTORE_MASK)
+		p->health.dimm_bad_restore = 1;
+
+	if (health & PAPR_PMEM_ENCRYPTED)
+		p->health.dimm_encrypted = 1;
+
+	if (health & PAPR_PMEM_SCRUBBED_AND_LOCKED) {
+		p->health.dimm_locked = 1;
+		p->health.dimm_scrubbed = 1;
+	}
+
+	if (health & PAPR_PMEM_HEALTH_UNHEALTHY)
+		p->health.dimm_health = PAPR_PDSM_DIMM_UNHEALTHY;
+
+	if (health & PAPR_PMEM_HEALTH_CRITICAL)
+		p->health.dimm_health = PAPR_PDSM_DIMM_CRITICAL;
+
+	if (health & PAPR_PMEM_HEALTH_FATAL)
+		p->health.dimm_health = PAPR_PDSM_DIMM_FATAL;
+
+	return 0;
 }
 
 /* Min interval in seconds for assuming stable dimm health */
@@ -403,6 +432,58 @@ static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
 	return 0;
 }
 
+/* Fetch the DIMM health info and populate it in provided package. */
+static int papr_pdsm_health(struct papr_scm_priv *p,
+			       struct nd_pdsm_cmd_pkg *pkg)
+{
+	int rc;
+	size_t copysize = sizeof(p->health);
+
+	/* Ensure dimm health mutex is taken preventing concurrent access */
+	rc = mutex_lock_interruptible(&p->health_mutex);
+	if (rc)
+		goto out;
+
+	/* Always fetch upto date dimm health data ignoring cached values */
+	rc = __drc_pmem_query_health(p);
+	if (rc)
+		goto out_unlock;
+	/*
+	 * If the requested payload version is greater than one we know
+	 * about, return the payload version we know about and let
+	 * caller/userspace handle.
+	 */
+	if (pkg->payload_version > ND_PAPR_PDSM_HEALTH_VERSION)
+		pkg->payload_version = ND_PAPR_PDSM_HEALTH_VERSION;
+
+	if (pkg->hdr.nd_size_out < copysize) {
+		dev_dbg(&p->pdev->dev, "Truncated payload (%u). Expected (%lu)",
+			pkg->hdr.nd_size_out, copysize);
+		rc = -ENOSPC;
+		goto out_unlock;
+	}
+
+	dev_dbg(&p->pdev->dev, "Copying payload size=%lu version=0x%x\n",
+		copysize, pkg->payload_version);
+
+	/* Copy the health struct to the payload */
+	memcpy(pdsm_cmd_to_payload(pkg), &p->health, copysize);
+	pkg->hdr.nd_fw_size = copysize;
+
+out_unlock:
+	mutex_unlock(&p->health_mutex);
+
+out:
+	/*
+	 * Put the error in out package and return success from function
+	 * so that errors if any are propogated back to userspace.
+	 */
+	pkg->cmd_status = rc;
+	dev_dbg(&p->pdev->dev, "completion code = %d\n", rc);
+
+	return 0;
+}
+
 static int papr_scm_service_pdsm(struct papr_scm_priv *p,
 				struct nd_pdsm_cmd_pkg *call_pkg)
 {
@@ -417,6 +498,9 @@ static int papr_scm_service_pdsm(struct papr_scm_priv *p,
 
 	/* Depending on the DSM command call appropriate service routine */
 	switch (call_pkg->hdr.nd_command) {
+	case PAPR_PDSM_HEALTH:
+		return papr_pdsm_health(p, call_pkg);
+
 	default:
 		dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n",
 			call_pkg->hdr.nd_command);
@@ -485,34 +569,41 @@ static ssize_t flags_show(struct device *dev,
 	struct nvdimm *dimm = to_nvdimm(dev);
 	struct papr_scm_priv *p = nvdimm_provider_data(dimm);
 	struct seq_buf s;
-	u64 health;
 	int rc;
 
 	rc = drc_pmem_query_health(p);
 	if (rc)
 		return rc;
 
-	/* Copy health_bitmap locally, check masks & update out buffer */
-	health = READ_ONCE(p->health_bitmap);
-
 	seq_buf_init(&s, buf, PAGE_SIZE);
-	if (health & PAPR_PMEM_UNARMED_MASK)
+
+	/* Protect concurrent modifications to papr_scm_priv */
+	rc = mutex_lock_interruptible(&p->health_mutex);
+	if (rc)
+		return rc;
+
+	if (p->health.dimm_unarmed)
 		seq_buf_printf(&s, "not_armed ");
 
-	if (health & PAPR_PMEM_BAD_SHUTDOWN_MASK)
+	if (p->health.dimm_bad_shutdown)
 		seq_buf_printf(&s, "flush_fail ");
 
-	if (health & PAPR_PMEM_BAD_RESTORE_MASK)
+	if (p->health.dimm_bad_restore)
 		seq_buf_printf(&s, "restore_fail ");
 
-	if (health & PAPR_PMEM_ENCRYPTED)
+	if (p->health.dimm_encrypted)
 		seq_buf_printf(&s, "encrypted ");
 
-	if (health & PAPR_PMEM_SMART_EVENT_MASK)
+	if (p->health.dimm_health)
 		seq_buf_printf(&s, "smart_notify ");
 
-	if (health & PAPR_PMEM_SCRUBBED_AND_LOCKED)
-		seq_buf_printf(&s, "scrubbed locked ");
+	if (p->health.dimm_scrubbed)
+		seq_buf_printf(&s, "scrubbed ");
+
+	if (p->health.dimm_locked)
+		seq_buf_printf(&s, "locked ");
+
+	mutex_unlock(&p->health_mutex);
 
 	if (seq_buf_used(&s))
 		seq_buf_printf(&s, "\n");
-- 
2.26.2


^ permalink raw reply related

* [RESEND PATCH v9 4/5] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods
From: Vaibhav Jain @ 2020-06-02 10:14 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm, linux-kernel
  Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
	Aneesh Kumar K . V, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200602101438.73929-1-vaibhav@linux.ibm.com>

Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm
module and add the command family NVDIMM_FAMILY_PAPR to the white list
of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the
nvdimm command mask and implement necessary scaffolding in the module
to handle ND_CMD_CALL ioctl and PDSM requests that we receive.

The layout of the PDSM request as we expect from libnvdimm/libndctl is
described in newly introduced uapi header 'papr_pdsm.h' which
defines a new 'struct nd_pdsm_cmd_pkg' header. This header is used
to communicate the PDSM request via member
'nd_cmd_pkg.nd_command' and size of payload that need to be
sent/received for servicing the PDSM.

A new function is_cmd_valid() is implemented that reads the args to
papr_scm_ndctl() and performs sanity tests on them. A new function
papr_scm_service_pdsm() is introduced and is called from
papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
command from libnvdimm.

Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:

Resend:
* Added ack from Aneesh.

v8..v9:
* Reduced the usage of term SCM replacing it with appropriate
  replacement [ Dan Williams, Aneesh ]
* Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h'
* s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g
* s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g
* Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'.
* Minor update to patch description.

v7..v8:
* Removed the 'payload_offset' field from 'struct
  nd_pdsm_cmd_pkg'. Instead command payload is always assumed to start
  at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ]
* To enable introducing new fields to 'struct nd_pdsm_cmd_pkg',
  'reserved' field of 10-bytes is introduced. [ Aneesh ]
* Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h
  [ Ira ]

Resend:
* None

v6..v7 :
* Removed the re-definitions of __packed macro from papr_scm_pdsm.h
  [Mpe].
* Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h [Mpe].
* Removed macros that were unused in papr_scm.c from papr_scm_pdsm.h
  [Mpe].
* Made functions defined in papr_scm_pdsm.h as static inline. [Mpe]

v5..v6 :
* Changed the usage of the term DSM to PDSM to distinguish it from the
  ACPI term [ Dan Williams ]
* Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various struct
  to reflect the new terminology.
* Updated the patch description and title to reflect the new terminology.
* Squashed patch to introduce new command family in 'ndctl.h' with
  this patch [ Dan Williams ]
* Updated the papr_scm_pdsm method starting index from 0x10000 to 0x0
  [ Dan Williams ]
* Removed redundant license text from the papr_scm_psdm.h file.
  [ Dan Williams ]
* s/envelop/envelope/ at various places [ Dan Williams ]
* Added '__packed' attribute to command package header to gaurd
  against different compiler adding paddings between the fields.
  [ Dan Williams]
* Converted various pr_debug to dev_debug [ Dan Williams ]

v4..v5 :
* None

v3..v4 :
* None

v2..v3 :
* Updated the patch prefix to 'ndctl/uapi' [Aneesh]

v1..v2 :
* None
---
 arch/powerpc/include/uapi/asm/papr_pdsm.h | 136 ++++++++++++++++++++++
 arch/powerpc/platforms/pseries/papr_scm.c | 101 +++++++++++++++-
 include/uapi/linux/ndctl.h                |   1 +
 3 files changed, 232 insertions(+), 6 deletions(-)
 create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h

diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h b/arch/powerpc/include/uapi/asm/papr_pdsm.h
new file mode 100644
index 000000000000..6407fefcc007
--- /dev/null
+++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
@@ -0,0 +1,136 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+/*
+ * PAPR nvDimm Specific Methods (PDSM) and structs for libndctl
+ *
+ * (C) Copyright IBM 2020
+ *
+ * Author: Vaibhav Jain <vaibhav at linux.ibm.com>
+ */
+
+#ifndef _UAPI_ASM_POWERPC_PAPR_PDSM_H_
+#define _UAPI_ASM_POWERPC_PAPR_PDSM_H_
+
+#include <linux/types.h>
+
+/*
+ * PDSM Envelope:
+ *
+ * The ioctl ND_CMD_CALL transfers data between user-space and kernel via
+ * envelope which consists of a header and user-defined payload sections.
+ * The header is described by 'struct nd_pdsm_cmd_pkg' which expects a
+ * payload following it and accessible via 'nd_pdsm_cmd_pkg.payload' field.
+ * There is reserved field that can used to introduce new fields to the
+ * structure in future. It also tries to ensure that 'nd_pdsm_cmd_pkg.payload'
+ * lies at a 8-byte boundary.
+ *
+ *  +-------------+---------------------+---------------------------+
+ *  |   64-Bytes  |       16-Bytes      |       Max 176-Bytes       |
+ *  +-------------+---------------------+---------------------------+
+ *  |               nd_pdsm_cmd_pkg     |                           |
+ *  |-------------+                     |                           |
+ *  |  nd_cmd_pkg |                     |                           |
+ *  +-------------+---------------------+---------------------------+
+ *  | nd_family   |                     |                           |
+ *  | nd_size_out | cmd_status          |                           |
+ *  | nd_size_in  | payload_version     |     payload               |
+ *  | nd_command  | reserved            |                           |
+ *  | nd_fw_size  |                     |                           |
+ *  +-------------+---------------------+---------------------------+
+ *
+ * PDSM Header:
+ *
+ * The header is defined as 'struct nd_pdsm_cmd_pkg' which embeds a
+ * 'struct nd_cmd_pkg' instance. The PDSM command is assigned to member
+ * 'nd_cmd_pkg.nd_command'. Apart from size information of the envelope which is
+ * contained in 'struct nd_cmd_pkg', the header also has members following
+ * members:
+ *
+ * 'cmd_status'		: (Out) Errors if any encountered while servicing PDSM.
+ * 'payload_version'	: (In/Out) Version number associated with the payload.
+ * 'reserved'		: Not used and reserved for future.
+ *
+ * PDSM Payload:
+ *
+ * The layout of the PDSM Payload is defined by various structs shared between
+ * papr_scm and libndctl so that contents of payload can be interpreted. During
+ * servicing of a PDSM the papr_scm module will read input args from the payload
+ * field by casting its contents to an appropriate struct pointer based on the
+ * PDSM command. Similarly the output of servicing the PDSM command will be
+ * copied to the payload field using the same struct.
+ *
+ * 'libnvdimm' enforces a hard limit of 256 bytes on the envelope size, which
+ * leaves around 176 bytes for the envelope payload (ignoring any padding that
+ * the compiler may silently introduce).
+ *
+ * Payload Version:
+ *
+ * A 'payload_version' field is present in PDSM header that indicates a specific
+ * version of the structure present in PDSM Payload for a given PDSM command.
+ * This provides backward compatibility in case the PDSM Payload structure
+ * evolves and different structures are supported by 'papr_scm' and 'libndctl'.
+ *
+ * When sending a PDSM Payload to 'papr_scm', 'libndctl' should send the version
+ * of the payload struct it supports via 'payload_version' field. The 'papr_scm'
+ * module when servicing the PDSM envelope checks the 'payload_version' and then
+ * uses 'payload struct version' == MIN('payload_version field',
+ * 'max payload-struct-version supported by papr_scm') to service the PDSM.
+ * After servicing the PDSM, 'papr_scm' put the negotiated version of payload
+ * struct in returned 'payload_version' field.
+ *
+ * Libndctl on receiving the envelope back from papr_scm again checks the
+ * 'payload_version' field and based on it use the appropriate version dsm
+ * struct to parse the results.
+ *
+ * Backward Compatibility:
+ *
+ * Above scheme of exchanging different versioned PDSM struct between libndctl
+ * and papr_scm should provide backward compatibility until following two
+ * assumptions/conditions when defining new PDSM structs hold:
+ *
+ * Let T(X) = { set of attributes in PDSM struct 'T' versioned X }
+ *
+ * 1. T(X) is a proper subset of T(Y) if Y > X.
+ *    i.e Each new version of PDSM struct should retain existing struct
+ *    attributes from previous version
+ *
+ * 2. If an entity (libndctl or papr_scm) supports a PDSM struct T(X) then
+ *    it should also support T(1), T(2)...T(X - 1).
+ *    i.e When adding support for new version of a PDSM struct, libndctl
+ *    and papr_scm should retain support of the existing PDSM struct
+ *    version they support.
+ */
+
+/* PDSM-header + payload expected with ND_CMD_CALL ioctl from libnvdimm */
+struct nd_pdsm_cmd_pkg {
+	struct nd_cmd_pkg hdr;	/* Package header containing sub-cmd */
+	__s32 cmd_status;	/* Out: Sub-cmd status returned back */
+	__u16 reserved[5];	/* Ignored and to be used in future */
+	__u16 payload_version;	/* In/Out: version of the payload */
+	__u8 payload[];		/* In/Out: Sub-cmd data buffer */
+} __packed;
+
+/*
+ * Methods to be embedded in ND_CMD_CALL request. These are sent to the kernel
+ * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
+ */
+enum papr_pdsm {
+	PAPR_PDSM_MIN = 0x0,
+	PAPR_PDSM_MAX,
+};
+
+/* Convert a libnvdimm nd_cmd_pkg to pdsm specific pkg */
+static inline struct nd_pdsm_cmd_pkg *nd_to_pdsm_cmd_pkg(struct nd_cmd_pkg *cmd)
+{
+	return (struct nd_pdsm_cmd_pkg *) cmd;
+}
+
+/* Return the payload pointer for a given pcmd */
+static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd)
+{
+	if (pcmd->hdr.nd_size_in == 0 && pcmd->hdr.nd_size_out == 0)
+		return NULL;
+	else
+		return (void *)(pcmd->payload);
+}
+
+#endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index 149431594839..5e2237e7ec08 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -15,13 +15,15 @@
 #include <linux/seq_buf.h>
 
 #include <asm/plpar_wrappers.h>
+#include <asm/papr_pdsm.h>
 
 #define BIND_ANY_ADDR (~0ul)
 
 #define PAPR_SCM_DIMM_CMD_MASK \
 	((1ul << ND_CMD_GET_CONFIG_SIZE) | \
 	 (1ul << ND_CMD_GET_CONFIG_DATA) | \
-	 (1ul << ND_CMD_SET_CONFIG_DATA))
+	 (1ul << ND_CMD_SET_CONFIG_DATA) | \
+	 (1ul << ND_CMD_CALL))
 
 /* DIMM health bitmap bitmap indicators */
 /* SCM device is unable to persist memory contents */
@@ -350,16 +352,97 @@ static int papr_scm_meta_set(struct papr_scm_priv *p,
 	return 0;
 }
 
+/*
+ * Validate the inputs args to dimm-control function and return '0' if valid.
+ * This also does initial sanity validation to ND_CMD_CALL sub-command packages.
+ */
+static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
+		       unsigned int buf_len)
+{
+	unsigned long cmd_mask = PAPR_SCM_DIMM_CMD_MASK;
+	struct nd_pdsm_cmd_pkg *pkg = nd_to_pdsm_cmd_pkg(buf);
+	struct papr_scm_priv *p;
+
+	/* Only dimm-specific calls are supported atm */
+	if (!nvdimm)
+		return -EINVAL;
+
+	/* get the provider date from struct nvdimm */
+	p = nvdimm_provider_data(nvdimm);
+
+	if (!test_bit(cmd, &cmd_mask)) {
+		dev_dbg(&p->pdev->dev, "Unsupported cmd=%u\n", cmd);
+		return -EINVAL;
+	} else if (cmd == ND_CMD_CALL) {
+
+		/* Verify the envelope package */
+		if (!buf || buf_len < sizeof(struct nd_pdsm_cmd_pkg)) {
+			dev_dbg(&p->pdev->dev, "Invalid pkg size=%u\n",
+				buf_len);
+			return -EINVAL;
+		}
+
+		/* Verify that the PDSM family is valid */
+		if (pkg->hdr.nd_family != NVDIMM_FAMILY_PAPR) {
+			dev_dbg(&p->pdev->dev, "Invalid pkg family=0x%llx\n",
+				pkg->hdr.nd_family);
+			return -EINVAL;
+
+		}
+
+		/* We except a payload with all PDSM commands */
+		if (pdsm_cmd_to_payload(pkg) == NULL) {
+			dev_dbg(&p->pdev->dev,
+				"Empty payload for sub-command=0x%llx\n",
+				pkg->hdr.nd_command);
+			return -EINVAL;
+		}
+	}
+
+	/* Command looks valid */
+	return 0;
+}
+
+static int papr_scm_service_pdsm(struct papr_scm_priv *p,
+				struct nd_pdsm_cmd_pkg *call_pkg)
+{
+	/* unknown subcommands return error in packages */
+	if (call_pkg->hdr.nd_command <= PAPR_PDSM_MIN ||
+	    call_pkg->hdr.nd_command >= PAPR_PDSM_MAX) {
+		dev_dbg(&p->pdev->dev, "Invalid PDSM request 0x%llx\n",
+			call_pkg->hdr.nd_command);
+		call_pkg->cmd_status = -EINVAL;
+		return 0;
+	}
+
+	/* Depending on the DSM command call appropriate service routine */
+	switch (call_pkg->hdr.nd_command) {
+	default:
+		dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n",
+			call_pkg->hdr.nd_command);
+		call_pkg->cmd_status = -ENOENT;
+		return 0;
+	}
+}
+
 static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
 			  struct nvdimm *nvdimm, unsigned int cmd, void *buf,
 			  unsigned int buf_len, int *cmd_rc)
 {
 	struct nd_cmd_get_config_size *get_size_hdr;
 	struct papr_scm_priv *p;
+	struct nd_pdsm_cmd_pkg *call_pkg = NULL;
+	int rc;
 
-	/* Only dimm-specific calls are supported atm */
-	if (!nvdimm)
-		return -EINVAL;
+	/* Use a local variable in case cmd_rc pointer is NULL */
+	if (cmd_rc == NULL)
+		cmd_rc = &rc;
+
+	*cmd_rc = is_cmd_valid(nvdimm, cmd, buf, buf_len);
+	if (*cmd_rc) {
+		pr_debug("Invalid cmd=0x%x. Err=%d\n", cmd, *cmd_rc);
+		return *cmd_rc;
+	}
 
 	p = nvdimm_provider_data(nvdimm);
 
@@ -381,13 +464,19 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
 		*cmd_rc = papr_scm_meta_set(p, buf);
 		break;
 
+	case ND_CMD_CALL:
+		call_pkg = nd_to_pdsm_cmd_pkg(buf);
+		*cmd_rc = papr_scm_service_pdsm(p, call_pkg);
+		break;
+
 	default:
-		return -EINVAL;
+		dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd);
+		*cmd_rc = -EINVAL;
 	}
 
 	dev_dbg(&p->pdev->dev, "returned with cmd_rc = %d\n", *cmd_rc);
 
-	return 0;
+	return *cmd_rc;
 }
 
 static ssize_t flags_show(struct device *dev,
diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
index de5d90212409..0e09dc5cec19 100644
--- a/include/uapi/linux/ndctl.h
+++ b/include/uapi/linux/ndctl.h
@@ -244,6 +244,7 @@ struct nd_cmd_pkg {
 #define NVDIMM_FAMILY_HPE2 2
 #define NVDIMM_FAMILY_MSFT 3
 #define NVDIMM_FAMILY_HYPERV 4
+#define NVDIMM_FAMILY_PAPR 5
 
 #define ND_IOCTL_CALL			_IOWR(ND_IOCTL, ND_CMD_CALL,\
 					struct nd_cmd_pkg)
-- 
2.26.2


^ permalink raw reply related

* [RESEND PATCH v9 3/5] powerpc/papr_scm: Fetch nvdimm health information from PHYP
From: Vaibhav Jain @ 2020-06-02 10:14 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm, linux-kernel
  Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
	Aneesh Kumar K . V, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200602101438.73929-1-vaibhav@linux.ibm.com>

Implement support for fetching nvdimm health information via
H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
of 64-bit bitmap, bitwise-and of which is then stored in
'struct papr_scm_priv' and subsequently partially exposed to
user-space via newly introduced dimm specific attribute
'papr/flags'. Since the hcall is costly, the health information is
cached and only re-queried, 60s after the previous successful hcall.

The patch also adds a  documentation text describing flags reported by
the the new sysfs attribute 'papr/flags' is also introduced at
Documentation/ABI/testing/sysfs-bus-papr-pmem.

[1] commit 58b278f568f0 ("powerpc: Provide initial documentation for
PAPR hcalls")

Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:

Resend:
* Added ack from Aneesh.

v8..v9:
* Rename some variables and defines to reduce usage of term SCM
  replacing it with PMEM [Dan Williams, Aneesh]
* s/PAPR_SCM_DIMM/PAPR_PMEM/g
* s/papr_scm_nd_attributes/papr_nd_attributes/g
* s/papr_scm_nd_attribute_group/papr_nd_attribute_group/g
* s/papr_scm_dimm_attr_groups/papr_nd_attribute_groups/g
* Renamed file sysfs-bus-papr-scm to sysfs-bus-papr-pmem

v7..v8:
* Update type of variable 'rc' in __drc_pmem_query_health() and
  drc_pmem_query_health() to long and int respectively. [ Ira ]
* Updated the patch description to s/64 bit Big Endian Number/64-bit
  bitmap/ [ Ira, Aneesh ].

Resend:
* None

v6..v7 :
* Used the exported buf_seq_printf() function to generate content for
  'papr/flags'
* Moved the PAPR_SCM_DIMM_* bit-flags macro definitions to papr_scm.c
  and removed the papr_scm.h file [Mpe]
* Some minor consistency issued in sysfs-bus-papr-scm
  documentation. [Mpe]
* s/dimm_mutex/health_mutex/g [Mpe]
* Split drc_pmem_query_health() into two function one of which takes
  care of caching and locking. [Mpe]
* Fixed a local copy creation of dimm health information using
  READ_ONCE(). [Mpe]

v5..v6 :
* Change the flags sysfs attribute from 'papr_flags' to 'papr/flags'
  [Dan Williams]
* Include documentation for 'papr/flags' attr [Dan Williams]
* Change flag 'save_fail' to 'flush_fail' [Dan Williams]
* Caching of health bitmap to reduce expensive hcalls [Dan Williams]
* Removed usage of PPC_BIT from 'papr-scm.h' header [Mpe]
* Replaced two __be64 integers from papr_scm_priv to a single u64
  integer [Mpe]
* Updated patch description to reflect the changes made in this
  version.
* Removed avoidable usage of 'papr_scm_priv.dimm_mutex' from
  flags_show() [Dan Williams]

v4..v5 :
* None

v3..v4 :
* None

v2..v3 :
* Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for
       	 NVDIMM unarmed [Aneesh]

v1..v2 :
* New patch in the series.
---
 Documentation/ABI/testing/sysfs-bus-papr-pmem |  27 +++
 arch/powerpc/platforms/pseries/papr_scm.c     | 169 +++++++++++++++++-
 2 files changed, 194 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem

diff --git a/Documentation/ABI/testing/sysfs-bus-papr-pmem b/Documentation/ABI/testing/sysfs-bus-papr-pmem
new file mode 100644
index 000000000000..5b10d036a8d4
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-papr-pmem
@@ -0,0 +1,27 @@
+What:		/sys/bus/nd/devices/nmemX/papr/flags
+Date:		Apr, 2020
+KernelVersion:	v5.8
+Contact:	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, linux-nvdimm@lists.01.org,
+Description:
+		(RO) Report flags indicating various states of a
+		papr-pmem NVDIMM device. Each flag maps to a one or
+		more bits set in the dimm-health-bitmap retrieved in
+		response to H_SCM_HEALTH hcall. The details of the bit
+		flags returned in response to this hcall is available
+		at 'Documentation/powerpc/papr_hcalls.rst' . Below are
+		the flags reported in this sysfs file:
+
+		* "not_armed"	: Indicates that NVDIMM contents will not
+				  survive a power cycle.
+		* "flush_fail"	: Indicates that NVDIMM contents
+				  couldn't be flushed during last
+				  shut-down event.
+		* "restore_fail": Indicates that NVDIMM contents
+				  couldn't be restored during NVDIMM
+				  initialization.
+		* "encrypted"	: NVDIMM contents are encrypted.
+		* "smart_notify": There is health event for the NVDIMM.
+		* "scrubbed"	: Indicating that contents of the
+				  NVDIMM have been scrubbed.
+		* "locked"	: Indicating that NVDIMM contents cant
+				  be modified until next power cycle.
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index f35592423380..149431594839 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -12,6 +12,7 @@
 #include <linux/libnvdimm.h>
 #include <linux/platform_device.h>
 #include <linux/delay.h>
+#include <linux/seq_buf.h>
 
 #include <asm/plpar_wrappers.h>
 
@@ -22,6 +23,44 @@
 	 (1ul << ND_CMD_GET_CONFIG_DATA) | \
 	 (1ul << ND_CMD_SET_CONFIG_DATA))
 
+/* DIMM health bitmap bitmap indicators */
+/* SCM device is unable to persist memory contents */
+#define PAPR_PMEM_UNARMED                   (1ULL << (63 - 0))
+/* SCM device failed to persist memory contents */
+#define PAPR_PMEM_SHUTDOWN_DIRTY            (1ULL << (63 - 1))
+/* SCM device contents are persisted from previous IPL */
+#define PAPR_PMEM_SHUTDOWN_CLEAN            (1ULL << (63 - 2))
+/* SCM device contents are not persisted from previous IPL */
+#define PAPR_PMEM_EMPTY                     (1ULL << (63 - 3))
+/* SCM device memory life remaining is critically low */
+#define PAPR_PMEM_HEALTH_CRITICAL           (1ULL << (63 - 4))
+/* SCM device will be garded off next IPL due to failure */
+#define PAPR_PMEM_HEALTH_FATAL              (1ULL << (63 - 5))
+/* SCM contents cannot persist due to current platform health status */
+#define PAPR_PMEM_HEALTH_UNHEALTHY          (1ULL << (63 - 6))
+/* SCM device is unable to persist memory contents in certain conditions */
+#define PAPR_PMEM_HEALTH_NON_CRITICAL       (1ULL << (63 - 7))
+/* SCM device is encrypted */
+#define PAPR_PMEM_ENCRYPTED                 (1ULL << (63 - 8))
+/* SCM device has been scrubbed and locked */
+#define PAPR_PMEM_SCRUBBED_AND_LOCKED       (1ULL << (63 - 9))
+
+/* Bits status indicators for health bitmap indicating unarmed dimm */
+#define PAPR_PMEM_UNARMED_MASK (PAPR_PMEM_UNARMED |		\
+				PAPR_PMEM_HEALTH_UNHEALTHY)
+
+/* Bits status indicators for health bitmap indicating unflushed dimm */
+#define PAPR_PMEM_BAD_SHUTDOWN_MASK (PAPR_PMEM_SHUTDOWN_DIRTY)
+
+/* Bits status indicators for health bitmap indicating unrestored dimm */
+#define PAPR_PMEM_BAD_RESTORE_MASK  (PAPR_PMEM_EMPTY)
+
+/* Bit status indicators for smart event notification */
+#define PAPR_PMEM_SMART_EVENT_MASK (PAPR_PMEM_HEALTH_CRITICAL | \
+				    PAPR_PMEM_HEALTH_FATAL |	\
+				    PAPR_PMEM_HEALTH_UNHEALTHY)
+
+/* private struct associated with each region */
 struct papr_scm_priv {
 	struct platform_device *pdev;
 	struct device_node *dn;
@@ -39,6 +78,15 @@ struct papr_scm_priv {
 	struct resource res;
 	struct nd_region *region;
 	struct nd_interleave_set nd_set;
+
+	/* Protect dimm health data from concurrent read/writes */
+	struct mutex health_mutex;
+
+	/* Last time the health information of the dimm was updated */
+	unsigned long lasthealth_jiffies;
+
+	/* Health information for the dimm */
+	u64 health_bitmap;
 };
 
 static int drc_pmem_bind(struct papr_scm_priv *p)
@@ -144,6 +192,62 @@ static int drc_pmem_query_n_bind(struct papr_scm_priv *p)
 	return drc_pmem_bind(p);
 }
 
+/*
+ * Issue hcall to retrieve dimm health info and populate papr_scm_priv with the
+ * health information.
+ */
+static int __drc_pmem_query_health(struct papr_scm_priv *p)
+{
+	unsigned long ret[PLPAR_HCALL_BUFSIZE];
+	long rc;
+
+	/* issue the hcall */
+	rc = plpar_hcall(H_SCM_HEALTH, ret, p->drc_index);
+	if (rc != H_SUCCESS) {
+		dev_err(&p->pdev->dev,
+			 "Failed to query health information, Err:%ld\n", rc);
+		rc = -ENXIO;
+		goto out;
+	}
+
+	p->lasthealth_jiffies = jiffies;
+	p->health_bitmap = ret[0] & ret[1];
+
+	dev_dbg(&p->pdev->dev,
+		"Queried dimm health info. Bitmap:0x%016lx Mask:0x%016lx\n",
+		ret[0], ret[1]);
+out:
+	return rc;
+}
+
+/* Min interval in seconds for assuming stable dimm health */
+#define MIN_HEALTH_QUERY_INTERVAL 60
+
+/* Query cached health info and if needed call drc_pmem_query_health */
+static int drc_pmem_query_health(struct papr_scm_priv *p)
+{
+	unsigned long cache_timeout;
+	int rc;
+
+	/* Protect concurrent modifications to papr_scm_priv */
+	rc = mutex_lock_interruptible(&p->health_mutex);
+	if (rc)
+		return rc;
+
+	/* Jiffies offset for which the health data is assumed to be same */
+	cache_timeout = p->lasthealth_jiffies +
+		msecs_to_jiffies(MIN_HEALTH_QUERY_INTERVAL * 1000);
+
+	/* Fetch new health info is its older than MIN_HEALTH_QUERY_INTERVAL */
+	if (time_after(jiffies, cache_timeout))
+		rc = __drc_pmem_query_health(p);
+	else
+		/* Assume cached health data is valid */
+		rc = 0;
+
+	mutex_unlock(&p->health_mutex);
+	return rc;
+}
 
 static int papr_scm_meta_get(struct papr_scm_priv *p,
 			     struct nd_cmd_get_config_data_hdr *hdr)
@@ -286,6 +390,64 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
 	return 0;
 }
 
+static ssize_t flags_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct nvdimm *dimm = to_nvdimm(dev);
+	struct papr_scm_priv *p = nvdimm_provider_data(dimm);
+	struct seq_buf s;
+	u64 health;
+	int rc;
+
+	rc = drc_pmem_query_health(p);
+	if (rc)
+		return rc;
+
+	/* Copy health_bitmap locally, check masks & update out buffer */
+	health = READ_ONCE(p->health_bitmap);
+
+	seq_buf_init(&s, buf, PAGE_SIZE);
+	if (health & PAPR_PMEM_UNARMED_MASK)
+		seq_buf_printf(&s, "not_armed ");
+
+	if (health & PAPR_PMEM_BAD_SHUTDOWN_MASK)
+		seq_buf_printf(&s, "flush_fail ");
+
+	if (health & PAPR_PMEM_BAD_RESTORE_MASK)
+		seq_buf_printf(&s, "restore_fail ");
+
+	if (health & PAPR_PMEM_ENCRYPTED)
+		seq_buf_printf(&s, "encrypted ");
+
+	if (health & PAPR_PMEM_SMART_EVENT_MASK)
+		seq_buf_printf(&s, "smart_notify ");
+
+	if (health & PAPR_PMEM_SCRUBBED_AND_LOCKED)
+		seq_buf_printf(&s, "scrubbed locked ");
+
+	if (seq_buf_used(&s))
+		seq_buf_printf(&s, "\n");
+
+	return seq_buf_used(&s);
+}
+DEVICE_ATTR_RO(flags);
+
+/* papr_scm specific dimm attributes */
+static struct attribute *papr_nd_attributes[] = {
+	&dev_attr_flags.attr,
+	NULL,
+};
+
+static struct attribute_group papr_nd_attribute_group = {
+	.name = "papr",
+	.attrs = papr_nd_attributes,
+};
+
+static const struct attribute_group *papr_nd_attr_groups[] = {
+	&papr_nd_attribute_group,
+	NULL,
+};
+
 static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
 {
 	struct device *dev = &p->pdev->dev;
@@ -312,8 +474,8 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
 	dimm_flags = 0;
 	set_bit(NDD_LABELING, &dimm_flags);
 
-	p->nvdimm = nvdimm_create(p->bus, p, NULL, dimm_flags,
-				  PAPR_SCM_DIMM_CMD_MASK, 0, NULL);
+	p->nvdimm = nvdimm_create(p->bus, p, papr_nd_attr_groups,
+				  dimm_flags, PAPR_SCM_DIMM_CMD_MASK, 0, NULL);
 	if (!p->nvdimm) {
 		dev_err(dev, "Error creating DIMM object for %pOF\n", p->dn);
 		goto err;
@@ -399,6 +561,9 @@ static int papr_scm_probe(struct platform_device *pdev)
 	if (!p)
 		return -ENOMEM;
 
+	/* Initialize the dimm mutex */
+	mutex_init(&p->health_mutex);
+
 	/* optional DT properties */
 	of_property_read_u32(dn, "ibm,metadata-size", &metadata_size);
 
-- 
2.26.2


^ permalink raw reply related

* [RESEND PATCH v9 1/5] powerpc: Document details on H_SCM_HEALTH hcall
From: Vaibhav Jain @ 2020-06-02 10:14 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm, linux-kernel
  Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
	Aneesh Kumar K . V, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200602101438.73929-1-vaibhav@linux.ibm.com>

Add documentation to 'papr_hcalls.rst' describing the bitmap flags
that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM
specification.

Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:

Resend:
* None

v8..v9:
* s/SCM/PMEM device. [ Dan Williams, Aneesh ]

v7..v8:
* Added a clarification on bit-ordering of Health Bitmap

Resend:
* None

v6..v7:
* None

v5..v6:
* New patch in the series
---
 Documentation/powerpc/papr_hcalls.rst | 46 ++++++++++++++++++++++++---
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst
index 3493631a60f8..48fcf1255a33 100644
--- a/Documentation/powerpc/papr_hcalls.rst
+++ b/Documentation/powerpc/papr_hcalls.rst
@@ -220,13 +220,51 @@ from the LPAR memory.
 **H_SCM_HEALTH**
 
 | Input: drcIndex
-| Out: *health-bitmap, health-bit-valid-bitmap*
+| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)*
 | Return Value: *H_Success, H_Parameter, H_Hardware*
 
 Given a DRC Index return the info on predictive failure and overall health of
-the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive
-failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
-valid.
+the PMEM device. The asserted bits in the health-bitmap indicate one or more states
+(described in table below) of the PMEM device and health-bit-valid-bitmap indicate
+which bits in health-bitmap are valid. The bits are reported in
+reverse bit ordering for example a value of 0xC400000000000000
+indicates bits 0, 1, and 5 are valid.
+
+Health Bitmap Flags:
+
++------+-----------------------------------------------------------------------+
+|  Bit |               Definition                                              |
++======+=======================================================================+
+|  00  | PMEM device is unable to persist memory contents.                     |
+|      | If the system is powered down, nothing will be saved.                 |
++------+-----------------------------------------------------------------------+
+|  01  | PMEM device failed to persist memory contents. Either contents were   |
+|      | not saved successfully on power down or were not restored properly on |
+|      | power up.                                                             |
++------+-----------------------------------------------------------------------+
+|  02  | PMEM device contents are persisted from previous IPL. The data from   |
+|      | the last boot were successfully restored.                             |
++------+-----------------------------------------------------------------------+
+|  03  | PMEM device contents are not persisted from previous IPL. There was no|
+|      | data to restore from the last boot.                                   |
++------+-----------------------------------------------------------------------+
+|  04  | PMEM device memory life remaining is critically low                   |
++------+-----------------------------------------------------------------------+
+|  05  | PMEM device will be garded off next IPL due to failure                |
++------+-----------------------------------------------------------------------+
+|  06  | PMEM device contents cannot persist due to current platform health    |
+|      | status. A hardware failure may prevent data from being saved or       |
+|      | restored.                                                             |
++------+-----------------------------------------------------------------------+
+|  07  | PMEM device is unable to persist memory contents in certain conditions|
++------+-----------------------------------------------------------------------+
+|  08  | PMEM device is encrypted                                              |
++------+-----------------------------------------------------------------------+
+|  09  | PMEM device has successfully completed a requested erase or secure    |
+|      | erase procedure.                                                      |
++------+-----------------------------------------------------------------------+
+|10:63 | Reserved / Unused                                                     |
++------+-----------------------------------------------------------------------+
 
 **H_SCM_PERFORMANCE_STATS**
 
-- 
2.26.2


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox