LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 5/8] powerpc/watchpoint: Rename current H_SET_MODE DAWR macro
From: Ravi Bangoria @ 2020-06-04  3:34 UTC (permalink / raw)
  To: mpe, mikey
  Cc: christophe.leroy, ravi.bangoria, apopple, peterz, fweisbec, oleg,
	npiggin, linux-kernel, paulus, jolsa, naveen.n.rao, linuxppc-dev,
	mingo
In-Reply-To: <20200604033443.70591-1-ravi.bangoria@linux.ibm.com>

Current H_SET_MODE hcall macro name for setting/resetting DAWR0 is
H_SET_MODE_RESOURCE_SET_DAWR. Add suffix 0 to macro name as well.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 arch/powerpc/include/asm/hvcall.h         | 2 +-
 arch/powerpc/include/asm/plpar_wrappers.h | 2 +-
 arch/powerpc/kvm/book3s_hv.c              | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index e90c073e437e..a7f6f1aeda6b 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -354,7 +354,7 @@
 
 /* Values for 2nd argument to H_SET_MODE */
 #define H_SET_MODE_RESOURCE_SET_CIABR		1
-#define H_SET_MODE_RESOURCE_SET_DAWR		2
+#define H_SET_MODE_RESOURCE_SET_DAWR0		2
 #define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE	3
 #define H_SET_MODE_RESOURCE_LE			4
 
diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h
index 4497c8afb573..93eb133d572c 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -312,7 +312,7 @@ static inline long plpar_set_ciabr(unsigned long ciabr)
 
 static inline long plpar_set_watchpoint0(unsigned long dawr0, unsigned long dawrx0)
 {
-	return plpar_set_mode(0, H_SET_MODE_RESOURCE_SET_DAWR, dawr0, dawrx0);
+	return plpar_set_mode(0, H_SET_MODE_RESOURCE_SET_DAWR0, dawr0, dawrx0);
 }
 
 static inline long plpar_signal_sys_reset(long cpu)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index a0cf17597838..26820b7bd75c 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -766,7 +766,7 @@ static int kvmppc_h_set_mode(struct kvm_vcpu *vcpu, unsigned long mflags,
 			return H_P3;
 		vcpu->arch.ciabr  = value1;
 		return H_SUCCESS;
-	case H_SET_MODE_RESOURCE_SET_DAWR:
+	case H_SET_MODE_RESOURCE_SET_DAWR0:
 		if (!kvmppc_power8_compatible(vcpu))
 			return H_P2;
 		if (!ppc_breakpoint_available())
-- 
2.26.2


^ permalink raw reply related

* [PATCH v2 4/8] powerpc/watchpoint: Set CPU_FTR_DAWR1 based on pa-features bit
From: Ravi Bangoria @ 2020-06-04  3:34 UTC (permalink / raw)
  To: mpe, mikey
  Cc: christophe.leroy, ravi.bangoria, apopple, peterz, fweisbec, oleg,
	npiggin, linux-kernel, paulus, jolsa, naveen.n.rao, linuxppc-dev,
	mingo
In-Reply-To: <20200604033443.70591-1-ravi.bangoria@linux.ibm.com>

As per the PAPR, bit 0 of byte 64 in pa-features property indicates
availability of 2nd DAWR registers. i.e. If this bit is set, 2nd
DAWR is present, otherwise not. Host generally uses "cpu-features",
which masks "pa-features". But "cpu-features" are still not used for
guests and thus this change is mostly applicable for guests only.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 arch/powerpc/kernel/prom.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 6a3bac357e24..34272cef8ae6 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -175,6 +175,8 @@ static struct ibm_pa_feature {
 	 */
 	{ .pabyte = 22, .pabit = 0, .cpu_features = CPU_FTR_TM_COMP,
 	  .cpu_user_ftrs2 = PPC_FEATURE2_HTM_COMP | PPC_FEATURE2_HTM_NOSC_COMP },
+
+	{ .pabyte = 64, .pabit = 0, .cpu_features = CPU_FTR_DAWR1 },
 };
 
 static void __init scan_features(unsigned long node, const unsigned char *ftrs,
-- 
2.26.2


^ permalink raw reply related

* [PATCH v2 3/8] powerpc/dt_cpu_ftrs: Add feature for 2nd DAWR
From: Ravi Bangoria @ 2020-06-04  3:34 UTC (permalink / raw)
  To: mpe, mikey
  Cc: christophe.leroy, ravi.bangoria, apopple, peterz, fweisbec, oleg,
	npiggin, linux-kernel, paulus, jolsa, naveen.n.rao, linuxppc-dev,
	mingo
In-Reply-To: <20200604033443.70591-1-ravi.bangoria@linux.ibm.com>

Add new device-tree feature for 2nd DAWR. If this feature is present,
2nd DAWR is supported, otherwise not.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 arch/powerpc/include/asm/cputable.h | 7 +++++--
 arch/powerpc/kernel/dt_cpu_ftrs.c   | 7 +++++++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
index e506d429b1af..3445c86e1f6f 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -214,6 +214,7 @@ static inline void cpu_feature_keys_init(void) { }
 #define CPU_FTR_P9_TLBIE_ERAT_BUG	LONG_ASM_CONST(0x0001000000000000)
 #define CPU_FTR_P9_RADIX_PREFETCH_BUG	LONG_ASM_CONST(0x0002000000000000)
 #define CPU_FTR_ARCH_31			LONG_ASM_CONST(0x0004000000000000)
+#define CPU_FTR_DAWR1			LONG_ASM_CONST(0x0008000000000000)
 
 #ifndef __ASSEMBLY__
 
@@ -497,14 +498,16 @@ static inline void cpu_feature_keys_init(void) { }
 #define CPU_FTRS_POSSIBLE	\
 	    (CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | CPU_FTRS_POWER8 | \
 	     CPU_FTR_ALTIVEC_COMP | CPU_FTR_VSX_COMP | CPU_FTRS_POWER9 | \
-	     CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
+	     CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
+	     CPU_FTR_DAWR1)
 #else
 #define CPU_FTRS_POSSIBLE	\
 	    (CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | \
 	     CPU_FTRS_POWER6 | CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | \
 	     CPU_FTRS_POWER8 | CPU_FTRS_CELL | CPU_FTRS_PA6T | \
 	     CPU_FTR_VSX_COMP | CPU_FTR_ALTIVEC_COMP | CPU_FTRS_POWER9 | \
-	     CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
+	     CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
+	     CPU_FTR_DAWR1)
 #endif /* CONFIG_CPU_LITTLE_ENDIAN */
 #endif
 #else
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 3a409517c031..eafd713ca23d 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -574,6 +574,12 @@ static int __init feat_enable_mma(struct dt_cpu_feature *f)
 	return 1;
 }
 
+static int __init feat_enable_dawr1(struct dt_cpu_feature *f)
+{
+	cur_cpu_spec->cpu_features |= CPU_FTR_DAWR1;
+	return 1;
+}
+
 struct dt_cpu_feature_match {
 	const char *name;
 	int (*enable)(struct dt_cpu_feature *f);
@@ -649,6 +655,7 @@ static struct dt_cpu_feature_match __initdata
 	{"wait-v3", feat_enable, 0},
 	{"prefix-instructions", feat_enable, 0},
 	{"matrix-multiply-assist", feat_enable_mma, 0},
+	{"dawr1", feat_enable_dawr1, 0},
 };
 
 static bool __initdata using_dt_cpu_ftrs;
-- 
2.26.2


^ permalink raw reply related

* [PATCH v2 2/8] powerpc/watchpoint: Enable watchpoint functionality on power10 guest
From: Ravi Bangoria @ 2020-06-04  3:34 UTC (permalink / raw)
  To: mpe, mikey
  Cc: christophe.leroy, ravi.bangoria, apopple, peterz, fweisbec, oleg,
	npiggin, linux-kernel, paulus, jolsa, naveen.n.rao, linuxppc-dev,
	mingo
In-Reply-To: <20200604033443.70591-1-ravi.bangoria@linux.ibm.com>

CPU_FTR_DAWR is by default enabled for host via CPU_FTRS_DT_CPU_BASE
(controlled by CONFIG_PPC_DT_CPU_FTRS). But cpu-features device-tree
node is not PAPR compatible and thus not yet used by kvm or pHyp
guests. Enable watchpoint functionality on power10 guest (both kvm
and powervm) by adding CPU_FTR_DAWR to CPU_FTRS_POWER10. Note that
this change does not enable 2nd DAWR support.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 arch/powerpc/include/asm/cputable.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
index bac2252c839e..e506d429b1af 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -478,7 +478,7 @@ static inline void cpu_feature_keys_init(void) { }
 	    CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
 	    CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
 	    CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_PKEY | \
-	    CPU_FTR_ARCH_31)
+	    CPU_FTR_ARCH_31 | CPU_FTR_DAWR)
 #define CPU_FTRS_CELL	(CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
-- 
2.26.2


^ permalink raw reply related

* [PATCH v2 1/8] powerpc/watchpoint: Fix 512 byte boundary limit
From: Ravi Bangoria @ 2020-06-04  3:34 UTC (permalink / raw)
  To: mpe, mikey
  Cc: christophe.leroy, ravi.bangoria, apopple, peterz, fweisbec, oleg,
	npiggin, linux-kernel, paulus, jolsa, naveen.n.rao, linuxppc-dev,
	mingo
In-Reply-To: <20200604033443.70591-1-ravi.bangoria@linux.ibm.com>

Milton reported that we are aligning start and end address to wrong
size SZ_512M. It should be SZ_512. Fix that.

While doing this change I also found a case where ALIGN() comparison
fails. Within a given aligned range, ALIGN() of two addresses does not
match when start address is pointing to the first byte and end address
is pointing to any other byte except the first one. But that's not true
for ALIGN_DOWN(). ALIGN_DOWN() of any two addresses within that range
will always point to the first byte. So use ALIGN_DOWN() instead of
ALIGN().

Fixes: e68ef121c1f4 ("powerpc/watchpoint: Use builtin ALIGN*() macros")
Reported-by: Milton Miller <miltonm@us.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 arch/powerpc/kernel/hw_breakpoint.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index 0000daf0e1da..031e6defc08e 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -419,7 +419,7 @@ static int hw_breakpoint_validate_len(struct arch_hw_breakpoint *hw)
 	if (dawr_enabled()) {
 		max_len = DAWR_MAX_LEN;
 		/* DAWR region can't cross 512 bytes boundary */
-		if (ALIGN(start_addr, SZ_512M) != ALIGN(end_addr - 1, SZ_512M))
+		if (ALIGN_DOWN(start_addr, SZ_512) != ALIGN_DOWN(end_addr - 1, SZ_512))
 			return -EINVAL;
 	} else if (IS_ENABLED(CONFIG_PPC_8xx)) {
 		/* 8xx can setup a range without limitation */
-- 
2.26.2


^ permalink raw reply related

* [PATCH v2 0/8] powerpc/watchpoint: Enable 2nd DAWR on baremetal and powervm
From: Ravi Bangoria @ 2020-06-04  3:34 UTC (permalink / raw)
  To: mpe, mikey
  Cc: christophe.leroy, ravi.bangoria, apopple, peterz, fweisbec, oleg,
	npiggin, linux-kernel, paulus, jolsa, naveen.n.rao, linuxppc-dev,
	mingo

Last series[1] was to add basic infrastructure support for more than
one watchpoint on Book3S powerpc. This series actually enables the 2nd 
DAWR for baremetal and powervm. Kvm guest is still not supported.

v1: https://lore.kernel.org/linuxppc-dev/20200602040106.127693-1-ravi.bangoria@linux.ibm.com

v1->v2:
  - Milton reported an issue with one patch in last series[1]. patch #1
    fixes that. So patch#1 is new.
  - Rebased to powerpc/next which now has "Base support for POWER10"[2]
    series included.

[1]: https://lore.kernel.org/linuxppc-dev/20200514111741.97993-1-ravi.bangoria@linux.ibm.com/
[2]: https://lore.kernel.org/linuxppc-dev/20200521014341.29095-1-alistair@popple.id.au

Ravi Bangoria (8):
  powerpc/watchpoint: Fix 512 byte boundary limit
  powerpc/watchpoint: Enable watchpoint functionality on power10 guest
  powerpc/dt_cpu_ftrs: Add feature for 2nd DAWR
  powerpc/watchpoint: Set CPU_FTR_DAWR1 based on pa-features bit
  powerpc/watchpoint: Rename current H_SET_MODE DAWR macro
  powerpc/watchpoint: Guest support for 2nd DAWR hcall
  powerpc/watchpoint: Return available watchpoints dynamically
  powerpc/watchpoint: Remove 512 byte boundary

 arch/powerpc/include/asm/cputable.h       | 13 +++++++++----
 arch/powerpc/include/asm/hvcall.h         |  3 ++-
 arch/powerpc/include/asm/hw_breakpoint.h  |  5 +++--
 arch/powerpc/include/asm/machdep.h        |  2 +-
 arch/powerpc/include/asm/plpar_wrappers.h |  7 ++++++-
 arch/powerpc/kernel/dawr.c                |  2 +-
 arch/powerpc/kernel/dt_cpu_ftrs.c         |  7 +++++++
 arch/powerpc/kernel/hw_breakpoint.c       |  5 +++--
 arch/powerpc/kernel/prom.c                |  2 ++
 arch/powerpc/kvm/book3s_hv.c              |  2 +-
 arch/powerpc/platforms/pseries/setup.c    |  7 +++++--
 11 files changed, 40 insertions(+), 15 deletions(-)

-- 
2.26.2


^ permalink raw reply

* Re: [PATCH v4 1/4] riscv: Move kernel mapping to vmalloc zone
From: Zong Li @ 2020-06-04  2:52 UTC (permalink / raw)
  To: Alexandre Ghiti
  Cc: Albert Ou, Anup Patel, linux-kernel@vger.kernel.org List,
	Atish Patra, Paul Mackerras, Paul Walmsley, Palmer Dabbelt,
	linux-riscv, linuxppc-dev
In-Reply-To: <20200603080010.13366-2-alex@ghiti.fr>

On Wed, Jun 3, 2020 at 4:01 PM Alexandre Ghiti <alex@ghiti.fr> wrote:
>
> This is a preparatory patch for relocatable kernel.
>
> The kernel used to be linked at PAGE_OFFSET address and used to be loaded
> physically at the beginning of the main memory. Therefore, we could use
> the linear mapping for the kernel mapping.
>
> But the relocated kernel base address will be different from PAGE_OFFSET
> and since in the linear mapping, two different virtual addresses cannot
> point to the same physical address, the kernel mapping needs to lie outside
> the linear mapping.
>
> In addition, because modules and BPF must be close to the kernel (inside
> +-2GB window), the kernel is placed at the end of the vmalloc zone minus
> 2GB, which leaves room for modules and BPF. The kernel could not be
> placed at the beginning of the vmalloc zone since other vmalloc
> allocations from the kernel could get all the +-2GB window around the
> kernel which would prevent new modules and BPF programs to be loaded.
>
> Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
> ---
>  arch/riscv/boot/loader.lds.S     |  3 +-
>  arch/riscv/include/asm/page.h    | 10 +++++-
>  arch/riscv/include/asm/pgtable.h | 38 ++++++++++++++-------
>  arch/riscv/kernel/head.S         |  3 +-
>  arch/riscv/kernel/module.c       |  4 +--
>  arch/riscv/kernel/vmlinux.lds.S  |  3 +-
>  arch/riscv/mm/init.c             | 58 +++++++++++++++++++++++++-------
>  arch/riscv/mm/physaddr.c         |  2 +-
>  8 files changed, 88 insertions(+), 33 deletions(-)
>
> diff --git a/arch/riscv/boot/loader.lds.S b/arch/riscv/boot/loader.lds.S
> index 47a5003c2e28..62d94696a19c 100644
> --- a/arch/riscv/boot/loader.lds.S
> +++ b/arch/riscv/boot/loader.lds.S
> @@ -1,13 +1,14 @@
>  /* SPDX-License-Identifier: GPL-2.0 */
>
>  #include <asm/page.h>
> +#include <asm/pgtable.h>
>
>  OUTPUT_ARCH(riscv)
>  ENTRY(_start)
>
>  SECTIONS
>  {
> -       . = PAGE_OFFSET;
> +       . = KERNEL_LINK_ADDR;
>
>         .payload : {
>                 *(.payload)
> diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h
> index 2d50f76efe48..48bb09b6a9b7 100644
> --- a/arch/riscv/include/asm/page.h
> +++ b/arch/riscv/include/asm/page.h
> @@ -90,18 +90,26 @@ typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_MMU
>  extern unsigned long va_pa_offset;
> +extern unsigned long va_kernel_pa_offset;
>  extern unsigned long pfn_base;
>  #define ARCH_PFN_OFFSET                (pfn_base)
>  #else
>  #define va_pa_offset           0
> +#define va_kernel_pa_offset    0
>  #define ARCH_PFN_OFFSET                (PAGE_OFFSET >> PAGE_SHIFT)
>  #endif /* CONFIG_MMU */
>
>  extern unsigned long max_low_pfn;
>  extern unsigned long min_low_pfn;
> +extern unsigned long kernel_virt_addr;
>
>  #define __pa_to_va_nodebug(x)  ((void *)((unsigned long) (x) + va_pa_offset))
> -#define __va_to_pa_nodebug(x)  ((unsigned long)(x) - va_pa_offset)
> +#define linear_mapping_va_to_pa(x)     ((unsigned long)(x) - va_pa_offset)
> +#define kernel_mapping_va_to_pa(x)     \
> +       ((unsigned long)(x) - va_kernel_pa_offset)
> +#define __va_to_pa_nodebug(x)          \
> +       (((x) >= PAGE_OFFSET) ?         \
> +               linear_mapping_va_to_pa(x) : kernel_mapping_va_to_pa(x))
>
>  #ifdef CONFIG_DEBUG_VIRTUAL
>  extern phys_addr_t __virt_to_phys(unsigned long x);
> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> index 35b60035b6b0..94ef3b49dfb6 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -11,23 +11,29 @@
>
>  #include <asm/pgtable-bits.h>
>
> -#ifndef __ASSEMBLY__
> -
> -/* Page Upper Directory not used in RISC-V */
> -#include <asm-generic/pgtable-nopud.h>
> -#include <asm/page.h>
> -#include <asm/tlbflush.h>
> -#include <linux/mm_types.h>
> -
> -#ifdef CONFIG_MMU
> +#ifndef CONFIG_MMU
> +#define KERNEL_VIRT_ADDR       PAGE_OFFSET
> +#define KERNEL_LINK_ADDR       PAGE_OFFSET
> +#else
> +/*
> + * Leave 2GB for modules and BPF that must lie within a 2GB range around
> + * the kernel.
> + */
> +#define KERNEL_VIRT_ADDR       (VMALLOC_END - SZ_2G + 1)
> +#define KERNEL_LINK_ADDR       KERNEL_VIRT_ADDR
>
>  #define VMALLOC_SIZE     (KERN_VIRT_SIZE >> 1)
>  #define VMALLOC_END      (PAGE_OFFSET - 1)
>  #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)
>
>  #define BPF_JIT_REGION_SIZE    (SZ_128M)
> -#define BPF_JIT_REGION_START   (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
> -#define BPF_JIT_REGION_END     (VMALLOC_END)
> +#define BPF_JIT_REGION_START   PFN_ALIGN((unsigned long)&_end)
> +#define BPF_JIT_REGION_END     (BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)
> +
> +#ifdef CONFIG_64BIT
> +#define VMALLOC_MODULE_START   BPF_JIT_REGION_END
> +#define VMALLOC_MODULE_END     (((unsigned long)&_start & PAGE_MASK) + SZ_2G)
> +#endif
>
>  /*
>   * Roughly size the vmemmap space to be large enough to fit enough
> @@ -57,9 +63,16 @@
>  #define FIXADDR_SIZE     PGDIR_SIZE
>  #endif
>  #define FIXADDR_START    (FIXADDR_TOP - FIXADDR_SIZE)
> -
>  #endif
>
> +#ifndef __ASSEMBLY__
> +
> +/* Page Upper Directory not used in RISC-V */
> +#include <asm-generic/pgtable-nopud.h>
> +#include <asm/page.h>
> +#include <asm/tlbflush.h>
> +#include <linux/mm_types.h>
> +
>  #ifdef CONFIG_64BIT
>  #include <asm/pgtable-64.h>
>  #else
> @@ -483,6 +496,7 @@ static inline void __kernel_map_pages(struct page *page, int numpages, int enabl
>
>  #define kern_addr_valid(addr)   (1) /* FIXME */
>
> +extern char _start[];
>  extern void *dtb_early_va;
>  void setup_bootmem(void);
>  void paging_init(void);
> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> index 98a406474e7d..8f5bb7731327 100644
> --- a/arch/riscv/kernel/head.S
> +++ b/arch/riscv/kernel/head.S
> @@ -49,7 +49,8 @@ ENTRY(_start)
>  #ifdef CONFIG_MMU
>  relocate:
>         /* Relocate return address */
> -       li a1, PAGE_OFFSET
> +       la a1, kernel_virt_addr
> +       REG_L a1, 0(a1)
>         la a2, _start
>         sub a1, a1, a2
>         add ra, ra, a1
> diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
> index 8bbe5dbe1341..1a8fbe05accf 100644
> --- a/arch/riscv/kernel/module.c
> +++ b/arch/riscv/kernel/module.c
> @@ -392,12 +392,10 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
>  }
>
>  #if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
> -#define VMALLOC_MODULE_START \
> -        max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
>  void *module_alloc(unsigned long size)
>  {
>         return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START,
> -                                   VMALLOC_END, GFP_KERNEL,
> +                                   VMALLOC_MODULE_END, GFP_KERNEL,
>                                     PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
>                                     __builtin_return_address(0));
>  }
> diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
> index 0339b6bbe11a..a9abde62909f 100644
> --- a/arch/riscv/kernel/vmlinux.lds.S
> +++ b/arch/riscv/kernel/vmlinux.lds.S
> @@ -4,7 +4,8 @@
>   * Copyright (C) 2017 SiFive
>   */
>
> -#define LOAD_OFFSET PAGE_OFFSET
> +#include <asm/pgtable.h>
> +#define LOAD_OFFSET KERNEL_LINK_ADDR
>  #include <asm/vmlinux.lds.h>
>  #include <asm/page.h>
>  #include <asm/cache.h>
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index 736de6c8739f..37be2eb45e58 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -22,6 +22,9 @@
>
>  #include "../kernel/head.h"
>
> +unsigned long kernel_virt_addr = KERNEL_VIRT_ADDR;
> +EXPORT_SYMBOL(kernel_virt_addr);
> +
>  unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]
>                                                         __page_aligned_bss;
>  EXPORT_SYMBOL(empty_zero_page);
> @@ -178,8 +181,12 @@ void __init setup_bootmem(void)
>  }
>
>  #ifdef CONFIG_MMU
> +/* Offset between linear mapping virtual address and kernel load address */
>  unsigned long va_pa_offset;
>  EXPORT_SYMBOL(va_pa_offset);
> +/* Offset between kernel mapping virtual address and kernel load address */
> +unsigned long va_kernel_pa_offset;
> +EXPORT_SYMBOL(va_kernel_pa_offset);
>  unsigned long pfn_base;
>  EXPORT_SYMBOL(pfn_base);
>
> @@ -271,7 +278,7 @@ static phys_addr_t __init alloc_pmd(uintptr_t va)
>         if (mmu_enabled)
>                 return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
>
> -       pmd_num = (va - PAGE_OFFSET) >> PGDIR_SHIFT;
> +       pmd_num = (va - kernel_virt_addr) >> PGDIR_SHIFT;
>         BUG_ON(pmd_num >= NUM_EARLY_PMDS);
>         return (uintptr_t)&early_pmd[pmd_num * PTRS_PER_PMD];
>  }
> @@ -372,14 +379,30 @@ static uintptr_t __init best_map_size(phys_addr_t base, phys_addr_t size)
>  #error "setup_vm() is called from head.S before relocate so it should not use absolute addressing."
>  #endif
>
> +static uintptr_t load_pa, load_sz;
> +
> +void create_kernel_page_table(pgd_t *pgdir, uintptr_t map_size)

It could be static if this function is only used in this file, as
kbuild test reported.
Apart from this, it looks good to me.
Reviewed-by: Zong Li <zong.li@sifive.com>

> +{
> +       uintptr_t va, end_va;
> +
> +       end_va = kernel_virt_addr + load_sz;
> +       for (va = kernel_virt_addr; va < end_va; va += map_size)
> +               create_pgd_mapping(pgdir, va,
> +                                  load_pa + (va - kernel_virt_addr),
> +                                  map_size, PAGE_KERNEL_EXEC);
> +}
> +
>  asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>  {
>         uintptr_t va, end_va;
> -       uintptr_t load_pa = (uintptr_t)(&_start);
> -       uintptr_t load_sz = (uintptr_t)(&_end) - load_pa;
>         uintptr_t map_size = best_map_size(load_pa, MAX_EARLY_MAPPING_SIZE);
>
> +       load_pa = (uintptr_t)(&_start);
> +       load_sz = (uintptr_t)(&_end) - load_pa;
> +
>         va_pa_offset = PAGE_OFFSET - load_pa;
> +       va_kernel_pa_offset = kernel_virt_addr - load_pa;
> +
>         pfn_base = PFN_DOWN(load_pa);
>
>         /*
> @@ -402,26 +425,22 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>         create_pmd_mapping(fixmap_pmd, FIXADDR_START,
>                            (uintptr_t)fixmap_pte, PMD_SIZE, PAGE_TABLE);
>         /* Setup trampoline PGD and PMD */
> -       create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
> +       create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
>                            (uintptr_t)trampoline_pmd, PGDIR_SIZE, PAGE_TABLE);
> -       create_pmd_mapping(trampoline_pmd, PAGE_OFFSET,
> +       create_pmd_mapping(trampoline_pmd, kernel_virt_addr,
>                            load_pa, PMD_SIZE, PAGE_KERNEL_EXEC);
>  #else
>         /* Setup trampoline PGD */
> -       create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
> +       create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
>                            load_pa, PGDIR_SIZE, PAGE_KERNEL_EXEC);
>  #endif
>
>         /*
> -        * Setup early PGD covering entire kernel which will allows
> +        * Setup early PGD covering entire kernel which will allow
>          * us to reach paging_init(). We map all memory banks later
>          * in setup_vm_final() below.
>          */
> -       end_va = PAGE_OFFSET + load_sz;
> -       for (va = PAGE_OFFSET; va < end_va; va += map_size)
> -               create_pgd_mapping(early_pg_dir, va,
> -                                  load_pa + (va - PAGE_OFFSET),
> -                                  map_size, PAGE_KERNEL_EXEC);
> +       create_kernel_page_table(early_pg_dir, map_size);
>
>         /* Create fixed mapping for early FDT parsing */
>         end_va = __fix_to_virt(FIX_FDT) + FIX_FDT_SIZE;
> @@ -441,6 +460,7 @@ static void __init setup_vm_final(void)
>         uintptr_t va, map_size;
>         phys_addr_t pa, start, end;
>         struct memblock_region *reg;
> +       static struct vm_struct vm_kernel = { 0 };
>
>         /* Set mmu_enabled flag */
>         mmu_enabled = true;
> @@ -467,10 +487,22 @@ static void __init setup_vm_final(void)
>                 for (pa = start; pa < end; pa += map_size) {
>                         va = (uintptr_t)__va(pa);
>                         create_pgd_mapping(swapper_pg_dir, va, pa,
> -                                          map_size, PAGE_KERNEL_EXEC);
> +                                          map_size, PAGE_KERNEL);
>                 }
>         }
>
> +       /* Map the kernel */
> +       create_kernel_page_table(swapper_pg_dir, PMD_SIZE);
> +
> +       /* Reserve the vmalloc area occupied by the kernel */
> +       vm_kernel.addr = (void *)kernel_virt_addr;
> +       vm_kernel.phys_addr = load_pa;
> +       vm_kernel.size = (load_sz + PMD_SIZE - 1) & ~(PMD_SIZE - 1);
> +       vm_kernel.flags = VM_MAP | VM_NO_GUARD;
> +       vm_kernel.caller = __builtin_return_address(0);
> +
> +       vm_area_add_early(&vm_kernel);
> +
>         /* Clear fixmap PTE and PMD mappings */
>         clear_fixmap(FIX_PTE);
>         clear_fixmap(FIX_PMD);
> diff --git a/arch/riscv/mm/physaddr.c b/arch/riscv/mm/physaddr.c
> index e8e4dcd39fed..35703d5ef5fd 100644
> --- a/arch/riscv/mm/physaddr.c
> +++ b/arch/riscv/mm/physaddr.c
> @@ -23,7 +23,7 @@ EXPORT_SYMBOL(__virt_to_phys);
>
>  phys_addr_t __phys_addr_symbol(unsigned long x)
>  {
> -       unsigned long kernel_start = (unsigned long)PAGE_OFFSET;
> +       unsigned long kernel_start = (unsigned long)kernel_virt_addr;
>         unsigned long kernel_end = (unsigned long)_end;
>
>         /*
> --
> 2.20.1
>

^ permalink raw reply

* Re: [PATCH v1 3/4] KVM: PPC: Book3S HV: migrate remaining normal-GFNs to secure-GFNs in H_SVM_INIT_DONE
From: Bharata B Rao @ 2020-06-04  2:31 UTC (permalink / raw)
  To: Ram Pai
  Cc: rcampbell, ldufour, cclaudio, kvm-ppc, aneesh.kumar, sukadev,
	linuxppc-dev, bauerman, david
In-Reply-To: <20200603231025.GA5772@oc0525413822.ibm.com>

On Wed, Jun 03, 2020 at 04:10:25PM -0700, Ram Pai wrote:
> On Tue, Jun 02, 2020 at 03:36:39PM +0530, Bharata B Rao wrote:
> > On Mon, Jun 01, 2020 at 12:05:35PM -0700, Ram Pai wrote:
> > > On Mon, Jun 01, 2020 at 05:25:18PM +0530, Bharata B Rao wrote:
> > > > On Sat, May 30, 2020 at 07:27:50PM -0700, Ram Pai wrote:
> > > > > H_SVM_INIT_DONE incorrectly assumes that the Ultravisor has explicitly
> > > > > called H_SVM_PAGE_IN for all secure pages.
> > > > 
> > > > I don't think that is quite true. HV doesn't assume anything about
> > > > secure pages by itself.
> > > 
> > > Yes. Currently, it does not assume anything about secure pages.  But I am
> > > proposing that it should consider all pages (except the shared pages) as
> > > secure pages, when H_SVM_INIT_DONE is called.
> > 
> > Ok, then may be also add the proposed changes to H_SVM_INIT_DONE
> > documentation.
> 
> ok.
> 
> > 
> > > 
> > > In other words, HV should treat all pages; except shared pages, as
> > > secure pages once H_SVM_INIT_DONE is called. And this includes pages
> > > added subsequently through memory hotplug.
> > 
> > So after H_SVM_INIT_DONE, if HV touches a secure page for any
> > reason and gets encrypted contents via page-out, HV drops the
> > device pfn at that time. So what state we would be in that case? We
> > have completed H_SVM_INIT_DONE, but still have a normal (but encrypted)
> > page in HV?
> 
> Good point.
> 
> The corresponding GFN will continue to be a secure GFN. Just that its
> backing PFN is not a device-PFN, but a memory-PFN. Also that backing
> memory-PFN contains encrypted content.
> 
> I will clarify this in the patch; about secure-GFN state.

I feel all this is complicating the states in HV and is avoidable
if UV just issued page-in calls during memslot registration uvcall.

Regards,
Bharata.

^ permalink raw reply

* Re: [PATCH 1/2] powerpc/pseries: remove cede offline state for CPUs
From: Nathan Lynch @ 2020-06-04  1:38 UTC (permalink / raw)
  To: Gautham R Shenoy; +Cc: svaidy, linuxppc-dev, npiggin
In-Reply-To: <20200603103044.GB25460@in.ibm.com>

Hi Gautham,

Gautham R Shenoy <ego@linux.vnet.ibm.com> writes:
> On Mon, Jun 01, 2020 at 11:31:39PM -0500, Nathan Lynch wrote:
>> This effectively reverts commit 3aa565f53c39 ("powerpc/pseries: Add
>> hooks to put the CPU into an appropriate offline state"), which added
>> an offline mode for CPUs which uses the H_CEDE hcall instead of the
>> architected stop-self RTAS function in order to facilitate "folding"
>> of dedicated mode processors on PowerVM platforms to achieve energy
>> savings. This has been the default offline mode since its
>> introduction.
>> 
>> There's nothing about stop-self that would prevent the hypervisor from
>> achieving the energy savings available via H_CEDE, so the original
>> premise of this change appears to be flawed.
>
>
> IIRC, back in 2009, when the Extended-CEDE was introduced, it couldn't
> be exposed via the cpuidle subsystem since this state needs an
> explicit H_PROD as opposed to the H_IPI which wakes up the regular
> CEDE call. So, the alterative was to use the CPU-Hotplug way by having
> a userspace daemon fold the cores which weren't needed currently and
> bring them back online when they were needed. Back then, Long Term
> CEDE was definitely faster compared to stop-self call (It is a pity
> that I didn't post the numbers when I wrote the patch) and the
> time-taken to unfold a core was definitely one of the concerns.
> (https://lkml.org/lkml/2009/9/23/522).

Thanks for the background. Ideally, we would understand what has changed
since then... certainly the offline path would have been slower when
pseries_cpu_die() slept for 200ms between issuing
query-cpu-stopped-state calls, but that was changed in 2008 in
b906cfa397fd ("powerpc/pseries: Fix cpu hotplug") so perhaps it wasn't a
factor in your measurements. Even less sure about what could have
made the online path more expensive.

(For the benefit of anybody else referring to the 2009 discussion that
you linked: that exchange is a bit strange to me because there seems to
be a conception that stop-self releases the processor to the platform in
a way that could prevent the OS from restarting it on demand. I don't
believe this has ever been the case. Processor deallocation is a related
but separate workflow involving different RTAS functions.)

> The patch looks good to me.
> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>

I appreciate the review, thanks!

^ permalink raw reply

* RE: [RESEND PATCH v9 4/5] ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
From: Williams, Dan J @ 2020-06-04  0:24 UTC (permalink / raw)
  To: Vaibhav Jain, linuxppc-dev@lists.ozlabs.org,
	linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org
  Cc: Santosh Sivaraj, Aneesh Kumar K . V, Steven Rostedt,
	Oliver O'Halloran, Weiny, Ira
In-Reply-To: <20200602101438.73929-5-vaibhav@linux.ibm.com>

[ forgive formatting I'm temporarily stuck using Outlook this week... ]

> From: Vaibhav Jain <vaibhav@linux.ibm.com>
[..]
> 
> Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm
> module and add the command family NVDIMM_FAMILY_PAPR to the white
> list of NVDIMM command sets. Also advertise support for ND_CMD_CALL for
> the nvdimm command mask and implement necessary scaffolding in the
> module to handle ND_CMD_CALL ioctl and PDSM requests that we receive.
> 
> The layout of the PDSM request as we expect from libnvdimm/libndctl is
> described in newly introduced uapi header 'papr_pdsm.h' which defines a
> new 'struct nd_pdsm_cmd_pkg' header. This header is used to communicate
> the PDSM request via member 'nd_cmd_pkg.nd_command' and size of
> payload that need to be sent/received for servicing the PDSM.
> 
> A new function is_cmd_valid() is implemented that reads the args to
> papr_scm_ndctl() and performs sanity tests on them. A new function
> papr_scm_service_pdsm() is introduced and is called from
> papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
> command from libnvdimm.
> 
> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
> Changelog:
> 
> Resend:
> * Added ack from Aneesh.
> 
> v8..v9:
> * Reduced the usage of term SCM replacing it with appropriate
>   replacement [ Dan Williams, Aneesh ]
> * Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h'
> * s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g
> * s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g
> * Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'.
> * Minor update to patch description.
> 
> v7..v8:
> * Removed the 'payload_offset' field from 'struct
>   nd_pdsm_cmd_pkg'. Instead command payload is always assumed to start
>   at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ]
> * To enable introducing new fields to 'struct nd_pdsm_cmd_pkg',
>   'reserved' field of 10-bytes is introduced. [ Aneesh ]
> * Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h
>   [ Ira ]
> 
> Resend:
> * None
> 
> v6..v7 :
> * Removed the re-definitions of __packed macro from papr_scm_pdsm.h
>   [Mpe].
> * Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h [Mpe].
> * Removed macros that were unused in papr_scm.c from papr_scm_pdsm.h
>   [Mpe].
> * Made functions defined in papr_scm_pdsm.h as static inline. [Mpe]
> 
> v5..v6 :
> * Changed the usage of the term DSM to PDSM to distinguish it from the
>   ACPI term [ Dan Williams ]
> * Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various
> struct
>   to reflect the new terminology.
> * Updated the patch description and title to reflect the new terminology.
> * Squashed patch to introduce new command family in 'ndctl.h' with
>   this patch [ Dan Williams ]
> * Updated the papr_scm_pdsm method starting index from 0x10000 to 0x0
>   [ Dan Williams ]
> * Removed redundant license text from the papr_scm_psdm.h file.
>   [ Dan Williams ]
> * s/envelop/envelope/ at various places [ Dan Williams ]
> * Added '__packed' attribute to command package header to gaurd
>   against different compiler adding paddings between the fields.
>   [ Dan Williams]
> * Converted various pr_debug to dev_debug [ Dan Williams ]
> 
> v4..v5 :
> * None
> 
> v3..v4 :
> * None
> 
> v2..v3 :
> * Updated the patch prefix to 'ndctl/uapi' [Aneesh]
> 
> v1..v2 :
> * None
> ---
>  arch/powerpc/include/uapi/asm/papr_pdsm.h | 136
> ++++++++++++++++++++++  arch/powerpc/platforms/pseries/papr_scm.c |
> 101 +++++++++++++++-
>  include/uapi/linux/ndctl.h                |   1 +
>  3 files changed, 232 insertions(+), 6 deletions(-)  create mode 100644
> arch/powerpc/include/uapi/asm/papr_pdsm.h
> 
> diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h
> b/arch/powerpc/include/uapi/asm/papr_pdsm.h
> new file mode 100644
> index 000000000000..6407fefcc007
> --- /dev/null
> +++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
> @@ -0,0 +1,136 @@
> +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
> +/*
> + * PAPR nvDimm Specific Methods (PDSM) and structs for libndctl
> + *
> + * (C) Copyright IBM 2020
> + *
> + * Author: Vaibhav Jain <vaibhav at linux.ibm.com>  */
> +
> +#ifndef _UAPI_ASM_POWERPC_PAPR_PDSM_H_
> +#define _UAPI_ASM_POWERPC_PAPR_PDSM_H_
> +
> +#include <linux/types.h>
> +
> +/*
> + * PDSM Envelope:
> + *
> + * The ioctl ND_CMD_CALL transfers data between user-space and kernel
> +via
> + * envelope which consists of a header and user-defined payload sections.
> + * The header is described by 'struct nd_pdsm_cmd_pkg' which expects a
> + * payload following it and accessible via 'nd_pdsm_cmd_pkg.payload' field.
> + * There is reserved field that can used to introduce new fields to the
> + * structure in future. It also tries to ensure that
> 'nd_pdsm_cmd_pkg.payload'
> + * lies at a 8-byte boundary.
> + *
> + *  +-------------+---------------------+---------------------------+
> + *  |   64-Bytes  |       16-Bytes      |       Max 176-Bytes       |
> + *  +-------------+---------------------+---------------------------+
> + *  |               nd_pdsm_cmd_pkg     |                           |
> + *  |-------------+                     |                           |
> + *  |  nd_cmd_pkg |                     |                           |
> + *  +-------------+---------------------+---------------------------+
> + *  | nd_family   |                     |                           |
> + *  | nd_size_out | cmd_status          |                           |
> + *  | nd_size_in  | payload_version     |     payload               |
> + *  | nd_command  | reserved            |                           |
> + *  | nd_fw_size  |                     |                           |
> + *  +-------------+---------------------+---------------------------+
> + *
> + * PDSM Header:
> + *
> + * The header is defined as 'struct nd_pdsm_cmd_pkg' which embeds a
> + * 'struct nd_cmd_pkg' instance. The PDSM command is assigned to
> member
> + * 'nd_cmd_pkg.nd_command'. Apart from size information of the
> envelope
> +which is
> + * contained in 'struct nd_cmd_pkg', the header also has members
> +following
> + * members:
> + *
> + * 'cmd_status'		: (Out) Errors if any encountered while
> servicing PDSM.
> + * 'payload_version'	: (In/Out) Version number associated with the
> payload.
> + * 'reserved'		: Not used and reserved for future.
> + *
> + * PDSM Payload:
> + *
> + * The layout of the PDSM Payload is defined by various structs shared
> +between
> + * papr_scm and libndctl so that contents of payload can be
> +interpreted. During
> + * servicing of a PDSM the papr_scm module will read input args from
> +the payload
> + * field by casting its contents to an appropriate struct pointer based
> +on the
> + * PDSM command. Similarly the output of servicing the PDSM command
> +will be
> + * copied to the payload field using the same struct.
> + *
> + * 'libnvdimm' enforces a hard limit of 256 bytes on the envelope size,
> +which
> + * leaves around 176 bytes for the envelope payload (ignoring any
> +padding that
> + * the compiler may silently introduce).
> + *
> + * Payload Version:
> + *
> + * A 'payload_version' field is present in PDSM header that indicates a
> +specific
> + * version of the structure present in PDSM Payload for a given PDSM
> command.
> + * This provides backward compatibility in case the PDSM Payload
> +structure
> + * evolves and different structures are supported by 'papr_scm' and
> 'libndctl'.
> + *
> + * When sending a PDSM Payload to 'papr_scm', 'libndctl' should send
> +the version
> + * of the payload struct it supports via 'payload_version' field. The
> 'papr_scm'
> + * module when servicing the PDSM envelope checks the 'payload_version'
> +and then
> + * uses 'payload struct version' == MIN('payload_version field',
> + * 'max payload-struct-version supported by papr_scm') to service the
> PDSM.
> + * After servicing the PDSM, 'papr_scm' put the negotiated version of
> +payload
> + * struct in returned 'payload_version' field.
> + *
> + * Libndctl on receiving the envelope back from papr_scm again checks
> +the
> + * 'payload_version' field and based on it use the appropriate version
> +dsm
> + * struct to parse the results.
> + *
> + * Backward Compatibility:
> + *
> + * Above scheme of exchanging different versioned PDSM struct between
> +libndctl
> + * and papr_scm should provide backward compatibility until following
> +two
> + * assumptions/conditions when defining new PDSM structs hold:
> + *
> + * Let T(X) = { set of attributes in PDSM struct 'T' versioned X }
> + *
> + * 1. T(X) is a proper subset of T(Y) if Y > X.
> + *    i.e Each new version of PDSM struct should retain existing struct
> + *    attributes from previous version
> + *
> + * 2. If an entity (libndctl or papr_scm) supports a PDSM struct T(X) then
> + *    it should also support T(1), T(2)...T(X - 1).
> + *    i.e When adding support for new version of a PDSM struct, libndctl
> + *    and papr_scm should retain support of the existing PDSM struct
> + *    version they support.
> + */
> +
> +/* PDSM-header + payload expected with ND_CMD_CALL ioctl from
> libnvdimm
> +*/ struct nd_pdsm_cmd_pkg {
> +	struct nd_cmd_pkg hdr;	/* Package header containing sub-
> cmd */
> +	__s32 cmd_status;	/* Out: Sub-cmd status returned back */
> +	__u16 reserved[5];	/* Ignored and to be used in future */
> +	__u16 payload_version;	/* In/Out: version of the payload */
> +	__u8 payload[];		/* In/Out: Sub-cmd data buffer */
> +} __packed;
> +
> +/*
> + * Methods to be embedded in ND_CMD_CALL request. These are sent to
> the
> +kernel
> + * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct  */
> +enum papr_pdsm {
> +	PAPR_PDSM_MIN = 0x0,
> +	PAPR_PDSM_MAX,
> +};
> +
> +/* Convert a libnvdimm nd_cmd_pkg to pdsm specific pkg */ static inline
> +struct nd_pdsm_cmd_pkg *nd_to_pdsm_cmd_pkg(struct nd_cmd_pkg
> *cmd) {
> +	return (struct nd_pdsm_cmd_pkg *) cmd; }
> +
> +/* Return the payload pointer for a given pcmd */ static inline void
> +*pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd) {
> +	if (pcmd->hdr.nd_size_in == 0 && pcmd->hdr.nd_size_out == 0)
> +		return NULL;
> +	else
> +		return (void *)(pcmd->payload);
> +}
> +
> +#endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c
> b/arch/powerpc/platforms/pseries/papr_scm.c
> index 149431594839..5e2237e7ec08 100644
> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> @@ -15,13 +15,15 @@
>  #include <linux/seq_buf.h>
> 
>  #include <asm/plpar_wrappers.h>
> +#include <asm/papr_pdsm.h>
> 
>  #define BIND_ANY_ADDR (~0ul)
> 
>  #define PAPR_SCM_DIMM_CMD_MASK \
>  	((1ul << ND_CMD_GET_CONFIG_SIZE) | \
>  	 (1ul << ND_CMD_GET_CONFIG_DATA) | \
> -	 (1ul << ND_CMD_SET_CONFIG_DATA))
> +	 (1ul << ND_CMD_SET_CONFIG_DATA) | \
> +	 (1ul << ND_CMD_CALL))
> 
>  /* DIMM health bitmap bitmap indicators */
>  /* SCM device is unable to persist memory contents */ @@ -350,16 +352,97
> @@ static int papr_scm_meta_set(struct papr_scm_priv *p,
>  	return 0;
>  }
> 
> +/*
> + * Validate the inputs args to dimm-control function and return '0' if valid.
> + * This also does initial sanity validation to ND_CMD_CALL sub-command
> packages.
> + */
> +static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void
> *buf,
> +		       unsigned int buf_len)
> +{
> +	unsigned long cmd_mask = PAPR_SCM_DIMM_CMD_MASK;
> +	struct nd_pdsm_cmd_pkg *pkg = nd_to_pdsm_cmd_pkg(buf);
> +	struct papr_scm_priv *p;
> +
> +	/* Only dimm-specific calls are supported atm */
> +	if (!nvdimm)
> +		return -EINVAL;
> +
> +	/* get the provider date from struct nvdimm */
> +	p = nvdimm_provider_data(nvdimm);
> +
> +	if (!test_bit(cmd, &cmd_mask)) {
> +		dev_dbg(&p->pdev->dev, "Unsupported cmd=%u\n", cmd);
> +		return -EINVAL;
> +	} else if (cmd == ND_CMD_CALL) {
> +
> +		/* Verify the envelope package */
> +		if (!buf || buf_len < sizeof(struct nd_pdsm_cmd_pkg)) {
> +			dev_dbg(&p->pdev->dev, "Invalid pkg size=%u\n",
> +				buf_len);
> +			return -EINVAL;
> +		}
> +
> +		/* Verify that the PDSM family is valid */
> +		if (pkg->hdr.nd_family != NVDIMM_FAMILY_PAPR) {
> +			dev_dbg(&p->pdev->dev, "Invalid pkg
> family=0x%llx\n",
> +				pkg->hdr.nd_family);
> +			return -EINVAL;
> +
> +		}
> +
> +		/* We except a payload with all PDSM commands */
> +		if (pdsm_cmd_to_payload(pkg) == NULL) {
> +			dev_dbg(&p->pdev->dev,
> +				"Empty payload for sub-command=0x%llx\n",
> +				pkg->hdr.nd_command);
> +			return -EINVAL;
> +		}
> +	}
> +
> +	/* Command looks valid */

So this is where I would expect the kernel to validate the command vs a known list of supported commands / payloads. One of the goals of requiring public documentation of any commands that libnvdimm might support for the ioctl path is to give the kernel the ability to gate future enabling on consideration of a common kernel front-end interface. I believe this would also address questions about the versioning scheme because userspace would be actively prevented from sending command payloads that were not first explicitly enabled in the kernel. This interface as it stands in this patch set seems to be a very thin / "anything goes" passthrough with no consideration for that policy.

As an example of the utility of this policy, consider the recent support for nvdimm security commands that allow a passphrase to be set and issue commands like "unlock" and "secure erase". The kernel actively prevents those commands from being sent from userspace. See acpi_nfit_clear_to_send() and nd_cmd_clear_to_send(). The reasoning is that it enforces the kernel's nvdimm security model that uses encrypted/trusted keys to protect key material (clear text keys only-ever exist in kernel-space). Yes, that restriction is painful for people that don't want the kernel's security model and just want the simplicity of passing clear-text keys around, but it's necessary for the kernel to have any chance to provide a common abstraction across vendors. The pain of negotiating every single command with what the kernel will support is useful for the long term health of the kernel. It forces ongoing conversations across vendors to consolidate interfaces and reuse kernel best practices like encrypted/trusted keys. Code acceptance is the only real gate the kernel has to enforce cooperation across vendors.

The expectation is that the kernel does not allow any command to pass that is not explicitly listed in a bitmap of known commands. I would expect that if you changed the payload of an existing command that would likely require a new entry in this bitmap. The goal is to give the kernel a chance to constrain the passthrough interface to afford a chance to have a discussion of what might done in a common implementation. Another example is the label-area read-write commands. The kernel needs explicit control to ensure that it owns the label area and that userspace is not able to corrupt it (write it behind the kernel's back).

Now that said, I have battle scars with some OEMs that just want a generic passthrough interface so they never need to work with the kernel community again and can just write their custom validation tooling and be done. I've mostly been successful in that fight outside of the gaping hole of ND_CMD_VENDOR. That's the path that ipmctl has used to issue commands that have not made it into the public specification on docs.pmem.io. My warning shot for that is the "disable_vendor_specific" module option that administrators can set to only allow commands that the kernel explicitly knows the effects of to be issued. The result is only tooling / enabling that submits to this auditing regime is guaranteed to work everywhere.

So, that long explanation out of the way, what does that mean for this patch set? I'd like to understand if you still see a need for a versioning scheme if the implementation is required to explicitly list all the commands it supports? I.e. that the kernel need not worry about userspace sending future unknown payloads because unknown payloads are blocked. Also if your interface has anything similar to a "vendor specific" passthrough I would like to require that go through the ND_CMD_VENDOR ioctl, so that the kernel still has a common check point to prevent vendor specific "I don't want to talk to the kernel community" shenanigans, but even better if ND_CMD_VENDOR is something the kernel can eventually jettison because nobody is using it.

I feel like this is a conversation that will take a few days to resolve, which does not leave time to push this for v5.8. That said, I do think the health flags patches at the beginning of this series are low risk and uncontentious. How about I merge those for v5.8 and circle back to get this ioctl path queued early in v5.8-rc? Apologies for the late feedback on this relative to v5.8.


^ permalink raw reply

* Re: [PATCH v3 2/7] documentation for stats_fs
From: Randy Dunlap @ 2020-06-04  0:23 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, kvm
  Cc: linux-s390, linux-doc, netdev, Emanuele Giuseppe Esposito,
	linux-kernel, kvm-ppc, Jonathan Adams, Christian Borntraeger,
	Alexander Viro, David Rientjes, linux-fsdevel, Paolo Bonzini,
	linux-mips, linuxppc-dev, linux-arm-kernel, Jim Mattson
In-Reply-To: <20200526110318.69006-3-eesposit@redhat.com>

Hi--

Here are a few comments for you.


On 5/26/20 4:03 AM, Emanuele Giuseppe Esposito wrote:
> Html docs for a complete documentation of the stats_fs API,
> filesystem and usage.
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>  Documentation/filesystems/index.rst    |   1 +
>  Documentation/filesystems/stats_fs.rst | 222 +++++++++++++++++++++++++
>  2 files changed, 223 insertions(+)
>  create mode 100644 Documentation/filesystems/stats_fs.rst

> diff --git a/Documentation/filesystems/stats_fs.rst b/Documentation/filesystems/stats_fs.rst
> new file mode 100644
> index 000000000000..292c689ffb98
> --- /dev/null
> +++ b/Documentation/filesystems/stats_fs.rst
> @@ -0,0 +1,222 @@
> +========
> +Stats_FS
> +========
> +
> +Stats_fs is a synthetic ram-based virtual filesystem that takes care of

                           RAM-based

> +gathering and displaying statistics for the Linux kernel subsystems.
> +
> +The motivation for stats_fs comes from the fact that there is no common
> +way for Linux kernel subsystems to expose statistics to userspace shared
> +throughout the Linux kernel; subsystems have to take care of gathering and
> +displaying statistics by themselves, for example in the form of files in
> +debugfs.
> +
> +Allowing each subsystem of the kernel to do so has two disadvantages.
> +First, it will introduce redundant code. Second, debugfs is anyway not the
> +right place for statistics (for example it is affected by lockdown).
> +
> +Stats_fs offers a generic and stable API, allowing any kind of
> +directory/file organization and supporting multiple kind of aggregations

                                                       kinds

> +(not only sum, but also average, max, min and count_zero) and data types
> +(boolean, all unsigned/signed and custom types). The implementation takes
> +care of gathering and displaying information at run time; users only need
> +to specify the values to be included in each source. Optionally, users can
> +also provide a display function for each value, that will take care of
> +displaying the provided value in a custom format.
> +
> +Its main function is to display each statistics as a file in the desired

                                        statistic

> +folder hierarchy defined through the API. Stats_fs files can be read, and
> +possibly cleared if their file mode allows it.
> +
> +Stats_fs is typically mounted with a command like::
> +
> +    mount -t stats_fs stats_fs /sys/kernel/stats_fs
> +
> +(Or an equivalent /etc/fstab line).
> +
> +Stats_fs has two main components: the public API defined by
> +include/linux/stats_fs.h, and the virtual file system in
> +/sys/kernel/stats.
> +
> +The API has two main elements, values and sources. Kernel
> +subsystems will create a source, add child
> +sources/values/aggregates and register it to the root source (that on the
> +virtual fs would be /sys/kernel/stats).
> +
> +The stats_fs API is defined in ``<linux/stats_fs.h>``.
> +
> +    Sources
> +        Sources are created via ``stats_fs_source_create()``, and each
> +        source becomes a directory in the file system. Sources form a
> +        parent-child relationship; root sources are added to the file
> +        system via ``stats_fs_source_register()``. Therefore each Linux
> +        subsystem will add its own entry to the root, filesystem similar

                                                   root filesystem

> +        to what it is done in debugfs. Every other source is added to or
> +        removed from a parent through the
> +        ``stats_fs_source_add_subordinate()`` and
> +        ``stats_fs_source_remove_subordinate()`` APIs. Once a source is
> +        created and added to the tree (via add_subordinate), it will be
> +        used to compute aggregate values in the parent source. A source
> +        can optionally be hidden from the filesystem but still considered
> +        in the aggregation operations if the corresponding flag is set
> +        during initialization.
> +
> +    Values
> +        Values represent quantites that are gathered by the stats_fs user.

                            quantities

> +        Examples of values include the number of vm exits of a given kind,

                                                    VM

> +        the amount of memory used by some data structure, the length of
> +        the longest hash table chain, or anything like that. Values are
> +        defined with the stats_fs_source_add_values function. Each value
> +        is defined by a ``struct stats_fs_value``; the same
> +        ``stats_fs_value`` can be added to many different sources. A value
> +        can be considered "simple" if it fetches data from a user-provided
> +        location, or "aggregate" if it groups all values in the
> +        subordinate sources that include the same ``stats_fs_value``.
> +        Values by default are considered to be cumulative, meaning the
> +        value they represent never decreases, but can also be defined as
> +        floating if they exibith a different behavior. The main difference

                            exhibit

> +        between these two is reflected into the file permission, since a
> +        floating value file does not allow the user to clear it. Each
> +        value has a ``stats_fs_type`` pointer in order to allow the user
> +        to provide custom get and clear functions. The library, however,
> +        also exports default ``stats_fs_type`` structs for the standard
> +        types (all unsigned and signed types plus boolean). A value can
> +        also provide a show function that takes care of displaying the
> +        value in a custom string format. This can be especially useful
> +        when displaying enums.
> +
> +Because stats_fs is a different mountpoint than debugfs, it is not affected
> +by security lockdown.
> +
> +Using Stats_fs
> +================
> +
> +Define a value::
> +
> +        struct statistics{
> +                uint64_t exit;
> +                ...
> +        };
> +
> +        struct kvm {
> +                ...
> +                struct statistics stat;
> +        };
> +
> +        struct stats_fs_value kvm_stats[] = {
> +                { "exit_vm", offsetof(struct kvm, stat.exit), &stats_fs_type_u64,
> +                  STATS_FS_SUM },
> +                { NULL }
> +        };
> +
> +The same ``struct stats_fs_value`` is used for both simple and aggregate
> +values, though the type and offset are only used for simple values.
> +Aggregates merge all values that use the same ``struct stats_fs_value``.
> +
> +Create the parent source::
> +
> +        struct stats_fs_source parent_source = stats_fs_source_create(0, "parent");
> +
> +Register it (files and folders
> +will only be visible after this function is called)::
> +
> +        stats_fs_source_register(parent_source);
> +
> +Create and add a child::
> +
> +        struct stats_fs_source child_source = stats_fs_source_create(STATS_FS_HIDDEN, "child");
> +
> +        stats_fs_source_add_subordinate(parent_source, child_source);
> +
> +The STATS_FS_HIDDEN attribute won't affect the aggregation, it will only
> +block the creation of the files.

Why does HIDDEN block the creation of files?  instead of their visibility?

> +
> +Add values to parent and child (also here order doesn't matter)::
> +
> +        struct kvm *base_ptr = kmalloc(..., sizeof(struct kvm));
> +        ...
> +        stats_fs_source_add_values(child_source, kvm_stats, base_ptr, 0);
> +        stats_fs_source_add_values(parent_source, kvm_stats, NULL, STATS_FS_HIDDEN);
> +
> +``child_source`` will be a simple value, since it has a non-NULL base
> +pointer, while ``parent_source`` will be an aggregate. During the adding
> +phase, also values can optionally be marked as hidden, so that the folder
> +and other values can be still shown.
> +
> +Of course the same ``struct stats_fs_value`` array can be also passed with a
> +different base pointer, to represent the same value but in another instance
> +of the kvm struct.
> +
> +Search:
> +
> +Fetch a value from the child source, returning the value
> +pointed by ``(uint64_t *) base_ptr + kvm_stats[0].offset``::
> +
> +        uint64_t ret_child, ret_parent;
> +
> +        stats_fs_source_get_value(child_source, &kvm_stats[0], &ret_child);
> +
> +Fetch an aggregate value, searching all subsources of ``parent_source`` for
> +the specified ``struct stats_fs_value``::
> +
> +        stats_fs_source_get_value(parent_source, &kvm_stats[0], &ret_parent);
> +
> +        assert(ret_child == ret_parent); // check expected result
> +
> +To make it more interesting, add another child::
> +
> +        struct stats_fs_source child_source2 = stats_fs_source_create(0, "child2");
> +
> +        stats_fs_source_add_subordinate(parent_source, child_source2);
> +        // now  the structure is parent -> child1
> +        //                              -> child2

Is that the same as                 parent -> child1 -> child2
?  It could almost be read as
                                    parent -> child1
                                    parent -> child2

Whichever it is, can you make it more explicit, please?


> +
> +        struct kvm *other_base_ptr = kmalloc(..., sizeof(struct kvm));
> +        ...
> +        stats_fs_source_add_values(child_source2, kvm_stats, other_base_ptr, 0);
> +
> +Note that other_base_ptr points to another instance of kvm, so the struct
> +stats_fs_value is the same but the address at which they point is not.
> +
> +Now get the aggregate value::
> +
> +        uint64_t ret_child, ret_child2, ret_parent;
> +
> +        stats_fs_source_get_value(child_source, &kvm_stats[0], &ret_child);
> +        stats_fs_source_get_value(parent_source, &kvm_stats[0], &ret_parent);
> +        stats_fs_source_get_value(child_source2, &kvm_stats[0], &ret_child2);
> +
> +        assert((ret_child + ret_child2) == ret_parent);
> +
> +Cleanup::
> +
> +        stats_fs_source_remove_subordinate(parent_source, child_source);
> +        stats_fs_source_revoke(child_source);
> +        stats_fs_source_put(child_source);
> +
> +        stats_fs_source_remove_subordinate(parent_source, child_source2);
> +        stats_fs_source_revoke(child_source2);
> +        stats_fs_source_put(child_source2);
> +
> +        stats_fs_source_put(parent_source);
> +        kfree(other_base_ptr);
> +        kfree(base_ptr);
> +
> +Calling stats_fs_source_revoke is very important, because it will ensure

           stats_fs_source_revoke()

> +that stats_fs will not access the data that were passed to
> +stats_fs_source_add_value for this source.
> +
> +Because open files increase the reference count for a stats_fs_source, the
> +source can end up living longer than the data that provides the values for
> +the source.  Calling stats_fs_source_revoke just before the backing data

                        stats_fs_source_revoke()

> +is freed avoids accesses to freed data structures. The sources will return
> +0.
> +
> +This is not needed for the parent_source, since it just contains
> +aggregates that would be 0 anyways if no matching child value exist.
> +
> +API Documentation
> +=================
> +
> +.. kernel-doc:: include/linux/stats_fs.h
> +   :export: fs/stats_fs/*.c
> \ No newline at end of file

Please fix that. ^^^^^


Thanks for the documentation.

-- 
~Randy


^ permalink raw reply

* Re: [PATCH] cxl: Fix kobject memleak
From: Andrew Donnellan @ 2020-06-04  0:08 UTC (permalink / raw)
  To: wanghai (M), fbarrat, arnd, gregkh; +Cc: linuxppc-dev, imunsie, linux-kernel
In-Reply-To: <a9ecd617-c541-aeb1-2e94-93abba475279@huawei.com>

On 3/6/20 9:57 pm, wanghai (M) wrote:
> kfree(cr) can be called when 
> kobject_put()-->kobject_release()-->kobject_cleanup()-->kobj_type->release() 
> is called.  The kobj_type here is afu_config_record_type

Of course, I missed that.

In that case

Acked-by: Andrew Donnellan <ajd@linux.ibm.com>

Thanks for the fix!

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited

^ permalink raw reply

* Re: [PATCH 3/3] ASoC: fsl_easrc: Fix "Function parameter not described" warnings
From: Nicolin Chen @ 2020-06-04  0:07 UTC (permalink / raw)
  To: Shengjiu Wang
  Cc: alsa-devel, timur, Xiubo.Lee, linuxppc-dev, tiwai, lgirdwood,
	perex, broonie, festevam, linux-kernel
In-Reply-To: <d166b868e6d294de47a89857be03758ec82a0a61.1591155860.git.shengjiu.wang@nxp.com>

On Wed, Jun 03, 2020 at 11:39:41AM +0800, Shengjiu Wang wrote:
> Obtained with:
> $ make W=1
> 
> sound/soc/fsl/fsl_easrc.c:403: warning: Function parameter or member 'easrc' not described in 'fsl_easrc_normalize_filter'
> sound/soc/fsl/fsl_easrc.c:403: warning: Function parameter or member 'infilter' not described in 'fsl_easrc_normalize_filter'
> sound/soc/fsl/fsl_easrc.c:403: warning: Function parameter or member 'outfilter' not described in 'fsl_easrc_normalize_filter'
> sound/soc/fsl/fsl_easrc.c:403: warning: Function parameter or member 'shift' not described in 'fsl_easrc_normalize_filter'
> 
> Fixes: 955ac624058f ("ASoC: fsl_easrc: Add EASRC ASoC CPU DAI drivers")
> Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
> Reported-by: kbuild test robot <lkp@intel.com>

Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>

^ permalink raw reply

* Re: [PATCH 2/3] ASoC: fsl_easrc: Fix -Wunused-but-set-variable
From: Nicolin Chen @ 2020-06-04  0:07 UTC (permalink / raw)
  To: Shengjiu Wang
  Cc: alsa-devel, timur, Xiubo.Lee, linuxppc-dev, tiwai, lgirdwood,
	perex, broonie, festevam, linux-kernel
In-Reply-To: <91ceb59e3bce31c9e93abba06f5156692ff5c71e.1591155860.git.shengjiu.wang@nxp.com>

On Wed, Jun 03, 2020 at 11:39:40AM +0800, Shengjiu Wang wrote:
> Obtained with:
> $ make W=1
> 
> sound/soc/fsl/fsl_easrc.c: In function 'fsl_easrc_set_rs_ratio':
> sound/soc/fsl/fsl_easrc.c:182:15: warning: variable 'int_bits' set but not used [-Wunused-but-set-variable]
>   unsigned int int_bits;
>                ^
> sound/soc/fsl/fsl_easrc.c: In function 'fsl_easrc_set_ctx_organziation':
> sound/soc/fsl/fsl_easrc.c:1204:17: warning: variable 'dev' set but not used [-Wunused-but-set-variable]
>   struct device *dev;
>                  ^
> sound/soc/fsl/fsl_easrc.c: In function 'fsl_easrc_release_context':
> sound/soc/fsl/fsl_easrc.c:1294:17: warning: variable 'dev' set but not used [-Wunused-but-set-variable]
>   struct device *dev;
>                  ^
> Fixes: 955ac624058f ("ASoC: fsl_easrc: Add EASRC ASoC CPU DAI drivers")
> Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
> Reported-by: kbuild test robot <lkp@intel.com>

Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>

^ permalink raw reply

* Re: [PATCH 1/3] ASoC: fsl_easrc: Fix -Wmissing-prototypes warning
From: Nicolin Chen @ 2020-06-04  0:05 UTC (permalink / raw)
  To: Shengjiu Wang
  Cc: alsa-devel, timur, Xiubo.Lee, linuxppc-dev, tiwai, lgirdwood,
	perex, broonie, festevam, linux-kernel
In-Reply-To: <ab1b83a56c71f4159a98e6da5602c2c36fe59f4d.1591155860.git.shengjiu.wang@nxp.com>

On Wed, Jun 03, 2020 at 11:39:39AM +0800, Shengjiu Wang wrote:
> Obtained with:
> $ make W=1
> 
> sound/soc/fsl/fsl_easrc.c:967:5: warning: no previous prototype for function 'fsl_easrc_config_context' [-Wmissing-prototypes]
> int fsl_easrc_config_context(struct fsl_asrc *easrc, unsigned int ctx_id)
>     ^
> sound/soc/fsl/fsl_easrc.c:967:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> int fsl_easrc_config_context(struct fsl_asrc *easrc, unsigned int ctx_id)
> ^
> static
> sound/soc/fsl/fsl_easrc.c:1128:5: warning: no previous prototype for function 'fsl_easrc_set_ctx_format' [-Wmissing-prototypes]
> int fsl_easrc_set_ctx_format(struct fsl_asrc_pair *ctx,
>     ^
> sound/soc/fsl/fsl_easrc.c:1128:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> int fsl_easrc_set_ctx_format(struct fsl_asrc_pair *ctx,
> ^
> static
> sound/soc/fsl/fsl_easrc.c:1201:5: warning: no previous prototype for function 'fsl_easrc_set_ctx_organziation' [-Wmissing-prototypes]
> int fsl_easrc_set_ctx_organziation(struct fsl_asrc_pair *ctx)
>     ^
> sound/soc/fsl/fsl_easrc.c:1201:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> int fsl_easrc_set_ctx_organziation(struct fsl_asrc_pair *ctx)
> ^
> static
> sound/soc/fsl/fsl_easrc.c:1245:5: warning: no previous prototype for function 'fsl_easrc_request_context' [-Wmissing-prototypes]
> int fsl_easrc_request_context(int channels, struct fsl_asrc_pair *ctx)
>     ^
> sound/soc/fsl/fsl_easrc.c:1245:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> int fsl_easrc_request_context(int channels, struct fsl_asrc_pair *ctx)
> ^
> static
> sound/soc/fsl/fsl_easrc.c:1290:6: warning: no previous prototype for function 'fsl_easrc_release_context' [-Wmissing-prototypes]
> void fsl_easrc_release_context(struct fsl_asrc_pair *ctx)
>      ^
> sound/soc/fsl/fsl_easrc.c:1290:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> void fsl_easrc_release_context(struct fsl_asrc_pair *ctx)
> ^
> static
> sound/soc/fsl/fsl_easrc.c:1317:5: warning: no previous prototype for function 'fsl_easrc_start_context' [-Wmissing-prototypes]
> int fsl_easrc_start_context(struct fsl_asrc_pair *ctx)
>     ^
> sound/soc/fsl/fsl_easrc.c:1317:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> int fsl_easrc_start_context(struct fsl_asrc_pair *ctx)
> ^
> static
> sound/soc/fsl/fsl_easrc.c:1335:5: warning: no previous prototype for function 'fsl_easrc_stop_context' [-Wmissing-prototypes]
> int fsl_easrc_stop_context(struct fsl_asrc_pair *ctx)
>     ^
> sound/soc/fsl/fsl_easrc.c:1335:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> int fsl_easrc_stop_context(struct fsl_asrc_pair *ctx)
> ^
> static
> sound/soc/fsl/fsl_easrc.c:1382:18: warning: no previous prototype for function 'fsl_easrc_get_dma_channel' [-Wmissing-prototypes]
> struct dma_chan *fsl_easrc_get_dma_channel(struct fsl_asrc_pair *ctx,
>                  ^
> sound/soc/fsl/fsl_easrc.c:1382:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> struct dma_chan *fsl_easrc_get_dma_channel(struct fsl_asrc_pair *ctx,
> ^
> static
> 
> Fixes: 955ac624058f ("ASoC: fsl_easrc: Add EASRC ASoC CPU DAI drivers")
> Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
> Reported-by: kbuild test robot <lkp@intel.com>

Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>

^ permalink raw reply

* Re: [PATCH] powerpc/vdso32: mark __kernel_datapage_offset as STV_PROTECTED
From: Nick Desaulniers @ 2020-06-03 23:50 UTC (permalink / raw)
  To: Nathan Chancellor, Christophe Leroy, Fangrui Song,
	Michael Ellerman
  Cc: clang-built-linux, Paul Mackerras, linuxppc-dev, LKML
In-Reply-To: <20200207064210.GA13125@ubuntu-x2-xlarge-x86>

On Thu, Feb 6, 2020 at 10:42 PM Nathan Chancellor
<natechancellor@gmail.com> wrote:
>
> On Wed, Feb 05, 2020 at 07:25:59AM +0100, Christophe Leroy wrote:
> >
> >
> > Le 05/02/2020 à 01:50, Fangrui Song a écrit :
> > > A PC-relative relocation (R_PPC_REL16_LO in this case) referencing a
> > > preemptible symbol in a -shared link is not allowed.  GNU ld's powerpc
> > > port is permissive and allows it [1], but lld will report an error after
> > > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?id=ec0895f08f99515194e9fcfe1338becf6f759d38
> >
> > Note that there is a series whose first two patches aim at dropping
> > __kernel_datapage_offset . See
> > https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=156045 and
> > especially patches https://patchwork.ozlabs.org/patch/1231467/ and
> > https://patchwork.ozlabs.org/patch/1231461/
> >
> > Those patches can be applied independentely of the rest.
> >
> > Christophe
>
> If that is the case, it would be nice if those could be fast tracked to
> 5.6 because as it stands now, all PowerPC builds that were working with
> ld.lld are now broken. Either that or take this patch and rebase that
> series on this one.

So do we still need Fangrui's patch or is it moot?  I'm doing a scrub
of our bug tracker and this issue is still open:
https://github.com/ClangBuiltLinux/linux/issues/851
but it looks like all of our ppc LE targets are linking with LLD just fine
https://travis-ci.com/github/ClangBuiltLinux/continuous-integration/builds/169379039
though it sounds like
https://github.com/ClangBuiltLinux/linux/issues/774
may be a blocker?
Though I don't see Cristophe's
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/5f97f7c921ffc2113ada0f32924e409bccc8277a.1580399657.git.christophe.leroy@c-s.fr/
in mainline or -next.  Was the series not accepted?


>
> Cheers,
> Nathan
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200207064210.GA13125%40ubuntu-x2-xlarge-x86.



--
Thanks,
~Nick Desaulniers

^ permalink raw reply

* Re: [PATCH] arch/{mips,sparc,microblaze,powerpc}: Don't enable pagefault/preempt twice
From: Guenter Roeck @ 2020-06-03 23:44 UTC (permalink / raw)
  To: Ira Weiny, Andrew Morton, Mike Rapoport
  Cc: Peter Zijlstra, Dave Hansen, dri-devel, linux-mips,
	James E.J. Bottomley, Max Filippov, Paul Mackerras,
	H. Peter Anvin, sparclinux, Thomas Gleixner, Helge Deller, x86,
	linux-csky, Christoph Hellwig, Ingo Molnar, linux-snps-arc,
	linux-xtensa, Borislav Petkov, Al Viro, Andy Lutomirski,
	Dan Williams, linux-arm-kernel, Chris Zankel, Thomas Bogendoerfer,
	linux-parisc, linux-kernel, Christian Koenig, linuxppc-dev,
	David S. Miller
In-Reply-To: <20200603211416.GA1740285@iweiny-DESK2.sc.intel.com>

On 6/3/20 2:14 PM, Ira Weiny wrote:
> On Wed, Jun 03, 2020 at 01:57:36PM -0700, Andrew Morton wrote:
>> On Thu, 21 May 2020 10:42:50 -0700 Ira Weiny <ira.weiny@intel.com> wrote:
>>
>>>>>
>>>>> Actually it occurs to me that the patch consolidating kmap_prot is odd for
>>>>> sparc 32 bit...
>>>>>
>>>>> Its a long shot but could you try reverting this patch?
>>>>>
>>>>> 4ea7d2419e3f kmap: consolidate kmap_prot definitions
>>>>>
>>>>
>>>> That is not easy to revert, unfortunately, due to several follow-up patches.
>>>
>>> I have gotten your sparc tests to run and they all pass...
>>>
>>> 08:10:34 > ../linux-build-test/rootfs/sparc/run-qemu-sparc.sh 
>>> Build reference: v5.7-rc4-17-g852b6f2edc0f
>>>
>>> Building sparc32:SPARCClassic:nosmp:scsi:hd ... running ......... passed
>>> Building sparc32:SPARCbook:nosmp:scsi:cd ... running ......... passed
>>> Building sparc32:LX:nosmp:noapc:scsi:hd ... running ......... passed
>>> Building sparc32:SS-4:nosmp:initrd ... running ......... passed
>>> Building sparc32:SS-5:nosmp:scsi:hd ... running ......... passed
>>> Building sparc32:SS-10:nosmp:scsi:cd ... running ......... passed
>>> Building sparc32:SS-20:nosmp:scsi:hd ... running ......... passed
>>> Building sparc32:SS-600MP:nosmp:scsi:hd ... running ......... passed
>>> Building sparc32:Voyager:nosmp:noapc:scsi:hd ... running ......... passed
>>> Building sparc32:SS-4:smp:scsi:hd ... running ......... passed
>>> Building sparc32:SS-5:smp:scsi:cd ... running ......... passed
>>> Building sparc32:SS-10:smp:scsi:hd ... running ......... passed
>>> Building sparc32:SS-20:smp:scsi:hd ... running ......... passed
>>> Building sparc32:SS-600MP:smp:scsi:hd ... running ......... passed
>>> Building sparc32:Voyager:smp:noapc:scsi:hd ... running ......... passed
>>>
>>> Is there another test I need to run?
>>
>> This all petered out, but as I understand it, this patchset still might
>> have issues on various architectures.
>>
>> Can folks please provide an update on the testing status?
> 
> I believe the tests were failing for Guenter due to another patch set...[1]
> 
> My tests with just this series are working.
> 
>>From my understanding the other failures were unrelated.[2]
> 
> 	<quote Mike Rapoport>
> 	I've checked the patch above on top of the mmots which already has
> 	Ira's patches and it booted fine. I've used sparc32_defconfig to build
> 	the kernel and qemu-system-sparc with default machine and CPU.
> 	</quote>
> 
> Mike, am I wrong?  Do you think the kmap() patches are still causing issues?
> 

For my part, all I can say is that -next is in pretty bad shape right now.
The summary of my tests says:

Build results:
	total: 151 pass: 130 fail: 21
Qemu test results:
	total: 430 pass: 375 fail: 55

sparc32 smp images in next-20200603 still crash for me with a spinlock
recursion. s390 images hang early in boot. Several others (alpha, arm64,
various ppc) don't even compile. I can run some more bisects over time,
but this is becoming a full-time job :-(.

Guenter

^ permalink raw reply

* Re: [PATCH] pwm: Add missing "CONFIG_" prefix
From: Kees Cook @ 2020-06-03 23:38 UTC (permalink / raw)
  To: Joe Perches
  Cc: linux-pwm, Uwe Kleine-König, linux-kernel, Thierry Reding,
	Paul Mackerras, linuxppc-dev
In-Reply-To: <b08611018fdb6d88757c6008a5c02fa0e07b32fb.camel@perches.com>

On Wed, Jun 03, 2020 at 04:04:31PM -0700, Joe Perches wrote:
> more odd uses (mostly in comments)
> 
> $ git grep -P -oh '\bIS_ENABLED\s*\(\s*\w+\s*\)'| \
>   sed -r 's/\s+//g'| \
>   grep -v '(CONFIG_' | \
>   sort | uniq -c | sort -rn

I think a missed a bunch because my grep was messy. :) This is much
easier to scan.

>       7 IS_ENABLED(DEBUG)
>       4 IS_ENABLED(cfg)
>       2 IS_ENABLED(opt_name)
>       2 IS_ENABLED(DEBUG_PRINT_TRIE_GRAPHVIZ)
>       2 IS_ENABLED(config)
>       2 IS_ENABLED(cond)
>       2 IS_ENABLED(__BIG_ENDIAN)
>       1 IS_ENABLED(x)
>       1 IS_ENABLED(PWM_DEBUG)
>       1 IS_ENABLED(option)
>       1 IS_ENABLED(DEBUG_RANDOM_TRIE)
>       1 IS_ENABLED(DEBUG_CHACHA20POLY1305_SLOW_CHUNK_TEST)

These seem to be "as expected".

>       4 IS_ENABLED(DRM_I915_SELFTEST)
>       1 IS_ENABLED(STRICT_KERNEL_RWX)
>       1 IS_ENABLED(ETHTOOL_NETLINK)

But these are not.

> 
> STRICT_KERNEL_RWX is misused here in ppc
> 
> ---
> 
> Fix pr_warn without newline too.
> 
>  arch/powerpc/mm/book3s64/hash_utils.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
> index 51e3c15f7aff..dd60c5f2b991 100644
> --- a/arch/powerpc/mm/book3s64/hash_utils.c
> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> @@ -660,11 +660,10 @@ static void __init htab_init_page_sizes(void)
>  		 * Pick a size for the linear mapping. Currently, we only
>  		 * support 16M, 1M and 4K which is the default
>  		 */
> -		if (IS_ENABLED(STRICT_KERNEL_RWX) &&
> +		if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX) &&
>  		    (unsigned long)_stext % 0x1000000) {
>  			if (mmu_psize_defs[MMU_PAGE_16M].shift)
> -				pr_warn("Kernel not 16M aligned, "
> -					"disabling 16M linear map alignment");
> +				pr_warn("Kernel not 16M aligned, disabling 16M linear map alignment\n");
>  			aligned = false;
>  		}

Reviewed-by: Kees Cook <keescook@chromium.org>


-- 
Kees Cook

^ permalink raw reply

* Re: [RESEND PATCH v9 4/5] ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
From: Ira Weiny @ 2020-06-03 23:22 UTC (permalink / raw)
  To: Vaibhav Jain
  Cc: Aneesh Kumar K . V, linuxppc-dev, linux-kernel, Steven Rostedt,
	linux-nvdimm
In-Reply-To: <87lfl3hp2n.fsf@linux.ibm.com>

On Thu, Jun 04, 2020 at 01:58:00AM +0530, Vaibhav Jain wrote:
> Hi Ira,
> 
> Thanks again for reviewing this patch. My Response below:
> 
> Ira Weiny <ira.weiny@intel.com> writes:
> 
> > On Tue, Jun 02, 2020 at 01:51:49PM -0700, 'Ira Weiny' wrote:
> >> On Tue, Jun 02, 2020 at 03:44:37PM +0530, Vaibhav Jain wrote:
> >
> > ...
> >
> >> > +
> >> > +/*
> >> > + * PDSM Envelope:
> >> > + *
> >> > + * The ioctl ND_CMD_CALL transfers data between user-space and kernel via
> >> > + * envelope which consists of a header and user-defined payload sections.
> >> > + * The header is described by 'struct nd_pdsm_cmd_pkg' which expects a
> >> > + * payload following it and accessible via 'nd_pdsm_cmd_pkg.payload' field.
> >> > + * There is reserved field that can used to introduce new fields to the
> >> > + * structure in future. It also tries to ensure that 'nd_pdsm_cmd_pkg.payload'
> >> > + * lies at a 8-byte boundary.
> >> > + *
> >> > + *  +-------------+---------------------+---------------------------+
> >> > + *  |   64-Bytes  |       16-Bytes      |       Max 176-Bytes       |
> >> > + *  +-------------+---------------------+---------------------------+
> >> > + *  |               nd_pdsm_cmd_pkg     |                           |
> >> > + *  |-------------+                     |                           |
> >> > + *  |  nd_cmd_pkg |                     |                           |
> >> > + *  +-------------+---------------------+---------------------------+
> >> > + *  | nd_family   |                     |                           |
> >> > + *  | nd_size_out | cmd_status          |                           |
> >> > + *  | nd_size_in  | payload_version     |     payload               |
> >> > + *  | nd_command  | reserved            |                           |
> >> > + *  | nd_fw_size  |                     |                           |
> >> > + *  +-------------+---------------------+---------------------------+
> >
> > One more comment WRT nd_size_[in|out].  I know that it is defined as the size
> > of the FW payload but normally when you nest headers 'size' in Header A
> > represents everything after Header A, including Header B.  In this case that
> > would be including nd_pdsm_cmd_pkg...
> >
> > It looks like that is not what you have done?  Or perhaps I missed it?
> >
> Not sure if I understand the question correctly.
> 'struct nd_pdsm_cmd_pkg' contains 'struct nd_cmd_pkg' at its head and
> its size_[in|out] are populated by the libndctl in userspace, setting
> them to data following the 'struct nd_cmd_pkg'.
> 
> Copying of 'struct nd_cmd_pkg' to the input/out envelop is implicitly
> done in __nd_ioctl via the command descriptor array
> __nd_cmd_bus_descs. So I dont need to add the size of 'struct
> nd_cmd_pkg' to nd_size_[in|out].

Yea I see that now...  Coming from a networking background I find that odd...
:-/  Usually 'size' in a header includes all data after that header.  Because
header A knows nothing of the rest of the 'payload'...

FWIW you could define nd_size_in anyway you want because you are not really
sending any payload back from firmware directly.  But I suppose I can live with
it.

Ira

> 
> > Ira
> >
> >> > + *
> >> > + * PDSM Header:
> >> > + *
> >> > + * The header is defined as 'struct nd_pdsm_cmd_pkg' which embeds a
> >> > + * 'struct nd_cmd_pkg' instance. The PDSM command is assigned to member
> >> > + * 'nd_cmd_pkg.nd_command'. Apart from size information of the envelope which is
> >> > + * contained in 'struct nd_cmd_pkg', the header also has members following 
> >>                                                              ^^^^^
> >>                                                         ...  the  ...
> >> 
> >> > + * members:
> >> > + *
> >> > + * 'cmd_status'		: (Out) Errors if any encountered while servicing PDSM.
> >> > + * 'payload_version'	: (In/Out) Version number associated with the payload.
> >> > + * 'reserved'		: Not used and reserved for future.
> >> > + *
> >> > + * PDSM Payload:
> >> > + *
> >> > + * The layout of the PDSM Payload is defined by various structs shared between
> >> > + * papr_scm and libndctl so that contents of payload can be interpreted. During
> >> > + * servicing of a PDSM the papr_scm module will read input args from the payload
> >> > + * field by casting its contents to an appropriate struct pointer based on the
> >> > + * PDSM command. Similarly the output of servicing the PDSM command will be
> >> > + * copied to the payload field using the same struct.
> >> > + *
> >> > + * 'libnvdimm' enforces a hard limit of 256 bytes on the envelope size, which
> >> > + * leaves around 176 bytes for the envelope payload (ignoring any padding that
> >> > + * the compiler may silently introduce).
> >> > + *
> >> > + * Payload Version:
> >> > + *
> >> > + * A 'payload_version' field is present in PDSM header that indicates a specific
> >> > + * version of the structure present in PDSM Payload for a given PDSM command.
> >> > + * This provides backward compatibility in case the PDSM Payload structure
> >> > + * evolves and different structures are supported by 'papr_scm' and 'libndctl'.
> >> > + *
> >> > + * When sending a PDSM Payload to 'papr_scm', 'libndctl' should send the version
> >> > + * of the payload struct it supports via 'payload_version' field. The 'papr_scm'
> >> > + * module when servicing the PDSM envelope checks the 'payload_version' and then
> >> > + * uses 'payload struct version' == MIN('payload_version field',
> >> > + * 'max payload-struct-version supported by papr_scm') to service the PDSM.
> >> > + * After servicing the PDSM, 'papr_scm' put the negotiated version of payload
> >> > + * struct in returned 'payload_version' field.
> >> > + *
> >> > + * Libndctl on receiving the envelope back from papr_scm again checks the
> >> > + * 'payload_version' field and based on it use the appropriate version dsm
> >> > + * struct to parse the results.
> >> > + *
> >> > + * Backward Compatibility:
> >> > + *
> >> > + * Above scheme of exchanging different versioned PDSM struct between libndctl
> >> > + * and papr_scm should provide backward compatibility until following two
> >> > + * assumptions/conditions when defining new PDSM structs hold:
> >> > + *
> >> > + * Let T(X) = { set of attributes in PDSM struct 'T' versioned X }
> >> > + *
> >> > + * 1. T(X) is a proper subset of T(Y) if Y > X.
> >> > + *    i.e Each new version of PDSM struct should retain existing struct
> >> > + *    attributes from previous version
> >> > + *
> >> > + * 2. If an entity (libndctl or papr_scm) supports a PDSM struct T(X) then
> >> > + *    it should also support T(1), T(2)...T(X - 1).
> >> > + *    i.e When adding support for new version of a PDSM struct, libndctl
> >> > + *    and papr_scm should retain support of the existing PDSM struct
> >> > + *    version they support.
> >> 
> >> Please see this thread for an example why versions are a bad idea in UAPIs:
> >> 
> >> https://lkml.org/lkml/2020/3/26/213
> >> 
> >> While the use of version is different in that thread the fundamental issues are
> >> the same.  You end up with some weird matrix of supported features and
> >> structure definitions.  For example, you are opening up the possibility of
> >> changing structures with a different version for no good reason.
> >> 
> >> Also having the user query with version Z and get back version X (older) is
> >> odd.  Generally if the kernel does not know about a feature (ie version Z of
> >> the structure) it should return -EINVAL and let the user figure out what to do.
> >> The user may just give up or they could try a different query.
> >> 
> >> > + */
> >> > +
> >> > +/* PDSM-header + payload expected with ND_CMD_CALL ioctl from libnvdimm */
> >> > +struct nd_pdsm_cmd_pkg {
> >> > +	struct nd_cmd_pkg hdr;	/* Package header containing sub-cmd */
> >> > +	__s32 cmd_status;	/* Out: Sub-cmd status returned back */
> >> > +	__u16 reserved[5];	/* Ignored and to be used in future */
> >> 
> >> How do you know when reserved is used for something else in the future?  Is
> >> reserved guaranteed (and checked by the code) to be 0?
> >> 
> >> > +	__u16 payload_version;	/* In/Out: version of the payload */
> >> 
> >> Why is payload_version after reserved?
> >> 
> >> > +	__u8 payload[];		/* In/Out: Sub-cmd data buffer */
> >> > +} __packed;
> >> > +
> >> > +/*
> >> > + * Methods to be embedded in ND_CMD_CALL request. These are sent to the kernel
> >> > + * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
> >> > + */
> >> > +enum papr_pdsm {
> >> > +	PAPR_PDSM_MIN = 0x0,
> >> > +	PAPR_PDSM_MAX,
> >> > +};
> >> > +
> >> > +/* Convert a libnvdimm nd_cmd_pkg to pdsm specific pkg */
> >> > +static inline struct nd_pdsm_cmd_pkg *nd_to_pdsm_cmd_pkg(struct nd_cmd_pkg *cmd)
> >> > +{
> >> > +	return (struct nd_pdsm_cmd_pkg *) cmd;
> >> > +}
> >> > +
> >> > +/* Return the payload pointer for a given pcmd */
> >> > +static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd)
> >> > +{
> >> > +	if (pcmd->hdr.nd_size_in == 0 && pcmd->hdr.nd_size_out == 0)
> >> > +		return NULL;
> >> > +	else
> >> > +		return (void *)(pcmd->payload);
> >> > +}
> >> > +
> >> > +#endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
> >> > diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> >> > index 149431594839..5e2237e7ec08 100644
> >> > --- a/arch/powerpc/platforms/pseries/papr_scm.c
> >> > +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> >> > @@ -15,13 +15,15 @@
> >> >  #include <linux/seq_buf.h>
> >> >  
> >> >  #include <asm/plpar_wrappers.h>
> >> > +#include <asm/papr_pdsm.h>
> >> >  
> >> >  #define BIND_ANY_ADDR (~0ul)
> >> >  
> >> >  #define PAPR_SCM_DIMM_CMD_MASK \
> >> >  	((1ul << ND_CMD_GET_CONFIG_SIZE) | \
> >> >  	 (1ul << ND_CMD_GET_CONFIG_DATA) | \
> >> > -	 (1ul << ND_CMD_SET_CONFIG_DATA))
> >> > +	 (1ul << ND_CMD_SET_CONFIG_DATA) | \
> >> > +	 (1ul << ND_CMD_CALL))
> >> >  
> >> >  /* DIMM health bitmap bitmap indicators */
> >> >  /* SCM device is unable to persist memory contents */
> >> > @@ -350,16 +352,97 @@ static int papr_scm_meta_set(struct papr_scm_priv *p,
> >> >  	return 0;
> >> >  }
> >> >  
> >> > +/*
> >> > + * Validate the inputs args to dimm-control function and return '0' if valid.
> >> > + * This also does initial sanity validation to ND_CMD_CALL sub-command packages.
> >> > + */
> >> > +static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
> >> > +		       unsigned int buf_len)
> >> > +{
> >> > +	unsigned long cmd_mask = PAPR_SCM_DIMM_CMD_MASK;
> >> > +	struct nd_pdsm_cmd_pkg *pkg = nd_to_pdsm_cmd_pkg(buf);
> >> > +	struct papr_scm_priv *p;
> >> > +
> >> > +	/* Only dimm-specific calls are supported atm */
> >> > +	if (!nvdimm)
> >> > +		return -EINVAL;
> >> > +
> >> > +	/* get the provider date from struct nvdimm */
> >> 
> >> s/date/data
> >> 
> >> > +	p = nvdimm_provider_data(nvdimm);
> >> > +
> >> > +	if (!test_bit(cmd, &cmd_mask)) {
> >> > +		dev_dbg(&p->pdev->dev, "Unsupported cmd=%u\n", cmd);
> >> > +		return -EINVAL;
> >> > +	} else if (cmd == ND_CMD_CALL) {
> >> > +
> >> > +		/* Verify the envelope package */
> >> > +		if (!buf || buf_len < sizeof(struct nd_pdsm_cmd_pkg)) {
> >> > +			dev_dbg(&p->pdev->dev, "Invalid pkg size=%u\n",
> >> > +				buf_len);
> >> > +			return -EINVAL;
> >> > +		}
> >> > +
> >> > +		/* Verify that the PDSM family is valid */
> >> > +		if (pkg->hdr.nd_family != NVDIMM_FAMILY_PAPR) {
> >> > +			dev_dbg(&p->pdev->dev, "Invalid pkg family=0x%llx\n",
> >> > +				pkg->hdr.nd_family);
> >> > +			return -EINVAL;
> >> > +
> >> > +		}
> >> > +
> >> > +		/* We except a payload with all PDSM commands */
> >> > +		if (pdsm_cmd_to_payload(pkg) == NULL) {
> >> > +			dev_dbg(&p->pdev->dev,
> >> > +				"Empty payload for sub-command=0x%llx\n",
> >> > +				pkg->hdr.nd_command);
> >> > +			return -EINVAL;
> >> > +		}
> >> > +	}
> >> > +
> >> > +	/* Command looks valid */
> >> 
> >> I assume the first command to be implemented also checks the { nd_command,
> >> payload_version, payload length } for correctness?
> >> 
> >> > +	return 0;
> >> > +}
> >> > +
> >> > +static int papr_scm_service_pdsm(struct papr_scm_priv *p,
> >> > +				struct nd_pdsm_cmd_pkg *call_pkg)
> >> > +{
> >> > +	/* unknown subcommands return error in packages */
> >> > +	if (call_pkg->hdr.nd_command <= PAPR_PDSM_MIN ||
> >> > +	    call_pkg->hdr.nd_command >= PAPR_PDSM_MAX) {
> >> > +		dev_dbg(&p->pdev->dev, "Invalid PDSM request 0x%llx\n",
> >> > +			call_pkg->hdr.nd_command);
> >> > +		call_pkg->cmd_status = -EINVAL;
> >> > +		return 0;
> >> > +	}
> >> > +
> >> > +	/* Depending on the DSM command call appropriate service routine */
> >> > +	switch (call_pkg->hdr.nd_command) {
> >> > +	default:
> >> > +		dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n",
> >> > +			call_pkg->hdr.nd_command);
> >> > +		call_pkg->cmd_status = -ENOENT;
> >> > +		return 0;
> >> > +	}
> >> > +}
> >> > +
> >> >  static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
> >> >  			  struct nvdimm *nvdimm, unsigned int cmd, void *buf,
> >> >  			  unsigned int buf_len, int *cmd_rc)
> >> >  {
> >> >  	struct nd_cmd_get_config_size *get_size_hdr;
> >> >  	struct papr_scm_priv *p;
> >> > +	struct nd_pdsm_cmd_pkg *call_pkg = NULL;
> >> > +	int rc;
> >> >  
> >> > -	/* Only dimm-specific calls are supported atm */
> >> > -	if (!nvdimm)
> >> > -		return -EINVAL;
> >> > +	/* Use a local variable in case cmd_rc pointer is NULL */
> >> > +	if (cmd_rc == NULL)
> >> > +		cmd_rc = &rc;
> >> 
> >> Why is this needed?  AFAICT The caller of papr_scm_ndctl does not specify null
> >> and you did not change it.
> >> 
> >> > +
> >> > +	*cmd_rc = is_cmd_valid(nvdimm, cmd, buf, buf_len);
> >> > +	if (*cmd_rc) {
> >> > +		pr_debug("Invalid cmd=0x%x. Err=%d\n", cmd, *cmd_rc);
> >> > +		return *cmd_rc;
> >> > +	}
> >> >  
> >> >  	p = nvdimm_provider_data(nvdimm);
> >> >  
> >> > @@ -381,13 +464,19 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
> >> >  		*cmd_rc = papr_scm_meta_set(p, buf);
> >> >  		break;
> >> >  
> >> > +	case ND_CMD_CALL:
> >> > +		call_pkg = nd_to_pdsm_cmd_pkg(buf);
> >> > +		*cmd_rc = papr_scm_service_pdsm(p, call_pkg);
> >> > +		break;
> >> > +
> >> >  	default:
> >> > -		return -EINVAL;
> >> > +		dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd);
> >> > +		*cmd_rc = -EINVAL;
> >> 
> >> Is this change related?  If there is a bug where there is a caller of
> >> papr_scm_ndctl() with cmd_rc == NULL this should be a separate patch to fix
> >> that issue.
> >> 
> >> Ira
> >> 
> >> >  	}
> >> >  
> >> >  	dev_dbg(&p->pdev->dev, "returned with cmd_rc = %d\n", *cmd_rc);
> >> >  
> >> > -	return 0;
> >> > +	return *cmd_rc;
> >> >  }
> >> >  
> >> >  static ssize_t flags_show(struct device *dev,
> >> > diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
> >> > index de5d90212409..0e09dc5cec19 100644
> >> > --- a/include/uapi/linux/ndctl.h
> >> > +++ b/include/uapi/linux/ndctl.h
> >> > @@ -244,6 +244,7 @@ struct nd_cmd_pkg {
> >> >  #define NVDIMM_FAMILY_HPE2 2
> >> >  #define NVDIMM_FAMILY_MSFT 3
> >> >  #define NVDIMM_FAMILY_HYPERV 4
> >> > +#define NVDIMM_FAMILY_PAPR 5
> >> >  
> >> >  #define ND_IOCTL_CALL			_IOWR(ND_IOCTL, ND_CMD_CALL,\
> >> >  					struct nd_cmd_pkg)
> >> > -- 
> >> > 2.26.2
> >> > 
> >> _______________________________________________
> >> Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
> >> To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
> 
> -- 
> Cheers
> ~ Vaibhav

^ permalink raw reply

* Re: [RESEND PATCH v9 5/5] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
From: Ira Weiny @ 2020-06-03 23:18 UTC (permalink / raw)
  To: Vaibhav Jain
  Cc: Santosh Sivaraj, linux-nvdimm, linux-kernel, Steven Rostedt,
	Oliver O'Halloran, Aneesh Kumar K . V, Dan Williams,
	linuxppc-dev
In-Reply-To: <87pnaggee3.fsf@linux.ibm.com>

On Thu, Jun 04, 2020 at 12:34:04AM +0530, Vaibhav Jain wrote:
> Hi Ira,
> 
> Thanks for reviewing this patch. My responses below:
> 
> Ira Weiny <ira.weiny@intel.com> writes:
> 
> > On Tue, Jun 02, 2020 at 03:44:38PM +0530, Vaibhav Jain wrote:
> >> This patch implements support for PDSM request 'PAPR_PDSM_HEALTH'
> >> that returns a newly introduced 'struct nd_papr_pdsm_health' instance
> >> containing dimm health information back to user space in response to
> >> ND_CMD_CALL. This functionality is implemented in newly introduced
> >> papr_pdsm_health() that queries the nvdimm health information and
> >> then copies this information to the package payload whose layout is
> >> defined by 'struct nd_papr_pdsm_health'.
> >> 
> >> The patch also introduces a new member 'struct papr_scm_priv.health'
> >> thats an instance of 'struct nd_papr_pdsm_health' to cache the health
> >> information of a nvdimm. As a result functions drc_pmem_query_health()
> >> and flags_show() are updated to populate and use this new struct
> >> instead of a u64 integer that was earlier used.
> >> 
> >> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
> >> Cc: Dan Williams <dan.j.williams@intel.com>
> >> Cc: Michael Ellerman <mpe@ellerman.id.au>
> >> Cc: Ira Weiny <ira.weiny@intel.com>
> >> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> >> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> >> ---
> >> Changelog:
> >> 
> >> Resend:
> >> * Added ack from Aneesh.
> >> 
> >> v8..v9:
> >> * s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g  [ Dan , Aneesh ]
> >> * s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g
> >> * Renamed papr_scm_get_health() to papr_psdm_health()
> >> * Updated patch description to replace papr-scm dimm with nvdimm.
> >> 
> >> v7..v8:
> >> * None
> >> 
> >> Resend:
> >> * None
> >> 
> >> v6..v7:
> >> * Updated flags_show() to use seq_buf_printf(). [Mpe]
> >> * Updated papr_scm_get_health() to use newly introduced
> >>   __drc_pmem_query_health() bypassing the cache [Mpe].
> >> 
> >> v5..v6:
> >> * Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to
> >>   gaurd against possibility of different compilers adding different
> >>   paddings to the struct [ Dan Williams ]
> >> 
> >> * Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of
> >>   'bool' and also updated drc_pmem_query_health() to take this into
> >>   account. [ Dan Williams ]
> >> 
> >> v4..v5:
> >> * None
> >> 
> >> v3..v4:
> >> * Call the DSM_PAPR_SCM_HEALTH service function from
> >>   papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh]
> >> 
> >> v2..v3:
> >> * Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types
> >>   as its exported to the userspace [Aneesh]
> >> * Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health
> >>   from enum to #defines [Aneesh]
> >> 
> >> v1..v2:
> >> * New patch in the series
> >> ---
> >>  arch/powerpc/include/uapi/asm/papr_pdsm.h |  39 +++++++
> >>  arch/powerpc/platforms/pseries/papr_scm.c | 125 +++++++++++++++++++---
> >>  2 files changed, 147 insertions(+), 17 deletions(-)
> >> 
> >> diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h b/arch/powerpc/include/uapi/asm/papr_pdsm.h
> >> index 6407fefcc007..411725a91591 100644
> >> --- a/arch/powerpc/include/uapi/asm/papr_pdsm.h
> >> +++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
> >> @@ -115,6 +115,7 @@ struct nd_pdsm_cmd_pkg {
> >>   */
> >>  enum papr_pdsm {
> >>  	PAPR_PDSM_MIN = 0x0,
> >> +	PAPR_PDSM_HEALTH,
> >>  	PAPR_PDSM_MAX,
> >>  };
> >>  
> >> @@ -133,4 +134,42 @@ static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd)
> >>  		return (void *)(pcmd->payload);
> >>  }
> >>  
> >> +/* Various nvdimm health indicators */
> >> +#define PAPR_PDSM_DIMM_HEALTHY       0
> >> +#define PAPR_PDSM_DIMM_UNHEALTHY     1
> >> +#define PAPR_PDSM_DIMM_CRITICAL      2
> >> +#define PAPR_PDSM_DIMM_FATAL         3
> >> +
> >> +/*
> >> + * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH
> >> + * Various flags indicate the health status of the dimm.
> >> + *
> >> + * dimm_unarmed		: Dimm not armed. So contents wont persist.
> >> + * dimm_bad_shutdown	: Previous shutdown did not persist contents.
> >> + * dimm_bad_restore	: Contents from previous shutdown werent restored.
> >> + * dimm_scrubbed	: Contents of the dimm have been scrubbed.
> >> + * dimm_locked		: Contents of the dimm cant be modified until CEC reboot
> >> + * dimm_encrypted	: Contents of dimm are encrypted.
> >> + * dimm_health		: Dimm health indicator. One of PAPR_PDSM_DIMM_XXXX
> >> + */
> >> +struct nd_papr_pdsm_health_v1 {
> >> +	__u8 dimm_unarmed;
> >> +	__u8 dimm_bad_shutdown;
> >> +	__u8 dimm_bad_restore;
> >> +	__u8 dimm_scrubbed;
> >> +	__u8 dimm_locked;
> >> +	__u8 dimm_encrypted;
> >> +	__u16 dimm_health;
> >> +} __packed;
> >> +
> >> +/*
> >> + * Typedef the current struct for dimm_health so that any application
> >> + * or kernel recompiled after introducing a new version automatically
> >> + * supports the new version.
> >> + */
> >> +#define nd_papr_pdsm_health nd_papr_pdsm_health_v1
> >> +
> >> +/* Current version number for the dimm health struct */
> >
> > This can't be the 'current' version.  You will need a list of versions you
> > support.  Because if the user passes in an old version you need to be able to
> > respond with that old version.  Also if you plan to support 'return X for a Y
> > query' then the user will need both X and Y defined to interpret X.
> Yes, and that change will be introduced with addition of version-2 of
> nd_papr_pdsm_health. Earlier version of the patchset[1] had such a table
> implemented. But to simplify the patchset, as we are only dealing with
> version-1 of the structs right now, it was dropped.
> 
> [1] :
> https://lore.kernel.org/linuxppc-dev/20200220095805.197229-9-vaibhav@linux.ibm.com/

I'm not sure I follow that comment.

I feel like there is some confusion about what firmware can return vs the UAPI
structure.  You have already marshaled the data between the 2.  We can define
whatever we want for the UAPI structures throwing away data the kernel does not
understand from the firmware.

> 
> >
> >> +#define ND_PAPR_PDSM_HEALTH_VERSION 1
> >> +
> >>  #endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
> >> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> >> index 5e2237e7ec08..c0606c0c659c 100644
> >> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> >> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> >> @@ -88,7 +88,7 @@ struct papr_scm_priv {
> >>  	unsigned long lasthealth_jiffies;
> >>  
> >>  	/* Health information for the dimm */
> >> -	u64 health_bitmap;
> >> +	struct nd_papr_pdsm_health health;
> >
> > ok so we are throwing away all the #defs from patch 1?  Are they still valid?
> >
> > I'm confused that patch 3 added this and we are throwing it away
> > here...
> The #defines are still valid, only the usage moved to a __drc_pmem_query_health().
> 
> >
> >>  };
> >>  
> >>  static int drc_pmem_bind(struct papr_scm_priv *p)
> >> @@ -201,6 +201,7 @@ static int drc_pmem_query_n_bind(struct papr_scm_priv *p)
> >>  static int __drc_pmem_query_health(struct papr_scm_priv *p)
> >>  {
> >>  	unsigned long ret[PLPAR_HCALL_BUFSIZE];
> >> +	u64 health;
> >>  	long rc;
> >>  
> >>  	/* issue the hcall */
> >> @@ -208,18 +209,46 @@ static int __drc_pmem_query_health(struct papr_scm_priv *p)
> >>  	if (rc != H_SUCCESS) {
> >>  		dev_err(&p->pdev->dev,
> >>  			 "Failed to query health information, Err:%ld\n", rc);
> >> -		rc = -ENXIO;
> >> -		goto out;
> >> +		return -ENXIO;
> >
> > I missed this...  probably did not need the goto in the first patch?
> Yes, will get rid of the goto from patch-1.

Cool.

> 
> >
> >>  	}
> >>  
> >>  	p->lasthealth_jiffies = jiffies;
> >> -	p->health_bitmap = ret[0] & ret[1];
> >> +	health = ret[0] & ret[1];
> >>  
> >>  	dev_dbg(&p->pdev->dev,
> >>  		"Queried dimm health info. Bitmap:0x%016lx Mask:0x%016lx\n",
> >>  		ret[0], ret[1]);
> >> -out:
> >> -	return rc;
> >> +
> >> +	memset(&p->health, 0, sizeof(p->health));
> >> +
> >> +	/* Check for various masks in bitmap and set the buffer */
> >> +	if (health & PAPR_PMEM_UNARMED_MASK)
> >
> > Oh ok...  odd.  (don't add code then just take it away in a series)
> > You could have lead with the user structure and put this code in patch
> > 3.
> The struct nd_papr_pdsm_health in only introduced this patch in header
> 'papr_pdsm.h' as means of exchanging nvdimm health information with
> userspace. Introducing this struct without introducing the necessary
> scafolding in 'papr_pdsm.h' would have been very counter-intutive.

I respectfully disagree.  You intended to use a copy of this structure in
kernel to store the data.  Just do that.

> 
> >
> > Why does the user need u8 to represent a single bit?  Does this help protect
> > against endian issues?
> This was 'bool' earlier but since type 'bool' isnt suitable for ioctl abi
> and I wanted to avoid bit fields here as not sure if their packing may
> differ across compilers hence replaced with u8.
> 

ok works for me...

> >
> >> +		p->health.dimm_unarmed = 1;
> >> +
> >> +	if (health & PAPR_PMEM_BAD_SHUTDOWN_MASK)
> >> +		p->health.dimm_bad_shutdown = 1;
> >> +
> >> +	if (health & PAPR_PMEM_BAD_RESTORE_MASK)
> >> +		p->health.dimm_bad_restore = 1;
> >> +
> >> +	if (health & PAPR_PMEM_ENCRYPTED)
> >> +		p->health.dimm_encrypted = 1;
> >> +
> >> +	if (health & PAPR_PMEM_SCRUBBED_AND_LOCKED) {
> >> +		p->health.dimm_locked = 1;
> >> +		p->health.dimm_scrubbed = 1;
> >> +	}
> >> +
> >> +	if (health & PAPR_PMEM_HEALTH_UNHEALTHY)
> >> +		p->health.dimm_health = PAPR_PDSM_DIMM_UNHEALTHY;
> >> +
> >> +	if (health & PAPR_PMEM_HEALTH_CRITICAL)
> >> +		p->health.dimm_health = PAPR_PDSM_DIMM_CRITICAL;
> >> +
> >> +	if (health & PAPR_PMEM_HEALTH_FATAL)
> >> +		p->health.dimm_health = PAPR_PDSM_DIMM_FATAL;
> >> +
> >> +	return 0;
> >>  }
> >>  
> >>  /* Min interval in seconds for assuming stable dimm health */
> >> @@ -403,6 +432,58 @@ static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
> >>  	return 0;
> >>  }
> >>  
> >> +/* Fetch the DIMM health info and populate it in provided package. */
> >> +static int papr_pdsm_health(struct papr_scm_priv *p,
> >> +			       struct nd_pdsm_cmd_pkg *pkg)
> >> +{
> >> +	int rc;
> >> +	size_t copysize = sizeof(p->health);
> >> +
> >> +	/* Ensure dimm health mutex is taken preventing concurrent access */
> >> +	rc = mutex_lock_interruptible(&p->health_mutex);
> >> +	if (rc)
> >> +		goto out;
> >> +
> >> +	/* Always fetch upto date dimm health data ignoring cached values */
> >> +	rc = __drc_pmem_query_health(p);
> >> +	if (rc)
> >> +		goto out_unlock;
> >> +	/*
> >> +	 * If the requested payload version is greater than one we know
> >> +	 * about, return the payload version we know about and let
> >> +	 * caller/userspace handle.
> >> +	 */
> >> +	if (pkg->payload_version > ND_PAPR_PDSM_HEALTH_VERSION)
> >> +		pkg->payload_version = ND_PAPR_PDSM_HEALTH_VERSION;
> >
> > I know this seems easy now but I do think you will run into trouble later.
> 
> I did addressed this in an earlier iteration of this patchset[1] and
> dropped it in favour of simplicity.
> 
> [1] :
> https://lore.kernel.org/linuxppc-dev/20200220095805.197229-9-vaibhav@linux.ibm.com/
 
I don't see how that addresses this?  See my other email.

Ira

> 
> > Ira
> >
> >> +
> >> +	if (pkg->hdr.nd_size_out < copysize) {
> >> +		dev_dbg(&p->pdev->dev, "Truncated payload (%u). Expected (%lu)",
> >> +			pkg->hdr.nd_size_out, copysize);
> >> +		rc = -ENOSPC;
> >> +		goto out_unlock;
> >> +	}
> >> +
> >> +	dev_dbg(&p->pdev->dev, "Copying payload size=%lu version=0x%x\n",
> >> +		copysize, pkg->payload_version);
> >> +
> >> +	/* Copy the health struct to the payload */
> >> +	memcpy(pdsm_cmd_to_payload(pkg), &p->health, copysize);
> >> +	pkg->hdr.nd_fw_size = copysize;
> >> +
> >> +out_unlock:
> >> +	mutex_unlock(&p->health_mutex);
> >> +
> >> +out:
> >> +	/*
> >> +	 * Put the error in out package and return success from function
> >> +	 * so that errors if any are propogated back to userspace.
> >> +	 */
> >> +	pkg->cmd_status = rc;
> >> +	dev_dbg(&p->pdev->dev, "completion code = %d\n", rc);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >>  static int papr_scm_service_pdsm(struct papr_scm_priv *p,
> >>  				struct nd_pdsm_cmd_pkg *call_pkg)
> >>  {
> >> @@ -417,6 +498,9 @@ static int papr_scm_service_pdsm(struct papr_scm_priv *p,
> >>  
> >>  	/* Depending on the DSM command call appropriate service routine */
> >>  	switch (call_pkg->hdr.nd_command) {
> >> +	case PAPR_PDSM_HEALTH:
> >> +		return papr_pdsm_health(p, call_pkg);
> >> +
> >>  	default:
> >>  		dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n",
> >>  			call_pkg->hdr.nd_command);
> >> @@ -485,34 +569,41 @@ static ssize_t flags_show(struct device *dev,
> >>  	struct nvdimm *dimm = to_nvdimm(dev);
> >>  	struct papr_scm_priv *p = nvdimm_provider_data(dimm);
> >>  	struct seq_buf s;
> >> -	u64 health;
> >>  	int rc;
> >>  
> >>  	rc = drc_pmem_query_health(p);
> >>  	if (rc)
> >>  		return rc;
> >>  
> >> -	/* Copy health_bitmap locally, check masks & update out buffer */
> >> -	health = READ_ONCE(p->health_bitmap);
> >> -
> >>  	seq_buf_init(&s, buf, PAGE_SIZE);
> >> -	if (health & PAPR_PMEM_UNARMED_MASK)
> >> +
> >> +	/* Protect concurrent modifications to papr_scm_priv */
> >> +	rc = mutex_lock_interruptible(&p->health_mutex);
> >> +	if (rc)
> >> +		return rc;
> >> +
> >> +	if (p->health.dimm_unarmed)
> >>  		seq_buf_printf(&s, "not_armed ");
> >>  
> >> -	if (health & PAPR_PMEM_BAD_SHUTDOWN_MASK)
> >> +	if (p->health.dimm_bad_shutdown)
> >>  		seq_buf_printf(&s, "flush_fail ");
> >>  
> >> -	if (health & PAPR_PMEM_BAD_RESTORE_MASK)
> >> +	if (p->health.dimm_bad_restore)
> >>  		seq_buf_printf(&s, "restore_fail ");
> >>  
> >> -	if (health & PAPR_PMEM_ENCRYPTED)
> >> +	if (p->health.dimm_encrypted)
> >>  		seq_buf_printf(&s, "encrypted ");
> >>  
> >> -	if (health & PAPR_PMEM_SMART_EVENT_MASK)
> >> +	if (p->health.dimm_health)
> >>  		seq_buf_printf(&s, "smart_notify ");
> >>  
> >> -	if (health & PAPR_PMEM_SCRUBBED_AND_LOCKED)
> >> -		seq_buf_printf(&s, "scrubbed locked ");
> >> +	if (p->health.dimm_scrubbed)
> >> +		seq_buf_printf(&s, "scrubbed ");
> >> +
> >> +	if (p->health.dimm_locked)
> >> +		seq_buf_printf(&s, "locked ");
> >> +
> >> +	mutex_unlock(&p->health_mutex);
> >>  
> >>  	if (seq_buf_used(&s))
> >>  		seq_buf_printf(&s, "\n");
> >> -- 
> >> 2.26.2
> >> 
> 
> -- 
> Cheers
> ~ Vaibhav

^ permalink raw reply

* Re: [PATCH v1 3/4] KVM: PPC: Book3S HV: migrate remaining normal-GFNs to secure-GFNs in H_SVM_INIT_DONE
From: Ram Pai @ 2020-06-03 23:10 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: rcampbell, ldufour, cclaudio, kvm-ppc, aneesh.kumar, sukadev,
	linuxppc-dev, bauerman, david
In-Reply-To: <20200602100639.GB31382@in.ibm.com>

On Tue, Jun 02, 2020 at 03:36:39PM +0530, Bharata B Rao wrote:
> On Mon, Jun 01, 2020 at 12:05:35PM -0700, Ram Pai wrote:
> > On Mon, Jun 01, 2020 at 05:25:18PM +0530, Bharata B Rao wrote:
> > > On Sat, May 30, 2020 at 07:27:50PM -0700, Ram Pai wrote:
> > > > H_SVM_INIT_DONE incorrectly assumes that the Ultravisor has explicitly
> > > > called H_SVM_PAGE_IN for all secure pages.
> > > 
> > > I don't think that is quite true. HV doesn't assume anything about
> > > secure pages by itself.
> > 
> > Yes. Currently, it does not assume anything about secure pages.  But I am
> > proposing that it should consider all pages (except the shared pages) as
> > secure pages, when H_SVM_INIT_DONE is called.
> 
> Ok, then may be also add the proposed changes to H_SVM_INIT_DONE
> documentation.

ok.

> 
> > 
> > In other words, HV should treat all pages; except shared pages, as
> > secure pages once H_SVM_INIT_DONE is called. And this includes pages
> > added subsequently through memory hotplug.
> 
> So after H_SVM_INIT_DONE, if HV touches a secure page for any
> reason and gets encrypted contents via page-out, HV drops the
> device pfn at that time. So what state we would be in that case? We
> have completed H_SVM_INIT_DONE, but still have a normal (but encrypted)
> page in HV?

Good point.

The corresponding GFN will continue to be a secure GFN. Just that its
backing PFN is not a device-PFN, but a memory-PFN. Also that backing
memory-PFN contains encrypted content.

I will clarify this in the patch; about secure-GFN state.

> 
> > 
> > Yes. the Ultravisor can explicitly request the HV to move the pages
> > individually.  But that will slow down the transition too significantly.
> > It takes above 20min to transition them, for a SVM of size 100G.
> > 
> > With this proposed enhancement, the switch completes in a few seconds.
> 
> I think, many pages during initial switch and most pages for hotplugged
> memory are zero pages, for which we don't anyway issue UV page-in calls.
> So the 20min saving you are observing is purely due to hcall overhead?

Apparently, that seems to be the case.

> 
> How about extending H_SVM_PAGE_IN interface or a new hcall to request
> multiple pages in one request?
> 
> Also, how about requesting for bigger page sizes (2M)? Ralph Campbell
> had patches that added THP support for migrate_vma_* calls.

yes. that should give further boost. I think the API does not stop us
from using that feature. Its the support on the Ultravisor side.
Hopefully we will have contributions to the ultravisor once it is
opensourced.

> 
> > 
> > > 
> > > > These GFNs continue to be
> > > > normal GFNs associated with normal PFNs; when infact, these GFNs should
> > > > have been secure GFNs associated with device PFNs.
> > > 
> > > Transition to secure state is driven by SVM/UV and HV just responds to
> > > hcalls by issuing appropriate uvcalls. SVM/UV is in the best position to
> > > determine the required pages that need to be moved into secure side.
> > > HV just responds to it and tracks such pages as device private pages.
> > > 
> > > If SVM/UV doesn't get in all the pages to secure side by the time
> > > of H_SVM_INIT_DONE, the remaining pages are just normal (shared or
> > > otherwise) pages as far as HV is concerned.  Why should HV assume that
> > > SVM/UV didn't ask for a few pages and hence push those pages during
> > > H_SVM_INIT_DONE?
> > 
> > By definition, SVM is a VM backed by secure pages.
> > Hence all pages(except shared) must turn secure when a VM switches to SVM.
> > 
> > UV is interested in only a certain pages for the VM, which it will
> > request explicitly through H_SVM_PAGE_IN.  All other pages, need not
> > be paged-in through UV_PAGE_IN.  They just need to be switched to
> > device-pages.
> > 
> > > 
> > > I think UV should drive the movement of pages into secure side both
> > > of boot-time SVM memory and hot-plugged memory. HV does memslot
> > > registration uvcall when new memory is plugged in, UV should explicitly
> > > get the required pages in at that time instead of expecting HV to drive
> > > the same.
> > > 
> > > > +static int uv_migrate_mem_slot(struct kvm *kvm,
> > > > +		const struct kvm_memory_slot *memslot)
> > > > +{
> > > > +	unsigned long gfn = memslot->base_gfn;
> > > > +	unsigned long end;
> > > > +	bool downgrade = false;
> > > > +	struct vm_area_struct *vma;
> > > > +	int i, ret = 0;
> > > > +	unsigned long start = gfn_to_hva(kvm, gfn);
> > > > +
> > > > +	if (kvm_is_error_hva(start))
> > > > +		return H_STATE;
> > > > +
> > > > +	end = start + (memslot->npages << PAGE_SHIFT);
> > > > +
> > > > +	down_write(&kvm->mm->mmap_sem);
> > > > +
> > > > +	mutex_lock(&kvm->arch.uvmem_lock);
> > > > +	vma = find_vma_intersection(kvm->mm, start, end);
> > > > +	if (!vma || vma->vm_start > start || vma->vm_end < end) {
> > > > +		ret = H_STATE;
> > > > +		goto out_unlock;
> > > > +	}
> > > > +
> > > > +	ret = ksm_madvise(vma, vma->vm_start, vma->vm_end,
> > > > +			  MADV_UNMERGEABLE, &vma->vm_flags);
> > > > +	downgrade_write(&kvm->mm->mmap_sem);
> > > > +	downgrade = true;
> > > > +	if (ret) {
> > > > +		ret = H_STATE;
> > > > +		goto out_unlock;
> > > > +	}
> > > > +
> > > > +	for (i = 0; i < memslot->npages; i++, ++gfn) {
> > > > +		/* skip paged-in pages and shared pages */
> > > > +		if (kvmppc_gfn_is_uvmem_pfn(gfn, kvm, NULL) ||
> > > > +			kvmppc_gfn_is_uvmem_shared(gfn, kvm))
> > > > +			continue;
> > > > +
> > > > +		start = gfn_to_hva(kvm, gfn);
> > > > +		end = start + (1UL << PAGE_SHIFT);
> > > > +		ret = kvmppc_svm_migrate_page(vma, start, end,
> > > > +			(gfn << PAGE_SHIFT), kvm, PAGE_SHIFT, false);
> > > > +
> > > > +		if (ret)
> > > > +			goto out_unlock;
> > > > +	}
> > > 
> > > Is there a guarantee that the vma you got for the start address remains
> > > valid for all the addresses till end in a memslot? If not, you should
> > > re-get the vma for the current address in each iteration I suppose.
> > 
> > 
> > mm->mmap_sem  is the semaphore that guards the vma. right?  If that
> > semaphore is held, can the vma change?
> 
> I am not sure if the vma you obtained would span the entire address range
> in the memslot.

will check.

Thanks,
RP

^ permalink raw reply

* Re: [PATCH] pwm: Add missing "CONFIG_" prefix
From: Joe Perches @ 2020-06-03 23:04 UTC (permalink / raw)
  To: Kees Cook, Uwe Kleine-König, Thierry Reding
  Cc: linux-pwm, linux-kernel, Paul Mackerras, linuxppc-dev
In-Reply-To: <202006031539.4198EA6@keescook>

On Wed, 2020-06-03 at 15:40 -0700, Kees Cook wrote:
> The IS_ENABLED() use was missing the CONFIG_ prefix which would have
> lead to skipping this code.
> 
> Fixes: 3ad1f3a33286 ("pwm: Implement some checks for lowlevel drivers")
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  drivers/pwm/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pwm/core.c b/drivers/pwm/core.c
> index 9973c442b455..6b3cbc0490c6 100644
> --- a/drivers/pwm/core.c
> +++ b/drivers/pwm/core.c
> @@ -121,7 +121,7 @@ static int pwm_device_request(struct pwm_device *pwm, const char *label)
>  		pwm->chip->ops->get_state(pwm->chip, pwm, &pwm->state);
>  		trace_pwm_get(pwm, &pwm->state);
>  
> -		if (IS_ENABLED(PWM_DEBUG))
> +		if (IS_ENABLED(CONFIG_PWM_DEBUG))
>  			pwm->last = pwm->state;
>  	}
>  
> -- 
> 2.25.1
> 

more odd uses (mostly in comments)

$ git grep -P -oh '\bIS_ENABLED\s*\(\s*\w+\s*\)'| \
  sed -r 's/\s+//g'| \
  grep -v '(CONFIG_' | \
  sort | uniq -c | sort -rn
      7 IS_ENABLED(DEBUG)
      4 IS_ENABLED(DRM_I915_SELFTEST)
      4 IS_ENABLED(cfg)
      2 IS_ENABLED(opt_name)
      2 IS_ENABLED(DEBUG_PRINT_TRIE_GRAPHVIZ)
      2 IS_ENABLED(config)
      2 IS_ENABLED(cond)
      2 IS_ENABLED(__BIG_ENDIAN)
      1 IS_ENABLED(x)
      1 IS_ENABLED(STRICT_KERNEL_RWX)
      1 IS_ENABLED(PWM_DEBUG)
      1 IS_ENABLED(option)
      1 IS_ENABLED(ETHTOOL_NETLINK)
      1 IS_ENABLED(DEBUG_RANDOM_TRIE)
      1 IS_ENABLED(DEBUG_CHACHA20POLY1305_SLOW_CHUNK_TEST)

STRICT_KERNEL_RWX is misused here in ppc

---

Fix pr_warn without newline too.

 arch/powerpc/mm/book3s64/hash_utils.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 51e3c15f7aff..dd60c5f2b991 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -660,11 +660,10 @@ static void __init htab_init_page_sizes(void)
 		 * Pick a size for the linear mapping. Currently, we only
 		 * support 16M, 1M and 4K which is the default
 		 */
-		if (IS_ENABLED(STRICT_KERNEL_RWX) &&
+		if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX) &&
 		    (unsigned long)_stext % 0x1000000) {
 			if (mmu_psize_defs[MMU_PAGE_16M].shift)
-				pr_warn("Kernel not 16M aligned, "
-					"disabling 16M linear map alignment");
+				pr_warn("Kernel not 16M aligned, disabling 16M linear map alignment\n");
 			aligned = false;
 		}
 




^ permalink raw reply related

* Re: [RESEND PATCH v9 4/5] ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
From: Ira Weiny @ 2020-06-03 22:46 UTC (permalink / raw)
  To: Vaibhav Jain
  Cc: Santosh Sivaraj, linux-nvdimm, linux-kernel, Steven Rostedt,
	Oliver O'Halloran, Aneesh Kumar K . V, Dan Williams,
	linuxppc-dev
In-Reply-To: <87tuzsggtd.fsf@linux.ibm.com>

On Wed, Jun 03, 2020 at 11:41:42PM +0530, Vaibhav Jain wrote:
> Hi Ira,
> 
> Thanks for reviewing this patch. My responses below:
> 
> Ira Weiny <ira.weiny@intel.com> writes:
> 

...

> >> + *
> >> + * Payload Version:
> >> + *
> >> + * A 'payload_version' field is present in PDSM header that indicates a specific
> >> + * version of the structure present in PDSM Payload for a given PDSM command.
> >> + * This provides backward compatibility in case the PDSM Payload structure
> >> + * evolves and different structures are supported by 'papr_scm' and 'libndctl'.
> >> + *
> >> + * When sending a PDSM Payload to 'papr_scm', 'libndctl' should send the version
> >> + * of the payload struct it supports via 'payload_version' field. The 'papr_scm'
> >> + * module when servicing the PDSM envelope checks the 'payload_version' and then
> >> + * uses 'payload struct version' == MIN('payload_version field',
> >> + * 'max payload-struct-version supported by papr_scm') to service the PDSM.
> >> + * After servicing the PDSM, 'papr_scm' put the negotiated version of payload
> >> + * struct in returned 'payload_version' field.
> >> + *
> >> + * Libndctl on receiving the envelope back from papr_scm again checks the
> >> + * 'payload_version' field and based on it use the appropriate version dsm
> >> + * struct to parse the results.
> >> + *
> >> + * Backward Compatibility:
> >> + *
> >> + * Above scheme of exchanging different versioned PDSM struct between libndctl
> >> + * and papr_scm should provide backward compatibility until following two
> >> + * assumptions/conditions when defining new PDSM structs hold:
> >> + *
> >> + * Let T(X) = { set of attributes in PDSM struct 'T' versioned X }
> >> + *
> >> + * 1. T(X) is a proper subset of T(Y) if Y > X.
> >> + *    i.e Each new version of PDSM struct should retain existing struct
> >> + *    attributes from previous version
> >> + *
> >> + * 2. If an entity (libndctl or papr_scm) supports a PDSM struct T(X) then
> >> + *    it should also support T(1), T(2)...T(X - 1).
> >> + *    i.e When adding support for new version of a PDSM struct, libndctl
> >> + *    and papr_scm should retain support of the existing PDSM struct
> >> + *    version they support.
> >
> > Please see this thread for an example why versions are a bad idea in UAPIs:
> >
> > https://lkml.org/lkml/2020/3/26/213
> >
> 
> > While the use of version is different in that thread the fundamental issues are
> > the same.  You end up with some weird matrix of supported features and
> > structure definitions.  For example, you are opening up the possibility of
> > changing structures with a different version for no good reason.
> 
> Not really sure I understand the statement correctly "you are opening up
> the possibility of changing structures with a different version for no
> good reason."

What I mean is:

struct v1 {
	u32 x;
	u32 y;
};

struct v2 {
	u32 y;
	u32 x;
};

x and y are the same data but you have now redefined the order of the struct.
You don't need that flexibility/complexity.

Generally I think you are defining:

struct v1 {
	u32 x;
	u32 y;
};

struct v2 {
	u32 x;
	u32 y;
	u32 z;
	u32 a;
};

Which becomes 2 structures...  There is no need.

The easiest thing to do is:

struct user_data {
	u32 x;
	u32 y;
};

And later you modify user_data to:

struct user_data {
	u32 x;
	u32 y;
	u32 z;
	u32 a;
};

libndctl always passes sizeof(struct user_data) to the call. [Do ensure
structures are 64bit aligned for this to work.]

The kernel sees the size and returns the amount of data up to that size.

Therefore, older kernels automatically fill in x and y,  newer kernels fill in
z/a if the buffer was big enough.  libndctl only uses the fields it knows about.

It is _much_ easier this way.  Almost nothing needs to get changed as versions
roll forward.  The only big issue is if libndctl _needs_ z then it has to check
if z is returned.

In that case add a cap_mask with bit fields which the kernel can fill in for
which fields are valid.

struct user_data {
	u64 cap_mask;  /* where bits define extra future capabilities */
	u32 x;
	u32 y;
};

IFF you need to add data within fields which are reserved you can use
capability flags to indicate which fields are requested and which are returned
by the kernel.

But I _think_ for what you want libndctl must survive if z/a are not available
right?  So just adding to the structure should be fine.

> We want to return more data in the struct in future iterations. So
> 'changing structure with different version' is something we are
> expecting. 
> 
> With the backward compatibility constraints 1 & 2 above, it will ensure
> that support matrix looks like a lower traingular matrix with each
> successive version supporting previous version attributes. So supporting
> future versions is relatively simplified.

But you end up with weird switch/if's etc to deal with the multiple structures.

With the size method the kernel simply returns the same size data as the user
requested and everything is done.  No logic required at all.  Literally it can
just copy the data it has (truncating if necessary).

> 
> >
> > Also having the user query with version Z and get back version X (older) is
> > odd.  Generally if the kernel does not know about a feature (ie version Z of
> > the structure) it should return -EINVAL and let the user figure out what to do.
> > The user may just give up or they could try a different query.
> >
> Considering the flow of ndctl/libndctl this is needed. libndctl will
> usually issues only one CMD_CALL ioctl to kernel and if that fails then
> an error is reported and ndctl will exit loosing state.
> 
> Adding mechanism in libndctl to reissue CMD_CALL ioctl to fetch a
> appropriate version of pdsm struct is going to be considerably more
> work.
> 
> This version fall-back mechanism, ensures that libndctl will receive
> usable data without having to reissue a more CMD_CALL ioctls.

Define usable?

What happens if libndctl does not get 'z' in my example above?  What does it
do?  If I understand correctly it does not _need_ z.  So why have a check on
the version from the kernel?

What if we change to:

struct v3 {
	u32 x;
	u32 y;
	u32 z;
	u32 a;
	u32 b;
	u32 c;
};

Now it has to

	if(version 2)
		z/a valid do something()

	if(version 3)
		b/c valid do something else()

if z, a, b, c are all 0 does it matter?

If not, the logic above disappears.

If so, then you need a cap mask.  Then the kernel can say c and a are valid
(but c is 0) or other flexible stuff like that.

> 
> >> + */
> >> +
> >> +/* PDSM-header + payload expected with ND_CMD_CALL ioctl from libnvdimm */
> >> +struct nd_pdsm_cmd_pkg {
> >> +	struct nd_cmd_pkg hdr;	/* Package header containing sub-cmd */
> >> +	__s32 cmd_status;	/* Out: Sub-cmd status returned back */
> >> +	__u16 reserved[5];	/* Ignored and to be used in future */
> >
> > How do you know when reserved is used for something else in the future?  Is
> > reserved guaranteed (and checked by the code) to be 0?
> 
> For current set of pdsm requests ignore these reserved fields. However a
> future pdsm request can leverage these reserved fields. So papr_scm
> just bind the usage of these fields with the value of
> 'nd_cmd_pkg.nd_command' that indicates the pdsm request.
> 
> That being said checking if the reserved fields are set to 0 will be a
> good measure. Will add this check in next iteration.

Exactly, if you don't check them now you will end up with an older libndctl
which passes in garbage and breaks future users...  Basically rendering the
reserved fields useless.

> 
> >
> >> +	__u16 payload_version;	/* In/Out: version of the payload */
> >
> > Why is payload_version after reserved?
> Want to place the payload version field just before the payload data so
> that it can be accessed with simple pointer arithmetic.

I did not see that in the patch.  I thought you were using
nd_pdsm_cmd_pkg->payload_version?

> 
> >
> >> +	__u8 payload[];		/* In/Out: Sub-cmd data buffer */
> >> +} __packed;
> >> +
> >> +/*
> >> + * Methods to be embedded in ND_CMD_CALL request. These are sent to the kernel
> >> + * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
> >> + */
> >> +enum papr_pdsm {
> >> +	PAPR_PDSM_MIN = 0x0,
> >> +	PAPR_PDSM_MAX,
> >> +};
> >> +
> >> +/* Convert a libnvdimm nd_cmd_pkg to pdsm specific pkg */
> >> +static inline struct nd_pdsm_cmd_pkg *nd_to_pdsm_cmd_pkg(struct nd_cmd_pkg *cmd)
> >> +{
> >> +	return (struct nd_pdsm_cmd_pkg *) cmd;
> >> +}
> >> +
> >> +/* Return the payload pointer for a given pcmd */
> >> +static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd)
> >> +{
> >> +	if (pcmd->hdr.nd_size_in == 0 && pcmd->hdr.nd_size_out == 0)
> >> +		return NULL;
> >> +	else
> >> +		return (void *)(pcmd->payload);
> >> +}
> >> +
> >> +#endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
> >> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> >> index 149431594839..5e2237e7ec08 100644
> >> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> >> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> >> @@ -15,13 +15,15 @@
> >>  #include <linux/seq_buf.h>
> >>  
> >>  #include <asm/plpar_wrappers.h>
> >> +#include <asm/papr_pdsm.h>
> >>  
> >>  #define BIND_ANY_ADDR (~0ul)
> >>  
> >>  #define PAPR_SCM_DIMM_CMD_MASK \
> >>  	((1ul << ND_CMD_GET_CONFIG_SIZE) | \
> >>  	 (1ul << ND_CMD_GET_CONFIG_DATA) | \
> >> -	 (1ul << ND_CMD_SET_CONFIG_DATA))
> >> +	 (1ul << ND_CMD_SET_CONFIG_DATA) | \
> >> +	 (1ul << ND_CMD_CALL))
> >>  
> >>  /* DIMM health bitmap bitmap indicators */
> >>  /* SCM device is unable to persist memory contents */
> >> @@ -350,16 +352,97 @@ static int papr_scm_meta_set(struct papr_scm_priv *p,
> >>  	return 0;
> >>  }
> >>  
> >> +/*
> >> + * Validate the inputs args to dimm-control function and return '0' if valid.
> >> + * This also does initial sanity validation to ND_CMD_CALL sub-command packages.
> >> + */
> >> +static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
> >> +		       unsigned int buf_len)
> >> +{
> >> +	unsigned long cmd_mask = PAPR_SCM_DIMM_CMD_MASK;
> >> +	struct nd_pdsm_cmd_pkg *pkg = nd_to_pdsm_cmd_pkg(buf);
> >> +	struct papr_scm_priv *p;
> >> +
> >> +	/* Only dimm-specific calls are supported atm */
> >> +	if (!nvdimm)
> >> +		return -EINVAL;
> >> +
> >> +	/* get the provider date from struct nvdimm */
> >
> > s/date/data
> Thanks for point this out. Will fix this in next iteration.
> 
> >
> >> +	p = nvdimm_provider_data(nvdimm);
> >> +
> >> +	if (!test_bit(cmd, &cmd_mask)) {
> >> +		dev_dbg(&p->pdev->dev, "Unsupported cmd=%u\n", cmd);
> >> +		return -EINVAL;
> >> +	} else if (cmd == ND_CMD_CALL) {
> >> +
> >> +		/* Verify the envelope package */
> >> +		if (!buf || buf_len < sizeof(struct nd_pdsm_cmd_pkg)) {
> >> +			dev_dbg(&p->pdev->dev, "Invalid pkg size=%u\n",
> >> +				buf_len);
> >> +			return -EINVAL;
> >> +		}
> >> +
> >> +		/* Verify that the PDSM family is valid */
> >> +		if (pkg->hdr.nd_family != NVDIMM_FAMILY_PAPR) {
> >> +			dev_dbg(&p->pdev->dev, "Invalid pkg family=0x%llx\n",
> >> +				pkg->hdr.nd_family);
> >> +			return -EINVAL;
> >> +
> >> +		}
> >> +
> >> +		/* We except a payload with all PDSM commands */
> >> +		if (pdsm_cmd_to_payload(pkg) == NULL) {
> >> +			dev_dbg(&p->pdev->dev,
> >> +				"Empty payload for sub-command=0x%llx\n",
> >> +				pkg->hdr.nd_command);
> >> +			return -EINVAL;
> >> +		}
> >> +	}
> >> +
> >> +	/* Command looks valid */
> >
> > I assume the first command to be implemented also checks the { nd_command,
> > payload_version, payload length } for correctness?
> Yes the pdsm service functions do check the payload_version and
> payload_length. Please see the papr_pdsm_health() that services the
> PAPR_PDSM_HEALTH pdsm in Patch-5
> 

cool.

> >
> >> +	return 0;
> >> +}
> >> +
> >> +static int papr_scm_service_pdsm(struct papr_scm_priv *p,
> >> +				struct nd_pdsm_cmd_pkg *call_pkg)
> >> +{
> >> +	/* unknown subcommands return error in packages */
> >> +	if (call_pkg->hdr.nd_command <= PAPR_PDSM_MIN ||
> >> +	    call_pkg->hdr.nd_command >= PAPR_PDSM_MAX) {
> >> +		dev_dbg(&p->pdev->dev, "Invalid PDSM request 0x%llx\n",
> >> +			call_pkg->hdr.nd_command);
> >> +		call_pkg->cmd_status = -EINVAL;
> >> +		return 0;
> >> +	}
> >> +
> >> +	/* Depending on the DSM command call appropriate service routine */
> >> +	switch (call_pkg->hdr.nd_command) {
> >> +	default:
> >> +		dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n",
> >> +			call_pkg->hdr.nd_command);
> >> +		call_pkg->cmd_status = -ENOENT;
> >> +		return 0;
> >> +	}
> >> +}
> >> +
> >>  static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
> >>  			  struct nvdimm *nvdimm, unsigned int cmd, void *buf,
> >>  			  unsigned int buf_len, int *cmd_rc)
> >>  {
> >>  	struct nd_cmd_get_config_size *get_size_hdr;
> >>  	struct papr_scm_priv *p;
> >> +	struct nd_pdsm_cmd_pkg *call_pkg = NULL;
> >> +	int rc;
> >>  
> >> -	/* Only dimm-specific calls are supported atm */
> >> -	if (!nvdimm)
> >> -		return -EINVAL;
> >> +	/* Use a local variable in case cmd_rc pointer is NULL */
> >> +	if (cmd_rc == NULL)
> >> +		cmd_rc = &rc;
> >
> > Why is this needed?  AFAICT The caller of papr_scm_ndctl does not specify null
> > and you did not change it.
> This pointer is coming from outside the papr_scm code hence need to be
> defensive here. Also as per[1] cmd_rc is "translation of firmware status"
> and not every caller would need it hence making this pointer optional.
> 
> This is evident in acpi_nfit_blk_get_flags() where the 'nd_desc->ndctl'
> is called with 'cmd_rc == NULL'.
> 
> [1] https://lore.kernel.org/linux-nvdimm/CAPcyv4hE_FG0YZXJVA1G=CBq8b9e0K54jxk5Sq5UKU-dnWT2Kg@mail.gmail.com/

Ah... Ok.  So this is a bug fix which needs to happen regardless of the status
of this patch...

> 
> >
> >> +
> >> +	*cmd_rc = is_cmd_valid(nvdimm, cmd, buf, buf_len);
> >> +	if (*cmd_rc) {
> >> +		pr_debug("Invalid cmd=0x%x. Err=%d\n", cmd, *cmd_rc);
> >> +		return *cmd_rc;
> >> +	}
> >>  
> >>  	p = nvdimm_provider_data(nvdimm);
> >>  
> >> @@ -381,13 +464,19 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
> >>  		*cmd_rc = papr_scm_meta_set(p, buf);

... Because this will break here. even without this new code...  right?

Lets get this fix in as a prelim-patch.

> >>  		break;
> >>  
> >> +	case ND_CMD_CALL:
> >> +		call_pkg = nd_to_pdsm_cmd_pkg(buf);
> >> +		*cmd_rc = papr_scm_service_pdsm(p, call_pkg);
> >> +		break;
> >> +
> >>  	default:
> >> -		return -EINVAL;
> >> +		dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd);
> >> +		*cmd_rc = -EINVAL;
> >
> > Is this change related?  If there is a bug where there is a caller of
> > papr_scm_ndctl() with cmd_rc == NULL this should be a separate patch to fix
> > that issue.
> This simplifies a bit debugging of errors reported in
> papr_scm_ndctl() as it ensures that subsequest dev_dbg "Returned with
> cmd_rc" is always logged.
> 
> I think, this is a too small change to be carved out as an independent
> patch. Also this doesnt change the behaviour of the code except logging
> some more error info.
> 
> However, If you feel too strongly about it I will spin a separate patch
> in this patch series for this.

This can go in as part of a 'protect against cmd_rc == NULL' preliminary patch.

I flagged this because at first I could not figure out what this had to do with
the ND_CMD_CALL...

For reviewers you want to make your patches concise to what you are
fixing/adding...

Also, based on acpi_nfit_blk_get_flags() using cmd_rc == NULL it looks like we
have a bug which needs to get fixed regardless of the this patch.  And if that
bug exists in earlier kernels you will need a separate patch to backport as a
fix.

So lets get that in first and separate...  :-D

Ira

> 
> >
> > Ira
> >
> >>  	}
> >>  
> >>  	dev_dbg(&p->pdev->dev, "returned with cmd_rc = %d\n", *cmd_rc);
> >>  
> >> -	return 0;
> >> +	return *cmd_rc;
> >>  }
> >>  
> >>  static ssize_t flags_show(struct device *dev,
> >> diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
> >> index de5d90212409..0e09dc5cec19 100644
> >> --- a/include/uapi/linux/ndctl.h
> >> +++ b/include/uapi/linux/ndctl.h
> >> @@ -244,6 +244,7 @@ struct nd_cmd_pkg {
> >>  #define NVDIMM_FAMILY_HPE2 2
> >>  #define NVDIMM_FAMILY_MSFT 3
> >>  #define NVDIMM_FAMILY_HYPERV 4
> >> +#define NVDIMM_FAMILY_PAPR 5
> >>  
> >>  #define ND_IOCTL_CALL			_IOWR(ND_IOCTL, ND_CMD_CALL,\
> >>  					struct nd_cmd_pkg)
> >> -- 
> >> 2.26.2
> >> 
> 
> -- 
> Cheers
> ~ Vaibhav

^ permalink raw reply

* Re: Re: [RESEND PATCH v5 5/5] Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'
From: Scott Branden @ 2020-06-03 21:32 UTC (permalink / raw)
  To: Bhupesh Sharma
  Cc: Mark Rutland, x86, Linux Doc Mailing List, Catalin Marinas,
	Ard Biesheuvel, kexec mailing list, Linux Kernel Mailing List,
	linuxppc-dev, Kazuhito Hagio, James Morse, Dave Anderson,
	Bhupesh SHARMA, Will Deacon, linux-arm-kernel, Steve Capper
In-Reply-To: <CACi5LpPXdcJ7AmFWiSyM8rG_+7C=wTqiP0oCa9QAPe0Y0_wH=Q@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1113 bytes --]

Hi Bhupesh,

On Wed, 3 Jun 2020 at 13:39, Bhupesh Sharma <bhsharma@redhat.com> wrote:

> Hello Scott,
>
> On Thu, Jun 4, 2020 at 12:17 AM Scott Branden
> <scott.branden@broadcom.com> wrote:
> >
> > Hi Bhupesh,
> >
> > Would be great to get this patch series upstreamed?
> >
> > On 2019-12-25 10:49 a.m., Bhupesh Sharma wrote:
> > > Hi James,
> > >
> > > On 12/12/2019 04:02 PM, James Morse wrote:
> > >> Hi Bhupesh,
> > >
> > > I am sorry this review mail skipped my attention due to holidays and
> > > focus on other urgent issues.
> > >
>
<SNIP>

> > >
> > > Ok, got it. Fixed this in v6, which should be on its way shortly.
> > I can't seem to find v6?
>
> Oops. I remember Cc'ing you to the v6 patchset (may be my email client
> messed up), anyways here is the v6 patchset for your reference:
> <http://lists.infradead.org/pipermail/kexec/2020-May/025095.html>
>
> Do share your review/test comments on the same.
>
I found the cover letter but didn't get the patches.  Thanks for sending
link.
v5 worked fine for us. Will get someone to try out v6 and let you know.

>
> Thanks,
> Bhupesh
>
> Regards,
 Scott

[-- Attachment #2: Type: text/html, Size: 2119 bytes --]

^ permalink raw reply

* [PATCH] net: ethernet: freescale: remove unneeded include for ucc_geth
From: Valentin Longchamp @ 2020-06-03 21:28 UTC (permalink / raw)
  To: netdev; +Cc: kuba, Valentin Longchamp, linuxppc-dev, leoyang.li

net/sch_generic.h does not need to be included, remove it.

Signed-off-by: Valentin Longchamp <valentin@longchamp.me>
---
 drivers/net/ethernet/freescale/ucc_geth.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/ucc_geth.c b/drivers/net/ethernet/freescale/ucc_geth.c
index 552e7554a9f8..db791f60b884 100644
--- a/drivers/net/ethernet/freescale/ucc_geth.c
+++ b/drivers/net/ethernet/freescale/ucc_geth.c
@@ -42,7 +42,6 @@
 #include <soc/fsl/qe/ucc.h>
 #include <soc/fsl/qe/ucc_fast.h>
 #include <asm/machdep.h>
-#include <net/sch_generic.h>
 
 #include "ucc_geth.h"
 
-- 
2.25.1


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox