LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] kbuild: introduce ccflags-remove-y and asflags-remove-y
From: Masahiro Yamada @ 2020-07-07  9:21 UTC (permalink / raw)
  To: linux-kbuild
  Cc: Rich Felker, linux-doc, Brendan Higgins, Paul Mackerras,
	linux-kselftest, live-patching, Miroslav Benes, Ingo Molnar,
	Joe Lawrence, Anders Roxell, Herbert Xu, Yoshinori Sato,
	Jonathan Corbet, Masahiro Yamada, linux-sh, Russell King,
	Ingo Molnar, Sami Tolvanen, Petr Mladek, Jiri Kosina,
	Steven Rostedt, Josh Poimboeuf, linux-arm-kernel, kunit-dev,
	Michal Marek, linux-kernel, Tal Gilboa, linux-crypto,
	linuxppc-dev, David S. Miller

CFLAGS_REMOVE_<file>.o filters out flags when compiling a particular
object, but there is no convenient way to do that for every object in
a directory.

Add ccflags-remove-y and asflags-remove-y to make it easily.

Use ccflags-remove-y to clean up some Makefiles.

The add/remove order works as follows:

 [1] KBUILD_CFLAGS specifies compiler flags used globally

 [2] ccflags-y adds compiler flags for all objects in the
     current Makefile

 [3] ccflags-remove-y removes compiler flags for all objects in the
     current Makefile (New feature)

 [4] CFLAGS_<file> adds compiler flags per file.

 [5] CFLAGS_REMOVE_<file> removes compiler flags per file.

Having [3] before [4] allows us to remove flags from most (but not all)
objects in the current Makefile.

For example, kernel/trace/Makefile removes $(CC_FLAGS_FTRACE)
from all objects in the directory, then adds it back to
trace_selftest_dynamic.o and CFLAGS_trace_kprobe_selftest.o

Please note ccflags-remove-y has no effect to the sub-directories.
In contrast, the previous notation got rid of compiler flags also from
all the sub-directories.

  arch/arm/boot/compressed/
  arch/powerpc/xmon/
  arch/sh/
  kernel/trace/

... have no sub-directories.

  lib/

... has several sub-directories.

To keep the behavior, I added ccflags-remove-y to all Makefiles
in subdirectories of lib/, except:

  lib/vdso/Makefile        - Kbuild does not descend into this Makefile
  lib/raid/test/Makefile   - This is not used for the kernel build

I think commit 2464a609ded0 ("ftrace: do not trace library functions")
excluded too much. In later commit, I will try to remove ccflags-remove-y
from sub-directory Makefiles.

Suggested-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
---

Changes in v2:
  - Swap the order of [3] and [4] to keep the current behavior of
    kernel/trace/Makefile.
  - Add ccflags-remove-y to subdir Makefiles of lib/

 Documentation/kbuild/makefiles.rst | 14 ++++++++++++++
 arch/arm/boot/compressed/Makefile  |  6 +-----
 arch/powerpc/xmon/Makefile         |  3 +--
 arch/sh/boot/compressed/Makefile   |  5 +----
 kernel/trace/Makefile              |  4 ++--
 lib/842/Makefile                   |  3 +++
 lib/Makefile                       |  5 +----
 lib/crypto/Makefile                |  2 ++
 lib/dim/Makefile                   |  2 ++
 lib/fonts/Makefile                 |  2 ++
 lib/kunit/Makefile                 |  3 +++
 lib/livepatch/Makefile             |  2 ++
 lib/lz4/Makefile                   |  1 +
 lib/lzo/Makefile                   |  2 ++
 lib/math/Makefile                  |  2 ++
 lib/mpi/Makefile                   |  2 ++
 lib/raid6/Makefile                 |  3 +++
 lib/reed_solomon/Makefile          |  2 ++
 lib/xz/Makefile                    |  3 +++
 lib/zlib_deflate/Makefile          |  2 ++
 lib/zlib_dfltcc/Makefile           |  2 ++
 lib/zlib_inflate/Makefile          |  2 ++
 lib/zstd/Makefile                  |  1 +
 scripts/Makefile.lib               | 14 ++++++++------
 24 files changed, 64 insertions(+), 23 deletions(-)

diff --git a/Documentation/kbuild/makefiles.rst b/Documentation/kbuild/makefiles.rst
index 6515ebc12b6f..14d8e7d23c04 100644
--- a/Documentation/kbuild/makefiles.rst
+++ b/Documentation/kbuild/makefiles.rst
@@ -368,6 +368,14 @@ more details, with real examples.
 
 		subdir-ccflags-y := -Werror
 
+    ccflags-remove-y, asflags-remove-y
+	These flags are used to remove particular flags for the compiler,
+	assembler invocations.
+
+	Example::
+
+		ccflags-remove-$(CONFIG_MCOUNT) += -pg
+
     CFLAGS_$@, AFLAGS_$@
 	CFLAGS_$@ and AFLAGS_$@ only apply to commands in current
 	kbuild makefile.
@@ -375,6 +383,9 @@ more details, with real examples.
 	$(CFLAGS_$@) specifies per-file options for $(CC).  The $@
 	part has a literal value which specifies the file that it is for.
 
+	CFLAGS_$@ has the higher priority than ccflags-remove-y; CFLAGS_$@
+	can re-add compiler flags that were removed by ccflags-remove-y.
+
 	Example::
 
 		# drivers/scsi/Makefile
@@ -387,6 +398,9 @@ more details, with real examples.
 	$(AFLAGS_$@) is a similar feature for source files in assembly
 	languages.
 
+	AFLAGS_$@ has the higher priority than asflags-remove-y; AFLAGS_$@
+	can re-add assembler flags that were removed by asflags-remove-y.
+
 	Example::
 
 		# arch/arm/kernel/Makefile
diff --git a/arch/arm/boot/compressed/Makefile b/arch/arm/boot/compressed/Makefile
index 00602a6fba04..3d5691b23951 100644
--- a/arch/arm/boot/compressed/Makefile
+++ b/arch/arm/boot/compressed/Makefile
@@ -103,13 +103,9 @@ clean-files += piggy_data lib1funcs.S ashldi3.S bswapsdi2.S hyp-stub.S
 
 KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING
 
-ifeq ($(CONFIG_FUNCTION_TRACER),y)
-ORIG_CFLAGS := $(KBUILD_CFLAGS)
-KBUILD_CFLAGS = $(subst -pg, , $(ORIG_CFLAGS))
-endif
-
 ccflags-y := -fpic $(call cc-option,-mno-single-pic-base,) -fno-builtin \
 	     -I$(obj) $(DISABLE_ARM_SSP_PER_TASK_PLUGIN)
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += -pg
 asflags-y := -DZIMAGE
 
 # Supply kernel BSS size to the decompressor via a linker symbol.
diff --git a/arch/powerpc/xmon/Makefile b/arch/powerpc/xmon/Makefile
index 89c76ca35640..eb25d7554ffd 100644
--- a/arch/powerpc/xmon/Makefile
+++ b/arch/powerpc/xmon/Makefile
@@ -7,8 +7,7 @@ UBSAN_SANITIZE := n
 KASAN_SANITIZE := n
 
 # Disable ftrace for the entire directory
-ORIG_CFLAGS := $(KBUILD_CFLAGS)
-KBUILD_CFLAGS = $(subst $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS))
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
 
 ifdef CONFIG_CC_IS_CLANG
 # clang stores addresses on the stack causing the frame size to blow
diff --git a/arch/sh/boot/compressed/Makefile b/arch/sh/boot/compressed/Makefile
index ad0e2403e56f..589d2d8a573d 100644
--- a/arch/sh/boot/compressed/Makefile
+++ b/arch/sh/boot/compressed/Makefile
@@ -28,10 +28,7 @@ IMAGE_OFFSET	:= $(shell /bin/bash -c 'printf "0x%08x" \
 			$(CONFIG_BOOT_LINK_OFFSET)]')
 endif
 
-ifeq ($(CONFIG_MCOUNT),y)
-ORIG_CFLAGS := $(KBUILD_CFLAGS)
-KBUILD_CFLAGS = $(subst -pg, , $(ORIG_CFLAGS))
-endif
+ccflags-remove-$(CONFIG_MCOUNT) += -pg
 
 LDFLAGS_vmlinux := --oformat $(ld-bfd) -Ttext $(IMAGE_OFFSET) -e startup \
 		   -T $(obj)/../../kernel/vmlinux.lds
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 6575bb0a0434..7492844a8b1b 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -2,9 +2,9 @@
 
 # Do not instrument the tracer itself:
 
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 ifdef CONFIG_FUNCTION_TRACER
-ORIG_CFLAGS := $(KBUILD_CFLAGS)
-KBUILD_CFLAGS = $(subst $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS))
 
 # Avoid recursion due to instrumentation.
 KCSAN_SANITIZE := n
diff --git a/lib/842/Makefile b/lib/842/Makefile
index 6f7aad269288..b815e824ae37 100644
--- a/lib/842/Makefile
+++ b/lib/842/Makefile
@@ -1,3 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
+
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 obj-$(CONFIG_842_COMPRESS) += 842_compress.o
 obj-$(CONFIG_842_DECOMPRESS) += 842_decompress.o
diff --git a/lib/Makefile b/lib/Makefile
index b1c42c10073b..b2ed4beddd68 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -3,10 +3,7 @@
 # Makefile for some libs needed in the kernel.
 #
 
-ifdef CONFIG_FUNCTION_TRACER
-ORIG_CFLAGS := $(KBUILD_CFLAGS)
-KBUILD_CFLAGS = $(subst $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS))
-endif
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
 
 # These files are disabled because they produce lots of non-interesting and/or
 # flaky coverage that is not a function of syscall inputs. For example,
diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile
index 3a435629d9ce..b557ef0b07c2 100644
--- a/lib/crypto/Makefile
+++ b/lib/crypto/Makefile
@@ -1,5 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 # chacha is used by the /dev/random driver which is always builtin
 obj-y						+= chacha.o
 obj-$(CONFIG_CRYPTO_LIB_CHACHA_GENERIC)		+= libchacha.o
diff --git a/lib/dim/Makefile b/lib/dim/Makefile
index 1d6858a108cb..97fc3e89d34e 100644
--- a/lib/dim/Makefile
+++ b/lib/dim/Makefile
@@ -2,6 +2,8 @@
 # DIM Dynamic Interrupt Moderation library
 #
 
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 obj-$(CONFIG_DIMLIB) += dim.o
 
 dim-y := dim.o net_dim.o rdma_dim.o
diff --git a/lib/fonts/Makefile b/lib/fonts/Makefile
index ed95070860de..f951750c179e 100644
--- a/lib/fonts/Makefile
+++ b/lib/fonts/Makefile
@@ -1,6 +1,8 @@
 # SPDX-License-Identifier: GPL-2.0
 # Font handling
 
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 font-objs := fonts.o
 
 font-objs-$(CONFIG_FONT_SUN8x16)   += font_sun8x16.o
diff --git a/lib/kunit/Makefile b/lib/kunit/Makefile
index 724b94311ca3..8c847557ab24 100644
--- a/lib/kunit/Makefile
+++ b/lib/kunit/Makefile
@@ -1,3 +1,6 @@
+
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 obj-$(CONFIG_KUNIT) +=			kunit.o
 
 kunit-objs +=				test.o \
diff --git a/lib/livepatch/Makefile b/lib/livepatch/Makefile
index 295b94bff370..9abdf615b088 100644
--- a/lib/livepatch/Makefile
+++ b/lib/livepatch/Makefile
@@ -2,6 +2,8 @@
 #
 # Makefile for livepatch test code.
 
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 obj-$(CONFIG_TEST_LIVEPATCH) += test_klp_atomic_replace.o \
 				test_klp_callbacks_demo.o \
 				test_klp_callbacks_demo2.o \
diff --git a/lib/lz4/Makefile b/lib/lz4/Makefile
index 5b42242afaa2..53da4cab7015 100644
--- a/lib/lz4/Makefile
+++ b/lib/lz4/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 ccflags-y += -O3
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
 
 obj-$(CONFIG_LZ4_COMPRESS) += lz4_compress.o
 obj-$(CONFIG_LZ4HC_COMPRESS) += lz4hc_compress.o
diff --git a/lib/lzo/Makefile b/lib/lzo/Makefile
index 2f58fafbbddd..9565a555275b 100644
--- a/lib/lzo/Makefile
+++ b/lib/lzo/Makefile
@@ -1,4 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 lzo_compress-objs := lzo1x_compress.o
 lzo_decompress-objs := lzo1x_decompress_safe.o
 
diff --git a/lib/math/Makefile b/lib/math/Makefile
index be6909e943bd..49aa50e28185 100644
--- a/lib/math/Makefile
+++ b/lib/math/Makefile
@@ -1,4 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 obj-y += div64.o gcd.o lcm.o int_pow.o int_sqrt.o reciprocal_div.o
 
 obj-$(CONFIG_CORDIC)		+= cordic.o
diff --git a/lib/mpi/Makefile b/lib/mpi/Makefile
index d5874a7f5ff9..df7883521619 100644
--- a/lib/mpi/Makefile
+++ b/lib/mpi/Makefile
@@ -3,6 +3,8 @@
 # MPI multiprecision maths library (from gpg)
 #
 
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 obj-$(CONFIG_MPILIB) = mpi.o
 
 mpi-y = \
diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
index b4c0df6d706d..3482d6ae3f3b 100644
--- a/lib/raid6/Makefile
+++ b/lib/raid6/Makefile
@@ -1,4 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
+
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 obj-$(CONFIG_RAID6_PQ)	+= raid6_pq.o
 
 raid6_pq-y	+= algos.o recov.o tables.o int1.o int2.o int4.o \
diff --git a/lib/reed_solomon/Makefile b/lib/reed_solomon/Makefile
index 5d4fa68f26cb..a5c9defdac7f 100644
--- a/lib/reed_solomon/Makefile
+++ b/lib/reed_solomon/Makefile
@@ -3,5 +3,7 @@
 # This is a modified version of reed solomon lib,
 #
 
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 obj-$(CONFIG_REED_SOLOMON) += reed_solomon.o
 obj-$(CONFIG_REED_SOLOMON_TEST) += test_rslib.o
diff --git a/lib/xz/Makefile b/lib/xz/Makefile
index fa6af814a8d1..fae9b6c7c389 100644
--- a/lib/xz/Makefile
+++ b/lib/xz/Makefile
@@ -1,4 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
+
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 obj-$(CONFIG_XZ_DEC) += xz_dec.o
 xz_dec-y := xz_dec_syms.o xz_dec_stream.o xz_dec_lzma2.o
 xz_dec-$(CONFIG_XZ_DEC_BCJ) += xz_dec_bcj.o
diff --git a/lib/zlib_deflate/Makefile b/lib/zlib_deflate/Makefile
index 2622e03c0b94..1fcefe73536f 100644
--- a/lib/zlib_deflate/Makefile
+++ b/lib/zlib_deflate/Makefile
@@ -7,6 +7,8 @@
 # decompression code.
 #
 
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 obj-$(CONFIG_ZLIB_DEFLATE) += zlib_deflate.o
 
 zlib_deflate-objs := deflate.o deftree.o deflate_syms.o
diff --git a/lib/zlib_dfltcc/Makefile b/lib/zlib_dfltcc/Makefile
index 8e4d5afbbb10..7a8067f6e772 100644
--- a/lib/zlib_dfltcc/Makefile
+++ b/lib/zlib_dfltcc/Makefile
@@ -6,6 +6,8 @@
 # This is the code for s390 zlib hardware support.
 #
 
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 obj-$(CONFIG_ZLIB_DFLTCC) += zlib_dfltcc.o
 
 zlib_dfltcc-objs := dfltcc.o dfltcc_deflate.o dfltcc_inflate.o dfltcc_syms.o
diff --git a/lib/zlib_inflate/Makefile b/lib/zlib_inflate/Makefile
index 27327d3e9f54..a451e96f9845 100644
--- a/lib/zlib_inflate/Makefile
+++ b/lib/zlib_inflate/Makefile
@@ -14,6 +14,8 @@
 # uncompression can be done without blocking on allocation).
 #
 
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
+
 obj-$(CONFIG_ZLIB_INFLATE) += zlib_inflate.o
 
 zlib_inflate-objs := inffast.o inflate.o infutil.o \
diff --git a/lib/zstd/Makefile b/lib/zstd/Makefile
index f5d778e7e5c7..01be908a2d94 100644
--- a/lib/zstd/Makefile
+++ b/lib/zstd/Makefile
@@ -3,6 +3,7 @@ obj-$(CONFIG_ZSTD_COMPRESS) += zstd_compress.o
 obj-$(CONFIG_ZSTD_DECOMPRESS) += zstd_decompress.o
 
 ccflags-y += -O3
+ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
 
 zstd_compress-y := fse_compress.o huf_compress.o compress.o \
 		   entropy_common.o fse_decompress.o zstd_common.o
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 916b2f7f7098..3629f66646d7 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -111,12 +111,14 @@ basename_flags = -DKBUILD_BASENAME=$(call name-fix,$(basetarget))
 modname_flags  = -DKBUILD_MODNAME=$(call name-fix,$(modname))
 modfile_flags  = -DKBUILD_MODFILE=$(call stringify,$(modfile))
 
-orig_c_flags   = $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS) \
-                 $(ccflags-y) $(CFLAGS_$(target-stem).o)
-_c_flags       = $(filter-out $(CFLAGS_REMOVE_$(target-stem).o), $(orig_c_flags))
-orig_a_flags   = $(KBUILD_CPPFLAGS) $(KBUILD_AFLAGS) \
-                 $(asflags-y) $(AFLAGS_$(target-stem).o)
-_a_flags       = $(filter-out $(AFLAGS_REMOVE_$(target-stem).o), $(orig_a_flags))
+_c_flags       = $(filter-out $(CFLAGS_REMOVE_$(target-stem).o), \
+                     $(filter-out $(ccflags-remove-y), \
+                         $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS) $(ccflags-y)) \
+                     $(CFLAGS_$(target-stem).o))
+_a_flags       = $(filter-out $(AFLAGS_REMOVE_$(target-stem).o), \
+                     $(filter-out $(asflags-remove-y), \
+                         $(KBUILD_CPPFLAGS) $(KBUILD_AFLAGS) $(asflags-y)) \
+                     $(AFLAGS_$(target-stem).o))
 _cpp_flags     = $(KBUILD_CPPFLAGS) $(cppflags-y) $(CPPFLAGS_$(target-stem).lds)
 
 #
-- 
2.25.1


^ permalink raw reply related

* [PATCH 2/2] ASoC: fsl_spdif: Add kctl for configuring TX validity bit
From: Shengjiu Wang @ 2020-07-07  8:54 UTC (permalink / raw)
  To: timur, nicoleotsuka, Xiubo.Lee, festevam, broonie, perex, tiwai,
	alsa-devel
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <1594112066-31297-1-git-send-email-shengjiu.wang@nxp.com>

Add one kctl for configuring TX validity bit from user
space.

The type of this kctl is boolean:
on - Outgoing validity always set
off - Outgoing validity always clear

Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
---
 sound/soc/fsl/fsl_spdif.c | 47 +++++++++++++++++++++++++++++++++++----
 1 file changed, 43 insertions(+), 4 deletions(-)

diff --git a/sound/soc/fsl/fsl_spdif.c b/sound/soc/fsl/fsl_spdif.c
index 576370dc6e70..37053e8f29d0 100644
--- a/sound/soc/fsl/fsl_spdif.c
+++ b/sound/soc/fsl/fsl_spdif.c
@@ -776,8 +776,8 @@ static int fsl_spdif_vbit_info(struct snd_kcontrol *kcontrol,
 }
 
 /* Get valid good bit from interrupt status register */
-static int fsl_spdif_vbit_get(struct snd_kcontrol *kcontrol,
-				struct snd_ctl_elem_value *ucontrol)
+static int fsl_spdif_rx_vbit_get(struct snd_kcontrol *kcontrol,
+				 struct snd_ctl_elem_value *ucontrol)
 {
 	struct snd_soc_dai *cpu_dai = snd_kcontrol_chip(kcontrol);
 	struct fsl_spdif_priv *spdif_priv = snd_soc_dai_get_drvdata(cpu_dai);
@@ -791,6 +791,35 @@ static int fsl_spdif_vbit_get(struct snd_kcontrol *kcontrol,
 	return 0;
 }
 
+static int fsl_spdif_tx_vbit_get(struct snd_kcontrol *kcontrol,
+				 struct snd_ctl_elem_value *ucontrol)
+{
+	struct snd_soc_dai *cpu_dai = snd_kcontrol_chip(kcontrol);
+	struct fsl_spdif_priv *spdif_priv = snd_soc_dai_get_drvdata(cpu_dai);
+	struct regmap *regmap = spdif_priv->regmap;
+	u32 val;
+
+	regmap_read(regmap, REG_SPDIF_SCR, &val);
+	val = (val & SCR_VAL_MASK) >> SCR_VAL_OFFSET;
+	val = 1 - val;
+	ucontrol->value.integer.value[0] = val;
+
+	return 0;
+}
+
+static int fsl_spdif_tx_vbit_put(struct snd_kcontrol *kcontrol,
+				 struct snd_ctl_elem_value *ucontrol)
+{
+	struct snd_soc_dai *cpu_dai = snd_kcontrol_chip(kcontrol);
+	struct fsl_spdif_priv *spdif_priv = snd_soc_dai_get_drvdata(cpu_dai);
+	struct regmap *regmap = spdif_priv->regmap;
+	u32 val = (1 - ucontrol->value.integer.value[0]) << SCR_VAL_OFFSET;
+
+	regmap_update_bits(regmap, REG_SPDIF_SCR, SCR_VAL_MASK, val);
+
+	return 0;
+}
+
 /* DPLL lock information */
 static int fsl_spdif_rxrate_info(struct snd_kcontrol *kcontrol,
 				struct snd_ctl_elem_info *uinfo)
@@ -948,11 +977,21 @@ static struct snd_kcontrol_new fsl_spdif_ctrls[] = {
 	/* Valid bit error controller */
 	{
 		.iface = SNDRV_CTL_ELEM_IFACE_PCM,
-		.name = "IEC958 V-Bit Errors",
+		.name = "IEC958 RX V-Bit Errors",
 		.access = SNDRV_CTL_ELEM_ACCESS_READ |
 			SNDRV_CTL_ELEM_ACCESS_VOLATILE,
 		.info = fsl_spdif_vbit_info,
-		.get = fsl_spdif_vbit_get,
+		.get = fsl_spdif_rx_vbit_get,
+	},
+	{
+		.iface = SNDRV_CTL_ELEM_IFACE_PCM,
+		.name = "IEC958 TX V-Bit",
+		.access = SNDRV_CTL_ELEM_ACCESS_READ |
+			SNDRV_CTL_ELEM_ACCESS_WRITE |
+			SNDRV_CTL_ELEM_ACCESS_VOLATILE,
+		.info = fsl_spdif_vbit_info,
+		.get = fsl_spdif_tx_vbit_get,
+		.put = fsl_spdif_tx_vbit_put,
 	},
 	/* DPLL lock info get controller */
 	{
-- 
2.21.0


^ permalink raw reply related

* [PATCH 0/2] ASoC: fsl_spdif: Clear the validity bit for TX
From: Shengjiu Wang @ 2020-07-07  8:54 UTC (permalink / raw)
  To: timur, nicoleotsuka, Xiubo.Lee, festevam, broonie, perex, tiwai,
	alsa-devel
  Cc: linuxppc-dev, linux-kernel

Clear the validity bit for TX
Add kctl for configuring TX validity bit

Shengjiu Wang (2):
  ASoC: fsl_spdif: Clear the validity bit for TX
  ASoC: fsl_spdif: Add kctl for configuring TX validity bit

 sound/soc/fsl/fsl_spdif.c | 51 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 47 insertions(+), 4 deletions(-)

-- 
2.21.0


^ permalink raw reply

* [PATCH 1/2] ASoC: fsl_spdif: Clear the validity bit for TX
From: Shengjiu Wang @ 2020-07-07  8:54 UTC (permalink / raw)
  To: timur, nicoleotsuka, Xiubo.Lee, festevam, broonie, perex, tiwai,
	alsa-devel
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <1594112066-31297-1-git-send-email-shengjiu.wang@nxp.com>

In IEC958 spec, "The validity bit is logical "0" if the
information in the main data field is reliable, and it
is logical "1" if it is not".

The default value of "ValCtrl" is zero, which means
"Outgoing Validity always set", then all the data is not
reliable, then some spdif sink device will drop the data.

So set "ValCtrl" to 1, that is to clear "Outgoing Validity"
in default.

Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
---
 sound/soc/fsl/fsl_spdif.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/sound/soc/fsl/fsl_spdif.c b/sound/soc/fsl/fsl_spdif.c
index 9fb95c6ee7ba..576370dc6e70 100644
--- a/sound/soc/fsl/fsl_spdif.c
+++ b/sound/soc/fsl/fsl_spdif.c
@@ -985,6 +985,10 @@ static int fsl_spdif_dai_probe(struct snd_soc_dai *dai)
 
 	snd_soc_add_dai_controls(dai, fsl_spdif_ctrls, ARRAY_SIZE(fsl_spdif_ctrls));
 
+	/*Clear the val bit for Tx*/
+	regmap_update_bits(spdif_private->regmap, REG_SPDIF_SCR,
+			   SCR_VAL_MASK, SCR_VAL_CLEAR);
+
 	return 0;
 }
 
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH] powerpc/numa: Restrict possible nodes based on platform
From: Srikar Dronamraju @ 2020-07-07  8:42 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Nathan Lynch, linuxppc-dev, Bharata B Rao
In-Reply-To: <87lfjv5352.fsf@mpe.ellerman.id.au>

* Michael Ellerman <mpe@ellerman.id.au> [2020-07-07 15:02:17]:

> Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> > As per PAPR, there are 2 device tree property
> > ibm,max-associativity-domains (which defines the maximum number of
> > domains that the firmware i.e PowerVM can support) and
> > ibm,current-associativity-domains (which defines the maximum number of
> > domains that the platform can support). Value of
> > ibm,max-associativity-domains property is always greater than or equal
> > to ibm,current-associativity-domains property.
> 
> Where is it documented?
> 
> It's definitely not in LoPAPR.
> 

https://openpowerfoundation.org/wp-content/uploads/2020/07/LoPAR-20200611.pdf
Page number 833.

which says 
ibm,current-associativity-domains”
	property name to define the current number of associativity
	domains for this platform.
	prop-encoded-array: An associativity list such that all values are
	the number of unique values that the current platform supports
	in that location. The associativity list consisting of a number of
	entries integer (N) encoded as with encode-int followed by N
	integers encoded as with encode-int each representing current
	number of unique associativity domains the platform supports at
	that level.

> > Powerpc currently uses ibm,max-associativity-domains  property while
> > setting the possible number of nodes. This is currently set at 32.
> > However the possible number of nodes for a platform may be significantly
> > less. Hence set the possible number of nodes based on
> > ibm,current-associativity-domains property.
> >
> > $ lsprop /proc/device-tree/rtas/ibm,*associ*-domains
> > /proc/device-tree/rtas/ibm,current-associativity-domains
> > 		 00000005 00000001 00000002 00000002 00000002 00000010
> > /proc/device-tree/rtas/ibm,max-associativity-domains
> > 		 00000005 00000001 00000008 00000020 00000020 00000100
> >
> > $ cat /sys/devices/system/node/possible ##Before patch
> > 0-31
> >
> > $ cat /sys/devices/system/node/possible ##After patch
> > 0-1
> >
> > Note the maximum nodes this platform can support is only 2 but the
> > possible nodes is set to 32.
> 
> But what about LPM to a system with more nodes?
> 

I have very less info on LPM, so I checked with Nathan Lynch before posting
and as per Nathan in the current design of LPM, Linux wouldn't use the new
node numbers. 

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply

* Re: FSL P5020/P5040: DPAA Ethernet issue with the latest Git kernel
From: Christian Zigotzky @ 2020-07-07  7:52 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Darren Stevens, mad skateman, Madalin Bucur,
	netdev@vger.kernel.org, Sascha Hauer, R.T.Dickinson, linuxppc-dev,
	Christian Zigotzky
In-Reply-To: <20200625102223.GA3646@oc3871087118.ibm.com>

On 25 June 2020 at 12:22 pm, Alexander Gordeev wrote:
> On Thu, Jun 25, 2020 at 01:01:52AM +0200, Christian Zigotzky wrote:
> [...]
>> I compiled a test kernel with the option "CONFIG_TEST_BITMAP=y"
>> yesterday. After that Skateman and I booted it and looked for the
>> bitmap tests with "dmesg | grep -i bitmap".
>>
>> Results:
>>
>> FSL P5020:
>>
>> [    0.297756] test_bitmap: loaded.
>> [    0.298113] test_bitmap: parselist: 14: input is '0-2047:128/256'
>> OK, Time: 562
>> [    0.298142] test_bitmap: parselist_user: 14: input is
>> '0-2047:128/256' OK, Time: 761
>> [    0.301553] test_bitmap: all 1663 tests passed
>>
>> FSL P5040:
>>
>> [    0.296563] test_bitmap: loaded.
>> [    0.296894] test_bitmap: parselist: 14: input is '0-2047:128/256'
>> OK, Time: 540
>> [    0.296920] test_bitmap: parselist_user: 14: input is
>> '0-2047:128/256' OK, Time: 680
>> [    0.299994] test_bitmap: all 1663 tests passed
> Thanks for the test! So it works as expected.
>
> I would suggest to compare what is going on on the device probing
> with and without the bisected commit.
>
> There seems to be MAC and PHY mode initialization issue that might
> resulted from the bitmap format change.
>
> I put Madalin and Sascha on CC as they have done some works on
> this part recently.
>
> Thanks!
>

Hi All,

The issue still exists [1] so we still need the dpaa patch [2]. Could 
you please check the problematic commit [3]?

Thanks,
Christian

[1] https://forum.hyperion-entertainment.com/viewtopic.php?p=50885#p50885
[2] https://forum.hyperion-entertainment.com/viewtopic.php?p=50982#p50982
[3] https://forum.hyperion-entertainment.com/viewtopic.php?p=50980#p50980

^ permalink raw reply

* [Bug 208197] OF: /pci@f2000000/mac-io@17/gpio@50/...: could not find phandle
From: bugzilla-daemon @ 2020-07-07  7:23 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <bug-208197-206035@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=208197

Michael Ellerman (michael@ellerman.id.au) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO
                 CC|                            |michael@ellerman.id.au

--- Comment #3 from Michael Ellerman (michael@ellerman.id.au) ---
Try this?

diff --git a/drivers/of/base.c b/drivers/of/base.c
index ae03b1218b06..b2961bec57d9 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -1299,6 +1299,7 @@ int of_phandle_iterator_next(struct of_phandle_iterator
*it)
                        if (!it->node) {
                                pr_err("%pOF: could not find phandle\n",
                                       it->parent);
+                               WARN_ON(1);
                                goto err;
                        }

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply related

* Re: [PATCH v2 07/10] powerpc/perf: support BHRB disable bit and new filtering modes
From: Michael Neuling @ 2020-07-07  7:17 UTC (permalink / raw)
  To: Athira Rajeev, mpe; +Cc: Vaidyanathan Srinivasan, ego, maddy, linuxppc-dev
In-Reply-To: <1593595262-1433-8-git-send-email-atrajeev@linux.vnet.ibm.com>

On Wed, 2020-07-01 at 05:20 -0400, Athira Rajeev wrote:
> PowerISA v3.1 has few updates for the Branch History Rolling Buffer(BHRB).
> First is the addition of BHRB disable bit and second new filtering
> modes for BHRB.
> 
> BHRB disable is controlled via Monitor Mode Control Register A (MMCRA)
> bit 26, namely "BHRB Recording Disable (BHRBRD)". This field controls
> whether BHRB entries are written when BHRB recording is enabled by other
> bits. Patch implements support for this BHRB disable bit.

Probably good to note here that this is backwards compatible. So if you have a
kernel that doesn't know about this bit, it'll clear it and hence you still get
BHRB. 

You should also note why you'd want to do disable this (ie. the core will run
faster).

> Secondly PowerISA v3.1 introduce filtering support for
> PERF_SAMPLE_BRANCH_IND_CALL/COND. The patch adds BHRB filter support
> for "ind_call" and "cond" in power10_bhrb_filter_map().
> 
> 'commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to userspace
> via BHRB buffer")'
> added a check in bhrb_read() to filter the kernel address from BHRB buffer.
> Patch here modified
> it to avoid that check for PowerISA v3.1 based processors, since PowerISA v3.1
> allows
> only MSR[PR]=1 address to be written to BHRB buffer.
> 
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> ---
>  arch/powerpc/perf/core-book3s.c       | 27 +++++++++++++++++++++------
>  arch/powerpc/perf/isa207-common.c     | 13 +++++++++++++
>  arch/powerpc/perf/power10-pmu.c       | 13 +++++++++++--
>  arch/powerpc/platforms/powernv/idle.c | 14 ++++++++++++++

This touches the idle code so we should get those guys on CC (adding Vaidy and
Ego).

>  4 files changed, 59 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index fad5159..9709606 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -466,9 +466,13 @@ static void power_pmu_bhrb_read(struct perf_event *event,
> struct cpu_hw_events *
>  			 * addresses at this point. Check the privileges before
>  			 * exporting it to userspace (avoid exposure of regions
>  			 * where we could have speculative execution)
> +			 * Incase of ISA 310, BHRB will capture only user-space
> +			 * address,hence include a check before filtering code
>  			 */
> -			if (is_kernel_addr(addr) && perf_allow_kernel(&event-
> >attr) != 0)
> -				continue;
> +			if (!(ppmu->flags & PPMU_ARCH_310S))
> +				if (is_kernel_addr(addr) &&
> +				perf_allow_kernel(&event->attr) != 0)
> +					continue;
>  
>  			/* Branches are read most recent first (ie. mfbhrb 0 is
>  			 * the most recent branch).
> @@ -1212,7 +1216,7 @@ static void write_mmcr0(struct cpu_hw_events *cpuhw,
> unsigned long mmcr0)
>  static void power_pmu_disable(struct pmu *pmu)
>  {
>  	struct cpu_hw_events *cpuhw;
> -	unsigned long flags, mmcr0, val;
> +	unsigned long flags, mmcr0, val, mmcra = 0;
>  
>  	if (!ppmu)
>  		return;
> @@ -1245,12 +1249,23 @@ static void power_pmu_disable(struct pmu *pmu)
>  		mb();
>  		isync();
>  
> +		val = mmcra = cpuhw->mmcr[2];
> +
>  		/*
>  		 * Disable instruction sampling if it was enabled
>  		 */
> -		if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE) {
> -			mtspr(SPRN_MMCRA,
> -			      cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
> +		if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE)
> +			mmcra = cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE;
> +
> +		/* Disable BHRB via mmcra [:26] for p10 if needed */
> +		if (!(cpuhw->mmcr[2] & MMCRA_BHRB_DISABLE))
> +			mmcra |= MMCRA_BHRB_DISABLE;
> +
> +		/* Write SPRN_MMCRA if mmcra has either disabled
> +		 * instruction sampling or BHRB
> +		 */
> +		if (val != mmcra) {
> +			mtspr(SPRN_MMCRA, mmcra);
>  			mb();
>  			isync();
>  		}
> diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-
> common.c
> index 7d4839e..463d925 100644
> --- a/arch/powerpc/perf/isa207-common.c
> +++ b/arch/powerpc/perf/isa207-common.c
> @@ -404,6 +404,12 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
>  
>  	mmcra = mmcr1 = mmcr2 = mmcr3 = 0;
>  
> +	/* Disable bhrb unless explicitly requested
> +	 * by setting MMCRA [:26] bit.
> +	 */
> +	if (cpu_has_feature(CPU_FTR_ARCH_31))
> +		mmcra |= MMCRA_BHRB_DISABLE;
> +
>  	/* Second pass: assign PMCs, set all MMCR1 fields */
>  	for (i = 0; i < n_ev; ++i) {
>  		pmc     = (event[i] >> EVENT_PMC_SHIFT) & EVENT_PMC_MASK;
> @@ -475,10 +481,17 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
>  		}
>  
>  		if (event[i] & EVENT_WANTS_BHRB) {
> +			/* set MMCRA[:26] to 0 for Power10 to enable BHRB */
> +			if (cpu_has_feature(CPU_FTR_ARCH_31))
> +				mmcra &= ~MMCRA_BHRB_DISABLE;
>  			val = (event[i] >> EVENT_IFM_SHIFT) & EVENT_IFM_MASK;
>  			mmcra |= val << MMCRA_IFM_SHIFT;
>  		}
>  
> +		/* set MMCRA[:26] to 0 if there is user request for BHRB */
> +		if (cpu_has_feature(CPU_FTR_ARCH_31) &&
> has_branch_stack(pevents[i]))
> +			mmcra &= ~MMCRA_BHRB_DISABLE;
> +
>  		if (pevents[i]->attr.exclude_user)
>  			mmcr2 |= MMCR2_FCP(pmc);
>  
> diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
> index d64d69d..07fb919 100644
> --- a/arch/powerpc/perf/power10-pmu.c
> +++ b/arch/powerpc/perf/power10-pmu.c
> @@ -82,6 +82,8 @@
>  
>  /* MMCRA IFM bits - POWER10 */
>  #define POWER10_MMCRA_IFM1		0x0000000040000000UL
> +#define POWER10_MMCRA_IFM2		0x0000000080000000UL
> +#define POWER10_MMCRA_IFM3		0x00000000C0000000UL
>  #define POWER10_MMCRA_BHRB_MASK		0x00000000C0000000UL
>  
>  /* Table of alternatives, sorted by column 0 */
> @@ -233,8 +235,15 @@ static u64 power10_bhrb_filter_map(u64
> branch_sample_type)
>  	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
>  		return -1;
>  
> -	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
> -		return -1;
> +	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL) {
> +		pmu_bhrb_filter |= POWER10_MMCRA_IFM2;
> +		return pmu_bhrb_filter;
> +	}
> +
> +	if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
> +		pmu_bhrb_filter |= POWER10_MMCRA_IFM3;
> +		return pmu_bhrb_filter;
> +	}
>  
>  	if (branch_sample_type & PERF_SAMPLE_BRANCH_CALL)
>  		return -1;
> diff --git a/arch/powerpc/platforms/powernv/idle.c
> b/arch/powerpc/platforms/powernv/idle.c
> index 2dd4673..7db99c7 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -611,6 +611,7 @@ static unsigned long power9_idle_stop(unsigned long psscr,
> bool mmu_on)
>  	unsigned long srr1;
>  	unsigned long pls;
>  	unsigned long mmcr0 = 0;
> +	unsigned long mmcra_bhrb = 0;
>  	struct p9_sprs sprs = {}; /* avoid false used-uninitialised */
>  	bool sprs_saved = false;
>  
> @@ -657,6 +658,15 @@ static unsigned long power9_idle_stop(unsigned long
> psscr, bool mmu_on)
>  		  */
>  		mmcr0		= mfspr(SPRN_MMCR0);
>  	}
> +
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		/* POWER10 uses MMCRA[:26] as BHRB disable bit
> +		 * to disable BHRB logic when not used. Hence Save and
> +		 * restore MMCRA after a state-loss idle.
> +		 */
> +		mmcra_bhrb		= mfspr(SPRN_MMCRA);


Why is the bhrb bit of mmcra special here?

> +	}
> +
>  	if ((psscr & PSSCR_RL_MASK) >= pnv_first_spr_loss_level) {
>  		sprs.lpcr	= mfspr(SPRN_LPCR);
>  		sprs.hfscr	= mfspr(SPRN_HFSCR);
> @@ -721,6 +731,10 @@ static unsigned long power9_idle_stop(unsigned long
> psscr, bool mmu_on)
>  			mtspr(SPRN_MMCR0, mmcr0);
>  		}
>  
> +		/* Reload MMCRA to restore BHRB disable bit for POWER10 */
> +		if (cpu_has_feature(CPU_FTR_ARCH_31))
> +			mtspr(SPRN_MMCRA, mmcra_bhrb);
> +
>  		/*
>  		 * DD2.2 and earlier need to set then clear bit 60 in MMCRA
>  		 * to ensure the PMU starts running.


^ permalink raw reply

* Re: Using Firefox hangs system
From: Nicholas Piggin @ 2020-07-07  7:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Michael Ellerman, Paul Mackerras,
	Paul Menzel
  Cc: linuxppc-dev
In-Reply-To: <de2c9ccd-a078-a2ca-e7c7-13a1032cbda3@molgen.mpg.de>

Excerpts from Paul Menzel's message of July 6, 2020 3:20 pm:
> Dear Nicholas,
> 
> 
> Thank you for the quick response.
> 
> 
> Am 06.07.20 um 02:41 schrieb Nicholas Piggin:
>> Excerpts from Paul Menzel's message of July 5, 2020 8:30 pm:
> 
>>> Am 05.07.20 um 11:22 schrieb Paul Menzel:
>>>> [  572.253008] Oops: Exception in kernel mode, sig: 5 [#1]
>>>> [  572.253198] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
>>>> [  572.253232] Modules linked in: tcp_diag inet_diag unix_diag xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter bridge stp llc overlay xfs kvm_hv kvm binfmt_misc joydev uas usb_storage vmx_crypto bnx2x crct10dif_vpmsum ofpart cmdlinepart powernv_flash mtd mdio ibmpowernv at24 ipmi_powernv ipmi_devintf ipmi_msghandler opal_prd powernv_rng sch_fq_codel parport_pc ppdev lp nfsd parport auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic libcrc32c xor zstd_compress raid6_pq input_leds mac_hid hid_generic ast drm_vram_helper drm_ttm_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ahci drm_panel_orientation_quirks libahci usbhid hid crc32c_vpmsum uio_pdrv_genirq uio
>>>> [  572.253639] CPU: 4 PID: 6728 Comm: Web Content Not tainted 5.8.0-rc3+ #1
>>>> [  572.253659] NIP:  c00000000000ff5c LR: c00000000001a8f8 CTR: c0000000001d5f00
>>>> [  572.253835] REGS: c000007f31f0f420 TRAP: 1500   Not tainted  (5.8.0-rc3+)
>>>> [  572.253854] MSR:  900000000290b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28c48482  XER: 20000000
>>>> [  572.253888] CFAR: c00000000000fecc IRQMASK: 1
>>>> [  572.253888] GPR00: c00000000001b228 c000007f31f0f6b0 c000000001f9a900 c000007f351544d0
>>>> [  572.253888] GPR04: 0000000000000000 c000007f31f0fe90 c000007f351544f0 c000007f32e522b0
>>>> [  572.253888] GPR08: 0000000000000000 0000000000002000 9000000000009033 c000007fbcd85800
>>>> [  572.253888] GPR12: 0000000000008800 c000007fffffb680 0000000000000005 0000000000000004
>>>> [  572.253888] GPR16: c000007f35153800 c000007f35154130 0000000000000005 0000000000000001
>>>> [  572.253888] GPR20: 0000000000000024 c000007f32e51e68 c000007f35154028 0000007fd8da0000
>>>> [  572.253888] GPR24: 0000007fd8da0000 c000007f351544d0 c000007e9a4024d0 c000000001665f18
>>>> [  572.253888] GPR28: c000007f351544d0 c000007f35153800 900000000290f033 c000007f35153800
>>>> [  572.254079] NIP [c00000000000ff5c] save_fpu+0xa8/0x2ac
>>>> [  572.254098] LR [c00000000001a8f8] __giveup_fpu+0x28/0x80
>>>> [  572.254114] Call Trace:
>>>> [  572.254128] [c000007f31f0f6b0] [c000007f35153980] 0xc000007f35153980 (unreliable)
>>>> [  572.254156] [c000007f31f0f6e0] [c00000000001b228] giveup_all+0x128/0x150
>>>> [  572.254327] [c000007f31f0f710] [c00000000001c124] __switch_to+0x104/0x490
>>>> [  572.254352] [c000007f31f0f770] [c0000000010d2e34] __schedule+0x2e4/0xa10
>>>> [  572.254374] [c000007f31f0f840] [c0000000010d35d4] schedule+0x74/0x140
>>>> [  572.254397] [c000007f31f0f870] [c0000000010d9478] schedule_timeout+0x358/0x5d0
>>>> [  572.254424] [c000007f31f0f980] [c0000000010d5638] wait_for_completion+0xc8/0x210
>>>> [  572.254451] [c000007f31f0fa00] [c000000000608ed4] do_coredump+0x3a4/0xd60
>>>> [  572.254625] [c000007f31f0fba0] [c00000000018d1cc] get_signal+0x1dc/0xd00
>>>> [  572.254648] [c000007f31f0fcc0] [c00000000001f088] do_notify_resume+0x158/0x450
>>>> [  572.254672] [c000007f31f0fda0] [c000000000037d04] interrupt_exit_user_prepare+0x1c4/0x230
>>>> [  572.254699] [c000007f31f0fe20] [c00000000000f2b4] interrupt_return+0x14/0x1c0
>>>> [  572.254720] Instruction dump:
>>>> [  572.254882] dae60170 db060180 db260190 db4601a0 db6601b0 db8601c0 dba601d0 dbc601e0
>>>> [  572.254912] dbe601f0 48000204 38800000 f0000250 <7c062798> f0000250 38800010 f0210a50
>>>> [  572.254946] ---[ end trace ba4452ee5c77d58e ]---
>>>
>>> Please find all the messages attached.
>> 
>> "Oops: Exception in kernel mode, sig: 5 [#1]"
>> 
>> Unfortunately it's a very poor error message. I think it is a 0x1500
>> exception triggering in the kernel FP register saving. Do you have the
>> CONFIG_PPC_DENORMALISATION config option set?
> 
> Yes, as it’s set in the Ubuntu Linux kernel configuration, I have it set 
> too.
> 
>      $ grep DENORMALI /boot/config-*
>      /boot/config-4.15.0-23-generic:CONFIG_PPC_DENORMALISATION=y
>      /boot/config-5.4.0-40-generic:CONFIG_PPC_DENORMALISATION=y
>      /boot/config-5.7.0-rc5+:CONFIG_PPC_DENORMALISATION=y
>      /boot/config-5.8.0-rc3+:CONFIG_PPC_DENORMALISATION=y

Ah thanks I was able to reproduce with a little denorm test case.

The denorm interrupt handler got broken by some careless person.

This patch should hopefully fix it for you?

Thanks,
Nick

---
 arch/powerpc/kernel/exceptions-64s.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index fa080694e581..0fc8bad878b2 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -2551,7 +2551,7 @@ EXC_VIRT_NONE(0x5400, 0x100)
 INT_DEFINE_BEGIN(denorm_exception)
 	IVEC=0x1500
 	IHSRR=1
-	IBRANCH_COMMON=0
+	IBRANCH_TO_COMMON=0
 	IKVM_REAL=1
 INT_DEFINE_END(denorm_exception)
 
-- 
2.23.0


^ permalink raw reply related

* Re: [RFC PATCH 4/5] powerpc/mm: Remove custom stack expansion checking
From: Michael Ellerman @ 2020-07-07  6:53 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev; +Cc: linux-arch, hughd, linux-kernel
In-Reply-To: <fb3aad5f-17a1-93cc-1a3a-c50fe16ab711@csgroup.eu>

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 03/07/2020 à 16:13, Michael Ellerman a écrit :
>> We have powerpc specific logic in our page fault handling to decide if
>> an access to an unmapped address below the stack pointer should expand
>> the stack VMA.
>> 
>> The logic aims to prevent userspace from doing bad accesses below the
>> stack pointer. However as long as the stack is < 1MB in size, we allow
>> all accesses without further checks. Adding some debug I see that I
>> can do a full kernel build and LTP run, and not a single process has
>> used more than 1MB of stack. So for the majority of processes the
>> logic never even fires.
>> 
>> We also recently found a nasty bug in this code which could cause
>> userspace programs to be killed during signal delivery. It went
>> unnoticed presumably because most processes use < 1MB of stack.
>> 
>> The generic mm code has also grown support for stack guard pages since
>> this code was originally written, so the most heinous case of the
>> stack expanding into other mappings is now handled for us.
>> 
>> Finally although some other arches have special logic in this path,
>> from what I can tell none of x86, arm64, arm and s390 impose any extra
>> checks other than those in expand_stack().
>> 
>> So drop our complicated logic and like other architectures just let
>> the stack expand as long as its within the rlimit.
>
> I agree that's probably not worth a so complicated logic that is nowhere 
> documented.
>
> This patch looks good to me, minor comments below.

Thanks.

>> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
>> index ed01329dd12b..925a7231abb3 100644
>> --- a/arch/powerpc/mm/fault.c
>> +++ b/arch/powerpc/mm/fault.c
>> @@ -42,39 +42,7 @@
>>   #include <asm/kup.h>
>>   #include <asm/inst.h>
>>   
>> -/*
>> - * Check whether the instruction inst is a store using
>> - * an update addressing form which will update r1.
>> - */
>> -static bool store_updates_sp(struct ppc_inst inst)
>> -{
>> -	/* check for 1 in the rA field */
>> -	if (((ppc_inst_val(inst) >> 16) & 0x1f) != 1)
>> -		return false;
>> -	/* check major opcode */
>> -	switch (ppc_inst_primary_opcode(inst)) {
>> -	case OP_STWU:
>> -	case OP_STBU:
>> -	case OP_STHU:
>> -	case OP_STFSU:
>> -	case OP_STFDU:
>> -		return true;
>> -	case OP_STD:	/* std or stdu */
>> -		return (ppc_inst_val(inst) & 3) == 1;
>> -	case OP_31:
>> -		/* check minor opcode */
>> -		switch ((ppc_inst_val(inst) >> 1) & 0x3ff) {
>> -		case OP_31_XOP_STDUX:
>> -		case OP_31_XOP_STWUX:
>> -		case OP_31_XOP_STBUX:
>> -		case OP_31_XOP_STHUX:
>> -		case OP_31_XOP_STFSUX:
>> -		case OP_31_XOP_STFDUX:
>> -			return true;
>> -		}
>> -	}
>> -	return false;
>> -}
>> +
>
> Do we need this additional blank line ?

I usually leave two blank lines between the end of the includes and the
start of the code, which is what I did here I guess.

>>   /*
>>    * do_page_fault error handling helpers
>>    */
>> @@ -267,54 +235,6 @@ static bool bad_kernel_fault(struct pt_regs *regs, unsigned long error_code,
>>   	return false;
>>   }
>>   
>> -static bool bad_stack_expansion(struct pt_regs *regs, unsigned long address,
>> -				struct vm_area_struct *vma, unsigned int flags,
>> -				bool *must_retry)
>> -{
>> -	/*
>> -	 * N.B. The POWER/Open ABI allows programs to access up to
>> -	 * 288 bytes below the stack pointer.
>> -	 * The kernel signal delivery code writes up to 4KB
>> -	 * below the stack pointer (r1) before decrementing it.
>> -	 * The exec code can write slightly over 640kB to the stack
>> -	 * before setting the user r1.  Thus we allow the stack to
>> -	 * expand to 1MB without further checks.
>> -	 */
>> -	if (address + 0x100000 < vma->vm_end) {
>> -		struct ppc_inst __user *nip = (struct ppc_inst __user *)regs->nip;
>> -		/* get user regs even if this fault is in kernel mode */
>> -		struct pt_regs *uregs = current->thread.regs;
>> -		if (uregs == NULL)
>> -			return true;
>> -
>> -		/*
>> -		 * A user-mode access to an address a long way below
>> -		 * the stack pointer is only valid if the instruction
>> -		 * is one which would update the stack pointer to the
>> -		 * address accessed if the instruction completed,
>> -		 * i.e. either stwu rs,n(r1) or stwux rs,r1,rb
>> -		 * (or the byte, halfword, float or double forms).
>> -		 *
>> -		 * If we don't check this then any write to the area
>> -		 * between the last mapped region and the stack will
>> -		 * expand the stack rather than segfaulting.
>> -		 */
>> -		if (address + 4096 >= uregs->gpr[1])
>> -			return false;
>> -
>> -		if ((flags & FAULT_FLAG_WRITE) && (flags & FAULT_FLAG_USER) &&
>> -		    access_ok(nip, sizeof(*nip))) {
>> -			struct ppc_inst inst;
>> -
>> -			if (!probe_user_read_inst(&inst, nip))
>> -				return !store_updates_sp(inst);
>> -			*must_retry = true;
>> -		}
>> -		return true;
>> -	}
>> -	return false;
>> -}
>> -
>>   #ifdef CONFIG_PPC_MEM_KEYS
>>   static bool access_pkey_error(bool is_write, bool is_exec, bool is_pkey,
>>   			      struct vm_area_struct *vma)
>> @@ -480,7 +400,6 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>>   	int is_user = user_mode(regs);
>>   	int is_write = page_fault_is_write(error_code);
>>   	vm_fault_t fault, major = 0;
>> -	bool must_retry = false;
>>   	bool kprobe_fault = kprobe_page_fault(regs, 11);
>>   
>>   	if (unlikely(debugger_fault_handler(regs) || kprobe_fault))
>> @@ -569,30 +488,15 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>>   	vma = find_vma(mm, address);
>>   	if (unlikely(!vma))
>>   		return bad_area(regs, address);
>> -	if (likely(vma->vm_start <= address))
>> -		goto good_area;
>> -	if (unlikely(!(vma->vm_flags & VM_GROWSDOWN)))
>> -		return bad_area(regs, address);
>>   
>> -	/* The stack is being expanded, check if it's valid */
>> -	if (unlikely(bad_stack_expansion(regs, address, vma, flags,
>> -					 &must_retry))) {
>> -		if (!must_retry)
>> +	if (unlikely(vma->vm_start > address)) {
>> +		if (unlikely(!(vma->vm_flags & VM_GROWSDOWN)))
>
> We are already in an unlikely() branch, I don't think it is worth having 
> a second level of unlikely(), better let gcc decide what's most efficient.

Yeah I did wonder if we need so many unlikelys in here, but I thought
that should be done in a separate patch with some actual analysis of the
generated code.

cheers

^ permalink raw reply

* Re: [PATCH 3/5] selftests/powerpc: Update the stack expansion test
From: Michael Ellerman @ 2020-07-07  6:53 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev; +Cc: linux-arch, hughd, linux-kernel
In-Reply-To: <8f6c5175-32ce-34a2-873d-b5fb3a5d7c4c@csgroup.eu>

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 03/07/2020 à 16:13, Michael Ellerman a écrit :
>> Update the stack expansion load/store test to take into account the
>> new allowance of 4096 bytes below the stack pointer.
>
> [I didn't receive patch 2, don't know why, hence commenting patch 2 here.]
>
> Shouldn't patch 2 carry a fixes tag and be Cced to stable for 
> application to previous kernel releases ?

Yes it should.

cheers

^ permalink raw reply

* Re: [PATCH v2 06/10] powerpc/perf: power10 Performance Monitoring support
From: Michael Neuling @ 2020-07-07  6:50 UTC (permalink / raw)
  To: Athira Rajeev, mpe; +Cc: maddy, linuxppc-dev
In-Reply-To: <1593595262-1433-7-git-send-email-atrajeev@linux.vnet.ibm.com>


> @@ -480,6 +520,7 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
>  	mmcr[1] = mmcr1;
>  	mmcr[2] = mmcra;
>  	mmcr[3] = mmcr2;
> +	mmcr[4] = mmcr3;

This is fragile like the kvm vcpu case I commented on before but it gets passed
in via a function parameter?! Can you create a struct to store these in rather
than this odd ball numbering?

The cleanup should start in patch 1/10 here:

        /*
         * The order of the MMCR array is:
-        *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2
+        *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2, MMCR3
         *  - 32-bit, MMCR0, MMCR1, MMCR2
         */
-       unsigned long mmcr[4];
+       unsigned long mmcr[5];



mikey

^ permalink raw reply

* Re: [PATCH v4 24/41] powerpc/book3s64/pkeys: Store/restore userspace AMR correctly on entry and exit from kernel
From: Nicholas Piggin @ 2020-07-07  6:23 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev, mpe; +Cc: linuxram, bauerman
In-Reply-To: <83a38d37-099a-7b98-1262-c16b4f5a0cc4@linux.ibm.com>

Excerpts from Aneesh Kumar K.V's message of July 3, 2020 7:30 pm:
> On 7/3/20 2:48 PM, Nicholas Piggin wrote:
>> Excerpts from Aneesh Kumar K.V's message of June 15, 2020 4:14 pm:
>>> This prepare kernel to operate with a different value than userspace AMR.
>>> For this, AMR needs to be saved and restored on entry and return from the
>>> kernel.
>>>
>>> With KUAP we modify kernel AMR when accessing user address from the kernel
>>> via copy_to/from_user interfaces.
>>>
>>> If MMU_FTR_KEY is enabled we always use the key mechanism to implement KUAP
>>> feature. If MMU_FTR_KEY is not supported and if we support MMU_FTR_KUAP
>>> (radix translation on POWER9), we can skip restoring AMR on return
>>> to userspace. Userspace won't be using AMR in that specific config.
>>>
>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>> ---
>>>   arch/powerpc/include/asm/book3s/64/kup.h | 141 ++++++++++++++++++-----
>>>   arch/powerpc/kernel/entry_64.S           |   6 +-
>>>   arch/powerpc/kernel/exceptions-64s.S     |   4 +-
>>>   arch/powerpc/kernel/syscall_64.c         |  26 ++++-
>>>   4 files changed, 144 insertions(+), 33 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
>>> index e6ee1c34842f..6979cd1a0003 100644
>>> --- a/arch/powerpc/include/asm/book3s/64/kup.h
>>> +++ b/arch/powerpc/include/asm/book3s/64/kup.h
>>> @@ -13,18 +13,47 @@
>>>   
>>>   #ifdef __ASSEMBLY__
>>>   
>>> -.macro kuap_restore_amr	gpr1, gpr2
>>> -#ifdef CONFIG_PPC_KUAP
>>> +.macro kuap_restore_user_amr gpr1
>>> +#if defined(CONFIG_PPC_MEM_KEYS)
>>>   	BEGIN_MMU_FTR_SECTION_NESTED(67)
>>> -	mfspr	\gpr1, SPRN_AMR
>>> +	/*
>>> +	 * AMR is going to be different when
>>> +	 * returning to userspace.
>>> +	 */
>>> +	ld	\gpr1, STACK_REGS_KUAP(r1)
>>> +	isync
>>> +	mtspr	SPRN_AMR, \gpr1
>>> +
>>> +	/* No isync required, see kuap_restore_user_amr() */
>>> +	END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_PKEY , 67)
>>> +#endif
>>> +.endm
>>> +
>>> +.macro kuap_restore_kernel_amr	gpr1, gpr2
>>> +#if defined(CONFIG_PPC_MEM_KEYS)
>>> +	BEGIN_MMU_FTR_SECTION_NESTED(67)
>>> +	b	99f  // handle_pkey_restore_amr
>>> +	END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_PKEY , 67)
>>> +
>>> +	BEGIN_MMU_FTR_SECTION_NESTED(68)
>>> +	b	99f  // handle_kuap_restore_amr
>>> +	MMU_FTR_SECTION_ELSE_NESTED(68)
>>> +	b	100f  // skip_restore_amr
>>> +	ALT_MMU_FTR_SECTION_END_NESTED_IFSET(MMU_FTR_KUAP, 68)
>>> +
>>> +99:
>>> +	/*
>>> +	 * AMR is going to be mostly the same since we are
>>> +	 * returning to the kernel. Compare and do a mtspr.
>>> +	 */
>>>   	ld	\gpr2, STACK_REGS_KUAP(r1)
>>> +	mfspr	\gpr1, SPRN_AMR
>>>   	cmpd	\gpr1, \gpr2
>>> -	beq	998f
>>> +	beq	100f
>>>   	isync
>>>   	mtspr	SPRN_AMR, \gpr2
>>>   	/* No isync required, see kuap_restore_amr() */
>>> -998:
>>> -	END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_KUAP, 67)
>>> +100:  // skip_restore_amr
>> 
>> Can't you code it like this? (_IFCLR requires none of the bits to be
>> set)
>> 
>> BEGIN_MMU_FTR_SECTION_NESTED(67)
>> 	b	99f	// nothing using AMR, no need to restore
>> END_MMU_FTR_SECTION_NESTED_IFCLR(MMU_FTR_PKEY | MMU_FTR_KUAP, 67)
>> 
>> That saves you a branch in the common case of using AMR. Similar
>> for others.
> 
> 
> 
> Yes i could switch to that. The code is taking extra 200 cycles even 
> with KUAP/KUEP disabled and no keys being used on hash. I am yet to 
> analyze this closely. So will rework things based on that analysis.
> 
>> 
>>> @@ -69,22 +133,40 @@
>>>   
>>>   extern u64 default_uamor;
>>>   
>>> -static inline void kuap_restore_amr(struct pt_regs *regs, unsigned long amr)
>>> +static inline void kuap_restore_user_amr(struct pt_regs *regs)
>>>   {
>>> -	if (mmu_has_feature(MMU_FTR_KUAP) && unlikely(regs->kuap != amr)) {
>>> -		isync();
>>> -		mtspr(SPRN_AMR, regs->kuap);
>>> -		/*
>>> -		 * No isync required here because we are about to RFI back to
>>> -		 * previous context before any user accesses would be made,
>>> -		 * which is a CSI.
>>> -		 */
>>> +	if (!mmu_has_feature(MMU_FTR_PKEY))
>>> +		return;
>> 
>> If you have PKEY but not KUAP, do you still have to restore?
>> 
> 
> 
> Yes, because user space pkey is now set on the exit path. This is needed 
> to handle things like exec/fork().
> 
> 
>>> +
>>> +	isync();
>>> +	mtspr(SPRN_AMR, regs->kuap);
>>> +	/*
>>> +	 * No isync required here because we are about to rfi
>>> +	 * back to previous context before any user accesses
>>> +	 * would be made, which is a CSI.
>>> +	 */
>>> +}
>>> +
>>> +static inline void kuap_restore_kernel_amr(struct pt_regs *regs,
>>> +					   unsigned long amr)
>>> +{
>>> +	if (mmu_has_feature(MMU_FTR_KUAP) || mmu_has_feature(MMU_FTR_PKEY)) {
>>> +
>>> +		if (unlikely(regs->kuap != amr)) {
>>> +			isync();
>>> +			mtspr(SPRN_AMR, regs->kuap);
>>> +			/*
>>> +			 * No isync required here because we are about to rfi
>>> +			 * back to previous context before any user accesses
>>> +			 * would be made, which is a CSI.
>>> +			 */
>>> +		}
>>>   	}
>>>   }
>>>   
>>>   static inline unsigned long kuap_get_and_check_amr(void)
>>>   {
>>> -	if (mmu_has_feature(MMU_FTR_KUAP)) {
>>> +	if (mmu_has_feature(MMU_FTR_KUAP) || mmu_has_feature(MMU_FTR_PKEY)) {
>>>   		unsigned long amr = mfspr(SPRN_AMR);
>>>   		if (IS_ENABLED(CONFIG_PPC_KUAP_DEBUG)) /* kuap_check_amr() */
>>>   			WARN_ON_ONCE(amr != AMR_KUAP_BLOCKED);
>> 
>> We could do a static key that's based on this condition, but that can
>> wait for another day.
>> 
>>>   
>>>   static inline void kuap_check_amr(void)
>>>   {
>>> -	if (IS_ENABLED(CONFIG_PPC_KUAP_DEBUG) && mmu_has_feature(MMU_FTR_KUAP))
>>> +	if (IS_ENABLED(CONFIG_PPC_KUAP_DEBUG) &&
>>> +	    (mmu_has_feature(MMU_FTR_KUAP) || mmu_has_feature(MMU_FTR_PKEY)))
>>>   		WARN_ON_ONCE(mfspr(SPRN_AMR) != AMR_KUAP_BLOCKED);
>>>   }
>>>   
>>>   #else /* CONFIG_PPC_MEM_KEYS */
>>>   
>>> -static inline void kuap_restore_amr(struct pt_regs *regs, unsigned long amr)
>>> +static inline void kuap_restore_user_amr(struct pt_regs *regs)
>>> +{
>>> +}
>>> +
>>> +static inline void kuap_restore_kernel_amr(struct pt_regs *regs, unsigned long amr)
>>>   {
>>>   }
>>>   
>>> @@ -113,6 +200,7 @@ static inline unsigned long kuap_get_and_check_amr(void)
>>>   {
>>>   	return 0;
>>>   }
>>> +
>>>   #endif /* CONFIG_PPC_MEM_KEYS */
>>>   
>>>   
>>> @@ -187,7 +275,6 @@ bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
>>>   		    "Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read");
>>>   }
>>>   #endif /* CONFIG_PPC_KUAP */
>>> -
>>>   #endif /* __ASSEMBLY__ */
>>>   
>>>   #endif /* _ASM_POWERPC_BOOK3S_64_KUP_H */
>> 
>> Unnecessary hunks.
> 
> 
> will remove
> 
>> 
>>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>>> index 9d49338e0c85..a087cbe0b17d 100644
>>> --- a/arch/powerpc/kernel/entry_64.S
>>> +++ b/arch/powerpc/kernel/entry_64.S
>>> @@ -481,8 +481,8 @@ _ASM_NOKPROBE_SYMBOL(fast_interrupt_return)
>>>   	kuap_check_amr r3, r4
>>>   	ld	r5,_MSR(r1)
>>>   	andi.	r0,r5,MSR_PR
>>> -	bne	.Lfast_user_interrupt_return
>>> -	kuap_restore_amr r3, r4
>>> +	bne	.Lfast_user_interrupt_return_amr
>>> +	kuap_restore_kernel_amr r3, r4
>>>   	andi.	r0,r5,MSR_RI
>>>   	li	r3,0 /* 0 return value, no EMULATE_STACK_STORE */
>>>   	bne+	.Lfast_kernel_interrupt_return
>>> @@ -502,6 +502,8 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return)
>>>   	cmpdi	r3,0
>>>   	bne-	.Lrestore_nvgprs
>>>   
>>> +.Lfast_user_interrupt_return_amr:
>>> +	kuap_restore_user_amr r3
>>>   .Lfast_user_interrupt_return:
>>>   	ld	r11,_NIP(r1)
>>>   	ld	r12,_MSR(r1)
>>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>>> index e70ebb5c318c..8226af444d77 100644
>>> --- a/arch/powerpc/kernel/exceptions-64s.S
>>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>>> @@ -971,7 +971,7 @@ EXC_COMMON_BEGIN(system_reset_common)
>>>   	ld	r10,SOFTE(r1)
>>>   	stb	r10,PACAIRQSOFTMASK(r13)
>>>   
>>> -	kuap_restore_amr r9, r10
>>> +	kuap_restore_kernel_amr r9, r10
>>>   	EXCEPTION_RESTORE_REGS
>>>   	RFI_TO_USER_OR_KERNEL
>>>   
>>> @@ -2784,7 +2784,7 @@ EXC_COMMON_BEGIN(soft_nmi_common)
>>>   	ld	r10,SOFTE(r1)
>>>   	stb	r10,PACAIRQSOFTMASK(r13)
>>>   
>>> -	kuap_restore_amr r9, r10
>>> +	kuap_restore_kernel_amr r9, r10
>>>   	EXCEPTION_RESTORE_REGS hsrr=0
>>>   	RFI_TO_KERNEL
>>>   
>>> diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
>>> index 7e560a01afa4..fded67982fbe 100644
>>> --- a/arch/powerpc/kernel/syscall_64.c
>>> +++ b/arch/powerpc/kernel/syscall_64.c
>>> @@ -35,7 +35,21 @@ notrace long system_call_exception(long r3, long r4, long r5,
>>>   	BUG_ON(!FULL_REGS(regs));
>>>   	BUG_ON(regs->softe != IRQS_ENABLED);
>>>   
>>> -	kuap_check_amr();
>>> +#ifdef CONFIG_PPC_MEM_KEYS
>>> +	if (mmu_has_feature(MMU_FTR_PKEY)) {
>>> +		unsigned long amr;
>>> +		/*
>>> +		 * When entering from userspace we mostly have the AMR
>>> +		 * different from kernel default values. Hence don't compare.
>>> +		 */
>>> +		amr = mfspr(SPRN_AMR);
>>> +		regs->kuap = amr;
>>> +		if (mmu_has_feature(MMU_FTR_KUAP))
>>> +			mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
>>> +		isync();
>> 
>> isync should be inside the if(). Again do pkeys need to save this if
>> KUAP is not being used? I haven't really looked at how all that works,
>> but what's changing for the PKEY && !KUAP case?
>> 
> 
> There is no SPR switch in context switch now and all the AMR/IAMR 
> handling is now in the exit to userspace.

If we have pkeys and no kuap, we could keep the switch in context 
switch?

If you don't think it's worth bothering to optimise that case because we 
expect KUAP to be used, that's probably okay although maybe an 
adjustment to the comment (we don't expect userspace to have different
from kernel values if kernel is not using it for KUAP).

>>> +	/*
>>> +	 * We do this at the end so that we do context switch with KERNEL AMR
>>> +	 */
>>> +	kuap_restore_user_amr(regs);
>>>   	return ret;
>> 
>> Comment doesn't make sense, newline required before return.
> 
> 
> Ok the detail there was we need to make sure we restore AMR towrads the 
> end and make sure all the kernel code continue to run with KERNEL AMR 
> value. There is a schedule() call in there with _TIF_NEED_RESCHED. But 
> those details are not really relevant. That was me tracking down some 
> issues and writing comment around that part of the code. The only real 
> detail is switch to userspace AMR in the end.

Yep, I don't think that comment is needed at all. A space before the
return would be nice. I guess after the account_cpu_user_exit is fine,
that thing's a pain anyway that needs to be changed to avoid an SPR
stall I think so I'll look at that afterward anyway.

Thanks,
Nick

^ permalink raw reply

* Re: [PATCH v2 04/10] powerpc/perf: Add power10_feat to dt_cpu_ftrs
From: Michael Neuling @ 2020-07-07  6:22 UTC (permalink / raw)
  To: Athira Rajeev, mpe; +Cc: maddy, linuxppc-dev
In-Reply-To: <1593595262-1433-5-git-send-email-atrajeev@linux.vnet.ibm.com>

On Wed, 2020-07-01 at 05:20 -0400, Athira Rajeev wrote:
> From: Madhavan Srinivasan <maddy@linux.ibm.com>
> 
> Add power10 feature function to dt_cpu_ftrs.c along
> with a power10 specific init() to initialize pmu sprs.

Can you say why you're doing this?

Can you add some text about what you're doing to the BHRB in this patch?

Mikey

> 
> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/reg.h        |  3 +++
>  arch/powerpc/kernel/cpu_setup_power.S |  7 +++++++
>  arch/powerpc/kernel/dt_cpu_ftrs.c     | 26 ++++++++++++++++++++++++++
>  3 files changed, 36 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index 21a1b2d..900ada1 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -1068,6 +1068,9 @@
>  #define MMCR0_PMC2_LOADMISSTIME	0x5
>  #endif
>  
> +/* BHRB disable bit for PowerISA v3.10 */
> +#define MMCRA_BHRB_DISABLE	0x0000002000000000
> +
>  /*
>   * SPRG usage:
>   *
> diff --git a/arch/powerpc/kernel/cpu_setup_power.S
> b/arch/powerpc/kernel/cpu_setup_power.S
> index efdcfa7..e8b3370c 100644
> --- a/arch/powerpc/kernel/cpu_setup_power.S
> +++ b/arch/powerpc/kernel/cpu_setup_power.S
> @@ -233,3 +233,10 @@ __init_PMU_ISA207:
>  	li	r5,0
>  	mtspr	SPRN_MMCRS,r5
>  	blr
> +
> +__init_PMU_ISA31:
> +	li	r5,0
> +	mtspr	SPRN_MMCR3,r5
> +	LOAD_REG_IMMEDIATE(r5, MMCRA_BHRB_DISABLE)
> +	mtspr	SPRN_MMCRA,r5
> +	blr
> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c
> b/arch/powerpc/kernel/dt_cpu_ftrs.c
> index a0edeb3..14a513f 100644
> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
> @@ -449,6 +449,31 @@ static int __init feat_enable_pmu_power9(struct
> dt_cpu_feature *f)
>  	return 1;
>  }
>  
> +static void init_pmu_power10(void)
> +{
> +	init_pmu_power9();
> +
> +	mtspr(SPRN_MMCR3, 0);
> +	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
> +}
> +
> +static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
> +{
> +	hfscr_pmu_enable();
> +
> +	init_pmu_power10();
> +	init_pmu_registers = init_pmu_power10;
> +
> +	cur_cpu_spec->cpu_features |= CPU_FTR_MMCRA;
> +	cur_cpu_spec->cpu_user_features |= PPC_FEATURE_PSERIES_PERFMON_COMPAT;
> +
> +	cur_cpu_spec->num_pmcs          = 6;
> +	cur_cpu_spec->pmc_type          = PPC_PMC_IBM;
> +	cur_cpu_spec->oprofile_cpu_type = "ppc64/power10";
> +
> +	return 1;
> +}
> +
>  static int __init feat_enable_tm(struct dt_cpu_feature *f)
>  {
>  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> @@ -638,6 +663,7 @@ struct dt_cpu_feature_match {
>  	{"pc-relative-addressing", feat_enable, 0},
>  	{"machine-check-power9", feat_enable_mce_power9, 0},
>  	{"performance-monitor-power9", feat_enable_pmu_power9, 0},
> +	{"performance-monitor-power10", feat_enable_pmu_power10, 0},
>  	{"event-based-branch-v3", feat_enable, 0},
>  	{"random-number-generator", feat_enable, 0},
>  	{"system-call-vectored", feat_disable, 0},


^ permalink raw reply

* Re: [PATCH v2 02/10] KVM: PPC: Book3S HV: Save/restore new PMU registers
From: Michael Neuling @ 2020-07-07  6:13 UTC (permalink / raw)
  To: Athira Rajeev, mpe; +Cc: Paul Mackerras, maddy, linuxppc-dev
In-Reply-To: <1593595262-1433-3-git-send-email-atrajeev@linux.vnet.ibm.com>

@@ -637,12 +637,12 @@ struct kvm_vcpu_arch {
>  	u32 ccr1;
>  	u32 dbsr;
>  
> -	u64 mmcr[5];
> +	u64 mmcr[6];
>  	u32 pmc[8];
>  	u32 spmc[2];
>  	u64 siar;


> +	mfspr	r5, SPRN_MMCR3
> +	mfspr	r6, SPRN_SIER2
> +	mfspr	r7, SPRN_SIER3
> +	std	r5, VCPU_MMCR + 40(r9)
> +	std	r6, VCPU_SIER + 8(r9)
> +	std	r7, VCPU_SIER + 16(r9)


This is looking pretty fragile now. vcpu mmcr[6] stores (in this strict order):
   mmcr0, mmcr1, mmcra, mmcr2, mmcrs, mmmcr3.

Can we clean that up? Give mmcra and mmcrs their own entries in vcpu and then
have a flat array for mmcr0-3.

Mikey

^ permalink raw reply

* Re: [PATCH v5 23/26] powerpc/book3s64/kuap: Move UAMOR setup to key init function
From: Michael Ellerman @ 2020-07-07  6:05 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev; +Cc: Aneesh Kumar K.V, linuxram, bauerman
In-Reply-To: <20200619135850.47155-24-aneesh.kumar@linux.ibm.com>

"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> UAMOR values are not application-specific.

It used to be, that's worth mentioning.

> The kernel initializes its value based on different reserved keys.
> Remove the thread-specific UAMOR value and don't switch the UAMOR on
> context switch.
>
> Move UAMOR initialization to key initialization code. Now that
> KUAP/KUEP feature depends on PPC_MEM_KEYS, we can start to consolidate
> all register initialization to keys init.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/kup.h |  2 ++
>  arch/powerpc/include/asm/processor.h     |  1 -
>  arch/powerpc/kernel/ptrace/ptrace-view.c | 17 ++++++++----
>  arch/powerpc/kernel/smp.c                |  5 ++++
>  arch/powerpc/mm/book3s64/pkeys.c         | 35 ++++++++++++++----------
>  5 files changed, 39 insertions(+), 21 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
> index 3a0e138d2735..942594745dfa 100644
> --- a/arch/powerpc/include/asm/book3s/64/kup.h
> +++ b/arch/powerpc/include/asm/book3s/64/kup.h
> @@ -67,6 +67,8 @@
>  #include <asm/mmu.h>
>  #include <asm/ptrace.h>
>  
> +extern u64 default_uamor;
> +
>  static inline void kuap_restore_amr(struct pt_regs *regs, unsigned long amr)
>  {
>  	if (mmu_has_feature(MMU_FTR_KUAP) && unlikely(regs->kuap != amr)) {
> diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> index 52a67835057a..6ac12168f1fe 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -237,7 +237,6 @@ struct thread_struct {
>  #ifdef CONFIG_PPC_MEM_KEYS
>  	unsigned long	amr;
>  	unsigned long	iamr;
> -	unsigned long	uamor;
>  #endif
>  #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
>  	void*		kvm_shadow_vcpu; /* KVM internal data */
> diff --git a/arch/powerpc/kernel/ptrace/ptrace-view.c b/arch/powerpc/kernel/ptrace/ptrace-view.c
> index caeb5822a8f4..689711eb018a 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace-view.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace-view.c
> @@ -488,14 +488,22 @@ static int pkey_active(struct task_struct *target, const struct user_regset *reg
>  static int pkey_get(struct task_struct *target, const struct user_regset *regset,
>  		    unsigned int pos, unsigned int count, void *kbuf, void __user *ubuf)
>  {
> +	int ret;
> +
>  	BUILD_BUG_ON(TSO(amr) + sizeof(unsigned long) != TSO(iamr));
> -	BUILD_BUG_ON(TSO(iamr) + sizeof(unsigned long) != TSO(uamor));
>  
>  	if (!arch_pkeys_enabled())
>  		return -ENODEV;
>  
> -	return user_regset_copyout(&pos, &count, &kbuf, &ubuf, &target->thread.amr,
> -				   0, ELF_NPKEY * sizeof(unsigned long));
> +	ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, &target->thread.amr,
> +				  0, 2 * sizeof(unsigned long));
> +	if (ret)
> +		goto err_out;

Why not just return?

> +
> +	ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, &default_uamor,
> +				  2 * sizeof(unsigned long), 3 * sizeof(unsigned long));
> +err_out:
> +	return ret;
>  }
>  
>  static int pkey_set(struct task_struct *target, const struct user_regset *regset,
> @@ -518,8 +526,7 @@ static int pkey_set(struct task_struct *target, const struct user_regset *regset
>  		return ret;
>  
>  	/* UAMOR determines which bits of the AMR can be set from userspace. */
> -	target->thread.amr = (new_amr & target->thread.uamor) |
> -			     (target->thread.amr & ~target->thread.uamor);
> +	target->thread.amr = (new_amr & default_uamor) | (target->thread.amr & ~default_uamor);

That comment could explain better why we are bothering to mask with ~default_uamor.

>  	return 0;
>  }
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index c820c95162ff..eec40082599f 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -59,6 +59,7 @@
>  #include <asm/asm-prototypes.h>
>  #include <asm/cpu_has_feature.h>
>  #include <asm/ftrace.h>
> +#include <asm/kup.h>
>  
>  #ifdef DEBUG
>  #include <asm/udbg.h>
> @@ -1256,6 +1257,10 @@ void start_secondary(void *unused)
>  	mmgrab(&init_mm);
>  	current->active_mm = &init_mm;
>  
> +#ifdef CONFIG_PPC_MEM_KEYS
> +	mtspr(SPRN_UAMOR, default_uamor);
> +#endif

That's 1) not very pretty and 2) risks blowing up on other CPUs.

It should at least go in early_init_mmu_secondary().

>  	smp_store_cpu_info(cpu);
>  	set_dec(tb_ticks_per_jiffy);
>  	preempt_disable();
> diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
> index aeecc8b8e11c..3f3593f85358 100644
> --- a/arch/powerpc/mm/book3s64/pkeys.c
> +++ b/arch/powerpc/mm/book3s64/pkeys.c
> @@ -24,7 +24,7 @@ static u32  initial_allocation_mask;   /* Bits set for the initially allocated k
>  static u64 default_amr;
>  static u64 default_iamr;
>  /* Allow all keys to be modified by default */
> -static u64 default_uamor = ~0x0UL;
> +u64 default_uamor = ~0x0UL;

__ro_after_init?

>  /*
>   * Key used to implement PROT_EXEC mmap. Denies READ/WRITE
>   * We pick key 2 because 0 is special key and 1 is reserved as per ISA.
> @@ -113,8 +113,16 @@ void __init pkey_early_init_devtree(void)
>  	/* scan the device tree for pkey feature */
>  	pkeys_total = scan_pkey_feature();
>  	if (!pkeys_total) {
> -		/* No support for pkey. Mark it disabled */
> -		return;
> +		/*
> +		 * No key support but on radix we can use key 0
> +		 * to implement kuap.
> +		 */
> +		if (early_radix_enabled())
> +			/*
> +			 * Make sure userspace can't change the AMR
> +			 */
> +			default_uamor = 0;
> +		goto err_out;

Would be cleaner if you inverted that. ie. initialise to 0 and then set
to ~0x0UL when you detect pkeys.

>  	}
>  
>  	cur_cpu_spec->mmu_features |= MMU_FTR_PKEY;
> @@ -197,6 +205,12 @@ void __init pkey_early_init_devtree(void)
>  	initial_allocation_mask |= reserved_allocation_mask;
>  
>  	pr_info("Enabling Memory keys with max key count %d", max_pkey);
> +err_out:

It's not "err" out if the OK path goes via here. That's just "out".

> +	/*
> +	 * Setup uamor on boot cpu
> +	 */
> +	mtspr(SPRN_UAMOR, default_uamor);
> +
>  	return;
>  }
>  
> @@ -232,8 +246,9 @@ void __init setup_kuap(bool disabled)
>  		cur_cpu_spec->mmu_features |= MMU_FTR_KUAP;
>  	}
>  
> -	/* Make sure userspace can't change the AMR */
> -	mtspr(SPRN_UAMOR, 0);

Why not just leave it there. It's extra insurance and it's good
documentation.

> +	/*
> +	 * Set the default kernel AMR values on all cpus.
> +	 */
>  	mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
>  	isync();
>  }
> @@ -278,11 +293,6 @@ static inline u64 read_uamor(void)
>  	return mfspr(SPRN_UAMOR);
>  }
>  
> -static inline void write_uamor(u64 value)
> -{
> -	mtspr(SPRN_UAMOR, value);
> -}
> -
>  static bool is_pkey_enabled(int pkey)
>  {
>  	u64 uamor = read_uamor();
> @@ -353,7 +363,6 @@ void thread_pkey_regs_save(struct thread_struct *thread)
>  	 */
>  	thread->amr = read_amr();
>  	thread->iamr = read_iamr();
> -	thread->uamor = read_uamor();
>  }
>  
>  void thread_pkey_regs_restore(struct thread_struct *new_thread,
> @@ -366,8 +375,6 @@ void thread_pkey_regs_restore(struct thread_struct *new_thread,
>  		write_amr(new_thread->amr);
>  	if (old_thread->iamr != new_thread->iamr)
>  		write_iamr(new_thread->iamr);
> -	if (old_thread->uamor != new_thread->uamor)
> -		write_uamor(new_thread->uamor);
>  }
>  
>  void thread_pkey_regs_init(struct thread_struct *thread)
> @@ -377,11 +384,9 @@ void thread_pkey_regs_init(struct thread_struct *thread)
>  
>  	thread->amr   = default_amr;
>  	thread->iamr  = default_iamr;
> -	thread->uamor = default_uamor;
>  
>  	write_amr(default_amr);
>  	write_iamr(default_iamr);
> -	write_uamor(default_uamor);
>  }
>  
>  int execute_only_pkey(struct mm_struct *mm)

cheers

^ permalink raw reply

* Re: [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks
From: Nicholas Piggin @ 2020-07-07  5:57 UTC (permalink / raw)
  To: linuxppc-dev, Waiman Long
  Cc: linux-arch, Peter Zijlstra, Boqun Feng, linux-kernel, kvm-ppc,
	virtualization, Ingo Molnar, Will Deacon
In-Reply-To: <24f75d2c-60cd-2766-4aab-1a3b1c80646e@redhat.com>

Excerpts from Waiman Long's message of July 7, 2020 4:39 am:
> On 7/6/20 12:35 AM, Nicholas Piggin wrote:
>> v3 is updated to use __pv_queued_spin_unlock, noticed by Waiman (thank you).
>>
>> Thanks,
>> Nick
>>
>> Nicholas Piggin (6):
>>    powerpc/powernv: must include hvcall.h to get PAPR defines
>>    powerpc/pseries: move some PAPR paravirt functions to their own file
>>    powerpc: move spinlock implementation to simple_spinlock
>>    powerpc/64s: implement queued spinlocks and rwlocks
>>    powerpc/pseries: implement paravirt qspinlocks for SPLPAR
>>    powerpc/qspinlock: optimised atomic_try_cmpxchg_lock that adds the
>>      lock hint
>>
>>   arch/powerpc/Kconfig                          |  13 +
>>   arch/powerpc/include/asm/Kbuild               |   2 +
>>   arch/powerpc/include/asm/atomic.h             |  28 ++
>>   arch/powerpc/include/asm/paravirt.h           |  89 +++++
>>   arch/powerpc/include/asm/qspinlock.h          |  91 ++++++
>>   arch/powerpc/include/asm/qspinlock_paravirt.h |   7 +
>>   arch/powerpc/include/asm/simple_spinlock.h    | 292 +++++++++++++++++
>>   .../include/asm/simple_spinlock_types.h       |  21 ++
>>   arch/powerpc/include/asm/spinlock.h           | 308 +-----------------
>>   arch/powerpc/include/asm/spinlock_types.h     |  17 +-
>>   arch/powerpc/lib/Makefile                     |   3 +
>>   arch/powerpc/lib/locks.c                      |  12 +-
>>   arch/powerpc/platforms/powernv/pci-ioda-tce.c |   1 +
>>   arch/powerpc/platforms/pseries/Kconfig        |   5 +
>>   arch/powerpc/platforms/pseries/setup.c        |   6 +-
>>   include/asm-generic/qspinlock.h               |   4 +
>>   16 files changed, 577 insertions(+), 322 deletions(-)
>>   create mode 100644 arch/powerpc/include/asm/paravirt.h
>>   create mode 100644 arch/powerpc/include/asm/qspinlock.h
>>   create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt.h
>>   create mode 100644 arch/powerpc/include/asm/simple_spinlock.h
>>   create mode 100644 arch/powerpc/include/asm/simple_spinlock_types.h
>>
> This patch looks OK to me.

Thanks for reviewing and testing.

> I had run some microbenchmark on powerpc system with or w/o the patch.
> 
> On a 2-socket 160-thread SMT4 POWER9 system (not virtualized):
> 
> 5.8.0-rc4
> =========
> 
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 160, Min/Mean/Max = 77,665/90,153/106,895
> Threads = 160, Total Rate = 1,441,759 op/s; Percpu Rate = 9,011 op/s
> 
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 160, Min/Mean/Max = 47,879/53,807/63,689
> Threads = 160, Total Rate = 860,192 op/s; Percpu Rate = 5,376 op/s
> 
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 80, Min/Mean/Max = 242,907/319,514/463,161
> Threads = 80, Total Rate = 2,555 kop/s; Percpu Rate = 32 kop/s
> 
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 80, Min/Mean/Max = 146,161/187,474/259,270
> Threads = 80, Total Rate = 1,498 kop/s; Percpu Rate = 19 kop/s
> 
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 40, Min/Mean/Max = 646,639/1,000,817/1,455,205
> Threads = 40, Total Rate = 4,001 kop/s; Percpu Rate = 100 kop/s
> 
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 40, Min/Mean/Max = 402,165/597,132/814,555
> Threads = 40, Total Rate = 2,388 kop/s; Percpu Rate = 60 kop/s
> 
> 5.8.0-rc4-qlock+
> ================
> 
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 160, Min/Mean/Max = 123,835/124,580/124,587
> Threads = 160, Total Rate = 1,992 kop/s; Percpu Rate = 12 kop/s
> 
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 160, Min/Mean/Max = 254,210/264,714/276,784
> Threads = 160, Total Rate = 4,231 kop/s; Percpu Rate = 26 kop/s
> 
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 80, Min/Mean/Max = 599,715/603,397/603,450
> Threads = 80, Total Rate = 4,825 kop/s; Percpu Rate = 60 kop/s
> 
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 80, Min/Mean/Max = 492,687/525,224/567,456
> Threads = 80, Total Rate = 4,199 kop/s; Percpu Rate = 52 kop/s
> 
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 40, Min/Mean/Max = 1,325,623/1,325,628/1,325,636
> Threads = 40, Total Rate = 5,299 kop/s; Percpu Rate = 132 kop/s
> 
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 40, Min/Mean/Max = 1,249,731/1,292,977/1,342,815
> Threads = 40, Total Rate = 5,168 kop/s; Percpu Rate = 129 kop/s
> 
> On systems on large number of cpus, qspinlock lock is faster and more fair.
> 
> With some tuning, we may be able to squeeze out more performance.

Yes, powerpc could certainly get more performance out of the slow
paths, and then there are a few parameters to tune.

We don't have a good alternate patching for function calls yet, but
that would be something to do for native vs pv.

And then there seem to be one or two tunable parameters we could
experiment with.

The paravirt locks may need a bit more tuning. Some simple testing
under KVM shows we might be a bit slower in some cases. Whether this
is fairness or something else I'm not sure. The current simple pv
spinlock code can do a directed yield to the lock holder CPU, whereas 
the pv qspl here just does a general yield. I think we might actually
be able to change that to also support directed yield. Though I'm
not sure if this is actually the cause of the slowdown yet.

Thanks,
Nick

^ permalink raw reply

* Re: [PATCH] powerpc/pseries: detect secure and trusted boot state of the system.
From: Michael Ellerman @ 2020-07-07  5:56 UTC (permalink / raw)
  To: Nayna Jain, linuxppc-dev; +Cc: Nayna Jain, linux-kernel, Mimi Zohar
In-Reply-To: <1593882535-21368-1-git-send-email-nayna@linux.ibm.com>

Nayna Jain <nayna@linux.ibm.com> writes:
> The device-tree property to check secure and trusted boot state is
> different for guests(pseries) compared to baremetal(powernv).
>
> This patch updates the existing is_ppc_secureboot_enabled() and
> is_ppc_trustedboot_enabled() function to add support for pseries.
>
> Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
> ---
>  arch/powerpc/kernel/secure_boot.c | 31 +++++++++++++++++++++++++------
>  1 file changed, 25 insertions(+), 6 deletions(-)
>
> diff --git a/arch/powerpc/kernel/secure_boot.c b/arch/powerpc/kernel/secure_boot.c
> index 4b982324d368..43fc6607c7a5 100644
> --- a/arch/powerpc/kernel/secure_boot.c
> +++ b/arch/powerpc/kernel/secure_boot.c
> @@ -6,6 +6,7 @@
>  #include <linux/types.h>
>  #include <linux/of.h>
>  #include <asm/secure_boot.h>
> +#include <asm/machdep.h>
>  
>  static struct device_node *get_ppc_fw_sb_node(void)
>  {
> @@ -23,11 +24,20 @@ bool is_ppc_secureboot_enabled(void)
>  {
>  	struct device_node *node;
>  	bool enabled = false;
> +	const u32 *secureboot;
>  
> -	node = get_ppc_fw_sb_node();
> -	enabled = of_property_read_bool(node, "os-secureboot-enforcing");
> +	if (machine_is(powernv)) {
> +		node = get_ppc_fw_sb_node();
> +		enabled =
> +		    of_property_read_bool(node, "os-secureboot-enforcing");
> +		of_node_put(node);
> +	}

We generally try to avoid adding new machine_is() checks if we can.

In a case like this I think you can just check for both properties
regardless of what platform you're on.
  
> -	of_node_put(node);
> +	if (machine_is(pseries)) {
> +		secureboot = of_get_property(of_root, "ibm,secure-boot", NULL);
> +		if (secureboot)
> +			enabled = (*secureboot > 1) ? true : false;
> +	}

Please don't use of_get_property() in new code. Use one of the properly
typed accessors that handles endian conversion for you.

cheers

^ permalink raw reply

* Re: [PATCH] powerpc: select ARCH_HAS_MEMBARRIER_SYNC_CORE
From: Nicholas Piggin @ 2020-07-07  5:50 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev; +Cc: linux-arch, Mathieu Desnoyers
In-Reply-To: <cf10b0bc-de79-1b2b-8355-fc7bbeec47c3@csgroup.eu>

Excerpts from Christophe Leroy's message of July 6, 2020 7:53 pm:
> 
> 
> Le 06/07/2020 à 04:18, Nicholas Piggin a écrit :
>> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
>> index 47bd4ea0837d..b88cb3a989b6 100644
>> --- a/arch/powerpc/include/asm/exception-64s.h
>> +++ b/arch/powerpc/include/asm/exception-64s.h
>> @@ -68,6 +68,10 @@
>>    *
>>    * The nop instructions allow us to insert one or more instructions to flush the
>>    * L1-D cache when returning to userspace or a guest.
>> + *
>> + * powerpc relies on return from interrupt/syscall being context synchronising
>> + * (which hrfid, rfid, and rfscv are) to support ARCH_HAS_MEMBARRIER_SYNC_CORE
>> + * without additional additional synchronisation instructions.
> 
> This file is dedicated to BOOK3S/64. What about other ones ?
> 
> On 32 bits, this is also valid as 'rfi' is also context synchronising, 
> but then why just add some comment in exception-64s.h and only there ?

Yeah you're right, I basically wanted to keep a note there just in case,
because it's possible we would get a less synchronising return (maybe
unlikely with meltdown) or even return from a kernel interrupt using a
something faster (e.g., bctar if we don't use tar register in the kernel
anywhere).

So I wonder where to add the note, entry_32.S and 64e.h as well?

I should actually change the comment for 64-bit because soft masked 
interrupt replay is an interesting case. I thought it was okay (because 
the IPI would cause a hard interrupt which does do the rfi) but that 
should at least be written. The context synchronisation happens before
the Linux IPI function is called, but for the purpose of membarrier I 
think that is okay (the membarrier just needs to have caused a memory
barrier + context synchronistaion by the time it has done).

Thanks,
Nick

^ permalink raw reply

* Re: [PATCH v3 1/2] powerpc/perf/hv-24x7: Add cpu hotplug support
From: Madhavan Srinivasan @ 2020-07-07  5:36 UTC (permalink / raw)
  To: Michael Ellerman, Kajol Jain, linuxppc-dev
  Cc: nathanl, ego, suka, maddy, anju
In-Reply-To: <87o8or53fh.fsf@mpe.ellerman.id.au>



On 7/7/20 10:26 AM, Michael Ellerman wrote:
> Madhavan Srinivasan <maddy@linux.ibm.com> writes:
>> On 7/6/20 8:43 AM, Michael Ellerman wrote:
>>> Kajol Jain <kjain@linux.ibm.com> writes:
>>>> Patch here adds cpu hotplug functions to hv_24x7 pmu.
>>>> A new cpuhp_state "CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE" enum
>>>> is added.
>>>>
>>>> The online callback function updates the cpumask only if its
>>>> empty. As the primary intention of adding hotplug support
>>>> is to designate a CPU to make HCALL to collect the
>>>> counter data.
>>>>
>>>> The offline function test and clear corresponding cpu in a cpumask
>>>> and update cpumask to any other active cpu.
>>>>
>>>> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
>>>> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
>>>> ---
>>>>    arch/powerpc/perf/hv-24x7.c | 45 +++++++++++++++++++++++++++++++++++++
>>>>    include/linux/cpuhotplug.h  |  1 +
>>>>    2 files changed, 46 insertions(+)
>>>>
>>>> diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
>>>> index db213eb7cb02..ce4739e2b407 100644
>>>> --- a/arch/powerpc/perf/hv-24x7.c
>>>> +++ b/arch/powerpc/perf/hv-24x7.c
>>>> @@ -31,6 +31,8 @@ static int interface_version;
>>>>    /* Whether we have to aggregate result data for some domains. */
>>>>    static bool aggregate_result_elements;
>>>>    
>>>> +static cpumask_t hv_24x7_cpumask;
>>>> +
>>>>    static bool domain_is_valid(unsigned domain)
>>>>    {
>>>>    	switch (domain) {
>>>> @@ -1641,6 +1643,44 @@ static struct pmu h_24x7_pmu = {
>>>>    	.capabilities = PERF_PMU_CAP_NO_EXCLUDE,
>>>>    };
>>>>    
>>>> +static int ppc_hv_24x7_cpu_online(unsigned int cpu)
>>>> +{
>>>> +	/* Make this CPU the designated target for counter collection */
>>> The comment implies every newly onlined CPU will become the target, but
>>> actually it's only the first onlined CPU.
>>>
>>> So I think the comment needs updating, or you could just drop the
>>> comment, I think the code is fairly clear by itself.
>>>
>>>> +	if (cpumask_empty(&hv_24x7_cpumask))
>>>> +		cpumask_set_cpu(cpu, &hv_24x7_cpumask);
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int ppc_hv_24x7_cpu_offline(unsigned int cpu)
>>>> +{
>>>> +	int target = -1;
>>> No need to initialise target, you assign to it unconditionally below.
>>>
>>>> +	/* Check if exiting cpu is used for collecting 24x7 events */
>>>> +	if (!cpumask_test_and_clear_cpu(cpu, &hv_24x7_cpumask))
>>>> +		return 0;
>>>> +
>>>> +	/* Find a new cpu to collect 24x7 events */
>>>> +	target = cpumask_last(cpu_active_mask);
>>> Any reason to use cpumask_last() vs cpumask_first(), or a randomly
>>> chosen CPU?
>>>
>>>> +	if (target < 0 || target >= nr_cpu_ids)
>>>> +		return -1;
>>>> +
>>>> +	/* Migrate 24x7 events to the new target */
>>>> +	cpumask_set_cpu(target, &hv_24x7_cpumask);
>>>> +	perf_pmu_migrate_context(&h_24x7_pmu, cpu, target);
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int hv_24x7_cpu_hotplug_init(void)
>>>> +{
>>>> +	return cpuhp_setup_state(CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE,
>>>> +			  "perf/powerpc/hv_24x7:online",
>>>> +			  ppc_hv_24x7_cpu_online,
>>>> +			  ppc_hv_24x7_cpu_offline);
>>>> +}
>>>> +
>>>>    static int hv_24x7_init(void)
>>>>    {
>>>>    	int r;
>>>> @@ -1685,6 +1725,11 @@ static int hv_24x7_init(void)
>>>>    	if (r)
>>>>    		return r;
>>>>    
>>>> +	/* init cpuhotplug */
>>>> +	r = hv_24x7_cpu_hotplug_init();
>>>> +	if (r)
>>>> +		pr_err("hv_24x7: CPU hotplug init failed\n");
>>>> +
>>> The hotplug initialisation shouldn't fail unless something is badly
>>> wrong. I think you should just fail initialisation of the entire PMU if
>>> that happens, which will make the error handling in the next patch much
>>> simpler.
>> We  did fail the PMU registration on failure of the hotplug
>> code (and yes error handling is much simpler), but on internal
>> review/discussion,
>> what came up was that, hv_24x7 PMU will still be usable without
>> the hotplug code (with "-C" option to perf tool command line).
> In theory yes.
>
> But in reality no one will ever test that case, so the code will easily
> bit rot.
>
> Even if it doesn't bit rot, you've now created another state the system
> can legally be in (hotplug init failed but PMU still probed), which you
> have to test, document & support.
>
> If the hotplug init fails then something is badly wrong, the best thing
> we can do is bail on the PMU initialisation and hope the rest of the
> system boots OK.

Yep agreed. Thanks for the comments mpe

Maddy

>
> cheers


^ permalink raw reply

* Re: [PATCH v12 00/31] Speculative page faults
From: Chinwen Chang @ 2020-07-07  5:31 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: jack, sergey.senozhatsky.work, peterz, Will Deacon, mhocko,
	linux-mm, paulus, Punit Agrawal, hpa, Michel Lespinasse,
	Alexei Starovoitov, Andrea Arcangeli, ak, Minchan Kim,
	aneesh.kumar, x86, Matthew Wilcox, Daniel Jordan, Ingo Molnar,
	zhong jiang, David Rientjes, paulmck, npiggin, sj38.park,
	Jerome Glisse, dave, kemi.wang, kirill, Thomas Gleixner,
	Haiyan Song, Ganesh Mahendran, Yang Shi, Mike Rapoport,
	linuxppc-dev, linux-kernel, Sergey Senozhatsky, miles.chen,
	vinayak menon, akpm, Tim Chen, haren
In-Reply-To: <490c0811-50cd-0802-2cbc-9c031ef309f6@linux.ibm.com>

On Mon, 2020-07-06 at 14:27 +0200, Laurent Dufour wrote:
> Le 06/07/2020 à 11:25, Chinwen Chang a écrit :
> > On Thu, 2019-06-20 at 16:19 +0800, Haiyan Song wrote:
> >> Hi Laurent,
> >>
> >> I downloaded your script and run it on Intel 2s skylake platform with spf-v12 patch
> >> serials.
> >>
> >> Here attached the output results of this script.
> >>
> >> The following comparison result is statistics from the script outputs.
> >>
> >> a). Enable THP
> >>                                              SPF_0          change       SPF_1
> >> will-it-scale.page_fault2.per_thread_ops    2664190.8      -11.7%       2353637.6
> >> will-it-scale.page_fault3.per_thread_ops    4480027.2      -14.7%       3819331.9
> >>
> >>
> >> b). Disable THP
> >>                                              SPF_0           change      SPF_1
> >> will-it-scale.page_fault2.per_thread_ops    2653260.7       -10%        2385165.8
> >> will-it-scale.page_fault3.per_thread_ops    4436330.1       -12.4%      3886734.2
> >>
> >>
> >> Thanks,
> >> Haiyan Song
> >>
> >>
> >> On Fri, Jun 14, 2019 at 10:44:47AM +0200, Laurent Dufour wrote:
> >>> Le 14/06/2019 à 10:37, Laurent Dufour a écrit :
> >>>> Please find attached the script I run to get these numbers.
> >>>> This would be nice if you could give it a try on your victim node and share the result.
> >>>
> >>> Sounds that the Intel mail fitering system doesn't like the attached shell script.
> >>> Please find it there: https://urldefense.com/v3/__https://gist.github.com/ldu4/a5cc1a93f293108ea387d43d5d5e7f44__;!!CTRNKA9wMg0ARbw!0lux2FMCbIFxFEl824CdSuSQqT0IVWsvyUqfDVJNEVb9gTWyRltm7cpPZg70N_XhXmMZ$ 
> >>>
> >>> Thanks,
> >>> Laurent.
> >>>
> > 
> > Hi Laurent,
> > 
> > We merged SPF v11 and some patches from v12 into our platforms. After
> > several experiments, we observed SPF has obvious improvements on the
> > launch time of applications, especially for those high-TLP ones,
> > 
> > # launch time of applications(s):
> > 
> > package           version      w/ SPF      w/o SPF      improve(%)
> > ------------------------------------------------------------------
> > Baidu maps        10.13.3      0.887       0.98         9.49
> > Taobao            8.4.0.35     1.227       1.293        5.10
> > Meituan           9.12.401     1.107       1.543        28.26
> > WeChat            7.0.3        2.353       2.68         12.20
> > Honor of Kings    1.43.1.6     6.63        6.713        1.24
> 
> That's great news, thanks for reporting this!
> 
> > 
> > By the way, we have verified our platforms with those patches and
> > achieved the goal of mass production.
> 
> Another good news!
> For my information, what is your targeted hardware?
> 
> Cheers,
> Laurent.

Hi Laurent,

Our targeted hardware belongs to ARM64 multi-core series.

Thanks.
Chinwen
> 


^ permalink raw reply

* Re: [PATCH] powerpc/numa: Restrict possible nodes based on platform
From: Michael Ellerman @ 2020-07-07  5:02 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Nathan Lynch, linuxppc-dev, Srikar Dronamraju, Bharata B Rao
In-Reply-To: <20200706064002.14848-1-srikar@linux.vnet.ibm.com>

Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> As per PAPR, there are 2 device tree property
> ibm,max-associativity-domains (which defines the maximum number of
> domains that the firmware i.e PowerVM can support) and
> ibm,current-associativity-domains (which defines the maximum number of
> domains that the platform can support). Value of
> ibm,max-associativity-domains property is always greater than or equal
> to ibm,current-associativity-domains property.

Where is it documented?

It's definitely not in LoPAPR.

> Powerpc currently uses ibm,max-associativity-domains  property while
> setting the possible number of nodes. This is currently set at 32.
> However the possible number of nodes for a platform may be significantly
> less. Hence set the possible number of nodes based on
> ibm,current-associativity-domains property.
>
> $ lsprop /proc/device-tree/rtas/ibm,*associ*-domains
> /proc/device-tree/rtas/ibm,current-associativity-domains
> 		 00000005 00000001 00000002 00000002 00000002 00000010
> /proc/device-tree/rtas/ibm,max-associativity-domains
> 		 00000005 00000001 00000008 00000020 00000020 00000100
>
> $ cat /sys/devices/system/node/possible ##Before patch
> 0-31
>
> $ cat /sys/devices/system/node/possible ##After patch
> 0-1
>
> Note the maximum nodes this platform can support is only 2 but the
> possible nodes is set to 32.

But what about LPM to a system with more nodes?

cheers

^ permalink raw reply

* [PATCH v2] MAINTAINERS: Add Shengjiu to reviewer list of sound/soc/fsl
From: Nicolin Chen @ 2020-07-07  4:58 UTC (permalink / raw)
  To: broonie
  Cc: alsa-devel, shengjiu.wang, Xiubo.Lee, festevam, shengjiu.wang,
	timur, linux-kernel, linuxppc-dev

Add Shengjiu who's actively working on the latest fsl/nxp audio drivers.

Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Acked-by: Shengjiu Wang <shengjiu.wang@gmail.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
---
Changelog
v1->v2:
 * Replaced Shengjiu's emaill address with his gmail one
 * Added Acked-by from Shengjiu and Reviewed-by from Fabio

 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 496fd4eafb68..ff97b8cefaea 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6956,6 +6956,7 @@ M:	Timur Tabi <timur@kernel.org>
 M:	Nicolin Chen <nicoleotsuka@gmail.com>
 M:	Xiubo Li <Xiubo.Lee@gmail.com>
 R:	Fabio Estevam <festevam@gmail.com>
+R:	Shengjiu Wang <shengjiu.wang@gmail.com>
 L:	alsa-devel@alsa-project.org (moderated for non-subscribers)
 L:	linuxppc-dev@lists.ozlabs.org
 S:	Maintained
-- 
2.17.1


^ permalink raw reply related

* Re: [PATCH v3 1/2] powerpc/perf/hv-24x7: Add cpu hotplug support
From: Michael Ellerman @ 2020-07-07  4:56 UTC (permalink / raw)
  To: Madhavan Srinivasan, Kajol Jain, linuxppc-dev
  Cc: nathanl, ego, suka, maddy, anju
In-Reply-To: <3b45d050-c27a-e5e7-8649-924910f5b01b@linux.ibm.com>

Madhavan Srinivasan <maddy@linux.ibm.com> writes:
> On 7/6/20 8:43 AM, Michael Ellerman wrote:
>> Kajol Jain <kjain@linux.ibm.com> writes:
>>> Patch here adds cpu hotplug functions to hv_24x7 pmu.
>>> A new cpuhp_state "CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE" enum
>>> is added.
>>>
>>> The online callback function updates the cpumask only if its
>>> empty. As the primary intention of adding hotplug support
>>> is to designate a CPU to make HCALL to collect the
>>> counter data.
>>>
>>> The offline function test and clear corresponding cpu in a cpumask
>>> and update cpumask to any other active cpu.
>>>
>>> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
>>> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
>>> ---
>>>   arch/powerpc/perf/hv-24x7.c | 45 +++++++++++++++++++++++++++++++++++++
>>>   include/linux/cpuhotplug.h  |  1 +
>>>   2 files changed, 46 insertions(+)
>>>
>>> diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
>>> index db213eb7cb02..ce4739e2b407 100644
>>> --- a/arch/powerpc/perf/hv-24x7.c
>>> +++ b/arch/powerpc/perf/hv-24x7.c
>>> @@ -31,6 +31,8 @@ static int interface_version;
>>>   /* Whether we have to aggregate result data for some domains. */
>>>   static bool aggregate_result_elements;
>>>   
>>> +static cpumask_t hv_24x7_cpumask;
>>> +
>>>   static bool domain_is_valid(unsigned domain)
>>>   {
>>>   	switch (domain) {
>>> @@ -1641,6 +1643,44 @@ static struct pmu h_24x7_pmu = {
>>>   	.capabilities = PERF_PMU_CAP_NO_EXCLUDE,
>>>   };
>>>   
>>> +static int ppc_hv_24x7_cpu_online(unsigned int cpu)
>>> +{
>>> +	/* Make this CPU the designated target for counter collection */
>> The comment implies every newly onlined CPU will become the target, but
>> actually it's only the first onlined CPU.
>>
>> So I think the comment needs updating, or you could just drop the
>> comment, I think the code is fairly clear by itself.
>>
>>> +	if (cpumask_empty(&hv_24x7_cpumask))
>>> +		cpumask_set_cpu(cpu, &hv_24x7_cpumask);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int ppc_hv_24x7_cpu_offline(unsigned int cpu)
>>> +{
>>> +	int target = -1;
>> No need to initialise target, you assign to it unconditionally below.
>>
>>> +	/* Check if exiting cpu is used for collecting 24x7 events */
>>> +	if (!cpumask_test_and_clear_cpu(cpu, &hv_24x7_cpumask))
>>> +		return 0;
>>> +
>>> +	/* Find a new cpu to collect 24x7 events */
>>> +	target = cpumask_last(cpu_active_mask);
>> Any reason to use cpumask_last() vs cpumask_first(), or a randomly
>> chosen CPU?
>>
>>> +	if (target < 0 || target >= nr_cpu_ids)
>>> +		return -1;
>>> +
>>> +	/* Migrate 24x7 events to the new target */
>>> +	cpumask_set_cpu(target, &hv_24x7_cpumask);
>>> +	perf_pmu_migrate_context(&h_24x7_pmu, cpu, target);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int hv_24x7_cpu_hotplug_init(void)
>>> +{
>>> +	return cpuhp_setup_state(CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE,
>>> +			  "perf/powerpc/hv_24x7:online",
>>> +			  ppc_hv_24x7_cpu_online,
>>> +			  ppc_hv_24x7_cpu_offline);
>>> +}
>>> +
>>>   static int hv_24x7_init(void)
>>>   {
>>>   	int r;
>>> @@ -1685,6 +1725,11 @@ static int hv_24x7_init(void)
>>>   	if (r)
>>>   		return r;
>>>   
>>> +	/* init cpuhotplug */
>>> +	r = hv_24x7_cpu_hotplug_init();
>>> +	if (r)
>>> +		pr_err("hv_24x7: CPU hotplug init failed\n");
>>> +
>> The hotplug initialisation shouldn't fail unless something is badly
>> wrong. I think you should just fail initialisation of the entire PMU if
>> that happens, which will make the error handling in the next patch much
>> simpler.
>
> We  did fail the PMU registration on failure of the hotplug
> code (and yes error handling is much simpler), but on internal 
> review/discussion,
> what came up was that, hv_24x7 PMU will still be usable without
> the hotplug code (with "-C" option to perf tool command line).

In theory yes.

But in reality no one will ever test that case, so the code will easily
bit rot.

Even if it doesn't bit rot, you've now created another state the system
can legally be in (hotplug init failed but PMU still probed), which you
have to test, document & support.

If the hotplug init fails then something is badly wrong, the best thing
we can do is bail on the PMU initialisation and hope the rest of the
system boots OK.

cheers

^ permalink raw reply

* Re: [PATCH v3 1/2] powerpc/perf/hv-24x7: Add cpu hotplug support
From: Madhavan Srinivasan @ 2020-07-07  4:15 UTC (permalink / raw)
  To: Michael Ellerman, Kajol Jain, linuxppc-dev
  Cc: nathanl, ego, suka, maddy, anju
In-Reply-To: <87zh8d5oab.fsf@mpe.ellerman.id.au>



On 7/6/20 8:43 AM, Michael Ellerman wrote:
> Kajol Jain <kjain@linux.ibm.com> writes:
>> Patch here adds cpu hotplug functions to hv_24x7 pmu.
>> A new cpuhp_state "CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE" enum
>> is added.
>>
>> The online callback function updates the cpumask only if its
>> empty. As the primary intention of adding hotplug support
>> is to designate a CPU to make HCALL to collect the
>> counter data.
>>
>> The offline function test and clear corresponding cpu in a cpumask
>> and update cpumask to any other active cpu.
>>
>> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
>> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
>> ---
>>   arch/powerpc/perf/hv-24x7.c | 45 +++++++++++++++++++++++++++++++++++++
>>   include/linux/cpuhotplug.h  |  1 +
>>   2 files changed, 46 insertions(+)
>>
>> diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
>> index db213eb7cb02..ce4739e2b407 100644
>> --- a/arch/powerpc/perf/hv-24x7.c
>> +++ b/arch/powerpc/perf/hv-24x7.c
>> @@ -31,6 +31,8 @@ static int interface_version;
>>   /* Whether we have to aggregate result data for some domains. */
>>   static bool aggregate_result_elements;
>>   
>> +static cpumask_t hv_24x7_cpumask;
>> +
>>   static bool domain_is_valid(unsigned domain)
>>   {
>>   	switch (domain) {
>> @@ -1641,6 +1643,44 @@ static struct pmu h_24x7_pmu = {
>>   	.capabilities = PERF_PMU_CAP_NO_EXCLUDE,
>>   };
>>   
>> +static int ppc_hv_24x7_cpu_online(unsigned int cpu)
>> +{
>> +	/* Make this CPU the designated target for counter collection */
> The comment implies every newly onlined CPU will become the target, but
> actually it's only the first onlined CPU.
>
> So I think the comment needs updating, or you could just drop the
> comment, I think the code is fairly clear by itself.
>
>> +	if (cpumask_empty(&hv_24x7_cpumask))
>> +		cpumask_set_cpu(cpu, &hv_24x7_cpumask);
>> +
>> +	return 0;
>> +}
>> +
>> +static int ppc_hv_24x7_cpu_offline(unsigned int cpu)
>> +{
>> +	int target = -1;
> No need to initialise target, you assign to it unconditionally below.
>
>> +	/* Check if exiting cpu is used for collecting 24x7 events */
>> +	if (!cpumask_test_and_clear_cpu(cpu, &hv_24x7_cpumask))
>> +		return 0;
>> +
>> +	/* Find a new cpu to collect 24x7 events */
>> +	target = cpumask_last(cpu_active_mask);
> Any reason to use cpumask_last() vs cpumask_first(), or a randomly
> chosen CPU?
>
>> +	if (target < 0 || target >= nr_cpu_ids)
>> +		return -1;
>> +
>> +	/* Migrate 24x7 events to the new target */
>> +	cpumask_set_cpu(target, &hv_24x7_cpumask);
>> +	perf_pmu_migrate_context(&h_24x7_pmu, cpu, target);
>> +
>> +	return 0;
>> +}
>> +
>> +static int hv_24x7_cpu_hotplug_init(void)
>> +{
>> +	return cpuhp_setup_state(CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE,
>> +			  "perf/powerpc/hv_24x7:online",
>> +			  ppc_hv_24x7_cpu_online,
>> +			  ppc_hv_24x7_cpu_offline);
>> +}
>> +
>>   static int hv_24x7_init(void)
>>   {
>>   	int r;
>> @@ -1685,6 +1725,11 @@ static int hv_24x7_init(void)
>>   	if (r)
>>   		return r;
>>   
>> +	/* init cpuhotplug */
>> +	r = hv_24x7_cpu_hotplug_init();
>> +	if (r)
>> +		pr_err("hv_24x7: CPU hotplug init failed\n");
>> +
> The hotplug initialisation shouldn't fail unless something is badly
> wrong. I think you should just fail initialisation of the entire PMU if
> that happens, which will make the error handling in the next patch much
> simpler.

We  did fail the PMU registration on failure of the hotplug
code (and yes error handling is much simpler), but on internal 
review/discussion,
what came up was that, hv_24x7 PMU will still be usable without
the hotplug code (with "-C" option to perf tool command line).

Maddy

>
> cheers
>
>>   	r = perf_pmu_register(&h_24x7_pmu, h_24x7_pmu.name, -1);
>>   	if (r)
>>   		return r;


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox