LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH net-next v2 9/9] ibmvnic: Do not replenish RX buffers after every polling loop
From: Thomas Falcon @ 2020-11-19 20:58 UTC (permalink / raw)
  To: ljp
  Cc: cforno12, netdev, ricklind, dnbanerg, Linuxppc-dev, drt, brking,
	kuba, sukadev, linuxppc-dev
In-Reply-To: <7853649c6c1f2f4ce6d8bf9643cd1a43@linux.vnet.ibm.com>

On 11/19/20 2:38 PM, ljp wrote:
> On 2020-11-19 14:26, Thomas Falcon wrote:
>> On 11/19/20 3:43 AM, ljp wrote:
>>
>>> On 2020-11-18 19:12, Thomas Falcon wrote:
>>>
>>>> From: "Dwip N. Banerjee" <dnbanerg@us.ibm.com>
>>>>
>>>> Reduce the amount of time spent replenishing RX buffers by
>>>> only doing so once available buffers has fallen under a certain
>>>> threshold, in this case half of the total number of buffers, or
>>>> if the polling loop exits before the packets processed is less
>>>> than its budget.
>>>>
>>>> Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
>>>> ---
>>>> drivers/net/ethernet/ibm/ibmvnic.c | 5 ++++-
>>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
>>>> b/drivers/net/ethernet/ibm/ibmvnic.c
>>>> index 96df6d8fa277..9fe43ab0496d 100644
>>>> --- a/drivers/net/ethernet/ibm/ibmvnic.c
>>>> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
>>>> @@ -2537,7 +2537,10 @@ static int ibmvnic_poll(struct napi_struct
>>>>
>>>> *napi, int budget)
>>>> frames_processed++;
>>>> }
>>>>
>>>> - if (adapter->state != VNIC_CLOSING)
>>>> + if (adapter->state != VNIC_CLOSING &&
>>>> + ((atomic_read(&adapter->rx_pool[scrq_num].available) <
>>>> + adapter->req_rx_add_entries_per_subcrq / 2) ||
>>>> + frames_processed < budget))
>>>
>>> 1/2 seems a simple and good algorithm.
>>> Explaining why "frames_process < budget" is necessary in the commit
>>> message
>>> or source code also helps.
>>
>> Hello, Lijun. The patch author, Dwip Banerjee, suggested the modified
>> commit message below:
>>
>> Reduce the amount of time spent replenishing RX buffers by
>>  only doing so once available buffers has fallen under a certain
>>  threshold, in this case half of the total number of buffers, or
>>  if the polling loop exits before the packets processed is less
>>  than its budget. Non-exhaustion of NAPI budget implies lower
>>  incoming packet pressure, allowing the leeway to refill the buffers
>>  in preparation for any impending burst.
>
> It looks good to me.
>
>>
>> Would such an update require a v3?
>
> I assume you ask Jakub, right?
>
>
Yes. There was an issue with my mail client in my earlier response, so I 
am posting Dwip's modified commit message again below.

Reduce the amount of time spent replenishing RX buffers by only doing so 
once available buffers has fallen under a certain threshold, in this 
case half of the total number of buffers, or if the polling loop exits 
before the packets processed is less than its budget. Non-exhaustion of 
NAPI budget implies lower incoming packet pressure, allowing the leeway 
to refill the buffers in preparation for any impending burst.

>>>> replenish_rx_pool(adapter, &adapter->rx_pool[scrq_num]);
>>>> if (frames_processed < budget) {
>>>> if (napi_complete_done(napi, frames_processed)) {

^ permalink raw reply

* [PATCH v2 2/2] kbuild: Disable CONFIG_LD_ORPHAN_WARN for ld.lld 10.0.1
From: Nathan Chancellor @ 2020-11-19 20:46 UTC (permalink / raw)
  To: Masahiro Yamada, Michal Marek, Kees Cook
  Cc: linuxppc-dev, kernelci . org bot, linux-kbuild, Catalin Marinas,
	Mark Brown, x86, Nick Desaulniers, Russell King, linux-kernel,
	clang-built-linux, Arvind Sankar, Ingo Molnar, Borislav Petkov,
	Thomas Gleixner, Will Deacon, Nathan Chancellor, linux-arm-kernel
In-Reply-To: <20201113195553.1487659-1-natechancellor@gmail.com>

ld.lld 10.0.1 spews a bunch of various warnings about .rela sections,
along with a few others. Newer versions of ld.lld do not have these
warnings. As a result, do not add '--orphan-handling=warn' to
LDFLAGS_vmlinux if ld.lld's version is not new enough.

Link: https://github.com/ClangBuiltLinux/linux/issues/1187
Link: https://github.com/ClangBuiltLinux/linux/issues/1193
Reported-by: Arvind Sankar <nivedita@alum.mit.edu>
Reported-by: kernelci.org bot <bot@kernelci.org>
Reported-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
---

v1 -> v2:

* Add condition as a depends on line (Kees Cook)

* Capture output of "$* --version" to avoid invoking linker twice (Nick
  Desaulniers)

* Improve documentation of script in comments (Nick Desaulniers)

* Pick up review tag from Kees

 MAINTAINERS            |  1 +
 init/Kconfig           |  5 +++++
 scripts/lld-version.sh | 20 ++++++++++++++++++++
 3 files changed, 26 insertions(+)
 create mode 100755 scripts/lld-version.sh

diff --git a/MAINTAINERS b/MAINTAINERS
index e451dcce054f..e6f74f130ae1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4284,6 +4284,7 @@ B:	https://github.com/ClangBuiltLinux/linux/issues
 C:	irc://chat.freenode.net/clangbuiltlinux
 F:	Documentation/kbuild/llvm.rst
 F:	scripts/clang-tools/
+F:	scripts/lld-version.sh
 K:	\b(?i:clang|llvm)\b
 
 CLEANCACHE API
diff --git a/init/Kconfig b/init/Kconfig
index 92c58b45abb8..b9037d6c5ab3 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -47,6 +47,10 @@ config CLANG_VERSION
 	int
 	default $(shell,$(srctree)/scripts/clang-version.sh $(CC))
 
+config LLD_VERSION
+	int
+	default $(shell,$(srctree)/scripts/lld-version.sh $(LD))
+
 config CC_CAN_LINK
 	bool
 	default $(success,$(srctree)/scripts/cc-can-link.sh $(CC) $(CLANG_FLAGS) $(m64-flag)) if 64BIT
@@ -1351,6 +1355,7 @@ config LD_DEAD_CODE_DATA_ELIMINATION
 config LD_ORPHAN_WARN
 	def_bool y
 	depends on ARCH_WANT_LD_ORPHAN_WARN
+	depends on !LD_IS_LLD || LLD_VERSION >= 110000
 	depends on $(ld-option,--orphan-handling=warn)
 
 config SYSCTL
diff --git a/scripts/lld-version.sh b/scripts/lld-version.sh
new file mode 100755
index 000000000000..d70edb4d8a4f
--- /dev/null
+++ b/scripts/lld-version.sh
@@ -0,0 +1,20 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+#
+# Usage: $ ./scripts/lld-version.sh ld.lld
+#
+# Print the linker version of `ld.lld' in a 5 or 6-digit form
+# such as `100001' for ld.lld 10.0.1 etc.
+
+linker_string="$($* --version)"
+
+if ! ( echo $linker_string | grep -q LLD ); then
+	echo 0
+	exit 1
+fi
+
+VERSION=$(echo $linker_string | cut -d ' ' -f 2)
+MAJOR=$(echo $VERSION | cut -d . -f 1)
+MINOR=$(echo $VERSION | cut -d . -f 2)
+PATCHLEVEL=$(echo $VERSION | cut -d . -f 3)
+printf "%d%02d%02d\\n" $MAJOR $MINOR $PATCHLEVEL
-- 
2.29.2


^ permalink raw reply related

* Re: [PATCH net-next v2 9/9] ibmvnic: Do not replenish RX buffers after every polling loop
From: Thomas Falcon @ 2020-11-19 20:26 UTC (permalink / raw)
  To: ljp
  Cc: cforno12, netdev, ricklind, dnbanerg, Linuxppc-dev, drt, brking,
	kuba, sukadev, linuxppc-dev
In-Reply-To: <1a4e7b1ef1fb101cbb26fb9d5867ee46@linux.vnet.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 2129 bytes --]

On 11/19/20 3:43 AM, ljp wrote:
> On 2020-11-18 19:12, Thomas Falcon wrote:
>> From: "Dwip N. Banerjee" <dnbanerg@us.ibm.com>
>>
>> Reduce the amount of time spent replenishing RX buffers by
>> only doing so once available buffers has fallen under a certain
>> threshold, in this case half of the total number of buffers, or
>> if the polling loop exits before the packets processed is less
>> than its budget.
>>
>> Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
>> ---
>>  drivers/net/ethernet/ibm/ibmvnic.c | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
>> b/drivers/net/ethernet/ibm/ibmvnic.c
>> index 96df6d8fa277..9fe43ab0496d 100644
>> --- a/drivers/net/ethernet/ibm/ibmvnic.c
>> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
>> @@ -2537,7 +2537,10 @@ static int ibmvnic_poll(struct napi_struct
>> *napi, int budget)
>>          frames_processed++;
>>      }
>>
>> -    if (adapter->state != VNIC_CLOSING)
>> +    if (adapter->state != VNIC_CLOSING &&
>> + ((atomic_read(&adapter->rx_pool[scrq_num].available) <
>> +          adapter->req_rx_add_entries_per_subcrq / 2) ||
>> +          frames_processed < budget))
>
> 1/2 seems a simple and good algorithm.
> Explaining why "frames_process < budget" is necessary in the commit 
> message
> or source code also helps.
>
Hello, Lijun. The patch author, Dwip Banerjee, suggested the modified 
commit message below:

Reduce the amount of time spent replenishing RX buffers by
only doing so once available buffers has fallen under a certain
threshold, in this case half of the total number of buffers, or
if the polling loop exits before the packets processed is less
than its budget. Non-exhaustion of NAPI budget implies lower
incoming packet pressure, allowing the leeway to refill the buffers
in preparation for any impending burst.

Would such an update require a v3?

>
>>          replenish_rx_pool(adapter, &adapter->rx_pool[scrq_num]);
>>      if (frames_processed < budget) {
>>          if (napi_complete_done(napi, frames_processed)) {

[-- Attachment #2: Type: text/html, Size: 3641 bytes --]

^ permalink raw reply

* [PATCH v2 1/2] kbuild: Hoist '--orphan-handling' into Kconfig
From: Nathan Chancellor @ 2020-11-19 20:46 UTC (permalink / raw)
  To: Masahiro Yamada, Michal Marek, Kees Cook
  Cc: linuxppc-dev, linux-kbuild, Catalin Marinas, x86,
	Nick Desaulniers, Russell King, linux-kernel, clang-built-linux,
	Arvind Sankar, Ingo Molnar, Borislav Petkov, Thomas Gleixner,
	Will Deacon, Nathan Chancellor, linux-arm-kernel
In-Reply-To: <20201113195553.1487659-1-natechancellor@gmail.com>

Currently, '--orphan-handling=warn' is spread out across four different
architectures in their respective Makefiles, which makes it a little
unruly to deal with in case it needs to be disabled for a specific
linker version (in this case, ld.lld 10.0.1).

To make it easier to control this, hoist this warning into Kconfig and
the main Makefile so that disabling it is simpler, as the warning will
only be enabled in a couple places (main Makefile and a couple of
compressed boot folders that blow away LDFLAGS_vmlinx) and making it
conditional is easier due to Kconfig syntax. One small additional
benefit of this is saving a call to ld-option on incremental builds
because we will have already evaluated it for CONFIG_LD_ORPHAN_WARN.

To keep the list of supported architectures the same, introduce
CONFIG_ARCH_WANT_LD_ORPHAN_WARN, which an architecture can select to
gain this automatically after all of the sections are specified and size
asserted. A special thanks to Kees Cook for the help text on this
config.

Link: https://github.com/ClangBuiltLinux/linux/issues/1187
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
---

v1 -> v2:

* Change

  ifeq ($(CONFIG_LD_ORPHAN_WARN),y)

  to

  ifdef CONFIG_LD_ORPHAN_WARN

  to improve readability (Michael Ellerman)

* Separate conditions for CONFIG_LD_ORPHAN warn to improve
  readability (Kees Cook)

* Pick up tags from Kees, Michael, and Nick

 Makefile                          | 6 ++++++
 arch/Kconfig                      | 9 +++++++++
 arch/arm/Kconfig                  | 1 +
 arch/arm/Makefile                 | 4 ----
 arch/arm/boot/compressed/Makefile | 4 +++-
 arch/arm64/Kconfig                | 1 +
 arch/arm64/Makefile               | 4 ----
 arch/powerpc/Kconfig              | 1 +
 arch/powerpc/Makefile             | 1 -
 arch/x86/Kconfig                  | 1 +
 arch/x86/Makefile                 | 3 ---
 arch/x86/boot/compressed/Makefile | 4 +++-
 init/Kconfig                      | 5 +++++
 13 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/Makefile b/Makefile
index e2c3f65c4721..2c7116299f1f 100644
--- a/Makefile
+++ b/Makefile
@@ -984,6 +984,12 @@ ifeq ($(CONFIG_RELR),y)
 LDFLAGS_vmlinux	+= --pack-dyn-relocs=relr
 endif
 
+# We never want expected sections to be placed heuristically by the
+# linker. All sections should be explicitly named in the linker script.
+ifdef CONFIG_LD_ORPHAN_WARN
+LDFLAGS_vmlinux += --orphan-handling=warn
+endif
+
 # Align the bit size of userspace programs with the kernel
 KBUILD_USERCFLAGS  += $(filter -m32 -m64 --target=%, $(KBUILD_CFLAGS))
 KBUILD_USERLDFLAGS += $(filter -m32 -m64 --target=%, $(KBUILD_CFLAGS))
diff --git a/arch/Kconfig b/arch/Kconfig
index 56b6ccc0e32d..ba4e966484ab 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1028,6 +1028,15 @@ config HAVE_STATIC_CALL_INLINE
 	bool
 	depends on HAVE_STATIC_CALL
 
+config ARCH_WANT_LD_ORPHAN_WARN
+	bool
+	help
+	  An arch should select this symbol once all linker sections are explicitly
+	  included, size-asserted, or discarded in the linker scripts. This is
+	  important because we never want expected sections to be placed heuristically
+	  by the linker, since the locations of such sections can change between linker
+	  versions.
+
 source "kernel/gcov/Kconfig"
 
 source "scripts/gcc-plugins/Kconfig"
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index fe2f17eb2b50..002e0cf025f5 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -35,6 +35,7 @@ config ARM
 	select ARCH_USE_CMPXCHG_LOCKREF
 	select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
 	select ARCH_WANT_IPC_PARSE_VERSION
+	select ARCH_WANT_LD_ORPHAN_WARN
 	select BINFMT_FLAT_ARGVP_ENVP_ON_STACK
 	select BUILDTIME_TABLE_SORT if MMU
 	select CLONE_BACKWARDS
diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 4d76eab2b22d..e15f76ca2887 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -16,10 +16,6 @@ LDFLAGS_vmlinux	+= --be8
 KBUILD_LDFLAGS_MODULE	+= --be8
 endif
 
-# We never want expected sections to be placed heuristically by the
-# linker. All sections should be explicitly named in the linker script.
-LDFLAGS_vmlinux += $(call ld-option, --orphan-handling=warn)
-
 GZFLAGS		:=-9
 #KBUILD_CFLAGS	+=-pipe
 
diff --git a/arch/arm/boot/compressed/Makefile b/arch/arm/boot/compressed/Makefile
index 47f001ca5499..e1567418a2b1 100644
--- a/arch/arm/boot/compressed/Makefile
+++ b/arch/arm/boot/compressed/Makefile
@@ -129,7 +129,9 @@ LDFLAGS_vmlinux += --no-undefined
 # Delete all temporary local symbols
 LDFLAGS_vmlinux += -X
 # Report orphan sections
-LDFLAGS_vmlinux += $(call ld-option, --orphan-handling=warn)
+ifdef CONFIG_LD_ORPHAN_WARN
+LDFLAGS_vmlinux += --orphan-handling=warn
+endif
 # Next argument is a linker script
 LDFLAGS_vmlinux += -T
 
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1515f6f153a0..a6b5b7ef40ae 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -81,6 +81,7 @@ config ARM64
 	select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
 	select ARCH_WANT_FRAME_POINTERS
 	select ARCH_WANT_HUGE_PMD_SHARE if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36)
+	select ARCH_WANT_LD_ORPHAN_WARN
 	select ARCH_HAS_UBSAN_SANITIZE_ALL
 	select ARM_AMBA
 	select ARM_ARCH_TIMER
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 5789c2d18d43..6a87d592bd00 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -28,10 +28,6 @@ LDFLAGS_vmlinux	+= --fix-cortex-a53-843419
   endif
 endif
 
-# We never want expected sections to be placed heuristically by the
-# linker. All sections should be explicitly named in the linker script.
-LDFLAGS_vmlinux += $(call ld-option, --orphan-handling=warn)
-
 ifeq ($(CONFIG_ARM64_USE_LSE_ATOMICS), y)
   ifneq ($(CONFIG_ARM64_LSE_ATOMICS), y)
 $(warning LSE atomics not supported by binutils)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e9f13fe08492..5181872f9452 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -152,6 +152,7 @@ config PPC
 	select ARCH_USE_QUEUED_SPINLOCKS	if PPC_QUEUED_SPINLOCKS
 	select ARCH_WANT_IPC_PARSE_VERSION
 	select ARCH_WANT_IRQS_OFF_ACTIVATE_MM
+	select ARCH_WANT_LD_ORPHAN_WARN
 	select ARCH_WEAK_RELEASE_ACQUIRE
 	select BINFMT_ELF
 	select BUILDTIME_TABLE_SORT
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index a4d56f0a41d9..d9eb0da845e1 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -123,7 +123,6 @@ endif
 LDFLAGS_vmlinux-y := -Bstatic
 LDFLAGS_vmlinux-$(CONFIG_RELOCATABLE) := -pie
 LDFLAGS_vmlinux	:= $(LDFLAGS_vmlinux-y)
-LDFLAGS_vmlinux += $(call ld-option,--orphan-handling=warn)
 
 ifdef CONFIG_PPC64
 ifeq ($(call cc-option-yn,-mcmodel=medium),y)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f6946b81f74a..fbf26e0f7a6a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -100,6 +100,7 @@ config X86
 	select ARCH_WANT_DEFAULT_BPF_JIT	if X86_64
 	select ARCH_WANTS_DYNAMIC_TASK_STRUCT
 	select ARCH_WANT_HUGE_PMD_SHARE
+	select ARCH_WANT_LD_ORPHAN_WARN
 	select ARCH_WANTS_THP_SWAP		if X86_64
 	select BUILDTIME_TABLE_SORT
 	select CLKEVT_I8253
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 154259f18b8b..1bf21746f4ce 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -209,9 +209,6 @@ ifdef CONFIG_X86_64
 LDFLAGS_vmlinux += -z max-page-size=0x200000
 endif
 
-# We never want expected sections to be placed heuristically by the
-# linker. All sections should be explicitly named in the linker script.
-LDFLAGS_vmlinux += $(call ld-option, --orphan-handling=warn)
 
 archscripts: scripts_basic
 	$(Q)$(MAKE) $(build)=arch/x86/tools relocs
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index ee249088cbfe..40b8fd375d52 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -61,7 +61,9 @@ KBUILD_LDFLAGS += $(call ld-option,--no-ld-generated-unwind-info)
 # Compressed kernel should be built as PIE since it may be loaded at any
 # address by the bootloader.
 LDFLAGS_vmlinux := -pie $(call ld-option, --no-dynamic-linker)
-LDFLAGS_vmlinux += $(call ld-option, --orphan-handling=warn)
+ifdef CONFIG_LD_ORPHAN_WARN
+LDFLAGS_vmlinux += --orphan-handling=warn
+endif
 LDFLAGS_vmlinux += -T
 
 hostprogs	:= mkpiggy
diff --git a/init/Kconfig b/init/Kconfig
index c9446911cf41..92c58b45abb8 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1348,6 +1348,11 @@ config LD_DEAD_CODE_DATA_ELIMINATION
 	  present. This option is not well tested yet, so use at your
 	  own risk.
 
+config LD_ORPHAN_WARN
+	def_bool y
+	depends on ARCH_WANT_LD_ORPHAN_WARN
+	depends on $(ld-option,--orphan-handling=warn)
+
 config SYSCTL
 	bool
 

base-commit: 09162bc32c880a791c6c0668ce0745cf7958f576
-- 
2.29.2


^ permalink raw reply related

* Re: [PATCH net-next v2 9/9] ibmvnic: Do not replenish RX buffers after every polling loop
From: ljp @ 2020-11-19 20:38 UTC (permalink / raw)
  To: Thomas Falcon
  Cc: cforno12, netdev, ricklind, dnbanerg, Linuxppc-dev, drt, brking,
	kuba, sukadev, linuxppc-dev
In-Reply-To: <83ca37f3-07be-4179-8414-88c8c83bfe56@linux.ibm.com>

On 2020-11-19 14:26, Thomas Falcon wrote:
> On 11/19/20 3:43 AM, ljp wrote:
> 
>> On 2020-11-18 19:12, Thomas Falcon wrote:
>> 
>>> From: "Dwip N. Banerjee" <dnbanerg@us.ibm.com>
>>> 
>>> Reduce the amount of time spent replenishing RX buffers by
>>> only doing so once available buffers has fallen under a certain
>>> threshold, in this case half of the total number of buffers, or
>>> if the polling loop exits before the packets processed is less
>>> than its budget.
>>> 
>>> Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
>>> ---
>>> drivers/net/ethernet/ibm/ibmvnic.c | 5 ++++-
>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
>>> b/drivers/net/ethernet/ibm/ibmvnic.c
>>> index 96df6d8fa277..9fe43ab0496d 100644
>>> --- a/drivers/net/ethernet/ibm/ibmvnic.c
>>> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
>>> @@ -2537,7 +2537,10 @@ static int ibmvnic_poll(struct napi_struct
>>> 
>>> *napi, int budget)
>>> frames_processed++;
>>> }
>>> 
>>> - if (adapter->state != VNIC_CLOSING)
>>> + if (adapter->state != VNIC_CLOSING &&
>>> + ((atomic_read(&adapter->rx_pool[scrq_num].available) <
>>> + adapter->req_rx_add_entries_per_subcrq / 2) ||
>>> + frames_processed < budget))
>> 
>> 1/2 seems a simple and good algorithm.
>> Explaining why "frames_process < budget" is necessary in the commit
>> message
>> or source code also helps.
> 
> Hello, Lijun. The patch author, Dwip Banerjee, suggested the modified
> commit message below:
> 
> Reduce the amount of time spent replenishing RX buffers by
>  only doing so once available buffers has fallen under a certain
>  threshold, in this case half of the total number of buffers, or
>  if the polling loop exits before the packets processed is less
>  than its budget. Non-exhaustion of NAPI budget implies lower
>  incoming packet pressure, allowing the leeway to refill the buffers
>  in preparation for any impending burst.

It looks good to me.

> 
> Would such an update require a v3?

I assume you ask Jakub, right?


>>> replenish_rx_pool(adapter, &adapter->rx_pool[scrq_num]);
>>> if (frames_processed < budget) {
>>> if (napi_complete_done(napi, frames_processed)) {

^ permalink raw reply

* Re: [PATCH v3 0/2] powerpc/ptrace: Hard wire PT_SOFTE value to 1 in gpr_get() too
From: Oleg Nesterov @ 2020-11-19 18:22 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Christophe Leroy, Madhavan Srinivasan, linuxppc-dev,
	Nicholas Piggin, linux-kernel, Paul Mackerras, Al Viro,
	Aneesh Kumar K.V, Jan Kratochvil
In-Reply-To: <d7c3ed05-b7e7-fac0-871f-4c43c1a7e90c@csgroup.eu>

On 11/19, Christophe Leroy wrote:
>
>
> Le 19/11/2020 à 17:01, Oleg Nesterov a écrit :
> >Can we finally fix this problem? ;)
> >
> >My previous attempt was ignored, see
>
> That doesn't seems right.
>
> Michael made some suggestion it seems, can you respond to it ?

I did, see https://lore.kernel.org/lkml/20200611105830.GB12500@redhat.com/

> >Sorry, uncompiled/untested, I don't have a ppc machine.
>
> I compiled with ppc64_defconfig, that seems ok. Still untested.

Thanks.

Oleg.


^ permalink raw reply

* Re: [PATCH v3 1/2] powerpc/ptrace: simplify gpr_get/tm_cgpr_get
From: Oleg Nesterov @ 2020-11-19 18:18 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Christophe Leroy, Madhavan Srinivasan, linuxppc-dev,
	Nicholas Piggin, linux-kernel, Paul Mackerras, Al Viro,
	Aneesh Kumar K.V, Jan Kratochvil
In-Reply-To: <94c56c46-e336-f61c-3623-1b2014fcbb2e@csgroup.eu>

On 11/19, Christophe Leroy wrote:
>
>
> Le 19/11/2020 à 17:02, Oleg Nesterov a écrit :
> >gpr_get() does membuf_write() twice to override pt_regs->msr in between.
>
> Is there anything wrong with that ?

Nothing wrong, but imo the code and 2/2 looks simpler after this patch.
I tried to explain this in the changelog.

> >  int tm_cgpr_get(struct task_struct *target, const struct user_regset *regset,
> >  		struct membuf to)
> >  {
> >+	struct membuf to_msr = membuf_at(&to, offsetof(struct pt_regs, msr));
> >+
> >  	if (!cpu_has_feature(CPU_FTR_TM))
> >  		return -ENODEV;
> >@@ -97,17 +99,12 @@ int tm_cgpr_get(struct task_struct *target, const struct user_regset *regset,
> >  	flush_altivec_to_thread(target);
> >  	membuf_write(&to, &target->thread.ckpt_regs,
> >-			offsetof(struct pt_regs, msr));
> >-	membuf_store(&to, get_user_ckpt_msr(target));
> >+				sizeof(struct user_pt_regs));
>
> This looks mis-aligned. But it should fit on a single line, now we allow up to 100 chars on a line.

OK, I can change this.

> >-	BUILD_BUG_ON(offsetof(struct pt_regs, orig_gpr3) !=
> >-		     offsetof(struct pt_regs, msr) + sizeof(long));
> >+	membuf_store(&to_msr, get_user_ckpt_msr(target));
> >-	membuf_write(&to, &target->thread.ckpt_regs.orig_gpr3,
> >-			sizeof(struct user_pt_regs) -
> >-			offsetof(struct pt_regs, orig_gpr3));
> >  	return membuf_zero(&to, ELF_NGREG * sizeof(unsigned long) -
> >-			sizeof(struct user_pt_regs));
> >+				sizeof(struct user_pt_regs));
>
> I can't see any change here except the alignment. Can you leave it as is ?

I just tried to make tm_cgpr_get() and gpr_get() look similar.

Sure, I can leave it as is.

Better yet, could you please fix this problem somehow so that I could forget
about the bug assigned to me?

I know nothing about powerpc, and personally I do not care about this (minor)
bug, I agree with any changes.

> >-	membuf_write(&to, target->thread.regs, offsetof(struct pt_regs, msr));
> >-	membuf_store(&to, get_user_msr(target));
> >+	membuf_write(&to, target->thread.regs,
> >+				sizeof(struct user_pt_regs));
>
> This should fit on a single line.
>
> >  	return membuf_zero(&to, ELF_NGREG * sizeof(unsigned long) -
> >-				 sizeof(struct user_pt_regs));
> >+				sizeof(struct user_pt_regs));
>
> This should not change, it's not part of the changes for this patch.

See above, I can leave it as is.

> >--- a/include/linux/regset.h
> >+++ b/include/linux/regset.h
> >@@ -46,6 +46,18 @@ static inline int membuf_write(struct membuf *s, const void *v, size_t size)
> >  	return s->left;
> >  }
> >+static inline struct membuf membuf_at(const struct membuf *s, size_t offs)
> >+{
> >+	struct membuf n = *s;
>
> Is there any point in using a struct membuf * instaed of a struct membuf as parameter ?

This matches other membuf_ helpers.

Oleg.


^ permalink raw reply

* Re: [PATCH v3 0/2] powerpc/ptrace: Hard wire PT_SOFTE value to 1 in gpr_get() too
From: Christophe Leroy @ 2020-11-19 17:19 UTC (permalink / raw)
  To: Oleg Nesterov, Benjamin Herrenschmidt, Madhavan Srinivasan,
	Michael Ellerman, Paul Mackerras
  Cc: Christophe Leroy, Aneesh Kumar K.V, linux-kernel, Nicholas Piggin,
	Jan Kratochvil, Al Viro, linuxppc-dev
In-Reply-To: <20201119160154.GA5183@redhat.com>



Le 19/11/2020 à 17:01, Oleg Nesterov a écrit :
> Can we finally fix this problem? ;)
> 
> My previous attempt was ignored, see

That doesn't seems right.

Michael made some suggestion it seems, can you respond to it ?

> 
> 	https://lore.kernel.org/lkml/20190917121256.GA8659@redhat.com/
> 
> Now that gpr_get() was changed to use membuf API we can make a simpler fix.
> 
> Sorry, uncompiled/untested, I don't have a ppc machine.

I compiled with ppc64_defconfig, that seems ok. Still untested.

Christophe

> 
> Oleg.
> 
>   arch/powerpc/kernel/ptrace/ptrace-tm.c   | 21 ++++++++++++---------
>   arch/powerpc/kernel/ptrace/ptrace-view.c | 21 ++++++++++++---------
>   include/linux/regset.h                   | 12 ++++++++++++
>   3 files changed, 36 insertions(+), 18 deletions(-)
> 

^ permalink raw reply

* Re: [PATCH v3 2/2] powerpc/ptrace: Hard wire PT_SOFTE value to 1 in gpr_get() too
From: Christophe Leroy @ 2020-11-19 17:18 UTC (permalink / raw)
  To: Oleg Nesterov, Benjamin Herrenschmidt, Madhavan Srinivasan,
	Michael Ellerman, Paul Mackerras
  Cc: Christophe Leroy, Aneesh Kumar K.V, linux-kernel, Nicholas Piggin,
	Jan Kratochvil, Al Viro, linuxppc-dev
In-Reply-To: <20201119160247.GB5188@redhat.com>



Le 19/11/2020 à 17:02, Oleg Nesterov a écrit :
> The commit a8a4b03ab95f ("powerpc: Hard wire PT_SOFTE value to 1 in
> ptrace & signals") changed ptrace_get_reg(PT_SOFTE) to report 0x1,
> but PTRACE_GETREGS still copies pt_regs->softe as is.
> 
> This is not consistent and this breaks the user-regs-peekpoke test
> from https://sourceware.org/systemtap/wiki/utrace/tests/
> 
> Reported-by: Jan Kratochvil <jan.kratochvil@redhat.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
>   arch/powerpc/kernel/ptrace/ptrace-tm.c   | 8 +++++++-
>   arch/powerpc/kernel/ptrace/ptrace-view.c | 8 +++++++-
>   2 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/ptrace/ptrace-tm.c b/arch/powerpc/kernel/ptrace/ptrace-tm.c
> index f8fcbd85d4cb..d0d339f86e61 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace-tm.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace-tm.c
> @@ -87,6 +87,10 @@ int tm_cgpr_get(struct task_struct *target, const struct user_regset *regset,
>   		struct membuf to)
>   {
>   	struct membuf to_msr = membuf_at(&to, offsetof(struct pt_regs, msr));
> +#ifdef CONFIG_PPC64
> +	struct membuf to_softe = membuf_at(&to,
> +					offsetof(struct pt_regs, softe));

Should fit on a single line I think.

> +#endif
>   
>   	if (!cpu_has_feature(CPU_FTR_TM))
>   		return -ENODEV;
> @@ -102,7 +106,9 @@ int tm_cgpr_get(struct task_struct *target, const struct user_regset *regset,
>   				sizeof(struct user_pt_regs));
>   
>   	membuf_store(&to_msr, get_user_ckpt_msr(target));
> -
> +#ifdef CONFIG_PPC64
> +	membuf_store(&to_softe, 0x1ul);
> +#endif
>   	return membuf_zero(&to, ELF_NGREG * sizeof(unsigned long) -
>   				sizeof(struct user_pt_regs));
>   }
> diff --git a/arch/powerpc/kernel/ptrace/ptrace-view.c b/arch/powerpc/kernel/ptrace/ptrace-view.c
> index 39686ede40b3..f554ccfcbfae 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace-view.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace-view.c
> @@ -218,6 +218,10 @@ static int gpr_get(struct task_struct *target, const struct user_regset *regset,
>   		   struct membuf to)
>   {
>   	struct membuf to_msr = membuf_at(&to, offsetof(struct pt_regs, msr));
> +#ifdef CONFIG_PPC64
> +	struct membuf to_softe = membuf_at(&to,
> +					offsetof(struct pt_regs, softe));

Should fit on a single line I think.

> +#endif
>   	int i;
>   
>   	if (target->thread.regs == NULL)
> @@ -233,7 +237,9 @@ static int gpr_get(struct task_struct *target, const struct user_regset *regset,
>   				sizeof(struct user_pt_regs));
>   
>   	membuf_store(&to_msr, get_user_msr(target));
> -
> +#ifdef CONFIG_PPC64
> +	membuf_store(&to_softe, 0x1ul);
> +#endif
>   	return membuf_zero(&to, ELF_NGREG * sizeof(unsigned long) -
>   				sizeof(struct user_pt_regs));
>   }
> 

Christophe

^ permalink raw reply

* Re: [PATCH v3 1/2] powerpc/ptrace: simplify gpr_get/tm_cgpr_get
From: Christophe Leroy @ 2020-11-19 17:16 UTC (permalink / raw)
  To: Oleg Nesterov, Benjamin Herrenschmidt, Madhavan Srinivasan,
	Michael Ellerman, Paul Mackerras
  Cc: Christophe Leroy, Aneesh Kumar K.V, linux-kernel, Nicholas Piggin,
	Jan Kratochvil, Al Viro, linuxppc-dev
In-Reply-To: <20201119160221.GA5188@redhat.com>



Le 19/11/2020 à 17:02, Oleg Nesterov a écrit :
> gpr_get() does membuf_write() twice to override pt_regs->msr in between.

Is there anything wrong with that ?

> We can call membuf_write() once and change ->msr in the kernel buffer,
> this simplifies the code and the next fix.
> 
> The patch adds a new simple helper, membuf_at(offs), it returns the new
> membuf which can be safely used after membuf_write().
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
>   arch/powerpc/kernel/ptrace/ptrace-tm.c   | 13 +++++--------
>   arch/powerpc/kernel/ptrace/ptrace-view.c | 13 +++++--------
>   include/linux/regset.h                   | 12 ++++++++++++
>   3 files changed, 22 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/ptrace/ptrace-tm.c b/arch/powerpc/kernel/ptrace/ptrace-tm.c
> index 54f2d076206f..f8fcbd85d4cb 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace-tm.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace-tm.c
> @@ -86,6 +86,8 @@ int tm_cgpr_active(struct task_struct *target, const struct user_regset *regset)
>   int tm_cgpr_get(struct task_struct *target, const struct user_regset *regset,
>   		struct membuf to)
>   {
> +	struct membuf to_msr = membuf_at(&to, offsetof(struct pt_regs, msr));
> +
>   	if (!cpu_has_feature(CPU_FTR_TM))
>   		return -ENODEV;
>   
> @@ -97,17 +99,12 @@ int tm_cgpr_get(struct task_struct *target, const struct user_regset *regset,
>   	flush_altivec_to_thread(target);
>   
>   	membuf_write(&to, &target->thread.ckpt_regs,
> -			offsetof(struct pt_regs, msr));
> -	membuf_store(&to, get_user_ckpt_msr(target));
> +				sizeof(struct user_pt_regs));

This looks mis-aligned. But it should fit on a single line, now we allow up to 100 chars on a line.

>   
> -	BUILD_BUG_ON(offsetof(struct pt_regs, orig_gpr3) !=
> -		     offsetof(struct pt_regs, msr) + sizeof(long));
> +	membuf_store(&to_msr, get_user_ckpt_msr(target));
>   
> -	membuf_write(&to, &target->thread.ckpt_regs.orig_gpr3,
> -			sizeof(struct user_pt_regs) -
> -			offsetof(struct pt_regs, orig_gpr3));
>   	return membuf_zero(&to, ELF_NGREG * sizeof(unsigned long) -
> -			sizeof(struct user_pt_regs));
> +				sizeof(struct user_pt_regs));

I can't see any change here except the alignment. Can you leave it as is ?


>   }
>   
>   /*
> diff --git a/arch/powerpc/kernel/ptrace/ptrace-view.c b/arch/powerpc/kernel/ptrace/ptrace-view.c
> index 7e6478e7ed07..39686ede40b3 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace-view.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace-view.c
> @@ -217,6 +217,7 @@ int ptrace_put_reg(struct task_struct *task, int regno, unsigned long data)
>   static int gpr_get(struct task_struct *target, const struct user_regset *regset,
>   		   struct membuf to)
>   {
> +	struct membuf to_msr = membuf_at(&to, offsetof(struct pt_regs, msr));
>   	int i;
>   
>   	if (target->thread.regs == NULL)
> @@ -228,17 +229,13 @@ static int gpr_get(struct task_struct *target, const struct user_regset *regset,
>   			target->thread.regs->gpr[i] = NV_REG_POISON;
>   	}
>   
> -	membuf_write(&to, target->thread.regs, offsetof(struct pt_regs, msr));
> -	membuf_store(&to, get_user_msr(target));
> +	membuf_write(&to, target->thread.regs,
> +				sizeof(struct user_pt_regs));

This should fit on a single line.

>   
> -	BUILD_BUG_ON(offsetof(struct pt_regs, orig_gpr3) !=
> -		     offsetof(struct pt_regs, msr) + sizeof(long));
> +	membuf_store(&to_msr, get_user_msr(target));
>   
> -	membuf_write(&to, &target->thread.regs->orig_gpr3,
> -			sizeof(struct user_pt_regs) -
> -			offsetof(struct pt_regs, orig_gpr3));
>   	return membuf_zero(&to, ELF_NGREG * sizeof(unsigned long) -
> -				 sizeof(struct user_pt_regs));
> +				sizeof(struct user_pt_regs));

This should not change, it's not part of the changes for this patch.

>   }
>   
>   static int gpr_set(struct task_struct *target, const struct user_regset *regset,
> diff --git a/include/linux/regset.h b/include/linux/regset.h
> index c3403f328257..a00765f0e8cf 100644
> --- a/include/linux/regset.h
> +++ b/include/linux/regset.h
> @@ -46,6 +46,18 @@ static inline int membuf_write(struct membuf *s, const void *v, size_t size)
>   	return s->left;
>   }
>   
> +static inline struct membuf membuf_at(const struct membuf *s, size_t offs)
> +{
> +	struct membuf n = *s;

Is there any point in using a struct membuf * instaed of a struct membuf as parameter ?

> +
> +	if (offs > n.left)
> +		offs = n.left;
> +	n.p += offs;
> +	n.left -= offs;
> +
> +	return n;
> +}
> +
>   /* current s->p must be aligned for v; v must be a scalar */
>   #define membuf_store(s, v)				\
>   ({							\
> 

Christophe

^ permalink raw reply

* Re: [PATCH v2] ASoC: fsl_sai: Correct the clock source for mclk0
From: Mark Brown @ 2020-11-19 17:09 UTC (permalink / raw)
  To: Xiubo.Lee, Shengjiu Wang, nicoleotsuka, timur, perex, festevam,
	alsa-devel, tiwai
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <1605768038-4582-1-git-send-email-shengjiu.wang@nxp.com>

On Thu, 19 Nov 2020 14:40:38 +0800, Shengjiu Wang wrote:
> On VF610, mclk0 = bus_clk;
> On i.MX6SX/6UL/6ULL/7D, mclk0 = mclk1;
> On i.MX7ULP, mclk0 = bus_clk;
> On i.MX8QM/8QXP, mclk0 = bus_clk;
> On i.MX8MQ/8MN/8MM/8MP, mclk0 = bus_clk;
> 
> So add variable mclk0_is_mclk1 in fsl_sai_soc_data to
> distinguish these platforms.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/1] ASoC: fsl_sai: Correct the clock source for mclk0
      commit: 53233e40c142b1e0e1df9d9ac0ffc0945cfffbc9

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

^ permalink raw reply

* Re: [PATCH v3 2/2] powerpc/ptrace: Hard wire PT_SOFTE value to 1 in gpr_get() too
From: Oleg Nesterov @ 2020-11-19 16:05 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Madhavan Srinivasan, Michael Ellerman,
	Paul Mackerras
  Cc: Christophe Leroy, Aneesh Kumar K.V, linux-kernel, Nicholas Piggin,
	Jan Kratochvil, Al Viro, linuxppc-dev
In-Reply-To: <20201119160247.GB5188@redhat.com>

On 11/19, Oleg Nesterov wrote:
>
> This is not consistent and this breaks the user-regs-peekpoke test
> from https://sourceware.org/systemtap/wiki/utrace/tests/

See the test-case below.

Oleg.

/* Test case for PTRACE_SETREGS modifying the requested ragisters.
   x86* counterpart of the s390* testcase `user-area-access.c'.

   This software is provided 'as-is', without any express or implied
   warranty.  In no event will the authors be held liable for any damages
   arising from the use of this software.

   Permission is granted to anyone to use this software for any purpose,
   including commercial applications, and to alter it and redistribute it
   freely.  */

/* FIXME: EFLAGS should be tested restricted on the appropriate bits.  */

#define _GNU_SOURCE 1

#if defined __powerpc__ || defined __sparc__
# define user_regs_struct pt_regs
#endif

#ifdef __ia64__
#define ia64_fpreg ia64_fpreg_DISABLE
#define pt_all_user_regs pt_all_user_regs_DISABLE
#endif	/* __ia64__ */
#include <sys/ptrace.h>
#ifdef __ia64__
#undef ia64_fpreg
#undef pt_all_user_regs
#endif	/* __ia64__ */
#include <linux/ptrace.h>
#include <sys/types.h>
#include <sys/user.h>
#if defined __i386__ || defined __x86_64__
#include <sys/debugreg.h>
#endif
#include <asm/unistd.h>

#include <assert.h>
#include <errno.h>
#include <error.h>
#include <unistd.h>
#include <signal.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/wait.h>
#include <sys/time.h>
#include <string.h>
#include <stddef.h>

/* ia64 has PTRACE_SETREGS but it has no USER_REGS_STRUCT.  */
#if !defined PTRACE_SETREGS || defined __ia64__

int
main (void)
{
  return 77;
}

#else	/* PTRACE_SETREGS */

/* The minimal alignment we use for the random access ranges.  */
#define REGALIGN (sizeof (long))

static pid_t child;

static void
cleanup (void)
{
  if (child > 0)
    kill (child, SIGKILL);
  child = 0;
}

static void
handler_fail (int signo)
{
  cleanup ();
  signal (SIGABRT, SIG_DFL);
  abort ();
}

int
main (void)
{
  long l;
  int status, i;
  pid_t pid;
  union
    {
      struct user_regs_struct user;
      unsigned char byte[sizeof (struct user_regs_struct)];
    } u, u2;
  int start;

  setbuf (stdout, NULL);
  atexit (cleanup);
  signal (SIGABRT, handler_fail);
  signal (SIGALRM, handler_fail);
  signal (SIGINT, handler_fail);
  i = alarm (10);
  assert (i == 0);

  child = fork ();
  switch (child)
    {
    case -1:
      assert_perror (errno);
      assert (0);

    case 0:
      l = ptrace (PTRACE_TRACEME, 0, NULL, NULL);
      assert (l == 0);

      // Prevent rt_sigprocmask() call called by glibc after raise().
      syscall (__NR_tkill, getpid (), SIGSTOP);
      assert (0);

    default:
      break;
    }

  pid = waitpid (child, &status, 0);
  assert (pid == child);
  assert (WIFSTOPPED (status));
  assert (WSTOPSIG (status) == SIGSTOP);

  /* Fetch U2 from the inferior.  */
  errno = 0;
# ifdef __sparc__
  l = ptrace (PTRACE_GETREGS, child, &u2.user, NULL);
# else
  l = ptrace (PTRACE_GETREGS, child, NULL, &u2.user);
# endif
  assert_perror (errno);
  assert (l == 0);

  /* Initialize U with a pattern.  */
  for (i = 0; i < sizeof u.byte; i++)
    u.byte[i] = i;
#ifdef __x86_64__
  /* non-EFLAGS modifications fail with EIO,  EFLAGS gets back different.  */
  u.user.eflags = u2.user.eflags;
  u.user.cs = u2.user.cs;
  u.user.ds = u2.user.ds;
  u.user.es = u2.user.es;
  u.user.fs = u2.user.fs;
  u.user.gs = u2.user.gs;
  u.user.ss = u2.user.ss;
  u.user.fs_base = u2.user.fs_base;
  u.user.gs_base = u2.user.gs_base;
  /* RHEL-4 refuses to set too high (and invalid) PC values.  */
  u.user.rip = (unsigned long) handler_fail;
  /* 2.6.25 always truncates and sign-extends orig_rax.  */
  u.user.orig_rax = (int) u.user.orig_rax;
#endif	/* __x86_64__ */
#ifdef __i386__
  /* These values get back different.  */
  u.user.xds = u2.user.xds;
  u.user.xes = u2.user.xes;
  u.user.xfs = u2.user.xfs;
  u.user.xgs = u2.user.xgs;
  u.user.xcs = u2.user.xcs;
  u.user.eflags = u2.user.eflags;
  u.user.xss = u2.user.xss;
  /* RHEL-4 refuses to set too high (and invalid) PC values.  */
  u.user.eip = (unsigned long) handler_fail;
#endif	/* __i386__ */
#ifdef __powerpc__
  /* These fields are constrained.  */
  u.user.msr = u2.user.msr;
# ifdef __powerpc64__
  u.user.softe = u2.user.softe;
# else
  u.user.mq = u2.user.mq;
# endif	/* __powerpc64__ */
  u.user.trap = u2.user.trap;
  u.user.dar = u2.user.dar;
  u.user.dsisr = u2.user.dsisr;
  u.user.result = u2.user.result;
#endif	/* __powerpc__ */

  /* Poke U.  */
# ifdef __sparc__
  l = ptrace (PTRACE_SETREGS, child, &u.user, NULL);
# else
  l = ptrace (PTRACE_SETREGS, child, NULL, &u.user);
# endif
  assert (l == 0);

  /* Peek into U2.  */
# ifdef __sparc__
  l = ptrace (PTRACE_GETREGS, child, &u2.user, NULL);
# else
  l = ptrace (PTRACE_GETREGS, child, NULL, &u2.user);
# endif
  assert (l == 0);

  /* Verify it matches.  */
  if (memcmp (&u.user, &u2.user, sizeof u.byte) != 0)
    {
      for (start = 0; start + REGALIGN <= sizeof u.byte; start += REGALIGN)
	if (*(unsigned long *) (u.byte + start)
	    != *(unsigned long *) (u2.byte + start))
	  printf ("\
mismatch at offset %#x: SETREGS wrote %lx GETREGS read %lx\n",
		  start, *(unsigned long *) (u.byte + start),
		  *(unsigned long *) (u2.byte + start));
      return 1;
    }

  /* Reverse the pattern.  */
  for (i = 0; i < sizeof u.byte; i++)
    u.byte[i] ^= -1;
#ifdef __x86_64__
  /* non-EFLAGS modifications fail with EIO,  EFLAGS gets back different.  */
  u.user.eflags = u2.user.eflags;
  u.user.cs = u2.user.cs;
  u.user.ds = u2.user.ds;
  u.user.es = u2.user.es;
  u.user.fs = u2.user.fs;
  u.user.gs = u2.user.gs;
  u.user.ss = u2.user.ss;
  u.user.fs_base = u2.user.fs_base;
  u.user.gs_base = u2.user.gs_base;
  /* RHEL-4 refuses to set too high (and invalid) PC values.  */
  u.user.rip = (unsigned long) handler_fail;
  /* 2.6.25 always truncates and sign-extends orig_rax.  */
  u.user.orig_rax = (int) u.user.orig_rax;
#endif	/* __x86_64__ */
#ifdef __i386__
  /* These values get back different.  */
  u.user.xds = u2.user.xds;
  u.user.xes = u2.user.xes;
  u.user.xfs = u2.user.xfs;
  u.user.xgs = u2.user.xgs;
  u.user.xcs = u2.user.xcs;
  u.user.eflags = u2.user.eflags;
  u.user.xss = u2.user.xss;
  /* RHEL-4 refuses to set too high (and invalid) PC values.  */
  u.user.eip = (unsigned long) handler_fail;
#endif	/* __i386__ */
#ifdef __powerpc__
  /* These fields are constrained.  */
  u.user.msr = u2.user.msr;
# ifdef __powerpc64__
  u.user.softe = u2.user.softe;
# else
  u.user.mq = u2.user.mq;
# endif	/* __powerpc64__ */
  u.user.trap = u2.user.trap;
  u.user.dar = u2.user.dar;
  u.user.dsisr = u2.user.dsisr;
  u.user.result = u2.user.result;
#endif	/* __powerpc__ */

  /* Poke U.  */
# ifdef __sparc__
  l = ptrace (PTRACE_SETREGS, child, &u.user, NULL);
# else
  l = ptrace (PTRACE_SETREGS, child, NULL, &u.user);
# endif
  assert (l == 0);

  /* Peek into U2.  */
# ifdef __sparc__
  l = ptrace (PTRACE_GETREGS, child, &u2.user, NULL);
# else
  l = ptrace (PTRACE_GETREGS, child, NULL, &u2.user);
# endif
  assert (l == 0);

  /* Verify it matches.  */
  if (memcmp (&u.user, &u2.user, sizeof u.byte) != 0)
    {
      for (start = 0; start + REGALIGN <= sizeof u.byte; start += REGALIGN)
	if (*(unsigned long *) (u.byte + start)
	    != *(unsigned long *) (u2.byte + start))
	  printf ("\
mismatch at offset %#x: SETREGS wrote %lx GETREGS read %lx\n",
		  start, *(unsigned long *) (u.byte + start),
		  *(unsigned long *) (u2.byte + start));
      return 1;
    }

  /* Now try poking arbitrary ranges and verifying it reads back right.
     We expect the U area is already a random enough pattern.  */
  for (start = 0; start + REGALIGN <= sizeof u.byte; start += REGALIGN)
    {
      for (i = start; i < start + REGALIGN; i++)
	u.byte[i]++;
#ifdef __x86_64__
      /* non-EFLAGS modifications fail with EIO,  EFLAGS gets back different.  */
      u.user.eflags = u2.user.eflags;
      u.user.cs = u2.user.cs;
      u.user.ds = u2.user.ds;
      u.user.es = u2.user.es;
      u.user.fs = u2.user.fs;
      u.user.gs = u2.user.gs;
      u.user.ss = u2.user.ss;
      u.user.fs_base = u2.user.fs_base;
      u.user.gs_base = u2.user.gs_base;
      /* RHEL-4 refuses to set too high (and invalid) PC values.  */
      u.user.rip = (unsigned long) handler_fail;
      /* 2.6.25 always truncates and sign-extends orig_rax.  */
      u.user.orig_rax = (int) u.user.orig_rax;
#endif	/* __x86_64__ */
#ifdef __i386__
      /* These values get back different.  */
      u.user.xds = u2.user.xds;
      u.user.xes = u2.user.xes;
      u.user.xfs = u2.user.xfs;
      u.user.xgs = u2.user.xgs;
      u.user.xcs = u2.user.xcs;
      u.user.eflags = u2.user.eflags;
      u.user.xss = u2.user.xss;
      /* RHEL-4 refuses to set too high (and invalid) PC values.  */
      u.user.eip = (unsigned long) handler_fail;
#endif	/* __i386__ */
#ifdef __powerpc__
      /* These fields are constrained.  */
      u.user.msr = u2.user.msr;
# ifdef __powerpc64__
      u.user.softe = u2.user.softe;
# else
      u.user.mq = u2.user.mq;
# endif	/* __powerpc64__ */
      u.user.trap = u2.user.trap;
      u.user.dar = u2.user.dar;
      u.user.dsisr = u2.user.dsisr;
      u.user.result = u2.user.result;
      if (start > offsetof (struct pt_regs, ccr))
	break;
#endif	/* __powerpc__ */

      /* Poke U.  */
      l = ptrace (PTRACE_POKEUSER, child, (void *) (unsigned long) start,
		  (void *) *(unsigned long *) (u.byte + start));
      if (l != 0)
	error (1, errno, "PTRACE_POKEUSER at %x", start);

      /* Peek into U2.  */
# ifdef __sparc__
      l = ptrace (PTRACE_GETREGS, child, &u2.user, NULL);
# else
      l = ptrace (PTRACE_GETREGS, child, NULL, &u2.user);
# endif
      assert (l == 0);

      /* Verify it matches.  */
      if (memcmp (&u.user, &u2.user, sizeof u.byte) != 0)
	{
	  printf ("mismatch at offset %#x: poked %lx but GETREGS read %lx\n",
		  start, *(unsigned long *) (u.byte + start),
		  *(unsigned long *) (u2.byte + start));
	  return 1;
	}
    }


  /* Now try peeking arbitrary ranges and verifying it is the same.
     We expect the U area is already a random enough pattern.  */
  for (start = 0; start + REGALIGN <= sizeof u.byte; start += REGALIGN)
    {
      /* Peek for the U comparation.  */
      errno = 0;
      l = ptrace (PTRACE_PEEKUSER, child, (void *) (unsigned long) start,
		  NULL);
      assert_perror (errno);

      /* Verify it matches.  */
      if (*(unsigned long *) (u.byte + start) != l)
	{
	  printf ("mismatch at offset %#x: poked %lx but peeked %lx\n",
		  start, *(unsigned long *) (u.byte + start), l);
	  return 1;
	}
    }


  return 0;
}

#endif	/* PTRACE_SETREGS */


^ permalink raw reply

* [PATCH v3 2/2] powerpc/ptrace: Hard wire PT_SOFTE value to 1 in gpr_get() too
From: Oleg Nesterov @ 2020-11-19 16:02 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Madhavan Srinivasan, Michael Ellerman,
	Paul Mackerras
  Cc: Christophe Leroy, Aneesh Kumar K.V, linux-kernel, Nicholas Piggin,
	Jan Kratochvil, Al Viro, linuxppc-dev
In-Reply-To: <20201119160154.GA5183@redhat.com>

The commit a8a4b03ab95f ("powerpc: Hard wire PT_SOFTE value to 1 in
ptrace & signals") changed ptrace_get_reg(PT_SOFTE) to report 0x1,
but PTRACE_GETREGS still copies pt_regs->softe as is.

This is not consistent and this breaks the user-regs-peekpoke test
from https://sourceware.org/systemtap/wiki/utrace/tests/

Reported-by: Jan Kratochvil <jan.kratochvil@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 arch/powerpc/kernel/ptrace/ptrace-tm.c   | 8 +++++++-
 arch/powerpc/kernel/ptrace/ptrace-view.c | 8 +++++++-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/ptrace/ptrace-tm.c b/arch/powerpc/kernel/ptrace/ptrace-tm.c
index f8fcbd85d4cb..d0d339f86e61 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-tm.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-tm.c
@@ -87,6 +87,10 @@ int tm_cgpr_get(struct task_struct *target, const struct user_regset *regset,
 		struct membuf to)
 {
 	struct membuf to_msr = membuf_at(&to, offsetof(struct pt_regs, msr));
+#ifdef CONFIG_PPC64
+	struct membuf to_softe = membuf_at(&to,
+					offsetof(struct pt_regs, softe));
+#endif
 
 	if (!cpu_has_feature(CPU_FTR_TM))
 		return -ENODEV;
@@ -102,7 +106,9 @@ int tm_cgpr_get(struct task_struct *target, const struct user_regset *regset,
 				sizeof(struct user_pt_regs));
 
 	membuf_store(&to_msr, get_user_ckpt_msr(target));
-
+#ifdef CONFIG_PPC64
+	membuf_store(&to_softe, 0x1ul);
+#endif
 	return membuf_zero(&to, ELF_NGREG * sizeof(unsigned long) -
 				sizeof(struct user_pt_regs));
 }
diff --git a/arch/powerpc/kernel/ptrace/ptrace-view.c b/arch/powerpc/kernel/ptrace/ptrace-view.c
index 39686ede40b3..f554ccfcbfae 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-view.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-view.c
@@ -218,6 +218,10 @@ static int gpr_get(struct task_struct *target, const struct user_regset *regset,
 		   struct membuf to)
 {
 	struct membuf to_msr = membuf_at(&to, offsetof(struct pt_regs, msr));
+#ifdef CONFIG_PPC64
+	struct membuf to_softe = membuf_at(&to,
+					offsetof(struct pt_regs, softe));
+#endif
 	int i;
 
 	if (target->thread.regs == NULL)
@@ -233,7 +237,9 @@ static int gpr_get(struct task_struct *target, const struct user_regset *regset,
 				sizeof(struct user_pt_regs));
 
 	membuf_store(&to_msr, get_user_msr(target));
-
+#ifdef CONFIG_PPC64
+	membuf_store(&to_softe, 0x1ul);
+#endif
 	return membuf_zero(&to, ELF_NGREG * sizeof(unsigned long) -
 				sizeof(struct user_pt_regs));
 }
-- 
2.25.1.362.g51ebf55



^ permalink raw reply related

* [PATCH v3 1/2] powerpc/ptrace: simplify gpr_get/tm_cgpr_get
From: Oleg Nesterov @ 2020-11-19 16:02 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Madhavan Srinivasan, Michael Ellerman,
	Paul Mackerras
  Cc: Christophe Leroy, Aneesh Kumar K.V, linux-kernel, Nicholas Piggin,
	Jan Kratochvil, Al Viro, linuxppc-dev
In-Reply-To: <20201119160154.GA5183@redhat.com>

gpr_get() does membuf_write() twice to override pt_regs->msr in between.
We can call membuf_write() once and change ->msr in the kernel buffer,
this simplifies the code and the next fix.

The patch adds a new simple helper, membuf_at(offs), it returns the new
membuf which can be safely used after membuf_write().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 arch/powerpc/kernel/ptrace/ptrace-tm.c   | 13 +++++--------
 arch/powerpc/kernel/ptrace/ptrace-view.c | 13 +++++--------
 include/linux/regset.h                   | 12 ++++++++++++
 3 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/kernel/ptrace/ptrace-tm.c b/arch/powerpc/kernel/ptrace/ptrace-tm.c
index 54f2d076206f..f8fcbd85d4cb 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-tm.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-tm.c
@@ -86,6 +86,8 @@ int tm_cgpr_active(struct task_struct *target, const struct user_regset *regset)
 int tm_cgpr_get(struct task_struct *target, const struct user_regset *regset,
 		struct membuf to)
 {
+	struct membuf to_msr = membuf_at(&to, offsetof(struct pt_regs, msr));
+
 	if (!cpu_has_feature(CPU_FTR_TM))
 		return -ENODEV;
 
@@ -97,17 +99,12 @@ int tm_cgpr_get(struct task_struct *target, const struct user_regset *regset,
 	flush_altivec_to_thread(target);
 
 	membuf_write(&to, &target->thread.ckpt_regs,
-			offsetof(struct pt_regs, msr));
-	membuf_store(&to, get_user_ckpt_msr(target));
+				sizeof(struct user_pt_regs));
 
-	BUILD_BUG_ON(offsetof(struct pt_regs, orig_gpr3) !=
-		     offsetof(struct pt_regs, msr) + sizeof(long));
+	membuf_store(&to_msr, get_user_ckpt_msr(target));
 
-	membuf_write(&to, &target->thread.ckpt_regs.orig_gpr3,
-			sizeof(struct user_pt_regs) -
-			offsetof(struct pt_regs, orig_gpr3));
 	return membuf_zero(&to, ELF_NGREG * sizeof(unsigned long) -
-			sizeof(struct user_pt_regs));
+				sizeof(struct user_pt_regs));
 }
 
 /*
diff --git a/arch/powerpc/kernel/ptrace/ptrace-view.c b/arch/powerpc/kernel/ptrace/ptrace-view.c
index 7e6478e7ed07..39686ede40b3 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-view.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-view.c
@@ -217,6 +217,7 @@ int ptrace_put_reg(struct task_struct *task, int regno, unsigned long data)
 static int gpr_get(struct task_struct *target, const struct user_regset *regset,
 		   struct membuf to)
 {
+	struct membuf to_msr = membuf_at(&to, offsetof(struct pt_regs, msr));
 	int i;
 
 	if (target->thread.regs == NULL)
@@ -228,17 +229,13 @@ static int gpr_get(struct task_struct *target, const struct user_regset *regset,
 			target->thread.regs->gpr[i] = NV_REG_POISON;
 	}
 
-	membuf_write(&to, target->thread.regs, offsetof(struct pt_regs, msr));
-	membuf_store(&to, get_user_msr(target));
+	membuf_write(&to, target->thread.regs,
+				sizeof(struct user_pt_regs));
 
-	BUILD_BUG_ON(offsetof(struct pt_regs, orig_gpr3) !=
-		     offsetof(struct pt_regs, msr) + sizeof(long));
+	membuf_store(&to_msr, get_user_msr(target));
 
-	membuf_write(&to, &target->thread.regs->orig_gpr3,
-			sizeof(struct user_pt_regs) -
-			offsetof(struct pt_regs, orig_gpr3));
 	return membuf_zero(&to, ELF_NGREG * sizeof(unsigned long) -
-				 sizeof(struct user_pt_regs));
+				sizeof(struct user_pt_regs));
 }
 
 static int gpr_set(struct task_struct *target, const struct user_regset *regset,
diff --git a/include/linux/regset.h b/include/linux/regset.h
index c3403f328257..a00765f0e8cf 100644
--- a/include/linux/regset.h
+++ b/include/linux/regset.h
@@ -46,6 +46,18 @@ static inline int membuf_write(struct membuf *s, const void *v, size_t size)
 	return s->left;
 }
 
+static inline struct membuf membuf_at(const struct membuf *s, size_t offs)
+{
+	struct membuf n = *s;
+
+	if (offs > n.left)
+		offs = n.left;
+	n.p += offs;
+	n.left -= offs;
+
+	return n;
+}
+
 /* current s->p must be aligned for v; v must be a scalar */
 #define membuf_store(s, v)				\
 ({							\
-- 
2.25.1.362.g51ebf55



^ permalink raw reply related

* [PATCH v3 0/2] powerpc/ptrace: Hard wire PT_SOFTE value to 1 in gpr_get() too
From: Oleg Nesterov @ 2020-11-19 16:01 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Madhavan Srinivasan, Michael Ellerman,
	Paul Mackerras
  Cc: Christophe Leroy, Aneesh Kumar K.V, linux-kernel, Nicholas Piggin,
	Jan Kratochvil, Al Viro, linuxppc-dev

Can we finally fix this problem? ;)

My previous attempt was ignored, see

	https://lore.kernel.org/lkml/20190917121256.GA8659@redhat.com/

Now that gpr_get() was changed to use membuf API we can make a simpler fix.

Sorry, uncompiled/untested, I don't have a ppc machine.

Oleg.

 arch/powerpc/kernel/ptrace/ptrace-tm.c   | 21 ++++++++++++---------
 arch/powerpc/kernel/ptrace/ptrace-view.c | 21 ++++++++++++---------
 include/linux/regset.h                   | 12 ++++++++++++
 3 files changed, 36 insertions(+), 18 deletions(-)


^ permalink raw reply

* Re: CONFIG_PPC_VAS depends on 64k pages...?
From: Christophe Leroy @ 2020-11-19 14:43 UTC (permalink / raw)
  To: Will Springer, linuxppc-dev, Sukadev Bhattiprolu; +Cc: daniel
In-Reply-To: <7171078.EvYhyI6sBW@sheen>

Hi,

Le 19/11/2020 à 11:58, Will Springer a écrit :
> I learned about the POWER9 gzip accelerator a few months ago when the
> support hit upstream Linux 5.8. However, for some reason the Kconfig
> dictates that VAS depends on a 64k page size, which is problematic as I
> run Void Linux, which uses a 4k-page kernel.
> 
> Some early poking by others indicated there wasn't an obvious page size
> dependency in the code, and suggested I try modifying the config to switch
> it on. I did so, but was stopped by a minor complaint of an "unexpected DT
> configuration" by the VAS code. I wasn't equipped to figure out exactly what
> this meant, even after finding the offending condition, so after writing a
> very drawn-out forum post asking for help, I dropped the subject.
> 
> Fast forward to today, when I was reminded of the whole thing again, and
> decided to debug a bit further. Apparently the VAS platform device
> (derived from the DT node) has 5 resources on my 4k kernel, instead of 4
> (which evidently works for others who have had success on 64k kernels). I
> have no idea what this means in practice (I don't know how to introspect
> it), but after making a tiny patch[1], everything came up smoothly and I
> was doing blazing-fast gzip (de)compression in no time.
> 
> Everything seems to work fine on 4k pages. So, what's up? Are there
> pitfalls lurking around that I've yet to stumble over? More reasonably,
> I'm curious as to why the feature supposedly depends on 64k pages, or if
> there's anything else I should be concerned about.
> 

Maybe ask Sukadev who did the implementation and is maintaining it ?

> I do have to say I'm quite satisfied with the results of the NX
> accelerator, though. Being able to shuffle data to a RaptorCS box over gigE
> and get compressed data back faster than most software gzip could ever
> hope to achieve is no small feat, let alone the instantaneous results locally.
> :)
> 
> Cheers,
> Will Springer [she/her]
> 
> [1]: https://github.com/Skirmisher/void-packages/blob/vas-4k-pages/srcpkgs/linux5.9/patches/ppc-vas-on-4k.patch
> 


Christophe

^ permalink raw reply

* CONFIG_PPC_VAS depends on 64k pages...?
From: Will Springer @ 2020-11-19 10:58 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: daniel

I learned about the POWER9 gzip accelerator a few months ago when the 
support hit upstream Linux 5.8. However, for some reason the Kconfig 
dictates that VAS depends on a 64k page size, which is problematic as I 
run Void Linux, which uses a 4k-page kernel.

Some early poking by others indicated there wasn't an obvious page size 
dependency in the code, and suggested I try modifying the config to switch 
it on. I did so, but was stopped by a minor complaint of an "unexpected DT 
configuration" by the VAS code. I wasn't equipped to figure out exactly what 
this meant, even after finding the offending condition, so after writing a 
very drawn-out forum post asking for help, I dropped the subject.

Fast forward to today, when I was reminded of the whole thing again, and 
decided to debug a bit further. Apparently the VAS platform device 
(derived from the DT node) has 5 resources on my 4k kernel, instead of 4 
(which evidently works for others who have had success on 64k kernels). I 
have no idea what this means in practice (I don't know how to introspect 
it), but after making a tiny patch[1], everything came up smoothly and I 
was doing blazing-fast gzip (de)compression in no time.

Everything seems to work fine on 4k pages. So, what's up? Are there 
pitfalls lurking around that I've yet to stumble over? More reasonably, 
I'm curious as to why the feature supposedly depends on 64k pages, or if 
there's anything else I should be concerned about.

I do have to say I'm quite satisfied with the results of the NX 
accelerator, though. Being able to shuffle data to a RaptorCS box over gigE 
and get compressed data back faster than most software gzip could ever
hope to achieve is no small feat, let alone the instantaneous results locally.
:)

Cheers,
Will Springer [she/her]

[1]: https://github.com/Skirmisher/void-packages/blob/vas-4k-pages/srcpkgs/linux5.9/patches/ppc-vas-on-4k.patch




^ permalink raw reply

* Re: [PATCH v2 00/16] PCI: dwc: Another round of clean-ups
From: Lorenzo Pieralisi @ 2020-11-19 11:01 UTC (permalink / raw)
  To: Rob Herring
  Cc: Kunihiko Hayashi, Neil Armstrong, linux-pci, Binghui Wang,
	Bjorn Andersson, linux-tegra, Thierry Reding, linux-arm-kernel,
	Thomas Petazzoni, Jonathan Chocron, Shawn Guo,
	Kishon Vijay Abraham I, Fabio Estevam, Jerome Brunet,
	Jesper Nilsson, Lorenzo Pieralisi, Kevin Hilman, Pratyush Anand,
	Krzysztof Kozlowski, Jonathan Hunter, Murali Karicheri,
	NXP Linux Team, Xiaowei Song, Marek Szyprowski, Masahiro Yamada,
	Richard Zhu, Martin Blumenstingl, linux-arm-msm, Sascha Hauer,
	Yue Wang, linux-samsung-soc, Bjorn Helgaas, linux-amlogic,
	linux-omap, Mingkai Hu, linux-arm-kernel, Roy Zang, Minghuan Lian,
	Jingoo Han, Andy Gross, Vidya Sagar, Stanimir Varbanov,
	Kukjin Kim, Pengutronix Kernel Team, Gustavo Pimentel,
	linuxppc-dev, Lucas Stach
In-Reply-To: <20201105211159.1814485-1-robh@kernel.org>

On Thu, 5 Nov 2020 15:11:43 -0600, Rob Herring wrote:
> Here's another batch of DWC PCI host refactoring. This series primarily
> moves more of the MSI, link up, and resource handling to the core
> code. Beyond a couple of minor fixes, new in this version is runtime
> detection of iATU regions instead of using DT properties.
> 
> No doubt I've probably broken something. Please test. I've run this thru
> kernelci and checked boards with DWC PCI which currently is just
> Layerscape boards (hint: add boards and/or enable PCI). A git branch is
> here[1].
> 
> [...]

Applied to pci/dwc, thanks!

[01/16] PCI: dwc: Support multiple ATU memory regions
        https://git.kernel.org/lpieralisi/pci/c/9f9e59a480
[02/16] PCI: dwc/intel-gw: Move ATU offset out of driver match data
        https://git.kernel.org/lpieralisi/pci/c/1d567aac46
[03/16] PCI: dwc: Move "dbi", "dbi2", and "addr_space" resource setup into common code
        https://git.kernel.org/lpieralisi/pci/c/a0fd361db8
[04/16] PCI: dwc/intel-gw: Remove some unneeded function wrappers
        https://git.kernel.org/lpieralisi/pci/c/1cc9a55999
[05/16] PCI: dwc: Ensure all outbound ATU windows are reset
        https://git.kernel.org/lpieralisi/pci/c/458ad06c4c
[06/16] PCI: dwc/dra7xx: Use the common MSI irq_chip
        https://git.kernel.org/lpieralisi/pci/c/7f170d35f5
[07/16] PCI: dwc: Drop the .set_num_vectors() host op
        https://git.kernel.org/lpieralisi/pci/c/331e9bcead
[08/16] PCI: dwc: Move MSI interrupt setup into DWC common code
        https://git.kernel.org/lpieralisi/pci/c/5bcb1757e6
[09/16] PCI: dwc: Rework MSI initialization
        https://git.kernel.org/lpieralisi/pci/c/f78f02638a
[10/16] PCI: dwc: Move link handling into common code
        https://git.kernel.org/lpieralisi/pci/c/886a9c1347
[11/16] PCI: dwc: Move dw_pcie_msi_init() into core
        https://git.kernel.org/lpieralisi/pci/c/59fbab1ae4
[12/16] PCI: dwc: Move dw_pcie_setup_rc() to DWC common code
        https://git.kernel.org/lpieralisi/pci/c/b9ac0f9dc8
[13/16] PCI: dwc: Remove unnecessary wrappers around dw_pcie_host_init()
        https://git.kernel.org/lpieralisi/pci/c/60f5b73fa0
[14/16] Revert "PCI: dwc/keystone: Drop duplicated 'num-viewport'"
        https://git.kernel.org/lpieralisi/pci/c/fcde397422
[15/16] PCI: dwc: Move inbound and outbound windows to common struct
        https://git.kernel.org/lpieralisi/pci/c/9ca17af552
[16/16] PCI: dwc: Detect number of iATU windows
        https://git.kernel.org/lpieralisi/pci/c/281f1f99cf

Thanks,
Lorenzo

^ permalink raw reply

* Re: [PATCH net-next v2 8/9] ibmvnic: Use netdev_alloc_skb instead of alloc_skb to replenish RX buffers
From: ljp @ 2020-11-19  9:47 UTC (permalink / raw)
  To: Thomas Falcon
  Cc: cforno12, netdev, ricklind, dnbanerg, Linuxppc-dev, drt, brking,
	kuba, sukadev, linuxppc-dev
In-Reply-To: <1605748345-32062-9-git-send-email-tlfalcon@linux.ibm.com>

On 2020-11-18 19:12, Thomas Falcon wrote:
> From: "Dwip N. Banerjee" <dnbanerg@us.ibm.com>
> 
> Take advantage of the additional optimizations in netdev_alloc_skb when
> allocating socket buffers to be used for packet reception.
> 
> Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>

Acked-by: Lijun Pan <ljp@linux.ibm.com>

^ permalink raw reply

* Re: [PATCH net-next v2 9/9] ibmvnic: Do not replenish RX buffers after every polling loop
From: ljp @ 2020-11-19  9:43 UTC (permalink / raw)
  To: Thomas Falcon
  Cc: cforno12, netdev, ricklind, dnbanerg, Linuxppc-dev, drt, brking,
	kuba, sukadev, linuxppc-dev
In-Reply-To: <1605748345-32062-10-git-send-email-tlfalcon@linux.ibm.com>

On 2020-11-18 19:12, Thomas Falcon wrote:
> From: "Dwip N. Banerjee" <dnbanerg@us.ibm.com>
> 
> Reduce the amount of time spent replenishing RX buffers by
> only doing so once available buffers has fallen under a certain
> threshold, in this case half of the total number of buffers, or
> if the polling loop exits before the packets processed is less
> than its budget.
> 
> Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index 96df6d8fa277..9fe43ab0496d 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -2537,7 +2537,10 @@ static int ibmvnic_poll(struct napi_struct
> *napi, int budget)
>  		frames_processed++;
>  	}
> 
> -	if (adapter->state != VNIC_CLOSING)
> +	if (adapter->state != VNIC_CLOSING &&
> +	    ((atomic_read(&adapter->rx_pool[scrq_num].available) <
> +	      adapter->req_rx_add_entries_per_subcrq / 2) ||
> +	      frames_processed < budget))

1/2 seems a simple and good algorithm.
Explaining why "frames_process < budget" is necessary in the commit 
message
or source code also helps.


>  		replenish_rx_pool(adapter, &adapter->rx_pool[scrq_num]);
>  	if (frames_processed < budget) {
>  		if (napi_complete_done(napi, frames_processed)) {

^ permalink raw reply

* Re: [PATCH net-next v2 5/9] ibmvnic: Remove send_subcrq function
From: ljp @ 2020-11-19  9:37 UTC (permalink / raw)
  To: Thomas Falcon
  Cc: cforno12, netdev, ricklind, dnbanerg, Linuxppc-dev, drt, brking,
	kuba, sukadev, linuxppc-dev
In-Reply-To: <1605748345-32062-6-git-send-email-tlfalcon@linux.ibm.com>

On 2020-11-18 19:12, Thomas Falcon wrote:
> It is not longer used, so remove it.
> 
> Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>

Acked-by: Lijun Pan <ljp@linux.ibm.com>

^ permalink raw reply

* Re: [PATCH net-next v2 1/9] ibmvnic: Introduce indirect subordinate Command Response Queue buffer
From: ljp @ 2020-11-19  9:34 UTC (permalink / raw)
  To: Thomas Falcon
  Cc: cforno12, netdev, ricklind, dnbanerg, drt, brking, kuba, sukadev,
	linuxppc-dev
In-Reply-To: <1605748345-32062-2-git-send-email-tlfalcon@linux.ibm.com>

On 2020-11-18 19:12, Thomas Falcon wrote:
> This patch introduces the infrastructure to send batched subordinate
> Command Response Queue descriptors, which are used by the ibmvnic
> driver to send TX frame and RX buffer descriptors.
> 
> Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>

Acked-by: Lijun Pan <ljp@linux.ibm.com>

^ permalink raw reply

* [PATCH for 5.4] powerpc/8xx: Always fault when _PAGE_ACCESSED is not set
From: Christophe Leroy @ 2020-11-19  8:47 UTC (permalink / raw)
  To: gregkh, stable; +Cc: linuxppc-dev, linux-kernel

[This is backport for 5.4 of 29daf869cbab69088fe1755d9dd224e99ba78b56]

The kernel expects pte_young() to work regardless of CONFIG_SWAP.

Make sure a minor fault is taken to set _PAGE_ACCESSED when it
is not already set, regardless of the selection of CONFIG_SWAP.

This adds at least 3 instructions to the TLB miss exception
handlers fast path. Following patch will reduce this overhead.

Also update the rotation instruction to the correct number of bits
to reflect all changes done to _PAGE_ACCESSED over time.

Fixes: d069cb4373fe ("powerpc/8xx: Don't touch ACCESSED when no SWAP.")
Fixes: 5f356497c384 ("powerpc/8xx: remove unused _PAGE_WRITETHRU")
Fixes: e0a8e0d90a9f ("powerpc/8xx: Handle PAGE_USER via APG bits")
Fixes: 5b2753fc3e8a ("powerpc/8xx: Implementation of PAGE_EXEC")
Fixes: a891c43b97d3 ("powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages.")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/af834e8a0f1fa97bfae65664950f0984a70c4750.1602492856.git.christophe.leroy@csgroup.eu
---
 arch/powerpc/kernel/head_8xx.S | 14 ++------------
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 98d8b6832fcb..f6428b90a6c7 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -229,9 +229,7 @@ SystemCall:
 
 InstructionTLBMiss:
 	mtspr	SPRN_SPRG_SCRATCH0, r10
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP)
 	mtspr	SPRN_SPRG_SCRATCH1, r11
-#endif
 
 	/* If we are faulting a kernel address, we have to use the
 	 * kernel page tables.
@@ -278,11 +276,9 @@ InstructionTLBMiss:
 #ifdef ITLB_MISS_KERNEL
 	mtcr	r11
 #endif
-#ifdef CONFIG_SWAP
-	rlwinm	r11, r10, 32-5, _PAGE_PRESENT
+	rlwinm	r11, r10, 32-7, _PAGE_PRESENT
 	and	r11, r11, r10
 	rlwimi	r10, r11, 0, _PAGE_PRESENT
-#endif
 	/* The Linux PTE won't go exactly into the MMU TLB.
 	 * Software indicator bits 20 and 23 must be clear.
 	 * Software indicator bits 22, 24, 25, 26, and 27 must be
@@ -296,9 +292,7 @@ InstructionTLBMiss:
 
 	/* Restore registers */
 0:	mfspr	r10, SPRN_SPRG_SCRATCH0
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP)
 	mfspr	r11, SPRN_SPRG_SCRATCH1
-#endif
 	rfi
 	patch_site	0b, patch__itlbmiss_exit_1
 
@@ -308,9 +302,7 @@ InstructionTLBMiss:
 	addi	r10, r10, 1
 	stw	r10, (itlb_miss_counter - PAGE_OFFSET)@l(0)
 	mfspr	r10, SPRN_SPRG_SCRATCH0
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP)
 	mfspr	r11, SPRN_SPRG_SCRATCH1
-#endif
 	rfi
 #endif
 
@@ -394,11 +386,9 @@ DataStoreTLBMiss:
 	 * r11 = ((r10 & PRESENT) & ((r10 & ACCESSED) >> 5));
 	 * r10 = (r10 & ~PRESENT) | r11;
 	 */
-#ifdef CONFIG_SWAP
-	rlwinm	r11, r10, 32-5, _PAGE_PRESENT
+	rlwinm	r11, r10, 32-7, _PAGE_PRESENT
 	and	r11, r11, r10
 	rlwimi	r10, r11, 0, _PAGE_PRESENT
-#endif
 	/* The Linux PTE won't go exactly into the MMU TLB.
 	 * Software indicator bits 24, 25, 26, and 27 must be
 	 * set.  All other Linux PTE bits control the behavior
-- 
2.25.0


^ permalink raw reply related

* [PATCH for 4.9] powerpc/8xx: Always fault when _PAGE_ACCESSED is not set
From: Christophe Leroy @ 2020-11-19  8:47 UTC (permalink / raw)
  To: gregkh, stable; +Cc: linuxppc-dev, linux-kernel

[This is backport for 4.9 of 29daf869cbab69088fe1755d9dd224e99ba78b56]

The kernel expects pte_young() to work regardless of CONFIG_SWAP.

Make sure a minor fault is taken to set _PAGE_ACCESSED when it
is not already set, regardless of the selection of CONFIG_SWAP.

This adds at least 3 instructions to the TLB miss exception
handlers fast path. Following patch will reduce this overhead.

Also update the rotation instruction to the correct number of bits
to reflect all changes done to _PAGE_ACCESSED over time.

Fixes: d069cb4373fe ("powerpc/8xx: Don't touch ACCESSED when no SWAP.")
Fixes: 5f356497c384 ("powerpc/8xx: remove unused _PAGE_WRITETHRU")
Fixes: e0a8e0d90a9f ("powerpc/8xx: Handle PAGE_USER via APG bits")
Fixes: 5b2753fc3e8a ("powerpc/8xx: Implementation of PAGE_EXEC")
Fixes: a891c43b97d3 ("powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages.")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/af834e8a0f1fa97bfae65664950f0984a70c4750.1602492856.git.christophe.leroy@csgroup.eu
---
 arch/powerpc/kernel/head_8xx.S | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 2274be535dda..3801b32b1642 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -359,11 +359,9 @@ InstructionTLBMiss:
 	/* Load the MI_TWC with the attributes for this "segment." */
 	MTSPR_CPU6(SPRN_MI_TWC, r11, r3)	/* Set segment attributes */
 
-#ifdef CONFIG_SWAP
-	rlwinm	r11, r10, 32-5, _PAGE_PRESENT
+	rlwinm	r11, r10, 32-11, _PAGE_PRESENT
 	and	r11, r11, r10
 	rlwimi	r10, r11, 0, _PAGE_PRESENT
-#endif
 	li	r11, RPN_PATTERN
 	/* The Linux PTE won't go exactly into the MMU TLB.
 	 * Software indicator bits 20-23 and 28 must be clear.
@@ -443,11 +441,9 @@ _ENTRY(DTLBMiss_jmp)
 	 * r11 = ((r10 & PRESENT) & ((r10 & ACCESSED) >> 5));
 	 * r10 = (r10 & ~PRESENT) | r11;
 	 */
-#ifdef CONFIG_SWAP
-	rlwinm	r11, r10, 32-5, _PAGE_PRESENT
+	rlwinm	r11, r10, 32-11, _PAGE_PRESENT
 	and	r11, r11, r10
 	rlwimi	r10, r11, 0, _PAGE_PRESENT
-#endif
 	/* The Linux PTE won't go exactly into the MMU TLB.
 	 * Software indicator bits 22 and 28 must be clear.
 	 * Software indicator bits 24, 25, 26, and 27 must be
-- 
2.25.0


^ permalink raw reply related

* [PATCH for 4.14] powerpc/8xx: Always fault when _PAGE_ACCESSED is not set
From: Christophe Leroy @ 2020-11-19  8:47 UTC (permalink / raw)
  To: gregkh, stable; +Cc: linuxppc-dev, linux-kernel

[This is backport for 4.14 of 29daf869cbab69088fe1755d9dd224e99ba78b56]

The kernel expects pte_young() to work regardless of CONFIG_SWAP.

Make sure a minor fault is taken to set _PAGE_ACCESSED when it
is not already set, regardless of the selection of CONFIG_SWAP.

This adds at least 3 instructions to the TLB miss exception
handlers fast path. Following patch will reduce this overhead.

Also update the rotation instruction to the correct number of bits
to reflect all changes done to _PAGE_ACCESSED over time.

Fixes: d069cb4373fe ("powerpc/8xx: Don't touch ACCESSED when no SWAP.")
Fixes: 5f356497c384 ("powerpc/8xx: remove unused _PAGE_WRITETHRU")
Fixes: e0a8e0d90a9f ("powerpc/8xx: Handle PAGE_USER via APG bits")
Fixes: 5b2753fc3e8a ("powerpc/8xx: Implementation of PAGE_EXEC")
Fixes: a891c43b97d3 ("powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages.")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/af834e8a0f1fa97bfae65664950f0984a70c4750.1602492856.git.christophe.leroy@csgroup.eu
---
 arch/powerpc/kernel/head_8xx.S | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 2d0d89e2cb9a..43884af0e35c 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -398,11 +398,9 @@ _ENTRY(ITLBMiss_cmp)
 #if defined (CONFIG_HUGETLB_PAGE) && defined (CONFIG_PPC_4K_PAGES)
 	rlwimi	r10, r11, 1, MI_SPS16K
 #endif
-#ifdef CONFIG_SWAP
-	rlwinm	r11, r10, 32-5, _PAGE_PRESENT
+	rlwinm	r11, r10, 32-11, _PAGE_PRESENT
 	and	r11, r11, r10
 	rlwimi	r10, r11, 0, _PAGE_PRESENT
-#endif
 	li	r11, RPN_PATTERN
 	/* The Linux PTE won't go exactly into the MMU TLB.
 	 * Software indicator bits 20-23 and 28 must be clear.
@@ -528,11 +526,9 @@ _ENTRY(DTLBMiss_jmp)
 	 * r11 = ((r10 & PRESENT) & ((r10 & ACCESSED) >> 5));
 	 * r10 = (r10 & ~PRESENT) | r11;
 	 */
-#ifdef CONFIG_SWAP
-	rlwinm	r11, r10, 32-5, _PAGE_PRESENT
+	rlwinm	r11, r10, 32-11, _PAGE_PRESENT
 	and	r11, r11, r10
 	rlwimi	r10, r11, 0, _PAGE_PRESENT
-#endif
 	/* The Linux PTE won't go exactly into the MMU TLB.
 	 * Software indicator bits 22 and 28 must be clear.
 	 * Software indicator bits 24, 25, 26, and 27 must be
-- 
2.25.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox