LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Allow custom PCI resource alignment on pseries
From: Shawn Anastasio @ 2019-05-28  1:54 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev
  Cc: sbobroff, linux-kernel, rppt, xyjxie, bhelgaas, paulus

Changes from v1 to v2:
  - Fix function declaration warnings caught by sparse

Hello all,

This patch set implements support for user-specified PCI resource
alignment on the pseries platform for hotplugged PCI devices.
Currently on pseries, PCI resource alignments specified with the
pci=resource_alignment commandline argument are ignored, since
the firmware is in charge of managing the PCI resources. In the
case of hotplugged devices, though, the kernel is in charge of 
configuring the resources and should obey alignment requirements.

The current behavior of ignoring the alignment for hotplugged devices
results in sub-page BARs landing between page boundaries and
becoming un-mappable from userspace via the VFIO framework.
This issue was observed on a pseries KVM guest with hotplugged
ivshmem devices.
 
With these changes, users can specify an appropriate
pci=resource_alignment argument on boot for devices they wish to use 
with VFIO.

In the future, this could be extended to provide page-aligned
resources by default for hotplugged devices, similar to what is done
on powernv by commit 382746376993 ("powerpc/powernv: Override
pcibios_default_alignment() to force PCI devices to be page aligned").

Feedback is appreciated.

Thanks,
Shawn

Shawn Anastasio (3):
  PCI: Introduce pcibios_ignore_alignment_request
  powerpc/64: Enable pcibios_after_init hook on ppc64
  powerpc/pseries: Allow user-specified PCI resource alignment after
    init

 arch/powerpc/include/asm/machdep.h     |  6 ++++--
 arch/powerpc/kernel/pci-common.c       |  9 +++++++++
 arch/powerpc/kernel/pci_64.c           |  4 ++++
 arch/powerpc/platforms/pseries/setup.c | 22 ++++++++++++++++++++++
 drivers/pci/pci.c                      |  9 +++++++--
 include/linux/pci.h                    |  1 +
 6 files changed, 47 insertions(+), 4 deletions(-)

-- 
2.20.1


^ permalink raw reply

* [PATCH v2 2/3] powerpc/64: Enable pcibios_after_init hook on ppc64
From: Shawn Anastasio @ 2019-05-28  1:54 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev
  Cc: sbobroff, linux-kernel, rppt, xyjxie, bhelgaas, paulus
In-Reply-To: <20190528015412.30521-1-shawn@anastas.io>

Enable the pcibios_after_init hook on all powerpc platforms.
This hook is executed at the end of pcibios_init and was previously
only available on CONFIG_PPC32.

Since it is useful and not inherently limited to 32-bit mode,
remove the limitation and allow it on all powerpc platforms.

Signed-off-by: Shawn Anastasio <shawn@anastas.io>
---
 arch/powerpc/include/asm/machdep.h | 3 +--
 arch/powerpc/kernel/pci_64.c       | 4 ++++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 2f0ca6560e47..2fbfaa9176ed 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -150,6 +150,7 @@ struct machdep_calls {
 	void		(*init)(void);
 
 	void		(*kgdb_map_scc)(void);
+#endif /* CONFIG_PPC32 */
 
 	/*
 	 * optional PCI "hooks"
@@ -157,8 +158,6 @@ struct machdep_calls {
 	/* Called at then very end of pcibios_init() */
 	void (*pcibios_after_init)(void);
 
-#endif /* CONFIG_PPC32 */
-
 	/* Called in indirect_* to avoid touching devices */
 	int (*pci_exclude_device)(struct pci_controller *, unsigned char, unsigned char);
 
diff --git a/arch/powerpc/kernel/pci_64.c b/arch/powerpc/kernel/pci_64.c
index 9d8c10d55407..fba7fe6e4a50 100644
--- a/arch/powerpc/kernel/pci_64.c
+++ b/arch/powerpc/kernel/pci_64.c
@@ -68,6 +68,10 @@ static int __init pcibios_init(void)
 
 	printk(KERN_DEBUG "PCI: Probing PCI hardware done\n");
 
+	/* Call machine dependent post-init code */
+	if (ppc_md.pcibios_after_init)
+		ppc_md.pcibios_after_init();
+
 	return 0;
 }
 
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH v2 2/2] tests: add close_range() tests
From: Michael Ellerman @ 2019-05-28  2:33 UTC (permalink / raw)
  To: Christian Brauner, viro, linux-kernel, linux-fsdevel, linux-api,
	torvalds, fweimer
  Cc: linux-ia64, linux-sh, ldv, dhowells, linux-kselftest, sparclinux,
	shuah, linux-arch, linux-s390, miklos, x86, Christian Brauner,
	linux-mips, linux-xtensa, tkjos, arnd, jannh, linux-m68k, tglx,
	linux-arm-kernel, linux-parisc, oleg, linux-alpha, linuxppc-dev
In-Reply-To: <20190523154747.15162-3-christian@brauner.io>

Christian Brauner <christian@brauner.io> writes:
> This adds basic tests for the new close_range() syscall.
> - test that no invalid flags can be passed
> - test that a range of file descriptors is correctly closed
> - test that a range of file descriptors is correctly closed if there there
>   are already closed file descriptors in the range
> - test that max_fd is correctly capped to the current fdtable maximum
>
> Signed-off-by: Christian Brauner <christian@brauner.io>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Jann Horn <jannh@google.com>
> Cc: David Howells <dhowells@redhat.com>
> Cc: Dmitry V. Levin <ldv@altlinux.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Florian Weimer <fweimer@redhat.com>
> Cc: linux-api@vger.kernel.org
> ---
> v1: unchanged
> v2:
> - Christian Brauner <christian@brauner.io>:
>   - verify that close_range() correctly closes a single file descriptor
> ---
>  tools/testing/selftests/Makefile              |   1 +
>  tools/testing/selftests/core/.gitignore       |   1 +
>  tools/testing/selftests/core/Makefile         |   6 +
>  .../testing/selftests/core/close_range_test.c | 142 ++++++++++++++++++
>  4 files changed, 150 insertions(+)
>  create mode 100644 tools/testing/selftests/core/.gitignore
>  create mode 100644 tools/testing/selftests/core/Makefile
>  create mode 100644 tools/testing/selftests/core/close_range_test.c
>
> diff --git a/tools/testing/selftests/core/.gitignore b/tools/testing/selftests/core/.gitignore
> new file mode 100644
> index 000000000000..6e6712ce5817
> --- /dev/null
> +++ b/tools/testing/selftests/core/.gitignore
> @@ -0,0 +1 @@
> +close_range_test
> diff --git a/tools/testing/selftests/core/Makefile b/tools/testing/selftests/core/Makefile
> new file mode 100644
> index 000000000000..de3ae68aa345
> --- /dev/null
> +++ b/tools/testing/selftests/core/Makefile
> @@ -0,0 +1,6 @@
> +CFLAGS += -g -I../../../../usr/include/ -I../../../../include

Your second -I pulls the unexported kernel headers in, userspace
programs shouldn't include unexported kernel headers.

It breaks the build on powerpc with eg:

  powerpc64le-linux-gnu-gcc -g -I../../../../usr/include/ -I../../../../include    close_range_test.c  -o /output/kselftest/core/close_range_test
  In file included from /usr/powerpc64le-linux-gnu/include/bits/fcntl-linux.h:346,
                   from /usr/powerpc64le-linux-gnu/include/bits/fcntl.h:62,
                   from /usr/powerpc64le-linux-gnu/include/fcntl.h:35,
                   from close_range_test.c:5:
  ../../../../include/linux/falloc.h:13:2: error: unknown type name '__s16'
    __s16  l_type;
    ^~~~~


Did you do that on purpose or just copy it from one of the other
Makefiles? :)

If you're just wanting to get the syscall number when the headers
haven't been exported, I think the best solution is to do eg:

diff --git a/tools/testing/selftests/core/close_range_test.c b/tools/testing/selftests/core/close_range_test.c
index d6e6079d3d53..34c6f02f25de 100644
--- a/tools/testing/selftests/core/close_range_test.c
+++ b/tools/testing/selftests/core/close_range_test.c
@@ -14,6 +14,10 @@

 #include "../kselftest.h"

+#ifndef __NR_close_range
+#define __NR_close_range       435
+#endif
+
 static inline int sys_close_range(unsigned int fd, unsigned int max_fd,
                                  unsigned int flags)
 {


cheers

^ permalink raw reply related

* [TRIVIAL] [PATCH] powerpc/powernv-eeh: Consisely desribe what this file does
From: Stewart Smith @ 2019-05-28  3:29 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: sbobroff, oohall, Stewart Smith, paulus

If the previous comment made sense, continue debugging or call your
doctor immediately.

Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index f38078976c5d..bea6708be065 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1,7 +1,5 @@
 /*
- * The file intends to implement the platform dependent EEH operations on
- * powernv platform. Actually, the powernv was created in order to fully
- * hypervisor support.
+ * PowerNV Platform dependent EEH operations
  *
  * Copyright Benjamin Herrenschmidt & Gavin Shan, IBM Corporation 2013.
  *
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH v3 2/3] arch: wire-up close_range()
From: Michael Ellerman @ 2019-05-28  3:43 UTC (permalink / raw)
  To: Christian Brauner, viro, linux-kernel, linux-fsdevel, torvalds,
	fweimer
  Cc: linux-ia64, linux-sh, ldv, dhowells, sparclinux, shuah,
	linux-arch, linux-s390, miklos, x86, Christian Brauner,
	linux-mips, linux-xtensa, tkjos, arnd, jannh, linux-m68k, tglx,
	linux-arm-kernel, linux-parisc, linux-api, oleg, linux-alpha,
	linuxppc-dev
In-Reply-To: <20190524111047.6892-3-christian@brauner.io>

Christian Brauner <christian@brauner.io> writes:
> diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
> index 103655d84b4b..ba2c1f078cbd 100644
> --- a/arch/powerpc/kernel/syscalls/syscall.tbl
> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
> @@ -515,3 +515,4 @@
>  431	common	fsconfig			sys_fsconfig
>  432	common	fsmount				sys_fsmount
>  433	common	fspick				sys_fspick
> +435	common	close_range			sys_close_range

With a minor build fix the selftest passes for me on ppc64le:

  # ./close_range_test 
  1..9
  ok 1 do not allow invalid flag values for close_range()
  ok 2 close_range() from 3 to 53
  ok 3 fcntl() verify closed range from 3 to 53
  ok 4 close_range() from 54 to 95
  ok 5 fcntl() verify closed range from 54 to 95
  ok 6 close_range() from 96 to 102
  ok 7 fcntl() verify closed range from 96 to 102
  ok 8 close_range() closed single file descriptor
  ok 9 fcntl() verify closed single file descriptor
  # Pass 9 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0


Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)

cheers

^ permalink raw reply

* Re: [RESEND PATCH 0/3] Allow custom PCI resource alignment on pseries
From: Oliver @ 2019-05-28  4:01 UTC (permalink / raw)
  To: Shawn Anastasio
  Cc: Sam Bobroff, linux-pci, Linux Kernel Mailing List, rppt,
	Alexey Kardashevskiy, Paul Mackerras, Bjorn Helgaas, xyjxie,
	linuxppc-dev
In-Reply-To: <20190527225521.5884-1-shawn@anastas.io>

On Tue, May 28, 2019 at 8:56 AM Shawn Anastasio <shawn@anastas.io> wrote:
>
> Hello all,
>
> This patch set implements support for user-specified PCI resource
> alignment on the pseries platform for hotplugged PCI devices.
> Currently on pseries, PCI resource alignments specified with the
> pci=resource_alignment commandline argument are ignored, since
> the firmware is in charge of managing the PCI resources. In the
> case of hotplugged devices, though, the kernel is in charge of
> configuring the resources and should obey alignment requirements.

Are you using hotplug to work around SLOF (the OF we use under qemu)
not aligning BARs to 64K? It looks like there is a commit in SLOF to
fix that (https://git.qemu.org/?p=SLOF.git;a=commit;f=board-qemu/slof/pci-phb.fs;h=1903174472f8800caf50c959b304501b4c01153c).

> The current behavior of ignoring the alignment for hotplugged devices
> results in sub-page BARs landing between page boundaries and
> becoming un-mappable from userspace via the VFIO framework.
> This issue was observed on a pseries KVM guest with hotplugged
> ivshmem devices.

> With these changes, users can specify an appropriate
> pci=resource_alignment argument on boot for devices they wish to use
> with VFIO.
>
> In the future, this could be extended to provide page-aligned
> resources by default for hotplugged devices, similar to what is done
> on powernv by commit 382746376993 ("powerpc/powernv: Override
> pcibios_default_alignment() to force PCI devices to be page aligned").

Can we make aligning the BARs to PAGE_SIZE the default behaviour? The
BAR assignment process is complex enough as-is so I'd rather we didn't
add another platform hack into the mix.

> Feedback is appreciated.
>
> Thanks,
> Shawn
>
> Shawn Anastasio (3):
>   PCI: Introduce pcibios_ignore_alignment_request
>   powerpc/64: Enable pcibios_after_init hook on ppc64
>   powerpc/pseries: Allow user-specified PCI resource alignment after
>     init
>
>  arch/powerpc/include/asm/machdep.h     |  6 ++++--
>  arch/powerpc/kernel/pci-common.c       |  9 +++++++++
>  arch/powerpc/kernel/pci_64.c           |  4 ++++
>  arch/powerpc/platforms/pseries/setup.c | 22 ++++++++++++++++++++++
>  drivers/pci/pci.c                      |  9 +++++++--
>  5 files changed, 46 insertions(+), 4 deletions(-)
>
> --
> 2.20.1
>

^ permalink raw reply

* Re: [TRIVIAL] [PATCH] powerpc/powernv-eeh: Consisely desribe what this file does
From: Oliver @ 2019-05-28  4:03 UTC (permalink / raw)
  To: Stewart Smith; +Cc: Sam Bobroff, Paul Mackerras, linuxppc-dev
In-Reply-To: <20190528032925.8836-1-stewart@linux.ibm.com>

On Tue, May 28, 2019 at 1:29 PM Stewart Smith <stewart@linux.ibm.com> wrote:
>
> If the previous comment made sense, continue debugging or call your
> doctor immediately.
>
> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
> ---
>  arch/powerpc/platforms/powernv/eeh-powernv.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index f38078976c5d..bea6708be065 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -1,7 +1,5 @@
>  /*
> - * The file intends to implement the platform dependent EEH operations on
> - * powernv platform. Actually, the powernv was created in order to fully
> - * hypervisor support.
> + * PowerNV Platform dependent EEH operations
>   *
>   * Copyright Benjamin Herrenschmidt & Gavin Shan, IBM Corporation 2013.

Stewart, Thanks for fixing it up. Since you're at it, Please replace
the maintainer to yourself.

>   *
> --
> 2.21.0
>

^ permalink raw reply

* [PATCH v3 3/3] powerpc/pseries: Allow user-specified PCI resource alignment after init
From: Shawn Anastasio @ 2019-05-28  4:03 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev
  Cc: sbobroff, linux-kernel, rppt, xyjxie, bhelgaas, paulus
In-Reply-To: <20190528040313.35582-1-shawn@anastas.io>

On pseries, custom PCI resource alignment specified with the commandline
argument pci=resource_alignment is disabled due to PCI resources being
managed by the firmware. However, in the case of PCI hotplug the
resources are managed by the kernel, so custom alignments should be
honored in these cases. This is done by only honoring custom
alignments after initial PCI initialization is done, to ensure that
all devices managed by the firmware are excluded.

Without this ability, sub-page BARs sometimes get mapped in between
page boundaries for hotplugged devices and are therefore unusable
with the VFIO framework. This change allows users to request
page alignment for devices they wish to access via VFIO using
the pci=resource_alignment commandline argument.

In the future, this could be extended to provide page-aligned
resources by default for hotplugged devices, similar to what is
done on powernv by commit 382746376993 ("powerpc/powernv: Override
pcibios_default_alignment() to force PCI devices to be page aligned")

Signed-off-by: Shawn Anastasio <shawn@anastas.io>
---
 arch/powerpc/include/asm/machdep.h     |  3 +++
 arch/powerpc/kernel/pci-common.c       |  9 +++++++++
 arch/powerpc/platforms/pseries/setup.c | 22 ++++++++++++++++++++++
 3 files changed, 34 insertions(+)

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 2fbfaa9176ed..46eb62c0954e 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -179,6 +179,9 @@ struct machdep_calls {
 
 	resource_size_t (*pcibios_default_alignment)(void);
 
+	/* Called when determining PCI resource alignment */
+	int (*pcibios_ignore_alignment_request)(void);
+
 #ifdef CONFIG_PCI_IOV
 	void (*pcibios_fixup_sriov)(struct pci_dev *pdev);
 	resource_size_t (*pcibios_iov_resource_alignment)(struct pci_dev *, int resno);
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index ff4b7539cbdf..8e0d73b4c188 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -238,6 +238,15 @@ resource_size_t pcibios_default_alignment(void)
 	return 0;
 }
 
+int pcibios_ignore_alignment_request(void)
+{
+	if (ppc_md.pcibios_ignore_alignment_request)
+		return ppc_md.pcibios_ignore_alignment_request();
+
+	/* Fall back to default method of checking PCI_PROBE_ONLY */
+	return pci_has_flag(PCI_PROBE_ONLY);
+}
+
 #ifdef CONFIG_PCI_IOV
 resource_size_t pcibios_iov_resource_alignment(struct pci_dev *pdev, int resno)
 {
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index e4f0dfd4ae33..07f03be02afe 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -82,6 +82,8 @@ EXPORT_SYMBOL(CMO_PageSize);
 
 int fwnmi_active;  /* TRUE if an FWNMI handler is present */
 
+static int initial_pci_init_done; /* TRUE if initial pcibios init has completed */
+
 static void pSeries_show_cpuinfo(struct seq_file *m)
 {
 	struct device_node *root;
@@ -749,6 +751,23 @@ static resource_size_t pseries_pci_iov_resource_alignment(struct pci_dev *pdev,
 }
 #endif
 
+static void pseries_after_init(void)
+{
+	initial_pci_init_done = 1;
+}
+
+static int pseries_ignore_alignment_request(void)
+{
+	if (initial_pci_init_done)
+		/*
+		 * Allow custom alignments after init for things
+		 * like PCI hotplugging.
+		 */
+		return 0;
+
+	return pci_has_flag(PCI_PROBE_ONLY);
+}
+
 static void __init pSeries_setup_arch(void)
 {
 	set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT);
@@ -797,6 +816,9 @@ static void __init pSeries_setup_arch(void)
 	}
 
 	ppc_md.pcibios_root_bridge_prepare = pseries_root_bridge_prepare;
+	ppc_md.pcibios_after_init = pseries_after_init;
+	ppc_md.pcibios_ignore_alignment_request =
+		pseries_ignore_alignment_request;
 }
 
 static void pseries_panic(char *str)
-- 
2.20.1


^ permalink raw reply related

* [PATCH v3 0/3] Allow custom PCI resource alignment on pseries
From: Shawn Anastasio @ 2019-05-28  4:03 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev
  Cc: sbobroff, linux-kernel, rppt, xyjxie, bhelgaas, paulus

Changes from v2 to v3:
  - Fix wrong return type of ppc pcibios_ignore_alignment_request
    (Not sure how my local compile didn't catch that!)

Hello all,

This patch set implements support for user-specified PCI resource
alignment on the pseries platform for hotplugged PCI devices.
Currently on pseries, PCI resource alignments specified with the
pci=resource_alignment commandline argument are ignored, since
the firmware is in charge of managing the PCI resources. In the
case of hotplugged devices, though, the kernel is in charge of 
configuring the resources and should obey alignment requirements.

The current behavior of ignoring the alignment for hotplugged devices
results in sub-page BARs landing between page boundaries and
becoming un-mappable from userspace via the VFIO framework.
This issue was observed on a pseries KVM guest with hotplugged
ivshmem devices.
 
With these changes, users can specify an appropriate
pci=resource_alignment argument on boot for devices they wish to use 
with VFIO.

In the future, this could be extended to provide page-aligned
resources by default for hotplugged devices, similar to what is done
on powernv by commit 382746376993 ("powerpc/powernv: Override
pcibios_default_alignment() to force PCI devices to be page aligned").

Feedback is appreciated.

Thanks,
Shawn

Shawn Anastasio (3):
  PCI: Introduce pcibios_ignore_alignment_request
  powerpc/64: Enable pcibios_after_init hook on ppc64
  powerpc/pseries: Allow user-specified PCI resource alignment after
    init

 arch/powerpc/include/asm/machdep.h     |  6 ++++--
 arch/powerpc/kernel/pci-common.c       |  9 +++++++++
 arch/powerpc/kernel/pci_64.c           |  4 ++++
 arch/powerpc/platforms/pseries/setup.c | 22 ++++++++++++++++++++++
 drivers/pci/pci.c                      |  9 +++++++--
 include/linux/pci.h                    |  1 +
 6 files changed, 47 insertions(+), 4 deletions(-)

-- 
2.20.1


^ permalink raw reply

* [PATCH v3 1/3] PCI: Introduce pcibios_ignore_alignment_request
From: Shawn Anastasio @ 2019-05-28  4:03 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev
  Cc: sbobroff, linux-kernel, rppt, xyjxie, bhelgaas, paulus
In-Reply-To: <20190528040313.35582-1-shawn@anastas.io>

Introduce a new pcibios function pcibios_ignore_alignment_request
which allows the PCI core to defer to platform-specific code to
determine whether or not to ignore alignment requests for PCI resources.

The existing behavior is to simply ignore alignment requests when
PCI_PROBE_ONLY is set. This is behavior is maintained by the
default implementation of pcibios_ignore_alignment_request.

Signed-off-by: Shawn Anastasio <shawn@anastas.io>
---
 drivers/pci/pci.c   | 9 +++++++--
 include/linux/pci.h | 1 +
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 8abc843b1615..8207a09085d1 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5882,6 +5882,11 @@ resource_size_t __weak pcibios_default_alignment(void)
 	return 0;
 }
 
+int __weak pcibios_ignore_alignment_request(void)
+{
+	return pci_has_flag(PCI_PROBE_ONLY);
+}
+
 #define RESOURCE_ALIGNMENT_PARAM_SIZE COMMAND_LINE_SIZE
 static char resource_alignment_param[RESOURCE_ALIGNMENT_PARAM_SIZE] = {0};
 static DEFINE_SPINLOCK(resource_alignment_lock);
@@ -5906,9 +5911,9 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
 	p = resource_alignment_param;
 	if (!*p && !align)
 		goto out;
-	if (pci_has_flag(PCI_PROBE_ONLY)) {
+	if (pcibios_ignore_alignment_request()) {
 		align = 0;
-		pr_info_once("PCI: Ignoring requested alignments (PCI_PROBE_ONLY)\n");
+		pr_info_once("PCI: Ignoring requested alignments\n");
 		goto out;
 	}
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 4a5a84d7bdd4..47471dcdbaf9 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1990,6 +1990,7 @@ static inline void pcibios_penalize_isa_irq(int irq, int active) {}
 int pcibios_alloc_irq(struct pci_dev *dev);
 void pcibios_free_irq(struct pci_dev *dev);
 resource_size_t pcibios_default_alignment(void);
+int pcibios_ignore_alignment_request(void);
 
 #ifdef CONFIG_HIBERNATE_CALLBACKS
 extern struct dev_pm_ops pcibios_pm_ops;
-- 
2.20.1


^ permalink raw reply related

* [PATCH v3 2/3] powerpc/64: Enable pcibios_after_init hook on ppc64
From: Shawn Anastasio @ 2019-05-28  4:03 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev
  Cc: sbobroff, linux-kernel, rppt, xyjxie, bhelgaas, paulus
In-Reply-To: <20190528040313.35582-1-shawn@anastas.io>

Enable the pcibios_after_init hook on all powerpc platforms.
This hook is executed at the end of pcibios_init and was previously
only available on CONFIG_PPC32.

Since it is useful and not inherently limited to 32-bit mode,
remove the limitation and allow it on all powerpc platforms.

Signed-off-by: Shawn Anastasio <shawn@anastas.io>
---
 arch/powerpc/include/asm/machdep.h | 3 +--
 arch/powerpc/kernel/pci_64.c       | 4 ++++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 2f0ca6560e47..2fbfaa9176ed 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -150,6 +150,7 @@ struct machdep_calls {
 	void		(*init)(void);
 
 	void		(*kgdb_map_scc)(void);
+#endif /* CONFIG_PPC32 */
 
 	/*
 	 * optional PCI "hooks"
@@ -157,8 +158,6 @@ struct machdep_calls {
 	/* Called at then very end of pcibios_init() */
 	void (*pcibios_after_init)(void);
 
-#endif /* CONFIG_PPC32 */
-
 	/* Called in indirect_* to avoid touching devices */
 	int (*pci_exclude_device)(struct pci_controller *, unsigned char, unsigned char);
 
diff --git a/arch/powerpc/kernel/pci_64.c b/arch/powerpc/kernel/pci_64.c
index 9d8c10d55407..fba7fe6e4a50 100644
--- a/arch/powerpc/kernel/pci_64.c
+++ b/arch/powerpc/kernel/pci_64.c
@@ -68,6 +68,10 @@ static int __init pcibios_init(void)
 
 	printk(KERN_DEBUG "PCI: Probing PCI hardware done\n");
 
+	/* Call machine dependent post-init code */
+	if (ppc_md.pcibios_after_init)
+		ppc_md.pcibios_after_init();
+
 	return 0;
 }
 
-- 
2.20.1


^ permalink raw reply related

* Re: [RESEND PATCH 0/3] Allow custom PCI resource alignment on pseries
From: Shawn Anastasio @ 2019-05-28  4:09 UTC (permalink / raw)
  To: Oliver
  Cc: Sam Bobroff, linux-pci, Linux Kernel Mailing List, rppt,
	Alexey Kardashevskiy, Paul Mackerras, Bjorn Helgaas, xyjxie,
	linuxppc-dev
In-Reply-To: <CAOSf1CFFyz0YNqdpd5r44MaBV449yoK3WOMBZ1mpgZ=judNfDQ@mail.gmail.com>



On 5/27/19 11:01 PM, Oliver wrote:
> On Tue, May 28, 2019 at 8:56 AM Shawn Anastasio <shawn@anastas.io> wrote:
>>
>> Hello all,
>>
>> This patch set implements support for user-specified PCI resource
>> alignment on the pseries platform for hotplugged PCI devices.
>> Currently on pseries, PCI resource alignments specified with the
>> pci=resource_alignment commandline argument are ignored, since
>> the firmware is in charge of managing the PCI resources. In the
>> case of hotplugged devices, though, the kernel is in charge of
>> configuring the resources and should obey alignment requirements.
> 
> Are you using hotplug to work around SLOF (the OF we use under qemu)
> not aligning BARs to 64K? It looks like there is a commit in SLOF to
> fix that (https://git.qemu.org/?p=SLOF.git;a=commit;f=board-qemu/slof/pci-phb.fs;h=1903174472f8800caf50c959b304501b4c01153c).
> 

No, my application actually requires PCI hotplug at run-time.

>> The current behavior of ignoring the alignment for hotplugged devices
>> results in sub-page BARs landing between page boundaries and
>> becoming un-mappable from userspace via the VFIO framework.
>> This issue was observed on a pseries KVM guest with hotplugged
>> ivshmem devices.
> 
>> With these changes, users can specify an appropriate
>> pci=resource_alignment argument on boot for devices they wish to use
>> with VFIO.
>>
>> In the future, this could be extended to provide page-aligned
>> resources by default for hotplugged devices, similar to what is done
>> on powernv by commit 382746376993 ("powerpc/powernv: Override
>> pcibios_default_alignment() to force PCI devices to be page aligned").
> 
> Can we make aligning the BARs to PAGE_SIZE the default behaviour? The
> BAR assignment process is complex enough as-is so I'd rather we didn't
> add another platform hack into the mix.

Absolutely. This will still require the existing changes so that the 
custom alignment isn't flat-out ignored on pseries, but I can set
it to default to PAGE_SIZE as well, similar to how it's done on PowerNV.
I've just pushed a v3 to fix a typo and I'll incorporate this change
in v4.

>> Feedback is appreciated.
>>
>> Thanks,
>> Shawn
>>
>> Shawn Anastasio (3):
>>    PCI: Introduce pcibios_ignore_alignment_request
>>    powerpc/64: Enable pcibios_after_init hook on ppc64
>>    powerpc/pseries: Allow user-specified PCI resource alignment after
>>      init
>>
>>   arch/powerpc/include/asm/machdep.h     |  6 ++++--
>>   arch/powerpc/kernel/pci-common.c       |  9 +++++++++
>>   arch/powerpc/kernel/pci_64.c           |  4 ++++
>>   arch/powerpc/platforms/pseries/setup.c | 22 ++++++++++++++++++++++
>>   drivers/pci/pci.c                      |  9 +++++++--
>>   5 files changed, 46 insertions(+), 4 deletions(-)
>>
>> --
>> 2.20.1
>>

^ permalink raw reply

* Re: [TRIVIAL] [PATCH] powerpc/powernv-eeh: Consisely desribe what this file does
From: Russell Currey @ 2019-05-28  4:20 UTC (permalink / raw)
  To: Stewart Smith, linuxppc-dev; +Cc: sbobroff, oohall, paulus
In-Reply-To: <20190528032925.8836-1-stewart@linux.ibm.com>

On Tue, 2019-05-28 at 13:29 +1000, Stewart Smith wrote:
> If the previous comment made sense, continue debugging or call your
> doctor immediately.
> 
> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>

This reply intends to implement the ack dependent EEH patch on powernv
platform.  Actually, the reply was created in order to fully ack
support.

Fully-ack-supported-by: Russell Currey <ruscur@russell.cc>



^ permalink raw reply

* Re: [TRIVIAL] [PATCH] powerpc/powernv-eeh: Consisely desribe what this file does
From: Stewart Smith @ 2019-05-28  4:43 UTC (permalink / raw)
  To: Oliver; +Cc: Sam Bobroff, Paul Mackerras, linuxppc-dev
In-Reply-To: <CAOSf1CHj0p8vgc710hFyT771T52zc0mm3UDu=MV1x39m1Ux3cg@mail.gmail.com>

Oliver <oohall@gmail.com> writes:

> On Tue, May 28, 2019 at 1:29 PM Stewart Smith <stewart@linux.ibm.com> wrote:
>>
>> If the previous comment made sense, continue debugging or call your
>> doctor immediately.
>>
>> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
>> ---
>>  arch/powerpc/platforms/powernv/eeh-powernv.c | 4 +---
>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>> index f38078976c5d..bea6708be065 100644
>> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>> @@ -1,7 +1,5 @@
>>  /*
>> - * The file intends to implement the platform dependent EEH operations on
>> - * powernv platform. Actually, the powernv was created in order to fully
>> - * hypervisor support.
>> + * PowerNV Platform dependent EEH operations
>>   *
>>   * Copyright Benjamin Herrenschmidt & Gavin Shan, IBM Corporation 2013.
>
> Stewart, Thanks for fixing it up. Since you're at it, Please replace
> the maintainer to yourself.

This message intends to implement the middle raising operations on the
finger platform. Actually, the EEH was created in order to fully
phalange extension.

:)

-- 
Stewart Smith
OPAL Architect, IBM.


^ permalink raw reply

* Re: [RESEND PATCH 0/3] Allow custom PCI resource alignment on pseries
From: Oliver @ 2019-05-28  4:52 UTC (permalink / raw)
  To: Shawn Anastasio
  Cc: Sam Bobroff, linux-pci, Linux Kernel Mailing List, rppt,
	Alexey Kardashevskiy, Paul Mackerras, Bjorn Helgaas, xyjxie,
	linuxppc-dev
In-Reply-To: <476555b7-b462-d844-57ea-7ca1c6113d9b@anastas.io>

On Tue, May 28, 2019 at 2:09 PM Shawn Anastasio <shawn@anastas.io> wrote:
>
>
>
> On 5/27/19 11:01 PM, Oliver wrote:
> > On Tue, May 28, 2019 at 8:56 AM Shawn Anastasio <shawn@anastas.io> wrote:
> >>
> >> Hello all,
> >>
> >> This patch set implements support for user-specified PCI resource
> >> alignment on the pseries platform for hotplugged PCI devices.
> >> Currently on pseries, PCI resource alignments specified with the
> >> pci=resource_alignment commandline argument are ignored, since
> >> the firmware is in charge of managing the PCI resources. In the
> >> case of hotplugged devices, though, the kernel is in charge of
> >> configuring the resources and should obey alignment requirements.
> >
> > Are you using hotplug to work around SLOF (the OF we use under qemu)
> > not aligning BARs to 64K? It looks like there is a commit in SLOF to
> > fix that (https://git.qemu.org/?p=SLOF.git;a=commit;f=board-qemu/slof/pci-phb.fs;h=1903174472f8800caf50c959b304501b4c01153c).
> >
>
> No, my application actually requires PCI hotplug at run-time.
>
> >> The current behavior of ignoring the alignment for hotplugged devices
> >> results in sub-page BARs landing between page boundaries and
> >> becoming un-mappable from userspace via the VFIO framework.
> >> This issue was observed on a pseries KVM guest with hotplugged
> >> ivshmem devices.
> >
> >> With these changes, users can specify an appropriate
> >> pci=resource_alignment argument on boot for devices they wish to use
> >> with VFIO.
> >>
> >> In the future, this could be extended to provide page-aligned
> >> resources by default for hotplugged devices, similar to what is done
> >> on powernv by commit 382746376993 ("powerpc/powernv: Override
> >> pcibios_default_alignment() to force PCI devices to be page aligned").
> >
> > Can we make aligning the BARs to PAGE_SIZE the default behaviour? The
> > BAR assignment process is complex enough as-is so I'd rather we didn't
> > add another platform hack into the mix.
>
> Absolutely. This will still require the existing changes so that the
> custom alignment isn't flat-out ignored on pseries, but I can set
> it to default to PAGE_SIZE as well, similar to how it's done on PowerNV.
> I've just pushed a v3 to fix a typo and I'll incorporate this change
> in v4.

I was thinking we could get rid of the ppcmd callback and do it in
kernel/pci-common.c. PowerNV is the only platform that implements the
callback and the pseries implementation is going to be identical so I
don't think there's much of point in keeping the callback.

> >> Feedback is appreciated.
> >>
> >> Thanks,
> >> Shawn
> >>
> >> Shawn Anastasio (3):
> >>    PCI: Introduce pcibios_ignore_alignment_request
> >>    powerpc/64: Enable pcibios_after_init hook on ppc64
> >>    powerpc/pseries: Allow user-specified PCI resource alignment after
> >>      init
> >>
> >>   arch/powerpc/include/asm/machdep.h     |  6 ++++--
> >>   arch/powerpc/kernel/pci-common.c       |  9 +++++++++
> >>   arch/powerpc/kernel/pci_64.c           |  4 ++++
> >>   arch/powerpc/platforms/pseries/setup.c | 22 ++++++++++++++++++++++
> >>   drivers/pci/pci.c                      |  9 +++++++--
> >>   5 files changed, 46 insertions(+), 4 deletions(-)
> >>
> >> --
> >> 2.20.1
> >>

^ permalink raw reply

* Re: kmemleak: 1157 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
From: Michael Ellerman @ 2019-05-28  5:21 UTC (permalink / raw)
  To: Mathieu Malaterre, linuxppc-dev
In-Reply-To: <CA+7wUsw_jkgWfknXbpK7_yfy=S5Y0jvQe1KP3kM-LT8fFnUF5g@mail.gmail.com>

Mathieu Malaterre <malat@debian.org> writes:
> Hi there,
>
> Is there a way to dump more context (somewhere in of tree
> flattening?). I cannot make sense of the following:

Hmm. Not that I know of.

Those don't look related to OF flattening/unflattening. That's just
sysfs setup based on the unflattened device tree.

The allocations are happening in safe_name() AFAICS.

int __of_add_property_sysfs(struct device_node *np, struct property *pp)
{
	...
	pp->attr.attr.name = safe_name(&np->kobj, pp->name);

And the free is in __of_sysfs_remove_bin_file():

void __of_sysfs_remove_bin_file(struct device_node *np, struct property *prop)
{
	if (!IS_ENABLED(CONFIG_SYSFS))
		return;

	sysfs_remove_bin_file(&np->kobj, &prop->attr);
	kfree(prop->attr.attr.name);


There is this check which could be failing leading to us not calling the
free at all:

void __of_remove_property_sysfs(struct device_node *np, struct property *prop)
{
	/* at early boot, bail here and defer setup to of_init() */
	if (of_kset && of_node_is_attached(np))
		__of_sysfs_remove_bin_file(np, prop);
}


So maybe stick a printk() in there to see if you're hitting that
condition, eg something like:

	if (of_kset && of_node_is_attached(np))
		__of_sysfs_remove_bin_file(np, prop);
	else
		printk("%s: leaking prop %s on node %pOF\n", __func__, prop->attr.attr.name, np);


cheers

> kmemleak: 1157 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
>
> Where:
>
> # head -40 /sys/kernel/debug/kmemleak
> unreferenced object 0xdf44d180 (size 8):
>   comm "swapper", pid 1, jiffies 4294892297 (age 4766.460s)
>   hex dump (first 8 bytes):
>     62 61 73 65 00 00 00 00                          base....
>   backtrace:
>     [<0ca59825>] kstrdup+0x4c/0xb8
>     [<c8a79377>] kobject_set_name_vargs+0x34/0xc8
>     [<661b4c86>] kobject_add+0x78/0x120
>     [<c1416f27>] __of_attach_node_sysfs+0xa0/0x14c
>     [<2a143d10>] of_core_init+0x90/0x114
>     [<a353d0e1>] driver_init+0x30/0x48
>     [<84ed01b1>] kernel_init_freeable+0xfc/0x3fc
>     [<dc60f815>] kernel_init+0x20/0x110
>     [<faa1c5b0>] ret_from_kernel_thread+0x14/0x1c
> unreferenced object 0xdf44d178 (size 8):
>   comm "swapper", pid 1, jiffies 4294892297 (age 4766.460s)
>   hex dump (first 8 bytes):
>     6d 6f 64 65 6c 00 97 c8                          model...
>   backtrace:
>     [<0ca59825>] kstrdup+0x4c/0xb8
>     [<0eeb0a3b>] __of_add_property_sysfs+0x88/0x12c
>     [<f6c64af0>] __of_attach_node_sysfs+0xcc/0x14c
>     [<2a143d10>] of_core_init+0x90/0x114
>     [<a353d0e1>] driver_init+0x30/0x48
>     [<84ed01b1>] kernel_init_freeable+0xfc/0x3fc
>     [<dc60f815>] kernel_init+0x20/0x110
>     [<faa1c5b0>] ret_from_kernel_thread+0x14/0x1c
> unreferenced object 0xdf4021e0 (size 16):
>   comm "swapper", pid 1, jiffies 4294892297 (age 4766.460s)
>   hex dump (first 16 bytes):
>     63 6f 6d 70 61 74 69 62 6c 65 00 01 00 00 00 00  compatible......
>   backtrace:
>     [<0ca59825>] kstrdup+0x4c/0xb8
>     [<0eeb0a3b>] __of_add_property_sysfs+0x88/0x12c
>     [<f6c64af0>] __of_attach_node_sysfs+0xcc/0x14c
>     [<2a143d10>] of_core_init+0x90/0x114
>     [<a353d0e1>] driver_init+0x30/0x48
>     [<84ed01b1>] kernel_init_freeable+0xfc/0x3fc
>     [<dc60f815>] kernel_init+0x20/0x110
>     [<faa1c5b0>] ret_from_kernel_thread+0x14/0x1c

^ permalink raw reply

* [PATCH] powerpc/mm: Move some of the boot time info print to generic file
From: Aneesh Kumar K.V @ 2019-05-28  5:35 UTC (permalink / raw)
  To: npiggin, paulus, mpe; +Cc: Aneesh Kumar K.V, linuxppc-dev

With radix translation enabled we find in dmesg

 hash-mmu: ppc64_pft_size    = 0x0
 hash-mmu: kernel vmalloc start   = 0xc008000000000000
 hash-mmu: kernel IO start        = 0xc00a000000000000
 hash-mmu: kernel vmemmap start   = 0xc00c000000000000

This is because these pr_info calls are in hash_utils.c which has

 #define pr_fmt(fmt) "hash-mmu: " fmt

The information printed in generic and hence move that to generic file

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 arch/powerpc/kernel/setup-common.c    | 4 ++++
 arch/powerpc/mm/book3s64/hash_utils.c | 5 -----
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
index aad9f5df6ab6..a73a91f2c21f 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -810,6 +810,10 @@ static __init void print_system_info(void)
 	pr_info("mmu_features      = 0x%08x\n", cur_cpu_spec->mmu_features);
 #ifdef CONFIG_PPC64
 	pr_info("firmware_features = 0x%016lx\n", powerpc_firmware_features);
+	pr_info("ppc64_pft_size    = 0x%llx\n", ppc64_pft_size);
+	pr_info("kernel vmalloc start   = 0x%lx\n", KERN_VIRT_START);
+	pr_info("kernel IO start        = 0x%lx\n", KERN_IO_START);
+	pr_info("kernel vmemmap start   = 0x%lx\n", (unsigned long)vmemmap);
 #endif
 
 	print_system_hash_info();
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 919a861a8ec0..2f677914bfd2 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1950,11 +1950,6 @@ machine_device_initcall(pseries, hash64_debugfs);
 
 void __init print_system_hash_info(void)
 {
-	pr_info("ppc64_pft_size    = 0x%llx\n", ppc64_pft_size);
-
 	if (htab_hash_mask)
 		pr_info("htab_hash_mask    = 0x%lx\n", htab_hash_mask);
-	pr_info("kernel vmalloc start   = 0x%lx\n", KERN_VIRT_START);
-	pr_info("kernel IO start        = 0x%lx\n", KERN_IO_START);
-	pr_info("kernel vmemmap start   = 0x%lx\n", (unsigned long)vmemmap);
 }
-- 
2.21.0


^ permalink raw reply related

* [PATCH v2 1/3] powerpc/mm: Handle page table allocation failures
From: Aneesh Kumar K.V @ 2019-05-28  5:36 UTC (permalink / raw)
  To: npiggin, paulus, mpe; +Cc: Aneesh Kumar K.V, linuxppc-dev

This fixes kernel crash that arises due to not handling page table allocation
failures while allocating hugetlb page table.

Fixes: e2b3d202d1db ("powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 arch/powerpc/mm/hugetlbpage.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index b5d92dc32844..1de0f43a68e5 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -130,6 +130,8 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
 	} else {
 		pdshift = PUD_SHIFT;
 		pu = pud_alloc(mm, pg, addr);
+		if (!pu)
+			return NULL;
 		if (pshift == PUD_SHIFT)
 			return (pte_t *)pu;
 		else if (pshift > PMD_SHIFT) {
@@ -138,6 +140,8 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
 		} else {
 			pdshift = PMD_SHIFT;
 			pm = pmd_alloc(mm, pu, addr);
+			if (!pm)
+				return NULL;
 			if (pshift == PMD_SHIFT)
 				/* 16MB hugepage */
 				return (pte_t *)pm;
@@ -154,12 +158,16 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
 	} else {
 		pdshift = PUD_SHIFT;
 		pu = pud_alloc(mm, pg, addr);
+		if (!pu)
+			return NULL;
 		if (pshift >= PUD_SHIFT) {
 			ptl = pud_lockptr(mm, pu);
 			hpdp = (hugepd_t *)pu;
 		} else {
 			pdshift = PMD_SHIFT;
 			pm = pmd_alloc(mm, pu, addr);
+			if (!pm)
+				return NULL;
 			ptl = pmd_lockptr(mm, pm);
 			hpdp = (hugepd_t *)pm;
 		}
-- 
2.21.0


^ permalink raw reply related

* [PATCH v2 3/3] powerpc/mm/hugetlb: Don't enable HugeTLB if we don't have a page table cache
From: Aneesh Kumar K.V @ 2019-05-28  5:36 UTC (permalink / raw)
  To: npiggin, paulus, mpe; +Cc: Aneesh Kumar K.V, linuxppc-dev
In-Reply-To: <20190528053626.2756-1-aneesh.kumar@linux.ibm.com>

This makes sure we don't enable HugeTLB if the cache is not configured.
I am still not sure about this. IMHO hugetlb support should be a hardware
support derivative and any cache allocation failure should be handled as I did
in the earlier patch. But then if we were not able to create hugetlb page table
cache, we can as well declare hugetlb support disabled thereby avoiding calling
into allocation routines.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 arch/powerpc/mm/hugetlbpage.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index f55dc110f2ad..d34540479b1a 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -601,6 +601,7 @@ __setup("hugepagesz=", hugepage_setup_sz);
 
 static int __init hugetlbpage_init(void)
 {
+	bool configured = false;
 	int psize;
 
 	if (hugetlb_disabled) {
@@ -651,10 +652,15 @@ static int __init hugetlbpage_init(void)
 			pgtable_cache_add(pdshift - shift);
 		else if (IS_ENABLED(CONFIG_PPC_FSL_BOOK3E) || IS_ENABLED(CONFIG_PPC_8xx))
 			pgtable_cache_add(PTE_T_ORDER);
+
+		configured = true;
 	}
 
-	if (IS_ENABLED(CONFIG_HUGETLB_PAGE_SIZE_VARIABLE))
-		hugetlbpage_init_default();
+	if (configured) {
+		if (IS_ENABLED(CONFIG_HUGETLB_PAGE_SIZE_VARIABLE))
+			hugetlbpage_init_default();
+	} else
+		pr_info("Failed to initialize. Disabling HugeTLB");
 
 	return 0;
 }
-- 
2.21.0


^ permalink raw reply related

* [PATCH v2 2/3] powerpc/mm/hugetlb: Fix kernel crash if we fail to allocate page table caches
From: Aneesh Kumar K.V @ 2019-05-28  5:36 UTC (permalink / raw)
  To: npiggin, paulus, mpe; +Cc: Aneesh Kumar K.V, linuxppc-dev
In-Reply-To: <20190528053626.2756-1-aneesh.kumar@linux.ibm.com>

We only check for hugetlb allocations, because with hugetlb we do conditional
registration. For PGD/PUD/PMD levels we register them always in
pgtable_cache_init.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 arch/powerpc/mm/hugetlbpage.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 1de0f43a68e5..f55dc110f2ad 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -61,12 +61,17 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
 		num_hugepd = 1;
 	}
 
+	if (!cachep) {
+		WARN_ONCE(1, "No page table cache created for hugetlb tables");
+		return -ENOMEM;
+	}
+
 	new = kmem_cache_alloc(cachep, pgtable_gfp_flags(mm, GFP_KERNEL));
 
 	BUG_ON(pshift > HUGEPD_SHIFT_MASK);
 	BUG_ON((unsigned long)new & HUGEPD_SHIFT_MASK);
 
-	if (! new)
+	if (!new)
 		return -ENOMEM;
 
 	/*
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH v3 1/3] PCI: Introduce pcibios_ignore_alignment_request
From: Oliver @ 2019-05-28  5:36 UTC (permalink / raw)
  To: Shawn Anastasio
  Cc: Sam Bobroff, linux-pci, Linux Kernel Mailing List, rppt,
	Paul Mackerras, Bjorn Helgaas, xyjxie, linuxppc-dev
In-Reply-To: <20190528040313.35582-2-shawn@anastas.io>

On Tue, May 28, 2019 at 2:03 PM Shawn Anastasio <shawn@anastas.io> wrote:
>
> Introduce a new pcibios function pcibios_ignore_alignment_request
> which allows the PCI core to defer to platform-specific code to
> determine whether or not to ignore alignment requests for PCI resources.
>
> The existing behavior is to simply ignore alignment requests when
> PCI_PROBE_ONLY is set. This is behavior is maintained by the
> default implementation of pcibios_ignore_alignment_request.
>
> Signed-off-by: Shawn Anastasio <shawn@anastas.io>
> ---
>  drivers/pci/pci.c   | 9 +++++++--
>  include/linux/pci.h | 1 +
>  2 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 8abc843b1615..8207a09085d1 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5882,6 +5882,11 @@ resource_size_t __weak pcibios_default_alignment(void)
>         return 0;
>  }
>
> +int __weak pcibios_ignore_alignment_request(void)
> +{
> +       return pci_has_flag(PCI_PROBE_ONLY);
> +}
> +
>  #define RESOURCE_ALIGNMENT_PARAM_SIZE COMMAND_LINE_SIZE
>  static char resource_alignment_param[RESOURCE_ALIGNMENT_PARAM_SIZE] = {0};
>  static DEFINE_SPINLOCK(resource_alignment_lock);
> @@ -5906,9 +5911,9 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>         p = resource_alignment_param;
>         if (!*p && !align)
>                 goto out;
> -       if (pci_has_flag(PCI_PROBE_ONLY)) {
> +       if (pcibios_ignore_alignment_request()) {
>                 align = 0;
> -               pr_info_once("PCI: Ignoring requested alignments (PCI_PROBE_ONLY)\n");
> +               pr_info_once("PCI: Ignoring requested alignments\n");
>                 goto out;
>         }

I think the logic here is questionable to begin with. If the user has
explicitly requested re-aligning a resource via the command line then
we should probably do it even if PCI_PROBE_ONLY is set. When it breaks
they get to keep the pieces.

That said, the real issue here is that PCI_PROBE_ONLY probably
shouldn't be set under qemu/kvm. Under the other hypervisor (PowerVM)
hotplugged devices are configured by firmware before it's passed to
the guest and we need to keep the FW assignments otherwise things
break. QEMU however doesn't do any BAR assignments and relies on that
being handled by the guest. At boot time this is done by SLOF, but
Linux only keeps SLOF around until it's extracted the device-tree.
Once that's done SLOF gets blown away and the kernel needs to do it's
own BAR assignments. I'm guessing there's a hack in there to make it
work today, but it's a little surprising that it works at all...

IIRC Sam Bobroff was looking at hotplug under pseries recently so he
might have something to add. He's sick at the moment, but I'll ask him
to take a look at this once he's back among the living

> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 4a5a84d7bdd4..47471dcdbaf9 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1990,6 +1990,7 @@ static inline void pcibios_penalize_isa_irq(int irq, int active) {}
>  int pcibios_alloc_irq(struct pci_dev *dev);
>  void pcibios_free_irq(struct pci_dev *dev);
>  resource_size_t pcibios_default_alignment(void);
> +int pcibios_ignore_alignment_request(void);
>
>  #ifdef CONFIG_HIBERNATE_CALLBACKS
>  extern struct dev_pm_ops pcibios_pm_ops;
> --
> 2.20.1
>

^ permalink raw reply

* Re: [PATCH v3 14/16] powerpc/32: implement fast entry for syscalls on BOOKE
From: Michael Ellerman @ 2019-05-28  5:37 UTC (permalink / raw)
  To: Christophe Leroy, Paul Mackerras
  Cc: linuxppc-dev, linux-kernel, Nicholas Piggin
In-Reply-To: <58f0e70f-ed9d-965e-e8d2-cc5d13a4c9eb@c-s.fr>

Christophe Leroy <christophe.leroy@c-s.fr> writes:
> Le 23/05/2019 à 09:00, Christophe Leroy a écrit :
>
> [...]
>
>>> arch/powerpc/kernel/head_fsl_booke.o: In function `SystemCall':
>>> arch/powerpc/kernel/head_fsl_booke.S:416: undefined reference to 
>>> `kvmppc_handler_BOOKE_INTERRUPT_SYSCALL_SPRN_SRR1'
>>> Makefile:1052: recipe for target 'vmlinux' failed
>>>
>>>> +.macro SYSCALL_ENTRY trapno intno
>>>> +    mfspr    r10, SPRN_SPRG_THREAD
>>>> +#ifdef CONFIG_KVM_BOOKE_HV
>>>> +BEGIN_FTR_SECTION
>>>> +    mtspr    SPRN_SPRG_WSCRATCH0, r10
>>>> +    stw    r11, THREAD_NORMSAVE(0)(r10)
>>>> +    stw    r13, THREAD_NORMSAVE(2)(r10)
>>>> +    mfcr    r13            /* save CR in r13 for now       */
>>>> +    mfspr    r11, SPRN_SRR1
>>>> +    mtocrf    0x80, r11    /* check MSR[GS] without clobbering reg */
>>>> +    bf    3, 1975f
>>>> +    b    kvmppc_handler_BOOKE_INTERRUPT_\intno\()_SPRN_SRR1
>>>
>>> It seems to me that the "_SPRN_SRR1" on the end of this line
>>> isn't meant to be there...  However, it still fails to link with that
>>> removed.
>
> It looks like I missed the macro expansion.
>
> The called function should be kvmppc_handler_8_0x01B
>
> Seems like kisskb doesn't build any config like this.

I thought we did, ie:

http://kisskb.ellerman.id.au/kisskb/buildresult/13817941/

But clearly something is missing to trigger the bug.

cheers

^ permalink raw reply

* Re: [PATCH v3 1/3] PCI: Introduce pcibios_ignore_alignment_request
From: Shawn Anastasio @ 2019-05-28  5:50 UTC (permalink / raw)
  To: Oliver
  Cc: Sam Bobroff, linux-pci, Linux Kernel Mailing List, rppt,
	Paul Mackerras, Bjorn Helgaas, xyjxie, linuxppc-dev
In-Reply-To: <CAOSf1CEFfbmwfvmdqT1xdt8SFb=tYdYXLfXeyZ8=iRnhg4a3Pg@mail.gmail.com>



On 5/28/19 12:36 AM, Oliver wrote:
> On Tue, May 28, 2019 at 2:03 PM Shawn Anastasio <shawn@anastas.io> wrote:
>>
>> Introduce a new pcibios function pcibios_ignore_alignment_request
>> which allows the PCI core to defer to platform-specific code to
>> determine whether or not to ignore alignment requests for PCI resources.
>>
>> The existing behavior is to simply ignore alignment requests when
>> PCI_PROBE_ONLY is set. This is behavior is maintained by the
>> default implementation of pcibios_ignore_alignment_request.
>>
>> Signed-off-by: Shawn Anastasio <shawn@anastas.io>
>> ---
>>   drivers/pci/pci.c   | 9 +++++++--
>>   include/linux/pci.h | 1 +
>>   2 files changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 8abc843b1615..8207a09085d1 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -5882,6 +5882,11 @@ resource_size_t __weak pcibios_default_alignment(void)
>>          return 0;
>>   }
>>
>> +int __weak pcibios_ignore_alignment_request(void)
>> +{
>> +       return pci_has_flag(PCI_PROBE_ONLY);
>> +}
>> +
>>   #define RESOURCE_ALIGNMENT_PARAM_SIZE COMMAND_LINE_SIZE
>>   static char resource_alignment_param[RESOURCE_ALIGNMENT_PARAM_SIZE] = {0};
>>   static DEFINE_SPINLOCK(resource_alignment_lock);
>> @@ -5906,9 +5911,9 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>>          p = resource_alignment_param;
>>          if (!*p && !align)
>>                  goto out;
>> -       if (pci_has_flag(PCI_PROBE_ONLY)) {
>> +       if (pcibios_ignore_alignment_request()) {
>>                  align = 0;
>> -               pr_info_once("PCI: Ignoring requested alignments (PCI_PROBE_ONLY)\n");
>> +               pr_info_once("PCI: Ignoring requested alignments\n");
>>                  goto out;
>>          }
> 
> I think the logic here is questionable to begin with. If the user has
> explicitly requested re-aligning a resource via the command line then
> we should probably do it even if PCI_PROBE_ONLY is set. When it breaks
> they get to keep the pieces.
> 
> That said, the real issue here is that PCI_PROBE_ONLY probably
> shouldn't be set under qemu/kvm. Under the other hypervisor (PowerVM)
> hotplugged devices are configured by firmware before it's passed to
> the guest and we need to keep the FW assignments otherwise things
> break. QEMU however doesn't do any BAR assignments and relies on that
> being handled by the guest. At boot time this is done by SLOF, but
> Linux only keeps SLOF around until it's extracted the device-tree.
> Once that's done SLOF gets blown away and the kernel needs to do it's
> own BAR assignments. I'm guessing there's a hack in there to make it
> work today, but it's a little surprising that it works at all...
Interesting, I wasn't aware that hotplugged devices are configured
by the hypervisor on PowerVM. That at least means that this patch is
wrong as-is since it won't handle that properly. Definitely seems like
there will need to be different behavior here depending on the hypervisor.

That being said, wouldn't PCI_PROBE_ONLY still be set on pseries/kvm
(at least for initial boot) to observe SLOF's original BAR assignments?
Perhaps it should be un-set after initial PCI init?

> 
> IIRC Sam Bobroff was looking at hotplug under pseries recently so he
> might have something to add. He's sick at the moment, but I'll ask him
> to take a look at this once he's back among the living

Good to know. I'll await his comments before continuing here.

>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 4a5a84d7bdd4..47471dcdbaf9 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -1990,6 +1990,7 @@ static inline void pcibios_penalize_isa_irq(int irq, int active) {}
>>   int pcibios_alloc_irq(struct pci_dev *dev);
>>   void pcibios_free_irq(struct pci_dev *dev);
>>   resource_size_t pcibios_default_alignment(void);
>> +int pcibios_ignore_alignment_request(void);
>>
>>   #ifdef CONFIG_HIBERNATE_CALLBACKS
>>   extern struct dev_pm_ops pcibios_pm_ops;
>> --
>> 2.20.1
>>

^ permalink raw reply

* Re: [PATCH v4 3/3] kselftest: Extend vDSO selftest to clock_getres
From: Michael Ellerman @ 2019-05-28  6:19 UTC (permalink / raw)
  To: Vincenzo Frascino, linux-arch, linuxppc-dev, linux-s390,
	linux-kselftest
  Cc: Arnd Bergmann, Heiko Carstens, Paul Mackerras, Martin Schwidefsky,
	Thomas Gleixner, vincenzo.frascino, Shuah Khan
In-Reply-To: <20190523112116.19233-4-vincenzo.frascino@arm.com>

Vincenzo Frascino <vincenzo.frascino@arm.com> writes:

> The current version of the multiarch vDSO selftest verifies only
> gettimeofday.
>
> Extend the vDSO selftest to clock_getres, to verify that the
> syscall and the vDSO library function return the same information.
>
> The extension has been used to verify the hrtimer_resoltion fix.

This is passing for me even without patch 1 applied, shouldn't it fail
without the fix? What am I missing?

# uname -r
5.2.0-rc2-gcc-8.2.0

# ./vdso_clock_getres
clock_id: CLOCK_REALTIME [PASS]
clock_id: CLOCK_BOOTTIME [PASS]
clock_id: CLOCK_TAI [PASS]
clock_id: CLOCK_REALTIME_COARSE [PASS]
clock_id: CLOCK_MONOTONIC [PASS]
clock_id: CLOCK_MONOTONIC_RAW [PASS]
clock_id: CLOCK_MONOTONIC_COARSE [PASS]

cheers

> Cc: Shuah Khan <shuah@kernel.org>
> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
> ---
>
> Note: This patch is independent from the others in this series, hence it
> can be merged singularly by the kselftest maintainers.
>
>  tools/testing/selftests/vDSO/Makefile         |   2 +
>  .../selftests/vDSO/vdso_clock_getres.c        | 124 ++++++++++++++++++
>  2 files changed, 126 insertions(+)
>  create mode 100644 tools/testing/selftests/vDSO/vdso_clock_getres.c

^ permalink raw reply

* Re: [PATCH v3 1/3] PCI: Introduce pcibios_ignore_alignment_request
From: Alexey Kardashevskiy @ 2019-05-28  6:27 UTC (permalink / raw)
  To: Oliver, Shawn Anastasio
  Cc: Sam Bobroff, linux-pci, Linux Kernel Mailing List, rppt,
	Paul Mackerras, Bjorn Helgaas, xyjxie, linuxppc-dev
In-Reply-To: <CAOSf1CEFfbmwfvmdqT1xdt8SFb=tYdYXLfXeyZ8=iRnhg4a3Pg@mail.gmail.com>



On 28/05/2019 15:36, Oliver wrote:
> On Tue, May 28, 2019 at 2:03 PM Shawn Anastasio <shawn@anastas.io> wrote:
>>
>> Introduce a new pcibios function pcibios_ignore_alignment_request
>> which allows the PCI core to defer to platform-specific code to
>> determine whether or not to ignore alignment requests for PCI resources.
>>
>> The existing behavior is to simply ignore alignment requests when
>> PCI_PROBE_ONLY is set. This is behavior is maintained by the
>> default implementation of pcibios_ignore_alignment_request.
>>
>> Signed-off-by: Shawn Anastasio <shawn@anastas.io>
>> ---
>>  drivers/pci/pci.c   | 9 +++++++--
>>  include/linux/pci.h | 1 +
>>  2 files changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 8abc843b1615..8207a09085d1 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -5882,6 +5882,11 @@ resource_size_t __weak pcibios_default_alignment(void)
>>         return 0;
>>  }
>>
>> +int __weak pcibios_ignore_alignment_request(void)
>> +{
>> +       return pci_has_flag(PCI_PROBE_ONLY);
>> +}
>> +
>>  #define RESOURCE_ALIGNMENT_PARAM_SIZE COMMAND_LINE_SIZE
>>  static char resource_alignment_param[RESOURCE_ALIGNMENT_PARAM_SIZE] = {0};
>>  static DEFINE_SPINLOCK(resource_alignment_lock);
>> @@ -5906,9 +5911,9 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>>         p = resource_alignment_param;
>>         if (!*p && !align)
>>                 goto out;
>> -       if (pci_has_flag(PCI_PROBE_ONLY)) {
>> +       if (pcibios_ignore_alignment_request()) {
>>                 align = 0;
>> -               pr_info_once("PCI: Ignoring requested alignments (PCI_PROBE_ONLY)\n");
>> +               pr_info_once("PCI: Ignoring requested alignments\n");
>>                 goto out;
>>         }
> 
> I think the logic here is questionable to begin with. If the user has
> explicitly requested re-aligning a resource via the command line then
> we should probably do it even if PCI_PROBE_ONLY is set. When it breaks
> they get to keep the pieces.
> 
> That said, the real issue here is that PCI_PROBE_ONLY probably
> shouldn't be set under qemu/kvm. Under the other hypervisor (PowerVM)
> hotplugged devices are configured by firmware before it's passed to
> the guest and we need to keep the FW assignments otherwise things
> break. QEMU however doesn't do any BAR assignments and relies on that
> being handled by the guest. At boot time this is done by SLOF, but
> Linux only keeps SLOF around until it's extracted the device-tree.
> Once that's done SLOF gets blown away and the kernel needs to do it's
> own BAR assignments. I'm guessing there's a hack in there to make it
> work today, but it's a little surprising that it works at all...


The hack is to run a modified qemu-aware "/usr/sbin/rtas_errd" in the
guest which receives an event from qemu (RAS_EPOW from
/proc/interrupts), fetches device tree chunks (and as I understand it -
they come with BARs from phyp but without from qemu) and writes "1" to
"/sys/bus/pci/rescan" which calls pci_assign_resource() eventually:

[c000000006e6f960] [c0000000005f62d4] pci_assign_resource+0x44/0x360

[c000000006e6fa10] [c0000000005f8b54]
assign_requested_resources_sorted+0x84/0x110
[c000000006e6fa60] [c0000000005f9540] __assign_resources_sorted+0xd0/0x750
[c000000006e6fb40] [c0000000005fb2e0]
__pci_bus_assign_resources+0x80/0x280
[c000000006e6fc00] [c0000000005fb95c]
pci_assign_unassigned_bus_resources+0xbc/0x100
[c000000006e6fc60] [c0000000005e3d74] pci_rescan_bus+0x34/0x60

[c000000006e6fc90] [c0000000005f1ef4] rescan_store+0x84/0xc0

[c000000006e6fcd0] [c00000000068060c] bus_attr_store+0x3c/0x60

[c000000006e6fcf0] [c00000000037853c] sysfs_kf_write+0x5c/0x80





> 
> IIRC Sam Bobroff was looking at hotplug under pseries recently so he
> might have something to add. He's sick at the moment, but I'll ask him
> to take a look at this once he's back among the living
> 
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 4a5a84d7bdd4..47471dcdbaf9 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -1990,6 +1990,7 @@ static inline void pcibios_penalize_isa_irq(int irq, int active) {}
>>  int pcibios_alloc_irq(struct pci_dev *dev);
>>  void pcibios_free_irq(struct pci_dev *dev);
>>  resource_size_t pcibios_default_alignment(void);
>> +int pcibios_ignore_alignment_request(void);
>>
>>  #ifdef CONFIG_HIBERNATE_CALLBACKS
>>  extern struct dev_pm_ops pcibios_pm_ops;
>> --
>> 2.20.1
>>

-- 
Alexey

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox