* [PATCH 2/2] powernv/npu: Add a debugfs setting to change ATSD threshold
2018-04-17 9:11 [PATCH 1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range Alistair Popple
@ 2018-04-17 9:11 ` Alistair Popple
2018-04-17 21:45 ` Balbir Singh
2018-07-23 15:11 ` [2/2] " Michael Ellerman
2018-04-17 9:17 ` [PATCH 1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range Balbir Singh
` (2 subsequent siblings)
3 siblings, 2 replies; 8+ messages in thread
From: Alistair Popple @ 2018-04-17 9:11 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: mhairgrove, arbab, bsingharora, Alistair Popple
The threshold at which it becomes more efficient to coalesce a range of
ATSDs into a single per-PID ATSD is currently not well understood due to a
lack of real-world work loads. This patch adds a debugfs parameter allowing
the threshold to be altered at runtime in order to aid future development
and refinement of the value.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
---
arch/powerpc/platforms/powernv/npu-dma.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
index dc34662e9df9..a765bf576c14 100644
--- a/arch/powerpc/platforms/powernv/npu-dma.c
+++ b/arch/powerpc/platforms/powernv/npu-dma.c
@@ -17,7 +17,9 @@
#include <linux/pci.h>
#include <linux/memblock.h>
#include <linux/iommu.h>
+#include <linux/debugfs.h>
+#include <asm/debugfs.h>
#include <asm/tlb.h>
#include <asm/powernv.h>
#include <asm/reg.h>
@@ -44,7 +46,8 @@ DEFINE_SPINLOCK(npu_context_lock);
* entire TLB on the GPU for the given PID rather than each specific address in
* the range.
*/
-#define ATSD_THRESHOLD (2*1024*1024)
+static uint64_t atsd_threshold = 2 * 1024 * 1024;
+static struct dentry *atsd_threshold_dentry;
/*
* Other types of TCE cache invalidation are not functional in the
@@ -682,7 +685,7 @@ static void pnv_npu2_mn_invalidate_range(struct mmu_notifier *mn,
struct npu_context *npu_context = mn_to_npu_context(mn);
unsigned long address;
- if (end - start > ATSD_THRESHOLD) {
+ if (end - start > atsd_threshold) {
/*
* Just invalidate the entire PID if the address range is too
* large.
@@ -956,6 +959,11 @@ int pnv_npu2_init(struct pnv_phb *phb)
static int npu_index;
uint64_t rc = 0;
+ if (!atsd_threshold_dentry) {
+ atsd_threshold_dentry = debugfs_create_x64("atsd_threshold",
+ 0600, powerpc_debugfs_root, &atsd_threshold);
+ }
+
phb->npu.nmmu_flush =
of_property_read_bool(phb->hose->dn, "ibm,nmmu-flush");
for_each_child_of_node(phb->hose->dn, dn) {
--
2.11.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] powernv/npu: Add a debugfs setting to change ATSD threshold
2018-04-17 9:11 ` [PATCH 2/2] powernv/npu: Add a debugfs setting to change ATSD threshold Alistair Popple
@ 2018-04-17 21:45 ` Balbir Singh
2018-07-23 15:11 ` [2/2] " Michael Ellerman
1 sibling, 0 replies; 8+ messages in thread
From: Balbir Singh @ 2018-04-17 21:45 UTC (permalink / raw)
To: Alistair Popple; +Cc: linuxppc-dev, mpe, mhairgrove, arbab
On Tue, 17 Apr 2018 19:11:29 +1000
Alistair Popple <alistair@popple.id.au> wrote:
> The threshold at which it becomes more efficient to coalesce a range of
> ATSDs into a single per-PID ATSD is currently not well understood due to a
> lack of real-world work loads. This patch adds a debugfs parameter allowing
> the threshold to be altered at runtime in order to aid future development
> and refinement of the value.
>
> Signed-off-by: Alistair Popple <alistair@popple.id.au>
> ---
> arch/powerpc/platforms/powernv/npu-dma.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> index dc34662e9df9..a765bf576c14 100644
> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> @@ -17,7 +17,9 @@
> #include <linux/pci.h>
> #include <linux/memblock.h>
> #include <linux/iommu.h>
> +#include <linux/debugfs.h>
>
> +#include <asm/debugfs.h>
> #include <asm/tlb.h>
> #include <asm/powernv.h>
> #include <asm/reg.h>
> @@ -44,7 +46,8 @@ DEFINE_SPINLOCK(npu_context_lock);
> * entire TLB on the GPU for the given PID rather than each specific address in
> * the range.
> */
> -#define ATSD_THRESHOLD (2*1024*1024)
> +static uint64_t atsd_threshold = 2 * 1024 * 1024;
> +static struct dentry *atsd_threshold_dentry;
>
> /*
> * Other types of TCE cache invalidation are not functional in the
> @@ -682,7 +685,7 @@ static void pnv_npu2_mn_invalidate_range(struct mmu_notifier *mn,
> struct npu_context *npu_context = mn_to_npu_context(mn);
> unsigned long address;
>
> - if (end - start > ATSD_THRESHOLD) {
> + if (end - start > atsd_threshold) {
> /*
> * Just invalidate the entire PID if the address range is too
> * large.
> @@ -956,6 +959,11 @@ int pnv_npu2_init(struct pnv_phb *phb)
> static int npu_index;
> uint64_t rc = 0;
>
> + if (!atsd_threshold_dentry) {
> + atsd_threshold_dentry = debugfs_create_x64("atsd_threshold",
Nit-picking can we call this atsd_threshold_in_bytes?
> + 0600, powerpc_debugfs_root, &atsd_threshold);
> + }
> +
> phb->npu.nmmu_flush =
> of_property_read_bool(phb->hose->dn, "ibm,nmmu-flush");
> for_each_child_of_node(phb->hose->dn, dn) {
Acked-by: Balbir Singh <bsingharora@gmail.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [2/2] powernv/npu: Add a debugfs setting to change ATSD threshold
2018-04-17 9:11 ` [PATCH 2/2] powernv/npu: Add a debugfs setting to change ATSD threshold Alistair Popple
2018-04-17 21:45 ` Balbir Singh
@ 2018-07-23 15:11 ` Michael Ellerman
1 sibling, 0 replies; 8+ messages in thread
From: Michael Ellerman @ 2018-07-23 15:11 UTC (permalink / raw)
To: Alistair Popple, linuxppc-dev; +Cc: Alistair Popple, mhairgrove, arbab
On Tue, 2018-04-17 at 09:11:29 UTC, Alistair Popple wrote:
> The threshold at which it becomes more efficient to coalesce a range of
> ATSDs into a single per-PID ATSD is currently not well understood due to a
> lack of real-world work loads. This patch adds a debugfs parameter allowing
> the threshold to be altered at runtime in order to aid future development
> and refinement of the value.
>
> Signed-off-by: Alistair Popple <alistair@popple.id.au>
> Acked-by: Balbir Singh <bsingharora@gmail.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/99c3ce33a00bc40cb218af770ef00c
cheers
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range
2018-04-17 9:11 [PATCH 1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range Alistair Popple
2018-04-17 9:11 ` [PATCH 2/2] powernv/npu: Add a debugfs setting to change ATSD threshold Alistair Popple
@ 2018-04-17 9:17 ` Balbir Singh
2018-04-17 22:25 ` Balbir Singh
2018-04-20 3:51 ` Alistair Popple
2018-04-24 3:48 ` [1/2] " Michael Ellerman
3 siblings, 1 reply; 8+ messages in thread
From: Balbir Singh @ 2018-04-17 9:17 UTC (permalink / raw)
To: Alistair Popple
Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), Michael Ellerman,
Mark Hairgrove, arbab
On Tue, Apr 17, 2018 at 7:11 PM, Alistair Popple <alistair@popple.id.au> wrote:
> The NPU has a limited number of address translation shootdown (ATSD)
> registers and the GPU has limited bandwidth to process ATSDs. This can
> result in contention of ATSD registers leading to soft lockups on some
> threads, particularly when invalidating a large address range in
> pnv_npu2_mn_invalidate_range().
>
> At some threshold it becomes more efficient to flush the entire GPU TLB for
> the given MM context (PID) than individually flushing each address in the
> range. This patch will result in ranges greater than 2MB being converted
> from 32+ ATSDs into a single ATSD which will flush the TLB for the given
> PID on each GPU.
>
> Signed-off-by: Alistair Popple <alistair@popple.id.au>
> ---
> arch/powerpc/platforms/powernv/npu-dma.c | 23 +++++++++++++++++++----
> 1 file changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> index 94801d8e7894..dc34662e9df9 100644
> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> @@ -40,6 +40,13 @@
> DEFINE_SPINLOCK(npu_context_lock);
>
> /*
> + * When an address shootdown range exceeds this threshold we invalidate the
> + * entire TLB on the GPU for the given PID rather than each specific address in
> + * the range.
> + */
> +#define ATSD_THRESHOLD (2*1024*1024)
> +
> +/*
> * Other types of TCE cache invalidation are not functional in the
> * hardware.
> */
> @@ -675,11 +682,19 @@ static void pnv_npu2_mn_invalidate_range(struct mmu_notifier *mn,
> struct npu_context *npu_context = mn_to_npu_context(mn);
> unsigned long address;
>
> - for (address = start; address < end; address += PAGE_SIZE)
> - mmio_invalidate(npu_context, 1, address, false);
> + if (end - start > ATSD_THRESHOLD) {
I'm nitpicking, but (end - start) > ATSD_THRESHOLD is clearer
> + /*
> + * Just invalidate the entire PID if the address range is too
> + * large.
> + */
> + mmio_invalidate(npu_context, 0, 0, true);
> + } else {
> + for (address = start; address < end; address += PAGE_SIZE)
> + mmio_invalidate(npu_context, 1, address, false);
>
> - /* Do the flush only on the final addess == end */
> - mmio_invalidate(npu_context, 1, address, true);
> + /* Do the flush only on the final addess == end */
> + mmio_invalidate(npu_context, 1, address, true);
> + }
> }
>
Acked-by: Balbir Singh <bsingharora@gmail.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range
2018-04-17 9:17 ` [PATCH 1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range Balbir Singh
@ 2018-04-17 22:25 ` Balbir Singh
0 siblings, 0 replies; 8+ messages in thread
From: Balbir Singh @ 2018-04-17 22:25 UTC (permalink / raw)
To: Alistair Popple
Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), Michael Ellerman,
Mark Hairgrove, arbab
On Tue, Apr 17, 2018 at 7:17 PM, Balbir Singh <bsingharora@gmail.com> wrote:
> On Tue, Apr 17, 2018 at 7:11 PM, Alistair Popple <alistair@popple.id.au> wrote:
>> The NPU has a limited number of address translation shootdown (ATSD)
>> registers and the GPU has limited bandwidth to process ATSDs. This can
>> result in contention of ATSD registers leading to soft lockups on some
>> threads, particularly when invalidating a large address range in
>> pnv_npu2_mn_invalidate_range().
>>
>> At some threshold it becomes more efficient to flush the entire GPU TLB for
>> the given MM context (PID) than individually flushing each address in the
>> range. This patch will result in ranges greater than 2MB being converted
>> from 32+ ATSDs into a single ATSD which will flush the TLB for the given
>> PID on each GPU.
>>
>> Signed-off-by: Alistair Popple <alistair@popple.id.au>
>> + }
>> }
>>
>
> Acked-by: Balbir Singh <bsingharora@gmail.com>
Tested-by: Balbir Singh <bsingharora@gmail.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range
2018-04-17 9:11 [PATCH 1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range Alistair Popple
2018-04-17 9:11 ` [PATCH 2/2] powernv/npu: Add a debugfs setting to change ATSD threshold Alistair Popple
2018-04-17 9:17 ` [PATCH 1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range Balbir Singh
@ 2018-04-20 3:51 ` Alistair Popple
2018-04-24 3:48 ` [1/2] " Michael Ellerman
3 siblings, 0 replies; 8+ messages in thread
From: Alistair Popple @ 2018-04-20 3:51 UTC (permalink / raw)
To: linuxppc-dev; +Cc: mpe, mhairgrove, arbab
Sorry, forgot to include:
Fixes: 1ab66d1fbada ("powerpc/powernv: Introduce address translation services for Nvlink2")
Thanks
On Tuesday, 17 April 2018 7:11:28 PM AEST Alistair Popple wrote:
> The NPU has a limited number of address translation shootdown (ATSD)
> registers and the GPU has limited bandwidth to process ATSDs. This can
> result in contention of ATSD registers leading to soft lockups on some
> threads, particularly when invalidating a large address range in
> pnv_npu2_mn_invalidate_range().
>
> At some threshold it becomes more efficient to flush the entire GPU TLB for
> the given MM context (PID) than individually flushing each address in the
> range. This patch will result in ranges greater than 2MB being converted
> from 32+ ATSDs into a single ATSD which will flush the TLB for the given
> PID on each GPU.
>
> Signed-off-by: Alistair Popple <alistair@popple.id.au>
> ---
> arch/powerpc/platforms/powernv/npu-dma.c | 23 +++++++++++++++++++----
> 1 file changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> index 94801d8e7894..dc34662e9df9 100644
> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> @@ -40,6 +40,13 @@
> DEFINE_SPINLOCK(npu_context_lock);
>
> /*
> + * When an address shootdown range exceeds this threshold we invalidate the
> + * entire TLB on the GPU for the given PID rather than each specific address in
> + * the range.
> + */
> +#define ATSD_THRESHOLD (2*1024*1024)
> +
> +/*
> * Other types of TCE cache invalidation are not functional in the
> * hardware.
> */
> @@ -675,11 +682,19 @@ static void pnv_npu2_mn_invalidate_range(struct mmu_notifier *mn,
> struct npu_context *npu_context = mn_to_npu_context(mn);
> unsigned long address;
>
> - for (address = start; address < end; address += PAGE_SIZE)
> - mmio_invalidate(npu_context, 1, address, false);
> + if (end - start > ATSD_THRESHOLD) {
> + /*
> + * Just invalidate the entire PID if the address range is too
> + * large.
> + */
> + mmio_invalidate(npu_context, 0, 0, true);
> + } else {
> + for (address = start; address < end; address += PAGE_SIZE)
> + mmio_invalidate(npu_context, 1, address, false);
>
> - /* Do the flush only on the final addess == end */
> - mmio_invalidate(npu_context, 1, address, true);
> + /* Do the flush only on the final addess == end */
> + mmio_invalidate(npu_context, 1, address, true);
> + }
> }
>
> static const struct mmu_notifier_ops nv_nmmu_notifier_ops = {
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range
2018-04-17 9:11 [PATCH 1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range Alistair Popple
` (2 preceding siblings ...)
2018-04-20 3:51 ` Alistair Popple
@ 2018-04-24 3:48 ` Michael Ellerman
3 siblings, 0 replies; 8+ messages in thread
From: Michael Ellerman @ 2018-04-24 3:48 UTC (permalink / raw)
To: Alistair Popple, linuxppc-dev; +Cc: Alistair Popple, mhairgrove, arbab
On Tue, 2018-04-17 at 09:11:28 UTC, Alistair Popple wrote:
> The NPU has a limited number of address translation shootdown (ATSD)
> registers and the GPU has limited bandwidth to process ATSDs. This can
> result in contention of ATSD registers leading to soft lockups on some
> threads, particularly when invalidating a large address range in
> pnv_npu2_mn_invalidate_range().
>
> At some threshold it becomes more efficient to flush the entire GPU TLB for
> the given MM context (PID) than individually flushing each address in the
> range. This patch will result in ranges greater than 2MB being converted
> from 32+ ATSDs into a single ATSD which will flush the TLB for the given
> PID on each GPU.
>
> Signed-off-by: Alistair Popple <alistair@popple.id.au>
> Acked-by: Balbir Singh <bsingharora@gmail.com>
> Tested-by: Balbir Singh <bsingharora@gmail.com>
Patch 1 applied to powerpc fixes, thanks.
https://git.kernel.org/powerpc/c/d0cf9b561ca97d5245bb9e0c4774b7
cheers
^ permalink raw reply [flat|nested] 8+ messages in thread