linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] irqchip/gic-v3-its: Allocate ITS tables from corresponding node memory
@ 2016-04-29 12:18 Ashok Kumar
  2016-04-29 13:02 ` Robert Richter
  2016-05-03  7:54 ` Marc Zyngier
  0 siblings, 2 replies; 4+ messages in thread
From: Ashok Kumar @ 2016-04-29 12:18 UTC (permalink / raw)
  To: linux-arm-kernel

In the case of systems having multi socket and multi ITS, allocating
local node memory for ITS device table, collection table, interrupt
translation table and command queue will help in reducing inter-chip
traffic even though they(except command queue) could be cached in the GIC.

Signed-off-by: Ashok Kumar <ashoks@broadcom.com>
---
This patch is created on top of Cavium thunderx erratum 23144 patch [1].

I am not sure how to do this for ACPI as GIC ITS ID in MADT doesn't map to
_PXM. Am I missing something here? Any thoughts?

[1] https://lkml.org/lkml/2016/4/15/830 - [PATCH v5] irqchip, gicv3-its, \
numa: Enable workaround for Cavium thunderx erratum 23144

Thanks,
Ashok

CC: marc.zyngier at arm.com
CC: rrichter at caviumnetworks.com
CC: gkulkarni at caviumnetworks.com
CC: jchandra at broadcom.com

 drivers/irqchip/irq-gic-v3-its.c |   12 ++++++++----
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 75f258f..9a187c0 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -860,6 +860,7 @@ static int its_alloc_tables(const char *node_name, struct its_node *its)
 		int alloc_pages;
 		u64 tmp;
 		void *base;
+		struct page *pg;
 
 		if (type == GITS_BASER_TYPE_NONE)
 			continue;
@@ -897,11 +898,13 @@ retry_alloc_baser:
 				node_name, order, alloc_pages);
 		}
 
-		base = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
-		if (!base) {
+		pg = alloc_pages_node(its->numa_node,
+				      GFP_KERNEL | __GFP_ZERO, order);
+		if (!pg) {
 			err = -ENOMEM;
 			goto out_free;
 		}
+		base = page_address(pg);
 
 		its->tables[i].base = base;
 		its->tables[i].order = order;
@@ -1184,7 +1187,7 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 	nr_ites = max(2UL, roundup_pow_of_two(nvecs));
 	sz = nr_ites * its->ite_size;
 	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
-	itt = kzalloc(sz, GFP_KERNEL);
+	itt = kzalloc_node(sz, GFP_KERNEL, its->numa_node);
 	lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);
 	if (lpi_map)
 		col_map = kzalloc(sizeof(*col_map) * nr_lpis, GFP_KERNEL);
@@ -1526,7 +1529,8 @@ static int __init its_probe(struct device_node *node,
 	its->ite_size = ((readl_relaxed(its_base + GITS_TYPER) >> 4) & 0xf) + 1;
 	its->numa_node = of_node_to_nid(node);
 
-	its->cmd_base = kzalloc(ITS_CMD_QUEUE_SZ, GFP_KERNEL);
+	its->cmd_base = kzalloc_node(ITS_CMD_QUEUE_SZ, GFP_KERNEL,
+				     its->numa_node);
 	if (!its->cmd_base) {
 		err = -ENOMEM;
 		goto out_free_its;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [RFC PATCH] irqchip/gic-v3-its: Allocate ITS tables from corresponding node memory
  2016-04-29 12:18 [RFC PATCH] irqchip/gic-v3-its: Allocate ITS tables from corresponding node memory Ashok Kumar
@ 2016-04-29 13:02 ` Robert Richter
  2016-05-03  7:54 ` Marc Zyngier
  1 sibling, 0 replies; 4+ messages in thread
From: Robert Richter @ 2016-04-29 13:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 29.04.16 05:18:42, Ashok Kumar wrote:
> In the case of systems having multi socket and multi ITS, allocating
> local node memory for ITS device table, collection table, interrupt
> translation table and command queue will help in reducing inter-chip
> traffic even though they(except command queue) could be cached in the GIC.
> 
> Signed-off-by: Ashok Kumar <ashoks@broadcom.com>
> ---
> This patch is created on top of Cavium thunderx erratum 23144 patch [1].
> 
> I am not sure how to do this for ACPI as GIC ITS ID in MADT doesn't map to
> _PXM. Am I missing something here? Any thoughts?

For ACPI we enable the #23144 workaround differently. In that case we
determine the node using MPIDR_AFFINITY_LEVEL() for this. I am going
to send a patch for this soon (but this is ThunderX specific and only
works for the errata handler).

-Robert

> [1] https://lkml.org/lkml/2016/4/15/830 - [PATCH v5] irqchip, gicv3-its, \
> numa: Enable workaround for Cavium thunderx erratum 23144

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [RFC PATCH] irqchip/gic-v3-its: Allocate ITS tables from corresponding node memory
  2016-04-29 12:18 [RFC PATCH] irqchip/gic-v3-its: Allocate ITS tables from corresponding node memory Ashok Kumar
  2016-04-29 13:02 ` Robert Richter
@ 2016-05-03  7:54 ` Marc Zyngier
  2016-05-03  8:05   ` Ashok Kumar
  1 sibling, 1 reply; 4+ messages in thread
From: Marc Zyngier @ 2016-05-03  7:54 UTC (permalink / raw)
  To: linux-arm-kernel

[Please CC LKML and all the irqchip maintainers on these patches]

On 29/04/16 13:18, Ashok Kumar wrote:
> In the case of systems having multi socket and multi ITS, allocating
> local node memory for ITS device table, collection table, interrupt
> translation table and command queue will help in reducing inter-chip
> traffic even though they(except command queue) could be cached in the GIC.
> 
> Signed-off-by: Ashok Kumar <ashoks@broadcom.com>
> ---
> This patch is created on top of Cavium thunderx erratum 23144 patch [1].
> 
> I am not sure how to do this for ACPI as GIC ITS ID in MADT doesn't map to
> _PXM. Am I missing something here? Any thoughts?

Indeed, and SRAT doesn't provide any valuable information either.

> 
> [1] https://lkml.org/lkml/2016/4/15/830 - [PATCH v5] irqchip, gicv3-its, \
> numa: Enable workaround for Cavium thunderx erratum 23144
> 
> Thanks,
> Ashok
> 
> CC: marc.zyngier at arm.com
> CC: rrichter at caviumnetworks.com
> CC: gkulkarni at caviumnetworks.com
> CC: jchandra at broadcom.com
> 
>  drivers/irqchip/irq-gic-v3-its.c |   12 ++++++++----
>  1 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 75f258f..9a187c0 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -860,6 +860,7 @@ static int its_alloc_tables(const char *node_name, struct its_node *its)
>  		int alloc_pages;
>  		u64 tmp;
>  		void *base;
> +		struct page *pg;
>  
>  		if (type == GITS_BASER_TYPE_NONE)
>  			continue;
> @@ -897,11 +898,13 @@ retry_alloc_baser:
>  				node_name, order, alloc_pages);
>  		}
>  
> -		base = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
> -		if (!base) {
> +		pg = alloc_pages_node(its->numa_node,
> +				      GFP_KERNEL | __GFP_ZERO, order);
> +		if (!pg) {
>  			err = -ENOMEM;
>  			goto out_free;
>  		}
> +		base = page_address(pg);
>  
>  		its->tables[i].base = base;
>  		its->tables[i].order = order;
> @@ -1184,7 +1187,7 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
>  	nr_ites = max(2UL, roundup_pow_of_two(nvecs));
>  	sz = nr_ites * its->ite_size;
>  	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
> -	itt = kzalloc(sz, GFP_KERNEL);
> +	itt = kzalloc_node(sz, GFP_KERNEL, its->numa_node);
>  	lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);
>  	if (lpi_map)
>  		col_map = kzalloc(sizeof(*col_map) * nr_lpis, GFP_KERNEL);
> @@ -1526,7 +1529,8 @@ static int __init its_probe(struct device_node *node,
>  	its->ite_size = ((readl_relaxed(its_base + GITS_TYPER) >> 4) & 0xf) + 1;
>  	its->numa_node = of_node_to_nid(node);
>  
> -	its->cmd_base = kzalloc(ITS_CMD_QUEUE_SZ, GFP_KERNEL);
> +	its->cmd_base = kzalloc_node(ITS_CMD_QUEUE_SZ, GFP_KERNEL,
> +				     its->numa_node);
>  	if (!its->cmd_base) {
>  		err = -ENOMEM;
>  		goto out_free_its;
> 

Does this lead to an improvement you've actually measured? If so, I'd
like to see numbers to back it up. Or is that purely theoretical?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [RFC PATCH] irqchip/gic-v3-its: Allocate ITS tables from corresponding node memory
  2016-05-03  7:54 ` Marc Zyngier
@ 2016-05-03  8:05   ` Ashok Kumar
  0 siblings, 0 replies; 4+ messages in thread
From: Ashok Kumar @ 2016-05-03  8:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 03, 2016 at 08:54:27AM +0100, Marc Zyngier wrote:
> [Please CC LKML and all the irqchip maintainers on these patches]
> 
> On 29/04/16 13:18, Ashok Kumar wrote:
> > In the case of systems having multi socket and multi ITS, allocating
> > local node memory for ITS device table, collection table, interrupt
> > translation table and command queue will help in reducing inter-chip
> > traffic even though they(except command queue) could be cached in the GIC.
> > 
> > Signed-off-by: Ashok Kumar <ashoks@broadcom.com>
> > ---
> > This patch is created on top of Cavium thunderx erratum 23144 patch [1].
> > 
> > I am not sure how to do this for ACPI as GIC ITS ID in MADT doesn't map to
> > _PXM. Am I missing something here? Any thoughts?
> 
> Indeed, and SRAT doesn't provide any valuable information either.
> 
> > 
> > [1] https://lkml.org/lkml/2016/4/15/830 - [PATCH v5] irqchip, gicv3-its, \
> > numa: Enable workaround for Cavium thunderx erratum 23144
> > 
> > Thanks,
> > Ashok
> > 
> > CC: marc.zyngier at arm.com
> > CC: rrichter at caviumnetworks.com
> > CC: gkulkarni at caviumnetworks.com
> > CC: jchandra at broadcom.com
> > 
> >  drivers/irqchip/irq-gic-v3-its.c |   12 ++++++++----
> >  1 files changed, 8 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > index 75f258f..9a187c0 100644
> > --- a/drivers/irqchip/irq-gic-v3-its.c
> > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > @@ -860,6 +860,7 @@ static int its_alloc_tables(const char *node_name, struct its_node *its)
> >  		int alloc_pages;
> >  		u64 tmp;
> >  		void *base;
> > +		struct page *pg;
> >  
> >  		if (type == GITS_BASER_TYPE_NONE)
> >  			continue;
> > @@ -897,11 +898,13 @@ retry_alloc_baser:
> >  				node_name, order, alloc_pages);
> >  		}
> >  
> > -		base = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
> > -		if (!base) {
> > +		pg = alloc_pages_node(its->numa_node,
> > +				      GFP_KERNEL | __GFP_ZERO, order);
> > +		if (!pg) {
> >  			err = -ENOMEM;
> >  			goto out_free;
> >  		}
> > +		base = page_address(pg);
> >  
> >  		its->tables[i].base = base;
> >  		its->tables[i].order = order;
> > @@ -1184,7 +1187,7 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
> >  	nr_ites = max(2UL, roundup_pow_of_two(nvecs));
> >  	sz = nr_ites * its->ite_size;
> >  	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
> > -	itt = kzalloc(sz, GFP_KERNEL);
> > +	itt = kzalloc_node(sz, GFP_KERNEL, its->numa_node);
> >  	lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);
> >  	if (lpi_map)
> >  		col_map = kzalloc(sizeof(*col_map) * nr_lpis, GFP_KERNEL);
> > @@ -1526,7 +1529,8 @@ static int __init its_probe(struct device_node *node,
> >  	its->ite_size = ((readl_relaxed(its_base + GITS_TYPER) >> 4) & 0xf) + 1;
> >  	its->numa_node = of_node_to_nid(node);
> >  
> > -	its->cmd_base = kzalloc(ITS_CMD_QUEUE_SZ, GFP_KERNEL);
> > +	its->cmd_base = kzalloc_node(ITS_CMD_QUEUE_SZ, GFP_KERNEL,
> > +				     its->numa_node);
> >  	if (!its->cmd_base) {
> >  		err = -ENOMEM;
> >  		goto out_free_its;
> > 
> 
> Does this lead to an improvement you've actually measured? If so, I'd
> like to see numbers to back it up. Or is that purely theoretical?
It is purely theoretical. I don't have the hardware setup to test it.

Thanks,
Ashok
> 
> Thanks,
> 
> 	M.
> -- 
> Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-05-03  8:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-29 12:18 [RFC PATCH] irqchip/gic-v3-its: Allocate ITS tables from corresponding node memory Ashok Kumar
2016-04-29 13:02 ` Robert Richter
2016-05-03  7:54 ` Marc Zyngier
2016-05-03  8:05   ` Ashok Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).