* [PATCH] x86 - irq vector assignment
@ 2010-09-21 20:05 Jack Steiner
2010-09-21 20:41 ` Yinghai Lu
0 siblings, 1 reply; 5+ messages in thread
From: Jack Steiner @ 2010-09-21 20:05 UTC (permalink / raw)
To: mingo, tglx; +Cc: linux-kernel, yinghai
Try to assign irq vectors to cpus on the correct node & fall back to global
assignment only if node-local fails. This reduces the chances of
using all of the interrupt vectors of a single cpu.
Signed-off-by: Jack Steiner <steiner@sgi.com>
---
Note: this is a fix for a problem we saw on systems with a large number of IOHs.
The IOHs are distributed across 10's of nodes.
Early in boot, the IO infrastructure assigns interrupts for the DMA engines.
Currently, all interrupts are targeted to cpu 0. This uses all interrupt
vectors on cpu 0. Later, some drivers try to create irqs targeted to
cpu 0. The assignment fails because all vectors are assigned.
This is a repost of a patch sent earlier. See
http://marc.info/?l=linux-kernel&m=127740806705617&w=2
http://marc.info/?l=linux-kernel&m=127791052828867&w=2
arch/x86/kernel/apic/io_apic.c | 5 +++++
1 file changed, 5 insertions(+)
Index: linux/arch/x86/kernel/apic/io_apic.c
===================================================================
--- linux.orig/arch/x86/kernel/apic/io_apic.c 2010-09-17 13:00:19.164638447 -0500
+++ linux/arch/x86/kernel/apic/io_apic.c 2010-09-17 13:00:23.448595373 -0500
@@ -3253,6 +3253,11 @@ unsigned int create_irq_nr(unsigned int
desc_new = move_irq_desc(desc_new, node);
cfg_new = desc_new->chip_data;
+#ifdef CONFIG_NUMA
+ if (node >= 0 && __assign_irq_vector(new, cfg_new, node_to_cpumask_map[node]) == 0)
+ irq = new;
+ else
+#endif
if (__assign_irq_vector(new, cfg_new, apic->target_cpus()) == 0)
irq = new;
break;
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] x86 - irq vector assignment
2010-09-21 20:05 [PATCH] x86 - irq vector assignment Jack Steiner
@ 2010-09-21 20:41 ` Yinghai Lu
2010-09-21 21:34 ` Thomas Gleixner
0 siblings, 1 reply; 5+ messages in thread
From: Yinghai Lu @ 2010-09-21 20:41 UTC (permalink / raw)
To: Jack Steiner; +Cc: mingo, tglx, linux-kernel
On Tue, Sep 21, 2010 at 1:05 PM, Jack Steiner <steiner@sgi.com> wrote:
> Try to assign irq vectors to cpus on the correct node & fall back to global
> assignment only if node-local fails. This reduces the chances of
> using all of the interrupt vectors of a single cpu.
>
> Signed-off-by: Jack Steiner <steiner@sgi.com>
>
>
> ---
> Note: this is a fix for a problem we saw on systems with a large number of IOHs.
> The IOHs are distributed across 10's of nodes.
>
> Early in boot, the IO infrastructure assigns interrupts for the DMA engines.
> Currently, all interrupts are targeted to cpu 0. This uses all interrupt
> vectors on cpu 0. Later, some drivers try to create irqs targeted to
> cpu 0. The assignment fails because all vectors are assigned.
>
> This is a repost of a patch sent earlier. See
> http://marc.info/?l=linux-kernel&m=127740806705617&w=2
> http://marc.info/?l=linux-kernel&m=127791052828867&w=2
>
>
>
> arch/x86/kernel/apic/io_apic.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> Index: linux/arch/x86/kernel/apic/io_apic.c
> ===================================================================
> --- linux.orig/arch/x86/kernel/apic/io_apic.c 2010-09-17 13:00:19.164638447 -0500
> +++ linux/arch/x86/kernel/apic/io_apic.c 2010-09-17 13:00:23.448595373 -0500
> @@ -3253,6 +3253,11 @@ unsigned int create_irq_nr(unsigned int
> desc_new = move_irq_desc(desc_new, node);
> cfg_new = desc_new->chip_data;
>
> +#ifdef CONFIG_NUMA
> + if (node >= 0 && __assign_irq_vector(new, cfg_new, node_to_cpumask_map[node]) == 0)
> + irq = new;
> + else
> +#endif
> if (__assign_irq_vector(new, cfg_new, apic->target_cpus()) == 0)
> irq = new;
> break;
target_cpus() for uv_x and x2apic phys mode all have cpu_online_mask()
so we should get the vector for other cpus. aka __assign_irq_vector()
should not fail. unless you have so many irq > nr_irqs.
current code we only make sure irq_desc on device local node.
for the vectors, user can set irq smp_affinity move the device local
cpus if needed.
Thanks
Yinghai
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] x86 - irq vector assignment
2010-09-21 20:41 ` Yinghai Lu
@ 2010-09-21 21:34 ` Thomas Gleixner
2010-09-21 23:12 ` Yinghai Lu
0 siblings, 1 reply; 5+ messages in thread
From: Thomas Gleixner @ 2010-09-21 21:34 UTC (permalink / raw)
To: Yinghai Lu; +Cc: Jack Steiner, mingo, linux-kernel
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1938 bytes --]
On Tue, 21 Sep 2010, Yinghai Lu wrote:
> > arch/x86/kernel/apic/io_apic.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > Index: linux/arch/x86/kernel/apic/io_apic.c
> > ===================================================================
> > --- linux.orig/arch/x86/kernel/apic/io_apic.c 2010-09-17 13:00:19.164638447 -0500
> > +++ linux/arch/x86/kernel/apic/io_apic.c 2010-09-17 13:00:23.448595373 -0500
> > @@ -3253,6 +3253,11 @@ unsigned int create_irq_nr(unsigned int
> > desc_new = move_irq_desc(desc_new, node);
> > cfg_new = desc_new->chip_data;
> >
> > +#ifdef CONFIG_NUMA
> > + if (node >= 0 && __assign_irq_vector(new, cfg_new, node_to_cpumask_map[node]) == 0)
> > + irq = new;
> > + else
> > +#endif
> > if (__assign_irq_vector(new, cfg_new, apic->target_cpus()) == 0)
> > irq = new;
> > break;
>
> target_cpus() for uv_x and x2apic phys mode all have cpu_online_mask()
>
> so we should get the vector for other cpus. aka __assign_irq_vector()
> should not fail. unless you have so many irq > nr_irqs.
Did you even read the changelog ? It's not about "should".
All CPU0 vectors are assigned already just because the current code
takes the first cpu in the target_cpus mask regardless of the node on
which the irq_desc is allocated. That's crap. Why do we allocate
irq_desc on node and leave the vector assigned to node(cpu0) ?
> current code we only make sure irq_desc on device local node.
Brilliant.
> for the vectors, user can set irq smp_affinity move the device local
> cpus if needed.
What a nonsense. If we allocate irq_desc on a target node it does not
make any sense to target the vector to whatever random node/cpu in the
first place and wait for user space to fix it up. What about running
into that situation _before_ we hit user space ?
Thanks,
tglx
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] x86 - irq vector assignment
2010-09-21 21:34 ` Thomas Gleixner
@ 2010-09-21 23:12 ` Yinghai Lu
2010-09-21 23:53 ` Jack Steiner
0 siblings, 1 reply; 5+ messages in thread
From: Yinghai Lu @ 2010-09-21 23:12 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Jack Steiner, mingo, linux-kernel
On 09/21/2010 02:34 PM, Thomas Gleixner wrote:
> On Tue, 21 Sep 2010, Yinghai Lu wrote:
>>> arch/x86/kernel/apic/io_apic.c | 5 +++++
>>> 1 file changed, 5 insertions(+)
>>>
>>> Index: linux/arch/x86/kernel/apic/io_apic.c
>>> ===================================================================
>>> --- linux.orig/arch/x86/kernel/apic/io_apic.c 2010-09-17 13:00:19.164638447 -0500
>>> +++ linux/arch/x86/kernel/apic/io_apic.c 2010-09-17 13:00:23.448595373 -0500
>>> @@ -3253,6 +3253,11 @@ unsigned int create_irq_nr(unsigned int
>>> desc_new = move_irq_desc(desc_new, node);
>>> cfg_new = desc_new->chip_data;
>>>
>>> +#ifdef CONFIG_NUMA
>>> + if (node >= 0 && __assign_irq_vector(new, cfg_new, node_to_cpumask_map[node]) == 0)
>>> + irq = new;
>>> + else
>>> +#endif
>>> if (__assign_irq_vector(new, cfg_new, apic->target_cpus()) == 0)
>>> irq = new;
>>> break;
>>
>> target_cpus() for uv_x and x2apic phys mode all have cpu_online_mask()
>>
>> so we should get the vector for other cpus. aka __assign_irq_vector()
>> should not fail. unless you have so many irq > nr_irqs.
>
> Did you even read the changelog ? It's not about "should".
>
> All CPU0 vectors are assigned already just because the current code
> takes the first cpu in the target_cpus mask regardless of the node on
> which the irq_desc is allocated. That's crap. Why do we allocate
> irq_desc on node and leave the vector assigned to node(cpu0) ?
ok, i got it. vectors from cpus on node0 are used by devices from others nodes.
later devices from node0 can not get vector from node0.
Yinghai
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] x86 - irq vector assignment
2010-09-21 23:12 ` Yinghai Lu
@ 2010-09-21 23:53 ` Jack Steiner
0 siblings, 0 replies; 5+ messages in thread
From: Jack Steiner @ 2010-09-21 23:53 UTC (permalink / raw)
To: Yinghai Lu; +Cc: Thomas Gleixner, mingo, linux-kernel
On Tue, Sep 21, 2010 at 04:12:33PM -0700, Yinghai Lu wrote:
> On 09/21/2010 02:34 PM, Thomas Gleixner wrote:
> > On Tue, 21 Sep 2010, Yinghai Lu wrote:
> >>> arch/x86/kernel/apic/io_apic.c | 5 +++++
> >>> 1 file changed, 5 insertions(+)
> >>>
> >>> Index: linux/arch/x86/kernel/apic/io_apic.c
> >>> ===================================================================
> >>> --- linux.orig/arch/x86/kernel/apic/io_apic.c 2010-09-17 13:00:19.164638447 -0500
> >>> +++ linux/arch/x86/kernel/apic/io_apic.c 2010-09-17 13:00:23.448595373 -0500
> >>> @@ -3253,6 +3253,11 @@ unsigned int create_irq_nr(unsigned int
> >>> desc_new = move_irq_desc(desc_new, node);
> >>> cfg_new = desc_new->chip_data;
> >>>
> >>> +#ifdef CONFIG_NUMA
> >>> + if (node >= 0 && __assign_irq_vector(new, cfg_new, node_to_cpumask_map[node]) == 0)
> >>> + irq = new;
> >>> + else
> >>> +#endif
> >>> if (__assign_irq_vector(new, cfg_new, apic->target_cpus()) == 0)
> >>> irq = new;
> >>> break;
> >>
> >> target_cpus() for uv_x and x2apic phys mode all have cpu_online_mask()
> >>
> >> so we should get the vector for other cpus. aka __assign_irq_vector()
> >> should not fail. unless you have so many irq > nr_irqs.
> >
> > Did you even read the changelog ? It's not about "should".
> >
> > All CPU0 vectors are assigned already just because the current code
> > takes the first cpu in the target_cpus mask regardless of the node on
> > which the irq_desc is allocated. That's crap. Why do we allocate
> > irq_desc on node and leave the vector assigned to node(cpu0) ?
>
> ok, i got it. vectors from cpus on node0 are used by devices from others nodes.
> later devices from node0 can not get vector from node0.
Does that resolve all of your questions or are there still other issues.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-09-21 23:53 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-21 20:05 [PATCH] x86 - irq vector assignment Jack Steiner
2010-09-21 20:41 ` Yinghai Lu
2010-09-21 21:34 ` Thomas Gleixner
2010-09-21 23:12 ` Yinghai Lu
2010-09-21 23:53 ` Jack Steiner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox