From: Peter Zijlstra <peterz@infradead.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Dimitri Sivanich <sivanich@sgi.com>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Ingo Molnar <mingo@elte.hu>,
Suresh Siddha <suresh.b.siddha@intel.com>,
Yinghai Lu <yinghai@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Jesse Barnes <jbarnes@virtuousgeek.org>,
Arjan van de Ven <arjan@infradead.org>,
David Miller <davem@davemloft.net>,
Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>,
"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH v6] x86/apic: limit irq affinity
Date: Tue, 24 Nov 2009 14:39:46 +0100 [thread overview]
Message-ID: <1259069986.4531.1453.camel@laptop> (raw)
In-Reply-To: <alpine.LFD.2.00.0911241246470.24119@localhost.localdomain>
On Tue, 2009-11-24 at 14:20 +0100, Thomas Gleixner wrote:
> On Sat, 21 Nov 2009, Dimitri Sivanich wrote:
>
> > On Sat, Nov 21, 2009 at 10:49:50AM -0800, Eric W. Biederman wrote:
> > > Dimitri Sivanich <sivanich@sgi.com> writes:
> > >
> > > > This patch allows for hard numa restrictions to irq affinity on x86 systems.
> > > >
> > > > Affinity is masked to allow only those cpus which the subarchitecture
> > > > deems accessible by the given irq.
> > > >
> > > > On some UV systems, this domain will be limited to the nodes accessible
> > > > to the irq's node. Initially other X86 systems will not mask off any cpus
> > > > so non-UV systems will remain unaffected.
> > >
> > > Is this a hardware restriction you are trying to model?
> > > If not this seems wrong.
> >
> > Yes, it's a hardware restriction.
>
> Nevertheless I think that this is the wrong approach.
>
> What we really want is a notion in the irq descriptor which tells us:
> this interrupt is restricted to numa node N.
>
> The solution in this patch is just restricted to x86 and hides that
> information deep in the arch code.
>
> Further the patch adds code which should be in the generic interrupt
> management code as it is useful for other purposes as well:
>
> Driver folks are looking for a way to restrict irq balancing to a
> given numa node when they have all the driver data allocated on that
> node. That's not a hardware restriction as in the UV case but requires
> a similar infrastructure.
>
> One possible solution would be to have a new flag:
> IRQF_NODE_BOUND - irq is bound to desc->node
>
> When an interrupt is set up we would query with a new irq_chip
> function chip->get_node_affinity(irq) which would default to an empty
> implementation returning -1. The arch code can provide its own
> function to return the numa affinity which would express the hardware
> restriction.
>
> The core code would restrict affinity settings to the cpumask of that
> node without any need for the arch code to check it further.
>
> That same infrastructure could be used for the software restriction of
> interrupts to a node on which the device is bound.
>
> Having it in the core code also allows us to expose this information
> to user space so that the irq balancer knows about it and does not try
> to randomly move the affinity to cpus which are not in the allowed set
> of the node.
I think we should not combine these two cases.
Node-bound devices simply prefer the IRQ to be routed to a cpu 'near'
that node, hard-limiting them to that node is policy and is not
something we should do.
Defaulting to the node-mask is debatable, but is, I think, something we
could do. But I think we should allow user-space to write any mask as
long as the hardware can indeed route the IRQ that way, even when
clearly stupid.
Which is where the UV case comes in, they cannot route IRQs to every
CPU, so it makes sense to limit the possible masks being written. I do
however fully agree that that should be done in generic code, as I can
quite imagine more hardware than UV having limitations in this regard.
Furthermore, the /sysfs topology information should include IRQ routing
data in this case.
next prev parent reply other threads:[~2009-11-24 13:40 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-20 21:11 [PATCH v6] x86/apic: limit irq affinity Dimitri Sivanich
2009-11-21 18:49 ` Eric W. Biederman
2009-11-22 1:14 ` Dimitri Sivanich
2009-11-24 13:20 ` Thomas Gleixner
2009-11-24 13:39 ` Peter Zijlstra [this message]
2009-11-24 13:55 ` Thomas Gleixner
2009-11-24 14:50 ` Arjan van de Ven
2009-11-24 17:41 ` Eric W. Biederman
2009-11-24 18:00 ` Peter P Waskiewicz Jr
2009-11-24 18:20 ` Ingo Molnar
2009-11-24 18:27 ` Yinghai Lu
2009-11-24 18:32 ` Peter Zijlstra
2009-11-24 18:59 ` Yinghai Lu
2009-11-24 21:41 ` Dimitri Sivanich
2009-11-24 21:51 ` Thomas Gleixner
2009-11-24 23:06 ` Eric W. Biederman
2009-11-25 1:23 ` Thomas Gleixner
2009-11-24 22:42 ` Eric W. Biederman
2009-11-25 15:40 ` Arjan van de Ven
2009-12-03 16:50 ` Dimitri Sivanich
2009-12-03 16:53 ` Waskiewicz Jr, Peter P
2009-12-03 17:01 ` Dimitri Sivanich
2009-12-03 17:07 ` Waskiewicz Jr, Peter P
2009-12-03 17:19 ` Dimitri Sivanich
2009-12-03 18:50 ` Waskiewicz Jr, Peter P
2009-12-04 16:42 ` Dimitri Sivanich
2009-12-04 21:17 ` Peter P Waskiewicz Jr
2009-12-04 23:12 ` Eric W. Biederman
2009-12-05 10:38 ` Peter P Waskiewicz Jr
2009-12-07 13:44 ` Dimitri Sivanich
2009-12-07 13:39 ` Dimitri Sivanich
2009-12-07 23:28 ` Peter P Waskiewicz Jr
2009-12-08 15:04 ` Dimitri Sivanich
2009-12-11 3:16 ` david
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1259069986.4531.1453.camel@laptop \
--to=peterz@infradead.org \
--cc=arjan@infradead.org \
--cc=davem@davemloft.net \
--cc=ebiederm@xmission.com \
--cc=hpa@zytor.com \
--cc=jbarnes@virtuousgeek.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peter.p.waskiewicz.jr@intel.com \
--cc=sivanich@sgi.com \
--cc=suresh.b.siddha@intel.com \
--cc=tglx@linutronix.de \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox