From: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
"peterz@infradead.org" <peterz@infradead.org>,
"arjan@linux.intel.com" <arjan@linux.intel.com>,
"yong.zhang0@gmail.com" <yong.zhang0@gmail.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"arjan@linux.jf.intel.com" <arjan@linux.jf.intel.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [PATCH] irq: Add node_affinity CPU masks for smarter irqbalance hints
Date: Tue, 24 Nov 2009 10:33:21 -0800 [thread overview]
Message-ID: <1259087601.2631.56.camel@ppwaskie-mobl2> (raw)
In-Reply-To: <4B0C2547.8030408@gmail.com>
On Tue, 2009-11-24 at 10:26 -0800, Eric Dumazet wrote:
> Peter P Waskiewicz Jr a écrit :
> That's the kind of thing PJ is trying to make available.
> >
> > Yes, that's exactly what I'm trying to do. Even further, we want to
> > allocate the ring SW struct itself and descriptor structures on other
> > NUMA nodes, and make sure the interrupt lines up with those allocations.
> >
>
> Say you allocate ring buffers on NUMA node of the CPU handling interrupt
> on a particular queue.
>
> If irqbalance or an admin changes /proc/irq/{number}/smp_affinities,
> do you want to realloc ring buffer to another NUMA node ?
>
That's why I'm trying to add the node_affinity mechanism that irqbalance
can use to prevent the interrupt being moved to another node.
> It seems complex to me, maybe optimal thing would be to use a NUMA policy to
> spread vmalloc() allocations to all nodes to get a good bandwidth...
That's exactly what we're doing in our 10GbE driver right now (isn't
pushed upstream yet, still finalizing our testing). We spread to all
NUMA nodes in a semi-intelligent fashion when allocating our rings and
buffers. The last piece is ensuring the interrupts tied to the various
queues all route to the NUMA nodes those CPUs belong to. irqbalance
needs some kind of hint to make sure it does the right thing, which
today it does not.
I don't see how this is complex though. Driver loads, allocates across
the NUMA nodes for optimal throughput, then writes CPU masks for the
NUMA nodes each interrupt belongs to. irqbalance comes along and looks
at the new mask "hint," and then balances that interrupt within that
hinted mask.
Cheers,
-PJ
next prev parent reply other threads:[~2009-11-24 18:33 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-23 6:46 [PATCH] irq: Add node_affinity CPU masks for smarter irqbalance hints Peter P Waskiewicz Jr
2009-11-23 7:32 ` Yong Zhang
2009-11-23 9:36 ` Peter P Waskiewicz Jr
2009-11-23 10:21 ` ixgbe question Eric Dumazet
2009-11-23 10:30 ` Badalian Vyacheslav
2009-11-23 10:34 ` Waskiewicz Jr, Peter P
2009-11-23 10:37 ` Eric Dumazet
2009-11-23 14:05 ` Eric Dumazet
2009-11-23 21:26 ` David Miller
2009-11-23 14:10 ` Jesper Dangaard Brouer
2009-11-23 14:38 ` Eric Dumazet
2009-11-23 18:30 ` robert
2009-11-23 16:59 ` Eric Dumazet
2009-11-23 20:54 ` robert
2009-11-23 21:28 ` David Miller
2009-11-23 22:14 ` Robert Olsson
2009-11-23 23:28 ` Waskiewicz Jr, Peter P
2009-11-23 23:44 ` David Miller
2009-11-24 7:46 ` Eric Dumazet
2009-11-24 8:46 ` Badalian Vyacheslav
2009-11-24 9:07 ` Peter P Waskiewicz Jr
2009-11-24 9:55 ` Eric Dumazet
2009-11-24 10:06 ` Peter P Waskiewicz Jr
2009-11-24 11:37 ` [PATCH net-next-2.6] ixgbe: Fix TX stats accounting Eric Dumazet
2009-11-24 13:23 ` Eric Dumazet
2009-11-25 7:38 ` Jeff Kirsher
2009-11-25 9:31 ` Eric Dumazet
2009-11-25 9:38 ` Jeff Kirsher
2009-11-24 13:14 ` ixgbe question John Fastabend
2009-11-29 8:18 ` David Miller
2009-11-30 13:02 ` Eric Dumazet
2009-11-30 20:20 ` John Fastabend
2009-11-26 14:10 ` Badalian Vyacheslav
2009-11-23 17:05 ` [PATCH] irq: Add node_affinity CPU masks for smarter irqbalance hints Peter Zijlstra
2009-11-23 23:32 ` Waskiewicz Jr, Peter P
2009-11-24 8:38 ` Peter Zijlstra
2009-11-24 8:59 ` Peter P Waskiewicz Jr
2009-11-24 9:08 ` Peter Zijlstra
2009-11-24 9:15 ` Peter P Waskiewicz Jr
2009-11-24 14:43 ` Arjan van de Ven
2009-11-24 9:15 ` Peter Zijlstra
2009-11-24 10:07 ` Thomas Gleixner
2009-11-24 17:55 ` Peter P Waskiewicz Jr
2009-11-25 11:18 ` Peter Zijlstra
2009-11-24 6:07 ` Arjan van de Ven
2009-11-24 8:39 ` Peter Zijlstra
2009-11-24 14:42 ` Arjan van de Ven
2009-11-24 17:39 ` David Miller
2009-11-24 17:56 ` Peter P Waskiewicz Jr
2009-11-24 18:26 ` Eric Dumazet
2009-11-24 18:33 ` Peter P Waskiewicz Jr [this message]
2009-11-24 19:01 ` Eric Dumazet
2009-11-24 19:53 ` Peter P Waskiewicz Jr
2009-11-24 18:54 ` David Miller
2009-11-24 18:58 ` Eric Dumazet
2009-11-24 20:35 ` Andi Kleen
2009-11-24 20:46 ` Eric Dumazet
2009-11-25 10:30 ` Eric Dumazet
2009-11-25 10:37 ` Andi Kleen
2009-11-25 11:35 ` Eric Dumazet
2009-11-25 11:50 ` Andi Kleen
2009-11-26 11:43 ` Eric Dumazet
2009-11-24 5:17 ` Yong Zhang
2009-11-24 8:39 ` Peter P Waskiewicz Jr
-- strict thread matches above, loose matches on Subject: below --
2009-11-23 7:12 Peter P Waskiewicz Jr
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1259087601.2631.56.camel@ppwaskie-mobl2 \
--to=peter.p.waskiewicz.jr@intel.com \
--cc=arjan@linux.intel.com \
--cc=arjan@linux.jf.intel.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=yong.zhang0@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).