netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Hutchings <bhutchings@solarflare.com>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>,
	netdev@vger.kernel.org, davem@davemloft.net,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: net: Automatic IRQ siloing for network devices
Date: Sun, 17 Apr 2011 19:38:59 +0100	[thread overview]
Message-ID: <1303065539.5282.938.camel@localhost> (raw)
In-Reply-To: <20110417172010.GA3362@neilslaptop.think-freely.org>

On Sun, 2011-04-17 at 13:20 -0400, Neil Horman wrote:
> On Sat, Apr 16, 2011 at 09:17:04AM -0700, Stephen Hemminger wrote:
[...]
> > My gut feeling is that:
> >   * kernel should default to a simple static sane irq policy without user
> >     space.  This is especially true for multi-queue devices where the default
> >     puts all IRQ's on one cpu.
> > 
> Thats not how it currently works, AFAICS.  The default kernel policy is
> currently that cpu affinity for any newly requested irq is all cpus.  Any
> restriction beyond that is the purview and doing of userspace (irqbalance or
> manual affinity setting).

Right.  Though it may be reasonable for the kernel to use the hint as
the initial affinity for a newly allocated IRQ (not sure quite how we
determine that).

[...]
> >   * irqbalance should not do the hacks it does to try and guess at network traffic.
> > 
> Well, I can certainly agree with that, but I'm not sure what that looks like.
> 
> I could envision something like:
> 
> 1) Use irqbalance to do a one time placement of interrupts, keeping a simple
> (possibly sub-optimal) policy, perhaps something like new irqs get assigned to
> the least loaded cpu within the numa node of the device the irq is originating
> from.
> 
> 2) Add a udev event on the addition of new interrupts, to rerun irqbalance

Yes, making irqbalance more (or entirely) event-driven seems like a good
thing.

> 3) Add some exported information to identify processes that are high users of
> network traffic, and correlate that usage to a rxq/irq that produces that
> information (possibly some per-task proc file)
> 
> 4) Create/expand an additional user space daemon to monitor the highest users of
> network traffic on various rxq/irqs (as identified in (3)) and restrict those
> processes execution to those cpus which are on the same L2 cache as the irq
> itself.  The cpuset cgroup could be usefull in doing this perhaps.

I just don't see that you're going to get processes associated with
specific RX queues unless you make use of flow steering.

The 128-entry flow hash indirection table is part of Microsoft's
requirements for RSS so most multiqueue hardware is going to let you do
limited flow steering that way.

> Actually, as I read back to myself, that acutally sounds kind of good to me.  It
> keeps all the policy for this in user space, and minimizes what we have to add
> to the kernel to make it happen (some process information in /proc and another
> udev event).  I'd like to get some feedback before I start implementing this,
> but I think this could be done.  What do you think?

I don't think it's a good idea to override the scheduler dynamically
like this.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


  reply	other threads:[~2011-04-17 18:39 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-15 20:17 net: Automatic IRQ siloing for network devices Neil Horman
2011-04-15 20:17 ` [PATCH 1/3] irq: Add registered affinity guidance infrastructure Neil Horman
2011-04-16  0:22   ` Thomas Gleixner
2011-04-16  2:11     ` Neil Horman
2011-04-15 20:17 ` [PATCH 2/3] net: Add net device irq siloing feature Neil Horman
2011-04-15 22:49   ` Ben Hutchings
2011-04-16  1:49     ` Neil Horman
2011-04-16  4:52       ` Stephen Hemminger
2011-04-16  6:21         ` Eric Dumazet
2011-04-16 11:55           ` Neil Horman
2011-04-15 20:17 ` [PATCH 3/3] net: Adding siloing irqs to cxgb4 driver Neil Horman
2011-04-15 22:54 ` net: Automatic IRQ siloing for network devices Ben Hutchings
2011-04-16  0:50   ` Ben Hutchings
2011-04-16  1:59   ` Neil Horman
2011-04-16 16:17     ` Stephen Hemminger
2011-04-17 17:20       ` Neil Horman
2011-04-17 18:38         ` Ben Hutchings [this message]
2011-04-18  1:08           ` Neil Horman
2011-04-18 21:51             ` Ben Hutchings
2011-04-19  0:52               ` Neil Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1303065539.5282.938.camel@localhost \
    --to=bhutchings@solarflare.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    --cc=shemminger@vyatta.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).