netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Dickinson <andrew@whydna.net>
To: "Brandeburg, Jesse" <jesse.brandeburg@intel.com>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: receive-side performance issue (ixgbe, core-i7, softirq cpu%)
Date: Thu, 28 Jan 2010 22:06:08 -0800	[thread overview]
Message-ID: <606676311001282206q113f6bbbq776996b67fd18adb@mail.gmail.com> (raw)
In-Reply-To: <alpine.WNT.2.00.1001281135110.360@jbrandeb-desk1.amr.corp.intel.com>

Short response: CONFIG_HPET was the dirty little bastard!

Answering your questions below in case somebody else stumbles across
this thread...

On Thu, Jan 28, 2010 at 4:18 PM, Brandeburg, Jesse
<jesse.brandeburg@intel.com> wrote:
>
>
> On Thu, 28 Jan 2010, Andrew Dickinson wrote:
>> I'm running into some unexpected performance issues.  I say
>> "unexpected" because I was running the same tests on this same box 5
>> months ago and getting very different (and much better) results.
>
>
> can you try turning off cpuspeed service, C-States in BIOS, and GV3 (aka
> speedstep) support in BIOS?

Yup, everything's on "maximum performance" in my BIOS's vernacular (HP
GL360g6) no C-states, etc.

> Have you upgraded your BIOS since before?

Not that I'm aware of, but our provisioning folks might have done
something crazy.

> I agree you should be able to see better numbers, I suspect that you are
> getting cross-cpu traffic that is limiting your throughput.

That's what I would have suspected as well.

> How many flows are you pushing?

I'm pushing two streams of traffic, one in each direction.  Each
stream is defined as follows:
    North-bound:
        L2: a0a0a0a0a0a0 -> b0b0b0b0b0b0
        L3: RAND(10.0.0.0/16) -> RAND(100.0.0.0/16)
        L4: UDP with random data
    South-bound is the reverse.

    where "RAND(CIDR)" is a random address within that CIDR (I'm using
an hardware traffic generator).

> Another idea is to compile the "perf" tool in the tools/perf directory of
> the kernel and run "perf record -a -- sleep 10" while running at steady
> state.  then show output of perf report to get an idea of which functions
> are eating all the cpu time.
>
> did you change to the "tickless" kernel?  We've also found that routing
> performance improves dramatically by disabling tickless, preemptive kernel
> and setting HZ=100.  What about CONFIG_HPET?

yes, yes, yes, and no...

changed CONFIG_HPET to n, rebooted and retested....

ta-da!

> You should try the kernel that the scheduler fixes went into (maybe 31?)
> or at least try 2.6.32.6 so you've tried something fully up to date.

I'll give it a whirl :D

>> === Background ===
>>
>> The box is a dual Core i7 box with a pair of Intel 82598EB's.  I'm
>> running 2.6.30 with the in-kernel ixgbe driver.  My tests 5 months ago
>> were using 2.6.30-rc3 (with a tiny patch from David Miller as seen
>> here: http://kerneltrap.org/mailarchive/linux-netdev/2009/4/30/5605924).
>>  The box is configured with both NICs in a bridge; normally I'm doing
>> some packet processing using ebtables, but for the sake of keeping
>> things simple, I'm not doing anything special.. just straight bridging
>> (no ebtables rules, etc).  I'm not running irqbalance and instead
>> pinning my interrupts, one per core.  I've re-read and double checked
>> various settings based on Intel's README (i.e. gso off, tso off, etc).
>>
>> In my previous tests, i was able to pass 3+Mpps regardless of how that
>> was divided across the two NICS (i.e. 3Mpps all in one direction,
>> 1.5Mpps in each direction simultaneously, etc).  Now, I'm hardly able
>> to exceed about 750kpps x 2 (i.e. 750k in both directions), and I
>> can't do more than 750kpps in one direction even with the other
>> direction having no traffic).
>>
>> Unfortunately, I didn't take very good notes when I did this last time
>> so I don't have my previous .config and I'm not 100% positive I've got
>> identical ethtool settings, etc.  That being said, I've worked through
>> seemingly every combination of factors that I can think of and I'm
>> still unable to see the old performance (NUMA on/off, Hyperthreading
>> on/off, various irq coelescing settings, etc).
>>
>> I have two identical boxes, they both see the same thing; so a
>> hardware issue seems unlikely.  My next step is to grab 2.6.30-rc3 and
>> see if I can repro the good performance with that kernel again and
>> determine if there was a regression between 2.6.30-rc3 and 2.6.30...
>> but I'm skeptical that that's the issue since I'm sure other people
>> would have noticed this as well.
>>
>>
>> === What I'm seeing ===
>>
>> CPU% (almost entirely softirq time, which is expected) ramps extremely
>> quickly as packet rate increases.  The following table show the packet
>> rate ("150 x 2" means 150kpps in each direction simultaneously), the
>> right side is the cpu utilization (as measured by %si in top).
>>
>> 150 x 2:   4%
>> 300 x 2:   8%
>> 450 x 2:  18%
>> 483 x 2:  50%
>> 525 x 2:  66%
>> 600 x 2:  85%
>> 750 x 2: 100% (and dropping frames)
>>
>> I _am_ seeing interrupts getting spread nicely across cores, so in the
>> "150 x 2" case, that's about 4% soft-interrupt time per each of the 16
>> cores.   The CPUs are otherwise idle bar a small amount of hardware
>> interrupt time (less than 1%).
>>
>>
>> === Where it gets weird... ===
>>
>> Trying to isolate the problem, I added an ebtables rule to drop
>> everything on the forward chain.  I was expecting to see the CPU
>> utilization drop since I'd no longer be dealing with the TX-side... no
>> change.
>>
>> I then decided to switch from a bridge to a route-based solution.  I
>> tore down the bridge, enabled ip_forward, setup some IPs and route
>> entries, etc.  Nothing changes.  CPU performance is identical to
>> what's shown above.  Additionally, if I add an iptables drop on
>> FORWARD, the CPU utilization remains unchanged (just like in the
>> bridging case above).
>>
>> The point that [I think] I'm driving to is that there's something
>> fishy going on with the receive-side of the packets.  I wish I could
>> point to something more specific or a section of code, but I haven't
>> been able to par this down to anything more granular in my testing.
>>
>>
>> === Questions ===
>>
>> Has anybody seen this before?  If so, what was wrong?
>> Do you have any recommendations on things to try (either as guesses
>> or, even better, to help eliminate possibilities)
>> And along those lines... can anybody think of any possible reasons for this?
>
> hope the above helped.
>
>> This is so frustrating since I _know_ this hardware is capable of so
>> much more.  It's relatively painless for me to re-run tests in my lab,
>> so feel free to throw something at me that you think will stick :D
>
> last I checked, I recall with 82599 I was pushing ~4.5 million 64 byte
> packets a second (bidirectional, no drop), after disabling irqbalance and
> 16 tx/rx queues set with set_irq_affinity.sh script (available in our
> ixgbe-foo.tar.gz from sourceforge).  82598 should be a bit lower, but
> probably can get close to that number.
>
> I haven't run the test lately though, but at that point I was likely on
> 2.6.30 ish
>
> Jesse
>

Thank you so much... I wish I'd sent this email out a week ago ;-P

-A

  reply	other threads:[~2010-01-29  6:06 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-28  8:23 receive-side performance issue (ixgbe, core-i7, softirq cpu%) Andrew Dickinson
2010-01-29  0:18 ` Brandeburg, Jesse
2010-01-29  6:06   ` Andrew Dickinson [this message]
2010-01-29  8:02     ` Andrew Dickinson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=606676311001282206q113f6bbbq776996b67fd18adb@mail.gmail.com \
    --to=andrew@whydna.net \
    --cc=jesse.brandeburg@intel.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).