From: Jeff Garzik <jgarzik@pobox.com>
To: Tim Mattox <tmattox@engr.uky.edu>
Cc: sfeldma@pobox.com, netdev@oss.sgi.com,
bonding-devel@lists.sourceforge.net,
Scott Feldman <scott.feldman@intel.com>
Subject: Re: [PATCH 2.6] e100: use NAPI mode all the time
Date: Sun, 06 Jun 2004 22:33:26 -0400 [thread overview]
Message-ID: <40C3D3F6.6010103@pobox.com> (raw)
In-Reply-To: <2DF80C45-B825-11D8-9557-000393652100@engr.uky.edu>
Tim Mattox wrote:
> The problem is caused by the order packets are delivered to the TCP
> stack on the receiving machine. In normal round-robin bonding mode,
> the packets are sent out one per NIC in the bond. For simplicity
> sake, lets say we have two NICs in a bond, eth0 and eth1. When
> sending packets, eth0 will handle all the even packets, and eth1 all
> the odd packets. Similarly when receiving, eth0 would get all
> the even packets, and eth1 all the odd packets from a particular
> TCP stream.
>
> With NAPI (or other interrupt mitigation techniques) the
> receiving machine will process multiple packets in a row from a
> single NIC, before getting packets from another NIC. In the
> above example, eth0 would receive packets 0, 2, 4, 6, etc.
> and pass them to the TCP layer. Followed by eth1's
> packets 1, 3, 5, 7, etc. The specific number of out-of-order
> packets received in a row would depend on many factors.
>
> The TCP layer would need to reorder the packets from something
> like 0, 2, 4, 6, 1, 3, 5, 7 or something
> like 0, 2, 4, 1, 3, 5, 6, 7. With many possible variations.
Ethernet drivers have _always_ processed multiple packets per interrupt,
since before the days of NAPI, and before the days of hardware mitigation.
Therefore, this is mainly an argument against using overly simplistic
load balancing schemes that _create_ this problem :) It's much smarter
to load balance based on flows, for example. I think the ALB mode does
this?
You appear to be making the incorrect assumption that packets sent in
this simplistic, round-robin manner could ever _hope_ to arrive in-order
at the destination. Any number of things serve gather packets into
bursts: net stack TX queue, hardware DMA ring, hardware FIFO, remote
h/w FIFO, remote hardware DMA ring, remote softirq.
> I don't want to slow the progress of Linux networking development.
> I was objecting to the removal of a feature to e100 that already has
> working code and that was, AFAIK, necessary for the performance
> enhancement of bonding.
No, just don't use a bonding mode that kills performance. It has
nothing to do with NAPI.
As I said, ethernet drivers have been processing runs of packets per irq
/ softirq for ages and ages. This isn't new with NAPI, to be sure.
> I have NO problems with NAPI itself, I think it's a wonderful development.
> I would even advocate for making NAPI the default across the board.
> But for bonding, until I see otherwise, I want to be able to not use NAPI.
> As I indicated, I will have a new cluster that I can directly test this
> NAPI vs Bonding issue very soon.
As Scott indicated, people use bonding with tg3 (unconditional NAPI) all
time.
Further, I hope you're not doing something silly like trying to load
balance on the _same_ ethernet. If you are, that's a signal that deeper
problems exist -- you should be able to do wire speed with one NIC.
Jeff
next prev parent reply other threads:[~2004-06-07 2:33 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-06-05 0:35 [PATCH 2.6] e100: use NAPI mode all the time Scott Feldman
2004-06-06 22:57 ` Tim Mattox
2004-06-07 0:03 ` Scott Feldman
2004-06-07 1:51 ` Tim Mattox
2004-06-07 2:33 ` Jeff Garzik [this message]
2004-06-07 6:39 ` [Bonding-devel] " Jay Vosburgh
2004-06-07 11:17 ` jamal
2004-06-08 9:53 ` Christopher Chan
2004-06-15 18:04 ` Christopher Chan
2004-06-11 0:16 ` Jeff Garzik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=40C3D3F6.6010103@pobox.com \
--to=jgarzik@pobox.com \
--cc=bonding-devel@lists.sourceforge.net \
--cc=netdev@oss.sgi.com \
--cc=scott.feldman@intel.com \
--cc=sfeldma@pobox.com \
--cc=tmattox@engr.uky.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).