From: Jan-Bernd Themann <ossthema@de.ibm.com>
To: "Jörn Engel" <joern@logfs.org>
Cc: David Miller <davem@davemloft.net>,
Christoph Raisch <raisch@de.ibm.com>,
Jan-Bernd Themann <themann@de.ibm.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-ppc <linuxppc-dev@ozlabs.org>,
Marcus Eder <meder@de.ibm.com>, Thomas Klein <tklein@de.ibm.com>,
netdev <netdev@vger.kernel.org>,
Andrew Gallatin <gallatin@myri.com>,
Jeff Garzik <jeff@garzik.org>,
Stefan Roscher <stefan.roscher@de.ibm.com>
Subject: Re: [PATCH 1/1] lro: Generic Large Receive Offload for TCP traffic
Date: Mon, 6 Aug 2007 09:51:11 +0200 [thread overview]
Message-ID: <200708060951.12408.ossthema@de.ibm.com> (raw)
In-Reply-To: <20070803134150.GH19344@lazybastard.org>
Hi Jörn
On Friday 03 August 2007 15:41, Jörn Engel wrote:
> On Fri, 3 August 2007 14:41:19 +0200, Jan-Bernd Themann wrote:
> >
> > This patch provides generic Large Receive Offload (LRO) functionality
> > for IPv4/TCP traffic.
> >
> > LRO combines received tcp packets to a single larger tcp packet and
> > passes them then to the network stack in order to increase performance
> > (throughput). The interface supports two modes: Drivers can either pass
> > SKBs or fragment lists to the LRO engine.
>
> Maybe this is a stupid question, but why is LRO done at the device
> driver level?
>
> If it is a unversal performance benefit, I would have expected it to be
> done generically, i.e. have all packets moved into network layer pass
> through LRO instead.
The driver seems to be the right place:
- There is the "page mode" interface that accepts fragment lists instead of
SKBs and does generate SKBs only in the end (see Andrew Gallatins
mails where he described the advantages of this approach)
- Some drivers (in particular for 10G NICs which actually could benefit
from LRO) have multiple HW receive queues that do some sort of sorting,
thus using one lro_mgr per queue increases the likelyhood of beeing able
to do efficient LRO.
> > +void lro_flush_pkt(struct net_lro_mgr *lro_mgr,
> > + struct iphdr *iph, struct tcphdr *tcph);
> In particular this bit looks like it should be driven by a timeout,
> which would be settable via /proc/sys/net/core/lro_timeout or similar.
No, this function is needed for "page mode" as some HW provides
extra handling for small packets where packets are not stored in preallocated
pages but in extra queues. Thus the driver needs a way to flush old sessions
for this connection and handle these packets in a different way (for example
create a SKB and copy the data there).
Timeouts are not used at all. Experiments showed that flushing at the end
of a NAPI poll round seems to be sufficient (see Andrew's test results)
and does not affect the latency too badly.
Regards,
Jan-Bernd
prev parent reply other threads:[~2007-08-06 8:21 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-03 12:41 [PATCH 1/1] lro: Generic Large Receive Offload for TCP traffic Jan-Bernd Themann
2007-08-03 13:41 ` Jörn Engel
2007-08-06 7:51 ` Jan-Bernd Themann [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200708060951.12408.ossthema@de.ibm.com \
--to=ossthema@de.ibm.com \
--cc=davem@davemloft.net \
--cc=gallatin@myri.com \
--cc=jeff@garzik.org \
--cc=joern@logfs.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=meder@de.ibm.com \
--cc=netdev@vger.kernel.org \
--cc=raisch@de.ibm.com \
--cc=stefan.roscher@de.ibm.com \
--cc=themann@de.ibm.com \
--cc=tklein@de.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).