From: Ben Hutchings <bhutchings@solarflare.com>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>, netdev@vger.kernel.org
Subject: Re: [PATCH 3/8] net: Add Generic Receive Offload infrastructure
Date: Fri, 12 Dec 2008 22:25:24 +0000 [thread overview]
Message-ID: <1229120724.3051.61.camel@achroite> (raw)
In-Reply-To: <E1LB0cx-0000r3-N2@gondolin.me.apana.org.au>
On Fri, 2008-12-12 at 16:31 +1100, Herbert Xu wrote:
[...]
> Whenever the skb is merged into an existing entry, the gro_receive
> function should set NAPI_GRO_CB(skb)->same_flow. Note that if an skb
> merely matches an existing entry but can't be merged with it, then
> this shouldn't be set.
So why not call this field "merged"?
[...]
> Once gro_receive has determined that the new skb matches a held packet,
> the held packet may be processed immediately if the new skb cannot be
> merged with it. In this case gro_receive should return the pointer to
> the existing skb in gro_list. Otherwise the new skb should be merged into
> the existing packet and NULL should be returned, unless the new skb makes
> it impossible for any further merges to be made (e.g., FIN packet) where
> the merged skb should be returned.
This belongs in a kernel-doc comment, not in the commit message.
[...]
> Currently held packets are stored in a singly liked list just like LRO.
> The list is limited to a maximum of 8 entries. In future, this may be
> expanded to use a hash table to allow more flows to be held for merging.
We used a hash table in our own soft-LRO, used in out-of-tree driver
releases. This certainly improved performance in many-to-one
benchmarks. How much it matters in real applications, I'm less sure.
[...]
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 4388e27..5e5132c 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
[...]
> +int napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
> +{
> + struct sk_buff **pp;
> + struct packet_type *ptype;
> + __be16 type = skb->protocol;
> + struct list_head *head = &ptype_base[ntohs(type) & PTYPE_HASH_MASK];
Are you intending for the VLAN driver to call napi_gro_receive()? If
not, I think this should treat VLAN tags as part of the MAC header.
Not every NIC separates them out!
> + int count = 0;
> + int mac_len;
> +
> + if (!(skb->dev->features & NETIF_F_GRO))
> + goto normal;
> +
> + rcu_read_lock();
> + list_for_each_entry_rcu(ptype, head, list) {
> + struct sk_buff *p;
> +
> + if (ptype->type != type || ptype->dev || !ptype->gro_receive)
> + continue;
> +
> + skb_reset_network_header(skb);
> + mac_len = skb->network_header - skb->mac_header;
> + skb->mac_len = mac_len;
> + NAPI_GRO_CB(skb)->same_flow = 0;
> + NAPI_GRO_CB(skb)->flush = 0;
> + for (p = napi->gro_list; p; p = p->next) {
> + count++;
> + NAPI_GRO_CB(p)->same_flow =
> + p->mac_len == mac_len &&
> + !memcmp(skb_mac_header(p), skb_mac_header(skb),
> + mac_len);
> + NAPI_GRO_CB(p)->flush = 0;
Is this assignment to flush really necessary? Surely any skb on the
gro_list with flush == 1 gets removed before the next call to
napi_gro_receive()?
> + }
> +
> + pp = ptype->gro_receive(&napi->gro_list, skb);
> + break;
> +
> + }
> + rcu_read_unlock();
> +
> + if (&ptype->list == head)
> + goto normal;
The above loop is unclear because most of the body is supposed to run at
most once; I would suggest writing the loop and the failure case as:
rcu_read_lock();
list_for_each_entry_rcu(ptype, head, list)
if (ptype->type == type && !ptype->dev && ptype->gro_receive)
break;
if (&ptype->list == head) {
rcu_read_unlock();
goto normal;
}
and then moving the rest of the loop body after this.
The inet_lro code accepts either skbs or pages and the sfc driver takes
advantage of this: so long as most packets can be coalesced by LRO, it's
cheaper to allocate page buffers in advance and then attach them to skbs
during LRO. I think you should support the use of page buffers.
Obviously it adds complexity but there's a real performance benefit.
(Alternately you could work out how to make skb allocation cheaper, and
everyone would be happy!)
[...]
> +void netif_napi_del(struct napi_struct *napi)
> +{
> + struct sk_buff *skb, *next;
> +
> + list_del(&napi->dev_list);
> +
> + for (skb = napi->gro_list; skb; skb = next) {
> + next = skb->next;
> + skb->next = NULL;
> + kfree_skb(skb);
> + }
[...]
Shouldn't the list already be empty at this point?
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
next prev parent reply other threads:[~2008-12-12 22:25 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-12 5:31 [0/8] net: Generic Receive Offload Herbert Xu
2008-12-12 5:31 ` [PATCH 1/8] net: Add frag_list support to skb_segment Herbert Xu
2008-12-12 19:46 ` Evgeniy Polyakov
2008-12-12 21:41 ` Herbert Xu
2008-12-13 2:38 ` Evgeniy Polyakov
2008-12-13 2:43 ` Herbert Xu
2008-12-13 3:11 ` Evgeniy Polyakov
2008-12-13 3:20 ` Herbert Xu
2008-12-12 5:31 ` [PATCH 2/8] net: Add frag_list support to GSO Herbert Xu
2008-12-12 5:31 ` [PATCH 3/8] net: Add Generic Receive Offload infrastructure Herbert Xu
2008-12-12 19:51 ` Evgeniy Polyakov
2008-12-12 21:45 ` Herbert Xu
2008-12-12 22:25 ` Ben Hutchings [this message]
2008-12-12 22:56 ` Herbert Xu
2008-12-12 23:11 ` Herbert Xu
2008-12-13 3:43 ` Herbert Xu
2008-12-13 14:03 ` Evgeniy Polyakov
2008-12-12 5:31 ` [PATCH 4/8] ipv4: Add GRO infrastructure Herbert Xu
2008-12-12 22:55 ` Ben Hutchings
2008-12-12 23:04 ` Herbert Xu
2008-12-12 5:31 ` [PATCH 5/8] net: Add skb_gro_receive Herbert Xu
2008-12-12 5:31 ` [PATCH 6/8] tcp: Add GRO support Herbert Xu
2008-12-12 19:56 ` Evgeniy Polyakov
2008-12-12 21:46 ` Herbert Xu
2008-12-13 2:40 ` Evgeniy Polyakov
2008-12-13 2:46 ` Herbert Xu
2008-12-13 3:10 ` Evgeniy Polyakov
2008-12-13 3:19 ` Herbert Xu
2008-12-12 5:31 ` [PATCH 7/8] ethtool: Add GGRO and SGRO ops Herbert Xu
2008-12-12 20:11 ` Ben Hutchings
2008-12-12 21:48 ` Herbert Xu
2008-12-12 22:35 ` Ben Hutchings
2008-12-12 22:49 ` Herbert Xu
2008-12-14 19:36 ` Waskiewicz Jr, Peter P
2008-12-14 21:09 ` Herbert Xu
2008-12-14 22:00 ` Waskiewicz Jr, Peter P
2008-12-15 3:40 ` Herbert Xu
2008-12-12 5:31 ` [PATCH 8/8] e1000e: Add GRO support Herbert Xu
2008-12-13 1:34 ` [0/8] net: Generic Receive Offload Herbert Xu
2008-12-13 1:35 ` [PATCH 1/8] net: Add frag_list support to skb_segment Herbert Xu
2008-12-16 7:27 ` David Miller
2008-12-13 1:35 ` [PATCH 2/8] net: Add frag_list support to GSO Herbert Xu
2008-12-16 7:30 ` David Miller
2008-12-13 1:35 ` [PATCH 3/8] net: Add Generic Receive Offload infrastructure Herbert Xu
2008-12-15 23:29 ` Paul E. McKenney
2008-12-15 23:39 ` David Miller
2008-12-16 0:02 ` Paul E. McKenney
2008-12-16 2:04 ` Herbert Xu
2008-12-16 16:37 ` Paul E. McKenney
2008-12-16 7:40 ` David Miller
2008-12-13 1:35 ` [PATCH 4/8] ipv4: Add GRO infrastructure Herbert Xu
2008-12-16 7:41 ` David Miller
2008-12-13 1:35 ` [PATCH 5/8] net: Add skb_gro_receive Herbert Xu
2008-12-13 2:52 ` Herbert Xu
2008-12-16 7:42 ` David Miller
2008-12-13 1:35 ` [PATCH 6/8] tcp: Add GRO support Herbert Xu
2008-12-16 7:43 ` David Miller
2008-12-13 1:35 ` [PATCH 7/8] ethtool: Add GGRO and SGRO ops Herbert Xu
2008-12-16 7:44 ` David Miller
2008-12-13 1:35 ` [PATCH 8/8] e1000e: Add GRO support Herbert Xu
2008-12-16 7:46 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1229120724.3051.61.camel@achroite \
--to=bhutchings@solarflare.com \
--cc=davem@davemloft.net \
--cc=herbert@gondor.apana.org.au \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.