From: Alexander Duyck <alexander.h.duyck@intel.com>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
David Miller <davem@davemloft.net>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Wei Liu (Intern)" <wei.liu2@citrix.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH 05/10] net: move destructor_arg to the front of sk_buff.
Date: Wed, 11 Apr 2012 09:31:38 -0700 [thread overview]
Message-ID: <4F85B1EA.9000600@intel.com> (raw)
In-Reply-To: <1334131241.12209.199.camel@dagon.hellion.org.uk>
On 04/11/2012 01:00 AM, Ian Campbell wrote:
> On Tue, 2012-04-10 at 20:15 +0100, Alexander Duyck wrote:
>> On 04/10/2012 11:41 AM, Eric Dumazet wrote:
>>> On Tue, 2012-04-10 at 11:33 -0700, Alexander Duyck wrote:
>>>
>>>> Have you checked this for 32 bit as well as 64? Based on my math your
>>>> next patch will still mess up the memset on 32 bit with the structure
>>>> being split somewhere just in front of hwtstamps.
>>>>
>>>> Why not just take frags and move it to the start of the structure? It
>>>> is already an unknown value because it can be either 16 or 17 depending
>>>> on the value of PAGE_SIZE, and since you are making changes to frags the
>>>> changes wouldn't impact the alignment of the other values later on since
>>>> you are aligning the end of the structure. That way you would be
>>>> guaranteed that all of the fields that will be memset would be in the
>>>> last 64 bytes.
>>>>
>>> Now when a fragmented packet is copied in pskb_expand_head(), you access
>>> two separate zones of memory to copy the shinfo. But its supposed to be
>>> slow path.
>>>
>>> Problem with this is that the offsets of often used fields will be big
>>> (instead of being < 127) and code will be bigger on x86.
>> Actually now that I think about it my concerns go much further than the
>> memset. I'm convinced that this is going to cause a pretty significant
>> performance regression on multiple drivers, especially on non x86_64
>> architecture. What we have right now on most platforms is a
>> skb_shared_info structure in which everything up to and including frag 0
>> is all in one cache line. This gives us pretty good performance for igb
>> and ixgbe since that is our common case when jumbo frames are not
>> enabled is to split the head and place the data in a page.
> With all the changes in this series it is still possible to fit a
> maximum standard MTU frame and the shinfo on the same 4K page while also
> have the skb_shared_info up to and including frag [0] aligned to the
> same 64 byte cache line.
>
> The only exception is destructor_arg on 64 bit which is on the preceding
> cache line but that is not a field used in any hot path.
The problem I have is that this is only true on x86_64. Proper work
hasn't been done to guarantee this on any other architectures.
I think what I would like to see is instead of just setting things up
and hoping it comes out cache aligned on nr_frags why not take steps to
guarantee it? You could do something like place and size the structure
based on:
SKB_DATA_ALIGN(sizeof(skb_shared_info) - offsetof(struct
skb_shared_info, nr_frags)) + offsetof(struct skb_shared_info, nr_frags)
That way you would have your alignment still guaranteed based off of the
end of the structure, but anything placed before nr_frags would be
placed on the end of the previous cache line.
>> However the change being recommend here only resolves the issue for one
>> specific architecture, and that is what I don't agree with. What we
>> need is a solution that also works for 64K pages or 32 bit pointers and
>> I am fairly certain this current solution does not.
> I think it does work for 32 bit pointers. What issue to do you see with
> 64K pages?
>
> Ian.
With 64K pages the MAX_SKB_FRAGS value drops from 17 to 16. That will
undoubtedly mess up the alignment.
Thanks,
Alex
next prev parent reply other threads:[~2012-04-11 16:31 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-10 14:26 [PATCH v4 0/10] skb paged fragment destructors Ian Campbell
2012-04-10 14:26 ` [PATCH 01/10] net: add and use SKB_ALLOCSIZE Ian Campbell
2012-04-10 14:26 ` Ian Campbell
2012-04-10 14:57 ` Eric Dumazet
2012-04-10 14:57 ` Eric Dumazet
2012-04-10 14:26 ` [PATCH 02/10] net: Use SKB_WITH_OVERHEAD in build_skb Ian Campbell
2012-04-10 14:26 ` Ian Campbell
2012-04-10 14:58 ` Eric Dumazet
2012-04-10 14:58 ` Eric Dumazet
2012-04-10 14:26 ` [PATCH 03/10] chelsio: use SKB_WITH_OVERHEAD Ian Campbell
2012-04-10 14:26 ` Ian Campbell
2012-04-10 14:59 ` Eric Dumazet
2012-04-10 14:59 ` Eric Dumazet
2012-04-10 14:26 ` [PATCH 04/10] net: pad skb data and shinfo as a whole rather than individually Ian Campbell
2012-04-10 14:26 ` Ian Campbell
2012-04-10 15:01 ` Eric Dumazet
2012-04-10 15:01 ` Eric Dumazet
2012-04-10 14:26 ` [PATCH 05/10] net: move destructor_arg to the front of sk_buff Ian Campbell
2012-04-10 15:05 ` Eric Dumazet
2012-04-10 15:19 ` Ian Campbell
2012-04-10 15:19 ` Ian Campbell
2012-04-10 15:05 ` Eric Dumazet
2012-04-10 18:33 ` Alexander Duyck
2012-04-10 18:33 ` Alexander Duyck
2012-04-10 18:41 ` Eric Dumazet
2012-04-10 19:15 ` Alexander Duyck
2012-04-10 19:15 ` Alexander Duyck
2012-04-11 8:00 ` Ian Campbell
2012-04-11 16:31 ` Alexander Duyck [this message]
2012-04-11 17:00 ` Ian Campbell
2012-04-11 17:00 ` Ian Campbell
2012-04-11 16:31 ` Alexander Duyck
2012-04-11 8:00 ` Ian Campbell
2012-04-11 8:20 ` Eric Dumazet
2012-04-11 8:20 ` Eric Dumazet
2012-04-11 16:05 ` Alexander Duyck
2012-04-11 16:05 ` Alexander Duyck
2012-04-10 18:41 ` Eric Dumazet
2012-04-11 7:56 ` Ian Campbell
2012-04-11 7:56 ` Ian Campbell
2012-04-10 14:26 ` Ian Campbell
2012-04-10 14:26 ` [PATCH 06/10] net: add support for per-paged-fragment destructors Ian Campbell
2012-04-10 14:26 ` Ian Campbell
2012-04-26 20:44 ` [Xen-devel] " Konrad Rzeszutek Wilk
2012-04-26 20:44 ` Konrad Rzeszutek Wilk
2012-04-10 14:26 ` [PATCH 07/10] net: only allow paged fragments with the same destructor to be coalesced Ian Campbell
2012-04-10 20:11 ` Ben Hutchings
2012-04-10 20:11 ` Ben Hutchings
2012-04-11 7:45 ` Ian Campbell
2012-04-11 7:45 ` Ian Campbell
2012-04-10 14:26 ` Ian Campbell
2012-04-10 14:26 ` [PATCH 08/10] net: add skb_orphan_frags to copy aside frags with destructors Ian Campbell
2012-04-10 14:26 ` Ian Campbell
[not found] ` <1334067965.5394.22.camel-o4Be2W7LfRlXesXXhkcM7miJhflN2719@public.gmane.org>
2012-04-10 14:26 ` [PATCH 09/10] net: add paged frag destructor support to kernel_sendpage Ian Campbell
2012-04-10 14:26 ` [Ocfs2-devel] " Ian Campbell
2012-04-10 14:26 ` Ian Campbell
2012-04-10 14:26 ` Ian Campbell
2012-04-10 14:26 ` [PATCH 10/10] sunrpc: use SKB fragment destructors to delay completion until page is released by network stack Ian Campbell
2012-04-10 14:26 ` Ian Campbell
2012-04-10 14:26 ` Ian Campbell
2012-04-10 14:58 ` [PATCH v4 0/10] skb paged fragment destructors Michael S. Tsirkin
2012-04-10 14:58 ` Michael S. Tsirkin
2012-04-10 15:00 ` Michael S. Tsirkin
2012-04-10 15:00 ` Michael S. Tsirkin
2012-04-10 15:46 ` Bart Van Assche
2012-04-10 15:46 ` Bart Van Assche
2012-04-10 15:50 ` Ian Campbell
2012-04-11 10:02 ` Bart Van Assche
2012-04-11 10:02 ` Bart Van Assche
2012-04-10 15:50 ` Ian Campbell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F85B1EA.9000600@intel.com \
--to=alexander.h.duyck@intel.com \
--cc=Ian.Campbell@citrix.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.