From: Alexander Duyck <alexander.h.duyck@intel.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>,
davem@davemloft.net, netdev@vger.kernel.org, gospo@redhat.com,
sassmann@redhat.com
Subject: Re: [net-next 02/11] ixgbe: Mask off check of frag_off as we only want fragment offset
Date: Fri, 12 Apr 2013 11:22:28 -0700 [thread overview]
Message-ID: <516850E4.8020504@intel.com> (raw)
In-Reply-To: <1365785511.4459.46.camel@edumazet-glaptop>
On 04/12/2013 09:51 AM, Eric Dumazet wrote:
> On Fri, 2013-04-12 at 09:38 -0700, Alexander Duyck wrote:
>> On 04/12/2013 06:45 AM, Eric Dumazet wrote:
>>> On Fri, 2013-04-12 at 06:28 -0700, Eric Dumazet wrote:
>>>
>>>> I wonder if you could use core functions instead of all this...
>>>>
>>>> A simple wrapper would be :
>>> Or more something like :
>>>
>>> static noinline unsigned int ixgbe_get_headlen(unsigned char *data,
>>> u32 maxlen)
>>> {
>>> struct skb fake;
>>> unsigned int res;
>>>
>>> if (maxlen < ETH_HLEN)
>>> return maxlen;
>>>
>>> fake->data = data + ETH_HLEN;
>>> fake->head = data;
>>> fake->data_len = 0;
>>> fake->len = maxlen - ETH_HLEN;
>>> skb_reset_network_header(&fake);
>>> res = __skb_get_poff(&fake);
>>> return res ? res + ETH_HLEN : maxlen;
>>> }
>> The problem is this is way more then I need, and I would prefer not to
>> allocate a 192+ byte structure on the stack in order to just parse a
>> header that is likely less than 128 bytes.
>>
> Thats why I used 'noinline' keyword.
>
> Your code adds significative icache pressure and latencies.
The footprint for the code itself is not that large, and the fact is the
behavior is different enough from skb_flow_dissect which is what
__skb_get_poff relies on that I don't think I could get the same
behavior without adding at least one more protocol (FCOE), and probably
some sort of flag because in our case we want the header length
including L4 header for the first frame of a fragmented flow since the
goal is to leave only payload data in the pages.
>> I could probably do something like create a copy of the
>> ixgbe_get_headlen function, maybe named something like
>> etherdev_get_headlen and stored in eth.c that could be used by both igb
>> and ixgbe. That way it would be available for anyone else who might
>> want to do something similar. If that would work for you I could
>> probably submit that patch sometime in the next few hours.
> No please don't do that.
>
> I suggested reusing stuff, not duplicating it.
>
> The main problem is not the cpu cycles spent to parse the header, but
> bringing two cache lines for the memcpy() to pull headers. (TCP uses 66
> bytes of headers)
>
> If you use a prefetch(data + 64), chances are good the current generic
> code will run before hitting the memory stall
The main problem I have with this is the fact that before we are done we
would have had to populate a number of fields within the fake skb before
the parsing could be completed. This also assumes that at no point in
the future will somebody add anything that requires any other fields to
be set or unset within the skb since all of the values in fake are not
memset to 0 like a standard skb. It would be a pain to debug this type
of issue.
For example the code snippet you sent likely wouldn't have worked
because it appears to have missed that it would also need to set
skb->protocol before being called.
I appreciate the desire to reuse, and what I meant was that since igb
and ixgbe both use essentially the same function I could move it to one
central location and both of them could use it as well as any other low
level drivers that need to just quickly parse the header out of a linear
block of data. I just don't feel __skb_get_poff really does what I am
looking for since it assumes it is working with a skb, not just a linear
block of data. If it could get broken up somehow so that it, or at
least pieces of it could just be used on linear blocks of data then I
might be more interested in reusing it.
Thanks,
Alex
next prev parent reply other threads:[~2013-04-12 18:22 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-12 11:24 [net-next 00/11][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
2013-04-12 11:24 ` [net-next 01/11] ixgbe: Support using build_skb in the case that jumbo frames are disabled Jeff Kirsher
2013-04-12 13:10 ` Eric Dumazet
2013-04-12 13:31 ` Eric Dumazet
2013-04-12 22:21 ` Ben Hutchings
2013-04-12 23:50 ` Alexander Duyck
2013-04-12 11:24 ` [net-next 02/11] ixgbe: Mask off check of frag_off as we only want fragment offset Jeff Kirsher
2013-04-12 13:28 ` Eric Dumazet
2013-04-12 13:45 ` Eric Dumazet
2013-04-12 16:38 ` Alexander Duyck
2013-04-12 16:51 ` Eric Dumazet
2013-04-12 18:22 ` Alexander Duyck [this message]
2013-04-12 18:44 ` Eric Dumazet
2013-04-12 20:12 ` Alexander Duyck
2013-04-12 20:29 ` Eric Dumazet
2013-04-12 21:05 ` Alexander Duyck
2013-04-12 21:12 ` Eric Dumazet
2013-04-12 21:34 ` Alexander Duyck
2013-04-12 21:40 ` Eric Dumazet
2013-04-12 11:24 ` [net-next 03/11] ixgbe: don't do arithmetic operations on bitmasks Jeff Kirsher
2013-04-12 11:24 ` [net-next 04/11] ixgbe: Drop check for PAGE_SIZE from ixgbe_xmit_frame_ring Jeff Kirsher
2013-04-12 11:24 ` [net-next 05/11] ixgbe: Enable support for recognizing PCI-e Gen3 link speed Jeff Kirsher
2013-04-12 11:24 ` [net-next 06/11] ixgbe: create conversion functions from link_status to bus/speed Jeff Kirsher
2013-04-12 11:24 ` [net-next 07/11] ixgbe: enable devices with internal switch to read pci parent Jeff Kirsher
2013-04-12 11:24 ` [net-next 08/11] ixgbe: walk pci-e bus to find minimum width Jeff Kirsher
2013-04-13 21:28 ` Or Gerlitz
2013-04-15 20:48 ` Keller, Jacob E
2013-04-12 11:24 ` [net-next 09/11] ixgbe: fix MNG FW support when adapter not up Jeff Kirsher
2013-04-12 11:24 ` [net-next 10/11] ixgbe: Fix 1G link WoL Jeff Kirsher
2013-04-12 11:24 ` [net-next 11/11] ixgbe: bump version number Jeff Kirsher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=516850E4.8020504@intel.com \
--to=alexander.h.duyck@intel.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=gospo@redhat.com \
--cc=jeffrey.t.kirsher@intel.com \
--cc=netdev@vger.kernel.org \
--cc=sassmann@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).