From: William Allen Simpson <william.allen.simpson@gmail.com>
To: Andi Kleen <andi@firstfloor.org>
Cc: "Linux Kernel Developers" <linux-kernel@vger.kernel.org>,
"Linux Kernel Network Developers" <netdev@vger.kernel.org>,
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>,
"Eric Dumazet" <eric.dumazet@gmail.com>
Subject: Re: [PATCH v4] tcp: harmonize tcp_vx_rcv header length assumptions
Date: Wed, 13 Jan 2010 14:49:52 -0500 [thread overview]
Message-ID: <4B4E23E0.4000007@gmail.com> (raw)
In-Reply-To: <20100113155323.GB24818@basil.fritz.box>
Andi Kleen wrote:
> On Wed, Jan 13, 2010 at 10:36:39AM -0500, William Allen Simpson wrote:
>> What make command would you suggest to get the interleaved asm?
>
> make path/to/file.lst
>
Thank you. So simple. I wish I'd asked you first! (Or that it was
in the README or at kernelnewbies or something.)
The short answer is yes, it saves a bit more than I'd expected!
Of course, much of the savings is due to the elimination of the
various skb->len tests. But the initial code is hit more often.
My obvious and simple load of doff, save, then multiply by 4, save,
has 2 stores into tcp_header_len -- versus 1 old store into th:
- th = tcp_hdr(skb);
-
- if (th->doff < sizeof(struct tcphdr) / 4)
+ /* Check bad doff, compare doff directly to constant value */
+ tcp_header_len = tcp_hdr(skb)->doff;
+ if (tcp_header_len < (sizeof(struct tcphdr) / 4))
goto bad_packet;
- if (!pskb_may_pull(skb, th->doff * 4))
+
+ /* Check too short header and options */
+ tcp_header_len *= 4;
+ if (!pskb_may_pull(skb, tcp_header_len))
goto discard_it;
Existing tcp_v4_rcv() initially has 3 extra loads, a shift, and several
register swaps compared to mine.
Eric's suggestion to use an immediate multiply eliminates my store,
but that turns out to be at the cost of many additional indexed loads!
In my version, the compiler has cleverly kept tcp_hdr(skb) in %edi.
Eric's idea ends up using %edi for tcp_header_len, so it recalculates
tcp_hdr(skb) later, and temporarily stores it on the stack, requiring
%edx for the loads and stores, and preventing the usage of %dl for
comparisons, making it about 3 loads and 3 stores and a large number of
other instructions longer....
My later savings is even better, completely obviating my 1 store.
For example:
==old==
th = tcp_hdr(skb);
iph = ip_hdr(skb);
TCP_SKB_CB(skb)->seq = ntohl(th->seq);
c04f2621: 8d 43 20 lea 0x20(%ebx),%eax
c04f2624: 8b 4f 04 mov 0x4(%edi),%ecx
c04f2627: 0f c9 bswap %ecx
c04f2629: 89 4d e8 mov %ecx,-0x18(%ebp)
c04f262c: 89 48 18 mov %ecx,0x18(%eax)
TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
c04f262f: 0f b6 4f 0d movzbl 0xd(%edi),%ecx
c04f2633: 89 ca mov %ecx,%edx
c04f2635: d0 ea shr %dl
c04f2637: 83 e2 01 and $0x1,%edx
c04f263a: 89 55 e4 mov %edx,-0x1c(%ebp)
c04f263d: 89 ca mov %ecx,%edx
c04f263f: 0f b6 4f 0c movzbl 0xc(%edi),%ecx
c04f2643: 83 e2 01 and $0x1,%edx
c04f2646: 03 55 e4 add -0x1c(%ebp),%edx
c04f2649: 03 53 50 add 0x50(%ebx),%edx
c04f264c: c0 e9 04 shr $0x4,%cl
c04f264f: 0f b6 c9 movzbl %cl,%ecx
c04f2652: c1 e1 02 shl $0x2,%ecx
c04f2655: 29 ca sub %ecx,%edx
c04f2657: 03 55 e8 add -0x18(%ebp),%edx
c04f265a: 89 50 1c mov %edx,0x1c(%eax)
c04f265d: 8b 57 08 mov 0x8(%edi),%edx
skb->len - th->doff * 4);
==new==
th = tcp_hdr(skb);
iph = ip_hdr(skb);
TCP_SKB_CB(skb)->seq = ntohl(th->seq);
c04f25f2: 8d 43 20 lea 0x20(%ebx),%eax
c04f25f5: 8b 4f 04 mov 0x4(%edi),%ecx
c04f25f8: 0f c9 bswap %ecx
c04f25fa: 89 48 18 mov %ecx,0x18(%eax)
TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
c04f25fd: 0f b6 57 0d movzbl 0xd(%edi),%edx
c04f2601: d0 ea shr %dl
c04f2603: 83 e2 01 and $0x1,%edx
c04f2606: 89 55 e0 mov %edx,-0x20(%ebp)
c04f2609: 0f b6 57 0d movzbl 0xd(%edi),%edx
c04f260d: 83 e2 01 and $0x1,%edx
c04f2610: 03 55 e0 add -0x20(%ebp),%edx
c04f2613: 03 53 50 add 0x50(%ebx),%edx
c04f2616: 2b 55 e8 sub -0x18(%ebp),%edx
c04f2619: 01 ca add %ecx,%edx
c04f261b: 89 50 1c mov %edx,0x1c(%eax)
c04f261e: 8b 57 08 mov 0x8(%edi),%edx
skb->len - tcp_header_len);
next prev parent reply other threads:[~2010-01-13 19:49 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-10 13:02 query: redundant tcp header length checks? William Allen Simpson
2010-01-12 10:05 ` [PATCH] tcp: harmonize tcp_vx_rcv header length assumptions William Allen Simpson
2010-01-12 10:46 ` Eric Dumazet
2010-01-12 17:11 ` William Allen Simpson
2010-01-13 9:50 ` William Allen Simpson
2010-01-12 17:14 ` William Allen Simpson
2010-01-13 10:48 ` [PATCH v4] " William Allen Simpson
2010-01-13 11:56 ` Andi Kleen
2010-01-13 15:36 ` William Allen Simpson
2010-01-13 15:53 ` Andi Kleen
2010-01-13 16:40 ` [PATCH] Makefile: Document ability to make file.lst and file.S Joe Perches
2010-01-13 17:14 ` Andi Kleen
2010-01-13 17:31 ` Joe Perches
2010-01-13 19:51 ` William Allen Simpson
2010-01-14 3:26 ` Américo Wang
2010-01-18 12:29 ` Michal Marek
2010-01-13 19:49 ` William Allen Simpson [this message]
2010-01-13 20:19 ` [PATCH v4] tcp: harmonize tcp_vx_rcv header length assumptions Andi Kleen
2010-01-13 21:13 ` William Allen Simpson
2010-01-14 1:03 ` Joe Perches
2010-01-14 8:39 ` Patrick McHardy
2010-01-14 15:02 ` William Allen Simpson
2010-01-14 15:10 ` [PATCH v5] " William Allen Simpson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B4E23E0.4000007@gmail.com \
--to=william.allen.simpson@gmail.com \
--cc=andi@firstfloor.org \
--cc=eric.dumazet@gmail.com \
--cc=ilpo.jarvinen@helsinki.fi \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).