From: David Laight <david.laight.linux@gmail.com>
To: Helge Deller <deller@gmx.de>
Cc: Kuniyuki Iwashima <kuniyu@google.com>,
deller@kernel.org, davem@davemloft.net, dsahern@kernel.org,
linux-kernel@vger.kernel.org, linux-parisc@vger.kernel.org,
netdev@vger.kernel.org, edumazet@google.com
Subject: Re: [PATCH] net: Optimize flush calculation in inet_gro_receive()
Date: Tue, 14 Apr 2026 10:36:33 +0100 [thread overview]
Message-ID: <20260414103633.4d5fe92a@pumpkin> (raw)
In-Reply-To: <49c05cd8-5ad0-4015-8f55-fed3416784bf@gmx.de>
On Tue, 14 Apr 2026 09:46:55 +0200
Helge Deller <deller@gmx.de> wrote:
> Hi Kikuyu and David,
...
> >>> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> >>> index c7731e300a44..58cad2687c2c 100644
> >>> --- a/net/ipv4/af_inet.c
> >>> +++ b/net/ipv4/af_inet.c
> >>> @@ -1479,7 +1479,7 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
> >>> struct sk_buff *p;
> >>> unsigned int hlen;
> >>> unsigned int off;
> >>> - int flush = 1;
> >>> + u16 flush = 1;
> >>> int proto;
> >>>
> >>> off = skb_gro_offset(skb);
> >>> @@ -1504,7 +1504,8 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
> >>> goto out;
> >>>
> >>> NAPI_GRO_CB(skb)->proto = proto;
> >>> - flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (ntohl(*(__be32 *)&iph->id) & ~IP_DF));
> >>> + flush = (get_unaligned_be16(&iph->tot_len) ^ skb_gro_len(skb)) |
> >>> + (get_unaligned_be16(&iph->frag_off) & ~IP_DF);
> >>
> >> I think here we intentionally use 32-bit loads:
> >>
> >> commit
> >> Author: Herbert Xu <herbert@gondor.apana.org.au>
> >> Date: Tue May 26 18:50:29 2009
> >>
> >> ipv4: Use 32-bit loads for ID and length in GRO
>
> I see, this patch is exactly the opposite of mine.
>
> >> Before your patch, 32-bit load + bswap are used while
> >> 16-bit load + rol 8 after the change.
> >>
> >> I feel the 4-byte aligned load + bswap is faster than
> >> misaligned access + 8 times shift (Is this internally
> >> optimised like xchg for a single word size ?)
> >>
> >> Do you have some numbers ?
>
> No, I don't have.
> In the end it's very platform specific anyway.
>
> > Check on some architecture that doesn't support misaligned loads.
> > Actually, aren't the accesses aligned??
>
> The reason why I touched this code at all, is because I got unaligned
> accesses in that function on parisc.
> But those unaligned accesses were triggered by parisc-specific
> inline assembly, and not by this code here.
The network stack is supposed to ensure that all receive packets are
aligned to that IP header is on a 4-byte boundary.
This typically requires the ethernet receive buffer be 4n+2 aligned.
Unfortunately there is some ethernet hardware that requires 4n aligned
buffers (often on SoC devices with cpu that fault misaligned accesses).
(Just writing two bytes of garbage before the frame solves the issue.)
> So, I believe those accesses here are aligned, and the get_unaligned_XX()
> helpers make the code more readable, but are NOT necessary.
>
> That said, I suggest to drop my patch.
> It makes the code more readable, but probably will not improve speed.
I think the purpose of the change was to use the hardware's 32bit
byte-swapping memory loads rather than software swapping of the
16-bit items.
That shaves off a few instructions - and they can be measurable
in some of the network paths with specific workloads.
Remember, save 0.1% 100 times and the code runs 10% faster.
Every little bit can make a difference.
David
>
> Thanks for your help!
> Helge
>
> > Also on ones without 32bit byteswap (some do have byteswapping
> > memory reads).
> >
> > Also you may not want to change 'flush' to u16.
> > On non-x86 it may force the compiler add extra masking instructions.
> >
> > David
> >
> >>
> >>
> >> Before:
> >> flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb))
> >> mov edx,DWORD PTR [rcx]
> >> bswap edx
> >> return skb->len - NAPI_GRO_CB(skb)->data_offset;
> >> mov r8d,DWORD PTR [rsi+0x38]
> >> mov r9d,DWORD PTR [rsi+0x70]
> >> sub r9d,r8d
> >> xor r9d,edx
> >> | (ntohl(*(__be32 *)&iph->id) & ~IP_DF));
> >> mov ebp,0xffbfffff
> >> and ebp,DWORD PTR [rcx+0x4]
> >> bswap ebp
> >> or ebp,r9d
> >>
> >>
> >> After:
> >> flush = (get_unaligned_be16(&iph->tot_len) ^ skb_gro_len(skb))
> >> movzx edx,WORD PTR [rcx+0x2]
> >> rol dx,0x8
> >> return skb->len - NAPI_GRO_CB(skb)->data_offset;
> >> mov r8d,DWORD PTR [rsi+0x38]
> >> mov r9d,DWORD PTR [rsi+0x70]
> >> sub r9d,r8d
> >> xor r9d,edx
> >> | (get_unaligned_be16(&iph->frag_off) & ~IP_DF);
> >> movzx ebp,WORD PTR [rcx+0x6]
> >> and ebp,0xffffffbf
> >> rol bp,0x8
> >> or ebp,r9d
> >>
> >
>
next prev parent reply other threads:[~2026-04-14 9:36 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-10 14:43 [PATCH] net: Optimize flush calculation in inet_gro_receive() Helge Deller
2026-04-11 5:19 ` Kuniyuki Iwashima
2026-04-11 12:09 ` David Laight
2026-04-14 7:46 ` Helge Deller
2026-04-14 9:36 ` David Laight [this message]
2026-04-14 9:57 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260414103633.4d5fe92a@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=davem@davemloft.net \
--cc=deller@gmx.de \
--cc=deller@kernel.org \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=kuniyu@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-parisc@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox