public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Helge Deller <deller@gmx.de>
To: David Laight <david.laight.linux@gmail.com>,
	Kuniyuki Iwashima <kuniyu@google.com>
Cc: deller@kernel.org, davem@davemloft.net, dsahern@kernel.org,
	linux-kernel@vger.kernel.org, linux-parisc@vger.kernel.org,
	netdev@vger.kernel.org, edumazet@google.com
Subject: Re: [PATCH] net: Optimize flush calculation in inet_gro_receive()
Date: Tue, 14 Apr 2026 09:46:55 +0200	[thread overview]
Message-ID: <49c05cd8-5ad0-4015-8f55-fed3416784bf@gmx.de> (raw)
In-Reply-To: <20260411130958.70202bab@pumpkin>

Hi Kikuyu and David,

On 4/11/26 14:09, David Laight wrote:
> On Sat, 11 Apr 2026 05:19:35 +0000
> Kikuyu Iwashima <kuniyu@google.com> wrote:
> 
>> From: Helge Deller <deller@kernel.org>
>> Date: Fri, 10 Apr 2026 16:43:54 +0200
>>> For the calculation of the flush variable, use the get_unaligned_xxx() helpers
>>> to access only relevant bits of the IP header.
>>>
>>> Note: Since I don't know the network details, I'm not sure if "& ~IP_DF"
>>> (& ~0x4000) is correct, or if "& IP_OFFSET" (& 0x1FFF) should be used instead
>>
>> ~IP_DF is correct (MF bit needs to be checked), see

Ok, Thanks for checking!

>> commit db8caf3dbc77599dc90f4ea0a803cd1d97116f30
>> Author: Eric Dumazet <edumazet@google.com>
>> Date:   Fri May 31 11:18:10 2013
>>
>>      gro: should aggregate frames without DF
>>
>>
>>> (which I believe would be more correct). Instead of possibly breaking things I
>>> left it as is, but maybe some expert can check?
>>>
>>> Signed-off-by: Helge Deller <deller@gmx.de>
>>>
>>> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
>>> index c7731e300a44..58cad2687c2c 100644
>>> --- a/net/ipv4/af_inet.c
>>> +++ b/net/ipv4/af_inet.c
>>> @@ -1479,7 +1479,7 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
>>>   	struct sk_buff *p;
>>>   	unsigned int hlen;
>>>   	unsigned int off;
>>> -	int flush = 1;
>>> +	u16 flush = 1;
>>>   	int proto;
>>>   
>>>   	off = skb_gro_offset(skb);
>>> @@ -1504,7 +1504,8 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
>>>   		goto out;
>>>   
>>>   	NAPI_GRO_CB(skb)->proto = proto;
>>> -	flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (ntohl(*(__be32 *)&iph->id) & ~IP_DF));
>>> +	flush = (get_unaligned_be16(&iph->tot_len) ^ skb_gro_len(skb)) |
>>> +	        (get_unaligned_be16(&iph->frag_off) & ~IP_DF);
>>
>> I think here we intentionally use 32-bit loads:
>>
>> commit 
>> Author: Herbert Xu <herbert@gondor.apana.org.au>
>> Date:   Tue May 26 18:50:29 2009
>>
>>      ipv4: Use 32-bit loads for ID and length in GRO

I see, this patch is exactly the opposite of mine.

>> Before your patch, 32-bit load + bswap are used while
>> 16-bit load + rol 8 after the change.
>>
>> I feel the 4-byte aligned load + bswap is faster than
>> misaligned access + 8 times shift (Is this internally
>> optimised like xchg for a single word size ?)
>>
>> Do you have some numbers ?

No, I don't have.
In the end it's very platform specific anyway.
  
> Check on some architecture that doesn't support misaligned loads.
> Actually, aren't the accesses aligned??

The reason why I touched this code at all, is because I got unaligned
accesses in that function on parisc.
But those unaligned accesses were triggered by parisc-specific
inline assembly, and not by this code here.

So, I believe those accesses here are aligned, and the get_unaligned_XX()
helpers make the code more readable, but are NOT necessary.

That said, I suggest to drop my patch.
It makes the code more readable, but probably will not improve speed.

Thanks for your help!
Helge

> Also on ones without 32bit byteswap (some do have byteswapping
> memory reads).
> 
> Also you may not want to change 'flush' to u16.
> On non-x86 it may force the compiler add extra masking instructions.
> 
> 	David
>   
>>
>>
>> Before:
>> 	flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb))
>> mov    edx,DWORD PTR [rcx]
>> bswap  edx
>> 	return skb->len - NAPI_GRO_CB(skb)->data_offset;
>> mov    r8d,DWORD PTR [rsi+0x38]
>> mov    r9d,DWORD PTR [rsi+0x70]
>> sub    r9d,r8d
>> xor    r9d,edx
>> 	| (ntohl(*(__be32 *)&iph->id) & ~IP_DF));
>> mov    ebp,0xffbfffff
>> and    ebp,DWORD PTR [rcx+0x4]
>> bswap  ebp
>> or     ebp,r9d
>>
>>
>> After:
>> 	flush = (get_unaligned_be16(&iph->tot_len) ^ skb_gro_len(skb))
>> movzx  edx,WORD PTR [rcx+0x2]
>> rol    dx,0x8
>> 	return skb->len - NAPI_GRO_CB(skb)->data_offset;
>> mov    r8d,DWORD PTR [rsi+0x38]
>> mov    r9d,DWORD PTR [rsi+0x70]
>> sub    r9d,r8d
>> xor    r9d,edx
>> 	| (get_unaligned_be16(&iph->frag_off) & ~IP_DF);
>> movzx  ebp,WORD PTR [rcx+0x6]
>> and    ebp,0xffffffbf
>> rol    bp,0x8
>> or     ebp,r9d
>>
> 


  reply	other threads:[~2026-04-14  7:47 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-10 14:43 [PATCH] net: Optimize flush calculation in inet_gro_receive() Helge Deller
2026-04-11  5:19 ` Kuniyuki Iwashima
2026-04-11 12:09   ` David Laight
2026-04-14  7:46     ` Helge Deller [this message]
2026-04-14  9:36       ` David Laight
2026-04-14  9:57         ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49c05cd8-5ad0-4015-8f55-fed3416784bf@gmx.de \
    --to=deller@gmx.de \
    --cc=davem@davemloft.net \
    --cc=david.laight.linux@gmail.com \
    --cc=deller@kernel.org \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=kuniyu@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox