[PATCH] net: Optimize flush calculation in inet_gro

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] net: Optimize flush calculation in inet_gro_receive()
@ 2026-04-10 14:43 Helge Deller
  2026-04-11  5:19 ` Kuniyuki Iwashima
  0 siblings, 1 reply; 6+ messages in thread
From: Helge Deller @ 2026-04-10 14:43 UTC (permalink / raw)
  To: netdev, linux-kernel, David S. Miller, David Ahern; +Cc: linux-parisc

For the calculation of the flush variable, use the get_unaligned_xxx() helpers
to access only relevant bits of the IP header.

Note: Since I don't know the network details, I'm not sure if "& ~IP_DF"
(& ~0x4000) is correct, or if "& IP_OFFSET" (& 0x1FFF) should be used instead
(which I believe would be more correct). Instead of possibly breaking things I
left it as is, but maybe some expert can check?

Signed-off-by: Helge Deller <deller@gmx.de>

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index c7731e300a44..58cad2687c2c 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1479,7 +1479,7 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
 	struct sk_buff *p;
 	unsigned int hlen;
 	unsigned int off;
-	int flush = 1;
+	u16 flush = 1;
 	int proto;
 
 	off = skb_gro_offset(skb);
@@ -1504,7 +1504,8 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
 		goto out;
 
 	NAPI_GRO_CB(skb)->proto = proto;
-	flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (ntohl(*(__be32 *)&iph->id) & ~IP_DF));
+	flush = (get_unaligned_be16(&iph->tot_len) ^ skb_gro_len(skb)) |
+	        (get_unaligned_be16(&iph->frag_off) & ~IP_DF);
 
 	list_for_each_entry(p, head, list) {
 		struct iphdr *iph2;

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] net: Optimize flush calculation in inet_gro_receive()
  2026-04-10 14:43 [PATCH] net: Optimize flush calculation in inet_gro_receive() Helge Deller
@ 2026-04-11  5:19 ` Kuniyuki Iwashima
  2026-04-11 12:09   ` David Laight
  0 siblings, 1 reply; 6+ messages in thread
From: Kuniyuki Iwashima @ 2026-04-11  5:19 UTC (permalink / raw)
  To: deller; +Cc: davem, dsahern, linux-kernel, linux-parisc, netdev, edumazet

From: Helge Deller <deller@kernel.org>
Date: Fri, 10 Apr 2026 16:43:54 +0200
> For the calculation of the flush variable, use the get_unaligned_xxx() helpers
> to access only relevant bits of the IP header.
> 
> Note: Since I don't know the network details, I'm not sure if "& ~IP_DF"
> (& ~0x4000) is correct, or if "& IP_OFFSET" (& 0x1FFF) should be used instead

~IP_DF is correct (MF bit needs to be checked), see

commit db8caf3dbc77599dc90f4ea0a803cd1d97116f30
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri May 31 11:18:10 2013

    gro: should aggregate frames without DF


> (which I believe would be more correct). Instead of possibly breaking things I
> left it as is, but maybe some expert can check?
> 
> Signed-off-by: Helge Deller <deller@gmx.de>
> 
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index c7731e300a44..58cad2687c2c 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -1479,7 +1479,7 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
>  	struct sk_buff *p;
>  	unsigned int hlen;
>  	unsigned int off;
> -	int flush = 1;
> +	u16 flush = 1;
>  	int proto;
>  
>  	off = skb_gro_offset(skb);
> @@ -1504,7 +1504,8 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
>  		goto out;
>  
>  	NAPI_GRO_CB(skb)->proto = proto;
> -	flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (ntohl(*(__be32 *)&iph->id) & ~IP_DF));
> +	flush = (get_unaligned_be16(&iph->tot_len) ^ skb_gro_len(skb)) |
> +	        (get_unaligned_be16(&iph->frag_off) & ~IP_DF);

I think here we intentionally use 32-bit loads:

commit 1075f3f65d0e0f49351b7d4310e9f94483972a51
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Tue May 26 18:50:29 2009

    ipv4: Use 32-bit loads for ID and length in GRO


Before your patch, 32-bit load + bswap are used while
16-bit load + rol 8 after the change.

I feel the 4-byte aligned load + bswap is faster than
misaligned access + 8 times shift (Is this internally
optimised like xchg for a single word size ?)

Do you have some numbers ?


Before:
	flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb))
mov    edx,DWORD PTR [rcx]
bswap  edx
	return skb->len - NAPI_GRO_CB(skb)->data_offset;
mov    r8d,DWORD PTR [rsi+0x38]
mov    r9d,DWORD PTR [rsi+0x70]
sub    r9d,r8d
xor    r9d,edx
	| (ntohl(*(__be32 *)&iph->id) & ~IP_DF));
mov    ebp,0xffbfffff
and    ebp,DWORD PTR [rcx+0x4]
bswap  ebp
or     ebp,r9d


After:
	flush = (get_unaligned_be16(&iph->tot_len) ^ skb_gro_len(skb))
movzx  edx,WORD PTR [rcx+0x2]
rol    dx,0x8
	return skb->len - NAPI_GRO_CB(skb)->data_offset;
mov    r8d,DWORD PTR [rsi+0x38]
mov    r9d,DWORD PTR [rsi+0x70]
sub    r9d,r8d
xor    r9d,edx
	| (get_unaligned_be16(&iph->frag_off) & ~IP_DF);
movzx  ebp,WORD PTR [rcx+0x6]
and    ebp,0xffffffbf
rol    bp,0x8
or     ebp,r9d

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] net: Optimize flush calculation in inet_gro_receive()
  2026-04-11  5:19 ` Kuniyuki Iwashima
@ 2026-04-11 12:09   ` David Laight
  2026-04-14  7:46     ` Helge Deller
  0 siblings, 1 reply; 6+ messages in thread
From: David Laight @ 2026-04-11 12:09 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: deller, davem, dsahern, linux-kernel, linux-parisc, netdev,
	edumazet

On Sat, 11 Apr 2026 05:19:35 +0000
Kuniyuki Iwashima <kuniyu@google.com> wrote:

> From: Helge Deller <deller@kernel.org>
> Date: Fri, 10 Apr 2026 16:43:54 +0200
> > For the calculation of the flush variable, use the get_unaligned_xxx() helpers
> > to access only relevant bits of the IP header.
> > 
> > Note: Since I don't know the network details, I'm not sure if "& ~IP_DF"
> > (& ~0x4000) is correct, or if "& IP_OFFSET" (& 0x1FFF) should be used instead  
> 
> ~IP_DF is correct (MF bit needs to be checked), see
> 
> commit db8caf3dbc77599dc90f4ea0a803cd1d97116f30
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Fri May 31 11:18:10 2013
> 
>     gro: should aggregate frames without DF
> 
> 
> > (which I believe would be more correct). Instead of possibly breaking things I
> > left it as is, but maybe some expert can check?
> > 
> > Signed-off-by: Helge Deller <deller@gmx.de>
> > 
> > diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> > index c7731e300a44..58cad2687c2c 100644
> > --- a/net/ipv4/af_inet.c
> > +++ b/net/ipv4/af_inet.c
> > @@ -1479,7 +1479,7 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
> >  	struct sk_buff *p;
> >  	unsigned int hlen;
> >  	unsigned int off;
> > -	int flush = 1;
> > +	u16 flush = 1;
> >  	int proto;
> >  
> >  	off = skb_gro_offset(skb);
> > @@ -1504,7 +1504,8 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
> >  		goto out;
> >  
> >  	NAPI_GRO_CB(skb)->proto = proto;
> > -	flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (ntohl(*(__be32 *)&iph->id) & ~IP_DF));
> > +	flush = (get_unaligned_be16(&iph->tot_len) ^ skb_gro_len(skb)) |
> > +	        (get_unaligned_be16(&iph->frag_off) & ~IP_DF);  
> 
> I think here we intentionally use 32-bit loads:
> 
> commit 1075f3f65d0e0f49351b7d4310e9f94483972a51
> Author: Herbert Xu <herbert@gondor.apana.org.au>
> Date:   Tue May 26 18:50:29 2009
> 
>     ipv4: Use 32-bit loads for ID and length in GRO
> 
> 
> Before your patch, 32-bit load + bswap are used while
> 16-bit load + rol 8 after the change.
> 
> I feel the 4-byte aligned load + bswap is faster than
> misaligned access + 8 times shift (Is this internally
> optimised like xchg for a single word size ?)
> 
> Do you have some numbers ?

Check on some architecture that doesn't support misaligned loads.
Actually, aren't the accesses aligned??
Also on ones without 32bit byteswap (some do have byteswapping
memory reads).

Also you may not want to change 'flush' to u16.
On non-x86 it may force the compiler add extra masking instructions.

	David
 
> 
> 
> Before:
> 	flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb))
> mov    edx,DWORD PTR [rcx]
> bswap  edx
> 	return skb->len - NAPI_GRO_CB(skb)->data_offset;
> mov    r8d,DWORD PTR [rsi+0x38]
> mov    r9d,DWORD PTR [rsi+0x70]
> sub    r9d,r8d
> xor    r9d,edx
> 	| (ntohl(*(__be32 *)&iph->id) & ~IP_DF));
> mov    ebp,0xffbfffff
> and    ebp,DWORD PTR [rcx+0x4]
> bswap  ebp
> or     ebp,r9d
> 
> 
> After:
> 	flush = (get_unaligned_be16(&iph->tot_len) ^ skb_gro_len(skb))
> movzx  edx,WORD PTR [rcx+0x2]
> rol    dx,0x8
> 	return skb->len - NAPI_GRO_CB(skb)->data_offset;
> mov    r8d,DWORD PTR [rsi+0x38]
> mov    r9d,DWORD PTR [rsi+0x70]
> sub    r9d,r8d
> xor    r9d,edx
> 	| (get_unaligned_be16(&iph->frag_off) & ~IP_DF);
> movzx  ebp,WORD PTR [rcx+0x6]
> and    ebp,0xffffffbf
> rol    bp,0x8
> or     ebp,r9d
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] net: Optimize flush calculation in inet_gro_receive()
  2026-04-11 12:09   ` David Laight
@ 2026-04-14  7:46     ` Helge Deller
  2026-04-14  9:36       ` David Laight
  0 siblings, 1 reply; 6+ messages in thread
From: Helge Deller @ 2026-04-14  7:46 UTC (permalink / raw)
  To: David Laight, Kuniyuki Iwashima
  Cc: deller, davem, dsahern, linux-kernel, linux-parisc, netdev,
	edumazet

Hi Kikuyu and David,

On 4/11/26 14:09, David Laight wrote:
> On Sat, 11 Apr 2026 05:19:35 +0000
> Kikuyu Iwashima <kuniyu@google.com> wrote:
> 
>> From: Helge Deller <deller@kernel.org>
>> Date: Fri, 10 Apr 2026 16:43:54 +0200
>>> For the calculation of the flush variable, use the get_unaligned_xxx() helpers
>>> to access only relevant bits of the IP header.
>>>
>>> Note: Since I don't know the network details, I'm not sure if "& ~IP_DF"
>>> (& ~0x4000) is correct, or if "& IP_OFFSET" (& 0x1FFF) should be used instead
>>
>> ~IP_DF is correct (MF bit needs to be checked), see

Ok, Thanks for checking!

>> commit db8caf3dbc77599dc90f4ea0a803cd1d97116f30
>> Author: Eric Dumazet <edumazet@google.com>
>> Date:   Fri May 31 11:18:10 2013
>>
>>      gro: should aggregate frames without DF
>>
>>
>>> (which I believe would be more correct). Instead of possibly breaking things I
>>> left it as is, but maybe some expert can check?
>>>
>>> Signed-off-by: Helge Deller <deller@gmx.de>
>>>
>>> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
>>> index c7731e300a44..58cad2687c2c 100644
>>> --- a/net/ipv4/af_inet.c
>>> +++ b/net/ipv4/af_inet.c
>>> @@ -1479,7 +1479,7 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
>>>   	struct sk_buff *p;
>>>   	unsigned int hlen;
>>>   	unsigned int off;
>>> -	int flush = 1;
>>> +	u16 flush = 1;
>>>   	int proto;
>>>   
>>>   	off = skb_gro_offset(skb);
>>> @@ -1504,7 +1504,8 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
>>>   		goto out;
>>>   
>>>   	NAPI_GRO_CB(skb)->proto = proto;
>>> -	flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (ntohl(*(__be32 *)&iph->id) & ~IP_DF));
>>> +	flush = (get_unaligned_be16(&iph->tot_len) ^ skb_gro_len(skb)) |
>>> +	        (get_unaligned_be16(&iph->frag_off) & ~IP_DF);
>>
>> I think here we intentionally use 32-bit loads:
>>
>> commit 
>> Author: Herbert Xu <herbert@gondor.apana.org.au>
>> Date:   Tue May 26 18:50:29 2009
>>
>>      ipv4: Use 32-bit loads for ID and length in GRO

I see, this patch is exactly the opposite of mine.

>> Before your patch, 32-bit load + bswap are used while
>> 16-bit load + rol 8 after the change.
>>
>> I feel the 4-byte aligned load + bswap is faster than
>> misaligned access + 8 times shift (Is this internally
>> optimised like xchg for a single word size ?)
>>
>> Do you have some numbers ?

No, I don't have.
In the end it's very platform specific anyway.
  
> Check on some architecture that doesn't support misaligned loads.
> Actually, aren't the accesses aligned??

The reason why I touched this code at all, is because I got unaligned
accesses in that function on parisc.
But those unaligned accesses were triggered by parisc-specific
inline assembly, and not by this code here.

So, I believe those accesses here are aligned, and the get_unaligned_XX()
helpers make the code more readable, but are NOT necessary.

That said, I suggest to drop my patch.
It makes the code more readable, but probably will not improve speed.

Thanks for your help!
Helge

> Also on ones without 32bit byteswap (some do have byteswapping
> memory reads).
> 
> Also you may not want to change 'flush' to u16.
> On non-x86 it may force the compiler add extra masking instructions.
> 
> 	David
>   
>>
>>
>> Before:
>> 	flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb))
>> mov    edx,DWORD PTR [rcx]
>> bswap  edx
>> 	return skb->len - NAPI_GRO_CB(skb)->data_offset;
>> mov    r8d,DWORD PTR [rsi+0x38]
>> mov    r9d,DWORD PTR [rsi+0x70]
>> sub    r9d,r8d
>> xor    r9d,edx
>> 	| (ntohl(*(__be32 *)&iph->id) & ~IP_DF));
>> mov    ebp,0xffbfffff
>> and    ebp,DWORD PTR [rcx+0x4]
>> bswap  ebp
>> or     ebp,r9d
>>
>>
>> After:
>> 	flush = (get_unaligned_be16(&iph->tot_len) ^ skb_gro_len(skb))
>> movzx  edx,WORD PTR [rcx+0x2]
>> rol    dx,0x8
>> 	return skb->len - NAPI_GRO_CB(skb)->data_offset;
>> mov    r8d,DWORD PTR [rsi+0x38]
>> mov    r9d,DWORD PTR [rsi+0x70]
>> sub    r9d,r8d
>> xor    r9d,edx
>> 	| (get_unaligned_be16(&iph->frag_off) & ~IP_DF);
>> movzx  ebp,WORD PTR [rcx+0x6]
>> and    ebp,0xffffffbf
>> rol    bp,0x8
>> or     ebp,r9d
>>
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] net: Optimize flush calculation in inet_gro_receive()
  2026-04-14  7:46     ` Helge Deller
@ 2026-04-14  9:36       ` David Laight
  2026-04-14  9:57         ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: David Laight @ 2026-04-14  9:36 UTC (permalink / raw)
  To: Helge Deller
  Cc: Kuniyuki Iwashima, deller, davem, dsahern, linux-kernel,
	linux-parisc, netdev, edumazet

On Tue, 14 Apr 2026 09:46:55 +0200
Helge Deller <deller@gmx.de> wrote:

> Hi Kikuyu and David,
...
> >>> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> >>> index c7731e300a44..58cad2687c2c 100644
> >>> --- a/net/ipv4/af_inet.c
> >>> +++ b/net/ipv4/af_inet.c
> >>> @@ -1479,7 +1479,7 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
> >>>   	struct sk_buff *p;
> >>>   	unsigned int hlen;
> >>>   	unsigned int off;
> >>> -	int flush = 1;
> >>> +	u16 flush = 1;
> >>>   	int proto;
> >>>   
> >>>   	off = skb_gro_offset(skb);
> >>> @@ -1504,7 +1504,8 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
> >>>   		goto out;
> >>>   
> >>>   	NAPI_GRO_CB(skb)->proto = proto;
> >>> -	flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (ntohl(*(__be32 *)&iph->id) & ~IP_DF));
> >>> +	flush = (get_unaligned_be16(&iph->tot_len) ^ skb_gro_len(skb)) |
> >>> +	        (get_unaligned_be16(&iph->frag_off) & ~IP_DF);  
> >>
> >> I think here we intentionally use 32-bit loads:
> >>
> >> commit 
> >> Author: Herbert Xu <herbert@gondor.apana.org.au>
> >> Date:   Tue May 26 18:50:29 2009
> >>
> >>      ipv4: Use 32-bit loads for ID and length in GRO  
> 
> I see, this patch is exactly the opposite of mine.
> 
> >> Before your patch, 32-bit load + bswap are used while
> >> 16-bit load + rol 8 after the change.
> >>
> >> I feel the 4-byte aligned load + bswap is faster than
> >> misaligned access + 8 times shift (Is this internally
> >> optimised like xchg for a single word size ?)
> >>
> >> Do you have some numbers ?  
> 
> No, I don't have.
> In the end it's very platform specific anyway.
>   
> > Check on some architecture that doesn't support misaligned loads.
> > Actually, aren't the accesses aligned??  
> 
> The reason why I touched this code at all, is because I got unaligned
> accesses in that function on parisc.
> But those unaligned accesses were triggered by parisc-specific
> inline assembly, and not by this code here.

The network stack is supposed to ensure that all receive packets are
aligned to that IP header is on a 4-byte boundary.
This typically requires the ethernet receive buffer be 4n+2 aligned.
Unfortunately there is some ethernet hardware that requires 4n aligned
buffers (often on SoC devices with cpu that fault misaligned accesses).
(Just writing two bytes of garbage before the frame solves the issue.) 

> So, I believe those accesses here are aligned, and the get_unaligned_XX()
> helpers make the code more readable, but are NOT necessary.
> 
> That said, I suggest to drop my patch.
> It makes the code more readable, but probably will not improve speed.

I think the purpose of the change was to use the hardware's 32bit
byte-swapping memory loads rather than software swapping of the 
16-bit items.
That shaves off a few instructions - and they can be measurable
in some of the network paths with specific workloads.

Remember, save 0.1% 100 times and the code runs 10% faster.
Every little bit can make a difference.

	David

> 
> Thanks for your help!
> Helge
> 
> > Also on ones without 32bit byteswap (some do have byteswapping
> > memory reads).
> > 
> > Also you may not want to change 'flush' to u16.
> > On non-x86 it may force the compiler add extra masking instructions.
> > 
> > 	David
> >     
> >>
> >>
> >> Before:
> >> 	flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb))
> >> mov    edx,DWORD PTR [rcx]
> >> bswap  edx
> >> 	return skb->len - NAPI_GRO_CB(skb)->data_offset;
> >> mov    r8d,DWORD PTR [rsi+0x38]
> >> mov    r9d,DWORD PTR [rsi+0x70]
> >> sub    r9d,r8d
> >> xor    r9d,edx
> >> 	| (ntohl(*(__be32 *)&iph->id) & ~IP_DF));
> >> mov    ebp,0xffbfffff
> >> and    ebp,DWORD PTR [rcx+0x4]
> >> bswap  ebp
> >> or     ebp,r9d
> >>
> >>
> >> After:
> >> 	flush = (get_unaligned_be16(&iph->tot_len) ^ skb_gro_len(skb))
> >> movzx  edx,WORD PTR [rcx+0x2]
> >> rol    dx,0x8
> >> 	return skb->len - NAPI_GRO_CB(skb)->data_offset;
> >> mov    r8d,DWORD PTR [rsi+0x38]
> >> mov    r9d,DWORD PTR [rsi+0x70]
> >> sub    r9d,r8d
> >> xor    r9d,edx
> >> 	| (get_unaligned_be16(&iph->frag_off) & ~IP_DF);
> >> movzx  ebp,WORD PTR [rcx+0x6]
> >> and    ebp,0xffffffbf
> >> rol    bp,0x8
> >> or     ebp,r9d
> >>  
> >   
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] net: Optimize flush calculation in inet_gro_receive()
  2026-04-14  9:36       ` David Laight
@ 2026-04-14  9:57         ` Eric Dumazet
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2026-04-14  9:57 UTC (permalink / raw)
  To: David Laight
  Cc: Helge Deller, Kuniyuki Iwashima, deller, davem, dsahern,
	linux-kernel, linux-parisc, netdev

On Tue, Apr 14, 2026 at 2:36 AM David Laight
<david.laight.linux@gmail.com> wrote:

> The network stack is supposed to ensure that all receive packets are
> aligned to that IP header is on a 4-byte boundary.
> This typically requires the ethernet receive buffer be 4n+2 aligned.
> Unfortunately there is some ethernet hardware that requires 4n aligned
> buffers (often on SoC devices with cpu that fault misaligned accesses).
> (Just writing two bytes of garbage before the frame solves the issue.)

This is mostly true, but depends on NET_IP_ALIGN,

On x86, this constant was changed to 0 back in 2010.

commit ea812ca1b06113597adcd8e70c0f84a413d97544
Author: Alexander Duyck <alexander.h.duyck@intel.com>
Date:   Tue Jun 29 18:38:00 2010 +0000

    x86: Align skb w/ start of cacheline on newer core 2/Xeon Arch

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-04-14  9:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-10 14:43 [PATCH] net: Optimize flush calculation in inet_gro_receive() Helge Deller
2026-04-11  5:19 ` Kuniyuki Iwashima
2026-04-11 12:09   ` David Laight
2026-04-14  7:46     ` Helge Deller
2026-04-14  9:36       ` David Laight
2026-04-14  9:57         ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox