netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 net] net: udp: fix Fast/frag0 UDP GRO
@ 2020-11-09 21:15 Alexander Lobakin
  2020-11-09 22:24 ` Willem de Bruijn
  2020-11-13 12:15 ` [net] 0b726f6b31: BUG:unable_to_handle_page_fault_for_address kernel test robot
  0 siblings, 2 replies; 7+ messages in thread
From: Alexander Lobakin @ 2020-11-09 21:15 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: Alexey Kuznetsov, Hideaki YOSHIFUJI, Paolo Abeni,
	Willem de Bruijn, Steffen Klassert, Alexander Lobakin, netdev,
	linux-kernel

While testing UDP GSO fraglists forwarding through driver that uses
Fast GRO (via napi_gro_frags()), I was observing lots of out-of-order
iperf packets:

[ ID] Interval           Transfer     Bitrate         Jitter
[SUM]  0.0-40.0 sec  12106 datagrams received out-of-order

Simple switch to napi_gro_receive() any other method without frag0
shortcut completely resolved them.

I've found that UDP GRO uses udp_hdr(skb) in its .gro_receive()
callback. While it's probably OK for non-frag0 paths (when all
headers or even the entire frame are already in skb->data), this
inline points to junk when using Fast GRO (napi_gro_frags() or
napi_gro_receive() with only Ethernet header in skb->data and all
the rest in shinfo->frags) and breaks GRO packet compilation and
the packet flow itself.
To support both modes, skb_gro_header_fast() + skb_gro_header_slow()
are typically used. UDP even has an inline helper that makes use of
them, udp_gro_udphdr(). Use that instead of troublemaking udp_hdr()
to get rid of the out-of-order delivers.

Present since the introduction of plain UDP GRO in 5.0-rc1.

Since v2 [1]:
 - dropped redundant check introduced in v2 as it's performed right
   before (thanks to Eric);
 - udp_hdr() switched to data + off for skbs from list (also Eric);
 - fixed possible malfunction of {,__}udp{4,6}_lib_lookup_skb() with
   Fast/frag0 due to ip{,v6}_hdr() usage (Willem).

Since v1 [2]:
 - added a NULL pointer check for "uh" as suggested by Willem.

[1] https://lore.kernel.org/netdev/0eaG8xtbtKY1dEKCTKUBubGiC9QawGgB3tVZtNqVdY@cp4-web-030.plabs.ch
[2] https://lore.kernel.org/netdev/YazU6GEzBdpyZMDMwJirxDX7B4sualpDG68ADZYvJI@cp4-web-034.plabs.ch

Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.")
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
---
 net/ipv4/udp.c         | 4 ++--
 net/ipv4/udp_offload.c | 9 ++++++---
 net/ipv6/udp.c         | 4 ++--
 3 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 09f0a23d1a01..948ddc9a0212 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -534,7 +534,7 @@ static inline struct sock *__udp4_lib_lookup_skb(struct sk_buff *skb,
 						 __be16 sport, __be16 dport,
 						 struct udp_table *udptable)
 {
-	const struct iphdr *iph = ip_hdr(skb);
+	const struct iphdr *iph = skb_gro_network_header(skb);
 
 	return __udp4_lib_lookup(dev_net(skb->dev), iph->saddr, sport,
 				 iph->daddr, dport, inet_iif(skb),
@@ -544,7 +544,7 @@ static inline struct sock *__udp4_lib_lookup_skb(struct sk_buff *skb,
 struct sock *udp4_lib_lookup_skb(struct sk_buff *skb,
 				 __be16 sport, __be16 dport)
 {
-	const struct iphdr *iph = ip_hdr(skb);
+	const struct iphdr *iph = skb_gro_network_header(skb);
 
 	return __udp4_lib_lookup(dev_net(skb->dev), iph->saddr, sport,
 				 iph->daddr, dport, inet_iif(skb),
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index e67a66fbf27b..dbc4d17c55e9 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -366,11 +366,11 @@ static struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb,
 static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
 					       struct sk_buff *skb)
 {
-	struct udphdr *uh = udp_hdr(skb);
+	struct udphdr *uh = udp_gro_udphdr(skb);
 	struct sk_buff *pp = NULL;
 	struct udphdr *uh2;
 	struct sk_buff *p;
-	unsigned int ulen;
+	u32 ulen, off;
 	int ret = 0;
 
 	/* requires non zero csum, for symmetry with GSO */
@@ -385,6 +385,9 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
 		NAPI_GRO_CB(skb)->flush = 1;
 		return NULL;
 	}
+
+	off = skb_gro_offset(skb);
+
 	/* pull encapsulating udp header */
 	skb_gro_pull(skb, sizeof(struct udphdr));
 
@@ -392,7 +395,7 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
 		if (!NAPI_GRO_CB(p)->same_flow)
 			continue;
 
-		uh2 = udp_hdr(p);
+		uh2 = (void *)p->data + off;
 
 		/* Match ports only, as csum is always non zero */
 		if ((*(u32 *)&uh->source != *(u32 *)&uh2->source)) {
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 29d9691359b9..a256ecce76b2 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -269,7 +269,7 @@ static struct sock *__udp6_lib_lookup_skb(struct sk_buff *skb,
 					  __be16 sport, __be16 dport,
 					  struct udp_table *udptable)
 {
-	const struct ipv6hdr *iph = ipv6_hdr(skb);
+	const struct ipv6hdr *iph = skb_gro_network_header(skb);
 
 	return __udp6_lib_lookup(dev_net(skb->dev), &iph->saddr, sport,
 				 &iph->daddr, dport, inet6_iif(skb),
@@ -279,7 +279,7 @@ static struct sock *__udp6_lib_lookup_skb(struct sk_buff *skb,
 struct sock *udp6_lib_lookup_skb(struct sk_buff *skb,
 				 __be16 sport, __be16 dport)
 {
-	const struct ipv6hdr *iph = ipv6_hdr(skb);
+	const struct ipv6hdr *iph = skb_gro_network_header(skb);
 
 	return __udp6_lib_lookup(dev_net(skb->dev), &iph->saddr, sport,
 				 &iph->daddr, dport, inet6_iif(skb),
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 7+ messages in thread
* Re: [net] 0b726f6b31: BUG:unable_to_handle_page_fault_for_address
@ 2020-11-16 13:30 Alexander Lobakin
  0 siblings, 0 replies; 7+ messages in thread
From: Alexander Lobakin @ 2020-11-16 13:30 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Alexander Lobakin, David S. Miller, Jakub Kicinski,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Paolo Abeni,
	Steffen Klassert, Network Development, linux-kernel, 0day robot,
	lkp, kernel test robot

From: Alexander Lobakin <alobakin@pm.me>
Date: Mon, 16 Nov 2020 12:54:42 +0000

> From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> Date: Fri, 13 Nov 2020 10:51:36 -0500
>
> Hi!
>
>> On Fri, Nov 13, 2020 at 7:00 AM kernel test robot <oliver.sang@intel.com> wrote:
>>>
>>>
>>> Greeting,
>>>
>>> FYI, we noticed the following commit (built with gcc-9):
>>>
>>> commit: 0b726f6b318a07644b6c2388e6e44406740f4754 ("[PATCH v3 net] net: udp: fix Fast/frag0 UDP GRO")
>>> url: https://github.com/0day-ci/linux/commits/Alexander-Lobakin/net-udp-fix-Fast-frag0-UDP-GRO/20201110-052215
>>> base: https://git.kernel.org/cgit/linux/kernel/git/davem/net.git 4e0396c59559264442963b349ab71f66e471f84d
>>>
>>> in testcase: apachebench
>>> version:
>>> with following parameters:
>>>
>>>         runtime: 300s
>>>         concurrency: 2000
>>>         cluster: cs-localhost
>>>         cpufreq_governor: performance
>>>         ucode: 0x7000019
>>>
>>> test-description: apachebench is a tool for benchmarking your Apache Hypertext Transfer Protocol (HTTP) server.
>>> test-url: https://httpd.apache.org/docs/2.4/programs/ab.html
>>>
>>>
>>> on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 48G memory
>>>
>>> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>>>
>>>
>>> If you fix the issue, kindly add following tag
>>> Reported-by: kernel test robot <oliver.sang@intel.com>
>>>
>>>
>>> [   28.582714] BUG: unable to handle page fault for address: fffffffffffffffa
>>> [   28.590164] #PF: supervisor read access in kernel mode
>>> [   28.590164] #PF: error_code(0x0000) - not-present page
>>> [   28.590165] PGD c7e20d067 P4D c7e20d067 PUD c7e20f067 PMD 0
>>> [   28.590169] Oops: 0000 [#1] SMP PTI
>>> [   28.590171] CPU: 15 PID: 0 Comm: swapper/15 Not tainted 5.10.0-rc2-00373-g0b726f6b318a #1
>>> [   28.590172] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1 03/02/2016
>>> [   28.590177] RIP: 0010:__udp4_lib_rcv+0x547/0xbe0
>>> [   28.590178] Code: 74 0a f6 45 3c 80 74 04 44 8b 4d 28 48 8b 55 58 48 83 e2 fe 74 07 8b 52 7c 85 d2 75 06 8b 95 90 00 00 00 48 8b be f0 04 00 00 <44> 8b 58 0c 8b 48 10 55 41 55 44 89 de 41 51 41 89 d1 44 89 d2 e8
>>> [   28.590179] RSP: 0018:ffffc900003b4bb8 EFLAGS: 00010246
>>> [   28.590180] RAX: ffffffffffffffee RBX: 0000000000000011 RCX: ffff888c7bc580e2
>>> [   28.590181] RDX: 0000000000000002 RSI: ffff88810ddc8000 RDI: ffffffff82d68f00
>>> [   28.590182] RBP: ffff888c7bf8f800 R08: 00000000000003b7 R09: 0000000000000000
>>> [   28.590182] R10: 0000000000003500 R11: 0000000000000000 R12: ffff888c7bc580e2
>>> [   28.590183] R13: ffffffff82e072b0 R14: ffffffff82d68f00 R15: 0000000000000034
>>> [   28.590184] FS:  0000000000000000(0000) GS:ffff888c7fdc0000(0000) knlGS:0000000000000000
>>> [   28.590185] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [   28.590186] CR2: fffffffffffffffa CR3: 0000000c7e20a006 CR4: 00000000003706e0
>>> [   28.590186] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> [   28.590187] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>> [   28.590187] Call Trace:
>>> [   28.590189]  <IRQ>
>>> [   28.590193]  ip_protocol_deliver_rcu+0xc5/0x1c0
>>> [   28.590196]  ip_local_deliver_finish+0x4b/0x60
>>> [   28.738714]  ip_local_deliver+0x6e/0x140
>>> [   28.738717]  ip_sublist_rcv_finish+0x57/0x80
>>> [   28.738719]  ip_sublist_rcv+0x199/0x240
>>> [   28.750730]  ip_list_rcv+0x13a/0x160
>>> [   28.750733]  __netif_receive_skb_list_core+0x2a9/0x2e0
>>> [   28.750736]  netif_receive_skb_list_internal+0x1d3/0x320
>>> [   28.764743]  gro_normal_list+0x19/0x40
>>> [   28.764747]  napi_complete_done+0x68/0x160
>>> [   28.773197]  igb_poll+0x63/0x320
>>> [   28.773198]  net_rx_action+0x136/0x3a0
>>> [   28.773201]  __do_softirq+0xe1/0x2c3
>>> [   28.773204]  asm_call_irq_on_stack+0x12/0x20
>>> [   28.773205]  </IRQ>
>>> [   28.773208]  do_softirq_own_stack+0x37/0x40
>>> [   28.773211]  irq_exit_rcu+0xd2/0xe0
>>> [   28.773213]  common_interrupt+0x74/0x140
>>> [   28.773216]  asm_common_interrupt+0x1e/0x40
>>> [   28.773219] RIP: 0010:cpuidle_enter_state+0xd2/0x360
>>
>> This was expected. This v3 of the patch has already been superseded by
>> one that addresses this lookup:
>
> Wait. This page fault happens on IP receive, which is performed after
> all GRO processing. At this point, all headers are pulled to skb->head,
> and no GRO helpers are needed to access them.
> The function that causes that, __udp4_lib_rcv(), uses ip_hdr() a lot,
> and it's safe. There should be another questionable point.

Oh well. I simply missed that this autotest was performed with v3.
I thought that the merged one was found to be broken. Glad it's fine.

>>> @@ -534,7 +534,7 @@ static inline struct sock *__udp4_lib_lookup_skb(struct sk_buff *skb,
>>>                                                  __be16 sport, __be16 dport,
>>>                                                  struct udp_table *udptable)
>>>  {
>>> -       const struct iphdr *iph = ip_hdr(skb);
>>> +       const struct iphdr *iph = skb_gro_network_header(skb);
>>
>> The merged version was v5 and lacks this change.
>
> Thanks,
> Al

Al


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-11-16 13:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-11-09 21:15 [PATCH v3 net] net: udp: fix Fast/frag0 UDP GRO Alexander Lobakin
2020-11-09 22:24 ` Willem de Bruijn
2020-11-13 12:15 ` [net] 0b726f6b31: BUG:unable_to_handle_page_fault_for_address kernel test robot
2020-11-13 15:51   ` Willem de Bruijn
2020-11-16 12:54     ` Alexander Lobakin
2020-11-16 13:37       ` Willem de Bruijn
  -- strict thread matches above, loose matches on Subject: below --
2020-11-16 13:30 Alexander Lobakin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).