netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/1] net: gro: fix misuse of CB in udp socket lookup
@ 2023-07-27 15:25 Richard Gobert
  2023-07-27 15:33 ` [PATCH v3 1/1] " Richard Gobert
  2023-07-29 17:20 ` [PATCH v3 0/1] " patchwork-bot+netdevbpf
  0 siblings, 2 replies; 4+ messages in thread
From: Richard Gobert @ 2023-07-27 15:25 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, willemdebruijn.kernel, dsahern,
	tom, netdev, linux-kernel, gal

GRO stack uses `udp_lib_lookup_skb` which relies on IP/IPv6 CB's info, and
at the GRO stage, CB holds `napi_gro_cb` info. Specifically,
`udp_lib_lookup_skb` tries to fetch `iff` and `flags` information from the
CB, to find the relevant udp tunnel socket (GENEVE/VXLAN/..). Up until a
patch I submitted recently [0], it worked merely by luck, due
to the layouts of `napi_gro_cb` and IP6CB.

AFAIU it worked because:
`IP6CB(skb)->flags` is at offset 16 inside IP6CB:
 - Before the patch: `flags` was mapped to `flush`.
 - After the patch: `flags` was mapped to `data_offset`.

`IP6CB(skb)->iff` is at offset 0 inside IP6CB:
 - Before the patch: `iif` was mapped to `frag0`.
 - After the patch: `iif` was mapped to a union of `frag0` and `last`.

After my patch, on the receive phase, while `data_offset` is 40 (since IPv6
header is 40 bytes), `inet_iif` calls `ipv6_l3mdev_skb`, which checks
whether `IP6CB(skb)->flags`'s `IP6SKB_L3SLAVE` bit is on or off (in our
case its off). If it is off, `inet_iif` returns `IP6CB(skb)->iif`, which is
mapped to `napi_gro_cb->frag0`, making `inet_iif` return 0 most of the
times. `inet_sdif` returns zero due to a similar reason caused by
`data_offset` being equal to 40 (and less than 64).

On the other hand, the complete phase behaves differently.
`data_offset` is usually greater than 64 and less than 128 so the
`IP6SKB_L3SLAVE` flag is on.  Thus, `inet_sdif` returns `IP6CB(skb)->iif`,
which is mapped to `last` which contains a pointer. This causes
`udp_sk_bound_dev_eq` to fail, which leads to `udp6_lib_lookup2` failing
and not returning a socket. This leads the receive phase of GRO
to find the right socket, and on the complete phase, it fails to find it 
and makes the throughput go down to nearly zero.

Before [0] `flags` was mapped to `flush`. `flush`'s possible
values were 1 and 0, making `inet6_iff` always returning `skb->skb_iif` and
`inet6_sdif` returning 0, and leading to `udp_sk_bound_dev_eq` returning
true.

A fix is to not rely on CB, and get `iff` and `sdif` using skb->dev. l3mdev
case requires special attention since it has a master and a slave device.

[0] https://lore.kernel.org/netdev/20230601160924.GA9194@debian/

Changelog:

v2 -> v3:
  * change functions to static inline in the header file

v1 -> v2:
  * make functions inline
  * fix logical bug
  * add a comment when we can use the new functions
  * checkpatch fixes

Richard Gobert (1):
  net: gro: fix misuse of CB in udp socket lookup

 include/net/gro.h      | 43 ++++++++++++++++++++++++++++++++++++++++++
 net/ipv4/udp.c         |  8 ++++++--
 net/ipv4/udp_offload.c |  7 +++++--
 net/ipv6/udp.c         |  8 ++++++--
 net/ipv6/udp_offload.c |  7 +++++--
 5 files changed, 65 insertions(+), 8 deletions(-)

-- 
2.36.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v3 1/1] net: gro: fix misuse of CB in udp socket lookup
  2023-07-27 15:25 [PATCH v3 0/1] net: gro: fix misuse of CB in udp socket lookup Richard Gobert
@ 2023-07-27 15:33 ` Richard Gobert
  2023-07-28 21:21   ` David Ahern
  2023-07-29 17:20 ` [PATCH v3 0/1] " patchwork-bot+netdevbpf
  1 sibling, 1 reply; 4+ messages in thread
From: Richard Gobert @ 2023-07-27 15:33 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, willemdebruijn.kernel, dsahern,
	tom, netdev, linux-kernel, gal

This patch fixes a misuse of IP{6}CB(skb) in GRO, while calling to
`udp6_lib_lookup2` when handling udp tunnels. `udp6_lib_lookup2` fetch the
device from CB. The fix changes it to fetch the device from `skb->dev`.
l3mdev case requires special attention since it has a master and a slave
device.

Fixes: a6024562ffd7 ("udp: Add GRO functions to UDP socket")
Reported-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
---
 include/net/gro.h      | 43 ++++++++++++++++++++++++++++++++++++++++++
 net/ipv4/udp.c         |  8 ++++++--
 net/ipv4/udp_offload.c |  7 +++++--
 net/ipv6/udp.c         |  8 ++++++--
 net/ipv6/udp_offload.c |  7 +++++--
 5 files changed, 65 insertions(+), 8 deletions(-)

diff --git a/include/net/gro.h b/include/net/gro.h
index 75efa6fb8441..88644b3ca660 100644
--- a/include/net/gro.h
+++ b/include/net/gro.h
@@ -452,6 +452,49 @@ static inline void gro_normal_one(struct napi_struct *napi, struct sk_buff *skb,
 		gro_normal_list(napi);
 }
 
+/* This function is the alternative of 'inet_iif' and 'inet_sdif'
+ * functions in case we can not rely on fields of IPCB.
+ *
+ * The caller must verify skb_valid_dst(skb) is false and skb->dev is initialized.
+ * The caller must hold the RCU read lock.
+ */
+static inline void inet_get_iif_sdif(const struct sk_buff *skb, int *iif, int *sdif)
+{
+	*iif = inet_iif(skb) ?: skb->dev->ifindex;
+	*sdif = 0;
+
+#if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
+	if (netif_is_l3_slave(skb->dev)) {
+		struct net_device *master = netdev_master_upper_dev_get_rcu(skb->dev);
+
+		*sdif = *iif;
+		*iif = master ? master->ifindex : 0;
+	}
+#endif
+}
+
+/* This function is the alternative of 'inet6_iif' and 'inet6_sdif'
+ * functions in case we can not rely on fields of IP6CB.
+ *
+ * The caller must verify skb_valid_dst(skb) is false and skb->dev is initialized.
+ * The caller must hold the RCU read lock.
+ */
+static inline void inet6_get_iif_sdif(const struct sk_buff *skb, int *iif, int *sdif)
+{
+	/* using skb->dev->ifindex because skb_dst(skb) is not initialized */
+	*iif = skb->dev->ifindex;
+	*sdif = 0;
+
+#if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
+	if (netif_is_l3_slave(skb->dev)) {
+		struct net_device *master = netdev_master_upper_dev_get_rcu(skb->dev);
+
+		*sdif = *iif;
+		*iif = master ? master->ifindex : 0;
+	}
+#endif
+}
+
 extern struct list_head offload_base;
 
 #endif /* _NET_IPV6_GRO_H */
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 8c3ebd95f5b9..1ee9e56dc79a 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -114,6 +114,7 @@
 #include <net/sock_reuseport.h>
 #include <net/addrconf.h>
 #include <net/udp_tunnel.h>
+#include <net/gro.h>
 #if IS_ENABLED(CONFIG_IPV6)
 #include <net/ipv6_stubs.h>
 #endif
@@ -555,10 +556,13 @@ struct sock *udp4_lib_lookup_skb(const struct sk_buff *skb,
 {
 	const struct iphdr *iph = ip_hdr(skb);
 	struct net *net = dev_net(skb->dev);
+	int iif, sdif;
+
+	inet_get_iif_sdif(skb, &iif, &sdif);
 
 	return __udp4_lib_lookup(net, iph->saddr, sport,
-				 iph->daddr, dport, inet_iif(skb),
-				 inet_sdif(skb), net->ipv4.udp_table, NULL);
+				 iph->daddr, dport, iif,
+				 sdif, net->ipv4.udp_table, NULL);
 }
 
 /* Must be called under rcu_read_lock().
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 75aa4de5b731..d734f11e13cc 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -603,10 +603,13 @@ static struct sock *udp4_gro_lookup_skb(struct sk_buff *skb, __be16 sport,
 {
 	const struct iphdr *iph = skb_gro_network_header(skb);
 	struct net *net = dev_net(skb->dev);
+	int iif, sdif;
+
+	inet_get_iif_sdif(skb, &iif, &sdif);
 
 	return __udp4_lib_lookup(net, iph->saddr, sport,
-				 iph->daddr, dport, inet_iif(skb),
-				 inet_sdif(skb), net->ipv4.udp_table, NULL);
+				 iph->daddr, dport, iif,
+				 sdif, net->ipv4.udp_table, NULL);
 }
 
 INDIRECT_CALLABLE_SCOPE
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index b7c972aa09a7..e5da5d1cb215 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -51,6 +51,7 @@
 #include <net/inet6_hashtables.h>
 #include <net/busy_poll.h>
 #include <net/sock_reuseport.h>
+#include <net/gro.h>
 
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
@@ -300,10 +301,13 @@ struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb,
 {
 	const struct ipv6hdr *iph = ipv6_hdr(skb);
 	struct net *net = dev_net(skb->dev);
+	int iif, sdif;
+
+	inet6_get_iif_sdif(skb, &iif, &sdif);
 
 	return __udp6_lib_lookup(net, &iph->saddr, sport,
-				 &iph->daddr, dport, inet6_iif(skb),
-				 inet6_sdif(skb), net->ipv4.udp_table, NULL);
+				 &iph->daddr, dport, iif,
+				 sdif, net->ipv4.udp_table, NULL);
 }
 
 /* Must be called under rcu_read_lock().
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index ad3b8726873e..31f12bd5d0fe 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -119,10 +119,13 @@ static struct sock *udp6_gro_lookup_skb(struct sk_buff *skb, __be16 sport,
 {
 	const struct ipv6hdr *iph = skb_gro_network_header(skb);
 	struct net *net = dev_net(skb->dev);
+	int iif, sdif;
+
+	inet6_get_iif_sdif(skb, &iif, &sdif);
 
 	return __udp6_lib_lookup(net, &iph->saddr, sport,
-				 &iph->daddr, dport, inet6_iif(skb),
-				 inet6_sdif(skb), net->ipv4.udp_table, NULL);
+				 &iph->daddr, dport, iif,
+				 sdif, net->ipv4.udp_table, NULL);
 }
 
 INDIRECT_CALLABLE_SCOPE
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v3 1/1] net: gro: fix misuse of CB in udp socket lookup
  2023-07-27 15:33 ` [PATCH v3 1/1] " Richard Gobert
@ 2023-07-28 21:21   ` David Ahern
  0 siblings, 0 replies; 4+ messages in thread
From: David Ahern @ 2023-07-28 21:21 UTC (permalink / raw)
  To: Richard Gobert, davem, edumazet, kuba, pabeni,
	willemdebruijn.kernel, tom, netdev, linux-kernel, gal

On 7/27/23 9:33 AM, Richard Gobert wrote:
> This patch fixes a misuse of IP{6}CB(skb) in GRO, while calling to
> `udp6_lib_lookup2` when handling udp tunnels. `udp6_lib_lookup2` fetch the
> device from CB. The fix changes it to fetch the device from `skb->dev`.
> l3mdev case requires special attention since it has a master and a slave
> device.
> 
> Fixes: a6024562ffd7 ("udp: Add GRO functions to UDP socket")
> Reported-by: Gal Pressman <gal@nvidia.com>
> Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
> ---
>  include/net/gro.h      | 43 ++++++++++++++++++++++++++++++++++++++++++
>  net/ipv4/udp.c         |  8 ++++++--
>  net/ipv4/udp_offload.c |  7 +++++--
>  net/ipv6/udp.c         |  8 ++++++--
>  net/ipv6/udp_offload.c |  7 +++++--
>  5 files changed, 65 insertions(+), 8 deletions(-)
> 

I still think the text in the cover letter should be in this commit message.

Code wise, it looks ok to me:
Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v3 0/1] net: gro: fix misuse of CB in udp socket lookup
  2023-07-27 15:25 [PATCH v3 0/1] net: gro: fix misuse of CB in udp socket lookup Richard Gobert
  2023-07-27 15:33 ` [PATCH v3 1/1] " Richard Gobert
@ 2023-07-29 17:20 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-07-29 17:20 UTC (permalink / raw)
  To: Richard Gobert
  Cc: davem, edumazet, kuba, pabeni, willemdebruijn.kernel, dsahern,
	tom, netdev, linux-kernel, gal

Hello:

This patch was applied to netdev/net.git (main)
by David S. Miller <davem@davemloft.net>:

On Thu, 27 Jul 2023 17:25:11 +0200 you wrote:
> GRO stack uses `udp_lib_lookup_skb` which relies on IP/IPv6 CB's info, and
> at the GRO stage, CB holds `napi_gro_cb` info. Specifically,
> `udp_lib_lookup_skb` tries to fetch `iff` and `flags` information from the
> CB, to find the relevant udp tunnel socket (GENEVE/VXLAN/..). Up until a
> patch I submitted recently [0], it worked merely by luck, due
> to the layouts of `napi_gro_cb` and IP6CB.
> 
> [...]

Here is the summary with links:
  - [v3,1/1] net: gro: fix misuse of CB in udp socket lookup
    https://git.kernel.org/netdev/net/c/7938cd154368

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-07-29 17:20 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-27 15:25 [PATCH v3 0/1] net: gro: fix misuse of CB in udp socket lookup Richard Gobert
2023-07-27 15:33 ` [PATCH v3 1/1] " Richard Gobert
2023-07-28 21:21   ` David Ahern
2023-07-29 17:20 ` [PATCH v3 0/1] " patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).