netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t.
@ 2024-10-01 19:28 Guillaume Nault
  2024-10-01 19:28 ` [PATCH net-next 1/5] ipv4: Convert icmp_route_lookup() " Guillaume Nault
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Guillaume Nault @ 2024-10-01 19:28 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, David Ahern, Ido Schimmel, Pablo Neira Ayuso,
	Jozsef Kadlecsik, Roopa Prabhu, Nikolay Aleksandrov,
	Steffen Klassert, Herbert Xu

Prepare ip_route_input_slow() and its call chain to future conversion
of ->flowi4_tos.

The ->flowi4_tos field of "struct flowi4" is used in many different
places, which makes it hard to convert it from __u8 to dscp_t.

In order to avoid a big patch updating all its users at once, this
patch series gradually converts some users to dscp_t. Those users now
set ->flowi4_tos from a dscp_t variable that is converted to __u8 using
inet_dscp_to_dsfield().

When all users of ->flowi4_tos will use a dscp_t variable, converting
that field to dscp_t will just be a matter of removing all the
inet_dscp_to_dsfield() conversions.

This series concentrates on ip_route_input_slow() and its direct and
indirect callers.

Guillaume Nault (5):
  ipv4: Convert icmp_route_lookup() to dscp_t.
  ipv4: Convert ip_route_input() to dscp_t.
  ipv4: Convert ip_route_input_noref() to dscp_t.
  ipv4: Convert ip_route_input_rcu() to dscp_t.
  ipv4: Convert ip_route_input_slow() to dscp_t.

 drivers/net/ipvlan/ipvlan_l3s.c |  6 ++++--
 include/net/ip.h                |  5 +++++
 include/net/route.h             |  8 ++++----
 net/bridge/br_netfilter_hooks.c |  8 +++++---
 net/core/lwt_bpf.c              |  5 +++--
 net/ipv4/icmp.c                 | 19 +++++++++----------
 net/ipv4/ip_fragment.c          |  4 ++--
 net/ipv4/ip_input.c             |  2 +-
 net/ipv4/ip_options.c           |  3 ++-
 net/ipv4/route.c                | 32 ++++++++++++++++++--------------
 net/ipv4/xfrm4_input.c          |  2 +-
 net/ipv4/xfrm4_protocol.c       |  2 +-
 net/ipv6/ip6_tunnel.c           |  4 ++--
 13 files changed, 57 insertions(+), 43 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH net-next 1/5] ipv4: Convert icmp_route_lookup() to dscp_t.
  2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
@ 2024-10-01 19:28 ` Guillaume Nault
  2024-10-01 19:28 ` [PATCH net-next 2/5] ipv4: Convert ip_route_input() " Guillaume Nault
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Guillaume Nault @ 2024-10-01 19:28 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, David Ahern, Ido Schimmel

Pass a dscp_t variable to icmp_route_lookup(), instead of a plain u8,
to prevent accidental setting of ECN bits in ->flowi4_tos. Rename that
variable ("tos" -> "dscp") to make the intent clear.

While there, reorganise the function parameters to fill up horizontal
space.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
 net/ipv4/icmp.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index e1384e7331d8..7d7b25ed8d21 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -478,13 +478,11 @@ static struct net_device *icmp_get_route_lookup_dev(struct sk_buff *skb)
 	return route_lookup_dev;
 }
 
-static struct rtable *icmp_route_lookup(struct net *net,
-					struct flowi4 *fl4,
+static struct rtable *icmp_route_lookup(struct net *net, struct flowi4 *fl4,
 					struct sk_buff *skb_in,
-					const struct iphdr *iph,
-					__be32 saddr, u8 tos, u32 mark,
-					int type, int code,
-					struct icmp_bxm *param)
+					const struct iphdr *iph, __be32 saddr,
+					dscp_t dscp, u32 mark, int type,
+					int code, struct icmp_bxm *param)
 {
 	struct net_device *route_lookup_dev;
 	struct dst_entry *dst, *dst2;
@@ -498,7 +496,7 @@ static struct rtable *icmp_route_lookup(struct net *net,
 	fl4->saddr = saddr;
 	fl4->flowi4_mark = mark;
 	fl4->flowi4_uid = sock_net_uid(net, NULL);
-	fl4->flowi4_tos = tos & INET_DSCP_MASK;
+	fl4->flowi4_tos = inet_dscp_to_dsfield(dscp);
 	fl4->flowi4_proto = IPPROTO_ICMP;
 	fl4->fl4_icmp_type = type;
 	fl4->fl4_icmp_code = code;
@@ -547,7 +545,7 @@ static struct rtable *icmp_route_lookup(struct net *net,
 		orefdst = skb_in->_skb_refdst; /* save old refdst */
 		skb_dst_set(skb_in, NULL);
 		err = ip_route_input(skb_in, fl4_dec.daddr, fl4_dec.saddr,
-				     tos, rt2->dst.dev);
+				     inet_dscp_to_dsfield(dscp), rt2->dst.dev);
 
 		dst_release(&rt2->dst);
 		rt2 = skb_rtable(skb_in);
@@ -741,8 +739,9 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info,
 	ipc.opt = &icmp_param.replyopts.opt;
 	ipc.sockc.mark = mark;
 
-	rt = icmp_route_lookup(net, &fl4, skb_in, iph, saddr, tos, mark,
-			       type, code, &icmp_param);
+	rt = icmp_route_lookup(net, &fl4, skb_in, iph, saddr,
+			       inet_dsfield_to_dscp(tos), mark, type, code,
+			       &icmp_param);
 	if (IS_ERR(rt))
 		goto out_unlock;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 2/5] ipv4: Convert ip_route_input() to dscp_t.
  2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
  2024-10-01 19:28 ` [PATCH net-next 1/5] ipv4: Convert icmp_route_lookup() " Guillaume Nault
@ 2024-10-01 19:28 ` Guillaume Nault
  2024-10-01 19:28 ` [PATCH net-next 3/5] ipv4: Convert ip_route_input_noref() " Guillaume Nault
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Guillaume Nault @ 2024-10-01 19:28 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, David Ahern, Ido Schimmel, Pablo Neira Ayuso,
	Jozsef Kadlecsik, Roopa Prabhu, Nikolay Aleksandrov

Pass a dscp_t variable to ip_route_input(), instead of a plain u8, to
prevent accidental setting of ECN bits in ->flowi4_tos.

Callers of ip_route_input() to consider are:

  * input_action_end_dx4_finish() and input_action_end_dt4() in
    net/ipv6/seg6_local.c. These functions set the tos parameter to 0,
    which is already a valid dscp_t value, so they don't need to be
    adjusted for the new prototype.

  * icmp_route_lookup(), which already has a dscp_t variable to pass as
    parameter. We just need to remove the inet_dscp_to_dsfield()
    conversion.

  * br_nf_pre_routing_finish(), ip_options_rcv_srr() and ip4ip6_err(),
    which get the DSCP directly from IPv4 headers. Define a helper to
    read the .tos field of struct iphdr as dscp_t, so that these
    function don't have to do the conversion manually.

While there, declare *iph as const in br_nf_pre_routing_finish(),
declare its local variables in reverse-christmas-tree order and move
the "err = ip_route_input()" assignment out of the conditional to avoid
checkpatch warning.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
 include/net/ip.h                | 5 +++++
 include/net/route.h             | 5 +++--
 net/bridge/br_netfilter_hooks.c | 8 +++++---
 net/ipv4/icmp.c                 | 2 +-
 net/ipv4/ip_options.c           | 3 ++-
 net/ipv6/ip6_tunnel.c           | 4 ++--
 6 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index d92d3bc3ec0e..bab084df1567 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -424,6 +424,11 @@ int ip_decrease_ttl(struct iphdr *iph)
 	return --iph->ttl;
 }
 
+static inline dscp_t ip4h_dscp(const struct iphdr *ip4h)
+{
+	return inet_dsfield_to_dscp(ip4h->tos);
+}
+
 static inline int ip_mtu_locked(const struct dst_entry *dst)
 {
 	const struct rtable *rt = dst_rtable(dst);
diff --git a/include/net/route.h b/include/net/route.h
index 1789f1e6640b..03dd28cf4bc4 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -208,12 +208,13 @@ int ip_route_use_hint(struct sk_buff *skb, __be32 dst, __be32 src,
 		      const struct sk_buff *hint);
 
 static inline int ip_route_input(struct sk_buff *skb, __be32 dst, __be32 src,
-				 u8 tos, struct net_device *devin)
+				 dscp_t dscp, struct net_device *devin)
 {
 	int err;
 
 	rcu_read_lock();
-	err = ip_route_input_noref(skb, dst, src, tos, devin);
+	err = ip_route_input_noref(skb, dst, src, inet_dscp_to_dsfield(dscp),
+				   devin);
 	if (!err) {
 		skb_dst_force(skb);
 		if (!skb_dst(skb))
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 0e8bc0ea6175..c6bab2b5e834 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -369,9 +369,9 @@ br_nf_ipv4_daddr_was_changed(const struct sk_buff *skb,
  */
 static int br_nf_pre_routing_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
 {
-	struct net_device *dev = skb->dev, *br_indev;
-	struct iphdr *iph = ip_hdr(skb);
 	struct nf_bridge_info *nf_bridge = nf_bridge_info_get(skb);
+	struct net_device *dev = skb->dev, *br_indev;
+	const struct iphdr *iph = ip_hdr(skb);
 	struct rtable *rt;
 	int err;
 
@@ -389,7 +389,9 @@ static int br_nf_pre_routing_finish(struct net *net, struct sock *sk, struct sk_
 	}
 	nf_bridge->in_prerouting = 0;
 	if (br_nf_ipv4_daddr_was_changed(skb, nf_bridge)) {
-		if ((err = ip_route_input(skb, iph->daddr, iph->saddr, iph->tos, dev))) {
+		err = ip_route_input(skb, iph->daddr, iph->saddr,
+				     ip4h_dscp(iph), dev);
+		if (err) {
 			struct in_device *in_dev = __in_dev_get_rcu(dev);
 
 			/* If err equals -EHOSTUNREACH the error is due to a
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 7d7b25ed8d21..23664434922e 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -545,7 +545,7 @@ static struct rtable *icmp_route_lookup(struct net *net, struct flowi4 *fl4,
 		orefdst = skb_in->_skb_refdst; /* save old refdst */
 		skb_dst_set(skb_in, NULL);
 		err = ip_route_input(skb_in, fl4_dec.daddr, fl4_dec.saddr,
-				     inet_dscp_to_dsfield(dscp), rt2->dst.dev);
+				     dscp, rt2->dst.dev);
 
 		dst_release(&rt2->dst);
 		rt2 = skb_rtable(skb_in);
diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
index a9e22a098872..b4c59708fc09 100644
--- a/net/ipv4/ip_options.c
+++ b/net/ipv4/ip_options.c
@@ -617,7 +617,8 @@ int ip_options_rcv_srr(struct sk_buff *skb, struct net_device *dev)
 
 		orefdst = skb->_skb_refdst;
 		skb_dst_set(skb, NULL);
-		err = ip_route_input(skb, nexthop, iph->saddr, iph->tos, dev);
+		err = ip_route_input(skb, nexthop, iph->saddr, ip4h_dscp(iph),
+				     dev);
 		rt2 = skb_rtable(skb);
 		if (err || (rt2->rt_type != RTN_UNICAST && rt2->rt_type != RTN_LOCAL)) {
 			skb_dst_drop(skb);
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index b60e13c42bca..48fd53b98972 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -630,8 +630,8 @@ ip4ip6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 		}
 		skb_dst_set(skb2, &rt->dst);
 	} else {
-		if (ip_route_input(skb2, eiph->daddr, eiph->saddr, eiph->tos,
-				   skb2->dev) ||
+		if (ip_route_input(skb2, eiph->daddr, eiph->saddr,
+				   ip4h_dscp(eiph), skb2->dev) ||
 		    skb_dst(skb2)->dev->type != ARPHRD_TUNNEL6)
 			goto out;
 	}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 3/5] ipv4: Convert ip_route_input_noref() to dscp_t.
  2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
  2024-10-01 19:28 ` [PATCH net-next 1/5] ipv4: Convert icmp_route_lookup() " Guillaume Nault
  2024-10-01 19:28 ` [PATCH net-next 2/5] ipv4: Convert ip_route_input() " Guillaume Nault
@ 2024-10-01 19:28 ` Guillaume Nault
  2024-10-01 19:28 ` [PATCH net-next 4/5] ipv4: Convert ip_route_input_rcu() " Guillaume Nault
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Guillaume Nault @ 2024-10-01 19:28 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, David Ahern, Ido Schimmel, Steffen Klassert, Herbert Xu

Pass a dscp_t variable to ip_route_input_noref(), instead of a plain
u8, to prevent accidental setting of ECN bits in ->flowi4_tos.

Callers of ip_route_input_noref() to consider are:

  * arp_process() in net/ipv4/arp.c. This function sets the tos
    parameter to 0, which is already a valid dscp_t value, so it
    doesn't need to be adjusted for the new prototype.

  * ip_route_input(), which already has a dscp_t variable to pass as
    parameter. We just need to remove the inet_dscp_to_dsfield()
    conversion.

  * ipvlan_l3_rcv(), bpf_lwt_input_reroute(), ip_expire(),
    ip_rcv_finish_core(), xfrm4_rcv_encap_finish() and
    xfrm4_rcv_encap(), which get the DSCP directly from IPv4 headers
    and can simply use the ip4h_dscp() helper.

While there, declare the IPv4 header pointers as const in
ipvlan_l3_rcv() and bpf_lwt_input_reroute().
Also, modify the declaration of ip_route_input_noref() in
include/net/route.h so that it matches the prototype of its
implementation in net/ipv4/route.c.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
 drivers/net/ipvlan/ipvlan_l3s.c | 6 ++++--
 include/net/route.h             | 7 +++----
 net/core/lwt_bpf.c              | 5 +++--
 net/ipv4/ip_fragment.c          | 4 ++--
 net/ipv4/ip_input.c             | 2 +-
 net/ipv4/route.c                | 6 +++---
 net/ipv4/xfrm4_input.c          | 2 +-
 net/ipv4/xfrm4_protocol.c       | 2 +-
 8 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan_l3s.c b/drivers/net/ipvlan/ipvlan_l3s.c
index d5b05e803219..b4ef386bdb1b 100644
--- a/drivers/net/ipvlan/ipvlan_l3s.c
+++ b/drivers/net/ipvlan/ipvlan_l3s.c
@@ -2,6 +2,8 @@
 /* Copyright (c) 2014 Mahesh Bandewar <maheshb@google.com>
  */
 
+#include <net/ip.h>
+
 #include "ipvlan.h"
 
 static unsigned int ipvlan_netid __read_mostly;
@@ -48,11 +50,11 @@ static struct sk_buff *ipvlan_l3_rcv(struct net_device *dev,
 	switch (proto) {
 	case AF_INET:
 	{
-		struct iphdr *ip4h = ip_hdr(skb);
+		const struct iphdr *ip4h = ip_hdr(skb);
 		int err;
 
 		err = ip_route_input_noref(skb, ip4h->daddr, ip4h->saddr,
-					   ip4h->tos, sdev);
+					   ip4h_dscp(ip4h), sdev);
 		if (unlikely(err))
 			goto out;
 		break;
diff --git a/include/net/route.h b/include/net/route.h
index 03dd28cf4bc4..5e4374d66927 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -201,8 +201,8 @@ static inline struct rtable *ip_route_output_gre(struct net *net, struct flowi4
 int ip_mc_validate_source(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 			  u8 tos, struct net_device *dev,
 			  struct in_device *in_dev, u32 *itag);
-int ip_route_input_noref(struct sk_buff *skb, __be32 dst, __be32 src,
-			 u8 tos, struct net_device *devin);
+int ip_route_input_noref(struct sk_buff *skb, __be32 daddr, __be32 saddr,
+			 dscp_t dscp, struct net_device *dev);
 int ip_route_use_hint(struct sk_buff *skb, __be32 dst, __be32 src,
 		      u8 tos, struct net_device *devin,
 		      const struct sk_buff *hint);
@@ -213,8 +213,7 @@ static inline int ip_route_input(struct sk_buff *skb, __be32 dst, __be32 src,
 	int err;
 
 	rcu_read_lock();
-	err = ip_route_input_noref(skb, dst, src, inet_dscp_to_dsfield(dscp),
-				   devin);
+	err = ip_route_input_noref(skb, dst, src, dscp, devin);
 	if (!err) {
 		skb_dst_force(skb);
 		if (!skb_dst(skb))
diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
index 1a14f915b7a4..e0ca24a58810 100644
--- a/net/core/lwt_bpf.c
+++ b/net/core/lwt_bpf.c
@@ -10,6 +10,7 @@
 #include <linux/bpf.h>
 #include <net/lwtunnel.h>
 #include <net/gre.h>
+#include <net/ip.h>
 #include <net/ip6_route.h>
 #include <net/ipv6_stubs.h>
 #include <net/inet_dscp.h>
@@ -91,12 +92,12 @@ static int bpf_lwt_input_reroute(struct sk_buff *skb)
 
 	if (skb->protocol == htons(ETH_P_IP)) {
 		struct net_device *dev = skb_dst(skb)->dev;
-		struct iphdr *iph = ip_hdr(skb);
+		const struct iphdr *iph = ip_hdr(skb);
 
 		dev_hold(dev);
 		skb_dst_drop(skb);
 		err = ip_route_input_noref(skb, iph->daddr, iph->saddr,
-					   iph->tos, dev);
+					   ip4h_dscp(iph), dev);
 		dev_put(dev);
 	} else if (skb->protocol == htons(ETH_P_IPV6)) {
 		skb_dst_drop(skb);
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index a92664a5ef2e..48e2810f1f27 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -175,8 +175,8 @@ static void ip_expire(struct timer_list *t)
 
 	/* skb has no dst, perform route lookup again */
 	iph = ip_hdr(head);
-	err = ip_route_input_noref(head, iph->daddr, iph->saddr,
-					   iph->tos, head->dev);
+	err = ip_route_input_noref(head, iph->daddr, iph->saddr, ip4h_dscp(iph),
+				   head->dev);
 	if (err)
 		goto out;
 
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index b6e7d4921309..c0a2490eb7c1 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -363,7 +363,7 @@ static int ip_rcv_finish_core(struct net *net, struct sock *sk,
 	 */
 	if (!skb_valid_dst(skb)) {
 		err = ip_route_input_noref(skb, iph->daddr, iph->saddr,
-					   iph->tos, dev);
+					   ip4h_dscp(iph), dev);
 		if (unlikely(err))
 			goto drop_error;
 	} else {
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 723ac9181558..00bfc0a11f64 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2465,14 +2465,14 @@ static int ip_route_input_rcu(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 }
 
 int ip_route_input_noref(struct sk_buff *skb, __be32 daddr, __be32 saddr,
-			 u8 tos, struct net_device *dev)
+			 dscp_t dscp, struct net_device *dev)
 {
 	struct fib_result res;
 	int err;
 
-	tos &= INET_DSCP_MASK;
 	rcu_read_lock();
-	err = ip_route_input_rcu(skb, daddr, saddr, tos, dev, &res);
+	err = ip_route_input_rcu(skb, daddr, saddr, inet_dscp_to_dsfield(dscp),
+				 dev, &res);
 	rcu_read_unlock();
 
 	return err;
diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c
index a620618cc568..b5b06323cfd9 100644
--- a/net/ipv4/xfrm4_input.c
+++ b/net/ipv4/xfrm4_input.c
@@ -33,7 +33,7 @@ static inline int xfrm4_rcv_encap_finish(struct net *net, struct sock *sk,
 		const struct iphdr *iph = ip_hdr(skb);
 
 		if (ip_route_input_noref(skb, iph->daddr, iph->saddr,
-					 iph->tos, skb->dev))
+					 ip4h_dscp(iph), skb->dev))
 			goto drop;
 	}
 
diff --git a/net/ipv4/xfrm4_protocol.c b/net/ipv4/xfrm4_protocol.c
index b146ce88c5d0..4ee624d8e66f 100644
--- a/net/ipv4/xfrm4_protocol.c
+++ b/net/ipv4/xfrm4_protocol.c
@@ -76,7 +76,7 @@ int xfrm4_rcv_encap(struct sk_buff *skb, int nexthdr, __be32 spi,
 		const struct iphdr *iph = ip_hdr(skb);
 
 		if (ip_route_input_noref(skb, iph->daddr, iph->saddr,
-					 iph->tos, skb->dev))
+					 ip4h_dscp(iph), skb->dev))
 			goto drop;
 	}
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 4/5] ipv4: Convert ip_route_input_rcu() to dscp_t.
  2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
                   ` (2 preceding siblings ...)
  2024-10-01 19:28 ` [PATCH net-next 3/5] ipv4: Convert ip_route_input_noref() " Guillaume Nault
@ 2024-10-01 19:28 ` Guillaume Nault
  2024-10-01 19:29 ` [PATCH net-next 5/5] ipv4: Convert ip_route_input_slow() " Guillaume Nault
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Guillaume Nault @ 2024-10-01 19:28 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, David Ahern, Ido Schimmel

Pass a dscp_t variable to ip_route_input_rcu(), instead of a plain u8,
to prevent accidental setting of ECN bits in ->flowi4_tos.

Callers of ip_route_input_rcu() to consider are:

  * ip_route_input_noref(), which already has a dscp_t variable to pass
    as parameter. We just need to remove the inet_dscp_to_dsfield()
    conversion.

  * inet_rtm_getroute(), which receives a u8 from user space and needs
    to convert it with inet_dsfield_to_dscp().

Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
 net/ipv4/route.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 00bfc0a11f64..a693b57b4111 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2415,7 +2415,8 @@ out:	return err;
 
 /* called with rcu_read_lock held */
 static int ip_route_input_rcu(struct sk_buff *skb, __be32 daddr, __be32 saddr,
-			      u8 tos, struct net_device *dev, struct fib_result *res)
+			      dscp_t dscp, struct net_device *dev,
+			      struct fib_result *res)
 {
 	/* Multicast recognition logic is moved from route cache to here.
 	 * The problem was that too many Ethernet cards have broken/missing
@@ -2456,12 +2457,14 @@ static int ip_route_input_rcu(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 #endif
 		   ) {
 			err = ip_route_input_mc(skb, daddr, saddr,
-						tos, dev, our);
+						inet_dscp_to_dsfield(dscp),
+						dev, our);
 		}
 		return err;
 	}
 
-	return ip_route_input_slow(skb, daddr, saddr, tos, dev, res);
+	return ip_route_input_slow(skb, daddr, saddr,
+				   inet_dscp_to_dsfield(dscp), dev, res);
 }
 
 int ip_route_input_noref(struct sk_buff *skb, __be32 daddr, __be32 saddr,
@@ -2471,8 +2474,7 @@ int ip_route_input_noref(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 	int err;
 
 	rcu_read_lock();
-	err = ip_route_input_rcu(skb, daddr, saddr, inet_dscp_to_dsfield(dscp),
-				 dev, &res);
+	err = ip_route_input_rcu(skb, daddr, saddr, dscp, dev, &res);
 	rcu_read_unlock();
 
 	return err;
@@ -3286,8 +3288,8 @@ static int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh,
 		skb->dev	= dev;
 		skb->mark	= mark;
 		err = ip_route_input_rcu(skb, dst, src,
-					 rtm->rtm_tos & INET_DSCP_MASK, dev,
-					 &res);
+					 inet_dsfield_to_dscp(rtm->rtm_tos),
+					 dev, &res);
 
 		rt = skb_rtable(skb);
 		if (err == 0 && rt->dst.error)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 5/5] ipv4: Convert ip_route_input_slow() to dscp_t.
  2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
                   ` (3 preceding siblings ...)
  2024-10-01 19:28 ` [PATCH net-next 4/5] ipv4: Convert ip_route_input_rcu() " Guillaume Nault
@ 2024-10-01 19:29 ` Guillaume Nault
  2024-10-02  2:45 ` [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers " David Ahern
  2024-10-03 23:40 ` patchwork-bot+netdevbpf
  6 siblings, 0 replies; 8+ messages in thread
From: Guillaume Nault @ 2024-10-01 19:29 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, David Ahern, Ido Schimmel

Pass a dscp_t variable to ip_route_input_slow(), instead of a plain u8,
to prevent accidental setting of ECN bits in ->flowi4_tos.

Only ip_route_input_rcu() actually calls ip_route_input_slow(). Since
it already has a dscp_t variable to pass as parameter, we only need to
remove the inet_dscp_to_dsfield() conversion.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
 net/ipv4/route.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index a693b57b4111..6e1cd0065b87 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2201,7 +2201,7 @@ static struct net_device *ip_rt_get_dev(struct net *net,
  */
 
 static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
-			       u8 tos, struct net_device *dev,
+			       dscp_t dscp, struct net_device *dev,
 			       struct fib_result *res)
 {
 	struct in_device *in_dev = __in_dev_get_rcu(dev);
@@ -2266,7 +2266,7 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 	fl4.flowi4_oif = 0;
 	fl4.flowi4_iif = dev->ifindex;
 	fl4.flowi4_mark = skb->mark;
-	fl4.flowi4_tos = tos;
+	fl4.flowi4_tos = inet_dscp_to_dsfield(dscp);
 	fl4.flowi4_scope = RT_SCOPE_UNIVERSE;
 	fl4.flowi4_flags = 0;
 	fl4.daddr = daddr;
@@ -2299,8 +2299,9 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 	}
 
 	if (res->type == RTN_LOCAL) {
-		err = fib_validate_source(skb, saddr, daddr, tos,
-					  0, dev, in_dev, &itag);
+		err = fib_validate_source(skb, saddr, daddr,
+					  inet_dscp_to_dsfield(dscp), 0, dev,
+					  in_dev, &itag);
 		if (err < 0)
 			goto martian_source;
 		goto local_input;
@@ -2314,7 +2315,8 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 		goto martian_destination;
 
 make_route:
-	err = ip_mkroute_input(skb, res, in_dev, daddr, saddr, tos, flkeys);
+	err = ip_mkroute_input(skb, res, in_dev, daddr, saddr,
+			       inet_dscp_to_dsfield(dscp), flkeys);
 out:	return err;
 
 brd_input:
@@ -2322,7 +2324,8 @@ out:	return err;
 		goto e_inval;
 
 	if (!ipv4_is_zeronet(saddr)) {
-		err = fib_validate_source(skb, saddr, 0, tos, 0, dev,
+		err = fib_validate_source(skb, saddr, 0,
+					  inet_dscp_to_dsfield(dscp), 0, dev,
 					  in_dev, &itag);
 		if (err < 0)
 			goto martian_source;
@@ -2463,8 +2466,7 @@ static int ip_route_input_rcu(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 		return err;
 	}
 
-	return ip_route_input_slow(skb, daddr, saddr,
-				   inet_dscp_to_dsfield(dscp), dev, res);
+	return ip_route_input_slow(skb, daddr, saddr, dscp, dev, res);
 }
 
 int ip_route_input_noref(struct sk_buff *skb, __be32 daddr, __be32 saddr,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t.
  2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
                   ` (4 preceding siblings ...)
  2024-10-01 19:29 ` [PATCH net-next 5/5] ipv4: Convert ip_route_input_slow() " Guillaume Nault
@ 2024-10-02  2:45 ` David Ahern
  2024-10-03 23:40 ` patchwork-bot+netdevbpf
  6 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2024-10-02  2:45 UTC (permalink / raw)
  To: Guillaume Nault, David Miller, Jakub Kicinski, Paolo Abeni,
	Eric Dumazet
  Cc: netdev, Ido Schimmel, Pablo Neira Ayuso, Jozsef Kadlecsik,
	Roopa Prabhu, Nikolay Aleksandrov, Steffen Klassert, Herbert Xu

On 10/1/24 1:28 PM, Guillaume Nault wrote:
> Prepare ip_route_input_slow() and its call chain to future conversion
> of ->flowi4_tos.
> 
> The ->flowi4_tos field of "struct flowi4" is used in many different
> places, which makes it hard to convert it from __u8 to dscp_t.
> 
> In order to avoid a big patch updating all its users at once, this
> patch series gradually converts some users to dscp_t. Those users now
> set ->flowi4_tos from a dscp_t variable that is converted to __u8 using
> inet_dscp_to_dsfield().
> 
> When all users of ->flowi4_tos will use a dscp_t variable, converting
> that field to dscp_t will just be a matter of removing all the
> inet_dscp_to_dsfield() conversions.
> 
> This series concentrates on ip_route_input_slow() and its direct and
> indirect callers.
> 
> Guillaume Nault (5):
>   ipv4: Convert icmp_route_lookup() to dscp_t.
>   ipv4: Convert ip_route_input() to dscp_t.
>   ipv4: Convert ip_route_input_noref() to dscp_t.
>   ipv4: Convert ip_route_input_rcu() to dscp_t.
>   ipv4: Convert ip_route_input_slow() to dscp_t.
> 

LGTM:

Reviewed-by: David Ahern <dsahern@kernel.org>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t.
  2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
                   ` (5 preceding siblings ...)
  2024-10-02  2:45 ` [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers " David Ahern
@ 2024-10-03 23:40 ` patchwork-bot+netdevbpf
  6 siblings, 0 replies; 8+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-10-03 23:40 UTC (permalink / raw)
  To: Guillaume Nault
  Cc: davem, kuba, pabeni, edumazet, netdev, dsahern, idosch, pablo,
	kadlec, roopa, razor, steffen.klassert, herbert

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 1 Oct 2024 21:28:30 +0200 you wrote:
> Prepare ip_route_input_slow() and its call chain to future conversion
> of ->flowi4_tos.
> 
> The ->flowi4_tos field of "struct flowi4" is used in many different
> places, which makes it hard to convert it from __u8 to dscp_t.
> 
> In order to avoid a big patch updating all its users at once, this
> patch series gradually converts some users to dscp_t. Those users now
> set ->flowi4_tos from a dscp_t variable that is converted to __u8 using
> inet_dscp_to_dsfield().
> 
> [...]

Here is the summary with links:
  - [net-next,1/5] ipv4: Convert icmp_route_lookup() to dscp_t.
    https://git.kernel.org/netdev/net-next/c/913c83a610bb
  - [net-next,2/5] ipv4: Convert ip_route_input() to dscp_t.
    https://git.kernel.org/netdev/net-next/c/7e863e5db618
  - [net-next,3/5] ipv4: Convert ip_route_input_noref() to dscp_t.
    https://git.kernel.org/netdev/net-next/c/66fb6386d358
  - [net-next,4/5] ipv4: Convert ip_route_input_rcu() to dscp_t.
    https://git.kernel.org/netdev/net-next/c/be612f5e99e1
  - [net-next,5/5] ipv4: Convert ip_route_input_slow() to dscp_t.
    https://git.kernel.org/netdev/net-next/c/783946aa0358

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-10-03 23:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
2024-10-01 19:28 ` [PATCH net-next 1/5] ipv4: Convert icmp_route_lookup() " Guillaume Nault
2024-10-01 19:28 ` [PATCH net-next 2/5] ipv4: Convert ip_route_input() " Guillaume Nault
2024-10-01 19:28 ` [PATCH net-next 3/5] ipv4: Convert ip_route_input_noref() " Guillaume Nault
2024-10-01 19:28 ` [PATCH net-next 4/5] ipv4: Convert ip_route_input_rcu() " Guillaume Nault
2024-10-01 19:29 ` [PATCH net-next 5/5] ipv4: Convert ip_route_input_slow() " Guillaume Nault
2024-10-02  2:45 ` [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers " David Ahern
2024-10-03 23:40 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).