* [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t.
@ 2024-10-01 19:28 Guillaume Nault
2024-10-01 19:28 ` [PATCH net-next 1/5] ipv4: Convert icmp_route_lookup() " Guillaume Nault
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: Guillaume Nault @ 2024-10-01 19:28 UTC (permalink / raw)
To: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: netdev, David Ahern, Ido Schimmel, Pablo Neira Ayuso,
Jozsef Kadlecsik, Roopa Prabhu, Nikolay Aleksandrov,
Steffen Klassert, Herbert Xu
Prepare ip_route_input_slow() and its call chain to future conversion
of ->flowi4_tos.
The ->flowi4_tos field of "struct flowi4" is used in many different
places, which makes it hard to convert it from __u8 to dscp_t.
In order to avoid a big patch updating all its users at once, this
patch series gradually converts some users to dscp_t. Those users now
set ->flowi4_tos from a dscp_t variable that is converted to __u8 using
inet_dscp_to_dsfield().
When all users of ->flowi4_tos will use a dscp_t variable, converting
that field to dscp_t will just be a matter of removing all the
inet_dscp_to_dsfield() conversions.
This series concentrates on ip_route_input_slow() and its direct and
indirect callers.
Guillaume Nault (5):
ipv4: Convert icmp_route_lookup() to dscp_t.
ipv4: Convert ip_route_input() to dscp_t.
ipv4: Convert ip_route_input_noref() to dscp_t.
ipv4: Convert ip_route_input_rcu() to dscp_t.
ipv4: Convert ip_route_input_slow() to dscp_t.
drivers/net/ipvlan/ipvlan_l3s.c | 6 ++++--
include/net/ip.h | 5 +++++
include/net/route.h | 8 ++++----
net/bridge/br_netfilter_hooks.c | 8 +++++---
net/core/lwt_bpf.c | 5 +++--
net/ipv4/icmp.c | 19 +++++++++----------
net/ipv4/ip_fragment.c | 4 ++--
net/ipv4/ip_input.c | 2 +-
net/ipv4/ip_options.c | 3 ++-
net/ipv4/route.c | 32 ++++++++++++++++++--------------
net/ipv4/xfrm4_input.c | 2 +-
net/ipv4/xfrm4_protocol.c | 2 +-
net/ipv6/ip6_tunnel.c | 4 ++--
13 files changed, 57 insertions(+), 43 deletions(-)
--
2.39.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH net-next 1/5] ipv4: Convert icmp_route_lookup() to dscp_t.
2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
@ 2024-10-01 19:28 ` Guillaume Nault
2024-10-01 19:28 ` [PATCH net-next 2/5] ipv4: Convert ip_route_input() " Guillaume Nault
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Guillaume Nault @ 2024-10-01 19:28 UTC (permalink / raw)
To: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: netdev, David Ahern, Ido Schimmel
Pass a dscp_t variable to icmp_route_lookup(), instead of a plain u8,
to prevent accidental setting of ECN bits in ->flowi4_tos. Rename that
variable ("tos" -> "dscp") to make the intent clear.
While there, reorganise the function parameters to fill up horizontal
space.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
net/ipv4/icmp.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index e1384e7331d8..7d7b25ed8d21 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -478,13 +478,11 @@ static struct net_device *icmp_get_route_lookup_dev(struct sk_buff *skb)
return route_lookup_dev;
}
-static struct rtable *icmp_route_lookup(struct net *net,
- struct flowi4 *fl4,
+static struct rtable *icmp_route_lookup(struct net *net, struct flowi4 *fl4,
struct sk_buff *skb_in,
- const struct iphdr *iph,
- __be32 saddr, u8 tos, u32 mark,
- int type, int code,
- struct icmp_bxm *param)
+ const struct iphdr *iph, __be32 saddr,
+ dscp_t dscp, u32 mark, int type,
+ int code, struct icmp_bxm *param)
{
struct net_device *route_lookup_dev;
struct dst_entry *dst, *dst2;
@@ -498,7 +496,7 @@ static struct rtable *icmp_route_lookup(struct net *net,
fl4->saddr = saddr;
fl4->flowi4_mark = mark;
fl4->flowi4_uid = sock_net_uid(net, NULL);
- fl4->flowi4_tos = tos & INET_DSCP_MASK;
+ fl4->flowi4_tos = inet_dscp_to_dsfield(dscp);
fl4->flowi4_proto = IPPROTO_ICMP;
fl4->fl4_icmp_type = type;
fl4->fl4_icmp_code = code;
@@ -547,7 +545,7 @@ static struct rtable *icmp_route_lookup(struct net *net,
orefdst = skb_in->_skb_refdst; /* save old refdst */
skb_dst_set(skb_in, NULL);
err = ip_route_input(skb_in, fl4_dec.daddr, fl4_dec.saddr,
- tos, rt2->dst.dev);
+ inet_dscp_to_dsfield(dscp), rt2->dst.dev);
dst_release(&rt2->dst);
rt2 = skb_rtable(skb_in);
@@ -741,8 +739,9 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info,
ipc.opt = &icmp_param.replyopts.opt;
ipc.sockc.mark = mark;
- rt = icmp_route_lookup(net, &fl4, skb_in, iph, saddr, tos, mark,
- type, code, &icmp_param);
+ rt = icmp_route_lookup(net, &fl4, skb_in, iph, saddr,
+ inet_dsfield_to_dscp(tos), mark, type, code,
+ &icmp_param);
if (IS_ERR(rt))
goto out_unlock;
--
2.39.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net-next 2/5] ipv4: Convert ip_route_input() to dscp_t.
2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
2024-10-01 19:28 ` [PATCH net-next 1/5] ipv4: Convert icmp_route_lookup() " Guillaume Nault
@ 2024-10-01 19:28 ` Guillaume Nault
2024-10-01 19:28 ` [PATCH net-next 3/5] ipv4: Convert ip_route_input_noref() " Guillaume Nault
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Guillaume Nault @ 2024-10-01 19:28 UTC (permalink / raw)
To: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: netdev, David Ahern, Ido Schimmel, Pablo Neira Ayuso,
Jozsef Kadlecsik, Roopa Prabhu, Nikolay Aleksandrov
Pass a dscp_t variable to ip_route_input(), instead of a plain u8, to
prevent accidental setting of ECN bits in ->flowi4_tos.
Callers of ip_route_input() to consider are:
* input_action_end_dx4_finish() and input_action_end_dt4() in
net/ipv6/seg6_local.c. These functions set the tos parameter to 0,
which is already a valid dscp_t value, so they don't need to be
adjusted for the new prototype.
* icmp_route_lookup(), which already has a dscp_t variable to pass as
parameter. We just need to remove the inet_dscp_to_dsfield()
conversion.
* br_nf_pre_routing_finish(), ip_options_rcv_srr() and ip4ip6_err(),
which get the DSCP directly from IPv4 headers. Define a helper to
read the .tos field of struct iphdr as dscp_t, so that these
function don't have to do the conversion manually.
While there, declare *iph as const in br_nf_pre_routing_finish(),
declare its local variables in reverse-christmas-tree order and move
the "err = ip_route_input()" assignment out of the conditional to avoid
checkpatch warning.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
include/net/ip.h | 5 +++++
include/net/route.h | 5 +++--
net/bridge/br_netfilter_hooks.c | 8 +++++---
net/ipv4/icmp.c | 2 +-
net/ipv4/ip_options.c | 3 ++-
net/ipv6/ip6_tunnel.c | 4 ++--
6 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/include/net/ip.h b/include/net/ip.h
index d92d3bc3ec0e..bab084df1567 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -424,6 +424,11 @@ int ip_decrease_ttl(struct iphdr *iph)
return --iph->ttl;
}
+static inline dscp_t ip4h_dscp(const struct iphdr *ip4h)
+{
+ return inet_dsfield_to_dscp(ip4h->tos);
+}
+
static inline int ip_mtu_locked(const struct dst_entry *dst)
{
const struct rtable *rt = dst_rtable(dst);
diff --git a/include/net/route.h b/include/net/route.h
index 1789f1e6640b..03dd28cf4bc4 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -208,12 +208,13 @@ int ip_route_use_hint(struct sk_buff *skb, __be32 dst, __be32 src,
const struct sk_buff *hint);
static inline int ip_route_input(struct sk_buff *skb, __be32 dst, __be32 src,
- u8 tos, struct net_device *devin)
+ dscp_t dscp, struct net_device *devin)
{
int err;
rcu_read_lock();
- err = ip_route_input_noref(skb, dst, src, tos, devin);
+ err = ip_route_input_noref(skb, dst, src, inet_dscp_to_dsfield(dscp),
+ devin);
if (!err) {
skb_dst_force(skb);
if (!skb_dst(skb))
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 0e8bc0ea6175..c6bab2b5e834 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -369,9 +369,9 @@ br_nf_ipv4_daddr_was_changed(const struct sk_buff *skb,
*/
static int br_nf_pre_routing_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
{
- struct net_device *dev = skb->dev, *br_indev;
- struct iphdr *iph = ip_hdr(skb);
struct nf_bridge_info *nf_bridge = nf_bridge_info_get(skb);
+ struct net_device *dev = skb->dev, *br_indev;
+ const struct iphdr *iph = ip_hdr(skb);
struct rtable *rt;
int err;
@@ -389,7 +389,9 @@ static int br_nf_pre_routing_finish(struct net *net, struct sock *sk, struct sk_
}
nf_bridge->in_prerouting = 0;
if (br_nf_ipv4_daddr_was_changed(skb, nf_bridge)) {
- if ((err = ip_route_input(skb, iph->daddr, iph->saddr, iph->tos, dev))) {
+ err = ip_route_input(skb, iph->daddr, iph->saddr,
+ ip4h_dscp(iph), dev);
+ if (err) {
struct in_device *in_dev = __in_dev_get_rcu(dev);
/* If err equals -EHOSTUNREACH the error is due to a
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 7d7b25ed8d21..23664434922e 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -545,7 +545,7 @@ static struct rtable *icmp_route_lookup(struct net *net, struct flowi4 *fl4,
orefdst = skb_in->_skb_refdst; /* save old refdst */
skb_dst_set(skb_in, NULL);
err = ip_route_input(skb_in, fl4_dec.daddr, fl4_dec.saddr,
- inet_dscp_to_dsfield(dscp), rt2->dst.dev);
+ dscp, rt2->dst.dev);
dst_release(&rt2->dst);
rt2 = skb_rtable(skb_in);
diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
index a9e22a098872..b4c59708fc09 100644
--- a/net/ipv4/ip_options.c
+++ b/net/ipv4/ip_options.c
@@ -617,7 +617,8 @@ int ip_options_rcv_srr(struct sk_buff *skb, struct net_device *dev)
orefdst = skb->_skb_refdst;
skb_dst_set(skb, NULL);
- err = ip_route_input(skb, nexthop, iph->saddr, iph->tos, dev);
+ err = ip_route_input(skb, nexthop, iph->saddr, ip4h_dscp(iph),
+ dev);
rt2 = skb_rtable(skb);
if (err || (rt2->rt_type != RTN_UNICAST && rt2->rt_type != RTN_LOCAL)) {
skb_dst_drop(skb);
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index b60e13c42bca..48fd53b98972 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -630,8 +630,8 @@ ip4ip6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
}
skb_dst_set(skb2, &rt->dst);
} else {
- if (ip_route_input(skb2, eiph->daddr, eiph->saddr, eiph->tos,
- skb2->dev) ||
+ if (ip_route_input(skb2, eiph->daddr, eiph->saddr,
+ ip4h_dscp(eiph), skb2->dev) ||
skb_dst(skb2)->dev->type != ARPHRD_TUNNEL6)
goto out;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net-next 3/5] ipv4: Convert ip_route_input_noref() to dscp_t.
2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
2024-10-01 19:28 ` [PATCH net-next 1/5] ipv4: Convert icmp_route_lookup() " Guillaume Nault
2024-10-01 19:28 ` [PATCH net-next 2/5] ipv4: Convert ip_route_input() " Guillaume Nault
@ 2024-10-01 19:28 ` Guillaume Nault
2024-10-01 19:28 ` [PATCH net-next 4/5] ipv4: Convert ip_route_input_rcu() " Guillaume Nault
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Guillaume Nault @ 2024-10-01 19:28 UTC (permalink / raw)
To: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: netdev, David Ahern, Ido Schimmel, Steffen Klassert, Herbert Xu
Pass a dscp_t variable to ip_route_input_noref(), instead of a plain
u8, to prevent accidental setting of ECN bits in ->flowi4_tos.
Callers of ip_route_input_noref() to consider are:
* arp_process() in net/ipv4/arp.c. This function sets the tos
parameter to 0, which is already a valid dscp_t value, so it
doesn't need to be adjusted for the new prototype.
* ip_route_input(), which already has a dscp_t variable to pass as
parameter. We just need to remove the inet_dscp_to_dsfield()
conversion.
* ipvlan_l3_rcv(), bpf_lwt_input_reroute(), ip_expire(),
ip_rcv_finish_core(), xfrm4_rcv_encap_finish() and
xfrm4_rcv_encap(), which get the DSCP directly from IPv4 headers
and can simply use the ip4h_dscp() helper.
While there, declare the IPv4 header pointers as const in
ipvlan_l3_rcv() and bpf_lwt_input_reroute().
Also, modify the declaration of ip_route_input_noref() in
include/net/route.h so that it matches the prototype of its
implementation in net/ipv4/route.c.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
drivers/net/ipvlan/ipvlan_l3s.c | 6 ++++--
include/net/route.h | 7 +++----
net/core/lwt_bpf.c | 5 +++--
net/ipv4/ip_fragment.c | 4 ++--
net/ipv4/ip_input.c | 2 +-
net/ipv4/route.c | 6 +++---
net/ipv4/xfrm4_input.c | 2 +-
net/ipv4/xfrm4_protocol.c | 2 +-
8 files changed, 18 insertions(+), 16 deletions(-)
diff --git a/drivers/net/ipvlan/ipvlan_l3s.c b/drivers/net/ipvlan/ipvlan_l3s.c
index d5b05e803219..b4ef386bdb1b 100644
--- a/drivers/net/ipvlan/ipvlan_l3s.c
+++ b/drivers/net/ipvlan/ipvlan_l3s.c
@@ -2,6 +2,8 @@
/* Copyright (c) 2014 Mahesh Bandewar <maheshb@google.com>
*/
+#include <net/ip.h>
+
#include "ipvlan.h"
static unsigned int ipvlan_netid __read_mostly;
@@ -48,11 +50,11 @@ static struct sk_buff *ipvlan_l3_rcv(struct net_device *dev,
switch (proto) {
case AF_INET:
{
- struct iphdr *ip4h = ip_hdr(skb);
+ const struct iphdr *ip4h = ip_hdr(skb);
int err;
err = ip_route_input_noref(skb, ip4h->daddr, ip4h->saddr,
- ip4h->tos, sdev);
+ ip4h_dscp(ip4h), sdev);
if (unlikely(err))
goto out;
break;
diff --git a/include/net/route.h b/include/net/route.h
index 03dd28cf4bc4..5e4374d66927 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -201,8 +201,8 @@ static inline struct rtable *ip_route_output_gre(struct net *net, struct flowi4
int ip_mc_validate_source(struct sk_buff *skb, __be32 daddr, __be32 saddr,
u8 tos, struct net_device *dev,
struct in_device *in_dev, u32 *itag);
-int ip_route_input_noref(struct sk_buff *skb, __be32 dst, __be32 src,
- u8 tos, struct net_device *devin);
+int ip_route_input_noref(struct sk_buff *skb, __be32 daddr, __be32 saddr,
+ dscp_t dscp, struct net_device *dev);
int ip_route_use_hint(struct sk_buff *skb, __be32 dst, __be32 src,
u8 tos, struct net_device *devin,
const struct sk_buff *hint);
@@ -213,8 +213,7 @@ static inline int ip_route_input(struct sk_buff *skb, __be32 dst, __be32 src,
int err;
rcu_read_lock();
- err = ip_route_input_noref(skb, dst, src, inet_dscp_to_dsfield(dscp),
- devin);
+ err = ip_route_input_noref(skb, dst, src, dscp, devin);
if (!err) {
skb_dst_force(skb);
if (!skb_dst(skb))
diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
index 1a14f915b7a4..e0ca24a58810 100644
--- a/net/core/lwt_bpf.c
+++ b/net/core/lwt_bpf.c
@@ -10,6 +10,7 @@
#include <linux/bpf.h>
#include <net/lwtunnel.h>
#include <net/gre.h>
+#include <net/ip.h>
#include <net/ip6_route.h>
#include <net/ipv6_stubs.h>
#include <net/inet_dscp.h>
@@ -91,12 +92,12 @@ static int bpf_lwt_input_reroute(struct sk_buff *skb)
if (skb->protocol == htons(ETH_P_IP)) {
struct net_device *dev = skb_dst(skb)->dev;
- struct iphdr *iph = ip_hdr(skb);
+ const struct iphdr *iph = ip_hdr(skb);
dev_hold(dev);
skb_dst_drop(skb);
err = ip_route_input_noref(skb, iph->daddr, iph->saddr,
- iph->tos, dev);
+ ip4h_dscp(iph), dev);
dev_put(dev);
} else if (skb->protocol == htons(ETH_P_IPV6)) {
skb_dst_drop(skb);
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index a92664a5ef2e..48e2810f1f27 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -175,8 +175,8 @@ static void ip_expire(struct timer_list *t)
/* skb has no dst, perform route lookup again */
iph = ip_hdr(head);
- err = ip_route_input_noref(head, iph->daddr, iph->saddr,
- iph->tos, head->dev);
+ err = ip_route_input_noref(head, iph->daddr, iph->saddr, ip4h_dscp(iph),
+ head->dev);
if (err)
goto out;
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index b6e7d4921309..c0a2490eb7c1 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -363,7 +363,7 @@ static int ip_rcv_finish_core(struct net *net, struct sock *sk,
*/
if (!skb_valid_dst(skb)) {
err = ip_route_input_noref(skb, iph->daddr, iph->saddr,
- iph->tos, dev);
+ ip4h_dscp(iph), dev);
if (unlikely(err))
goto drop_error;
} else {
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 723ac9181558..00bfc0a11f64 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2465,14 +2465,14 @@ static int ip_route_input_rcu(struct sk_buff *skb, __be32 daddr, __be32 saddr,
}
int ip_route_input_noref(struct sk_buff *skb, __be32 daddr, __be32 saddr,
- u8 tos, struct net_device *dev)
+ dscp_t dscp, struct net_device *dev)
{
struct fib_result res;
int err;
- tos &= INET_DSCP_MASK;
rcu_read_lock();
- err = ip_route_input_rcu(skb, daddr, saddr, tos, dev, &res);
+ err = ip_route_input_rcu(skb, daddr, saddr, inet_dscp_to_dsfield(dscp),
+ dev, &res);
rcu_read_unlock();
return err;
diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c
index a620618cc568..b5b06323cfd9 100644
--- a/net/ipv4/xfrm4_input.c
+++ b/net/ipv4/xfrm4_input.c
@@ -33,7 +33,7 @@ static inline int xfrm4_rcv_encap_finish(struct net *net, struct sock *sk,
const struct iphdr *iph = ip_hdr(skb);
if (ip_route_input_noref(skb, iph->daddr, iph->saddr,
- iph->tos, skb->dev))
+ ip4h_dscp(iph), skb->dev))
goto drop;
}
diff --git a/net/ipv4/xfrm4_protocol.c b/net/ipv4/xfrm4_protocol.c
index b146ce88c5d0..4ee624d8e66f 100644
--- a/net/ipv4/xfrm4_protocol.c
+++ b/net/ipv4/xfrm4_protocol.c
@@ -76,7 +76,7 @@ int xfrm4_rcv_encap(struct sk_buff *skb, int nexthdr, __be32 spi,
const struct iphdr *iph = ip_hdr(skb);
if (ip_route_input_noref(skb, iph->daddr, iph->saddr,
- iph->tos, skb->dev))
+ ip4h_dscp(iph), skb->dev))
goto drop;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net-next 4/5] ipv4: Convert ip_route_input_rcu() to dscp_t.
2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
` (2 preceding siblings ...)
2024-10-01 19:28 ` [PATCH net-next 3/5] ipv4: Convert ip_route_input_noref() " Guillaume Nault
@ 2024-10-01 19:28 ` Guillaume Nault
2024-10-01 19:29 ` [PATCH net-next 5/5] ipv4: Convert ip_route_input_slow() " Guillaume Nault
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Guillaume Nault @ 2024-10-01 19:28 UTC (permalink / raw)
To: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: netdev, David Ahern, Ido Schimmel
Pass a dscp_t variable to ip_route_input_rcu(), instead of a plain u8,
to prevent accidental setting of ECN bits in ->flowi4_tos.
Callers of ip_route_input_rcu() to consider are:
* ip_route_input_noref(), which already has a dscp_t variable to pass
as parameter. We just need to remove the inet_dscp_to_dsfield()
conversion.
* inet_rtm_getroute(), which receives a u8 from user space and needs
to convert it with inet_dsfield_to_dscp().
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
net/ipv4/route.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 00bfc0a11f64..a693b57b4111 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2415,7 +2415,8 @@ out: return err;
/* called with rcu_read_lock held */
static int ip_route_input_rcu(struct sk_buff *skb, __be32 daddr, __be32 saddr,
- u8 tos, struct net_device *dev, struct fib_result *res)
+ dscp_t dscp, struct net_device *dev,
+ struct fib_result *res)
{
/* Multicast recognition logic is moved from route cache to here.
* The problem was that too many Ethernet cards have broken/missing
@@ -2456,12 +2457,14 @@ static int ip_route_input_rcu(struct sk_buff *skb, __be32 daddr, __be32 saddr,
#endif
) {
err = ip_route_input_mc(skb, daddr, saddr,
- tos, dev, our);
+ inet_dscp_to_dsfield(dscp),
+ dev, our);
}
return err;
}
- return ip_route_input_slow(skb, daddr, saddr, tos, dev, res);
+ return ip_route_input_slow(skb, daddr, saddr,
+ inet_dscp_to_dsfield(dscp), dev, res);
}
int ip_route_input_noref(struct sk_buff *skb, __be32 daddr, __be32 saddr,
@@ -2471,8 +2474,7 @@ int ip_route_input_noref(struct sk_buff *skb, __be32 daddr, __be32 saddr,
int err;
rcu_read_lock();
- err = ip_route_input_rcu(skb, daddr, saddr, inet_dscp_to_dsfield(dscp),
- dev, &res);
+ err = ip_route_input_rcu(skb, daddr, saddr, dscp, dev, &res);
rcu_read_unlock();
return err;
@@ -3286,8 +3288,8 @@ static int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh,
skb->dev = dev;
skb->mark = mark;
err = ip_route_input_rcu(skb, dst, src,
- rtm->rtm_tos & INET_DSCP_MASK, dev,
- &res);
+ inet_dsfield_to_dscp(rtm->rtm_tos),
+ dev, &res);
rt = skb_rtable(skb);
if (err == 0 && rt->dst.error)
--
2.39.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net-next 5/5] ipv4: Convert ip_route_input_slow() to dscp_t.
2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
` (3 preceding siblings ...)
2024-10-01 19:28 ` [PATCH net-next 4/5] ipv4: Convert ip_route_input_rcu() " Guillaume Nault
@ 2024-10-01 19:29 ` Guillaume Nault
2024-10-02 2:45 ` [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers " David Ahern
2024-10-03 23:40 ` patchwork-bot+netdevbpf
6 siblings, 0 replies; 8+ messages in thread
From: Guillaume Nault @ 2024-10-01 19:29 UTC (permalink / raw)
To: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: netdev, David Ahern, Ido Schimmel
Pass a dscp_t variable to ip_route_input_slow(), instead of a plain u8,
to prevent accidental setting of ECN bits in ->flowi4_tos.
Only ip_route_input_rcu() actually calls ip_route_input_slow(). Since
it already has a dscp_t variable to pass as parameter, we only need to
remove the inet_dscp_to_dsfield() conversion.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
net/ipv4/route.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index a693b57b4111..6e1cd0065b87 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2201,7 +2201,7 @@ static struct net_device *ip_rt_get_dev(struct net *net,
*/
static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
- u8 tos, struct net_device *dev,
+ dscp_t dscp, struct net_device *dev,
struct fib_result *res)
{
struct in_device *in_dev = __in_dev_get_rcu(dev);
@@ -2266,7 +2266,7 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
fl4.flowi4_oif = 0;
fl4.flowi4_iif = dev->ifindex;
fl4.flowi4_mark = skb->mark;
- fl4.flowi4_tos = tos;
+ fl4.flowi4_tos = inet_dscp_to_dsfield(dscp);
fl4.flowi4_scope = RT_SCOPE_UNIVERSE;
fl4.flowi4_flags = 0;
fl4.daddr = daddr;
@@ -2299,8 +2299,9 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
}
if (res->type == RTN_LOCAL) {
- err = fib_validate_source(skb, saddr, daddr, tos,
- 0, dev, in_dev, &itag);
+ err = fib_validate_source(skb, saddr, daddr,
+ inet_dscp_to_dsfield(dscp), 0, dev,
+ in_dev, &itag);
if (err < 0)
goto martian_source;
goto local_input;
@@ -2314,7 +2315,8 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
goto martian_destination;
make_route:
- err = ip_mkroute_input(skb, res, in_dev, daddr, saddr, tos, flkeys);
+ err = ip_mkroute_input(skb, res, in_dev, daddr, saddr,
+ inet_dscp_to_dsfield(dscp), flkeys);
out: return err;
brd_input:
@@ -2322,7 +2324,8 @@ out: return err;
goto e_inval;
if (!ipv4_is_zeronet(saddr)) {
- err = fib_validate_source(skb, saddr, 0, tos, 0, dev,
+ err = fib_validate_source(skb, saddr, 0,
+ inet_dscp_to_dsfield(dscp), 0, dev,
in_dev, &itag);
if (err < 0)
goto martian_source;
@@ -2463,8 +2466,7 @@ static int ip_route_input_rcu(struct sk_buff *skb, __be32 daddr, __be32 saddr,
return err;
}
- return ip_route_input_slow(skb, daddr, saddr,
- inet_dscp_to_dsfield(dscp), dev, res);
+ return ip_route_input_slow(skb, daddr, saddr, dscp, dev, res);
}
int ip_route_input_noref(struct sk_buff *skb, __be32 daddr, __be32 saddr,
--
2.39.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t.
2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
` (4 preceding siblings ...)
2024-10-01 19:29 ` [PATCH net-next 5/5] ipv4: Convert ip_route_input_slow() " Guillaume Nault
@ 2024-10-02 2:45 ` David Ahern
2024-10-03 23:40 ` patchwork-bot+netdevbpf
6 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2024-10-02 2:45 UTC (permalink / raw)
To: Guillaume Nault, David Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet
Cc: netdev, Ido Schimmel, Pablo Neira Ayuso, Jozsef Kadlecsik,
Roopa Prabhu, Nikolay Aleksandrov, Steffen Klassert, Herbert Xu
On 10/1/24 1:28 PM, Guillaume Nault wrote:
> Prepare ip_route_input_slow() and its call chain to future conversion
> of ->flowi4_tos.
>
> The ->flowi4_tos field of "struct flowi4" is used in many different
> places, which makes it hard to convert it from __u8 to dscp_t.
>
> In order to avoid a big patch updating all its users at once, this
> patch series gradually converts some users to dscp_t. Those users now
> set ->flowi4_tos from a dscp_t variable that is converted to __u8 using
> inet_dscp_to_dsfield().
>
> When all users of ->flowi4_tos will use a dscp_t variable, converting
> that field to dscp_t will just be a matter of removing all the
> inet_dscp_to_dsfield() conversions.
>
> This series concentrates on ip_route_input_slow() and its direct and
> indirect callers.
>
> Guillaume Nault (5):
> ipv4: Convert icmp_route_lookup() to dscp_t.
> ipv4: Convert ip_route_input() to dscp_t.
> ipv4: Convert ip_route_input_noref() to dscp_t.
> ipv4: Convert ip_route_input_rcu() to dscp_t.
> ipv4: Convert ip_route_input_slow() to dscp_t.
>
LGTM:
Reviewed-by: David Ahern <dsahern@kernel.org>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t.
2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
` (5 preceding siblings ...)
2024-10-02 2:45 ` [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers " David Ahern
@ 2024-10-03 23:40 ` patchwork-bot+netdevbpf
6 siblings, 0 replies; 8+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-10-03 23:40 UTC (permalink / raw)
To: Guillaume Nault
Cc: davem, kuba, pabeni, edumazet, netdev, dsahern, idosch, pablo,
kadlec, roopa, razor, steffen.klassert, herbert
Hello:
This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Tue, 1 Oct 2024 21:28:30 +0200 you wrote:
> Prepare ip_route_input_slow() and its call chain to future conversion
> of ->flowi4_tos.
>
> The ->flowi4_tos field of "struct flowi4" is used in many different
> places, which makes it hard to convert it from __u8 to dscp_t.
>
> In order to avoid a big patch updating all its users at once, this
> patch series gradually converts some users to dscp_t. Those users now
> set ->flowi4_tos from a dscp_t variable that is converted to __u8 using
> inet_dscp_to_dsfield().
>
> [...]
Here is the summary with links:
- [net-next,1/5] ipv4: Convert icmp_route_lookup() to dscp_t.
https://git.kernel.org/netdev/net-next/c/913c83a610bb
- [net-next,2/5] ipv4: Convert ip_route_input() to dscp_t.
https://git.kernel.org/netdev/net-next/c/7e863e5db618
- [net-next,3/5] ipv4: Convert ip_route_input_noref() to dscp_t.
https://git.kernel.org/netdev/net-next/c/66fb6386d358
- [net-next,4/5] ipv4: Convert ip_route_input_rcu() to dscp_t.
https://git.kernel.org/netdev/net-next/c/be612f5e99e1
- [net-next,5/5] ipv4: Convert ip_route_input_slow() to dscp_t.
https://git.kernel.org/netdev/net-next/c/783946aa0358
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-10-03 23:40 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-01 19:28 [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers to dscp_t Guillaume Nault
2024-10-01 19:28 ` [PATCH net-next 1/5] ipv4: Convert icmp_route_lookup() " Guillaume Nault
2024-10-01 19:28 ` [PATCH net-next 2/5] ipv4: Convert ip_route_input() " Guillaume Nault
2024-10-01 19:28 ` [PATCH net-next 3/5] ipv4: Convert ip_route_input_noref() " Guillaume Nault
2024-10-01 19:28 ` [PATCH net-next 4/5] ipv4: Convert ip_route_input_rcu() " Guillaume Nault
2024-10-01 19:29 ` [PATCH net-next 5/5] ipv4: Convert ip_route_input_slow() " Guillaume Nault
2024-10-02 2:45 ` [PATCH net-next 0/5] ipv4: Convert ip_route_input_slow() and its callers " David Ahern
2024-10-03 23:40 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).