netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 00/12] Unmask upper DSCP bits - part 2
@ 2024-08-27 11:18 Ido Schimmel
  2024-08-27 11:18 ` [PATCH net-next 01/12] ipv4: Unmask upper DSCP bits in RTM_GETROUTE output route lookup Ido Schimmel
                   ` (12 more replies)
  0 siblings, 13 replies; 34+ messages in thread
From: Ido Schimmel @ 2024-08-27 11:18 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, gnault, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf,
	Ido Schimmel

tl;dr - This patchset continues to unmask the upper DSCP bits in the
IPv4 flow key in preparation for allowing IPv4 FIB rules to match on
DSCP. No functional changes are expected. Part 1 was merged in commit
("Merge branch 'unmask-upper-dscp-bits-part-1'").

The TOS field in the IPv4 flow key ('flowi4_tos') is used during FIB
lookup to match against the TOS selector in FIB rules and routes.

It is currently impossible for user space to configure FIB rules that
match on the DSCP value as the upper DSCP bits are either masked in the
various call sites that initialize the IPv4 flow key or along the path
to the FIB core.

In preparation for adding a DSCP selector to IPv4 and IPv6 FIB rules, we
need to make sure the entire DSCP value is present in the IPv4 flow key.
This patchset continues to unmask the upper DSCP bits, but this time in
the output route path.

Patches #1-#3 unmask the upper DSCP bits in the various places that
invoke the core output route lookup functions directly.

Patches #4-#6 do the same in three helpers that are widely used in the
output path to initialize the TOS field in the IPv4 flow key.

The rest of the patches continue to unmask these bits in call sites that
invoke the following wrappers around the core lookup functions:

Patch #7 - __ip_route_output_key()
Patches #8-#12 - ip_route_output_flow()

The next patchset will handle the callers of ip_route_output_ports() and
ip_route_output_key().

No functional changes are expected as commit 1fa3314c14c6 ("ipv4:
Centralize TOS matching") moved the masking of the upper DSCP bits to
the core where 'flowi4_tos' is matched against the TOS selector.

Ido Schimmel (12):
  ipv4: Unmask upper DSCP bits in RTM_GETROUTE output route lookup
  ipv4: Unmask upper DSCP bits in ip_route_output_key_hash()
  ipv4: icmp: Unmask upper DSCP bits in icmp_route_lookup()
  ipv4: Unmask upper DSCP bits in ip_sock_rt_tos()
  ipv4: Unmask upper DSCP bits in get_rttos()
  ipv4: Unmask upper DSCP bits when building flow key
  xfrm: Unmask upper DSCP bits in xfrm_get_tos()
  ipv4: Unmask upper DSCP bits in ip_send_unicast_reply()
  ipv6: sit: Unmask upper DSCP bits in ipip6_tunnel_xmit()
  ipvlan: Unmask upper DSCP bits in ipvlan_process_v4_outbound()
  vrf: Unmask upper DSCP bits in vrf_process_v4_outbound()
  bpf: Unmask upper DSCP bits in __bpf_redirect_neigh_v4()

 drivers/net/ipvlan/ipvlan_core.c | 4 +++-
 drivers/net/vrf.c                | 3 ++-
 include/net/ip.h                 | 5 ++++-
 include/net/route.h              | 5 ++---
 net/core/filter.c                | 2 +-
 net/ipv4/icmp.c                  | 3 ++-
 net/ipv4/ip_output.c             | 3 ++-
 net/ipv4/route.c                 | 8 ++++----
 net/ipv6/sit.c                   | 5 +++--
 net/xfrm/xfrm_policy.c           | 3 ++-
 10 files changed, 25 insertions(+), 16 deletions(-)

-- 
2.46.0


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next 01/12] ipv4: Unmask upper DSCP bits in RTM_GETROUTE output route lookup
  2024-08-27 11:18 [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Ido Schimmel
@ 2024-08-27 11:18 ` Ido Schimmel
  2024-08-27 13:55   ` Guillaume Nault
  2024-08-27 11:18 ` [PATCH net-next 02/12] ipv4: Unmask upper DSCP bits in ip_route_output_key_hash() Ido Schimmel
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2024-08-27 11:18 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, gnault, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf,
	Ido Schimmel

Unmask the upper DSCP bits when looking up an output route via the
RTM_GETROUTE netlink message so that in the future the lookup could be
performed according to the full DSCP value.

No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv4/route.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index f6972b24664a..e4b45aa18470 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -3261,7 +3261,7 @@ static int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh,
 
 	fl4.daddr = dst;
 	fl4.saddr = src;
-	fl4.flowi4_tos = rtm->rtm_tos & IPTOS_RT_MASK;
+	fl4.flowi4_tos = rtm->rtm_tos & INET_DSCP_MASK;
 	fl4.flowi4_oif = tb[RTA_OIF] ? nla_get_u32(tb[RTA_OIF]) : 0;
 	fl4.flowi4_mark = mark;
 	fl4.flowi4_uid = uid;
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 02/12] ipv4: Unmask upper DSCP bits in ip_route_output_key_hash()
  2024-08-27 11:18 [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Ido Schimmel
  2024-08-27 11:18 ` [PATCH net-next 01/12] ipv4: Unmask upper DSCP bits in RTM_GETROUTE output route lookup Ido Schimmel
@ 2024-08-27 11:18 ` Ido Schimmel
  2024-08-27 13:57   ` Guillaume Nault
  2024-08-27 11:18 ` [PATCH net-next 03/12] ipv4: icmp: Unmask upper DSCP bits in icmp_route_lookup() Ido Schimmel
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2024-08-27 11:18 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, gnault, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf,
	Ido Schimmel

Unmask the upper DSCP bits so that in the future output routes could be
looked up according to the full DSCP value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv4/route.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index e4b45aa18470..5a77dc6d9c72 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2618,7 +2618,7 @@ struct rtable *ip_route_output_key_hash(struct net *net, struct flowi4 *fl4,
 	struct rtable *rth;
 
 	fl4->flowi4_iif = LOOPBACK_IFINDEX;
-	fl4->flowi4_tos &= IPTOS_RT_MASK;
+	fl4->flowi4_tos &= INET_DSCP_MASK;
 
 	rcu_read_lock();
 	rth = ip_route_output_key_hash_rcu(net, fl4, &res, skb);
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 03/12] ipv4: icmp: Unmask upper DSCP bits in icmp_route_lookup()
  2024-08-27 11:18 [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Ido Schimmel
  2024-08-27 11:18 ` [PATCH net-next 01/12] ipv4: Unmask upper DSCP bits in RTM_GETROUTE output route lookup Ido Schimmel
  2024-08-27 11:18 ` [PATCH net-next 02/12] ipv4: Unmask upper DSCP bits in ip_route_output_key_hash() Ido Schimmel
@ 2024-08-27 11:18 ` Ido Schimmel
  2024-08-27 14:16   ` Guillaume Nault
  2024-08-27 11:18 ` [PATCH net-next 04/12] ipv4: Unmask upper DSCP bits in ip_sock_rt_tos() Ido Schimmel
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2024-08-27 11:18 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, gnault, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf,
	Ido Schimmel

The function is called to resolve a route for an ICMP message that is
sent in response to a situation. Based on the type of the generated ICMP
message, the function is either passed the DS field of the packet that
generated the ICMP message or a DS field that is derived from it.

Unmask the upper DSCP bits before resolving and output route via
ip_route_output_key_hash() so that in the future the lookup could be
performed according to the full DSCP value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv4/icmp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index b8f56d03fcbb..441057f2c903 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -93,6 +93,7 @@
 #include <net/ip_fib.h>
 #include <net/l3mdev.h>
 #include <net/addrconf.h>
+#include <net/inet_dscp.h>
 #define CREATE_TRACE_POINTS
 #include <trace/events/icmp.h>
 
@@ -496,7 +497,7 @@ static struct rtable *icmp_route_lookup(struct net *net,
 	fl4->saddr = saddr;
 	fl4->flowi4_mark = mark;
 	fl4->flowi4_uid = sock_net_uid(net, NULL);
-	fl4->flowi4_tos = RT_TOS(tos);
+	fl4->flowi4_tos = tos & INET_DSCP_MASK;
 	fl4->flowi4_proto = IPPROTO_ICMP;
 	fl4->fl4_icmp_type = type;
 	fl4->fl4_icmp_code = code;
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 04/12] ipv4: Unmask upper DSCP bits in ip_sock_rt_tos()
  2024-08-27 11:18 [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Ido Schimmel
                   ` (2 preceding siblings ...)
  2024-08-27 11:18 ` [PATCH net-next 03/12] ipv4: icmp: Unmask upper DSCP bits in icmp_route_lookup() Ido Schimmel
@ 2024-08-27 11:18 ` Ido Schimmel
  2024-08-27 14:29   ` Guillaume Nault
  2024-08-27 11:18 ` [PATCH net-next 05/12] ipv4: Unmask upper DSCP bits in get_rttos() Ido Schimmel
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2024-08-27 11:18 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, gnault, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf,
	Ido Schimmel

The function is used to read the DS field that was stored in IPv4
sockets via the IP_TOS socket option so that it could be used to
initialize the flowi4_tos field before resolving an output route.

Unmask the upper DSCP bits so that in the future the output route lookup
could be performed according to the full DSCP value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 include/net/route.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/net/route.h b/include/net/route.h
index 93833cfe9c96..b896f086ec8e 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -27,6 +27,7 @@
 #include <net/ip_fib.h>
 #include <net/arp.h>
 #include <net/ndisc.h>
+#include <net/inet_dscp.h>
 #include <linux/in_route.h>
 #include <linux/rtnetlink.h>
 #include <linux/rcupdate.h>
@@ -45,7 +46,7 @@ static inline __u8 ip_sock_rt_scope(const struct sock *sk)
 
 static inline __u8 ip_sock_rt_tos(const struct sock *sk)
 {
-	return RT_TOS(READ_ONCE(inet_sk(sk)->tos));
+	return READ_ONCE(inet_sk(sk)->tos) & INET_DSCP_MASK;
 }
 
 struct ip_tunnel_info;
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 05/12] ipv4: Unmask upper DSCP bits in get_rttos()
  2024-08-27 11:18 [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Ido Schimmel
                   ` (3 preceding siblings ...)
  2024-08-27 11:18 ` [PATCH net-next 04/12] ipv4: Unmask upper DSCP bits in ip_sock_rt_tos() Ido Schimmel
@ 2024-08-27 11:18 ` Ido Schimmel
  2024-08-27 14:43   ` Guillaume Nault
  2024-08-27 11:18 ` [PATCH net-next 06/12] ipv4: Unmask upper DSCP bits when building flow key Ido Schimmel
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2024-08-27 11:18 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, gnault, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf,
	Ido Schimmel

The function is used by a few socket types to retrieve the TOS value
with which to perform the FIB lookup for packets sent through the socket
(flowi4_tos). If a DS field was passed using the IP_TOS control message,
then it is used. Otherwise the one specified via the IP_TOS socket
option.

Unmask the upper DSCP bits so that in the future the lookup could be
performed according to the full DSCP value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 include/net/ip.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index c5606cadb1a5..2b43f04c7d03 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -33,6 +33,7 @@
 #include <net/flow_dissector.h>
 #include <net/netns/hash.h>
 #include <net/lwtunnel.h>
+#include <net/inet_dscp.h>
 
 #define IPV4_MAX_PMTU		65535U		/* RFC 2675, Section 5.1 */
 #define IPV4_MIN_MTU		68			/* RFC 791 */
@@ -258,7 +259,9 @@ static inline u8 ip_sendmsg_scope(const struct inet_sock *inet,
 
 static inline __u8 get_rttos(struct ipcm_cookie* ipc, struct inet_sock *inet)
 {
-	return (ipc->tos != -1) ? RT_TOS(ipc->tos) : RT_TOS(READ_ONCE(inet->tos));
+	u8 dsfield = ipc->tos != -1 ? ipc->tos : READ_ONCE(inet->tos);
+
+	return dsfield & INET_DSCP_MASK;
 }
 
 /* datagram.c */
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 06/12] ipv4: Unmask upper DSCP bits when building flow key
  2024-08-27 11:18 [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Ido Schimmel
                   ` (4 preceding siblings ...)
  2024-08-27 11:18 ` [PATCH net-next 05/12] ipv4: Unmask upper DSCP bits in get_rttos() Ido Schimmel
@ 2024-08-27 11:18 ` Ido Schimmel
  2024-08-27 14:51   ` Guillaume Nault
  2024-08-27 11:18 ` [PATCH net-next 07/12] xfrm: Unmask upper DSCP bits in xfrm_get_tos() Ido Schimmel
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2024-08-27 11:18 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, gnault, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf,
	Ido Schimmel

build_sk_flow_key() and __build_flow_key() are used to build an IPv4
flow key before calling one of the FIB lookup APIs.

Unmask the upper DSCP bits so that in the future the lookup could be
performed according to the full DSCP value.

Remove IPTOS_RT_MASK since it is no longer used.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 include/net/route.h | 2 --
 net/ipv4/route.c    | 4 ++--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/include/net/route.h b/include/net/route.h
index b896f086ec8e..1789f1e6640b 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -266,8 +266,6 @@ static inline void ip_rt_put(struct rtable *rt)
 	dst_release(&rt->dst);
 }
 
-#define IPTOS_RT_MASK	(IPTOS_TOS_MASK & ~3)
-
 extern const __u8 ip_tos2prio[16];
 
 static inline char rt_tos2priority(u8 tos)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 5a77dc6d9c72..723ac9181558 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -512,7 +512,7 @@ static void __build_flow_key(const struct net *net, struct flowi4 *fl4,
 						    sk->sk_protocol;
 	}
 
-	flowi4_init_output(fl4, oif, mark, tos & IPTOS_RT_MASK, scope,
+	flowi4_init_output(fl4, oif, mark, tos & INET_DSCP_MASK, scope,
 			   prot, flow_flags, iph->daddr, iph->saddr, 0, 0,
 			   sock_net_uid(net, sk));
 }
@@ -541,7 +541,7 @@ static void build_sk_flow_key(struct flowi4 *fl4, const struct sock *sk)
 	if (inet_opt && inet_opt->opt.srr)
 		daddr = inet_opt->opt.faddr;
 	flowi4_init_output(fl4, sk->sk_bound_dev_if, READ_ONCE(sk->sk_mark),
-			   ip_sock_rt_tos(sk) & IPTOS_RT_MASK,
+			   ip_sock_rt_tos(sk),
 			   ip_sock_rt_scope(sk),
 			   inet_test_bit(HDRINCL, sk) ?
 				IPPROTO_RAW : sk->sk_protocol,
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 07/12] xfrm: Unmask upper DSCP bits in xfrm_get_tos()
  2024-08-27 11:18 [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Ido Schimmel
                   ` (5 preceding siblings ...)
  2024-08-27 11:18 ` [PATCH net-next 06/12] ipv4: Unmask upper DSCP bits when building flow key Ido Schimmel
@ 2024-08-27 11:18 ` Ido Schimmel
  2024-08-27 14:54   ` Guillaume Nault
  2024-08-27 11:18 ` [PATCH net-next 08/12] ipv4: Unmask upper DSCP bits in ip_send_unicast_reply() Ido Schimmel
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2024-08-27 11:18 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, gnault, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf,
	Ido Schimmel

The function returns a value that is used to initialize 'flowi4_tos'
before being passed to the FIB lookup API in the following call chain:

xfrm_bundle_create()
	tos = xfrm_get_tos(fl, family)
	xfrm_dst_lookup(..., tos, ...)
		__xfrm_dst_lookup(..., tos, ...)
			xfrm4_dst_lookup(..., tos, ...)
				__xfrm4_dst_lookup(..., tos, ...)
					fl4->flowi4_tos = tos
					__ip_route_output_key(net, fl4)

Unmask the upper DSCP bits so that in the future the output route lookup
could be performed according to the full DSCP value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/xfrm/xfrm_policy.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index c56c61b0c12e..b22767c0c078 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -45,6 +45,7 @@
 #ifdef CONFIG_XFRM_ESPINTCP
 #include <net/espintcp.h>
 #endif
+#include <net/inet_dscp.h>
 
 #include "xfrm_hash.h"
 
@@ -2561,7 +2562,7 @@ xfrm_tmpl_resolve(struct xfrm_policy **pols, int npols, const struct flowi *fl,
 static int xfrm_get_tos(const struct flowi *fl, int family)
 {
 	if (family == AF_INET)
-		return IPTOS_RT_MASK & fl->u.ip4.flowi4_tos;
+		return fl->u.ip4.flowi4_tos & INET_DSCP_MASK;
 
 	return 0;
 }
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 08/12] ipv4: Unmask upper DSCP bits in ip_send_unicast_reply()
  2024-08-27 11:18 [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Ido Schimmel
                   ` (6 preceding siblings ...)
  2024-08-27 11:18 ` [PATCH net-next 07/12] xfrm: Unmask upper DSCP bits in xfrm_get_tos() Ido Schimmel
@ 2024-08-27 11:18 ` Ido Schimmel
  2024-08-27 15:09   ` Guillaume Nault
  2024-08-27 11:18 ` [PATCH net-next 09/12] ipv6: sit: Unmask upper DSCP bits in ipip6_tunnel_xmit() Ido Schimmel
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2024-08-27 11:18 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, gnault, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf,
	Ido Schimmel

The function calls flowi4_init_output() to initialize an IPv4 flow key
with which it then performs a FIB lookup using ip_route_output_flow().

'arg->tos' with which the TOS value in the IPv4 flow key (flowi4_tos) is
initialized contains the full DS field. Unmask the upper DSCP bits so
that in the future the FIB lookup could be performed according to the
full DSCP value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv4/ip_output.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index b90d0f78ac80..eea443b7f65e 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -77,6 +77,7 @@
 #include <net/inetpeer.h>
 #include <net/inet_ecn.h>
 #include <net/lwtunnel.h>
+#include <net/inet_dscp.h>
 #include <linux/bpf-cgroup.h>
 #include <linux/igmp.h>
 #include <linux/netfilter_ipv4.h>
@@ -1621,7 +1622,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
 
 	flowi4_init_output(&fl4, oif,
 			   IP4_REPLY_MARK(net, skb->mark) ?: sk->sk_mark,
-			   RT_TOS(arg->tos),
+			   arg->tos & INET_DSCP_MASK,
 			   RT_SCOPE_UNIVERSE, ip_hdr(skb)->protocol,
 			   ip_reply_arg_flowi_flags(arg),
 			   daddr, saddr,
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 09/12] ipv6: sit: Unmask upper DSCP bits in ipip6_tunnel_xmit()
  2024-08-27 11:18 [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Ido Schimmel
                   ` (7 preceding siblings ...)
  2024-08-27 11:18 ` [PATCH net-next 08/12] ipv4: Unmask upper DSCP bits in ip_send_unicast_reply() Ido Schimmel
@ 2024-08-27 11:18 ` Ido Schimmel
  2024-08-27 15:17   ` Guillaume Nault
  2024-08-27 11:18 ` [PATCH net-next 10/12] ipvlan: Unmask upper DSCP bits in ipvlan_process_v4_outbound() Ido Schimmel
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2024-08-27 11:18 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, gnault, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf,
	Ido Schimmel

The function calls flowi4_init_output() to initialize an IPv4 flow key
with which it then performs a FIB lookup using ip_route_output_flow().

The 'tos' variable with which the TOS value in the IPv4 flow key
(flowi4_tos) is initialized contains the full DS field. Unmask the upper
DSCP bits so that in the future the FIB lookup could be performed
according to the full DSCP value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv6/sit.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 83b195f09561..3b2eed7fc765 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -51,6 +51,7 @@
 #include <net/dsfield.h>
 #include <net/net_namespace.h>
 #include <net/netns/generic.h>
+#include <net/inet_dscp.h>
 
 /*
    This version of net/ipv6/sit.c is cloned of net/ipv4/ip_gre.c
@@ -935,8 +936,8 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb,
 	}
 
 	flowi4_init_output(&fl4, tunnel->parms.link, tunnel->fwmark,
-			   RT_TOS(tos), RT_SCOPE_UNIVERSE, IPPROTO_IPV6,
-			   0, dst, tiph->saddr, 0, 0,
+			   tos & INET_DSCP_MASK, RT_SCOPE_UNIVERSE,
+			   IPPROTO_IPV6, 0, dst, tiph->saddr, 0, 0,
 			   sock_net_uid(tunnel->net, NULL));
 
 	rt = dst_cache_get_ip4(&tunnel->dst_cache, &fl4.saddr);
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 10/12] ipvlan: Unmask upper DSCP bits in ipvlan_process_v4_outbound()
  2024-08-27 11:18 [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Ido Schimmel
                   ` (8 preceding siblings ...)
  2024-08-27 11:18 ` [PATCH net-next 09/12] ipv6: sit: Unmask upper DSCP bits in ipip6_tunnel_xmit() Ido Schimmel
@ 2024-08-27 11:18 ` Ido Schimmel
  2024-08-27 15:19   ` Guillaume Nault
  2024-08-27 11:18 ` [PATCH net-next 11/12] vrf: Unmask upper DSCP bits in vrf_process_v4_outbound() Ido Schimmel
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2024-08-27 11:18 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, gnault, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf,
	Ido Schimmel

Unmask the upper DSCP bits when calling ip_route_output_flow() so that
in the future it could perform the FIB lookup according to the full DSCP
value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/ipvlan/ipvlan_core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index fef4eff7753a..b1afcb8740de 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -2,6 +2,8 @@
 /* Copyright (c) 2014 Mahesh Bandewar <maheshb@google.com>
  */
 
+#include <net/inet_dscp.h>
+
 #include "ipvlan.h"
 
 static u32 ipvlan_jhash_secret __read_mostly;
@@ -420,7 +422,7 @@ static noinline_for_stack int ipvlan_process_v4_outbound(struct sk_buff *skb)
 	int err, ret = NET_XMIT_DROP;
 	struct flowi4 fl4 = {
 		.flowi4_oif = dev->ifindex,
-		.flowi4_tos = RT_TOS(ip4h->tos),
+		.flowi4_tos = ip4h->tos & INET_DSCP_MASK,
 		.flowi4_flags = FLOWI_FLAG_ANYSRC,
 		.flowi4_mark = skb->mark,
 		.daddr = ip4h->daddr,
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 11/12] vrf: Unmask upper DSCP bits in vrf_process_v4_outbound()
  2024-08-27 11:18 [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Ido Schimmel
                   ` (9 preceding siblings ...)
  2024-08-27 11:18 ` [PATCH net-next 10/12] ipvlan: Unmask upper DSCP bits in ipvlan_process_v4_outbound() Ido Schimmel
@ 2024-08-27 11:18 ` Ido Schimmel
  2024-08-27 15:22   ` Guillaume Nault
  2024-08-27 11:18 ` [PATCH net-next 12/12] bpf: Unmask upper DSCP bits in __bpf_redirect_neigh_v4() Ido Schimmel
  2024-08-27 13:47 ` [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Guillaume Nault
  12 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2024-08-27 11:18 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, gnault, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf,
	Ido Schimmel

Unmask the upper DSCP bits when calling ip_route_output_flow() so that
in the future it could perform the FIB lookup according to the full DSCP
value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/vrf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index 040f0bb36c0e..a900908eb24a 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -37,6 +37,7 @@
 #include <net/sch_generic.h>
 #include <net/netns/generic.h>
 #include <net/netfilter/nf_conntrack.h>
+#include <net/inet_dscp.h>
 
 #define DRV_NAME	"vrf"
 #define DRV_VERSION	"1.1"
@@ -520,7 +521,7 @@ static netdev_tx_t vrf_process_v4_outbound(struct sk_buff *skb,
 	/* needed to match OIF rule */
 	fl4.flowi4_l3mdev = vrf_dev->ifindex;
 	fl4.flowi4_iif = LOOPBACK_IFINDEX;
-	fl4.flowi4_tos = RT_TOS(ip4h->tos);
+	fl4.flowi4_tos = ip4h->tos & INET_DSCP_MASK;
 	fl4.flowi4_flags = FLOWI_FLAG_ANYSRC;
 	fl4.flowi4_proto = ip4h->protocol;
 	fl4.daddr = ip4h->daddr;
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 12/12] bpf: Unmask upper DSCP bits in __bpf_redirect_neigh_v4()
  2024-08-27 11:18 [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Ido Schimmel
                   ` (10 preceding siblings ...)
  2024-08-27 11:18 ` [PATCH net-next 11/12] vrf: Unmask upper DSCP bits in vrf_process_v4_outbound() Ido Schimmel
@ 2024-08-27 11:18 ` Ido Schimmel
  2024-08-27 13:47 ` [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Guillaume Nault
  12 siblings, 0 replies; 34+ messages in thread
From: Ido Schimmel @ 2024-08-27 11:18 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, gnault, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf,
	Ido Schimmel

Unmask the upper DSCP bits when calling ip_route_output_flow() so that
in the future it could perform the FIB lookup according to the full DSCP
value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/core/filter.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index f09d875cc053..8569cd2482ee 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2372,7 +2372,7 @@ static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev,
 		struct flowi4 fl4 = {
 			.flowi4_flags = FLOWI_FLAG_ANYSRC,
 			.flowi4_mark  = skb->mark,
-			.flowi4_tos   = RT_TOS(ip4h->tos),
+			.flowi4_tos   = ip4h->tos & INET_DSCP_MASK,
 			.flowi4_oif   = dev->ifindex,
 			.flowi4_proto = ip4h->protocol,
 			.daddr	      = ip4h->daddr,
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/12] Unmask upper DSCP bits - part 2
  2024-08-27 11:18 [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Ido Schimmel
                   ` (11 preceding siblings ...)
  2024-08-27 11:18 ` [PATCH net-next 12/12] bpf: Unmask upper DSCP bits in __bpf_redirect_neigh_v4() Ido Schimmel
@ 2024-08-27 13:47 ` Guillaume Nault
  2024-08-27 15:45   ` Ido Schimmel
  12 siblings, 1 reply; 34+ messages in thread
From: Guillaume Nault @ 2024-08-27 13:47 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 02:18:01PM +0300, Ido Schimmel wrote:
> tl;dr - This patchset continues to unmask the upper DSCP bits in the
> IPv4 flow key in preparation for allowing IPv4 FIB rules to match on
> DSCP. No functional changes are expected. Part 1 was merged in commit
> ("Merge branch 'unmask-upper-dscp-bits-part-1'").
> 
> The TOS field in the IPv4 flow key ('flowi4_tos') is used during FIB
> lookup to match against the TOS selector in FIB rules and routes.
> 
> It is currently impossible for user space to configure FIB rules that
> match on the DSCP value as the upper DSCP bits are either masked in the
> various call sites that initialize the IPv4 flow key or along the path
> to the FIB core.
> 
> In preparation for adding a DSCP selector to IPv4 and IPv6 FIB rules, we

Hum, do you plan to add a DSCP selector for IPv6? That shouldn't be
necessary as IPv6 already takes all the DSCP bits into account. Also we
don't need to keep any compatibility with the legacy TOS interpretation,
as it has never been defined nor used in IPv6.

> need to make sure the entire DSCP value is present in the IPv4 flow key.
> This patchset continues to unmask the upper DSCP bits, but this time in
> the output route path.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 01/12] ipv4: Unmask upper DSCP bits in RTM_GETROUTE output route lookup
  2024-08-27 11:18 ` [PATCH net-next 01/12] ipv4: Unmask upper DSCP bits in RTM_GETROUTE output route lookup Ido Schimmel
@ 2024-08-27 13:55   ` Guillaume Nault
  0 siblings, 0 replies; 34+ messages in thread
From: Guillaume Nault @ 2024-08-27 13:55 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 02:18:02PM +0300, Ido Schimmel wrote:
> Unmask the upper DSCP bits when looking up an output route via the
> RTM_GETROUTE netlink message so that in the future the lookup could be
> performed according to the full DSCP value.

Reviewed-by: Guillaume Nault <gnault@redhat.com>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 02/12] ipv4: Unmask upper DSCP bits in ip_route_output_key_hash()
  2024-08-27 11:18 ` [PATCH net-next 02/12] ipv4: Unmask upper DSCP bits in ip_route_output_key_hash() Ido Schimmel
@ 2024-08-27 13:57   ` Guillaume Nault
  0 siblings, 0 replies; 34+ messages in thread
From: Guillaume Nault @ 2024-08-27 13:57 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 02:18:03PM +0300, Ido Schimmel wrote:
> Unmask the upper DSCP bits so that in the future output routes could be
> looked up according to the full DSCP value.

Reviewed-by: Guillaume Nault <gnault@redhat.com>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 03/12] ipv4: icmp: Unmask upper DSCP bits in icmp_route_lookup()
  2024-08-27 11:18 ` [PATCH net-next 03/12] ipv4: icmp: Unmask upper DSCP bits in icmp_route_lookup() Ido Schimmel
@ 2024-08-27 14:16   ` Guillaume Nault
  0 siblings, 0 replies; 34+ messages in thread
From: Guillaume Nault @ 2024-08-27 14:16 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 02:18:04PM +0300, Ido Schimmel wrote:
> The function is called to resolve a route for an ICMP message that is
> sent in response to a situation. Based on the type of the generated ICMP
> message, the function is either passed the DS field of the packet that
> generated the ICMP message or a DS field that is derived from it.
> 
> Unmask the upper DSCP bits before resolving and output route via
> ip_route_output_key_hash() so that in the future the lookup could be
> performed according to the full DSCP value.

Reviewed-by: Guillaume Nault <gnault@redhat.com>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 04/12] ipv4: Unmask upper DSCP bits in ip_sock_rt_tos()
  2024-08-27 11:18 ` [PATCH net-next 04/12] ipv4: Unmask upper DSCP bits in ip_sock_rt_tos() Ido Schimmel
@ 2024-08-27 14:29   ` Guillaume Nault
  0 siblings, 0 replies; 34+ messages in thread
From: Guillaume Nault @ 2024-08-27 14:29 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 02:18:05PM +0300, Ido Schimmel wrote:
> The function is used to read the DS field that was stored in IPv4
> sockets via the IP_TOS socket option so that it could be used to
> initialize the flowi4_tos field before resolving an output route.
> 
> Unmask the upper DSCP bits so that in the future the output route lookup
> could be performed according to the full DSCP value.

Reviewed-by: Guillaume Nault <gnault@redhat.com>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 05/12] ipv4: Unmask upper DSCP bits in get_rttos()
  2024-08-27 11:18 ` [PATCH net-next 05/12] ipv4: Unmask upper DSCP bits in get_rttos() Ido Schimmel
@ 2024-08-27 14:43   ` Guillaume Nault
  0 siblings, 0 replies; 34+ messages in thread
From: Guillaume Nault @ 2024-08-27 14:43 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 02:18:06PM +0300, Ido Schimmel wrote:
> The function is used by a few socket types to retrieve the TOS value
> with which to perform the FIB lookup for packets sent through the socket
> (flowi4_tos). If a DS field was passed using the IP_TOS control message,
> then it is used. Otherwise the one specified via the IP_TOS socket
> option.
> 
> Unmask the upper DSCP bits so that in the future the lookup could be
> performed according to the full DSCP value.

Reviewed-by: Guillaume Nault <gnault@redhat.com>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 06/12] ipv4: Unmask upper DSCP bits when building flow key
  2024-08-27 11:18 ` [PATCH net-next 06/12] ipv4: Unmask upper DSCP bits when building flow key Ido Schimmel
@ 2024-08-27 14:51   ` Guillaume Nault
  2024-08-27 15:37     ` Ido Schimmel
  0 siblings, 1 reply; 34+ messages in thread
From: Guillaume Nault @ 2024-08-27 14:51 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 02:18:07PM +0300, Ido Schimmel wrote:
> build_sk_flow_key() and __build_flow_key() are used to build an IPv4
> flow key before calling one of the FIB lookup APIs.
> 
> Unmask the upper DSCP bits so that in the future the lookup could be
> performed according to the full DSCP value.
> 
> Remove IPTOS_RT_MASK since it is no longer used.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  include/net/route.h | 2 --
>  net/ipv4/route.c    | 4 ++--
>  2 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/include/net/route.h b/include/net/route.h
> index b896f086ec8e..1789f1e6640b 100644
> --- a/include/net/route.h
> +++ b/include/net/route.h
> @@ -266,8 +266,6 @@ static inline void ip_rt_put(struct rtable *rt)
>  	dst_release(&rt->dst);
>  }
>  
> -#define IPTOS_RT_MASK	(IPTOS_TOS_MASK & ~3)
> -

IPTOS_RT_MASK is still used by xfrm_get_tos() (net/xfrm/xfrm_policy.c).
To preserve bisectablility, this chunk should be moved to the next
patch. Or just swap patch 6 and 7, whatever you prefer :).


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 07/12] xfrm: Unmask upper DSCP bits in xfrm_get_tos()
  2024-08-27 11:18 ` [PATCH net-next 07/12] xfrm: Unmask upper DSCP bits in xfrm_get_tos() Ido Schimmel
@ 2024-08-27 14:54   ` Guillaume Nault
  0 siblings, 0 replies; 34+ messages in thread
From: Guillaume Nault @ 2024-08-27 14:54 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 02:18:08PM +0300, Ido Schimmel wrote:
> The function returns a value that is used to initialize 'flowi4_tos'
> before being passed to the FIB lookup API in the following call chain:
> 
> xfrm_bundle_create()
> 	tos = xfrm_get_tos(fl, family)
> 	xfrm_dst_lookup(..., tos, ...)
> 		__xfrm_dst_lookup(..., tos, ...)
> 			xfrm4_dst_lookup(..., tos, ...)
> 				__xfrm4_dst_lookup(..., tos, ...)
> 					fl4->flowi4_tos = tos
> 					__ip_route_output_key(net, fl4)
> 
> Unmask the upper DSCP bits so that in the future the output route lookup
> could be performed according to the full DSCP value.

Reviewed-by: Guillaume Nault <gnault@redhat.com>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 08/12] ipv4: Unmask upper DSCP bits in ip_send_unicast_reply()
  2024-08-27 11:18 ` [PATCH net-next 08/12] ipv4: Unmask upper DSCP bits in ip_send_unicast_reply() Ido Schimmel
@ 2024-08-27 15:09   ` Guillaume Nault
  0 siblings, 0 replies; 34+ messages in thread
From: Guillaume Nault @ 2024-08-27 15:09 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 02:18:09PM +0300, Ido Schimmel wrote:
> The function calls flowi4_init_output() to initialize an IPv4 flow key
> with which it then performs a FIB lookup using ip_route_output_flow().
> 
> 'arg->tos' with which the TOS value in the IPv4 flow key (flowi4_tos) is
> initialized contains the full DS field. Unmask the upper DSCP bits so
> that in the future the FIB lookup could be performed according to the
> full DSCP value.

Reviewed-by: Guillaume Nault <gnault@redhat.com>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 09/12] ipv6: sit: Unmask upper DSCP bits in ipip6_tunnel_xmit()
  2024-08-27 11:18 ` [PATCH net-next 09/12] ipv6: sit: Unmask upper DSCP bits in ipip6_tunnel_xmit() Ido Schimmel
@ 2024-08-27 15:17   ` Guillaume Nault
  0 siblings, 0 replies; 34+ messages in thread
From: Guillaume Nault @ 2024-08-27 15:17 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 02:18:10PM +0300, Ido Schimmel wrote:
> The function calls flowi4_init_output() to initialize an IPv4 flow key
> with which it then performs a FIB lookup using ip_route_output_flow().
> 
> The 'tos' variable with which the TOS value in the IPv4 flow key
> (flowi4_tos) is initialized contains the full DS field. Unmask the upper
> DSCP bits so that in the future the FIB lookup could be performed
> according to the full DSCP value.

Reviewed-by: Guillaume Nault <gnault@redhat.com>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 10/12] ipvlan: Unmask upper DSCP bits in ipvlan_process_v4_outbound()
  2024-08-27 11:18 ` [PATCH net-next 10/12] ipvlan: Unmask upper DSCP bits in ipvlan_process_v4_outbound() Ido Schimmel
@ 2024-08-27 15:19   ` Guillaume Nault
  0 siblings, 0 replies; 34+ messages in thread
From: Guillaume Nault @ 2024-08-27 15:19 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 02:18:11PM +0300, Ido Schimmel wrote:
> Unmask the upper DSCP bits when calling ip_route_output_flow() so that
> in the future it could perform the FIB lookup according to the full DSCP
> value.

Reviewed-by: Guillaume Nault <gnault@redhat.com>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 11/12] vrf: Unmask upper DSCP bits in vrf_process_v4_outbound()
  2024-08-27 11:18 ` [PATCH net-next 11/12] vrf: Unmask upper DSCP bits in vrf_process_v4_outbound() Ido Schimmel
@ 2024-08-27 15:22   ` Guillaume Nault
  0 siblings, 0 replies; 34+ messages in thread
From: Guillaume Nault @ 2024-08-27 15:22 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 02:18:12PM +0300, Ido Schimmel wrote:
> Unmask the upper DSCP bits when calling ip_route_output_flow() so that
> in the future it could perform the FIB lookup according to the full DSCP
> value.

Reviewed-by: Guillaume Nault <gnault@redhat.com>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 06/12] ipv4: Unmask upper DSCP bits when building flow key
  2024-08-27 14:51   ` Guillaume Nault
@ 2024-08-27 15:37     ` Ido Schimmel
  0 siblings, 0 replies; 34+ messages in thread
From: Ido Schimmel @ 2024-08-27 15:37 UTC (permalink / raw)
  To: Guillaume Nault
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 04:51:49PM +0200, Guillaume Nault wrote:
> On Tue, Aug 27, 2024 at 02:18:07PM +0300, Ido Schimmel wrote:
> > build_sk_flow_key() and __build_flow_key() are used to build an IPv4
> > flow key before calling one of the FIB lookup APIs.
> > 
> > Unmask the upper DSCP bits so that in the future the lookup could be
> > performed according to the full DSCP value.
> > 
> > Remove IPTOS_RT_MASK since it is no longer used.
> > 
> > Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> > ---
> >  include/net/route.h | 2 --
> >  net/ipv4/route.c    | 4 ++--
> >  2 files changed, 2 insertions(+), 4 deletions(-)
> > 
> > diff --git a/include/net/route.h b/include/net/route.h
> > index b896f086ec8e..1789f1e6640b 100644
> > --- a/include/net/route.h
> > +++ b/include/net/route.h
> > @@ -266,8 +266,6 @@ static inline void ip_rt_put(struct rtable *rt)
> >  	dst_release(&rt->dst);
> >  }
> >  
> > -#define IPTOS_RT_MASK	(IPTOS_TOS_MASK & ~3)
> > -
> 
> IPTOS_RT_MASK is still used by xfrm_get_tos() (net/xfrm/xfrm_policy.c).
> To preserve bisectablility, this chunk should be moved to the next
> patch. Or just swap patch 6 and 7, whatever you prefer :).

Oops. The order was initially different and I forgot to rebuild each
patch after reordering the patches. Will move this chunk to the next
patch in v2.

Thanks!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/12] Unmask upper DSCP bits - part 2
  2024-08-27 13:47 ` [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Guillaume Nault
@ 2024-08-27 15:45   ` Ido Schimmel
  2024-08-28 12:09     ` Ido Schimmel
  2024-08-29 11:30     ` Guillaume Nault
  0 siblings, 2 replies; 34+ messages in thread
From: Ido Schimmel @ 2024-08-27 15:45 UTC (permalink / raw)
  To: Guillaume Nault
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 03:47:05PM +0200, Guillaume Nault wrote:
> On Tue, Aug 27, 2024 at 02:18:01PM +0300, Ido Schimmel wrote:
> > tl;dr - This patchset continues to unmask the upper DSCP bits in the
> > IPv4 flow key in preparation for allowing IPv4 FIB rules to match on
> > DSCP. No functional changes are expected. Part 1 was merged in commit
> > ("Merge branch 'unmask-upper-dscp-bits-part-1'").
> > 
> > The TOS field in the IPv4 flow key ('flowi4_tos') is used during FIB
> > lookup to match against the TOS selector in FIB rules and routes.
> > 
> > It is currently impossible for user space to configure FIB rules that
> > match on the DSCP value as the upper DSCP bits are either masked in the
> > various call sites that initialize the IPv4 flow key or along the path
> > to the FIB core.
> > 
> > In preparation for adding a DSCP selector to IPv4 and IPv6 FIB rules, we
> 
> Hum, do you plan to add a DSCP selector for IPv6? That shouldn't be
> necessary as IPv6 already takes all the DSCP bits into account. Also we
> don't need to keep any compatibility with the legacy TOS interpretation,
> as it has never been defined nor used in IPv6.

Yes. I want to add the DSCP selector for both families so that user
space would not need to use different selectors for different families.
It's implemented in the patches I previously shared:

https://github.com/idosch/linux/commit/a3289a6838a0d0e6e0a30a61132bdce3d2f71a3c.patch
https://github.com/idosch/linux/commit/ff5dd634fb278431b58437654d7f65b57fd4ae4b.patch
https://github.com/idosch/linux/commit/3060ecb534475eadabfa1d419dd64804f0bd0148.patch
https://github.com/idosch/linux/commit/12ddbce4f519b42477ea1e130b6d2bab1cca137c.patch

> 
> > need to make sure the entire DSCP value is present in the IPv4 flow key.
> > This patchset continues to unmask the upper DSCP bits, but this time in
> > the output route path.
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/12] Unmask upper DSCP bits - part 2
  2024-08-27 15:45   ` Ido Schimmel
@ 2024-08-28 12:09     ` Ido Schimmel
  2024-08-29 11:54       ` Guillaume Nault
  2024-08-29 11:30     ` Guillaume Nault
  1 sibling, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2024-08-28 12:09 UTC (permalink / raw)
  To: Guillaume Nault
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 06:45:53PM +0300, Ido Schimmel wrote:
> On Tue, Aug 27, 2024 at 03:47:05PM +0200, Guillaume Nault wrote:
> > On Tue, Aug 27, 2024 at 02:18:01PM +0300, Ido Schimmel wrote:
> > > tl;dr - This patchset continues to unmask the upper DSCP bits in the
> > > IPv4 flow key in preparation for allowing IPv4 FIB rules to match on
> > > DSCP. No functional changes are expected. Part 1 was merged in commit
> > > ("Merge branch 'unmask-upper-dscp-bits-part-1'").
> > > 
> > > The TOS field in the IPv4 flow key ('flowi4_tos') is used during FIB
> > > lookup to match against the TOS selector in FIB rules and routes.
> > > 
> > > It is currently impossible for user space to configure FIB rules that
> > > match on the DSCP value as the upper DSCP bits are either masked in the
> > > various call sites that initialize the IPv4 flow key or along the path
> > > to the FIB core.
> > > 
> > > In preparation for adding a DSCP selector to IPv4 and IPv6 FIB rules, we
> > 
> > Hum, do you plan to add a DSCP selector for IPv6? That shouldn't be
> > necessary as IPv6 already takes all the DSCP bits into account. Also we
> > don't need to keep any compatibility with the legacy TOS interpretation,
> > as it has never been defined nor used in IPv6.
> 
> Yes. I want to add the DSCP selector for both families so that user
> space would not need to use different selectors for different families.

Another approach could be to add a mask to the existing tos/dsfield. For
example:

# ip -4 rule add dsfield 0x04/0xfc table 100
# ip -6 rule add dsfield 0xf8/0xfc table 100

The default IPv4 mask (when user doesn't specify one) would be 0x1c and
the default IPv6 mask would be 0xfc.

WDYT?

> It's implemented in the patches I previously shared:
> 
> https://github.com/idosch/linux/commit/a3289a6838a0d0e6e0a30a61132bdce3d2f71a3c.patch
> https://github.com/idosch/linux/commit/ff5dd634fb278431b58437654d7f65b57fd4ae4b.patch
> https://github.com/idosch/linux/commit/3060ecb534475eadabfa1d419dd64804f0bd0148.patch
> https://github.com/idosch/linux/commit/12ddbce4f519b42477ea1e130b6d2bab1cca137c.patch
> 
> > 
> > > need to make sure the entire DSCP value is present in the IPv4 flow key.
> > > This patchset continues to unmask the upper DSCP bits, but this time in
> > > the output route path.
> > 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/12] Unmask upper DSCP bits - part 2
  2024-08-27 15:45   ` Ido Schimmel
  2024-08-28 12:09     ` Ido Schimmel
@ 2024-08-29 11:30     ` Guillaume Nault
  2024-08-29 14:43       ` Ido Schimmel
  1 sibling, 1 reply; 34+ messages in thread
From: Guillaume Nault @ 2024-08-29 11:30 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Tue, Aug 27, 2024 at 06:45:53PM +0300, Ido Schimmel wrote:
> On Tue, Aug 27, 2024 at 03:47:05PM +0200, Guillaume Nault wrote:
> > On Tue, Aug 27, 2024 at 02:18:01PM +0300, Ido Schimmel wrote:
> > > tl;dr - This patchset continues to unmask the upper DSCP bits in the
> > > IPv4 flow key in preparation for allowing IPv4 FIB rules to match on
> > > DSCP. No functional changes are expected. Part 1 was merged in commit
> > > ("Merge branch 'unmask-upper-dscp-bits-part-1'").
> > > 
> > > The TOS field in the IPv4 flow key ('flowi4_tos') is used during FIB
> > > lookup to match against the TOS selector in FIB rules and routes.
> > > 
> > > It is currently impossible for user space to configure FIB rules that
> > > match on the DSCP value as the upper DSCP bits are either masked in the
> > > various call sites that initialize the IPv4 flow key or along the path
> > > to the FIB core.
> > > 
> > > In preparation for adding a DSCP selector to IPv4 and IPv6 FIB rules, we
> > 
> > Hum, do you plan to add a DSCP selector for IPv6? That shouldn't be
> > necessary as IPv6 already takes all the DSCP bits into account. Also we
> > don't need to keep any compatibility with the legacy TOS interpretation,
> > as it has never been defined nor used in IPv6.
> 
> Yes. I want to add the DSCP selector for both families so that user
> space would not need to use different selectors for different families.
> It's implemented in the patches I previously shared:

Hum, I guess that was a misunderstanding on my side. I read
"adding a DSCP selector to [IPv4 and] IPv6 FIB rules" as "adding the
possibility to match only the 3-bits TOS in fib6_rules". But your
fib6_rule.c patch doesn't modify fib6_rule_match(), so I believe that
what you really meant was just to add the new FRA_DSCP netlink
attribute to IPv6. Am I getting it right?

> https://github.com/idosch/linux/commit/a3289a6838a0d0e6e0a30a61132bdce3d2f71a3c.patch
> https://github.com/idosch/linux/commit/ff5dd634fb278431b58437654d7f65b57fd4ae4b.patch
> https://github.com/idosch/linux/commit/3060ecb534475eadabfa1d419dd64804f0bd0148.patch
> https://github.com/idosch/linux/commit/12ddbce4f519b42477ea1e130b6d2bab1cca137c.patch


> > > need to make sure the entire DSCP value is present in the IPv4 flow key.
> > > This patchset continues to unmask the upper DSCP bits, but this time in
> > > the output route path.
> > 
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/12] Unmask upper DSCP bits - part 2
  2024-08-28 12:09     ` Ido Schimmel
@ 2024-08-29 11:54       ` Guillaume Nault
  2024-08-29 14:52         ` Ido Schimmel
  0 siblings, 1 reply; 34+ messages in thread
From: Guillaume Nault @ 2024-08-29 11:54 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Wed, Aug 28, 2024 at 03:09:19PM +0300, Ido Schimmel wrote:
> On Tue, Aug 27, 2024 at 06:45:53PM +0300, Ido Schimmel wrote:
> > On Tue, Aug 27, 2024 at 03:47:05PM +0200, Guillaume Nault wrote:
> > > On Tue, Aug 27, 2024 at 02:18:01PM +0300, Ido Schimmel wrote:
> > > > tl;dr - This patchset continues to unmask the upper DSCP bits in the
> > > > IPv4 flow key in preparation for allowing IPv4 FIB rules to match on
> > > > DSCP. No functional changes are expected. Part 1 was merged in commit
> > > > ("Merge branch 'unmask-upper-dscp-bits-part-1'").
> > > > 
> > > > The TOS field in the IPv4 flow key ('flowi4_tos') is used during FIB
> > > > lookup to match against the TOS selector in FIB rules and routes.
> > > > 
> > > > It is currently impossible for user space to configure FIB rules that
> > > > match on the DSCP value as the upper DSCP bits are either masked in the
> > > > various call sites that initialize the IPv4 flow key or along the path
> > > > to the FIB core.
> > > > 
> > > > In preparation for adding a DSCP selector to IPv4 and IPv6 FIB rules, we
> > > 
> > > Hum, do you plan to add a DSCP selector for IPv6? That shouldn't be
> > > necessary as IPv6 already takes all the DSCP bits into account. Also we
> > > don't need to keep any compatibility with the legacy TOS interpretation,
> > > as it has never been defined nor used in IPv6.
> > 
> > Yes. I want to add the DSCP selector for both families so that user
> > space would not need to use different selectors for different families.
> 
> Another approach could be to add a mask to the existing tos/dsfield. For
> example:
> 
> # ip -4 rule add dsfield 0x04/0xfc table 100
> # ip -6 rule add dsfield 0xf8/0xfc table 100
> 
> The default IPv4 mask (when user doesn't specify one) would be 0x1c and
> the default IPv6 mask would be 0xfc.
> 
> WDYT?

For internal implementation, I find the mask option elegant (to avoid
conditionals). But I don't really like the idea of letting user space
provide its own mask. This would let the user create non-standard
behaviours, likely by mistake (as nobody seem to ever have requested
that flexibility).

I think my favourite approach would be to have the new FRA_DSCP
attribute work identically on both IPv4 and IPv6 FIB rules and keep
the behaviour of the old "tos" field of struct fib_rule_hdr unchanged.

This "tos" field would still work differently for IPv4 and IPv6, as it
always did, but people wanting consistent behaviour could just use
FRA_DSCP instead. Also, FRA_DSCP accepts real DSCP values as defined in
RFCs, while "tos" requires the 2 bits shift. For all these reasons, I'm
tempted to just consider "tos" as a legacy option used only for
backward compatibility, while FRA_DSCP would be the "clean" interface.

Is that approach acceptable for you?


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/12] Unmask upper DSCP bits - part 2
  2024-08-29 11:30     ` Guillaume Nault
@ 2024-08-29 14:43       ` Ido Schimmel
  2024-08-29 15:08         ` Guillaume Nault
  0 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2024-08-29 14:43 UTC (permalink / raw)
  To: Guillaume Nault
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Thu, Aug 29, 2024 at 01:30:58PM +0200, Guillaume Nault wrote:
> On Tue, Aug 27, 2024 at 06:45:53PM +0300, Ido Schimmel wrote:
> > On Tue, Aug 27, 2024 at 03:47:05PM +0200, Guillaume Nault wrote:
> > > On Tue, Aug 27, 2024 at 02:18:01PM +0300, Ido Schimmel wrote:
> > > > tl;dr - This patchset continues to unmask the upper DSCP bits in the
> > > > IPv4 flow key in preparation for allowing IPv4 FIB rules to match on
> > > > DSCP. No functional changes are expected. Part 1 was merged in commit
> > > > ("Merge branch 'unmask-upper-dscp-bits-part-1'").
> > > > 
> > > > The TOS field in the IPv4 flow key ('flowi4_tos') is used during FIB
> > > > lookup to match against the TOS selector in FIB rules and routes.
> > > > 
> > > > It is currently impossible for user space to configure FIB rules that
> > > > match on the DSCP value as the upper DSCP bits are either masked in the
> > > > various call sites that initialize the IPv4 flow key or along the path
> > > > to the FIB core.
> > > > 
> > > > In preparation for adding a DSCP selector to IPv4 and IPv6 FIB rules, we
> > > 
> > > Hum, do you plan to add a DSCP selector for IPv6? That shouldn't be
> > > necessary as IPv6 already takes all the DSCP bits into account. Also we
> > > don't need to keep any compatibility with the legacy TOS interpretation,
> > > as it has never been defined nor used in IPv6.
> > 
> > Yes. I want to add the DSCP selector for both families so that user
> > space would not need to use different selectors for different families.
> > It's implemented in the patches I previously shared:
> 
> Hum, I guess that was a misunderstanding on my side. I read
> "adding a DSCP selector to [IPv4 and] IPv6 FIB rules" as "adding the
> possibility to match only the 3-bits TOS in fib6_rules". But your
> fib6_rule.c patch doesn't modify fib6_rule_match(), so I believe that
> what you really meant was just to add the new FRA_DSCP netlink
> attribute to IPv6. Am I getting it right?

Yes. To be clear, you will be able to use the new 'dscp' keyword exactly
the same way with both IPv4 and IPv6:

# ip -4 rule add dscp 63 table 100
# ip -6 rule add dscp 63 table 100

Mixing 'dscp' and 'tos' will not work:

# ip -4 rule add dscp 7 tos 0x1c table 100
Error: Cannot specify both TOS and DSCP.
# ip -6 rule add dscp 7 tos 0x1c table 100
Error: Cannot specify both TOS and DSCP.

> 
> > https://github.com/idosch/linux/commit/a3289a6838a0d0e6e0a30a61132bdce3d2f71a3c.patch
> > https://github.com/idosch/linux/commit/ff5dd634fb278431b58437654d7f65b57fd4ae4b.patch
> > https://github.com/idosch/linux/commit/3060ecb534475eadabfa1d419dd64804f0bd0148.patch
> > https://github.com/idosch/linux/commit/12ddbce4f519b42477ea1e130b6d2bab1cca137c.patch
> 
> 
> > > > need to make sure the entire DSCP value is present in the IPv4 flow key.
> > > > This patchset continues to unmask the upper DSCP bits, but this time in
> > > > the output route path.
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/12] Unmask upper DSCP bits - part 2
  2024-08-29 11:54       ` Guillaume Nault
@ 2024-08-29 14:52         ` Ido Schimmel
  2024-08-29 15:10           ` Guillaume Nault
  0 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2024-08-29 14:52 UTC (permalink / raw)
  To: Guillaume Nault
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Thu, Aug 29, 2024 at 01:54:46PM +0200, Guillaume Nault wrote:
> On Wed, Aug 28, 2024 at 03:09:19PM +0300, Ido Schimmel wrote:
> > On Tue, Aug 27, 2024 at 06:45:53PM +0300, Ido Schimmel wrote:
> > > On Tue, Aug 27, 2024 at 03:47:05PM +0200, Guillaume Nault wrote:
> > > > On Tue, Aug 27, 2024 at 02:18:01PM +0300, Ido Schimmel wrote:
> > > > > tl;dr - This patchset continues to unmask the upper DSCP bits in the
> > > > > IPv4 flow key in preparation for allowing IPv4 FIB rules to match on
> > > > > DSCP. No functional changes are expected. Part 1 was merged in commit
> > > > > ("Merge branch 'unmask-upper-dscp-bits-part-1'").
> > > > > 
> > > > > The TOS field in the IPv4 flow key ('flowi4_tos') is used during FIB
> > > > > lookup to match against the TOS selector in FIB rules and routes.
> > > > > 
> > > > > It is currently impossible for user space to configure FIB rules that
> > > > > match on the DSCP value as the upper DSCP bits are either masked in the
> > > > > various call sites that initialize the IPv4 flow key or along the path
> > > > > to the FIB core.
> > > > > 
> > > > > In preparation for adding a DSCP selector to IPv4 and IPv6 FIB rules, we
> > > > 
> > > > Hum, do you plan to add a DSCP selector for IPv6? That shouldn't be
> > > > necessary as IPv6 already takes all the DSCP bits into account. Also we
> > > > don't need to keep any compatibility with the legacy TOS interpretation,
> > > > as it has never been defined nor used in IPv6.
> > > 
> > > Yes. I want to add the DSCP selector for both families so that user
> > > space would not need to use different selectors for different families.
> > 
> > Another approach could be to add a mask to the existing tos/dsfield. For
> > example:
> > 
> > # ip -4 rule add dsfield 0x04/0xfc table 100
> > # ip -6 rule add dsfield 0xf8/0xfc table 100
> > 
> > The default IPv4 mask (when user doesn't specify one) would be 0x1c and
> > the default IPv6 mask would be 0xfc.
> > 
> > WDYT?
> 
> For internal implementation, I find the mask option elegant (to avoid
> conditionals). But I don't really like the idea of letting user space
> provide its own mask. This would let the user create non-standard
> behaviours, likely by mistake (as nobody seem to ever have requested
> that flexibility).
> 
> I think my favourite approach would be to have the new FRA_DSCP
> attribute work identically on both IPv4 and IPv6 FIB rules and keep
> the behaviour of the old "tos" field of struct fib_rule_hdr unchanged.
> 
> This "tos" field would still work differently for IPv4 and IPv6, as it
> always did, but people wanting consistent behaviour could just use
> FRA_DSCP instead. Also, FRA_DSCP accepts real DSCP values as defined in
> RFCs, while "tos" requires the 2 bits shift. For all these reasons, I'm
> tempted to just consider "tos" as a legacy option used only for
> backward compatibility, while FRA_DSCP would be the "clean" interface.
> 
> Is that approach acceptable for you?

Yes. The patches I shared implement this approach :)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/12] Unmask upper DSCP bits - part 2
  2024-08-29 14:43       ` Ido Schimmel
@ 2024-08-29 15:08         ` Guillaume Nault
  0 siblings, 0 replies; 34+ messages in thread
From: Guillaume Nault @ 2024-08-29 15:08 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Thu, Aug 29, 2024 at 05:43:17PM +0300, Ido Schimmel wrote:
> On Thu, Aug 29, 2024 at 01:30:58PM +0200, Guillaume Nault wrote:
> > On Tue, Aug 27, 2024 at 06:45:53PM +0300, Ido Schimmel wrote:
> > > On Tue, Aug 27, 2024 at 03:47:05PM +0200, Guillaume Nault wrote:
> > > > On Tue, Aug 27, 2024 at 02:18:01PM +0300, Ido Schimmel wrote:
> > > > > tl;dr - This patchset continues to unmask the upper DSCP bits in the
> > > > > IPv4 flow key in preparation for allowing IPv4 FIB rules to match on
> > > > > DSCP. No functional changes are expected. Part 1 was merged in commit
> > > > > ("Merge branch 'unmask-upper-dscp-bits-part-1'").
> > > > > 
> > > > > The TOS field in the IPv4 flow key ('flowi4_tos') is used during FIB
> > > > > lookup to match against the TOS selector in FIB rules and routes.
> > > > > 
> > > > > It is currently impossible for user space to configure FIB rules that
> > > > > match on the DSCP value as the upper DSCP bits are either masked in the
> > > > > various call sites that initialize the IPv4 flow key or along the path
> > > > > to the FIB core.
> > > > > 
> > > > > In preparation for adding a DSCP selector to IPv4 and IPv6 FIB rules, we
> > > > 
> > > > Hum, do you plan to add a DSCP selector for IPv6? That shouldn't be
> > > > necessary as IPv6 already takes all the DSCP bits into account. Also we
> > > > don't need to keep any compatibility with the legacy TOS interpretation,
> > > > as it has never been defined nor used in IPv6.
> > > 
> > > Yes. I want to add the DSCP selector for both families so that user
> > > space would not need to use different selectors for different families.
> > > It's implemented in the patches I previously shared:
> > 
> > Hum, I guess that was a misunderstanding on my side. I read
> > "adding a DSCP selector to [IPv4 and] IPv6 FIB rules" as "adding the
> > possibility to match only the 3-bits TOS in fib6_rules". But your
> > fib6_rule.c patch doesn't modify fib6_rule_match(), so I believe that
> > what you really meant was just to add the new FRA_DSCP netlink
> > attribute to IPv6. Am I getting it right?
> 
> Yes. To be clear, you will be able to use the new 'dscp' keyword exactly
> the same way with both IPv4 and IPv6:
> 
> # ip -4 rule add dscp 63 table 100
> # ip -6 rule add dscp 63 table 100
> 
> Mixing 'dscp' and 'tos' will not work:
> 
> # ip -4 rule add dscp 7 tos 0x1c table 100
> Error: Cannot specify both TOS and DSCP.
> # ip -6 rule add dscp 7 tos 0x1c table 100
> Error: Cannot specify both TOS and DSCP.

Thanks, that's exactly what I had in mind.

> > 
> > > https://github.com/idosch/linux/commit/a3289a6838a0d0e6e0a30a61132bdce3d2f71a3c.patch
> > > https://github.com/idosch/linux/commit/ff5dd634fb278431b58437654d7f65b57fd4ae4b.patch
> > > https://github.com/idosch/linux/commit/3060ecb534475eadabfa1d419dd64804f0bd0148.patch
> > > https://github.com/idosch/linux/commit/12ddbce4f519b42477ea1e130b6d2bab1cca137c.patch
> > 
> > 
> > > > > need to make sure the entire DSCP value is present in the IPv4 flow key.
> > > > > This patchset continues to unmask the upper DSCP bits, but this time in
> > > > > the output route path.
> > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/12] Unmask upper DSCP bits - part 2
  2024-08-29 14:52         ` Ido Schimmel
@ 2024-08-29 15:10           ` Guillaume Nault
  0 siblings, 0 replies; 34+ messages in thread
From: Guillaume Nault @ 2024-08-29 15:10 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, ast, daniel,
	martin.lau, john.fastabend, steffen.klassert, herbert, bpf

On Thu, Aug 29, 2024 at 05:52:05PM +0300, Ido Schimmel wrote:
> On Thu, Aug 29, 2024 at 01:54:46PM +0200, Guillaume Nault wrote:
> > On Wed, Aug 28, 2024 at 03:09:19PM +0300, Ido Schimmel wrote:
> > > On Tue, Aug 27, 2024 at 06:45:53PM +0300, Ido Schimmel wrote:
> > > > On Tue, Aug 27, 2024 at 03:47:05PM +0200, Guillaume Nault wrote:
> > > > > On Tue, Aug 27, 2024 at 02:18:01PM +0300, Ido Schimmel wrote:
> > > > > > tl;dr - This patchset continues to unmask the upper DSCP bits in the
> > > > > > IPv4 flow key in preparation for allowing IPv4 FIB rules to match on
> > > > > > DSCP. No functional changes are expected. Part 1 was merged in commit
> > > > > > ("Merge branch 'unmask-upper-dscp-bits-part-1'").
> > > > > > 
> > > > > > The TOS field in the IPv4 flow key ('flowi4_tos') is used during FIB
> > > > > > lookup to match against the TOS selector in FIB rules and routes.
> > > > > > 
> > > > > > It is currently impossible for user space to configure FIB rules that
> > > > > > match on the DSCP value as the upper DSCP bits are either masked in the
> > > > > > various call sites that initialize the IPv4 flow key or along the path
> > > > > > to the FIB core.
> > > > > > 
> > > > > > In preparation for adding a DSCP selector to IPv4 and IPv6 FIB rules, we
> > > > > 
> > > > > Hum, do you plan to add a DSCP selector for IPv6? That shouldn't be
> > > > > necessary as IPv6 already takes all the DSCP bits into account. Also we
> > > > > don't need to keep any compatibility with the legacy TOS interpretation,
> > > > > as it has never been defined nor used in IPv6.
> > > > 
> > > > Yes. I want to add the DSCP selector for both families so that user
> > > > space would not need to use different selectors for different families.
> > > 
> > > Another approach could be to add a mask to the existing tos/dsfield. For
> > > example:
> > > 
> > > # ip -4 rule add dsfield 0x04/0xfc table 100
> > > # ip -6 rule add dsfield 0xf8/0xfc table 100
> > > 
> > > The default IPv4 mask (when user doesn't specify one) would be 0x1c and
> > > the default IPv6 mask would be 0xfc.
> > > 
> > > WDYT?
> > 
> > For internal implementation, I find the mask option elegant (to avoid
> > conditionals). But I don't really like the idea of letting user space
> > provide its own mask. This would let the user create non-standard
> > behaviours, likely by mistake (as nobody seem to ever have requested
> > that flexibility).
> > 
> > I think my favourite approach would be to have the new FRA_DSCP
> > attribute work identically on both IPv4 and IPv6 FIB rules and keep
> > the behaviour of the old "tos" field of struct fib_rule_hdr unchanged.
> > 
> > This "tos" field would still work differently for IPv4 and IPv6, as it
> > always did, but people wanting consistent behaviour could just use
> > FRA_DSCP instead. Also, FRA_DSCP accepts real DSCP values as defined in
> > RFCs, while "tos" requires the 2 bits shift. For all these reasons, I'm
> > tempted to just consider "tos" as a legacy option used only for
> > backward compatibility, while FRA_DSCP would be the "clean" interface.
> > 
> > Is that approach acceptable for you?
> 
> Yes. The patches I shared implement this approach :)

Thanks for confirming. And sorry for the misunderstanding in v1.


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2024-08-29 15:10 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-27 11:18 [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Ido Schimmel
2024-08-27 11:18 ` [PATCH net-next 01/12] ipv4: Unmask upper DSCP bits in RTM_GETROUTE output route lookup Ido Schimmel
2024-08-27 13:55   ` Guillaume Nault
2024-08-27 11:18 ` [PATCH net-next 02/12] ipv4: Unmask upper DSCP bits in ip_route_output_key_hash() Ido Schimmel
2024-08-27 13:57   ` Guillaume Nault
2024-08-27 11:18 ` [PATCH net-next 03/12] ipv4: icmp: Unmask upper DSCP bits in icmp_route_lookup() Ido Schimmel
2024-08-27 14:16   ` Guillaume Nault
2024-08-27 11:18 ` [PATCH net-next 04/12] ipv4: Unmask upper DSCP bits in ip_sock_rt_tos() Ido Schimmel
2024-08-27 14:29   ` Guillaume Nault
2024-08-27 11:18 ` [PATCH net-next 05/12] ipv4: Unmask upper DSCP bits in get_rttos() Ido Schimmel
2024-08-27 14:43   ` Guillaume Nault
2024-08-27 11:18 ` [PATCH net-next 06/12] ipv4: Unmask upper DSCP bits when building flow key Ido Schimmel
2024-08-27 14:51   ` Guillaume Nault
2024-08-27 15:37     ` Ido Schimmel
2024-08-27 11:18 ` [PATCH net-next 07/12] xfrm: Unmask upper DSCP bits in xfrm_get_tos() Ido Schimmel
2024-08-27 14:54   ` Guillaume Nault
2024-08-27 11:18 ` [PATCH net-next 08/12] ipv4: Unmask upper DSCP bits in ip_send_unicast_reply() Ido Schimmel
2024-08-27 15:09   ` Guillaume Nault
2024-08-27 11:18 ` [PATCH net-next 09/12] ipv6: sit: Unmask upper DSCP bits in ipip6_tunnel_xmit() Ido Schimmel
2024-08-27 15:17   ` Guillaume Nault
2024-08-27 11:18 ` [PATCH net-next 10/12] ipvlan: Unmask upper DSCP bits in ipvlan_process_v4_outbound() Ido Schimmel
2024-08-27 15:19   ` Guillaume Nault
2024-08-27 11:18 ` [PATCH net-next 11/12] vrf: Unmask upper DSCP bits in vrf_process_v4_outbound() Ido Schimmel
2024-08-27 15:22   ` Guillaume Nault
2024-08-27 11:18 ` [PATCH net-next 12/12] bpf: Unmask upper DSCP bits in __bpf_redirect_neigh_v4() Ido Schimmel
2024-08-27 13:47 ` [PATCH net-next 00/12] Unmask upper DSCP bits - part 2 Guillaume Nault
2024-08-27 15:45   ` Ido Schimmel
2024-08-28 12:09     ` Ido Schimmel
2024-08-29 11:54       ` Guillaume Nault
2024-08-29 14:52         ` Ido Schimmel
2024-08-29 15:10           ` Guillaume Nault
2024-08-29 11:30     ` Guillaume Nault
2024-08-29 14:43       ` Ido Schimmel
2024-08-29 15:08         ` Guillaume Nault

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).