* [PATCH net 1/4] xfrm: Don't accidentally set RTO_ONLINK in decode_session4()
2022-01-05 19:56 [PATCH net 0/4] ipv4: Fix accidental RTO_ONLINK flags passed to ip_route_output_key_hash() Guillaume Nault
@ 2022-01-05 19:56 ` Guillaume Nault
2022-01-05 19:56 ` [PATCH net 2/4] gre: Don't accidentally set RTO_ONLINK in gre_fill_metadata_dst() Guillaume Nault
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Guillaume Nault @ 2022-01-05 19:56 UTC (permalink / raw)
To: David Miller, Jakub Kicinski; +Cc: netdev, Steffen Klassert, Herbert Xu
Similar to commit 94e2238969e8 ("xfrm4: strip ECN bits from tos field"),
clear the ECN bits from iph->tos when setting ->flowi4_tos.
This ensures that the last bit of ->flowi4_tos is cleared, so
ip_route_output_key_hash() isn't going to restrict the scope of the
route lookup.
Use ~INET_ECN_MASK instead of IPTOS_RT_MASK, because we have no reason
to clear the high order bits.
Found by code inspection, compile tested only.
Fixes: 4da3089f2b58 ("[IPSEC]: Use TOS when doing tunnel lookups")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
net/xfrm/xfrm_policy.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 1a06585022ab..0bb82df0f569 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -31,6 +31,7 @@
#include <linux/if_tunnel.h>
#include <net/dst.h>
#include <net/flow.h>
+#include <net/inet_ecn.h>
#include <net/xfrm.h>
#include <net/ip.h>
#if IS_ENABLED(CONFIG_IPV6_MIP6)
@@ -3294,7 +3295,7 @@ decode_session4(struct sk_buff *skb, struct flowi *fl, bool reverse)
fl4->flowi4_proto = iph->protocol;
fl4->daddr = reverse ? iph->saddr : iph->daddr;
fl4->saddr = reverse ? iph->daddr : iph->saddr;
- fl4->flowi4_tos = iph->tos;
+ fl4->flowi4_tos = iph->tos & ~INET_ECN_MASK;
if (!ip_is_fragment(iph)) {
switch (iph->protocol) {
--
2.21.3
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH net 2/4] gre: Don't accidentally set RTO_ONLINK in gre_fill_metadata_dst()
2022-01-05 19:56 [PATCH net 0/4] ipv4: Fix accidental RTO_ONLINK flags passed to ip_route_output_key_hash() Guillaume Nault
2022-01-05 19:56 ` [PATCH net 1/4] xfrm: Don't accidentally set RTO_ONLINK in decode_session4() Guillaume Nault
@ 2022-01-05 19:56 ` Guillaume Nault
2022-01-05 19:56 ` [PATCH net 3/4] libcxgb: Don't accidentally set RTO_ONLINK in cxgb_find_route() Guillaume Nault
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Guillaume Nault @ 2022-01-05 19:56 UTC (permalink / raw)
To: David Miller, Jakub Kicinski
Cc: netdev, Hideaki YOSHIFUJI, David Ahern, wenxu
Mask the ECN bits before initialising ->flowi4_tos. The tunnel key may
have the last ECN bit set, which will interfere with the route lookup
process as ip_route_output_key_hash() interpretes this bit specially
(to restrict the route scope).
Found by code inspection, compile tested only.
Fixes: 962924fa2b7a ("ip_gre: Refactor collect metatdata mode tunnel xmit to ip_md_tunnel_xmit")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
net/ipv4/ip_gre.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 2ac2b95c5694..99db2e41ed10 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -604,8 +604,9 @@ static int gre_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
key = &info->key;
ip_tunnel_init_flow(&fl4, IPPROTO_GRE, key->u.ipv4.dst, key->u.ipv4.src,
- tunnel_id_to_key32(key->tun_id), key->tos, 0,
- skb->mark, skb_get_hash(skb));
+ tunnel_id_to_key32(key->tun_id),
+ key->tos & ~INET_ECN_MASK, 0, skb->mark,
+ skb_get_hash(skb));
rt = ip_route_output_key(dev_net(dev), &fl4);
if (IS_ERR(rt))
return PTR_ERR(rt);
--
2.21.3
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH net 3/4] libcxgb: Don't accidentally set RTO_ONLINK in cxgb_find_route()
2022-01-05 19:56 [PATCH net 0/4] ipv4: Fix accidental RTO_ONLINK flags passed to ip_route_output_key_hash() Guillaume Nault
2022-01-05 19:56 ` [PATCH net 1/4] xfrm: Don't accidentally set RTO_ONLINK in decode_session4() Guillaume Nault
2022-01-05 19:56 ` [PATCH net 2/4] gre: Don't accidentally set RTO_ONLINK in gre_fill_metadata_dst() Guillaume Nault
@ 2022-01-05 19:56 ` Guillaume Nault
2022-01-05 19:56 ` [PATCH net 4/4] mlx5: Don't accidentally set RTO_ONLINK before mlx5e_route_lookup_ipv4_get() Guillaume Nault
2022-01-10 0:23 ` [PATCH net 0/4] ipv4: Fix accidental RTO_ONLINK flags passed to ip_route_output_key_hash() Jakub Kicinski
4 siblings, 0 replies; 9+ messages in thread
From: Guillaume Nault @ 2022-01-05 19:56 UTC (permalink / raw)
To: David Miller, Jakub Kicinski; +Cc: netdev, Varun Prakash
Mask the ECN bits before calling ip_route_output_ports(). The tos
variable might be passed directly from an IPv4 header, so it may have
the last ECN bit set. This interferes with the route lookup process as
ip_route_output_key_hash() interpretes this bit specially (to restrict
the route scope).
Found by code inspection, compile tested only.
Fixes: 804c2f3e36ef ("libcxgb,iw_cxgb4,cxgbit: add cxgb_find_route()")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
drivers/net/ethernet/chelsio/libcxgb/libcxgb_cm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/chelsio/libcxgb/libcxgb_cm.c b/drivers/net/ethernet/chelsio/libcxgb/libcxgb_cm.c
index d04a6c163445..da8d10475a08 100644
--- a/drivers/net/ethernet/chelsio/libcxgb/libcxgb_cm.c
+++ b/drivers/net/ethernet/chelsio/libcxgb/libcxgb_cm.c
@@ -32,6 +32,7 @@
#include <linux/tcp.h>
#include <linux/ipv6.h>
+#include <net/inet_ecn.h>
#include <net/route.h>
#include <net/ip6_route.h>
@@ -99,7 +100,7 @@ cxgb_find_route(struct cxgb4_lld_info *lldi,
rt = ip_route_output_ports(&init_net, &fl4, NULL, peer_ip, local_ip,
peer_port, local_port, IPPROTO_TCP,
- tos, 0);
+ tos & ~INET_ECN_MASK, 0);
if (IS_ERR(rt))
return NULL;
n = dst_neigh_lookup(&rt->dst, &peer_ip);
--
2.21.3
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH net 4/4] mlx5: Don't accidentally set RTO_ONLINK before mlx5e_route_lookup_ipv4_get()
2022-01-05 19:56 [PATCH net 0/4] ipv4: Fix accidental RTO_ONLINK flags passed to ip_route_output_key_hash() Guillaume Nault
` (2 preceding siblings ...)
2022-01-05 19:56 ` [PATCH net 3/4] libcxgb: Don't accidentally set RTO_ONLINK in cxgb_find_route() Guillaume Nault
@ 2022-01-05 19:56 ` Guillaume Nault
2022-01-06 3:57 ` Saeed Mahameed
2022-01-10 0:23 ` [PATCH net 0/4] ipv4: Fix accidental RTO_ONLINK flags passed to ip_route_output_key_hash() Jakub Kicinski
4 siblings, 1 reply; 9+ messages in thread
From: Guillaume Nault @ 2022-01-05 19:56 UTC (permalink / raw)
To: David Miller, Jakub Kicinski
Cc: netdev, Saeed Mahameed, Leon Romanovsky, Vlad Buslov, Or Gerlitz
Mask the ECN bits before calling mlx5e_route_lookup_ipv4_get(). The
tunnel key might have the last ECN bit set. This interferes with the
route lookup process as ip_route_output_key_hash() interpretes this bit
specially (to restrict the route scope).
Found by code inspection, compile tested only.
Fixes: c7b9038d8af6 ("net/mlx5e: TC preparation refactoring for routing update event")
Fixes: 9a941117fb76 ("net/mlx5e: Maximize ip tunnel key usage on the TC offloading path")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
index a5e450973225..bc5f1dcb75e1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
/* Copyright (c) 2018 Mellanox Technologies. */
+#include <net/inet_ecn.h>
#include <net/vxlan.h>
#include <net/gre.h>
#include <net/geneve.h>
@@ -235,7 +236,7 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv,
int err;
/* add the IP fields */
- attr.fl.fl4.flowi4_tos = tun_key->tos;
+ attr.fl.fl4.flowi4_tos = tun_key->tos & ~INET_ECN_MASK;
attr.fl.fl4.daddr = tun_key->u.ipv4.dst;
attr.fl.fl4.saddr = tun_key->u.ipv4.src;
attr.ttl = tun_key->ttl;
@@ -350,7 +351,7 @@ int mlx5e_tc_tun_update_header_ipv4(struct mlx5e_priv *priv,
int err;
/* add the IP fields */
- attr.fl.fl4.flowi4_tos = tun_key->tos;
+ attr.fl.fl4.flowi4_tos = tun_key->tos & ~INET_ECN_MASK;
attr.fl.fl4.daddr = tun_key->u.ipv4.dst;
attr.fl.fl4.saddr = tun_key->u.ipv4.src;
attr.ttl = tun_key->ttl;
--
2.21.3
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH net 4/4] mlx5: Don't accidentally set RTO_ONLINK before mlx5e_route_lookup_ipv4_get()
2022-01-05 19:56 ` [PATCH net 4/4] mlx5: Don't accidentally set RTO_ONLINK before mlx5e_route_lookup_ipv4_get() Guillaume Nault
@ 2022-01-06 3:57 ` Saeed Mahameed
2022-01-06 11:47 ` Guillaume Nault
0 siblings, 1 reply; 9+ messages in thread
From: Saeed Mahameed @ 2022-01-06 3:57 UTC (permalink / raw)
To: Guillaume Nault, roid
Cc: David Miller, Jakub Kicinski, netdev, Saeed Mahameed,
Leon Romanovsky, Vlad Buslov, Or Gerlitz
On Wed, Jan 05, 2022 at 08:56:28PM +0100, Guillaume Nault wrote:
>Mask the ECN bits before calling mlx5e_route_lookup_ipv4_get(). The
>tunnel key might have the last ECN bit set. This interferes with the
>route lookup process as ip_route_output_key_hash() interpretes this bit
>specially (to restrict the route scope).
>
>Found by code inspection, compile tested only.
>
>Fixes: c7b9038d8af6 ("net/mlx5e: TC preparation refactoring for routing update event")
>Fixes: 9a941117fb76 ("net/mlx5e: Maximize ip tunnel key usage on the TC offloading path")
>Signed-off-by: Guillaume Nault <gnault@redhat.com>
>---
> drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
>index a5e450973225..bc5f1dcb75e1 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
>@@ -1,6 +1,7 @@
> /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
> /* Copyright (c) 2018 Mellanox Technologies. */
>
>+#include <net/inet_ecn.h>
> #include <net/vxlan.h>
> #include <net/gre.h>
> #include <net/geneve.h>
>@@ -235,7 +236,7 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv,
> int err;
>
> /* add the IP fields */
>- attr.fl.fl4.flowi4_tos = tun_key->tos;
>+ attr.fl.fl4.flowi4_tos = tun_key->tos & ~INET_ECN_MASK;
This is TC control path, why would ecn bits be ON in TC act->tunnel info ?
I don't believe these bits are on and if they were, TC tunnels layer should
clear them before calling the driver's offload callback.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH net 4/4] mlx5: Don't accidentally set RTO_ONLINK before mlx5e_route_lookup_ipv4_get()
2022-01-06 3:57 ` Saeed Mahameed
@ 2022-01-06 11:47 ` Guillaume Nault
0 siblings, 0 replies; 9+ messages in thread
From: Guillaume Nault @ 2022-01-06 11:47 UTC (permalink / raw)
To: Saeed Mahameed
Cc: roid, David Miller, Jakub Kicinski, netdev, Saeed Mahameed,
Leon Romanovsky, Vlad Buslov, Or Gerlitz
On Wed, Jan 05, 2022 at 07:57:23PM -0800, Saeed Mahameed wrote:
> On Wed, Jan 05, 2022 at 08:56:28PM +0100, Guillaume Nault wrote:
> > Mask the ECN bits before calling mlx5e_route_lookup_ipv4_get(). The
> > tunnel key might have the last ECN bit set. This interferes with the
> > route lookup process as ip_route_output_key_hash() interpretes this bit
> > specially (to restrict the route scope).
> >
> > Found by code inspection, compile tested only.
> >
> > Fixes: c7b9038d8af6 ("net/mlx5e: TC preparation refactoring for routing update event")
> > Fixes: 9a941117fb76 ("net/mlx5e: Maximize ip tunnel key usage on the TC offloading path")
> > Signed-off-by: Guillaume Nault <gnault@redhat.com>
> > ---
> > drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
> > index a5e450973225..bc5f1dcb75e1 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
> > @@ -1,6 +1,7 @@
> > /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
> > /* Copyright (c) 2018 Mellanox Technologies. */
> >
> > +#include <net/inet_ecn.h>
> > #include <net/vxlan.h>
> > #include <net/gre.h>
> > #include <net/geneve.h>
> > @@ -235,7 +236,7 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv,
> > int err;
> >
> > /* add the IP fields */
> > - attr.fl.fl4.flowi4_tos = tun_key->tos;
> > + attr.fl.fl4.flowi4_tos = tun_key->tos & ~INET_ECN_MASK;
>
> This is TC control path, why would ecn bits be ON in TC act->tunnel info ?
As far as I understand, the value of tun_key->tos can be set by
act_tunnel_key.c, which doesn't impose any restriction on the tos value
(TCA_TUNNEL_KEY_ENC_TOS). Unless I've missed something (I'm not
familiar with the hardware offload infrastructure), tun_key->tos is
effectively under user control.
> I don't believe these bits are on and if they were, TC tunnels layer should
> clear them before calling the driver's offload callback.
We could reject TOS values that have the ECN bits set in
act_tunnel_key.c, but that'd be a much broader change, and a user
visible one. At the very least it'd be something for net-next, while
this series tries to fix bugs in net.
Also, from a logical point of view, callers of ip_route_output_key()
are responsible for properly setting or clearing the RTO_ONLINK flag.
We shouldn't have to change act_tunnel_key.c because one of the drivers
offload path might call ip_route_output_key().
Even though I agree that act_tunnel_key should ideally not accept ECN
bits in TCA_TUNNEL_KEY_ENC_TOS, it should refuse them for user
understandable reasons (decouple DSCP and ECN), not because of some
drivers implementation details.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 0/4] ipv4: Fix accidental RTO_ONLINK flags passed to ip_route_output_key_hash()
2022-01-05 19:56 [PATCH net 0/4] ipv4: Fix accidental RTO_ONLINK flags passed to ip_route_output_key_hash() Guillaume Nault
` (3 preceding siblings ...)
2022-01-05 19:56 ` [PATCH net 4/4] mlx5: Don't accidentally set RTO_ONLINK before mlx5e_route_lookup_ipv4_get() Guillaume Nault
@ 2022-01-10 0:23 ` Jakub Kicinski
2022-01-10 13:59 ` Guillaume Nault
4 siblings, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2022-01-10 0:23 UTC (permalink / raw)
To: Guillaume Nault
Cc: David Miller, netdev, Steffen Klassert, Herbert Xu,
Hideaki YOSHIFUJI, David Ahern, wenxu, Varun Prakash,
Saeed Mahameed, Leon Romanovsky, Vlad Buslov, Or Gerlitz
On Wed, 5 Jan 2022 20:56:16 +0100 Guillaume Nault wrote:
> The IPv4 stack generally uses the last bit of ->flowi4_tos as a flag
> indicating link scope for route lookups (RTO_ONLINK). Therefore, we
> have to be careful when copying a TOS value to ->flowi4_tos. In
> particular, the ->tos field of IPv4 packets may have this bit set
> because of ECN. Also tunnel keys generally accept any user value for
> the tos.
>
> This series fixes several places where ->flowi4_tos was set from
> non-sanitised values and the flowi4 structure was later used by
> ip_route_output_key_hash().
>
> Note that the IPv4 stack usually clears the RTO_ONLINK bit using
> RT_TOS(). However this macro is based on an obsolete interpretation of
> the old IPv4 TOS field (RFC 1349) and clears the three high order bits.
> Since we don't need to clear these bits and since it doesn't make sense
> to clear only one of the ECN bits, this patch series uses INET_ECN_MASK
> instead.
>
> All patches were compile tested only.
Does not apply cleanly to net any more, could you respin?
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH net 0/4] ipv4: Fix accidental RTO_ONLINK flags passed to ip_route_output_key_hash()
2022-01-10 0:23 ` [PATCH net 0/4] ipv4: Fix accidental RTO_ONLINK flags passed to ip_route_output_key_hash() Jakub Kicinski
@ 2022-01-10 13:59 ` Guillaume Nault
0 siblings, 0 replies; 9+ messages in thread
From: Guillaume Nault @ 2022-01-10 13:59 UTC (permalink / raw)
To: Jakub Kicinski
Cc: David Miller, netdev, Steffen Klassert, Herbert Xu,
Hideaki YOSHIFUJI, David Ahern, wenxu, Varun Prakash,
Saeed Mahameed, Leon Romanovsky, Vlad Buslov, Or Gerlitz
On Sun, Jan 09, 2022 at 04:23:22PM -0800, Jakub Kicinski wrote:
> On Wed, 5 Jan 2022 20:56:16 +0100 Guillaume Nault wrote:
> > The IPv4 stack generally uses the last bit of ->flowi4_tos as a flag
> > indicating link scope for route lookups (RTO_ONLINK). Therefore, we
> > have to be careful when copying a TOS value to ->flowi4_tos. In
> > particular, the ->tos field of IPv4 packets may have this bit set
> > because of ECN. Also tunnel keys generally accept any user value for
> > the tos.
> >
> > This series fixes several places where ->flowi4_tos was set from
> > non-sanitised values and the flowi4 structure was later used by
> > ip_route_output_key_hash().
> >
> > Note that the IPv4 stack usually clears the RTO_ONLINK bit using
> > RT_TOS(). However this macro is based on an obsolete interpretation of
> > the old IPv4 TOS field (RFC 1349) and clears the three high order bits.
> > Since we don't need to clear these bits and since it doesn't make sense
> > to clear only one of the ECN bits, this patch series uses INET_ECN_MASK
> > instead.
> >
> > All patches were compile tested only.
>
> Does not apply cleanly to net any more, could you respin?
Yes, done:
https://lore.kernel.org/netdev/cover.1641821242.git.gnault@redhat.com/
^ permalink raw reply [flat|nested] 9+ messages in thread