Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] sit: prepare for RTNL-less link dumping
@ 2026-07-01 15:51 Eric Dumazet
  2026-07-01 15:51 ` [PATCH net-next 1/2] ip_tunnel: use WRITE_ONCE in ip_tunnel_encap_setup Eric Dumazet
  2026-07-01 15:51 ` [PATCH net-next 2/2] sit: no longer rely on RTNL in ipip6_fill_info() Eric Dumazet
  0 siblings, 2 replies; 5+ messages in thread
From: Eric Dumazet @ 2026-07-01 15:51 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Ido Schimmel, David Ahern,
	netdev, eric.dumazet, Eric Dumazet

This series prepares the SIT tunnel driver for RTNL-less link info dumping.

To do so, we need to ensure that configuration reads in ipip6_fill_info()
can run safely concurrently with configuration updates
(via netlink changelink or ioctls).

First patch fixes a long-standing bug in the shared
ip_tunnel_encap_setup() helper (used by SIT and other IPv4 tunnels),
which cleared the encapsulation configuration even if the setup failed,
and adds WRITE_ONCE() annotations to protect concurrent readers.

Second patch updates SIT's fill_info() to use READ_ONCE()
for lockless reads, and updates the write paths to use WRITE_ONCE().

Eric Dumazet (2):
  ip_tunnel: use WRITE_ONCE in ip_tunnel_encap_setup
  sit: no longer rely on RTNL in ipip6_fill_info()

 net/ipv4/ip_tunnel.c | 14 ++++-----
 net/ipv6/sit.c       | 70 +++++++++++++++++++++++++-------------------
 2 files changed, 46 insertions(+), 38 deletions(-)

-- 
2.55.0.rc0.799.gd6f94ed593-goog


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH net-next 1/2] ip_tunnel: use WRITE_ONCE in ip_tunnel_encap_setup
  2026-07-01 15:51 [PATCH net-next 0/2] sit: prepare for RTNL-less link dumping Eric Dumazet
@ 2026-07-01 15:51 ` Eric Dumazet
  2026-07-02  3:11   ` Kuniyuki Iwashima
  2026-07-01 15:51 ` [PATCH net-next 2/2] sit: no longer rely on RTNL in ipip6_fill_info() Eric Dumazet
  1 sibling, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2026-07-01 15:51 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Ido Schimmel, David Ahern,
	netdev, eric.dumazet, Eric Dumazet

Update ip_tunnel_encap_setup() to use WRITE_ONCE() when writing
to encap fields (type, sport, dport, flags) and hlen fields.
This ensures that concurrent lockless readers (like fill_info)
do not see torn writes.

Also remove the unsafe memset() on t->encap which could cause
concurrent readers to transiently see zeroed fields.
Removing it also fixes a bug where t->encap was left cleared
even if ip_encap_hlen() failed, resulting in partial configuration.

Fixes: 56328486539d ("net: Changes to ip_tunnel to support foo-over-udp encapsulation")
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/ip_tunnel.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index 9d114bd575f928b0ab46ef3007e456692d82b497..f7bbdd1bd323b1973ec1ea1d93f0ed6ab703bcd3 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -491,19 +491,17 @@ int ip_tunnel_encap_setup(struct ip_tunnel *t,
 {
 	int hlen;
 
-	memset(&t->encap, 0, sizeof(t->encap));
-
 	hlen = ip_encap_hlen(ipencap);
 	if (hlen < 0)
 		return hlen;
 
-	t->encap.type = ipencap->type;
-	t->encap.sport = ipencap->sport;
-	t->encap.dport = ipencap->dport;
-	t->encap.flags = ipencap->flags;
+	WRITE_ONCE(t->encap.type, ipencap->type);
+	WRITE_ONCE(t->encap.sport, ipencap->sport);
+	WRITE_ONCE(t->encap.dport, ipencap->dport);
+	WRITE_ONCE(t->encap.flags, ipencap->flags);
 
-	t->encap_hlen = hlen;
-	t->hlen = t->encap_hlen + t->tun_hlen;
+	WRITE_ONCE(t->encap_hlen, hlen);
+	WRITE_ONCE(t->hlen, hlen + t->tun_hlen);
 
 	return 0;
 }
-- 
2.55.0.rc0.799.gd6f94ed593-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net-next 2/2] sit: no longer rely on RTNL in ipip6_fill_info()
  2026-07-01 15:51 [PATCH net-next 0/2] sit: prepare for RTNL-less link dumping Eric Dumazet
  2026-07-01 15:51 ` [PATCH net-next 1/2] ip_tunnel: use WRITE_ONCE in ip_tunnel_encap_setup Eric Dumazet
@ 2026-07-01 15:51 ` Eric Dumazet
  2026-07-02  3:13   ` Kuniyuki Iwashima
  1 sibling, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2026-07-01 15:51 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Ido Schimmel, David Ahern,
	netdev, eric.dumazet, Eric Dumazet

Update ipip6_fill_info() to read configuration fields (link, ttl, tos,
proto, i_flags, fwmark, 6rd prefix, encap type/sport/dport/flags)
locklessly using READ_ONCE().

Annotate the bitmap reads for i_flags by copying the first element
atomically using READ_ONCE() into a local variable, as the whole
bitmap fits in one unsigned long.

This allows ipip6_fill_info() to run safely without RTNL.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv6/sit.c | 70 ++++++++++++++++++++++++++++----------------------
 1 file changed, 40 insertions(+), 30 deletions(-)

diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index a38b24fb838424b6d3cb063d77aa85cf719ce6c5..c7abbb09bfd3dfada6fc1e1682ac42acb7248ad9 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -1143,17 +1143,17 @@ static void ipip6_tunnel_update(struct ip_tunnel *t,
 
 	ipip6_tunnel_unlink(sitn, t);
 	synchronize_net();
-	t->parms.iph.saddr = p->iph.saddr;
-	t->parms.iph.daddr = p->iph.daddr;
+	WRITE_ONCE(t->parms.iph.saddr, p->iph.saddr);
+	WRITE_ONCE(t->parms.iph.daddr, p->iph.daddr);
 	__dev_addr_set(t->dev, &p->iph.saddr, 4);
 	memcpy(t->dev->broadcast, &p->iph.daddr, 4);
 	ipip6_tunnel_link(sitn, t);
-	t->parms.iph.ttl = p->iph.ttl;
-	t->parms.iph.tos = p->iph.tos;
-	t->parms.iph.frag_off = p->iph.frag_off;
-	if (t->parms.link != p->link || t->fwmark != fwmark) {
-		t->parms.link = p->link;
-		t->fwmark = fwmark;
+	WRITE_ONCE(t->parms.iph.ttl, p->iph.ttl);
+	WRITE_ONCE(t->parms.iph.tos, p->iph.tos);
+	WRITE_ONCE(t->parms.iph.frag_off, p->iph.frag_off);
+	if (READ_ONCE(t->parms.link) != p->link || READ_ONCE(t->fwmark) != fwmark) {
+		WRITE_ONCE(t->parms.link, p->link);
+		WRITE_ONCE(t->fwmark, fwmark);
 		ipip6_tunnel_bind_dev(t->dev);
 	}
 	dst_cache_reset(&t->dst_cache);
@@ -1184,9 +1184,9 @@ static int ipip6_tunnel_update_6rd(struct ip_tunnel *t,
 		return -EINVAL;
 
 	t->ip6rd.prefix = prefix;
-	t->ip6rd.relay_prefix = relay_prefix;
-	t->ip6rd.prefixlen = ip6rd->prefixlen;
-	t->ip6rd.relay_prefixlen = ip6rd->relay_prefixlen;
+	WRITE_ONCE(t->ip6rd.relay_prefix, relay_prefix);
+	WRITE_ONCE(t->ip6rd.prefixlen, ip6rd->prefixlen);
+	WRITE_ONCE(t->ip6rd.relay_prefixlen, ip6rd->relay_prefixlen);
 	dst_cache_reset(&t->dst_cache);
 	netdev_state_change(t->dev);
 	return 0;
@@ -1693,42 +1693,52 @@ static size_t ipip6_get_size(const struct net_device *dev)
 
 static int ipip6_fill_info(struct sk_buff *skb, const struct net_device *dev)
 {
-	struct ip_tunnel *tunnel = netdev_priv(dev);
-	struct ip_tunnel_parm_kern *parm = &tunnel->parms;
-
-	if (nla_put_u32(skb, IFLA_IPTUN_LINK, parm->link) ||
-	    nla_put_in_addr(skb, IFLA_IPTUN_LOCAL, parm->iph.saddr) ||
-	    nla_put_in_addr(skb, IFLA_IPTUN_REMOTE, parm->iph.daddr) ||
-	    nla_put_u8(skb, IFLA_IPTUN_TTL, parm->iph.ttl) ||
-	    nla_put_u8(skb, IFLA_IPTUN_TOS, parm->iph.tos) ||
+	const struct ip_tunnel *tunnel = netdev_priv(dev);
+	const struct ip_tunnel_parm_kern *parm;
+	IP_TUNNEL_DECLARE_FLAGS(i_flags);
+	__be16 frag_off;
+	__be32 daddr;
+	__be32 saddr;
+
+	parm = &tunnel->parms;
+	i_flags[0] = READ_ONCE(parm->i_flags[0]);
+	frag_off = READ_ONCE(parm->iph.frag_off);
+	saddr = READ_ONCE(parm->iph.saddr);
+	daddr = READ_ONCE(parm->iph.daddr);
+
+	if (nla_put_u32(skb, IFLA_IPTUN_LINK, READ_ONCE(parm->link)) ||
+	    nla_put_in_addr(skb, IFLA_IPTUN_LOCAL, saddr) ||
+	    nla_put_in_addr(skb, IFLA_IPTUN_REMOTE, daddr) ||
+	    nla_put_u8(skb, IFLA_IPTUN_TTL, READ_ONCE(parm->iph.ttl)) ||
+	    nla_put_u8(skb, IFLA_IPTUN_TOS, READ_ONCE(parm->iph.tos)) ||
 	    nla_put_u8(skb, IFLA_IPTUN_PMTUDISC,
-		       !!(parm->iph.frag_off & htons(IP_DF))) ||
-	    nla_put_u8(skb, IFLA_IPTUN_PROTO, parm->iph.protocol) ||
+		       !!(frag_off & htons(IP_DF))) ||
+	    nla_put_u8(skb, IFLA_IPTUN_PROTO, READ_ONCE(parm->iph.protocol)) ||
 	    nla_put_be16(skb, IFLA_IPTUN_FLAGS,
-			 ip_tunnel_flags_to_be16(parm->i_flags)) ||
-	    nla_put_u32(skb, IFLA_IPTUN_FWMARK, tunnel->fwmark))
+			 ip_tunnel_flags_to_be16(i_flags)) ||
+	    nla_put_u32(skb, IFLA_IPTUN_FWMARK, READ_ONCE(tunnel->fwmark)))
 		goto nla_put_failure;
 
 #ifdef CONFIG_IPV6_SIT_6RD
 	if (nla_put_in6_addr(skb, IFLA_IPTUN_6RD_PREFIX,
 			     &tunnel->ip6rd.prefix) ||
 	    nla_put_in_addr(skb, IFLA_IPTUN_6RD_RELAY_PREFIX,
-			    tunnel->ip6rd.relay_prefix) ||
+			    READ_ONCE(tunnel->ip6rd.relay_prefix)) ||
 	    nla_put_u16(skb, IFLA_IPTUN_6RD_PREFIXLEN,
-			tunnel->ip6rd.prefixlen) ||
+			READ_ONCE(tunnel->ip6rd.prefixlen)) ||
 	    nla_put_u16(skb, IFLA_IPTUN_6RD_RELAY_PREFIXLEN,
-			tunnel->ip6rd.relay_prefixlen))
+			READ_ONCE(tunnel->ip6rd.relay_prefixlen)))
 		goto nla_put_failure;
 #endif
 
 	if (nla_put_u16(skb, IFLA_IPTUN_ENCAP_TYPE,
-			tunnel->encap.type) ||
+			READ_ONCE(tunnel->encap.type)) ||
 	    nla_put_be16(skb, IFLA_IPTUN_ENCAP_SPORT,
-			tunnel->encap.sport) ||
+			READ_ONCE(tunnel->encap.sport)) ||
 	    nla_put_be16(skb, IFLA_IPTUN_ENCAP_DPORT,
-			tunnel->encap.dport) ||
+			READ_ONCE(tunnel->encap.dport)) ||
 	    nla_put_u16(skb, IFLA_IPTUN_ENCAP_FLAGS,
-			tunnel->encap.flags))
+			READ_ONCE(tunnel->encap.flags)))
 		goto nla_put_failure;
 
 	return 0;
-- 
2.55.0.rc0.799.gd6f94ed593-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next 1/2] ip_tunnel: use WRITE_ONCE in ip_tunnel_encap_setup
  2026-07-01 15:51 ` [PATCH net-next 1/2] ip_tunnel: use WRITE_ONCE in ip_tunnel_encap_setup Eric Dumazet
@ 2026-07-02  3:11   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 5+ messages in thread
From: Kuniyuki Iwashima @ 2026-07-02  3:11 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Ido Schimmel, David Ahern, netdev, eric.dumazet

On Wed, Jul 1, 2026 at 8:51 AM Eric Dumazet <edumazet@google.com> wrote:
>
> Update ip_tunnel_encap_setup() to use WRITE_ONCE() when writing
> to encap fields (type, sport, dport, flags) and hlen fields.
> This ensures that concurrent lockless readers (like fill_info)
> do not see torn writes.
>
> Also remove the unsafe memset() on t->encap which could cause
> concurrent readers to transiently see zeroed fields.
> Removing it also fixes a bug where t->encap was left cleared
> even if ip_encap_hlen() failed, resulting in partial configuration.
>
> Fixes: 56328486539d ("net: Changes to ip_tunnel to support foo-over-udp encapsulation")
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next 2/2] sit: no longer rely on RTNL in ipip6_fill_info()
  2026-07-01 15:51 ` [PATCH net-next 2/2] sit: no longer rely on RTNL in ipip6_fill_info() Eric Dumazet
@ 2026-07-02  3:13   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 5+ messages in thread
From: Kuniyuki Iwashima @ 2026-07-02  3:13 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Ido Schimmel, David Ahern, netdev, eric.dumazet

On Wed, Jul 1, 2026 at 8:51 AM Eric Dumazet <edumazet@google.com> wrote:
>
> Update ipip6_fill_info() to read configuration fields (link, ttl, tos,
> proto, i_flags, fwmark, 6rd prefix, encap type/sport/dport/flags)
> locklessly using READ_ONCE().
>
> Annotate the bitmap reads for i_flags by copying the first element
> atomically using READ_ONCE() into a local variable, as the whole
> bitmap fits in one unsigned long.
>
> This allows ipip6_fill_info() to run safely without RTNL.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-07-02  3:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-01 15:51 [PATCH net-next 0/2] sit: prepare for RTNL-less link dumping Eric Dumazet
2026-07-01 15:51 ` [PATCH net-next 1/2] ip_tunnel: use WRITE_ONCE in ip_tunnel_encap_setup Eric Dumazet
2026-07-02  3:11   ` Kuniyuki Iwashima
2026-07-01 15:51 ` [PATCH net-next 2/2] sit: no longer rely on RTNL in ipip6_fill_info() Eric Dumazet
2026-07-02  3:13   ` Kuniyuki Iwashima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox