From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Eric Dumazet <edumazet@google.com>,
Simon Horman <horms@kernel.org>, Jakub Kicinski <kuba@kernel.org>,
Sasha Levin <sashal@kernel.org>,
davem@davemloft.net, dsahern@kernel.org, netdev@vger.kernel.org
Subject: [PATCH AUTOSEL 6.19-6.18] ipv6: annotate data-races in net/ipv6/route.c
Date: Sat, 14 Feb 2026 16:23:30 -0500 [thread overview]
Message-ID: <20260214212452.782265-65-sashal@kernel.org> (raw)
In-Reply-To: <20260214212452.782265-1-sashal@kernel.org>
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit f062e8e25102324364aada61b8283356235bc3c1 ]
sysctls are read while their values can change,
add READ_ONCE() annotations.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260115094141.3124990-9-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
## Analysis: ipv6: annotate data-races in net/ipv6/route.c
### Commit Message Analysis
The commit message is straightforward: it adds `READ_ONCE()` annotations
to sysctl reads in the IPv6 routing code where values can be
concurrently modified. This is authored by Eric Dumazet, a prolific
networking maintainer at Google who regularly submits KCSAN data-race
annotation patches. Reviewed by Simon Horman.
### Code Change Analysis
The patch adds `READ_ONCE()` annotations to the following sysctl fields
read from `net->ipv6.sysctl.*`:
1. **`ip6_rt_mtu_expires`** in `rt6_do_update_pmtu()` — MTU expiration
timer
2. **`ip6_rt_min_advmss`** in `ip6_default_advmss()` — minimum
advertised MSS (also refactored from if/assign to `max_t()`)
3. **`ip6_rt_gc_min_interval`** in `ip6_dst_gc()` — GC minimum interval
4. **`ip6_rt_gc_elasticity`** in `ip6_dst_gc()` — GC elasticity
5. **`ip6_rt_gc_timeout`** in `ip6_dst_gc()` — GC timeout
6. **`ip6_rt_last_gc`** in `ip6_dst_gc()` — last GC timestamp (this is
`READ_ONCE` on a non-sysctl field, but still a concurrent access)
7. **`skip_notify_on_dev_down`** in `rt6_sync_down_dev()` — device down
notification skip flag
8. **`fib_notify_on_flag_change`** in `fib6_info_hw_flags_set()` — FIB
notification flag (read into local variable to ensure consistent use)
9. **`flush_delay`** in `ipv6_sysctl_rtcache_flush()` — route cache
flush delay
### Bug Classification
These are **KCSAN data-race fixes**. The sysctl values are written from
userspace via `/proc/sys/net/ipv6/...` while being read concurrently
from network processing paths. Without `READ_ONCE()`:
- The compiler is free to reload the value multiple times, potentially
getting different values within the same function
- This can cause inconsistent behavior (e.g., in
`fib6_info_hw_flags_set()` where `fib_notify_on_flag_change` is
checked twice — once for `== 2` and once for `!= 0` — a torn read
could skip both checks or trigger unexpected paths)
- KCSAN will report these as data races
### Specific Risk Assessment
The most interesting change is in `fib6_info_hw_flags_set()` where the
sysctl value is read once into a local variable and then used for both
comparisons. Without this, the value could change between the `== 2`
check and the `!= 0` check, leading to potentially skipping the
notification when it shouldn't be skipped (or vice versa). This is a
real logic consistency bug, not just a theoretical annotation.
The `ip6_dst_gc()` changes protect GC parameters that could be torn-
read, potentially affecting garbage collection timing and behavior.
### Scope and Risk
- **Files changed**: 1 (net/ipv6/route.c)
- **Lines changed**: ~20 lines, all mechanical `READ_ONCE()` additions
- **Risk**: Very low — `READ_ONCE()` is a compiler barrier that prevents
optimization-based re-reading. It doesn't change any logic or locking.
The `max_t()` refactor in `ip6_default_advmss()` is functionally
equivalent to the original if/assign pattern.
- **Dependencies**: None — `READ_ONCE()` is a basic kernel macro
available in all stable trees.
### Stable Criteria Assessment
1. **Obviously correct and tested**: Yes — mechanical `READ_ONCE()`
additions, reviewed by Simon Horman
2. **Fixes a real bug**: Yes — data races detected by KCSAN; the
`fib_notify_on_flag_change` double-read is a real logic bug
3. **Important issue**: Medium — data races in networking code can cause
subtle misbehavior. KCSAN annotations are routinely backported
4. **Small and contained**: Yes — single file, ~20 lines of mechanical
changes
5. **No new features**: Correct — pure annotation/fix
6. **Applies cleanly**: Should apply cleanly to recent stable trees
### Precedent
Eric Dumazet's `READ_ONCE()`/`WRITE_ONCE()` annotation patches for
networking sysctls are regularly backported to stable trees. This is
part of an ongoing effort to make the networking stack KCSAN-clean, and
these patches have a strong track record of being safe and beneficial.
### Conclusion
This is a low-risk, well-understood data-race fix in core IPv6 routing
code. The `fib6_info_hw_flags_set()` fix addresses a real logic
consistency bug where a sysctl value could change between two reads
within the same function. The remaining annotations prevent compiler-
induced re-reads of concurrently modifiable values. This type of patch
is routinely and safely backported to stable.
**YES**
net/ipv6/route.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index e3a260a5564ba..cd229974b7974 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2895,7 +2895,7 @@ static void rt6_do_update_pmtu(struct rt6_info *rt, u32 mtu)
dst_metric_set(&rt->dst, RTAX_MTU, mtu);
rt->rt6i_flags |= RTF_MODIFIED;
- rt6_update_expires(rt, net->ipv6.sysctl.ip6_rt_mtu_expires);
+ rt6_update_expires(rt, READ_ONCE(net->ipv6.sysctl.ip6_rt_mtu_expires));
}
static bool rt6_cache_allowed_for_pmtu(const struct rt6_info *rt)
@@ -3256,8 +3256,8 @@ static unsigned int ip6_default_advmss(const struct dst_entry *dst)
rcu_read_lock();
net = dst_dev_net_rcu(dst);
- if (mtu < net->ipv6.sysctl.ip6_rt_min_advmss)
- mtu = net->ipv6.sysctl.ip6_rt_min_advmss;
+ mtu = max_t(unsigned int, mtu,
+ READ_ONCE(net->ipv6.sysctl.ip6_rt_min_advmss));
rcu_read_unlock();
@@ -3359,10 +3359,10 @@ struct dst_entry *icmp6_dst_alloc(struct net_device *dev,
static void ip6_dst_gc(struct dst_ops *ops)
{
struct net *net = container_of(ops, struct net, ipv6.ip6_dst_ops);
- int rt_min_interval = net->ipv6.sysctl.ip6_rt_gc_min_interval;
- int rt_elasticity = net->ipv6.sysctl.ip6_rt_gc_elasticity;
- int rt_gc_timeout = net->ipv6.sysctl.ip6_rt_gc_timeout;
- unsigned long rt_last_gc = net->ipv6.ip6_rt_last_gc;
+ int rt_min_interval = READ_ONCE(net->ipv6.sysctl.ip6_rt_gc_min_interval);
+ int rt_elasticity = READ_ONCE(net->ipv6.sysctl.ip6_rt_gc_elasticity);
+ int rt_gc_timeout = READ_ONCE(net->ipv6.sysctl.ip6_rt_gc_timeout);
+ unsigned long rt_last_gc = READ_ONCE(net->ipv6.ip6_rt_last_gc);
unsigned int val;
int entries;
@@ -5008,7 +5008,7 @@ void rt6_sync_down_dev(struct net_device *dev, unsigned long event)
};
struct net *net = dev_net(dev);
- if (net->ipv6.sysctl.skip_notify_on_dev_down)
+ if (READ_ONCE(net->ipv6.sysctl.skip_notify_on_dev_down))
fib6_clean_all_skip_notify(net, fib6_ifdown, &arg);
else
fib6_clean_all(net, fib6_ifdown, &arg);
@@ -6408,6 +6408,7 @@ void fib6_rt_update(struct net *net, struct fib6_info *rt,
void fib6_info_hw_flags_set(struct net *net, struct fib6_info *f6i,
bool offload, bool trap, bool offload_failed)
{
+ u8 fib_notify_on_flag_change;
struct sk_buff *skb;
int err;
@@ -6419,8 +6420,9 @@ void fib6_info_hw_flags_set(struct net *net, struct fib6_info *f6i,
WRITE_ONCE(f6i->offload, offload);
WRITE_ONCE(f6i->trap, trap);
+ fib_notify_on_flag_change = READ_ONCE(net->ipv6.sysctl.fib_notify_on_flag_change);
/* 2 means send notifications only if offload_failed was changed. */
- if (net->ipv6.sysctl.fib_notify_on_flag_change == 2 &&
+ if (fib_notify_on_flag_change == 2 &&
READ_ONCE(f6i->offload_failed) == offload_failed)
return;
@@ -6432,7 +6434,7 @@ void fib6_info_hw_flags_set(struct net *net, struct fib6_info *f6i,
*/
return;
- if (!net->ipv6.sysctl.fib_notify_on_flag_change)
+ if (!fib_notify_on_flag_change)
return;
skb = nlmsg_new(rt6_nlmsg_size(f6i), GFP_KERNEL);
@@ -6529,7 +6531,7 @@ static int ipv6_sysctl_rtcache_flush(const struct ctl_table *ctl, int write,
return ret;
net = (struct net *)ctl->extra1;
- delay = net->ipv6.sysctl.flush_delay;
+ delay = READ_ONCE(net->ipv6.sysctl.flush_delay);
fib6_run_gc(delay <= 0 ? 0 : (unsigned long)delay, net, delay > 0);
return 0;
}
--
2.51.0
next prev parent reply other threads:[~2026-02-14 21:26 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260214212452.782265-1-sashal@kernel.org>
2026-02-14 21:22 ` [PATCH AUTOSEL 6.19-5.10] myri10ge: avoid uninitialized variable use Sasha Levin
2026-02-14 21:22 ` [PATCH AUTOSEL 6.19-6.1] net: mctp-i2c: fix duplicate reception of old data Sasha Levin
2026-02-14 21:22 ` [PATCH AUTOSEL 6.19-6.12] net: wwan: mhi: Add network support for Foxconn T99W760 Sasha Levin
2026-02-14 21:22 ` [PATCH AUTOSEL 6.19-5.10] net/rds: Clear reconnect pending bit Sasha Levin
2026-02-14 21:22 ` [PATCH AUTOSEL 6.19-6.12] ipv6: annotate data-races over sysctl.flowlabel_reflect Sasha Levin
2026-02-14 21:22 ` [PATCH AUTOSEL 6.19-5.15] ipv6: exthdrs: annotate data-race over multiple sysctl Sasha Levin
2026-02-14 21:23 ` [PATCH AUTOSEL 6.19-5.10] octeontx2-af: Workaround SQM/PSE stalls by disabling sticky Sasha Levin
2026-02-14 21:23 ` [PATCH AUTOSEL 6.19-5.10] vmw_vsock: bypass false-positive Wnonnull warning with gcc-16 Sasha Levin
2026-02-14 21:23 ` [PATCH AUTOSEL 6.19-5.15] ipv6: annotate data-races in ip6_multipath_hash_{policy,fields}() Sasha Levin
2026-02-14 21:23 ` [PATCH AUTOSEL 6.19-6.6] ipv4: igmp: annotate data-races around idev->mr_maxdelay Sasha Levin
2026-02-14 21:23 ` [PATCH AUTOSEL 6.19-5.10] net/rds: No shortcut out of RDS_CONN_ERROR Sasha Levin
2026-02-14 21:23 ` Sasha Levin [this message]
2026-02-14 21:23 ` [PATCH AUTOSEL 6.19-6.12] bnxt_en: Allow ntuple filters for drops Sasha Levin
2026-02-14 21:23 ` [PATCH AUTOSEL 6.19-6.18] ptp: ptp_vmclock: add 'VMCLOCK' to ACPI device match Sasha Levin
2026-02-14 21:23 ` [PATCH AUTOSEL 6.19-5.10] ipv4: fib: Annotate access to struct fib_alias.fa_state Sasha Levin
2026-02-14 21:23 ` [PATCH AUTOSEL 6.19-6.12] net: sfp: add quirk for Lantech 8330-265D Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260214212452.782265-65-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox