From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF9C2313276; Sat, 14 Feb 2026 21:26:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771104404; cv=none; b=oZq9Rw47Qpjt4b0AgmtaxycWBesEw+i+oDHqtOwQLHSPHBxchEMPvfWef7cWNG7Cbt2PH1NwX+3XROQa57Y8VdxbUGueewh9R4uas04TyXHoJr4e/BHj2Au8sjWgOAAeqaceUQVaaFz4dxi0vUN+npaWCeSek5XoCjbIBO7xoDw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771104404; c=relaxed/simple; bh=RbMrL6gLKfAnQPVcaNts4Jte0dJZt7ijalnP54S/bKQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=qOOfkt9oZbbGDnataNgVN80G6c4BMwMiAuY2bzvI6Y9L6WaANC/u1Y8+RArzJs8qmhqSJ6+yx5XHiL3WLxWW7l8TNEz0WycOORGG9SqdbDY0eF4sSeMHunRfqSA0DrzXx3V3+vzwuSW3s+AM+P4+LywnbBYk6RFl5GN+pYe35+Y= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ry/TSq7E; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ry/TSq7E" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E432FC2BC87; Sat, 14 Feb 2026 21:26:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1771104403; bh=RbMrL6gLKfAnQPVcaNts4Jte0dJZt7ijalnP54S/bKQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ry/TSq7Eyd43Cr6K7qlvFIwhJ6CnVAAeQ0QsAynH2BO/GtL3TTRpZdBqRz8q6iR3w O17xhc6M63mcfgf4+ivlhyppgY3an6qv2Wo/OdkE9nElaLPxnrBTjBUK7wn1e44JqK 7buAtz9N+LOus9t2XSwR5NonjBVRyS8Jf56haJCgIghjnii6yury3lK3nQ382g04+D O5qitkNco92IPyf/7URjBAW61q3HNk7v5kqwIruz3cyS18DaPxKz+L7fyQUnZPOICT EgzQVfKQKSnCzqZ5OonvHzOVuvk8JAGpe/d/TQEvbS5qyz5EUYT2yzmt589XBL0Zo8 y4PSJ9MD9zr/Q== From: Sasha Levin To: patches@lists.linux.dev, stable@vger.kernel.org Cc: Eric Dumazet , Simon Horman , Jakub Kicinski , Sasha Levin , davem@davemloft.net, dsahern@kernel.org, netdev@vger.kernel.org Subject: [PATCH AUTOSEL 6.19-6.18] ipv6: annotate data-races in net/ipv6/route.c Date: Sat, 14 Feb 2026 16:23:30 -0500 Message-ID: <20260214212452.782265-65-sashal@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260214212452.782265-1-sashal@kernel.org> References: <20260214212452.782265-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.19 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Eric Dumazet [ Upstream commit f062e8e25102324364aada61b8283356235bc3c1 ] sysctls are read while their values can change, add READ_ONCE() annotations. Signed-off-by: Eric Dumazet Reviewed-by: Simon Horman Link: https://patch.msgid.link/20260115094141.3124990-9-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- LLM Generated explanations, may be completely bogus: ## Analysis: ipv6: annotate data-races in net/ipv6/route.c ### Commit Message Analysis The commit message is straightforward: it adds `READ_ONCE()` annotations to sysctl reads in the IPv6 routing code where values can be concurrently modified. This is authored by Eric Dumazet, a prolific networking maintainer at Google who regularly submits KCSAN data-race annotation patches. Reviewed by Simon Horman. ### Code Change Analysis The patch adds `READ_ONCE()` annotations to the following sysctl fields read from `net->ipv6.sysctl.*`: 1. **`ip6_rt_mtu_expires`** in `rt6_do_update_pmtu()` — MTU expiration timer 2. **`ip6_rt_min_advmss`** in `ip6_default_advmss()` — minimum advertised MSS (also refactored from if/assign to `max_t()`) 3. **`ip6_rt_gc_min_interval`** in `ip6_dst_gc()` — GC minimum interval 4. **`ip6_rt_gc_elasticity`** in `ip6_dst_gc()` — GC elasticity 5. **`ip6_rt_gc_timeout`** in `ip6_dst_gc()` — GC timeout 6. **`ip6_rt_last_gc`** in `ip6_dst_gc()` — last GC timestamp (this is `READ_ONCE` on a non-sysctl field, but still a concurrent access) 7. **`skip_notify_on_dev_down`** in `rt6_sync_down_dev()` — device down notification skip flag 8. **`fib_notify_on_flag_change`** in `fib6_info_hw_flags_set()` — FIB notification flag (read into local variable to ensure consistent use) 9. **`flush_delay`** in `ipv6_sysctl_rtcache_flush()` — route cache flush delay ### Bug Classification These are **KCSAN data-race fixes**. The sysctl values are written from userspace via `/proc/sys/net/ipv6/...` while being read concurrently from network processing paths. Without `READ_ONCE()`: - The compiler is free to reload the value multiple times, potentially getting different values within the same function - This can cause inconsistent behavior (e.g., in `fib6_info_hw_flags_set()` where `fib_notify_on_flag_change` is checked twice — once for `== 2` and once for `!= 0` — a torn read could skip both checks or trigger unexpected paths) - KCSAN will report these as data races ### Specific Risk Assessment The most interesting change is in `fib6_info_hw_flags_set()` where the sysctl value is read once into a local variable and then used for both comparisons. Without this, the value could change between the `== 2` check and the `!= 0` check, leading to potentially skipping the notification when it shouldn't be skipped (or vice versa). This is a real logic consistency bug, not just a theoretical annotation. The `ip6_dst_gc()` changes protect GC parameters that could be torn- read, potentially affecting garbage collection timing and behavior. ### Scope and Risk - **Files changed**: 1 (net/ipv6/route.c) - **Lines changed**: ~20 lines, all mechanical `READ_ONCE()` additions - **Risk**: Very low — `READ_ONCE()` is a compiler barrier that prevents optimization-based re-reading. It doesn't change any logic or locking. The `max_t()` refactor in `ip6_default_advmss()` is functionally equivalent to the original if/assign pattern. - **Dependencies**: None — `READ_ONCE()` is a basic kernel macro available in all stable trees. ### Stable Criteria Assessment 1. **Obviously correct and tested**: Yes — mechanical `READ_ONCE()` additions, reviewed by Simon Horman 2. **Fixes a real bug**: Yes — data races detected by KCSAN; the `fib_notify_on_flag_change` double-read is a real logic bug 3. **Important issue**: Medium — data races in networking code can cause subtle misbehavior. KCSAN annotations are routinely backported 4. **Small and contained**: Yes — single file, ~20 lines of mechanical changes 5. **No new features**: Correct — pure annotation/fix 6. **Applies cleanly**: Should apply cleanly to recent stable trees ### Precedent Eric Dumazet's `READ_ONCE()`/`WRITE_ONCE()` annotation patches for networking sysctls are regularly backported to stable trees. This is part of an ongoing effort to make the networking stack KCSAN-clean, and these patches have a strong track record of being safe and beneficial. ### Conclusion This is a low-risk, well-understood data-race fix in core IPv6 routing code. The `fib6_info_hw_flags_set()` fix addresses a real logic consistency bug where a sysctl value could change between two reads within the same function. The remaining annotations prevent compiler- induced re-reads of concurrently modifiable values. This type of patch is routinely and safely backported to stable. **YES** net/ipv6/route.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index e3a260a5564ba..cd229974b7974 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -2895,7 +2895,7 @@ static void rt6_do_update_pmtu(struct rt6_info *rt, u32 mtu) dst_metric_set(&rt->dst, RTAX_MTU, mtu); rt->rt6i_flags |= RTF_MODIFIED; - rt6_update_expires(rt, net->ipv6.sysctl.ip6_rt_mtu_expires); + rt6_update_expires(rt, READ_ONCE(net->ipv6.sysctl.ip6_rt_mtu_expires)); } static bool rt6_cache_allowed_for_pmtu(const struct rt6_info *rt) @@ -3256,8 +3256,8 @@ static unsigned int ip6_default_advmss(const struct dst_entry *dst) rcu_read_lock(); net = dst_dev_net_rcu(dst); - if (mtu < net->ipv6.sysctl.ip6_rt_min_advmss) - mtu = net->ipv6.sysctl.ip6_rt_min_advmss; + mtu = max_t(unsigned int, mtu, + READ_ONCE(net->ipv6.sysctl.ip6_rt_min_advmss)); rcu_read_unlock(); @@ -3359,10 +3359,10 @@ struct dst_entry *icmp6_dst_alloc(struct net_device *dev, static void ip6_dst_gc(struct dst_ops *ops) { struct net *net = container_of(ops, struct net, ipv6.ip6_dst_ops); - int rt_min_interval = net->ipv6.sysctl.ip6_rt_gc_min_interval; - int rt_elasticity = net->ipv6.sysctl.ip6_rt_gc_elasticity; - int rt_gc_timeout = net->ipv6.sysctl.ip6_rt_gc_timeout; - unsigned long rt_last_gc = net->ipv6.ip6_rt_last_gc; + int rt_min_interval = READ_ONCE(net->ipv6.sysctl.ip6_rt_gc_min_interval); + int rt_elasticity = READ_ONCE(net->ipv6.sysctl.ip6_rt_gc_elasticity); + int rt_gc_timeout = READ_ONCE(net->ipv6.sysctl.ip6_rt_gc_timeout); + unsigned long rt_last_gc = READ_ONCE(net->ipv6.ip6_rt_last_gc); unsigned int val; int entries; @@ -5008,7 +5008,7 @@ void rt6_sync_down_dev(struct net_device *dev, unsigned long event) }; struct net *net = dev_net(dev); - if (net->ipv6.sysctl.skip_notify_on_dev_down) + if (READ_ONCE(net->ipv6.sysctl.skip_notify_on_dev_down)) fib6_clean_all_skip_notify(net, fib6_ifdown, &arg); else fib6_clean_all(net, fib6_ifdown, &arg); @@ -6408,6 +6408,7 @@ void fib6_rt_update(struct net *net, struct fib6_info *rt, void fib6_info_hw_flags_set(struct net *net, struct fib6_info *f6i, bool offload, bool trap, bool offload_failed) { + u8 fib_notify_on_flag_change; struct sk_buff *skb; int err; @@ -6419,8 +6420,9 @@ void fib6_info_hw_flags_set(struct net *net, struct fib6_info *f6i, WRITE_ONCE(f6i->offload, offload); WRITE_ONCE(f6i->trap, trap); + fib_notify_on_flag_change = READ_ONCE(net->ipv6.sysctl.fib_notify_on_flag_change); /* 2 means send notifications only if offload_failed was changed. */ - if (net->ipv6.sysctl.fib_notify_on_flag_change == 2 && + if (fib_notify_on_flag_change == 2 && READ_ONCE(f6i->offload_failed) == offload_failed) return; @@ -6432,7 +6434,7 @@ void fib6_info_hw_flags_set(struct net *net, struct fib6_info *f6i, */ return; - if (!net->ipv6.sysctl.fib_notify_on_flag_change) + if (!fib_notify_on_flag_change) return; skb = nlmsg_new(rt6_nlmsg_size(f6i), GFP_KERNEL); @@ -6529,7 +6531,7 @@ static int ipv6_sysctl_rtcache_flush(const struct ctl_table *ctl, int write, return ret; net = (struct net *)ctl->extra1; - delay = net->ipv6.sysctl.flush_delay; + delay = READ_ONCE(net->ipv6.sysctl.flush_delay); fib6_run_gc(delay <= 0 ? 0 : (unsigned long)delay, net, delay > 0); return 0; } -- 2.51.0