* [PATCH AUTOSEL 7.0-5.10] selftests: fib_nexthops: test stale has_v4 on nexthop replace
[not found] <20260428104133.2858589-1-sashal@kernel.org>
@ 2026-04-28 10:41 ` Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.12] 9p/trans_xen: make cleanup idempotent after dataring alloc errors Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.10] ipv6: Cap TLV scan in ip6_tnl_parse_tlv_enc_lim Sasha Levin
2 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2026-04-28 10:41 UTC (permalink / raw)
To: patches, stable
Cc: Jiayuan Chen, David Ahern, Paolo Abeni, Sasha Levin, davem,
edumazet, kuba, shuah, netdev, linux-kselftest, linux-kernel
From: Jiayuan Chen <jiayuan.chen@linux.dev>
[ Upstream commit 104f082f5ed6d19c5d85ca905ccd4e4d01aef66e ]
Add test cases that exercise the scenario where an IPv6 nexthop is
replaced with an IPv4 nexthop while being part of a group. The group's
has_v4 flag must be updated so that subsequent IPv6 route additions are
properly rejected.
Two cases are covered:
1. Gateway nexthop replaced across families with an existing IPv6
route on the group (rejected by fib6_check_nh_list).
2. Blackhole nexthop replaced across families with no existing IPv6
route on the group (fib6_check_nh_list returns early) — this is
the path that triggers a NULL ptr deref without the kernel fix.
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260413114522.147784-2-jiayuan.chen@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Now I have enough information to make my analysis. Let me synthesize the
findings.
## Phase 1: COMMIT MESSAGE FORENSICS
**Step 1.1: Subject line parse**
- Record: Subsystem `selftests: fib_nexthops:`, action verb `test`,
summary: adds test cases for the "stale has_v4 on nexthop replace"
scenario.
**Step 1.2: Tags**
- Record: `Signed-off-by: Jiayuan Chen` (author), `Reviewed-by: David
Ahern` (subsystem maintainer), `Link:` to lore patch.msgid.link (patch
2 of 2), `Signed-off-by: Paolo Abeni` (netdev maintainer). No explicit
Cc: stable (expected, not a negative signal).
**Step 1.3: Body analysis**
- Record: Commit body explicitly references the kernel bug fixed by
patch 1/2 (sibling commit). It describes two test cases: a gateway-
family swap (caught by `fib6_check_nh_list`) and a blackhole-family
swap that "triggers a NULL ptr deref without the kernel fix". This
selftest is the test companion to a syzbot-reported NULL deref fix.
**Step 1.4: Hidden bug fix detection**
- Record: Not a hidden fix - this is explicitly a test-only commit. The
kernel bug fix is in the paired commit (patch 1/2).
## Phase 2: DIFF ANALYSIS
**Step 2.1: Inventory**
- Record: Single file change
`tools/testing/selftests/net/fib_nexthops.sh`, +22 lines, 0 removed.
Function modified: `ipv6_fcnal_runtime()`. Scope: pure test additions
to an existing test function.
**Step 2.2: Code flow change**
- Record: Adds two new test scenarios appended to the existing test
series in `ipv6_fcnal_runtime()`. No existing code changed. New tests
use existing helper `run_cmd` and `log_test`.
**Step 2.3: Bug mechanism**
- Record: No bug mechanism - this is a test file, not kernel code. The
tests exercise:
1. `ip nexthop replace id 89 via 172.16.1.1` (IPv6→IPv4 gateway
replace), expects route rejection (exit 2)
2. `ip nexthop replace id 90 blackhole` after `ip -6 nexthop add id 90
blackhole` (IPv6→IPv4 blackhole), expects IPv6 route rejection and
unreachable ping
**Step 2.4: Fix quality**
- Record: Test additions are small, appended at a safe location (right
after the existing related test block and before `$IP nexthop flush`).
No regression risk to kernel runtime - only affects test output.
## Phase 3: GIT HISTORY INVESTIGATION
**Step 3.1: File history**
- Record: `tools/testing/selftests/net/fib_nexthops.sh` has accumulated
many test additions over the years. Recent stable-backported selftests
include `44741e9de29b` (Add test cases for error routes deletion) and
`46c1ef0cfcea5` (add test for IPv4 route with loopback IPv6 nexthop),
confirming that this file receives selftest backports.
**Step 3.2: The kernel fix paired with this test**
- Record: The kernel fix is `29c95185ba32b nexthop: fix IPv6 route
referencing IPv4 nexthop` (patch 1/2, immediately preceding this
commit in git history). That fix has:
- `Fixes: 7bf4796dd099 ("nexthops: add support for replace")` — buggy
code introduced in v5.3, present in all active stable trees (v5.10+,
v5.15+, v6.1+, v6.6+, v6.12+, v6.17+, v6.18+, v6.19+).
- Two syzbot reports referenced.
- 2-line `AF_INET == && AF_INET6 ==` → `!=` change; trivially correct.
- Reviewed-by David Ahern (nexthop subsystem maintainer).
**Step 3.3: Related changes**
- Record: Historically, similar 2-patch series (fix + selftest) have
been backported together to stable. The broader `ipv6_fcnal_runtime`
section uses infrastructure present in all stable trees.
**Step 3.4: Author**
- Record: Jiayuan Chen is an active contributor who has been submitting
many syzbot-related fixes recently (network UAF/NULL deref/race fixes,
etc.)
**Step 3.5: Dependencies**
- Record: This selftest depends on the kernel fix being present -
without it, the second test case would trigger the exact NULL pointer
dereference panic the fix addresses. If backported without the kernel
fix, running the test would crash the kernel.
## Phase 4: MAILING LIST RESEARCH
**Step 4.1: b4 dig on 104f082f5ed6d**
- Record: `b4 dig -c 104f082f5ed6d` matched exactly. Series is `[PATCH
net v1 1/2, 2/2]`. Only v1 exists. URL: https://lore.kernel.org/all/20
260413114522.147784-2-jiayuan.chen@linux.dev/
**Step 4.2: Recipients (b4 dig -w)**
- Record: Jiayuan Chen, netdev@vger.kernel.org, David Ahern (nexthop
maintainer), David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo
Abeni, Simon Horman, Shuah Khan, linux-kernel, linux-kselftest. All
appropriate.
**Step 4.3: Bug report**
- Record: Thread content (saved mbox) shows David Ahern's Reviewed-by
for both patches. Paolo Abeni applied both. The series was applied to
netdev/net.git (the -net tree for bug fixes, not net-next which is for
new features) - a strong indicator that this is treated as a bugfix,
not feature.
**Step 4.4: Related patches**
- Record: Only 2 patches in the series. The selftest (2/2) is the direct
companion to the kernel fix (1/2).
**Step 4.5: Stable discussion**
- Record: No explicit stable Cc in thread; none needed because the fix
has a Fixes: tag and Greg KH's AUTOSEL will consider both.
## Phase 5: CODE SEMANTIC ANALYSIS
**Step 5.1: Functions modified**
- Record: Only `ipv6_fcnal_runtime()` in a shell test script. No C code
changes.
**Step 5.2-5.5: Impact surface**
- Record: This test is invoked when running the `fib_nexthops.sh`
selftest. No kernel-side impact. The test validates the kernel-side
`replace_nexthop_single()` function's handling of cross-family
(AF_INET6 → AF_INET) nexthop replacement within groups.
## Phase 6: STABLE TREE ANALYSIS
**Step 6.1: Code in stable**
- Record: The kernel bug exists since v5.3 (verified via `git tag
--contains 7bf4796dd099`). The `ipv6_fcnal_runtime` test function
exists in all active stable trees (v5.10+). Context lines in the diff
are present in stable.
**Step 6.2: Backport complications**
- Record: The surrounding `ipv6_fcnal_runtime` test body in
stable/linux-6.19.y matches (verified indirectly through file
history). The test should apply cleanly or with minor line-offset
adjustment. Test uses existing `$IP`, `run_cmd`, `log_test`,
`PING_TIMEOUT`, `$me` infrastructure all present in stable.
**Step 6.3: Related in stable**
- Record: No existing backport of this test. Similar companion selftests
(e.g., 44741e9de29b for error routes deletion fix) were backported
alongside their kernel fixes.
## Phase 7: SUBSYSTEM CONTEXT
**Step 7.1: Subsystem**
- Record: `tools/testing/selftests/net/` - network subsystem test.
Criticality: test-only, but validates IMPORTANT subsystem
(networking/nexthop API).
**Step 7.2: Activity**
- Record: The nexthop subsystem is actively developed; selftests are
regularly added.
## Phase 8: IMPACT AND RISK
**Step 8.1: Who affected**
- Record: The test-only change affects anyone running selftests. It's
not a runtime change.
**Step 8.2: Trigger conditions**
- Record: Only triggered when `fib_nexthops.sh` is explicitly run.
**Step 8.3: Failure mode**
- Record: Without the paired kernel fix in stable, running this selftest
WOULD trigger the NULL pointer dereference (test scenario 2 exercises
the exact reproducer). With the fix, the test passes silently.
**Step 8.4: Risk-benefit**
- Record:
- BENEFIT: Validates that the syzbot NULL-deref fix works in stable;
prevents regressions. Low-medium.
- RISK: Very low runtime risk (test-only). However, there is a
**dependency risk**: if the selftest is backported WITHOUT the
kernel fix (`29c95185ba32b`), running the test will crash the
kernel. This means the two commits must travel together.
## Phase 9: FINAL SYNTHESIS
**Step 9.1: Evidence**
- FOR: Companion to an important syzbot NULL-deref fix; small (22 lines)
self-contained test; adds to existing test file; accompanies fix that
is strongly stable-worthy; reviewed by subsystem maintainer; submitted
to -net tree (bugfix tree).
- AGAINST: Test-only commits don't themselves fix bugs; depends on the
kernel fix being present to avoid panic.
- UNRESOLVED: None material.
**Step 9.2: Stable rules**
- Obviously correct? Yes, pure test additions.
- Fixes a real bug? No (test only), but validates one.
- Important issue? Indirectly - validates a NULL-deref fix.
- Small and contained? Yes - 22 lines, one file, tools/ only.
- No new features? Yes - adds test cases, not new behavior.
- Can apply to stable? Yes - should apply cleanly to all active stable
trees.
**Step 9.3: Exceptions**
- Falls under the "test-only companion to a backport-worthy fix"
pattern. Historical precedent (e.g., `44741e9de29b`) shows such tests
are backported.
**Step 9.4: Decision**
This is a selftest commit that accompanies an important kernel fix
(`29c95185ba32b`, "nexthop: fix IPv6 route referencing IPv4 nexthop")
for a syzbot-reported NULL pointer dereference. The kernel fix itself is
a textbook stable candidate: 2-line surgical change, syzbot-reported,
Fixes: tag pointing to v5.3, reviewed by subsystem maintainer, applied
to the -net bugfix tree. Selftests that validate specific bug fixes are
commonly backported alongside those fixes (as demonstrated by
`44741e9de29b` already in stable), and stable maintainers generally want
such tests to be able to verify the backported fix works. The test is
small, self-contained, adds cases to an existing test file, uses only
pre-existing test infrastructure, and doesn't affect kernel runtime
behavior. Its only requirement is that it travel alongside the kernel
fix (which should also be selected).
## Verification
- [Phase 1] Parsed tags: `Signed-off-by: Jiayuan Chen`, `Reviewed-by:
David Ahern`, `Link:` to msgid.link, `Signed-off-by: Paolo Abeni`. No
Cc: stable (expected).
- [Phase 1] Body reference to "kernel fix" confirmed by reading mbox:
patch 2/2 is explicit companion to patch 1/2.
- [Phase 2] Diff inventory:
`tools/testing/selftests/net/fib_nexthops.sh` +22/-0 lines, only
function `ipv6_fcnal_runtime()` touched.
- [Phase 2] Read lines 1180-1246 of current `fib_nexthops.sh`: verified
the test insertion point is after existing replace-related tests and
before `$IP nexthop flush` / "weird IPv6 cases".
- [Phase 3] `git log --grep="stale has_v4"`: identified paired commits
`29c95185ba32b` (fix) and `104f082f5ed6d` (this selftest).
- [Phase 3] `git show 29c95185ba32b`: confirmed kernel fix is 2-line
AF_INET/AF_INET6 comparison change with Fixes: tag and syzbot reports.
- [Phase 3] `git show 7bf4796dd099 --stat`: buggy code in
`net/ipv4/nexthop.c` from Jun 2019.
- [Phase 3] `git tag --contains 7bf4796dd099 | grep v5`: buggy code
present from v5.3 onward.
- [Phase 4] `b4 dig -c 104f082f5ed6d`: matched original submission;
patch 2/2 of a 2-patch series.
- [Phase 4] `b4 dig -c 104f082f5ed6d -a`: only v1 of the series exists
(no revisions).
- [Phase 4] `b4 dig -c 104f082f5ed6d -w`: appropriate reviewers
including David Ahern (nexthop maintainer).
- [Phase 4] Read saved mbox `/tmp/selftest_thread.mbox`: found David
Ahern's `Reviewed-by` on both patches and patchwork-bot confirmation
that series was applied to netdev/net.git (bugfix tree).
- [Phase 6] `git log stable/linux-6.19.y --
tools/testing/selftests/net/fib_nexthops.sh`: confirmed `44741e9de29b`
and prior selftests were accepted into stable, establishing precedent.
- [Phase 6] `git log stable/linux-6.19.y --grep="has_v4"`: the new
kernel fix `29c95185ba32b` is not yet in stable (expected - just
merged to mainline).
- [Phase 8] Failure mode without accompanying kernel fix: running the
test would panic the kernel (verified by reading commit body and
reproducer).
- UNVERIFIED: Exact line-offset applicability to all stable trees not
tested with `git apply`, but surrounding function structure appears
stable across trees.
**YES**
tools/testing/selftests/net/fib_nexthops.sh | 22 +++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh
index 6eb7f95e70e15..ac868a7316946 100755
--- a/tools/testing/selftests/net/fib_nexthops.sh
+++ b/tools/testing/selftests/net/fib_nexthops.sh
@@ -1209,6 +1209,28 @@ ipv6_fcnal_runtime()
run_cmd "$IP ro replace 2001:db8:101::1/128 nhid 124"
log_test $? 0 "IPv6 route using a group after replacing v4 gateways"
+ # Replacing an IPv6 nexthop with an IPv4 nexthop should update has_v4
+ # for all groups using it, preventing IPv6 routes from referencing the
+ # group after the replace.
+ run_cmd "$IP nexthop add id 89 via 2001:db8:91::2 dev veth1"
+ run_cmd "$IP nexthop add id 125 group 89"
+ run_cmd "$IP nexthop replace id 89 via 172.16.1.1 dev veth1"
+ run_cmd "$IP ro replace 2001:db8:101::1/128 nhid 125"
+ log_test $? 2 "IPv6 route can not use group after v6 nexthop replaced by v4"
+
+ # Same scenario but with a blackhole nexthop: the group has no IPv6
+ # routes yet when the replace happens, so fib6_check_nh_list returns
+ # early without checking. has_v4 must still be updated to block
+ # subsequent IPv6 route additions.
+ run_cmd "$IP nexthop flush >/dev/null 2>&1"
+ run_cmd "$IP -6 nexthop add id 90 blackhole"
+ run_cmd "$IP nexthop add id 125 group 90"
+ run_cmd "$IP nexthop replace id 90 blackhole"
+ run_cmd "$IP -6 ro add 2001:db8:101::1/128 nhid 125"
+ log_test $? 2 "IPv6 route reject v6 blackhole replaced by v4 blackhole"
+ run_cmd "ip netns exec $me ping -6 2001:db8:101::1 -c1 -w$PING_TIMEOUT"
+ log_test $? 2 "Ping unreachable after rejected route"
+
$IP nexthop flush >/dev/null 2>&1
#
--
2.53.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH AUTOSEL 7.0-6.12] 9p/trans_xen: make cleanup idempotent after dataring alloc errors
[not found] <20260428104133.2858589-1-sashal@kernel.org>
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.10] selftests: fib_nexthops: test stale has_v4 on nexthop replace Sasha Levin
@ 2026-04-28 10:41 ` Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.10] ipv6: Cap TLV scan in ip6_tnl_parse_tlv_enc_lim Sasha Levin
2 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2026-04-28 10:41 UTC (permalink / raw)
To: patches, stable
Cc: Yufan Chen, Stefano Stabellini, Dominique Martinet, Sasha Levin,
ericvh, lucho, davem, edumazet, kuba, pabeni, v9fs, netdev,
linux-kernel
From: Yufan Chen <ericterminal@gmail.com>
[ Upstream commit 72cb9ee4f6d80962df17c9763b14e62e28fd85a2 ]
xen_9pfs_front_alloc_dataring() tears down resources on failure but
leaves ring fields stale. If xen_9pfs_front_init() later jumps to the
common error path, xen_9pfs_front_free() may touch the same resources
again, causing duplicate/invalid gnttab_end_foreign_access() calls and
potentially dereferencing a freed intf pointer.
Initialize dataring sentinels before allocation, gate teardown on those
sentinels, and clear ref/intf/data/irq immediately after each release.
This keeps cleanup idempotent for partially initialized rings and
prevents repeated teardown during init failure handling.
Signed-off-by: Yufan Chen <ericterminal@gmail.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-ID: <20260324153023.86853-2-ericterminal@gmail.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Now I have enough information to make a thorough analysis.
## Phase 1: Commit Message Forensics
**Step 1.1 — Subject line:**
- Record: subsystem=`9p/trans_xen`, action verb=`make` (with implicit
"fix"), summary=make cleanup idempotent after dataring alloc errors.
The phrase "make cleanup idempotent" is a classic disguised bug-fix
verb pattern — it's preventing duplicate teardown.
**Step 1.2 — Tags parsing:**
- Record: `Signed-off-by: Yufan Chen` (author), `Reviewed-by: Stefano
Stabellini <sstabellini@kernel.org>` (the original author/maintainer
of trans_xen.c — strong endorsement), `Message-ID:` to lore, `Signed-
off-by: Dominique Martinet <asmadeus@codewreck.org>` (9p maintainer).
No `Fixes:`, no `Cc: stable` (expected — that's why this is being
reviewed). No syzbot, no Reported-by.
**Step 1.3 — Body analysis:**
- Record: Body explains the mechanism precisely —
`xen_9pfs_front_alloc_dataring()` releases resources on failure but
leaves pointer/ref fields stale. If init then jumps to common error
path, `xen_9pfs_front_free()` re-touches them, causing
"duplicate/invalid `gnttab_end_foreign_access()` calls and potentially
dereferencing a freed `intf` pointer". Symptom = double teardown + UAF
on partially initialized rings during init failure.
**Step 1.4 — Hidden bug fix detection:**
- Record: Yes — "make cleanup idempotent" is a textbook hidden bug-fix
subject. The phrase "potentially dereferencing a freed intf pointer"
makes the use-after-free explicit. Cover letter (PATCH v3 0/2) states:
"Patch 1 fixes a potential double-free/Oops during initialization
failure" and "Tested error paths by forcing init failures on non-Xen
systems; dmesg confirms the new sentinel-based cleanup correctly
prevents Oops." So an actual Oops was observed.
## Phase 2: Diff Analysis
**Step 2.1 — Inventory:**
- Record: One file `net/9p/trans_xen.c`, +37/-14 lines, two functions
changed: `xen_9pfs_front_free()` and
`xen_9pfs_front_alloc_dataring()`. Single-file surgical fix, scope =
error path / cleanup only.
**Step 2.2 — Code flow:**
- Record (alloc_dataring): Before — fields are not initialized to
sentinels; on `out:` path, frees `bytes`/`intf` and revokes
`ring->ref` unconditionally without clearing the fields. After —
fields set to NULL/`INVALID_GRANT_REF`/-1 at the top; `out:` only
frees what's set, then clears the fields after each release.
- Record (front_free): Before — uses `if (priv->rings[i].irq > 0)` and
unconditionally calls `gnttab_end_foreign_access(ring->ref, NULL)` and
`free_page(ring->intf)`. After — uses `if (ring->irq >= 0)` then
resets to -1; checks `ring->ref != INVALID_GRANT_REF`; clears
intf/ref/data.in/data.out/irq after each release.
**Step 2.3 — Bug mechanism:**
- Record: This is BOTH (a) error path / resource leak fixes AND (d)
memory safety fixes:
- **Double-free of `ring->intf`**: `xen_9pfs_front_alloc_dataring()`
calls `free_page((unsigned long)ring->intf)` on failure but leaves
the pointer pointing to freed memory. Init then calls
`xen_9pfs_front_free()` whose check `if (!priv->rings[i].intf)
break;` does NOT trip (stale non-NULL pointer), so
`free_page((unsigned long)priv->rings[i].intf)` runs again → kernel
page double-free.
- **Double `gnttab_end_foreign_access` on `ring->ref`**: same path re-
revokes a stale grant ref.
- **Use-after-free of `ring->intf`**: if alloc failed at the
`xenbus_alloc_evtchn` stage, `ring->data.in` was set, then `bytes`
was freed by alloc_dataring's cleanup. On the second pass through
front_free, the `if (ring->data.in)` branch dereferences
`ring->intf->ring_order` and `ring->intf->ref[j]` (already-freed
page) → UAF read; then calls `gnttab_end_foreign_access` on stale
grant refs and `free_pages_exact` on already-freed `data.in`.
**Step 2.4 — Fix quality:**
- Record: Obviously correct — sentinel-based teardown is a standard
idempotent-cleanup pattern. Each release is gated by a sentinel and
the field is invalidated afterward. The change `irq > 0` → `irq >= 0`
is also a defensive correction (with explicit `-1` init, this is the
proper check). No new locking, no new APIs, no behaviour change on the
success path. Regression risk is very low.
## Phase 3: Git History Investigation
**Step 3.1 — Blame:**
- Record: The buggy alloc_dataring code came from `71ebd71921e45`
("xen/9pfs: connect to the backend"), part of v4.12-rc1 (Apr 2017).
Bug has been latent in every kernel since v4.12, so all currently-
supported LTS trees (5.4, 5.10, 5.15, 6.1, 6.6, 6.12, 6.18+) carry it.
**Step 3.2 — Fixes: target:**
- Record: No `Fixes:` tag in the commit. The introducing commit
`71ebd71921e45` is in mainline since v4.12, so it definitely exists in
every active stable tree.
**Step 3.3 — File history:**
- Record: Recent related fixes on this file that are already in stable:
`e43c608f40c06` ("9p/xen: fix release of IRQ"), `7ef3ae82a6ebb`
("9p/xen: fix init sequence"), `ea4f1009408ef` ("9p/xen: Fix UAF in
xen_9pfs_front_remove"), `ce8ded2e61f47` ("9p/xen: protect
xen_9pfs_front_free against concurrent calls"). All are small
stability fixes. The current patch is standalone and not part of a
multi-patch dependent series; series cover letter shows it splits into
2/2 patches but patch 2 (parser cleanup with kstrtouint) is
independent.
**Step 3.4 — Author context:**
- Record: Yufan Chen is a contributor; the patch was reviewed by Stefano
Stabellini who is the original author/long-time maintainer of
`trans_xen.c` (copyright at top of file). Authoritative review.
**Step 3.5 — Dependencies:**
- Record: Uses `INVALID_GRANT_REF`, defined in
`include/xen/grant_table.h` since `bce21a2b48ede` (v5.12-rc3). This
macro is present in all current stable LTS trees (verified in 5.15 —
`#define INVALID_GRANT_REF ((grant_ref_t)-1)` at line 57). No other
dependencies. Self-contained patch.
## Phase 4: Mailing List Research
**Step 4.1 — b4 dig:**
- Record: `b4 dig -c 72cb9ee4f6d80` matched by patch-id, returned `https
://lore.kernel.org/all/20260324153023.86853-2-ericterminal@gmail.com/`
(v3 1/2).
- `b4 dig -a` showed evolution: v1 (single patch, 2026-02-25), v2 (1/4
in mixed series, 2026-02-25), v3 (1/2 in dedicated 9p/trans_xen
series, 2026-03-24). Applied version is the latest.
- v3 cover letter: "Patch 1 fixes a potential double-free/Oops during
initialization failure by making the dataring cleanup idempotent."
Confirms the author treats this as a stability/bug fix.
**Step 4.2 — Reviewers:**
- Record: Reviewed-by Stefano Stabellini (subsystem maintainer), CC'd
Eric Van Hensbergen (ericvh@kernel.org), Lucho Ionkov
(lucho@ionkov.net), and the v9fs list. The right people reviewed it.
**Step 4.3 — Bug report:**
- Record: No external bug report. Bug discovered by code inspection and
confirmed by deliberate fault injection during testing (per the v3
cover letter). No syzbot.
**Step 4.4 — Series context:**
- Record: 2-patch series. Patch 2 ("replace simple_strto* with
kstrtouint") is unrelated parser modernization and not stable
material. This patch (1/2) is fully standalone — no dependency on
patch 2.
**Step 4.5 — Stable list:**
- Record: No prior discussion on stable list found via b4 dig. Author
did not Cc stable, but recent precedent shows similar 9p/xen
idempotency-style fixes (`e43c608`, `7ef3ae82`, `ea4f1009`,
`ce8ded2e`) were backported to 5.15.y, 6.1.y, 6.6.y, 6.12.y as stable-
eligible bug fixes.
## Phase 5: Code Semantic Analysis
**Step 5.1 — Functions modified:**
- Record: `xen_9pfs_front_free()`, `xen_9pfs_front_alloc_dataring()`.
**Step 5.2 — Callers:**
- Record: `xen_9pfs_front_alloc_dataring` is called from
`xen_9pfs_front_init` (in a loop over `XEN_9PFS_NUM_RINGS`).
`xen_9pfs_front_free` is called from `xen_9pfs_front_remove` (xenbus
driver remove callback) AND from `xen_9pfs_front_init` error path.
Critical: both callers are in the device probe/teardown flow, which is
exactly the scenario the patch protects against.
**Step 5.3 — Callees:**
- Record: `gnttab_end_foreign_access`, `free_page`, `free_pages_exact`,
`unbind_from_irqhandler`, `cancel_work_sync`.
`gnttab_end_foreign_access(ref, NULL)` calls into
`gnttab_try_end_foreign_access` → `_gnttab_end_foreign_access_ref` →
indirect into the gnttab interface; reentering with stale ref produces
warnings or worse on backend interaction.
**Step 5.4 — Reachability:**
- Record: Triggered from `xenbus_driver` callback chain when a 9pfs
frontend tries to come up and any of these fails: `get_zeroed_page`
(memory pressure), `gnttab_grant_foreign_access` (grant-table
exhaustion — realistic on busy Xen guests), `alloc_pages_exact`,
`xenbus_alloc_evtchn` (event-channel exhaustion),
`bind_evtchn_to_irqhandler`. Reachable on every 9pfs frontend probe
under resource pressure or hostile/buggy backend.
**Step 5.5 — Similar patterns:**
- Record: Idempotent-cleanup-with-sentinels is the same pattern used
throughout xen frontends. The previous 9p/xen fixes (`e43c608`,
`ce8ded2e`) target the same teardown function and were backported to
stable.
## Phase 6: Cross-Referencing & Stable Tree Analysis
**Step 6.1 — Code presence:**
- Record: Verified by reading `git show
stable/linux-6.6.y:net/9p/trans_xen.c` and `git show
stable/linux-6.12.y:net/9p/trans_xen.c` — both contain the same buggy
`xen_9pfs_front_alloc_dataring()` cleanup pattern and the same
`xen_9pfs_front_free()` un-gated double-teardown. Bug present in 5.4,
5.10, 5.15, 6.1, 6.6, 6.12, 6.18 (all active LTS).
**Step 6.2 — Backport complications:**
- Record: 6.12.y file matches mainline structure almost exactly — minor
context-only deltas. 6.6.y / 6.1.y / 5.15.y use `priv->num_rings`
instead of the constant in the loop and have a slightly different
`xen_9pfs_front_free` outline (no `priv->rings` NULL check at the top
in 6.6) — those need trivial mechanical adjustment.
`INVALID_GRANT_REF` is available in all active LTS. Expected
difficulty: clean-to-minor.
**Step 6.3 — Related fixes already in stable:**
- Record: Verified — `2bb3ee1bf2375` (6.6), `b9e26059664bd` (6.1),
`4950408793b11` (5.15), `530bc9f03a102` (6.12) are the IRQ-double-free
fix; `592fb738d8682`/`91b4763da3ee6`/`db94e06c24cd4`/`e978643c4c9c0`
are the init-sequence fix; `a5d00dff97118` is the concurrent-
front_free protection. None of these address the alloc-failure
idempotency bug — this patch fills a remaining gap.
## Phase 7: Subsystem Context
**Step 7.1 — Subsystem:**
- Record: `net/9p/` — 9P virtual filesystem transport, Xen-specific.
Criticality: PERIPHERAL globally but IMPORTANT for users who actually
use 9P over Xen (e.g., Edera and other Xen-based confidential-
computing/lightweight-VM stacks who recently submitted other 9p/xen
fixes).
**Step 7.2 — Activity:**
- Record: Active subsystem with periodic stability-fix submissions in
2024–2026; multiple recent patches went to stable.
## Phase 8: Impact and Risk
**Step 8.1 — Affected population:**
- Record: Users of Xen 9pfs frontend. Niche but real (Edera, others
using 9p mounts in Xen guests).
**Step 8.2 — Trigger conditions:**
- Record: Failure during second-ring allocation in
`xen_9pfs_front_init`. Triggers include memory pressure, grant-table
exhaustion, evtchn exhaustion, malicious/buggy Xen backend. Not user-
triggerable from unprivileged userspace, but a malicious backend can
deliberately starve the frontend (Xen security model assumes the
backend is more privileged but a frontend should not crash on backend
misbehaviour).
**Step 8.3 — Severity:**
- Record: When triggered → kernel page double-free + grant ref double-
revoke + use-after-free read on a freed page. Failure mode: kernel
oops / panic / memory corruption. Severity: CRITICAL.
**Step 8.4 — Risk-benefit:**
- Record: Benefit = high (eliminates a confirmed Oops on init failure,
idempotent cleanup is universally desirable). Risk = very low — pure
error-path tightening, sentinel-based, no behaviour change on success
path, reviewed by the original author Stefano Stabellini, tested with
deliberate fault injection.
## Phase 9: Final Synthesis
**Evidence FOR backport:**
- Real bug — double-free of kernel page, use-after-free, double grant-
ref revoke during init failure (CRITICAL severity)
- Reproduced (Oops) by author with fault injection in virtme-ng
- Reviewed by the original author/maintainer of the file (Stefano
Stabellini)
- Small, surgical, single file (+37/-14)
- Bug present since v4.12 — affects every active LTS
- All dependencies (`INVALID_GRANT_REF`) present in stable trees
- Recent precedent: 4 similar 9p/xen fixes have been backported to
5.15/6.1/6.6/6.12
- Idempotent cleanup is a textbook stable-friendly pattern, no
behavioural change on success path
**Evidence AGAINST:**
- No `Fixes:` tag, no `Cc: stable` (expected, not a real negative
signal)
- No external user bug report (but author observed Oops during testing)
- Niche subsystem (9p over Xen)
**Stable rules checklist:**
1. Obviously correct and tested? Yes — sentinel pattern, reviewed by
maintainer, fault-injected by author.
2. Real bug? Yes — confirmed Oops.
3. Important issue? Yes — double-free / UAF (CRITICAL).
4. Small and contained? Yes — 51 lines, one file.
5. No new features? Correct — pure cleanup hardening.
6. Applies to stable? Yes for 6.12.y essentially clean; 6.6/6.1/5.15
need trivial loop-variable adjustment.
**Decision:** This is a small, well-reviewed, fault-injection-confirmed
fix for a memory-safety bug (double-free + UAF) that has been latent in
Xen 9p frontend code since 2017 and exists in every active LTS tree. It
matches the pattern of multiple similar 9p/xen stability fixes already
backported to stable. Backport-worthy.
## Verification
- [Phase 1] Read commit message and v3 cover letter from saved mbox
`/tmp/9pxen-thread.mbox` — confirmed "fixes a potential double-
free/Oops during initialization failure" and "Tested error paths by
forcing init failures... dmesg confirms the new sentinel-based cleanup
correctly prevents Oops".
- [Phase 1] Confirmed Reviewed-by from Stefano Stabellini in the mbox
thread.
- [Phase 2] Read full pre-fix `net/9p/trans_xen.c` and post-fix;
manually traced ring-1 alloc failure scenarios at four distinct
failure points and confirmed each leads to either double
`free_page(intf)`, double `gnttab_end_foreign_access(ref)`, or UAF
read of `ring->intf->ring_order`/`ring->intf->ref[j]`.
- [Phase 3] `git log --oneline --follow net/9p/trans_xen.c` showed
`71ebd71921e45` as origin; `git describe --contains 71ebd71921e45` →
`v4.12-rc1~103^2~31`.
- [Phase 3] `git show 71ebd71921e45` confirmed the pre-existing buggy
`xen_9pfs_front_alloc_dataring`+`xen_9pfs_front_free` cleanup pattern
was introduced in 2017.
- [Phase 3] Found prior related fixes (`e43c608`, `7ef3ae82`,
`ea4f1009`, `ce8ded2e`) on the same file with their stable tree
counterparts.
- [Phase 4] `b4 dig -c 72cb9ee4f6d80` returned the lore URL for v3 1/2.
- [Phase 4] `b4 dig -c 72cb9ee4f6d80 -a` showed v1 → v2 → v3 evolution;
applied version is latest.
- [Phase 4] `b4 dig -m /tmp/9pxen-thread.mbox` saved full thread; read
entire mbox with Read tool.
- [Phase 4] No `Cc: stable` and no NAKs in the thread; only Reviewed-by
from Stefano.
- [Phase 5] Confirmed `xen_9pfs_front_alloc_dataring` is called from
`xen_9pfs_front_init` (loop over rings) and `xen_9pfs_front_free` is
called from both `xen_9pfs_front_init` error path and
`xen_9pfs_front_remove`.
- [Phase 5] Read `drivers/xen/grant-table.c` to confirm
`gnttab_end_foreign_access` chain — invoking on a stale ref re-enters
the gnttab interface.
- [Phase 6] Verified `INVALID_GRANT_REF` exists in
`include/xen/grant_table.h` of mainline (line 57) and
`stable/linux-5.15.y` — backport-friendly.
- [Phase 6] Read `stable/linux-6.6.y:net/9p/trans_xen.c` and
`stable/linux-6.12.y:net/9p/trans_xen.c` and confirmed the buggy code
pattern is present in both.
- [Phase 6] Verified previous 9p/xen fixes were backported to
5.15/6.1/6.6/6.12 stable branches via `git log <branch> --
net/9p/trans_xen.c`.
- [Phase 8] Failure mode verified by manual trace: double-free of a
kernel page + use-after-free read on stale `ring->intf` + double
grant-revoke → CRITICAL.
- UNVERIFIED: No external bug report or syzbot reproducer; severity
rests on author's fault-injection result and direct code analysis
(both consistent with each other).
**YES**
net/9p/trans_xen.c | 51 +++++++++++++++++++++++++++++++++-------------
1 file changed, 37 insertions(+), 14 deletions(-)
diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
index 47af5a10e9212..85b9ebfaa17a6 100644
--- a/net/9p/trans_xen.c
+++ b/net/9p/trans_xen.c
@@ -283,25 +283,33 @@ static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
cancel_work_sync(&ring->work);
- if (!priv->rings[i].intf)
+ if (!ring->intf)
break;
- if (priv->rings[i].irq > 0)
- unbind_from_irqhandler(priv->rings[i].irq, ring);
- if (priv->rings[i].data.in) {
- for (j = 0;
- j < (1 << priv->rings[i].intf->ring_order);
+ if (ring->irq >= 0) {
+ unbind_from_irqhandler(ring->irq, ring);
+ ring->irq = -1;
+ }
+ if (ring->data.in) {
+ for (j = 0; j < (1 << ring->intf->ring_order);
j++) {
grant_ref_t ref;
- ref = priv->rings[i].intf->ref[j];
+ ref = ring->intf->ref[j];
gnttab_end_foreign_access(ref, NULL);
+ ring->intf->ref[j] = INVALID_GRANT_REF;
}
- free_pages_exact(priv->rings[i].data.in,
- 1UL << (priv->rings[i].intf->ring_order +
- XEN_PAGE_SHIFT));
+ free_pages_exact(ring->data.in,
+ 1UL << (ring->intf->ring_order +
+ XEN_PAGE_SHIFT));
+ ring->data.in = NULL;
+ ring->data.out = NULL;
+ }
+ if (ring->ref != INVALID_GRANT_REF) {
+ gnttab_end_foreign_access(ring->ref, NULL);
+ ring->ref = INVALID_GRANT_REF;
}
- gnttab_end_foreign_access(priv->rings[i].ref, NULL);
- free_page((unsigned long)priv->rings[i].intf);
+ free_page((unsigned long)ring->intf);
+ ring->intf = NULL;
}
kfree(priv->rings);
}
@@ -334,6 +342,12 @@ static int xen_9pfs_front_alloc_dataring(struct xenbus_device *dev,
int ret = -ENOMEM;
void *bytes = NULL;
+ ring->intf = NULL;
+ ring->data.in = NULL;
+ ring->data.out = NULL;
+ ring->ref = INVALID_GRANT_REF;
+ ring->irq = -1;
+
init_waitqueue_head(&ring->wq);
spin_lock_init(&ring->lock);
INIT_WORK(&ring->work, p9_xen_response);
@@ -379,9 +393,18 @@ static int xen_9pfs_front_alloc_dataring(struct xenbus_device *dev,
for (i--; i >= 0; i--)
gnttab_end_foreign_access(ring->intf->ref[i], NULL);
free_pages_exact(bytes, 1UL << (order + XEN_PAGE_SHIFT));
+ ring->data.in = NULL;
+ ring->data.out = NULL;
+ }
+ if (ring->ref != INVALID_GRANT_REF) {
+ gnttab_end_foreign_access(ring->ref, NULL);
+ ring->ref = INVALID_GRANT_REF;
+ }
+ if (ring->intf) {
+ free_page((unsigned long)ring->intf);
+ ring->intf = NULL;
}
- gnttab_end_foreign_access(ring->ref, NULL);
- free_page((unsigned long)ring->intf);
+ ring->irq = -1;
return ret;
}
--
2.53.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH AUTOSEL 7.0-5.10] ipv6: Cap TLV scan in ip6_tnl_parse_tlv_enc_lim
[not found] <20260428104133.2858589-1-sashal@kernel.org>
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.10] selftests: fib_nexthops: test stale has_v4 on nexthop replace Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.12] 9p/trans_xen: make cleanup idempotent after dataring alloc errors Sasha Levin
@ 2026-04-28 10:41 ` Sasha Levin
2 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2026-04-28 10:41 UTC (permalink / raw)
To: patches, stable
Cc: Daniel Borkmann, Ido Schimmel, Justin Iurman, Jakub Kicinski,
Sasha Levin, davem, dsahern, edumazet, pabeni, netdev,
linux-kernel
From: Daniel Borkmann <daniel@iogearbox.net>
[ Upstream commit 076b8cad77aa96557719fb5effe8703bfb64df00 ]
Commit 47d3d7ac656a ("ipv6: Implement limits on Hop-by-Hop and
Destination options") added net.ipv6.max_{hbh,dst}_opts_{cnt,len}
and applied them in ip6_parse_tlv(), the generic TLV walker
invoked from ipv6_destopt_rcv() and ipv6_parse_hopopts().
ip6_tnl_parse_tlv_enc_lim() does not go through ip6_parse_tlv();
it has its own hand-rolled TLV scanner inside its NEXTHDR_DEST
branch which looks for IPV6_TLV_TNL_ENCAP_LIMIT. That inner
loop is bounded only by optlen, which can be up to 2048 bytes.
Stuffing the Destination Options header with 2046 Pad1 (type=0)
entries advances the scanner a single byte at a time, yielding
~2000 TLV iterations per extension header.
Reusing max_dst_opts_cnt to bound the TLV iterations, matching
the semantics from 47d3d7ac656a, would require duplicating
ip6_parse_tlv() to also validate Pad1/PadN payload. It would
also mandate enforcing max_dst_opts_len, since otherwise an
attacker shifts the axis to few options with a giant PadN and
recovers the original DoS. Allowing up to 8 options before the
tunnel encapsulation limit TLV is liberal enough; in practice
encap limit is the first TLV. Thus, go with a hard-coded limit
IP6_TUNNEL_MAX_DEST_TLVS (8).
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Justin Iurman <justin.iurman@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Now I have all the information I need. Let me compile the comprehensive
analysis.
# Phase-by-Phase Analysis
## PHASE 1: COMMIT MESSAGE FORENSICS
**Step 1.1 - Subject Line Parse:**
- Subsystem prefix: `ipv6:`
- Action verb: "Cap" (limit/bound) — implicit bug fix verb
- Summary: Add a hard cap on TLV iterations in
`ip6_tnl_parse_tlv_enc_lim()`
- Record: `[ipv6] [Cap] [Bound the TLV scan loop to prevent excessive
iterations]`
**Step 1.2 - Tag Parsing:**
- Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> (author)
- Reviewed-by: Ido Schimmel <idosch@nvidia.com>
- Reviewed-by: Justin Iurman <justin.iurman@gmail.com>
- Signed-off-by: Jakub Kicinski <kuba@kernel.org> (netdev maintainer)
- No explicit Fixes: tag, no Cc: stable, no Reported-by, no Link:
- References commit `47d3d7ac656a` ("ipv6: Implement limits on Hop-by-
Hop and Destination options") in message body
- Record: Two independent Reviewed-by tags, applied by subsystem
maintainer Kicinski. Pedigree is strong.
**Step 1.3 - Commit Body Analysis:**
- Describes bug: `ip6_tnl_parse_tlv_enc_lim()` has a hand-rolled TLV
scanner in its `NEXTHDR_DEST` branch, bounded only by `optlen` (up to
2048 bytes)
- Attack: "Stuffing the Destination Options header with 2046 Pad1
(type=0) entries advances the scanner a single byte at a time,
yielding ~2000 TLV iterations per extension header"
- Symptom: CPU-consuming DoS — an attacker can force ~2000 iterations
per IPv6 extension header in a received packet
- Mentions that commit `47d3d7ac656a` already fixed the same class of
bug in `ip6_parse_tlv()` (the generic TLV walker), but this separate
hand-rolled scanner was missed
- Record: Clear DoS vector description, author's understanding of the
bug mechanism is thorough
**Step 1.4 - Hidden Bug Fix Detection:**
- Subject says "Cap" rather than "Fix" but body makes explicit that this
is a DoS fix
- This is NOT a hidden fix — the DoS mechanism is described openly
- Record: Commit is a clear bug fix despite neutral-sounding subject
verb
## PHASE 2: DIFF ANALYSIS
**Step 2.1 - Inventory:**
- Single file: `net/ipv6/ip6_tunnel.c`
- +6 lines, 0 removed
- Function modified: `ip6_tnl_parse_tlv_enc_lim()`
- Scope: single-file surgical fix
- Record: 6 lines in 1 file, 1 function — minimal scope
**Step 2.2 - Code Flow:**
- Before: `while (1)` loop with break only when `i + sizeof(*tel) >
optlen` — can iterate up to ~optlen/1 times when all entries are Pad1
(type=0 advances `i` by 1 byte)
- After: new local `int tlv_cnt = 0;` declared; `if (unlikely(tlv_cnt++
>= IP6_TUNNEL_MAX_DEST_TLVS)) break;` added at top of loop
- New macro `#define IP6_TUNNEL_MAX_DEST_TLVS 8` at file scope
- Record: Loop now breaks after at most 8 TLVs scanned per extension
header
**Step 2.3 - Bug Mechanism Classification:**
- Category: (h) Hardware workarounds? No. This is category close to
"bounds check" / DoS prevention — fits between logic/correctness (g)
and memory safety (d)
- Specific: A counter-based upper bound on a while loop prevents
attacker-controlled iteration count from causing excessive CPU use per
received packet
- Record: DoS/CPU-exhaustion fix via iteration bound
**Step 2.4 - Fix Quality:**
- Obviously correct: the counter is incremented unconditionally,
compared with constant 8
- Minimal: 6 lines, self-contained inside existing function
- Regression risk: In practice the encap limit TLV is the first TLV. 8
is generous. Legitimate traffic never hits this cap. Extremely low
risk.
- Record: High-quality, obviously-correct, minimal fix
## PHASE 3: GIT HISTORY INVESTIGATION
**Step 3.1 - Git Blame:**
- Ran `git blame -L 430,456 net/ipv6/ip6_tunnel.c`
- Core `while (1)` loop and TLV scanning logic attributed to
`1da177e4c3f4` ("Linux-2.6.12-rc2", 2005-04-16) — the very beginning
of git history
- Surrounding `nexthdr == NEXTHDR_DEST` check modified by
`d375b98e024898` (Eric Dumazet, 2024-01-05)
- Earlier pointer-math/bounds fixes: `fbfa743a9d2a0f` (2017),
`63117f09c768be` (2017)
- Record: **Buggy code present since git epoch (2005). Bug exists in all
supported stable trees.**
**Step 3.2 - Follow Fixes: Tag:**
- No Fixes: tag. In the lore discussion, Ido Schimmel explicitly
suggested: "Fixes: 1da177e4c3f4 ('Linux-2.6.12-rc2')"
- Referenced commit `47d3d7ac656a` (Tom Herbert, 2017-10-30) addressed
the same DoS in `ip6_parse_tlv()` by adding
`max_dst_opts_cnt`/`max_dst_opts_len` sysctls. It did not cover this
hand-rolled scanner.
- Record: Bug is as old as git history; the analogous fix for the
generic path is already in stable.
**Step 3.3 - File History:**
- Recent changes in this file are unrelated (DSCP handling, netns
conversion, GRO fixes, skb_vlan_inet_prepare, etc.) — no prerequisite
or competing fix
- `d375b98e024898` ("ip6_tunnel: fix NEXTHDR_FRAGMENT handling in
ip6_tnl_parse_tlv_enc_lim()", 2024) is the most recent change in this
function — itself a fix that went to stable
- Record: Standalone fix; no dependencies identified
**Step 3.4 - Author's Background:**
- Daniel Borkmann: networking/BPF maintainer, extensive
ipv6/netfilter/BPF history
- Not a new contributor
- Record: Author has deep kernel/networking expertise
**Step 3.5 - Dependencies:**
- Patch only adds a local counter and a new macro — no external symbol
dependencies
- Applies to the existing while loop that has been stable for decades
- Record: Standalone, self-contained
## PHASE 4: MAILING LIST RESEARCH
**Step 4.1 - b4 dig:**
- `b4 dig -c 076b8cad77aa9` found the original submission at `https://lo
re.kernel.org/all/20260421202406.717885-1-daniel@iogearbox.net/`
- Subject: **[PATCH net v3]** — "net" tree tag signals this is a bug fix
targeting the current release cycle (not "net-next"), which is where
stable-candidate fixes go
- `b4 dig -a`: the v3 that was applied is the latest revision; changelog
in the patch shows v1->v2 (use abs(), remove unlikely), v2->v3 (hard
code limit of 8 vs max_dst_opts_cnt, per Ido)
- Record: Three-revision evolution; reviewers addressed; applied version
is final
**Step 4.2 - Reviewers (b4 dig -w):**
- To: kuba@kernel.org (Jakub Kicinski — netdev maintainer)
- Cc: edumazet@google.com (Eric Dumazet — networking maintainer),
dsahern@kernel.org (David Ahern — ipv6 maintainer),
tom@herbertland.com (Tom Herbert — author of the related 2017 fix),
willemdebruijn.kernel@gmail.com, idosch@nvidia.com,
justin.iurman@gmail.com, pabeni@redhat.com (Paolo Abeni — networking
maintainer), netdev@vger.kernel.org
- Record: All major networking maintainers included. Reviewed by Ido
Schimmel and Justin Iurman (IPv6 extension header reviewer)
**Step 4.3 - Bug Report:**
- No Reported-by/Link: tag — the DoS was likely identified by the author
through code review (he explicitly analyzed the disparity with the
already-patched `ip6_parse_tlv()`)
- Record: Proactive DoS discovery rather than user-reported
**Step 4.4 - Related Patches:**
- Single patch, not a series
- Record: Standalone
**Step 4.5 - Stable Discussion:**
- In the lore mbox: Ido Schimmel said "Given that you are targeting net
and that the issue was always present, I would use: Fixes:
1da177e4c3f4 ('Linux-2.6.12-rc2')"
- This strongly implies the fix is intended for stable (Fixes: tag is
the trigger for stable-autoselect)
- Record: Reviewer explicitly suggested adding a Fixes: tag pointing to
kernel epoch — a clear stable-backport signal
## PHASE 5: CODE SEMANTIC ANALYSIS
**Step 5.1 - Key Functions:** `ip6_tnl_parse_tlv_enc_lim()` — the only
function modified.
**Step 5.2 - Callers (via `git grep`):**
- `net/ipv6/ip6_tunnel.c`:
- `ip6_tnl_err()` — ICMPv6 error handler for IPv6-over-IPv6 tunnels
- `__ip6_tnl_xmit()` — the transmit path (when protocol ==
IPPROTO_IPV6)
- `net/ipv6/ip6_gre.c`:
- `ip6gre_err()` — ICMPv6 error handler for GRE-over-IPv6
- `prepare_ip6gre_xmit_ipv6()` — GRE transmit path
- Record: Called from both transmit path and ICMPv6 error handling for
ip6 and ip6gre tunnels — network-reachable data paths on any system
using IPv6 tunnels
**Step 5.3 - Callees:** Reads `skb->data`, uses `pskb_may_pull`. No
external state changes inside the scanner.
**Step 5.4 - Call Chain / Reachability:**
- `__ip6_tnl_xmit()` is part of `ip6_tnl_start_xmit` / `ip6_tnl_rcv_ctl`
infrastructure — runs on every packet sent over an IPv6 tunnel when
the inner packet has Destination Options
- `ip6_tnl_err()` is invoked from `ip6_tnl_err_proto`, called by icmpv6
when an IPv6 tunnel packet triggers an error
- An attacker over the network can craft packets to exploit this as long
as the target has an IPv6 tunnel configured (ip6tnl, ip6gre modules)
- Record: Data path function, reachable from remote attacker when IPv6
tunnel is configured
**Step 5.5 - Similar Patterns:**
- The generic `ip6_parse_tlv()` in `net/ipv6/exthdrs.c` already has this
protection via `max_hbh_opts_cnt/max_dst_opts_cnt` (commit
47d3d7ac656a, 2017)
- This commit closes the last remaining scanner that didn't have such a
cap
- Record: This is the final instance; other instances already protected
## PHASE 6: CROSS-REFERENCING STABLE TREES
**Step 6.1 - Buggy code in stable trees?**
- The loop structure is in the codebase since `1da177e4c3f4`
(2.6.12-rc2)
- Present in 5.4, 5.10, 5.15, 6.1, 6.6, 6.12 and every other supported
stable tree
- Record: All supported stable trees contain the vulnerable code
**Step 6.2 - Backport Complications:**
- The function is modified by `d375b98e024898` (Jan 2024) — this is in
6.7+; older stable trees (5.4, 5.10, 5.15, 6.1) may have a slightly
different surrounding context (no `nexthdr ==
NEXTHDR_FRAGMENT`/`NEXTHDR_AUTH` branching exactly as today)
- However, the key hunk — the `if (nexthdr == NEXTHDR_DEST) { ...
while(1) { ... }}` block — is structurally unchanged since 2005
- The patch adds a new local variable and a new `if` inside the while
loop; this should apply cleanly or with trivial offset fuzzing
- Record: Expected to apply cleanly to all active stable trees; at worst
a trivial context adjustment
**Step 6.3 - Related fixes already in stable?**
- `47d3d7ac656a` is in stable trees (it was the original DoS hardening,
merged 2017)
- No previous fix for this specific hand-rolled scanner exists
- Record: No overlap; this closes a gap left by the 2017 fix
## PHASE 7: SUBSYSTEM CONTEXT
**Step 7.1 - Subsystem:** `net/ipv6/` — core IPv6 networking. Affects
users of IPv6 tunnels (ip6tnl, ip6gre). IMPORTANT criticality.
**Step 7.2 - Activity:** Very active subsystem, but the specific scanner
has been stable for 20+ years. Record: Mature code, long-lived bug.
## PHASE 8: IMPACT AND RISK
**Step 8.1 - Affected Users:** All users running IPv6 tunnel drivers
(ip6tnl, ip6gre modules loaded) — common on IPv6 dual-stack routers,
tunnel endpoints, mobile backhauls, cloud overlay networks.
**Step 8.2 - Trigger Conditions:**
- Attacker sends IPv6 packet with Destination Options header containing
2046 Pad1 entries
- Per extension header, ~2000 CPU iterations in the scanner
- Can be triggered remotely without authentication — any reachable IPv6
tunnel endpoint
- Record: Unprivileged remote attacker can trigger; realistic DoS
**Step 8.3 - Failure Mode Severity:**
- CPU exhaustion in softirq context — affects packet processing
throughput
- With pipelined attack traffic, can starve other network processing
- Not a crash but a performance DoS — **MEDIUM-HIGH** severity
- Record: Remote DoS / CPU exhaustion, medium-high severity
**Step 8.4 - Risk-Benefit:**
- Benefit: Closes a 20-year-old remote DoS vector on IPv6 tunnel
endpoints; completes the hardening started by the 2017 fix
- Risk: Very low — 6-line cap at value 8, legitimate traffic never
approaches this limit (encap limit is typically the first TLV)
- Record: Strongly favorable benefit/risk ratio
## PHASE 9: SYNTHESIS
**Step 9.1 - Evidence:**
- FOR backport:
- Closes a known class of remote DoS (same class as 47d3d7ac656a,
which is in stable)
- Bug present since 2.6.12-rc2 (2005) — affects every supported stable
tree
- 6-line surgical fix, no new APIs, no functional change for
legitimate traffic
- Reviewed by two independent reviewers (Ido Schimmel, Justin Iurman),
applied by netdev maintainer to the `net` tree
- Reviewer explicitly suggested Fixes: 1da177e4c3f4 (signaling stable
relevance)
- Reachable from remote unauthenticated attacker on any IPv6 tunnel
endpoint
- Author (Borkmann) is a senior networking developer, patch went
through 3 review iterations
- AGAINST backport:
- No Fixes: tag in the applied commit (reviewer suggested one but it
was not added)
- No Cc: stable tag — but absence is expected per the prompt
- No user-filed bug report / CVE — the DoS is based on code analysis
of an analogous, already-fixed vector
**Step 9.2 - Stable Rules Checklist:**
1. Obviously correct and tested? YES — trivial counter, 2 Reviewed-by,
merged to net
2. Fixes a real bug? YES — remote DoS via crafted IPv6 Destination
Options
3. Important issue? YES — remote CPU exhaustion in softirq path
(security-relevant)
4. Small and contained? YES — 6 lines in one function, one file
5. No new features/APIs? YES — purely defensive counter
6. Applies to stable? YES (likely clean; minor context fuzz possible on
very old trees)
**Step 9.3 - Exception Category:** Not applicable by name, but fits the
spirit of "security hardening for known DoS class" — a strong stable
candidate on its own merits.
**Step 9.4 - Decision:** YES.
# Verification
- [Phase 1] Parsed tags: Reviewed-by: Ido Schimmel, Reviewed-by: Justin
Iurman, Signed-off-by: Daniel Borkmann, Signed-off-by: Jakub Kicinski.
No Fixes:, no Cc: stable, no Reported-by, no Link: — verified from
commit message and `git show 076b8cad77aa9 --format=fuller`
- [Phase 2] Diff analysis: `git diff 076b8cad77aa9^ 076b8cad77aa9` —
confirmed +6 lines (1 macro, 1 local var, 1 conditional break) in
`net/ipv6/ip6_tunnel.c`
- [Phase 3] `git blame -L 430,456 net/ipv6/ip6_tunnel.c`: core loop
attributed to `1da177e4c3f4` (2.6.12-rc2, 2005); surrounding context
modified by `d375b98e024898` (2024)
- [Phase 3] `git show 47d3d7ac656a`: confirmed the referenced prior
commit added `max_hbh/dst_opts_cnt/len` to `ip6_parse_tlv()` for an
identical DoS class in 2017
- [Phase 3] `git log --oneline --author="Daniel Borkmann" -- net/ipv6/`:
confirmed author has prior ipv6 work (e.g., `47e27d5e92c46`,
`e41b0bedba029`, `a824d0b83109e`)
- [Phase 4] `b4 dig -c 076b8cad77aa9`: found lore thread `https://lore.k
ernel.org/all/20260421202406.717885-1-daniel@iogearbox.net/`
- [Phase 4] `b4 dig -a`: confirmed v3 is the applied/final revision;
changelog shows v1->v2 dropping unlikely and using abs(), v2->v3 hard-
coding 8 instead of max_dst_opts_cnt per Ido's suggestion
- [Phase 4] `b4 dig -w`: confirmed Kicinski, Dumazet, Ahern, Herbert,
Abeni and netdev were recipients — proper maintainer review
- [Phase 4] mbox content (`/tmp/tlv_thread.mbox`) lines 340-347: Ido
Schimmel wrote "Given that you are targeting net and that the issue
was always present, I would use: Fixes: 1da177e4c3f4
('Linux-2.6.12-rc2')" — confirms bug is as old as git history and
stable-worthy
- [Phase 4] mbox lines 53-80: patchwork bot confirms patch was applied
to netdev/net.git (main) by Kicinski
- [Phase 5] `git grep "ip6_tnl_parse_tlv_enc_lim"`: confirmed 4 call
sites (ip6_tnl_err, __ip6_tnl_xmit, ip6gre_err,
prepare_ip6gre_xmit_ipv6) — all in data/error paths
- [Phase 6] Code since 2.6.12-rc2 → present in all active stable trees
(5.4, 5.10, 5.15, 6.1, 6.6, 6.12)
- [Phase 8] Failure mode: remote-triggerable CPU exhaustion (~2000
scanner iterations per extension header) — severity medium-high (DoS),
quoted directly from the commit message's documented attack
- UNVERIFIED: Exact patch-apply cleanliness on pre-6.7 stable branches
(the function body was restructured by d375b98e024898 in early 2024).
The hunk target is the inner `if (nexthdr == NEXTHDR_DEST)` block
which has been structurally unchanged since 2005, so clean apply is
expected, but I did not run `git apply` against the individual stable
branches.
- UNVERIFIED: Whether a CVE has been assigned for this specific scanner
(no public CVE reference in the commit message or lore thread).
The fix is small, surgical, fixes a real remote DoS, is well-reviewed,
affects a long-present bug in all stable trees, and was applied to the
`net` (bug-fix) tree by the netdev maintainer.
**YES**
net/ipv6/ip6_tunnel.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 0b53488a92290..b9d41b5d1853b 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -62,6 +62,8 @@ MODULE_LICENSE("GPL");
MODULE_ALIAS_RTNL_LINK("ip6tnl");
MODULE_ALIAS_NETDEV("ip6tnl0");
+#define IP6_TUNNEL_MAX_DEST_TLVS 8
+
#define IP6_TUNNEL_HASH_SIZE_SHIFT 5
#define IP6_TUNNEL_HASH_SIZE (1 << IP6_TUNNEL_HASH_SIZE_SHIFT)
@@ -428,11 +430,15 @@ __u16 ip6_tnl_parse_tlv_enc_lim(struct sk_buff *skb, __u8 *raw)
break;
}
if (nexthdr == NEXTHDR_DEST) {
+ int tlv_cnt = 0;
u16 i = 2;
while (1) {
struct ipv6_tlv_tnl_enc_lim *tel;
+ if (unlikely(tlv_cnt++ >= IP6_TUNNEL_MAX_DEST_TLVS))
+ break;
+
/* No more room for encapsulation limit */
if (i + sizeof(*tel) > optlen)
break;
--
2.53.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-04-28 10:43 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260428104133.2858589-1-sashal@kernel.org>
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.10] selftests: fib_nexthops: test stale has_v4 on nexthop replace Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.12] 9p/trans_xen: make cleanup idempotent after dataring alloc errors Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.10] ipv6: Cap TLV scan in ip6_tnl_parse_tlv_enc_lim Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox