From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Yuvraj Singh Chauhan <ysinghcin@gmail.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Sasha Levin <sashal@kernel.org>,
linux-staging@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 7.0-5.10] staging: octeon: fix free_irq dev_id mismatch in cvm_oct_rx_shutdown
Date: Tue, 28 Apr 2026 06:40:55 -0400 [thread overview]
Message-ID: <20260428104133.2858589-44-sashal@kernel.org> (raw)
In-Reply-To: <20260428104133.2858589-1-sashal@kernel.org>
From: Yuvraj Singh Chauhan <ysinghcin@gmail.com>
[ Upstream commit 41db5b76eeb4cc11a1097384caba7cfc659f7293 ]
In cvm_oct_rx_initialize(), request_irq() is called with
&oct_rx_group[i].napi as the dev_id:
request_irq(oct_rx_group[i].irq, cvm_oct_do_interrupt, 0, "Ethernet",
&oct_rx_group[i].napi);
However, cvm_oct_rx_shutdown() passes cvm_oct_device (an array of
struct net_device pointers) as the dev_id to free_irq():
free_irq(oct_rx_group[i].irq, cvm_oct_device);
Since __free_irq() matches the action to remove by comparing
dev_id pointers, the mismatched cookie means the IRQ handler is
never found, triggering a WARN and leaving the IRQ line permanently
allocated. This prevents proper driver cleanup on module removal.
Fix the mismatch by passing &oct_rx_group[i].napi as the dev_id
to free_irq(), matching what was used during request_irq().
Signed-off-by: Yuvraj Singh Chauhan <ysinghcin@gmail.com>
Link: https://patch.msgid.link/20260212171903.1417804-1-ysinghcin@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Now I have enough information to provide a comprehensive analysis.
## Analysis
### PHASE 1: COMMIT MESSAGE FORENSICS
**Step 1.1:** Parse subject line
- Record: Subsystem `staging: octeon:`, verb `fix`, summary: "free_irq
dev_id mismatch in cvm_oct_rx_shutdown"
**Step 1.2:** Parse commit message tags
- Record: `Signed-off-by: Yuvraj Singh Chauhan`; `Link:` to
patch.msgid.link; `Signed-off-by: Greg Kroah-Hartman`. No Fixes:,
Reported-by:, Reviewed-by:, or Tested-by: tags. No syzbot reporter. No
Cc: stable.
**Step 1.3:** Analyze commit body
- Record: Bug is a mismatch between the dev_id cookie passed to
`request_irq()` (`&oct_rx_group[i].napi`) vs. `free_irq()`
(`cvm_oct_device`). `__free_irq()` matches the action to remove by
comparing dev_id pointers. With the mismatch, the action is never
found, triggering `WARN(1, "Trying to free already-free IRQ %d\n",
irq)` and leaving the IRQ line permanently allocated.
- Failure mode: WARN splat on module remove; IRQ handler stays
registered; driver cleanup is incomplete.
**Step 1.4:** Hidden bug fix check
- Record: This is an explicit "fix" - not hidden. The description
clearly explains the mechanism.
### PHASE 2: DIFF ANALYSIS
**Step 2.1:** Inventory
- Record: 1 file (`drivers/staging/octeon/ethernet-rx.c`), +1/-1 = 1
line effective change inside `cvm_oct_rx_shutdown()`. Scope
classification: single-file surgical one-liner.
**Step 2.2:** Code flow
- Record: Before: `free_irq(oct_rx_group[i].irq, cvm_oct_device)` -
wrong cookie. After: `free_irq(oct_rx_group[i].irq,
&oct_rx_group[i].napi)` - matches the `request_irq()` cookie used in
`cvm_oct_rx_initialize()` at line 481.
**Step 2.3:** Bug mechanism
- Record: Category: logic/correctness (mismatched function-pair
parameter). Specifically, the request_irq() is called with
`&oct_rx_group[i].napi` and `free_irq()` must pass the same pointer.
Verified in kernel/irq/manage.c:1886 (`if (action->dev_id ==
dev_id)`).
**Step 2.4:** Fix quality
- Record: Obviously correct - it literally makes the teardown mirror the
setup. Zero regression risk - if the module were working with the
"old" dev_id in free_irq (it wasn't, as shown by the WARN behavior),
no user would depend on that behavior.
### PHASE 3: GIT HISTORY INVESTIGATION
**Step 3.1:** Blame
- Record: The line `free_irq(oct_rx_group[i].irq, cvm_oct_device)` was
touched by revert `422d97b8b05ed` (2020) but that was a mass revert
restoring the driver. The actual bug was introduced earlier.
**Step 3.2:** Follow Fixes tag
- Record: No Fixes: tag present. Tracing manually: commit
`08712f9de1013` ("staging: octeon: pass the NAPI instance reference to
irq handler", Aug 2016) changed `request_irq()` from using
`cvm_oct_device` to `&cvm_oct_napi` but did NOT update the matching
`free_irq()` - introducing this mismatch. This went into v4.9. Commit
`e971a119f713a` then extended it to multiple rx groups (also v4.9),
still leaving `free_irq()` with `cvm_oct_device`.
**Step 3.3:** Related changes
- Record: No recent churn in this file's shutdown path. Last functional
changes around napi/irq were in 2016. Related commit `60c85e23bed17`
(switch to netif_napi_add_weight) did not touch free_irq.
**Step 3.4:** Author context
- Record: Yuvraj Singh Chauhan - first-time contributor to the kernel
based on the lore thread (no Reviewed-by/Tested-by responses). Patch
applied directly by Greg KH (staging maintainer).
**Step 3.5:** Dependencies
- Record: No dependencies. The code structure (`oct_rx_group[i].napi`)
exists in all stable trees since v4.9. Standalone fix.
### PHASE 4: MAILING LIST RESEARCH
**Step 4.1/4.2:** b4 dig results
- Record: Found single submission at https://lore.kernel.org/all/2026021
2171903.1417804-1-ysinghcin@gmail.com/. Only v1, no revisions. Thread
mbox contains only the original patch - no review replies, no NAKs, no
stable suggestions. Applied by Greg KH without external review (common
for trivial staging fixes).
**Step 4.3:** Bug report
- Record: No Reported-by: tag. No bug report linked. Appears to be
discovered via code inspection.
**Step 4.4:** Related patches
- Record: Standalone patch, not part of a series.
**Step 4.5:** Stable mailing list
- Record: No prior discussion on stable@ for this specific bug found.
### PHASE 5: CODE SEMANTIC ANALYSIS
**Step 5.1-5.4:** Function impact
- Record: `cvm_oct_rx_shutdown()` is called only from `cvm_oct_remove()`
in `drivers/staging/octeon/ethernet.c:936`, which is the
platform_device remove callback. Trigger path: module unload or device
unbind. Limited trigger frequency but reachable from standard module
lifecycle.
**Step 5.5:** Similar patterns
- Record: `drivers/staging/octeon/ethernet-tx.c` uses `cvm_oct_device`
consistently for BOTH its `request_irq()` (line 663) and `free_irq()`
(line 672) - that pair is correctly matched. The bug is isolated to
the rx path.
### PHASE 6: CROSS-REFERENCING AND STABLE TREE ANALYSIS
**Step 6.1:** Buggy code in stable
- Record: Verified identical `free_irq(oct_rx_group[i].irq,
cvm_oct_device);` in v6.1, v6.6, v6.12 at line 538. Bug exists in all
active LTS trees. Bug introduced in v4.9 (2016).
**Step 6.2:** Backport complexity
- Record: The one-line change would apply cleanly to all stable trees
since the surrounding code (the for-loop structure with
`oct_rx_group[i].irq`) is stable since v4.9.
**Step 6.3:** Already in stable
- Record: No prior fix exists; the patch would be the first for this
bug.
### PHASE 7: SUBSYSTEM AND MAINTAINER CONTEXT
**Step 7.1:** Criticality
- Record: `drivers/staging/octeon/` - staging driver for Cavium OCTEON
MIPS SoCs. PERIPHERAL criticality - affects only OCTEON hardware users
(real users exist per TODO file stating "functional and supports
Ethernet on OCTEON+/OCTEON2/OCTEON3").
**Step 7.2:** Subsystem activity
- Record: Low activity - mostly cleanup/style commits over recent years.
The driver is mature but still in staging.
### PHASE 8: IMPACT AND RISK ASSESSMENT
**Step 8.1:** Affected users
- Record: Driver-specific - users of Cavium OCTEON MIPS SoCs with
`CONFIG_OCTEON_ETHERNET=m`.
**Step 8.2:** Trigger conditions
- Record: Triggered on module unload (`rmmod octeon_ethernet`) or driver
unbind via sysfs. Not triggered during normal operation. Not
triggerable by unprivileged users.
**Step 8.3:** Failure mode severity
- Record:
- Immediate: WARN splat with stack trace (MEDIUM - no crash).
- Persistent: IRQ line permanently allocated after module removal
(MEDIUM - resource leak).
- Potential: If module re-loaded, `request_irq()` could fail or re-
register on the same IRQ unpredictably. More concerning, after
module unload the IRQ handler function pointer in the irqaction
still references unloaded module code - if an interrupt fires, it
would be a kernel oops.
- Overall severity: MEDIUM (resource leak + latent crash risk on
unload+interrupt).
**Step 8.4:** Risk-benefit
- Record: Benefit: LOW-MEDIUM (narrow hardware audience, uncommon
trigger). Risk: VERY LOW (1 line, trivially correct, mirrors well-
understood request/free pair). Ratio: favorable for backport.
### PHASE 9: FINAL SYNTHESIS
**Step 9.1:** Evidence
- FOR: Obviously correct fix, minimal scope, bug present since v4.9 in
all stable trees, clean application expected, mirrors existing pattern
in ethernet-tx.c.
- AGAINST: Staging driver, narrow audience, no reviewer endorsement, no
user bug report, WARN is typically considered medium severity.
- UNRESOLVED: Whether any real user has hit this WARN (no Reported-by).
**Step 9.2:** Stable rules checklist
1. Obviously correct and tested? YES (the patch mirrors the matching
setup call; fix is trivially verifiable)
2. Fixes a real bug? YES (mismatch is real, `__free_irq` logic verified
in kernel/irq/manage.c)
3. Important issue? BORDERLINE-YES (WARN + resource leak on module
unload; latent crash potential)
4. Small and contained? YES (1 line)
5. No new features? YES
6. Applies to stable? YES
**Step 9.3:** Exception category
- Not an explicit exception category but qualifies as a "small, surgical
bug fix" per stable rules.
**Step 9.4:** Decision
- The fix is obviously correct, one line, zero regression risk, and
fixes a genuine kernel WARN plus IRQ leak on module unload in a driver
with real hardware users. It applies to all active stable trees. Even
though it's a staging driver and the trigger is module unload, stable
routinely backports such clearly correct tiny fixes. The cost-benefit
is clearly positive.
## Verification
- [Phase 1] Read commit 41db5b76eeb4c with `git show` - confirmed tags
and body
- [Phase 2] Diff confirmed: one line change in `cvm_oct_rx_shutdown()`
replacing `cvm_oct_device` with `&oct_rx_group[i].napi`
- [Phase 2] Read `drivers/staging/octeon/ethernet-rx.c` at HEAD and at
41db5b76eeb4c to confirm before/after states
- [Phase 2] Verified `__free_irq()` semantics in
`kernel/irq/manage.c:1858-1889` - WARN when action not found, returns
NULL
- [Phase 2] Verified `action->dev_id = dev_id` stored at
`kernel/irq/manage.c:2167` and `:2307`, matched at `:1886`
- [Phase 3] `git blame` on lines 530-545 - returned revert commit
422d97b8b05ed3 (2020)
- [Phase 3] `git log --follow --diff-filter=A` traced first addition to
80ff0fd3ab645 (v2.6.30) and revert
- [Phase 3] `git show 08712f9de1013` - confirmed this commit (v4.9, Aug
2016) changed request_irq to `&cvm_oct_napi` but left free_irq
unchanged (line 505 still used `cvm_oct_device`)
- [Phase 3] `git show e971a119f713a` - extended to per-group arrays in
v4.9, still with mismatched free_irq
- [Phase 3] `git tag --contains 08712f9de1013` - showed v4.9 as first
containing tag
- [Phase 4] `b4 dig -c 41db5b76eeb4c` - found original submission at
patch.msgid.link
- [Phase 4] `b4 dig -c 41db5b76eeb4c -a` - only v1, no prior revisions
- [Phase 4] `b4 dig -c 41db5b76eeb4c -w` - only gregkh and staging/lkml
mailing lists were CC'd
- [Phase 4] Thread mbox (117 lines) - only the original submission, no
replies/reviews
- [Phase 4] lore.kernel.org direct fetch was blocked by Anubis; relied
on mbox content instead
- [Phase 5] `git grep "cvm_oct_device\b"` showed tx.c uses it
consistently for both request_irq and free_irq (correctly matched),
confirming only rx.c is buggy
- [Phase 5] `git grep "cvm_oct_rx_shutdown"` - called from
`cvm_oct_remove()` (platform_device remove callback only)
- [Phase 6] Verified bug exists in v6.1, v6.6, v6.12 with `git show
vX.Y:drivers/staging/octeon/ethernet-rx.c | grep free_irq` - all show
the buggy line at 538
- [Phase 7] Read `drivers/staging/octeon/TODO` - driver supports real
hardware (OCTEON+/OCTEON2/OCTEON3 up to CN7030)
- [Phase 8] Trigger path confirmed via `cvm_oct_remove()` callchain in
ethernet.c:936
- UNVERIFIED: Exact kernel version the bug was first introduced (v4.9
per `git tag --contains`, but I did not check if v4.9.y was the oldest
still-supported LTS)
- UNVERIFIED: Whether any user has actually reported hitting the WARN in
production - no Reported-by in commit, no bug tracker link
The fix is a trivial, obviously-correct one-line change that corrects a
genuine bug present since v4.9, in a driver with real users, with zero
regression risk. It meets all stable kernel criteria.
**YES**
drivers/staging/octeon/ethernet-rx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/staging/octeon/ethernet-rx.c b/drivers/staging/octeon/ethernet-rx.c
index 965330eec80a8..d0b43d50b83ce 100644
--- a/drivers/staging/octeon/ethernet-rx.c
+++ b/drivers/staging/octeon/ethernet-rx.c
@@ -535,7 +535,7 @@ void cvm_oct_rx_shutdown(void)
cvmx_write_csr(CVMX_POW_WQ_INT_THRX(i), 0);
/* Free the interrupt handler */
- free_irq(oct_rx_group[i].irq, cvm_oct_device);
+ free_irq(oct_rx_group[i].irq, &oct_rx_group[i].napi);
netif_napi_del(&oct_rx_group[i].napi);
}
--
2.53.0
next parent reply other threads:[~2026-04-28 10:42 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260428104133.2858589-1-sashal@kernel.org>
2026-04-28 10:40 ` Sasha Levin [this message]
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-5.10] staging: fbtft: fix unchecked write return value in fb_agm1264k-fl Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260428104133.2858589-44-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-staging@lists.linux.dev \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
--cc=ysinghcin@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox