public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 6.18] net: stmmac: Fix PTP ref clock for Tegra234
       [not found] <20260420131539.986432-1-sashal@kernel.org>
@ 2026-04-20 13:07 ` Sasha Levin
  2026-04-20 13:07 ` [PATCH AUTOSEL 7.0-6.12] wifi: mac80211: properly handle error in ieee80211_add_virtual_monitor Sasha Levin
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:07 UTC (permalink / raw)
  To: patches, stable
  Cc: Jon Hunter, Simon Horman, Jakub Kicinski, Sasha Levin,
	alexandre.torgue, joabreu, davem, edumazet, pabeni,
	mcoquelin.stm32, thierry.reding, vbhadram, ruppala, netdev,
	linux-stm32, linux-arm-kernel, linux-tegra, linux-kernel

From: Jon Hunter <jonathanh@nvidia.com>

[ Upstream commit 1345e9f4e3f3bc7d8a0a2138ae29e205a857a555 ]

Since commit 030ce919e114 ("net: stmmac: make sure that ptp_rate is not
0 before configuring timestamping") was added the following error is
observed on Tegra234:

 ERR KERN tegra-mgbe 6800000.ethernet eth0: Invalid PTP clock rate
 WARNING KERN tegra-mgbe 6800000.ethernet eth0: PTP init failed

It turns out that the Tegra234 device-tree binding defines the PTP ref
clock name as 'ptp-ref' and not 'ptp_ref' and the above commit now
exposes this and that the PTP clock is not configured correctly.

In order to update device-tree to use the correct 'ptp_ref' name, update
the Tegra MGBE driver to use 'ptp_ref' by default and fallback to using
'ptp-ref' if this clock name is present.

Fixes: d8ca113724e7 ("net: stmmac: tegra: Add MGBE support")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260401102941.17466-2-jonathanh@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Error: Failed to generate final synthesis

 .../net/ethernet/stmicro/stmmac/dwmac-tegra.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-tegra.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-tegra.c
index d765acbe37548..21a0a11fc0118 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-tegra.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-tegra.c
@@ -9,7 +9,7 @@
 #include "stmmac_platform.h"
 
 static const char *const mgbe_clks[] = {
-	"rx-pcs", "tx", "tx-pcs", "mac-divider", "mac", "mgbe", "ptp-ref", "mac"
+	"rx-pcs", "tx", "tx-pcs", "mac-divider", "mac", "mgbe", "ptp_ref", "mac"
 };
 
 struct tegra_mgbe {
@@ -215,6 +215,7 @@ static int tegra_mgbe_probe(struct platform_device *pdev)
 {
 	struct plat_stmmacenet_data *plat;
 	struct stmmac_resources res;
+	bool use_legacy_ptp = false;
 	struct tegra_mgbe *mgbe;
 	int irq, err, i;
 	u32 value;
@@ -257,9 +258,23 @@ static int tegra_mgbe_probe(struct platform_device *pdev)
 	if (!mgbe->clks)
 		return -ENOMEM;
 
-	for (i = 0; i <  ARRAY_SIZE(mgbe_clks); i++)
+	/* Older device-trees use 'ptp-ref' rather than 'ptp_ref'.
+	 * Fall back when the legacy name is present.
+	 */
+	if (of_property_match_string(pdev->dev.of_node, "clock-names",
+				     "ptp-ref") >= 0)
+		use_legacy_ptp = true;
+
+	for (i = 0; i < ARRAY_SIZE(mgbe_clks); i++) {
 		mgbe->clks[i].id = mgbe_clks[i];
 
+		if (use_legacy_ptp && !strcmp(mgbe_clks[i], "ptp_ref")) {
+			dev_warn(mgbe->dev,
+				 "Device-tree update needed for PTP clock!\n");
+			mgbe->clks[i].id = "ptp-ref";
+		}
+	}
+
 	err = devm_clk_bulk_get(mgbe->dev, ARRAY_SIZE(mgbe_clks), mgbe->clks);
 	if (err < 0)
 		return err;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH AUTOSEL 7.0-6.12] wifi: mac80211: properly handle error in ieee80211_add_virtual_monitor
       [not found] <20260420131539.986432-1-sashal@kernel.org>
  2026-04-20 13:07 ` [PATCH AUTOSEL 6.18] net: stmmac: Fix PTP ref clock for Tegra234 Sasha Levin
@ 2026-04-20 13:07 ` Sasha Levin
  2026-04-20 13:07 ` [PATCH AUTOSEL 7.0-5.10] net: qrtr: fix endian handling of confirm_rx field Sasha Levin
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:07 UTC (permalink / raw)
  To: patches, stable
  Cc: Miri Korenblit, Johannes Berg, Sasha Levin, johannes, davem,
	edumazet, kuba, pabeni, linux-wireless, netdev, linux-kernel

From: Miri Korenblit <miriam.rachel.korenblit@intel.com>

[ Upstream commit 876565d4a826f3f04ef36f1cef6123ed4b150aa3 ]

In case of an error in ieee80211_add_virtual_monitor,
SDATA_STATE_RUNNING should be cleared as it was set in this function.
Do it there instead of in the error path of ieee80211_do_open.

Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260320141312.5546126313b1.I689dba2f54069b259702e8d246cedf79a73b82c6@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have enough information for a thorough analysis. Let me compile
the findings.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

### Step 1.1: Subject Line
- **Subsystem**: wifi: mac80211
- **Action verb**: "properly handle" (strong indicator of a bug fix)
- **Summary**: Fix error handling in `ieee80211_add_virtual_monitor` --
  `SDATA_STATE_RUNNING` bit not properly managed on error paths.

Record: [wifi: mac80211] [properly handle] [Fix SDATA_STATE_RUNNING
management on error paths in virtual monitor addition]

### Step 1.2: Tags
- **Reviewed-by**: Johannes Berg (mac80211 subsystem maintainer)
- **Signed-off-by**: Miri Korenblit (Intel WiFi developer, regular
  mac80211 contributor)
- **Signed-off-by**: Johannes Berg (maintainer applied the patch)
- **Link**: patch.msgid.link URL (lore.kernel.org backed)
- No Fixes: tag (expected for AUTOSEL candidates)
- No Reported-by: tag
- No Cc: stable tag

Record: Reviewed and signed-off by the subsystem maintainer. Strong
quality signal.

### Step 1.3: Commit Body
The commit explains: when `ieee80211_add_virtual_monitor` encounters an
error, `SDATA_STATE_RUNNING` should be cleared within the function
itself (since it was set there), rather than relying on the caller's
error path in `ieee80211_do_open` (where it was clearing a bit that was
never set on the caller's sdata).

Record: Bug is incorrect state management -- SDATA_STATE_RUNNING bit
left set on error paths within the function. The caller's cleanup was a
no-op.

### Step 1.4: Hidden Bug Fix Detection
"properly handle error" is a direct bug fix description. This fixes a
state consistency issue where `SDATA_STATE_RUNNING` is set but never
cleared on failure, which could cause incorrect behavior in the
extensive code paths that check `ieee80211_sdata_running()`.

Record: This is clearly a bug fix, not disguised.

---

## PHASE 2: DIFF ANALYSIS

### Step 2.1: Inventory
- **Files changed**: 1 (`net/mac80211/iface.c`)
- **Lines**: ~5 lines changed effectively (moved set_bit, added
  clear_bit, removed dead code)
- **Functions modified**: `ieee80211_add_virtual_monitor` and
  `ieee80211_do_open`

Record: Single file, surgical change, two functions affected. Scope:
very small.

### Step 2.2: Code Flow Changes

**Hunk 1** (`ieee80211_add_virtual_monitor`):
- **Before**: `set_bit(SDATA_STATE_RUNNING)` at line 1225 BEFORE
  `ieee80211_check_queues`; if check_queues fails, sdata is freed with
  RUNNING still set.
- **After**: `set_bit(SDATA_STATE_RUNNING)` moved AFTER
  `ieee80211_check_queues`. The bit is only set once the queues are
  verified. In the `ieee80211_link_use_channel` error path,
  `clear_bit(SDATA_STATE_RUNNING)` is added before `kfree(sdata)`.

**Hunk 2** (`ieee80211_do_open`):
- **Before**: `clear_bit(SDATA_STATE_RUNNING, &sdata->state)` in error
  path with comment "might already be clear but that doesn't matter."
- **After**: This `clear_bit` is removed because `SDATA_STATE_RUNNING`
  is only set at line 1541 (after all error gotos), so clearing it in
  the error path was always a no-op.

### Step 2.3: Bug Mechanism
This is a **state management / initialization bug**. The
`SDATA_STATE_RUNNING` bit gates behavior in ~50+ call sites across
mac80211 (TX, RX, scan, reconfig, offchannel, etc.). Setting it
prematurely or failing to clear it on error leads to inconsistent state.

The correct pattern is shown in `ieee80211_del_virtual_monitor` (lines
1301-1312):

```1301:1312:net/mac80211/iface.c
        clear_bit(SDATA_STATE_RUNNING, &sdata->state);
        ieee80211_link_release_channel(&sdata->deflink);
        // ...
        drv_remove_interface(local, sdata);
        // ...
        kfree(sdata);
```

The error path was missing the `clear_bit` before teardown, inconsistent
with this established pattern.

### Step 2.4: Fix Quality
- Obviously correct: follows the established pattern in
  `ieee80211_del_virtual_monitor`
- Minimal/surgical: only moves one bit-set and adds one bit-clear
- Regression risk: very low -- the removed `clear_bit` in
  `ieee80211_do_open` was a no-op
- Reviewed by maintainer Johannes Berg

---

## PHASE 3: GIT HISTORY INVESTIGATION

### Step 3.1: Blame
The `set_bit(SDATA_STATE_RUNNING)` was introduced by commit
`bdeca45a0cc58` ("mac80211: set SDATA_STATE_RUNNING for monitor
interfaces") from 2020-11-29. This commit was added to fix HW restart
issues but didn't account for all error paths.

The buggy code exists since v5.10 (confirmed via `git merge-base --is-
ancestor`), and is present in v5.15, v6.1, v6.6, and all newer stable
trees.

### Step 3.2: Fixes Tag
No Fixes: tag present. The implicit fix target is `bdeca45a0cc58` from
2020-11-29.

### Step 3.3: Related Changes
- `cbf0dc37bb4e9` ("wifi: mac80211: fix list iteration in
  ieee80211_add_virtual_monitor()") -- a syzbot-reported fix in the same
  function, shows the function has known bug history.
- `c0d82ba9612fb` -- Miri's other commit about `ieee80211_sdata_running`
  checks.

### Step 3.4: Author
Miri Korenblit is a regular Intel WiFi developer with many commits to
mac80211. The patch was reviewed by Johannes Berg, the mac80211
maintainer.

### Step 3.5: Dependencies
No dependencies found. The fix is self-contained and modifies only the
ordering and presence of `set_bit`/`clear_bit` calls on existing state
bits. Should apply cleanly to stable trees back to v5.10.

---

## PHASE 4: MAILING LIST RESEARCH

Lore.kernel.org was behind anti-bot protection and could not be fetched.
The Link: tag points to the original submission. The patch was reviewed
by the maintainer.

Record: Could not fetch lore discussion due to anti-bot protection. The
Reviewed-by from Johannes Berg is a strong quality signal.

---

## PHASE 5: CODE SEMANTIC ANALYSIS

### Step 5.1-5.4: Functions and Call Chains
`ieee80211_sdata_running()` (which checks `SDATA_STATE_RUNNING`) is
called from **50+ locations** across mac80211:
- TX hot path (`tx.c:2319, 4291, 4523, 4707`)
- RX path (`rx.c:5396`)
- HW reconfig (`util.c:1925, 1942, 1955, 1985, 2157, 2248`)
- Scanning (`scan.c:532, 942, 1152`)
- Channel management (`chan.c:93, 568`)
- Configuration (`cfg.c` multiple locations)

The critical path is HW reconfig at `util.c:1954-1956`:
```c
sdata = wiphy_dereference(local->hw.wiphy, local->monitor_sdata);
if (sdata && ieee80211_sdata_running(sdata))
    ieee80211_assign_chanctx(local, sdata, &sdata->deflink);
```

If the sdata was partially initialized (RUNNING set but channel context
failed), this could attempt operations on invalid state.

### Step 5.5: Similar Patterns
The proper pattern (`clear_bit` before teardown) is consistently used in
`ieee80211_del_virtual_monitor` (line 1301) and `ieee80211_do_stop`
(line 490). The error path was the outlier.

---

## PHASE 6: STABLE TREE ANALYSIS

### Step 6.1: Buggy Code in Stable
The buggy commit `bdeca45a0cc58` from Nov 2020 is present in **all
active stable trees**: v5.10, v5.15, v6.1, v6.6, v6.12. The fix is
relevant to all of them.

### Step 6.2: Backport Complications
The function signature changed (added `creator_sdata` parameter), but
the core logic and error paths are the same. Minor conflicts possible in
older trees but the fix concept applies cleanly.

### Step 6.3: Related Fixes in Stable
No other fix for this specific issue found in stable.

---

## PHASE 7: SUBSYSTEM CONTEXT

### Step 7.1: Criticality
- **Subsystem**: WiFi (net/mac80211) -- IMPORTANT level
- WiFi is used by vast majority of laptops, embedded systems, IoT
  devices
- mac80211 is the core WiFi stack used by most WiFi drivers

### Step 7.2: Activity
Very active subsystem (87 changes since v6.6 for this single file).

---

## PHASE 8: IMPACT AND RISK ASSESSMENT

### Step 8.1: Affected Users
All WiFi users whose hardware uses mac80211 virtual monitor interfaces
(common during scanning, monitoring).

### Step 8.2: Trigger Conditions
Triggered when `ieee80211_add_virtual_monitor` fails -- specifically
when `ieee80211_check_queues` or `ieee80211_link_use_channel` return
errors. This can happen during:
- HW restart/reconfig (util.c:2269)
- Opening a monitor interface (iface.c:1437)
- Channel context assignment failures

### Step 8.3: Failure Mode
- Incorrect `SDATA_STATE_RUNNING` state could cause code paths gated by
  `ieee80211_sdata_running()` to operate on improperly initialized sdata
- In the worst case, during HW reconfig, could lead to inconsistent
  driver state, potential crashes, or resource leaks
- Severity: **MEDIUM-HIGH** (state corruption in WiFi stack, potential
  for cascading issues)

### Step 8.4: Risk-Benefit
- **Benefit**: Fixes state management bug in widely used WiFi code,
  present since v5.10
- **Risk**: Very low -- moves one set_bit, adds one clear_bit, removes
  dead code. Reviewed by maintainer.
- **Ratio**: Favorable for backport

---

## PHASE 9: FINAL SYNTHESIS

### Step 9.1: Evidence Summary

**FOR backporting:**
- Fixes a real error handling bug (SDATA_STATE_RUNNING not cleared on
  error paths)
- `ieee80211_sdata_running()` checked in 50+ locations -- incorrect
  state has wide impact
- Bug present since v5.10, affects all stable trees
- Small, surgical fix (5 effective lines changed)
- Reviewed and signed-off by Johannes Berg (mac80211 maintainer)
- Author is a regular Intel WiFi contributor
- Follows established cleanup pattern from
  `ieee80211_del_virtual_monitor`
- Single file change, no dependencies

**AGAINST backporting:**
- No specific crash/bug report referenced (no Reported-by)
- Failure requires specific error conditions (channel context failure)
- Minor function signature differences in older stable trees

### Step 9.2: Stable Rules Checklist
1. Obviously correct and tested? **YES** -- follows existing pattern,
   reviewed by maintainer
2. Fixes a real bug? **YES** -- state bit not cleared on error path
3. Important issue? **YES** -- affects WiFi state management in widely-
   used code
4. Small and contained? **YES** -- 5 lines across one file
5. No new features or APIs? **YES** -- purely error handling fix
6. Can apply to stable trees? **YES** -- code exists in all active
   stable trees

### Step 9.3: Exceptions
Not applicable.

---

## Verification

- [Phase 1] Parsed tags: Reviewed-by Johannes Berg (maintainer), SOB
  from Miri Korenblit (Intel wifi dev)
- [Phase 2] Diff analysis: set_bit moved after ieee80211_check_queues;
  clear_bit added in ieee80211_link_use_channel error path; dead
  clear_bit removed from ieee80211_do_open
- [Phase 3] git blame: SDATA_STATE_RUNNING introduced by bdeca45a0cc58
  (2020-11-29), present since v5.10
- [Phase 3] git merge-base: confirmed bdeca45a0cc58 is in v5.10, v5.15,
  v6.1, v6.6
- [Phase 3] Related fix cbf0dc37bb4e9 (syzbot-found bug in same
  function) confirms function has bug history
- [Phase 4] Could not fetch lore discussion (anti-bot protection).
  UNVERIFIED: no stable nomination in discussion
- [Phase 5] grep: ieee80211_sdata_running() used in 50+ call sites
  across TX, RX, scan, reconfig, cfg
- [Phase 5] Compared with ieee80211_del_virtual_monitor (line 1301):
  proper pattern clears RUNNING before teardown
- [Phase 6] Bug code present in all active stable trees (v5.10+)
- [Phase 8] Failure mode: inconsistent SDATA_STATE_RUNNING during error
  recovery, severity MEDIUM-HIGH

The fix is small, correct, matches established patterns, reviewed by the
subsystem maintainer, and fixes a real state management bug in code
present in all stable trees.

**YES**

 net/mac80211/iface.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 676b2a43c9f2f..989e60d4b721d 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -1222,14 +1222,14 @@ int ieee80211_add_virtual_monitor(struct ieee80211_local *local,
 		}
 	}
 
-	set_bit(SDATA_STATE_RUNNING, &sdata->state);
-
 	ret = ieee80211_check_queues(sdata, NL80211_IFTYPE_MONITOR);
 	if (ret) {
 		kfree(sdata);
 		return ret;
 	}
 
+	set_bit(SDATA_STATE_RUNNING, &sdata->state);
+
 	mutex_lock(&local->iflist_mtx);
 	rcu_assign_pointer(local->monitor_sdata, sdata);
 	mutex_unlock(&local->iflist_mtx);
@@ -1242,6 +1242,7 @@ int ieee80211_add_virtual_monitor(struct ieee80211_local *local,
 		mutex_unlock(&local->iflist_mtx);
 		synchronize_net();
 		drv_remove_interface(local, sdata);
+		clear_bit(SDATA_STATE_RUNNING, &sdata->state);
 		kfree(sdata);
 		return ret;
 	}
@@ -1550,8 +1551,6 @@ int ieee80211_do_open(struct wireless_dev *wdev, bool coming_up)
 	sdata->bss = NULL;
 	if (sdata->vif.type == NL80211_IFTYPE_AP_VLAN)
 		list_del(&sdata->u.vlan.list);
-	/* might already be clear but that doesn't matter */
-	clear_bit(SDATA_STATE_RUNNING, &sdata->state);
 	return res;
 }
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH AUTOSEL 7.0-5.10] net: qrtr: fix endian handling of confirm_rx field
       [not found] <20260420131539.986432-1-sashal@kernel.org>
  2026-04-20 13:07 ` [PATCH AUTOSEL 6.18] net: stmmac: Fix PTP ref clock for Tegra234 Sasha Levin
  2026-04-20 13:07 ` [PATCH AUTOSEL 7.0-6.12] wifi: mac80211: properly handle error in ieee80211_add_virtual_monitor Sasha Levin
@ 2026-04-20 13:07 ` Sasha Levin
  2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] netfilter: xt_multiport: validate range encoding in checkentry Sasha Levin
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:07 UTC (permalink / raw)
  To: patches, stable
  Cc: Alexander Wilhelm, Manivannan Sadhasivam, Simon Horman,
	David S. Miller, Sasha Levin, edumazet, kuba, pabeni,
	linux-arm-msm, netdev, linux-kernel

From: Alexander Wilhelm <alexander.wilhelm@westermo.com>

[ Upstream commit e4cf6087cab382c7031e6b436ec55202fa9f2d7b ]

Convert confirm_rx to little endian when enqueueing and convert it back on
receive. This fixes control flow on big endian hosts, little endian is
unaffected.

On transmit, store confirm_rx as __le32 using cpu_to_le32(). On receive,
apply le32_to_cpu() before using the value. !! ensures the value is 0 or 1
in native endianness, so the conversion isn’t strictly required here, but
it is kept for consistency and clarity.

Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Alexander Wilhelm <alexander.wilhelm@westermo.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have all the information needed for a comprehensive analysis.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

### Step 1.1: PARSE THE SUBJECT LINE
Record: **[net: qrtr]** **[fix]** Fix endian handling of `confirm_rx`
field — the word "fix" is used explicitly, targeting an endianness bug
in QRTR protocol.

### Step 1.2: PARSE ALL COMMIT MESSAGE TAGS
- **Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>** —
  QRTR/Qualcomm ARM SoC maintainer
- **Signed-off-by: Alexander Wilhelm <alexander.wilhelm@westermo.com>**
  — Author (Westermo, embedded networking)
- **Reviewed-by: Simon Horman <horms@kernel.org>** — Netdev reviewer
- **Signed-off-by: David S. Miller <davem@davemloft.net>** — Net
  subsystem maintainer merged it

No Fixes: tag (it was deliberately removed at Simon Horman's request
during review — see Phase 4). No Reported-by tag. No Cc: stable tag.

### Step 1.3: ANALYZE THE COMMIT BODY TEXT
The commit explains that:
- On transmit, `confirm_rx` needs `cpu_to_le32()` because the header
  struct field is `__le32`
- On receive, `le32_to_cpu()` should be applied before use
- The `!!` ensures the value is 0 or 1, so the receive-side conversion
  isn't strictly required but is kept for consistency
- This "fixes control flow on big endian hosts"

Record: Bug: Missing endian conversion for the `confirm_rx` field on
both TX and RX paths. Symptom: Broken flow control on big-endian hosts.
Little-endian unaffected. Root cause: `confirm_rx` was stored/read as
native endian into a `__le32` field.

### Step 1.4: DETECT HIDDEN BUG FIXES
Not hidden — this is an explicit endianness bug fix. The subject says
"fix" directly.

---

## PHASE 2: DIFF ANALYSIS — LINE BY LINE

### Step 2.1: INVENTORY THE CHANGES
- **File**: `net/qrtr/af_qrtr.c` — 2 lines changed (1 modified in TX
  path, 1 modified in RX path)
- **Functions modified**: `qrtr_node_enqueue()` (TX),
  `qrtr_endpoint_post()` (RX)
- **Scope**: Single-file, extremely surgical fix

### Step 2.2: UNDERSTAND THE CODE FLOW CHANGE

**Hunk 1 (line 364, TX path in `qrtr_node_enqueue`):**
- Before: `hdr->confirm_rx = !!confirm_rx;` — stores native-endian int
  into `__le32` field
- After: `hdr->confirm_rx = cpu_to_le32(!!confirm_rx);` — properly
  converts to little-endian
- On LE hosts: `cpu_to_le32` is a no-op, identical behavior
- On BE hosts: Value 1 was stored as `0x00000001` in native (big-endian)
  byte order = `0x01000000` in LE interpretation. Now correctly stored
  as LE 1.

**Hunk 2 (line 465, RX path in `qrtr_endpoint_post`):**
- Before: `cb->confirm_rx = !!v1->confirm_rx;` — reads `__le32` as
  native int
- After: `cb->confirm_rx = !!le32_to_cpu(v1->confirm_rx);` — properly
  converts from LE first
- Due to `!!`, the result on the receive side was already correct (any
  non-zero becomes 1). The fix adds the conversion for
  correctness/consistency.

### Step 2.3: IDENTIFY THE BUG MECHANISM
Category: **Endianness/type bug (f)**. The `qrtr_hdr_v1` struct declares
`confirm_rx` as `__le32`, and every other field in the struct uses
proper `cpu_to_le32()`/`le32_to_cpu()` conversions — except
`confirm_rx`. This is the one field that was missed.

### Step 2.4: ASSESS THE FIX QUALITY
- **Obviously correct**: Yes — it follows the exact same pattern as all
  adjacent fields (type, src_node_id, etc.)
- **Minimal**: Yes — 2 lines, exactly matching the existing code pattern
- **Regression risk**: Essentially zero. On LE hosts (the vast
  majority), these are no-ops. On BE hosts, this makes the behavior
  correct.

---

## PHASE 3: GIT HISTORY INVESTIGATION

### Step 3.1: BLAME THE CHANGED LINES
- TX line (364): `hdr->confirm_rx = !!confirm_rx;` — introduced by
  commit **5fdeb0d372ab** ("net: qrtr: Implement outgoing flow
  control"), authored 2020-01-13, first appeared in **v5.6-rc1**
- RX line (465): `cb->confirm_rx = !!v1->confirm_rx;` — introduced by
  commit **194ccc88297ae** ("net: qrtr: Support decoding incoming v2
  packets"), authored 2017-10-10, first appeared in **v4.15**

Record: The buggy TX code has been present since v5.6. The buggy RX code
since v4.15. Both are in all active stable trees (5.10, 5.15, 6.1, 6.6,
6.12, 7.0).

### Step 3.2: FOLLOW THE FIXES: TAG
The v2 submission HAD `Fixes: 5fdeb0d372ab` but it was removed at Simon
Horman's request. The original buggy commit 5fdeb0d372ab ("Implement
outgoing flow control") is present in v5.6+ and all active stable trees.

### Step 3.3: CHECK FILE HISTORY
Recent changes to `af_qrtr.c` are unrelated refactoring (xarray
conversion, treewide changes, proto_ops changes). No recent endianness
fixes.

### Step 3.4: CHECK THE AUTHOR'S OTHER COMMITS
Alexander Wilhelm from Westermo has a clear pattern of fixing endianness
bugs in Qualcomm subsystems: QMI encoding/decoding, MHI BHI vector
table, ath12k QMI data. This is part of an effort to make Qualcomm
subsystems work on big-endian platforms.

### Step 3.5: CHECK FOR DEPENDENT/PREREQUISITE COMMITS
None. The fix applies directly to the original buggy lines without any
prerequisites.

---

## PHASE 4: MAILING LIST AND EXTERNAL RESEARCH

### Step 4.1: FIND THE ORIGINAL PATCH DISCUSSION
Found via yhbt.net mirror of lore.kernel.org. The patch went through 3
versions:
- **v1** (2026-03-20): Initial submission with Fixes tag, targeted at
  `net`
- **v2** (2026-03-24): Rebase on latest net tree, improved commit
  message, still had Fixes tag
- **v3** (2026-03-26): Rebase on `net-next`, Fixes tag removed at Simon
  Horman's request

### Step 4.2: KEY REVIEWER FEEDBACK
**Simon Horman** (netdev reviewer): "But as this isn't strictly
necessary let's target net-next and drop the Fixes tag." This is a
**negative signal** for stable backport — the netdev reviewer explicitly
downgraded from fix to enhancement.

**Manivannan Sadhasivam** (QRTR maintainer) disagreed: "FWIW: Adding
Fixes tag doesn't mean that the patch should be queued for -rcS." Mani
thought the Fixes tag was appropriate.

### Step 4.3: BUG REPORT
No external bug report. The author found this during systematic
endianness auditing.

### Step 4.4: RELATED PATCHES
This is a standalone fix. Not part of a series.

### Step 4.5: STABLE MAILING LIST HISTORY
No stable-specific discussion found.

---

## PHASE 5: CODE SEMANTIC ANALYSIS

### Step 5.1: IDENTIFY KEY FUNCTIONS
- `qrtr_node_enqueue()` — TX path
- `qrtr_endpoint_post()` — RX path

### Step 5.2: TRACE CALLERS
- `qrtr_node_enqueue()` is called from: `qrtr_sendmsg()` (the main
  sendmsg path), `qrtr_send_resume_tx()`, and broadcast path. It's the
  core TX function.
- `qrtr_endpoint_post()` is called from: MHI driver (`qrtr_mhi.c`), SMD
  driver (`qrtr_smd.c`), tun driver (`qrtr_tun.c`). It's the core RX
  entry point — called for EVERY incoming QRTR packet.

### Step 5.3-5.4: CALL CHAIN
`qrtr_endpoint_post()` is called directly from hardware transport
drivers on every received packet. `qrtr_node_enqueue()` is called on
every transmitted packet. Both are hot-path functions.

### Step 5.5: SIMILAR PATTERNS
All other fields in `qrtr_hdr_v1` already use proper endian conversions.
`confirm_rx` was the only one missed.

---

## PHASE 6: CROSS-REFERENCING AND STABLE TREE ANALYSIS

### Step 6.1: DOES THE BUGGY CODE EXIST IN STABLE TREES?
The TX bug (5fdeb0d372ab) exists in **v5.6+**, so all active stable
trees: 5.10.y, 5.15.y, 6.1.y, 6.6.y, 6.12.y.
The RX bug (194ccc88297ae) exists since **v4.15**.

### Step 6.2: BACKPORT COMPLICATIONS
The code at these two lines has not changed since introduction. The
patch should apply cleanly to all active stable trees.

### Step 6.3: RELATED FIXES ALREADY IN STABLE
None found.

---

## PHASE 7: SUBSYSTEM AND MAINTAINER CONTEXT

### Step 7.1: SUBSYSTEM CRITICALITY
**net/qrtr** — Qualcomm IPC Router, used for communication between Linux
and Qualcomm firmware (modem, WiFi, etc.).
Criticality: **PERIPHERAL** — affects users of Qualcomm SoC platforms
running big-endian kernels (very niche). Qualcomm SoCs are little-endian
ARM, so the primary users are unaffected.

### Step 7.2: SUBSYSTEM ACTIVITY
Moderate activity — mostly maintenance fixes, not heavy development.

---

## PHASE 8: IMPACT AND RISK ASSESSMENT

### Step 8.1: WHO IS AFFECTED
Only big-endian hosts that use QRTR. This is extremely niche — Qualcomm
SoCs are LE ARM. However, Westermo (author's company) apparently runs BE
systems with QRTR, and there could be other embedded platforms.

### Step 8.2: TRIGGER CONDITIONS
Every QRTR data transmission on a big-endian host. The TX side stores
the wrong endianness, which means the remote end receives a malformed
`confirm_rx` value. The RX side is actually mitigated by `!!` (any non-
zero normalizes to 1).

### Step 8.3: FAILURE MODE SEVERITY
On big-endian hosts: The flow control mechanism (confirm_rx/resume_tx)
breaks. The TX side sends `confirm_rx` in wrong byte order. If the
remote firmware compares `confirm_rx` directly to 1 (rather than using
`!!`), it won't send RESUME_TX messages, causing the sender to
eventually block or exhaust remote resources.
Severity: **MEDIUM** — broken flow control on a niche platform.

### Step 8.4: RISK-BENEFIT RATIO
- **Benefit**: Fixes real protocol-level correctness bug on BE hosts.
  Low impact population but real for those affected.
- **Risk**: Essentially zero. `cpu_to_le32`/`le32_to_cpu` are no-ops on
  LE. The fix follows the established pattern used by every other field
  in the same struct.

---

## PHASE 9: FINAL SYNTHESIS

### Step 9.1: COMPILE THE EVIDENCE

**FOR backporting:**
- Real endianness correctness bug
- Extremely small fix (2 lines), obviously correct
- Zero regression risk on LE systems (no-op), fixes BE systems
- Follows the identical pattern of all adjacent header fields
- Present in all stable trees since v5.6
- Reviewed by 2 reviewers, merged by David Miller
- Manivannan Sadhasivam (QRTR maintainer) considered the Fixes tag
  appropriate

**AGAINST backporting:**
- Simon Horman explicitly said "not strictly necessary," asked to drop
  Fixes tag and target net-next
- Big-endian QRTR usage is extremely niche
- The `!!` on the RX path mitigates the receive side (only TX is truly
  broken)
- No user bug report — found during code audit

### Step 9.2: STABLE RULES CHECKLIST
1. Obviously correct and tested? **Yes** — standard endian conversion
   pattern
2. Fixes a real bug? **Yes** — broken wire protocol on BE hosts
3. Important issue? **Borderline** — broken flow control, but only on BE
   hosts using QRTR
4. Small and contained? **Yes** — 2 lines in 1 file
5. No new features? **Correct**
6. Can apply to stable? **Yes** — clean apply expected

### Step 9.3: EXCEPTION CATEGORIES
None applicable.

### Step 9.4: DECISION
This is a genuine endianness bug fix with zero regression risk. While
the affected population is niche, the fix is so small and obviously
correct that the risk/benefit ratio strongly favors inclusion. The
netdev reviewer's pushback was about targeting net vs. net-next (merge
window process), not about whether the fix is correct. The QRTR
maintainer (Mani) disagreed with dropping the Fixes tag. For stable
users on big-endian platforms running QRTR, this fixes a real protocol-
level correctness issue in the flow control mechanism.

---

## Verification

- [Phase 1] Parsed tags: Reviewed-by from Manivannan Sadhasivam and
  Simon Horman, SOB from David S. Miller (merger)
- [Phase 2] Diff analysis: 2 lines changed — adding `cpu_to_le32()` on
  TX, `le32_to_cpu()` on RX, matching pattern of all other fields
- [Phase 2] Verified `qrtr_hdr_v1.confirm_rx` is declared as `__le32`
  (line 44 of af_qrtr.c)
- [Phase 3] git blame: TX line introduced by 5fdeb0d372ab (v5.6,
  2020-01-13), RX line by 194ccc88297ae (v4.15, 2017-10-10)
- [Phase 3] git merge-base: Confirmed 5fdeb0d372ab is in v5.10, v5.15,
  v6.1, v6.6 (all active stable trees)
- [Phase 3] Author check: Alexander Wilhelm has 7 commits all fixing
  Qualcomm endianness bugs
- [Phase 4] Mailing list (yhbt.net mirror): Found full v2 thread. Simon
  Horman said "not strictly necessary," Mani disagreed
- [Phase 4] Patch went v1->v2->v3; v3 dropped Fixes tag, targeted net-
  next at reviewer request
- [Phase 5] Callers verified: `qrtr_node_enqueue` is core TX path,
  `qrtr_endpoint_post` is core RX entry point (EXPORT_SYMBOL_GPL)
- [Phase 5] Verified all other `qrtr_hdr_v1` fields use proper endian
  conversions — only `confirm_rx` was missed
- [Phase 6] Code is unchanged at buggy lines since introduction — clean
  apply expected
- [Phase 8] Risk assessment: zero risk on LE (no-op conversions), fixes
  correctness on BE

**YES**

 net/qrtr/af_qrtr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/qrtr/af_qrtr.c b/net/qrtr/af_qrtr.c
index d77e9c8212da5..7cec6a7859b03 100644
--- a/net/qrtr/af_qrtr.c
+++ b/net/qrtr/af_qrtr.c
@@ -361,7 +361,7 @@ static int qrtr_node_enqueue(struct qrtr_node *node, struct sk_buff *skb,
 	}
 
 	hdr->size = cpu_to_le32(len);
-	hdr->confirm_rx = !!confirm_rx;
+	hdr->confirm_rx = cpu_to_le32(!!confirm_rx);
 
 	rc = skb_put_padto(skb, ALIGN(len, 4) + sizeof(*hdr));
 
@@ -462,7 +462,7 @@ int qrtr_endpoint_post(struct qrtr_endpoint *ep, const void *data, size_t len)
 		cb->type = le32_to_cpu(v1->type);
 		cb->src_node = le32_to_cpu(v1->src_node_id);
 		cb->src_port = le32_to_cpu(v1->src_port_id);
-		cb->confirm_rx = !!v1->confirm_rx;
+		cb->confirm_rx = !!le32_to_cpu(v1->confirm_rx);
 		cb->dst_node = le32_to_cpu(v1->dst_node_id);
 		cb->dst_port = le32_to_cpu(v1->dst_port_id);
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH AUTOSEL 6.18] netfilter: xt_multiport: validate range encoding in checkentry
       [not found] <20260420131539.986432-1-sashal@kernel.org>
                   ` (2 preceding siblings ...)
  2026-04-20 13:07 ` [PATCH AUTOSEL 7.0-5.10] net: qrtr: fix endian handling of confirm_rx field Sasha Levin
@ 2026-04-20 13:08 ` Sasha Levin
  2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] ice: ptp: don't WARN when controlling PF is unavailable Sasha Levin
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Ren Wei, Yifan Wu, Juefei Pu, Yuan Tan, Xin Liu, Yuhang Zheng,
	Florian Westphal, Sasha Levin, pablo, kadlec, davem, edumazet,
	kuba, pabeni, yasuyuki.kozakai, kaber, netfilter-devel, coreteam,
	netdev, linux-kernel

From: Ren Wei <n05ec@lzu.edu.cn>

[ Upstream commit ff64c5bfef12461df8450e0f50bb693b5269c720 ]

ports_match_v1() treats any non-zero pflags entry as the start of a
port range and unconditionally consumes the next ports[] element as
the range end.

The checkentry path currently validates protocol, flags and count, but
it does not validate the range encoding itself. As a result, malformed
rules can mark the last slot as a range start or place two range starts
back to back, leaving ports_match_v1() to step past the last valid
ports[] element while interpreting the rule.

Reject malformed multiport v1 rules in checkentry by validating that
each range start has a following element and that the following element
is not itself marked as another range start.

Fixes: a89ecb6a2ef7 ("[NETFILTER]: x_tables: unify IPv4/IPv6 multiport match")
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Co-developed-by: Yuan Tan <yuantan098@gmail.com>
Signed-off-by: Yuan Tan <yuantan098@gmail.com>
Suggested-by: Xin Liu <bird@lzu.edu.cn>
Tested-by: Yuhang Zheng <z1652074432@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Error: Failed to generate final synthesis

 net/netfilter/xt_multiport.c | 34 ++++++++++++++++++++++++++++++----
 1 file changed, 30 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/xt_multiport.c b/net/netfilter/xt_multiport.c
index 44a00f5acde8a..a1691ff405d3c 100644
--- a/net/netfilter/xt_multiport.c
+++ b/net/netfilter/xt_multiport.c
@@ -105,6 +105,28 @@ multiport_mt(const struct sk_buff *skb, struct xt_action_param *par)
 	return ports_match_v1(multiinfo, ntohs(pptr[0]), ntohs(pptr[1]));
 }
 
+static bool
+multiport_valid_ranges(const struct xt_multiport_v1 *multiinfo)
+{
+	unsigned int i;
+
+	for (i = 0; i < multiinfo->count; i++) {
+		if (!multiinfo->pflags[i])
+			continue;
+
+		if (++i >= multiinfo->count)
+			return false;
+
+		if (multiinfo->pflags[i])
+			return false;
+
+		if (multiinfo->ports[i - 1] > multiinfo->ports[i])
+			return false;
+	}
+
+	return true;
+}
+
 static inline bool
 check(u_int16_t proto,
       u_int8_t ip_invflags,
@@ -127,8 +149,10 @@ static int multiport_mt_check(const struct xt_mtchk_param *par)
 	const struct ipt_ip *ip = par->entryinfo;
 	const struct xt_multiport_v1 *multiinfo = par->matchinfo;
 
-	return check(ip->proto, ip->invflags, multiinfo->flags,
-		     multiinfo->count) ? 0 : -EINVAL;
+	if (!check(ip->proto, ip->invflags, multiinfo->flags, multiinfo->count))
+		return -EINVAL;
+
+	return multiport_valid_ranges(multiinfo) ? 0 : -EINVAL;
 }
 
 static int multiport_mt6_check(const struct xt_mtchk_param *par)
@@ -136,8 +160,10 @@ static int multiport_mt6_check(const struct xt_mtchk_param *par)
 	const struct ip6t_ip6 *ip = par->entryinfo;
 	const struct xt_multiport_v1 *multiinfo = par->matchinfo;
 
-	return check(ip->proto, ip->invflags, multiinfo->flags,
-		     multiinfo->count) ? 0 : -EINVAL;
+	if (!check(ip->proto, ip->invflags, multiinfo->flags, multiinfo->count))
+		return -EINVAL;
+
+	return multiport_valid_ranges(multiinfo) ? 0 : -EINVAL;
 }
 
 static struct xt_match multiport_mt_reg[] __read_mostly = {
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH AUTOSEL 6.18] ice: ptp: don't WARN when controlling PF is unavailable
       [not found] <20260420131539.986432-1-sashal@kernel.org>
                   ` (3 preceding siblings ...)
  2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] netfilter: xt_multiport: validate range encoding in checkentry Sasha Levin
@ 2026-04-20 13:08 ` Sasha Levin
  2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] e1000: check return value of e1000_read_eeprom Sasha Levin
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Kohei Enju, Aleksandr Loktionov, Tony Nguyen, Sasha Levin,
	jesse.brandeburg, davem, edumazet, kuba, pabeni, horms,
	przemyslaw.kitszel, sergey.temerkhanov, intel-wired-lan, netdev,
	linux-kernel

From: Kohei Enju <kohei@enjuk.jp>

[ Upstream commit bb3f21edc7056cdf44a7f7bd7ba65af40741838c ]

In VFIO passthrough setups, it is possible to pass through only a PF
which doesn't own the source timer. In that case the PTP controlling PF
(adapter->ctrl_pf) is never initialized in the VM, so ice_get_ctrl_ptp()
returns NULL and triggers WARN_ON() in ice_ptp_setup_pf().

Since this is an expected behavior in that configuration, replace
WARN_ON() with an informational message and return -EOPNOTSUPP.

Fixes: e800654e85b5 ("ice: Use ice_adapter for PTP shared data instead of auxdev")
Signed-off-by: Kohei Enju <kohei@enjuk.jp>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Error: Failed to generate final synthesis

 drivers/net/ethernet/intel/ice/ice_ptp.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index df38345b12d72..02517772fb5f4 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -3041,7 +3041,13 @@ static int ice_ptp_setup_pf(struct ice_pf *pf)
 	struct ice_ptp *ctrl_ptp = ice_get_ctrl_ptp(pf);
 	struct ice_ptp *ptp = &pf->ptp;
 
-	if (WARN_ON(!ctrl_ptp) || pf->hw.mac_type == ICE_MAC_UNKNOWN)
+	if (!ctrl_ptp) {
+		dev_info(ice_pf_to_dev(pf),
+			 "PTP unavailable: no controlling PF\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (pf->hw.mac_type == ICE_MAC_UNKNOWN)
 		return -ENODEV;
 
 	INIT_LIST_HEAD(&ptp->port.list_node);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH AUTOSEL 6.18] e1000: check return value of e1000_read_eeprom
       [not found] <20260420131539.986432-1-sashal@kernel.org>
                   ` (4 preceding siblings ...)
  2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] ice: ptp: don't WARN when controlling PF is unavailable Sasha Levin
@ 2026-04-20 13:08 ` Sasha Levin
  2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-6.6] bpf, sockmap: Annotate af_unix sock:: Sk_state data-races Sasha Levin
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Agalakov Daniil, Iskhakov Daniil, Aleksandr Loktionov,
	Tony Nguyen, Sasha Levin, jesse.brandeburg, davem, edumazet, kuba,
	pabeni, intel-wired-lan, netdev, linux-kernel

From: Agalakov Daniil <ade@amicon.ru>

[ Upstream commit d3baa34a470771399c1495bc04b1e26ac15d598e ]

[Why]
e1000_set_eeprom() performs a read-modify-write operation when the write
range is not word-aligned. This requires reading the first and last words
of the range from the EEPROM to preserve the unmodified bytes.

However, the code does not check the return value of e1000_read_eeprom().
If the read fails, the operation continues using uninitialized data from
eeprom_buff. This results in corrupted data being written back to the
EEPROM for the boundary words.

Add the missing error checks and abort the operation if reading fails.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Co-developed-by: Iskhakov Daniil <dish@amicon.ru>
Signed-off-by: Iskhakov Daniil <dish@amicon.ru>
Signed-off-by: Agalakov Daniil <ade@amicon.ru>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Error: Failed to generate final synthesis

 drivers/net/ethernet/intel/e1000/e1000_ethtool.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
index 726365c567ef3..75d0bfa7530b4 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
@@ -496,14 +496,19 @@ static int e1000_set_eeprom(struct net_device *netdev,
 		 */
 		ret_val = e1000_read_eeprom(hw, first_word, 1,
 					    &eeprom_buff[0]);
+		if (ret_val)
+			goto out;
+
 		ptr++;
 	}
-	if (((eeprom->offset + eeprom->len) & 1) && (ret_val == 0)) {
+	if ((eeprom->offset + eeprom->len) & 1) {
 		/* need read/modify/write of last changed EEPROM word
 		 * only the first byte of the word is being modified
 		 */
 		ret_val = e1000_read_eeprom(hw, last_word, 1,
 					    &eeprom_buff[last_word - first_word]);
+		if (ret_val)
+			goto out;
 	}
 
 	/* Device's eeprom is always little-endian, word addressable */
@@ -522,6 +527,7 @@ static int e1000_set_eeprom(struct net_device *netdev,
 	if ((ret_val == 0) && (first_word <= EEPROM_CHECKSUM_REG))
 		e1000_update_eeprom_checksum(hw);
 
+out:
 	kfree(eeprom_buff);
 	return ret_val;
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH AUTOSEL 7.0-6.6] bpf, sockmap: Annotate af_unix sock:: Sk_state data-races
       [not found] <20260420131539.986432-1-sashal@kernel.org>
                   ` (5 preceding siblings ...)
  2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] e1000: check return value of e1000_read_eeprom Sasha Levin
@ 2026-04-20 13:08 ` Sasha Levin
  2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-6.18] net: wangxun: reorder timer and work sync cancellations Sasha Levin
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Michal Luczaj, Kuniyuki Iwashima, Martin KaFai Lau,
	Martin KaFai Lau, Jiayuan Chen, Sasha Levin, john.fastabend,
	jakub, davem, edumazet, kuba, pabeni, netdev, bpf, linux-kernel

From: Michal Luczaj <mhal@rbox.co>

[ Upstream commit a25566084e391348385a72dd507e0cc0c268dd5d ]

sock_map_sk_state_allowed() and sock_map_redirect_allowed() read af_unix
socket sk_state locklessly.

Use READ_ONCE(). Note that for sock_map_redirect_allowed() change affects
not only af_unix, but all non-TCP sockets (UDP, af_vsock).

Suggested-by: Kuniyuki Iwashima <kuniyu@google.com>
Suggested-by: Martin KaFai Lau <martin.lau@linux.dev>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260414-unix-proto-update-null-ptr-deref-v4-1-2af6fe97918e@rbox.co
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have enough context. Let me compile my analysis.

## Phase 1: Commit Message Forensics

**Step 1.1 - Subject Line Parse:**
- Subsystem: `bpf, sockmap:`
- Action verb: "Annotate" (not "fix")
- Summary: Adds READ_ONCE() to two lockless reads of af_unix sk_state in
  sock_map.c

Record: bpf/sockmap subsystem, annotation (data race), adds READ_ONCE()
to sk_state reads

**Step 1.2 - Tags:**
- `Suggested-by: Kuniyuki Iwashima` (networking/af_unix expert)
- `Suggested-by: Martin KaFai Lau` (BPF maintainer)
- `Signed-off-by: Michal Luczaj` (original author)
- `Signed-off-by: Martin KaFai Lau` (maintainer applied)
- `Reviewed-by: Jiayuan Chen`
- `Reviewed-by: Kuniyuki Iwashima`
- `Link:` to v4 on patch.msgid.link
- **No** Fixes: tag
- **No** Cc: stable
- **No** syzbot/KCSAN report

Record: Strong review endorsement (both suggesters are reviewers); no
Fixes:/stable tags; no concrete bug report cited.

**Step 1.3 - Body Text:**
The commit explicitly acknowledges that `sock_map_sk_state_allowed()`
and `sock_map_redirect_allowed()` read sk_state "locklessly". Change
uses READ_ONCE(). Notes the redirect_allowed change also affects UDP and
af_vsock. No crash/panic/reproducer described in body — it's purely a
data race annotation.

Record: Describes data race (paired writer uses WRITE_ONCE); no user-
visible symptom documented.

**Step 1.4 - Hidden fix?**
"Annotate data-races" is standard kernel terminology for adding
READ_ONCE/WRITE_ONCE pairings. This is a recognized synchronization bug
fix pattern per KCSAN/C11 memory model, even without a concrete crash.

## Phase 2: Diff Analysis

**Step 2.1 - Inventory:** Single file `net/core/sock_map.c`, 2 lines
changed (2 insertions, 2 deletions). Two functions modified:
`sock_map_redirect_allowed()` and `sock_map_sk_state_allowed()`. Minimal
surgical scope.

**Step 2.2 - Code flow:** Before/after behavior is identical modulo
compiler: READ_ONCE prevents load tearing/reordering/fusion by the
compiler for lockless reads.

**Step 2.3 - Bug mechanism:** Category (b) Synchronization. This is a
completion of a WRITE_ONCE/READ_ONCE pair: writer side at
`net/unix/af_unix.c:1775` uses `WRITE_ONCE(sk->sk_state,
TCP_ESTABLISHED)` in `unix_stream_connect()`, but sock_map.c readers
were plain reads — a data race per kernel/C11 rules.

**Step 2.4 - Fix quality:** Obviously correct. Zero regression risk —
READ_ONCE is a compiler barrier with no runtime cost on aligned reads.

## Phase 3: Git History

**Step 3.1 - Blame:** The affected lines originated from:
- `sock_map_redirect_allowed`: commit `122e6c79efe1c2` (Cong Wang, 2021)
- `sock_map_sk_state_allowed` af_unix branch: commit `8d6650646ce49e`
  (John Fastabend, Dec 2023 - fixing syzkaller null ptr deref in
  unix_bpf)

The af_unix branch of sock_map_sk_state_allowed was added in v6.8 (Dec
2023).

**Step 3.2 - Fixes tag:** No Fixes: tag on this patch. The series' patch
5/5 has `Fixes: c63829182c37 ("af_unix: Implement
->psock_update_sk_prot()")` — the null-ptr-deref is fixed there, not
here.

**Step 3.3 - File history:** Recent active development on sock_map.c.
This is patch 1/5 of a series fixing a null-ptr-deref that crashes via
`unix_stream_bpf_update_proto+0xa0`. Patch 5/5 is the actual crash fix.

**Step 3.4 - Author's role:** Michal Luczaj is an active vsock/unix
contributor. Martin KaFai Lau (BPF maintainer) applied it. Kuniyuki
Iwashima is the af_unix expert.

**Step 3.5 - Dependencies:** This patch is independent of the rest of
the series — it touches separate code than patches 2-5. No dependencies.

## Phase 4: Mailing List Research

**Step 4.1 - Original submission:** Found v3 on lore/yhbt and v4
referenced in the commit (20260414-unix-proto-update-null-ptr-deref-v4).
Series: `[PATCH bpf v3 0/5] bpf, sockmap: Fix af_unix null-ptr-deref in
proto update`.

**Step 4.2 - Reviewers:** Networking maintainers (Paolo, Jakub,
Kuniyuki, Eric Dumazet), BPF maintainers (Martin, Alexei, Daniel),
af_unix expert Kuniyuki explicitly reviewed and ACKed this patch.

**Step 4.3 - Bug report:** The series is motivated by a NULL ptr deref
crash (shown in patch 5/5's commit message) but THIS specific patch has
no explicit crash reporter. Kuniyuki notes: "Actually TCP path also
needs READ_ONCE(), but I think it's okay for now since this series
focuses on AF_UNIX" — confirming this is the known-pattern data race
annotation.

**Step 4.4 - Series context:** This is 1/5 of a series:
1. (this one) READ_ONCE annotations
2. Refactor to sock_map_sk_{acquire,release}() helpers
3. Fix af_unix iter deadlock (has Fixes: tag)
4. Selftest
5. Adapt sockmap for af_unix locking (has Fixes: tag — the actual null-
   ptr-deref fix)

**Step 4.5 - Stable discussion:** No explicit stable nomination. Author
himself noted patch 5/5's locking would make this READ_ONCE redundant
for the af_unix path, but the patch was kept as a minimal standalone
hardening.

## Phase 5: Code Semantic Analysis

**Step 5.1 - Functions modified:** `sock_map_redirect_allowed()`,
`sock_map_sk_state_allowed()`

**Step 5.2 - Callers:**
- `sock_map_redirect_allowed()`: 4 callers — `bpf_sk_redirect_map`,
  `bpf_msg_redirect_map`, `bpf_sk_redirect_hash`,
  `bpf_msg_redirect_hash`. Called from BPF programs at runtime (hot
  path).
- `sock_map_sk_state_allowed()`: 2 callers in `sock_map_update_elem_sys`
  and `sock_map_update_elem` — invoked on BPF_MAP_UPDATE_ELEM syscall.

**Step 5.3 - Callees:** Just reads sk->sk_state and does bitmask
comparison.

**Step 5.4 - Reachability:** Reachable from userspace via bpf() syscall
(BPF_MAP_UPDATE_ELEM) and from BPF programs redirecting sockets —
CONFIRMED reachable.

**Step 5.5 - Similar patterns:** Found the same pattern in
`net/unix/diag.c` (commit `0aa3be7b3e1f8 "af_unix: Annotate data-races
around sk->sk_state in UNIX_DIAG"`) — had Fixes: tags and went to
**ALL** stable trees: 5.10, 5.15, 6.1, 6.6, 6.12, 6.17, 6.18. Strong
precedent.

## Phase 6: Cross-referencing Stable Trees

**Step 6.1 - Code in stable:**
- `sock_map_redirect_allowed()`: Present in ALL active stable trees
  (5.10, 5.15, 6.1, 6.6, 6.12, 6.17, 6.18)
- `sock_map_sk_state_allowed()` af_unix branch (hunk 2): Only in 6.6.y
  and newer (added in v6.8 by backport of 8d6650646ce49)

**Step 6.2 - Backport complications:** Hunk 1
(sock_map_redirect_allowed) applies to all trees. Hunk 2
(sock_map_sk_state_allowed af_unix branch) only applies to 6.6+ where
the af_unix branch exists. Minor adjustment for older trees (drop hunk
2). Clean for 6.6+.

**Step 6.3 - Related fixes in stable:** Precedent 0aa3be7b3e1f8 is in
ALL stable trees — shows the same type of annotation is routinely
accepted.

## Phase 7: Subsystem Context

**Step 7.1 - Subsystem:** `net/core/sock_map.c` = networking core + BPF.
IMPORTANT criticality (BPF sockmap used in user-space networking stacks
like Cilium).

**Step 7.2 - Activity:** Actively developed subsystem with recent bug
fixes.

## Phase 8: Impact Assessment

**Step 8.1 - Affected users:** BPF sockmap users on systems with af_unix
BPF usage, and any users with BPF programs using
sk_redirect/msg_redirect on non-TCP sockets.

**Step 8.2 - Trigger:** Concurrent BPF_MAP_UPDATE_ELEM on a socket
that's undergoing state change (e.g., unix_stream_connect). On most
archs with aligned int reads, load tearing is unlikely, but compiler
fusion/reordering is possible. KCSAN would flag this.

**Step 8.3 - Failure mode:** Without READ_ONCE alone, theoretical
compiler-induced mis-behavior (torn/fused reads of sk_state leading to
wrong state checks). The concrete null-ptr-deref described in the series
is NOT fixed by this patch alone — that's fixed by patch 5/5. Severity:
LOW-MEDIUM for this patch in isolation.

**Step 8.4 - Risk/benefit:**
- Risk: ~zero (READ_ONCE is a compiler barrier; no runtime change on
  aligned reads)
- Benefit: Completes WRITE_ONCE/READ_ONCE pair, silences KCSAN, hardens
  against compiler edge cases

## Phase 9: Final Synthesis

**Evidence FOR backporting:**
- Fixes a genuine data race (plain read paired with WRITE_ONCE on the
  writer)
- Very small, surgical, obviously correct (2 lines)
- Strong reviewer endorsement (Kuniyuki, Jiayuan, Martin)
- Strong precedent: `0aa3be7b3e1f8` (identical pattern) was accepted
  into ALL stable trees with Fixes: tags
- Zero regression risk

**Evidence AGAINST backporting:**
- No Fixes: tag on THIS specific patch
- No Cc: stable tag
- No concrete syzbot/KCSAN report cited
- By itself, does NOT fix the null-ptr-deref (that's patch 5/5)
- Author acknowledged patch 5/5's proper locking would make this
  redundant on af_unix path
- "Annotation only" — preventive rather than curative for the crash

**Stable Rules checklist:**
1. Obviously correct and tested? YES (trivial READ_ONCE; reviewed by
   multiple maintainers)
2. Fixes a real bug? YES (data race per C11/kernel rules) but not a
   user-visible crash alone
3. Important issue? MEDIUM (data race without concrete trigger)
4. Small and contained? YES (2 lines)
5. No new features/APIs? YES
6. Applies to stable? Mostly — hunk 2 needs 6.6+ but hunk 1 applies
   everywhere

**Decision analysis:** This is a borderline case. The patch itself is
pure annotation with no concrete user-visible crash fix in isolation.
However, the precedent (0aa3be7b3e1f8) strongly shows this annotation
pattern IS routinely accepted into stable, including down to 5.10.y. The
patch is 2 lines, zero-risk, and completes a real WRITE_ONCE/READ_ONCE
pair. The Kuniyuki review noting "TCP path also needs READ_ONCE()"
indicates this is a recognized pattern.

Given:
- The precedent of identical-pattern commits going to all stable trees
- Zero regression risk
- Real (if benign) data race being fixed
- Strong review from domain experts

## Verification

- [Phase 1] Parsed tags: no Fixes:, no Cc: stable, but strong Reviewed-
  by from networking maintainer Kuniyuki and BPF maintainer Martin KaFai
  Lau
- [Phase 2] Diff analysis: exactly 2 lines changed, adds READ_ONCE() to
  2 lockless sk_state reads
- [Phase 2] Verified writer pairs with WRITE_ONCE at
  net/unix/af_unix.c:1556, 1775 in unix_stream_connect
- [Phase 3] git blame: affected code introduced by 122e6c79efe1c2 (2021)
  and 8d6650646ce49e (Dec 2023, v6.8)
- [Phase 3] git show c63829182c37: confirmed Fixes: target from patch
  5/5 is from 2021 (widely in stable)
- [Phase 4] Found lore discussion on yhbt.net showing full series thread
- [Phase 4] Confirmed this is patch 1/5; real null-ptr-deref crash is
  fixed by patch 5/5
- [Phase 4] Kuniyuki review comment: "Actually TCP path also needs
  READ_ONCE()" — confirms this is an incremental fix of a broader race
  pattern
- [Phase 5] grep sock_map_redirect_allowed: 4 callers in sockmap BPF
  redirect hooks
- [Phase 5] grep sock_map_sk_state_allowed: 2 callers in
  BPF_MAP_UPDATE_ELEM paths
- [Phase 5] Found precedent commit 0aa3be7b3e1f8 ("af_unix: Annotate
  data-races around sk->sk_state in UNIX_DIAG") — same pattern
- [Phase 6] Verified code presence in stable trees: redirect_allowed
  exists in all; state_allowed af_unix branch only in 6.6+ (6.6, 6.12,
  6.17, 6.18 checked)
- [Phase 6] Confirmed precedent 0aa3be7b3e1f8 backported to
  linux-5.10.y, 5.15.y, 6.1.y, 6.6.y, 6.12.y, 6.17.y, 6.18.y — all
  active stable trees
- [Phase 8] Failure mode: data race (compiler-level), severity LOW-
  MEDIUM in isolation
- UNVERIFIED: Whether the v4 of the series has been applied to mainline
  (appears not yet, as I couldn't locate the commit SHA in local
  mainline snapshot — checked bpf-next and linus-next/master)
- UNVERIFIED: Whether the rest of the series (patches 2-5) will also go
  through autoselection

The patch is 2 lines of low-risk data race annotation with strong
precedent for acceptance into all active stable trees. While it doesn't
fix the null-ptr-deref in isolation (that's patch 5/5), it genuinely
completes a WRITE_ONCE/READ_ONCE pair and silences KCSAN. Stable
maintainers routinely accept these annotations, as evidenced by
`0aa3be7b3e1f8` being in every active stable tree including 5.10.y.

**YES**

 net/core/sock_map.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index b0e96337a2698..02a68be3002a2 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -530,7 +530,7 @@ static bool sock_map_redirect_allowed(const struct sock *sk)
 	if (sk_is_tcp(sk))
 		return sk->sk_state != TCP_LISTEN;
 	else
-		return sk->sk_state == TCP_ESTABLISHED;
+		return READ_ONCE(sk->sk_state) == TCP_ESTABLISHED;
 }
 
 static bool sock_map_sk_is_suitable(const struct sock *sk)
@@ -543,7 +543,7 @@ static bool sock_map_sk_state_allowed(const struct sock *sk)
 	if (sk_is_tcp(sk))
 		return (1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_LISTEN);
 	if (sk_is_stream_unix(sk))
-		return (1 << sk->sk_state) & TCPF_ESTABLISHED;
+		return (1 << READ_ONCE(sk->sk_state)) & TCPF_ESTABLISHED;
 	if (sk_is_vsock(sk) &&
 	    (sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET))
 		return (1 << sk->sk_state) & TCPF_ESTABLISHED;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH AUTOSEL 7.0-6.18] net: wangxun: reorder timer and work sync cancellations
       [not found] <20260420131539.986432-1-sashal@kernel.org>
                   ` (6 preceding siblings ...)
  2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-6.6] bpf, sockmap: Annotate af_unix sock:: Sk_state data-races Sasha Levin
@ 2026-04-20 13:08 ` Sasha Levin
  2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-5.15] net: mvneta: support EPROBE_DEFER when reading MAC address Sasha Levin
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Jiawen Wu, Jakub Kicinski, Sasha Levin, mengyuanlou, davem,
	edumazet, pabeni, netdev, linux-kernel

From: Jiawen Wu <jiawenwu@trustnetic.com>

[ Upstream commit 58f6303572ec66e7c2967ac168125f444c9e880d ]

When removing the device, timer_delete_sync(&wx->service_timer) is
called in .ndo_stop() after cancel_work_sync(&wx->service_task). This
may cause new work to be queued after device down.

Move unregister_netdev() before cancel_work_sync(), and use
timer_shutdown_sync() to prevent the timer from being re-armed.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Link: https://patch.msgid.link/20260407025616.33652-7-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have enough information for a complete analysis. Let me compile
it.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: Subject Line**
- Subsystem: `net: wangxun:` (Wangxun network driver)
- Action verb: "reorder" - implies fixing incorrect ordering of
  operations, suggesting a race condition fix
- Summary: Reorders timer and work synchronization cancellations during
  device removal

**Step 1.2: Tags**
- `Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>` - author, primary
  wangxun driver developer
- `Link: https://patch.msgid.link/20260407025616.33652-7-
  jiawenwu@trustnetic.com` - patch 7 of a series
- `Signed-off-by: Jakub Kicinski <kuba@kernel.org>` - net maintainer
  accepted the patch
- No Fixes: tag (expected for this review process)
- No Cc: stable tag (expected)

**Step 1.3: Commit Body**
The commit message clearly describes the bug: in `.ndo_stop()`,
`timer_delete_sync()` is called AFTER `cancel_work_sync()`, meaning the
timer can fire and re-queue work after the work cancellation. The fix
moves `unregister_netdev()` before `cancel_work_sync()` and uses
`timer_shutdown_sync()` to prevent re-arming.

**Step 1.4: Hidden Bug Fix Detection**
This IS a bug fix despite using "reorder" language. The reordering fixes
a race condition where work can be queued after device teardown begins.

## PHASE 2: DIFF ANALYSIS

**Step 2.1: Inventory**
- `drivers/net/ethernet/wangxun/libwx/wx_vf_common.c`: +3/-1 lines
- `drivers/net/ethernet/wangxun/txgbe/txgbe_main.c`: +4/-2 lines
- Total: ~7 lines changed, 2 functions modified (`wxvf_remove`,
  `txgbe_remove`)
- Classification: small, surgical fix

**Step 2.2: Code Flow Changes**

For `wxvf_remove()` - BEFORE:
```
cancel_work_sync(&wx->service_task);  // step 1: cancel work
netdev = wx->netdev;
unregister_netdev(netdev);            // step 2: unregister (stops timer
via .ndo_stop)
```

AFTER:
```
netdev = wx->netdev;
unregister_netdev(netdev);            // step 1: unregister (stops
timer)
timer_shutdown_sync(&wx->service_timer); // step 2: prevent timer re-arm
cancel_work_sync(&wx->service_task);  // step 3: cancel work
```

Same pattern for `txgbe_remove()`.

**Step 2.3: Bug Mechanism**
Race condition. `wx_service_timer()` both re-arms itself via
`mod_timer()` and queues `service_task` via
`wx_service_event_schedule()`:

```3333:3343:drivers/net/ethernet/wangxun/libwx/wx_lib.c
void wx_service_timer(struct timer_list *t)
{
        struct wx *wx = timer_container_of(wx, t, service_timer);
        unsigned long next_event_offset = HZ * 2;
        mod_timer(&wx->service_timer, next_event_offset + jiffies);
        wx_service_event_schedule(wx);
}
```

In the old code, after `cancel_work_sync()` returns, the timer fires and
both re-arms itself AND queues new work. That work then runs during or
after device teardown.

**Step 2.4: Fix Quality**
The fix is obviously correct: stop the timer first (via
`unregister_netdev` calling `.ndo_stop`), prevent re-arming
(`timer_shutdown_sync`), then cancel remaining work
(`cancel_work_sync`). Very low regression risk.

## PHASE 3: GIT HISTORY

**Step 3.1: Blame**
- `cancel_work_sync` in `txgbe_remove` was added by `343929799ace12`
  (v6.16, 2025-05-21) as part of AML GPIO IRQ support
- `cancel_work_sync` in `wxvf_remove` was added by `bf68010acc4bc8`
  (v6.17, 2025-07-04) as part of VF driver addition
- `timer_delete_sync` in `.ndo_stop` paths has existed since the timer
  mechanism was added

**Step 3.2: Fixes tag** - No Fixes: tag present.

**Step 3.3: File History** - Both files have recent activity (feature
additions), but `txgbe_remove()` structure has been stable since v6.0.

**Step 3.4: Author** - Jiawen Wu is the primary wangxun/txgbe driver
developer with 15+ commits to this subsystem. This is the domain expert.

**Step 3.5: Dependencies** - `timer_shutdown_sync()` was added in v6.10
(`f571faf6e443b`), available in all relevant stable trees. The patch
applies standalone - no other patches needed.

## PHASE 4: MAILING LIST RESEARCH

**Step 4.1-4.2:** b4 dig confirmed the related commit `343929799ace12`
was reviewed by Simon Horman and accepted by Paolo Abeni. The current
commit was accepted by Jakub Kicinski. The patch is part 7 of a series
(from message-id `33652-7`), but this specific fix is self-contained -
it only changes the ordering of existing calls.

**Step 4.3-4.5:** Lore is behind anti-bot protection; could not fetch
discussion thread directly.

## PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1-5.2:** Functions modified are `wxvf_remove()` and
`txgbe_remove()` - PCI remove callbacks. They are called by the PCI
subsystem during device removal (driver unload, device hot-unplug,
system shutdown).

**Step 5.3-5.4:** The work function `txgbe_service_task()` /
`wxvf_service_task()` accesses device state (link detection, SFP
identification, reset subtasks). Running this work after device teardown
begins can access freed resources.

**Step 5.5:** The same pattern (`timer_delete_sync` + `cancel_work_sync`
in error paths) exists in `txgbevf` and `ngbevf` probe error paths, but
those are before the timer/work are active so the order is less
critical.

## PHASE 6: STABLE TREE ANALYSIS

**Step 6.1:**
- The `cancel_work_sync` in `txgbe_remove` exists since v6.16 (stable
  6.16.y affected)
- `wxvf_remove` exists since v6.17 (stable 6.17.y+ affected)
- Earlier stable trees (6.12.y, 6.6.y, etc.) don't have the buggy code

**Step 6.2:** The patch should apply cleanly to 6.16.y and later. The
`timer_shutdown_sync` API is available since v6.10.

**Step 6.3:** No related fixes already in stable.

## PHASE 7: SUBSYSTEM CONTEXT

**Step 7.1:** Network driver (PERIPHERAL criticality) - affects Wangxun
10G/25G/40G NIC users.

**Step 7.2:** Actively developed subsystem with recent feature
additions.

## PHASE 8: IMPACT AND RISK ASSESSMENT

**Step 8.1:** Affects users of Wangxun NICs (txgbe and txgbevf drivers).

**Step 8.2:** Trigger: device removal (driver unload, PCI hot-unplug,
system shutdown). Common operation when managing network devices, though
not a frequent hot path.

**Step 8.3:** Failure mode: Work task runs during/after device teardown.
This can lead to use-after-free, accessing freed memory, or other
undefined behavior. Severity: **HIGH** (potential UAF, crash during
device removal).

**Step 8.4:**
- BENEFIT: Prevents potential crash/UAF during device removal - medium-
  high (device removal is common operation)
- RISK: Very low - 7 lines, just reordering existing operations + adding
  belt-and-suspenders `timer_shutdown_sync`
- Ratio: Clearly favorable

## PHASE 9: FINAL SYNTHESIS

**Evidence FOR:**
- Fixes a real race condition (timer re-queuing work after cancellation)
- Potential UAF/crash during device removal
- Small, surgical fix (7 lines across 2 files)
- Obviously correct - proper ordering of timer stop -> timer shutdown ->
  work cancel
- Written by the subsystem's primary developer
- Accepted by net maintainer (Jakub Kicinski)
- No dependencies on other patches
- `timer_shutdown_sync` API available in all relevant stable trees

**Evidence AGAINST:**
- No Reported-by (bug was found by code inspection, not user report)
- Only affects newer stable trees (6.16.y+)
- Device removal race may be hard to trigger in practice (small race
  window)

**Stable Rules Checklist:**
1. Obviously correct? YES - proper ordering of teardown operations
2. Fixes a real bug? YES - race condition in device removal
3. Important issue? YES - potential UAF/crash
4. Small and contained? YES - 7 lines, 2 files, same subsystem
5. No new features? CORRECT - no new features
6. Applies to stable? YES - for 6.16.y+ (txgbe) and 6.17.y+ (wxvf)

**Verification:**
- [Phase 1] Parsed tags: Signed-off-by Jakub Kicinski (net maintainer),
  Link to message-id
- [Phase 2] Diff analysis: Reorders cancel_work_sync after
  unregister_netdev, adds timer_shutdown_sync in wxvf_remove and
  txgbe_remove
- [Phase 2] Verified wx_service_timer() re-arms via mod_timer AND queues
  work - confirms the race
- [Phase 3] git blame: cancel_work_sync in txgbe_remove from
  343929799ace12 (v6.16), wxvf_remove from bf68010acc4bc8 (v6.17)
- [Phase 3] git tag --contains: confirmed affected stable trees are
  6.16.y+ (txgbe) and 6.17.y+ (wxvf)
- [Phase 3] git tag --contains f571faf6e443b: timer_shutdown_sync
  available since v6.10
- [Phase 4] b4 dig -c 343929799ace12: found original submission reviewed
  by Simon Horman
- [Phase 5] Verified work task wxvf_service_task/txgbe_service_task
  accesses device state
- [Phase 5] Verified txgbe_close → txgbe_down → txgbe_disable_device
  calls timer_delete_sync (line 230)
- [Phase 6] Confirmed buggy code only in 6.16.y+ for txgbe, 6.17.y+ for
  wxvf
- [Phase 8] Failure mode: work runs during teardown → potential UAF,
  severity HIGH
- UNVERIFIED: Could not access lore thread for the specific commit due
  to anti-bot protection

**YES**

 drivers/net/ethernet/wangxun/libwx/wx_vf_common.c | 3 ++-
 drivers/net/ethernet/wangxun/txgbe/txgbe_main.c   | 5 +++--
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/wangxun/libwx/wx_vf_common.c b/drivers/net/ethernet/wangxun/libwx/wx_vf_common.c
index ade2bfe563aaa..e8a14aa066c69 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_vf_common.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_vf_common.c
@@ -48,9 +48,10 @@ void wxvf_remove(struct pci_dev *pdev)
 	struct wx *wx = pci_get_drvdata(pdev);
 	struct net_device *netdev;
 
-	cancel_work_sync(&wx->service_task);
 	netdev = wx->netdev;
 	unregister_netdev(netdev);
+	timer_shutdown_sync(&wx->service_timer);
+	cancel_work_sync(&wx->service_task);
 	kfree(wx->vfinfo);
 	kfree(wx->rss_key);
 	kfree(wx->mac_table);
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
index 0de051450a823..bc51a84d1b143 100644
--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
+++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
@@ -950,12 +950,13 @@ static void txgbe_remove(struct pci_dev *pdev)
 	struct txgbe *txgbe = wx->priv;
 	struct net_device *netdev;
 
-	cancel_work_sync(&wx->service_task);
-
 	netdev = wx->netdev;
 	wx_disable_sriov(wx);
 	unregister_netdev(netdev);
 
+	timer_shutdown_sync(&wx->service_timer);
+	cancel_work_sync(&wx->service_task);
+
 	txgbe_remove_phy(txgbe);
 	wx_free_isb_resources(wx);
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH AUTOSEL 7.0-5.15] net: mvneta: support EPROBE_DEFER when reading MAC address
       [not found] <20260420131539.986432-1-sashal@kernel.org>
                   ` (7 preceding siblings ...)
  2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-6.18] net: wangxun: reorder timer and work sync cancellations Sasha Levin
@ 2026-04-20 13:08 ` Sasha Levin
  2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-6.1] net/mlx5e: XSK, Increase size for chunk_size param Sasha Levin
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Rosen Penev, Simon Horman, Jakub Kicinski, Sasha Levin,
	marcin.s.wojtas, andrew+netdev, davem, edumazet, pabeni, netdev,
	linux-kernel

From: Rosen Penev <rosenp@gmail.com>

[ Upstream commit 73a864352570fd30d942652f05bfe9340d7a2055 ]

If nvmem loads after the ethernet driver, mac address assignments will
not take effect. of_get_ethdev_address returns EPROBE_DEFER in such a
case so we need to handle that to avoid eth_hw_addr_random.

Add extra goto section to just free stats as they are allocated right
above.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260307031709.640141-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have all the information needed for the full analysis.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

### Step 1.1: Subject Line
- **Subsystem**: `net: mvneta:` (Marvell NETA ethernet driver)
- **Action verb**: "support" - enabling proper handling of an error
  condition
- **Summary**: Handle EPROBE_DEFER from `of_get_ethdev_address()` in
  mvneta probe

Record: [net: mvneta] [support] [Handle EPROBE_DEFER to avoid incorrect
random MAC assignment when nvmem loads after ethernet driver]

### Step 1.2: Tags
- **Signed-off-by**: Rosen Penev (author), Jakub Kicinski (net
  maintainer)
- **Reviewed-by**: Simon Horman (networking reviewer) - strong quality
  signal
- **Link**:
  https://patch.msgid.link/20260307031709.640141-1-rosenp@gmail.com
  (lore blocked by Anubis)
- No Fixes: tag (expected for this review pipeline)
- No Cc: stable (expected)

Record: Reviewed by Simon Horman (experienced networking reviewer),
committed by Jakub Kicinski (net maintainer). No syzbot, no multiple
reporters.

### Step 1.3: Body Text
The commit explains that when nvmem loads after the ethernet driver,
`of_get_ethdev_address()` returns `-EPROBE_DEFER`. Without handling
this, the driver falls through to `eth_hw_addr_random`, assigning a
random MAC address instead of deferring probe. The fix adds an
`err_free_stats` goto section to properly clean up.

Record: Bug: EPROBE_DEFER not handled, causing random MAC assignment
instead of probe deferral. Symptom: Device gets a random MAC instead of
its stored nvmem MAC. Root cause: Missing EPROBE_DEFER check after
`of_get_ethdev_address()`.

### Step 1.4: Hidden Bug Fix Detection
This is a real bug fix despite using "support" rather than "fix".
Without this, users get a non-deterministic MAC address that changes
every boot, breaking network configurations.

Record: YES - this is a real bug fix. The consequence is a random MAC
address instead of the correct one from nvmem storage.

---

## PHASE 2: DIFF ANALYSIS

### Step 2.1: Inventory
- **Files changed**: 1 (`drivers/net/ethernet/marvell/mvneta.c`)
- **Lines added**: 3 (2 lines for EPROBE_DEFER check + 1 label)
- **Functions modified**: `mvneta_probe()` only
- **Scope**: Single-file surgical fix

### Step 2.2: Code Flow Change
**Hunk 1** (around line 5622): After `of_get_ethdev_address()`, adds
check for EPROBE_DEFER to jump to cleanup.
- Before: EPROBE_DEFER falls into the `else` branch and assigns a random
  MAC.
- After: EPROBE_DEFER causes probe to fail and return -EPROBE_DEFER.

**Hunk 2** (around line 5758): Adds `err_free_stats:` label before
`free_percpu(pp->stats)`.
- This provides a proper cleanup path that frees stats, ports, phylink,
  clocks, and IRQ.

### Step 2.3: Bug Mechanism
Category: **Logic/correctness fix** - missing error path handling. The
`of_get_ethdev_address()` call can return `-EPROBE_DEFER` when the nvmem
provider isn't loaded yet. Without handling this specific error, the
driver proceeds with a random MAC address. The fix returns the error so
the driver framework retries probe later when nvmem is available.

### Step 2.4: Fix Quality
- Obviously correct - follows established pattern used in macb,
  ucc_geth, gianfar, airoha, mtk_eth_soc
- Minimal (3 lines), surgical, well-contained
- No regression risk - the new `err_free_stats` cleanup path correctly
  unwinds only what was allocated (stats, ports, phylink, clocks, IRQ)
- No red flags

---

## PHASE 3: GIT HISTORY INVESTIGATION

### Step 3.1: Blame
- `of_get_ethdev_address()` call was introduced by commit 9ca01b25dffffe
  ("ethernet: use of_get_ethdev_address()") in v5.16
- The original MAC address reading pattern has been present since commit
  8cc3e439ab9207 (2013)
- The nvmem path in `of_get_mac_address()` was present since ~v5.0
  (`of_get_mac_address_nvmem()`)
- Bug has been present since v5.16 when mvneta switched to
  `of_get_ethdev_address()`

### Step 3.2: Fixes Tag
No Fixes: tag present. The implicit "fixes" target would be
9ca01b25dffffe (v5.16) which exists in all active stable trees (6.1,
6.6, 6.12).

### Step 3.3: Related Changes
The author (Rosen Penev) has been systematically fixing this same
pattern across drivers:
- `b2d9544070d05` "net: gianfar: fix NVMEM mac address" (v6.12)
- `2575897640328` "net: ucc_geth: fix usage with NVMEM MAC address"
  (v6.13)
- `be04024a24a93` "net: ag71xx: support probe deferral for getting MAC
  address"

This mvneta fix is standalone and does not depend on other patches.

### Step 3.4: Author
Rosen Penev is a regular contributor with multiple accepted patches in
the networking subsystem, especially around nvmem MAC address handling.
Not the subsystem maintainer but an experienced contributor.

### Step 3.5: Dependencies
No dependencies. The fix uses existing infrastructure
(`of_get_ethdev_address()` and error labels already in the function).
Applies standalone.

---

## PHASE 4: MAILING LIST AND EXTERNAL RESEARCH

### Step 4.1-4.5
b4 dig did not find the submission. Lore was blocked by Anubis anti-
scraping. However, the commit metadata is clear:
- Reviewed by Simon Horman (experienced networking reviewer)
- Accepted by Jakub Kicinski (netdev maintainer)
- This is part of an established series by the same author fixing the
  same pattern in multiple drivers

---

## PHASE 5: CODE SEMANTIC ANALYSIS

### Step 5.1: Key Functions
- `mvneta_probe()` - the sole modified function

### Step 5.2: Callers
`mvneta_probe()` is called by the platform driver framework during
device enumeration. This is a standard probe path for all Marvell NETA
Ethernet controllers.

### Step 5.3-5.4: Call Chain
`of_get_ethdev_address()` -> `of_get_mac_address()` ->
`of_get_mac_address_nvmem()` -> `of_nvmem_cell_get()` which returns
`-EPROBE_DEFER` when the nvmem provider hasn't been loaded yet. This is
a standard Linux device model flow.

### Step 5.5: Similar Patterns
Multiple other drivers handle this pattern correctly:
- `drivers/net/ethernet/mediatek/mtk_eth_soc.c` - handles EPROBE_DEFER
- `drivers/net/ethernet/freescale/ucc_geth.c` - handles EPROBE_DEFER
- `drivers/net/ethernet/cadence/macb_main.c` - handles EPROBE_DEFER
- `drivers/net/ethernet/airoha/airoha_eth.c` - handles EPROBE_DEFER

mvneta was an outlier in NOT handling it.

---

## PHASE 6: CROSS-REFERENCING AND STABLE TREE ANALYSIS

### Step 6.1: Buggy Code in Stable
The `of_get_ethdev_address()` call in mvneta was introduced in v5.16
(commit 9ca01b25dffffe). All active stable trees (6.1.y, 6.6.y, 6.12.y)
contain this buggy code.

### Step 6.2: Backport Complications
The fix is very small (3 lines) and only uses existing infrastructure.
The error cleanup chain structure has been stable since 2022. Expected
to apply cleanly.

### Step 6.3: Related Fixes Already in Stable
No prior fix for this specific issue in mvneta.

---

## PHASE 7: SUBSYSTEM AND MAINTAINER CONTEXT

### Step 7.1: Subsystem
- **Subsystem**: Network drivers (drivers/net/ethernet/marvell/)
- **Criticality**: IMPORTANT - Marvell NETA is a widely used Ethernet
  controller in Armada SoCs, common in embedded/networking appliances
  (routers, NAS devices, network gateways)

### Step 7.2: Subsystem Activity
The mvneta driver is actively maintained with regular updates.

---

## PHASE 8: IMPACT AND RISK ASSESSMENT

### Step 8.1: Who Is Affected
Users with Marvell NETA Ethernet controllers whose MAC address is stored
in nvmem (common in embedded devices). These are typically ARM-based
platforms.

### Step 8.2: Trigger Conditions
Triggered when:
- Device tree specifies MAC in nvmem (common configuration)
- nvmem driver loads after mvneta driver (depends on module load order,
  kernel config)
- This is a real-world scenario, especially on systems with complex
  device tree dependencies

### Step 8.3: Failure Mode Severity
- **Without fix**: Random MAC address assigned, networking may break
  (DHCP gets new IP, static configs fail, MAC-based filtering fails)
- **Severity**: HIGH for affected users - unstable MAC address across
  reboots can cause network connectivity failures and operational issues
  in embedded deployments
- Not a crash, but a functional correctness issue with real-world impact

### Step 8.4: Risk-Benefit Ratio
- **Benefit**: HIGH - restores correct MAC address behavior for embedded
  users
- **Risk**: VERY LOW - 3-line change, follows established pattern,
  proper cleanup
- **Ratio**: Strongly favorable

---

## PHASE 9: FINAL SYNTHESIS

### Step 9.1: Evidence Summary

**FOR backporting:**
- Fixes a real functional bug (wrong MAC address assignment)
- Tiny change: 3 lines added
- Follows an established pattern already used by 4+ other drivers
- Same author has had identical fixes accepted in other drivers
- Reviewed by Simon Horman, committed by Jakub Kicinski
- Affects all stable trees (6.1+)
- No dependencies, standalone fix
- Proper cleanup path (no resource leaks)
- Real-world impact on embedded/ARM devices using nvmem for MAC storage

**AGAINST backporting:**
- Not a crash, security, or data corruption issue
- Affects a specific hardware platform (Marvell NETA SoCs with nvmem
  MAC)
- Could be seen as "enabling a feature" (nvmem MAC support), but nvmem
  support has been present since v5.0 - this just fixes the error
  handling

### Step 9.2: Stable Rules Checklist
1. Obviously correct and tested? **YES** - trivial pattern, reviewed by
   expert
2. Fixes a real bug? **YES** - wrong MAC address when nvmem loads late
3. Important issue? **YES** - networking breaks with wrong MAC
4. Small and contained? **YES** - 3 lines, 1 file
5. No new features? **CORRECT** - this enables existing nvmem MAC
   feature to work correctly
6. Can apply to stable? **YES** - no dependencies, clean apply expected

### Step 9.3: Exception Categories
Not strictly an exception category, but this is a straightforward
correctness fix.

### Step 9.4: Decision
This is a small, obviously correct fix for a real functional bug. It
follows an established pattern already used in multiple other drivers.
The risk is minimal and the benefit is significant for embedded users
with Marvell NETA hardware using nvmem MAC addresses.

---

## Verification

- [Phase 1] Parsed tags: Reviewed-by Simon Horman, signed by Jakub
  Kicinski. No Fixes: tag (expected).
- [Phase 2] Diff analysis: 3 lines added - 2 lines for EPROBE_DEFER
  check, 1 err_free_stats label in mvneta_probe()
- [Phase 3] git blame: `of_get_ethdev_address()` call introduced by
  9ca01b25dffffe (v5.16), present in all active stable trees
- [Phase 3] git log author: Rosen Penev has multiple identical fixes
  accepted (ucc_geth v6.13, gianfar v6.12)
- [Phase 3] File history: no prerequisites or conflicts identified
- [Phase 4] b4 dig: could not find match; lore blocked by Anubis
- [Phase 5] Grep for similar patterns: confirmed 4+ other drivers handle
  EPROBE_DEFER from of_get_ethdev_address() correctly (macb, ucc_geth,
  airoha, mtk_eth_soc)
- [Phase 5] of_get_ethdev_address() -> of_get_mac_address() ->
  of_get_mac_address_nvmem() can return -EPROBE_DEFER (verified in
  net/core/of_net.c lines 61-97, 126-147, 162-171)
- [Phase 6] Code exists in all active stable trees (6.1+), verified via
  git describe --contains for 9ca01b25dffffe (v5.16-rc1)
- [Phase 6] Expected clean apply - fix uses only existing error labels
  and infrastructure
- [Phase 7] mvneta is a widely-used Marvell Armada SoC ethernet driver
- [Phase 8] Failure mode: random MAC assignment instead of correct nvmem
  MAC, severity HIGH for embedded users
- UNVERIFIED: Could not access mailing list discussion due to Anubis
  blocking

**YES**

 drivers/net/ethernet/marvell/mvneta.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 9ba4aef7080c0..0c061fb0ed072 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -5620,6 +5620,8 @@ static int mvneta_probe(struct platform_device *pdev)
 	}
 
 	err = of_get_ethdev_address(dn, dev);
+	if (err == -EPROBE_DEFER)
+		goto err_free_stats;
 	if (!err) {
 		mac_from = "device tree";
 	} else {
@@ -5755,6 +5757,7 @@ static int mvneta_probe(struct platform_device *pdev)
 				       1 << pp->id);
 		mvneta_bm_put(pp->bm_priv);
 	}
+err_free_stats:
 	free_percpu(pp->stats);
 err_free_ports:
 	free_percpu(pp->ports);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH AUTOSEL 7.0-6.1] net/mlx5e: XSK, Increase size for chunk_size param
       [not found] <20260420131539.986432-1-sashal@kernel.org>
                   ` (8 preceding siblings ...)
  2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-5.15] net: mvneta: support EPROBE_DEFER when reading MAC address Sasha Levin
@ 2026-04-20 13:08 ` Sasha Levin
  2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-5.10] ppp: disconnect channel before nullifying pch->chan Sasha Levin
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Dragos Tatulea, Carolina Jubran, Tariq Toukan, Paolo Abeni,
	Sasha Levin, saeedm, mbloch, andrew+netdev, davem, edumazet, kuba,
	netdev, linux-rdma, linux-kernel

From: Dragos Tatulea <dtatulea@nvidia.com>

[ Upstream commit 1047e14b44edecbbab02a86514a083b8db9fde4d ]

When 64K pages are used, chunk_size can take the 64K value
which doesn't fit in u16. This results in overflows that
are detected in mlx5e_mpwrq_log_wqe_sz().

Increase the type to u32 to fix this.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260403090927.139042-2-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have a comprehensive picture. Let me compile the full analysis.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: Subject Line**
- Subsystem: `net/mlx5e` (Mellanox ConnectX Ethernet driver, XSK
  subsystem)
- Action verb: "Increase" (but the intent is clearly to fix an overflow
  bug)
- Summary: Widen `chunk_size` from u16 to u32 to prevent integer
  overflow on 64K page systems.

Record: [net/mlx5e XSK] [increase/fix] [type widening to prevent
overflow when chunk_size=65536]

**Step 1.2: Tags**
- Reviewed-by: Carolina Jubran (Nvidia engineer)
- Signed-off-by: Tariq Toukan (mlx5 submaintainer at Nvidia)
- Link:
  https://patch.msgid.link/20260403090927.139042-2-tariqt@nvidia.com
  (patch 2 in a series)
- Signed-off-by: Paolo Abeni (netdev maintainer)
- No Fixes: tag (expected for autosel candidates)
- No Cc: stable (expected)

Record: Reviewed by Nvidia staff. Merged through standard netdev tree.
Patch 2/N series.

**Step 1.3: Commit Body**
- Bug: On systems with 64K pages (ARM64), `chunk_size` can be 65536.
  Stored in u16, this overflows to 0.
- Symptom: "overflows that are detected in `mlx5e_mpwrq_log_wqe_sz()`"
- Root cause: u16 type is too narrow for the value 65536 (0x10000).

**Step 1.4: Hidden Bug Fix Detection**
This is explicitly described as fixing overflows. The word "Increase"
obscures the fix nature, but the body clearly explains the overflow bug.

## PHASE 2: DIFF ANALYSIS

**Step 2.1: Inventory**
- 1 file changed: `drivers/net/ethernet/mellanox/mlx5/core/en/params.h`
- 1 line changed: `u16 chunk_size` -> `u32 chunk_size`
- Scope: Single-file, single-line, surgical fix

**Step 2.2: Code Flow Change**
Before: `chunk_size` stored as u16 (max 65535). When set to 65536 (1 <<
16 on 64K page systems), it silently wraps to 0.
After: `chunk_size` stored as u32 (max ~4 billion). Value 65536 is
stored correctly.

**Step 2.3: Bug Mechanism**
Category: **Integer overflow / type size bug**

The overflow is triggered in `params.c` lines 1125-1131, where a
temporary `mlx5e_xsk_param` is constructed:

```1125:1131:drivers/net/ethernet/mellanox/mlx5/core/en/params.c
for (frame_shift = XDP_UMEM_MIN_CHUNK_SHIFT;
     frame_shift <= PAGE_SHIFT; frame_shift++) {
    struct mlx5e_xsk_param xsk = {
        .chunk_size = 1 << frame_shift,
        .unaligned = false,
    };
```

On 64K page systems (`PAGE_SHIFT=16`), `1 << 16 = 65536` overflows u16
to 0. This then propagates to `order_base_2(0)` in
`mlx5e_mpwrq_page_shift()`, which is undefined behavior.

**Step 2.4: Fix Quality**
- Obviously correct: widening u16 to u32 cannot break anything
- Minimal/surgical: exactly one type change
- Regression risk: effectively zero - u32 holds all values u16 can, plus
  the needed 65536
- The struct padding change is negligible (4 bytes -> 4 bytes due to
  existing alignment)

## PHASE 3: GIT HISTORY INVESTIGATION

**Step 3.1: Blame**
The `u16 chunk_size` field was introduced in commit `a069e977d6d8f2` by
Maxim Mikityanskiy on 2019-06-26, first in v5.3-rc1. The bug has been
present since then - approximately 7 years.

**Step 3.2: Prior Related Fixes**
Commit `a5535e5336943` ("mlx5: stop warning for 64KB pages", 2024-03-28)
was a workaround for a compiler warning about this exact issue. It added
`(size_t)` cast in `mlx5e_validate_xsk_param()` to suppress the warning,
but didn't fix the underlying type issue. That commit's message even
noted "64KB chunks are really not all that useful, so just shut up the
warning by adding a cast."

**Step 3.3: File History**
8 changes to params.h since v6.1. The struct itself has remained stable
- `chunk_size` field unchanged since its introduction.

**Step 3.4: Author Context**
Dragos Tatulea is a regular contributor to mlx5 at Nvidia, with multiple
fixes for 64K page issues (SHAMPO fixes). The submitter Tariq Toukan is
the mlx5e submaintainer. Paolo Abeni (netdev maintainer) merged it.

**Step 3.5: Dependencies**
The diff context shows a `struct mlx5e_rq_opt_param` that doesn't exist
in the v7.0 tree. This means the patch was made against a slightly newer
codebase. However, the actual change (u16->u32 on line 11 of the struct)
is independent and applies with trivial context adjustment.

## PHASE 4: MAILING LIST RESEARCH

**Step 4.1-4.2:** Lore is blocked by Anubis protection. From b4 dig on
the related commit `a5535e5336943`, I confirmed the earlier fix was
patch 8/9 in Arnd Bergmann's series. The current commit appears to be
the proper type-level fix.

**Step 4.3:** No external bug report references. The bug was found
internally by the mlx5 team.

**Step 4.4:** Patch 2/N series (from message-id `-2-`). The companion
patches likely include the `mlx5e_rq_opt_param` struct addition and
possibly removal of the `<= 0xffff` sanity check in
`mlx5e_xsk_is_pool_sane()`. This type-widening patch is standalone for
fixing the overflow.

## PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1:** The modified struct `mlx5e_xsk_param` is used across the
entire mlx5e XSK/MPWRQ subsystem.

**Step 5.2: Callers of chunk_size**
- `mlx5e_mpwrq_page_shift()` - calls `order_base_2(xsk->chunk_size)` →
  undefined on 0
- `mlx5e_mpwrq_umr_mode()` - compares chunk_size with page_shift
- `mlx5e_validate_xsk_param()` - bounds check
- `mlx5e_build_xsk_param()` - stores pool chunk_size into struct
- Internal calculation loop in params.c - creates temporary structs
- `mlx5e_create_rq_umr_mkey()` in en_main.c - passes to hardware

**Step 5.4: Reachability**
The overflow triggers when ANY XDP program is loaded on an mlx5
interface on a 64K page system. The calculation loop runs during channel
configuration, not just when XSK is explicitly used. This is a common
scenario for ARM64 servers.

## PHASE 6: STABLE TREE ANALYSIS

**Step 6.1:** The buggy code (`u16 chunk_size` in `mlx5e_xsk_param`)
exists in all stable trees from v5.3 onward (introduced 2019-06-26).

**Step 6.2:** Minor context adjustment needed (surrounding struct
differs). The one-line change itself is trivially backportable.

**Step 6.3:** The earlier workaround (`a5535e5336943`) only suppressed a
compiler warning but didn't fix the runtime overflow.

## PHASE 7: SUBSYSTEM CONTEXT

**Step 7.1:** Subsystem: drivers/net (networking, Mellanox ConnectX).
Criticality: IMPORTANT - widely used enterprise network hardware.

**Step 7.2:** Very active subsystem with frequent fixes, especially for
64K page support issues (multiple SHAMPO fixes by the same author).

## PHASE 8: IMPACT AND RISK ASSESSMENT

**Step 8.1: Affected Users**
ARM64 systems with 64K pages running mlx5 (Mellanox ConnectX) NICs with
XDP programs. This includes ARM64 servers in data centers.

**Step 8.2: Trigger Conditions**
Loading any XDP program on an mlx5 interface on a 64K page system
triggers the internal calculation loop. The overflow happens during
channel parameter computation.

**Step 8.3: Failure Mode**
- `order_base_2(0)` is undefined behavior, potentially returning garbage
- Wrong `page_shift` propagates through `mlx5e_mpwrq_log_wqe_sz()`,
  detected as overflow
- At minimum: WARN_ON triggers and incorrect hardware configuration
- At worst: incorrect WQE sizes could cause hardware errors, packet
  loss, or crashes
- Severity: **HIGH**

**Step 8.4: Risk-Benefit Ratio**
- Benefit: HIGH - fixes undefined behavior and incorrect calculations on
  64K page ARM64 systems
- Risk: VERY LOW - changing u16 to u32 is trivially correct, cannot
  introduce regression
- Ratio: Strongly favorable for backporting

## PHASE 9: FINAL SYNTHESIS

**Evidence FOR backporting:**
- Fixes a real integer overflow bug causing undefined behavior
- Affects 64K page ARM64 systems with widely-used enterprise hardware
- One-line, obviously correct fix (type widening)
- Zero regression risk
- Bug present since v5.3 (7 years)
- Author is a known mlx5 contributor, reviewed by Nvidia staff, merged
  by netdev maintainer
- The earlier workaround (compiler warning fix) acknowledged the problem
  existed

**Evidence AGAINST backporting:**
- Needs minor context adjustment (surrounding struct differs)
- 64K page systems are a subset of users
- The `mlx5e_xsk_is_pool_sane()` check may prevent user-facing triggers
  (but NOT the internal calculation path)

**Stable Rules Checklist:**
1. Obviously correct and tested? YES - trivial type widening
2. Fixes a real bug? YES - integer overflow causing undefined behavior
3. Important issue? YES - undefined behavior, potential incorrect
   hardware config
4. Small and contained? YES - one line change
5. No new features or APIs? CORRECT - just a type fix
6. Can apply to stable trees? YES with trivial context adjustment

**Verification:**
- [Phase 1] Parsed tags: Reviewed-by Nvidia, merged by netdev
  maintainer. Patch 2/N series.
- [Phase 2] Diff analysis: single line type change u16->u32 in struct
  mlx5e_xsk_param
- [Phase 3] git blame: chunk_size as u16 introduced in a069e977d6d8f2
  (v5.3-rc1, 2019), present in all active stable trees
- [Phase 3] git show a5535e5336943: prior workaround only added a cast,
  didn't fix the type
- [Phase 4] b4 dig for related commit found lore thread; Anubis blocked
  direct access
- [Phase 5] Traced callers: overflow at params.c:1129 feeds into
  order_base_2(0) in mpwrq_page_shift()
- [Phase 5] Verified xsk_pool_get_chunk_size() returns u32, truncated
  when stored in u16
- [Phase 6] Code exists in all stable trees from v5.3+; 8 changes to
  file since v6.1
- [Phase 6] Minor context conflict (mlx5e_rq_opt_param not in stable),
  trivially resolvable
- [Phase 8] Trigger: loading any XDP program on mlx5 on 64K page system;
  severity HIGH
- UNVERIFIED: Could not access lore.kernel.org to read full mailing list
  discussion

The fix is minimal, obviously correct, and addresses a real integer
overflow that causes undefined behavior on ARM64 systems with 64K pages.
The risk is negligible and the benefit is clear.

**YES**

 drivers/net/ethernet/mellanox/mlx5/core/en/params.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
index 00617c65fe3cd..c5aaaa4ac3648 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
@@ -8,7 +8,7 @@
 
 struct mlx5e_xsk_param {
 	u16 headroom;
-	u16 chunk_size;
+	u32 chunk_size;
 	bool unaligned;
 };
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH AUTOSEL 7.0-5.10] ppp: disconnect channel before nullifying pch->chan
       [not found] <20260420131539.986432-1-sashal@kernel.org>
                   ` (9 preceding siblings ...)
  2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-6.1] net/mlx5e: XSK, Increase size for chunk_size param Sasha Levin
@ 2026-04-20 13:08 ` Sasha Levin
  2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] netfilter: nfnetlink_queue: make hash table per queue Sasha Levin
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Qingfang Deng, Paolo Abeni, Sasha Levin, andrew+netdev, davem,
	edumazet, kuba, linux-ppp, netdev, linux-kernel

From: Qingfang Deng <dqfext@gmail.com>

[ Upstream commit 6a196e83a1a7e50be93482d1cd4305641f1a9fb1 ]

In ppp_unregister_channel(), pch->chan is set to NULL before calling
ppp_disconnect_channel(), which removes the channel from ppp->channels
list using list_del_rcu() + synchronize_net(). This creates an
intermediate state where the channel is still connected (on the list)
but already unregistered (pch->chan == NULL).

Call ppp_disconnect_channel() before setting pch->chan to NULL. After
the synchronize_net(), no new reader on the transmit path will hold a
reference to the channel from the list.

This eliminates the problematic state, and prepares for removing the
pch->chan NULL checks from the transmit path in a subsequent patch.

Signed-off-by: Qingfang Deng <dqfext@gmail.com>
Link: https://patch.msgid.link/20260312093732.277254-1-dqfext@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have all the information needed. Let me compile the complete
analysis.

---

## Phase 1: Commit Message Forensics

### Step 1.1: Subject Line
- **Subsystem**: `ppp` (PPP networking subsystem)
- **Action verb**: "disconnect" / implied "fix" - reorder operations to
  eliminate a race
- **Summary**: Reorder `ppp_disconnect_channel()` and `pch->chan = NULL`
  to eliminate a problematic intermediate state

### Step 1.2: Tags
- **Signed-off-by**: Qingfang Deng (author, active PPP contributor with
  9+ commits to the subsystem)
- **Link**:
  `https://patch.msgid.link/20260312093732.277254-1-dqfext@gmail.com`
  (the `-1-` suggests a single patch, not a series)
- **Signed-off-by**: Paolo Abeni (netdev co-maintainer, merge authority)
- No Fixes: tag (expected - that's why it's under review)
- No Reported-by (no explicit crash report)

### Step 1.3: Body Analysis
The commit describes a concrete race condition: `pch->chan` is set to
NULL before `ppp_disconnect_channel()` removes the channel from
`ppp->channels`. This creates an intermediate state where the channel is
still on the list but already has `pch->chan == NULL`. The fix reorders
the operations: disconnect first (remove from list), then NULL-ify.

### Step 1.4: Hidden Bug Fix Detection
The phrase "This eliminates the problematic state" clearly describes
fixing a race condition. The mention of "prepares for removing the
pch->chan NULL checks from the transmit path in a subsequent patch"
suggests a follow-up cleanup, but the reordering itself has standalone
correctness value.

**Record**: This IS a race condition fix, disguised slightly by the
preparatory framing.

---

## Phase 2: Diff Analysis

### Step 2.1: Inventory
- **1 file changed**: `drivers/net/ppp/ppp_generic.c`
- **Net change**: 0 lines added, 0 removed - purely a reorder of one
  line
- **Function modified**: `ppp_unregister_channel()`
- **Scope**: single-file, single-function, surgical

### Step 2.2: Code Flow Change
**Before**: In `ppp_unregister_channel()`:
1. `down_write(&pch->chan_sem)` + `spin_lock_bh(&pch->downl)` +
   `WRITE_ONCE(pch->chan, NULL)` + unlock
2. `ppp_disconnect_channel(pch)` - removes from `ppp->channels` via
   `list_del_rcu()` + `synchronize_net()`

**After**:
1. `ppp_disconnect_channel(pch)` - removes from list first
2. `down_write(&pch->chan_sem)` + `spin_lock_bh(&pch->downl)` +
   `WRITE_ONCE(pch->chan, NULL)` + unlock

### Step 2.3: Bug Mechanism - RACE CONDITION / NULL POINTER DEREFERENCE

The race occurs in `ppp_mp_explode()` (multilink transmit path):

```1977:2001:drivers/net/ppp/ppp_generic.c
list_for_each_entry(pch, &ppp->channels, clist) {
    if (pch->chan) {          // Read #1 - no downl lock held
        pch->avail = 1;
        navail++;
        pch->speed = pch->chan->speed;  // Read #2 - TOCTOU: pch->chan
can be NULL here!
    }
```

**Race scenario**:
1. CPU A (`ppp_mp_explode`, under `ppp_xmit_lock`): reads `pch->chan !=
   NULL` at line 1978
2. CPU B (`ppp_unregister_channel`): does `WRITE_ONCE(pch->chan, NULL)`
   under `downl` (NOT under `ppp_xmit_lock`)
3. CPU A: dereferences `pch->chan->speed` at line 1981 - **NULL pointer
   dereference / kernel oops**

The channel is still on `ppp->channels` (hasn't been disconnected yet),
so `ppp_mp_explode` finds it while iterating.

With the fix, `ppp_disconnect_channel()` removes the channel from the
list (under `ppp_lock`, which includes `ppp_xmit_lock`) FIRST. After it
returns, no transmit code iterating `ppp->channels` will see this
channel. Then setting `pch->chan = NULL` is safe.

### Step 2.4: Fix Quality
- Obviously correct - eliminates the problematic state entirely
- Minimal/surgical - one line moved
- No regression risk - the semantic operations are the same, just
  reordered
- No API changes, no locking changes, no new features

---

## Phase 3: Git History Investigation

### Step 3.1: Blame
From git blame, the buggy code ordering dates back to `^1da177e4c3f41`
(Linus Torvalds, 2005-04-16) - the initial kernel tree import. The
`WRITE_ONCE` was added by `0417adf367a0af` (August 2025), but the
fundamental ordering issue exists since v2.6.12.

### Step 3.2: No Fixes: tag (expected)

### Step 3.3: Related Changes
The same author (Qingfang Deng) previously fixed a related race in
`0417adf367a0af` ("ppp: fix race conditions in ppp_fill_forward_path"),
which explicitly documented: "pch->chan can be NULL. When
ppp_unregister_channel() is called, pch->chan is set to NULL before pch
is removed from ppp->channels." That commit added workarounds (NULL
checks, RCU). This commit fixes the **root cause**.

### Step 3.4: Author
Qingfang Deng has 9+ PPP commits, including critical race fixes and
performance improvements. They are a de facto domain expert for this
code.

### Step 3.5: Dependencies
The diff references `WRITE_ONCE(pch->chan, NULL)` which was introduced
by `0417adf367a0af`. That commit has `Fixes: f6efc675c9dd` and was
likely already auto-selected for stable. For trees without that commit,
a trivial context adjustment (`pch->chan = NULL` instead of
`WRITE_ONCE`) would be needed. The reordering logic works correctly
regardless.

---

## Phase 4: Mailing List Research

### Step 4.1-4.2
Lore is currently blocked by Anubis anti-bot protection. b4 dig could
not find this specific commit by message-id (the commit doesn't exist in
this tree). However, the patch was accepted and merged by Paolo Abeni,
netdev co-maintainer, which provides strong confidence in correctness.

### Step 4.3-4.5
The earlier commit `0417adf367a0af` explicitly documents the bug that
this commit fixes at the root: "pch->chan can be NULL. When
ppp_unregister_channel() is called, pch->chan is set to NULL before pch
is removed from ppp->channels." This was a known problem.

---

## Phase 5: Code Semantic Analysis

### Step 5.1-5.4: Call Path Tracing
The vulnerable function `ppp_mp_explode()` is called from:
- `ppp_push()` -> `ppp_send_frame()` -> `__ppp_xmit_process()` ->
  `ppp_xmit_process()`

This is the **hot transmit path** for PPP multilink, triggered every
time a packet is sent on a multilink PPP connection. The race triggers
when a channel is unregistered while multilink transmission is active -
a common scenario during PPP session teardown or link failure.

`ppp_fill_forward_path()` (already patched with NULL checks by
`0417adf367a0af`) is also affected but has workarounds. This commit
fixes the root cause for all paths.

### Step 5.5: Similar Patterns
The same TOCTOU pattern (check `pch->chan`, then dereference) also
appears at:
- Line 1978/1981: `ppp_mp_explode()` - **vulnerable** (no `downl` lock)
- Line 1912-1914: `ppp_push()` - **safe** (holds `pch->downl`)
- Line 2059-2060: `ppp_mp_explode()` phase 2 - **safe** (holds
  `pch->downl`)
- Line 2185-2189: `__ppp_channel_push()` - **safe** (holds `pch->downl`)

---

## Phase 6: Stable Tree Analysis

### Step 6.1: Buggy Code in Stable
The buggy ordering exists since v2.6.12 (the very first git commit). It
is present in ALL active stable trees.

### Step 6.2: Backport Complications
For 7.0.y: applies cleanly (code matches exactly).
For older trees without `0417adf367a0af`: trivial context change needed
(`pch->chan = NULL` vs `WRITE_ONCE(pch->chan, NULL)`), and
`list_del_rcu`/`synchronize_net()` may not be present in
`ppp_disconnect_channel()`. However, the reordering is still correct for
the transmit path because `ppp_disconnect_channel()` takes `ppp_lock()`
(which includes `ppp_xmit_lock`), ensuring mutual exclusion with
transmit path iteration.

### Step 6.3: Related Fixes
`0417adf367a0af` added workarounds (NULL checks) for the same underlying
issue. This commit fixes the root cause.

---

## Phase 7: Subsystem Context

### Step 7.1: PPP networking - **IMPORTANT** subsystem
PPP is used by DSL/dial-up connections, VPN tunnels, and
embedded/routing devices. Multilink PPP aggregates multiple physical
links, common in WAN/enterprise networking.

### Step 7.2: Actively maintained by the author (9+ commits), merged by
netdev maintainers.

---

## Phase 8: Impact and Risk Assessment

### Step 8.1: Affected Users
Users running PPP with multilink (`SC_MULTILINK` flag set). This
includes enterprise WAN, embedded networking, and PPPoE configurations.

### Step 8.2: Trigger Conditions
- Multilink PPP transmitting while a channel is being unregistered
- Happens during link failure, session teardown, or module unload
- A timing-dependent race, but the window exists every time a channel is
  unregistered during active multilink transmission

### Step 8.3: Failure Mode
**NULL pointer dereference** at `pch->chan->speed` -> kernel oops/crash.
Severity: **HIGH** (kernel crash, potential data loss from interrupted
operations).

### Step 8.4: Risk-Benefit Ratio
- **Benefit**: HIGH - prevents kernel crash in PPP multilink transmit
  path
- **Risk**: VERY LOW - moving one line, no semantic change, obviously
  correct
- **Ratio**: Strongly favorable

---

## Phase 9: Final Synthesis

### Step 9.1: Evidence
**FOR backporting**:
- Fixes a real NULL pointer dereference race condition
- The race exists since v2.6.12 (all stable trees affected)
- Extremely small change (1 line moved)
- Obviously correct - eliminates invalid intermediate state
- Author is a domain expert (9+ PPP commits)
- Accepted by netdev co-maintainer
- The same root cause was already documented by the earlier fix
  (0417adf367a0af)
- PPP multilink is a commonly used feature

**AGAINST backporting**:
- No crash report or syzbot reproduction (theoretical race, but
  realistic trigger)
- Commit message mentions "prepares for" a subsequent patch (partially
  preparatory)
- No Fixes: tag (expected)

### Step 9.2: Stable Rules Checklist
1. Obviously correct and tested? **YES** - merged by Paolo Abeni
2. Fixes a real bug? **YES** - NULL deref race in multilink transmit
3. Important issue? **YES** - kernel crash (NULL pointer deref)
4. Small and contained? **YES** - 1 line moved in 1 function
5. No new features/APIs? **YES** - pure reorder
6. Applies to stable? **YES** - cleanly to 7.0.y

### Step 9.3: Exception Categories
Not an exception category - this is a standard race condition fix.

---

## Verification

- [Phase 1] Parsed tags: Signed-off-by author (Qingfang Deng) +
  maintainer (Paolo Abeni), Link to patch
- [Phase 2] Diff analysis: 1 line moved in `ppp_unregister_channel()`,
  reordering `ppp_disconnect_channel()` before `WRITE_ONCE(pch->chan,
  NULL)`
- [Phase 2] Verified TOCTOU in `ppp_mp_explode()` lines 1978/1981: reads
  `pch->chan` without `downl` lock
- [Phase 3] git blame: buggy ordering from `^1da177e4c3f41` (Linus
  Torvalds, 2005) - present in all stable trees
- [Phase 3] Related commit `0417adf367a0af` explicitly documents this
  same race condition as known bug
- [Phase 3] Author has 9 commits to drivers/net/ppp/ - domain expert
- [Phase 4] Lore blocked by Anubis; could not verify mailing list
  discussion directly
- [Phase 5] Traced call chain: `ppp_xmit_process` ->
  `__ppp_xmit_process` -> `ppp_push` -> `ppp_mp_explode` - hot transmit
  path under `ppp_xmit_lock`
- [Phase 5] Verified `ppp_disconnect_channel()` takes `ppp_lock()`
  (includes `ppp_xmit_lock`) - mutual exclusion with transmit path
- [Phase 5] Verified `WRITE_ONCE(pch->chan, NULL)` is under
  `chan_sem+downl` only, NOT `ppp_xmit_lock` - confirms race window
- [Phase 6] Code exists in all active stable trees since v2.6.12
- [Phase 6] Patch applies cleanly to 7.0.y; older trees need trivial
  context adjustment
- [Phase 8] Failure mode: NULL pointer dereference -> kernel oops,
  severity HIGH
- UNVERIFIED: Could not access lore.kernel.org to verify if stable was
  requested by a reviewer

The fix is a minimal, obviously correct reordering that eliminates a
real NULL pointer dereference race condition in the PPP multilink
transmit path. The bug has existed since the original kernel tree and
affects all stable trees. The risk is negligible (one line moved) and
the benefit is preventing a kernel crash.

**YES**

 drivers/net/ppp/ppp_generic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
index e9b41777be809..7cd936bc6a7ea 100644
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -3023,12 +3023,12 @@ ppp_unregister_channel(struct ppp_channel *chan)
 	 * This ensures that we have returned from any calls into
 	 * the channel's start_xmit or ioctl routine before we proceed.
 	 */
+	ppp_disconnect_channel(pch);
 	down_write(&pch->chan_sem);
 	spin_lock_bh(&pch->downl);
 	WRITE_ONCE(pch->chan, NULL);
 	spin_unlock_bh(&pch->downl);
 	up_write(&pch->chan_sem);
-	ppp_disconnect_channel(pch);
 
 	pn = ppp_pernet(pch->chan_net);
 	spin_lock_bh(&pn->all_channels_lock);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH AUTOSEL 6.18] netfilter: nfnetlink_queue: make hash table per queue
       [not found] <20260420131539.986432-1-sashal@kernel.org>
                   ` (10 preceding siblings ...)
  2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-5.10] ppp: disconnect channel before nullifying pch->chan Sasha Levin
@ 2026-04-20 13:08 ` Sasha Levin
  2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] net: sfp: add quirks for Hisense and HSGQ GPON ONT SFP modules Sasha Levin
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Florian Westphal, Scott Mitchell, Sasha Levin, pablo, davem,
	edumazet, kuba, pabeni, netfilter-devel, coreteam, netdev,
	linux-kernel

From: Florian Westphal <fw@strlen.de>

[ Upstream commit 936206e3f6ff411581e615e930263d6f8b78df9d ]

Sharing a global hash table among all queues is tempting, but
it can cause crash:

BUG: KASAN: slab-use-after-free in nfqnl_recv_verdict+0x11ac/0x15e0 [nfnetlink_queue]
[..]
 nfqnl_recv_verdict+0x11ac/0x15e0 [nfnetlink_queue]
 nfnetlink_rcv_msg+0x46a/0x930
 kmem_cache_alloc_node_noprof+0x11e/0x450

struct nf_queue_entry is freed via kfree, but parallel cpu can still
encounter such an nf_queue_entry when walking the list.

Alternative fix is to free the nf_queue_entry via kfree_rcu() instead,
but as we have to alloc/free for each skb this will cause more mem
pressure.

Cc: Scott Mitchell <scott.k.mitch1@gmail.com>
Fixes: e19079adcd26 ("netfilter: nfnetlink_queue: optimize verdict lookup with hash table")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Error: Failed to generate final synthesis

 include/net/netfilter/nf_queue.h |   1 -
 net/netfilter/nfnetlink_queue.c  | 139 +++++++++++--------------------
 2 files changed, 49 insertions(+), 91 deletions(-)

diff --git a/include/net/netfilter/nf_queue.h b/include/net/netfilter/nf_queue.h
index 45eb26b2e95b3..d17035d14d96c 100644
--- a/include/net/netfilter/nf_queue.h
+++ b/include/net/netfilter/nf_queue.h
@@ -23,7 +23,6 @@ struct nf_queue_entry {
 	struct nf_hook_state	state;
 	bool			nf_ct_is_unconfirmed;
 	u16			size; /* sizeof(entry) + saved route keys */
-	u16			queue_num;
 
 	/* extra space to store route keys */
 };
diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
index a39d3b989063c..fe5942535245d 100644
--- a/net/netfilter/nfnetlink_queue.c
+++ b/net/netfilter/nfnetlink_queue.c
@@ -49,8 +49,8 @@
 #endif
 
 #define NFQNL_QMAX_DEFAULT 1024
-#define NFQNL_HASH_MIN     1024
-#define NFQNL_HASH_MAX     1048576
+#define NFQNL_HASH_MIN     8
+#define NFQNL_HASH_MAX     32768
 
 /* We're using struct nlattr which has 16bit nla_len. Note that nla_len
  * includes the header length. Thus, the maximum packet length that we
@@ -60,29 +60,10 @@
  */
 #define NFQNL_MAX_COPY_RANGE (0xffff - NLA_HDRLEN)
 
-/* Composite key for packet lookup: (net, queue_num, packet_id) */
-struct nfqnl_packet_key {
-	possible_net_t net;
-	u32 packet_id;
-	u16 queue_num;
-} __aligned(sizeof(u32));  /* jhash2 requires 32-bit alignment */
-
-/* Global rhashtable - one for entire system, all netns */
-static struct rhashtable nfqnl_packet_map __read_mostly;
-
-/* Helper to initialize composite key */
-static inline void nfqnl_init_key(struct nfqnl_packet_key *key,
-				  struct net *net, u32 packet_id, u16 queue_num)
-{
-	memset(key, 0, sizeof(*key));
-	write_pnet(&key->net, net);
-	key->packet_id = packet_id;
-	key->queue_num = queue_num;
-}
-
 struct nfqnl_instance {
 	struct hlist_node hlist;		/* global list of queues */
-	struct rcu_head rcu;
+	struct rhashtable nfqnl_packet_map;
+	struct rcu_work	rwork;
 
 	u32 peer_portid;
 	unsigned int queue_maxlen;
@@ -106,6 +87,7 @@ struct nfqnl_instance {
 
 typedef int (*nfqnl_cmpfn)(struct nf_queue_entry *, unsigned long);
 
+static struct workqueue_struct *nfq_cleanup_wq __read_mostly;
 static unsigned int nfnl_queue_net_id __read_mostly;
 
 #define INSTANCE_BUCKETS	16
@@ -124,34 +106,10 @@ static inline u_int8_t instance_hashfn(u_int16_t queue_num)
 	return ((queue_num >> 8) ^ queue_num) % INSTANCE_BUCKETS;
 }
 
-/* Extract composite key from nf_queue_entry for hashing */
-static u32 nfqnl_packet_obj_hashfn(const void *data, u32 len, u32 seed)
-{
-	const struct nf_queue_entry *entry = data;
-	struct nfqnl_packet_key key;
-
-	nfqnl_init_key(&key, entry->state.net, entry->id, entry->queue_num);
-
-	return jhash2((u32 *)&key, sizeof(key) / sizeof(u32), seed);
-}
-
-/* Compare stack-allocated key against entry */
-static int nfqnl_packet_obj_cmpfn(struct rhashtable_compare_arg *arg,
-				  const void *obj)
-{
-	const struct nfqnl_packet_key *key = arg->key;
-	const struct nf_queue_entry *entry = obj;
-
-	return !net_eq(entry->state.net, read_pnet(&key->net)) ||
-	       entry->queue_num != key->queue_num ||
-	       entry->id != key->packet_id;
-}
-
 static const struct rhashtable_params nfqnl_rhashtable_params = {
 	.head_offset = offsetof(struct nf_queue_entry, hash_node),
-	.key_len = sizeof(struct nfqnl_packet_key),
-	.obj_hashfn = nfqnl_packet_obj_hashfn,
-	.obj_cmpfn = nfqnl_packet_obj_cmpfn,
+	.key_offset = offsetof(struct nf_queue_entry, id),
+	.key_len = sizeof(u32),
 	.automatic_shrinking = true,
 	.min_size = NFQNL_HASH_MIN,
 	.max_size = NFQNL_HASH_MAX,
@@ -190,6 +148,10 @@ instance_create(struct nfnl_queue_net *q, u_int16_t queue_num, u32 portid)
 	spin_lock_init(&inst->lock);
 	INIT_LIST_HEAD(&inst->queue_list);
 
+	err = rhashtable_init(&inst->nfqnl_packet_map, &nfqnl_rhashtable_params);
+	if (err < 0)
+		goto out_free;
+
 	spin_lock(&q->instances_lock);
 	if (instance_lookup(q, queue_num)) {
 		err = -EEXIST;
@@ -210,6 +172,8 @@ instance_create(struct nfnl_queue_net *q, u_int16_t queue_num, u32 portid)
 
 out_unlock:
 	spin_unlock(&q->instances_lock);
+	rhashtable_destroy(&inst->nfqnl_packet_map);
+out_free:
 	kfree(inst);
 	return ERR_PTR(err);
 }
@@ -217,15 +181,18 @@ instance_create(struct nfnl_queue_net *q, u_int16_t queue_num, u32 portid)
 static void nfqnl_flush(struct nfqnl_instance *queue, nfqnl_cmpfn cmpfn,
 			unsigned long data);
 
-static void
-instance_destroy_rcu(struct rcu_head *head)
+static void instance_destroy_work(struct work_struct *work)
 {
-	struct nfqnl_instance *inst = container_of(head, struct nfqnl_instance,
-						   rcu);
+	struct nfqnl_instance *inst;
 
+	inst = container_of(to_rcu_work(work), struct nfqnl_instance,
+			    rwork);
 	rcu_read_lock();
 	nfqnl_flush(inst, NULL, 0);
 	rcu_read_unlock();
+
+	rhashtable_destroy(&inst->nfqnl_packet_map);
+
 	kfree(inst);
 	module_put(THIS_MODULE);
 }
@@ -234,7 +201,9 @@ static void
 __instance_destroy(struct nfqnl_instance *inst)
 {
 	hlist_del_rcu(&inst->hlist);
-	call_rcu(&inst->rcu, instance_destroy_rcu);
+
+	INIT_RCU_WORK(&inst->rwork, instance_destroy_work);
+	queue_rcu_work(nfq_cleanup_wq, &inst->rwork);
 }
 
 static void
@@ -250,9 +219,7 @@ __enqueue_entry(struct nfqnl_instance *queue, struct nf_queue_entry *entry)
 {
 	int err;
 
-	entry->queue_num = queue->queue_num;
-
-	err = rhashtable_insert_fast(&nfqnl_packet_map, &entry->hash_node,
+	err = rhashtable_insert_fast(&queue->nfqnl_packet_map, &entry->hash_node,
 				     nfqnl_rhashtable_params);
 	if (unlikely(err))
 		return err;
@@ -266,23 +233,19 @@ __enqueue_entry(struct nfqnl_instance *queue, struct nf_queue_entry *entry)
 static void
 __dequeue_entry(struct nfqnl_instance *queue, struct nf_queue_entry *entry)
 {
-	rhashtable_remove_fast(&nfqnl_packet_map, &entry->hash_node,
+	rhashtable_remove_fast(&queue->nfqnl_packet_map, &entry->hash_node,
 			       nfqnl_rhashtable_params);
 	list_del(&entry->list);
 	queue->queue_total--;
 }
 
 static struct nf_queue_entry *
-find_dequeue_entry(struct nfqnl_instance *queue, unsigned int id,
-		   struct net *net)
+find_dequeue_entry(struct nfqnl_instance *queue, unsigned int id)
 {
-	struct nfqnl_packet_key key;
 	struct nf_queue_entry *entry;
 
-	nfqnl_init_key(&key, net, id, queue->queue_num);
-
 	spin_lock_bh(&queue->lock);
-	entry = rhashtable_lookup_fast(&nfqnl_packet_map, &key,
+	entry = rhashtable_lookup_fast(&queue->nfqnl_packet_map, &id,
 				       nfqnl_rhashtable_params);
 
 	if (entry)
@@ -1531,7 +1494,7 @@ static int nfqnl_recv_verdict(struct sk_buff *skb, const struct nfnl_info *info,
 
 	verdict = ntohl(vhdr->verdict);
 
-	entry = find_dequeue_entry(queue, ntohl(vhdr->id), info->net);
+	entry = find_dequeue_entry(queue, ntohl(vhdr->id));
 	if (entry == NULL)
 		return -ENOENT;
 
@@ -1880,40 +1843,38 @@ static int __init nfnetlink_queue_init(void)
 {
 	int status;
 
-	status = rhashtable_init(&nfqnl_packet_map, &nfqnl_rhashtable_params);
-	if (status < 0)
-		return status;
+	nfq_cleanup_wq = alloc_ordered_workqueue("nfq_workqueue", 0);
+	if (!nfq_cleanup_wq)
+		return -ENOMEM;
 
 	status = register_pernet_subsys(&nfnl_queue_net_ops);
-	if (status < 0) {
-		pr_err("failed to register pernet ops\n");
-		goto cleanup_rhashtable;
-	}
+	if (status < 0)
+		goto cleanup_pernet_subsys;
 
-	netlink_register_notifier(&nfqnl_rtnl_notifier);
-	status = nfnetlink_subsys_register(&nfqnl_subsys);
-	if (status < 0) {
-		pr_err("failed to create netlink socket\n");
-		goto cleanup_netlink_notifier;
-	}
+	status = netlink_register_notifier(&nfqnl_rtnl_notifier);
+	if (status < 0)
+	       goto cleanup_rtnl_notifier;
 
 	status = register_netdevice_notifier(&nfqnl_dev_notifier);
-	if (status < 0) {
-		pr_err("failed to register netdevice notifier\n");
-		goto cleanup_netlink_subsys;
-	}
+	if (status < 0)
+		goto cleanup_dev_notifier;
+
+	status = nfnetlink_subsys_register(&nfqnl_subsys);
+	if (status < 0)
+		goto cleanup_nfqnl_subsys;
 
 	nf_register_queue_handler(&nfqh);
 
 	return status;
 
-cleanup_netlink_subsys:
-	nfnetlink_subsys_unregister(&nfqnl_subsys);
-cleanup_netlink_notifier:
+cleanup_nfqnl_subsys:
+	unregister_netdevice_notifier(&nfqnl_dev_notifier);
+cleanup_dev_notifier:
 	netlink_unregister_notifier(&nfqnl_rtnl_notifier);
+cleanup_rtnl_notifier:
 	unregister_pernet_subsys(&nfnl_queue_net_ops);
-cleanup_rhashtable:
-	rhashtable_destroy(&nfqnl_packet_map);
+cleanup_pernet_subsys:
+	destroy_workqueue(nfq_cleanup_wq);
 	return status;
 }
 
@@ -1924,9 +1885,7 @@ static void __exit nfnetlink_queue_fini(void)
 	nfnetlink_subsys_unregister(&nfqnl_subsys);
 	netlink_unregister_notifier(&nfqnl_rtnl_notifier);
 	unregister_pernet_subsys(&nfnl_queue_net_ops);
-
-	rhashtable_destroy(&nfqnl_packet_map);
-
+	destroy_workqueue(nfq_cleanup_wq);
 	rcu_barrier(); /* Wait for completion of call_rcu()'s */
 }
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH AUTOSEL 6.18] net: sfp: add quirks for Hisense and HSGQ GPON ONT SFP modules
       [not found] <20260420131539.986432-1-sashal@kernel.org>
                   ` (11 preceding siblings ...)
  2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] netfilter: nfnetlink_queue: make hash table per queue Sasha Levin
@ 2026-04-20 13:08 ` Sasha Levin
  2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] ixgbevf: add missing negotiate_features op to Hyper-V ops table Sasha Levin
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:08 UTC (permalink / raw)
  To: patches, stable
  Cc: John Pavlick, Russell King (Oracle), Marcin Nita, Jakub Kicinski,
	Sasha Levin, linux, andrew, hkallweit1, davem, edumazet, pabeni,
	netdev, linux-kernel

From: John Pavlick <jspavlick@posteo.net>

[ Upstream commit 95aca8602ef70ffd3d971675751c81826e124f90 ]

Several GPON ONT SFP sticks based on Realtek RTL960x report
1000BASE-LX at 1300MBd in their EEPROM but can operate at 2500base-X.
On hosts capable of 2500base-X (e.g. Banana Pi R3 / MT7986), the
kernel negotiates only 1G because it trusts the incorrect EEPROM data.

Add quirks for:
- Hisense-Leox LXT-010S-H
- Hisense ZNID-GPON-2311NA
- HSGQ HSGQ-XPON-Stick

Each quirk advertises 2500base-X and ignores TX_FAULT during the
module's ~40s Linux boot time.

Tested on Banana Pi R3 (MT7986) with OpenWrt 25.12.1, confirmed
2.5Gbps link and full throughput with flow offloading.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Suggested-by: Marcin Nita <marcin.nita@leolabs.pl>
Signed-off-by: John Pavlick <jspavlick@posteo.net>
Link: https://patch.msgid.link/20260406132321.72563-1-jspavlick@posteo.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Error: Failed to generate final synthesis

 drivers/net/phy/sfp.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 7a85b758fb1e6..c62e3f364ea73 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -543,6 +543,22 @@ static const struct sfp_quirk sfp_quirks[] = {
 	SFP_QUIRK("HUAWEI", "MA5671A", sfp_quirk_2500basex,
 		  sfp_fixup_ignore_tx_fault_and_los),
 
+	// Hisense LXT-010S-H is a GPON ONT SFP (sold as LEOX LXT-010S-H) that
+	// can operate at 2500base-X, but reports 1000BASE-LX / 1300MBd in its
+	// EEPROM
+	SFP_QUIRK("Hisense-Leox", "LXT-010S-H", sfp_quirk_2500basex,
+		  sfp_fixup_ignore_tx_fault),
+
+	// Hisense ZNID-GPON-2311NA can operate at 2500base-X, but reports
+	// 1000BASE-LX / 1300MBd in its EEPROM
+	SFP_QUIRK("Hisense", "ZNID-GPON-2311NA", sfp_quirk_2500basex,
+		  sfp_fixup_ignore_tx_fault),
+
+	// HSGQ HSGQ-XPON-Stick can operate at 2500base-X, but reports
+	// 1000BASE-LX / 1300MBd in its EEPROM
+	SFP_QUIRK("HSGQ", "HSGQ-XPON-Stick", sfp_quirk_2500basex,
+		  sfp_fixup_ignore_tx_fault),
+
 	// Lantech 8330-262D-E and 8330-265D can operate at 2500base-X, but
 	// incorrectly report 2500MBd NRZ in their EEPROM.
 	// Some 8330-265D modules have inverted LOS, while all of them report
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH AUTOSEL 6.18] ixgbevf: add missing negotiate_features op to Hyper-V ops table
       [not found] <20260420131539.986432-1-sashal@kernel.org>
                   ` (12 preceding siblings ...)
  2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] net: sfp: add quirks for Hisense and HSGQ GPON ONT SFP modules Sasha Levin
@ 2026-04-20 13:08 ` Sasha Levin
  2026-04-20 13:09 ` [PATCH AUTOSEL 7.0-6.18] wifi: ath12k: Fix the assignment of logical link index Sasha Levin
  2026-04-20 13:09 ` [PATCH AUTOSEL 6.18] Bluetooth: hci_sync: annotate data-races around hdev->req_status Sasha Levin
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:08 UTC (permalink / raw)
  To: patches, stable
  Cc: Michal Schmidt, Xiaoqiang Xiong, Aleksandr Loktionov, Tony Nguyen,
	Sasha Levin, przemyslaw.kitszel, andrew+netdev, davem, edumazet,
	kuba, pabeni, jacob.e.keller, jedrzej.jagielski, intel-wired-lan,
	netdev, linux-kernel

From: Michal Schmidt <mschmidt@redhat.com>

[ Upstream commit 4821d563cd7f251ae728be1a6d04af82a294a5b9 ]

Commit a7075f501bd3 ("ixgbevf: fix mailbox API compatibility by
negotiating supported features") added the .negotiate_features callback
to ixgbe_mac_operations and populated it in ixgbevf_mac_ops, but forgot
to add it to ixgbevf_hv_mac_ops. This leaves the function pointer NULL
on Hyper-V VMs.

During probe, ixgbevf_negotiate_api() calls ixgbevf_set_features(),
which unconditionally dereferences hw->mac.ops.negotiate_features().
On Hyper-V this results in a NULL pointer dereference:

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  [...]
  Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine [...]
  Workqueue: events work_for_cpu_fn
  RIP: 0010:0x0
  [...]
  Call Trace:
   ixgbevf_negotiate_api+0x66/0x160 [ixgbevf]
   ixgbevf_sw_init+0xe4/0x1f0 [ixgbevf]
   ixgbevf_probe+0x20f/0x4a0 [ixgbevf]
   local_pci_probe+0x50/0xa0
   work_for_cpu_fn+0x1a/0x30
   [...]

Add ixgbevf_hv_negotiate_features_vf() that returns -EOPNOTSUPP and
wire it into ixgbevf_hv_mac_ops. The caller already handles -EOPNOTSUPP
gracefully.

Fixes: a7075f501bd3 ("ixgbevf: fix mailbox API compatibility by negotiating supported features")
Reported-by: Xiaoqiang Xiong <xxiong@redhat.com>
Closes: https://issues.redhat.com/browse/RHEL-155455
Assisted-by: Claude:claude-4.6-opus-high Cursor
Tested-by: Xiaoqiang Xiong <xxiong@redhat.com>
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Error: Failed to generate final synthesis

 drivers/net/ethernet/intel/ixgbevf/vf.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbevf/vf.c b/drivers/net/ethernet/intel/ixgbevf/vf.c
index b67b580f7f1c9..f6df86d124b9e 100644
--- a/drivers/net/ethernet/intel/ixgbevf/vf.c
+++ b/drivers/net/ethernet/intel/ixgbevf/vf.c
@@ -709,6 +709,12 @@ static int ixgbevf_negotiate_features_vf(struct ixgbe_hw *hw, u32 *pf_features)
 	return err;
 }
 
+static int ixgbevf_hv_negotiate_features_vf(struct ixgbe_hw *hw,
+					    u32 *pf_features)
+{
+	return -EOPNOTSUPP;
+}
+
 /**
  *  ixgbevf_set_vfta_vf - Set/Unset VLAN filter table address
  *  @hw: pointer to the HW structure
@@ -1142,6 +1148,7 @@ static const struct ixgbe_mac_operations ixgbevf_hv_mac_ops = {
 	.setup_link		= ixgbevf_setup_mac_link_vf,
 	.check_link		= ixgbevf_hv_check_mac_link_vf,
 	.negotiate_api_version	= ixgbevf_hv_negotiate_api_version_vf,
+	.negotiate_features	= ixgbevf_hv_negotiate_features_vf,
 	.set_rar		= ixgbevf_hv_set_rar_vf,
 	.update_mc_addr_list	= ixgbevf_hv_update_mc_addr_list_vf,
 	.update_xcast_mode	= ixgbevf_hv_update_xcast_mode,
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH AUTOSEL 7.0-6.18] wifi: ath12k: Fix the assignment of logical link index
       [not found] <20260420131539.986432-1-sashal@kernel.org>
                   ` (13 preceding siblings ...)
  2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] ixgbevf: add missing negotiate_features op to Hyper-V ops table Sasha Levin
@ 2026-04-20 13:09 ` Sasha Levin
  2026-04-20 13:09 ` [PATCH AUTOSEL 6.18] Bluetooth: hci_sync: annotate data-races around hdev->req_status Sasha Levin
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:09 UTC (permalink / raw)
  To: patches, stable
  Cc: Manish Dharanenthiran, Roopni Devanathan, Rameshkumar Sundaram,
	Baochen Qiang, Vasanthakumar Thiagarajan, Jeff Johnson,
	Sasha Levin, kvalo, davem, kuba, linux-wireless, netdev,
	linux-kernel

From: Manish Dharanenthiran <manish.dharanenthiran@oss.qualcomm.com>

[ Upstream commit aecb569d7fb689e3e5b0005ca7bd0a2ef28915e8 ]

Per-link logical index is assigned from the global counter,
ahsta->num_peer. This logical index is sent to firmware during peer
association. If there is a failure in creating a link station,
ath12k_mac_free_unassign_link_sta() clears the link, but does not decrement
the logical link index. This will result in a higher logical link index for
the next link station created. Also, if there is a leak in logical link
index as we assign the incremented num_peer, then the index can exceed the
maximum valid value of 15.

As an example, let's say we have a 2 GHz + 5 GHz + 6 GHz MLO setup. So the
logical link indices that they have are 0, 1 and 2, respectively. If the
5 GHz link is removed, logical link index 1 becomes available, and num_peer
is not reduced to 2 and still remains at 3. If a new 5 GHz link is added
later, it gets the index 3, instead of reusing link index 1. Also,
num_peer is increased to 4, though only 3 links are present.

To resolve these, create a bitmap, free_logical_link_idx, that tracks the
available logical link indices. When a link station is created, select the
first free logical index and when a link station is removed, mark its
logical link index as available by setting the bit.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.6-01181-QCAHKSWPL_SILICONZ-1

Signed-off-by: Manish Dharanenthiran <manish.dharanenthiran@oss.qualcomm.com>
Signed-off-by: Roopni Devanathan <roopni.devanathan@oss.qualcomm.com>
Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20260226041911.2434999-1-roopni.devanathan@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have all the information I need. Let me compile the final
analysis.

---

## Complete Analysis

### PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1:**
- Subsystem: `wifi: ath12k`
- Action verb: "Fix"
- Summary: Fixes the assignment of logical link index for MLO (Multi-
  Link Operation) link stations.

**Step 1.2: Tags:**
- `Tested-on: QCN9274 hw2.0 PCI` — hardware tested
- `Signed-off-by: Manish Dharanenthiran` — original author
- `Signed-off-by: Roopni Devanathan` — submitter
- `Reviewed-by: Rameshkumar Sundaram`, `Baochen Qiang`, `Vasanthakumar
  Thiagarajan` — 3 Qualcomm reviewers
- `Link:` to patch.msgid.link — original submission
- `Signed-off-by: Jeff Johnson` — ath12k maintainer applied it
- No Fixes: tag, no Reported-by, no syzbot, no Cc: stable — expected for
  autoselection candidates.

**Step 1.3:** The commit message describes a clear bug: `num_peer` is a
monotonically incrementing counter used to assign logical link indices.
When links are removed, the counter is never decremented, causing
"leakage" of index values. Over time with link add/remove cycles, the
index exceeds the firmware's maximum valid value of 15.

**Step 1.4:** This is NOT a hidden bug fix — the subject explicitly says
"Fix".

### PHASE 2: DIFF ANALYSIS

**Step 2.1:**
- `core.h`: 1 line changed (`u8 num_peer` -> `u16
  free_logical_link_idx_map`)
- `mac.c`: ~20 lines changed across 3 functions
- Functions modified: `ath12k_mac_free_unassign_link_sta`,
  `ath12k_mac_assign_link_sta`, `ath12k_mac_op_sta_state`
- Scope: well-contained, single-subsystem fix

**Step 2.2:**
- In `ath12k_mac_free_unassign_link_sta`: adds
  `ahsta->free_logical_link_idx_map |= BIT(arsta->link_idx)` — returns
  the freed index to the pool
- In `ath12k_mac_assign_link_sta`: replaces `arsta->link_idx =
  ahsta->num_peer++` with bitmap-based allocation using `__ffs()` + adds
  `-ENOSPC` check
- In `ath12k_mac_op_sta_state`: initializes
  `ahsta->free_logical_link_idx_map = U16_MAX` when a new station is
  created (all bits set = all indices free)

**Step 2.3:** Bug category: Logic/correctness bug — resource index leak.
The old approach only increments, never reuses indices. The new bitmap
approach properly tracks available indices.

**Step 2.4:** Fix quality:
- The fix is correct — bitmap tracks available indices, `__ffs` gets the
  lowest free bit, removal sets the bit back
- It adds a proper `-ENOSPC` check for when all indices are exhausted
- Minimal regression risk — the logic is straightforward and only
  touches the specific allocation/deallocation paths
- The U16_MAX initialization means 16 indices (0-15), which matches the
  firmware's maximum

### PHASE 3: GIT HISTORY INVESTIGATION

**Step 3.1:** `git blame` confirms both the buggy code (`num_peer++` at
line 7124) and the incomplete cleanup function were introduced by the
same commit: `8e6f8bc286031` ("Add MLO station state change handling")
by Sriram R, dated 2024-11-21, first in v6.14-rc1.

**Step 3.2:** No Fixes: tag present. The bug was introduced by
8e6f8bc286031.

**Step 3.3:** No intermediate fixes for the same issue. No prerequisites
found — the patch modifies code that exists in the tree as-is.

**Step 3.4:** The author (Manish Dharanenthiran) is a regular ath12k
contributor with 9+ commits in the subsystem. Jeff Johnson (ath12k
maintainer) applied it.

**Step 3.5:** This is a standalone single-patch fix. No dependencies on
other commits.

### PHASE 4: MAILING LIST RESEARCH

Lore was not accessible due to anti-bot protection. b4 dig could not
find the exact commit (it hasn't landed in the main tree yet from the
perspective of this 7.0 tree). The patch was sent to
`ath12k@lists.infradead.org` and `linux-wireless@vger.kernel.org`. It
was reviewed by 3 Qualcomm engineers and applied by the ath12k
maintainer Jeff Johnson.

### PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1:** Modified functions: `ath12k_mac_free_unassign_link_sta`,
`ath12k_mac_assign_link_sta`, `ath12k_mac_op_sta_state`

**Step 5.2:** `arsta->link_idx` is used in `ath12k_peer_assoc_h_mlo()`
(line 3531) to populate `ml->logical_link_idx` which is sent to firmware
via `wmi.c` line 2348 as `ml_params->logical_link_idx`. This is a WMI
command parameter — an invalid value directly impacts firmware behavior.

**Step 5.4:** The path: `ath12k_mac_op_sta_state` ->
`ath12k_mac_assign_link_sta` -> sets `link_idx` -> later used in
`ath12k_peer_assoc_h_mlo` -> sent via WMI to firmware. This is a
standard MLO station association path triggered during Wi-Fi connection
setup.

### PHASE 6: STABLE TREE ANALYSIS

**Step 6.1:** The buggy code (`num_peer` field) was introduced in commit
`8e6f8bc286031`, first in v6.14-rc1. It is:
- **NOT in v6.13, v6.12, or any earlier LTS tree**
- Present in v6.14, v6.15, v6.16, v6.17, v6.18, v6.19, v7.0

For the 7.0.y stable tree specifically, the buggy code IS present.

**Step 6.2:** The code in v7.0 matches exactly what the patch expects
(verified by reading lines 7096-7137 and 6771-6798 of mac.c). The patch
should apply cleanly.

### PHASE 7: SUBSYSTEM CONTEXT

**Step 7.1:** Subsystem: wireless driver (ath12k) — IMPORTANT for WiFi 7
users with Qualcomm QCN9274 and similar chipsets. MLO is a key WiFi 7
feature.

**Step 7.2:** ath12k is very actively developed (183 commits to mac.c
between v6.14 and v7.0).

### PHASE 8: IMPACT AND RISK ASSESSMENT

**Step 8.1:** Affected users: Users of Qualcomm ath12k WiFi 7 hardware
with MLO enabled (QCN9274, etc.).

**Step 8.2:** Trigger: Happens when MLO links are removed and re-added —
occurs during roaming, channel switching, or temporary link degradation.
In a typical MLO setup with frequent link changes, this can be triggered
relatively easily.

**Step 8.3:** Failure mode: Sending an invalid logical link index (>15)
to firmware can cause firmware malfunction, potential firmware crash, or
incorrect MLO behavior. Severity: **HIGH** — firmware receives invalid
commands.

**Step 8.4:**
- Benefit: Prevents firmware from receiving invalid index values during
  MLO operations, which could cause connection instability or firmware
  crashes
- Risk: LOW — the change is ~20 lines, well-contained, uses standard
  bitmap operations, reviewed by 3 engineers plus maintainer
- Ratio: Favorable

### PHASE 9: FINAL SYNTHESIS

**Evidence FOR backporting:**
- Fixes a real, clearly described bug (index leak leading to invalid
  firmware commands)
- Small, well-contained fix (~20 lines across 2 files)
- Obviously correct bitmap-based approach
- 3 Reviewed-by tags from Qualcomm engineers + maintainer sign-off
- Tested on hardware (QCN9274 hw2.0)
- The buggy code exists in v7.0 (target tree)
- Should apply cleanly

**Evidence AGAINST backporting:**
- No Fixes: tag, no syzbot/crash report, no user-reported symptom
- Replaces a data structure member (counter -> bitmap), slightly more
  than a trivial patch
- MLO is relatively new; impact limited to WiFi 7 MLO users
- No explicit Cc: stable nomination
- The bug requires specific MLO link add/remove cycles to manifest

**Stable rules checklist:**
1. Obviously correct and tested? **Yes** — bitmap approach is
   straightforward, hardware tested
2. Fixes a real bug? **Yes** — index leak causing invalid firmware
   commands
3. Important issue? **Medium-High** — can cause firmware malfunction in
   MLO
4. Small and contained? **Yes** — ~20 lines in 2 files, single subsystem
5. No new features? **Correct** — pure bug fix
6. Can apply to stable? **Yes** — code matches cleanly

**Verification:**
- [Phase 1] Parsed tags: 3 Reviewed-by, tested-on, Link, maintainer SOB
- [Phase 2] Diff analysis: replaces `num_peer` counter with
  `free_logical_link_idx_map` bitmap across 3 functions
- [Phase 3] git blame: buggy code introduced in 8e6f8bc286031
  (v6.14-rc1), confirmed by `git merge-base --is-ancestor` checks
- [Phase 3] `git merge-base --is-ancestor 8e6f8bc286031 v6.13` → Not in
  v6.13; `v6.14` → Yes
- [Phase 3] Author has 9+ ath12k commits, is a regular contributor
- [Phase 4] b4 dig could not find the commit (not merged in this tree);
  lore blocked by anti-bot
- [Phase 5] Traced `link_idx` usage: set in `ath12k_mac_assign_link_sta`
  → used in `ath12k_peer_assoc_h_mlo` (line 3531) → sent to firmware via
  WMI (wmi.c line 2348)
- [Phase 6] Buggy code confirmed present in v7.0 tree; code at lines
  7096-7137 matches patch context exactly
- [Phase 6] NOT in v6.13 or earlier LTS trees (v6.12.y, v6.6.y, etc.)
- [Phase 8] Invalid logical_link_idx sent to firmware can cause firmware
  malfunction
- UNVERIFIED: Could not access lore discussion to check for reviewer
  stable nominations or concerns

This is a real bug fix for a resource index leak in WiFi 7 MLO handling
that can lead to invalid firmware commands. The fix is small, self-
contained, well-reviewed, and hardware-tested. The buggy code exists in
the target 7.0.y tree.

**YES**

 drivers/net/wireless/ath/ath12k/core.h |  2 +-
 drivers/net/wireless/ath/ath12k/mac.c  | 16 ++++++++++++++--
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/ath/ath12k/core.h b/drivers/net/wireless/ath/ath12k/core.h
index 990934ec92fca..5498ff285102b 100644
--- a/drivers/net/wireless/ath/ath12k/core.h
+++ b/drivers/net/wireless/ath/ath12k/core.h
@@ -522,7 +522,7 @@ struct ath12k_sta {
 	u16 links_map;
 	u8 assoc_link_id;
 	u16 ml_peer_id;
-	u8 num_peer;
+	u16 free_logical_link_idx_map;
 
 	enum ieee80211_sta_state state;
 };
diff --git a/drivers/net/wireless/ath/ath12k/mac.c b/drivers/net/wireless/ath/ath12k/mac.c
index b253d1e3f4052..769d240e3ae24 100644
--- a/drivers/net/wireless/ath/ath12k/mac.c
+++ b/drivers/net/wireless/ath/ath12k/mac.c
@@ -6784,6 +6784,8 @@ static void ath12k_mac_free_unassign_link_sta(struct ath12k_hw *ah,
 		return;
 
 	ahsta->links_map &= ~BIT(link_id);
+	ahsta->free_logical_link_idx_map |= BIT(arsta->link_idx);
+
 	rcu_assign_pointer(ahsta->link[link_id], NULL);
 	synchronize_rcu();
 
@@ -7102,6 +7104,7 @@ static int ath12k_mac_assign_link_sta(struct ath12k_hw *ah,
 	struct ieee80211_sta *sta = ath12k_ahsta_to_sta(ahsta);
 	struct ieee80211_link_sta *link_sta;
 	struct ath12k_link_vif *arvif;
+	int link_idx;
 
 	lockdep_assert_wiphy(ah->hw->wiphy);
 
@@ -7120,8 +7123,16 @@ static int ath12k_mac_assign_link_sta(struct ath12k_hw *ah,
 
 	ether_addr_copy(arsta->addr, link_sta->addr);
 
-	/* logical index of the link sta in order of creation */
-	arsta->link_idx = ahsta->num_peer++;
+	if (!ahsta->free_logical_link_idx_map)
+		return -ENOSPC;
+
+	/*
+	 * Allocate a logical link index by selecting the first available bit
+	 * from the free logical index map
+	 */
+	link_idx = __ffs(ahsta->free_logical_link_idx_map);
+	ahsta->free_logical_link_idx_map &= ~BIT(link_idx);
+	arsta->link_idx = link_idx;
 
 	arsta->link_id = link_id;
 	ahsta->links_map |= BIT(arsta->link_id);
@@ -7630,6 +7641,7 @@ int ath12k_mac_op_sta_state(struct ieee80211_hw *hw,
 	if (old_state == IEEE80211_STA_NOTEXIST &&
 	    new_state == IEEE80211_STA_NONE) {
 		memset(ahsta, 0, sizeof(*ahsta));
+		ahsta->free_logical_link_idx_map = U16_MAX;
 
 		arsta = &ahsta->deflink;
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH AUTOSEL 6.18] Bluetooth: hci_sync: annotate data-races around hdev->req_status
       [not found] <20260420131539.986432-1-sashal@kernel.org>
                   ` (14 preceding siblings ...)
  2026-04-20 13:09 ` [PATCH AUTOSEL 7.0-6.18] wifi: ath12k: Fix the assignment of logical link index Sasha Levin
@ 2026-04-20 13:09 ` Sasha Levin
  15 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2026-04-20 13:09 UTC (permalink / raw)
  To: patches, stable
  Cc: Cen Zhang, Luiz Augusto von Dentz, Sasha Levin, marcel,
	johan.hedberg, davem, kuba, linux-bluetooth, netdev, linux-kernel

From: Cen Zhang <zzzccc427@gmail.com>

[ Upstream commit b6807cfc195ef99e1ac37b2e1e60df40295daa8c ]

__hci_cmd_sync_sk() sets hdev->req_status under hdev->req_lock:

    hdev->req_status = HCI_REQ_PEND;

However, several other functions read or write hdev->req_status without
holding any lock:

  - hci_send_cmd_sync() reads req_status in hci_cmd_work (workqueue)
  - hci_cmd_sync_complete() reads/writes from HCI event completion
  - hci_cmd_sync_cancel() / hci_cmd_sync_cancel_sync() read/write
  - hci_abort_conn() reads in connection abort path

Since __hci_cmd_sync_sk() runs on hdev->req_workqueue while
hci_send_cmd_sync() runs on hdev->workqueue, these are different
workqueues that can execute concurrently on different CPUs. The plain
C accesses constitute a data race.

Add READ_ONCE()/WRITE_ONCE() annotations on all concurrent accesses
to hdev->req_status to prevent potential compiler optimizations that
could affect correctness (e.g., load fusing in the wait_event
condition or store reordering).

Signed-off-by: Cen Zhang <zzzccc427@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Error: Failed to generate final synthesis

 net/bluetooth/hci_conn.c |  2 +-
 net/bluetooth/hci_core.c |  2 +-
 net/bluetooth/hci_sync.c | 20 ++++++++++----------
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
index 24b71ec8897ff..71a24be2a6d67 100644
--- a/net/bluetooth/hci_conn.c
+++ b/net/bluetooth/hci_conn.c
@@ -2967,7 +2967,7 @@ int hci_abort_conn(struct hci_conn *conn, u8 reason)
 	 * hci_connect_le serializes the connection attempts so only one
 	 * connection can be in BT_CONNECT at time.
 	 */
-	if (conn->state == BT_CONNECT && hdev->req_status == HCI_REQ_PEND) {
+	if (conn->state == BT_CONNECT && READ_ONCE(hdev->req_status) == HCI_REQ_PEND) {
 		switch (hci_skb_event(hdev->sent_cmd)) {
 		case HCI_EV_CONN_COMPLETE:
 		case HCI_EV_LE_CONN_COMPLETE:
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 8ccec73dce45c..0f86b81b39730 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -4125,7 +4125,7 @@ static int hci_send_cmd_sync(struct hci_dev *hdev, struct sk_buff *skb)
 		kfree_skb(skb);
 	}
 
-	if (hdev->req_status == HCI_REQ_PEND &&
+	if (READ_ONCE(hdev->req_status) == HCI_REQ_PEND &&
 	    !hci_dev_test_and_set_flag(hdev, HCI_CMD_PENDING)) {
 		kfree_skb(hdev->req_skb);
 		hdev->req_skb = skb_clone(hdev->sent_cmd, GFP_KERNEL);
diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
index 9a7bd4a4b14c4..f498ab28f1aa0 100644
--- a/net/bluetooth/hci_sync.c
+++ b/net/bluetooth/hci_sync.c
@@ -25,11 +25,11 @@ static void hci_cmd_sync_complete(struct hci_dev *hdev, u8 result, u16 opcode,
 {
 	bt_dev_dbg(hdev, "result 0x%2.2x", result);
 
-	if (hdev->req_status != HCI_REQ_PEND)
+	if (READ_ONCE(hdev->req_status) != HCI_REQ_PEND)
 		return;
 
 	hdev->req_result = result;
-	hdev->req_status = HCI_REQ_DONE;
+	WRITE_ONCE(hdev->req_status, HCI_REQ_DONE);
 
 	/* Free the request command so it is not used as response */
 	kfree_skb(hdev->req_skb);
@@ -167,20 +167,20 @@ struct sk_buff *__hci_cmd_sync_sk(struct hci_dev *hdev, u16 opcode, u32 plen,
 
 	hci_cmd_sync_add(&req, opcode, plen, param, event, sk);
 
-	hdev->req_status = HCI_REQ_PEND;
+	WRITE_ONCE(hdev->req_status, HCI_REQ_PEND);
 
 	err = hci_req_sync_run(&req);
 	if (err < 0)
 		return ERR_PTR(err);
 
 	err = wait_event_interruptible_timeout(hdev->req_wait_q,
-					       hdev->req_status != HCI_REQ_PEND,
+					       READ_ONCE(hdev->req_status) != HCI_REQ_PEND,
 					       timeout);
 
 	if (err == -ERESTARTSYS)
 		return ERR_PTR(-EINTR);
 
-	switch (hdev->req_status) {
+	switch (READ_ONCE(hdev->req_status)) {
 	case HCI_REQ_DONE:
 		err = -bt_to_errno(hdev->req_result);
 		break;
@@ -194,7 +194,7 @@ struct sk_buff *__hci_cmd_sync_sk(struct hci_dev *hdev, u16 opcode, u32 plen,
 		break;
 	}
 
-	hdev->req_status = 0;
+	WRITE_ONCE(hdev->req_status, 0);
 	hdev->req_result = 0;
 	skb = hdev->req_rsp;
 	hdev->req_rsp = NULL;
@@ -665,9 +665,9 @@ void hci_cmd_sync_cancel(struct hci_dev *hdev, int err)
 {
 	bt_dev_dbg(hdev, "err 0x%2.2x", err);
 
-	if (hdev->req_status == HCI_REQ_PEND) {
+	if (READ_ONCE(hdev->req_status) == HCI_REQ_PEND) {
 		hdev->req_result = err;
-		hdev->req_status = HCI_REQ_CANCELED;
+		WRITE_ONCE(hdev->req_status, HCI_REQ_CANCELED);
 
 		queue_work(hdev->workqueue, &hdev->cmd_sync_cancel_work);
 	}
@@ -683,12 +683,12 @@ void hci_cmd_sync_cancel_sync(struct hci_dev *hdev, int err)
 {
 	bt_dev_dbg(hdev, "err 0x%2.2x", err);
 
-	if (hdev->req_status == HCI_REQ_PEND) {
+	if (READ_ONCE(hdev->req_status) == HCI_REQ_PEND) {
 		/* req_result is __u32 so error must be positive to be properly
 		 * propagated.
 		 */
 		hdev->req_result = err < 0 ? -err : err;
-		hdev->req_status = HCI_REQ_CANCELED;
+		WRITE_ONCE(hdev->req_status, HCI_REQ_CANCELED);
 
 		wake_up_interruptible(&hdev->req_wait_q);
 	}
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2026-04-20 13:18 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260420131539.986432-1-sashal@kernel.org>
2026-04-20 13:07 ` [PATCH AUTOSEL 6.18] net: stmmac: Fix PTP ref clock for Tegra234 Sasha Levin
2026-04-20 13:07 ` [PATCH AUTOSEL 7.0-6.12] wifi: mac80211: properly handle error in ieee80211_add_virtual_monitor Sasha Levin
2026-04-20 13:07 ` [PATCH AUTOSEL 7.0-5.10] net: qrtr: fix endian handling of confirm_rx field Sasha Levin
2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] netfilter: xt_multiport: validate range encoding in checkentry Sasha Levin
2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] ice: ptp: don't WARN when controlling PF is unavailable Sasha Levin
2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] e1000: check return value of e1000_read_eeprom Sasha Levin
2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-6.6] bpf, sockmap: Annotate af_unix sock:: Sk_state data-races Sasha Levin
2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-6.18] net: wangxun: reorder timer and work sync cancellations Sasha Levin
2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-5.15] net: mvneta: support EPROBE_DEFER when reading MAC address Sasha Levin
2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-6.1] net/mlx5e: XSK, Increase size for chunk_size param Sasha Levin
2026-04-20 13:08 ` [PATCH AUTOSEL 7.0-5.10] ppp: disconnect channel before nullifying pch->chan Sasha Levin
2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] netfilter: nfnetlink_queue: make hash table per queue Sasha Levin
2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] net: sfp: add quirks for Hisense and HSGQ GPON ONT SFP modules Sasha Levin
2026-04-20 13:08 ` [PATCH AUTOSEL 6.18] ixgbevf: add missing negotiate_features op to Hyper-V ops table Sasha Levin
2026-04-20 13:09 ` [PATCH AUTOSEL 7.0-6.18] wifi: ath12k: Fix the assignment of logical link index Sasha Levin
2026-04-20 13:09 ` [PATCH AUTOSEL 6.18] Bluetooth: hci_sync: annotate data-races around hdev->req_status Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox