[PATCH AUTOSEL 6.19-5.10] ntb: ntb_hw_switchtec: Fix shift-out-of-bounds for 0 mw lut

public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH AUTOSEL 6.19-5.10] ntb: ntb_hw_switchtec: Fix shift-out-of-bounds for 0 mw lut
@ 2026-02-23 16:17 Sasha Levin
  2026-02-23 16:17 ` [PATCH AUTOSEL 6.19-5.10] ntb: ntb_hw_switchtec: Fix array-index-out-of-bounds access Sasha Levin
  2026-02-23 16:17 ` [PATCH AUTOSEL 6.19-6.1] rtc: zynqmp: correct frequency value Sasha Levin
  0 siblings, 2 replies; 3+ messages in thread
From: Sasha Levin @ 2026-02-23 16:17 UTC (permalink / raw)
  To: patches, stable
  Cc: Maciej Grochowski, Jon Mason, Sasha Levin, kurt.schwemmer, logang,
	dave.jiang, allenbh, linux-pci, ntb, linux-kernel

From: Maciej Grochowski <Maciej.Grochowski@sony.com>

[ Upstream commit 186615f8855a0be4ee7d3fcd09a8ecc10e783b08 ]

Number of MW LUTs depends on NTB configuration and can be set to zero,
in such scenario rounddown_pow_of_two will cause undefined behaviour and
should not be performed.
This patch ensures that rounddown_pow_of_two is called on valid value.

Signed-off-by: Maciej Grochowski <Maciej.Grochowski@sony.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

The file has been present since 2017 (v4.14 era), so it's in all stable
trees.

Now let me verify the exact nature of the bug:

## Analysis

### Problem
The commit fixes undefined behavior (UB) in `switchtec_ntb_init_mw()`.
When `nr_lut_mw` or `peer_nr_lut_mw` is read as 0 from hardware (via
`ioread16`), calling `rounddown_pow_of_two(0)` results in:

- `1UL << (fls_long(0) - 1)` = `1UL << (0 - 1)` = `1UL << -1` (unsigned
  underflow to a huge shift value)

This is explicitly documented as undefined in the kernel: the comment in
`include/linux/log2.h` says "the result is undefined when n == 0". This
is a **shift-out-of-bounds** bug that:
- Triggers UBSAN warnings
- Produces an incorrect (garbage) value for `nr_lut_mw`, which could
  cause further issues downstream

### Fix
The fix adds a simple `if (sndev->nr_lut_mw)` guard before calling
`rounddown_pow_of_two()` in two places — for both `self` and `peer` LUT
MW counts. If the value is 0, it stays 0 (which is correct — no LUT
memory windows).

### Stable Criteria Assessment
1. **Obviously correct and tested**: Yes — trivially correct. If the
   count is 0, rounding down 0 should remain 0.
2. **Fixes a real bug**: Yes — undefined behavior from shift-out-of-
   bounds. This can cause UBSAN splats and potentially incorrect values.
3. **Important issue**: Medium — UB can have unpredictable consequences
   depending on compiler optimizations. The value 0 is a valid hardware
   configuration.
4. **Small and contained**: Yes — 4 lines changed (2 `if` guards added),
   single file, single function.
5. **No new features**: Correct — purely a bug fix.
6. **Applies cleanly**: The code has been stable since 2017; should
   apply to all active stable trees.

### Risk Assessment
- **Risk**: Extremely low. The guard only adds a check for zero before
  calling a function that explicitly documents UB for zero input.
- **Benefit**: Eliminates undefined behavior and potential UBSAN splats
  on hardware configurations with 0 LUT memory windows.

### Verification
- Verified `rounddown_pow_of_two` is documented as "result is undefined
  when n == 0" in `include/linux/log2.h`
- Verified the implementation: `1UL << (fls_long(n) - 1)` with n=0
  produces `1UL << (0-1)` = shift-out-of-bounds
- Verified the file has existed since 2017 (commit 33dea5aae032),
  present in all active stable trees
- Verified the fix is minimal: 2 `if` guards added, no other behavioral
  changes
- Verified there's a related prior shift fix in the same file
  (ff148d8ac53e5), showing this class of bugs has been addressed before

**YES**

 drivers/ntb/hw/mscc/ntb_hw_switchtec.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
index f15ebab138144..0536521fa6ccc 100644
--- a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
+++ b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
@@ -1202,7 +1202,8 @@ static void switchtec_ntb_init_mw(struct switchtec_ntb *sndev)
 				       sndev->mmio_self_ctrl);

 	sndev->nr_lut_mw = ioread16(&sndev->mmio_self_ctrl->lut_table_entries);
-	sndev->nr_lut_mw = rounddown_pow_of_two(sndev->nr_lut_mw);
+	if (sndev->nr_lut_mw)
+		sndev->nr_lut_mw = rounddown_pow_of_two(sndev->nr_lut_mw);

 	dev_dbg(&sndev->stdev->dev, "MWs: %d direct, %d lut\n",
 		sndev->nr_direct_mw, sndev->nr_lut_mw);
@@ -1212,7 +1213,8 @@ static void switchtec_ntb_init_mw(struct switchtec_ntb *sndev)

 	sndev->peer_nr_lut_mw =
 		ioread16(&sndev->mmio_peer_ctrl->lut_table_entries);
-	sndev->peer_nr_lut_mw = rounddown_pow_of_two(sndev->peer_nr_lut_mw);
+	if (sndev->peer_nr_lut_mw)
+		sndev->peer_nr_lut_mw = rounddown_pow_of_two(sndev->peer_nr_lut_mw);

 	dev_dbg(&sndev->stdev->dev, "Peer MWs: %d direct, %d lut\n",
 		sndev->peer_nr_direct_mw, sndev->peer_nr_lut_mw);
-- 
2.51.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH AUTOSEL 6.19-5.10] ntb: ntb_hw_switchtec: Fix array-index-out-of-bounds access
  2026-02-23 16:17 [PATCH AUTOSEL 6.19-5.10] ntb: ntb_hw_switchtec: Fix shift-out-of-bounds for 0 mw lut Sasha Levin
@ 2026-02-23 16:17 ` Sasha Levin
  2026-02-23 16:17 ` [PATCH AUTOSEL 6.19-6.1] rtc: zynqmp: correct frequency value Sasha Levin
  1 sibling, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2026-02-23 16:17 UTC (permalink / raw)
  To: patches, stable
  Cc: Maciej Grochowski, Jon Mason, Sasha Levin, kurt.schwemmer, logang,
	dave.jiang, allenbh, linux-pci, ntb, linux-kernel

From: Maciej Grochowski <Maciej.Grochowski@sony.com>

[ Upstream commit c8ba7ad2cc1c7b90570aa347b8ebbe279f1eface ]

Number of MW LUTs depends on NTB configuration and can be set to MAX_MWS,
This patch protects against invalid index out of bounds access to mw_sizes
When invalid access print message to user that configuration is not valid.

Signed-off-by: Maciej Grochowski <Maciej.Grochowski@sony.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have a clear picture. Let me verify the key claim about the bug.

The critical finding from the exploration:
- `mw_sizes` is declared as `u64 mw_sizes[MAX_MWS]` where `MAX_MWS =
  128`
- `nr_direct_mw` can be at most 6 (MAX_DIRECT_MW)
- `nr_lut_mw` is read from hardware via `ioread16()` and can be up to
  512 (rounded down to power of 2, so max 256 or 512)
- The buggy loop computes `idx = nr_direct_mw + i` where `i` ranges from
  `0` to `nr_lut_mw - 1`
- If `nr_lut_mw` is large enough (e.g., 256), then `idx` can exceed
  `MAX_MWS (128)`, causing an out-of-bounds write to `mw_sizes[idx]`

This is a real out-of-bounds array access bug. The `nr_lut_mw` value
comes from hardware registers (`ioread16`), and there's no validation
that `nr_direct_mw + nr_lut_mw` stays within `MAX_MWS`. If the hardware
reports a large number of LUT entries, the loop will write past the end
of the `mw_sizes[128]` array, corrupting adjacent memory in the
`shared_mw` structure (the `spad[128]` array) or beyond.

## Analysis

### What the commit fixes
An array-index-out-of-bounds write in `switchtec_ntb_init_shared()`. The
`nr_lut_mw` value is read from hardware registers and can exceed
`MAX_MWS - nr_direct_mw`. When this happens,
`sndev->self_shared->mw_sizes[idx]` writes past the 128-element array
boundary, corrupting the subsequent `spad[128]` field or memory beyond
the structure.

### Bug severity
- **Out-of-bounds write**: This is a memory corruption bug. Writing past
  `mw_sizes` corrupts the `spad` array in the shared memory window
  structure, which could cause unpredictable behavior.
- The shared memory buffer is DMA-allocated (`dma_alloc_coherent`), so
  corrupting it could affect hardware/firmware interaction.
- Triggered by hardware configuration — if a Switchtec NTB device
  reports many LUT table entries, this will fire during driver
  initialization.

### Meets stable criteria
1. **Obviously correct**: The fix adds a simple bounds check `if (idx >=
   MAX_MWS)` before the array access, prints an error, and breaks out of
   the loop. This is straightforward and safe.
2. **Fixes a real bug**: Out-of-bounds array write — memory corruption.
3. **Small and contained**: Only adds 5 lines of bounds-checking code in
   a single function.
4. **No new features**: Pure defensive fix.
5. **Low risk**: The break simply stops filling in MW sizes for indices
   beyond the array — existing valid entries are unaffected.

### Risk assessment
- **Very low risk**. The change is a simple bounds check that prevents
  memory corruption. It cannot break any working configuration — it only
  affects cases where the index would have been out of bounds.
- The affected code has existed since the driver was introduced, so this
  fix applies to all stable trees that include this driver.

### Verification

- Confirmed `MAX_MWS = 128` at line 32, `mw_sizes[MAX_MWS]` at line 38
  of `ntb_hw_switchtec.c`
- Confirmed `nr_lut_mw` is read from hardware via `ioread16()` at line
  1204 and rounded to power of 2 at line 1205 — can be up to 256 or 512
- Confirmed `nr_direct_mw` max is 6 (bounded by `MAX_DIRECT_MW =
  ARRAY_SIZE(bar_entry)` where `bar_entry[6]`)
- Confirmed the `shared_mw` struct layout: `mw_sizes[128]` followed by
  `spad[128]` — OOB write corrupts `spad`
- `git log` shows the file has had other bug fixes backported (shift-
  out-of-bounds, UAF), confirming the driver is in stable trees
- The first loop over `nr_direct_mw` is safe (max index 5), but the
  second loop over `nr_lut_mw` is unbounded before this fix
- Could NOT verify via lore.kernel.org the specific mailing list
  discussion (not fetched), but the commit message and code are clear

**YES**

 drivers/ntb/hw/mscc/ntb_hw_switchtec.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
index f851397b65d6e..f15ebab138144 100644
--- a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
+++ b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
@@ -1314,6 +1314,12 @@ static void switchtec_ntb_init_shared(struct switchtec_ntb *sndev)
 	for (i = 0; i < sndev->nr_lut_mw; i++) {
 		int idx = sndev->nr_direct_mw + i;
 
+		if (idx >= MAX_MWS) {
+			dev_err(&sndev->stdev->dev,
+				"Total number of MW cannot be bigger than %d", MAX_MWS);
+			break;
+		}
+
 		sndev->self_shared->mw_sizes[idx] = LUT_SIZE;
 	}
 }
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH AUTOSEL 6.19-6.1] rtc: zynqmp: correct frequency value
  2026-02-23 16:17 [PATCH AUTOSEL 6.19-5.10] ntb: ntb_hw_switchtec: Fix shift-out-of-bounds for 0 mw lut Sasha Levin
  2026-02-23 16:17 ` [PATCH AUTOSEL 6.19-5.10] ntb: ntb_hw_switchtec: Fix array-index-out-of-bounds access Sasha Levin
@ 2026-02-23 16:17 ` Sasha Levin
  1 sibling, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2026-02-23 16:17 UTC (permalink / raw)
  To: patches, stable
  Cc: Tomas Melin, Harini T, Michal Simek, Alexandre Belloni,
	Sasha Levin, linux-rtc, linux-arm-kernel, linux-kernel

From: Tomas Melin <tomas.melin@vaisala.com>

[ Upstream commit 2724fb4d429cbb724dcb6fa17953040918ebe3a2 ]

Fix calibration value in case a clock reference is provided.
The actual calibration value written into register is
frequency - 1.

Reviewed-by: Harini T <harini.t@amd.com>
Tested-by: Harini T <harini.t@amd.com>
Signed-off-by: Tomas Melin <tomas.melin@vaisala.com>
Acked-by: Michal Simek <michal.simek@amd.com>
Link: https://patch.msgid.link/20260122-zynqmp-rtc-updates-v4-1-d4edb966b499@vaisala.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

## Analysis

### What the commit fixes

This commit fixes an off-by-one error in the RTC calibration value when
the frequency is obtained from a clock reference (via `clk_get_rate()`).
The ZynqMP RTC hardware register expects a calibration value of
`frequency - 1`, but the code was writing the raw frequency value
directly.

**Evidence:**
- `RTC_CALIB_DEF` = `0x7FFF` = 32767 = 32768 - 1 (the default is already
  correctly decremented)
- The `"calibration"` DT property presumably provides the register-ready
  value (already freq-1)
- But `clk_get_rate()` returns the raw clock rate (e.g., 32768), so it
  needs the `-1` adjustment
- Without the fix, the RTC counts one extra tick per second, causing
  time drift

### Code change assessment

The fix adds an `else` branch with `xrtcdev->freq--` when the frequency
comes from `clk_get_rate()` (i.e., when `xrtcdev->freq` is non-zero from
the clock). This is a 2-line addition, surgically targeted.

### Dependency analysis

The clock name fix `2a388ff22d2cb` ("rtc: zynqmp: Fix optional clock
name property") was already tagged `Cc: stable@kernel.org` and is
targeted at v6.14-rc1. Before that fix, the driver was looking for clock
name "rtc_clk" instead of "rtc" (matching the DT binding), so the clock-
based frequency path was effectively dead code. With `2a388ff22d2cb`
being backported to stable, the clock can now actually be found, making
this off-by-one bug reachable.

The underlying calibration infrastructure was introduced in
`07dcc6f9c762` (v6.0-rc1), so stable trees v6.1.y and later have the
affected code.

### Stable criteria evaluation

- **Fixes a real bug:** Yes - incorrect RTC calibration causes time
  drift
- **Obviously correct:** Yes - the register needs freq-1, this subtracts
  1
- **Small and contained:** Yes - 2 lines in one file
- **No new features:** Correct - purely fixes calibration logic
- **Tested:** Yes - has Tested-by and Reviewed-by from AMD engineer,
  Acked-by from Michal Simek

### Risk assessment

**Very low risk.** The change only affects the path where a clock
reference provides the frequency. It cannot break the default path
(`RTC_CALIB_DEF`) or the DT `"calibration"` property path. The worst
case if something were wrong would be an RTC running at the wrong rate -
exactly the same as the current bug.

### Verification

- Read the full driver source: confirmed `RTC_CALIB_DEF` = 0x7FFF =
  32767 (line 40)
- Verified `clk_get_rate()` returns raw frequency, not register value,
  per kernel API
- `git show 85cab027d4e31`: confirmed previous calibration fix changed
  default from 0x198233 to 0x7FFF (32768-1)
- `git show 07dcc6f9c762`: confirmed this is the commit that introduced
  clock-based calibration (v6.0-rc1)
- `git describe --contains 2a388ff22d2cb`: confirmed clock name fix is
  in v6.14-rc1, already tagged for stable
- `git describe --contains 07dcc6f9c762`: confirmed calibration support
  is in v6.0-rc1, present in all current stable trees
- The fix directly corresponds to the relationship: `RTC_CALIB_DEF`
  (default) = 0x7FFF = 32768 - 1, confirming the register semantics

This is a small, well-tested fix for incorrect RTC timekeeping. It's a
companion to the already-stable-tagged clock name fix. Without this fix,
any board using the ZynqMP RTC with a clock reference will have
incorrect time calibration.

**YES**

 drivers/rtc/rtc-zynqmp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/rtc/rtc-zynqmp.c b/drivers/rtc/rtc-zynqmp.c
index 3baa2b481d9f2..856bc1678e7d3 100644
--- a/drivers/rtc/rtc-zynqmp.c
+++ b/drivers/rtc/rtc-zynqmp.c
@@ -345,7 +345,10 @@ static int xlnx_rtc_probe(struct platform_device *pdev)
 					   &xrtcdev->freq);
 		if (ret)
 			xrtcdev->freq = RTC_CALIB_DEF;
+	} else {
+		xrtcdev->freq--;
 	}
+
 	ret = readl(xrtcdev->reg_base + RTC_CALIB_RD);
 	if (!ret)
 		writel(xrtcdev->freq, (xrtcdev->reg_base + RTC_CALIB_WR));
-- 
2.51.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-02-23 16:17 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-23 16:17 [PATCH AUTOSEL 6.19-5.10] ntb: ntb_hw_switchtec: Fix shift-out-of-bounds for 0 mw lut Sasha Levin
2026-02-23 16:17 ` [PATCH AUTOSEL 6.19-5.10] ntb: ntb_hw_switchtec: Fix array-index-out-of-bounds access Sasha Levin
2026-02-23 16:17 ` [PATCH AUTOSEL 6.19-6.1] rtc: zynqmp: correct frequency value Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox