* [PATCH v3 0/6] firmware: samsung: acpm: Various fixes for sashiko bug reports
@ 2026-04-29 13:11 Tudor Ambarus
2026-04-29 13:11 ` [PATCH v3 1/6] firmware: samsung: acpm: Fix cross-thread RX length corruption Tudor Ambarus
` (6 more replies)
0 siblings, 7 replies; 9+ messages in thread
From: Tudor Ambarus @ 2026-04-29 13:11 UTC (permalink / raw)
To: Krzysztof Kozlowski, Alim Akhtar
Cc: linux-kernel, linux-samsung-soc, linux-arm-kernel, peter.griffin,
andre.draszik, jyescas, kernel-team, Tudor Ambarus, stable
Fixes for bugs that were identified by sashiko when proposing the
GS101 ACPM TMU addition.
While the bugs are sane, we haven't hit them yet, maybe because we
don't have enough ACPM clients upstreamed. The fixes can go either
as fixes at -rc phase, or as regular patches for the next merge window.
If the later, we'll need a dedicated branch, as these patches toghether
with the other ACPM thermal preparatory patches will be needed by the
GS101 ACPM thermal driver. I'm thinking a dedicated branch and a tag
will do. I will respin the GS101 ACPM thermal driver series once this
fixes set gets in.
Thanks,
ta
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
---
Changes in v3:
- validate more SRAM parameters and queue pointers (sashiko)
- consider/fix the acquire path (Krzysztof) - patch was moved
last in the series to avoid touching the same code twice.
- Link to v2: https://lore.kernel.org/r/20260427-acpm-fixes-sashiko-reports-v2-0-1ff8de94a997@linaro.org
Changes in v2:
- drop patch "firmware: samsung: acpm: Fix sequence number leak and infinite loop"
The patch freed sequence numbers on mailbox failures or timeouts. Because
the message is already in SRAM and tx.front was advanced, a delayed
firmware wake-up will process that abandoned message, stealing the
sequence number from a new thread and causing silent data corruption.
- fix mailbox channel leak when `acpm_achan_alloc_cmds()` failed. Did it
by moving the `devm_add_action_or_reset()` call.
- new patches, last 3 in the set, they fix some more sashiko reports.
- Link to v1: https://lore.kernel.org/r/20260423-acpm-fixes-sashiko-reports-v1-0-2217b790925e@linaro.org
---
Tudor Ambarus (6):
firmware: samsung: acpm: Fix cross-thread RX length corruption
firmware: samsung: acpm: Fix mailbox channel leak on probe error
firmware: samsung: acpm: Fix dummy stubs to return ERR_PTR
firmware: samsung: acpm: Validate SRAM shared memory and queue pointers
firmware: samsung: acpm: Fix infinite loop on sequence number exhaustion
firmware: samsung: acpm: Fix memory ordering races in RX and polling paths
drivers/firmware/samsung/exynos-acpm-dvfs.c | 3 +
drivers/firmware/samsung/exynos-acpm.c | 113 ++++++++++++++++-----
.../linux/firmware/samsung/exynos-acpm-protocol.h | 3 +-
3 files changed, 90 insertions(+), 29 deletions(-)
---
base-commit: 254f49634ee16a731174d2ae34bc50bd5f45e731
change-id: 20260423-acpm-fixes-sashiko-reports-ae28b6ed5581
Best regards,
--
Tudor Ambarus <tudor.ambarus@linaro.org>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v3 1/6] firmware: samsung: acpm: Fix cross-thread RX length corruption
2026-04-29 13:11 [PATCH v3 0/6] firmware: samsung: acpm: Various fixes for sashiko bug reports Tudor Ambarus
@ 2026-04-29 13:11 ` Tudor Ambarus
2026-04-29 13:11 ` [PATCH v3 2/6] firmware: samsung: acpm: Fix mailbox channel leak on probe error Tudor Ambarus
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Tudor Ambarus @ 2026-04-29 13:11 UTC (permalink / raw)
To: Krzysztof Kozlowski, Alim Akhtar
Cc: linux-kernel, linux-samsung-soc, linux-arm-kernel, peter.griffin,
andre.draszik, jyescas, kernel-team, Tudor Ambarus, stable
Sashiko identified a cross-thread RX length corruption bug when
reviewing the thermal addition to ACPM [1].
When multiple threads concurrently send IPC requests, the ACPM polling
mechanism can encounter responses belonging to other threads. To drain
the queue, the driver saves these concurrent responses into an internal
cache (`rx_data->cmd`) to be retrieved later by the owning thread.
Previously, the driver incorrectly used `xfer->rxcnt` (the expected
receive length of the *current* polling thread) when copying data for
*other* threads into this cache. If the threads expected responses of
different lengths, this resulted in buffer underflows (leading to reads
of uninitialized memory) or potential buffer overflows.
Fix this by replacing the boolean `response` flag in
`struct acpm_rx_data` with `rxcnt`, caching the exact expected receive
length for each specific transaction during transfer preparation. Use
this cached length when saving concurrent responses.
Consequently, ensure that `xfer->rxcnt` is explicitly zeroed in driver
helpers (e.g., `acpm_dvfs_set_xfer`) for fire-and-forget messages to
prevent uninitialized stack garbage from being interpreted as a massive
expected receive length.
Cc: stable@vger.kernel.org
Fixes: a88927b534ba ("firmware: add Exynos ACPM protocol driver")
Closes: https://sashiko.dev/#/patchset/20260420-acpm-tmu-v3-0-3dc8e93f0b26%40linaro.org [1]
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
---
drivers/firmware/samsung/exynos-acpm-dvfs.c | 3 +++
drivers/firmware/samsung/exynos-acpm.c | 15 ++++++++-------
2 files changed, 11 insertions(+), 7 deletions(-)
diff --git a/drivers/firmware/samsung/exynos-acpm-dvfs.c b/drivers/firmware/samsung/exynos-acpm-dvfs.c
index 06bdf62dea1f..fdea7aa24ca0 100644
--- a/drivers/firmware/samsung/exynos-acpm-dvfs.c
+++ b/drivers/firmware/samsung/exynos-acpm-dvfs.c
@@ -31,6 +31,9 @@ static void acpm_dvfs_set_xfer(struct acpm_xfer *xfer, u32 *cmd, size_t cmdlen,
if (response) {
xfer->rxcnt = cmdlen;
xfer->rxd = cmd;
+ } else {
+ xfer->rxcnt = 0;
+ xfer->rxd = NULL;
}
}
diff --git a/drivers/firmware/samsung/exynos-acpm.c b/drivers/firmware/samsung/exynos-acpm.c
index 16c46ed60837..e95edc350efa 100644
--- a/drivers/firmware/samsung/exynos-acpm.c
+++ b/drivers/firmware/samsung/exynos-acpm.c
@@ -104,12 +104,12 @@ struct acpm_queue {
*
* @cmd: pointer to where the data shall be saved.
* @n_cmd: number of 32-bit commands.
- * @response: true if the client expects the RX data.
+ * @rxcnt: expected length of the response in 32-bit words.
*/
struct acpm_rx_data {
u32 *cmd;
size_t n_cmd;
- bool response;
+ size_t rxcnt;
};
#define ACPM_SEQNUM_MAX 64
@@ -199,7 +199,7 @@ static void acpm_get_saved_rx(struct acpm_chan *achan,
const struct acpm_rx_data *rx_data = &achan->rx_data[tx_seqnum - 1];
u32 rx_seqnum;
- if (!rx_data->response)
+ if (!rx_data->rxcnt)
return;
rx_seqnum = FIELD_GET(ACPM_PROTOCOL_SEQNUM, rx_data->cmd[0]);
@@ -256,7 +256,7 @@ static int acpm_get_rx(struct acpm_chan *achan, const struct acpm_xfer *xfer)
seqnum = rx_seqnum - 1;
rx_data = &achan->rx_data[seqnum];
- if (rx_data->response) {
+ if (rx_data->rxcnt) {
if (rx_seqnum == tx_seqnum) {
__ioread32_copy(xfer->rxd, addr, xfer->rxcnt);
rx_set = true;
@@ -268,7 +268,8 @@ static int acpm_get_rx(struct acpm_chan *achan, const struct acpm_xfer *xfer)
* clear yet the bitmap. It will be cleared
* after the response is copied to the request.
*/
- __ioread32_copy(rx_data->cmd, addr, xfer->rxcnt);
+ __ioread32_copy(rx_data->cmd, addr,
+ rx_data->rxcnt);
}
} else {
clear_bit(seqnum, achan->bitmap_seqnum);
@@ -380,8 +381,8 @@ static void acpm_prepare_xfer(struct acpm_chan *achan,
/* Clear data for upcoming responses */
rx_data = &achan->rx_data[achan->seqnum - 1];
memset(rx_data->cmd, 0, sizeof(*rx_data->cmd) * rx_data->n_cmd);
- if (xfer->rxd)
- rx_data->response = true;
+ /* zero means no response expected */
+ rx_data->rxcnt = xfer->rxcnt;
/* Flag the index based on seqnum. (seqnum: 1~63, bitmap: 0~62) */
set_bit(achan->seqnum - 1, achan->bitmap_seqnum);
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 2/6] firmware: samsung: acpm: Fix mailbox channel leak on probe error
2026-04-29 13:11 [PATCH v3 0/6] firmware: samsung: acpm: Various fixes for sashiko bug reports Tudor Ambarus
2026-04-29 13:11 ` [PATCH v3 1/6] firmware: samsung: acpm: Fix cross-thread RX length corruption Tudor Ambarus
@ 2026-04-29 13:11 ` Tudor Ambarus
2026-04-29 13:11 ` [PATCH v3 3/6] firmware: samsung: acpm: Fix dummy stubs to return ERR_PTR Tudor Ambarus
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Tudor Ambarus @ 2026-04-29 13:11 UTC (permalink / raw)
To: Krzysztof Kozlowski, Alim Akhtar
Cc: linux-kernel, linux-samsung-soc, linux-arm-kernel, peter.griffin,
andre.draszik, jyescas, kernel-team, Tudor Ambarus, stable
Sashiko identified the leak at [1].
The ACPM driver allocates hardware mailbox channels using
`mbox_request_channel()` during `acpm_channels_init()`. However, the
driver lacked a `.remove` callback and did not free these channels on
subsequent error paths inside `acpm_probe()`.
Additionally, if `acpm_achan_alloc_cmds()` failed during the channel
initialization loop, the function returned immediately, bypassing the
manual cleanup and permanently leaking any channels successfully
requested in previous loop iterations.
Fix this by modifying `acpm_free_mbox_chans()` to match the `devres`
action signature and registering it via `devm_add_action_or_reset()`.
Cc: stable@vger.kernel.org
Fixes: a88927b534ba ("firmware: add Exynos ACPM protocol driver")
Closes: https://sashiko.dev/#/patchset/20260420-acpm-tmu-v3-0-3dc8e93f0b26%40linaro.org [1]
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
---
drivers/firmware/samsung/exynos-acpm.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/firmware/samsung/exynos-acpm.c b/drivers/firmware/samsung/exynos-acpm.c
index e95edc350efa..bd0d48e9d157 100644
--- a/drivers/firmware/samsung/exynos-acpm.c
+++ b/drivers/firmware/samsung/exynos-acpm.c
@@ -529,8 +529,9 @@ static int acpm_achan_alloc_cmds(struct acpm_chan *achan)
* acpm_free_mbox_chans() - free mailbox channels.
* @acpm: pointer to driver data.
*/
-static void acpm_free_mbox_chans(struct acpm_info *acpm)
+static void acpm_free_mbox_chans(void *data)
{
+ struct acpm_info *acpm = data;
int i;
for (i = 0; i < acpm->num_chans; i++)
@@ -558,6 +559,10 @@ static int acpm_channels_init(struct acpm_info *acpm)
if (!acpm->chans)
return -ENOMEM;
+ ret = devm_add_action_or_reset(dev, acpm_free_mbox_chans, acpm);
+ if (ret)
+ return dev_err_probe(dev, ret, "Failed to add mbox free action.\n");
+
chans_shmem = acpm->sram_base + readl(&shmem->chans);
for (i = 0; i < acpm->num_chans; i++) {
@@ -579,10 +584,8 @@ static int acpm_channels_init(struct acpm_info *acpm)
cl->dev = dev;
achan->chan = mbox_request_channel(cl, 0);
- if (IS_ERR(achan->chan)) {
- acpm_free_mbox_chans(acpm);
+ if (IS_ERR(achan->chan))
return PTR_ERR(achan->chan);
- }
}
return 0;
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 3/6] firmware: samsung: acpm: Fix dummy stubs to return ERR_PTR
2026-04-29 13:11 [PATCH v3 0/6] firmware: samsung: acpm: Various fixes for sashiko bug reports Tudor Ambarus
2026-04-29 13:11 ` [PATCH v3 1/6] firmware: samsung: acpm: Fix cross-thread RX length corruption Tudor Ambarus
2026-04-29 13:11 ` [PATCH v3 2/6] firmware: samsung: acpm: Fix mailbox channel leak on probe error Tudor Ambarus
@ 2026-04-29 13:11 ` Tudor Ambarus
2026-04-29 13:11 ` [PATCH v3 4/6] firmware: samsung: acpm: Validate SRAM shared memory and queue pointers Tudor Ambarus
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Tudor Ambarus @ 2026-04-29 13:11 UTC (permalink / raw)
To: Krzysztof Kozlowski, Alim Akhtar
Cc: linux-kernel, linux-samsung-soc, linux-arm-kernel, peter.griffin,
andre.draszik, jyescas, kernel-team, Tudor Ambarus, stable
Sashiko identified a potential NULL pointer dereference [1].
The dummy stub implementation for devm_acpm_get_by_node() returns NULL
when CONFIG_EXYNOS_ACPM_PROTOCOL is disabled.
However, the active implementation of this function returns an ERR_PTR
on failure, and the consumer driver checks the return value using
IS_ERR(). Because IS_ERR(NULL) evaluates to false, returning NULL from
the stub tricks consumer drivers into treating the NULL return as a
valid handle. Subsequent attempts to access handle->ops result in a
fatal NULL pointer dereference.
Fix this by returning ERR_PTR(-ENODEV) in the disabled configuration
to correctly propagate the disabled state and match the API contract.
Cc: stable@vger.kernel.org
Fixes: 6837c006d4e7 ("firmware: exynos-acpm: add empty method to allow compile test")
Closes: https://sashiko.dev/#/patchset/20260420-acpm-tmu-v3-0-3dc8e93f0b26%40linaro.org [1]
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
---
include/linux/firmware/samsung/exynos-acpm-protocol.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/linux/firmware/samsung/exynos-acpm-protocol.h b/include/linux/firmware/samsung/exynos-acpm-protocol.h
index 13f17dc4443b..d4db2796a6fb 100644
--- a/include/linux/firmware/samsung/exynos-acpm-protocol.h
+++ b/include/linux/firmware/samsung/exynos-acpm-protocol.h
@@ -8,6 +8,7 @@
#ifndef __EXYNOS_ACPM_PROTOCOL_H
#define __EXYNOS_ACPM_PROTOCOL_H
+#include <linux/err.h>
#include <linux/types.h>
struct acpm_handle;
@@ -57,7 +58,7 @@ struct acpm_handle *devm_acpm_get_by_node(struct device *dev,
static inline struct acpm_handle *devm_acpm_get_by_node(struct device *dev,
struct device_node *np)
{
- return NULL;
+ return ERR_PTR(-ENODEV);
}
#endif
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 4/6] firmware: samsung: acpm: Validate SRAM shared memory and queue pointers
2026-04-29 13:11 [PATCH v3 0/6] firmware: samsung: acpm: Various fixes for sashiko bug reports Tudor Ambarus
` (2 preceding siblings ...)
2026-04-29 13:11 ` [PATCH v3 3/6] firmware: samsung: acpm: Fix dummy stubs to return ERR_PTR Tudor Ambarus
@ 2026-04-29 13:11 ` Tudor Ambarus
2026-05-01 12:23 ` Tudor Ambarus
2026-04-29 13:11 ` [PATCH v3 5/6] firmware: samsung: acpm: Fix infinite loop on sequence number exhaustion Tudor Ambarus
` (2 subsequent siblings)
6 siblings, 1 reply; 9+ messages in thread
From: Tudor Ambarus @ 2026-04-29 13:11 UTC (permalink / raw)
To: Krzysztof Kozlowski, Alim Akhtar
Cc: linux-kernel, linux-samsung-soc, linux-arm-kernel, peter.griffin,
andre.draszik, jyescas, kernel-team, Tudor Ambarus, stable
Sashiko identified multiple missing validation checks [1].
The ACPM driver reads queue pointers (rx_front, rx_rear, tx_front) and
configuration parameters (qlen) directly from shared SRAM without
verifying their validity. Relying blindly on firmware-provided values
leaves the kernel vulnerable to crashes or infinite loops if the
firmware misbehaves.
This patch fixes three specific vulnerabilities:
1. RX path infinite loop and OOB read: The rear pointer ('i') is used
to calculate the MMIO address before the modulo operation is applied.
If 'rx_front' or 'i' are >= achan->qlen, the driver performs an
out-of-bounds read. Furthermore, because 'i' is mathematically capped
by the modulo operator, if 'rx_front' is >= qlen, 'i' will never
equal 'rx_front', causing the CPU to spin forever and deadlock the
polling thread.
2. TX path out-of-bounds: 'tx_front' is used to calculate queue indices.
If it exceeds the queue length, it causes invalid state tracking and
out-of-bounds memory accesses during __iowrite32_copy().
3. Divide-by-zero panics: 'qlen' is read from SRAM during channel
initialization. If 'qlen' is 0, any subsequent modulo operations
(% achan->qlen) will trigger a divide-by-zero kernel panic.
Protect the kernel by strictly validating the initialization parameters
and MMIO queue offsets immediately after reading them.
Cc: stable@vger.kernel.org
Fixes: a88927b534ba ("firmware: add Exynos ACPM protocol driver")
Closes: https://sashiko.dev/#/patchset/20260420-acpm-tmu-v3-0-3dc8e93f0b26%40linaro.org [1]
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
---
drivers/firmware/samsung/exynos-acpm.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/drivers/firmware/samsung/exynos-acpm.c b/drivers/firmware/samsung/exynos-acpm.c
index bd0d48e9d157..e4d8d1120192 100644
--- a/drivers/firmware/samsung/exynos-acpm.c
+++ b/drivers/firmware/samsung/exynos-acpm.c
@@ -230,6 +230,13 @@ static int acpm_get_rx(struct acpm_chan *achan, const struct acpm_xfer *xfer)
rx_front = readl(achan->rx.front);
i = readl(achan->rx.rear);
+ if (rx_front >= achan->qlen || i >= achan->qlen) {
+ dev_err(achan->acpm->dev,
+ "Invalid RX queue pointers from firmware: front=%u rear=%u qlen=%u\n",
+ rx_front, i, achan->qlen);
+ return -EIO;
+ }
+
tx_seqnum = FIELD_GET(ACPM_PROTOCOL_SEQNUM, xfer->txd[0]);
if (i == rx_front) {
@@ -439,6 +446,14 @@ int acpm_do_xfer(struct acpm_handle *handle, const struct acpm_xfer *xfer)
scoped_guard(mutex, &achan->tx_lock) {
tx_front = readl(achan->tx.front);
+
+ if (tx_front >= achan->qlen) {
+ dev_err(achan->acpm->dev,
+ "Invalid TX front pointer from firmware: %u (qlen: %u)\n",
+ tx_front, achan->qlen);
+ return -EIO;
+ }
+
idx = (tx_front + 1) % achan->qlen;
ret = acpm_wait_for_queue_slots(achan, idx);
@@ -574,6 +589,12 @@ static int acpm_channels_init(struct acpm_info *acpm)
acpm_chan_shmem_get_params(achan, chan_shmem);
+ if (!achan->qlen) {
+ dev_err(dev, "Invalid shared memory parameters for channel %d: qlen=%u\n",
+ i, achan->qlen);
+ return -EIO;
+ }
+
ret = acpm_achan_alloc_cmds(achan);
if (ret)
return ret;
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 5/6] firmware: samsung: acpm: Fix infinite loop on sequence number exhaustion
2026-04-29 13:11 [PATCH v3 0/6] firmware: samsung: acpm: Various fixes for sashiko bug reports Tudor Ambarus
` (3 preceding siblings ...)
2026-04-29 13:11 ` [PATCH v3 4/6] firmware: samsung: acpm: Validate SRAM shared memory and queue pointers Tudor Ambarus
@ 2026-04-29 13:11 ` Tudor Ambarus
2026-04-29 13:11 ` [PATCH v3 6/6] firmware: samsung: acpm: Fix memory ordering races in RX and polling paths Tudor Ambarus
2026-05-01 10:23 ` [PATCH v3 0/6] firmware: samsung: acpm: Various fixes for sashiko bug reports Tudor Ambarus
6 siblings, 0 replies; 9+ messages in thread
From: Tudor Ambarus @ 2026-04-29 13:11 UTC (permalink / raw)
To: Krzysztof Kozlowski, Alim Akhtar
Cc: linux-kernel, linux-samsung-soc, linux-arm-kernel, peter.griffin,
andre.draszik, jyescas, kernel-team, Tudor Ambarus, stable
Sashiko identified a possible infinite loop [1].
ACPM IPC sequence numbers are tracked via a 64-bit bitmap. Previously,
acpm_prepare_xfer() used a do...while loop to search for a free
sequence number.
If all 63 available sequence numbers are leaked due to transient
hardware timeouts or mailbox failures, the bitmap becomes full.
The next call to acpm_prepare_xfer() would enter an infinite loop.
Fix this by utilizing the kernel's optimized bitmap search functions
(find_next_zero_bit / find_first_zero_bit). If the pool is completely
exhausted, log the failure and return -EBUSY to allow the kernel to
fail gracefully instead of hanging.
Cc: stable@vger.kernel.org
Fixes: a88927b534ba ("firmware: add Exynos ACPM protocol driver")
Closes: https://sashiko.dev/#/patchset/20260420-acpm-tmu-v3-0-3dc8e93f0b26%40linaro.org [1]
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
---
drivers/firmware/samsung/exynos-acpm.c | 36 +++++++++++++++++++++++-----------
1 file changed, 25 insertions(+), 11 deletions(-)
diff --git a/drivers/firmware/samsung/exynos-acpm.c b/drivers/firmware/samsung/exynos-acpm.c
index e4d8d1120192..b8a4978b091b 100644
--- a/drivers/firmware/samsung/exynos-acpm.c
+++ b/drivers/firmware/samsung/exynos-acpm.c
@@ -12,6 +12,7 @@
#include <linux/container_of.h>
#include <linux/delay.h>
#include <linux/device.h>
+#include <linux/find.h>
#include <linux/firmware/samsung/exynos-acpm-protocol.h>
#include <linux/io.h>
#include <linux/iopoll.h>
@@ -370,29 +371,40 @@ static int acpm_wait_for_queue_slots(struct acpm_chan *achan, u32 next_tx_front)
* TX queue.
* @achan: ACPM channel info.
* @xfer: reference to the transfer being prepared.
+ *
+ * Return: 0 on success, -EBUSY if the sequence number pool is exhausted.
*/
-static void acpm_prepare_xfer(struct acpm_chan *achan,
- const struct acpm_xfer *xfer)
+static int acpm_prepare_xfer(struct acpm_chan *achan,
+ const struct acpm_xfer *xfer)
{
struct acpm_rx_data *rx_data;
u32 *txd = (u32 *)xfer->txd;
+ unsigned long size = ACPM_SEQNUM_MAX - 1;
+ unsigned long bit;
+
+ bit = find_next_zero_bit(achan->bitmap_seqnum, size, achan->seqnum);
+ if (bit >= size) {
+ bit = find_first_zero_bit(achan->bitmap_seqnum, size);
+ if (bit >= size) {
+ dev_err_ratelimited(achan->acpm->dev,
+ "ACPM sequence number pool exhausted\n");
+ return -EBUSY;
+ }
+ }
- /* Prevent chan->seqnum from being re-used */
- do {
- if (++achan->seqnum == ACPM_SEQNUM_MAX)
- achan->seqnum = 1;
- } while (test_bit(achan->seqnum - 1, achan->bitmap_seqnum));
+ /* Flag the index based on seqnum. (seqnum: 1~63, bitmap: 0~62) */
+ achan->seqnum = bit + 1;
+ set_bit(bit, achan->bitmap_seqnum);
txd[0] |= FIELD_PREP(ACPM_PROTOCOL_SEQNUM, achan->seqnum);
/* Clear data for upcoming responses */
- rx_data = &achan->rx_data[achan->seqnum - 1];
+ rx_data = &achan->rx_data[bit];
memset(rx_data->cmd, 0, sizeof(*rx_data->cmd) * rx_data->n_cmd);
/* zero means no response expected */
rx_data->rxcnt = xfer->rxcnt;
- /* Flag the index based on seqnum. (seqnum: 1~63, bitmap: 0~62) */
- set_bit(achan->seqnum - 1, achan->bitmap_seqnum);
+ return 0;
}
/**
@@ -460,7 +472,9 @@ int acpm_do_xfer(struct acpm_handle *handle, const struct acpm_xfer *xfer)
if (ret)
return ret;
- acpm_prepare_xfer(achan, xfer);
+ ret = acpm_prepare_xfer(achan, xfer);
+ if (ret)
+ return ret;
/* Write TX command. */
__iowrite32_copy(achan->tx.base + achan->mlen * tx_front,
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 6/6] firmware: samsung: acpm: Fix memory ordering races in RX and polling paths
2026-04-29 13:11 [PATCH v3 0/6] firmware: samsung: acpm: Various fixes for sashiko bug reports Tudor Ambarus
` (4 preceding siblings ...)
2026-04-29 13:11 ` [PATCH v3 5/6] firmware: samsung: acpm: Fix infinite loop on sequence number exhaustion Tudor Ambarus
@ 2026-04-29 13:11 ` Tudor Ambarus
2026-05-01 10:23 ` [PATCH v3 0/6] firmware: samsung: acpm: Various fixes for sashiko bug reports Tudor Ambarus
6 siblings, 0 replies; 9+ messages in thread
From: Tudor Ambarus @ 2026-04-29 13:11 UTC (permalink / raw)
To: Krzysztof Kozlowski, Alim Akhtar
Cc: linux-kernel, linux-samsung-soc, linux-arm-kernel, peter.griffin,
andre.draszik, jyescas, kernel-team, Tudor Ambarus, stable
Sashiko identified a memory ordering race in the RX path [1].
Sequence numbers are allocated by the TX thread and freed by the RX
thread. Because the TX path is protected by 'tx_lock' and the RX path
is protected by 'rx_lock', the shared 'bitmap_seqnum' is modified
across two separate lock domains. Thus, the operations acting on the
bitmap are effectively lockless and require explicit memory barriers.
This patch addresses missing barriers in two paths:
1. The Release Path (RX thread):
When draining the RX queue, the driver reads the payload and uses a
relaxed clear_bit() to free the sequence number. On weakly ordered
architectures like ARM64, this allows the CPU to make the cleared bit
globally visible before the preceding memory reads (memcpy or
__ioread32_copy) have completed. If a concurrent TX thread allocates
the newly freed sequence number, it can execute memset() and corrupt
the buffer while the RX thread is still actively reading from it.
Fix this by replacing clear_bit() with clear_bit_unlock() to enforce
Release semantics.
2. The Acquire Path (Polling thread):
In polling mode, zero-length messages (rxcnt == 0) can have their bits
cleared by a concurrent thread that happens to be draining the queue.
The polling thread waits on test_bit(). Because test_bit() lacks an
acquire barrier, the CPU can speculatively execute the client driver's
subsequent instructions before the RX thread's memory updates are
globally visible. Fix this by pairing the release with
test_bit_acquire().
Note that the TX allocation path (acpm_prepare_xfer) is safe as-is
and does not require an explicit acquire barrier (like
test_and_set_bit_lock) for two reasons:
* Address Dependency: The CPU mathematically cannot calculate the
destination pointer for the memset() until the non-atomic
find_next_zero_bit() returns the index, naturally preventing
speculative execution of the buffer wipe.
* Lock Boundaries: The visibility of the atomic set_bit() to the next
TX thread is safely protected by the 'tx_lock' boundaries
(specifically the Release semantics of mutex_unlock).
Cc: stable@vger.kernel.org
Fixes: a88927b534ba ("firmware: add Exynos ACPM protocol driver")
Closes: https://sashiko.dev/#/patchset/20260423-acpm-fixes-sashiko-reports-v1-0-2217b790925e%40linaro.org [1]
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
---
drivers/firmware/samsung/exynos-acpm.c | 32 +++++++++++++++++++++++++-------
1 file changed, 25 insertions(+), 7 deletions(-)
diff --git a/drivers/firmware/samsung/exynos-acpm.c b/drivers/firmware/samsung/exynos-acpm.c
index b8a4978b091b..15627b439838 100644
--- a/drivers/firmware/samsung/exynos-acpm.c
+++ b/drivers/firmware/samsung/exynos-acpm.c
@@ -7,7 +7,7 @@
#include <linux/bitfield.h>
#include <linux/bitmap.h>
-#include <linux/bits.h>
+#include <linux/bitops.h>
#include <linux/cleanup.h>
#include <linux/container_of.h>
#include <linux/delay.h>
@@ -207,7 +207,7 @@ static void acpm_get_saved_rx(struct acpm_chan *achan,
if (rx_seqnum == tx_seqnum) {
memcpy(xfer->rxd, rx_data->cmd, xfer->rxcnt * sizeof(*xfer->rxd));
- clear_bit(rx_seqnum - 1, achan->bitmap_seqnum);
+ clear_bit_unlock(rx_seqnum - 1, achan->bitmap_seqnum);
}
}
@@ -268,7 +268,7 @@ static int acpm_get_rx(struct acpm_chan *achan, const struct acpm_xfer *xfer)
if (rx_seqnum == tx_seqnum) {
__ioread32_copy(xfer->rxd, addr, xfer->rxcnt);
rx_set = true;
- clear_bit(seqnum, achan->bitmap_seqnum);
+ clear_bit_unlock(seqnum, achan->bitmap_seqnum);
} else {
/*
* The RX data corresponds to another request.
@@ -280,7 +280,7 @@ static int acpm_get_rx(struct acpm_chan *achan, const struct acpm_xfer *xfer)
rx_data->rxcnt);
}
} else {
- clear_bit(seqnum, achan->bitmap_seqnum);
+ clear_bit_unlock(seqnum, achan->bitmap_seqnum);
}
i = (i + 1) % achan->qlen;
@@ -322,7 +322,14 @@ static int acpm_dequeue_by_polling(struct acpm_chan *achan,
if (ret)
return ret;
- if (!test_bit(seqnum - 1, achan->bitmap_seqnum))
+ /*
+ * For zero-length messages (rxcnt == 0), the bit can be
+ * cleared by a concurrent thread draining the queue. Use
+ * test_bit_acquire() to prevent the CPU from speculatively
+ * executing the caller's subsequent instructions before the
+ * hardware transaction is fully synchronized.
+ */
+ if (!test_bit_acquire(seqnum - 1, achan->bitmap_seqnum))
return 0;
/* Determined experimentally. */
@@ -392,13 +399,24 @@ static int acpm_prepare_xfer(struct acpm_chan *achan,
}
}
- /* Flag the index based on seqnum. (seqnum: 1~63, bitmap: 0~62) */
+ /*
+ * Flag the index based on seqnum. (seqnum: 1~63, bitmap: 0~62). We do
+ * not need an explicit acquire barrier here because visibility to the
+ * next TX thread is safely protected by the tx_lock boundaries
+ * (mutex_unlock provides Release semantics). The RX thread only
+ * blind-clears bits and doesn't care about this.
+ */
achan->seqnum = bit + 1;
set_bit(bit, achan->bitmap_seqnum);
txd[0] |= FIELD_PREP(ACPM_PROTOCOL_SEQNUM, achan->seqnum);
- /* Clear data for upcoming responses */
+ /*
+ * Clear data for upcoming responses. Speculative execution of memset()
+ * is prevented by the strict Address Dependency (implicit barrier) on
+ * 'bit'. The CPU mathematically cannot calculate the destination
+ * pointer until find_next/first_zero_bit() returns.
+ */
rx_data = &achan->rx_data[bit];
memset(rx_data->cmd, 0, sizeof(*rx_data->cmd) * rx_data->n_cmd);
/* zero means no response expected */
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v3 0/6] firmware: samsung: acpm: Various fixes for sashiko bug reports
2026-04-29 13:11 [PATCH v3 0/6] firmware: samsung: acpm: Various fixes for sashiko bug reports Tudor Ambarus
` (5 preceding siblings ...)
2026-04-29 13:11 ` [PATCH v3 6/6] firmware: samsung: acpm: Fix memory ordering races in RX and polling paths Tudor Ambarus
@ 2026-05-01 10:23 ` Tudor Ambarus
6 siblings, 0 replies; 9+ messages in thread
From: Tudor Ambarus @ 2026-05-01 10:23 UTC (permalink / raw)
To: Krzysztof Kozlowski, Alim Akhtar
Cc: linux-kernel, linux-samsung-soc, linux-arm-kernel, peter.griffin,
andre.draszik, jyescas, kernel-team, stable
FYI, sashiko identified few improvements for the current set.
I'm preparing v4.
Thanks!
ta
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 4/6] firmware: samsung: acpm: Validate SRAM shared memory and queue pointers
2026-04-29 13:11 ` [PATCH v3 4/6] firmware: samsung: acpm: Validate SRAM shared memory and queue pointers Tudor Ambarus
@ 2026-05-01 12:23 ` Tudor Ambarus
0 siblings, 0 replies; 9+ messages in thread
From: Tudor Ambarus @ 2026-05-01 12:23 UTC (permalink / raw)
To: Krzysztof Kozlowski, Alim Akhtar
Cc: linux-kernel, linux-samsung-soc, linux-arm-kernel, peter.griffin,
andre.draszik, jyescas, kernel-team, stable
Hi everyone,
After further review, I've decided to drop this specific SRAM validation
patch from the upcoming v4 series. I had the typical band-aid vs. the
surgery dilemma.
While the intention was to protect against compromised firmware
providing out-of-bounds offsets, implementing this in the probe() path
turns out to be an incomplete band-aid. I identified two firmware quirks
that complicate static validation during initialization:
1) Some channels seem to use absolute physical DRAM addresses instead of
SRAM offsets. Mapping these against the SRAM ioremap triggers a fatal
page fault if we don't explicitly fence them off.
2) Some channels are purely doorbells (mlen == 0). acpm_do_xfer() and
acpm_get_rx() functions currently assume all channels have payloads and
sequence numbers, meaning these channels will break at runtime if we let
them pass probe.
Trying to silently disable or bypass these unsupported configurations
during probe() makes the code brittle and creates a false sense of
security regarding bounds checking.
I think the proper architectural fix is to introduce a formal
acpm_request_channel() API. This will allow the driver to validate
parameters at probe time, mark the unsupported channels and gracefully
return -EOPNOTSUPP to client drivers, rather than hacking workarounds
into the probe and runtime paths.
I'll leave the broader SRAM boundary validation for a future patch
series.
Thanks,
ta
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-05-01 12:23 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-29 13:11 [PATCH v3 0/6] firmware: samsung: acpm: Various fixes for sashiko bug reports Tudor Ambarus
2026-04-29 13:11 ` [PATCH v3 1/6] firmware: samsung: acpm: Fix cross-thread RX length corruption Tudor Ambarus
2026-04-29 13:11 ` [PATCH v3 2/6] firmware: samsung: acpm: Fix mailbox channel leak on probe error Tudor Ambarus
2026-04-29 13:11 ` [PATCH v3 3/6] firmware: samsung: acpm: Fix dummy stubs to return ERR_PTR Tudor Ambarus
2026-04-29 13:11 ` [PATCH v3 4/6] firmware: samsung: acpm: Validate SRAM shared memory and queue pointers Tudor Ambarus
2026-05-01 12:23 ` Tudor Ambarus
2026-04-29 13:11 ` [PATCH v3 5/6] firmware: samsung: acpm: Fix infinite loop on sequence number exhaustion Tudor Ambarus
2026-04-29 13:11 ` [PATCH v3 6/6] firmware: samsung: acpm: Fix memory ordering races in RX and polling paths Tudor Ambarus
2026-05-01 10:23 ` [PATCH v3 0/6] firmware: samsung: acpm: Various fixes for sashiko bug reports Tudor Ambarus
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox