From: sashiko-bot@kernel.org
To: "Sebastian Andrzej Siewior" <bigeasy@linutronix.de>
Cc: Frank.Li@kernel.org, imx@lists.linux.dev
Subject: Re: [PATCH v2 1/9] mailbox: imx: Add a channel shutdown field
Date: Wed, 03 Jun 2026 13:20:28 +0000 [thread overview]
Message-ID: <20260603132029.98A3D1F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260603-imx_mbox_rproc-v2-1-a0059dc3b69a@linutronix.de>
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
Pre-existing issues:
- [Critical] Timeout in imx_mu_seco_rxdb() skips clearing the hardware interrupt status, causing an infinite IRQ storm.
- [High] The patch addresses a race where an RX interrupt can be re-enabled during channel shutdown, but fails to fix the exact same race for TX channels. This can result in an unhandled interrupt and the kernel disabling the shared IRQ line.
- [High] Error paths in imx_mu_specific_rx() permanently wedge the RX channel by leaving the receive interrupt disabled.
- [High] The driver performs a multi-second busy-wait polling loop inside RT-priority threaded handlers, triggering softlockups.
--
commit e0f87aaf04945c03d65f3581c6859712272ba596
Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
mailbox: imx: Add a channel shutdown field
Introduce imx_mu_con_priv::shutdown to signal that the channel is shutting
down. This prevents a race condition where an RX event could be disabled in
shutdown() and then incorrectly re-enabled by the ISR.
> diff --git a/drivers/mailbox/imx-mailbox.c b/drivers/mailbox/imx-mailbox.c
> index 246a9a9e39520..34edc2f31dcb5 100644
> --- a/drivers/mailbox/imx-mailbox.c
> +++ b/drivers/mailbox/imx-mailbox.c
[ ... ]
> @@ -376,7 +407,7 @@ static int imx_mu_specific_rx(struct imx_mu_priv *priv, struct imx_mu_con_priv *
> *data++ = imx_mu_read(priv, priv->dcfg->xRR + (i % num_rr) * 4);
> }
>
> - imx_mu_xcr_rmw(priv, IMX_MU_RCR, IMX_MU_xCR_RIEn(priv->dcfg->type, 0), 0);
> + imx_mu_xcr_set_act(priv, cp, IMX_MU_RCR, IMX_MU_xCR_RIEn(priv->dcfg->type, 0));
> mbox_chan_received_data(cp->chan, (void *)priv->msg);
>
> return 0;
[Severity: High]
This isn't a bug introduced by this patch, but can error paths in
imx_mu_specific_rx() permanently wedge the RX channel by leaving the receive
interrupt disabled?
In the updated threaded interrupt architecture, the receive interrupt
(RIEn) is explicitly disabled in the primary handler. The threaded handler
(imx_mu_isr_th) invokes the channel callback and conditionally unmasks the
interrupt only if the callback returns 0:
if (!priv->dcfg->rx(priv, cp))
imx_mu_xcr_set_act(priv, cp, IMX_MU_RCR,
IMX_MU_xCR_RIEn(priv->dcfg->type, cp->idx));
If imx_mu_specific_rx() encounters an error (such as exceeding the maximum
message size):
if (size > max_size / 4) {
dev_err(priv->dev, ...);
return -EINVAL;
}
It returns a non-zero error code. The threaded handler will evaluate the
condition as false and skip imx_mu_xcr_set_act(), leaving the receive
interrupt permanently disabled. The hardware will never generate another
receive interrupt for this channel, permanently deadlocking communication.
[Severity: High]
This isn't a bug introduced by this patch, but is it safe to perform a
multi-second busy-wait polling loop inside RT-priority threaded handlers, or
could this trigger softlockups?
The driver uses multi-second busy-wait polling in functions called from the
threaded interrupt handler. imx_mu_specific_rx() uses readl_poll_timeout()
with a 5-second timeout:
ret = readl_poll_timeout(..., 0, 5 * USEC_PER_SEC);
Similarly, imx_mu_rx_waiting_read() loops for 3 seconds without sleeping:
} while (!can_read && time_is_after_jiffies64(timeout_time));
Because threaded IRQ handlers run as Real-Time tasks (SCHED_FIFO,
priority 50), spinning for up to 5 seconds will entirely stall the CPU
core, starve normal priority tasks, delay RCU grace periods, and
trigger softlockup and RCU stall panics.
[ ... ]
> @@ -638,13 +670,13 @@ static void imx_mu_shutdown(struct mbox_chan *chan)
>
> switch (cp->type) {
> case IMX_MU_TYPE_TX:
> - imx_mu_xcr_rmw(priv, IMX_MU_TCR, 0, IMX_MU_xCR_TIEn(priv->dcfg->type, cp->idx));
> + imx_mu_xcr_clr_shut(priv, cp, IMX_MU_TCR, IMX_MU_xCR_TIEn(priv->dcfg->type, cp->idx));
> break;
[Severity: High]
This isn't a bug introduced by this patch, but would the same race condition
exist for TX channels, potentially resulting in unhandled interrupts?
When shutting down a TX channel, imx_mu_shutdown() calls
imx_mu_xcr_clr_shut() to clear the TX interrupt enable bit (TIEn) and set
cp->shutdown = true. It then calls free_irq(), which waits for any running
threaded ISR to finish.
If the threaded ISR (imx_mu_isr_th) is running concurrently, it acknowledges
the interrupt and calls mbox_chan_txdone(). This can push the next queued
message via send_data(), invoking imx_mu_generic_tx() or
imx_mu_specific_tx().
Both of these TX functions unconditionally re-enable the TX interrupt using
imx_mu_xcr_rmw(), completely ignoring the cp->shutdown flag:
imx_mu_xcr_rmw(priv, IMX_MU_TCR,
IMX_MU_xCR_TIEn(priv->dcfg->type, cp->idx), 0);
After the handler finishes, free_irq() unregisters the interrupt handler.
The hardware will eventually assert the TX interrupt because TIEn was
re-enabled. Since the IRQ is shared, other handlers will return IRQ_NONE,
leading to an interrupt storm and the kernel disabling the shared IRQ line.
Should the fix have used imx_mu_xcr_set_act() in the TX paths, just as it did
for RX?
> case IMX_MU_TYPE_RX:
> - imx_mu_xcr_rmw(priv, IMX_MU_RCR, 0, IMX_MU_xCR_RIEn(priv->dcfg->type, cp->idx));
> + imx_mu_xcr_clr_shut(priv, cp, IMX_MU_RCR, IMX_MU_xCR_RIEn(priv->dcfg->type, cp->idx));
> break;
> case IMX_MU_TYPE_RXDB:
> - imx_mu_xcr_rmw(priv, IMX_MU_GIER, 0, IMX_MU_xCR_GIEn(priv->dcfg->type, cp->idx));
> + imx_mu_xcr_clr_shut(priv, cp, IMX_MU_GIER, IMX_MU_xCR_GIEn(priv->dcfg->type, cp->idx));
[Severity: Critical]
This isn't a bug introduced by this patch, but does a timeout in
imx_mu_seco_rxdb() skip clearing the hardware interrupt status, causing an
infinite IRQ storm?
In imx_mu_seco_rxdb(), if imx_mu_rx_waiting_read() times out, the function
jumps to the error label:
if (err) {
dev_err(priv->dev, "Timeout rx %d\n", i);
goto error;
}
This bypasses the hardware write that clears the General Purpose Interrupt
Pending (GIPn) bit. The threaded handler (imx_mu_isr_th) ignores the error
return value from rxdb() and unconditionally unmasks the interrupt:
priv->dcfg->rxdb(priv, cp);
imx_mu_xcr_set_act(priv, cp, IMX_MU_GIER,
IMX_MU_xCR_GIEn(priv->dcfg->type, cp->idx));
Because the hardware interrupt is never acknowledged or disabled, the
interrupt controller will immediately re-trigger the ISR. This cycle repeats
infinitely: trigger, thread busy-waits 3 seconds, timeout, unmask, trigger
again, completely locking up the CPU.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260603-imx_mbox_rproc-v2-0-a0059dc3b69a@linutronix.de?part=1
next prev parent reply other threads:[~2026-06-03 13:20 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-03 13:05 [PATCH v2 0/9] mailbox: imx: Use threaded handler to avoid kworker in imx's remoteproc Sebastian Andrzej Siewior
2026-06-03 13:05 ` [PATCH v2 1/9] mailbox: imx: Add a channel shutdown field Sebastian Andrzej Siewior
2026-06-03 13:20 ` sashiko-bot [this message]
2026-06-03 14:59 ` Sebastian Andrzej Siewior
2026-06-03 13:05 ` [PATCH v2 2/9] mailbox: imx: Use devm_pm_runtime_enable() Sebastian Andrzej Siewior
2026-06-03 13:27 ` sashiko-bot
2026-06-03 15:00 ` Sebastian Andrzej Siewior
2026-06-03 13:05 ` [PATCH v2 3/9] mailbox: imx: use devm_of_platform_populate() Sebastian Andrzej Siewior
2026-06-03 13:37 ` sashiko-bot
2026-06-03 15:01 ` Sebastian Andrzej Siewior
2026-06-03 13:05 ` [PATCH v2 4/9] mailbox: imx: Use channel index instead of zero in imx_mu_specific_rx() Sebastian Andrzej Siewior
2026-06-03 13:05 ` [PATCH v2 5/9] mailbox: imx: Start splitting the IRQ handler in primary and threaded handler Sebastian Andrzej Siewior
2026-06-03 13:58 ` sashiko-bot
2026-06-03 15:07 ` Sebastian Andrzej Siewior
2026-06-03 13:05 ` [PATCH v2 6/9] mailbox: imx: Move the RX part of the mailbox into the " Sebastian Andrzej Siewior
2026-06-03 13:18 ` Daniel Baluta
2026-06-03 13:20 ` Sebastian Andrzej Siewior
2026-06-03 13:45 ` Daniel Baluta
2026-06-03 15:26 ` Sebastian Andrzej Siewior
2026-06-03 13:05 ` [PATCH v2 7/9] mailbox: imx: Move the RXDB " Sebastian Andrzej Siewior
2026-06-03 14:19 ` sashiko-bot
2026-06-03 13:05 ` [PATCH v2 8/9] mailbox: imx: Don't force-thread the primary handler Sebastian Andrzej Siewior
2026-06-03 14:31 ` sashiko-bot
2026-06-03 15:20 ` Sebastian Andrzej Siewior
2026-06-03 13:05 ` [PATCH v2 9/9] remoteproc: imx_rproc: Invoke the callback directly Sebastian Andrzej Siewior
2026-06-03 14:45 ` sashiko-bot
2026-06-03 15:21 ` Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260603132029.98A3D1F00893@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=Frank.Li@kernel.org \
--cc=bigeasy@linutronix.de \
--cc=imx@lists.linux.dev \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.