Linux Tegra architecture development
 help / color / mirror / Atom feed
* [PATCH 6.18.y] mailbox: Fix NULL message support in mbox_send_message()
@ 2026-05-07  6:21 Joonwon Kang
  2026-05-09  2:08 ` Sasha Levin
  0 siblings, 1 reply; 3+ messages in thread
From: Joonwon Kang @ 2026-05-07  6:21 UTC (permalink / raw)
  To: stable, jassisinghbrar
  Cc: sudeep.holla, thierry.reding, jonathanh, linux-kernel, linux-acpi,
	linux-tegra, joonwonkang, Douglas Anderson

From: Jassi Brar <jassisinghbrar@gmail.com>

commit c58e9456e30c ("mailbox: Fix NULL message support in mbox_send_message()") upstream.

The active_req field serves double duty as both the "is a TX in
flight" flag (NULL means idle) and the storage for the in-flight
message pointer. When a client sends NULL via mbox_send_message(),
active_req is set to NULL, which the framework misinterprets as
"no active request". This breaks the TX state machine by:

 - tx_tick() short-circuits on (!mssg), skipping the tx_done
   callback and the tx_complete completion
 - txdone_hrtimer() skips the channel entirely since active_req
   is NULL, so poll-based TX-done detection never fires.

Fix this by introducing a MBOX_NO_MSG sentinel value that means
"no active request," freeing NULL to be valid message data. The
sentinel is defined in the subsystem-internal mailbox.h so that
controller drivers within drivers/mailbox/ can reference it, but
it is not exposed to clients outside the subsystem.

Fifteen in-tree callers send NULL (doorbell-style IPCs on Qualcomm,
Tegra, TI, Xilinx, i.MX, SCMI, and PCC platforms). All were
audited for regression:

 - Most already work around the bug via knows_txdone=true with a
   manual mbox_client_txdone() call, making the framework's
   tracking irrelevant. These are unaffected.

 - Poll-based callers (Xilinx zynqmp/r5) are strictly better off:
   the poll timer now correctly detects NULL-active channels
   instead of silently skipping them.

 - irq-qcom-mpm.c was a pre-existing bug -- the only Qualcomm
   caller that omitted the knows_txdone + mbox_client_txdone()
   pattern. Fixed in a companion commit ("irqchip/qcom-mpm: Fix
   missing mailbox TX done acknowledgment").

 - No caller sets both a tx_done callback and sends NULL, nor
   combines tx_block=true with NULL sends, so the newly reachable
   callback/completion paths are never exercised.

Also update tegra-hsp's flush callback, which directly inspects
active_req to wait for the channel to drain: the old "!= NULL"
check becomes "!= MBOX_NO_MSG", otherwise flush spins until
timeout since the sentinel is non-NULL.

The only tradeoff is that 'MBOX_NO_MSG' can not be used as a message
by clients.

Reported-by: Joonwon Kang <joonwonkang@google.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
[ add the MBOX_NO_MSG check to drivers/mailbox/pcc.c. ]
Signed-off-by: Joonwon Kang <joonwonkang@google.com>
---
 drivers/mailbox/mailbox.c          | 15 ++++++++-------
 drivers/mailbox/pcc.c              |  2 +-
 drivers/mailbox/tegra-hsp.c        |  2 +-
 include/linux/mailbox_controller.h |  3 +++
 4 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
index 2acc6ec229a4..caa98e38ce04 100644
--- a/drivers/mailbox/mailbox.c
+++ b/drivers/mailbox/mailbox.c
@@ -52,7 +52,7 @@ static void msg_submit(struct mbox_chan *chan)
 	int err = -EBUSY;
 
 	scoped_guard(spinlock_irqsave, &chan->lock) {
-		if (!chan->msg_count || chan->active_req)
+		if (!chan->msg_count || chan->active_req != MBOX_NO_MSG)
 			break;
 
 		count = chan->msg_count;
@@ -87,13 +87,13 @@ static void tx_tick(struct mbox_chan *chan, int r)
 
 	scoped_guard(spinlock_irqsave, &chan->lock) {
 		mssg = chan->active_req;
-		chan->active_req = NULL;
+		chan->active_req = MBOX_NO_MSG;
 	}
 
 	/* Submit next message */
 	msg_submit(chan);
 
-	if (!mssg)
+	if (mssg == MBOX_NO_MSG)
 		return;
 
 	/* Notify the client */
@@ -114,7 +114,7 @@ static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer)
 	for (i = 0; i < mbox->num_chans; i++) {
 		struct mbox_chan *chan = &mbox->chans[i];
 
-		if (chan->active_req && chan->cl) {
+		if (chan->active_req != MBOX_NO_MSG && chan->cl) {
 			txdone = chan->mbox->ops->last_tx_done(chan);
 			if (txdone)
 				tx_tick(chan, 0);
@@ -246,7 +246,7 @@ int mbox_send_message(struct mbox_chan *chan, void *mssg)
 {
 	int t;
 
-	if (!chan || !chan->cl)
+	if (!chan || !chan->cl || mssg == MBOX_NO_MSG)
 		return -EINVAL;
 
 	t = add_to_rbuf(chan, mssg);
@@ -319,7 +319,7 @@ static int __mbox_bind_client(struct mbox_chan *chan, struct mbox_client *cl)
 	scoped_guard(spinlock_irqsave, &chan->lock) {
 		chan->msg_free = 0;
 		chan->msg_count = 0;
-		chan->active_req = NULL;
+		chan->active_req = MBOX_NO_MSG;
 		chan->cl = cl;
 		init_completion(&chan->tx_complete);
 
@@ -477,7 +477,7 @@ void mbox_free_channel(struct mbox_chan *chan)
 	/* The queued TX requests are simply aborted, no callbacks are made */
 	scoped_guard(spinlock_irqsave, &chan->lock) {
 		chan->cl = NULL;
-		chan->active_req = NULL;
+		chan->active_req = MBOX_NO_MSG;
 		if (chan->txdone_method == TXDONE_BY_ACK)
 			chan->txdone_method = TXDONE_BY_POLL;
 	}
@@ -534,6 +534,7 @@ int mbox_controller_register(struct mbox_controller *mbox)
 
 		chan->cl = NULL;
 		chan->mbox = mbox;
+		chan->active_req = MBOX_NO_MSG;
 		chan->txdone_method = txdone;
 		spin_lock_init(&chan->lock);
 	}
diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
index ff292b9e0be9..7a2baeca2ba4 100644
--- a/drivers/mailbox/pcc.c
+++ b/drivers/mailbox/pcc.c
@@ -361,7 +361,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
 	if (pchan->chan.rx_alloc)
 		handle = write_response(pchan);
 
-	if (chan->active_req) {
+	if (chan->active_req != MBOX_NO_MSG) {
 		pcc_header = chan->active_req;
 		if (pcc_header->flags & PCC_CMD_COMPLETION_NOTIFY)
 			mbox_chan_txdone(chan, 0);
diff --git a/drivers/mailbox/tegra-hsp.c b/drivers/mailbox/tegra-hsp.c
index ed9a0bb2bcd8..7991e8dba579 100644
--- a/drivers/mailbox/tegra-hsp.c
+++ b/drivers/mailbox/tegra-hsp.c
@@ -497,7 +497,7 @@ static int tegra_hsp_mailbox_flush(struct mbox_chan *chan,
 			mbox_chan_txdone(chan, 0);
 
 			/* Wait until channel is empty */
-			if (chan->active_req != NULL)
+			if (chan->active_req != MBOX_NO_MSG)
 				continue;
 
 			return 0;
diff --git a/include/linux/mailbox_controller.h b/include/linux/mailbox_controller.h
index 80a427c7ca29..1db0069c27c5 100644
--- a/include/linux/mailbox_controller.h
+++ b/include/linux/mailbox_controller.h
@@ -11,6 +11,9 @@
 
 struct mbox_chan;
 
+/* Sentinel value distinguishing "no active request" from "NULL message data" */
+#define MBOX_NO_MSG	((void *)-1)
+
 /**
  * struct mbox_chan_ops - methods to control mailbox channels
  * @send_data:	The API asks the MBOX controller driver, in atomic
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH 6.18.y] mailbox: Fix NULL message support in mbox_send_message()
  2026-05-07  6:21 [PATCH 6.18.y] mailbox: Fix NULL message support in mbox_send_message() Joonwon Kang
@ 2026-05-09  2:08 ` Sasha Levin
  2026-05-10  5:50   ` Joonwon Kang
  0 siblings, 1 reply; 3+ messages in thread
From: Sasha Levin @ 2026-05-09  2:08 UTC (permalink / raw)
  To: stable, jassisinghbrar
  Cc: Sasha Levin, sudeep.holla, thierry.reding, jonathanh,
	linux-kernel, linux-acpi, linux-tegra, joonwonkang,
	Douglas Anderson

On Thu, May 07, 2026 at 06:21:07AM +0000, Joonwon Kang wrote:
> diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
> index ff292b9e0be9..7a2baeca2ba4 100644
> --- a/drivers/mailbox/pcc.c
> +++ b/drivers/mailbox/pcc.c
> @@ -361,7 +361,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
>  	if (pchan->chan.rx_alloc)
>  		handle = write_response(pchan);
>
> -	if (chan->active_req) {
> +	if (chan->active_req != MBOX_NO_MSG) {
>  		pcc_header = chan->active_req;
>  		if (pcc_header->flags & PCC_CMD_COMPLETION_NOTIFY)
>  			mbox_chan_txdone(chan, 0);

This pcc.c hunk does not apply on 6.18.y: commit 5378bdf6a611 ("mailbox/pcc:
support mailbox management of the shared buffer") was reverted upstream by
f82c3e62b6b8, and that revert is already queued in 6.18 as 2cafad617431.
write_response() and the active_req-driven txdone path no longer exist in
pcc_mbox_irq() on 6.18, so this hunk is both syntactically inapplicable and
semantically unnecessary.

Could you send a v2 omitting the pcc.c hunk? The other three hunks
(mailbox.c, tegra-hsp.c, mailbox_controller.h) apply cleanly and I'm
happy to queue those for 6.18.y.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 6.18.y] mailbox: Fix NULL message support in mbox_send_message()
  2026-05-09  2:08 ` Sasha Levin
@ 2026-05-10  5:50   ` Joonwon Kang
  0 siblings, 0 replies; 3+ messages in thread
From: Joonwon Kang @ 2026-05-10  5:50 UTC (permalink / raw)
  To: sashal
  Cc: dianders, jassisinghbrar, jonathanh, joonwonkang, linux-acpi,
	linux-kernel, linux-tegra, stable, sudeep.holla, thierry.reding

> On Thu, May 07, 2026 at 06:21:07AM +0000, Joonwon Kang wrote:
> > diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
> > index ff292b9e0be9..7a2baeca2ba4 100644
> > --- a/drivers/mailbox/pcc.c
> > +++ b/drivers/mailbox/pcc.c
> > @@ -361,7 +361,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
> >  	if (pchan->chan.rx_alloc)
> >  		handle = write_response(pchan);
> >
> > -	if (chan->active_req) {
> > +	if (chan->active_req != MBOX_NO_MSG) {
> >  		pcc_header = chan->active_req;
> >  		if (pcc_header->flags & PCC_CMD_COMPLETION_NOTIFY)
> >  			mbox_chan_txdone(chan, 0);
> 
> This pcc.c hunk does not apply on 6.18.y: commit 5378bdf6a611 ("mailbox/pcc:
> support mailbox management of the shared buffer") was reverted upstream by
> f82c3e62b6b8, and that revert is already queued in 6.18 as 2cafad617431.
> write_response() and the active_req-driven txdone path no longer exist in
> pcc_mbox_irq() on 6.18, so this hunk is both syntactically inapplicable and
> semantically unnecessary.
> 

Indeed. Thanks for letting me know of this. My local environment was quite
a bit behind the latest.

> Could you send a v2 omitting the pcc.c hunk? The other three hunks
> (mailbox.c, tegra-hsp.c, mailbox_controller.h) apply cleanly and I'm
> happy to queue those for 6.18.y.

Sure, I will send a new version.

Thanks,
Joonwon Kang

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-10  5:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-07  6:21 [PATCH 6.18.y] mailbox: Fix NULL message support in mbox_send_message() Joonwon Kang
2026-05-09  2:08 ` Sasha Levin
2026-05-10  5:50   ` Joonwon Kang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox