public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Richard GENOUD <richard.genoud@bootlin.com>
To: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org>,
	Marco Crivellari <marco.crivellari@suse.com>,
	Kees Cook <kees@kernel.org>, Roy Pledge <roy.pledge@nxp.com>,
	Claudiu Manoil <claudiu.manoil@nxp.com>,
	Scott Wood <oss@buserror.net>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>,
	linuxppc-dev@lists.ozlabs.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] soc: fsl: qbman: fix race condition in qman_destroy_fq
Date: Wed, 7 Jan 2026 10:03:00 +0100	[thread overview]
Message-ID: <9a8eeb82-c40a-421c-8d57-dbbe317c270d@bootlin.com> (raw)
In-Reply-To: <da2988cf-8f4b-4f85-a915-d83ebcb9923e@kernel.org>

Hi Christophe,

Le 06/01/2026 à 19:29, Christophe Leroy (CS GROUP) a écrit :
> 
> 
> Le 23/12/2025 à 08:25, Richard Genoud a écrit :
>> [Vous ne recevez pas souvent de courriers de 
>> richard.genoud@bootlin.com. Découvrez pourquoi ceci est important à 
>> https://aka.ms/LearnAboutSenderIdentification ]
>>
>> When QMAN_FQ_FLAG_DYNAMIC_FQID is set, there's a race condition between
>> fq_table[fq->idx] state and freeing/allocating from the pool and
>> WARN_ON(fq_table[fq->idx]) in qman_create_fq() gets triggered.
>>
>> Indeed, we can have:
>>           Thread A                             Thread B
>>      qman_destroy_fq()                    qman_create_fq()
>>        qman_release_fqid()
>>          qman_shutdown_fq()
>>          gen_pool_free()
>>             -- At this point, the fqid is available again --
>>                                             qman_alloc_fqid()
>>             -- so, we can get the just-freed fqid in thread B --
>>                                             fq->fqid = fqid;
>>                                             fq->idx = fqid * 2;
>>                                             WARN_ON(fq_table[fq->idx]);
>>                                             fq_table[fq->idx] = fq;
>>       fq_table[fq->idx] = NULL;
>>
>> And adding some logs between qman_release_fqid() and
>> fq_table[fq->idx] = NULL makes the WARN_ON() trigger a lot more.
>>
>> To prevent that, ensure that fq_table[fq->idx] is set to NULL before
>> gen_pool_free() is called by using smp_wmb().
> 
> You dismantle/reimplement qman_release_fqid(). Is that the only possible 
> approach ?
> Isn't it possible to just clear fq_table[fq->idx] _before_ calling 
> qman_release_fqid() ?
> 
I'm afraid that clearing fq_table[fq->idx] before calling
qman_release_fqid() will create another race condition:
In qm_mr_process_task() we have:
			case QM_MR_VERB_FQRN:
			case QM_MR_VERB_FQRL:
				/* Lookup in the retirement table */
				fq = fqid_to_fq(qm_fqid_get(&msg->fq));
				if (WARN_ON(!fq))
					break;
				fq_state_change(p, fq, msg, verb);
				if (fq->cb.fqs)
					fq->cb.fqs(p, fq, msg);
				break;
https://elixir.bootlin.com/linux/v6.19-rc4/source/drivers/soc/fsl/qbman/qman.c#L1509-L1518
with fqid_to_fq(fqid) expanding to fq_table[fqid * 2];

Is it possible for those "cases" (QM_MR_VERB_FQRN/QM_MR_VERB_FQRL) to
happen after fq_table[fq->idx] is cleared and before qman_shutdown_fq()
is called?
In doubt, I chose the safe side, but I'm not 100% sure it can happen.


Thanks!

Regards,
Richard
>>
>> Fixes: c535e923bb97 ("soc/fsl: Introduce DPAA 1.x QMan device driver")
>> Signed-off-by: Richard Genoud <richard.genoud@bootlin.com>
>> ---
>>   drivers/soc/fsl/qbman/qman.c | 24 ++++++++++++++++++++++--
>>   1 file changed, 22 insertions(+), 2 deletions(-)
>>
>> NB: I'm not 100% sure of the need of a barrier here, since even without
>> it, the WARN_ON() wasn't triggered any more.
>>
>> diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c
>> index 6b392b3ad4b1..39a3e7aab6ff 100644
>> --- a/drivers/soc/fsl/qbman/qman.c
>> +++ b/drivers/soc/fsl/qbman/qman.c
>> @@ -1827,6 +1827,8 @@ EXPORT_SYMBOL(qman_create_fq);
>>
>>   void qman_destroy_fq(struct qman_fq *fq)
>>   {
>> +       int leaked;
>> +
>>          /*
>>           * We don't need to lock the FQ as it is a pre-condition that 
>> the FQ be
>>           * quiesced. Instead, run some checks.
>> @@ -1834,11 +1836,29 @@ void qman_destroy_fq(struct qman_fq *fq)
>>          switch (fq->state) {
>>          case qman_fq_state_parked:
>>          case qman_fq_state_oos:
>> -               if (fq_isset(fq, QMAN_FQ_FLAG_DYNAMIC_FQID))
>> -                       qman_release_fqid(fq->fqid);
>> +               /*
>> +                * There's a race condition here on releasing the fqid,
>> +                * setting the fq_table to NULL, and freeing the fqid.
>> +                * To prevent it, this order should be respected:
>> +                */
>> +               if (fq_isset(fq, QMAN_FQ_FLAG_DYNAMIC_FQID)) {
>> +                       leaked = qman_shutdown_fq(fq->fqid);
>> +                       if (leaked)
>> +                               pr_debug("FQID %d leaked\n", fq->fqid);
>> +               }
>>
>>                  DPAA_ASSERT(fq_table[fq->idx]);
>>                  fq_table[fq->idx] = NULL;
>> +
>> +               if (fq_isset(fq, QMAN_FQ_FLAG_DYNAMIC_FQID) && !leaked) {
>> +                       /*
>> +                        * fq_table[fq->idx] should be set to null before
>> +                        * freeing fq->fqid otherwise it could by 
>> allocated by
>> +                        * qman_alloc_fqid() while still being !NULL
>> +                        */
>> +                       smp_wmb();
>> +                       gen_pool_free(qm_fqalloc, fq->fqid | 
>> DPAA_GENALLOC_OFF, 1);
>> +               }
>>                  return;
>>          default:
>>                  break;
>>
>> base-commit: 9448598b22c50c8a5bb77a9103e2d49f134c9578
>> -- 
>> 2.47.3
>>
> 


-- 
Richard Genoud, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


  reply	other threads:[~2026-01-07  9:03 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-23  7:25 [PATCH] soc: fsl: qbman: fix race condition in qman_destroy_fq Richard Genoud
2026-01-06 18:29 ` Christophe Leroy (CS GROUP)
2026-01-07  9:03   ` Richard GENOUD [this message]
2026-02-02 12:54 ` Richard GENOUD
2026-02-06  7:16   ` Christophe Leroy (CS GROUP)
2026-02-06  7:24 ` Christophe Leroy (CS GROUP)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9a8eeb82-c40a-421c-8d57-dbbe317c270d@bootlin.com \
    --to=richard.genoud@bootlin.com \
    --cc=chleroy@kernel.org \
    --cc=claudiu.manoil@nxp.com \
    --cc=kees@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=marco.crivellari@suse.com \
    --cc=oss@buserror.net \
    --cc=roy.pledge@nxp.com \
    --cc=thomas.petazzoni@bootlin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox