public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: "Czurylo, Krzysztof" <krzysztof.czurylo@intel.com>
Cc: "jgg@nvidia.com" <jgg@nvidia.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"Nikolova, Tatyana E" <tatyana.e.nikolova@intel.com>
Subject: Re: [for-next 02/12] RDMA/irdma: Fix data race on cqp_request->request_done
Date: Tue, 17 Mar 2026 15:22:53 +0200	[thread overview]
Message-ID: <20260317132253.GX61385@unreal> (raw)
In-Reply-To: <DS0PR11MB7736961A569FE56FDB09631EEB41A@DS0PR11MB7736.namprd11.prod.outlook.com>

On Tue, Mar 17, 2026 at 12:14:21PM +0000, Czurylo, Krzysztof wrote:
> 
> 
> > -----Original Message-----
> > From: Leon Romanovsky <leon@kernel.org>
> > Sent: 17 March, 2026 12:13
> > To: Nikolova, Tatyana E <tatyana.e.nikolova@intel.com>
> > Cc: jgg@nvidia.com; linux-rdma@vger.kernel.org; Czurylo, Krzysztof
> > <krzysztof.czurylo@intel.com>
> > Subject: Re: [for-next 02/12] RDMA/irdma: Fix data race on cqp_request-
> > >request_done
> > 
> > On Mon, Mar 16, 2026 at 01:39:39PM -0500, Tatyana Nikolova wrote:
> > > From: Krzysztof Czurylo <krzysztof.czurylo@intel.com>
> > >
> > > Changes type of request_done flag from bool to atomic_t to fix
> > > data race in irdma_complete_cqp_request / irdma_wait_event
> > > reported by KCSAN:
> > 
> > Again, this fix is wrong too.
> 
> Could you please elaborate on what is wrong with this fix?
> And/or suggest how to fix it properly?
> 
> Please note 'request_done' is _not_ a bitfield and we only do simple
> load/store operations on it.  There is no RMW.
> Despite this, KCSAN still reports a data race on it.
> 
> Honestly, the original idea was just to change the type from
> 'bool' to 'u8'.  This is enough to silence KCSAN, but it is
> not clear why.  Perhaps it indicates a bug in KCSAN?

Yes, both u8 and atomic_t behave the same, they can't be interrupted
during read/write. This is why KCSAN doesn't warn you.

> I mean, maybe the report below is a false positive?

Sounds like that.

> 
> Thanks
> 
> > 
> > Thanks
> > 
> > >
> > > BUG: KCSAN: data-race in irdma_complete_cqp_request [irdma] /
> > irdma_wait_event [irdma]
> > >
> > > write (marked) to 0xffffa0bef390ad5c of 1 bytes by task 7761 on cpu 0:
> > >  irdma_complete_cqp_request+0x19/0x90 [irdma]
> > >  irdma_cqp_ce_handler+0x22d/0x290 [irdma]
> > >  cqp_compl_worker+0x1f/0x30 [irdma]
> > >  process_one_work+0x3dc/0x7c0
> > >  worker_thread+0xa6/0x6c0
> > >  kthread+0x17f/0x1c0
> > >  ret_from_fork+0x2c/0x50
> > >
> > > read to 0xffffa0bef390ad5c of 1 bytes by task 28566 on cpu 3:
> > >  irdma_wait_event+0x242/0x390 [irdma]
> > >  irdma_handle_cqp_op+0x93/0x210 [irdma]
> > >  irdma_hw_modify_qp+0xe3/0x2f0 [irdma]
> > >  irdma_modify_qp_roce+0x13df/0x1630 [irdma]
> > >  ib_security_modify_qp+0x34a/0x640 [ib_core]
> > >  _ib_modify_qp+0x16b/0x620 [ib_core]
> > >  ib_modify_qp_with_udata+0x3c/0x50 [ib_core]
> > >  modify_qp+0x6e1/0x920 [ib_uverbs]
> > >  ib_uverbs_ex_modify_qp+0xa6/0xf0 [ib_uverbs]
> > >  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x16c/0x200 [ib_uverbs]
> > >  ib_uverbs_run_method+0x342/0x380 [ib_uverbs]
> > >  ib_uverbs_cmd_verbs+0x365/0x440 [ib_uverbs]
> > >  ib_uverbs_ioctl+0x111/0x190 [ib_uverbs]
> > >  __x64_sys_ioctl+0xc3/0x100
> > >  do_syscall_64+0x3f/0x90
> > >  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > >
> > > value changed: 0x00 -> 0x01
> > >
> > > Fixes: f0842bb3d388 ("RDMA/irdma: Fix data race on CQP request done")
> > > Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com>
> > > Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
> > > ---
> > >  drivers/infiniband/hw/irdma/hw.c    | 2 +-
> > >  drivers/infiniband/hw/irdma/main.h  | 2 +-
> > >  drivers/infiniband/hw/irdma/utils.c | 6 +++---
> > >  3 files changed, 5 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/infiniband/hw/irdma/hw.c
> > b/drivers/infiniband/hw/irdma/hw.c
> > > index 6e0466ce83d1..3ba4809bc1ef 100644
> > > --- a/drivers/infiniband/hw/irdma/hw.c
> > > +++ b/drivers/infiniband/hw/irdma/hw.c
> > > @@ -235,7 +235,7 @@ static void irdma_complete_cqp_request(struct
> > irdma_cqp *cqp,
> > >  				       struct irdma_cqp_request *cqp_request)
> > >  {
> > >  	if (cqp_request->waiting) {
> > > -		WRITE_ONCE(cqp_request->request_done, true);
> > > +		atomic_set(&cqp_request->request_done, true);
> > >  		wake_up(&cqp_request->waitq);
> > >  	} else if (cqp_request->callback_fcn) {
> > >  		cqp_request->callback_fcn(cqp_request);
> > > diff --git a/drivers/infiniband/hw/irdma/main.h
> > b/drivers/infiniband/hw/irdma/main.h
> > > index 3d49bd57bae7..e22160e2ba33 100644
> > > --- a/drivers/infiniband/hw/irdma/main.h
> > > +++ b/drivers/infiniband/hw/irdma/main.h
> > > @@ -167,7 +167,7 @@ struct irdma_cqp_request {
> > >  	void (*callback_fcn)(struct irdma_cqp_request *cqp_request);
> > >  	void *param;
> > >  	struct irdma_cqp_compl_info compl_info;
> > > -	bool request_done; /* READ/WRITE_ONCE macros operate on it */
> > > +	atomic_t request_done;
> > >  	bool waiting:1;
> > >  	bool dynamic:1;
> > >  	bool pending:1;
> > > diff --git a/drivers/infiniband/hw/irdma/utils.c
> > b/drivers/infiniband/hw/irdma/utils.c
> > > index ab8c5284d4be..f9c99c216a2c 100644
> > > --- a/drivers/infiniband/hw/irdma/utils.c
> > > +++ b/drivers/infiniband/hw/irdma/utils.c
> > > @@ -480,7 +480,7 @@ void irdma_free_cqp_request(struct irdma_cqp *cqp,
> > >  	if (cqp_request->dynamic) {
> > >  		kfree(cqp_request);
> > >  	} else {
> > > -		WRITE_ONCE(cqp_request->request_done, false);
> > > +		atomic_set(&cqp_request->request_done, false);
> > >  		cqp_request->callback_fcn = NULL;
> > >  		cqp_request->waiting = false;
> > >  		cqp_request->pending = false;
> > > @@ -515,7 +515,7 @@ irdma_free_pending_cqp_request(struct irdma_cqp
> > *cqp,
> > >  {
> > >  	if (cqp_request->waiting) {
> > >  		cqp_request->compl_info.error = true;
> > > -		WRITE_ONCE(cqp_request->request_done, true);
> > > +		atomic_set(&cqp_request->request_done, true);
> > >  		wake_up(&cqp_request->waitq);
> > >  	}
> > >  	wait_event_timeout(cqp->remove_wq,
> > > @@ -610,7 +610,7 @@ static int irdma_wait_event(struct irdma_pci_f *rf,
> > >  	do {
> > >  		irdma_cqp_ce_handler(rf, &rf->ccq.sc_cq);
> > >  		if (wait_event_timeout(cqp_request->waitq,
> > > -				       READ_ONCE(cqp_request->request_done),
> > > +				       atomic_read(&cqp_request->request_done),
> > >  				       msecs_to_jiffies(CQP_COMPL_WAIT_TIME_MS)))
> > >  			break;
> > >
> > > --
> > > 2.31.1
> > >
> 

  reply	other threads:[~2026-03-17 13:23 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-16 18:39 [for-next 00/12] RDMA/irdma: A few fixes for irdma Tatyana Nikolova
2026-03-16 18:39 ` [for-next 01/12] RDMA/irdma: Initialize free_qp completion before using it Tatyana Nikolova
2026-03-16 18:39 ` [for-next 02/12] RDMA/irdma: Fix data race on cqp_request->request_done Tatyana Nikolova
2026-03-17 11:12   ` Leon Romanovsky
2026-03-17 12:14     ` Czurylo, Krzysztof
2026-03-17 13:22       ` Leon Romanovsky [this message]
2026-03-17 19:27         ` Nikolova, Tatyana E
2026-03-18 10:19           ` Leon Romanovsky
2026-03-16 18:39 ` [for-next 03/12] RDMA/irdma: Change ah_valid type to atomic Tatyana Nikolova
2026-03-17 11:11   ` Leon Romanovsky
2026-03-16 18:39 ` [for-next 04/12] RDMA/irdma: Update ibqp state to error if QP is already in error state Tatyana Nikolova
2026-03-16 18:39 ` [for-next 05/12] RDMA/irdma: Remove a NOP wait_event() in irdma_modify_qp_roce() Tatyana Nikolova
2026-03-16 18:39 ` [for-next 06/12] RDMA/irdma: Clean up unnecessary dereference of event->cm_node Tatyana Nikolova
2026-03-16 18:39 ` [for-next 07/12] RDMA/irdma: Remove reset check from irdma_modify_qp_to_err() Tatyana Nikolova
2026-03-16 18:39 ` [for-next 08/12] RDMA/irdma: Fix deadlock during netdev reset with active connections Tatyana Nikolova
2026-03-16 18:39 ` [for-next 09/12] RDMA/irdma: Return EINVAL for invalid arp index error Tatyana Nikolova
2026-03-16 18:39 ` [for-next 10/12] RDMA/irdma: Harden depth calculation functions Tatyana Nikolova
2026-03-16 18:39 ` [for-next 11/12] RDMA/irdma: Provide scratch buffers to firmware for internal use Tatyana Nikolova
2026-03-16 18:39 ` [for-next 12/12] RDMA/irdma: Add support for GEN4 hardware Tatyana Nikolova
2026-03-18 10:24 ` (subset) [for-next 00/12] RDMA/irdma: A few fixes for irdma Leon Romanovsky
2026-03-18 10:25   ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260317132253.GX61385@unreal \
    --to=leon@kernel.org \
    --cc=jgg@nvidia.com \
    --cc=krzysztof.czurylo@intel.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=tatyana.e.nikolova@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox