DMA Engine development
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Fenghua Yu <fenghua.yu@intel.com>
Cc: Vinod Koul <vkoul@kernel.org>, Dave Jiang <dave.jiang@intel.com>,
	dmaengine@vger.kernel.org,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Nikhil Rao <nikhil.rao@intel.com>, Tony Zhu <tony.zhu@intel.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Subject: Re: [PATCH] dmaengine: idxd: Change wmb() to smp_wmb() when copying completion record to user space
Date: Tue, 30 Jan 2024 18:14:11 +0000	[thread overview]
Message-ID: <Zbk8c6l5gZslj6wJ@FVFF77S0Q05N> (raw)
In-Reply-To: <Zbk4wGNcB-g91Vr0@FVFF77S0Q05N>

On Tue, Jan 30, 2024 at 05:58:24PM +0000, Mark Rutland wrote:
> This patch might be ok (it looks reasonable as an optimization), but I think
> the description of wmb() and smp_wmb() is incorrect. I also think that you're
> missing an rmb()/smp_rmb()eor equivalent on the reader side.

Sorry, the above should have said:
	
	an rmb()/smp_rmb() *or* equivalent

> 
> On Mon, Jan 29, 2024 at 06:58:06PM -0800, Fenghua Yu wrote:
> > wmb() is used to ensure status in the completion record is written
> > after the rest of the completion record, making it visible to the user.
> > However, on SMP systems, this may not guarantee visibility across
> > different CPUs.
> > 
> > Considering this scenario that event log handler is running on CPU1 while
> > user app is polling completion record (cr) status on CPU2:
> > 
> > 	CPU1				CPU2
> > event log handler			user app
> > 
> > 					1. cr = 0 (status = 0)
> > 2. copy X to user cr except "status"
> > 3. wmb()
> > 4. copy Y to user cr "status"
> > 					5. poll status value Y
> > 				 	6. read rest cr which is still 0.
> > 					   cr handling fails
> > 					7. cr value X visible now
> > 
> > Although wmb() ensure value Y is written and visible after X is written
> > on CPU1, the order is not guaranteed on CPU2. So user app may see status
> > value Y while cr value X is still not visible yet on CPU2. This will
> > cause reading 0 from the rest of cr and cr handling fails.
> 
> The wmb() on CPU1 ensures the order of the reads, but you need an rmb() on CPU2

Sorry again, the above should have said:

	The wmb() on CPU1 ensures the order of the *writes*

Apologies for any confusion resulting from those mistakes.

Mark.

> between reading the 'status' and 'rest' parts; otherwise CPU2 (or the
> compiler!) is permitted to hoist the read of 'rest' early, before reading from
> 'status', and hence you can end up with a sequence that is effectively:
> 
> 	CPU1				CPU2
>   event log handler			user app
> 					
>   					1. cr = 0 (status = 0)
>   				 	6a. read rest cr which is still 0.
>   2. copy X to user cr except "status"
>   3. wmb()
>   4. copy Y to user cr "status"
>   					5. poll status value Y
>   					6b. cr handling fails
>   					7. cr value X visible now
> 
> Since this is all to regular cacheable memory, it's *sufficient* to use
> smp_wmb() and smp_rmb(), but that's an optimization rather than an ordering
> fix.
> 
> Note that on x86_64, TSO means that the stores are in-order (and so smp_wmb()
> is just a compiler barrier), and IIUC loads are not reordered w.r.t. other
> loads (and so smp_rmb() is also just a compiler barrier).
> 
> > Changing wmb() to smp_wmb() ensures Y is written after X on both CPU1
> > and CPU2. This guarantees that user app can consume cr in right order.
> 
> This implies that smp_wmb() is *stronger* than wmb(), whereas smp_wmb() is
> actually *weaker* (e.g. on x86_64 wmb() is an sfence, whereas smp_wmb() is a
> barrier()).
> 
> Thanks,
> Mark.
> 
> > 
> > Fixes: b022f59725f0 ("dmaengine: idxd: add idxd_copy_cr() to copy user completion record during page fault handling")
> > Suggested-by: Nikhil Rao <nikhil.rao@intel.com>
> > Tested-by: Tony Zhu <tony.zhu@intel.com>
> > Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> > ---
> >  drivers/dma/idxd/cdev.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/dma/idxd/cdev.c b/drivers/dma/idxd/cdev.c
> > index 77f8885cf407..9b7388a23cbe 100644
> > --- a/drivers/dma/idxd/cdev.c
> > +++ b/drivers/dma/idxd/cdev.c
> > @@ -681,9 +681,10 @@ int idxd_copy_cr(struct idxd_wq *wq, ioasid_t pasid, unsigned long addr,
> >  		 * Ensure that the completion record's status field is written
> >  		 * after the rest of the completion record has been written.
> >  		 * This ensures that the user receives the correct completion
> > -		 * record information once polling for a non-zero status.
> > +		 * record information on any CPU once polling for a non-zero
> > +		 * status.
> >  		 */
> > -		wmb();
> > +		smp_wmb();
> >  		status = *(u8 *)cr;
> >  		if (put_user(status, (u8 __user *)addr))
> >  			left += status_size;
> > -- 
> > 2.37.1
> > 
> > 
> 

  reply	other threads:[~2024-01-30 18:14 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-30  2:58 [PATCH] dmaengine: idxd: Change wmb() to smp_wmb() when copying completion record to user space Fenghua Yu
2024-01-30 15:35 ` Dave Jiang
2024-01-30 16:01 ` Lijun Pan
2024-01-30 17:58 ` Mark Rutland
2024-01-30 18:14   ` Mark Rutland [this message]
2024-01-30 19:53   ` Boqun Feng
2024-01-30 20:30     ` Dave Hansen
2024-01-30 20:41       ` Fenghua Yu
2024-01-31  4:18         ` Rao, Nikhil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zbk8c6l5gZslj6wJ@FVFF77S0Q05N \
    --to=mark.rutland@arm.com \
    --cc=dave.jiang@intel.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=fenghua.yu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=nikhil.rao@intel.com \
    --cc=tony.zhu@intel.com \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox