linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] btrfs: Relax memory barrier in btrfs_tree_unlock
@ 2018-02-14 12:37 Nikolay Borisov
  2018-02-24  0:14 ` David Sterba
  2018-03-07 16:05 ` David Sterba
  0 siblings, 2 replies; 4+ messages in thread
From: Nikolay Borisov @ 2018-02-14 12:37 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Nikolay Borisov

When performing an unlock on an extent buffer we'd like to order the
decrement of extent_buffer::blocking_writers with waking up any
waiters. In such situations it's sufficient to use smp_mb__after_atomic
rather than the heavy smp_mb. On architectures where atomic operations
are fully ordered (such as x86 or s390) unconditionally executing
a heavyweight smp_mb instruction causes a severe hit to performance
while bringin no improvements in terms of correctness.

The better thing is to use the appropriate smp_mb__after_atomic routine
which will do the correct thing (invoke a full smp_mb or in the case
of ordered atomics insert a compiler barrier). Put another way,
an RMW atomic op + smp_load__after_atomic equals, in terms of
semantics, to a full smp_mb. This ensures that none of the problems
described in the accompanying comment of waitqueue_active occur.
No functional changes.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---
 fs/btrfs/locking.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
index d13128c70ddd..621083f8932c 100644
--- a/fs/btrfs/locking.c
+++ b/fs/btrfs/locking.c
@@ -290,7 +290,7 @@ void btrfs_tree_unlock(struct extent_buffer *eb)
 		/*
 		 * Make sure counter is updated before we wake up waiters.
 		 */
-		smp_mb();
+		smp_mb__after_atomic();
 		if (waitqueue_active(&eb->write_lock_wq))
 			wake_up(&eb->write_lock_wq);
 	} else {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] btrfs: Relax memory barrier in btrfs_tree_unlock
  2018-02-14 12:37 [PATCH] btrfs: Relax memory barrier in btrfs_tree_unlock Nikolay Borisov
@ 2018-02-24  0:14 ` David Sterba
  2018-02-24 10:59   ` Nikolay Borisov
  2018-03-07 16:05 ` David Sterba
  1 sibling, 1 reply; 4+ messages in thread
From: David Sterba @ 2018-02-24  0:14 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: linux-btrfs

On Wed, Feb 14, 2018 at 02:37:26PM +0200, Nikolay Borisov wrote:
> When performing an unlock on an extent buffer we'd like to order the
> decrement of extent_buffer::blocking_writers with waking up any
> waiters. In such situations it's sufficient to use smp_mb__after_atomic
> rather than the heavy smp_mb. On architectures where atomic operations
> are fully ordered (such as x86 or s390) unconditionally executing
> a heavyweight smp_mb instruction causes a severe hit to performance
> while bringin no improvements in terms of correctness.

Have you measured this severe performance hit? There is an impact, but I
doubt you'll ever notice it in the profiles given where the
btrfs_tree_unlock appears.

> The better thing is to use the appropriate smp_mb__after_atomic routine
> which will do the correct thing (invoke a full smp_mb or in the case
> of ordered atomics insert a compiler barrier). Put another way,
> an RMW atomic op + smp_load__after_atomic equals, in terms of
> semantics, to a full smp_mb. This ensures that none of the problems
> described in the accompanying comment of waitqueue_active occur.
> No functional changes.

I tend to agree.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] btrfs: Relax memory barrier in btrfs_tree_unlock
  2018-02-24  0:14 ` David Sterba
@ 2018-02-24 10:59   ` Nikolay Borisov
  0 siblings, 0 replies; 4+ messages in thread
From: Nikolay Borisov @ 2018-02-24 10:59 UTC (permalink / raw)
  To: dsterba, linux-btrfs



On 24.02.2018 02:14, David Sterba wrote:
> On Wed, Feb 14, 2018 at 02:37:26PM +0200, Nikolay Borisov wrote:
>> When performing an unlock on an extent buffer we'd like to order the
>> decrement of extent_buffer::blocking_writers with waking up any
>> waiters. In such situations it's sufficient to use smp_mb__after_atomic
>> rather than the heavy smp_mb. On architectures where atomic operations
>> are fully ordered (such as x86 or s390) unconditionally executing
>> a heavyweight smp_mb instruction causes a severe hit to performance
>> while bringin no improvements in terms of correctness.
> 
> Have you measured this severe performance hit? There is an impact, but I
> doubt you'll ever notice it in the profiles given where the
> btrfs_tree_unlock appears.

Admittedly I haven't :) But I'd say "every little bit helps"

> 
>> The better thing is to use the appropriate smp_mb__after_atomic routine
>> which will do the correct thing (invoke a full smp_mb or in the case
>> of ordered atomics insert a compiler barrier). Put another way,
>> an RMW atomic op + smp_load__after_atomic equals, in terms of
>> semantics, to a full smp_mb. This ensures that none of the problems
>> described in the accompanying comment of waitqueue_active occur.
>> No functional changes.
> 
> I tend to agree.
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] btrfs: Relax memory barrier in btrfs_tree_unlock
  2018-02-14 12:37 [PATCH] btrfs: Relax memory barrier in btrfs_tree_unlock Nikolay Borisov
  2018-02-24  0:14 ` David Sterba
@ 2018-03-07 16:05 ` David Sterba
  1 sibling, 0 replies; 4+ messages in thread
From: David Sterba @ 2018-03-07 16:05 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: linux-btrfs

On Wed, Feb 14, 2018 at 02:37:26PM +0200, Nikolay Borisov wrote:
> When performing an unlock on an extent buffer we'd like to order the
> decrement of extent_buffer::blocking_writers with waking up any
> waiters. In such situations it's sufficient to use smp_mb__after_atomic
> rather than the heavy smp_mb. On architectures where atomic operations
> are fully ordered (such as x86 or s390) unconditionally executing
> a heavyweight smp_mb instruction causes a severe hit to performance
> while bringin no improvements in terms of correctness.
> 
> The better thing is to use the appropriate smp_mb__after_atomic routine
> which will do the correct thing (invoke a full smp_mb or in the case
> of ordered atomics insert a compiler barrier). Put another way,
> an RMW atomic op + smp_load__after_atomic equals, in terms of
> semantics, to a full smp_mb. This ensures that none of the problems
> described in the accompanying comment of waitqueue_active occur.
> No functional changes.
> 
> Signed-off-by: Nikolay Borisov <nborisov@suse.com>

Reviewed-by: David Sterba <dsterba@suse.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-03-07 16:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-14 12:37 [PATCH] btrfs: Relax memory barrier in btrfs_tree_unlock Nikolay Borisov
2018-02-24  0:14 ` David Sterba
2018-02-24 10:59   ` Nikolay Borisov
2018-03-07 16:05 ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).