* [PATCH] btrfs: Relax memory barrier in btrfs_tree_unlock
@ 2018-02-14 12:37 Nikolay Borisov
2018-02-24 0:14 ` David Sterba
2018-03-07 16:05 ` David Sterba
0 siblings, 2 replies; 4+ messages in thread
From: Nikolay Borisov @ 2018-02-14 12:37 UTC (permalink / raw)
To: linux-btrfs; +Cc: Nikolay Borisov
When performing an unlock on an extent buffer we'd like to order the
decrement of extent_buffer::blocking_writers with waking up any
waiters. In such situations it's sufficient to use smp_mb__after_atomic
rather than the heavy smp_mb. On architectures where atomic operations
are fully ordered (such as x86 or s390) unconditionally executing
a heavyweight smp_mb instruction causes a severe hit to performance
while bringin no improvements in terms of correctness.
The better thing is to use the appropriate smp_mb__after_atomic routine
which will do the correct thing (invoke a full smp_mb or in the case
of ordered atomics insert a compiler barrier). Put another way,
an RMW atomic op + smp_load__after_atomic equals, in terms of
semantics, to a full smp_mb. This ensures that none of the problems
described in the accompanying comment of waitqueue_active occur.
No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---
fs/btrfs/locking.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
index d13128c70ddd..621083f8932c 100644
--- a/fs/btrfs/locking.c
+++ b/fs/btrfs/locking.c
@@ -290,7 +290,7 @@ void btrfs_tree_unlock(struct extent_buffer *eb)
/*
* Make sure counter is updated before we wake up waiters.
*/
- smp_mb();
+ smp_mb__after_atomic();
if (waitqueue_active(&eb->write_lock_wq))
wake_up(&eb->write_lock_wq);
} else {
--
2.7.4
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] btrfs: Relax memory barrier in btrfs_tree_unlock
2018-02-14 12:37 [PATCH] btrfs: Relax memory barrier in btrfs_tree_unlock Nikolay Borisov
@ 2018-02-24 0:14 ` David Sterba
2018-02-24 10:59 ` Nikolay Borisov
2018-03-07 16:05 ` David Sterba
1 sibling, 1 reply; 4+ messages in thread
From: David Sterba @ 2018-02-24 0:14 UTC (permalink / raw)
To: Nikolay Borisov; +Cc: linux-btrfs
On Wed, Feb 14, 2018 at 02:37:26PM +0200, Nikolay Borisov wrote:
> When performing an unlock on an extent buffer we'd like to order the
> decrement of extent_buffer::blocking_writers with waking up any
> waiters. In such situations it's sufficient to use smp_mb__after_atomic
> rather than the heavy smp_mb. On architectures where atomic operations
> are fully ordered (such as x86 or s390) unconditionally executing
> a heavyweight smp_mb instruction causes a severe hit to performance
> while bringin no improvements in terms of correctness.
Have you measured this severe performance hit? There is an impact, but I
doubt you'll ever notice it in the profiles given where the
btrfs_tree_unlock appears.
> The better thing is to use the appropriate smp_mb__after_atomic routine
> which will do the correct thing (invoke a full smp_mb or in the case
> of ordered atomics insert a compiler barrier). Put another way,
> an RMW atomic op + smp_load__after_atomic equals, in terms of
> semantics, to a full smp_mb. This ensures that none of the problems
> described in the accompanying comment of waitqueue_active occur.
> No functional changes.
I tend to agree.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] btrfs: Relax memory barrier in btrfs_tree_unlock
2018-02-24 0:14 ` David Sterba
@ 2018-02-24 10:59 ` Nikolay Borisov
0 siblings, 0 replies; 4+ messages in thread
From: Nikolay Borisov @ 2018-02-24 10:59 UTC (permalink / raw)
To: dsterba, linux-btrfs
On 24.02.2018 02:14, David Sterba wrote:
> On Wed, Feb 14, 2018 at 02:37:26PM +0200, Nikolay Borisov wrote:
>> When performing an unlock on an extent buffer we'd like to order the
>> decrement of extent_buffer::blocking_writers with waking up any
>> waiters. In such situations it's sufficient to use smp_mb__after_atomic
>> rather than the heavy smp_mb. On architectures where atomic operations
>> are fully ordered (such as x86 or s390) unconditionally executing
>> a heavyweight smp_mb instruction causes a severe hit to performance
>> while bringin no improvements in terms of correctness.
>
> Have you measured this severe performance hit? There is an impact, but I
> doubt you'll ever notice it in the profiles given where the
> btrfs_tree_unlock appears.
Admittedly I haven't :) But I'd say "every little bit helps"
>
>> The better thing is to use the appropriate smp_mb__after_atomic routine
>> which will do the correct thing (invoke a full smp_mb or in the case
>> of ordered atomics insert a compiler barrier). Put another way,
>> an RMW atomic op + smp_load__after_atomic equals, in terms of
>> semantics, to a full smp_mb. This ensures that none of the problems
>> described in the accompanying comment of waitqueue_active occur.
>> No functional changes.
>
> I tend to agree.
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] btrfs: Relax memory barrier in btrfs_tree_unlock
2018-02-14 12:37 [PATCH] btrfs: Relax memory barrier in btrfs_tree_unlock Nikolay Borisov
2018-02-24 0:14 ` David Sterba
@ 2018-03-07 16:05 ` David Sterba
1 sibling, 0 replies; 4+ messages in thread
From: David Sterba @ 2018-03-07 16:05 UTC (permalink / raw)
To: Nikolay Borisov; +Cc: linux-btrfs
On Wed, Feb 14, 2018 at 02:37:26PM +0200, Nikolay Borisov wrote:
> When performing an unlock on an extent buffer we'd like to order the
> decrement of extent_buffer::blocking_writers with waking up any
> waiters. In such situations it's sufficient to use smp_mb__after_atomic
> rather than the heavy smp_mb. On architectures where atomic operations
> are fully ordered (such as x86 or s390) unconditionally executing
> a heavyweight smp_mb instruction causes a severe hit to performance
> while bringin no improvements in terms of correctness.
>
> The better thing is to use the appropriate smp_mb__after_atomic routine
> which will do the correct thing (invoke a full smp_mb or in the case
> of ordered atomics insert a compiler barrier). Put another way,
> an RMW atomic op + smp_load__after_atomic equals, in terms of
> semantics, to a full smp_mb. This ensures that none of the problems
> described in the accompanying comment of waitqueue_active occur.
> No functional changes.
>
> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-03-07 16:07 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-14 12:37 [PATCH] btrfs: Relax memory barrier in btrfs_tree_unlock Nikolay Borisov
2018-02-24 0:14 ` David Sterba
2018-02-24 10:59 ` Nikolay Borisov
2018-03-07 16:05 ` David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).