* [PATCH] btrfs: fix waitqueue_active without memory barrier in btrfs
@ 2015-10-09 0:35 Kosuke Tatsukawa
2015-10-09 4:32 ` Josef Bacik
2015-10-09 14:21 ` David Sterba
0 siblings, 2 replies; 5+ messages in thread
From: Kosuke Tatsukawa @ 2015-10-09 0:35 UTC (permalink / raw)
To: Chris Mason, Josef Bacik, David Sterba
Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org
btrfs_bio_counter_sub() seems to be missing a memory barrier which might
cause the waker to not notice the waiter and miss sending a wake_up as
in the following figure.
btrfs_bio_counter_sub btrfs_rm_dev_replace_blocked
------------------------------------------------------------------------
if (waitqueue_active(&fs_info->replace_wait))
/* The CPU might reorder the test for
the waitqueue up here, before
prior writes complete */
/* wait_event */
/* __wait_event */
/* ___wait_event */
long __int = prepare_to_wait_event(&wq,
&__wait, state);
if (!percpu_counter_sum(&fs_info->bio_counter))
percpu_counter_sub(&fs_info->bio_counter,
amount);
schedule()
------------------------------------------------------------------------
This patch removes the call to waitqueue_active() leaving just wake_up()
behind. This fixes the problem because the call to spin_lock_irqsave()
in wake_up() will be an ACQUIRE operation.
I found this issue when I was looking through the linux source code
for places calling waitqueue_active() before wake_up*(), but without
preceding memory barriers, after sending a patch to fix a similar
issue in drivers/tty/n_tty.c (Details about the original issue can be
found here: https://lkml.org/lkml/2015/9/28/849).
Signed-off-by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com>
---
fs/btrfs/dev-replace.c | 4 +---
1 files changed, 1 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index e54dd59..ecb3e71 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -918,9 +918,7 @@ void btrfs_bio_counter_inc_noblocked(struct btrfs_fs_info *fs_info)
void btrfs_bio_counter_sub(struct btrfs_fs_info *fs_info, s64 amount)
{
percpu_counter_sub(&fs_info->bio_counter, amount);
-
- if (waitqueue_active(&fs_info->replace_wait))
- wake_up(&fs_info->replace_wait);
+ wake_up(&fs_info->replace_wait);
}
void btrfs_bio_counter_inc_blocked(struct btrfs_fs_info *fs_info)
--
1.7.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: fix waitqueue_active without memory barrier in btrfs
2015-10-09 0:35 [PATCH] btrfs: fix waitqueue_active without memory barrier in btrfs Kosuke Tatsukawa
@ 2015-10-09 4:32 ` Josef Bacik
2015-10-09 4:50 ` Kosuke Tatsukawa
2015-10-09 14:21 ` David Sterba
1 sibling, 1 reply; 5+ messages in thread
From: Josef Bacik @ 2015-10-09 4:32 UTC (permalink / raw)
To: Kosuke Tatsukawa, Chris Mason, David Sterba
Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org
On 10/08/2015 05:35 PM, Kosuke Tatsukawa wrote:
> btrfs_bio_counter_sub() seems to be missing a memory barrier which might
> cause the waker to not notice the waiter and miss sending a wake_up as
> in the following figure.
>
> btrfs_bio_counter_sub btrfs_rm_dev_replace_blocked
> ------------------------------------------------------------------------
> if (waitqueue_active(&fs_info->replace_wait))
> /* The CPU might reorder the test for
> the waitqueue up here, before
> prior writes complete */
> /* wait_event */
> /* __wait_event */
> /* ___wait_event */
> long __int = prepare_to_wait_event(&wq,
> &__wait, state);
> if (!percpu_counter_sum(&fs_info->bio_counter))
> percpu_counter_sub(&fs_info->bio_counter,
> amount);
> schedule()
percpu_counter_sub can't be reordered, in its most basic form it does
preempt_disable/enable which in its most basic form does barrier(). Thanks,
Josef
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: fix waitqueue_active without memory barrier in btrfs
2015-10-09 4:32 ` Josef Bacik
@ 2015-10-09 4:50 ` Kosuke Tatsukawa
0 siblings, 0 replies; 5+ messages in thread
From: Kosuke Tatsukawa @ 2015-10-09 4:50 UTC (permalink / raw)
To: Josef Bacik
Cc: Chris Mason, David Sterba, linux-btrfs@vger.kernel.org,
linux-kernel@vger.kernel.org
Josef Bacik wrote:
> On 10/08/2015 05:35 PM, Kosuke Tatsukawa wrote:
>> btrfs_bio_counter_sub() seems to be missing a memory barrier which might
>> cause the waker to not notice the waiter and miss sending a wake_up as
>> in the following figure.
>>
>> btrfs_bio_counter_sub btrfs_rm_dev_replace_blocked
>> ------------------------------------------------------------------------
>> if (waitqueue_active(&fs_info->replace_wait))
>> /* The CPU might reorder the test for
>> the waitqueue up here, before
>> prior writes complete */
>> /* wait_event */
>> /* __wait_event */
>> /* ___wait_event */
>> long __int = prepare_to_wait_event(&wq,
>> &__wait, state);
>> if (!percpu_counter_sum(&fs_info->bio_counter))
>> percpu_counter_sub(&fs_info->bio_counter,
>> amount);
>> schedule()
>
> percpu_counter_sub can't be reordered, in its most basic form it does
> preempt_disable/enable which in its most basic form does barrier(). Thanks,
It's not the compiler, but the CPU that is doing the reordering.
The CPU can delay the write of the counter, so that the following read
of &fs_info->replace_wait is completed first. Hence a memory barrier is
required, and not just a barrier.
---
Kosuke TATSUKAWA | 3rd IT Platform Department
| IT Platform Division, NEC Corporation
| tatsu@ab.jp.nec.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: fix waitqueue_active without memory barrier in btrfs
2015-10-09 0:35 [PATCH] btrfs: fix waitqueue_active without memory barrier in btrfs Kosuke Tatsukawa
2015-10-09 4:32 ` Josef Bacik
@ 2015-10-09 14:21 ` David Sterba
2015-10-10 5:03 ` Kosuke Tatsukawa
1 sibling, 1 reply; 5+ messages in thread
From: David Sterba @ 2015-10-09 14:21 UTC (permalink / raw)
To: Kosuke Tatsukawa
Cc: Chris Mason, Josef Bacik, David Sterba,
linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org
On Fri, Oct 09, 2015 at 12:35:48AM +0000, Kosuke Tatsukawa wrote:
> This patch removes the call to waitqueue_active() leaving just wake_up()
> behind. This fixes the problem because the call to spin_lock_irqsave()
> in wake_up() will be an ACQUIRE operation.
Either we can switch it to wake_up or put the barrier before the check.
Not all instances of waitqueue_active need the barrier though.
> I found this issue when I was looking through the linux source code
> for places calling waitqueue_active() before wake_up*(), but without
> preceding memory barriers, after sending a patch to fix a similar
> issue in drivers/tty/n_tty.c (Details about the original issue can be
> found here: https://lkml.org/lkml/2015/9/28/849).
There are more in btrfs:
https://www.mail-archive.com/linux-btrfs%40vger.kernel.org/msg41914.html
> @@ -918,9 +918,7 @@ void btrfs_bio_counter_inc_noblocked(struct btrfs_fs_info *fs_info)
> void btrfs_bio_counter_sub(struct btrfs_fs_info *fs_info, s64 amount)
> {
> percpu_counter_sub(&fs_info->bio_counter, amount);
> -
> - if (waitqueue_active(&fs_info->replace_wait))
> - wake_up(&fs_info->replace_wait);
> + wake_up(&fs_info->replace_wait);
Chris had a comment on that one in
https://www.mail-archive.com/linux-btrfs%40vger.kernel.org/msg42551.html
it's in performance critial context and the explicit wake_up is even
worse than the barrier.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: fix waitqueue_active without memory barrier in btrfs
2015-10-09 14:21 ` David Sterba
@ 2015-10-10 5:03 ` Kosuke Tatsukawa
0 siblings, 0 replies; 5+ messages in thread
From: Kosuke Tatsukawa @ 2015-10-10 5:03 UTC (permalink / raw)
To: dsterba@suse.cz
Cc: Chris Mason, Josef Bacik, David Sterba,
linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org
David Sterba wrote:
> On Fri, Oct 09, 2015 at 12:35:48AM +0000, Kosuke Tatsukawa wrote:
>> This patch removes the call to waitqueue_active() leaving just wake_up()
>> behind. This fixes the problem because the call to spin_lock_irqsave()
>> in wake_up() will be an ACQUIRE operation.
>
> Either we can switch it to wake_up or put the barrier before the check.
> Not all instances of waitqueue_active need the barrier though.
>
>> I found this issue when I was looking through the linux source code
>> for places calling waitqueue_active() before wake_up*(), but without
>> preceding memory barriers, after sending a patch to fix a similar
>> issue in drivers/tty/n_tty.c (Details about the original issue can be
>> found here: https://lkml.org/lkml/2015/9/28/849).
>
> There are more in btrfs:
>
> https://www.mail-archive.com/linux-btrfs%40vger.kernel.org/msg41914.html
Thank you for the pointer.
Your patch seems better than mine.
I think the other places in btrfs that use waitqueue_active() before
wake_up are preceded by either a smp_mb or some kind of atomic
operation.
The latter still needs smp_mb__after_atomic() but it's light-weight
compared to smp_mb().
>> @@ -918,9 +918,7 @@ void btrfs_bio_counter_inc_noblocked(struct btrfs_fs_info *fs_info)
>> void btrfs_bio_counter_sub(struct btrfs_fs_info *fs_info, s64 amount)
>> {
>> percpu_counter_sub(&fs_info->bio_counter, amount);
>> -
>> - if (waitqueue_active(&fs_info->replace_wait))
>> - wake_up(&fs_info->replace_wait);
>> + wake_up(&fs_info->replace_wait);
>
> Chris had a comment on that one in
> https://www.mail-archive.com/linux-btrfs%40vger.kernel.org/msg42551.html
> it's in performance critial context and the explicit wake_up is even
> worse than the barrier.
---
Kosuke TATSUKAWA | 3rd IT Platform Department
| IT Platform Division, NEC Corporation
| tatsu@ab.jp.nec.com
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-10-10 5:04 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-09 0:35 [PATCH] btrfs: fix waitqueue_active without memory barrier in btrfs Kosuke Tatsukawa
2015-10-09 4:32 ` Josef Bacik
2015-10-09 4:50 ` Kosuke Tatsukawa
2015-10-09 14:21 ` David Sterba
2015-10-10 5:03 ` Kosuke Tatsukawa
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.