public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Fix data race with transaction->state
@ 2025-11-21 14:59 Josef Bacik
  2025-11-21 14:59 ` [PATCH v2 1/2] btrfs: fix data race on transaction->state Josef Bacik
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Josef Bacik @ 2025-11-21 14:59 UTC (permalink / raw)
  To: linux-btrfs

v1: https://lore.kernel.org/linux-btrfs/cover.1763481355.git.josef@toxicpanda.com/

v1->v2:
- I'm rusty and forgot READ_ONCE/WRITE_ONCE doesn't mean smp consistency, fixed
  the race with a proper locked check.
- Updated the smp_mb usage in start_transaction to use the proper helper.

I want to note that this isn't actually an observed hang, there was a problem
with MMIO based block IO in my version of QEMU that was making IO just stop. I
happened to notice this because the hung tasks looked very much like a deadlock.
This fixes a real data race, and we would for sure miss wakeups without these
fixes, but I don't actually have a reproducer for any sort of deadlock in this
area.

=== Original email ===

I've been setting up Claude to setup fstests and run vms automatically and I
kept hitting hangs. This turned out to be a bug with qemu's microvm, but at some
point I was convinced there was a deadlock with running out of block tags and
ordered extent completion and transaction commit. This actually wasn't the case,
however this data race is in fact real. We can easily miss wakeups if we have to
wait on transaction state to change because we do it outside of a lock and we do
not have proper barriers around transaction->state. I suspect this explains the
random hangs that I would see in production while at Meta that would clear up
eventually (we do call wakeup on the transaction wait thing a lot). In any case
this is a data race, even if it wasn't my particular bug, we should fix it.
I've run it through fstests a few times, but obviously spot check it since I'm a
little rusty with this stuff at the moment. Thanks,

Josef

Josef Bacik (2):
  btrfs: fix data race on transaction->state
  btrfs: remove useless smp_mb in start_transaction

 fs/btrfs/transaction.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

-- 
2.51.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] btrfs: fix data race on transaction->state
  2025-11-21 14:59 [PATCH v2 0/2] Fix data race with transaction->state Josef Bacik
@ 2025-11-21 14:59 ` Josef Bacik
  2025-11-21 15:29   ` Filipe Manana
  2025-11-21 14:59 ` [PATCH v2 2/2] btrfs: remove useless smp_mb in start_transaction Josef Bacik
  2025-11-21 19:49 ` [PATCH v2 0/2] Fix data race with transaction->state Qu Wenruo
  2 siblings, 1 reply; 8+ messages in thread
From: Josef Bacik @ 2025-11-21 14:59 UTC (permalink / raw)
  To: linux-btrfs

Debugging a hang with btrfs on QEMU I discovered a data race with
transaction->state. In wait_current_trans we do

wait_event(fs_info->transaction_wait,
	   cur_trans->state>=TRANS_STATE_UNBLOCKED ||
	   TRANS_ABORTED(cur_trans));

however we're doing this outside of the fs_info->trans_lock. This
generally isn't super problematic because we hit
wake_up(fs_info->transaction_wait) quite a bit, but it could lead to
latencies where we miss wakeups, or in the worst case (like the compiler
re-orders the load of the ->state outside of the wait_event loop) we
could hang completely.

Fix this by using a helper that takes the fs_info->trans_lock to do the
check safely.

I've added a lockdep_assert for the other helper to make sure nobody
uses that one without holding the lock.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/transaction.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 89ae0c7a610a..863e145a3c26 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -509,11 +509,25 @@ int btrfs_record_root_in_trans(struct btrfs_trans_handle *trans,
 
 static inline int is_transaction_blocked(struct btrfs_transaction *trans)
 {
+	lockdep_assert_held(&trans->fs_info->trans_lock);
+
 	return (trans->state >= TRANS_STATE_COMMIT_START &&
 		trans->state < TRANS_STATE_UNBLOCKED &&
 		!TRANS_ABORTED(trans));
 }
 
+/* Helper to check transaction state under lock for wait_event */
+static bool trans_unblocked(struct btrfs_transaction *trans)
+{
+	struct btrfs_fs_info *fs_info = trans->fs_info;
+	bool ret;
+
+	spin_lock(&fs_info->trans_lock);
+	ret = trans->state >= TRANS_STATE_UNBLOCKED || TRANS_ABORTED(trans);
+	spin_unlock(&fs_info->trans_lock);
+	return ret;
+}
+
 /* wait for commit against the current transaction to become unblocked
  * when this is done, it is safe to start a new transaction, but the current
  * transaction might not be fully on disk.
@@ -529,9 +543,7 @@ static void wait_current_trans(struct btrfs_fs_info *fs_info)
 		spin_unlock(&fs_info->trans_lock);
 
 		btrfs_might_wait_for_state(fs_info, BTRFS_LOCKDEP_TRANS_UNBLOCKED);
-		wait_event(fs_info->transaction_wait,
-			   cur_trans->state >= TRANS_STATE_UNBLOCKED ||
-			   TRANS_ABORTED(cur_trans));
+		wait_event(fs_info->transaction_wait, trans_unblocked(cur_trans));
 		btrfs_put_transaction(cur_trans);
 	} else {
 		spin_unlock(&fs_info->trans_lock);
-- 
2.51.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] btrfs: remove useless smp_mb in start_transaction
  2025-11-21 14:59 [PATCH v2 0/2] Fix data race with transaction->state Josef Bacik
  2025-11-21 14:59 ` [PATCH v2 1/2] btrfs: fix data race on transaction->state Josef Bacik
@ 2025-11-21 14:59 ` Josef Bacik
  2025-11-24 17:23   ` David Sterba
  2025-11-21 19:49 ` [PATCH v2 0/2] Fix data race with transaction->state Qu Wenruo
  2 siblings, 1 reply; 8+ messages in thread
From: Josef Bacik @ 2025-11-21 14:59 UTC (permalink / raw)
  To: linux-btrfs

trans->state is protected by a spin_lock() and there's no mb when we set
it, simply the spin_unlock(). The correct pairing with that is
smp_load_acquire(), so replace the smp_mb with the appropriate helper.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/transaction.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 863e145a3c26..be2f1b88a0c1 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -737,8 +737,7 @@ start_transaction(struct btrfs_root *root, unsigned int num_items,
 	INIT_LIST_HEAD(&h->new_bgs);
 	btrfs_init_metadata_block_rsv(fs_info, &h->delayed_rsv, BTRFS_BLOCK_RSV_DELOPS);
 
-	smp_mb();
-	if (cur_trans->state >= TRANS_STATE_COMMIT_START &&
+	if (smp_load_acquire(&cur_trans->state) >= TRANS_STATE_COMMIT_START &&
 	    may_wait_transaction(fs_info, type)) {
 		current->journal_info = h;
 		btrfs_commit_transaction(h);
-- 
2.51.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] btrfs: fix data race on transaction->state
  2025-11-21 14:59 ` [PATCH v2 1/2] btrfs: fix data race on transaction->state Josef Bacik
@ 2025-11-21 15:29   ` Filipe Manana
  2025-11-21 20:29     ` Qu Wenruo
  0 siblings, 1 reply; 8+ messages in thread
From: Filipe Manana @ 2025-11-21 15:29 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs

On Fri, Nov 21, 2025 at 3:13 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> Debugging a hang with btrfs on QEMU I discovered a data race with
> transaction->state. In wait_current_trans we do
>
> wait_event(fs_info->transaction_wait,
>            cur_trans->state>=TRANS_STATE_UNBLOCKED ||
>            TRANS_ABORTED(cur_trans));
>
> however we're doing this outside of the fs_info->trans_lock. This
> generally isn't super problematic because we hit
> wake_up(fs_info->transaction_wait) quite a bit, but it could lead to
> latencies where we miss wakeups, or in the worst case (like the compiler
> re-orders the load of the ->state outside of the wait_event loop) we
> could hang completely.
>
> Fix this by using a helper that takes the fs_info->trans_lock to do the
> check safely.
>
> I've added a lockdep_assert for the other helper to make sure nobody
> uses that one without holding the lock.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/transaction.c | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 89ae0c7a610a..863e145a3c26 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -509,11 +509,25 @@ int btrfs_record_root_in_trans(struct btrfs_trans_handle *trans,
>
>  static inline int is_transaction_blocked(struct btrfs_transaction *trans)
>  {
> +       lockdep_assert_held(&trans->fs_info->trans_lock);
> +

It seems odd to sneak this in when no other change in the patch
introduces a call to this function.
I would make this a separate patch.

>         return (trans->state >= TRANS_STATE_COMMIT_START &&
>                 trans->state < TRANS_STATE_UNBLOCKED &&
>                 !TRANS_ABORTED(trans));
>  }
>
> +/* Helper to check transaction state under lock for wait_event */
> +static bool trans_unblocked(struct btrfs_transaction *trans)

This could have a name that is closer to the other helper:
is_transaction_unblocked()

> +{
> +       struct btrfs_fs_info *fs_info = trans->fs_info;
> +       bool ret;
> +
> +       spin_lock(&fs_info->trans_lock);
> +       ret = trans->state >= TRANS_STATE_UNBLOCKED || TRANS_ABORTED(trans);
> +       spin_unlock(&fs_info->trans_lock);
> +       return ret;
> +}
> +
>  /* wait for commit against the current transaction to become unblocked
>   * when this is done, it is safe to start a new transaction, but the current
>   * transaction might not be fully on disk.
> @@ -529,9 +543,7 @@ static void wait_current_trans(struct btrfs_fs_info *fs_info)
>                 spin_unlock(&fs_info->trans_lock);
>
>                 btrfs_might_wait_for_state(fs_info, BTRFS_LOCKDEP_TRANS_UNBLOCKED);
> -               wait_event(fs_info->transaction_wait,
> -                          cur_trans->state >= TRANS_STATE_UNBLOCKED ||
> -                          TRANS_ABORTED(cur_trans));
> +               wait_event(fs_info->transaction_wait, trans_unblocked(cur_trans));

Instead of adding an helper and locking, couldn't this be simply:

wait_event(fs_info->transaction_wait,
smp_load_acquire(cur_trans->state) >= TRANS_STATE_UNBLOCKED ||
TRANS_ABORTED(cur_trans));

Thanks.

>                 btrfs_put_transaction(cur_trans);
>         } else {
>                 spin_unlock(&fs_info->trans_lock);
> --
> 2.51.1
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/2] Fix data race with transaction->state
  2025-11-21 14:59 [PATCH v2 0/2] Fix data race with transaction->state Josef Bacik
  2025-11-21 14:59 ` [PATCH v2 1/2] btrfs: fix data race on transaction->state Josef Bacik
  2025-11-21 14:59 ` [PATCH v2 2/2] btrfs: remove useless smp_mb in start_transaction Josef Bacik
@ 2025-11-21 19:49 ` Qu Wenruo
  2 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2025-11-21 19:49 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs



在 2025/11/22 01:29, Josef Bacik 写道:
> v1: https://lore.kernel.org/linux-btrfs/cover.1763481355.git.josef@toxicpanda.com/
> 
> v1->v2:
> - I'm rusty and forgot READ_ONCE/WRITE_ONCE doesn't mean smp consistency, fixed
>    the race with a proper locked check.
> - Updated the smp_mb usage in start_transaction to use the proper helper.
> 
> I want to note that this isn't actually an observed hang, there was a problem
> with MMIO based block IO in my version of QEMU that was making IO just stop. I
> happened to notice this because the hung tasks looked very much like a deadlock.
> This fixes a real data race, and we would for sure miss wakeups without these
> fixes, but I don't actually have a reproducer for any sort of deadlock in this
> area.

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> 
> === Original email ===
> 
> I've been setting up Claude to setup fstests and run vms automatically and I
> kept hitting hangs. This turned out to be a bug with qemu's microvm, but at some
> point I was convinced there was a deadlock with running out of block tags and
> ordered extent completion and transaction commit. This actually wasn't the case,
> however this data race is in fact real. We can easily miss wakeups if we have to
> wait on transaction state to change because we do it outside of a lock and we do
> not have proper barriers around transaction->state. I suspect this explains the
> random hangs that I would see in production while at Meta that would clear up
> eventually (we do call wakeup on the transaction wait thing a lot). In any case
> this is a data race, even if it wasn't my particular bug, we should fix it.
> I've run it through fstests a few times, but obviously spot check it since I'm a
> little rusty with this stuff at the moment. Thanks,
> 
> Josef
> 
> Josef Bacik (2):
>    btrfs: fix data race on transaction->state
>    btrfs: remove useless smp_mb in start_transaction
> 
>   fs/btrfs/transaction.c | 21 ++++++++++++++++-----
>   1 file changed, 16 insertions(+), 5 deletions(-)
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] btrfs: fix data race on transaction->state
  2025-11-21 15:29   ` Filipe Manana
@ 2025-11-21 20:29     ` Qu Wenruo
  2025-11-21 20:51       ` Filipe Manana
  0 siblings, 1 reply; 8+ messages in thread
From: Qu Wenruo @ 2025-11-21 20:29 UTC (permalink / raw)
  To: Filipe Manana, Josef Bacik; +Cc: linux-btrfs



在 2025/11/22 01:59, Filipe Manana 写道:
> On Fri, Nov 21, 2025 at 3:13 PM Josef Bacik <josef@toxicpanda.com> wrote:
>>
>> Debugging a hang with btrfs on QEMU I discovered a data race with
>> transaction->state. In wait_current_trans we do
>>
>> wait_event(fs_info->transaction_wait,
>>             cur_trans->state>=TRANS_STATE_UNBLOCKED ||
>>             TRANS_ABORTED(cur_trans));
>>
>> however we're doing this outside of the fs_info->trans_lock. This
>> generally isn't super problematic because we hit
>> wake_up(fs_info->transaction_wait) quite a bit, but it could lead to
>> latencies where we miss wakeups, or in the worst case (like the compiler
>> re-orders the load of the ->state outside of the wait_event loop) we
>> could hang completely.
>>
>> Fix this by using a helper that takes the fs_info->trans_lock to do the
>> check safely.
>>
>> I've added a lockdep_assert for the other helper to make sure nobody
>> uses that one without holding the lock.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>> ---
>>   fs/btrfs/transaction.c | 18 +++++++++++++++---
>>   1 file changed, 15 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
>> index 89ae0c7a610a..863e145a3c26 100644
>> --- a/fs/btrfs/transaction.c
>> +++ b/fs/btrfs/transaction.c
>> @@ -509,11 +509,25 @@ int btrfs_record_root_in_trans(struct btrfs_trans_handle *trans,
>>
>>   static inline int is_transaction_blocked(struct btrfs_transaction *trans)
>>   {
>> +       lockdep_assert_held(&trans->fs_info->trans_lock);
>> +
> 
> It seems odd to sneak this in when no other change in the patch
> introduces a call to this function.
> I would make this a separate patch.
> 
>>          return (trans->state >= TRANS_STATE_COMMIT_START &&
>>                  trans->state < TRANS_STATE_UNBLOCKED &&
>>                  !TRANS_ABORTED(trans));
>>   }
>>
>> +/* Helper to check transaction state under lock for wait_event */
>> +static bool trans_unblocked(struct btrfs_transaction *trans)
> 
> This could have a name that is closer to the other helper:
> is_transaction_unblocked()
> 
>> +{
>> +       struct btrfs_fs_info *fs_info = trans->fs_info;
>> +       bool ret;
>> +
>> +       spin_lock(&fs_info->trans_lock);
>> +       ret = trans->state >= TRANS_STATE_UNBLOCKED || TRANS_ABORTED(trans);
>> +       spin_unlock(&fs_info->trans_lock);
>> +       return ret;
>> +}
>> +
>>   /* wait for commit against the current transaction to become unblocked
>>    * when this is done, it is safe to start a new transaction, but the current
>>    * transaction might not be fully on disk.
>> @@ -529,9 +543,7 @@ static void wait_current_trans(struct btrfs_fs_info *fs_info)
>>                  spin_unlock(&fs_info->trans_lock);
>>
>>                  btrfs_might_wait_for_state(fs_info, BTRFS_LOCKDEP_TRANS_UNBLOCKED);
>> -               wait_event(fs_info->transaction_wait,
>> -                          cur_trans->state >= TRANS_STATE_UNBLOCKED ||
>> -                          TRANS_ABORTED(cur_trans));
>> +               wait_event(fs_info->transaction_wait, trans_unblocked(cur_trans));
> 
> Instead of adding an helper and locking, couldn't this be simply:
> 
> wait_event(fs_info->transaction_wait,
> smp_load_acquire(cur_trans->state) >= TRANS_STATE_UNBLOCKED ||
> TRANS_ABORTED(cur_trans));

Not an expert on memory barriers, but isn't the key point here that 
we're accessing two different variables in a non-atomic way?

And to be honest, I really do not like introducing low level memory 
barrier callers, it's really hard to get it right.

Thanks,
Qu

> 
> Thanks.
> 
>>                  btrfs_put_transaction(cur_trans);
>>          } else {
>>                  spin_unlock(&fs_info->trans_lock);
>> --
>> 2.51.1
>>
>>
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] btrfs: fix data race on transaction->state
  2025-11-21 20:29     ` Qu Wenruo
@ 2025-11-21 20:51       ` Filipe Manana
  0 siblings, 0 replies; 8+ messages in thread
From: Filipe Manana @ 2025-11-21 20:51 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Josef Bacik, linux-btrfs

On Fri, Nov 21, 2025 at 8:29 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> 在 2025/11/22 01:59, Filipe Manana 写道:
> > On Fri, Nov 21, 2025 at 3:13 PM Josef Bacik <josef@toxicpanda.com> wrote:
> >>
> >> Debugging a hang with btrfs on QEMU I discovered a data race with
> >> transaction->state. In wait_current_trans we do
> >>
> >> wait_event(fs_info->transaction_wait,
> >>             cur_trans->state>=TRANS_STATE_UNBLOCKED ||
> >>             TRANS_ABORTED(cur_trans));
> >>
> >> however we're doing this outside of the fs_info->trans_lock. This
> >> generally isn't super problematic because we hit
> >> wake_up(fs_info->transaction_wait) quite a bit, but it could lead to
> >> latencies where we miss wakeups, or in the worst case (like the compiler
> >> re-orders the load of the ->state outside of the wait_event loop) we
> >> could hang completely.
> >>
> >> Fix this by using a helper that takes the fs_info->trans_lock to do the
> >> check safely.
> >>
> >> I've added a lockdep_assert for the other helper to make sure nobody
> >> uses that one without holding the lock.
> >>
> >> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> >> ---
> >>   fs/btrfs/transaction.c | 18 +++++++++++++++---
> >>   1 file changed, 15 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> >> index 89ae0c7a610a..863e145a3c26 100644
> >> --- a/fs/btrfs/transaction.c
> >> +++ b/fs/btrfs/transaction.c
> >> @@ -509,11 +509,25 @@ int btrfs_record_root_in_trans(struct btrfs_trans_handle *trans,
> >>
> >>   static inline int is_transaction_blocked(struct btrfs_transaction *trans)
> >>   {
> >> +       lockdep_assert_held(&trans->fs_info->trans_lock);
> >> +
> >
> > It seems odd to sneak this in when no other change in the patch
> > introduces a call to this function.
> > I would make this a separate patch.
> >
> >>          return (trans->state >= TRANS_STATE_COMMIT_START &&
> >>                  trans->state < TRANS_STATE_UNBLOCKED &&
> >>                  !TRANS_ABORTED(trans));
> >>   }
> >>
> >> +/* Helper to check transaction state under lock for wait_event */
> >> +static bool trans_unblocked(struct btrfs_transaction *trans)
> >
> > This could have a name that is closer to the other helper:
> > is_transaction_unblocked()
> >
> >> +{
> >> +       struct btrfs_fs_info *fs_info = trans->fs_info;
> >> +       bool ret;
> >> +
> >> +       spin_lock(&fs_info->trans_lock);
> >> +       ret = trans->state >= TRANS_STATE_UNBLOCKED || TRANS_ABORTED(trans);
> >> +       spin_unlock(&fs_info->trans_lock);
> >> +       return ret;
> >> +}
> >> +
> >>   /* wait for commit against the current transaction to become unblocked
> >>    * when this is done, it is safe to start a new transaction, but the current
> >>    * transaction might not be fully on disk.
> >> @@ -529,9 +543,7 @@ static void wait_current_trans(struct btrfs_fs_info *fs_info)
> >>                  spin_unlock(&fs_info->trans_lock);
> >>
> >>                  btrfs_might_wait_for_state(fs_info, BTRFS_LOCKDEP_TRANS_UNBLOCKED);
> >> -               wait_event(fs_info->transaction_wait,
> >> -                          cur_trans->state >= TRANS_STATE_UNBLOCKED ||
> >> -                          TRANS_ABORTED(cur_trans));
> >> +               wait_event(fs_info->transaction_wait, trans_unblocked(cur_trans));
> >
> > Instead of adding an helper and locking, couldn't this be simply:
> >
> > wait_event(fs_info->transaction_wait,
> > smp_load_acquire(cur_trans->state) >= TRANS_STATE_UNBLOCKED ||
> > TRANS_ABORTED(cur_trans));
>
> Not an expert on memory barriers, but isn't the key point here that
> we're accessing two different variables in a non-atomic way?

So what?
We can continue accessing the trans aborted case as we did because
it's not expected to happen, so any latency in the waiting due to a
transaction abort is fine.

>
> And to be honest, I really do not like introducing low level memory
> barrier callers, it's really hard to get it right.

I generally don't like it either. But here it saves quite some code
and it's for a very rare case anyway.

Plus there's also this inconsistency with the second patch, which
favors using smp_load_acquire() instead of taking the lock.

__nfs_lookup_revalidate() uses smp_load_acquire() precisely in the
same situation,
in a wait_var_event() condition to read dentry->d_fsdata, which is
updated under the protection of dentry->d_lock.


>
> Thanks,
> Qu
>
> >
> > Thanks.
> >
> >>                  btrfs_put_transaction(cur_trans);
> >>          } else {
> >>                  spin_unlock(&fs_info->trans_lock);
> >> --
> >> 2.51.1
> >>
> >>
> >
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] btrfs: remove useless smp_mb in start_transaction
  2025-11-21 14:59 ` [PATCH v2 2/2] btrfs: remove useless smp_mb in start_transaction Josef Bacik
@ 2025-11-24 17:23   ` David Sterba
  0 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2025-11-24 17:23 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs

On Fri, Nov 21, 2025 at 09:59:22AM -0500, Josef Bacik wrote:
> trans->state is protected by a spin_lock() and there's no mb when we set
> it, simply the spin_unlock(). The correct pairing with that is
> smp_load_acquire(), so replace the smp_mb with the appropriate helper.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/transaction.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 863e145a3c26..be2f1b88a0c1 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -737,8 +737,7 @@ start_transaction(struct btrfs_root *root, unsigned int num_items,
>  	INIT_LIST_HEAD(&h->new_bgs);
>  	btrfs_init_metadata_block_rsv(fs_info, &h->delayed_rsv, BTRFS_BLOCK_RSV_DELOPS);
>  
> -	smp_mb();
> -	if (cur_trans->state >= TRANS_STATE_COMMIT_START &&
> +	if (smp_load_acquire(&cur_trans->state) >= TRANS_STATE_COMMIT_START &&

This is a memory barrier so it should be documented.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-11-24 17:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-21 14:59 [PATCH v2 0/2] Fix data race with transaction->state Josef Bacik
2025-11-21 14:59 ` [PATCH v2 1/2] btrfs: fix data race on transaction->state Josef Bacik
2025-11-21 15:29   ` Filipe Manana
2025-11-21 20:29     ` Qu Wenruo
2025-11-21 20:51       ` Filipe Manana
2025-11-21 14:59 ` [PATCH v2 2/2] btrfs: remove useless smp_mb in start_transaction Josef Bacik
2025-11-24 17:23   ` David Sterba
2025-11-21 19:49 ` [PATCH v2 0/2] Fix data race with transaction->state Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox