public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] btrfs: qgroup: fix deadlock between rescan worker and remove qgroup
@ 2022-02-26 16:03 Sidong Yang
  2022-02-28  0:55 ` Shinichiro Kawasaki
  0 siblings, 1 reply; 3+ messages in thread
From: Sidong Yang @ 2022-02-26 16:03 UTC (permalink / raw)
  To: Filipe Manana, dsterba@suse.cz, linux-btrfs@vger.kernel.org,
	Shinichiro Kawasaki
  Cc: Sidong Yang, Filipe Manana

The commit e804861bd4e6 ("btrfs: fix deadlock between quota disable and
qgroup rescan worker") by Kawasaki resolves deadlock between quota
disable and qgroup rescan worker. but also there is a deadlock case like
it. It's about enabling or disabling quota and creating or removing
qgroup. It can be reproduced in simple script below.

for i in {1..100}
do
    btrfs quota enable /mnt &
    btrfs qgroup create 1/0 /mnt &
    btrfs qgroup destroy 1/0 /mnt &
    btrfs quota disable /mnt &
done

Here's why the deadlock happens:

1) The quota rescan task is running.

2) Task A calls btrfs_quota_disable(), locks the qgroup_ioctl_lock
   mutex, and then calls btrfs_qgroup_wait_for_completion(), to wait for
   the quota rescan task to complete.

3) Task B calls btrfs_remove_qgroup() and it blocks when trying to lock
   the qgroup_ioctl_lock mutex, because it's being held by task A. At that
   To  resolve this issue, The thread disabling quota should unlock
   int task B is holding a transaction handle for the current transaction.

4) The quota rescan task calls btrfs_commit_transaction(). This results
   in it waiting for all other tasks to release their handles on the
   transaction, but task B is blocked on the qgroup_ioctl_lock mutex
   while holding a handle on the transaction, and that mutex is being held
   by task A, which is waiting for the quota rescan task to complete,
   resulting in a deadlock between these 3 tasks.

To resolve this issue, The thread disabling quota should unlock
qgroup_ioctl_lock before waiting rescan completion. This patch moves
btrfs_qgroup_wait_for_completion() after qgroup_ioctl_lock().

Fixes: e804861bd4e6 ("btrfs: fix deadlock between quota disable and
qgroup rescan worker")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Sidong Yang <realwakka@gmail.com>
---
v3: fix comments, typos, changelog.
v2: add comments, move locking before clear_bit.
---
 fs/btrfs/qgroup.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 2c0dd6b8a80c..1866b1f0da01 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1213,6 +1213,14 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
 	if (!fs_info->quota_root)
 		goto out;
 
+	/*
+	 * Unlock the qgroup_ioctl_lock mutex before waiting for the rescan worker to
+	 * complete. Otherwise we can deadlock because btrfs_remove_qgroup() needs
+	 * to lock that mutex while holding a transaction handle and the rescan
+	 * worker needs to commit a transaction.
+	 */
+	mutex_unlock(&fs_info->qgroup_ioctl_lock);
+
 	/*
 	 * Request qgroup rescan worker to complete and wait for it. This wait
 	 * must be done before transaction start for quota disable since it may
@@ -1220,7 +1228,6 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
 	 */
 	clear_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
 	btrfs_qgroup_wait_for_completion(fs_info, false);
-	mutex_unlock(&fs_info->qgroup_ioctl_lock);
 
 	/*
 	 * 1 For the root item
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] btrfs: qgroup: fix deadlock between rescan worker and remove qgroup
  2022-02-26 16:03 [PATCH v3] btrfs: qgroup: fix deadlock between rescan worker and remove qgroup Sidong Yang
@ 2022-02-28  0:55 ` Shinichiro Kawasaki
  2022-02-28  1:19   ` Sidong Yang
  0 siblings, 1 reply; 3+ messages in thread
From: Shinichiro Kawasaki @ 2022-02-28  0:55 UTC (permalink / raw)
  To: Sidong Yang
  Cc: Filipe Manana, dsterba@suse.cz, linux-btrfs@vger.kernel.org,
	Filipe Manana

Thanks for this v3 patch. I found some more points to improve in the commit
message.

On Feb 26, 2022 / 16:03, Sidong Yang wrote:
> The commit e804861bd4e6 ("btrfs: fix deadlock between quota disable and
> qgroup rescan worker") by Kawasaki resolves deadlock between quota
> disable and qgroup rescan worker. but also there is a deadlock case like

nit: s/but also/But also/

> it. It's about enabling or disabling quota and creating or removing
> qgroup. It can be reproduced in simple script below.
> 
> for i in {1..100}
> do
>     btrfs quota enable /mnt &
>     btrfs qgroup create 1/0 /mnt &
>     btrfs qgroup destroy 1/0 /mnt &
>     btrfs quota disable /mnt &
> done
> 
> Here's why the deadlock happens:
> 
> 1) The quota rescan task is running.
> 
> 2) Task A calls btrfs_quota_disable(), locks the qgroup_ioctl_lock
>    mutex, and then calls btrfs_qgroup_wait_for_completion(), to wait for
>    the quota rescan task to complete.
> 
> 3) Task B calls btrfs_remove_qgroup() and it blocks when trying to lock
>    the qgroup_ioctl_lock mutex, because it's being held by task A. At that
>    To  resolve this issue, The thread disabling quota should unlock
>    int task B is holding a transaction handle for the current transaction.

I think you made a mistake in the two lines above, when you copied the text that
Filipe suggested.

> 
> 4) The quota rescan task calls btrfs_commit_transaction(). This results
>    in it waiting for all other tasks to release their handles on the
>    transaction, but task B is blocked on the qgroup_ioctl_lock mutex
>    while holding a handle on the transaction, and that mutex is being held
>    by task A, which is waiting for the quota rescan task to complete,
>    resulting in a deadlock between these 3 tasks.
> 
> To resolve this issue, The thread disabling quota should unlock

nit: s/The thread/the thread/

> qgroup_ioctl_lock before waiting rescan completion. This patch moves
> btrfs_qgroup_wait_for_completion() after qgroup_ioctl_lock().

Do you mean 'unlock of qgroup_ioctl_lock' instead of 'qgroup_ioclt_lock()?

Also, according to Documentation/process/submitting-patches.rst, imperative
mood is recommended for commit messages. Then the sentence above can be:

  Move btrfs_qgroup_wait_for_completion() after unlock of qgroup_ioctl_lock.

-- 
Best Regards,
Shin'ichiro Kawasaki

> 
> Fixes: e804861bd4e6 ("btrfs: fix deadlock between quota disable and
> qgroup rescan worker")
> Reviewed-by: Filipe Manana <fdmanana@suse.com>
> Signed-off-by: Sidong Yang <realwakka@gmail.com>
> ---
> v3: fix comments, typos, changelog.
> v2: add comments, move locking before clear_bit.
> ---
>  fs/btrfs/qgroup.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index 2c0dd6b8a80c..1866b1f0da01 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -1213,6 +1213,14 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
>  	if (!fs_info->quota_root)
>  		goto out;
>  
> +	/*
> +	 * Unlock the qgroup_ioctl_lock mutex before waiting for the rescan worker to
> +	 * complete. Otherwise we can deadlock because btrfs_remove_qgroup() needs
> +	 * to lock that mutex while holding a transaction handle and the rescan
> +	 * worker needs to commit a transaction.
> +	 */
> +	mutex_unlock(&fs_info->qgroup_ioctl_lock);
> +
>  	/*
>  	 * Request qgroup rescan worker to complete and wait for it. This wait
>  	 * must be done before transaction start for quota disable since it may
> @@ -1220,7 +1228,6 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
>  	 */
>  	clear_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
>  	btrfs_qgroup_wait_for_completion(fs_info, false);
> -	mutex_unlock(&fs_info->qgroup_ioctl_lock);
>  
>  	/*
>  	 * 1 For the root item
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] btrfs: qgroup: fix deadlock between rescan worker and remove qgroup
  2022-02-28  0:55 ` Shinichiro Kawasaki
@ 2022-02-28  1:19   ` Sidong Yang
  0 siblings, 0 replies; 3+ messages in thread
From: Sidong Yang @ 2022-02-28  1:19 UTC (permalink / raw)
  To: Shinichiro Kawasaki
  Cc: Filipe Manana, dsterba@suse.cz, linux-btrfs@vger.kernel.org,
	Filipe Manana

On Mon, Feb 28, 2022 at 12:55:47AM +0000, Shinichiro Kawasaki wrote:

Hi, Shinichiro.
Thanks for detailed review!

> Thanks for this v3 patch. I found some more points to improve in the commit
> message.
> 
> On Feb 26, 2022 / 16:03, Sidong Yang wrote:
> > The commit e804861bd4e6 ("btrfs: fix deadlock between quota disable and
> > qgroup rescan worker") by Kawasaki resolves deadlock between quota
> > disable and qgroup rescan worker. but also there is a deadlock case like
> 
> nit: s/but also/But also/

Thanks.
> 
> > it. It's about enabling or disabling quota and creating or removing
> > qgroup. It can be reproduced in simple script below.
> > 
> > for i in {1..100}
> > do
> >     btrfs quota enable /mnt &
> >     btrfs qgroup create 1/0 /mnt &
> >     btrfs qgroup destroy 1/0 /mnt &
> >     btrfs quota disable /mnt &
> > done
> > 
> > Here's why the deadlock happens:
> > 
> > 1) The quota rescan task is running.
> > 
> > 2) Task A calls btrfs_quota_disable(), locks the qgroup_ioctl_lock
> >    mutex, and then calls btrfs_qgroup_wait_for_completion(), to wait for
> >    the quota rescan task to complete.
> > 
> > 3) Task B calls btrfs_remove_qgroup() and it blocks when trying to lock
> >    the qgroup_ioctl_lock mutex, because it's being held by task A. At that
> >    To  resolve this issue, The thread disabling quota should unlock
> >    int task B is holding a transaction handle for the current transaction.
> 
> I think you made a mistake in the two lines above, when you copied the text that
> Filipe suggested.

Oops. It's my mistake.
> 
> > 
> > 4) The quota rescan task calls btrfs_commit_transaction(). This results
> >    in it waiting for all other tasks to release their handles on the
> >    transaction, but task B is blocked on the qgroup_ioctl_lock mutex
> >    while holding a handle on the transaction, and that mutex is being held
> >    by task A, which is waiting for the quota rescan task to complete,
> >    resulting in a deadlock between these 3 tasks.
> > 
> > To resolve this issue, The thread disabling quota should unlock
> 
> nit: s/The thread/the thread/
> 
> > qgroup_ioctl_lock before waiting rescan completion. This patch moves
> > btrfs_qgroup_wait_for_completion() after qgroup_ioctl_lock().
> 
> Do you mean 'unlock of qgroup_ioctl_lock' instead of 'qgroup_ioclt_lock()?

Yeah, It should be unlock of qgroup_ioctl_lock. qgroup_ioctl_lock is not
a function but value. 

> 
> Also, according to Documentation/process/submitting-patches.rst, imperative
> mood is recommended for commit messages. Then the sentence above can be:
> 
>   Move btrfs_qgroup_wait_for_completion() after unlock of qgroup_ioctl_lock.

Thanks, What I wrote is exact bad example for this. I need to read the
guide carefully.

Thanks,
Sidong
> 
> -- 
> Best Regards,
> Shin'ichiro Kawasaki
> 
> > 
> > Fixes: e804861bd4e6 ("btrfs: fix deadlock between quota disable and
> > qgroup rescan worker")
> > Reviewed-by: Filipe Manana <fdmanana@suse.com>
> > Signed-off-by: Sidong Yang <realwakka@gmail.com>
> > ---
> > v3: fix comments, typos, changelog.
> > v2: add comments, move locking before clear_bit.
> > ---
> >  fs/btrfs/qgroup.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> > index 2c0dd6b8a80c..1866b1f0da01 100644
> > --- a/fs/btrfs/qgroup.c
> > +++ b/fs/btrfs/qgroup.c
> > @@ -1213,6 +1213,14 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
> >  	if (!fs_info->quota_root)
> >  		goto out;
> >  
> > +	/*
> > +	 * Unlock the qgroup_ioctl_lock mutex before waiting for the rescan worker to
> > +	 * complete. Otherwise we can deadlock because btrfs_remove_qgroup() needs
> > +	 * to lock that mutex while holding a transaction handle and the rescan
> > +	 * worker needs to commit a transaction.
> > +	 */
> > +	mutex_unlock(&fs_info->qgroup_ioctl_lock);
> > +
> >  	/*
> >  	 * Request qgroup rescan worker to complete and wait for it. This wait
> >  	 * must be done before transaction start for quota disable since it may
> > @@ -1220,7 +1228,6 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
> >  	 */
> >  	clear_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
> >  	btrfs_qgroup_wait_for_completion(fs_info, false);
> > -	mutex_unlock(&fs_info->qgroup_ioctl_lock);
> >  
> >  	/*
> >  	 * 1 For the root item
> > -- 
> > 2.25.1
> > 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-02-28  1:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-26 16:03 [PATCH v3] btrfs: qgroup: fix deadlock between rescan worker and remove qgroup Sidong Yang
2022-02-28  0:55 ` Shinichiro Kawasaki
2022-02-28  1:19   ` Sidong Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox