linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] protect relocation with sb_start_write
@ 2022-03-11  7:38 Naohiro Aota
  2022-03-11  7:38 ` [PATCH 1/4] btrfs: mark resumed async balance as writing Naohiro Aota
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Naohiro Aota @ 2022-03-11  7:38 UTC (permalink / raw)
  To: linux-btrfs; +Cc: johannes.thumshirn, linux-fsdevel, viro, david, Naohiro Aota

This series is a follow-up to the series below. The old series added
an assertion to btrfs_relocate_chunk() to check if it is protected
with sb_start_write(). However, it revealed another location we need
to add sb_start_write() [1].

https://lore.kernel.org/linux-btrfs/cover.1645157220.git.naohiro.aota@wdc.com/T/

[1] https://lore.kernel.org/linux-btrfs/cover.1645157220.git.naohiro.aota@wdc.com/T/#e06eecc07ce1cd1e45bfd30a374bd2d15b4fd76d8

Patches 1 and 2 add sb_{start,end}_write() to the resumed async
balancing and device addition.

Patches 3 and 4 add an ASSERT to catch a future error.

Note: I added Fixes tag as "5accdf82ba25 ("fs: Improve filesystem
freezing handling")" considering that sb_start_write() is missing from
the introduction of it. But, I'm not sure this commit is correct or
not.

Naohiro Aota (4):
  btrfs: mark resumed async balance as writing
  btrfs: mark device addition as sb_writing
  fs: add check functions for sb_start_{write,pagefault,intwrite}
  btrfs: assert that relocation is protected with sb_start_write()

 fs/btrfs/ioctl.c   |  2 ++
 fs/btrfs/volumes.c |  5 +++++
 include/linux/fs.h | 20 ++++++++++++++++++++
 3 files changed, 27 insertions(+)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/4] btrfs: mark resumed async balance as writing
  2022-03-11  7:38 [PATCH 0/4] protect relocation with sb_start_write Naohiro Aota
@ 2022-03-11  7:38 ` Naohiro Aota
  2022-03-11 14:08   ` Filipe Manana
  2022-03-11  7:38 ` [PATCH 2/4] btrfs: mark device addition as sb_writing Naohiro Aota
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Naohiro Aota @ 2022-03-11  7:38 UTC (permalink / raw)
  To: linux-btrfs; +Cc: johannes.thumshirn, linux-fsdevel, viro, david, Naohiro Aota

When btrfs balance is interrupted with umount, the background balance
resumes on the next mount. There is a potential deadlock with FS freezing
here like as described in commit 26559780b953 ("btrfs: zoned: mark
relocation as writing").

Mark the process as sb_writing. To preserve the order of sb_start_write()
(or mnt_want_write_file()) and btrfs_exclop_start(), call sb_start_write()
at btrfs_resume_balance_async() before taking fs_info->super_lock.

Fixes: 5accdf82ba25 ("fs: Improve filesystem freezing handling")
Cc: stable@vger.kernel.org
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/volumes.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 1be7cb2f955f..0d27d8d35c7a 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4443,6 +4443,7 @@ static int balance_kthread(void *data)
 	if (fs_info->balance_ctl)
 		ret = btrfs_balance(fs_info, fs_info->balance_ctl, NULL);
 	mutex_unlock(&fs_info->balance_mutex);
+	sb_end_write(fs_info->sb);
 
 	return ret;
 }
@@ -4463,6 +4464,7 @@ int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info)
 		return 0;
 	}
 
+	sb_start_write(fs_info->sb);
 	spin_lock(&fs_info->super_lock);
 	ASSERT(fs_info->exclusive_operation == BTRFS_EXCLOP_BALANCE_PAUSED);
 	fs_info->exclusive_operation = BTRFS_EXCLOP_BALANCE;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] btrfs: mark device addition as sb_writing
  2022-03-11  7:38 [PATCH 0/4] protect relocation with sb_start_write Naohiro Aota
  2022-03-11  7:38 ` [PATCH 1/4] btrfs: mark resumed async balance as writing Naohiro Aota
@ 2022-03-11  7:38 ` Naohiro Aota
  2022-03-11 14:21   ` Filipe Manana
  2022-03-11  7:38 ` [PATCH 3/4] fs: add check functions for sb_start_{write,pagefault,intwrite} Naohiro Aota
  2022-03-11  7:38 ` [PATCH 4/4] btrfs: assert that relocation is protected with sb_start_write() Naohiro Aota
  3 siblings, 1 reply; 12+ messages in thread
From: Naohiro Aota @ 2022-03-11  7:38 UTC (permalink / raw)
  To: linux-btrfs; +Cc: johannes.thumshirn, linux-fsdevel, viro, david, Naohiro Aota

btrfs_init_new_device() calls btrfs_relocate_sys_chunk() which incurs
file-system internal writing. That writing can cause a deadlock with
FS freezing like as described in like as described in commit
26559780b953 ("btrfs: zoned: mark relocation as writing").

Mark the device addition as sb_writing. This is also consistent with
the removing device ioctl counterpart.

Fixes: 5accdf82ba25 ("fs: Improve filesystem freezing handling")
Cc: stable@vger.kernel.org
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/ioctl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 238cee5b5254..ffa30fd3eed2 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3484,6 +3484,7 @@ static long btrfs_ioctl_add_dev(struct btrfs_fs_info *fs_info, void __user *arg)
 		return -EINVAL;
 	}
 
+	sb_start_write(fs_info->sb);
 	if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_DEV_ADD)) {
 		if (!btrfs_exclop_start_try_lock(fs_info, BTRFS_EXCLOP_DEV_ADD))
 			return BTRFS_ERROR_DEV_EXCL_RUN_IN_PROGRESS;
@@ -3516,6 +3517,7 @@ static long btrfs_ioctl_add_dev(struct btrfs_fs_info *fs_info, void __user *arg)
 		btrfs_exclop_balance(fs_info, BTRFS_EXCLOP_BALANCE_PAUSED);
 	else
 		btrfs_exclop_finish(fs_info);
+	sb_end_write(fs_info->sb);
 	return ret;
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] fs: add check functions for sb_start_{write,pagefault,intwrite}
  2022-03-11  7:38 [PATCH 0/4] protect relocation with sb_start_write Naohiro Aota
  2022-03-11  7:38 ` [PATCH 1/4] btrfs: mark resumed async balance as writing Naohiro Aota
  2022-03-11  7:38 ` [PATCH 2/4] btrfs: mark device addition as sb_writing Naohiro Aota
@ 2022-03-11  7:38 ` Naohiro Aota
  2022-03-11 14:28   ` Filipe Manana
  2022-03-11  7:38 ` [PATCH 4/4] btrfs: assert that relocation is protected with sb_start_write() Naohiro Aota
  3 siblings, 1 reply; 12+ messages in thread
From: Naohiro Aota @ 2022-03-11  7:38 UTC (permalink / raw)
  To: linux-btrfs; +Cc: johannes.thumshirn, linux-fsdevel, viro, david, Naohiro Aota

Add a function sb_write_started() to return if sb_start_write() is
properly called. It is used in the next commit.

Also, add the similar functions for sb_start_pagefault() and
sb_start_intwrite().

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 include/linux/fs.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 27746a3da8fd..0c8714d64169 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1732,6 +1732,11 @@ static inline bool __sb_start_write_trylock(struct super_block *sb, int level)
 #define __sb_writers_release(sb, lev)	\
 	percpu_rwsem_release(&(sb)->s_writers.rw_sem[(lev)-1], 1, _THIS_IP_)
 
+static inline bool __sb_write_started(struct super_block *sb, int level)
+{
+	return lockdep_is_held_type(sb->s_writers.rw_sem + level - 1, 1);
+}
+
 /**
  * sb_end_write - drop write access to a superblock
  * @sb: the super we wrote to
@@ -1797,6 +1802,11 @@ static inline bool sb_start_write_trylock(struct super_block *sb)
 	return __sb_start_write_trylock(sb, SB_FREEZE_WRITE);
 }
 
+static inline bool sb_write_started(struct super_block *sb)
+{
+	return __sb_write_started(sb, SB_FREEZE_WRITE);
+}
+
 /**
  * sb_start_pagefault - get write access to a superblock from a page fault
  * @sb: the super we write to
@@ -1821,6 +1831,11 @@ static inline void sb_start_pagefault(struct super_block *sb)
 	__sb_start_write(sb, SB_FREEZE_PAGEFAULT);
 }
 
+static inline bool sb_pagefault_started(struct super_block *sb)
+{
+	return __sb_write_started(sb, SB_FREEZE_PAGEFAULT);
+}
+
 /**
  * sb_start_intwrite - get write access to a superblock for internal fs purposes
  * @sb: the super we write to
@@ -1844,6 +1859,11 @@ static inline bool sb_start_intwrite_trylock(struct super_block *sb)
 	return __sb_start_write_trylock(sb, SB_FREEZE_FS);
 }
 
+static inline bool sb_intwrite_started(struct super_block *sb)
+{
+	return __sb_write_started(sb, SB_FREEZE_FS);
+}
+
 bool inode_owner_or_capable(struct user_namespace *mnt_userns,
 			    const struct inode *inode);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] btrfs: assert that relocation is protected with sb_start_write()
  2022-03-11  7:38 [PATCH 0/4] protect relocation with sb_start_write Naohiro Aota
                   ` (2 preceding siblings ...)
  2022-03-11  7:38 ` [PATCH 3/4] fs: add check functions for sb_start_{write,pagefault,intwrite} Naohiro Aota
@ 2022-03-11  7:38 ` Naohiro Aota
  2022-03-11 14:33   ` Filipe Manana
  3 siblings, 1 reply; 12+ messages in thread
From: Naohiro Aota @ 2022-03-11  7:38 UTC (permalink / raw)
  To: linux-btrfs; +Cc: johannes.thumshirn, linux-fsdevel, viro, david, Naohiro Aota

btrfs_relocate_chunk() initiates new ordered extents. They can cause a
hang when a process is trying to thaw the filesystem.

We should have called sb_start_write(), so the filesystem is not being
frozen. Add an ASSERT to check it is protected.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/volumes.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 0d27d8d35c7a..b558fd293ffa 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3239,6 +3239,9 @@ int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset)
 	u64 length;
 	int ret;
 
+	/* Assert we called sb_start_write(), not to race with FS freezing */
+	ASSERT(sb_write_started(fs_info->sb));
+
 	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
 		btrfs_err(fs_info,
 			  "relocate: not supported on extent tree v2 yet");
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] btrfs: mark resumed async balance as writing
  2022-03-11  7:38 ` [PATCH 1/4] btrfs: mark resumed async balance as writing Naohiro Aota
@ 2022-03-11 14:08   ` Filipe Manana
  2022-03-14  2:29     ` Naohiro Aota
  0 siblings, 1 reply; 12+ messages in thread
From: Filipe Manana @ 2022-03-11 14:08 UTC (permalink / raw)
  To: Naohiro Aota; +Cc: linux-btrfs, johannes.thumshirn, linux-fsdevel, viro, david

On Fri, Mar 11, 2022 at 04:38:02PM +0900, Naohiro Aota wrote:
> When btrfs balance is interrupted with umount, the background balance
> resumes on the next mount. There is a potential deadlock with FS freezing
> here like as described in commit 26559780b953 ("btrfs: zoned: mark
> relocation as writing").
> 
> Mark the process as sb_writing. To preserve the order of sb_start_write()
> (or mnt_want_write_file()) and btrfs_exclop_start(), call sb_start_write()
> at btrfs_resume_balance_async() before taking fs_info->super_lock.
> 
> Fixes: 5accdf82ba25 ("fs: Improve filesystem freezing handling")

This seems odd to me. I read the note you left on the cover letter about
this, but honestly I don't think it's fair to blame that commit. I see
it more as btrfs specific problem.

Plus it's a 10 years old commit, so instead of the Fixes tag, adding a
minimal kernel version to the CC stable tag below makes more sense.

> Cc: stable@vger.kernel.org
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> ---
>  fs/btrfs/volumes.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 1be7cb2f955f..0d27d8d35c7a 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -4443,6 +4443,7 @@ static int balance_kthread(void *data)
>  	if (fs_info->balance_ctl)
>  		ret = btrfs_balance(fs_info, fs_info->balance_ctl, NULL);
>  	mutex_unlock(&fs_info->balance_mutex);
> +	sb_end_write(fs_info->sb);
>  
>  	return ret;
>  }
> @@ -4463,6 +4464,7 @@ int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info)
>  		return 0;
>  	}
>  
> +	sb_start_write(fs_info->sb);

I don't understand this.

We are doing the sb_start_write() here, in the task doing the mount, and then
we do the sb_end_write() at the kthread that runs balance_kthread().

Why not do the sb_start_write() in the kthread?

This is also buggy in the case the call below to kthread_run() fails, as
we end up never calling sb_end_write().

Thanks.

>  	spin_lock(&fs_info->super_lock);
>  	ASSERT(fs_info->exclusive_operation == BTRFS_EXCLOP_BALANCE_PAUSED);
>  	fs_info->exclusive_operation = BTRFS_EXCLOP_BALANCE;
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] btrfs: mark device addition as sb_writing
  2022-03-11  7:38 ` [PATCH 2/4] btrfs: mark device addition as sb_writing Naohiro Aota
@ 2022-03-11 14:21   ` Filipe Manana
  2022-03-14  2:31     ` Naohiro Aota
  0 siblings, 1 reply; 12+ messages in thread
From: Filipe Manana @ 2022-03-11 14:21 UTC (permalink / raw)
  To: Naohiro Aota; +Cc: linux-btrfs, johannes.thumshirn, linux-fsdevel, viro, david

On Fri, Mar 11, 2022 at 04:38:03PM +0900, Naohiro Aota wrote:
> btrfs_init_new_device() calls btrfs_relocate_sys_chunk() which incurs
> file-system internal writing. That writing can cause a deadlock with
> FS freezing like as described in like as described in commit
> 26559780b953 ("btrfs: zoned: mark relocation as writing").
> 
> Mark the device addition as sb_writing. This is also consistent with
> the removing device ioctl counterpart.
> 
> Fixes: 5accdf82ba25 ("fs: Improve filesystem freezing handling")

Same comment as the previous patch about this.

> Cc: stable@vger.kernel.org
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> ---
>  fs/btrfs/ioctl.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 238cee5b5254..ffa30fd3eed2 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -3484,6 +3484,7 @@ static long btrfs_ioctl_add_dev(struct btrfs_fs_info *fs_info, void __user *arg)
>  		return -EINVAL;
>  	}
>  
> +	sb_start_write(fs_info->sb);

Why not use mnt_want_write_file(), just like all the other ioctls that need
to do some change to the fs?

We don't have the struct file * here at btrfs_ioctl_add_dev(), but we have
it in its caller, btrfs_ioctl().

Thanks.

>  	if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_DEV_ADD)) {
>  		if (!btrfs_exclop_start_try_lock(fs_info, BTRFS_EXCLOP_DEV_ADD))
>  			return BTRFS_ERROR_DEV_EXCL_RUN_IN_PROGRESS;
> @@ -3516,6 +3517,7 @@ static long btrfs_ioctl_add_dev(struct btrfs_fs_info *fs_info, void __user *arg)
>  		btrfs_exclop_balance(fs_info, BTRFS_EXCLOP_BALANCE_PAUSED);
>  	else
>  		btrfs_exclop_finish(fs_info);
> +	sb_end_write(fs_info->sb);
>  	return ret;
>  }
>  
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] fs: add check functions for sb_start_{write,pagefault,intwrite}
  2022-03-11  7:38 ` [PATCH 3/4] fs: add check functions for sb_start_{write,pagefault,intwrite} Naohiro Aota
@ 2022-03-11 14:28   ` Filipe Manana
  0 siblings, 0 replies; 12+ messages in thread
From: Filipe Manana @ 2022-03-11 14:28 UTC (permalink / raw)
  To: Naohiro Aota; +Cc: linux-btrfs, johannes.thumshirn, linux-fsdevel, viro, david

On Fri, Mar 11, 2022 at 04:38:04PM +0900, Naohiro Aota wrote:
> Add a function sb_write_started() to return if sb_start_write() is
> properly called. It is used in the next commit.
> 
> Also, add the similar functions for sb_start_pagefault() and
> sb_start_intwrite().
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Looks good, thanks.

> ---
>  include/linux/fs.h | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 27746a3da8fd..0c8714d64169 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1732,6 +1732,11 @@ static inline bool __sb_start_write_trylock(struct super_block *sb, int level)
>  #define __sb_writers_release(sb, lev)	\
>  	percpu_rwsem_release(&(sb)->s_writers.rw_sem[(lev)-1], 1, _THIS_IP_)
>  
> +static inline bool __sb_write_started(struct super_block *sb, int level)
> +{
> +	return lockdep_is_held_type(sb->s_writers.rw_sem + level - 1, 1);
> +}
> +
>  /**
>   * sb_end_write - drop write access to a superblock
>   * @sb: the super we wrote to
> @@ -1797,6 +1802,11 @@ static inline bool sb_start_write_trylock(struct super_block *sb)
>  	return __sb_start_write_trylock(sb, SB_FREEZE_WRITE);
>  }
>  
> +static inline bool sb_write_started(struct super_block *sb)
> +{
> +	return __sb_write_started(sb, SB_FREEZE_WRITE);
> +}
> +
>  /**
>   * sb_start_pagefault - get write access to a superblock from a page fault
>   * @sb: the super we write to
> @@ -1821,6 +1831,11 @@ static inline void sb_start_pagefault(struct super_block *sb)
>  	__sb_start_write(sb, SB_FREEZE_PAGEFAULT);
>  }
>  
> +static inline bool sb_pagefault_started(struct super_block *sb)
> +{
> +	return __sb_write_started(sb, SB_FREEZE_PAGEFAULT);
> +}
> +
>  /**
>   * sb_start_intwrite - get write access to a superblock for internal fs purposes
>   * @sb: the super we write to
> @@ -1844,6 +1859,11 @@ static inline bool sb_start_intwrite_trylock(struct super_block *sb)
>  	return __sb_start_write_trylock(sb, SB_FREEZE_FS);
>  }
>  
> +static inline bool sb_intwrite_started(struct super_block *sb)
> +{
> +	return __sb_write_started(sb, SB_FREEZE_FS);
> +}
> +
>  bool inode_owner_or_capable(struct user_namespace *mnt_userns,
>  			    const struct inode *inode);
>  
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] btrfs: assert that relocation is protected with sb_start_write()
  2022-03-11  7:38 ` [PATCH 4/4] btrfs: assert that relocation is protected with sb_start_write() Naohiro Aota
@ 2022-03-11 14:33   ` Filipe Manana
  0 siblings, 0 replies; 12+ messages in thread
From: Filipe Manana @ 2022-03-11 14:33 UTC (permalink / raw)
  To: Naohiro Aota; +Cc: linux-btrfs, johannes.thumshirn, linux-fsdevel, viro, david

On Fri, Mar 11, 2022 at 04:38:05PM +0900, Naohiro Aota wrote:
> btrfs_relocate_chunk() initiates new ordered extents. They can cause a
> hang when a process is trying to thaw the filesystem.
> 
> We should have called sb_start_write(), so the filesystem is not being
> frozen. Add an ASSERT to check it is protected.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

> ---
>  fs/btrfs/volumes.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 0d27d8d35c7a..b558fd293ffa 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -3239,6 +3239,9 @@ int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset)
>  	u64 length;
>  	int ret;
>  
> +	/* Assert we called sb_start_write(), not to race with FS freezing */
> +	ASSERT(sb_write_started(fs_info->sb));

Does this pass the scenario of patch 1/4 (resuming balance on mount)?

Because as commented in that patch, we have the sb_start_write() done
in the mount task, and not by the task that actually runs balance - the
balance kthread.

Anyway, this change looks good, my concerns are only about patch 1/4.

Thanks.

> +
>  	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
>  		btrfs_err(fs_info,
>  			  "relocate: not supported on extent tree v2 yet");
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] btrfs: mark resumed async balance as writing
  2022-03-11 14:08   ` Filipe Manana
@ 2022-03-14  2:29     ` Naohiro Aota
  2022-03-14 11:25       ` Filipe Manana
  0 siblings, 1 reply; 12+ messages in thread
From: Naohiro Aota @ 2022-03-14  2:29 UTC (permalink / raw)
  To: Filipe Manana
  Cc: linux-btrfs@vger.kernel.org, Johannes Thumshirn,
	linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk,
	david@fromorbit.com

On Fri, Mar 11, 2022 at 02:08:37PM +0000, Filipe Manana wrote:
> On Fri, Mar 11, 2022 at 04:38:02PM +0900, Naohiro Aota wrote:
> > When btrfs balance is interrupted with umount, the background balance
> > resumes on the next mount. There is a potential deadlock with FS freezing
> > here like as described in commit 26559780b953 ("btrfs: zoned: mark
> > relocation as writing").
> > 
> > Mark the process as sb_writing. To preserve the order of sb_start_write()
> > (or mnt_want_write_file()) and btrfs_exclop_start(), call sb_start_write()
> > at btrfs_resume_balance_async() before taking fs_info->super_lock.
> > 
> > Fixes: 5accdf82ba25 ("fs: Improve filesystem freezing handling")
> 
> This seems odd to me. I read the note you left on the cover letter about
> this, but honestly I don't think it's fair to blame that commit. I see
> it more as btrfs specific problem.

Yeah, I was really not sure how I should write the tag. The issue is
we missed to add sb_start_write() after this commit.

> Plus it's a 10 years old commit, so instead of the Fixes tag, adding a
> minimal kernel version to the CC stable tag below makes more sense.

So, only with "Cc: stable@vger.kernel.org # 3.6+" ?

> > Cc: stable@vger.kernel.org
> > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> > ---
> >  fs/btrfs/volumes.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > index 1be7cb2f955f..0d27d8d35c7a 100644
> > --- a/fs/btrfs/volumes.c
> > +++ b/fs/btrfs/volumes.c
> > @@ -4443,6 +4443,7 @@ static int balance_kthread(void *data)
> >  	if (fs_info->balance_ctl)
> >  		ret = btrfs_balance(fs_info, fs_info->balance_ctl, NULL);
> >  	mutex_unlock(&fs_info->balance_mutex);
> > +	sb_end_write(fs_info->sb);
> >  
> >  	return ret;
> >  }
> > @@ -4463,6 +4464,7 @@ int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info)
> >  		return 0;
> >  	}
> >  
> > +	sb_start_write(fs_info->sb);
> 
> I don't understand this.
> 
> We are doing the sb_start_write() here, in the task doing the mount, and then
> we do the sb_end_write() at the kthread that runs balance_kthread().

Oops, I made a mistake here. It actually printed the lockdep warning
"lock held when returning to user space!".

> Why not do the sb_start_write() in the kthread?
> 
> This is also buggy in the case the call below to kthread_run() fails, as
> we end up never calling sb_end_write().

I was trying to preserve the lock taking order: sb_start_write() ->
spin_lock(fs_info->super_lock). But, it might not be a big deal as
long as we don't call sb_start_write() in the super_lock.

> Thanks.
> 
> >  	spin_lock(&fs_info->super_lock);
> >  	ASSERT(fs_info->exclusive_operation == BTRFS_EXCLOP_BALANCE_PAUSED);
> >  	fs_info->exclusive_operation = BTRFS_EXCLOP_BALANCE;
> > -- 
> > 2.35.1
> > 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] btrfs: mark device addition as sb_writing
  2022-03-11 14:21   ` Filipe Manana
@ 2022-03-14  2:31     ` Naohiro Aota
  0 siblings, 0 replies; 12+ messages in thread
From: Naohiro Aota @ 2022-03-14  2:31 UTC (permalink / raw)
  To: Filipe Manana
  Cc: linux-btrfs@vger.kernel.org, Johannes Thumshirn,
	linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk,
	david@fromorbit.com

On Fri, Mar 11, 2022 at 02:21:14PM +0000, Filipe Manana wrote:
> On Fri, Mar 11, 2022 at 04:38:03PM +0900, Naohiro Aota wrote:
> > btrfs_init_new_device() calls btrfs_relocate_sys_chunk() which incurs
> > file-system internal writing. That writing can cause a deadlock with
> > FS freezing like as described in like as described in commit
> > 26559780b953 ("btrfs: zoned: mark relocation as writing").
> > 
> > Mark the device addition as sb_writing. This is also consistent with
> > the removing device ioctl counterpart.
> > 
> > Fixes: 5accdf82ba25 ("fs: Improve filesystem freezing handling")
> 
> Same comment as the previous patch about this.
> 
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> > ---
> >  fs/btrfs/ioctl.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> > index 238cee5b5254..ffa30fd3eed2 100644
> > --- a/fs/btrfs/ioctl.c
> > +++ b/fs/btrfs/ioctl.c
> > @@ -3484,6 +3484,7 @@ static long btrfs_ioctl_add_dev(struct btrfs_fs_info *fs_info, void __user *arg)
> >  		return -EINVAL;
> >  	}
> >  
> > +	sb_start_write(fs_info->sb);
> 
> Why not use mnt_want_write_file(), just like all the other ioctls that need
> to do some change to the fs?

This is just because there are no "struct file *" here.

> We don't have the struct file * here at btrfs_ioctl_add_dev(), but we have
> it in its caller, btrfs_ioctl().

So, I'll fix the patch in this way. Thanks.

> Thanks.
> 
> >  	if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_DEV_ADD)) {
> >  		if (!btrfs_exclop_start_try_lock(fs_info, BTRFS_EXCLOP_DEV_ADD))
> >  			return BTRFS_ERROR_DEV_EXCL_RUN_IN_PROGRESS;
> > @@ -3516,6 +3517,7 @@ static long btrfs_ioctl_add_dev(struct btrfs_fs_info *fs_info, void __user *arg)
> >  		btrfs_exclop_balance(fs_info, BTRFS_EXCLOP_BALANCE_PAUSED);
> >  	else
> >  		btrfs_exclop_finish(fs_info);
> > +	sb_end_write(fs_info->sb);
> >  	return ret;
> >  }
> >  
> > -- 
> > 2.35.1
> > 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] btrfs: mark resumed async balance as writing
  2022-03-14  2:29     ` Naohiro Aota
@ 2022-03-14 11:25       ` Filipe Manana
  0 siblings, 0 replies; 12+ messages in thread
From: Filipe Manana @ 2022-03-14 11:25 UTC (permalink / raw)
  To: Naohiro Aota
  Cc: linux-btrfs@vger.kernel.org, Johannes Thumshirn,
	linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk,
	david@fromorbit.com

On Mon, Mar 14, 2022 at 02:29:22AM +0000, Naohiro Aota wrote:
> On Fri, Mar 11, 2022 at 02:08:37PM +0000, Filipe Manana wrote:
> > On Fri, Mar 11, 2022 at 04:38:02PM +0900, Naohiro Aota wrote:
> > > When btrfs balance is interrupted with umount, the background balance
> > > resumes on the next mount. There is a potential deadlock with FS freezing
> > > here like as described in commit 26559780b953 ("btrfs: zoned: mark
> > > relocation as writing").
> > > 
> > > Mark the process as sb_writing. To preserve the order of sb_start_write()
> > > (or mnt_want_write_file()) and btrfs_exclop_start(), call sb_start_write()
> > > at btrfs_resume_balance_async() before taking fs_info->super_lock.
> > > 
> > > Fixes: 5accdf82ba25 ("fs: Improve filesystem freezing handling")
> > 
> > This seems odd to me. I read the note you left on the cover letter about
> > this, but honestly I don't think it's fair to blame that commit. I see
> > it more as btrfs specific problem.
> 
> Yeah, I was really not sure how I should write the tag. The issue is
> we missed to add sb_start_write() after this commit.
> 
> > Plus it's a 10 years old commit, so instead of the Fixes tag, adding a
> > minimal kernel version to the CC stable tag below makes more sense.
> 
> So, only with "Cc: stable@vger.kernel.org # 3.6+" ?

Looking at kernel.org the oldest stable kernel is 4.9, so anything older
than that is pointless.

> 
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> > > ---
> > >  fs/btrfs/volumes.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > > index 1be7cb2f955f..0d27d8d35c7a 100644
> > > --- a/fs/btrfs/volumes.c
> > > +++ b/fs/btrfs/volumes.c
> > > @@ -4443,6 +4443,7 @@ static int balance_kthread(void *data)
> > >  	if (fs_info->balance_ctl)
> > >  		ret = btrfs_balance(fs_info, fs_info->balance_ctl, NULL);
> > >  	mutex_unlock(&fs_info->balance_mutex);
> > > +	sb_end_write(fs_info->sb);
> > >  
> > >  	return ret;
> > >  }
> > > @@ -4463,6 +4464,7 @@ int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info)
> > >  		return 0;
> > >  	}
> > >  
> > > +	sb_start_write(fs_info->sb);
> > 
> > I don't understand this.
> > 
> > We are doing the sb_start_write() here, in the task doing the mount, and then
> > we do the sb_end_write() at the kthread that runs balance_kthread().
> 
> Oops, I made a mistake here. It actually printed the lockdep warning
> "lock held when returning to user space!".
> 
> > Why not do the sb_start_write() in the kthread?
> > 
> > This is also buggy in the case the call below to kthread_run() fails, as
> > we end up never calling sb_end_write().
> 
> I was trying to preserve the lock taking order: sb_start_write() ->
> spin_lock(fs_info->super_lock). But, it might not be a big deal as
> long as we don't call sb_start_write() in the super_lock.
> 
> > Thanks.
> > 
> > >  	spin_lock(&fs_info->super_lock);
> > >  	ASSERT(fs_info->exclusive_operation == BTRFS_EXCLOP_BALANCE_PAUSED);
> > >  	fs_info->exclusive_operation = BTRFS_EXCLOP_BALANCE;
> > > -- 
> > > 2.35.1
> > > 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-03-14 11:25 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-11  7:38 [PATCH 0/4] protect relocation with sb_start_write Naohiro Aota
2022-03-11  7:38 ` [PATCH 1/4] btrfs: mark resumed async balance as writing Naohiro Aota
2022-03-11 14:08   ` Filipe Manana
2022-03-14  2:29     ` Naohiro Aota
2022-03-14 11:25       ` Filipe Manana
2022-03-11  7:38 ` [PATCH 2/4] btrfs: mark device addition as sb_writing Naohiro Aota
2022-03-11 14:21   ` Filipe Manana
2022-03-14  2:31     ` Naohiro Aota
2022-03-11  7:38 ` [PATCH 3/4] fs: add check functions for sb_start_{write,pagefault,intwrite} Naohiro Aota
2022-03-11 14:28   ` Filipe Manana
2022-03-11  7:38 ` [PATCH 4/4] btrfs: assert that relocation is protected with sb_start_write() Naohiro Aota
2022-03-11 14:33   ` Filipe Manana

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).