public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] btrfs: fix extent state leaks after device replace
@ 2023-04-26 17:12 fdmanana
  2023-04-26 17:13 ` [PATCH 1/2] btrfs: fix leak of source device allocation state " fdmanana
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: fdmanana @ 2023-04-26 17:12 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

This fixes a recent regression (on its way to Linus' tree) that results in
leaking extent state records in the allocation io tree of a device used as
a source device for a device replace. Also unexport btrfs_free_device().

Filipe Manana (2):
  btrfs: fix leak of source device allocation state after device replace
  btrfs: make btrfs_free_device() static

 fs/btrfs/volumes.c | 3 ++-
 fs/btrfs/volumes.h | 1 -
 2 files changed, 2 insertions(+), 2 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] btrfs: fix leak of source device allocation state after device replace
  2023-04-26 17:12 [PATCH 0/2] btrfs: fix extent state leaks after device replace fdmanana
@ 2023-04-26 17:13 ` fdmanana
  2023-04-27  3:15   ` Qu Wenruo
  2023-04-27 15:16   ` Anand Jain
  2023-04-26 17:13 ` [PATCH 2/2] btrfs: make btrfs_free_device() static fdmanana
  2023-04-27 22:52 ` [PATCH 0/2] btrfs: fix extent state leaks after device replace David Sterba
  2 siblings, 2 replies; 12+ messages in thread
From: fdmanana @ 2023-04-26 17:13 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

When a device replace finishes, the source device is freed by calling
btrfs_free_device() at btrfs_rm_dev_replace_free_srcdev(), but the
allocation state, tracked in the device's alloc_state io tree, is never
freed.

This is a regression recently introduced by commit f0bb5474cff0 ("btrfs:
remove redundant release of btrfs_device::alloc_state"), which removed a
call to extent_io_tree_release() from btrfs_free_device(), with the
rationale that btrfs_close_one_device() already releases the allocation
state from a device and btrfs_close_one_device() is always called before
a device is freed with btrfs_free_device(). However that is not true for
the device replace case, as btrfs_free_device() is called without any
previous call to btrfs_close_one_device().

The issue is trivial to reproduce, for example, by running test btrfs/027
from fstests:

  $ ./check btrfs/027
  $ rmmod btrfs
  $ dmesg
  (...)
  [84519.395485] BTRFS info (device sdc): dev_replace from <missing disk> (devid 2) to /dev/sdg started
  [84519.466224] BTRFS info (device sdc): dev_replace from <missing disk> (devid 2) to /dev/sdg finished
  [84519.552251] BTRFS info (device sdc): scrub: started on devid 1
  [84519.552277] BTRFS info (device sdc): scrub: started on devid 2
  [84519.552332] BTRFS info (device sdc): scrub: started on devid 3
  [84519.552705] BTRFS info (device sdc): scrub: started on devid 4
  [84519.604261] BTRFS info (device sdc): scrub: finished on devid 4 with status: 0
  [84519.609374] BTRFS info (device sdc): scrub: finished on devid 3 with status: 0
  [84519.610818] BTRFS info (device sdc): scrub: finished on devid 1 with status: 0
  [84519.610927] BTRFS info (device sdc): scrub: finished on devid 2 with status: 0
  [84559.503795] BTRFS: state leak: start 1048576 end 1351614463 state 1 in tree 1 refs 1
  [84559.506764] BTRFS: state leak: start 1048576 end 1347420159 state 1 in tree 1 refs 1
  [84559.510294] BTRFS: state leak: start 1048576 end 1351614463 state 1 in tree 1 refs 1

So fix this by adding back the call to extent_io_tree_release() at
btrfs_free_device().

Fixes: f0bb5474cff0 ("btrfs: remove redundant release of btrfs_device::alloc_state")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/volumes.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 03f52e4a20aa..841e799dece5 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -395,6 +395,7 @@ void btrfs_free_device(struct btrfs_device *device)
 {
 	WARN_ON(!list_empty(&device->post_commit_list));
 	rcu_string_free(device->name);
+	extent_io_tree_release(&device->alloc_state);
 	btrfs_destroy_dev_zone_info(device);
 	kfree(device);
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] btrfs: make btrfs_free_device() static
  2023-04-26 17:12 [PATCH 0/2] btrfs: fix extent state leaks after device replace fdmanana
  2023-04-26 17:13 ` [PATCH 1/2] btrfs: fix leak of source device allocation state " fdmanana
@ 2023-04-26 17:13 ` fdmanana
  2023-04-27 11:31   ` Qu Wenruo
  2023-04-27 15:17   ` Anand Jain
  2023-04-27 22:52 ` [PATCH 0/2] btrfs: fix extent state leaks after device replace David Sterba
  2 siblings, 2 replies; 12+ messages in thread
From: fdmanana @ 2023-04-26 17:13 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

The function btrfs_free_device() is never used outside of volumes.c, so
make it static and remove its prototype declaration at volumes.h.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/volumes.c | 2 +-
 fs/btrfs/volumes.h | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 841e799dece5..1a7620680f50 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -391,7 +391,7 @@ static struct btrfs_fs_devices *alloc_fs_devices(const u8 *fsid,
 	return fs_devs;
 }
 
-void btrfs_free_device(struct btrfs_device *device)
+static void btrfs_free_device(struct btrfs_device *device)
 {
 	WARN_ON(!list_empty(&device->post_commit_list));
 	rcu_string_free(device->name);
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index bf47a1a70813..5cbbee32748c 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -617,7 +617,6 @@ struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info,
 					const u64 *devid, const u8 *uuid,
 					const char *path);
 void btrfs_put_dev_args_from_path(struct btrfs_dev_lookup_args *args);
-void btrfs_free_device(struct btrfs_device *device);
 int btrfs_rm_device(struct btrfs_fs_info *fs_info,
 		    struct btrfs_dev_lookup_args *args,
 		    struct block_device **bdev, fmode_t *mode);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] btrfs: fix leak of source device allocation state after device replace
  2023-04-26 17:13 ` [PATCH 1/2] btrfs: fix leak of source device allocation state " fdmanana
@ 2023-04-27  3:15   ` Qu Wenruo
  2023-04-27 15:16   ` Anand Jain
  1 sibling, 0 replies; 12+ messages in thread
From: Qu Wenruo @ 2023-04-27  3:15 UTC (permalink / raw)
  To: fdmanana, linux-btrfs



On 2023/4/27 01:13, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> When a device replace finishes, the source device is freed by calling
> btrfs_free_device() at btrfs_rm_dev_replace_free_srcdev(), but the
> allocation state, tracked in the device's alloc_state io tree, is never
> freed.
> 
> This is a regression recently introduced by commit f0bb5474cff0 ("btrfs:
> remove redundant release of btrfs_device::alloc_state"), which removed a
> call to extent_io_tree_release() from btrfs_free_device(), with the
> rationale that btrfs_close_one_device() already releases the allocation
> state from a device and btrfs_close_one_device() is always called before
> a device is freed with btrfs_free_device(). However that is not true for
> the device replace case, as btrfs_free_device() is called without any
> previous call to btrfs_close_one_device().
> 
> The issue is trivial to reproduce, for example, by running test btrfs/027
> from fstests:
> 
>    $ ./check btrfs/027
>    $ rmmod btrfs
>    $ dmesg
>    (...)
>    [84519.395485] BTRFS info (device sdc): dev_replace from <missing disk> (devid 2) to /dev/sdg started
>    [84519.466224] BTRFS info (device sdc): dev_replace from <missing disk> (devid 2) to /dev/sdg finished
>    [84519.552251] BTRFS info (device sdc): scrub: started on devid 1
>    [84519.552277] BTRFS info (device sdc): scrub: started on devid 2
>    [84519.552332] BTRFS info (device sdc): scrub: started on devid 3
>    [84519.552705] BTRFS info (device sdc): scrub: started on devid 4
>    [84519.604261] BTRFS info (device sdc): scrub: finished on devid 4 with status: 0
>    [84519.609374] BTRFS info (device sdc): scrub: finished on devid 3 with status: 0
>    [84519.610818] BTRFS info (device sdc): scrub: finished on devid 1 with status: 0
>    [84519.610927] BTRFS info (device sdc): scrub: finished on devid 2 with status: 0
>    [84559.503795] BTRFS: state leak: start 1048576 end 1351614463 state 1 in tree 1 refs 1
>    [84559.506764] BTRFS: state leak: start 1048576 end 1347420159 state 1 in tree 1 refs 1
>    [84559.510294] BTRFS: state leak: start 1048576 end 1351614463 state 1 in tree 1 refs 1
> 
> So fix this by adding back the call to extent_io_tree_release() at
> btrfs_free_device().
> 
> Fixes: f0bb5474cff0 ("btrfs: remove redundant release of btrfs_device::alloc_state")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>   fs/btrfs/volumes.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 03f52e4a20aa..841e799dece5 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -395,6 +395,7 @@ void btrfs_free_device(struct btrfs_device *device)
>   {
>   	WARN_ON(!list_empty(&device->post_commit_list));
>   	rcu_string_free(device->name);
> +	extent_io_tree_release(&device->alloc_state);
>   	btrfs_destroy_dev_zone_info(device);
>   	kfree(device);
>   }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] btrfs: make btrfs_free_device() static
  2023-04-26 17:13 ` [PATCH 2/2] btrfs: make btrfs_free_device() static fdmanana
@ 2023-04-27 11:31   ` Qu Wenruo
  2023-04-27 15:17   ` Anand Jain
  1 sibling, 0 replies; 12+ messages in thread
From: Qu Wenruo @ 2023-04-27 11:31 UTC (permalink / raw)
  To: fdmanana, linux-btrfs



On 2023/4/27 01:13, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> The function btrfs_free_device() is never used outside of volumes.c, so
> make it static and remove its prototype declaration at volumes.h.
> 
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>   fs/btrfs/volumes.c | 2 +-
>   fs/btrfs/volumes.h | 1 -
>   2 files changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 841e799dece5..1a7620680f50 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -391,7 +391,7 @@ static struct btrfs_fs_devices *alloc_fs_devices(const u8 *fsid,
>   	return fs_devs;
>   }
>   
> -void btrfs_free_device(struct btrfs_device *device)
> +static void btrfs_free_device(struct btrfs_device *device)
>   {
>   	WARN_ON(!list_empty(&device->post_commit_list));
>   	rcu_string_free(device->name);
> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> index bf47a1a70813..5cbbee32748c 100644
> --- a/fs/btrfs/volumes.h
> +++ b/fs/btrfs/volumes.h
> @@ -617,7 +617,6 @@ struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info,
>   					const u64 *devid, const u8 *uuid,
>   					const char *path);
>   void btrfs_put_dev_args_from_path(struct btrfs_dev_lookup_args *args);
> -void btrfs_free_device(struct btrfs_device *device);
>   int btrfs_rm_device(struct btrfs_fs_info *fs_info,
>   		    struct btrfs_dev_lookup_args *args,
>   		    struct block_device **bdev, fmode_t *mode);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] btrfs: fix leak of source device allocation state after device replace
  2023-04-26 17:13 ` [PATCH 1/2] btrfs: fix leak of source device allocation state " fdmanana
  2023-04-27  3:15   ` Qu Wenruo
@ 2023-04-27 15:16   ` Anand Jain
  2023-04-27 16:34     ` Filipe Manana
  1 sibling, 1 reply; 12+ messages in thread
From: Anand Jain @ 2023-04-27 15:16 UTC (permalink / raw)
  To: fdmanana, linux-btrfs

On 27/04/2023 01:13, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> When a device replace finishes, the source device is freed by calling
> btrfs_free_device() at btrfs_rm_dev_replace_free_srcdev(), but the
> allocation state, tracked in the device's alloc_state io tree, is never
> freed.
> 
> This is a regression recently introduced by commit f0bb5474cff0 ("btrfs:
> remove redundant release of btrfs_device::alloc_state"), which removed a
> call to extent_io_tree_release() from btrfs_free_device(), with the
> rationale that btrfs_close_one_device() already releases the allocation
> state from a device and btrfs_close_one_device() is always called before
> a device is freed with btrfs_free_device(). However that is not true for
> the device replace case, as btrfs_free_device() is called without any
> previous call to btrfs_close_one_device().
> 
> The issue is trivial to reproduce, for example, by running test btrfs/027
> from fstests:
> 
>    $ ./check btrfs/027
>    $ rmmod btrfs

Ah, module reload is a useful way to verify. I have now enabled
it in the fstests by setting TEST_FS_MODULE_RELOAD=1, and I am
able to reproduce the issue.

>    $ dmesg
>    (...)
>    [84519.395485] BTRFS info (device sdc): dev_replace from <missing disk> (devid 2) to /dev/sdg started
>    [84519.466224] BTRFS info (device sdc): dev_replace from <missing disk> (devid 2) to /dev/sdg finished
>    [84519.552251] BTRFS info (device sdc): scrub: started on devid 1
>    [84519.552277] BTRFS info (device sdc): scrub: started on devid 2
>    [84519.552332] BTRFS info (device sdc): scrub: started on devid 3
>    [84519.552705] BTRFS info (device sdc): scrub: started on devid 4
>    [84519.604261] BTRFS info (device sdc): scrub: finished on devid 4 with status: 0
>    [84519.609374] BTRFS info (device sdc): scrub: finished on devid 3 with status: 0
>    [84519.610818] BTRFS info (device sdc): scrub: finished on devid 1 with status: 0
>    [84519.610927] BTRFS info (device sdc): scrub: finished on devid 2 with status: 0
>    [84559.503795] BTRFS: state leak: start 1048576 end 1351614463 state 1 in tree 1 refs 1
>    [84559.506764] BTRFS: state leak: start 1048576 end 1347420159 state 1 in tree 1 refs 1
>    [84559.510294] BTRFS: state leak: start 1048576 end 1351614463 state 1 in tree 1 refs 1
> 
> So fix this by adding back the call to extent_io_tree_release() at
> btrfs_free_device().
> 
> Fixes: f0bb5474cff0 ("btrfs: remove redundant release of btrfs_device::alloc_state")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
>   fs/btrfs/volumes.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 03f52e4a20aa..841e799dece5 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -395,6 +395,7 @@ void btrfs_free_device(struct btrfs_device *device)
>   {
>   	WARN_ON(!list_empty(&device->post_commit_list));
>   	rcu_string_free(device->name);
> +	extent_io_tree_release(&device->alloc_state);
>   	btrfs_destroy_dev_zone_info(device);
>   	kfree(device);
>   }


Hmm. Is this fix incomplete? It does not address the concern raised in
f0bb5474cff0. Why don't we also do this...

-----
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 40ef429d10a5..e8c26856426e 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1133,7 +1133,6 @@ static void btrfs_close_one_device(struct 
btrfs_device *device)

         device->fs_info = NULL;
         atomic_set(&device->dev_stats_ccnt, 0);
-       extent_io_tree_release(&device->alloc_state);

         /*
          * Reset the flush error record. We might have a transient 
flush error
-----

Thanks, Anand



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] btrfs: make btrfs_free_device() static
  2023-04-26 17:13 ` [PATCH 2/2] btrfs: make btrfs_free_device() static fdmanana
  2023-04-27 11:31   ` Qu Wenruo
@ 2023-04-27 15:17   ` Anand Jain
  1 sibling, 0 replies; 12+ messages in thread
From: Anand Jain @ 2023-04-27 15:17 UTC (permalink / raw)
  To: fdmanana, linux-btrfs

On 27/04/2023 01:13, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> The function btrfs_free_device() is never used outside of volumes.c, so
> make it static and remove its prototype declaration at volumes.h.
> 
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Reviewed-by: Anand Jain <anand.jain@oracle.com>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] btrfs: fix leak of source device allocation state after device replace
  2023-04-27 15:16   ` Anand Jain
@ 2023-04-27 16:34     ` Filipe Manana
  2023-04-27 23:39       ` Anand Jain
  0 siblings, 1 reply; 12+ messages in thread
From: Filipe Manana @ 2023-04-27 16:34 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Thu, Apr 27, 2023 at 4:16 PM Anand Jain <anand.jain@oracle.com> wrote:
>
> On 27/04/2023 01:13, fdmanana@kernel.org wrote:
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > When a device replace finishes, the source device is freed by calling
> > btrfs_free_device() at btrfs_rm_dev_replace_free_srcdev(), but the
> > allocation state, tracked in the device's alloc_state io tree, is never
> > freed.
> >
> > This is a regression recently introduced by commit f0bb5474cff0 ("btrfs:
> > remove redundant release of btrfs_device::alloc_state"), which removed a
> > call to extent_io_tree_release() from btrfs_free_device(), with the
> > rationale that btrfs_close_one_device() already releases the allocation
> > state from a device and btrfs_close_one_device() is always called before
> > a device is freed with btrfs_free_device(). However that is not true for
> > the device replace case, as btrfs_free_device() is called without any
> > previous call to btrfs_close_one_device().
> >
> > The issue is trivial to reproduce, for example, by running test btrfs/027
> > from fstests:
> >
> >    $ ./check btrfs/027
> >    $ rmmod btrfs
>
> Ah, module reload is a useful way to verify. I have now enabled
> it in the fstests by setting TEST_FS_MODULE_RELOAD=1, and I am
> able to reproduce the issue.
>
> >    $ dmesg
> >    (...)
> >    [84519.395485] BTRFS info (device sdc): dev_replace from <missing disk> (devid 2) to /dev/sdg started
> >    [84519.466224] BTRFS info (device sdc): dev_replace from <missing disk> (devid 2) to /dev/sdg finished
> >    [84519.552251] BTRFS info (device sdc): scrub: started on devid 1
> >    [84519.552277] BTRFS info (device sdc): scrub: started on devid 2
> >    [84519.552332] BTRFS info (device sdc): scrub: started on devid 3
> >    [84519.552705] BTRFS info (device sdc): scrub: started on devid 4
> >    [84519.604261] BTRFS info (device sdc): scrub: finished on devid 4 with status: 0
> >    [84519.609374] BTRFS info (device sdc): scrub: finished on devid 3 with status: 0
> >    [84519.610818] BTRFS info (device sdc): scrub: finished on devid 1 with status: 0
> >    [84519.610927] BTRFS info (device sdc): scrub: finished on devid 2 with status: 0
> >    [84559.503795] BTRFS: state leak: start 1048576 end 1351614463 state 1 in tree 1 refs 1
> >    [84559.506764] BTRFS: state leak: start 1048576 end 1347420159 state 1 in tree 1 refs 1
> >    [84559.510294] BTRFS: state leak: start 1048576 end 1351614463 state 1 in tree 1 refs 1
> >
> > So fix this by adding back the call to extent_io_tree_release() at
> > btrfs_free_device().
> >
> > Fixes: f0bb5474cff0 ("btrfs: remove redundant release of btrfs_device::alloc_state")
> > Signed-off-by: Filipe Manana <fdmanana@suse.com>
> > ---
> >   fs/btrfs/volumes.c | 1 +
> >   1 file changed, 1 insertion(+)
> >
> > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > index 03f52e4a20aa..841e799dece5 100644
> > --- a/fs/btrfs/volumes.c
> > +++ b/fs/btrfs/volumes.c
> > @@ -395,6 +395,7 @@ void btrfs_free_device(struct btrfs_device *device)
> >   {
> >       WARN_ON(!list_empty(&device->post_commit_list));
> >       rcu_string_free(device->name);
> > +     extent_io_tree_release(&device->alloc_state);
> >       btrfs_destroy_dev_zone_info(device);
> >       kfree(device);
> >   }
>
>
> Hmm. Is this fix incomplete? It does not address the concern raised in
> f0bb5474cff0. Why don't we also do this...

What's the concern exactly?
One extra call to extent_io_tree_release() that's actually needed?

>
> -----
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 40ef429d10a5..e8c26856426e 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1133,7 +1133,6 @@ static void btrfs_close_one_device(struct
> btrfs_device *device)
>
>          device->fs_info = NULL;
>          atomic_set(&device->dev_stats_ccnt, 0);
> -       extent_io_tree_release(&device->alloc_state);

This will change semantics a bit.
Removing this, will make us hold on to more memory for a longer period
when fs_devices->opened > 1, so btrfs_free_device() /
free_fs_devices() will be called potentially much later.

I'd rather keep the behaviour we had unless there's a stronger reason
other than removing 1 line of code.

Thanks.



>
>          /*
>           * Reset the flush error record. We might have a transient
> flush error
> -----
>
> Thanks, Anand
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] btrfs: fix extent state leaks after device replace
  2023-04-26 17:12 [PATCH 0/2] btrfs: fix extent state leaks after device replace fdmanana
  2023-04-26 17:13 ` [PATCH 1/2] btrfs: fix leak of source device allocation state " fdmanana
  2023-04-26 17:13 ` [PATCH 2/2] btrfs: make btrfs_free_device() static fdmanana
@ 2023-04-27 22:52 ` David Sterba
  2023-04-27 23:51   ` Anand Jain
  2 siblings, 1 reply; 12+ messages in thread
From: David Sterba @ 2023-04-27 22:52 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

On Wed, Apr 26, 2023 at 06:12:59PM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> This fixes a recent regression (on its way to Linus' tree) that results in
> leaking extent state records in the allocation io tree of a device used as
> a source device for a device replace. Also unexport btrfs_free_device().
> 
> Filipe Manana (2):
>   btrfs: fix leak of source device allocation state after device replace
>   btrfs: make btrfs_free_device() static

Added to misc-next, thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] btrfs: fix leak of source device allocation state after device replace
  2023-04-27 16:34     ` Filipe Manana
@ 2023-04-27 23:39       ` Anand Jain
  0 siblings, 0 replies; 12+ messages in thread
From: Anand Jain @ 2023-04-27 23:39 UTC (permalink / raw)
  To: Filipe Manana; +Cc: linux-btrfs



On 28/4/23 00:34, Filipe Manana wrote:
> On Thu, Apr 27, 2023 at 4:16 PM Anand Jain <anand.jain@oracle.com> wrote:
>>
>> On 27/04/2023 01:13, fdmanana@kernel.org wrote:
>>> From: Filipe Manana <fdmanana@suse.com>
>>>
>>> When a device replace finishes, the source device is freed by calling
>>> btrfs_free_device() at btrfs_rm_dev_replace_free_srcdev(), but the
>>> allocation state, tracked in the device's alloc_state io tree, is never
>>> freed.
>>>
>>> This is a regression recently introduced by commit f0bb5474cff0 ("btrfs:
>>> remove redundant release of btrfs_device::alloc_state"), which removed a
>>> call to extent_io_tree_release() from btrfs_free_device(), with the
>>> rationale that btrfs_close_one_device() already releases the allocation
>>> state from a device and btrfs_close_one_device() is always called before
>>> a device is freed with btrfs_free_device(). However that is not true for
>>> the device replace case, as btrfs_free_device() is called without any
>>> previous call to btrfs_close_one_device().
>>>
>>> The issue is trivial to reproduce, for example, by running test btrfs/027
>>> from fstests:
>>>
>>>     $ ./check btrfs/027
>>>     $ rmmod btrfs
>>
>> Ah, module reload is a useful way to verify. I have now enabled
>> it in the fstests by setting TEST_FS_MODULE_RELOAD=1, and I am
>> able to reproduce the issue.
>>
>>>     $ dmesg
>>>     (...)
>>>     [84519.395485] BTRFS info (device sdc): dev_replace from <missing disk> (devid 2) to /dev/sdg started
>>>     [84519.466224] BTRFS info (device sdc): dev_replace from <missing disk> (devid 2) to /dev/sdg finished
>>>     [84519.552251] BTRFS info (device sdc): scrub: started on devid 1
>>>     [84519.552277] BTRFS info (device sdc): scrub: started on devid 2
>>>     [84519.552332] BTRFS info (device sdc): scrub: started on devid 3
>>>     [84519.552705] BTRFS info (device sdc): scrub: started on devid 4
>>>     [84519.604261] BTRFS info (device sdc): scrub: finished on devid 4 with status: 0
>>>     [84519.609374] BTRFS info (device sdc): scrub: finished on devid 3 with status: 0
>>>     [84519.610818] BTRFS info (device sdc): scrub: finished on devid 1 with status: 0
>>>     [84519.610927] BTRFS info (device sdc): scrub: finished on devid 2 with status: 0
>>>     [84559.503795] BTRFS: state leak: start 1048576 end 1351614463 state 1 in tree 1 refs 1
>>>     [84559.506764] BTRFS: state leak: start 1048576 end 1347420159 state 1 in tree 1 refs 1
>>>     [84559.510294] BTRFS: state leak: start 1048576 end 1351614463 state 1 in tree 1 refs 1
>>>
>>> So fix this by adding back the call to extent_io_tree_release() at
>>> btrfs_free_device().
>>>
>>> Fixes: f0bb5474cff0 ("btrfs: remove redundant release of btrfs_device::alloc_state")
>>> Signed-off-by: Filipe Manana <fdmanana@suse.com>
>>> ---
>>>    fs/btrfs/volumes.c | 1 +
>>>    1 file changed, 1 insertion(+)
>>>
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index 03f52e4a20aa..841e799dece5 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -395,6 +395,7 @@ void btrfs_free_device(struct btrfs_device *device)
>>>    {
>>>        WARN_ON(!list_empty(&device->post_commit_list));
>>>        rcu_string_free(device->name);
>>> +     extent_io_tree_release(&device->alloc_state);
>>>        btrfs_destroy_dev_zone_info(device);
>>>        kfree(device);
>>>    }
>>
>>
>> Hmm. Is this fix incomplete? It does not address the concern raised in
>> f0bb5474cff0. Why don't we also do this...
> 
> What's the concern exactly?
> One extra call to extent_io_tree_release() that's actually needed?
I'm looking for a solution that meets the need without leading to a
situation where extent_io_tree_release() is invoked twice elsewhere.

> 
>>
>> -----
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 40ef429d10a5..e8c26856426e 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -1133,7 +1133,6 @@ static void btrfs_close_one_device(struct
>> btrfs_device *device)
>>
>>           device->fs_info = NULL;
>>           atomic_set(&device->dev_stats_ccnt, 0);
>> -       extent_io_tree_release(&device->alloc_state);
> 
> This will change semantics a bit.
> Removing this, will make us hold on to more memory for a longer period
> when fs_devices->opened > 1, so btrfs_free_device() /
> free_fs_devices() will be called potentially much later.
> 
> I'd rather keep the behaviour we had unless there's a stronger reason
> other than removing 1 line of code.

I'm trying to simplify by merging steps in either 
btrfs_rm_dev_replace_remove_srcdev() or 
btrfs_rm_dev_replace_free_srcdev() with btrfs_close_one_device().

This could beautifully address the issue. Let's see if it works.

Thanks, Anand

> 
> Thanks.
> 
> 
> 
>>
>>           /*
>>            * Reset the flush error record. We might have a transient
>> flush error
>> -----
>>
>> Thanks, Anand
>>
>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] btrfs: fix extent state leaks after device replace
  2023-04-27 22:52 ` [PATCH 0/2] btrfs: fix extent state leaks after device replace David Sterba
@ 2023-04-27 23:51   ` Anand Jain
  2023-04-28 13:07     ` David Sterba
  0 siblings, 1 reply; 12+ messages in thread
From: Anand Jain @ 2023-04-27 23:51 UTC (permalink / raw)
  To: dsterba, fdmanana; +Cc: linux-btrfs

On 28/4/23 06:52, David Sterba wrote:
> On Wed, Apr 26, 2023 at 06:12:59PM +0100, fdmanana@kernel.org wrote:
>> From: Filipe Manana <fdmanana@suse.com>
>>
>> This fixes a recent regression (on its way to Linus' tree) that results in
>> leaking extent state records in the allocation io tree of a device used as
>> a source device for a device replace. Also unexport btrfs_free_device().
>>
>> Filipe Manana (2):
>>    btrfs: fix leak of source device allocation state after device replace
>>    btrfs: make btrfs_free_device() static
> 


> Added to misc-next, thanks.

Oh, I hope you saw the discussions in the patch 1/2.
We can address both calling extent_io_tree_release() twice and the
leak after device replace. Or no need of it?



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] btrfs: fix extent state leaks after device replace
  2023-04-27 23:51   ` Anand Jain
@ 2023-04-28 13:07     ` David Sterba
  0 siblings, 0 replies; 12+ messages in thread
From: David Sterba @ 2023-04-28 13:07 UTC (permalink / raw)
  To: Anand Jain; +Cc: fdmanana, linux-btrfs

On Fri, Apr 28, 2023 at 07:51:42AM +0800, Anand Jain wrote:
> On 28/4/23 06:52, David Sterba wrote:
> > On Wed, Apr 26, 2023 at 06:12:59PM +0100, fdmanana@kernel.org wrote:
> >> From: Filipe Manana <fdmanana@suse.com>
> >>
> >> This fixes a recent regression (on its way to Linus' tree) that results in
> >> leaking extent state records in the allocation io tree of a device used as
> >> a source device for a device replace. Also unexport btrfs_free_device().
> >>
> >> Filipe Manana (2):
> >>    btrfs: fix leak of source device allocation state after device replace
> >>    btrfs: make btrfs_free_device() static
> > 
> 
> 
> > Added to misc-next, thanks.
> 
> Oh, I hope you saw the discussions in the patch 1/2.
> We can address both calling extent_io_tree_release() twice and the
> leak after device replace. Or no need of it?

Yes I saw the discussion, I'd rather revert back to the known behaviour,
this does not seem to be worth the optimization, apparently it's easy to
miss some cases.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-04-28 13:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-26 17:12 [PATCH 0/2] btrfs: fix extent state leaks after device replace fdmanana
2023-04-26 17:13 ` [PATCH 1/2] btrfs: fix leak of source device allocation state " fdmanana
2023-04-27  3:15   ` Qu Wenruo
2023-04-27 15:16   ` Anand Jain
2023-04-27 16:34     ` Filipe Manana
2023-04-27 23:39       ` Anand Jain
2023-04-26 17:13 ` [PATCH 2/2] btrfs: make btrfs_free_device() static fdmanana
2023-04-27 11:31   ` Qu Wenruo
2023-04-27 15:17   ` Anand Jain
2023-04-27 22:52 ` [PATCH 0/2] btrfs: fix extent state leaks after device replace David Sterba
2023-04-27 23:51   ` Anand Jain
2023-04-28 13:07     ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox