* [PATCH v3] btrfs: fix broken drop_caches on extent_buffer folios
@ 2025-04-18 18:25 Boris Burkov
2025-04-22 9:23 ` Daniel Vacek
0 siblings, 1 reply; 4+ messages in thread
From: Boris Burkov @ 2025-04-18 18:25 UTC (permalink / raw)
To: linux-btrfs, kernel-team
The (correct) commit
e41c81d0d30e ("mm/truncate: Replace page_mapped() call in invalidate_inode_page()")
replaced the page_mapped(page) check with a refcount check. However,
this refcount check does not work as expected with drop_caches for
btrfs's metadata pages.
Btrfs has a per-sb metadata inode with cached pages, and when not in
active use by btrfs, they have a refcount of 3. One from the initial
call to alloc_pages, one (nr_pages == 1) from filemap_add_folio, and one
from folio_attach_private. We would expect such pages to get dropped by
drop_caches. However, drop_caches calls into mapping_evict_folio via
mapping_try_invalidate which gets a reference on the folio with
find_lock_entries(). As a result, these pages have a refcount of 4, and
fail this check.
For what it's worth, such pages do get reclaimed under memory pressure,
so I would say that while this behavior is surprising, it is not really
dangerously broken.
When I asked the mm folks about the expected refcount in this case, I
was told that the correct thing to do is to donate the refcount from the
original allocation to the page cache after inserting it.
https://lore.kernel.org/linux-mm/ZrwhTXKzgDnCK76Z@casper.infradead.org/
Therefore, attempt to fix this by adding a put_folio() to the critical
spot in alloc_extent_buffer where we are sure that we have really
allocated and attached new pages. We must also adjust
folio_detach_private to properly handle being the last reference to the
folio and not do a UAF after folio_detach_private().
Finally, extent_buffers allocated by clone_extent_buffer() and
alloc_dummy_extent_buffer() are unmapped, so this transfer of ownership
from allocation to insertion in the mapping does not apply to them.
However, we can still folio_put() them safely once they are finished
being allocated and have called folio_attach_private().
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
---
Changelog:
v3:
* call folio_put() before using extent_buffers allocated with
clone_extent_buffer() and alloc_dummy_extent_buffer() to get rid of
the UNMAPPED exception in btrfs_release_extent_buffer_folios().
v2:
* Used Filipe's suggestion to simplify detach_extent_buffer_folio and
lose the extra folio_get()/folio_put() pair
* Noticed a memory leak causing failures in fstests on smaller vms.
Fixed a bug where I was missing a folio_put() for ebs that never got
their folios added to the mapping.
fs/btrfs/extent_io.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 5f08615b334f..a6c74c1957ff 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2752,6 +2752,7 @@ static bool folio_range_has_eb(struct folio *folio)
static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct folio *folio)
{
struct btrfs_fs_info *fs_info = eb->fs_info;
+ struct address_space *mapping = folio->mapping;
const bool mapped = !test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags);
/*
@@ -2759,11 +2760,11 @@ static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct fo
* be done under the i_private_lock.
*/
if (mapped)
- spin_lock(&folio->mapping->i_private_lock);
+ spin_lock(&mapping->i_private_lock);
if (!folio_test_private(folio)) {
if (mapped)
- spin_unlock(&folio->mapping->i_private_lock);
+ spin_unlock(&mapping->i_private_lock);
return;
}
@@ -2783,7 +2784,7 @@ static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct fo
folio_detach_private(folio);
}
if (mapped)
- spin_unlock(&folio->mapping->i_private_lock);
+ spin_unlock(&mapping->i_private_lock);
return;
}
@@ -2806,7 +2807,7 @@ static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct fo
if (!folio_range_has_eb(folio))
btrfs_detach_subpage(fs_info, folio, BTRFS_SUBPAGE_METADATA);
- spin_unlock(&folio->mapping->i_private_lock);
+ spin_unlock(&mapping->i_private_lock);
}
/* Release all folios attached to the extent buffer */
@@ -2821,9 +2822,6 @@ static void btrfs_release_extent_buffer_folios(const struct extent_buffer *eb)
continue;
detach_extent_buffer_folio(eb, folio);
-
- /* One for when we allocated the folio. */
- folio_put(folio);
}
}
@@ -2889,6 +2887,7 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src)
return NULL;
}
WARN_ON(folio_test_dirty(folio));
+ folio_put(folio);
}
copy_extent_buffer_full(new, src);
set_extent_buffer_uptodate(new);
@@ -2915,6 +2914,8 @@ struct extent_buffer *alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info,
if (ret < 0)
goto out_detach;
}
+ for (int i = 0; i < num_extent_folios(eb); i++)
+ folio_put(eb->folios[i]);
set_extent_buffer_uptodate(eb);
btrfs_set_header_nritems(eb, 0);
@@ -3365,8 +3366,15 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
* btree_release_folio will correctly detect that a page belongs to a
* live buffer and won't free them prematurely.
*/
- for (int i = 0; i < num_extent_folios(eb); i++)
+ for (int i = 0; i < num_extent_folios(eb); i++) {
folio_unlock(eb->folios[i]);
+ /*
+ * A folio that has been added to an address_space mapping
+ * should not continue holding the refcount from its original
+ * allocation indefinitely.
+ */
+ folio_put(eb->folios[i]);
+ }
return eb;
out:
--
2.49.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v3] btrfs: fix broken drop_caches on extent_buffer folios
2025-04-18 18:25 [PATCH v3] btrfs: fix broken drop_caches on extent_buffer folios Boris Burkov
@ 2025-04-22 9:23 ` Daniel Vacek
2025-04-22 10:21 ` Filipe Manana
0 siblings, 1 reply; 4+ messages in thread
From: Daniel Vacek @ 2025-04-22 9:23 UTC (permalink / raw)
To: Boris Burkov; +Cc: linux-btrfs, kernel-team
On Fri, 18 Apr 2025 at 20:24, Boris Burkov <boris@bur.io> wrote:
>
> The (correct) commit
> e41c81d0d30e ("mm/truncate: Replace page_mapped() call in invalidate_inode_page()")
> replaced the page_mapped(page) check with a refcount check. However,
> this refcount check does not work as expected with drop_caches for
> btrfs's metadata pages.
>
> Btrfs has a per-sb metadata inode with cached pages, and when not in
> active use by btrfs, they have a refcount of 3. One from the initial
> call to alloc_pages, one (nr_pages == 1) from filemap_add_folio, and one
> from folio_attach_private. We would expect such pages to get dropped by
> drop_caches. However, drop_caches calls into mapping_evict_folio via
> mapping_try_invalidate which gets a reference on the folio with
> find_lock_entries(). As a result, these pages have a refcount of 4, and
> fail this check.
>
> For what it's worth, such pages do get reclaimed under memory pressure,
> so I would say that while this behavior is surprising, it is not really
> dangerously broken.
>
> When I asked the mm folks about the expected refcount in this case, I
> was told that the correct thing to do is to donate the refcount from the
> original allocation to the page cache after inserting it.
> https://lore.kernel.org/linux-mm/ZrwhTXKzgDnCK76Z@casper.infradead.org/
>
> Therefore, attempt to fix this by adding a put_folio() to the critical
> spot in alloc_extent_buffer where we are sure that we have really
> allocated and attached new pages. We must also adjust
> folio_detach_private to properly handle being the last reference to the
> folio and not do a UAF after folio_detach_private().
>
> Finally, extent_buffers allocated by clone_extent_buffer() and
> alloc_dummy_extent_buffer() are unmapped, so this transfer of ownership
> from allocation to insertion in the mapping does not apply to them.
> However, we can still folio_put() them safely once they are finished
> being allocated and have called folio_attach_private().
>
> Reviewed-by: Filipe Manana <fdmanana@suse.com>
> Signed-off-by: Boris Burkov <boris@bur.io>
> ---
> Changelog:
> v3:
> * call folio_put() before using extent_buffers allocated with
> clone_extent_buffer() and alloc_dummy_extent_buffer() to get rid of
> the UNMAPPED exception in btrfs_release_extent_buffer_folios().
> v2:
> * Used Filipe's suggestion to simplify detach_extent_buffer_folio and
> lose the extra folio_get()/folio_put() pair
> * Noticed a memory leak causing failures in fstests on smaller vms.
> Fixed a bug where I was missing a folio_put() for ebs that never got
> their folios added to the mapping.
>
>
> fs/btrfs/extent_io.c | 24 ++++++++++++++++--------
> 1 file changed, 16 insertions(+), 8 deletions(-)
>
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 5f08615b334f..a6c74c1957ff 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2752,6 +2752,7 @@ static bool folio_range_has_eb(struct folio *folio)
> static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct folio *folio)
> {
> struct btrfs_fs_info *fs_info = eb->fs_info;
> + struct address_space *mapping = folio->mapping;
> const bool mapped = !test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags);
>
> /*
> @@ -2759,11 +2760,11 @@ static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct fo
> * be done under the i_private_lock.
> */
> if (mapped)
> - spin_lock(&folio->mapping->i_private_lock);
> + spin_lock(&mapping->i_private_lock);
>
> if (!folio_test_private(folio)) {
> if (mapped)
> - spin_unlock(&folio->mapping->i_private_lock);
> + spin_unlock(&mapping->i_private_lock);
> return;
> }
>
> @@ -2783,7 +2784,7 @@ static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct fo
> folio_detach_private(folio);
> }
> if (mapped)
> - spin_unlock(&folio->mapping->i_private_lock);
> + spin_unlock(&mapping->i_private_lock);
> return;
> }
>
> @@ -2806,7 +2807,7 @@ static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct fo
> if (!folio_range_has_eb(folio))
> btrfs_detach_subpage(fs_info, folio, BTRFS_SUBPAGE_METADATA);
>
> - spin_unlock(&folio->mapping->i_private_lock);
> + spin_unlock(&mapping->i_private_lock);
> }
>
> /* Release all folios attached to the extent buffer */
> @@ -2821,9 +2822,6 @@ static void btrfs_release_extent_buffer_folios(const struct extent_buffer *eb)
> continue;
>
> detach_extent_buffer_folio(eb, folio);
> -
> - /* One for when we allocated the folio. */
> - folio_put(folio);
> }
> }
>
> @@ -2889,6 +2887,7 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src)
> return NULL;
> }
> WARN_ON(folio_test_dirty(folio));
> + folio_put(folio);
Would this cause double puts in case the second iteration of the loop
fails to attach?
Other than that, it looks good to me.
> }
> copy_extent_buffer_full(new, src);
> set_extent_buffer_uptodate(new);
> @@ -2915,6 +2914,8 @@ struct extent_buffer *alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info,
> if (ret < 0)
> goto out_detach;
> }
> + for (int i = 0; i < num_extent_folios(eb); i++)
> + folio_put(eb->folios[i]);
>
> set_extent_buffer_uptodate(eb);
> btrfs_set_header_nritems(eb, 0);
> @@ -3365,8 +3366,15 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
> * btree_release_folio will correctly detect that a page belongs to a
> * live buffer and won't free them prematurely.
> */
> - for (int i = 0; i < num_extent_folios(eb); i++)
> + for (int i = 0; i < num_extent_folios(eb); i++) {
> folio_unlock(eb->folios[i]);
> + /*
> + * A folio that has been added to an address_space mapping
> + * should not continue holding the refcount from its original
> + * allocation indefinitely.
> + */
> + folio_put(eb->folios[i]);
> + }
> return eb;
>
> out:
> --
> 2.49.0
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3] btrfs: fix broken drop_caches on extent_buffer folios
2025-04-22 9:23 ` Daniel Vacek
@ 2025-04-22 10:21 ` Filipe Manana
2025-04-22 15:21 ` Boris Burkov
0 siblings, 1 reply; 4+ messages in thread
From: Filipe Manana @ 2025-04-22 10:21 UTC (permalink / raw)
To: Daniel Vacek; +Cc: Boris Burkov, linux-btrfs, kernel-team
On Tue, Apr 22, 2025 at 10:24 AM Daniel Vacek <neelx@suse.com> wrote:
>
> On Fri, 18 Apr 2025 at 20:24, Boris Burkov <boris@bur.io> wrote:
> >
> > The (correct) commit
> > e41c81d0d30e ("mm/truncate: Replace page_mapped() call in invalidate_inode_page()")
> > replaced the page_mapped(page) check with a refcount check. However,
> > this refcount check does not work as expected with drop_caches for
> > btrfs's metadata pages.
> >
> > Btrfs has a per-sb metadata inode with cached pages, and when not in
> > active use by btrfs, they have a refcount of 3. One from the initial
> > call to alloc_pages, one (nr_pages == 1) from filemap_add_folio, and one
> > from folio_attach_private. We would expect such pages to get dropped by
> > drop_caches. However, drop_caches calls into mapping_evict_folio via
> > mapping_try_invalidate which gets a reference on the folio with
> > find_lock_entries(). As a result, these pages have a refcount of 4, and
> > fail this check.
> >
> > For what it's worth, such pages do get reclaimed under memory pressure,
> > so I would say that while this behavior is surprising, it is not really
> > dangerously broken.
> >
> > When I asked the mm folks about the expected refcount in this case, I
> > was told that the correct thing to do is to donate the refcount from the
> > original allocation to the page cache after inserting it.
> > https://lore.kernel.org/linux-mm/ZrwhTXKzgDnCK76Z@casper.infradead.org/
> >
> > Therefore, attempt to fix this by adding a put_folio() to the critical
> > spot in alloc_extent_buffer where we are sure that we have really
> > allocated and attached new pages. We must also adjust
> > folio_detach_private to properly handle being the last reference to the
> > folio and not do a UAF after folio_detach_private().
> >
> > Finally, extent_buffers allocated by clone_extent_buffer() and
> > alloc_dummy_extent_buffer() are unmapped, so this transfer of ownership
> > from allocation to insertion in the mapping does not apply to them.
> > However, we can still folio_put() them safely once they are finished
> > being allocated and have called folio_attach_private().
> >
> > Reviewed-by: Filipe Manana <fdmanana@suse.com>
> > Signed-off-by: Boris Burkov <boris@bur.io>
> > ---
> > Changelog:
> > v3:
> > * call folio_put() before using extent_buffers allocated with
> > clone_extent_buffer() and alloc_dummy_extent_buffer() to get rid of
> > the UNMAPPED exception in btrfs_release_extent_buffer_folios().
> > v2:
> > * Used Filipe's suggestion to simplify detach_extent_buffer_folio and
> > lose the extra folio_get()/folio_put() pair
> > * Noticed a memory leak causing failures in fstests on smaller vms.
> > Fixed a bug where I was missing a folio_put() for ebs that never got
> > their folios added to the mapping.
> >
> >
> > fs/btrfs/extent_io.c | 24 ++++++++++++++++--------
> > 1 file changed, 16 insertions(+), 8 deletions(-)
> >
> > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> > index 5f08615b334f..a6c74c1957ff 100644
> > --- a/fs/btrfs/extent_io.c
> > +++ b/fs/btrfs/extent_io.c
> > @@ -2752,6 +2752,7 @@ static bool folio_range_has_eb(struct folio *folio)
> > static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct folio *folio)
> > {
> > struct btrfs_fs_info *fs_info = eb->fs_info;
> > + struct address_space *mapping = folio->mapping;
> > const bool mapped = !test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags);
> >
> > /*
> > @@ -2759,11 +2760,11 @@ static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct fo
> > * be done under the i_private_lock.
> > */
> > if (mapped)
> > - spin_lock(&folio->mapping->i_private_lock);
> > + spin_lock(&mapping->i_private_lock);
> >
> > if (!folio_test_private(folio)) {
> > if (mapped)
> > - spin_unlock(&folio->mapping->i_private_lock);
> > + spin_unlock(&mapping->i_private_lock);
> > return;
> > }
> >
> > @@ -2783,7 +2784,7 @@ static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct fo
> > folio_detach_private(folio);
> > }
> > if (mapped)
> > - spin_unlock(&folio->mapping->i_private_lock);
> > + spin_unlock(&mapping->i_private_lock);
> > return;
> > }
> >
> > @@ -2806,7 +2807,7 @@ static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct fo
> > if (!folio_range_has_eb(folio))
> > btrfs_detach_subpage(fs_info, folio, BTRFS_SUBPAGE_METADATA);
> >
> > - spin_unlock(&folio->mapping->i_private_lock);
> > + spin_unlock(&mapping->i_private_lock);
> > }
> >
> > /* Release all folios attached to the extent buffer */
> > @@ -2821,9 +2822,6 @@ static void btrfs_release_extent_buffer_folios(const struct extent_buffer *eb)
> > continue;
> >
> > detach_extent_buffer_folio(eb, folio);
> > -
> > - /* One for when we allocated the folio. */
> > - folio_put(folio);
> > }
> > }
> >
> > @@ -2889,6 +2887,7 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src)
> > return NULL;
> > }
> > WARN_ON(folio_test_dirty(folio));
> > + folio_put(folio);
>
> Would this cause double puts in case the second iteration of the loop
> fails to attach?
Nop, because a folio_put() was removed from
btrfs_release_extent_buffer_folios() in this patch.
But this introduces a leak here in case attach_extent_buffer_folio()
returns an error, since we will miss a folio_put() for all the folios
we haven't attached.
So this folio_put() should be done in a subsequent loop, and ideally
have a comment similar to the one added to alloc_extent_buffer() in
this patch.
Boris btw, I see you kept my Review tag in v2 and v3, yet the patches
changed in non-trivial ways.
A Review tag is meant to be preserved only in case of trivial changes
like fixing typos for example or exclusively applying suggestions from
a reviewer.
>
> Other than that, it looks good to me.
>
> > }
> > copy_extent_buffer_full(new, src);
> > set_extent_buffer_uptodate(new);
> > @@ -2915,6 +2914,8 @@ struct extent_buffer *alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info,
> > if (ret < 0)
> > goto out_detach;
> > }
> > + for (int i = 0; i < num_extent_folios(eb); i++)
> > + folio_put(eb->folios[i]);
> >
> > set_extent_buffer_uptodate(eb);
> > btrfs_set_header_nritems(eb, 0);
> > @@ -3365,8 +3366,15 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
> > * btree_release_folio will correctly detect that a page belongs to a
> > * live buffer and won't free them prematurely.
> > */
> > - for (int i = 0; i < num_extent_folios(eb); i++)
> > + for (int i = 0; i < num_extent_folios(eb); i++) {
> > folio_unlock(eb->folios[i]);
> > + /*
> > + * A folio that has been added to an address_space mapping
> > + * should not continue holding the refcount from its original
> > + * allocation indefinitely.
> > + */
> > + folio_put(eb->folios[i]);
> > + }
> > return eb;
> >
> > out:
> > --
> > 2.49.0
> >
> >
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3] btrfs: fix broken drop_caches on extent_buffer folios
2025-04-22 10:21 ` Filipe Manana
@ 2025-04-22 15:21 ` Boris Burkov
0 siblings, 0 replies; 4+ messages in thread
From: Boris Burkov @ 2025-04-22 15:21 UTC (permalink / raw)
To: Filipe Manana; +Cc: Daniel Vacek, linux-btrfs, kernel-team
On Tue, Apr 22, 2025 at 11:21:46AM +0100, Filipe Manana wrote:
> On Tue, Apr 22, 2025 at 10:24 AM Daniel Vacek <neelx@suse.com> wrote:
> >
> > On Fri, 18 Apr 2025 at 20:24, Boris Burkov <boris@bur.io> wrote:
> > >
> > > The (correct) commit
> > > e41c81d0d30e ("mm/truncate: Replace page_mapped() call in invalidate_inode_page()")
> > > replaced the page_mapped(page) check with a refcount check. However,
> > > this refcount check does not work as expected with drop_caches for
> > > btrfs's metadata pages.
> > >
> > > Btrfs has a per-sb metadata inode with cached pages, and when not in
> > > active use by btrfs, they have a refcount of 3. One from the initial
> > > call to alloc_pages, one (nr_pages == 1) from filemap_add_folio, and one
> > > from folio_attach_private. We would expect such pages to get dropped by
> > > drop_caches. However, drop_caches calls into mapping_evict_folio via
> > > mapping_try_invalidate which gets a reference on the folio with
> > > find_lock_entries(). As a result, these pages have a refcount of 4, and
> > > fail this check.
> > >
> > > For what it's worth, such pages do get reclaimed under memory pressure,
> > > so I would say that while this behavior is surprising, it is not really
> > > dangerously broken.
> > >
> > > When I asked the mm folks about the expected refcount in this case, I
> > > was told that the correct thing to do is to donate the refcount from the
> > > original allocation to the page cache after inserting it.
> > > https://lore.kernel.org/linux-mm/ZrwhTXKzgDnCK76Z@casper.infradead.org/
> > >
> > > Therefore, attempt to fix this by adding a put_folio() to the critical
> > > spot in alloc_extent_buffer where we are sure that we have really
> > > allocated and attached new pages. We must also adjust
> > > folio_detach_private to properly handle being the last reference to the
> > > folio and not do a UAF after folio_detach_private().
> > >
> > > Finally, extent_buffers allocated by clone_extent_buffer() and
> > > alloc_dummy_extent_buffer() are unmapped, so this transfer of ownership
> > > from allocation to insertion in the mapping does not apply to them.
> > > However, we can still folio_put() them safely once they are finished
> > > being allocated and have called folio_attach_private().
> > >
> > > Reviewed-by: Filipe Manana <fdmanana@suse.com>
> > > Signed-off-by: Boris Burkov <boris@bur.io>
> > > ---
> > > Changelog:
> > > v3:
> > > * call folio_put() before using extent_buffers allocated with
> > > clone_extent_buffer() and alloc_dummy_extent_buffer() to get rid of
> > > the UNMAPPED exception in btrfs_release_extent_buffer_folios().
> > > v2:
> > > * Used Filipe's suggestion to simplify detach_extent_buffer_folio and
> > > lose the extra folio_get()/folio_put() pair
> > > * Noticed a memory leak causing failures in fstests on smaller vms.
> > > Fixed a bug where I was missing a folio_put() for ebs that never got
> > > their folios added to the mapping.
> > >
> > >
> > > fs/btrfs/extent_io.c | 24 ++++++++++++++++--------
> > > 1 file changed, 16 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> > > index 5f08615b334f..a6c74c1957ff 100644
> > > --- a/fs/btrfs/extent_io.c
> > > +++ b/fs/btrfs/extent_io.c
> > > @@ -2752,6 +2752,7 @@ static bool folio_range_has_eb(struct folio *folio)
> > > static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct folio *folio)
> > > {
> > > struct btrfs_fs_info *fs_info = eb->fs_info;
> > > + struct address_space *mapping = folio->mapping;
> > > const bool mapped = !test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags);
> > >
> > > /*
> > > @@ -2759,11 +2760,11 @@ static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct fo
> > > * be done under the i_private_lock.
> > > */
> > > if (mapped)
> > > - spin_lock(&folio->mapping->i_private_lock);
> > > + spin_lock(&mapping->i_private_lock);
> > >
> > > if (!folio_test_private(folio)) {
> > > if (mapped)
> > > - spin_unlock(&folio->mapping->i_private_lock);
> > > + spin_unlock(&mapping->i_private_lock);
> > > return;
> > > }
> > >
> > > @@ -2783,7 +2784,7 @@ static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct fo
> > > folio_detach_private(folio);
> > > }
> > > if (mapped)
> > > - spin_unlock(&folio->mapping->i_private_lock);
> > > + spin_unlock(&mapping->i_private_lock);
> > > return;
> > > }
> > >
> > > @@ -2806,7 +2807,7 @@ static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct fo
> > > if (!folio_range_has_eb(folio))
> > > btrfs_detach_subpage(fs_info, folio, BTRFS_SUBPAGE_METADATA);
> > >
> > > - spin_unlock(&folio->mapping->i_private_lock);
> > > + spin_unlock(&mapping->i_private_lock);
> > > }
> > >
> > > /* Release all folios attached to the extent buffer */
> > > @@ -2821,9 +2822,6 @@ static void btrfs_release_extent_buffer_folios(const struct extent_buffer *eb)
> > > continue;
> > >
> > > detach_extent_buffer_folio(eb, folio);
> > > -
> > > - /* One for when we allocated the folio. */
> > > - folio_put(folio);
> > > }
> > > }
> > >
> > > @@ -2889,6 +2887,7 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src)
> > > return NULL;
> > > }
> > > WARN_ON(folio_test_dirty(folio));
> > > + folio_put(folio);
> >
> > Would this cause double puts in case the second iteration of the loop
> > fails to attach?
>
> Nop, because a folio_put() was removed from
> btrfs_release_extent_buffer_folios() in this patch.
> But this introduces a leak here in case attach_extent_buffer_folio()
> returns an error, since we will miss a folio_put() for all the folios
> we haven't attached.
>
> So this folio_put() should be done in a subsequent loop, and ideally
> have a comment similar to the one added to alloc_extent_buffer() in
> this patch.
I'll send a patch to make the proper error handling.
>
> Boris btw, I see you kept my Review tag in v2 and v3, yet the patches
> changed in non-trivial ways.
> A Review tag is meant to be preserved only in case of trivial changes
> like fixing typos for example or exclusively applying suggestions from
> a reviewer.
My mistake. I wanted to your honor your existing review work, not steal it,
but I will be sure to remove it for any non-trivial changes after review in
the future, as you should not be accountable for bugs you didn't review.
>
> >
> > Other than that, it looks good to me.
> >
> > > }
> > > copy_extent_buffer_full(new, src);
> > > set_extent_buffer_uptodate(new);
> > > @@ -2915,6 +2914,8 @@ struct extent_buffer *alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info,
> > > if (ret < 0)
> > > goto out_detach;
> > > }
> > > + for (int i = 0; i < num_extent_folios(eb); i++)
> > > + folio_put(eb->folios[i]);
> > >
> > > set_extent_buffer_uptodate(eb);
> > > btrfs_set_header_nritems(eb, 0);
> > > @@ -3365,8 +3366,15 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
> > > * btree_release_folio will correctly detect that a page belongs to a
> > > * live buffer and won't free them prematurely.
> > > */
> > > - for (int i = 0; i < num_extent_folios(eb); i++)
> > > + for (int i = 0; i < num_extent_folios(eb); i++) {
> > > folio_unlock(eb->folios[i]);
> > > + /*
> > > + * A folio that has been added to an address_space mapping
> > > + * should not continue holding the refcount from its original
> > > + * allocation indefinitely.
> > > + */
> > > + folio_put(eb->folios[i]);
> > > + }
> > > return eb;
> > >
> > > out:
> > > --
> > > 2.49.0
> > >
> > >
> >
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-04-22 15:21 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-18 18:25 [PATCH v3] btrfs: fix broken drop_caches on extent_buffer folios Boris Burkov
2025-04-22 9:23 ` Daniel Vacek
2025-04-22 10:21 ` Filipe Manana
2025-04-22 15:21 ` Boris Burkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox