* [PATCH v3 0/2] btrfs: fix __folio_put refcount errors
@ 2024-07-02 14:31 Boris Burkov
2024-07-02 14:31 ` [PATCH v3 1/2] btrfs: fix __folio_put refcount in btrfs_do_encoded_write Boris Burkov
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Boris Burkov @ 2024-07-02 14:31 UTC (permalink / raw)
To: linux-btrfs, kernel-team
Switching from __free_page to __folio_put introduced a bug because
__free_page called put_page_testzero while __folio_put does not. Fix the
two affected callers by changing to folio_put which does call
put_folio_testzero.
--
Changelog:
v3:
- split up patches for backporting
v2:
- add second callsite
Boris Burkov (2):
btrfs: fix __folio_put refcount in btrfs_do_encoded_write
btrfs: fix __folio_put refcount in __alloc_dummy_extent_buffer
fs/btrfs/extent_io.c | 2 +-
fs/btrfs/inode.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
--
2.45.2
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH v3 1/2] btrfs: fix __folio_put refcount in btrfs_do_encoded_write
2024-07-02 14:31 [PATCH v3 0/2] btrfs: fix __folio_put refcount errors Boris Burkov
@ 2024-07-02 14:31 ` Boris Burkov
2024-07-02 16:19 ` David Sterba
2024-07-02 14:31 ` [PATCH v3 2/2] btrfs: fix __folio_put refcount in __alloc_dummy_extent_buffer Boris Burkov
` (2 subsequent siblings)
3 siblings, 1 reply; 8+ messages in thread
From: Boris Burkov @ 2024-07-02 14:31 UTC (permalink / raw)
To: linux-btrfs, kernel-team
The conversion to folios switched __free_page to __folio_put in the
error path in btrfs_do_encoded_write.
However, this gets the page refcounting wrong. If we do hit that error
path (I reproduced by modifying btrfs_do_encoded_write to pretend to
always fail in a way that jumps to out_folios and running the xfstest
btrfs/281), then we always hit the following BUG freeing the folio:
BUG: Bad page state in process btrfs pfn:40ab0b
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x61be5 pfn:0x40ab0b
flags: 0x5ffff0000000000(node=0|zone=2|lastcpupid=0x1ffff)
raw: 05ffff0000000000 0000000000000000 dead000000000122 0000000000000000
raw: 0000000000061be5 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: nonzero _refcount
Call Trace:
<TASK>
dump_stack_lvl+0x3d/0xe0
bad_page+0xea/0xf0
free_unref_page+0x8e1/0x900
? __mem_cgroup_uncharge+0x69/0x90
__folio_put+0xe6/0x190
btrfs_do_encoded_write+0x445/0x780
? current_time+0x25/0xd0
btrfs_do_write_iter+0x2cc/0x4b0
btrfs_ioctl_encoded_write+0x2b6/0x340
It turns out __free_page dereferenced the page while __folio_put does
not. Switch __folio_put to folio_put which does dereference the folio
first.
Fixes: 400b172b8cdc ("btrfs: compression: migrate compression/decompression paths to folios")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
---
fs/btrfs/inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 0a11d309ee89..12fb7e8056a1 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9558,7 +9558,7 @@ ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from,
out_folios:
for (i = 0; i < nr_folios; i++) {
if (folios[i])
- __folio_put(folios[i]);
+ folio_put(folios[i]);
}
kvfree(folios);
out:
--
2.45.2
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH v3 1/2] btrfs: fix __folio_put refcount in btrfs_do_encoded_write
2024-07-02 14:31 ` [PATCH v3 1/2] btrfs: fix __folio_put refcount in btrfs_do_encoded_write Boris Burkov
@ 2024-07-02 16:19 ` David Sterba
2024-07-02 16:51 ` Boris Burkov
0 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2024-07-02 16:19 UTC (permalink / raw)
To: Boris Burkov; +Cc: linux-btrfs, kernel-team
On Tue, Jul 02, 2024 at 07:31:13AM -0700, Boris Burkov wrote:
> The conversion to folios switched __free_page to __folio_put in the
> error path in btrfs_do_encoded_write.
>
> However, this gets the page refcounting wrong. If we do hit that error
> path (I reproduced by modifying btrfs_do_encoded_write to pretend to
> always fail in a way that jumps to out_folios and running the xfstest
> btrfs/281), then we always hit the following BUG freeing the folio:
>
> BUG: Bad page state in process btrfs pfn:40ab0b
> page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x61be5 pfn:0x40ab0b
> flags: 0x5ffff0000000000(node=0|zone=2|lastcpupid=0x1ffff)
> raw: 05ffff0000000000 0000000000000000 dead000000000122 0000000000000000
> raw: 0000000000061be5 0000000000000000 00000001ffffffff 0000000000000000
> page dumped because: nonzero _refcount
> Call Trace:
> <TASK>
> dump_stack_lvl+0x3d/0xe0
> bad_page+0xea/0xf0
> free_unref_page+0x8e1/0x900
> ? __mem_cgroup_uncharge+0x69/0x90
> __folio_put+0xe6/0x190
> btrfs_do_encoded_write+0x445/0x780
> ? current_time+0x25/0xd0
> btrfs_do_write_iter+0x2cc/0x4b0
> btrfs_ioctl_encoded_write+0x2b6/0x340
>
> It turns out __free_page dereferenced the page while __folio_put does
> not. Switch __folio_put to folio_put which does dereference the folio
> first.
By 'dereferenced' you mean to decrease the reference count? Because
dereference is usually said about pointers, it's confusing in this
context.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3 1/2] btrfs: fix __folio_put refcount in btrfs_do_encoded_write
2024-07-02 16:19 ` David Sterba
@ 2024-07-02 16:51 ` Boris Burkov
0 siblings, 0 replies; 8+ messages in thread
From: Boris Burkov @ 2024-07-02 16:51 UTC (permalink / raw)
To: David Sterba; +Cc: linux-btrfs, kernel-team
On Tue, Jul 02, 2024 at 06:19:16PM +0200, David Sterba wrote:
> On Tue, Jul 02, 2024 at 07:31:13AM -0700, Boris Burkov wrote:
> > The conversion to folios switched __free_page to __folio_put in the
> > error path in btrfs_do_encoded_write.
> >
> > However, this gets the page refcounting wrong. If we do hit that error
> > path (I reproduced by modifying btrfs_do_encoded_write to pretend to
> > always fail in a way that jumps to out_folios and running the xfstest
> > btrfs/281), then we always hit the following BUG freeing the folio:
> >
> > BUG: Bad page state in process btrfs pfn:40ab0b
> > page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x61be5 pfn:0x40ab0b
> > flags: 0x5ffff0000000000(node=0|zone=2|lastcpupid=0x1ffff)
> > raw: 05ffff0000000000 0000000000000000 dead000000000122 0000000000000000
> > raw: 0000000000061be5 0000000000000000 00000001ffffffff 0000000000000000
> > page dumped because: nonzero _refcount
> > Call Trace:
> > <TASK>
> > dump_stack_lvl+0x3d/0xe0
> > bad_page+0xea/0xf0
> > free_unref_page+0x8e1/0x900
> > ? __mem_cgroup_uncharge+0x69/0x90
> > __folio_put+0xe6/0x190
> > btrfs_do_encoded_write+0x445/0x780
> > ? current_time+0x25/0xd0
> > btrfs_do_write_iter+0x2cc/0x4b0
> > btrfs_ioctl_encoded_write+0x2b6/0x340
> >
> > It turns out __free_page dereferenced the page while __folio_put does
> > not. Switch __folio_put to folio_put which does dereference the folio
> > first.
>
> By 'dereferenced' you mean to decrease the reference count? Because
> dereference is usually said about pointers, it's confusing in this
> context.
Yes, you're right. It is "removing a reference" but not the best use of
"de-reference" in that case :) "decrement the refcount" is certainly
much clearer.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v3 2/2] btrfs: fix __folio_put refcount in __alloc_dummy_extent_buffer
2024-07-02 14:31 [PATCH v3 0/2] btrfs: fix __folio_put refcount errors Boris Burkov
2024-07-02 14:31 ` [PATCH v3 1/2] btrfs: fix __folio_put refcount in btrfs_do_encoded_write Boris Burkov
@ 2024-07-02 14:31 ` Boris Burkov
2024-07-02 14:41 ` [PATCH v3 0/2] btrfs: fix __folio_put refcount errors Filipe Manana
2024-07-02 17:23 ` David Sterba
3 siblings, 0 replies; 8+ messages in thread
From: Boris Burkov @ 2024-07-02 14:31 UTC (permalink / raw)
To: linux-btrfs, kernel-team
Another improper use of __folio_put in an error path after freshly
allocating pages/folios which returns them with the refcount initialized
to 1. The refactor from __free_pages -> __folio_put (instead of
folio_put) removed a refcount decrement found in __free_pages and
folio_put but absent from __folio_put.
Fixes: 13df3775efca ("btrfs: cleanup metadata page pointer usage")
Signed-off-by: Boris Burkov <boris@bur.io>
---
fs/btrfs/extent_io.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index d3ce07ab9692..cb315779af30 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2775,7 +2775,7 @@ struct extent_buffer *__alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info,
for (int i = 0; i < num_folios; i++) {
if (eb->folios[i]) {
detach_extent_buffer_folio(eb, eb->folios[i]);
- __folio_put(eb->folios[i]);
+ folio_put(eb->folios[i]);
}
}
__free_extent_buffer(eb);
--
2.45.2
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH v3 0/2] btrfs: fix __folio_put refcount errors
2024-07-02 14:31 [PATCH v3 0/2] btrfs: fix __folio_put refcount errors Boris Burkov
2024-07-02 14:31 ` [PATCH v3 1/2] btrfs: fix __folio_put refcount in btrfs_do_encoded_write Boris Burkov
2024-07-02 14:31 ` [PATCH v3 2/2] btrfs: fix __folio_put refcount in __alloc_dummy_extent_buffer Boris Burkov
@ 2024-07-02 14:41 ` Filipe Manana
2024-07-02 17:23 ` David Sterba
3 siblings, 0 replies; 8+ messages in thread
From: Filipe Manana @ 2024-07-02 14:41 UTC (permalink / raw)
To: Boris Burkov; +Cc: linux-btrfs, kernel-team
On Tue, Jul 2, 2024 at 3:32 PM Boris Burkov <boris@bur.io> wrote:
>
> Switching from __free_page to __folio_put introduced a bug because
> __free_page called put_page_testzero while __folio_put does not. Fix the
> two affected callers by changing to folio_put which does call
> put_folio_testzero.
> --
> Changelog:
> v3:
> - split up patches for backporting
> v2:
> - add second callsite
>
> Boris Burkov (2):
> btrfs: fix __folio_put refcount in btrfs_do_encoded_write
> btrfs: fix __folio_put refcount in __alloc_dummy_extent_buffer
>
> fs/btrfs/extent_io.c | 2 +-
> fs/btrfs/inode.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
For both patches:
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Thanks.
>
> --
> 2.45.2
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3 0/2] btrfs: fix __folio_put refcount errors
2024-07-02 14:31 [PATCH v3 0/2] btrfs: fix __folio_put refcount errors Boris Burkov
` (2 preceding siblings ...)
2024-07-02 14:41 ` [PATCH v3 0/2] btrfs: fix __folio_put refcount errors Filipe Manana
@ 2024-07-02 17:23 ` David Sterba
3 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2024-07-02 17:23 UTC (permalink / raw)
To: Boris Burkov; +Cc: linux-btrfs, kernel-team
On Tue, Jul 02, 2024 at 07:31:12AM -0700, Boris Burkov wrote:
> Switching from __free_page to __folio_put introduced a bug because
> __free_page called put_page_testzero while __folio_put does not. Fix the
> two affected callers by changing to folio_put which does call
> put_folio_testzero.
> --
> Changelog:
> v3:
> - split up patches for backporting
> v2:
> - add second callsite
>
> Boris Burkov (2):
> btrfs: fix __folio_put refcount in btrfs_do_encoded_write
> btrfs: fix __folio_put refcount in __alloc_dummy_extent_buffer
Reviewed-by: David Sterba <dsterba@suse.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v3 0/2] btrfs: fix __folio_put refcount errors
@ 2024-07-03 18:47 Ed T.
0 siblings, 0 replies; 8+ messages in thread
From: Ed T. @ 2024-07-03 18:47 UTC (permalink / raw)
To: Boris Burkov; +Cc: linux-btrfs
From: Filipe Manana <fdmanana@kernel.org>
To: Boris Burkov <boris@bur.io>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH v3 0/2] btrfs: fix __folio_put refcount errors
Date: Tue, 2 Jul 2024 15:41:19 +0100 [thread overview]
Message-ID: <CAL3q7H7Gu2SQV+V1WMuVsuMmffAyKVTC5miagZVeitVQps6YuA@mail.gmail.com>
(raw)
In-Reply-To: <cover.1719930430.git.boris@bur.io>
On Tue, Jul 2, 2024 at 3:32 PM Boris Burkov <boris@bur.io> wrote:
>
> Switching from __free_page to __folio_put introduced a bug because
> __free_page called put_page_testzero while __folio_put does not. Fix the
> two affected callers by changing to folio_put which does call
> put_folio_testzero.
> --
> Changelog:
> v3:
> - split up patches for backporting
> v2:
> - add second callsite
>
> Boris Burkov (2):
> btrfs: fix __folio_put refcount in btrfs_do_encoded_write
> btrfs: fix __folio_put refcount in __alloc_dummy_extent_buffer
>
> fs/btrfs/extent_io.c | 2 +-
> fs/btrfs/inode.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
(resent as plaintext)
For both patches:
Tested-by: Ed Tomlinson <edtoml@gmail.com>
In both 6.10-rc5 & rc6 I was seeing many 'bad page' errors in my logs
during my backups, which create, send & delete snapshots. With these
patches applied to rc6 the logs are clean.
Thanks
Ed Tomlinson
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-07-03 18:48 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-02 14:31 [PATCH v3 0/2] btrfs: fix __folio_put refcount errors Boris Burkov
2024-07-02 14:31 ` [PATCH v3 1/2] btrfs: fix __folio_put refcount in btrfs_do_encoded_write Boris Burkov
2024-07-02 16:19 ` David Sterba
2024-07-02 16:51 ` Boris Burkov
2024-07-02 14:31 ` [PATCH v3 2/2] btrfs: fix __folio_put refcount in __alloc_dummy_extent_buffer Boris Burkov
2024-07-02 14:41 ` [PATCH v3 0/2] btrfs: fix __folio_put refcount errors Filipe Manana
2024-07-02 17:23 ` David Sterba
-- strict thread matches above, loose matches on Subject: below --
2024-07-03 18:47 Ed T.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.