* [REGRESSION] fuse: copy_file_range() fails with EIO
@ 2024-08-09 15:53 Jürg Billeter
2024-08-09 16:03 ` Matthew Wilcox
0 siblings, 1 reply; 19+ messages in thread
From: Jürg Billeter @ 2024-08-09 15:53 UTC (permalink / raw)
To: Miklos Szeredi, Matthew Wilcox (Oracle); +Cc: linux-fsdevel, regressions
Starting with 6.10, I'm seeing `copy_file_range()`, with source and
destination being on the same FUSE filesystem[1], failing with EIO in
some cases. The (low-level libfuse3) userspace filesystem does not
implement `copy_file_range`, so the kernel falls back to the generic
implementation. The userspace filesystem receives read requests and
replies with the `FUSE_BUF_SPLICE_MOVE` flag.
I'm not sure what exactly triggers the issue but it may depend on the
file size, among other things. I can reproduce it fairly reliably
attempting to copy files that are exactly 65536 bytes in size.
6.9 works fine but I see the issue in 6.10, 6.10.3 and also in current
master ee9a43b7cfe2.
413e8f014c8b848e4ce939156f210df59fbd1c24 is the first bad commit
commit 413e8f014c8b848e4ce939156f210df59fbd1c24 (HEAD)
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date: Sat Apr 20 03:50:06 2024 +0100
fuse: Convert fuse_readpages_end() to use folio_end_read()
Nobody checks the error flag on fuse folios, so stop setting it.
Optimise the (optional) setting of the uptodate flag and clearing
of the lock flag by using folio_end_read().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
I've confirmed the bisection by reverting this commit on top of 6.10.3,
which resolves the issue.
#regzbot introduced: 413e8f014c8b
Cheers,
Jürg
[1] https://gitlab.com/BuildGrid/buildbox/buildbox/-/tree/master/fuse
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [REGRESSION] fuse: copy_file_range() fails with EIO
2024-08-09 15:53 [REGRESSION] fuse: copy_file_range() fails with EIO Jürg Billeter
@ 2024-08-09 16:03 ` Matthew Wilcox
2024-08-09 16:22 ` [PATCH 1/3] fuse: remove call to SetPageError Matthew Wilcox (Oracle)
` (3 more replies)
0 siblings, 4 replies; 19+ messages in thread
From: Matthew Wilcox @ 2024-08-09 16:03 UTC (permalink / raw)
To: Jürg Billeter; +Cc: Miklos Szeredi, linux-fsdevel, regressions
On Fri, Aug 09, 2024 at 05:53:26PM +0200, Jürg Billeter wrote:
> Starting with 6.10, I'm seeing `copy_file_range()`, with source and
> destination being on the same FUSE filesystem[1], failing with EIO in
> some cases. The (low-level libfuse3) userspace filesystem does not
> implement `copy_file_range`, so the kernel falls back to the generic
> implementation. The userspace filesystem receives read requests and
> replies with the `FUSE_BUF_SPLICE_MOVE` flag.
>
> I'm not sure what exactly triggers the issue but it may depend on the
> file size, among other things. I can reproduce it fairly reliably
> attempting to copy files that are exactly 65536 bytes in size.
>
> 6.9 works fine but I see the issue in 6.10, 6.10.3 and also in current
> master ee9a43b7cfe2.
>
> 413e8f014c8b848e4ce939156f210df59fbd1c24 is the first bad commit
> commit 413e8f014c8b848e4ce939156f210df59fbd1c24 (HEAD)
> Author: Matthew Wilcox (Oracle) <willy@infradead.org>
> Date: Sat Apr 20 03:50:06 2024 +0100
>
> fuse: Convert fuse_readpages_end() to use folio_end_read()
>
> Nobody checks the error flag on fuse folios, so stop setting it.
> Optimise the (optional) setting of the uptodate flag and clearing
> of the lock flag by using folio_end_read().
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
>
> I've confirmed the bisection by reverting this commit on top of 6.10.3,
> which resolves the issue.
Umm. I don't see it. This is all that's in the commit:
for (i = 0; i < ap->num_pages; i++) {
- struct page *page = ap->pages[i];
+ struct folio *folio = page_folio(ap->pages[i]);
- if (!err)
- SetPageUptodate(page);
- else
- SetPageError(page);
- unlock_page(page);
- put_page(page);
+ folio_end_read(folio, !err);
+ folio_put(folio);
}
Do you have CONFIG_DEBUG_VM enabled? There are some debugging asserts
which that will enable that might indicate a problem.
Otherwise we can try splitting the commit into individual steps and
seeing which one shows the problem.
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/3] fuse: remove call to SetPageError
2024-08-09 16:03 ` Matthew Wilcox
@ 2024-08-09 16:22 ` Matthew Wilcox (Oracle)
2024-08-09 16:22 ` [PATCH 2/3] fuse: Use a folio instead of a page Matthew Wilcox (Oracle)
` (2 subsequent siblings)
3 siblings, 0 replies; 19+ messages in thread
From: Matthew Wilcox (Oracle) @ 2024-08-09 16:22 UTC (permalink / raw)
To: Jürg Billeter; +Cc: Matthew Wilcox (Oracle), Miklos Szeredi, linux-fsdevel
part one
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/fuse/file.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index b57ce4157640..4bf452d5ba9d 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -939,8 +939,6 @@ static void fuse_readpages_end(struct fuse_mount *fm, struct fuse_args *args,
if (!err)
SetPageUptodate(page);
- else
- SetPageError(page);
unlock_page(page);
put_page(page);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/3] fuse: Use a folio instead of a page
2024-08-09 16:03 ` Matthew Wilcox
2024-08-09 16:22 ` [PATCH 1/3] fuse: remove call to SetPageError Matthew Wilcox (Oracle)
@ 2024-08-09 16:22 ` Matthew Wilcox (Oracle)
2024-08-09 16:22 ` [PATCH 3/3] fuse: use folio_end_read Matthew Wilcox (Oracle)
2024-08-10 5:56 ` [REGRESSION] fuse: copy_file_range() fails with EIO Jürg Billeter
3 siblings, 0 replies; 19+ messages in thread
From: Matthew Wilcox (Oracle) @ 2024-08-09 16:22 UTC (permalink / raw)
To: Jürg Billeter; +Cc: Matthew Wilcox (Oracle), Miklos Szeredi, linux-fsdevel
part two
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/fuse/file.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 4bf452d5ba9d..2b5533e41a62 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -935,12 +935,12 @@ static void fuse_readpages_end(struct fuse_mount *fm, struct fuse_args *args,
}
for (i = 0; i < ap->num_pages; i++) {
- struct page *page = ap->pages[i];
+ struct folio *folio = page_folio(ap->pages[i]);
if (!err)
- SetPageUptodate(page);
- unlock_page(page);
- put_page(page);
+ folio_mark_uptodate(folio);
+ folio_unlock(folio);
+ folio_put(folio);
}
if (ia->ff)
fuse_file_put(ia->ff, false);
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/3] fuse: use folio_end_read
2024-08-09 16:03 ` Matthew Wilcox
2024-08-09 16:22 ` [PATCH 1/3] fuse: remove call to SetPageError Matthew Wilcox (Oracle)
2024-08-09 16:22 ` [PATCH 2/3] fuse: Use a folio instead of a page Matthew Wilcox (Oracle)
@ 2024-08-09 16:22 ` Matthew Wilcox (Oracle)
2024-08-10 12:24 ` Jürg Billeter
2024-08-10 5:56 ` [REGRESSION] fuse: copy_file_range() fails with EIO Jürg Billeter
3 siblings, 1 reply; 19+ messages in thread
From: Matthew Wilcox (Oracle) @ 2024-08-09 16:22 UTC (permalink / raw)
To: Jürg Billeter; +Cc: Matthew Wilcox (Oracle), Miklos Szeredi, linux-fsdevel
part three
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/fuse/file.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 2b5533e41a62..f39456c65ed7 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -937,9 +937,7 @@ static void fuse_readpages_end(struct fuse_mount *fm, struct fuse_args *args,
for (i = 0; i < ap->num_pages; i++) {
struct folio *folio = page_folio(ap->pages[i]);
- if (!err)
- folio_mark_uptodate(folio);
- folio_unlock(folio);
+ folio_end_read(folio, !err);
folio_put(folio);
}
if (ia->ff)
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [REGRESSION] fuse: copy_file_range() fails with EIO
2024-08-09 16:03 ` Matthew Wilcox
` (2 preceding siblings ...)
2024-08-09 16:22 ` [PATCH 3/3] fuse: use folio_end_read Matthew Wilcox (Oracle)
@ 2024-08-10 5:56 ` Jürg Billeter
2024-08-10 15:12 ` Matthew Wilcox
3 siblings, 1 reply; 19+ messages in thread
From: Jürg Billeter @ 2024-08-10 5:56 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Miklos Szeredi, linux-fsdevel, regressions
Hi Matthew,
Thanks for the quick response.
On Fri, 2024-08-09 at 17:03 +0100, Matthew Wilcox wrote:
> Do you have CONFIG_DEBUG_VM enabled? There are some debugging asserts
> which that will enable that might indicate a problem.
With CONFIG_DEBUG_VM enabled, I get:
page: refcount:2 mapcount:0 mapping:00000000b2c30835 index:0x0 pfn:0x12a113
memcg:ffff9d8e3a660800
aops:0xffffffff8a056820 ino:21 dentry name:"bash"
flags: 0x24000000000022d(locked|referenced|uptodate|lru|workingset|node=0|zone=2)
raw: 024000000000022d ffffd9ce04a827c8 ffffd9ce04a84508 ffff9d8e0bbc99f0
raw: 0000000000000000 0000000000000000 00000002ffffffff ffff9d8e3a660800
page dumped because: VM_BUG_ON_FOLIO(folio_test_uptodate(folio))
------------[ cut here ]------------
kernel BUG at mm/filemap.c:1534!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 0 UID: 1000 PID: 1638 Comm: buildbox-fuse Not tainted 6.11.0-rc2+ #26
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:folio_end_read+0xa2/0xb0
Code: 37 8a e8 21 1b 05 00 0f 0b 48 c7 c6 48 7f 39 8a e8 13 1b 05 00 0f 0b 31 f6 e9 7a f9 ff ff 48 c7 c6 a8 7f 39 8a e8 fe 1a 05 00 <0f> 0b 90 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90
RSP: 0018:ffffaa0ec289fc68 EFLAGS: 00010246
RAX: 0000000000000040 RBX: ffffd9ce04a844c0 RCX: 0000000000000027
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff9d8f77c1c8c0
RBP: ffff9d8e07298a38 R08: 00000000ffffefff R09: ffffffff8a6b0d68
R10: 0000000000000003 R11: 0000000000000002 R12: 0000000000000000
R13: 0000000000000001 R14: ffff9d8e305ba098 R15: 0000000000000000
FS: 00007f83d62b4140(0000) GS:ffff9d8f77c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564ef5598078 CR3: 00000001013be000 CR4: 0000000000350ef0
Call Trace:
<TASK>
? __die+0x56/0x97
? die+0x2e/0x50
? do_trap+0x10a/0x110
? do_error_trap+0x65/0x80
? folio_end_read+0xa2/0xb0
? exc_invalid_op+0x50/0x70
? folio_end_read+0xa2/0xb0
? asm_exc_invalid_op+0x1a/0x20
? folio_end_read+0xa2/0xb0
? folio_end_read+0xa2/0xb0
fuse_readpages_end+0xc3/0x150
fuse_request_end+0x84/0x170
fuse_dev_do_write+0x24d/0x1050
? __kmalloc_node_noprof+0x25e/0x480
? fuse_dev_splice_write+0x9d/0x390
fuse_dev_splice_write+0x2b0/0x390
do_splice+0x311/0x8b0
__do_splice+0x204/0x220
__x64_sys_splice+0xb2/0x120
do_syscall_64+0x54/0x140
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f83d5f29a03
Code: 64 89 02 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 80 3d 29 48 0d 00 00 49 89 ca 74 14 b8 13 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
RSP: 002b:00007fff2c0b8ce8 EFLAGS: 00000202 ORIG_RAX: 0000000000000113
RAX: ffffffffffffffda RBX: 00007fff2c0b8eb0 RCX: 00007f83d5f29a03
RDX: 0000000000000105 RSI: 0000000000000000 RDI: 0000000000000106
RBP: 0000564ef555a4f0 R08: 0000000000004010 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000202 R12: 00007fff2c0b8e20
R13: 0000000000000001 R14: 00000000000000fb R15: 0000000000000000
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:folio_end_read+0xa2/0xb0
Code: 37 8a e8 21 1b 05 00 0f 0b 48 c7 c6 48 7f 39 8a e8 13 1b 05 00 0f 0b 31 f6 e9 7a f9 ff ff 48 c7 c6 a8 7f 39 8a e8 fe 1a 05 00 <0f> 0b 90 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90
RSP: 0018:ffffaa0ec289fc68 EFLAGS: 00010246
RAX: 0000000000000040 RBX: ffffd9ce04a844c0 RCX: 0000000000000027
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff9d8f77c1c8c0
RBP: ffff9d8e07298a38 R08: 00000000ffffefff R09: ffffffff8a6b0d68
R10: 0000000000000003 R11: 0000000000000002 R12: 0000000000000000
R13: 0000000000000001 R14: ffff9d8e305ba098 R15: 0000000000000000
FS: 00007f83d62b4140(0000) GS:ffff9d8f77c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564ef5598078 CR3: 00000001013be000 CR4: 0000000000350ef0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x8000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Cheers,
Jürg
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] fuse: use folio_end_read
2024-08-09 16:22 ` [PATCH 3/3] fuse: use folio_end_read Matthew Wilcox (Oracle)
@ 2024-08-10 12:24 ` Jürg Billeter
2024-08-20 12:57 ` Jürg Billeter
0 siblings, 1 reply; 19+ messages in thread
From: Jürg Billeter @ 2024-08-10 12:24 UTC (permalink / raw)
To: Matthew Wilcox (Oracle); +Cc: Miklos Szeredi, linux-fsdevel
On Fri, 2024-08-09 at 17:22 +0100, Matthew Wilcox (Oracle) wrote:
> part three
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
> fs/fuse/file.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 2b5533e41a62..f39456c65ed7 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -937,9 +937,7 @@ static void fuse_readpages_end(struct fuse_mount
> *fm, struct fuse_args *args,
> for (i = 0; i < ap->num_pages; i++) {
> struct folio *folio = page_folio(ap->pages[i]);
>
> - if (!err)
> - folio_mark_uptodate(folio);
> - folio_unlock(folio);
> + folio_end_read(folio, !err);
> folio_put(folio);
> }
> if (ia->ff)
Reverting this part is sufficient to fix the issue for me.
Cheers,
Jürg
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [REGRESSION] fuse: copy_file_range() fails with EIO
2024-08-10 5:56 ` [REGRESSION] fuse: copy_file_range() fails with EIO Jürg Billeter
@ 2024-08-10 15:12 ` Matthew Wilcox
2024-08-22 13:04 ` Miklos Szeredi
0 siblings, 1 reply; 19+ messages in thread
From: Matthew Wilcox @ 2024-08-10 15:12 UTC (permalink / raw)
To: Jürg Billeter; +Cc: Miklos Szeredi, linux-fsdevel, regressions
On Sat, Aug 10, 2024 at 07:56:21AM +0200, Jürg Billeter wrote:
> Thanks for the quick response.
Thanks for the quick test!
> On Fri, 2024-08-09 at 17:03 +0100, Matthew Wilcox wrote:
> > Do you have CONFIG_DEBUG_VM enabled? There are some debugging asserts
> > which that will enable that might indicate a problem.
>
> With CONFIG_DEBUG_VM enabled, I get:
>
> page: refcount:2 mapcount:0 mapping:00000000b2c30835 index:0x0 pfn:0x12a113
> memcg:ffff9d8e3a660800
> aops:0xffffffff8a056820 ino:21 dentry name:"bash"
> flags: 0x24000000000022d(locked|referenced|uptodate|lru|workingset|node=0|zone=2)
> raw: 024000000000022d ffffd9ce04a827c8 ffffd9ce04a84508 ffff9d8e0bbc99f0
> raw: 0000000000000000 0000000000000000 00000002ffffffff ffff9d8e3a660800
> page dumped because: VM_BUG_ON_FOLIO(folio_test_uptodate(folio))
That's what I suspected was going wrong -- we're trying to end a read on
a folio that is already uptodate. Miklos, what the hell is FUSE doing
here?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] fuse: use folio_end_read
2024-08-10 12:24 ` Jürg Billeter
@ 2024-08-20 12:57 ` Jürg Billeter
2024-08-20 13:50 ` Matthew Wilcox
0 siblings, 1 reply; 19+ messages in thread
From: Jürg Billeter @ 2024-08-20 12:57 UTC (permalink / raw)
To: Matthew Wilcox (Oracle); +Cc: Miklos Szeredi, linux-fsdevel
On Sat, 2024-08-10 at 14:24 +0200, Jürg Billeter wrote:
> On Fri, 2024-08-09 at 17:22 +0100, Matthew Wilcox (Oracle) wrote:
> > part three
> >
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > ---
> > fs/fuse/file.c | 4 +---
> > 1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> > index 2b5533e41a62..f39456c65ed7 100644
> > --- a/fs/fuse/file.c
> > +++ b/fs/fuse/file.c
> > @@ -937,9 +937,7 @@ static void fuse_readpages_end(struct
> > fuse_mount
> > *fm, struct fuse_args *args,
> > for (i = 0; i < ap->num_pages; i++) {
> > struct folio *folio = page_folio(ap->pages[i]);
> >
> > - if (!err)
> > - folio_mark_uptodate(folio);
> > - folio_unlock(folio);
> > + folio_end_read(folio, !err);
> > folio_put(folio);
> > }
> > if (ia->ff)
>
> Reverting this part is sufficient to fix the issue for me.
Would it make sense to get a revert of this part (or a full revert of
commit 413e8f014c8b) into mainline and also 6.10 stable if a proper fix
will take more time?
Cheers,
Jürg
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] fuse: use folio_end_read
2024-08-20 12:57 ` Jürg Billeter
@ 2024-08-20 13:50 ` Matthew Wilcox
2024-08-22 11:58 ` Miklos Szeredi
0 siblings, 1 reply; 19+ messages in thread
From: Matthew Wilcox @ 2024-08-20 13:50 UTC (permalink / raw)
To: Jürg Billeter; +Cc: Miklos Szeredi, linux-fsdevel
On Tue, Aug 20, 2024 at 02:57:04PM +0200, Jürg Billeter wrote:
> On Sat, 2024-08-10 at 14:24 +0200, Jürg Billeter wrote:
> > On Fri, 2024-08-09 at 17:22 +0100, Matthew Wilcox (Oracle) wrote:
> > > part three
> > >
> > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > ---
> > > fs/fuse/file.c | 4 +---
> > > 1 file changed, 1 insertion(+), 3 deletions(-)
> > >
> > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> > > index 2b5533e41a62..f39456c65ed7 100644
> > > --- a/fs/fuse/file.c
> > > +++ b/fs/fuse/file.c
> > > @@ -937,9 +937,7 @@ static void fuse_readpages_end(struct
> > > fuse_mount
> > > *fm, struct fuse_args *args,
> > > for (i = 0; i < ap->num_pages; i++) {
> > > struct folio *folio = page_folio(ap->pages[i]);
> > >
> > > - if (!err)
> > > - folio_mark_uptodate(folio);
> > > - folio_unlock(folio);
> > > + folio_end_read(folio, !err);
> > > folio_put(folio);
> > > }
> > > if (ia->ff)
> >
> > Reverting this part is sufficient to fix the issue for me.
>
> Would it make sense to get a revert of this part (or a full revert of
> commit 413e8f014c8b) into mainline and also 6.10 stable if a proper fix
> will take more time?
As far as I'm concerned, I've found the fix. It's just that Miklos
isn't responding. On holiday, perhaps?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] fuse: use folio_end_read
2024-08-20 13:50 ` Matthew Wilcox
@ 2024-08-22 11:58 ` Miklos Szeredi
2024-08-22 12:25 ` Matthew Wilcox
0 siblings, 1 reply; 19+ messages in thread
From: Miklos Szeredi @ 2024-08-22 11:58 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Jürg Billeter, linux-fsdevel
On Tue, 20 Aug 2024 at 15:50, Matthew Wilcox <willy@infradead.org> wrote:
> > Would it make sense to get a revert of this part (or a full revert of
> > commit 413e8f014c8b) into mainline and also 6.10 stable if a proper fix
> > will take more time?
>
> As far as I'm concerned, I've found the fix. It's just that Miklos
> isn't responding. On holiday, perhaps?
I was, but not anymore.
I'm trying to dig though the unread pile, but haven't seen your fix.
Can you please point me in the right direction?
Thanks,
Miklos
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] fuse: use folio_end_read
2024-08-22 11:58 ` Miklos Szeredi
@ 2024-08-22 12:25 ` Matthew Wilcox
0 siblings, 0 replies; 19+ messages in thread
From: Matthew Wilcox @ 2024-08-22 12:25 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: Jürg Billeter, linux-fsdevel
On Thu, Aug 22, 2024 at 01:58:28PM +0200, Miklos Szeredi wrote:
> On Tue, 20 Aug 2024 at 15:50, Matthew Wilcox <willy@infradead.org> wrote:
>
> > > Would it make sense to get a revert of this part (or a full revert of
> > > commit 413e8f014c8b) into mainline and also 6.10 stable if a proper fix
> > > will take more time?
> >
> > As far as I'm concerned, I've found the fix. It's just that Miklos
> > isn't responding. On holiday, perhaps?
>
> I was, but not anymore.
>
> I'm trying to dig though the unread pile, but haven't seen your fix.
>
> Can you please point me in the right direction?
See the other fork of this thread:
> That's what I suspected was going wrong -- we're trying to end a read on
> a folio that is already uptodate. Miklos, what the hell is FUSE doing
> here?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [REGRESSION] fuse: copy_file_range() fails with EIO
2024-08-10 15:12 ` Matthew Wilcox
@ 2024-08-22 13:04 ` Miklos Szeredi
2024-08-22 13:12 ` Jürg Billeter
0 siblings, 1 reply; 19+ messages in thread
From: Miklos Szeredi @ 2024-08-22 13:04 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Jürg Billeter, linux-fsdevel, regressions
On Sat, 10 Aug 2024 at 17:12, Matthew Wilcox <willy@infradead.org> wrote:
>
> On Sat, Aug 10, 2024 at 07:56:21AM +0200, Jürg Billeter wrote:
> > Thanks for the quick response.
>
> Thanks for the quick test!
>
> > On Fri, 2024-08-09 at 17:03 +0100, Matthew Wilcox wrote:
> > > Do you have CONFIG_DEBUG_VM enabled? There are some debugging asserts
> > > which that will enable that might indicate a problem.
> >
> > With CONFIG_DEBUG_VM enabled, I get:
> >
> > page: refcount:2 mapcount:0 mapping:00000000b2c30835 index:0x0 pfn:0x12a113
> > memcg:ffff9d8e3a660800
> > aops:0xffffffff8a056820 ino:21 dentry name:"bash"
> > flags: 0x24000000000022d(locked|referenced|uptodate|lru|workingset|node=0|zone=2)
> > raw: 024000000000022d ffffd9ce04a827c8 ffffd9ce04a84508 ffff9d8e0bbc99f0
> > raw: 0000000000000000 0000000000000000 00000002ffffffff ffff9d8e3a660800
> > page dumped because: VM_BUG_ON_FOLIO(folio_test_uptodate(folio))
>
> That's what I suspected was going wrong -- we're trying to end a read on
> a folio that is already uptodate. Miklos, what the hell is FUSE doing
> here?
Ah, this is the fancy page cache replacement done in fuse_try_move_page().
I understand how this triggers VM_BUG_ON_FOLIO() in folio_end_read().
What I don't understand is how this results in the -EIO that Jürg reported.
Thanks,
Miklos
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [REGRESSION] fuse: copy_file_range() fails with EIO
2024-08-22 13:04 ` Miklos Szeredi
@ 2024-08-22 13:12 ` Jürg Billeter
2024-08-22 13:14 ` Miklos Szeredi
0 siblings, 1 reply; 19+ messages in thread
From: Jürg Billeter @ 2024-08-22 13:12 UTC (permalink / raw)
To: Miklos Szeredi, Matthew Wilcox; +Cc: linux-fsdevel, regressions
On Thu, 2024-08-22 at 15:04 +0200, Miklos Szeredi wrote:
> On Sat, 10 Aug 2024 at 17:12, Matthew Wilcox <willy@infradead.org> wrote:
> > That's what I suspected was going wrong -- we're trying to end a read on
> > a folio that is already uptodate. Miklos, what the hell is FUSE doing
> > here?
>
> Ah, this is the fancy page cache replacement done in
> fuse_try_move_page().
>
> I understand how this triggers VM_BUG_ON_FOLIO() in folio_end_read().
>
> What I don't understand is how this results in the -EIO that Jürg
> reported.
I'm not really familiar with this code but it seems `folio_end_read()`
uses xor to update the `PG_uptodate` flag. So if it was already set, it
will incorrectly clear the `PG_uptodate` set, which I guess triggers
the issue.
Jürg
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [REGRESSION] fuse: copy_file_range() fails with EIO
2024-08-22 13:12 ` Jürg Billeter
@ 2024-08-22 13:14 ` Miklos Szeredi
2024-08-22 13:32 ` Miklos Szeredi
0 siblings, 1 reply; 19+ messages in thread
From: Miklos Szeredi @ 2024-08-22 13:14 UTC (permalink / raw)
To: Jürg Billeter; +Cc: Matthew Wilcox, linux-fsdevel, regressions
On Thu, 22 Aug 2024 at 15:12, Jürg Billeter <j@bitron.ch> wrote:
>
> On Thu, 2024-08-22 at 15:04 +0200, Miklos Szeredi wrote:
> > On Sat, 10 Aug 2024 at 17:12, Matthew Wilcox <willy@infradead.org> wrote:
> > > That's what I suspected was going wrong -- we're trying to end a read on
> > > a folio that is already uptodate. Miklos, what the hell is FUSE doing
> > > here?
> >
> > Ah, this is the fancy page cache replacement done in
> > fuse_try_move_page().
> >
> > I understand how this triggers VM_BUG_ON_FOLIO() in folio_end_read().
> >
> > What I don't understand is how this results in the -EIO that Jürg
> > reported.
>
> I'm not really familiar with this code but it seems `folio_end_read()`
> uses xor to update the `PG_uptodate` flag. So if it was already set, it
> will incorrectly clear the `PG_uptodate` set, which I guess triggers
> the issue.
Indeed, that would explain this.
Thanks,
Miklos
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [REGRESSION] fuse: copy_file_range() fails with EIO
2024-08-22 13:14 ` Miklos Szeredi
@ 2024-08-22 13:32 ` Miklos Szeredi
2024-08-22 14:14 ` Jürg Billeter
2024-08-29 9:27 ` Marcel Ziswiler
0 siblings, 2 replies; 19+ messages in thread
From: Miklos Szeredi @ 2024-08-22 13:32 UTC (permalink / raw)
To: Jürg Billeter; +Cc: Matthew Wilcox, linux-fsdevel, regressions
[-- Attachment #1: Type: text/plain, Size: 489 bytes --]
On Thu, 22 Aug 2024 at 15:14, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > > What I don't understand is how this results in the -EIO that Jürg
> > > reported.
> >
> > I'm not really familiar with this code but it seems `folio_end_read()`
> > uses xor to update the `PG_uptodate` flag. So if it was already set, it
> > will incorrectly clear the `PG_uptodate` set, which I guess triggers
> > the issue.
Untested patch attached. Could you please try this?
Thanks,
Miklos
[-- Attachment #2: test.patch --]
[-- Type: text/x-patch, Size: 730 bytes --]
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 1f1d8a023d35..7cc4f8c9b896 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -789,7 +789,6 @@ static int fuse_check_folio(struct folio *folio)
(folio->flags & PAGE_FLAGS_CHECK_AT_PREP &
~(1 << PG_locked |
1 << PG_referenced |
- 1 << PG_uptodate |
1 << PG_lru |
1 << PG_active |
1 << PG_workingset |
@@ -834,9 +833,7 @@ static int fuse_try_move_page(struct fuse_copy_state *cs, struct page **pagep)
newfolio = page_folio(buf->page);
- if (!folio_test_uptodate(newfolio))
- folio_mark_uptodate(newfolio);
-
+ folio_clear_uptodate(newfolio);
folio_clear_mappedtodisk(newfolio);
if (fuse_check_folio(newfolio) != 0)
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [REGRESSION] fuse: copy_file_range() fails with EIO
2024-08-22 13:32 ` Miklos Szeredi
@ 2024-08-22 14:14 ` Jürg Billeter
2024-08-29 9:27 ` Marcel Ziswiler
1 sibling, 0 replies; 19+ messages in thread
From: Jürg Billeter @ 2024-08-22 14:14 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: Matthew Wilcox, linux-fsdevel, regressions
On Thu, 2024-08-22 at 15:32 +0200, Miklos Szeredi wrote:
> On Thu, 22 Aug 2024 at 15:14, Miklos Szeredi <miklos@szeredi.hu>
> wrote:
>
> > > > What I don't understand is how this results in the -EIO that Jürg
> > > > reported.
> > >
> > > I'm not really familiar with this code but it seems `folio_end_read()`
> > > uses xor to update the `PG_uptodate` flag. So if it was already set, it
> > > will incorrectly clear the `PG_uptodate` set, which I guess triggers
> > > the issue.
>
> Untested patch attached. Could you please try this?
I can no longer reproduce the issue with this patch. So at least with
regards to my test case, it looks good to me.
Cheers,
Jürg
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [REGRESSION] fuse: copy_file_range() fails with EIO
2024-08-22 13:32 ` Miklos Szeredi
2024-08-22 14:14 ` Jürg Billeter
@ 2024-08-29 9:27 ` Marcel Ziswiler
2024-08-29 9:30 ` Miklos Szeredi
1 sibling, 1 reply; 19+ messages in thread
From: Marcel Ziswiler @ 2024-08-29 9:27 UTC (permalink / raw)
To: Miklos Szeredi, Jürg Billeter
Cc: Matthew Wilcox, linux-fsdevel, regressions
Hi Miklos
On Thu, 2024-08-22 at 15:32 +0200, Miklos Szeredi wrote:
> On Thu, 22 Aug 2024 at 15:14, Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> > > > What I don't understand is how this results in the -EIO that Jürg
> > > > reported.
> > >
> > > I'm not really familiar with this code but it seems `folio_end_read()`
> > > uses xor to update the `PG_uptodate` flag. So if it was already set, it
> > > will incorrectly clear the `PG_uptodate` set, which I guess triggers
> > > the issue.
>
> Untested patch attached. Could you please try this?
Any progress on this? I can confirm that it does also fix an issue I noticed on Fedora Silverblue 40 running
6.10.6-200.fc40.x86_64. I successfully tested this patch on top of latest kernel-ark 6.11.0-
0.rc4.872cf28b8df9.39.fc40.x86_64. Thanks!
> Thanks,
> Miklos
Cheers
Marcel
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [REGRESSION] fuse: copy_file_range() fails with EIO
2024-08-29 9:27 ` Marcel Ziswiler
@ 2024-08-29 9:30 ` Miklos Szeredi
0 siblings, 0 replies; 19+ messages in thread
From: Miklos Szeredi @ 2024-08-29 9:30 UTC (permalink / raw)
To: Marcel Ziswiler
Cc: Jürg Billeter, Matthew Wilcox, linux-fsdevel, regressions
On Thu, 29 Aug 2024 at 11:27, Marcel Ziswiler <marcel@ziswiler.com> wrote:
> Any progress on this? I can confirm that it does also fix an issue I noticed on Fedora Silverblue 40 running
> 6.10.6-200.fc40.x86_64. I successfully tested this patch on top of latest kernel-ark 6.11.0-
> 0.rc4.872cf28b8df9.39.fc40.x86_64. Thanks!
Planning to send a pull request with fuse fixes this week.
Thanks,
Miklos
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2024-08-29 9:30 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-09 15:53 [REGRESSION] fuse: copy_file_range() fails with EIO Jürg Billeter
2024-08-09 16:03 ` Matthew Wilcox
2024-08-09 16:22 ` [PATCH 1/3] fuse: remove call to SetPageError Matthew Wilcox (Oracle)
2024-08-09 16:22 ` [PATCH 2/3] fuse: Use a folio instead of a page Matthew Wilcox (Oracle)
2024-08-09 16:22 ` [PATCH 3/3] fuse: use folio_end_read Matthew Wilcox (Oracle)
2024-08-10 12:24 ` Jürg Billeter
2024-08-20 12:57 ` Jürg Billeter
2024-08-20 13:50 ` Matthew Wilcox
2024-08-22 11:58 ` Miklos Szeredi
2024-08-22 12:25 ` Matthew Wilcox
2024-08-10 5:56 ` [REGRESSION] fuse: copy_file_range() fails with EIO Jürg Billeter
2024-08-10 15:12 ` Matthew Wilcox
2024-08-22 13:04 ` Miklos Szeredi
2024-08-22 13:12 ` Jürg Billeter
2024-08-22 13:14 ` Miklos Szeredi
2024-08-22 13:32 ` Miklos Szeredi
2024-08-22 14:14 ` Jürg Billeter
2024-08-29 9:27 ` Marcel Ziswiler
2024-08-29 9:30 ` Miklos Szeredi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).