* fuse: trying to steal weird page
@ 2020-05-02 19:09 Nikolaus Rath
  2020-05-02 19:52 ` Nikolaus Rath
  0 siblings, 1 reply; 26+ messages in thread
From: Nikolaus Rath @ 2020-05-02 19:09 UTC (permalink / raw)
  To: linux-fsdevel, fuse-devel, linux-mm, miklos,
	Gabriel Krisman Bertazi, André Almeida
Hello,
I have recently noticed that a FUSE filesystem regularly produces many
kernel messages like this:
[ 2333.009931] fuse: trying to steal weird page
[ 2333.009937] fuse:   page=00000000dd1750e3 index=2022240 flags=17ffffc0000097, count=1, mapcount=0, mapping=00000000125079ad
[ 2334.595835] fuse: trying to steal weird page
[ 2334.595841] fuse:   page=000000009e8626ac index=2052288 flags=17ffffc0000097, count=1, mapcount=0, mapping=00000000125079ad
[ 2334.983568] fuse: trying to steal weird page
[ 2334.983572] fuse:   page=0000000013fdd9e4 index=2059392 flags=17ffffc0000097, count=1, mapcount=0, mapping=00000000125079ad
[ 2335.979905] fuse: trying to steal weird page
[ 2335.979911] fuse:   page=00000000a7057848 index=2078588 flags=17ffffc0000097, count=1, mapcount=0, mapping=00000000125079ad
They do not seem to correlate with userspace errors, but I noticed that
there is no significant performance difference between libfuse using
splice, splice with SPLICE_MOVE, and not using splice. This is somewhat
unexpected, so maybe it is related to the kernel messages?
What are the implications of the above kernel message? Is there a way to
provide more debugging information?
(I have reported a similar issue before
(https://marc.info/?l=linux-mm&m=155290847909996&w=2) and the patch may
not be present in the kernel that I'm using. However, the previous time
the warning had a different set of flags and a null mapping, so I think
this one is different).
$ uname -a
Linux valve 5.3.0-46-generic #38-Ubuntu SMP Fri Mar 27 17:37:05 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Best,
-Nikolaus
-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
             »Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply	[flat|nested] 26+ messages in thread- * Re: fuse: trying to steal weird page
  2020-05-02 19:09 fuse: trying to steal weird page Nikolaus Rath
@ 2020-05-02 19:52 ` Nikolaus Rath
  2020-05-03  3:26   ` Matthew Wilcox
  0 siblings, 1 reply; 26+ messages in thread
From: Nikolaus Rath @ 2020-05-02 19:52 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: fuse-devel, linux-mm, miklos, Gabriel Krisman Bertazi,
	André Almeida
On May 02 2020, Nikolaus Rath <Nikolaus@rath.org> wrote:
> Hello,
>
> I have recently noticed that a FUSE filesystem regularly produces many
> kernel messages like this:
>
> [ 2333.009931] fuse: trying to steal weird page
> [ 2333.009937] fuse: page=00000000dd1750e3 index=2022240 flags=17ffffc0000097, count=1,
> mapcount=0, mapping=00000000125079ad
> [ 2334.595835] fuse: trying to steal weird page
> [ 2334.595841] fuse: page=000000009e8626ac index=2052288 flags=17ffffc0000097, count=1,
> mapcount=0, mapping=00000000125079ad
> [ 2334.983568] fuse: trying to steal weird page
> [ 2334.983572] fuse: page=0000000013fdd9e4 index=2059392 flags=17ffffc0000097, count=1,
> mapcount=0, mapping=00000000125079ad
> [ 2335.979905] fuse: trying to steal weird page
> [ 2335.979911] fuse: page=00000000a7057848 index=2078588 flags=17ffffc0000097, count=1,
> mapcount=0, mapping=00000000125079ad
>
> They do not seem to correlate with userspace errors, but I noticed that
> there is no significant performance difference between libfuse using
> splice, splice with SPLICE_MOVE, and not using splice. This is somewhat
> unexpected, so maybe it is related to the kernel messages?
>
> What are the implications of the above kernel message? Is there a way to
> provide more debugging information?
>
> (I have reported a similar issue before
> (https://marc.info/?l=linux-mm&m=155290847909996&w=2) and the patch may
> not be present in the kernel that I'm using. However, the previous time
> the warning had a different set of flags and a null mapping, so I think
> this one is different).
>
> $ uname -a
> Linux valve 5.3.0-46-generic #38-Ubuntu SMP Fri Mar 27 17:37:05 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
I have now reproduced this with a vanilla 5.6 kernel.
I also found that the "mapping" and "flags" parameters always have the
same values - even across different workloads and when re-mounting the
filesystem. The "page" parameter varies.
Best,
-Nikolaus
-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
             »Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: fuse: trying to steal weird page
  2020-05-02 19:52 ` Nikolaus Rath
@ 2020-05-03  3:26   ` Matthew Wilcox
  2020-05-03  8:43     ` [fuse-devel] " Nikolaus Rath
  0 siblings, 1 reply; 26+ messages in thread
From: Matthew Wilcox @ 2020-05-03  3:26 UTC (permalink / raw)
  To: linux-fsdevel, fuse-devel, linux-mm, miklos,
	Gabriel Krisman Bertazi, André Almeida
On Sat, May 02, 2020 at 08:52:48PM +0100, Nikolaus Rath wrote:
> On May 02 2020, Nikolaus Rath <Nikolaus@rath.org> wrote:
> > I have recently noticed that a FUSE filesystem regularly produces many
> > kernel messages like this:
> >
> > [ 2333.009931] fuse: trying to steal weird page
> > [ 2333.009937] fuse: page=00000000dd1750e3 index=2022240 flags=17ffffc0000097, count=1,
> > mapcount=0, mapping=00000000125079ad
...
> > What are the implications of the above kernel message? Is there a way to
> > provide more debugging information?
It'd be helpful to use the common debugging infrastructure which prints
more useful information:
+++ b/fs/fuse/dev.c
@@ -772,8 +772,7 @@ static int fuse_check_page(struct page *page)
               1 << PG_lru |
               1 << PG_active |
               1 << PG_reclaim))) {
-               pr_warn("trying to steal weird page\n");
-               pr_warn("  page=%p index=%li flags=%08lx, count=%i, mapcount=%i, mapping=%p\n", page, page->index, page->flags, page_count(page), page_mapcount(page), page->mapping);
+               dump_page(page, "fuse: trying to steal weird page");
                return 1;
        }
        return 0;
(whitespace damaged; if you can't make the equivalent change, let me
know and I'll send you a real patch)
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2020-05-03  3:26   ` Matthew Wilcox
@ 2020-05-03  8:43     ` Nikolaus Rath
  2020-05-03 10:27       ` Matthew Wilcox
  0 siblings, 1 reply; 26+ messages in thread
From: Nikolaus Rath @ 2020-05-03  8:43 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-fsdevel, fuse-devel, linux-mm, miklos,
	Gabriel Krisman Bertazi, André Almeida
On May 02 2020, Matthew Wilcox <willy@infradead.org> wrote:
> On Sat, May 02, 2020 at 08:52:48PM +0100, Nikolaus Rath wrote:
>> On May 02 2020, Nikolaus Rath <Nikolaus@rath.org> wrote:
>> > I have recently noticed that a FUSE filesystem regularly produces many
>> > kernel messages like this:
>> >
>> > [ 2333.009931] fuse: trying to steal weird page
>> > [ 2333.009937] fuse: page=00000000dd1750e3 index=2022240 flags=17ffffc0000097, count=1,
>> > mapcount=0, mapping=00000000125079ad
> ...
>> > What are the implications of the above kernel message? Is there a way to
>> > provide more debugging information?
>
> It'd be helpful to use the common debugging infrastructure which prints
> more useful information:
>
> +++ b/fs/fuse/dev.c
> @@ -772,8 +772,7 @@ static int fuse_check_page(struct page *page)
>                1 << PG_lru |
>                1 << PG_active |
>                1 << PG_reclaim))) {
> -               pr_warn("trying to steal weird page\n");
> -               pr_warn("  page=%p index=%li flags=%08lx, count=%i, mapcount=%i, mapping=%p\n", page, page->index, page->flags, page_count(page), page_mapcount(page), page->mapping);
> +               dump_page(page, "fuse: trying to steal weird page");
>                 return 1;
>         }
>         return 0;
>
> (whitespace damaged; if you can't make the equivalent change, let me
> know and I'll send you a real patch)
Here's what I got:
[  221.277260] page:ffffec4bbd639880 refcount:1 mapcount:0 mapping:0000000000000000 index:0xd9
[  221.277265] flags: 0x17ffffc0000097(locked|waiters|referenced|uptodate|lru)
[  221.277269] raw: 0017ffffc0000097 ffffec4bbd62f048 ffffec4bbd619308 0000000000000000
[  221.277271] raw: 00000000000000d9 0000000000000000 00000001ffffffff ffff9aec11beb000
[  221.277272] page dumped because: fuse: trying to steal weird page
[  221.277273] page->mem_cgroup:ffff9aec11beb000
[  221.601910] page:ffffec4bbd4b2dc0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x1273
[  221.601915] flags: 0x17ffffc0000097(locked|waiters|referenced|uptodate|lru)
[  221.601919] raw: 0017ffffc0000097 ffffec4bbd51f488 ffffec4bbd4b2d48 0000000000000000
[  221.601921] raw: 0000000000001273 0000000000000000 00000001ffffffff ffff9aec11beb000
[  221.601922] page dumped because: fuse: trying to steal weird page
[  221.601923] page->mem_cgroup:ffff9aec11beb000
[  221.958699] page:ffffec4bbd424d80 refcount:1 mapcount:0 mapping:0000000000000000 index:0x38be
[  221.958703] flags: 0x17ffffc0000097(locked|waiters|referenced|uptodate|lru)
[  221.958707] raw: 0017ffffc0000097 ffffec4bbd424048 ffffec4bbd424d48 0000000000000000
[  221.958709] raw: 00000000000038be 0000000000000000 00000001ffffffff ffff9aec11beb000
[  221.958710] page dumped because: fuse: trying to steal weird page
[  221.958711] page->mem_cgroup:ffff9aec11beb000
Best,
-Nikolaus
-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
             »Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2020-05-03  8:43     ` [fuse-devel] " Nikolaus Rath
@ 2020-05-03 10:27       ` Matthew Wilcox
  2020-05-03 18:28         ` Gabriel Krisman Bertazi
                           ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Matthew Wilcox @ 2020-05-03 10:27 UTC (permalink / raw)
  To: linux-fsdevel, fuse-devel, linux-mm, miklos,
	Gabriel Krisman Bertazi, André Almeida
On Sun, May 03, 2020 at 09:43:41AM +0100, Nikolaus Rath wrote:
> Here's what I got:
> 
> [  221.277260] page:ffffec4bbd639880 refcount:1 mapcount:0 mapping:0000000000000000 index:0xd9
> [  221.277265] flags: 0x17ffffc0000097(locked|waiters|referenced|uptodate|lru)
> [  221.277269] raw: 0017ffffc0000097 ffffec4bbd62f048 ffffec4bbd619308 0000000000000000
> [  221.277271] raw: 00000000000000d9 0000000000000000 00000001ffffffff ffff9aec11beb000
> [  221.277272] page dumped because: fuse: trying to steal weird page
> [  221.277273] page->mem_cgroup:ffff9aec11beb000
Great!  Here's the condition:
        if (page_mapcount(page) ||
            page->mapping != NULL ||
            page_count(page) != 1 ||
            (page->flags & PAGE_FLAGS_CHECK_AT_PREP &
             ~(1 << PG_locked |
               1 << PG_referenced |
               1 << PG_uptodate |
               1 << PG_lru |
               1 << PG_active |
               1 << PG_reclaim))) {
mapcount is 0, mapping is NULL, refcount is 1, so that's all fine.
flags has 'waiters' set, which is not in the allowed list.  I don't
know the internals of FUSE, so I don't know why that is.
Also, page_count() is unstable.  Unless there has been an RCU grace period
between when the page was freed and now, a speculative reference may exist
from the page cache.  So I would say this is a bad thing to check for.
Thanks for the swift provision of the debugging data!
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2020-05-03 10:27       ` Matthew Wilcox
@ 2020-05-03 18:28         ` Gabriel Krisman Bertazi
  2020-05-03 20:06           ` Matthew Wilcox
  2020-05-03 20:25           ` Nikolaus Rath
  2020-05-03 21:34         ` Hugh Dickins
  2020-05-18 12:45         ` Miklos Szeredi
  2 siblings, 2 replies; 26+ messages in thread
From: Gabriel Krisman Bertazi @ 2020-05-03 18:28 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-fsdevel, fuse-devel, linux-mm, miklos, André Almeida
Matthew Wilcox <willy@infradead.org> writes:
> On Sun, May 03, 2020 at 09:43:41AM +0100, Nikolaus Rath wrote:
>> Here's what I got:
>> 
>> [  221.277260] page:ffffec4bbd639880 refcount:1 mapcount:0 mapping:0000000000000000 index:0xd9
>> [  221.277265] flags: 0x17ffffc0000097(locked|waiters|referenced|uptodate|lru)
>> [  221.277269] raw: 0017ffffc0000097 ffffec4bbd62f048 ffffec4bbd619308 0000000000000000
>> [  221.277271] raw: 00000000000000d9 0000000000000000 00000001ffffffff ffff9aec11beb000
>> [  221.277272] page dumped because: fuse: trying to steal weird page
>> [  221.277273] page->mem_cgroup:ffff9aec11beb000
>
> Great!  Here's the condition:
>
>         if (page_mapcount(page) ||
>             page->mapping != NULL ||
>             page_count(page) != 1 ||
>             (page->flags & PAGE_FLAGS_CHECK_AT_PREP &
>              ~(1 << PG_locked |
>                1 << PG_referenced |
>                1 << PG_uptodate |
>                1 << PG_lru |
>                1 << PG_active |
>                1 << PG_reclaim))) {
>
> mapcount is 0, mapping is NULL, refcount is 1, so that's all fine.
> flags has 'waiters' set, which is not in the allowed list.  I don't
> know the internals of FUSE, so I don't know why that is.
>
Hi
On the first message, Nikolaus sent the following line:
>> [ 2333.009937] fuse: page=00000000dd1750e3 index=2022240 flags=17ffffc0000097, count=1,
>> mapcount=0, mapping=00000000125079ad
It should be noted that on the second run, where we got the dump_page
log, it indeed had a null mapping, which is similar to what Nikolaus
asked on the previous thread he linked to, but looks like this wasn't
the case on at least some of the reproductions of the issue.  On the
line above, the condition that triggered the warning was page->mapping
!= NULL.  I don't know what to do with this information, though.
-- 
Gabriel Krisman Bertazi
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2020-05-03 18:28         ` Gabriel Krisman Bertazi
@ 2020-05-03 20:06           ` Matthew Wilcox
  2020-05-03 20:25           ` Nikolaus Rath
  1 sibling, 0 replies; 26+ messages in thread
From: Matthew Wilcox @ 2020-05-03 20:06 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi
  Cc: linux-fsdevel, fuse-devel, linux-mm, miklos, André Almeida
On Sun, May 03, 2020 at 02:28:34PM -0400, Gabriel Krisman Bertazi wrote:
> Matthew Wilcox <willy@infradead.org> writes:
> > mapcount is 0, mapping is NULL, refcount is 1, so that's all fine.
> > flags has 'waiters' set, which is not in the allowed list.  I don't
> > know the internals of FUSE, so I don't know why that is.
> 
> Hi
> 
> On the first message, Nikolaus sent the following line:
> 
> >> [ 2333.009937] fuse: page=00000000dd1750e3 index=2022240 flags=17ffffc0000097, count=1,
> >> mapcount=0, mapping=00000000125079ad
> 
> It should be noted that on the second run, where we got the dump_page
> log, it indeed had a null mapping, which is similar to what Nikolaus
> asked on the previous thread he linked to, but looks like this wasn't
> the case on at least some of the reproductions of the issue.  On the
> line above, the condition that triggered the warning was page->mapping
> != NULL.  I don't know what to do with this information, though.
I don't see anything in upstream which will print "NULL" for a null
pointer passed to pointer().  ptr_to_id() doesn't check for NULL, nor
does __ptr_to_hashval().
I think we _should_, along with the ERR_PTR values and values less
than 4096.  But that patch isn't upstream.
^ permalink raw reply	[flat|nested] 26+ messages in thread 
- * Re: [fuse-devel] fuse: trying to steal weird page
  2020-05-03 18:28         ` Gabriel Krisman Bertazi
  2020-05-03 20:06           ` Matthew Wilcox
@ 2020-05-03 20:25           ` Nikolaus Rath
  2020-05-06 13:57             ` Vlastimil Babka
  1 sibling, 1 reply; 26+ messages in thread
From: Nikolaus Rath @ 2020-05-03 20:25 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi
  Cc: Matthew Wilcox, linux-fsdevel, fuse-devel, linux-mm, miklos,
	André Almeida
On May 03 2020, Gabriel Krisman Bertazi <krisman@collabora.com> wrote:
> Matthew Wilcox <willy@infradead.org> writes:
>
>> On Sun, May 03, 2020 at 09:43:41AM +0100, Nikolaus Rath wrote:
>>> Here's what I got:
>>> 
>>> [  221.277260] page:ffffec4bbd639880 refcount:1 mapcount:0 mapping:0000000000000000 index:0xd9
>>> [  221.277265] flags: 0x17ffffc0000097(locked|waiters|referenced|uptodate|lru)
>>> [  221.277269] raw: 0017ffffc0000097 ffffec4bbd62f048 ffffec4bbd619308 0000000000000000
>>> [  221.277271] raw: 00000000000000d9 0000000000000000 00000001ffffffff ffff9aec11beb000
>>> [  221.277272] page dumped because: fuse: trying to steal weird page
>>> [  221.277273] page->mem_cgroup:ffff9aec11beb000
>>
>> Great!  Here's the condition:
>>
>>         if (page_mapcount(page) ||
>>             page->mapping != NULL ||
>>             page_count(page) != 1 ||
>>             (page->flags & PAGE_FLAGS_CHECK_AT_PREP &
>>              ~(1 << PG_locked |
>>                1 << PG_referenced |
>>                1 << PG_uptodate |
>>                1 << PG_lru |
>>                1 << PG_active |
>>                1 << PG_reclaim))) {
>>
>> mapcount is 0, mapping is NULL, refcount is 1, so that's all fine.
>> flags has 'waiters' set, which is not in the allowed list.  I don't
>> know the internals of FUSE, so I don't know why that is.
>>
>
> On the first message, Nikolaus sent the following line:
>
>>> [ 2333.009937] fuse: page=00000000dd1750e3 index=2022240 flags=17ffffc0000097, count=1,
>>> mapcount=0, mapping=00000000125079ad
>
> It should be noted that on the second run, where we got the dump_page
> log, it indeed had a null mapping, which is similar to what Nikolaus
> asked on the previous thread he linked to, but looks like this wasn't
> the case on at least some of the reproductions of the issue.  On the
> line above, the condition that triggered the warning was page->mapping
> != NULL.  I don't know what to do with this information, though.
Indeed, that's curious. I've modified the patch slightly to print both
the old and the new message to confirm. And indeed:
[  260.882873] fuse: trying to steal weird page
[  260.882879] fuse:   page=00000000813e7570 index=2010048 flags=17ffffc0000097, count=1, mapcount=0, mapping=0000000094844a11
[  260.882882] page:ffffe13431bcc000 refcount:1 mapcount:0 mapping:0000000000000000 index:0x1eabc0
[  260.882885] flags: 0x17ffffc0000097(locked|waiters|referenced|uptodate|lru)
[  260.882889] raw: 0017ffffc0000097 ffffe13431ca7c48 ffffe13431bcbfc8 0000000000000000
[  260.882891] raw: 00000000001eabc0 0000000000000000 00000001ffffffff ffff91fe31318000
[  260.882892] page dumped because: fuse: trying to steal weird page
[  260.882893] page->mem_cgroup:ffff91fe31318000
[  262.608438] fuse: trying to steal weird page
[  262.608444] fuse:   page=000000000c21d0c7 index=2040704 flags=17ffffc0000097, count=1, mapcount=0, mapping=0000000094844a11
[  262.608447] page:ffffe134319ebf80 refcount:1 mapcount:0 mapping:0000000000000000 index:0x1f2380
[  262.608450] flags: 0x17ffffc0000097(locked|waiters|referenced|uptodate|lru)
[  262.608455] raw: 0017ffffc0000097 ffffe134319e9fc8 ffffe134319ebf48 0000000000000000
[  262.608457] raw: 00000000001f2380 0000000000000000 00000001ffffffff ffff91fe31318000
[  262.608458] page dumped because: fuse: trying to steal weird page
[  262.608459] page->mem_cgroup:ffff91fe31318000
[  262.770209] fuse: trying to steal weird page
[  262.770215] fuse:   page=0000000018a813ac index=2045120 flags=17ffffc0000097, count=1, mapcount=0, mapping=0000000094844a11
[  262.770218] page:ffffe134319953c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x1f34c0
[  262.770221] flags: 0x17ffffc0000097(locked|waiters|referenced|uptodate|lru)
[  262.770225] raw: 0017ffffc0000097 ffffe134319ae788 ffffe13431995388 0000000000000000
[  262.770228] raw: 00000000001f34c0 0000000000000000 00000001ffffffff ffff91fe31318000
[  262.770229] page dumped because: fuse: trying to steal weird page
[  262.770230] page->mem_cgroup:ffff91fe31318000
Best,
-Nikolaus
-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
             »Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2020-05-03 20:25           ` Nikolaus Rath
@ 2020-05-06 13:57             ` Vlastimil Babka
  0 siblings, 0 replies; 26+ messages in thread
From: Vlastimil Babka @ 2020-05-06 13:57 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi, Matthew Wilcox, linux-fsdevel,
	fuse-devel, linux-mm, miklos, André Almeida
On 5/3/20 10:25 PM, Nikolaus Rath wrote:
> On May 03 2020, Gabriel Krisman Bertazi <krisman@collabora.com> wrote:
>> Matthew Wilcox <willy@infradead.org> writes:
>>
>>> On Sun, May 03, 2020 at 09:43:41AM +0100, Nikolaus Rath wrote:
>>>> Here's what I got:
>>>> 
>>>> [  221.277260] page:ffffec4bbd639880 refcount:1 mapcount:0 mapping:0000000000000000 index:0xd9
>>>> [  221.277265] flags: 0x17ffffc0000097(locked|waiters|referenced|uptodate|lru)
>>>> [  221.277269] raw: 0017ffffc0000097 ffffec4bbd62f048 ffffec4bbd619308 0000000000000000
>>>> [  221.277271] raw: 00000000000000d9 0000000000000000 00000001ffffffff ffff9aec11beb000
>>>> [  221.277272] page dumped because: fuse: trying to steal weird page
>>>> [  221.277273] page->mem_cgroup:ffff9aec11beb000
>>>
>>> Great!  Here's the condition:
>>>
>>>         if (page_mapcount(page) ||
>>>             page->mapping != NULL ||
>>>             page_count(page) != 1 ||
>>>             (page->flags & PAGE_FLAGS_CHECK_AT_PREP &
>>>              ~(1 << PG_locked |
>>>                1 << PG_referenced |
>>>                1 << PG_uptodate |
>>>                1 << PG_lru |
>>>                1 << PG_active |
>>>                1 << PG_reclaim))) {
>>>
>>> mapcount is 0, mapping is NULL, refcount is 1, so that's all fine.
>>> flags has 'waiters' set, which is not in the allowed list.  I don't
>>> know the internals of FUSE, so I don't know why that is.
>>>
>>
>> On the first message, Nikolaus sent the following line:
>>
>>>> [ 2333.009937] fuse: page=00000000dd1750e3 index=2022240 flags=17ffffc0000097, count=1,
>>>> mapcount=0, mapping=00000000125079ad
>>
>> It should be noted that on the second run, where we got the dump_page
>> log, it indeed had a null mapping, which is similar to what Nikolaus
>> asked on the previous thread he linked to, but looks like this wasn't
>> the case on at least some of the reproductions of the issue.  On the
>> line above, the condition that triggered the warning was page->mapping
>> != NULL.  I don't know what to do with this information, though.
> 
> Indeed, that's curious. I've modified the patch slightly to print both
> the old and the new message to confirm. And indeed:
> 
> [  260.882873] fuse: trying to steal weird page
> [  260.882879] fuse:   page=00000000813e7570 index=2010048 flags=17ffffc0000097, count=1, mapcount=0, mapping=0000000094844a11
fuse_check_page() is using %p for these, so they are hashed (IIRC that means the
upper part is zeroed and the lower hashed)
> [  260.882882] page:ffffe13431bcc000 refcount:1 mapcount:0 mapping:0000000000000000 index:0x1eabc0
__dump_page() in 5.6 uses %px so they are unmodified. Thus it's really a NULL
pointer.
For extra fun, __dump_page() in 5.7 will also print page pointer unmodified, but
mapping will become hashed too.
Yeah it would be nice if NULLish values were treated specially, as
0000000094844a11 instead of NULL is really misleading. __dump_page() is fine
thanks to the raw dump, but other places perhaps not.
^ permalink raw reply	[flat|nested] 26+ messages in thread
 
 
- * Re: [fuse-devel] fuse: trying to steal weird page
  2020-05-03 10:27       ` Matthew Wilcox
  2020-05-03 18:28         ` Gabriel Krisman Bertazi
@ 2020-05-03 21:34         ` Hugh Dickins
  2020-05-18 12:45         ` Miklos Szeredi
  2 siblings, 0 replies; 26+ messages in thread
From: Hugh Dickins @ 2020-05-03 21:34 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-fsdevel, fuse-devel, linux-mm, miklos,
	Gabriel Krisman Bertazi, Andre Almeida
On Sun, 3 May 2020, Matthew Wilcox wrote:
> On Sun, May 03, 2020 at 09:43:41AM +0100, Nikolaus Rath wrote:
> > Here's what I got:
> > 
> > [  221.277260] page:ffffec4bbd639880 refcount:1 mapcount:0 mapping:0000000000000000 index:0xd9
> > [  221.277265] flags: 0x17ffffc0000097(locked|waiters|referenced|uptodate|lru)
> > [  221.277269] raw: 0017ffffc0000097 ffffec4bbd62f048 ffffec4bbd619308 0000000000000000
> > [  221.277271] raw: 00000000000000d9 0000000000000000 00000001ffffffff ffff9aec11beb000
> > [  221.277272] page dumped because: fuse: trying to steal weird page
> > [  221.277273] page->mem_cgroup:ffff9aec11beb000
> 
> Great!  Here's the condition:
> 
>         if (page_mapcount(page) ||
>             page->mapping != NULL ||
>             page_count(page) != 1 ||
>             (page->flags & PAGE_FLAGS_CHECK_AT_PREP &
>              ~(1 << PG_locked |
>                1 << PG_referenced |
>                1 << PG_uptodate |
>                1 << PG_lru |
>                1 << PG_active |
>                1 << PG_reclaim))) {
> 
> mapcount is 0, mapping is NULL, refcount is 1, so that's all fine.
> flags has 'waiters' set, which is not in the allowed list.  I don't
> know the internals of FUSE, so I don't know why that is.
That list of PG_flags dates back to 2010: which 2016's 62906027091f
("mm: add PageWaiters indicating tasks are waiting for a page bit")
ought to have updated.  Though it's understandable that it did not:
surprising to find a list of PG_flags outside of mm/ and fs/proc/.
Just add PG_waiters to the list and the issue should go away.
> 
> Also, page_count() is unstable.
Agreed: fine to back out if page_count() is high,
but not good to issue a worrying warning about it.
> Unless there has been an RCU grace period
> between when the page was freed and now, a speculative reference may exist
> from the page cache.  So I would say this is a bad thing to check for.
> 
> Thanks for the swift provision of the debugging data!
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2020-05-03 10:27       ` Matthew Wilcox
  2020-05-03 18:28         ` Gabriel Krisman Bertazi
  2020-05-03 21:34         ` Hugh Dickins
@ 2020-05-18 12:45         ` Miklos Szeredi
  2020-05-18 14:48           ` Matthew Wilcox
  2 siblings, 1 reply; 26+ messages in thread
From: Miklos Szeredi @ 2020-05-18 12:45 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-fsdevel, fuse-devel, linux-mm, miklos,
	Gabriel Krisman Bertazi, André Almeida
On Sun, May 3, 2020 at 12:27 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Sun, May 03, 2020 at 09:43:41AM +0100, Nikolaus Rath wrote:
> > Here's what I got:
> >
> > [  221.277260] page:ffffec4bbd639880 refcount:1 mapcount:0 mapping:0000000000000000 index:0xd9
> > [  221.277265] flags: 0x17ffffc0000097(locked|waiters|referenced|uptodate|lru)
> > [  221.277269] raw: 0017ffffc0000097 ffffec4bbd62f048 ffffec4bbd619308 0000000000000000
> > [  221.277271] raw: 00000000000000d9 0000000000000000 00000001ffffffff ffff9aec11beb000
> > [  221.277272] page dumped because: fuse: trying to steal weird page
> > [  221.277273] page->mem_cgroup:ffff9aec11beb000
>
> Great!  Here's the condition:
>
>         if (page_mapcount(page) ||
>             page->mapping != NULL ||
>             page_count(page) != 1 ||
>             (page->flags & PAGE_FLAGS_CHECK_AT_PREP &
>              ~(1 << PG_locked |
>                1 << PG_referenced |
>                1 << PG_uptodate |
>                1 << PG_lru |
>                1 << PG_active |
>                1 << PG_reclaim))) {
>
> mapcount is 0, mapping is NULL, refcount is 1, so that's all fine.
> flags has 'waiters' set, which is not in the allowed list.  I don't
> know the internals of FUSE, so I don't know why that is.
>
> Also, page_count() is unstable.  Unless there has been an RCU grace period
> between when the page was freed and now, a speculative reference may exist
> from the page cache.  So I would say this is a bad thing to check for.
page_cache_pipe_buf_steal() calls remove_mapping() which calls
page_ref_unfreeze(page, 1).  That sets the refcount to 1, right?
What am I missing?
Thanks,
Miklos
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2020-05-18 12:45         ` Miklos Szeredi
@ 2020-05-18 14:48           ` Matthew Wilcox
  2020-05-18 14:58             ` Miklos Szeredi
  0 siblings, 1 reply; 26+ messages in thread
From: Matthew Wilcox @ 2020-05-18 14:48 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-fsdevel, fuse-devel, linux-mm, miklos,
	Gabriel Krisman Bertazi, André Almeida
On Mon, May 18, 2020 at 02:45:02PM +0200, Miklos Szeredi wrote:
> On Sun, May 3, 2020 at 12:27 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Sun, May 03, 2020 at 09:43:41AM +0100, Nikolaus Rath wrote:
> > > Here's what I got:
> > >
> > > [  221.277260] page:ffffec4bbd639880 refcount:1 mapcount:0 mapping:0000000000000000 index:0xd9
> > > [  221.277265] flags: 0x17ffffc0000097(locked|waiters|referenced|uptodate|lru)
> > > [  221.277269] raw: 0017ffffc0000097 ffffec4bbd62f048 ffffec4bbd619308 0000000000000000
> > > [  221.277271] raw: 00000000000000d9 0000000000000000 00000001ffffffff ffff9aec11beb000
> > > [  221.277272] page dumped because: fuse: trying to steal weird page
> > > [  221.277273] page->mem_cgroup:ffff9aec11beb000
> >
> > Great!  Here's the condition:
> >
> >         if (page_mapcount(page) ||
> >             page->mapping != NULL ||
> >             page_count(page) != 1 ||
> >             (page->flags & PAGE_FLAGS_CHECK_AT_PREP &
> >              ~(1 << PG_locked |
> >                1 << PG_referenced |
> >                1 << PG_uptodate |
> >                1 << PG_lru |
> >                1 << PG_active |
> >                1 << PG_reclaim))) {
> >
> > mapcount is 0, mapping is NULL, refcount is 1, so that's all fine.
> > flags has 'waiters' set, which is not in the allowed list.  I don't
> > know the internals of FUSE, so I don't know why that is.
> >
> > Also, page_count() is unstable.  Unless there has been an RCU grace period
> > between when the page was freed and now, a speculative reference may exist
> > from the page cache.  So I would say this is a bad thing to check for.
> 
> page_cache_pipe_buf_steal() calls remove_mapping() which calls
> page_ref_unfreeze(page, 1).  That sets the refcount to 1, right?
> 
> What am I missing?
find_get_entry() calling page_cache_get_speculative().
In a previous allocation, this page belonged to the page cache.  Then it
was freed, but another thread is in the middle of a page cache lookup and
has already loaded the pointer.  It is now delayed by a few clock ticks.
Now the page is allocated to FUSE, which calls page_ref_unfreeze().
And then the refcount gets bumped to 2 by page_cache_get_speculative().
find_get_entry() calls xas_reload() and discovers this page is no longer
at that index, so it calls put_page(), but in that narrow window, FUSE
checks the refcount and finds it's not 1.
Monumentally unlikely, of course, so you've probably never seen it,
but multiply by the hundreds of millions of devices running Linux,
and somebody will hit it someday.
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2020-05-18 14:48           ` Matthew Wilcox
@ 2020-05-18 14:58             ` Miklos Szeredi
  2020-05-18 15:26               ` Matthew Wilcox
  0 siblings, 1 reply; 26+ messages in thread
From: Miklos Szeredi @ 2020-05-18 14:58 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-fsdevel, fuse-devel, linux-mm, miklos,
	Gabriel Krisman Bertazi, André Almeida
On Mon, May 18, 2020 at 4:48 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Mon, May 18, 2020 at 02:45:02PM +0200, Miklos Szeredi wrote:
> > page_cache_pipe_buf_steal() calls remove_mapping() which calls
> > page_ref_unfreeze(page, 1).  That sets the refcount to 1, right?
> >
> > What am I missing?
>
> find_get_entry() calling page_cache_get_speculative().
>
> In a previous allocation, this page belonged to the page cache.  Then it
> was freed, but another thread is in the middle of a page cache lookup and
> has already loaded the pointer.  It is now delayed by a few clock ticks.
>
> Now the page is allocated to FUSE, which calls page_ref_unfreeze().
> And then the refcount gets bumped to 2 by page_cache_get_speculative().
> find_get_entry() calls xas_reload() and discovers this page is no longer
> at that index, so it calls put_page(), but in that narrow window, FUSE
> checks the refcount and finds it's not 1.
What if that page_cache_get_speculative() happens just before
page_ref_unfreeze()?  The speculative reference would be lost and on
put_page() the page would be freed, even though fuse is still holding
the pointer.
Thanks,
Miklos
^ permalink raw reply	[flat|nested] 26+ messages in thread 
- * Re: [fuse-devel] fuse: trying to steal weird page
  2020-05-18 14:58             ` Miklos Szeredi
@ 2020-05-18 15:26               ` Matthew Wilcox
  0 siblings, 0 replies; 26+ messages in thread
From: Matthew Wilcox @ 2020-05-18 15:26 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-fsdevel, fuse-devel, linux-mm, miklos,
	Gabriel Krisman Bertazi, André Almeida
On Mon, May 18, 2020 at 04:58:05PM +0200, Miklos Szeredi wrote:
> On Mon, May 18, 2020 at 4:48 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Mon, May 18, 2020 at 02:45:02PM +0200, Miklos Szeredi wrote:
> 
> > > page_cache_pipe_buf_steal() calls remove_mapping() which calls
> > > page_ref_unfreeze(page, 1).  That sets the refcount to 1, right?
> > >
> > > What am I missing?
> >
> > find_get_entry() calling page_cache_get_speculative().
> >
> > In a previous allocation, this page belonged to the page cache.  Then it
> > was freed, but another thread is in the middle of a page cache lookup and
> > has already loaded the pointer.  It is now delayed by a few clock ticks.
> >
> > Now the page is allocated to FUSE, which calls page_ref_unfreeze().
> > And then the refcount gets bumped to 2 by page_cache_get_speculative().
> > find_get_entry() calls xas_reload() and discovers this page is no longer
> > at that index, so it calls put_page(), but in that narrow window, FUSE
> > checks the refcount and finds it's not 1.
> 
> What if that page_cache_get_speculative() happens just before
> page_ref_unfreeze()?  The speculative reference would be lost and on
> put_page() the page would be freed, even though fuse is still holding
> the pointer.
static inline void page_ref_unfreeze(struct page *page, int count)
{
        VM_BUG_ON_PAGE(page_count(page) != 0, page);
        VM_BUG_ON(count == 0);
        atomic_set_release(&page->_refcount, count);
static inline int __page_cache_add_speculative(struct page *page, int count)
{
        if (unlikely(!page_ref_add_unless(page, count, 0))) {
                return 0;
^ permalink raw reply	[flat|nested] 26+ messages in thread
 
 
 
 
 
 
 
* fuse: trying to steal weird page
@ 2018-12-26 21:43 Nikolaus Rath
  2019-01-07  8:28 ` [fuse-devel] " Miklos Szeredi
  0 siblings, 1 reply; 26+ messages in thread
From: Nikolaus Rath @ 2018-12-26 21:43 UTC (permalink / raw)
  To: fuse-devel, Miklos Szeredi, linux-fsdevel
Hi,
I am seeing relatively regular occurences of
$ sudo dmesg | tail
[21929.138815] fuse: trying to steal weird page
[21929.138821]   page=00000000a7dd2617 index=64 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
[21930.647338] fuse: trying to steal weird page
[21930.647345]   page=00000000a07f32af index=2848 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
[21932.338873] fuse: trying to steal weird page
[21932.338879]   page=0000000067e3a012 index=64 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
[21933.930703] fuse: trying to steal weird page
[21933.930710]   page=00000000046feb25 index=845 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
[21936.163174] fuse: trying to steal weird page
[21936.163180]   page=00000000fb80fe27 index=0 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
They do not seem to correlate with userspace errors.
(I am debugging a mysterious EBADF error that I get from an nftw() walk
and that disappears when running under strace(), but it its occurence
does not correlate).
What are the implications of the above kernel message? Is there a way to
provide more debugging information?
$ uname -a
Linux vostro.rath.org 4.18.0-0.bpo.1-amd64 #1 SMP Debian 4.18.6-1~bpo9+1 (2018-09-13) x86_64 GNU/Linux
Best,
-Nikolaus
-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
             »Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply	[flat|nested] 26+ messages in thread- * Re: [fuse-devel] fuse: trying to steal weird page
  2018-12-26 21:43 Nikolaus Rath
@ 2019-01-07  8:28 ` Miklos Szeredi
  2019-01-07 21:05   ` Nikolaus Rath
  0 siblings, 1 reply; 26+ messages in thread
From: Miklos Szeredi @ 2019-01-07  8:28 UTC (permalink / raw)
  To: fuse-devel, Miklos Szeredi, linux-fsdevel
On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> Hi,
>
> I am seeing relatively regular occurences of
>
> $ sudo dmesg | tail
> [21929.138815] fuse: trying to steal weird page
> [21929.138821]   page=00000000a7dd2617 index=64 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
> [21930.647338] fuse: trying to steal weird page
> [21930.647345]   page=00000000a07f32af index=2848 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
> [21932.338873] fuse: trying to steal weird page
> [21932.338879]   page=0000000067e3a012 index=64 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
> [21933.930703] fuse: trying to steal weird page
> [21933.930710]   page=00000000046feb25 index=845 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
> [21936.163174] fuse: trying to steal weird page
> [21936.163180]   page=00000000fb80fe27 index=0 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
The page has the PG_dity and PG_waiters flags set which are
incompatible with stealing.  page_cache_pipe_buf_steal() does
apparently filter out dirty ones, so it's not a regular file that we
are trying to streal the page from.  So the question is: what is the
source of the splice()?
Thanks,
Miklos
>
> They do not seem to correlate with userspace errors.
>
> (I am debugging a mysterious EBADF error that I get from an nftw() walk
> and that disappears when running under strace(), but it its occurence
> does not correlate).
>
>
> What are the implications of the above kernel message? Is there a way to
> provide more debugging information?
>
> $ uname -a
> Linux vostro.rath.org 4.18.0-0.bpo.1-amd64 #1 SMP Debian 4.18.6-1~bpo9+1 (2018-09-13) x86_64 GNU/Linux
>
> Best,
> -Nikolaus
>
> --
> GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
>
>              »Time flies like an arrow, fruit flies like a Banana.«
>
>
> --
> fuse-devel mailing list
> To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
^ permalink raw reply	[flat|nested] 26+ messages in thread 
- * Re: [fuse-devel] fuse: trying to steal weird page
  2019-01-07  8:28 ` [fuse-devel] " Miklos Szeredi
@ 2019-01-07 21:05   ` Nikolaus Rath
  2019-01-08  8:27     ` Miklos Szeredi
  0 siblings, 1 reply; 26+ messages in thread
From: Nikolaus Rath @ 2019-01-07 21:05 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, Miklos Szeredi, linux-fsdevel
On Jan 07 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> Hi,
>>
>> I am seeing relatively regular occurences of
>>
>> $ sudo dmesg | tail
>> [21929.138815] fuse: trying to steal weird page
>> [21929.138821] page=00000000a7dd2617 index=64 flags=17fffc0000000ad,
>> count=1, mapcount=0, mapping= (null)
>> [21930.647338] fuse: trying to steal weird page
>> [21930.647345] page=00000000a07f32af index=2848
>> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> [21932.338873] fuse: trying to steal weird page
>> [21932.338879] page=0000000067e3a012 index=64 flags=17fffc0000000ad,
>> count=1, mapcount=0, mapping= (null)
>> [21933.930703] fuse: trying to steal weird page
>> [21933.930710] page=00000000046feb25 index=845
>> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> [21936.163174] fuse: trying to steal weird page
>> [21936.163180] page=00000000fb80fe27 index=0 flags=17fffc0000000ad,
>> count=1, mapcount=0, mapping= (null)
>
> The page has the PG_dity and PG_waiters flags set which are
> incompatible with stealing.  page_cache_pipe_buf_steal() does
> apparently filter out dirty ones, so it's not a regular file that we
> are trying to streal the page from.  So the question is: what is the
> source of the splice()?
 
Hmm. I think it has to be a regular file. But as I mentioned in my other
email, I did have a race condition where fd's were closed
incorrectly. Is it possible that this also triggered the above,
i.e. that the fd was closed sometime during splice?
Best,
-Nikolaus
-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
             »Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2019-01-07 21:05   ` Nikolaus Rath
@ 2019-01-08  8:27     ` Miklos Szeredi
  2019-01-08 10:35       ` Nikolaus Rath
  0 siblings, 1 reply; 26+ messages in thread
From: Miklos Szeredi @ 2019-01-08  8:27 UTC (permalink / raw)
  To: fuse-devel; +Cc: Miklos Szeredi, linux-fsdevel
On Mon, Jan 7, 2019 at 10:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> On Jan 07 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >>
> >> Hi,
> >>
> >> I am seeing relatively regular occurences of
> >>
> >> $ sudo dmesg | tail
> >> [21929.138815] fuse: trying to steal weird page
> >> [21929.138821] page=00000000a7dd2617 index=64 flags=17fffc0000000ad,
> >> count=1, mapcount=0, mapping= (null)
> >> [21930.647338] fuse: trying to steal weird page
> >> [21930.647345] page=00000000a07f32af index=2848
> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
> >> [21932.338873] fuse: trying to steal weird page
> >> [21932.338879] page=0000000067e3a012 index=64 flags=17fffc0000000ad,
> >> count=1, mapcount=0, mapping= (null)
> >> [21933.930703] fuse: trying to steal weird page
> >> [21933.930710] page=00000000046feb25 index=845
> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
> >> [21936.163174] fuse: trying to steal weird page
> >> [21936.163180] page=00000000fb80fe27 index=0 flags=17fffc0000000ad,
> >> count=1, mapcount=0, mapping= (null)
> >
> > The page has the PG_dity and PG_waiters flags set which are
> > incompatible with stealing.  page_cache_pipe_buf_steal() does
> > apparently filter out dirty ones, so it's not a regular file that we
> > are trying to streal the page from.  So the question is: what is the
> > source of the splice()?
>
> Hmm. I think it has to be a regular file. But as I mentioned in my other
> email, I did have a race condition where fd's were closed
> incorrectly. Is it possible that this also triggered the above,
> i.e. that the fd was closed sometime during splice?
Close during a syscall that uses the fd is not an issue, because a ref
to the file is acquired.  So the race is between the close() and the
internal fget(); if the close() wins then fget() will fail and the
syscall will return EBADF.  If the fget() wins, then the syscall can
run normally despite the fact that the fd was closed.
Can you tell me what filesystem is the regular file (the one being
spliced into fuse) is on?
It actually has to be a regular file, since AFAIK nothing else has
dirty pages.   It could be using something other than
page_cache_pipe_buf_steal(), or there's some other mechanism that lets
the page be dirtied after being unmapped, though that looks
impossible...
Thanks,
Miklos
^ permalink raw reply	[flat|nested] 26+ messages in thread 
- * Re: [fuse-devel] fuse: trying to steal weird page
  2019-01-08  8:27     ` Miklos Szeredi
@ 2019-01-08 10:35       ` Nikolaus Rath
  2019-01-09  8:07         ` Miklos Szeredi
  0 siblings, 1 reply; 26+ messages in thread
From: Nikolaus Rath @ 2019-01-08 10:35 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-fsdevel, Miklos Szeredi
On Jan 08 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Mon, Jan 7, 2019 at 10:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> On Jan 07 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> > On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>> >>
>> >> Hi,
>> >>
>> >> I am seeing relatively regular occurences of
>> >>
>> >> $ sudo dmesg | tail
>> >> [21929.138815] fuse: trying to steal weird page
>> >> [21929.138821] page=00000000a7dd2617 index=64 flags=17fffc0000000ad,
>> >> count=1, mapcount=0, mapping= (null)
>> >> [21930.647338] fuse: trying to steal weird page
>> >> [21930.647345] page=00000000a07f32af index=2848
>> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> >> [21932.338873] fuse: trying to steal weird page
>> >> [21932.338879] page=0000000067e3a012 index=64 flags=17fffc0000000ad,
>> >> count=1, mapcount=0, mapping= (null)
>> >> [21933.930703] fuse: trying to steal weird page
>> >> [21933.930710] page=00000000046feb25 index=845
>> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> >> [21936.163174] fuse: trying to steal weird page
>> >> [21936.163180] page=00000000fb80fe27 index=0 flags=17fffc0000000ad,
>> >> count=1, mapcount=0, mapping= (null)
>> >
>> > The page has the PG_dity and PG_waiters flags set which are
>> > incompatible with stealing.  page_cache_pipe_buf_steal() does
>> > apparently filter out dirty ones, so it's not a regular file that we
>> > are trying to streal the page from.  So the question is: what is the
>> > source of the splice()?
>>
>> Hmm. I think it has to be a regular file. But as I mentioned in my other
>> email, I did have a race condition where fd's were closed
>> incorrectly. Is it possible that this also triggered the above,
>> i.e. that the fd was closed sometime during splice?
>
> Close during a syscall that uses the fd is not an issue, because a ref
> to the file is acquired.  So the race is between the close() and the
> internal fget(); if the close() wins then fget() will fail and the
> syscall will return EBADF.  If the fget() wins, then the syscall can
> run normally despite the fact that the fd was closed.
>
> Can you tell me what filesystem is the regular file (the one being
> spliced into fuse) is on?
It's ext4.
> It actually has to be a regular file, since AFAIK nothing else has
> dirty pages.   It could be using something other than
> page_cache_pipe_buf_steal(), or there's some other mechanism that lets
> the page be dirtied after being unmapped, though that looks
> impossible...
I can't meaningfully comment on that, sorry...
Best,
-Nikolaus
-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
             »Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2019-01-08 10:35       ` Nikolaus Rath
@ 2019-01-09  8:07         ` Miklos Szeredi
  2019-01-11 15:39           ` Nikolaus Rath
  0 siblings, 1 reply; 26+ messages in thread
From: Miklos Szeredi @ 2019-01-09  8:07 UTC (permalink / raw)
  To: fuse-devel; +Cc: linux-fsdevel, Miklos Szeredi
On Tue, Jan 8, 2019 at 11:35 AM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> On Jan 08 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > On Mon, Jan 7, 2019 at 10:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >>
> >> On Jan 07 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> >> > On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> I am seeing relatively regular occurences of
> >> >>
> >> >> $ sudo dmesg | tail
> >> >> [21929.138815] fuse: trying to steal weird page
> >> >> [21929.138821] page=00000000a7dd2617 index=64 flags=17fffc0000000ad,
> >> >> count=1, mapcount=0, mapping= (null)
> >> >> [21930.647338] fuse: trying to steal weird page
> >> >> [21930.647345] page=00000000a07f32af index=2848
> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
> >> >> [21932.338873] fuse: trying to steal weird page
> >> >> [21932.338879] page=0000000067e3a012 index=64 flags=17fffc0000000ad,
> >> >> count=1, mapcount=0, mapping= (null)
> >> >> [21933.930703] fuse: trying to steal weird page
> >> >> [21933.930710] page=00000000046feb25 index=845
> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
> >> >> [21936.163174] fuse: trying to steal weird page
> >> >> [21936.163180] page=00000000fb80fe27 index=0 flags=17fffc0000000ad,
> >> >> count=1, mapcount=0, mapping= (null)
> >> >
> >> > The page has the PG_dity and PG_waiters flags set which are
> >> > incompatible with stealing.  page_cache_pipe_buf_steal() does
> >> > apparently filter out dirty ones, so it's not a regular file that we
> >> > are trying to streal the page from.  So the question is: what is the
> >> > source of the splice()?
> >>
> >> Hmm. I think it has to be a regular file. But as I mentioned in my other
> >> email, I did have a race condition where fd's were closed
> >> incorrectly. Is it possible that this also triggered the above,
> >> i.e. that the fd was closed sometime during splice?
> >
> > Close during a syscall that uses the fd is not an issue, because a ref
> > to the file is acquired.  So the race is between the close() and the
> > internal fget(); if the close() wins then fget() will fail and the
> > syscall will return EBADF.  If the fget() wins, then the syscall can
> > run normally despite the fact that the fd was closed.
> >
> > Can you tell me what filesystem is the regular file (the one being
> > spliced into fuse) is on?
>
> It's ext4.
Next question: is file opened with O_DIRECT or is filesystem mounted
with DAX, or anything fancy?
Thanks,
Miklos
^ permalink raw reply	[flat|nested] 26+ messages in thread 
- * Re: fuse: trying to steal weird page
@ 2019-01-11 15:39           ` Nikolaus Rath
  2019-01-11 15:39             ` [fuse-devel] " Nikolaus Rath
  2019-02-10 22:05             ` Nikolaus Rath
  0 siblings, 2 replies; 26+ messages in thread
From: Nikolaus Rath @ 2019-01-11 15:39 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-fsdevel, Miklos Szeredi
On Jan 09 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Tue, Jan 8, 2019 at 11:35 AM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> On Jan 08 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> > On Mon, Jan 7, 2019 at 10:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>> >>
>> >> On Jan 07 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> >> > On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> I am seeing relatively regular occurences of
>> >> >>
>> >> >> $ sudo dmesg | tail
>> >> >> [21929.138815] fuse: trying to steal weird page
>> >> >> [21929.138821] page=00000000a7dd2617 index=64 flags=17fffc0000000ad,
>> >> >> count=1, mapcount=0, mapping= (null)
>> >> >> [21930.647338] fuse: trying to steal weird page
>> >> >> [21930.647345] page=00000000a07f32af index=2848
>> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> >> >> [21932.338873] fuse: trying to steal weird page
>> >> >> [21932.338879] page=0000000067e3a012 index=64 flags=17fffc0000000ad,
>> >> >> count=1, mapcount=0, mapping= (null)
>> >> >> [21933.930703] fuse: trying to steal weird page
>> >> >> [21933.930710] page=00000000046feb25 index=845
>> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> >> >> [21936.163174] fuse: trying to steal weird page
>> >> >> [21936.163180] page=00000000fb80fe27 index=0 flags=17fffc0000000ad,
>> >> >> count=1, mapcount=0, mapping= (null)
>> >> >
>> >> > The page has the PG_dity and PG_waiters flags set which are
>> >> > incompatible with stealing.  page_cache_pipe_buf_steal() does
>> >> > apparently filter out dirty ones, so it's not a regular file that we
>> >> > are trying to streal the page from.  So the question is: what is the
>> >> > source of the splice()?
>> >>
>> >> Hmm. I think it has to be a regular file. But as I mentioned in my other
>> >> email, I did have a race condition where fd's were closed
>> >> incorrectly. Is it possible that this also triggered the above,
>> >> i.e. that the fd was closed sometime during splice?
>> >
>> > Close during a syscall that uses the fd is not an issue, because a ref
>> > to the file is acquired.  So the race is between the close() and the
>> > internal fget(); if the close() wins then fget() will fail and the
>> > syscall will return EBADF.  If the fget() wins, then the syscall can
>> > run normally despite the fact that the fd was closed.
>> >
>> > Can you tell me what filesystem is the regular file (the one being
>> > spliced into fuse) is on?
>>
>> It's ext4.
>
> Next question: is file opened with O_DIRECT or is filesystem mounted
> with DAX, or anything fancy?
Neither. But thinking about this, I guess that (because of the race) the
fd could have been closed and re-opened before the ref was acquired. So
it may have turned into a directory fd.
To be honest, I don't think it's worth investigating this unless I see
it happen again now that the race in my code is fixed.
Best,
-Nikolaus
-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
             »Time flies like an arrow, fruit flies like a Banana.«
-- 
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2019-01-11 15:39           ` Nikolaus Rath
@ 2019-01-11 15:39             ` Nikolaus Rath
  2019-02-10 22:05             ` Nikolaus Rath
  1 sibling, 0 replies; 26+ messages in thread
From: Nikolaus Rath @ 2019-01-11 15:39 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-fsdevel, Miklos Szeredi
On Jan 09 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Tue, Jan 8, 2019 at 11:35 AM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> On Jan 08 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> > On Mon, Jan 7, 2019 at 10:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>> >>
>> >> On Jan 07 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> >> > On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> I am seeing relatively regular occurences of
>> >> >>
>> >> >> $ sudo dmesg | tail
>> >> >> [21929.138815] fuse: trying to steal weird page
>> >> >> [21929.138821] page=00000000a7dd2617 index=64 flags=17fffc0000000ad,
>> >> >> count=1, mapcount=0, mapping= (null)
>> >> >> [21930.647338] fuse: trying to steal weird page
>> >> >> [21930.647345] page=00000000a07f32af index=2848
>> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> >> >> [21932.338873] fuse: trying to steal weird page
>> >> >> [21932.338879] page=0000000067e3a012 index=64 flags=17fffc0000000ad,
>> >> >> count=1, mapcount=0, mapping= (null)
>> >> >> [21933.930703] fuse: trying to steal weird page
>> >> >> [21933.930710] page=00000000046feb25 index=845
>> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> >> >> [21936.163174] fuse: trying to steal weird page
>> >> >> [21936.163180] page=00000000fb80fe27 index=0 flags=17fffc0000000ad,
>> >> >> count=1, mapcount=0, mapping= (null)
>> >> >
>> >> > The page has the PG_dity and PG_waiters flags set which are
>> >> > incompatible with stealing.  page_cache_pipe_buf_steal() does
>> >> > apparently filter out dirty ones, so it's not a regular file that we
>> >> > are trying to streal the page from.  So the question is: what is the
>> >> > source of the splice()?
>> >>
>> >> Hmm. I think it has to be a regular file. But as I mentioned in my other
>> >> email, I did have a race condition where fd's were closed
>> >> incorrectly. Is it possible that this also triggered the above,
>> >> i.e. that the fd was closed sometime during splice?
>> >
>> > Close during a syscall that uses the fd is not an issue, because a ref
>> > to the file is acquired.  So the race is between the close() and the
>> > internal fget(); if the close() wins then fget() will fail and the
>> > syscall will return EBADF.  If the fget() wins, then the syscall can
>> > run normally despite the fact that the fd was closed.
>> >
>> > Can you tell me what filesystem is the regular file (the one being
>> > spliced into fuse) is on?
>>
>> It's ext4.
>
> Next question: is file opened with O_DIRECT or is filesystem mounted
> with DAX, or anything fancy?
Neither. But thinking about this, I guess that (because of the race) the
fd could have been closed and re-opened before the ref was acquired. So
it may have turned into a directory fd.
To be honest, I don't think it's worth investigating this unless I see
it happen again now that the race in my code is fixed.
Best,
-Nikolaus
-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
             »Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2019-01-11 15:39           ` Nikolaus Rath
  2019-01-11 15:39             ` [fuse-devel] " Nikolaus Rath
@ 2019-02-10 22:05             ` Nikolaus Rath
  2019-02-12 14:57               ` Miklos Szeredi
  1 sibling, 1 reply; 26+ messages in thread
From: Nikolaus Rath @ 2019-02-10 22:05 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-fsdevel
On Jan 11 2019, Nikolaus Rath <Nikolaus@rath.org> wrote:
> On Jan 09 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Tue, Jan 8, 2019 at 11:35 AM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>>
>>> On Jan 08 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> > On Mon, Jan 7, 2019 at 10:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>> >>
>>> >> On Jan 07 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> >> > On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>> >> >>
>>> >> >> Hi,
>>> >> >>
>>> >> >> I am seeing relatively regular occurences of
>>> >> >>
>>> >> >> $ sudo dmesg | tail
>>> >> >> [21929.138815] fuse: trying to steal weird page
>>> >> >> [21929.138821] page=00000000a7dd2617 index=64 flags=17fffc0000000ad,
>>> >> >> count=1, mapcount=0, mapping= (null)
>>> >> >> [21930.647338] fuse: trying to steal weird page
>>> >> >> [21930.647345] page=00000000a07f32af index=2848
>>> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>>> >> >> [21932.338873] fuse: trying to steal weird page
>>> >> >> [21932.338879] page=0000000067e3a012 index=64 flags=17fffc0000000ad,
>>> >> >> count=1, mapcount=0, mapping= (null)
>>> >> >> [21933.930703] fuse: trying to steal weird page
>>> >> >> [21933.930710] page=00000000046feb25 index=845
>>> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>>> >> >> [21936.163174] fuse: trying to steal weird page
>>> >> >> [21936.163180] page=00000000fb80fe27 index=0 flags=17fffc0000000ad,
>>> >> >> count=1, mapcount=0, mapping= (null)
>>> >> >
>>> >> > The page has the PG_dity and PG_waiters flags set which are
>>> >> > incompatible with stealing.  page_cache_pipe_buf_steal() does
>>> >> > apparently filter out dirty ones, so it's not a regular file that we
>>> >> > are trying to streal the page from.  So the question is: what is the
>>> >> > source of the splice()?
>>> >>
>>> >> Hmm. I think it has to be a regular file. But as I mentioned in my other
>>> >> email, I did have a race condition where fd's were closed
>>> >> incorrectly. Is it possible that this also triggered the above,
>>> >> i.e. that the fd was closed sometime during splice?
>>> >
>>> > Close during a syscall that uses the fd is not an issue, because a ref
>>> > to the file is acquired.  So the race is between the close() and the
>>> > internal fget(); if the close() wins then fget() will fail and the
>>> > syscall will return EBADF.  If the fget() wins, then the syscall can
>>> > run normally despite the fact that the fd was closed.
>>> >
>>> > Can you tell me what filesystem is the regular file (the one being
>>> > spliced into fuse) is on?
>>>
>>> It's ext4.
>>
>> Next question: is file opened with O_DIRECT or is filesystem mounted
>> with DAX, or anything fancy?
>
> Neither. But thinking about this, I guess that (because of the race) the
> fd could have been closed and re-opened before the ref was acquired. So
> it may have turned into a directory fd.
>
> To be honest, I don't think it's worth investigating this unless I see
> it happen again now that the race in my code is fixed.
Bad news. I can now reliably reproduce the issue again.
I have no clue why it disappeared for a while. This is exactly the same
filesystem code, but I can't rule out that there has been some routine upgrade
in the base system (compiler or kernel).
Any suggestions for debugging this?
As I said before, splice() source is a regular file on ext4, opened
without O_DIRECT or DAX. If I disable FUSE_CAP_SPLICE_WRITE, the error
message no longer occurs.
Best,
-Nikolaus
-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
             »Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2019-02-10 22:05             ` Nikolaus Rath
@ 2019-02-12 14:57               ` Miklos Szeredi
  2019-02-12 21:28                 ` Nikolaus Rath
  0 siblings, 1 reply; 26+ messages in thread
From: Miklos Szeredi @ 2019-02-12 14:57 UTC (permalink / raw)
  To: Miklos Szeredi, fuse-devel, linux-fsdevel
On Sun, Feb 10, 2019 at 11:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> Bad news. I can now reliably reproduce the issue again.
A reliable reproducer is always good news.   Are the messages exactly
the same as last time (value of flags, etc)?
Thanks,
Miklos
^ permalink raw reply	[flat|nested] 26+ messages in thread 
- * Re: [fuse-devel] fuse: trying to steal weird page
  2019-02-12 14:57               ` Miklos Szeredi
@ 2019-02-12 21:28                 ` Nikolaus Rath
  2019-02-25 21:41                   ` Nikolaus Rath
  0 siblings, 1 reply; 26+ messages in thread
From: Nikolaus Rath @ 2019-02-12 21:28 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-fsdevel
On Feb 12 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Sun, Feb 10, 2019 at 11:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
>> Bad news. I can now reliably reproduce the issue again.
>
> A reliable reproducer is always good news.   Are the messages exactly
> the same as last time (value of flags, etc)?
The flags, count, mapcount and mapping values are always the same. The
page and index is varying. So the general format is:
[ 2562.773181] fuse: trying to steal weird page
[ 2562.773187]   page=<something> index=<something> flags=17ffffc00000ad, count=1, mapcount=0, mapping=          (null)
Best,
-Nikolaus
-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
             »Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2019-02-12 21:28                 ` Nikolaus Rath
@ 2019-02-25 21:41                   ` Nikolaus Rath
  2019-02-26 12:57                     ` Miklos Szeredi
  0 siblings, 1 reply; 26+ messages in thread
From: Nikolaus Rath @ 2019-02-25 21:41 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-fsdevel
On Feb 12 2019, Nikolaus Rath <Nikolaus@rath.org> wrote:
> On Feb 12 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Sun, Feb 10, 2019 at 11:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>>> Bad news. I can now reliably reproduce the issue again.
>>
>> A reliable reproducer is always good news.   Are the messages exactly
>> the same as last time (value of flags, etc)?
>
> The flags, count, mapcount and mapping values are always the same. The
> page and index is varying. So the general format is:
>
> [ 2562.773181] fuse: trying to steal weird page
> [ 2562.773187] page=<something> index=<something>
> flags=17ffffc00000ad, count=1, mapcount=0, mapping= (null)
Is there anything else I can do to help debugging this?
Best,
-Nikolaus
-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
             »Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2019-02-25 21:41                   ` Nikolaus Rath
@ 2019-02-26 12:57                     ` Miklos Szeredi
  2019-02-26 13:30                       ` Miklos Szeredi
  0 siblings, 1 reply; 26+ messages in thread
From: Miklos Szeredi @ 2019-02-26 12:57 UTC (permalink / raw)
  To: fuse-devel; +Cc: linux-fsdevel
[-- Attachment #1: Type: text/plain, Size: 900 bytes --]
On Mon, Feb 25, 2019 at 10:41 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> On Feb 12 2019, Nikolaus Rath <Nikolaus@rath.org> wrote:
> > On Feb 12 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> >> On Sun, Feb 10, 2019 at 11:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >>
> >>> Bad news. I can now reliably reproduce the issue again.
> >>
> >> A reliable reproducer is always good news.   Are the messages exactly
> >> the same as last time (value of flags, etc)?
> >
> > The flags, count, mapcount and mapping values are always the same. The
> > page and index is varying. So the general format is:
> >
> > [ 2562.773181] fuse: trying to steal weird page
> > [ 2562.773187] page=<something> index=<something>
> > flags=17ffffc00000ad, count=1, mapcount=0, mapping= (null)
>
> Is there anything else I can do to help debugging this?
Could you please try the attached patch?
Thanks,
Miklos
[-- Attachment #2: test.patch --]
[-- Type: text/x-patch, Size: 451 bytes --]
diff --git a/fs/splice.c b/fs/splice.c
index de2ede048473..07c5c77abc45 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -73,7 +73,7 @@ static int page_cache_pipe_buf_steal(struct pipe_inode_info *pipe,
 		 * If we succeeded in removing the mapping, set LRU flag
 		 * and return good.
 		 */
-		if (remove_mapping(mapping, page)) {
+		if (remove_mapping(mapping, page) && page_count(page) == 1) {
 			buf->flags |= PIPE_BUF_FLAG_LRU;
 			return 0;
 		}
^ permalink raw reply related	[flat|nested] 26+ messages in thread
- * Re: [fuse-devel] fuse: trying to steal weird page
  2019-02-26 12:57                     ` Miklos Szeredi
@ 2019-02-26 13:30                       ` Miklos Szeredi
  0 siblings, 0 replies; 26+ messages in thread
From: Miklos Szeredi @ 2019-02-26 13:30 UTC (permalink / raw)
  To: fuse-devel; +Cc: linux-fsdevel
On Tue, Feb 26, 2019 at 1:57 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Mon, Feb 25, 2019 at 10:41 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >
> > On Feb 12 2019, Nikolaus Rath <Nikolaus@rath.org> wrote:
> > > On Feb 12 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > >> On Sun, Feb 10, 2019 at 11:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> > >>
> > >>> Bad news. I can now reliably reproduce the issue again.
> > >>
> > >> A reliable reproducer is always good news.   Are the messages exactly
> > >> the same as last time (value of flags, etc)?
> > >
> > > The flags, count, mapcount and mapping values are always the same. The
> > > page and index is varying. So the general format is:
> > >
> > > [ 2562.773181] fuse: trying to steal weird page
> > > [ 2562.773187] page=<something> index=<something>
> > > flags=17ffffc00000ad, count=1, mapcount=0, mapping= (null)
> >
> > Is there anything else I can do to help debugging this?
>
> Could you please try the attached patch?
Looking more, it's very unlikely to help.  remove_mapping() should
already ensure that the page count is 1.
I think this bug report needs to be forwarded to the
<linux-mm@kvack.org> mailing list as this appears to be  a race
somewhere in the memory management subsystem and fuse is only making
it visible due to its sanity checking in the page stealing code.
Thanks,
Miklos
^ permalink raw reply	[flat|nested] 26+ messages in thread 
 
 
 
 
 
 
 
 
 
 
 
end of thread, other threads:[~2020-05-18 15:26 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-05-02 19:09 fuse: trying to steal weird page Nikolaus Rath
2020-05-02 19:52 ` Nikolaus Rath
2020-05-03  3:26   ` Matthew Wilcox
2020-05-03  8:43     ` [fuse-devel] " Nikolaus Rath
2020-05-03 10:27       ` Matthew Wilcox
2020-05-03 18:28         ` Gabriel Krisman Bertazi
2020-05-03 20:06           ` Matthew Wilcox
2020-05-03 20:25           ` Nikolaus Rath
2020-05-06 13:57             ` Vlastimil Babka
2020-05-03 21:34         ` Hugh Dickins
2020-05-18 12:45         ` Miklos Szeredi
2020-05-18 14:48           ` Matthew Wilcox
2020-05-18 14:58             ` Miklos Szeredi
2020-05-18 15:26               ` Matthew Wilcox
  -- strict thread matches above, loose matches on Subject: below --
2018-12-26 21:43 Nikolaus Rath
2019-01-07  8:28 ` [fuse-devel] " Miklos Szeredi
2019-01-07 21:05   ` Nikolaus Rath
2019-01-08  8:27     ` Miklos Szeredi
2019-01-08 10:35       ` Nikolaus Rath
2019-01-09  8:07         ` Miklos Szeredi
2019-01-11 15:39           ` Nikolaus Rath
2019-01-11 15:39             ` [fuse-devel] " Nikolaus Rath
2019-02-10 22:05             ` Nikolaus Rath
2019-02-12 14:57               ` Miklos Szeredi
2019-02-12 21:28                 ` Nikolaus Rath
2019-02-25 21:41                   ` Nikolaus Rath
2019-02-26 12:57                     ` Miklos Szeredi
2019-02-26 13:30                       ` Miklos Szeredi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).