From: David Hildenbrand <david@redhat.com>
To: Ryan Roberts <ryan.roberts@arm.com>,
Yin Fengwei <fengwei.yin@intel.com>,
syzbot <syzbot+50ef73537bbc393a25bb@syzkaller.appspotmail.com>,
akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, syzkaller-bugs@googlegroups.com,
Matthew Wilcox <willy@infradead.org>
Subject: Re: [syzbot] [mm?] WARNING in __folio_rmap_sanity_checks
Date: Fri, 5 Jan 2024 09:56:29 +0100 [thread overview]
Message-ID: <556f8a4f-c739-41e0-85ec-643a0b32a2ce@redhat.com> (raw)
In-Reply-To: <1eb61435-c89c-4ca1-b1b6-aa00b3478cd2@arm.com>
>>>>> If I am not wrong, that triggers:
>>>>>
>>>>> VM_WARN_ON_FOLIO(folio_test_large(folio) &&
>>>>> !folio_test_large_rmappable(folio), folio);
>>>>>
>>>>> So we are trying to rmap a large folio that did not go through
>>>>> folio_prep_large_rmappable().
>
> Would someone mind explaining the rules to me for this? As far as I can see,
> folio_prep_large_rmappable() just inits the _deferred_list and sets a flag so we
> remember to deinit the list on destruction. Why can't we just init that list for
> all folios order-2 or greater? Then everything is rmappable?
I think we much rather want to look into moving all mapcount-related
stuff into folio_prep_large_rmappable(). It doesn't make any sense to
initialize that for any compound pages, especially the ones that will
never get mapped to user space.
>
>>>>>
>>>>> net/packet/af_packet.c calls vm_insert_page() on some pages/folios stoed
>>>>> in the "struct packet_ring_buffer". No idea where that comes from, but I
>>>>> suspect it's simply some compound allocation.
>>>> Looks like:
>>>> alloc_pg_vec
>>>> alloc_one_pg_vec_page
>>>> gfp_t gfp_flags = GFP_KERNEL | __GFP_COMP |
>>>> __GFP_ZERO | __GFP_NOWARN | __GFP_NORETRY;
>>>>
>>>> buffer = (char *) __get_free_pages(gfp_flags, order);
>>>> So you are right here... :).
>>>
>>> Hm, but I wonder if this something that's supposed to work or is this one of
>>> the cases where we should actually use a VM_PFN mapping?
>>>
>>> It's not a pagecache(file/shmem) page after all.
>>>
>>> We could relax that check and document why we expect something that is not
>>> marked rmappable. But it fells wrong. I suspect this should be a VM_PFNMAP
>>> instead (like recent udmabuf changes).
>>
>> VM_PFNMAP looks correct.
>
> And why is making the folio rmappable and mapping it the normal way not the
> right solution here? Because the folio could be order-1? Or something more profound?
>
Think about it: we are adding/removing a page from rmap handling that
can *never* be looked up using the rmap because there is no rmap for
these pages, and folio->index is just completely unexpressive.
VM_MIXEDMAP doesn't have any linearity constraints.
Logically, it doesn't make any sense to involve rmap code although it
currently might work. validate_page_before_insert() blocks off most
pages where the order-0 mapcount would be used for other purposes and
everything would blow up.
Looking at vm_insert_page(), this interface is only for pages the caller
allocated. Maybe we should just not do any rmap accounting when
mapping/unmapping these pages: not involve any rmap code, including
mapcounts?
vm_normal_page() works on these mappings, so we'd also have to skip rmap
code when unmapping these pages etc. Maybe that's the whole reason we
have the rmap handling here: to not special-case the unmap path.
Alternatively, we can:
(1) Require the caller to make sure large folios are rmappable. We
already require allocations to be compound. Should be easy to add.
(2) Allow non-rmappable folios in rmap code just for mapcount tracking.
Confusing but possible.
>>
>> I do have another question: why do we just check the large folio
>> rmappable? Does that mean order0 folio is always rmappable?
>>
We didn't really have a check for that I believe. We simply reject all
pages in vm_insert_page() that are problematic because the pagecount is
overloaded.
>> I ask this because vm_insert_pages() is called in net/ipv4/tcp.c
>> and drivers call vm_insert_page. I suppose they all need be VM_PFNMAP.
Right, similar problem.
--
Cheers,
David / dhildenb
next prev parent reply other threads:[~2024-01-05 8:56 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-03 11:48 [syzbot] [mm?] WARNING in __folio_rmap_sanity_checks syzbot
2024-01-03 12:13 ` David Hildenbrand
2024-01-03 14:16 ` Yin, Fengwei
2024-01-04 21:36 ` David Hildenbrand
2024-01-05 2:20 ` Yin Fengwei
2024-01-05 8:14 ` Ryan Roberts
2024-01-05 8:41 ` Yin Fengwei
2024-01-05 8:56 ` David Hildenbrand [this message]
2024-01-05 10:25 ` Ryan Roberts
2024-01-05 16:05 ` David Hildenbrand
2024-01-05 14:48 ` Yin, Fengwei
2024-01-05 15:45 ` David Hildenbrand
2024-01-06 4:36 ` Yin, Fengwei
2024-01-05 17:04 ` Andrew Morton
2024-01-05 17:16 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=556f8a4f-c739-41e0-85ec-643a0b32a2ce@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=fengwei.yin@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ryan.roberts@arm.com \
--cc=syzbot+50ef73537bbc393a25bb@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox