All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <roman.gushchin@linux.dev>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org, Hugh Dickins <hughd@google.com>,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH] mm: page_alloc: move mlocked flag clearance into free_pages_prepare()
Date: Mon, 21 Oct 2024 17:17:53 +0000	[thread overview]
Message-ID: <ZxaMwfShUXDzQMwQ@google.com> (raw)
In-Reply-To: <c5cd0ad5-9d9d-4df3-ab20-c5de2a380894@suse.cz>

On Mon, Oct 21, 2024 at 07:01:59PM +0200, Vlastimil Babka wrote:
> On 10/21/24 18:48, Roman Gushchin wrote:
> > Syzbot reported [1] a bad page state problem caused by a page
> > being freed using free_page() still having a mlocked flag at
> > free_pages_prepare() stage:
> > 
> >   BUG: Bad page state in process syz.0.15  pfn:1137bb
> >   page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff8881137bb870 pfn:0x1137bb
> >   flags: 0x400000000080000(mlocked|node=0|zone=1)
> >   raw: 0400000000080000 0000000000000000 dead000000000122 0000000000000000
> >   raw: ffff8881137bb870 0000000000000000 00000000ffffffff 0000000000000000
> >   page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> >   page_owner tracks the page as allocated
> >   page last allocated via order 0, migratetype Unmovable, gfp_mask
> >   0x400dc0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), pid 3005, tgid
> >   3004 (syz.0.15), ts 61546  608067, free_ts 61390082085
> >    set_page_owner include/linux/page_owner.h:32 [inline]
> >    post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1537
> >    prep_new_page mm/page_alloc.c:1545 [inline]
> >    get_page_from_freelist+0x3008/0x31f0 mm/page_alloc.c:3457
> >    __alloc_pages_noprof+0x292/0x7b0 mm/page_alloc.c:4733
> >    alloc_pages_mpol_noprof+0x3e8/0x630 mm/mempolicy.c:2265
> >    kvm_coalesced_mmio_init+0x1f/0xf0 virt/kvm/coalesced_mmio.c:99
> >    kvm_create_vm virt/kvm/kvm_main.c:1235 [inline]
> >    kvm_dev_ioctl_create_vm virt/kvm/kvm_main.c:5500 [inline]
> >    kvm_dev_ioctl+0x13bb/0x2320 virt/kvm/kvm_main.c:5542
> >    vfs_ioctl fs/ioctl.c:51 [inline]
> >    __do_sys_ioctl fs/ioctl.c:907 [inline]
> >    __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893
> >    do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> >    do_syscall_64+0x69/0x110 arch/x86/entry/common.c:83
> >    entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >   page last free pid 951 tgid 951 stack trace:
> >    reset_page_owner include/linux/page_owner.h:25 [inline]
> >    free_pages_prepare mm/page_alloc.c:1108 [inline]
> >    free_unref_page+0xcb1/0xf00 mm/page_alloc.c:2638
> >    vfree+0x181/0x2e0 mm/vmalloc.c:3361
> >    delayed_vfree_work+0x56/0x80 mm/vmalloc.c:3282
> >    process_one_work kernel/workqueue.c:3229 [inline]
> >    process_scheduled_works+0xa5c/0x17a0 kernel/workqueue.c:3310
> >    worker_thread+0xa2b/0xf70 kernel/workqueue.c:3391
> >    kthread+0x2df/0x370 kernel/kthread.c:389
> >    ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> >    ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> > 
> > The problem was originally introduced by
> > commit b109b87050df ("mm/munlock: replace clear_page_mlock() by final
> > clearance"): it was handling focused on handling pagecache
> > and anonymous memory and wasn't suitable for lower level
> > get_page()/free_page() API's used for example by KVM, as with
> > this reproducer.
> 
> Does that mean KVM is mlocking pages that are not pagecache nor anonymous,
> thus not LRU? How and why (and since when) is that done?

KVM allows to mmap and mlock several pages allocated directly.
Please, take a look at the reproducer:
https://syzkaller.appspot.com/x/repro.c?x=1437939f980000

> 
> > Fix it by moving the mlocked flag clearance down to
> > free_page_prepare().
> > 
> > The bug itself if fairly old and harmless (aside from generating these
> > warnings), so the stable backport is likely not justified.
> 
> But since there's a Cc: stable below, it will be backported :)

My bad, I changed my mind in the last minute and added Cc: stable but
forgot to drop this sentence.

> 
> > Closes: https://syzkaller.appspot.com/x/report.txt?x=169a47d0580000
> > Fixes: b109b87050df ("mm/munlock: replace clear_page_mlock() by final clearance")
> > Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
> > Cc: <stable@vger.kernel.org>
> > Cc: Hugh Dickins <hughd@google.com>
> > Cc: Matthew Wilcox <willy@infradead.org>
> > ---
> >  mm/page_alloc.c |  9 +++++++++
> >  mm/swap.c       | 14 --------------
> >  2 files changed, 9 insertions(+), 14 deletions(-)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index bc55d39eb372..24200651ad92 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1044,6 +1044,7 @@ __always_inline bool free_pages_prepare(struct page *page,
> >  	bool skip_kasan_poison = should_skip_kasan_poison(page);
> >  	bool init = want_init_on_free();
> >  	bool compound = PageCompound(page);
> > +	struct folio *folio = page_folio(page);
> >  
> >  	VM_BUG_ON_PAGE(PageTail(page), page);
> >  
> > @@ -1053,6 +1054,14 @@ __always_inline bool free_pages_prepare(struct page *page,
> >  	if (memcg_kmem_online() && PageMemcgKmem(page))
> >  		__memcg_kmem_uncharge_page(page, order);
> >  
> > +	if (unlikely(folio_test_mlocked(folio))) {
> > +		long nr_pages = folio_nr_pages(folio);
> > +
> > +		__folio_clear_mlocked(folio);
> > +		zone_stat_mod_folio(folio, NR_MLOCK, -nr_pages);
> > +		count_vm_events(UNEVICTABLE_PGCLEARED, nr_pages);
> > +	}
> 
> Why drop the useful comment?

Agree. Sounds like I need to restore the comment, drop no stable backport
recommendation and send v2.

Thank you for taking a look!

      reply	other threads:[~2024-10-21 17:17 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-21 16:48 [PATCH] mm: page_alloc: move mlocked flag clearance into free_pages_prepare() Roman Gushchin
2024-10-21 17:01 ` Vlastimil Babka
2024-10-21 17:17   ` Roman Gushchin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZxaMwfShUXDzQMwQ@google.com \
    --to=roman.gushchin@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.