From: Jinjiang Tu <tujinjiang@huawei.com>
To: David Hildenbrand <david@redhat.com>,
syzbot <syzbot+3b220254df55d8ca8a61@syzkaller.appspotmail.com>,
<Liam.Howlett@oracle.com>, <akpm@linux-foundation.org>,
<harry.yoo@oracle.com>, <linux-kernel@vger.kernel.org>,
<linux-mm@kvack.org>, <lorenzo.stoakes@oracle.com>,
<riel@surriel.com>, <syzkaller-bugs@googlegroups.com>,
<vbabka@suse.cz>, Jens Axboe <axboe@kernel.dk>,
Catalin Marinas <catalin.marinas@arm.com>
Subject: Re: [syzbot] [mm?] kernel BUG in try_to_unmap_one (2)
Date: Thu, 5 Jun 2025 15:18:34 +0800 [thread overview]
Message-ID: <d1e1896b-8685-fd7c-d17d-f4328939b96f@huawei.com> (raw)
In-Reply-To: <b9a43f6d-1865-4074-b91c-a5bd7e10f2a9@redhat.com>
在 2025/6/5 14:37, David Hildenbrand 写道:
> On 05.06.25 08:27, David Hildenbrand wrote:
>> On 05.06.25 08:11, David Hildenbrand wrote:
>>> On 05.06.25 07:38, syzbot wrote:
>>>> Hello,
>>>>
>>>> syzbot found the following issue on:
>>>>
>>>> HEAD commit: d7fa1af5b33e Merge branch 'for-next/core' into
>>>> for-kernelci
>>>
>>> Hmmm, another very odd page-table mapping related problem on that tree
>>> found on arm64 only:
>>
>> In this particular reproducer we seem to be having MADV_HUGEPAGE and
>> io_uring_setup() be racing with MADV_HWPOISON, MADV_PAGEOUT and
>> io_uring_register(IORING_REGISTER_BUFFERS).
>>
>> I assume the issue is related to MADV_HWPOISON, MADV_PAGEOUT and
>> io_uring_register racing, only. I suspect MADV_HWPOISON is trying to
>> split a THP, while MADV_PAGEOUT tries paging it out.
>>
>> IORING_REGISTER_BUFFERS ends up in
>> io_sqe_buffers_register->io_sqe_buffer_register where we GUP-fast and
>> try coalescing buffers.
>>
>> And something about THPs is not particularly happy :)
>>
>
> Not sure if realted to io_uring.
>
> unmap_poisoned_folio() calls try_to_unmap() without TTU_SPLIT_HUGE_PMD.
>
> When called from memory_failure(), we make sure to never call it on a
> large folio: WARN_ON(folio_test_large(folio));
>
> However, from shrink_folio_list() we might call unmap_poisoned_folio()
> on a large folio, which doesn't work if it is still PMD-mapped. Maybe
> passing TTU_SPLIT_HUGE_PMD would fix it.
>
TTU_SPLIT_HUGE_PMD only converts the PMD-mapped THP to PTE-mapped THP, and may trigger the below WARN_ON_ONCE in try_to_unmap_one.
if (PageHWPoison(subpage) && (flags & TTU_HWPOISON)) {
...
} else if (likely(pte_present(pteval)) && pte_unused(pteval) &&
!userfaultfd_armed(vma)) {
....
} else if (folio_test_anon(folio)) {
swp_entry_t entry = page_swap_entry(subpage);
pte_t swp_pte;
/*
* Store the swap location in the pte.
* See handle_pte_fault() ...
*/
if (unlikely(folio_test_swapbacked(folio) !=
folio_test_swapcache(folio))) {
WARN_ON_ONCE(1); // here. if the subpage isn't hwposioned, and we hasn't call add_to_swap() for the THP
goto walk_abort;
}
If we want to unmap in shrink_folio_list, we have to try_to_split_thp_page() like memory_failure(). But it't too complicated, maybe just skip the
hwpoisoned folio is enough? If the folio is accessed again, memory_failure will be trigerred again and kill the accessing process since the folio
has be hwpoisoned.
>
> Likely the relevant commit is:
>
> commit 1b0449544c6482179ac84530b61fc192a6527bfd
> Author: Jinjiang Tu <tujinjiang@huawei.com>
> Date: Tue Mar 18 16:39:39 2025 +0800
Yes, It is caused by this commit.
>
> mm/vmscan: don't try to reclaim hwpoison folio
> Syzkaller reports a bug as follows:
> Injecting memory failure for pfn 0x18b00e at process virtual
> address 0x20ffd000
> Memory failure: 0x18b00e: dirty swapcache page still referenced by
> 2 users
> Memory failure: 0x18b00e: recovery action for dirty swapcache
> page: Failed
> page: refcount:2 mapcount:0 mapping:0000000000000000 index:0x20ffd
> pfn:0x18b00e
> memcg:ffff0000dd6d9000
> anon flags:
> 0x5ffffe00482011(locked|dirty|arch_1|swapbacked|hwpoison|node=0|zone=2|lastcpupid=0xfffff)
> raw: 005ffffe00482011 dead000000000100 dead000000000122
> ffff0000e232a7c9
> raw: 0000000000020ffd 0000000000000000 00000002ffffffff
> ffff0000dd6d9000
> page dumped because: VM_BUG_ON_FOLIO(!folio_test_uptodate(folio))
>
> CCing Jinjiang Tu
>
next prev parent reply other threads:[~2025-06-05 7:18 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-05 5:38 [syzbot] [mm?] kernel BUG in try_to_unmap_one (2) syzbot
2025-06-05 6:11 ` David Hildenbrand
2025-06-05 6:27 ` David Hildenbrand
2025-06-05 6:37 ` David Hildenbrand
2025-06-05 7:18 ` Jinjiang Tu [this message]
2025-06-06 7:56 ` David Hildenbrand
2025-06-07 1:29 ` Jinjiang Tu
2025-06-09 8:35 ` Miaohe Lin
2025-06-05 7:37 ` Jinjiang Tu
2025-06-06 7:40 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d1e1896b-8685-fd7c-d17d-f4328939b96f@huawei.com \
--to=tujinjiang@huawei.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=catalin.marinas@arm.com \
--cc=david@redhat.com \
--cc=harry.yoo@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=riel@surriel.com \
--cc=syzbot+3b220254df55d8ca8a61@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).