From: Lance Yang <lance.yang@linux.dev>
To: david@kernel.org
Cc: lance.yang@linux.dev, kartikey406@gmail.com,
usama.arif@linux.dev, Liam.Howlett@oracle.com, ziy@nvidia.com,
syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com,
akpm@linux-foundation.org, baohua@kernel.org,
baolin.wang@linux.alibaba.com, dev.jain@arm.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org, ljs@kernel.org,
npache@redhat.com, ryan.roberts@arm.com,
syzkaller-bugs@googlegroups.com
Subject: Re: [syzbot] [mm?] WARNING in deferred_split_folio
Date: Wed, 1 Apr 2026 19:20:49 +0800 [thread overview]
Message-ID: <20260401112049.18634-1-lance.yang@linux.dev> (raw)
In-Reply-To: <f28d968b-1d09-4db9-9f4b-55205f43c951@kernel.org>
On Wed, Apr 01, 2026 at 01:00:13PM +0200, David Hildenbrand (Arm) wrote:
>On 4/1/26 12:53, Lance Yang wrote:
>>
>> +Cc Deepanshu
>>
>> On Wed, Apr 01, 2026 at 12:16:43PM +0200, David Hildenbrand (Arm) wrote:
>>> On 4/1/26 10:59, Lance Yang wrote:
>>>>
>>>> >from another sharer can then remove some of those mappings and reach
>>>>
>>>> Perhaps the WARN is simply too strict there :)
>>>>
>>>> Migration already holds the folio lock on dst, while the competing
>>>> rmap-removal path runs under the page-table lock. So once
>>>> remove_migration_ptes(src, dst, 0) makes dst visible again, this race
>>>> looks hard to avoid.
>>>>
>>>> So maybe the simplest fix is just to drop the WARN in the
>>>> !partially_mapped path:
>>>>
>>>> ---8<---
>>>> Subject: [PATCH 1/1] mm/thp: avoid false warning in deferred_split_folio()
>>>>
>>>> From: Lance Yang <lance.yang@linux.dev>
>>>>
>>>> migrate_folio_move() snapshots src_partially_mapped from src before
>>>> migration and later requeues dst after remove_migration_ptes(src, dst, 0).
>>>>
>>>> Once dst is visible again, a competing rmap-removal path can legally set
>>>> PG_partially_mapped before the migration path reaches
>>>> deferred_split_folio(dst, src_partially_mapped).
>>>>
>>>> Migration already holds the folio lock on dst, while the competing
>>>> rmap-removal path runs under the page-table lock. So once
>>>> remove_migration_ptes(src, dst, 0) makes dst visible again, this race
>>>> looks hard to avoid.
>>>>
>>>> So just drop the WARN in the !partially_mapped path and preserve an
>>>> already-set PG_partially_mapped bit.
>>>>
>>>> Link: https://lore.kernel.org/linux-mm/69ccb65b.050a0220.183828.003a.GAE@google.com/
>>>> Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue")
>>>> Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com
>>>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>>>> ---
>>>> mm/huge_memory.c | 3 ---
>>>> 1 file changed, 3 deletions(-)
>>>>
>>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>>> index 745eb3d0d4a7..8ea8e293dc7c 100644
>>>> --- a/mm/huge_memory.c
>>>> +++ b/mm/huge_memory.c
>>>> @@ -4433,9 +4433,6 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped)
>>>> mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, 1);
>>>>
>>>> }
>>>> - } else {
>>>> - /* partially mapped folios cannot become non-partially mapped */
>>>> - VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio);
>>>> }
>>>
>>> Can't we simply move the setting before restoring migration ptes?
>>
>> Afraid not, it closes the remove_migration_ptes() ->
>> deferred_split_folio() race, but opens a new one with the shrinker, IIUC
>>
>> Once dst is on the deferred split queue, deferred_split_scan() can
>> pick it up immediately. The shrinker unconditionally dequeues every
>> folio it visits:
>>
>> list_del_init(&folio->_deferred_list); /* always */
>>
>> Then for a non-partially-mapped folio, if folio_trylock() fails
>> (dst is still locked by migration), it falls through to:
>>
>> next:
>> if (did_split || !folio_test_partially_mapped(folio))
>> continue; /* not requeued, dst silently lost */
>>
>> so it is *not* requeued.
>
>How is that different to the shrinker just trying to lock the folio before we
>unlock it and failing? The race already exists?
Ouch, you're right, I was wrong - the trylock drop is a pre-existing
issue, not caused by the reorder ;)
>
>To sort out that race a trylock must not result in the folio getting
>discarded.
Nice, LGTM!
Given that the "trylock -> drop" behavior seems to exist already today,
do you think it's worth fixing that together with the reorder?
>diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>index ff9a42abd1b6..521989517cd1 100644
>--- a/mm/huge_memory.c
>+++ b/mm/huge_memory.c
>@@ -4558,7 +4558,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
> goto next;
> }
> if (!folio_trylock(folio))
>- goto next;
>+ goto requeue:
> if (!split_folio(folio)) {
> did_split = true;
> if (underused)
>@@ -4569,6 +4569,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
> next:
> if (did_split || !folio_test_partially_mapped(folio))
> continue;
>+requeue:
> /*
> * Only add back to the queue if folio is partially mapped.
> * If thp_underused returns false, or if split_folio fails
>
>
next prev parent reply other threads:[~2026-04-01 11:21 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <EE70ZGRNE12@zendesk.com>
2026-04-01 6:08 ` [syzbot] [mm?] WARNING in deferred_split_folio syzbot
2026-04-01 6:09 ` Request received Yail
2026-04-01 8:10 ` [syzbot] [mm?] WARNING in deferred_split_folio Lance Yang
2026-04-01 8:59 ` Lance Yang
2026-04-01 9:36 ` David Hildenbrand (Arm)
2026-04-01 10:16 ` David Hildenbrand (Arm)
2026-04-01 10:53 ` Lance Yang
2026-04-01 11:00 ` David Hildenbrand (Arm)
2026-04-01 11:20 ` Lance Yang [this message]
2026-04-01 11:22 ` David Hildenbrand (Arm)
2026-04-01 11:34 ` Lance Yang
2026-04-01 11:38 ` David Hildenbrand (Arm)
2026-04-01 11:41 ` Lance Yang
2026-04-01 11:44 ` David Hildenbrand (Arm)
2026-04-01 11:51 ` Lance Yang
2026-04-01 11:54 ` Lance Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260401112049.18634-1-lance.yang@linux.dev \
--to=lance.yang@linux.dev \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=kartikey406@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=npache@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=usama.arif@linux.dev \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox