All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lance Yang <lance.yang@linux.dev>
To: david@kernel.org
Cc: lance.yang@linux.dev, kartikey406@gmail.com,
	usama.arif@linux.dev, Liam.Howlett@oracle.com, ziy@nvidia.com,
	syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com,
	akpm@linux-foundation.org, baohua@kernel.org,
	baolin.wang@linux.alibaba.com, dev.jain@arm.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, ljs@kernel.org,
	npache@redhat.com, ryan.roberts@arm.com,
	syzkaller-bugs@googlegroups.com
Subject: Re: [syzbot] [mm?] WARNING in deferred_split_folio
Date: Wed,  1 Apr 2026 19:20:49 +0800	[thread overview]
Message-ID: <20260401112049.18634-1-lance.yang@linux.dev> (raw)
In-Reply-To: <f28d968b-1d09-4db9-9f4b-55205f43c951@kernel.org>


On Wed, Apr 01, 2026 at 01:00:13PM +0200, David Hildenbrand (Arm) wrote:
>On 4/1/26 12:53, Lance Yang wrote:
>> 
>> +Cc Deepanshu
>> 
>> On Wed, Apr 01, 2026 at 12:16:43PM +0200, David Hildenbrand (Arm) wrote:
>>> On 4/1/26 10:59, Lance Yang wrote:
>>>>
>>>> >from another sharer can then remove some of those mappings and reach
>>>>
>>>> Perhaps the WARN is simply too strict there :)
>>>>
>>>> Migration already holds the folio lock on dst, while the competing
>>>> rmap-removal path runs under the page-table lock. So once
>>>> remove_migration_ptes(src, dst, 0) makes dst visible again, this race
>>>> looks hard to avoid.
>>>>
>>>> So maybe the simplest fix is just to drop the WARN in the
>>>> !partially_mapped path:
>>>>
>>>> ---8<---
>>>> Subject: [PATCH 1/1] mm/thp: avoid false warning in deferred_split_folio()
>>>>
>>>> From: Lance Yang <lance.yang@linux.dev>
>>>>
>>>> migrate_folio_move() snapshots src_partially_mapped from src before
>>>> migration and later requeues dst after remove_migration_ptes(src, dst, 0).
>>>>
>>>> Once dst is visible again, a competing rmap-removal path can legally set
>>>> PG_partially_mapped before the migration path reaches
>>>> deferred_split_folio(dst, src_partially_mapped).
>>>>
>>>> Migration already holds the folio lock on dst, while the competing
>>>> rmap-removal path runs under the page-table lock. So once
>>>> remove_migration_ptes(src, dst, 0) makes dst visible again, this race
>>>> looks hard to avoid.
>>>>
>>>> So just drop the WARN in the !partially_mapped path and preserve an
>>>> already-set PG_partially_mapped bit.
>>>>
>>>> Link: https://lore.kernel.org/linux-mm/69ccb65b.050a0220.183828.003a.GAE@google.com/
>>>> Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue")
>>>> Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com
>>>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>>>> ---
>>>>  mm/huge_memory.c | 3 ---
>>>>  1 file changed, 3 deletions(-)
>>>>
>>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>>> index 745eb3d0d4a7..8ea8e293dc7c 100644
>>>> --- a/mm/huge_memory.c
>>>> +++ b/mm/huge_memory.c
>>>> @@ -4433,9 +4433,6 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped)
>>>>  			mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, 1);
>>>>
>>>>  		}
>>>> -	} else {
>>>> -		/* partially mapped folios cannot become non-partially mapped */
>>>> -		VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio);
>>>>  	}
>>>
>>> Can't we simply move the setting before restoring migration ptes?
>> 
>> Afraid not, it closes the remove_migration_ptes() ->
>> deferred_split_folio() race, but opens a new one with the shrinker, IIUC
>> 
>> Once dst is on the deferred split queue, deferred_split_scan() can
>> pick it up immediately. The shrinker unconditionally dequeues every
>> folio it visits:
>> 
>> 	list_del_init(&folio->_deferred_list);   /* always */
>> 
>> Then for a non-partially-mapped folio, if folio_trylock() fails
>> (dst is still locked by migration), it falls through to:
>> 
>> next:
>> 		if (did_split || !folio_test_partially_mapped(folio))
>> 			continue;  /* not requeued, dst silently lost */
>> 
>> so it is *not* requeued.
>
>How is that different to the shrinker just trying to lock the folio before we
>unlock it and failing? The race already exists?

Ouch, you're right, I was wrong - the trylock drop is a pre-existing
issue, not caused by the reorder ;)

>
>To sort out that race a trylock must not result in the folio getting
>discarded.

Nice, LGTM!

Given that the "trylock -> drop" behavior seems to exist already today,
do you think it's worth fixing that together with the reorder?

>diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>index ff9a42abd1b6..521989517cd1 100644
>--- a/mm/huge_memory.c
>+++ b/mm/huge_memory.c
>@@ -4558,7 +4558,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
>                                goto next;
>                }
>                if (!folio_trylock(folio))
>-                       goto next;
>+                       goto requeue:
>                if (!split_folio(folio)) {
>                        did_split = true;
>                        if (underused)
>@@ -4569,6 +4569,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
> next:
>                if (did_split || !folio_test_partially_mapped(folio))
>                        continue;
>+requeue:
>                /*
>                 * Only add back to the queue if folio is partially mapped.
>                 * If thp_underused returns false, or if split_folio fails
>
>


  reply	other threads:[~2026-04-01 11:21 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <EE70ZGRNE12@zendesk.com>
2026-04-01  6:08 ` [syzbot] [mm?] WARNING in deferred_split_folio syzbot
2026-04-01  6:09   ` Request received Yail
2026-04-01  7:52   ` Forwarded: [PATCH] mm/migrate: fix stale partially_mapped arg to deferred_split_folio() syzbot
2026-04-01  8:10   ` [syzbot] [mm?] WARNING in deferred_split_folio Lance Yang
2026-04-01  8:59     ` Lance Yang
2026-04-01  9:36       ` David Hildenbrand (Arm)
2026-04-01 10:16       ` David Hildenbrand (Arm)
2026-04-01 10:53         ` Lance Yang
2026-04-01 11:00           ` David Hildenbrand (Arm)
2026-04-01 11:20             ` Lance Yang [this message]
2026-04-01 11:22               ` David Hildenbrand (Arm)
2026-04-01 11:34                 ` Lance Yang
2026-04-01 11:38                   ` David Hildenbrand (Arm)
2026-04-01 11:41                     ` Lance Yang
2026-04-01 11:44                       ` David Hildenbrand (Arm)
2026-04-01 11:51                         ` Lance Yang
2026-04-01 11:54                           ` Lance Yang
     [not found] <20260401075211.20151-1-kartikey406@gmail.com>
2026-04-01  8:34 ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260401112049.18634-1-lance.yang@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=kartikey406@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=usama.arif@linux.dev \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.