public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Lance Yang <lance.yang@linux.dev>
To: david@kernel.org, kartikey406@gmail.com
Cc: lance.yang@linux.dev, usama.arif@linux.dev,
	Liam.Howlett@oracle.com, ziy@nvidia.com,
	syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com,
	akpm@linux-foundation.org, baohua@kernel.org,
	baolin.wang@linux.alibaba.com, dev.jain@arm.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, ljs@kernel.org,
	npache@redhat.com, ryan.roberts@arm.com,
	syzkaller-bugs@googlegroups.com
Subject: Re: [syzbot] [mm?] WARNING in deferred_split_folio
Date: Wed,  1 Apr 2026 18:53:05 +0800	[thread overview]
Message-ID: <20260401105305.94886-1-lance.yang@linux.dev> (raw)
In-Reply-To: <27d742c6-631a-4878-9c44-bf49bcce9510@kernel.org>


+Cc Deepanshu

On Wed, Apr 01, 2026 at 12:16:43PM +0200, David Hildenbrand (Arm) wrote:
>On 4/1/26 10:59, Lance Yang wrote:
>> 
>> On Wed, Apr 01, 2026 at 04:10:25PM +0800, Lance Yang wrote:
>>>
>>> +Cc Usama
>>>
>>> On Tue, Mar 31, 2026 at 11:08:27PM -0700, syzbot wrote:
>>>> Hello,
>>>>
>>>> syzbot found the following issue on:
>>>>
>>>> HEAD commit:    cf7c3c02fdd0 Add linux-next specific files for 20260330
>>>> git tree:       linux-next
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=154ee46a580000
>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=3944d875fa9bfb67
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=a7067a757858ac8eb085
>>>> compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
>>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12c846ba580000
>>>>
>>>> Downloadable assets:
>>>> disk image: https://storage.googleapis.com/syzbot-assets/053d3b49a360/disk-cf7c3c02.raw.xz
>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/faabb37d41d0/vmlinux-cf7c3c02.xz
>>>> kernel image: https://storage.googleapis.com/syzbot-assets/8d47fe92aaa8/bzImage-cf7c3c02.xz
>>>>
>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>> Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com
>>>>
>>>> free_pages_and_swap_cache+0x2b9/0x490 mm/swap_state.c:401
>>>> __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline]
>>>> tlb_batch_pages_flush mm/mmu_gather.c:151 [inline]
>>>> tlb_flush_mmu_free mm/mmu_gather.c:417 [inline]
>>>> tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424
>>>> tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549
>>>> exit_mmap+0x498/0x9e0 mm/mmap.c:1313
>>>> __mmput+0x118/0x430 kernel/fork.c:1177
>>>> exit_mm+0x18e/0x250 kernel/exit.c:581
>>>> do_exit+0x6a2/0x22c0 kernel/exit.c:962
>>>> do_group_exit+0x21b/0x2d0 kernel/exit.c:1116
>>>> __do_sys_exit_group kernel/exit.c:1127 [inline]
>>>> __se_sys_exit_group kernel/exit.c:1125 [inline]
>>>> __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1125
>>>> x64_sys_call+0x221a/0x2240 arch/x86/include/generated/asm/syscalls_64.h:232
>>>> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>>>> do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94
>>>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>>> ------------[ cut here ]------------
>>>> 1
>>>> WARNING: mm/huge_memory.c:4371 at deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371, CPU#1: syz.3.1110/10500
>>>> Modules linked in:
>>>> CPU: 1 UID: 0 PID: 10500 Comm: syz.3.1110 Not tainted syzkaller #0 PREEMPT(full) 
>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026
>>>> RIP: 0010:deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371
>>>> Code: 31 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d e9 c2 67 8d 09 cc e8 8c 73 93 ff 48 89 df 48 c7 c6 20 5b fc 8b e8 dd 2b f5 fe 90 <0f> 0b 90 e9 d4 fe ff ff e8 9f 7a 8a 09 e8 6a 73 93 ff 48 89 df 48
>>>> RSP: 0018:ffffc900047ef540 EFLAGS: 00010046
>>>> RAX: 1c05fb65cfaab100 RBX: ffffea0001840000 RCX: 0000000080000001
>>>> RDX: 0000000000000002 RSI: ffffffff8e4da1c7 RDI: ffff88807d6f9e80
>>>> RBP: ffffc900047ef610 R08: ffff8880b87247d3 R09: 1ffff110170e48fa
>>>> R10: dffffc0000000000 R11: ffffed10170e48fb R12: ffffea0001840040
>>>> R13: 0000000000000000 R14: 0000000000010000 R15: 1ffff920008fdeb0
>>>> FS:  00007f32e32a76c0(0000) GS:ffff8881250e8000(0000) knlGS:0000000000000000
>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> CR2: 00007f5825757930 CR3: 0000000034ad8000 CR4: 00000000003526f0
>>>> Call Trace:
>>>> <TASK>
>>>> migrate_folio_move mm/migrate.c:1411 [inline]
>>>
>>> Looks like a race introduced by commit[1] ("mm: migrate: requeue
>>> destination folio on deferred split queue").
>>>
>>> Between folio migration (mbind) and rmap removal (exit_mmap), I guess :)
>>>
>>> migrate_folio_move() snapshots src_partially_mapped from src before
>>> migration:
>>>
>>> 	if (folio_order(src) > 1 &&
>>> 	    !data_race(list_empty(&src->_deferred_list))) {
>>> 		src_deferred_split = true;
>>> 		src_partially_mapped = folio_test_partially_mapped(src);
>>> 	}
>>>
>>> Then move_to_new_folio() eventually unqueues src in
>>> __folio_migrate_mapping():
>>>
>>> 	folio_unqueue_deferred_split(src);
>>>
>>> After that, migration restores mappings to dst:
>>>
>>> 	if (old_page_state & PAGE_WAS_MAPPED)
>>> 		remove_migration_ptes(src, dst, 0);
>>>
>>> At that point, dst is already visible again. A concurrent unmap path
>>>from another sharer can then remove some of those mappings and reach
>>> deferred_split_folio(dst, true), which sets PG_partially_mapped on
>>> dst.
>>>
>>> Migration then resumes and does:
>>>
>>> 	if (src_deferred_split)
>>> 		deferred_split_folio(dst, src_partially_mapped);
>>>
>>> If the earlier snapshot from src was false, this becomes
>>> deferred_split_folio(dst, false), but dst may already have been marked
>>> partially mapped by the concurrent rmap-removal path, so the WARN in
>>> deferred_split_folio() fires:
>>>
>>> 	if (partially_mapped) {
>>> 		...
>>> 	} else {
>>> 		/* partially mapped folios cannot become non-partially mapped */
>>> 		VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio);
>>> 	}
>>>
>>> [1] https://lore.kernel.org/all/20260312104723.1351321-1-usama.arif@linux.dev/
>>>
>> 
>> Perhaps the WARN is simply too strict there :)
>> 
>> Migration already holds the folio lock on dst, while the competing
>> rmap-removal path runs under the page-table lock. So once
>> remove_migration_ptes(src, dst, 0) makes dst visible again, this race
>> looks hard to avoid.
>> 
>> So maybe the simplest fix is just to drop the WARN in the
>> !partially_mapped path:
>> 
>> ---8<---
>> Subject: [PATCH 1/1] mm/thp: avoid false warning in deferred_split_folio()
>> 
>> From: Lance Yang <lance.yang@linux.dev>
>> 
>> migrate_folio_move() snapshots src_partially_mapped from src before
>> migration and later requeues dst after remove_migration_ptes(src, dst, 0).
>> 
>> Once dst is visible again, a competing rmap-removal path can legally set
>> PG_partially_mapped before the migration path reaches
>> deferred_split_folio(dst, src_partially_mapped).
>> 
>> Migration already holds the folio lock on dst, while the competing
>> rmap-removal path runs under the page-table lock. So once
>> remove_migration_ptes(src, dst, 0) makes dst visible again, this race
>> looks hard to avoid.
>> 
>> So just drop the WARN in the !partially_mapped path and preserve an
>> already-set PG_partially_mapped bit.
>> 
>> Link: https://lore.kernel.org/linux-mm/69ccb65b.050a0220.183828.003a.GAE@google.com/
>> Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue")
>> Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com
>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>> ---
>>  mm/huge_memory.c | 3 ---
>>  1 file changed, 3 deletions(-)
>> 
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 745eb3d0d4a7..8ea8e293dc7c 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -4433,9 +4433,6 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped)
>>  			mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, 1);
>> 
>>  		}
>> -	} else {
>> -		/* partially mapped folios cannot become non-partially mapped */
>> -		VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio);
>>  	}
>
>Can't we simply move the setting before restoring migration ptes?

Afraid not, it closes the remove_migration_ptes() ->
deferred_split_folio() race, but opens a new one with the shrinker, IIUC

Once dst is on the deferred split queue, deferred_split_scan() can
pick it up immediately. The shrinker unconditionally dequeues every
folio it visits:

	list_del_init(&folio->_deferred_list);   /* always */

Then for a non-partially-mapped folio, if folio_trylock() fails
(dst is still locked by migration), it falls through to:

next:
		if (did_split || !folio_test_partially_mapped(folio))
			continue;  /* not requeued, dst silently lost */

so it is *not* requeued.

That seems to recreate the original issue commit[1] was fixing: letting
underused THPs silently fall off the deferred split queue again ...

Hopefully, I didn't miss something important :)

[1] https://lore.kernel.org/all/20260312104723.1351321-1-usama.arif@linux.dev/

>diff --git a/mm/migrate.c b/mm/migrate.c
>index 05cb408846f2..5f222cb0ca90 100644
>--- a/mm/migrate.c
>+++ b/mm/migrate.c
>@@ -1385,6 +1385,15 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>        if (rc)
>                goto out;
> 
>+       /*
>+        * Requeue the destination folio on the deferred split queue if
>+        * the source was on the queue.  The source is unqueued in
>+        * __folio_migrate_mapping(), so we recorded the state from
>+        * before move_to_new_folio().
>+        */
>+       if (src_deferred_split)
>+               deferred_split_folio(dst, src_partially_mapped);
>+
>        /*
>         * When successful, push dst to LRU immediately: so that if it
>         * turns out to be an mlocked page, remove_migration_ptes() will
>@@ -1400,16 +1409,6 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
> 
>        if (old_page_state & PAGE_WAS_MAPPED)
>                remove_migration_ptes(src, dst, 0);
>-
>-       /*
>-        * Requeue the destination folio on the deferred split queue if
>-        * the source was on the queue.  The source is unqueued in
>-        * __folio_migrate_mapping(), so we recorded the state from
>-        * before move_to_new_folio().
>-        */
>-       if (src_deferred_split)
>-               deferred_split_folio(dst, src_partially_mapped);
>-
> out_unlock_both:
>        folio_unlock(dst);
>        folio_set_owner_migrate_reason(dst, reason);
>
>


  reply	other threads:[~2026-04-01 10:53 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <EE70ZGRNE12@zendesk.com>
2026-04-01  6:08 ` [syzbot] [mm?] WARNING in deferred_split_folio syzbot
2026-04-01  6:09   ` Request received Yail
2026-04-01  8:10   ` [syzbot] [mm?] WARNING in deferred_split_folio Lance Yang
2026-04-01  8:59     ` Lance Yang
2026-04-01  9:36       ` David Hildenbrand (Arm)
2026-04-01 10:16       ` David Hildenbrand (Arm)
2026-04-01 10:53         ` Lance Yang [this message]
2026-04-01 11:00           ` David Hildenbrand (Arm)
2026-04-01 11:20             ` Lance Yang
2026-04-01 11:22               ` David Hildenbrand (Arm)
2026-04-01 11:34                 ` Lance Yang
2026-04-01 11:38                   ` David Hildenbrand (Arm)
2026-04-01 11:41                     ` Lance Yang
2026-04-01 11:44                       ` David Hildenbrand (Arm)
2026-04-01 11:51                         ` Lance Yang
2026-04-01 11:54                           ` Lance Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260401105305.94886-1-lance.yang@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=kartikey406@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=usama.arif@linux.dev \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox