From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-187.mta1.migadu.com (out-187.mta1.migadu.com [95.215.58.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0CC9B361DBF for ; Wed, 1 Apr 2026 08:59:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.187 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775033994; cv=none; b=ESqTjNGdN1sx6MvpIqL7Cwrlcd/NQlKekyCy7h+gSVQMFsF+oNOcMLf86GAVKCu6gvLmiF21mqqjDsWchYaFHrH5zkVx0PeqCnU2EMpZeK4ja2v8Lsw0CEZORG/xPAttoHnzZijIl0w62i1l4SkfkKJxAN2jDBN0MxXY4OtxF/k= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775033994; c=relaxed/simple; bh=cKNiKwZ8MHdD1DH/AEoGKtBV7MlP0zCPqQv6ewRgYL8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=g03+DLKkKgRcDqNUHR0kZlmBXbzqdBjhrpJA6VuVF4B1r+Nud2FOlhoLIvjiqmEAYNtf+fxH2cbRx3W3FKxsJxk/O0cgnWOACqmg6sTqQeeotk6RbaX1iIg9oOPJKrse15rTaZ7JoX37INP9JSGzI6fX/E4OQ6iRQcZmbsLk7h0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=UNRby26p; arc=none smtp.client-ip=95.215.58.187 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="UNRby26p" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1775033991; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qXE23v6Xe1ckxx6xxRFxCgDOboFKH33RxjSSREyXMO4=; b=UNRby26p74Jodr1KsvYfZw93OYuNvsgaO2jQI8Rz/JfUbGv59xnsl2Pl1csbsZpPxn+IK+ x+PGZQwvoGWNqVv35oOfSB/s36XKh8DcnH6Kpa37BFJY8cwUsLhMc0CW4pMfypD1NckGOO lAEoAE2PoMwRZJBQHV5SD9ZAiaUAycY= From: Lance Yang To: usama.arif@linux.dev, david@kernel.org, Liam.Howlett@oracle.com, ziy@nvidia.com Cc: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com, akpm@linux-foundation.org, baohua@kernel.org, baolin.wang@linux.alibaba.com, dev.jain@arm.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, ljs@kernel.org, npache@redhat.com, ryan.roberts@arm.com, syzkaller-bugs@googlegroups.com, Lance Yang Subject: Re: [syzbot] [mm?] WARNING in deferred_split_folio Date: Wed, 1 Apr 2026 16:59:32 +0800 Message-Id: <20260401085932.20945-1-lance.yang@linux.dev> In-Reply-To: <20260401081025.68951-1-lance.yang@linux.dev> References: <20260401081025.68951-1-lance.yang@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT On Wed, Apr 01, 2026 at 04:10:25PM +0800, Lance Yang wrote: > >+Cc Usama > >On Tue, Mar 31, 2026 at 11:08:27PM -0700, syzbot wrote: >>Hello, >> >>syzbot found the following issue on: >> >>HEAD commit: cf7c3c02fdd0 Add linux-next specific files for 20260330 >>git tree: linux-next >>console output: https://syzkaller.appspot.com/x/log.txt?x=154ee46a580000 >>kernel config: https://syzkaller.appspot.com/x/.config?x=3944d875fa9bfb67 >>dashboard link: https://syzkaller.appspot.com/bug?extid=a7067a757858ac8eb085 >>compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 >>syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12c846ba580000 >> >>Downloadable assets: >>disk image: https://storage.googleapis.com/syzbot-assets/053d3b49a360/disk-cf7c3c02.raw.xz >>vmlinux: https://storage.googleapis.com/syzbot-assets/faabb37d41d0/vmlinux-cf7c3c02.xz >>kernel image: https://storage.googleapis.com/syzbot-assets/8d47fe92aaa8/bzImage-cf7c3c02.xz >> >>IMPORTANT: if you fix the issue, please add the following tag to the commit: >>Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com >> >> free_pages_and_swap_cache+0x2b9/0x490 mm/swap_state.c:401 >> __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline] >> tlb_batch_pages_flush mm/mmu_gather.c:151 [inline] >> tlb_flush_mmu_free mm/mmu_gather.c:417 [inline] >> tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424 >> tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549 >> exit_mmap+0x498/0x9e0 mm/mmap.c:1313 >> __mmput+0x118/0x430 kernel/fork.c:1177 >> exit_mm+0x18e/0x250 kernel/exit.c:581 >> do_exit+0x6a2/0x22c0 kernel/exit.c:962 >> do_group_exit+0x21b/0x2d0 kernel/exit.c:1116 >> __do_sys_exit_group kernel/exit.c:1127 [inline] >> __se_sys_exit_group kernel/exit.c:1125 [inline] >> __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1125 >> x64_sys_call+0x221a/0x2240 arch/x86/include/generated/asm/syscalls_64.h:232 >> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] >> do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94 >> entry_SYSCALL_64_after_hwframe+0x77/0x7f >>------------[ cut here ]------------ >>1 >>WARNING: mm/huge_memory.c:4371 at deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371, CPU#1: syz.3.1110/10500 >>Modules linked in: >>CPU: 1 UID: 0 PID: 10500 Comm: syz.3.1110 Not tainted syzkaller #0 PREEMPT(full) >>Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026 >>RIP: 0010:deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371 >>Code: 31 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d e9 c2 67 8d 09 cc e8 8c 73 93 ff 48 89 df 48 c7 c6 20 5b fc 8b e8 dd 2b f5 fe 90 <0f> 0b 90 e9 d4 fe ff ff e8 9f 7a 8a 09 e8 6a 73 93 ff 48 89 df 48 >>RSP: 0018:ffffc900047ef540 EFLAGS: 00010046 >>RAX: 1c05fb65cfaab100 RBX: ffffea0001840000 RCX: 0000000080000001 >>RDX: 0000000000000002 RSI: ffffffff8e4da1c7 RDI: ffff88807d6f9e80 >>RBP: ffffc900047ef610 R08: ffff8880b87247d3 R09: 1ffff110170e48fa >>R10: dffffc0000000000 R11: ffffed10170e48fb R12: ffffea0001840040 >>R13: 0000000000000000 R14: 0000000000010000 R15: 1ffff920008fdeb0 >>FS: 00007f32e32a76c0(0000) GS:ffff8881250e8000(0000) knlGS:0000000000000000 >>CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>CR2: 00007f5825757930 CR3: 0000000034ad8000 CR4: 00000000003526f0 >>Call Trace: >> >> migrate_folio_move mm/migrate.c:1411 [inline] > >Looks like a race introduced by commit[1] ("mm: migrate: requeue >destination folio on deferred split queue"). > >Between folio migration (mbind) and rmap removal (exit_mmap), I guess :) > >migrate_folio_move() snapshots src_partially_mapped from src before >migration: > > if (folio_order(src) > 1 && > !data_race(list_empty(&src->_deferred_list))) { > src_deferred_split = true; > src_partially_mapped = folio_test_partially_mapped(src); > } > >Then move_to_new_folio() eventually unqueues src in >__folio_migrate_mapping(): > > folio_unqueue_deferred_split(src); > >After that, migration restores mappings to dst: > > if (old_page_state & PAGE_WAS_MAPPED) > remove_migration_ptes(src, dst, 0); > >At that point, dst is already visible again. A concurrent unmap path >from another sharer can then remove some of those mappings and reach >deferred_split_folio(dst, true), which sets PG_partially_mapped on >dst. > >Migration then resumes and does: > > if (src_deferred_split) > deferred_split_folio(dst, src_partially_mapped); > >If the earlier snapshot from src was false, this becomes >deferred_split_folio(dst, false), but dst may already have been marked >partially mapped by the concurrent rmap-removal path, so the WARN in >deferred_split_folio() fires: > > if (partially_mapped) { > ... > } else { > /* partially mapped folios cannot become non-partially mapped */ > VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); > } > >[1] https://lore.kernel.org/all/20260312104723.1351321-1-usama.arif@linux.dev/ > Perhaps the WARN is simply too strict there :) Migration already holds the folio lock on dst, while the competing rmap-removal path runs under the page-table lock. So once remove_migration_ptes(src, dst, 0) makes dst visible again, this race looks hard to avoid. So maybe the simplest fix is just to drop the WARN in the !partially_mapped path: ---8<--- Subject: [PATCH 1/1] mm/thp: avoid false warning in deferred_split_folio() From: Lance Yang migrate_folio_move() snapshots src_partially_mapped from src before migration and later requeues dst after remove_migration_ptes(src, dst, 0). Once dst is visible again, a competing rmap-removal path can legally set PG_partially_mapped before the migration path reaches deferred_split_folio(dst, src_partially_mapped). Migration already holds the folio lock on dst, while the competing rmap-removal path runs under the page-table lock. So once remove_migration_ptes(src, dst, 0) makes dst visible again, this race looks hard to avoid. So just drop the WARN in the !partially_mapped path and preserve an already-set PG_partially_mapped bit. Link: https://lore.kernel.org/linux-mm/69ccb65b.050a0220.183828.003a.GAE@google.com/ Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue") Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com Signed-off-by: Lance Yang --- mm/huge_memory.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 745eb3d0d4a7..8ea8e293dc7c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -4433,9 +4433,6 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped) mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, 1); } - } else { - /* partially mapped folios cannot become non-partially mapped */ - VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); } if (list_empty(&folio->_deferred_list)) { struct mem_cgroup *memcg; --- Thanks, Lance