From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A16162FB08C; Fri, 6 Mar 2026 01:52:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772761972; cv=none; b=ALnRNIuYbC4ILZBSvYQxrHDLweGcx/6hA4Y9DqmnF4V7kLMz4qV6GD3JDkEVi7HsDuXDincH4dCLoo5GNexJm7bzZ7O6rjXvrqPNjldSjof/XZ4bsoSduaAKa6FU6gEw/RoGy0rwM/L6x1OjHwOHCrieoC1RQp89EVju9ZtsArE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772761972; c=relaxed/simple; bh=iF8GnXBA8t+YxQR0K4YtgDIdnFOhQK7MeJUamNTzjRU=; h=Date:To:From:Subject:Message-Id; b=LgU0L3MghHXGwv4Ho+vMoD3839eqU+XWi3n4SXfbHZM+m2NsSaxcccTmKXMPV5Fh7X1PBEjNYCYYF1yvTEDFxQxdJoDyAyfBaHUanv096NCVcsIVUX067J5JtDOYPOxwJS0KL+jHKql20scaYzMgNJRkr7y+sx++fl3Xq/KvvcE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=d848IU6A; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="d848IU6A" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4FAF9C116C6; Fri, 6 Mar 2026 01:52:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1772761972; bh=iF8GnXBA8t+YxQR0K4YtgDIdnFOhQK7MeJUamNTzjRU=; h=Date:To:From:Subject:From; b=d848IU6ARfOrZ0A6WsfXAZ/SGmnp52jd1ihMLzCbboOvgxeSQGQE5SxzrkjLJDMuE gjmZSzysDZYZNgEUtKSQE5c/s3+IkpRy6Jcg/dmW1iG6NOVMp3PTwUodIn5Wtp6INZ gvJWAhNxGPr4OVhejtucdus8A5hwtRrgS1PB1qtg= Date: Thu, 05 Mar 2026 17:52:51 -0800 To: mm-commits@vger.kernel.org,ziy@nvidia.com,stable@vger.kernel.org,ljs@kernel.org,lance.yang@linux.dev,gavinguo@igalia.com,david@kernel.org,baolin.wang@linux.alibaba.com,richard.weiyang@gmail.com,akpm@linux-foundation.org From: Andrew Morton Subject: + mm-huge_memory-fix-early-failure-try_to_migrate-when-split-huge-pmd-for-shared-thp.patch added to mm-hotfixes-unstable branch Message-Id: <20260306015252.4FAF9C116C6@smtp.kernel.org> Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: The patch titled Subject: mm/huge_memory: fix early failure try_to_migrate() when split huge pmd for shared THP has been added to the -mm mm-hotfixes-unstable branch. Its filename is mm-huge_memory-fix-early-failure-try_to_migrate-when-split-huge-pmd-for-shared-thp.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-huge_memory-fix-early-failure-try_to_migrate-when-split-huge-pmd-for-shared-thp.patch This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via various branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there most days ------------------------------------------------------ From: Wei Yang Subject: mm/huge_memory: fix early failure try_to_migrate() when split huge pmd for shared THP Date: Thu, 5 Mar 2026 01:50:06 +0000 Commit 60fbb14396d5 ("mm/huge_memory: adjust try_to_migrate_one() and split_huge_pmd_locked()") return false unconditionally after split_huge_pmd_locked(). This may fail try_to_migrate() early when TTU_SPLIT_HUGE_PMD is specified. The reason is the above commit adjusted try_to_migrate_one() to, when a PMD-mapped THP entry is found, and TTU_SPLIT_HUGE_PMD is specified (for example, via unmap_folio()), return false unconditionally. This breaks the rmap walk and fail try_to_migrate() early, if this PMD-mapped THP is mapped in multiple processes. The user sensible impact of this bug could be: * On memory pressure, shrink_folio_list() may split partially mapped folio with split_folio_to_list(). Then free unmapped pages without IO. If failed, it may not be reclaimed. * On memory failure, memory_failure() would call try_to_split_thp_page() to split folio contains the bad page. If succeed, the PG_has_hwpoisoned bit is only set in the after-split folio contains @split_at. By doing so, we limit bad memory. If failed to split, the whole folios is not usable. One way to reproduce: Create an anonymous THP range and fork 512 children, so we have a THP shared mapped in 513 processes. Then trigger folio split with /sys/kernel/debug/split_huge_pages debugfs to split the THP folio to order 0. Without the above commit, we can successfully split to order 0. With the above commit, the folio is still a large folio. And currently there are two core users of TTU_SPLIT_HUGE_PMD: * try_to_unmap_one() * try_to_migrate_one() try_to_unmap_one() would restart the rmap walk, so only try_to_migrate_one() is affected. We can't simply revert commit 60fbb14396d5 ("mm/huge_memory: adjust try_to_migrate_one() and split_huge_pmd_locked()"), since it removed some duplicated check covered by page_vma_mapped_walk(). This patch fixes this by restart page_vma_mapped_walk() after split_huge_pmd_locked(). Since we cannot simply return "true" to fix the problem, as that would affect another case: When invoking folio_try_share_anon_rmap_pmd() from split_huge_pmd_locked(), the latter can fail and leave a large folio mapped through PTEs, in which case we ought to return true from try_to_migrate_one(). This might result in unnecessary walking of the rmap but is relatively harmless. Link: https://lkml.kernel.org/r/20260305015006.27343-1-richard.weiyang@gmail.com Fixes: 60fbb14396d5 ("mm/huge_memory: adjust try_to_migrate_one() and split_huge_pmd_locked()") Signed-off-by: Wei Yang Reviewed-by: Baolin Wang Reviewed-by: Zi Yan Tested-by: Lance Yang Reviewed-by: Lance Yang Reviewed-by: Gavin Guo Acked-by: David Hildenbrand (arm) Reviewed-by: Lorenzo Stoakes (Oracle) Cc: Signed-off-by: Andrew Morton --- mm/rmap.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) --- a/mm/rmap.c~mm-huge_memory-fix-early-failure-try_to_migrate-when-split-huge-pmd-for-shared-thp +++ a/mm/rmap.c @@ -2450,11 +2450,17 @@ static bool try_to_migrate_one(struct fo __maybe_unused pmd_t pmdval; if (flags & TTU_SPLIT_HUGE_PMD) { + /* + * split_huge_pmd_locked() might leave the + * folio mapped through PTEs. Retry the walk + * so we can detect this scenario and properly + * abort the walk. + */ split_huge_pmd_locked(vma, pvmw.address, pvmw.pmd, true); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; + flags &= ~TTU_SPLIT_HUGE_PMD; + page_vma_mapped_walk_restart(&pvmw); + continue; } #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION pmdval = pmdp_get(pvmw.pmd); _ Patches currently in -mm which might be from richard.weiyang@gmail.com are mm-huge_memory-fix-early-failure-try_to_migrate-when-split-huge-pmd-for-shared-thp.patch