* [PATCH] mm/migrate_device: pin large folios before splitting
@ 2026-07-01 14:06 Usama Arif
2026-07-01 16:49 ` David Hildenbrand (Arm)
2026-07-01 17:02 ` Zi Yan
0 siblings, 2 replies; 5+ messages in thread
From: Usama Arif @ 2026-07-01 14:06 UTC (permalink / raw)
To: Andrew Morton, apopple, byungchul, david, gourry, joshua.hahnjy,
linux-kernel, linux-mm, matthew.brost, rakie.kim, ying.huang, ziy
Cc: shakeel.butt, hannes, kernel-team, Usama Arif, sashiko-bot
migrate_vma_collect_pmd() can detect a large folio while holding the PTE
lock, then drop the PTE lock before calling migrate_vma_split_folio(). The
split helper took its own reference, but only after the lock had already
been dropped.
One way to hit this is device migration over a range that contains a large
folio. The walker reads the PTE while holding the PTE lock and derives the
folio either from a present PTE via vm_normal_page(), or from a non-present
PTE that encodes a device-private softleaf entry. It then has to drop the
PTE lock because split_folio() can block. Before migrate_vma_split_folio()
gets a folio reference, concurrent reclaim, migration, or truncation can
replace or clear the entry and drop the last reference to the folio. The
split helper would then take a reference and lock on a stale folio pointer.
Take a temporary reference before dropping the PTE lock and pass that
reference into migrate_vma_split_folio(). The helper consumes the
reference, so split_folio() still sees only the expected caller pin instead
of an extra pin that could make the split fail.
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Link: https://sashiko.dev/#/patchset/20260630164143.1595669-1-usama.arif%40linux.dev
Fixes: 022a12deda53 ("mm/migrate_device: handle partially mapped folios during collection")
Signed-off-by: Usama Arif <usama.arif@linux.dev>
---
mm/migrate_device.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/mm/migrate_device.c b/mm/migrate_device.c
index 2f8b646302c2..f5a5f699e98e 100644
--- a/mm/migrate_device.c
+++ b/mm/migrate_device.c
@@ -77,6 +77,9 @@ static int migrate_vma_collect_hole(unsigned long start,
* @folio: the folio to split
* @fault_page: struct page associated with the fault if any
*
+ * If @folio is not the folio containing @fault_page, the caller must hold a
+ * reference on @folio. The helper consumes that reference.
+ *
* Returns 0 on success
*/
static int migrate_vma_split_folio(struct folio *folio,
@@ -86,10 +89,8 @@ static int migrate_vma_split_folio(struct folio *folio,
struct folio *fault_folio = fault_page ? page_folio(fault_page) : NULL;
struct folio *new_fault_folio = NULL;
- if (folio != fault_folio) {
- folio_get(folio);
+ if (folio != fault_folio)
folio_lock(folio);
- }
ret = split_folio(folio);
if (ret) {
@@ -310,6 +311,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
if (folio_test_large(folio)) {
int ret;
+ /*
+ * Keep the folio stable after dropping the PTE
+ * lock. migrate_vma_split_folio() consumes this
+ * reference.
+ */
+ if (folio != fault_folio)
+ folio_get(folio);
lazy_mmu_mode_disable();
pte_unmap_unlock(ptep, ptl);
ret = migrate_vma_split_folio(folio,
@@ -353,6 +361,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
if (folio && folio_test_large(folio)) {
int ret;
+ /*
+ * Keep the folio stable after dropping the
+ * PTE lock. migrate_vma_split_folio() consumes
+ * this reference.
+ */
+ if (folio != fault_folio)
+ folio_get(folio);
lazy_mmu_mode_disable();
pte_unmap_unlock(ptep, ptl);
ret = migrate_vma_split_folio(folio,
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] mm/migrate_device: pin large folios before splitting
2026-07-01 14:06 [PATCH] mm/migrate_device: pin large folios before splitting Usama Arif
@ 2026-07-01 16:49 ` David Hildenbrand (Arm)
2026-07-01 17:02 ` Zi Yan
1 sibling, 0 replies; 5+ messages in thread
From: David Hildenbrand (Arm) @ 2026-07-01 16:49 UTC (permalink / raw)
To: Usama Arif, Andrew Morton, apopple, byungchul, gourry,
joshua.hahnjy, linux-kernel, linux-mm, matthew.brost, rakie.kim,
ying.huang, ziy
Cc: shakeel.butt, hannes, kernel-team, sashiko-bot
On 7/1/26 16:06, Usama Arif wrote:
> migrate_vma_collect_pmd() can detect a large folio while holding the PTE
> lock, then drop the PTE lock before calling migrate_vma_split_folio(). The
> split helper took its own reference, but only after the lock had already
> been dropped.
>
> One way to hit this is device migration over a range that contains a large
> folio. The walker reads the PTE while holding the PTE lock and derives the
> folio either from a present PTE via vm_normal_page(), or from a non-present
> PTE that encodes a device-private softleaf entry. It then has to drop the
> PTE lock because split_folio() can block. Before migrate_vma_split_folio()
> gets a folio reference, concurrent reclaim, migration, or truncation can
> replace or clear the entry and drop the last reference to the folio. The
> split helper would then take a reference and lock on a stale folio pointer.
>
> Take a temporary reference before dropping the PTE lock and pass that
> reference into migrate_vma_split_folio(). The helper consumes the
> reference, so split_folio() still sees only the expected caller pin instead
> of an extra pin that could make the split fail.
>
> Reported-by: sashiko-bot <sashiko-bot@kernel.org>
> Link: https://sashiko.dev/#/patchset/20260630164143.1595669-1-usama.arif%40linux.dev
> Fixes: 022a12deda53 ("mm/migrate_device: handle partially mapped folios during collection")
> Signed-off-by: Usama Arif <usama.arif@linux.dev>
> ---
> mm/migrate_device.c | 21 ++++++++++++++++++---
> 1 file changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
> index 2f8b646302c2..f5a5f699e98e 100644
> --- a/mm/migrate_device.c
> +++ b/mm/migrate_device.c
> @@ -77,6 +77,9 @@ static int migrate_vma_collect_hole(unsigned long start,
> * @folio: the folio to split
> * @fault_page: struct page associated with the fault if any
> *
> + * If @folio is not the folio containing @fault_page, the caller must hold a
> + * reference on @folio. The helper consumes that reference.
> + *
> * Returns 0 on success
> */
> static int migrate_vma_split_folio(struct folio *folio,
> @@ -86,10 +89,8 @@ static int migrate_vma_split_folio(struct folio *folio,
> struct folio *fault_folio = fault_page ? page_folio(fault_page) : NULL;
> struct folio *new_fault_folio = NULL;
>
> - if (folio != fault_folio) {
> - folio_get(folio);
> + if (folio != fault_folio)
> folio_lock(folio);
> - }
>
> ret = split_folio(folio);
> if (ret) {
> @@ -310,6 +311,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
> if (folio_test_large(folio)) {
> int ret;
>
> + /*
> + * Keep the folio stable after dropping the PTE
> + * lock. migrate_vma_split_folio() consumes this
> + * reference.
> + */
Do we really need that comment (same below?). It's a common mechanism when
keeping to refer to folios after dropping the PTL.
> + if (folio != fault_folio)
> + folio_get(folio);
> lazy_mmu_mode_disable();
> pte_unmap_unlock(ptep, ptl);
> ret = migrate_vma_split_folio(folio,
> @@ -353,6 +361,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
> if (folio && folio_test_large(folio)) {
> int ret;
>
> + /*
> + * Keep the folio stable after dropping the
> + * PTE lock. migrate_vma_split_folio() consumes
> + * this reference.
> + */
> + if (folio != fault_folio)
> + folio_get(folio);
> lazy_mmu_mode_disable();
> pte_unmap_unlock(ptep, ptl);
> ret = migrate_vma_split_folio(folio,
LGTM. The code duplication in this function is concerning.
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
--
Cheers,
David
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mm/migrate_device: pin large folios before splitting
2026-07-01 14:06 [PATCH] mm/migrate_device: pin large folios before splitting Usama Arif
2026-07-01 16:49 ` David Hildenbrand (Arm)
@ 2026-07-01 17:02 ` Zi Yan
2026-07-01 19:27 ` Andrew Morton
1 sibling, 1 reply; 5+ messages in thread
From: Zi Yan @ 2026-07-01 17:02 UTC (permalink / raw)
To: Usama Arif
Cc: Andrew Morton, apopple, byungchul, david, gourry, joshua.hahnjy,
linux-kernel, linux-mm, matthew.brost, rakie.kim, ying.huang,
shakeel.butt, hannes, kernel-team, sashiko-bot
On 1 Jul 2026, at 10:06, Usama Arif wrote:
> migrate_vma_collect_pmd() can detect a large folio while holding the PTE
> lock, then drop the PTE lock before calling migrate_vma_split_folio(). The
> split helper took its own reference, but only after the lock had already
> been dropped.
>
> One way to hit this is device migration over a range that contains a large
> folio. The walker reads the PTE while holding the PTE lock and derives the
> folio either from a present PTE via vm_normal_page(), or from a non-present
> PTE that encodes a device-private softleaf entry. It then has to drop the
> PTE lock because split_folio() can block. Before migrate_vma_split_folio()
> gets a folio reference, concurrent reclaim, migration, or truncation can
> replace or clear the entry and drop the last reference to the folio. The
> split helper would then take a reference and lock on a stale folio pointer.
>
> Take a temporary reference before dropping the PTE lock and pass that
> reference into migrate_vma_split_folio(). The helper consumes the
> reference, so split_folio() still sees only the expected caller pin instead
> of an extra pin that could make the split fail.
>
> Reported-by: sashiko-bot <sashiko-bot@kernel.org>
> Link: https://sashiko.dev/#/patchset/20260630164143.1595669-1-usama.arif%40linux.dev
> Fixes: 022a12deda53 ("mm/migrate_device: handle partially mapped folios during collection")
> Signed-off-by: Usama Arif <usama.arif@linux.dev>
> ---
> mm/migrate_device.c | 21 ++++++++++++++++++---
> 1 file changed, 18 insertions(+), 3 deletions(-)
>
LGTM. Like David said, the comments might not be needed. Thanks.
Reviewed-by: Zi Yan <ziy@nvidia.com>
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mm/migrate_device: pin large folios before splitting
2026-07-01 17:02 ` Zi Yan
@ 2026-07-01 19:27 ` Andrew Morton
2026-07-01 20:06 ` David Hildenbrand (Arm)
0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2026-07-01 19:27 UTC (permalink / raw)
To: Zi Yan
Cc: Usama Arif, apopple, byungchul, david, gourry, joshua.hahnjy,
linux-kernel, linux-mm, matthew.brost, rakie.kim, ying.huang,
shakeel.butt, hannes, kernel-team, sashiko-bot
On Wed, 01 Jul 2026 13:02:10 -0400 Zi Yan <ziy@nvidia.com> wrote:
> LGTM. Like David said, the comments might not be needed. Thanks.
I like the comments! They may be uninteresting to those who are
already familiar with these things, but they aren't the target audience.
How are others to become familiar, if not by this?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mm/migrate_device: pin large folios before splitting
2026-07-01 19:27 ` Andrew Morton
@ 2026-07-01 20:06 ` David Hildenbrand (Arm)
0 siblings, 0 replies; 5+ messages in thread
From: David Hildenbrand (Arm) @ 2026-07-01 20:06 UTC (permalink / raw)
To: Andrew Morton, Zi Yan
Cc: Usama Arif, apopple, byungchul, gourry, joshua.hahnjy,
linux-kernel, linux-mm, matthew.brost, rakie.kim, ying.huang,
shakeel.butt, hannes, kernel-team, sashiko-bot
On 7/1/26 21:27, Andrew Morton wrote:
> On Wed, 01 Jul 2026 13:02:10 -0400 Zi Yan <ziy@nvidia.com> wrote:
>
>> LGTM. Like David said, the comments might not be needed. Thanks.
>
> I like the comments! They may be uninteresting to those who are
> already familiar with these things, but they aren't the target audience.
>
> How are others to become familiar, if not by this?
I mean, it's one of the basic rules: if you lookup a page in the page table, the
moment you drop the lock that might be invalid.
If we were to document that everywhere... this is not really the secret sauce we
want to document everywhere.
--
Cheers,
David
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-07-01 20:07 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-01 14:06 [PATCH] mm/migrate_device: pin large folios before splitting Usama Arif
2026-07-01 16:49 ` David Hildenbrand (Arm)
2026-07-01 17:02 ` Zi Yan
2026-07-01 19:27 ` Andrew Morton
2026-07-01 20:06 ` David Hildenbrand (Arm)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox