linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] mm: gup: avoid CMA pinning and clean up stale logic
       [not found] <CGME20250605033401epcas2p46651837ba85629a50ed69db9665a52a2@epcas2p4.samsung.com>
@ 2025-06-05  3:32 ` Hyesoo Yu
       [not found]   ` <CGME20250605033430epcas2p37db099e1ff4f2225c2059e08bbaa97c9@epcas2p3.samsung.com>
       [not found]   ` <CGME20250605033432epcas2p4c024a9d05246b18c217f3562b3f53551@epcas2p4.samsung.com>
  0 siblings, 2 replies; 8+ messages in thread
From: Hyesoo Yu @ 2025-06-05  3:32 UTC (permalink / raw)
  Cc: janghyuck.kim, zhaoyang.huang, jaewon31.kim, david, Hyesoo Yu,
	Andrew Morton, Jason Gunthorpe, John Hubbard, Peter Xu, linux-mm,
	linux-kernel

This patch series addresses an issue where longterm GUP requests
could inadvertently pin unpinnable pages (such as CMA) when no
migratable folios are found. This regression was introduced by
commit 1aaf8c122918 ("mm: gup: fix infinite loop within __get_longterm_locked").

The first patch removes stale codes and unnecessary logic left.

The second patch fixes the logic to return -EAGAIN when unpinnable
pages are detected but no folios could be collected, allowing
migration to be retried rather than pinning the page.

Hyesoo Yu (2):
  mm: gup: clean up stale logic in migrate_longterm_unpinnable_folio()
  mm: gup: avoid CMA page pinning by retrying migration if no migratable
    page

 mm/gup.c | 58 +++++++++++++++++++++++++++++++-------------------------
 1 file changed, 32 insertions(+), 26 deletions(-)

-- 
2.49.0



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] mm: gup: clean up stale logic in migrate_longterm_unpinnable_folio()
       [not found]   ` <CGME20250605033430epcas2p37db099e1ff4f2225c2059e08bbaa97c9@epcas2p3.samsung.com>
@ 2025-06-05  3:32     ` Hyesoo Yu
  0 siblings, 0 replies; 8+ messages in thread
From: Hyesoo Yu @ 2025-06-05  3:32 UTC (permalink / raw)
  Cc: janghyuck.kim, zhaoyang.huang, jaewon31.kim, david, Hyesoo Yu,
	Andrew Morton, Jason Gunthorpe, John Hubbard, Peter Xu, linux-mm,
	linux-kernel

migrate_longterm_unpinnable_folio() always receives a non-empty
movable_folio_list. Thus, list_empty() check can be safely removed.

Also, pofs entries are fully unpinned before migration is attempted.
The err label contained unnecessary unpinning logic for pofs, which
is now removed.

No functional change intended.

Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com>
---
 mm/gup.c | 30 ++++++++++--------------------
 1 file changed, 10 insertions(+), 20 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index e065a49842a8..68d91b000199 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2353,7 +2353,12 @@ static int
 migrate_longterm_unpinnable_folios(struct list_head *movable_folio_list,
 				   struct pages_or_folios *pofs)
 {
-	int ret;
+	struct migration_target_control mtc = {
+		.gfp_mask = GFP_USER | __GFP_NOWARN,
+		.reason = MR_LONGTERM_PIN,
+		.nid = NUMA_NO_NODE,
+	};
+	int ret = -EAGAIN;
 	unsigned long i;
 
 	for (i = 0; i < pofs->nr_entries; i++) {
@@ -2370,6 +2375,7 @@ migrate_longterm_unpinnable_folios(struct list_head *movable_folio_list,
 			gup_put_folio(folio, 1, FOLL_PIN);
 
 			if (migrate_device_coherent_folio(folio)) {
+				pofs_unpin(pofs);
 				ret = -EBUSY;
 				goto err;
 			}
@@ -2388,27 +2394,11 @@ migrate_longterm_unpinnable_folios(struct list_head *movable_folio_list,
 		pofs_clear_entry(pofs, i);
 	}
 
-	if (!list_empty(movable_folio_list)) {
-		struct migration_target_control mtc = {
-			.nid = NUMA_NO_NODE,
-			.gfp_mask = GFP_USER | __GFP_NOWARN,
-			.reason = MR_LONGTERM_PIN,
-		};
-
-		if (migrate_pages(movable_folio_list, alloc_migration_target,
-				  NULL, (unsigned long)&mtc, MIGRATE_SYNC,
-				  MR_LONGTERM_PIN, NULL)) {
-			ret = -ENOMEM;
-			goto err;
-		}
-	}
-
-	putback_movable_pages(movable_folio_list);
-
-	return -EAGAIN;
+	if (migrate_pages(movable_folio_list, alloc_migration_target, NULL,
+			  (unsigned long)&mtc, MIGRATE_SYNC, MR_LONGTERM_PIN, NULL))
+		ret = -ENOMEM;
 
 err:
-	pofs_unpin(pofs);
 	putback_movable_pages(movable_folio_list);
 
 	return ret;
-- 
2.49.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] mm: gup: avoid CMA page pinning by retrying migration if no migratable page
       [not found]   ` <CGME20250605033432epcas2p4c024a9d05246b18c217f3562b3f53551@epcas2p4.samsung.com>
@ 2025-06-05  3:32     ` Hyesoo Yu
  2025-06-05  3:43       ` Andrew Morton
  2025-06-05  7:39       ` David Hildenbrand
  0 siblings, 2 replies; 8+ messages in thread
From: Hyesoo Yu @ 2025-06-05  3:32 UTC (permalink / raw)
  Cc: janghyuck.kim, zhaoyang.huang, jaewon31.kim, david, Hyesoo Yu,
	Andrew Morton, Jason Gunthorpe, John Hubbard, Peter Xu, linux-mm,
	linux-kernel

Commit 1aaf8c122918 ("mm: gup: fix infinite loop within __get_longterm_locked")
introduced an issue where CMA pages could be pinned by longterm GUP requests.
This occurs when unpinnable pages are detected but the movable_page_list is empty;
the commit would return success without retrying, allowing unpinnable
pages (such as CMA) to become pinned.

CMA pages may be temporarily off the LRU due to concurrent isolation,
for example when multiple longterm GUP requests are racing and therefore
not appear in movable_page_list. Before commit 1aaf8c, the kernel would
retry migration in such cases, which helped avoid accidental CMA pinning.

The original intent of the commit was to support longterm GUP on non-LRU
CMA pages in out-of-tree use cases such as pKVM. However, allowing this
can lead to broader CMA pinning issues.

To avoid this, the logic is restored to return -EAGAIN instead of success
when no folios could be collected but unpinnable pages were found.
This ensures that migration is retried until success, and avoids
inadvertently pinning unpinnable pages.

Fixes: 1aaf8c122918 ("mm: gup: fix infinite loop within __get_longterm_locked")
Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com>
---
 mm/gup.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 68d91b000199..a25c8c894882 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2300,15 +2300,13 @@ static void pofs_unpin(struct pages_or_folios *pofs)
 		unpin_user_pages(pofs->pages, pofs->nr_entries);
 }
 
-/*
- * Returns the number of collected folios. Return value is always >= 0.
- */
-static void collect_longterm_unpinnable_folios(
+static bool collect_longterm_unpinnable_folios(
 		struct list_head *movable_folio_list,
 		struct pages_or_folios *pofs)
 {
 	struct folio *prev_folio = NULL;
 	bool drain_allow = true;
+	bool any_unpinnable = false;
 	unsigned long i;
 
 	for (i = 0; i < pofs->nr_entries; i++) {
@@ -2321,6 +2319,8 @@ static void collect_longterm_unpinnable_folios(
 		if (folio_is_longterm_pinnable(folio))
 			continue;
 
+		any_unpinnable = true;
+
 		if (folio_is_device_coherent(folio))
 			continue;
 
@@ -2342,6 +2342,8 @@ static void collect_longterm_unpinnable_folios(
 				    NR_ISOLATED_ANON + folio_is_file_lru(folio),
 				    folio_nr_pages(folio));
 	}
+
+	return any_unpinnable;
 }
 
 /*
@@ -2407,11 +2409,25 @@ migrate_longterm_unpinnable_folios(struct list_head *movable_folio_list,
 static long
 check_and_migrate_movable_pages_or_folios(struct pages_or_folios *pofs)
 {
+	bool any_unpinnable;
+
 	LIST_HEAD(movable_folio_list);
 
-	collect_longterm_unpinnable_folios(&movable_folio_list, pofs);
-	if (list_empty(&movable_folio_list))
+	any_unpinnable = collect_longterm_unpinnable_folios(&movable_folio_list, pofs);
+
+	if (list_empty(&movable_folio_list)) {
+		/*
+		 * If we find any longterm unpinnable page that we failed to
+		 * isolated for migration, it might be because someone else
+		 * concurrently isolated it. Make the caller retry until it
+		 * succeeds.
+		 */
+		if (any_unpinnable) {
+			pofs_unpin(pofs);
+			return -EAGAIN;
+		}
 		return 0;
+	}
 
 	return migrate_longterm_unpinnable_folios(&movable_folio_list, pofs);
 }
-- 
2.49.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] mm: gup: avoid CMA page pinning by retrying migration if no migratable page
  2025-06-05  3:32     ` [PATCH v2 2/2] mm: gup: avoid CMA page pinning by retrying migration if no migratable page Hyesoo Yu
@ 2025-06-05  3:43       ` Andrew Morton
  2025-06-05  5:11         ` Hyesoo Yu
  2025-06-05  7:39       ` David Hildenbrand
  1 sibling, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2025-06-05  3:43 UTC (permalink / raw)
  To: Hyesoo Yu
  Cc: janghyuck.kim, zhaoyang.huang, jaewon31.kim, david,
	Jason Gunthorpe, John Hubbard, Peter Xu, linux-mm, linux-kernel

On Thu,  5 Jun 2025 12:32:07 +0900 Hyesoo Yu <hyesoo.yu@samsung.com> wrote:

> Commit 1aaf8c122918 ("mm: gup: fix infinite loop within __get_longterm_locked")
> introduced an issue where CMA pages could be pinned by longterm GUP requests.
> This occurs when unpinnable pages are detected but the movable_page_list is empty;
> the commit would return success without retrying, allowing unpinnable
> pages (such as CMA) to become pinned.
> 
> CMA pages may be temporarily off the LRU due to concurrent isolation,
> for example when multiple longterm GUP requests are racing and therefore
> not appear in movable_page_list. Before commit 1aaf8c, the kernel would
> retry migration in such cases, which helped avoid accidental CMA pinning.
> 
> The original intent of the commit was to support longterm GUP on non-LRU
> CMA pages in out-of-tree use cases such as pKVM. However, allowing this
> can lead to broader CMA pinning issues.
> 
> To avoid this, the logic is restored to return -EAGAIN instead of success
> when no folios could be collected but unpinnable pages were found.
> This ensures that migration is retried until success, and avoids
> inadvertently pinning unpinnable pages.
> 
> Fixes: 1aaf8c122918 ("mm: gup: fix infinite loop within __get_longterm_locked")

v6.14.

As ever, a question is "should we backport this fix".  To answer that
we should understand the effect the regression has upon our users. 
Readers can guess, but it's better if you tell us this, please?




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] mm: gup: avoid CMA page pinning by retrying migration if no migratable page
  2025-06-05  3:43       ` Andrew Morton
@ 2025-06-05  5:11         ` Hyesoo Yu
  2025-06-05  5:24           ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Hyesoo Yu @ 2025-06-05  5:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: janghyuck.kim, zhaoyang.huang, jaewon31.kim, david,
	Jason Gunthorpe, John Hubbard, Peter Xu, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2260 bytes --]

On Wed, Jun 04, 2025 at 08:43:23PM -0700, Andrew Morton wrote:
> On Thu,  5 Jun 2025 12:32:07 +0900 Hyesoo Yu <hyesoo.yu@samsung.com> wrote:
> 
> > Commit 1aaf8c122918 ("mm: gup: fix infinite loop within __get_longterm_locked")
> > introduced an issue where CMA pages could be pinned by longterm GUP requests.
> > This occurs when unpinnable pages are detected but the movable_page_list is empty;
> > the commit would return success without retrying, allowing unpinnable
> > pages (such as CMA) to become pinned.
> > 
> > CMA pages may be temporarily off the LRU due to concurrent isolation,
> > for example when multiple longterm GUP requests are racing and therefore
> > not appear in movable_page_list. Before commit 1aaf8c, the kernel would
> > retry migration in such cases, which helped avoid accidental CMA pinning.
> > 
> > The original intent of the commit was to support longterm GUP on non-LRU
> > CMA pages in out-of-tree use cases such as pKVM. However, allowing this
> > can lead to broader CMA pinning issues.
> > 
> > To avoid this, the logic is restored to return -EAGAIN instead of success
> > when no folios could be collected but unpinnable pages were found.
> > This ensures that migration is retried until success, and avoids
> > inadvertently pinning unpinnable pages.
> > 
> > Fixes: 1aaf8c122918 ("mm: gup: fix infinite loop within __get_longterm_locked")
> 
> v6.14.
> 
> As ever, a question is "should we backport this fix".  To answer that
> we should understand the effect the regression has upon our users. 
> Readers can guess, but it's better if you tell us this, please?
>

Hi Andrew.

We have confirmed that this regression causes CMA pages to be pinned
in our kernel 6.12-based environment.

In addition to CMA allocation failures, we also observed GUP longterm
failures in cases where the same VMA was accessed repeatedly.

Specifically, the first GUP longterm call would pin a CMA page, and a second
call on the same region would fail the migration due to the cma page already
being pinned.

After reverting commit 1aaf8c122918, the issue no longer reproduced.

Therefore, this fix is important to ensure reliable behavior of GUP longterm
and CMA-backed memory, and should be backported to stable.

Thanks,
Regards.

> 
> 

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] mm: gup: avoid CMA page pinning by retrying migration if no migratable page
  2025-06-05  5:11         ` Hyesoo Yu
@ 2025-06-05  5:24           ` Andrew Morton
  2025-06-05  6:59             ` Hyesoo Yu
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2025-06-05  5:24 UTC (permalink / raw)
  To: Hyesoo Yu
  Cc: janghyuck.kim, zhaoyang.huang, jaewon31.kim, david,
	Jason Gunthorpe, John Hubbard, Peter Xu, linux-mm, linux-kernel

On Thu, 5 Jun 2025 14:11:31 +0900 Hyesoo Yu <hyesoo.yu@samsung.com> wrote:

> We have confirmed that this regression causes CMA pages to be pinned
> in our kernel 6.12-based environment.
> 
> In addition to CMA allocation failures, we also observed GUP longterm
> failures in cases where the same VMA was accessed repeatedly.
> 
> Specifically, the first GUP longterm call would pin a CMA page, and a second
> call on the same region would fail the migration due to the cma page already
> being pinned.
> 
> After reverting commit 1aaf8c122918, the issue no longer reproduced.
> 
> Therefore, this fix is important to ensure reliable behavior of GUP longterm
> and CMA-backed memory, and should be backported to stable.

Great, thanks.  Please add this to the patch's changelog.


The problem is, this series combines a non-urgent cleanup with an
important, backportable regression fix.  We shouldn't backport the
cleanup into earlier kernels - that just adds undesirable noise.

So can I ask you to prepare a single standalone fix for the regression
against current -linus and to later propose the cleanup patch for
6.17-rc1?

In other words, pleas reverse the patching order, send the patches
separately and test the regression fix without the presence of the
cleanup?

(I could do these manipulations locally but then what I have for the
regression fix wasn't standalone tested by yourself).

Thanks.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] mm: gup: avoid CMA page pinning by retrying migration if no migratable page
  2025-06-05  5:24           ` Andrew Morton
@ 2025-06-05  6:59             ` Hyesoo Yu
  0 siblings, 0 replies; 8+ messages in thread
From: Hyesoo Yu @ 2025-06-05  6:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: janghyuck.kim, zhaoyang.huang, jaewon31.kim, david,
	Jason Gunthorpe, John Hubbard, Peter Xu, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1704 bytes --]

On Wed, Jun 04, 2025 at 10:24:00PM -0700, Andrew Morton wrote:
> On Thu, 5 Jun 2025 14:11:31 +0900 Hyesoo Yu <hyesoo.yu@samsung.com> wrote:
> 
> > We have confirmed that this regression causes CMA pages to be pinned
> > in our kernel 6.12-based environment.
> > 
> > In addition to CMA allocation failures, we also observed GUP longterm
> > failures in cases where the same VMA was accessed repeatedly.
> > 
> > Specifically, the first GUP longterm call would pin a CMA page, and a second
> > call on the same region would fail the migration due to the cma page already
> > being pinned.
> > 
> > After reverting commit 1aaf8c122918, the issue no longer reproduced.
> > 
> > Therefore, this fix is important to ensure reliable behavior of GUP longterm
> > and CMA-backed memory, and should be backported to stable.
> 
> Great, thanks.  Please add this to the patch's changelog.
> 
> 
> The problem is, this series combines a non-urgent cleanup with an
> important, backportable regression fix.  We shouldn't backport the
> cleanup into earlier kernels - that just adds undesirable noise.
> 
> So can I ask you to prepare a single standalone fix for the regression
> against current -linus and to later propose the cleanup patch for
> 6.17-rc1?
> 
> In other words, pleas reverse the patching order, send the patches
> separately and test the regression fix without the presence of the
> cleanup?
> 
> (I could do these manipulations locally but then what I have for the
> regression fix wasn't standalone tested by yourself).
> 
> Thanks.
> 

Thanks for the clarification. I'll prepare a standalone v3 patch with just the fix,
and send the cleanup separately for 6.17-rc1 as suggested.

Thanks,
Regards.

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] mm: gup: avoid CMA page pinning by retrying migration if no migratable page
  2025-06-05  3:32     ` [PATCH v2 2/2] mm: gup: avoid CMA page pinning by retrying migration if no migratable page Hyesoo Yu
  2025-06-05  3:43       ` Andrew Morton
@ 2025-06-05  7:39       ` David Hildenbrand
  1 sibling, 0 replies; 8+ messages in thread
From: David Hildenbrand @ 2025-06-05  7:39 UTC (permalink / raw)
  To: Hyesoo Yu
  Cc: janghyuck.kim, zhaoyang.huang, jaewon31.kim, Andrew Morton,
	Jason Gunthorpe, John Hubbard, Peter Xu, linux-mm, linux-kernel

On 05.06.25 05:32, Hyesoo Yu wrote:
> Commit 1aaf8c122918 ("mm: gup: fix infinite loop within __get_longterm_locked")
> introduced an issue where CMA pages could be pinned by longterm GUP requests.
> This occurs when unpinnable pages are detected but the movable_page_list is empty;
> the commit would return success without retrying, allowing unpinnable
> pages (such as CMA) to become pinned.
> 
> CMA pages may be temporarily off the LRU due to concurrent isolation,
> for example when multiple longterm GUP requests are racing and therefore
> not appear in movable_page_list. Before commit 1aaf8c, the kernel would
> retry migration in such cases, which helped avoid accidental CMA pinning.
> 
> The original intent of the commit was to support longterm GUP on non-LRU
> CMA pages in out-of-tree use cases such as pKVM. However, allowing this
> can lead to broader CMA pinning issues.
> 
> To avoid this, the logic is restored to return -EAGAIN instead of success
> when no folios could be collected but unpinnable pages were found.
> This ensures that migration is retried until success, and avoids
> inadvertently pinning unpinnable pages.
> 

The fix should always come before the cleanup. But Andrew already asked 
for a separate submission, so that will be taken care of.

> Fixes: 1aaf8c122918 ("mm: gup: fix infinite loop within __get_longterm_locked")
> Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com>
> ---
>   mm/gup.c | 28 ++++++++++++++++++++++------
>   1 file changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/gup.c b/mm/gup.c
> index 68d91b000199..a25c8c894882 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2300,15 +2300,13 @@ static void pofs_unpin(struct pages_or_folios *pofs)
>   		unpin_user_pages(pofs->pages, pofs->nr_entries);
>   }
>   
> -/*
> - * Returns the number of collected folios. Return value is always >= 0.
> - */
> -static void collect_longterm_unpinnable_folios(
> +static bool collect_longterm_unpinnable_folios(
>   		struct list_head *movable_folio_list,
>   		struct pages_or_folios *pofs)
>   {
>   	struct folio *prev_folio = NULL;
>   	bool drain_allow = true;
> +	bool any_unpinnable = false;

bool any_unpinnable = false;
bool drain_allow = true;

(maintain reverse xmas tree)


Apart from that LGTM

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-06-05  7:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CGME20250605033401epcas2p46651837ba85629a50ed69db9665a52a2@epcas2p4.samsung.com>
2025-06-05  3:32 ` [PATCH v2 0/2] mm: gup: avoid CMA pinning and clean up stale logic Hyesoo Yu
     [not found]   ` <CGME20250605033430epcas2p37db099e1ff4f2225c2059e08bbaa97c9@epcas2p3.samsung.com>
2025-06-05  3:32     ` [PATCH v2 1/2] mm: gup: clean up stale logic in migrate_longterm_unpinnable_folio() Hyesoo Yu
     [not found]   ` <CGME20250605033432epcas2p4c024a9d05246b18c217f3562b3f53551@epcas2p4.samsung.com>
2025-06-05  3:32     ` [PATCH v2 2/2] mm: gup: avoid CMA page pinning by retrying migration if no migratable page Hyesoo Yu
2025-06-05  3:43       ` Andrew Morton
2025-06-05  5:11         ` Hyesoo Yu
2025-06-05  5:24           ` Andrew Morton
2025-06-05  6:59             ` Hyesoo Yu
2025-06-05  7:39       ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).