* [merged mm-hotfixes-stable] mm-gup-stop-leaking-pinned-pages-in-low-memory-conditions.patch removed from -mm tree
@ 2024-10-24 6:15 Andrew Morton
0 siblings, 0 replies; 2+ messages in thread
From: Andrew Morton @ 2024-10-24 6:15 UTC (permalink / raw)
To: mm-commits, willy, vivek.kasireddy, syoshida, peterx,
pasha.tatashin, osalvador, minchan, kraxel, junxiao.chang, jgg,
hughd, hch, hch, dongwon.kim, david, daniel.vetter, arnd, apopple,
airlied, jhubbard, akpm
The quilt patch titled
Subject: mm/gup: stop leaking pinned pages in low memory conditions
has been removed from the -mm tree. Its filename was
mm-gup-stop-leaking-pinned-pages-in-low-memory-conditions.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: John Hubbard <jhubbard@nvidia.com>
Subject: mm/gup: stop leaking pinned pages in low memory conditions
Date: Thu, 17 Oct 2024 18:17:10 -0700
Patch series "mm/gup: stop leaking pinned pages in low memory conditions",
v2.
This patch (of 2):
If a driver tries to call any of the pin_user_pages*(FOLL_LONGTERM) family
of functions, and requests "too many" pages, then the call will
erroneously leave pages pinned. This is visible in user space as an
actual memory leak.
Repro is trivial: just make enough pin_user_pages(FOLL_LONGTERM) calls to
exhaust memory.
The root cause of the problem is this sequence, within
__gup_longterm_locked():
__get_user_pages_locked()
rc = check_and_migrate_movable_pages()
...which gets retried in a loop. The loop error handling is incomplete,
clearly due to a somewhat unusual and complicated tri-state error API.
But anyway, if -ENOMEM, or in fact, any unexpected error is returned from
check_and_migrate_movable_pages(), then __gup_longterm_locked() happily
returns the error, while leaving the pages pinned.
In the failed case, which is an app that requests (via a device driver)
30720000000 bytes to be pinned, and then exits, I see this:
$ grep foll /proc/vmstat
nr_foll_pin_acquired 7502048
nr_foll_pin_released 2048
And after applying this patch, it returns to balanced pins:
$ grep foll /proc/vmstat
nr_foll_pin_acquired 7502048
nr_foll_pin_released 7502048
Note that the child routine, check_and_migrate_movable_folios(), avoids
this problem, by unpinning any folios in the **folios argument, before
returning an error.
Fix this by making check_and_migrate_movable_pages() behave in exactly
the same way as check_and_migrate_movable_folios(): unpin all pages in
**pages, before returning an error.
Also, documentation was an aggravating factor, so:
1) Consolidate the documentation for these two routines, now that they
have identical external behavior.
2) Rewrite the consolidated documentation:
a) Clearly list the three return code cases, and what happens in
each case.
b) Mention that one of the cases unpins the pages or folios, before
returning an error code.
[jhubbard@nvidia.com: updates per David]
Link: https://lkml.kernel.org/r/20241018223411.310331-1-jhubbard@nvidia.comLink: https://lkml.kernel.org/r/20241018011711.183642-1-jhubbard@nvidia.com
Link: https://lkml.kernel.org/r/20241018011711.183642-2-jhubbard@nvidia.com
Fixes: 24a95998e9ba ("mm/gup.c: simplify and fix check_and_migrate_movable_pages() return codes")
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: Shigeru Yoshida <syoshida@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Junxiao Chang <junxiao.chang@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/gup.c | 44 ++++++++++++++++++++++++++++++--------------
1 file changed, 30 insertions(+), 14 deletions(-)
--- a/mm/gup.c~mm-gup-stop-leaking-pinned-pages-in-low-memory-conditions
+++ a/mm/gup.c
@@ -2394,20 +2394,25 @@ err:
}
/*
- * Check whether all folios are *allowed* to be pinned indefinitely (longterm).
+ * Check whether all folios are *allowed* to be pinned indefinitely (long term).
* Rather confusingly, all folios in the range are required to be pinned via
* FOLL_PIN, before calling this routine.
*
- * If any folios in the range are not allowed to be pinned, then this routine
- * will migrate those folios away, unpin all the folios in the range and return
- * -EAGAIN. The caller should re-pin the entire range with FOLL_PIN and then
- * call this routine again.
+ * Return values:
*
- * If an error other than -EAGAIN occurs, this indicates a migration failure.
- * The caller should give up, and propagate the error back up the call stack.
- *
- * If everything is OK and all folios in the range are allowed to be pinned,
+ * 0: if everything is OK and all folios in the range are allowed to be pinned,
* then this routine leaves all folios pinned and returns zero for success.
+ *
+ * -EAGAIN: if any folios in the range are not allowed to be pinned, then this
+ * routine will migrate those folios away, unpin all the folios in the range. If
+ * migration of the entire set of folios succeeds, then -EAGAIN is returned. The
+ * caller should re-pin the entire range with FOLL_PIN and then call this
+ * routine again.
+ *
+ * -ENOMEM, or any other -errno: if an error *other* than -EAGAIN occurs, this
+ * indicates a migration failure. The caller should give up, and propagate the
+ * error back up the call stack. The caller does not need to unpin any folios in
+ * that case, because this routine will do the unpinning.
*/
static long check_and_migrate_movable_folios(unsigned long nr_folios,
struct folio **folios)
@@ -2425,10 +2430,8 @@ static long check_and_migrate_movable_fo
}
/*
- * This routine just converts all the pages in the @pages array to folios and
- * calls check_and_migrate_movable_folios() to do the heavy lifting.
- *
- * Please see the check_and_migrate_movable_folios() documentation for details.
+ * Return values and behavior are the same as those for
+ * check_and_migrate_movable_folios().
*/
static long check_and_migrate_movable_pages(unsigned long nr_pages,
struct page **pages)
@@ -2437,8 +2440,10 @@ static long check_and_migrate_movable_pa
long i, ret;
folios = kmalloc_array(nr_pages, sizeof(*folios), GFP_KERNEL);
- if (!folios)
+ if (!folios) {
+ unpin_user_pages(pages, nr_pages);
return -ENOMEM;
+ }
for (i = 0; i < nr_pages; i++)
folios[i] = page_folio(pages[i]);
@@ -2492,6 +2497,17 @@ static long __gup_longterm_locked(struct
/* FOLL_LONGTERM implies FOLL_PIN */
rc = check_and_migrate_movable_pages(nr_pinned_pages, pages);
+
+ /*
+ * The __get_user_pages_locked() call happens before we know if
+ * it's possible to successfully complete the whole operation.
+ * To compensate for this, if we get an unexpected error (such
+ * as -ENOMEM) then we must unpin everything, before erroring
+ * out.
+ */
+ if (rc != -EAGAIN && rc != 0)
+ unpin_user_pages(pages, nr_pinned_pages);
+
} while (rc == -EAGAIN);
memalloc_pin_restore(flags);
return rc ? rc : nr_pinned_pages;
_
Patches currently in -mm which might be from jhubbard@nvidia.com are
kaslr-rename-physmem_end-and-physmem_end-to-direct_map_physmem_end.patch
^ permalink raw reply [flat|nested] 2+ messages in thread
* [merged mm-hotfixes-stable] mm-gup-stop-leaking-pinned-pages-in-low-memory-conditions.patch removed from -mm tree
@ 2024-10-31 3:15 Andrew Morton
0 siblings, 0 replies; 2+ messages in thread
From: Andrew Morton @ 2024-10-31 3:15 UTC (permalink / raw)
To: mm-commits, syoshida, pasha.tatashin, minchan, jgg, david,
apopple, jhubbard, akpm
The quilt patch titled
Subject: mm/gup: stop leaking pinned pages in low memory conditions
has been removed from the -mm tree. Its filename was
mm-gup-stop-leaking-pinned-pages-in-low-memory-conditions.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: John Hubbard <jhubbard@nvidia.com>
Subject: mm/gup: stop leaking pinned pages in low memory conditions
Date: Fri, 18 Oct 2024 15:34:11 -0700
If a driver tries to call any of the pin_user_pages*(FOLL_LONGTERM) family
of functions, and requests "too many" pages, then the call will
erroneously leave pages pinned. This is visible in user space as an
actual memory leak.
Repro is trivial: just make enough pin_user_pages(FOLL_LONGTERM) calls to
exhaust memory.
The root cause of the problem is this sequence, within
__gup_longterm_locked():
__get_user_pages_locked()
rc = check_and_migrate_movable_pages()
...which gets retried in a loop. The loop error handling is incomplete,
clearly due to a somewhat unusual and complicated tri-state error API.
But anyway, if -ENOMEM, or in fact, any unexpected error is returned from
check_and_migrate_movable_pages(), then __gup_longterm_locked() happily
returns the error, while leaving the pages pinned.
In the failed case, which is an app that requests (via a device driver)
30720000000 bytes to be pinned, and then exits, I see this:
$ grep foll /proc/vmstat
nr_foll_pin_acquired 7502048
nr_foll_pin_released 2048
And after applying this patch, it returns to balanced pins:
$ grep foll /proc/vmstat
nr_foll_pin_acquired 7502048
nr_foll_pin_released 7502048
Note that the child routine, check_and_migrate_movable_folios(), avoids
this problem, by unpinning any folios in the **folios argument, before
returning an error.
Fix this by making check_and_migrate_movable_pages() behave in exactly the
same way as check_and_migrate_movable_folios(): unpin all pages in
**pages, before returning an error.
Also, documentation was an aggravating factor, so:
1) Consolidate the documentation for these two routines, now that they
have identical external behavior.
2) Rewrite the consolidated documentation:
a) Clearly list the three return code cases, and what happens in
each case.
b) Mention that one of the cases unpins the pages or folios, before
returning an error code.
Link: https://lkml.kernel.org/r/20241018223411.310331-1-jhubbard@nvidia.com
Fixes: 24a95998e9ba ("mm/gup.c: simplify and fix check_and_migrate_movable_pages() return codes")
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Cc: Shigeru Yoshida <syoshida@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/gup.c | 33 +++++++++++++++++++--------------
1 file changed, 19 insertions(+), 14 deletions(-)
--- a/mm/gup.c~mm-gup-stop-leaking-pinned-pages-in-low-memory-conditions
+++ a/mm/gup.c
@@ -2394,20 +2394,25 @@ err:
}
/*
- * Check whether all folios are *allowed* to be pinned indefinitely (longterm).
+ * Check whether all folios are *allowed* to be pinned indefinitely (long term).
* Rather confusingly, all folios in the range are required to be pinned via
* FOLL_PIN, before calling this routine.
*
- * If any folios in the range are not allowed to be pinned, then this routine
- * will migrate those folios away, unpin all the folios in the range and return
- * -EAGAIN. The caller should re-pin the entire range with FOLL_PIN and then
- * call this routine again.
+ * Return values:
*
- * If an error other than -EAGAIN occurs, this indicates a migration failure.
- * The caller should give up, and propagate the error back up the call stack.
- *
- * If everything is OK and all folios in the range are allowed to be pinned,
+ * 0: if everything is OK and all folios in the range are allowed to be pinned,
* then this routine leaves all folios pinned and returns zero for success.
+ *
+ * -EAGAIN: if any folios in the range are not allowed to be pinned, then this
+ * routine will migrate those folios away, unpin all the folios in the range. If
+ * migration of the entire set of folios succeeds, then -EAGAIN is returned. The
+ * caller should re-pin the entire range with FOLL_PIN and then call this
+ * routine again.
+ *
+ * -ENOMEM, or any other -errno: if an error *other* than -EAGAIN occurs, this
+ * indicates a migration failure. The caller should give up, and propagate the
+ * error back up the call stack. The caller does not need to unpin any folios in
+ * that case, because this routine will do the unpinning.
*/
static long check_and_migrate_movable_folios(unsigned long nr_folios,
struct folio **folios)
@@ -2425,10 +2430,8 @@ static long check_and_migrate_movable_fo
}
/*
- * This routine just converts all the pages in the @pages array to folios and
- * calls check_and_migrate_movable_folios() to do the heavy lifting.
- *
- * Please see the check_and_migrate_movable_folios() documentation for details.
+ * Return values and behavior are the same as those for
+ * check_and_migrate_movable_folios().
*/
static long check_and_migrate_movable_pages(unsigned long nr_pages,
struct page **pages)
@@ -2437,8 +2440,10 @@ static long check_and_migrate_movable_pa
long i, ret;
folios = kmalloc_array(nr_pages, sizeof(*folios), GFP_KERNEL);
- if (!folios)
+ if (!folios) {
+ unpin_user_pages(pages, nr_pages);
return -ENOMEM;
+ }
for (i = 0; i < nr_pages; i++)
folios[i] = page_folio(pages[i]);
_
Patches currently in -mm which might be from jhubbard@nvidia.com are
kaslr-rename-physmem_end-and-physmem_end-to-direct_map_physmem_end.patch
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-10-31 3:15 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-24 6:15 [merged mm-hotfixes-stable] mm-gup-stop-leaking-pinned-pages-in-low-memory-conditions.patch removed from -mm tree Andrew Morton
-- strict thread matches above, loose matches on Subject: below --
2024-10-31 3:15 Andrew Morton
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.