Re: [PATCH] mm/gup: stop leaking pinned pages in low memory conditions

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Alistair Popple <apopple@nvidia.com>
To: David Hildenbrand <david@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Shigeru Yoshida <syoshida@redhat.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Minchan Kim <minchan@kernel.org>,
	Pasha Tatashin <pasha.tatashin@soleen.com>
Subject: Re: [PATCH] mm/gup: stop leaking pinned pages in low memory conditions
Date: Fri, 18 Oct 2024 08:28:49 +1100	[thread overview]
Message-ID: <87v7xqmmxt.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <1f8dcae1-416e-43cf-8dda-5440e0db4c00@redhat.com>


David Hildenbrand <david@redhat.com> writes:

> On 16.10.24 22:22, John Hubbard wrote:
>> If a driver tries to call any of the pin_user_pages*(FOLL_LONGTERM)
>> family of functions, and requests "too many" pages, then the call will
>> erroneously leave pages pinned. This is visible in user space as an
>> actual memory leak.
>> Repro is trivial: just make enough pin_user_pages(FOLL_LONGTERM)
>> calls
>> to exhaust memory.
>> The root cause of the problem is this sequence, within
>> __gup_longterm_locked():
>>      __get_user_pages_locked()
>>      rc = check_and_migrate_movable_pages()
>> ...which gets retried in a loop. The loop error handling is
>> incomplete,
>> clearly due to a somewhat unusual and complicated tri-state error API.
>> But anyway, if -ENOMEM, or in fact, any unexpected error is returned
>> from check_and_migrate_movable_pages(), then __gup_longterm_locked()
>> happily returns the error, while leaving the pages pinned.
>> In the failed case, which is an app that requests (via a device
>> driver)
>> 30720000000 bytes to be pinned, and then exits, I see this:
>>      $ grep foll /proc/vmstat
>>          nr_foll_pin_acquired 7502048
>>          nr_foll_pin_released 2048
>> And after applying this patch, it returns to balanced pins:
>>      $ grep foll /proc/vmstat
>>          nr_foll_pin_acquired 7502048
>>          nr_foll_pin_released 7502048
>> Fix this by unpinning the pages that __get_user_pages_locked() has
>> pinned, in such error cases.
>> Fixes: 24a95998e9ba ("mm/gup.c: simplify and fix
>> check_and_migrate_movable_pages() return codes")
>> Cc: Alistair Popple <apopple@nvidia.com>
>> Cc: Shigeru Yoshida <syoshida@redhat.com>
>> Cc: David Hildenbrand <david@redhat.com>
>> Cc: Jason Gunthorpe <jgg@nvidia.com>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>> ---
>>   mm/gup.c | 11 +++++++++++
>>   1 file changed, 11 insertions(+)
>> diff --git a/mm/gup.c b/mm/gup.c
>> index a82890b46a36..24acf53c8294 100644
>> --- a/mm/gup.c
>> +++ b/mm/gup.c
>> @@ -2492,6 +2492,17 @@ static long __gup_longterm_locked(struct mm_struct *mm,
>>     		/* FOLL_LONGTERM implies FOLL_PIN */
>>   		rc = check_and_migrate_movable_pages(nr_pinned_pages, pages);
>> +
>> +		/*
>> +		 * The __get_user_pages_locked() call happens before we know
>> +		 * that whether it's possible to successfully complete the whole
>> +		 * operation. To compensate for this, if we get an unexpected
>> +		 * error (such as -ENOMEM) then we must unpin everything, before
>> +		 * erroring out.
>> +		 */
>> +		if (rc != -EAGAIN && rc != 0)
>> +			unpin_user_pages(pages, nr_pinned_pages);
>> +
>>   	} while (rc == -EAGAIN);
>
> Wouldn't it be cleaner to simply have here after the loop (possibly
> even after the memalloc_pin_restore())
>
> if (rc)
> 	unpin_user_pages(pages, nr_pinned_pages);
>
> But maybe I am missing something.

I initially thought the same thing but I'm not sure it is
correct. Consider what happens when __get_user_pages_locked() fails
earlier in the loop. In this case it will have bailed out of the loop
with rc <= 0 but we shouldn't call unpin_user_pages().

>>   	memalloc_pin_restore(flags);
>>   	return rc ? rc : nr_pinned_pages;

next prev parent reply	other threads:[~2024-10-17 21:36 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-16 20:22 [PATCH] mm/gup: stop leaking pinned pages in low memory conditions John Hubbard
2024-10-16 21:57 ` Andrew Morton
2024-10-16 22:05   ` John Hubbard
2024-10-16 22:41     ` Andrew Morton
2024-10-16 22:13 ` Alistair Popple
2024-10-16 22:22   ` John Hubbard
2024-10-17  8:51 ` David Hildenbrand
2024-10-17  8:53   ` David Hildenbrand
2024-10-17 17:06     ` John Hubbard
2024-10-17 17:10       ` David Hildenbrand
2024-10-17 16:54   ` John Hubbard
2024-10-17 21:28   ` Alistair Popple [this message]
2024-10-17 21:47     ` David Hildenbrand
2024-10-17 21:57       ` John Hubbard
2024-10-17 22:03         ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v7xqmmxt.fsf@nvdebian.thelocal \
    --to=apopple@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=syoshida@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.