Re: [PATCH] mm/gup: honour FOLL_PIN in NOMMU __get_user_pages_locked()

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	John Hubbard <jhubbard@nvidia.com>, Peter Xu <peterx@redhat.com>
Subject: Re: [PATCH] mm/gup: honour FOLL_PIN in NOMMU __get_user_pages_locked()
Date: Thu, 23 Apr 2026 15:00:42 +0200	[thread overview]
Message-ID: <fff4048d-8fad-4a63-b1d4-688ec3608fe8@kernel.org> (raw)
In-Reply-To: <dee5b7e8-864d-48b5-99cf-282eedaba927@kernel.org>

On 4/23/26 14:47, David Hildenbrand (Arm) wrote:
> On 4/23/26 14:31, Greg Kroah-Hartman wrote:
>> The !CONFIG_MMU implementation of __get_user_pages_locked() takes a bare
>> get_page() reference for each page regardless of foll_flags:
>> 	if (pages[i])
>> 		get_page(pages[i]);
>>
>> This is reached from pin_user_pages*() with FOLL_PIN set.
>> unpin_user_page() is shared between MMU and NOMMU configurations and
>> unconditionally calls gup_put_folio(..., FOLL_PIN), which subtracts
>> GUP_PIN_COUNTING_BIAS (1024) from the folio refcount.
>>
>> This means that pin adds 1, and then unpin will subtract 1024.
>>
>> If a user maps a page (refcount 1), registers it 1023 times as an
>> io_uring fixed buffer (1023 pin_user_pages calls -> refcount 1024), then
>> unregisters: the first unpin_user_page subtracts 1024, refcount hits 0,
>> the page is freed and returned to the buddy allocator.  The remaining
>> 1022 unpins write into whatever was reallocated, and the user's VMA
>> still maps the freed page (NOMMU has no MMU to invalidate it).
>> Reallocating the page for an io_uring pbuf_ring then lets userspace
>> corrupt the new owner's data through the stale mapping.
>>
>> Use try_grab_folio() which adds GUP_PIN_COUNTING_BIAS for FOLL_PIN and 1
>> for FOLL_GET, mirroring the CONFIG_MMU path so pin and unpin are
>> symmetric.
>>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: David Hildenbrand <david@kernel.org>
>> Cc: Jason Gunthorpe <jgg@ziepe.ca>
>> Cc: John Hubbard <jhubbard@nvidia.com>
>> Cc: Peter Xu <peterx@redhat.com>
>> Reported-by: Anthropic
>> Assisted-by: gkh_clanker_t1000
>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> ---
>> My first foray into -mm, eeek!
> 
> Oh, nommu ... what a great use of our time.
> 
> I was briefly wondering if we want to add a Fixes: ... but then, this was likely
> broken for years and nobody cared so far in practice.
> 
>>
>> Anyway, this was a crazy report sent to me, and I knocked up this
>> change, and I have a reproducer if people need/want to see that as well
>> (it's for nommu systems, so be wary of it.)
> 
> [...]
> 
>> -				get_page(pages[i]);
>> +			if (pages[i]) {
>> +				/*
>> +				 * pin_user_pages*() arrives here with FOLL_PIN
>> +				 * set; unpin_user_page() (which is not
>> +				 * !CONFIG_MMU-specific) calls
>> +				 * gup_put_folio(..., FOLL_PIN) which subtracts
>> +				 * GUP_PIN_COUNTING_BIAS (1024).  A bare
>> +				 * get_page() here adds only 1, so 1023 pins on
>> +				 * a fresh page bring refcount to 1024 and a
>> +				 * single unpin then frees it out from under the
>> +				 * remaining 1022 pins and any live VMA
>> +				 * mappings. Use the same grab path as the MMU
>> +				 * implementation so pin and unpin are
>> +				 * symmetric.
>> +				 */
> 
> Yeah, drop all that. Especially the hardcoded 1024/1022 is just screaming for
> trouble longterm.
> 
> It just follows what we do everywhere else (e.g., follow_page_pte()).
> 
> 
>> +				if (try_grab_folio(page_folio(pages[i]), 1,
>> +						   foll_flags)) {
>> +					pages[i] = NULL;
>> +					break;
>> +				}
>> +			}
> 
> If it fails on the first iteration, we return -EFAULT instead of -ENOMEM.
> 
> I know, I know, nobody cares. But if we touch it, we might just want to return
> the error we get from try_grab_folio().
> 


BTW, looking into this, I am not sure if continuing on !pages[i] is a sane thing
to do.

IIRC, we are not supposed to return NULL-page pointer from this function.

We support it in unpin_user_pages() only for internal purposes.
unpin_user_pages_dirty_lock() does not support/expect that.

That should likely be an -EFAULT (if the first page).

-- 
Cheers,

David

     prev parent reply	other threads:[~2026-04-23 13:00 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-23 12:31 [PATCH] mm/gup: honour FOLL_PIN in NOMMU __get_user_pages_locked() Greg Kroah-Hartman
2026-04-23 12:47 ` David Hildenbrand (Arm)
2026-04-23 12:59   ` Greg Kroah-Hartman
2026-04-23 13:04     ` David Hildenbrand (Arm)
2026-04-23 13:11       ` Greg Kroah-Hartman
2026-04-23 13:42       ` Greg Kroah-Hartman
2026-04-23 13:00   ` David Hildenbrand (Arm) [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fff4048d-8fad-4a63-b1d4-688ec3608fe8@kernel.org \
    --to=david@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterx@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox