From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3C1332EA749 for ; Thu, 23 Apr 2026 13:00:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776949201; cv=none; b=XkGDJ0GM/GLyHgXzaqIOAh38o0bI7f+PJJy61SGgQma/2bqbQBiNRa4ju2yujXVHyKCjdgYmKJGV6PKVi/mDysDvznbGaeUadwpZiIEEdikAJQZUNccn1OCwuN1UT+vYr5/PbY+fb9dmRRcprksdTOolriGIgGtEvEKKnDvg7Ww= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776949201; c=relaxed/simple; bh=6GbEckYSnZZ1ear2oExA9pFYBwoNnXW6rLGISNYHF1U=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=VerynuC4KqOM+uwrDoTvya3rEIuzxqfKNsSbHJXX96uJFgz5yKS8Jb7L2tDNT7DD1SLS38CXqSazXYeH209OUGq1pPQa08rlIwQsnr+LyRWScTP2/SfKaoE7k2WiG2dk7zesixN/Cz9IbQbOSSD1p+FSyS9Vp7oIDLgGSDHV/wc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=wjsCD++q; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="wjsCD++q" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5B9F4C2BCAF; Thu, 23 Apr 2026 13:00:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1776949200; bh=6GbEckYSnZZ1ear2oExA9pFYBwoNnXW6rLGISNYHF1U=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=wjsCD++qo04kgcNKA2OC1fTlPBtZbLJZClDxjzBZpRovMVUInvv8FDzM+BR2oN3MB DR1F4Ncc9iQ59SOJBjnLlKVibewh1UAoqlJdk3phiDRbyNogph0Z7xMAJb1RSNoUlv jdH97HKSvf8dopBVzQkwoNbZMacB3y0K/JcrN8N0= Date: Thu, 23 Apr 2026 14:59:58 +0200 From: Greg Kroah-Hartman To: "David Hildenbrand (Arm)" Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Jason Gunthorpe , John Hubbard , Peter Xu Subject: Re: [PATCH] mm/gup: honour FOLL_PIN in NOMMU __get_user_pages_locked() Message-ID: <2026042314-traffic-riverbank-d9a1@gregkh> References: <2026042334-acutely-unadorned-e05c@gregkh> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Thu, Apr 23, 2026 at 02:47:23PM +0200, David Hildenbrand (Arm) wrote: > On 4/23/26 14:31, Greg Kroah-Hartman wrote: > > The !CONFIG_MMU implementation of __get_user_pages_locked() takes a bare > > get_page() reference for each page regardless of foll_flags: > > if (pages[i]) > > get_page(pages[i]); > > > > This is reached from pin_user_pages*() with FOLL_PIN set. > > unpin_user_page() is shared between MMU and NOMMU configurations and > > unconditionally calls gup_put_folio(..., FOLL_PIN), which subtracts > > GUP_PIN_COUNTING_BIAS (1024) from the folio refcount. > > > > This means that pin adds 1, and then unpin will subtract 1024. > > > > If a user maps a page (refcount 1), registers it 1023 times as an > > io_uring fixed buffer (1023 pin_user_pages calls -> refcount 1024), then > > unregisters: the first unpin_user_page subtracts 1024, refcount hits 0, > > the page is freed and returned to the buddy allocator. The remaining > > 1022 unpins write into whatever was reallocated, and the user's VMA > > still maps the freed page (NOMMU has no MMU to invalidate it). > > Reallocating the page for an io_uring pbuf_ring then lets userspace > > corrupt the new owner's data through the stale mapping. > > > > Use try_grab_folio() which adds GUP_PIN_COUNTING_BIAS for FOLL_PIN and 1 > > for FOLL_GET, mirroring the CONFIG_MMU path so pin and unpin are > > symmetric. > > > > Cc: Andrew Morton > > Cc: David Hildenbrand > > Cc: Jason Gunthorpe > > Cc: John Hubbard > > Cc: Peter Xu > > Reported-by: Anthropic > > Assisted-by: gkh_clanker_t1000 > > Signed-off-by: Greg Kroah-Hartman > > --- > > My first foray into -mm, eeek! > > Oh, nommu ... what a great use of our time. Yeah, tell me about it. I have been cursing a specific company's name a lot these past days... > I was briefly wondering if we want to add a Fixes: ... but then, this was likely > broken for years and nobody cared so far in practice. Agreed. > > > > Anyway, this was a crazy report sent to me, and I knocked up this > > change, and I have a reproducer if people need/want to see that as well > > (it's for nommu systems, so be wary of it.) > > [...] > > > - get_page(pages[i]); > > + if (pages[i]) { > > + /* > > + * pin_user_pages*() arrives here with FOLL_PIN > > + * set; unpin_user_page() (which is not > > + * !CONFIG_MMU-specific) calls > > + * gup_put_folio(..., FOLL_PIN) which subtracts > > + * GUP_PIN_COUNTING_BIAS (1024). A bare > > + * get_page() here adds only 1, so 1023 pins on > > + * a fresh page bring refcount to 1024 and a > > + * single unpin then frees it out from under the > > + * remaining 1022 pins and any live VMA > > + * mappings. Use the same grab path as the MMU > > + * implementation so pin and unpin are > > + * symmetric. > > + */ > > Yeah, drop all that. Especially the hardcoded 1024/1022 is just screaming for > trouble longterm. Ok, will drop! > It just follows what we do everywhere else (e.g., follow_page_pte()). > > > > + if (try_grab_folio(page_folio(pages[i]), 1, > > + foll_flags)) { > > + pages[i] = NULL; > > + break; > > + } > > + } > > If it fails on the first iteration, we return -EFAULT instead of -ENOMEM. > > I know, I know, nobody cares. But if we touch it, we might just want to return > the error we get from try_grab_folio(). So just abort here and return it? No, that will not work, there's a lock we would jump around. How about something like this horrid thing, adding back in the relevant unlikely() to match the other calls like this: diff --git a/mm/gup.c b/mm/gup.c index ad9ded39609c..8fa5b37be8b7 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1983,6 +1983,7 @@ static long __get_user_pages_locked(struct mm_struct *mm, unsigned long start, struct vm_area_struct *vma; bool must_unlock = false; vm_flags_t vm_flags; + int ret; long i; if (!nr_pages) @@ -2019,8 +2020,15 @@ static long __get_user_pages_locked(struct mm_struct *mm, unsigned long start, if (pages) { pages[i] = virt_to_page((void *)start); - if (pages[i]) - get_page(pages[i]); + if (pages[i]) { + ret = try_grab_folio(page_folio(pages[i]), 1, + foll_flags); + if (unlikely(ret)) { + pages[i] = NULL; + i = ret; + break; + } + } } start = (start + PAGE_SIZE) & PAGE_MASK;