From: David Howells <dhowells@redhat.com>
To: Christoph Hellwig <hch@infradead.org>,
David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>, Jens Axboe <axboe@kernel.dk>,
Al Viro <viro@zeniv.linux.org.uk>,
Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>,
Jeff Layton <jlayton@kernel.org>,
Jason Gunthorpe <jgg@nvidia.com>,
Logan Gunthorpe <logang@deltatee.com>,
Hillf Danton <hdanton@sina.com>,
Christian Brauner <brauner@kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Andrew Morton <akpm@linux-foundation.org>
Subject: [RFC PATCH v2 1/3] mm: Don't pin ZERO_PAGE in pin_user_pages()
Date: Thu, 25 May 2023 23:39:51 +0100 [thread overview]
Message-ID: <20230525223953.225496-2-dhowells@redhat.com> (raw)
In-Reply-To: <20230525223953.225496-1-dhowells@redhat.com>
Make pin_user_pages*() leave a ZERO_PAGE unpinned if it extracts a pointer
to it from the page tables and make unpin_user_page*() correspondingly
ignore a ZERO_PAGE when unpinning. We don't want to risk overrunning a
zero page's refcount as we're only allowed ~2 million pins on it -
something that userspace can conceivably trigger.
Add a pair of functions to test whether a page or a folio is a ZERO_PAGE.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@infradead.org>
cc: David Hildenbrand <david@redhat.com>
cc: Andrew Morton <akpm@linux-foundation.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Matthew Wilcox <willy@infradead.org>
cc: Jan Kara <jack@suse.cz>
cc: Jeff Layton <jlayton@kernel.org>
cc: Jason Gunthorpe <jgg@nvidia.com>
cc: Logan Gunthorpe <logang@deltatee.com>
cc: Hillf Danton <hdanton@sina.com>
cc: Christian Brauner <brauner@kernel.org>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-kernel@vger.kernel.org
cc: linux-mm@kvack.org
---
Notes:
ver #2)
- Fix use of ZERO_PAGE().
- Add is_zero_page() and is_zero_folio() wrappers.
- Return the zero page obtained, not ZERO_PAGE(0) unconditionally.
include/linux/pgtable.h | 10 ++++++++++
mm/gup.c | 25 ++++++++++++++++++++++++-
2 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index c5a51481bbb9..2b0431a11de2 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1245,6 +1245,16 @@ static inline unsigned long my_zero_pfn(unsigned long addr)
}
#endif /* CONFIG_MMU */
+static inline bool is_zero_page(const struct page *page)
+{
+ return is_zero_pfn(page_to_pfn(page));
+}
+
+static inline bool is_zero_folio(const struct folio *folio)
+{
+ return is_zero_page(&folio->page);
+}
+
#ifdef CONFIG_MMU
#ifndef CONFIG_TRANSPARENT_HUGEPAGE
diff --git a/mm/gup.c b/mm/gup.c
index bbe416236593..69b002628f5d 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -51,7 +51,8 @@ static inline void sanity_check_pinned_pages(struct page **pages,
struct page *page = *pages;
struct folio *folio = page_folio(page);
- if (!folio_test_anon(folio))
+ if (is_zero_page(page) ||
+ !folio_test_anon(folio))
continue;
if (!folio_test_large(folio) || folio_test_hugetlb(folio))
VM_BUG_ON_PAGE(!PageAnonExclusive(&folio->page), page);
@@ -131,6 +132,13 @@ struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags)
else if (flags & FOLL_PIN) {
struct folio *folio;
+ /*
+ * Don't take a pin on the zero page - it's not going anywhere
+ * and it is used in a *lot* of places.
+ */
+ if (is_zero_page(page))
+ return page_folio(page);
+
/*
* Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a
* right zone, so fail and let the caller fall back to the slow
@@ -180,6 +188,8 @@ struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags)
static void gup_put_folio(struct folio *folio, int refs, unsigned int flags)
{
if (flags & FOLL_PIN) {
+ if (is_zero_folio(folio))
+ return;
node_stat_mod_folio(folio, NR_FOLL_PIN_RELEASED, refs);
if (folio_test_large(folio))
atomic_sub(refs, &folio->_pincount);
@@ -224,6 +234,13 @@ int __must_check try_grab_page(struct page *page, unsigned int flags)
if (flags & FOLL_GET)
folio_ref_inc(folio);
else if (flags & FOLL_PIN) {
+ /*
+ * Don't take a pin on the zero page - it's not going anywhere
+ * and it is used in a *lot* of places.
+ */
+ if (is_zero_page(page))
+ return 0;
+
/*
* Similar to try_grab_folio(): be sure to *also*
* increment the normal page refcount field at least once,
@@ -3079,6 +3096,9 @@ EXPORT_SYMBOL_GPL(get_user_pages_fast);
*
* FOLL_PIN means that the pages must be released via unpin_user_page(). Please
* see Documentation/core-api/pin_user_pages.rst for further details.
+ *
+ * Note that if the zero_page is amongst the returned pages, it will not have
+ * pins in it and unpin_user_page() will not remove pins from it.
*/
int pin_user_pages_fast(unsigned long start, int nr_pages,
unsigned int gup_flags, struct page **pages)
@@ -3161,6 +3181,9 @@ EXPORT_SYMBOL(pin_user_pages);
* pin_user_pages_unlocked() is the FOLL_PIN variant of
* get_user_pages_unlocked(). Behavior is the same, except that this one sets
* FOLL_PIN and rejects FOLL_GET.
+ *
+ * Note that if the zero_page is amongst the returned pages, it will not have
+ * pins in it and unpin_user_page() will not remove pins from it.
*/
long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
struct page **pages, unsigned int gup_flags)
next prev parent reply other threads:[~2023-05-25 22:41 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-25 22:39 [RFC PATCH v2 0/3] block: Make old dio use iov_iter_extract_pages() and page pinning David Howells
2023-05-25 22:39 ` David Howells [this message]
2023-05-26 8:10 ` [RFC PATCH v2 1/3] mm: Don't pin ZERO_PAGE in pin_user_pages() Lorenzo Stoakes
2023-05-26 8:17 ` Christoph Hellwig
2023-05-26 8:22 ` Christoph Hellwig
2023-05-26 8:27 ` Lorenzo Stoakes
2023-05-26 8:29 ` David Howells
2023-05-26 8:43 ` David Howells
2023-05-26 8:58 ` Lorenzo Stoakes
2023-05-26 9:15 ` David Howells
2023-05-26 9:23 ` Lorenzo Stoakes
2023-05-26 9:24 ` David Hildenbrand
2023-05-25 22:39 ` [RFC PATCH v2 2/3] mm: Provide a function to get an additional pin on a page David Howells
2023-05-26 2:29 ` Linus Torvalds
2023-05-26 5:50 ` David Howells
2023-05-26 8:20 ` Christoph Hellwig
2023-05-26 8:31 ` David Howells
2023-05-25 22:39 ` [RFC PATCH v2 3/3] block: Use iov_iter_extract_pages() and page pinning in direct-io.c David Howells
2023-05-26 8:26 ` Christoph Hellwig
2023-05-26 8:33 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230525223953.225496-2-dhowells@redhat.com \
--to=dhowells@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=david@redhat.com \
--cc=hch@infradead.org \
--cc=hdanton@sina.com \
--cc=jack@suse.cz \
--cc=jgg@nvidia.com \
--cc=jlayton@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=logang@deltatee.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).