From: Claudio Imbrenda <imbrenda@linux.ibm.com>
To: linux-next@vger.kernel.org, akpm@linux-foundation.org
Cc: borntraeger@de.ibm.com, david@redhat.com, aarcange@redhat.com,
linux-mm@kvack.org, frankja@linux.ibm.com, sfr@canb.auug.org.au,
jhubbard@nvidia.com, linux-kernel@vger.kernel.org,
linux-s390@vger.kernel.org, Will Deacon <will@kernel.org>
Subject: [RFC v1 2/2] mm/gup/writeback: add callbacks for inaccessible pages
Date: Fri, 28 Feb 2020 16:43:22 +0100 [thread overview]
Message-ID: <20200228154322.329228-4-imbrenda@linux.ibm.com> (raw)
In-Reply-To: <20200228154322.329228-1-imbrenda@linux.ibm.com>
With the introduction of protected KVM guests on s390 there is now a
concept of inaccessible pages. These pages need to be made accessible
before the host can access them.
While cpu accesses will trigger a fault that can be resolved, I/O
accesses will just fail. We need to add a callback into architecture
code for places that will do I/O, namely when writeback is started or
when a page reference is taken.
This is not only to enable paging, file backing etc, it is also
necessary to protect the host against a malicious user space. For
example a bad QEMU could simply start direct I/O on such protected
memory. We do not want userspace to be able to trigger I/O errors and
thus we the logic is "whenever somebody accesses that page (gup) or
does I/O, make sure that this page can be accessed". When the guest
tries to access that page we will wait in the page fault handler for
writeback to have finished and for the page_ref to be the expected
value.
On s390x the function is not supposed to fail, so it is ok to use a
WARN_ON on failure. If we ever need some more finegrained handling
we can tackle this when we know the details.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
include/linux/gfp.h | 6 ++++++
mm/gup.c | 19 ++++++++++++++++---
mm/page-writeback.c | 5 +++++
3 files changed, 27 insertions(+), 3 deletions(-)
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index e5b817cb86e7..be2754841369 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -485,6 +485,12 @@ static inline void arch_free_page(struct page *page, int order) { }
#ifndef HAVE_ARCH_ALLOC_PAGE
static inline void arch_alloc_page(struct page *page, int order) { }
#endif
+#ifndef HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
+static inline int arch_make_page_accessible(struct page *page)
+{
+ return 0;
+}
+#endif
struct page *
__alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid,
diff --git a/mm/gup.c b/mm/gup.c
index 0b9a806898f3..86fff6e4e4f3 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -391,6 +391,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
struct page *page;
spinlock_t *ptl;
pte_t *ptep, pte;
+ int ret;
/* FOLL_GET and FOLL_PIN are mutually exclusive. */
if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) ==
@@ -449,8 +450,6 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
if (is_zero_pfn(pte_pfn(pte))) {
page = pte_page(pte);
} else {
- int ret;
-
ret = follow_pfn_pte(vma, address, ptep, flags);
page = ERR_PTR(ret);
goto out;
@@ -458,7 +457,6 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
}
if (flags & FOLL_SPLIT && PageTransCompound(page)) {
- int ret;
get_page(page);
pte_unmap_unlock(ptep, ptl);
lock_page(page);
@@ -475,6 +473,14 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
page = ERR_PTR(-ENOMEM);
goto out;
}
+ if (flags & FOLL_PIN) {
+ ret = arch_make_page_accessible(page);
+ if (ret) {
+ unpin_user_page(page);
+ page = ERR_PTR(ret);
+ goto out;
+ }
+ }
if (flags & FOLL_TOUCH) {
if ((flags & FOLL_WRITE) &&
!pte_dirty(pte) && !PageDirty(page))
@@ -2143,6 +2149,13 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
VM_BUG_ON_PAGE(compound_head(page) != head, page);
+ if (flags & FOLL_PIN) {
+ ret = arch_make_page_accessible(page);
+ if (ret) {
+ unpin_user_page(page);
+ goto pte_unmap;
+ }
+ }
SetPageReferenced(page);
pages[*nr] = page;
(*nr)++;
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index ab5a3cee8ad3..8384be5a2758 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2807,6 +2807,11 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
inc_zone_page_state(page, NR_ZONE_WRITE_PENDING);
}
unlock_page_memcg(page);
+ /*
+ * If writeback has been triggered on a page that cannot be made
+ * accessible, it is too late.
+ */
+ WARN_ON(arch_make_page_accessible(page));
return ret;
}
--
2.24.1
next prev parent reply other threads:[~2020-02-28 15:43 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-28 15:43 [RFC v1 0/2] add callbacks for inaccessible pages Claudio Imbrenda
2020-02-28 15:43 ` [RFC v1 1/2] fixup for 9947ea2c1e608e32669d5caeb67b3e3fba3309e8 "mm/gup: track FOLL_PIN pages" Claudio Imbrenda
2020-02-28 15:45 ` Claudio Imbrenda
2020-02-28 15:43 ` [RFC v1 1/2] mm/gup: fixup for 9947ea2c1e608e32 " Claudio Imbrenda
2020-02-28 23:08 ` John Hubbard
2020-02-29 10:51 ` Claudio Imbrenda
2020-02-29 20:09 ` John Hubbard
2020-03-02 13:46 ` Michal Hocko
2020-02-28 15:43 ` Claudio Imbrenda [this message]
2020-02-28 16:08 ` [RFC v1 2/2] mm/gup/writeback: add callbacks for inaccessible pages Christian Borntraeger
2020-02-29 0:08 ` John Hubbard
2020-02-29 10:49 ` Claudio Imbrenda
2020-02-29 20:07 ` John Hubbard
2020-03-01 3:47 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200228154322.329228-4-imbrenda@linux.ibm.com \
--to=imbrenda@linux.ibm.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=borntraeger@de.ibm.com \
--cc=david@redhat.com \
--cc=frankja@linux.ibm.com \
--cc=jhubbard@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-next@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=sfr@canb.auug.org.au \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).