linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7]  userfaultfd: add support for shared memory
@ 2016-08-04  8:14 Mike Rapoport
  2016-08-04  8:14 ` [PATCH 1/7] userfaultfd: introduce vma_can_userfault Mike Rapoport
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04  8:14 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
	Mike Rapoport

These patches enable userfaultfd support for shared memory mappings. The
VMAs backed with shmem/tmpfs can be registered with userfaultfd which
allows management of page faults in these areas by userland.

This patch set adds implementation of shmem_mcopy_atomic_pte for proper
handling of UFFDIO_COPY command. A callback to handle_userfault is added
to shmem page fault handling path. The userfaultfd register/unregister
methods are extended to allow shmem VMAs.

The UFFDIO_ZEROPAGE and UFFDIO_REGISTER_MODE_WP are not implemented which
is reflected by userfaultfd API handshake methods.

The patches are based on current Andrea's tree:
https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git

Mike Rapoport (7):
  userfaultfd: introduce vma_can_userfault
  userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support
  userfaultfd: shmem: introduce vma_is_shmem
  userfaultfd: shmem: use shmem_mcopy_atomic_pte for shared memory
  userfaultfd: shmem: add userfaultfd hook for shared memory faults
  userfaultfd: shmem: allow registration of shared memory ranges
  userfaultfd: shmem: add userfaultfd_shmem test

 fs/userfaultfd.c                         |  32 ++++---
 include/linux/mm.h                       |  10 +++
 include/linux/shmem_fs.h                 |  11 +++
 include/uapi/linux/userfaultfd.h         |   2 +-
 mm/shmem.c                               | 139 +++++++++++++++++++++++++++++--
 mm/userfaultfd.c                         |  31 ++++---
 tools/testing/selftests/vm/Makefile      |   3 +
 tools/testing/selftests/vm/run_vmtests   |  11 +++
 tools/testing/selftests/vm/userfaultfd.c |  39 ++++++++-
 9 files changed, 237 insertions(+), 41 deletions(-)

-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/7] userfaultfd: introduce vma_can_userfault
  2016-08-04  8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
@ 2016-08-04  8:14 ` Mike Rapoport
  2016-08-04  8:14 ` [PATCH 2/7] userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support Mike Rapoport
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04  8:14 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
	Mike Rapoport

Check whether a VMA can be used with userfault in more compact way

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 fs/userfaultfd.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index f48f709..2aab2e1 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1066,6 +1066,11 @@ static __always_inline int validate_range(struct mm_struct *mm,
 	return 0;
 }
 
+static inline bool vma_can_userfault(struct vm_area_struct *vma)
+{
+	return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma);
+}
+
 static int userfaultfd_register(struct userfaultfd_ctx *ctx,
 				unsigned long arg)
 {
@@ -1148,7 +1153,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
 
 		/* check not compatible vmas */
 		ret = -EINVAL;
-		if (!vma_is_anonymous(cur) && !is_vm_hugetlb_page(cur))
+		if (!vma_can_userfault(cur))
 			goto out_unlock;
 		/* FIXME: add WP support to hugetlbfs */
 		if (is_vm_hugetlb_page(cur) && vm_flags & VM_UFFD_WP)
@@ -1197,7 +1202,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
 	do {
 		cond_resched();
 
-		BUG_ON(!vma_is_anonymous(vma) && !is_vm_hugetlb_page(vma));
+		BUG_ON(!vma_can_userfault(vma));
 		BUG_ON(vma->vm_userfaultfd_ctx.ctx &&
 		       vma->vm_userfaultfd_ctx.ctx != ctx);
 
@@ -1335,7 +1340,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
 		 * provides for more strict behavior to notice
 		 * unregistration errors.
 		 */
-		if (!vma_is_anonymous(cur) && !is_vm_hugetlb_page(cur))
+		if (!vma_can_userfault(cur))
 			goto out_unlock;
 
 		found = true;
@@ -1349,7 +1354,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
 	do {
 		cond_resched();
 
-		BUG_ON(!vma_is_anonymous(vma) && !is_vm_hugetlb_page(vma));
+		BUG_ON(!vma_can_userfault(vma));
 
 		/*
 		 * Nothing to do: this vma is already registered into this
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/7] userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support
  2016-08-04  8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
  2016-08-04  8:14 ` [PATCH 1/7] userfaultfd: introduce vma_can_userfault Mike Rapoport
@ 2016-08-04  8:14 ` Mike Rapoport
  2016-08-04  8:14 ` [PATCH 3/7] userfaultfd: shmem: introduce vma_is_shmem Mike Rapoport
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04  8:14 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
	Mike Rapoport

shmem_mcopy_atomic_pte is the low level routine that implements
the userfaultfd UFFDIO_COPY command.  It is based on the existing
mcopy_atomic_pte routine with modifications for shared memory pages.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 include/linux/shmem_fs.h |  11 +++++
 mm/shmem.c               | 109 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 120 insertions(+)

diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index 4d4780c..8dcbdfd 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -83,4 +83,15 @@ static inline long shmem_fcntl(struct file *f, unsigned int c, unsigned long a)
 
 #endif
 
+#ifdef CONFIG_SHMEM
+extern int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
+				  struct vm_area_struct *dst_vma,
+				  unsigned long dst_addr,
+				  unsigned long src_addr,
+				  struct page **pagep);
+#else
+#define shmem_mcopy_atomic_pte(dst_mm, dst_pte, dst_vma, dst_addr, \
+			       src_addr, pagep)        ({ BUG(); 0; })
+#endif
+
 #endif
diff --git a/mm/shmem.c b/mm/shmem.c
index a361449..fcf560c 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -69,6 +69,7 @@ static struct vfsmount *shm_mnt;
 #include <linux/syscalls.h>
 #include <linux/fcntl.h>
 #include <uapi/linux/memfd.h>
+#include <linux/rmap.h>
 
 #include <asm/uaccess.h>
 #include <asm/pgtable.h>
@@ -1548,6 +1549,114 @@ bool shmem_mapping(struct address_space *mapping)
 	return mapping->host->i_sb->s_op == &shmem_ops;
 }
 
+int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm,
+			   pmd_t *dst_pmd,
+			   struct vm_area_struct *dst_vma,
+			   unsigned long dst_addr,
+			   unsigned long src_addr,
+			   struct page **pagep)
+{
+	struct inode *inode = file_inode(dst_vma->vm_file);
+	struct shmem_inode_info *info = SHMEM_I(inode);
+	struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
+	struct address_space *mapping = inode->i_mapping;
+	gfp_t gfp = mapping_gfp_mask(mapping);
+	pgoff_t pgoff = linear_page_index(dst_vma, dst_addr);
+	struct mem_cgroup *memcg;
+	spinlock_t *ptl;
+	void *page_kaddr;
+	struct page *page;
+	pte_t _dst_pte, *dst_pte;
+	int ret;
+
+	if (!*pagep) {
+		ret = -ENOMEM;
+		if (shmem_acct_block(info->flags))
+			goto out;
+		if (sbinfo->max_blocks) {
+			if (percpu_counter_compare(&sbinfo->used_blocks,
+						   sbinfo->max_blocks) >= 0)
+				goto out_unacct_blocks;
+			percpu_counter_inc(&sbinfo->used_blocks);
+		}
+
+		page = shmem_alloc_page(gfp, info, pgoff);
+		if (!page)
+			goto out_dec_used_blocks;
+
+		page_kaddr = kmap_atomic(page);
+		ret = copy_from_user(page_kaddr, (const void __user *)src_addr,
+				     PAGE_SIZE);
+		kunmap_atomic(page_kaddr);
+
+		/* fallback to copy_from_user outside mmap_sem */
+		if (unlikely(ret)) {
+			*pagep = page;
+			/* don't free the page */
+			return -EFAULT;
+		}
+	} else {
+		page = *pagep;
+		*pagep = NULL;
+	}
+
+	_dst_pte = mk_pte(page, dst_vma->vm_page_prot);
+	if (dst_vma->vm_flags & VM_WRITE)
+		_dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte));
+
+	ret = -EEXIST;
+	dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl);
+	if (!pte_none(*dst_pte))
+		goto out_release_uncharge_unlock;
+
+	__SetPageUptodate(page);
+
+	ret = mem_cgroup_try_charge(page, dst_mm, gfp, &memcg,
+				    false);
+	if (ret)
+		goto out_release_uncharge_unlock;
+	ret = radix_tree_maybe_preload(gfp & GFP_RECLAIM_MASK);
+	if (!ret) {
+		ret = shmem_add_to_page_cache(page, mapping, pgoff, NULL);
+		radix_tree_preload_end();
+	}
+	if (ret) {
+		mem_cgroup_cancel_charge(page, memcg, false);
+		goto out_release_uncharge_unlock;
+	}
+
+	mem_cgroup_commit_charge(page, memcg, false, false);
+	lru_cache_add_anon(page);
+
+	spin_lock(&info->lock);
+	info->alloced++;
+	inode->i_blocks += BLOCKS_PER_PAGE;
+	shmem_recalc_inode(inode);
+	spin_unlock(&info->lock);
+
+	inc_mm_counter(dst_mm, mm_counter_file(page));
+	page_add_file_rmap(page);
+	set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte);
+
+	/* No need to invalidate - it was non-present before */
+	update_mmu_cache(dst_vma, dst_addr, dst_pte);
+	unlock_page(page);
+	pte_unmap_unlock(dst_pte, ptl);
+	ret = 0;
+out:
+	return ret;
+out_release_uncharge_unlock:
+	pte_unmap_unlock(dst_pte, ptl);
+	mem_cgroup_cancel_charge(page, memcg, false);
+	put_page(page);
+out_dec_used_blocks:
+	if (sbinfo->max_blocks)
+		percpu_counter_add(&sbinfo->used_blocks, -1);
+out_unacct_blocks:
+	shmem_unacct_blocks(info->flags, 1);
+	goto out;
+}
+
 #ifdef CONFIG_TMPFS
 static const struct inode_operations shmem_symlink_inode_operations;
 static const struct inode_operations shmem_short_symlink_operations;
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/7] userfaultfd: shmem: introduce vma_is_shmem
  2016-08-04  8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
  2016-08-04  8:14 ` [PATCH 1/7] userfaultfd: introduce vma_can_userfault Mike Rapoport
  2016-08-04  8:14 ` [PATCH 2/7] userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support Mike Rapoport
@ 2016-08-04  8:14 ` Mike Rapoport
  2016-08-04  8:14 ` [PATCH 4/7] userfaultfd: shmem: use shmem_mcopy_atomic_pte for shared memory Mike Rapoport
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04  8:14 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
	Mike Rapoport

Currently userfault relies on vma_is_anonymous and vma_is_hugetlb to ensure
compatibility of a VMA with userfault. Introduction of vma_is_shmem allows
detection if tmpfs backed VMAs, so that they may be used with userfaultfd.
Current implementation presumes usage of vma_is_shmem only by slow path
routines in userfaultfd, therefore the vma_is_shmem is not made inline to
leave the few remaining free bits in vm_flags.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 include/linux/mm.h | 10 ++++++++++
 mm/shmem.c         |  5 +++++
 2 files changed, 15 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1dedeb8..7a20398 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1343,6 +1343,16 @@ static inline bool vma_is_anonymous(struct vm_area_struct *vma)
 	return !vma->vm_ops;
 }
 
+#ifdef CONFIG_SHMEM
+/*
+ * The vma_is_shmem is not inline because it is used only by slow
+ * paths in userfault.
+ */
+bool vma_is_shmem(struct vm_area_struct *vma);
+#else
+static inline bool vma_is_shmem(struct vm_area_struct *vma) { return false; }
+#endif
+
 static inline int stack_guard_page_start(struct vm_area_struct *vma,
 					     unsigned long addr)
 {
diff --git a/mm/shmem.c b/mm/shmem.c
index fcf560c..881b7a0 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -194,6 +194,11 @@ static const struct inode_operations shmem_dir_inode_operations;
 static const struct inode_operations shmem_special_inode_operations;
 static const struct vm_operations_struct shmem_vm_ops;
 
+bool vma_is_shmem(struct vm_area_struct *vma)
+{
+	return vma->vm_ops == &shmem_vm_ops;
+}
+
 static LIST_HEAD(shmem_swaplist);
 static DEFINE_MUTEX(shmem_swaplist_mutex);
 
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/7] userfaultfd: shmem: use shmem_mcopy_atomic_pte for shared memory
  2016-08-04  8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
                   ` (2 preceding siblings ...)
  2016-08-04  8:14 ` [PATCH 3/7] userfaultfd: shmem: introduce vma_is_shmem Mike Rapoport
@ 2016-08-04  8:14 ` Mike Rapoport
  2016-08-04  8:14 ` [PATCH 5/7] userfaultfd: shmem: add userfaultfd hook for shared memory faults Mike Rapoport
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04  8:14 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
	Mike Rapoport

The shmem_mcopy_atomic_pte implements low lever part of UFFDIO_COPY
operation for shared memory VMAs. It's based on mcopy_atomic_pte with
adjustments necessary for shared memory pages.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 mm/userfaultfd.c | 31 ++++++++++++++++++-------------
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index ae4a976..d9259ba 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -16,6 +16,7 @@
 #include <linux/mmu_notifier.h>
 #include <linux/hugetlb.h>
 #include <linux/pagemap.h>
+#include <linux/shmem_fs.h>
 #include <asm/tlbflush.h>
 #include "internal.h"
 
@@ -348,7 +349,9 @@ retry:
 	 */
 	err = -EINVAL;
 	dst_vma = find_vma(dst_mm, dst_start);
-	if (!dst_vma || (dst_vma->vm_flags & VM_SHARED))
+	if (!dst_vma)
+		goto out_unlock;
+	if (!vma_is_shmem(dst_vma) && dst_vma->vm_flags & VM_SHARED)
 		goto out_unlock;
 	if (dst_start < dst_vma->vm_start ||
 	    dst_start + len > dst_vma->vm_end)
@@ -373,11 +376,7 @@ retry:
 	if (!dst_vma->vm_userfaultfd_ctx.ctx)
 		goto out_unlock;
 
-	/*
-	 * FIXME: only allow copying on anonymous vmas, tmpfs should
-	 * be added.
-	 */
-	if (!vma_is_anonymous(dst_vma))
+	if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma))
 		goto out_unlock;
 
 	/*
@@ -386,7 +385,7 @@ retry:
 	 * dst_vma.
 	 */
 	err = -ENOMEM;
-	if (unlikely(anon_vma_prepare(dst_vma)))
+	if (vma_is_anonymous(dst_vma) && unlikely(anon_vma_prepare(dst_vma)))
 		goto out_unlock;
 
 	while (src_addr < src_start + len) {
@@ -423,12 +422,18 @@ retry:
 		BUG_ON(pmd_none(*dst_pmd));
 		BUG_ON(pmd_trans_huge(*dst_pmd));
 
-		if (!zeropage)
-			err = mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma,
-					       dst_addr, src_addr, &page);
-		else
-			err = mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma,
-						 dst_addr);
+		if (vma_is_anonymous(dst_vma)) {
+			if (!zeropage)
+				err = mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma,
+						       dst_addr, src_addr,
+						       &page);
+			else
+				err = mfill_zeropage_pte(dst_mm, dst_pmd,
+							 dst_vma, dst_addr);
+		} else {
+			err = shmem_mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma,
+						     dst_addr, src_addr, &page);
+		}
 
 		cond_resched();
 
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5/7] userfaultfd: shmem: add userfaultfd hook for shared memory faults
  2016-08-04  8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
                   ` (3 preceding siblings ...)
  2016-08-04  8:14 ` [PATCH 4/7] userfaultfd: shmem: use shmem_mcopy_atomic_pte for shared memory Mike Rapoport
@ 2016-08-04  8:14 ` Mike Rapoport
  2016-08-04  8:14 ` [PATCH 6/7] userfaultfd: shmem: allow registration of shared memory ranges Mike Rapoport
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04  8:14 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
	Mike Rapoport

When processing a page fault in shared memory area for not present page,
check the VMA determine if faults are to be handled by userfaultfd. If so,
delegate the page fault to handle_userfault.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 mm/shmem.c | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 881b7a0..7ed2a1a 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -69,6 +69,7 @@ static struct vfsmount *shm_mnt;
 #include <linux/syscalls.h>
 #include <linux/fcntl.h>
 #include <uapi/linux/memfd.h>
+#include <linux/userfaultfd_k.h>
 #include <linux/rmap.h>
 
 #include <asm/uaccess.h>
@@ -123,13 +124,14 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
 				struct shmem_inode_info *info, pgoff_t index);
 static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
 		struct page **pagep, enum sgp_type sgp,
-		gfp_t gfp, struct mm_struct *fault_mm, int *fault_type);
+		gfp_t gfp, struct vm_area_struct *vma,
+		struct vm_fault *vmf, int *fault_type);
 
 static inline int shmem_getpage(struct inode *inode, pgoff_t index,
 		struct page **pagep, enum sgp_type sgp)
 {
 	return shmem_getpage_gfp(inode, index, pagep, sgp,
-		mapping_gfp_mask(inode->i_mapping), NULL, NULL);
+		mapping_gfp_mask(inode->i_mapping), NULL, NULL, NULL);
 }
 
 static inline struct shmem_sb_info *SHMEM_SB(struct super_block *sb)
@@ -1129,7 +1131,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
  */
 static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
 	struct page **pagep, enum sgp_type sgp, gfp_t gfp,
-	struct mm_struct *fault_mm, int *fault_type)
+	struct vm_area_struct *vma, struct vm_fault *vmf, int *fault_type)
 {
 	struct address_space *mapping = inode->i_mapping;
 	struct shmem_inode_info *info;
@@ -1180,7 +1182,7 @@ repeat:
 	 */
 	info = SHMEM_I(inode);
 	sbinfo = SHMEM_SB(inode->i_sb);
-	charge_mm = fault_mm ? : current->mm;
+	charge_mm = vma ? vma->vm_mm : current->mm;
 
 	if (swap.val) {
 		/* Look it up and read it in.. */
@@ -1190,7 +1192,8 @@ repeat:
 			if (fault_type) {
 				*fault_type |= VM_FAULT_MAJOR;
 				count_vm_event(PGMAJFAULT);
-				mem_cgroup_count_vm_event(fault_mm, PGMAJFAULT);
+				mem_cgroup_count_vm_event(vma->vm_mm,
+							  PGMAJFAULT);
 			}
 			/* Here we actually start the io */
 			page = shmem_swapin(swap, gfp, info, index);
@@ -1259,6 +1262,14 @@ repeat:
 		swap_free(swap);
 
 	} else {
+		if (vma && userfaultfd_missing(vma)) {
+			unsigned long addr =
+				(unsigned long)vmf->virtual_address;
+			*fault_type = handle_userfault(vma, addr, vmf->flags,
+						       VM_UFFD_MISSING);
+			return 0;
+		}
+
 		if (shmem_acct_block(info->flags)) {
 			error = -ENOSPC;
 			goto failed;
@@ -1432,7 +1443,7 @@ static int shmem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	}
 
 	error = shmem_getpage_gfp(inode, vmf->pgoff, &vmf->page, SGP_CACHE,
-				  gfp, vma->vm_mm, &ret);
+				  gfp, vma, vmf, &ret);
 	if (error)
 		return ((error == -ENOMEM) ? VM_FAULT_OOM : VM_FAULT_SIGBUS);
 	return ret;
@@ -3601,7 +3612,7 @@ struct page *shmem_read_mapping_page_gfp(struct address_space *mapping,
 
 	BUG_ON(mapping->a_ops != &shmem_aops);
 	error = shmem_getpage_gfp(inode, index, &page, SGP_CACHE,
-				  gfp, NULL, NULL);
+				  gfp, NULL, NULL, NULL);
 	if (error)
 		page = ERR_PTR(error);
 	else
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 6/7] userfaultfd: shmem: allow registration of shared memory ranges
  2016-08-04  8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
                   ` (4 preceding siblings ...)
  2016-08-04  8:14 ` [PATCH 5/7] userfaultfd: shmem: add userfaultfd hook for shared memory faults Mike Rapoport
@ 2016-08-04  8:14 ` Mike Rapoport
  2016-08-04  8:14 ` [PATCH 7/7] userfaultfd: shmem: add userfaultfd_shmem test Mike Rapoport
  2016-08-04 18:54 ` [PATCH 0/7] userfaultfd: add support for shared memory Andrea Arcangeli
  7 siblings, 0 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04  8:14 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
	Mike Rapoport

Expand the userfaultfd_register/unregister routines to allow shared memory
VMAs. Currently, there is no UFFDIO_ZEROPAGE and write-protection support
for shared memory VMAs, which is reflected in ioctl methods supported by
uffdio_register.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 fs/userfaultfd.c                         | 21 +++++++--------------
 include/uapi/linux/userfaultfd.h         |  2 +-
 tools/testing/selftests/vm/userfaultfd.c |  2 +-
 3 files changed, 9 insertions(+), 16 deletions(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 2aab2e1..2f9c87e 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1068,7 +1068,8 @@ static __always_inline int validate_range(struct mm_struct *mm,
 
 static inline bool vma_can_userfault(struct vm_area_struct *vma)
 {
-	return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma);
+	return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) ||
+		vma_is_shmem(vma);
 }
 
 static int userfaultfd_register(struct userfaultfd_ctx *ctx,
@@ -1081,7 +1082,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
 	struct uffdio_register __user *user_uffdio_register;
 	unsigned long vm_flags, new_flags;
 	bool found;
-	bool huge_pages;
+	bool non_anon_pages;
 	unsigned long start, end, vma_end;
 
 	user_uffdio_register = (struct uffdio_register __user *) arg;
@@ -1138,13 +1139,9 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
 
 	/*
 	 * Search for not compatible vmas.
-	 *
-	 * FIXME: this shall be relaxed later so that it doesn't fail
-	 * on tmpfs backed vmas (in addition to the current allowance
-	 * on anonymous vmas).
 	 */
 	found = false;
-	huge_pages = false;
+	non_anon_pages = false;
 	for (cur = vma; cur && cur->vm_start < end; cur = cur->vm_next) {
 		cond_resched();
 
@@ -1188,8 +1185,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
 		/*
 		 * Note vmas containing huge pages
 		 */
-		if (is_vm_hugetlb_page(cur))
-			huge_pages = true;
+		if (is_vm_hugetlb_page(cur) || vma_is_shmem(cur))
+			non_anon_pages = true;
 
 		found = true;
 	}
@@ -1260,7 +1257,7 @@ out_unlock:
 		 * userland which ioctls methods are guaranteed to
 		 * succeed on this range.
 		 */
-		if (put_user(huge_pages ? UFFD_API_RANGE_IOCTLS_HPAGE :
+		if (put_user(non_anon_pages ? UFFD_API_RANGE_IOCTLS_BASIC :
 			     UFFD_API_RANGE_IOCTLS,
 			     &user_uffdio_register->ioctls))
 			ret = -EFAULT;
@@ -1320,10 +1317,6 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
 
 	/*
 	 * Search for not compatible vmas.
-	 *
-	 * FIXME: this shall be relaxed later so that it doesn't fail
-	 * on tmpfs backed vmas (in addition to the current allowance
-	 * on anonymous vmas).
 	 */
 	found = false;
 	ret = -EINVAL;
diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
index 7a386c5..a9c3389 100644
--- a/include/uapi/linux/userfaultfd.h
+++ b/include/uapi/linux/userfaultfd.h
@@ -31,7 +31,7 @@
 	 (__u64)1 << _UFFDIO_COPY |		\
 	 (__u64)1 << _UFFDIO_ZEROPAGE |		\
 	 (__u64)1 << _UFFDIO_WRITEPROTECT)
-#define UFFD_API_RANGE_IOCTLS_HPAGE		\
+#define UFFD_API_RANGE_IOCTLS_BASIC		\
 	((__u64)1 << _UFFDIO_WAKE |		\
 	 (__u64)1 << _UFFDIO_COPY)
 
diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
index 3011711..d753a91 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -129,7 +129,7 @@ static void allocate_area(void **alloc_area)
 
 #else /* HUGETLB_TEST */
 
-#define EXPECTED_IOCTLS		UFFD_API_RANGE_IOCTLS_HPAGE
+#define EXPECTED_IOCTLS		UFFD_API_RANGE_IOCTLS_BASIC
 
 static int release_pages(char *rel_area)
 {
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 7/7] userfaultfd: shmem: add userfaultfd_shmem test
  2016-08-04  8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
                   ` (5 preceding siblings ...)
  2016-08-04  8:14 ` [PATCH 6/7] userfaultfd: shmem: allow registration of shared memory ranges Mike Rapoport
@ 2016-08-04  8:14 ` Mike Rapoport
  2016-08-04 18:54 ` [PATCH 0/7] userfaultfd: add support for shared memory Andrea Arcangeli
  7 siblings, 0 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04  8:14 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
	Mike Rapoport

The test verifies that anonymous shared mapping can be used with userfault
using the existing testing method.
The shared memory area is allocated using mmap(..., MAP_SHARED |
MAP_ANONYMOUS, ...) and released using madvise(MADV_REMOVE)

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 tools/testing/selftests/vm/Makefile      |  3 +++
 tools/testing/selftests/vm/run_vmtests   | 11 ++++++++++
 tools/testing/selftests/vm/userfaultfd.c | 37 ++++++++++++++++++++++++++++++--
 3 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/vm/Makefile b/tools/testing/selftests/vm/Makefile
index aaa4225..12ff211 100644
--- a/tools/testing/selftests/vm/Makefile
+++ b/tools/testing/selftests/vm/Makefile
@@ -11,6 +11,7 @@ BINARIES += thuge-gen
 BINARIES += transhuge-stress
 BINARIES += userfaultfd
 BINARIES += userfaultfd_hugetlb
+BINARIES += userfaultfd_shmem
 
 all: $(BINARIES)
 %: %.c
@@ -19,6 +20,8 @@ userfaultfd: userfaultfd.c ../../../../usr/include/linux/kernel.h
 	$(CC) $(CFLAGS) -O2 -o $@ $< -lpthread
 userfaultfd_hugetlb: userfaultfd.c ../../../../usr/include/linux/kernel.h
 	$(CC) $(CFLAGS) -DHUGETLB_TEST -O2 -o $@ $< -lpthread
+userfaultfd_shmem: userfaultfd.c ../../../../usr/include/linux/kernel.h
+	$(CC) $(CFLAGS) -DSHMEM_TEST -O2 -o $@ $< -lpthread
 
 ../../../../usr/include/linux/kernel.h:
 	make -C ../../../.. headers_install
diff --git a/tools/testing/selftests/vm/run_vmtests b/tools/testing/selftests/vm/run_vmtests
index 14d697e..c92f6cf 100755
--- a/tools/testing/selftests/vm/run_vmtests
+++ b/tools/testing/selftests/vm/run_vmtests
@@ -116,6 +116,17 @@ else
 fi
 rm -f $mnt/ufd_test_file
 
+echo "----------------------------"
+echo "running userfaultfd_shmem"
+echo "----------------------------"
+./userfaultfd_shmem 128 32
+if [ $? -ne 0 ]; then
+	echo "[FAIL]"
+	exitcode=1
+else
+	echo "[PASS]"
+fi
+
 #cleanup
 umount $mnt
 rm -rf $mnt
diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
index d753a91..a5e5808 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -101,8 +101,9 @@ pthread_attr_t attr;
 				 ~(unsigned long)(sizeof(unsigned long long) \
 						  -  1)))
 
-#ifndef HUGETLB_TEST
+#if !defined(HUGETLB_TEST) && !defined(SHMEM_TEST)
 
+/* Anonymous memory */
 #define EXPECTED_IOCTLS		((1 << _UFFDIO_WAKE) | \
 				 (1 << _UFFDIO_COPY) | \
 				 (1 << _UFFDIO_ZEROPAGE))
@@ -127,10 +128,13 @@ static void allocate_area(void **alloc_area)
 	}
 }
 
-#else /* HUGETLB_TEST */
+#else /* HUGETLB_TEST or SHMEM_TEST */
 
 #define EXPECTED_IOCTLS		UFFD_API_RANGE_IOCTLS_BASIC
 
+#ifdef HUGETLB_TEST
+
+/* HugeTLB memory */
 static int release_pages(char *rel_area)
 {
 	int ret = 0;
@@ -162,8 +166,37 @@ static void allocate_area(void **alloc_area)
 		huge_fd_off0 = *alloc_area;
 }
 
+#elif defined(SHMEM_TEST)
+
+/* Shared memory */
+static int release_pages(char *rel_area)
+{
+	int ret = 0;
+
+	if (madvise(rel_area, nr_pages * page_size, MADV_REMOVE)) {
+		perror("madvise");
+		ret = 1;
+	}
+
+	return ret;
+}
+
+static void allocate_area(void **alloc_area)
+{
+	*alloc_area = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE,
+			   MAP_ANONYMOUS | MAP_SHARED, -1, 0);
+	if (*alloc_area == MAP_FAILED) {
+		fprintf(stderr, "shared memory mmap failed\n");
+		*alloc_area = NULL;
+	}
+}
+
+#else /* SHMEM_TEST */
+#error "Undefined test type"
 #endif /* HUGETLB_TEST */
 
+#endif /* !defined(HUGETLB_TEST) && !defined(SHMEM_TEST) */
+
 static int my_bcmp(char *str1, char *str2, size_t n)
 {
 	unsigned long i;
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/7]  userfaultfd: add support for shared memory
  2016-08-04  8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
                   ` (6 preceding siblings ...)
  2016-08-04  8:14 ` [PATCH 7/7] userfaultfd: shmem: add userfaultfd_shmem test Mike Rapoport
@ 2016-08-04 18:54 ` Andrea Arcangeli
  7 siblings, 0 replies; 9+ messages in thread
From: Andrea Arcangeli @ 2016-08-04 18:54 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel

Hi Mike,

On Thu, Aug 04, 2016 at 11:14:11AM +0300, Mike Rapoport wrote:
> These patches enable userfaultfd support for shared memory mappings. The
> VMAs backed with shmem/tmpfs can be registered with userfaultfd which
> allows management of page faults in these areas by userland.
> 
> This patch set adds implementation of shmem_mcopy_atomic_pte for proper
> handling of UFFDIO_COPY command. A callback to handle_userfault is added
> to shmem page fault handling path. The userfaultfd register/unregister
> methods are extended to allow shmem VMAs.
> 
> The UFFDIO_ZEROPAGE and UFFDIO_REGISTER_MODE_WP are not implemented which
> is reflected by userfaultfd API handshake methods.

This looks great.

I'm getting rejects during rebase but not because of your changes, I
think I'll fold some patches that I originally fixed up incrementally,
in order to reduce the reject churn.

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-08-04 18:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-04  8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
2016-08-04  8:14 ` [PATCH 1/7] userfaultfd: introduce vma_can_userfault Mike Rapoport
2016-08-04  8:14 ` [PATCH 2/7] userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support Mike Rapoport
2016-08-04  8:14 ` [PATCH 3/7] userfaultfd: shmem: introduce vma_is_shmem Mike Rapoport
2016-08-04  8:14 ` [PATCH 4/7] userfaultfd: shmem: use shmem_mcopy_atomic_pte for shared memory Mike Rapoport
2016-08-04  8:14 ` [PATCH 5/7] userfaultfd: shmem: add userfaultfd hook for shared memory faults Mike Rapoport
2016-08-04  8:14 ` [PATCH 6/7] userfaultfd: shmem: allow registration of shared memory ranges Mike Rapoport
2016-08-04  8:14 ` [PATCH 7/7] userfaultfd: shmem: add userfaultfd_shmem test Mike Rapoport
2016-08-04 18:54 ` [PATCH 0/7] userfaultfd: add support for shared memory Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).