* [PATCH 0/7] userfaultfd: add support for shared memory
@ 2016-08-04 8:14 Mike Rapoport
2016-08-04 8:14 ` [PATCH 1/7] userfaultfd: introduce vma_can_userfault Mike Rapoport
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04 8:14 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
Mike Rapoport
These patches enable userfaultfd support for shared memory mappings. The
VMAs backed with shmem/tmpfs can be registered with userfaultfd which
allows management of page faults in these areas by userland.
This patch set adds implementation of shmem_mcopy_atomic_pte for proper
handling of UFFDIO_COPY command. A callback to handle_userfault is added
to shmem page fault handling path. The userfaultfd register/unregister
methods are extended to allow shmem VMAs.
The UFFDIO_ZEROPAGE and UFFDIO_REGISTER_MODE_WP are not implemented which
is reflected by userfaultfd API handshake methods.
The patches are based on current Andrea's tree:
https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git
Mike Rapoport (7):
userfaultfd: introduce vma_can_userfault
userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support
userfaultfd: shmem: introduce vma_is_shmem
userfaultfd: shmem: use shmem_mcopy_atomic_pte for shared memory
userfaultfd: shmem: add userfaultfd hook for shared memory faults
userfaultfd: shmem: allow registration of shared memory ranges
userfaultfd: shmem: add userfaultfd_shmem test
fs/userfaultfd.c | 32 ++++---
include/linux/mm.h | 10 +++
include/linux/shmem_fs.h | 11 +++
include/uapi/linux/userfaultfd.h | 2 +-
mm/shmem.c | 139 +++++++++++++++++++++++++++++--
mm/userfaultfd.c | 31 ++++---
tools/testing/selftests/vm/Makefile | 3 +
tools/testing/selftests/vm/run_vmtests | 11 +++
tools/testing/selftests/vm/userfaultfd.c | 39 ++++++++-
9 files changed, 237 insertions(+), 41 deletions(-)
--
1.9.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/7] userfaultfd: introduce vma_can_userfault
2016-08-04 8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
@ 2016-08-04 8:14 ` Mike Rapoport
2016-08-04 8:14 ` [PATCH 2/7] userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support Mike Rapoport
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04 8:14 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
Mike Rapoport
Check whether a VMA can be used with userfault in more compact way
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
fs/userfaultfd.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index f48f709..2aab2e1 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1066,6 +1066,11 @@ static __always_inline int validate_range(struct mm_struct *mm,
return 0;
}
+static inline bool vma_can_userfault(struct vm_area_struct *vma)
+{
+ return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma);
+}
+
static int userfaultfd_register(struct userfaultfd_ctx *ctx,
unsigned long arg)
{
@@ -1148,7 +1153,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
/* check not compatible vmas */
ret = -EINVAL;
- if (!vma_is_anonymous(cur) && !is_vm_hugetlb_page(cur))
+ if (!vma_can_userfault(cur))
goto out_unlock;
/* FIXME: add WP support to hugetlbfs */
if (is_vm_hugetlb_page(cur) && vm_flags & VM_UFFD_WP)
@@ -1197,7 +1202,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
do {
cond_resched();
- BUG_ON(!vma_is_anonymous(vma) && !is_vm_hugetlb_page(vma));
+ BUG_ON(!vma_can_userfault(vma));
BUG_ON(vma->vm_userfaultfd_ctx.ctx &&
vma->vm_userfaultfd_ctx.ctx != ctx);
@@ -1335,7 +1340,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
* provides for more strict behavior to notice
* unregistration errors.
*/
- if (!vma_is_anonymous(cur) && !is_vm_hugetlb_page(cur))
+ if (!vma_can_userfault(cur))
goto out_unlock;
found = true;
@@ -1349,7 +1354,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
do {
cond_resched();
- BUG_ON(!vma_is_anonymous(vma) && !is_vm_hugetlb_page(vma));
+ BUG_ON(!vma_can_userfault(vma));
/*
* Nothing to do: this vma is already registered into this
--
1.9.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/7] userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support
2016-08-04 8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
2016-08-04 8:14 ` [PATCH 1/7] userfaultfd: introduce vma_can_userfault Mike Rapoport
@ 2016-08-04 8:14 ` Mike Rapoport
2016-08-04 8:14 ` [PATCH 3/7] userfaultfd: shmem: introduce vma_is_shmem Mike Rapoport
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04 8:14 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
Mike Rapoport
shmem_mcopy_atomic_pte is the low level routine that implements
the userfaultfd UFFDIO_COPY command. It is based on the existing
mcopy_atomic_pte routine with modifications for shared memory pages.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
include/linux/shmem_fs.h | 11 +++++
mm/shmem.c | 109 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 120 insertions(+)
diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index 4d4780c..8dcbdfd 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -83,4 +83,15 @@ static inline long shmem_fcntl(struct file *f, unsigned int c, unsigned long a)
#endif
+#ifdef CONFIG_SHMEM
+extern int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
+ struct vm_area_struct *dst_vma,
+ unsigned long dst_addr,
+ unsigned long src_addr,
+ struct page **pagep);
+#else
+#define shmem_mcopy_atomic_pte(dst_mm, dst_pte, dst_vma, dst_addr, \
+ src_addr, pagep) ({ BUG(); 0; })
+#endif
+
#endif
diff --git a/mm/shmem.c b/mm/shmem.c
index a361449..fcf560c 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -69,6 +69,7 @@ static struct vfsmount *shm_mnt;
#include <linux/syscalls.h>
#include <linux/fcntl.h>
#include <uapi/linux/memfd.h>
+#include <linux/rmap.h>
#include <asm/uaccess.h>
#include <asm/pgtable.h>
@@ -1548,6 +1549,114 @@ bool shmem_mapping(struct address_space *mapping)
return mapping->host->i_sb->s_op == &shmem_ops;
}
+int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm,
+ pmd_t *dst_pmd,
+ struct vm_area_struct *dst_vma,
+ unsigned long dst_addr,
+ unsigned long src_addr,
+ struct page **pagep)
+{
+ struct inode *inode = file_inode(dst_vma->vm_file);
+ struct shmem_inode_info *info = SHMEM_I(inode);
+ struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
+ struct address_space *mapping = inode->i_mapping;
+ gfp_t gfp = mapping_gfp_mask(mapping);
+ pgoff_t pgoff = linear_page_index(dst_vma, dst_addr);
+ struct mem_cgroup *memcg;
+ spinlock_t *ptl;
+ void *page_kaddr;
+ struct page *page;
+ pte_t _dst_pte, *dst_pte;
+ int ret;
+
+ if (!*pagep) {
+ ret = -ENOMEM;
+ if (shmem_acct_block(info->flags))
+ goto out;
+ if (sbinfo->max_blocks) {
+ if (percpu_counter_compare(&sbinfo->used_blocks,
+ sbinfo->max_blocks) >= 0)
+ goto out_unacct_blocks;
+ percpu_counter_inc(&sbinfo->used_blocks);
+ }
+
+ page = shmem_alloc_page(gfp, info, pgoff);
+ if (!page)
+ goto out_dec_used_blocks;
+
+ page_kaddr = kmap_atomic(page);
+ ret = copy_from_user(page_kaddr, (const void __user *)src_addr,
+ PAGE_SIZE);
+ kunmap_atomic(page_kaddr);
+
+ /* fallback to copy_from_user outside mmap_sem */
+ if (unlikely(ret)) {
+ *pagep = page;
+ /* don't free the page */
+ return -EFAULT;
+ }
+ } else {
+ page = *pagep;
+ *pagep = NULL;
+ }
+
+ _dst_pte = mk_pte(page, dst_vma->vm_page_prot);
+ if (dst_vma->vm_flags & VM_WRITE)
+ _dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte));
+
+ ret = -EEXIST;
+ dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl);
+ if (!pte_none(*dst_pte))
+ goto out_release_uncharge_unlock;
+
+ __SetPageUptodate(page);
+
+ ret = mem_cgroup_try_charge(page, dst_mm, gfp, &memcg,
+ false);
+ if (ret)
+ goto out_release_uncharge_unlock;
+ ret = radix_tree_maybe_preload(gfp & GFP_RECLAIM_MASK);
+ if (!ret) {
+ ret = shmem_add_to_page_cache(page, mapping, pgoff, NULL);
+ radix_tree_preload_end();
+ }
+ if (ret) {
+ mem_cgroup_cancel_charge(page, memcg, false);
+ goto out_release_uncharge_unlock;
+ }
+
+ mem_cgroup_commit_charge(page, memcg, false, false);
+ lru_cache_add_anon(page);
+
+ spin_lock(&info->lock);
+ info->alloced++;
+ inode->i_blocks += BLOCKS_PER_PAGE;
+ shmem_recalc_inode(inode);
+ spin_unlock(&info->lock);
+
+ inc_mm_counter(dst_mm, mm_counter_file(page));
+ page_add_file_rmap(page);
+ set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte);
+
+ /* No need to invalidate - it was non-present before */
+ update_mmu_cache(dst_vma, dst_addr, dst_pte);
+ unlock_page(page);
+ pte_unmap_unlock(dst_pte, ptl);
+ ret = 0;
+out:
+ return ret;
+out_release_uncharge_unlock:
+ pte_unmap_unlock(dst_pte, ptl);
+ mem_cgroup_cancel_charge(page, memcg, false);
+ put_page(page);
+out_dec_used_blocks:
+ if (sbinfo->max_blocks)
+ percpu_counter_add(&sbinfo->used_blocks, -1);
+out_unacct_blocks:
+ shmem_unacct_blocks(info->flags, 1);
+ goto out;
+}
+
#ifdef CONFIG_TMPFS
static const struct inode_operations shmem_symlink_inode_operations;
static const struct inode_operations shmem_short_symlink_operations;
--
1.9.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/7] userfaultfd: shmem: introduce vma_is_shmem
2016-08-04 8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
2016-08-04 8:14 ` [PATCH 1/7] userfaultfd: introduce vma_can_userfault Mike Rapoport
2016-08-04 8:14 ` [PATCH 2/7] userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support Mike Rapoport
@ 2016-08-04 8:14 ` Mike Rapoport
2016-08-04 8:14 ` [PATCH 4/7] userfaultfd: shmem: use shmem_mcopy_atomic_pte for shared memory Mike Rapoport
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04 8:14 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
Mike Rapoport
Currently userfault relies on vma_is_anonymous and vma_is_hugetlb to ensure
compatibility of a VMA with userfault. Introduction of vma_is_shmem allows
detection if tmpfs backed VMAs, so that they may be used with userfaultfd.
Current implementation presumes usage of vma_is_shmem only by slow path
routines in userfaultfd, therefore the vma_is_shmem is not made inline to
leave the few remaining free bits in vm_flags.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
include/linux/mm.h | 10 ++++++++++
mm/shmem.c | 5 +++++
2 files changed, 15 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1dedeb8..7a20398 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1343,6 +1343,16 @@ static inline bool vma_is_anonymous(struct vm_area_struct *vma)
return !vma->vm_ops;
}
+#ifdef CONFIG_SHMEM
+/*
+ * The vma_is_shmem is not inline because it is used only by slow
+ * paths in userfault.
+ */
+bool vma_is_shmem(struct vm_area_struct *vma);
+#else
+static inline bool vma_is_shmem(struct vm_area_struct *vma) { return false; }
+#endif
+
static inline int stack_guard_page_start(struct vm_area_struct *vma,
unsigned long addr)
{
diff --git a/mm/shmem.c b/mm/shmem.c
index fcf560c..881b7a0 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -194,6 +194,11 @@ static const struct inode_operations shmem_dir_inode_operations;
static const struct inode_operations shmem_special_inode_operations;
static const struct vm_operations_struct shmem_vm_ops;
+bool vma_is_shmem(struct vm_area_struct *vma)
+{
+ return vma->vm_ops == &shmem_vm_ops;
+}
+
static LIST_HEAD(shmem_swaplist);
static DEFINE_MUTEX(shmem_swaplist_mutex);
--
1.9.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/7] userfaultfd: shmem: use shmem_mcopy_atomic_pte for shared memory
2016-08-04 8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
` (2 preceding siblings ...)
2016-08-04 8:14 ` [PATCH 3/7] userfaultfd: shmem: introduce vma_is_shmem Mike Rapoport
@ 2016-08-04 8:14 ` Mike Rapoport
2016-08-04 8:14 ` [PATCH 5/7] userfaultfd: shmem: add userfaultfd hook for shared memory faults Mike Rapoport
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04 8:14 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
Mike Rapoport
The shmem_mcopy_atomic_pte implements low lever part of UFFDIO_COPY
operation for shared memory VMAs. It's based on mcopy_atomic_pte with
adjustments necessary for shared memory pages.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
mm/userfaultfd.c | 31 ++++++++++++++++++-------------
1 file changed, 18 insertions(+), 13 deletions(-)
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index ae4a976..d9259ba 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -16,6 +16,7 @@
#include <linux/mmu_notifier.h>
#include <linux/hugetlb.h>
#include <linux/pagemap.h>
+#include <linux/shmem_fs.h>
#include <asm/tlbflush.h>
#include "internal.h"
@@ -348,7 +349,9 @@ retry:
*/
err = -EINVAL;
dst_vma = find_vma(dst_mm, dst_start);
- if (!dst_vma || (dst_vma->vm_flags & VM_SHARED))
+ if (!dst_vma)
+ goto out_unlock;
+ if (!vma_is_shmem(dst_vma) && dst_vma->vm_flags & VM_SHARED)
goto out_unlock;
if (dst_start < dst_vma->vm_start ||
dst_start + len > dst_vma->vm_end)
@@ -373,11 +376,7 @@ retry:
if (!dst_vma->vm_userfaultfd_ctx.ctx)
goto out_unlock;
- /*
- * FIXME: only allow copying on anonymous vmas, tmpfs should
- * be added.
- */
- if (!vma_is_anonymous(dst_vma))
+ if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma))
goto out_unlock;
/*
@@ -386,7 +385,7 @@ retry:
* dst_vma.
*/
err = -ENOMEM;
- if (unlikely(anon_vma_prepare(dst_vma)))
+ if (vma_is_anonymous(dst_vma) && unlikely(anon_vma_prepare(dst_vma)))
goto out_unlock;
while (src_addr < src_start + len) {
@@ -423,12 +422,18 @@ retry:
BUG_ON(pmd_none(*dst_pmd));
BUG_ON(pmd_trans_huge(*dst_pmd));
- if (!zeropage)
- err = mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma,
- dst_addr, src_addr, &page);
- else
- err = mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma,
- dst_addr);
+ if (vma_is_anonymous(dst_vma)) {
+ if (!zeropage)
+ err = mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma,
+ dst_addr, src_addr,
+ &page);
+ else
+ err = mfill_zeropage_pte(dst_mm, dst_pmd,
+ dst_vma, dst_addr);
+ } else {
+ err = shmem_mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma,
+ dst_addr, src_addr, &page);
+ }
cond_resched();
--
1.9.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 5/7] userfaultfd: shmem: add userfaultfd hook for shared memory faults
2016-08-04 8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
` (3 preceding siblings ...)
2016-08-04 8:14 ` [PATCH 4/7] userfaultfd: shmem: use shmem_mcopy_atomic_pte for shared memory Mike Rapoport
@ 2016-08-04 8:14 ` Mike Rapoport
2016-08-04 8:14 ` [PATCH 6/7] userfaultfd: shmem: allow registration of shared memory ranges Mike Rapoport
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04 8:14 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
Mike Rapoport
When processing a page fault in shared memory area for not present page,
check the VMA determine if faults are to be handled by userfaultfd. If so,
delegate the page fault to handle_userfault.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
mm/shmem.c | 25 ++++++++++++++++++-------
1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index 881b7a0..7ed2a1a 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -69,6 +69,7 @@ static struct vfsmount *shm_mnt;
#include <linux/syscalls.h>
#include <linux/fcntl.h>
#include <uapi/linux/memfd.h>
+#include <linux/userfaultfd_k.h>
#include <linux/rmap.h>
#include <asm/uaccess.h>
@@ -123,13 +124,14 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
struct shmem_inode_info *info, pgoff_t index);
static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
struct page **pagep, enum sgp_type sgp,
- gfp_t gfp, struct mm_struct *fault_mm, int *fault_type);
+ gfp_t gfp, struct vm_area_struct *vma,
+ struct vm_fault *vmf, int *fault_type);
static inline int shmem_getpage(struct inode *inode, pgoff_t index,
struct page **pagep, enum sgp_type sgp)
{
return shmem_getpage_gfp(inode, index, pagep, sgp,
- mapping_gfp_mask(inode->i_mapping), NULL, NULL);
+ mapping_gfp_mask(inode->i_mapping), NULL, NULL, NULL);
}
static inline struct shmem_sb_info *SHMEM_SB(struct super_block *sb)
@@ -1129,7 +1131,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
*/
static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
struct page **pagep, enum sgp_type sgp, gfp_t gfp,
- struct mm_struct *fault_mm, int *fault_type)
+ struct vm_area_struct *vma, struct vm_fault *vmf, int *fault_type)
{
struct address_space *mapping = inode->i_mapping;
struct shmem_inode_info *info;
@@ -1180,7 +1182,7 @@ repeat:
*/
info = SHMEM_I(inode);
sbinfo = SHMEM_SB(inode->i_sb);
- charge_mm = fault_mm ? : current->mm;
+ charge_mm = vma ? vma->vm_mm : current->mm;
if (swap.val) {
/* Look it up and read it in.. */
@@ -1190,7 +1192,8 @@ repeat:
if (fault_type) {
*fault_type |= VM_FAULT_MAJOR;
count_vm_event(PGMAJFAULT);
- mem_cgroup_count_vm_event(fault_mm, PGMAJFAULT);
+ mem_cgroup_count_vm_event(vma->vm_mm,
+ PGMAJFAULT);
}
/* Here we actually start the io */
page = shmem_swapin(swap, gfp, info, index);
@@ -1259,6 +1262,14 @@ repeat:
swap_free(swap);
} else {
+ if (vma && userfaultfd_missing(vma)) {
+ unsigned long addr =
+ (unsigned long)vmf->virtual_address;
+ *fault_type = handle_userfault(vma, addr, vmf->flags,
+ VM_UFFD_MISSING);
+ return 0;
+ }
+
if (shmem_acct_block(info->flags)) {
error = -ENOSPC;
goto failed;
@@ -1432,7 +1443,7 @@ static int shmem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
}
error = shmem_getpage_gfp(inode, vmf->pgoff, &vmf->page, SGP_CACHE,
- gfp, vma->vm_mm, &ret);
+ gfp, vma, vmf, &ret);
if (error)
return ((error == -ENOMEM) ? VM_FAULT_OOM : VM_FAULT_SIGBUS);
return ret;
@@ -3601,7 +3612,7 @@ struct page *shmem_read_mapping_page_gfp(struct address_space *mapping,
BUG_ON(mapping->a_ops != &shmem_aops);
error = shmem_getpage_gfp(inode, index, &page, SGP_CACHE,
- gfp, NULL, NULL);
+ gfp, NULL, NULL, NULL);
if (error)
page = ERR_PTR(error);
else
--
1.9.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 6/7] userfaultfd: shmem: allow registration of shared memory ranges
2016-08-04 8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
` (4 preceding siblings ...)
2016-08-04 8:14 ` [PATCH 5/7] userfaultfd: shmem: add userfaultfd hook for shared memory faults Mike Rapoport
@ 2016-08-04 8:14 ` Mike Rapoport
2016-08-04 8:14 ` [PATCH 7/7] userfaultfd: shmem: add userfaultfd_shmem test Mike Rapoport
2016-08-04 18:54 ` [PATCH 0/7] userfaultfd: add support for shared memory Andrea Arcangeli
7 siblings, 0 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04 8:14 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
Mike Rapoport
Expand the userfaultfd_register/unregister routines to allow shared memory
VMAs. Currently, there is no UFFDIO_ZEROPAGE and write-protection support
for shared memory VMAs, which is reflected in ioctl methods supported by
uffdio_register.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
fs/userfaultfd.c | 21 +++++++--------------
include/uapi/linux/userfaultfd.h | 2 +-
tools/testing/selftests/vm/userfaultfd.c | 2 +-
3 files changed, 9 insertions(+), 16 deletions(-)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 2aab2e1..2f9c87e 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1068,7 +1068,8 @@ static __always_inline int validate_range(struct mm_struct *mm,
static inline bool vma_can_userfault(struct vm_area_struct *vma)
{
- return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma);
+ return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) ||
+ vma_is_shmem(vma);
}
static int userfaultfd_register(struct userfaultfd_ctx *ctx,
@@ -1081,7 +1082,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
struct uffdio_register __user *user_uffdio_register;
unsigned long vm_flags, new_flags;
bool found;
- bool huge_pages;
+ bool non_anon_pages;
unsigned long start, end, vma_end;
user_uffdio_register = (struct uffdio_register __user *) arg;
@@ -1138,13 +1139,9 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
/*
* Search for not compatible vmas.
- *
- * FIXME: this shall be relaxed later so that it doesn't fail
- * on tmpfs backed vmas (in addition to the current allowance
- * on anonymous vmas).
*/
found = false;
- huge_pages = false;
+ non_anon_pages = false;
for (cur = vma; cur && cur->vm_start < end; cur = cur->vm_next) {
cond_resched();
@@ -1188,8 +1185,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
/*
* Note vmas containing huge pages
*/
- if (is_vm_hugetlb_page(cur))
- huge_pages = true;
+ if (is_vm_hugetlb_page(cur) || vma_is_shmem(cur))
+ non_anon_pages = true;
found = true;
}
@@ -1260,7 +1257,7 @@ out_unlock:
* userland which ioctls methods are guaranteed to
* succeed on this range.
*/
- if (put_user(huge_pages ? UFFD_API_RANGE_IOCTLS_HPAGE :
+ if (put_user(non_anon_pages ? UFFD_API_RANGE_IOCTLS_BASIC :
UFFD_API_RANGE_IOCTLS,
&user_uffdio_register->ioctls))
ret = -EFAULT;
@@ -1320,10 +1317,6 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
/*
* Search for not compatible vmas.
- *
- * FIXME: this shall be relaxed later so that it doesn't fail
- * on tmpfs backed vmas (in addition to the current allowance
- * on anonymous vmas).
*/
found = false;
ret = -EINVAL;
diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
index 7a386c5..a9c3389 100644
--- a/include/uapi/linux/userfaultfd.h
+++ b/include/uapi/linux/userfaultfd.h
@@ -31,7 +31,7 @@
(__u64)1 << _UFFDIO_COPY | \
(__u64)1 << _UFFDIO_ZEROPAGE | \
(__u64)1 << _UFFDIO_WRITEPROTECT)
-#define UFFD_API_RANGE_IOCTLS_HPAGE \
+#define UFFD_API_RANGE_IOCTLS_BASIC \
((__u64)1 << _UFFDIO_WAKE | \
(__u64)1 << _UFFDIO_COPY)
diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
index 3011711..d753a91 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -129,7 +129,7 @@ static void allocate_area(void **alloc_area)
#else /* HUGETLB_TEST */
-#define EXPECTED_IOCTLS UFFD_API_RANGE_IOCTLS_HPAGE
+#define EXPECTED_IOCTLS UFFD_API_RANGE_IOCTLS_BASIC
static int release_pages(char *rel_area)
{
--
1.9.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 7/7] userfaultfd: shmem: add userfaultfd_shmem test
2016-08-04 8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
` (5 preceding siblings ...)
2016-08-04 8:14 ` [PATCH 6/7] userfaultfd: shmem: allow registration of shared memory ranges Mike Rapoport
@ 2016-08-04 8:14 ` Mike Rapoport
2016-08-04 18:54 ` [PATCH 0/7] userfaultfd: add support for shared memory Andrea Arcangeli
7 siblings, 0 replies; 9+ messages in thread
From: Mike Rapoport @ 2016-08-04 8:14 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel,
Mike Rapoport
The test verifies that anonymous shared mapping can be used with userfault
using the existing testing method.
The shared memory area is allocated using mmap(..., MAP_SHARED |
MAP_ANONYMOUS, ...) and released using madvise(MADV_REMOVE)
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
tools/testing/selftests/vm/Makefile | 3 +++
tools/testing/selftests/vm/run_vmtests | 11 ++++++++++
tools/testing/selftests/vm/userfaultfd.c | 37 ++++++++++++++++++++++++++++++--
3 files changed, 49 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/vm/Makefile b/tools/testing/selftests/vm/Makefile
index aaa4225..12ff211 100644
--- a/tools/testing/selftests/vm/Makefile
+++ b/tools/testing/selftests/vm/Makefile
@@ -11,6 +11,7 @@ BINARIES += thuge-gen
BINARIES += transhuge-stress
BINARIES += userfaultfd
BINARIES += userfaultfd_hugetlb
+BINARIES += userfaultfd_shmem
all: $(BINARIES)
%: %.c
@@ -19,6 +20,8 @@ userfaultfd: userfaultfd.c ../../../../usr/include/linux/kernel.h
$(CC) $(CFLAGS) -O2 -o $@ $< -lpthread
userfaultfd_hugetlb: userfaultfd.c ../../../../usr/include/linux/kernel.h
$(CC) $(CFLAGS) -DHUGETLB_TEST -O2 -o $@ $< -lpthread
+userfaultfd_shmem: userfaultfd.c ../../../../usr/include/linux/kernel.h
+ $(CC) $(CFLAGS) -DSHMEM_TEST -O2 -o $@ $< -lpthread
../../../../usr/include/linux/kernel.h:
make -C ../../../.. headers_install
diff --git a/tools/testing/selftests/vm/run_vmtests b/tools/testing/selftests/vm/run_vmtests
index 14d697e..c92f6cf 100755
--- a/tools/testing/selftests/vm/run_vmtests
+++ b/tools/testing/selftests/vm/run_vmtests
@@ -116,6 +116,17 @@ else
fi
rm -f $mnt/ufd_test_file
+echo "----------------------------"
+echo "running userfaultfd_shmem"
+echo "----------------------------"
+./userfaultfd_shmem 128 32
+if [ $? -ne 0 ]; then
+ echo "[FAIL]"
+ exitcode=1
+else
+ echo "[PASS]"
+fi
+
#cleanup
umount $mnt
rm -rf $mnt
diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
index d753a91..a5e5808 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -101,8 +101,9 @@ pthread_attr_t attr;
~(unsigned long)(sizeof(unsigned long long) \
- 1)))
-#ifndef HUGETLB_TEST
+#if !defined(HUGETLB_TEST) && !defined(SHMEM_TEST)
+/* Anonymous memory */
#define EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \
(1 << _UFFDIO_COPY) | \
(1 << _UFFDIO_ZEROPAGE))
@@ -127,10 +128,13 @@ static void allocate_area(void **alloc_area)
}
}
-#else /* HUGETLB_TEST */
+#else /* HUGETLB_TEST or SHMEM_TEST */
#define EXPECTED_IOCTLS UFFD_API_RANGE_IOCTLS_BASIC
+#ifdef HUGETLB_TEST
+
+/* HugeTLB memory */
static int release_pages(char *rel_area)
{
int ret = 0;
@@ -162,8 +166,37 @@ static void allocate_area(void **alloc_area)
huge_fd_off0 = *alloc_area;
}
+#elif defined(SHMEM_TEST)
+
+/* Shared memory */
+static int release_pages(char *rel_area)
+{
+ int ret = 0;
+
+ if (madvise(rel_area, nr_pages * page_size, MADV_REMOVE)) {
+ perror("madvise");
+ ret = 1;
+ }
+
+ return ret;
+}
+
+static void allocate_area(void **alloc_area)
+{
+ *alloc_area = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE,
+ MAP_ANONYMOUS | MAP_SHARED, -1, 0);
+ if (*alloc_area == MAP_FAILED) {
+ fprintf(stderr, "shared memory mmap failed\n");
+ *alloc_area = NULL;
+ }
+}
+
+#else /* SHMEM_TEST */
+#error "Undefined test type"
#endif /* HUGETLB_TEST */
+#endif /* !defined(HUGETLB_TEST) && !defined(SHMEM_TEST) */
+
static int my_bcmp(char *str1, char *str2, size_t n)
{
unsigned long i;
--
1.9.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 0/7] userfaultfd: add support for shared memory
2016-08-04 8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
` (6 preceding siblings ...)
2016-08-04 8:14 ` [PATCH 7/7] userfaultfd: shmem: add userfaultfd_shmem test Mike Rapoport
@ 2016-08-04 18:54 ` Andrea Arcangeli
7 siblings, 0 replies; 9+ messages in thread
From: Andrea Arcangeli @ 2016-08-04 18:54 UTC (permalink / raw)
To: Mike Rapoport; +Cc: Hugh Dickins, Pavel Emelyanov, linux-mm, linux-kernel
Hi Mike,
On Thu, Aug 04, 2016 at 11:14:11AM +0300, Mike Rapoport wrote:
> These patches enable userfaultfd support for shared memory mappings. The
> VMAs backed with shmem/tmpfs can be registered with userfaultfd which
> allows management of page faults in these areas by userland.
>
> This patch set adds implementation of shmem_mcopy_atomic_pte for proper
> handling of UFFDIO_COPY command. A callback to handle_userfault is added
> to shmem page fault handling path. The userfaultfd register/unregister
> methods are extended to allow shmem VMAs.
>
> The UFFDIO_ZEROPAGE and UFFDIO_REGISTER_MODE_WP are not implemented which
> is reflected by userfaultfd API handshake methods.
This looks great.
I'm getting rejects during rebase but not because of your changes, I
think I'll fold some patches that I originally fixed up incrementally,
in order to reduce the reject churn.
Thanks,
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-08-04 18:54 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-04 8:14 [PATCH 0/7] userfaultfd: add support for shared memory Mike Rapoport
2016-08-04 8:14 ` [PATCH 1/7] userfaultfd: introduce vma_can_userfault Mike Rapoport
2016-08-04 8:14 ` [PATCH 2/7] userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support Mike Rapoport
2016-08-04 8:14 ` [PATCH 3/7] userfaultfd: shmem: introduce vma_is_shmem Mike Rapoport
2016-08-04 8:14 ` [PATCH 4/7] userfaultfd: shmem: use shmem_mcopy_atomic_pte for shared memory Mike Rapoport
2016-08-04 8:14 ` [PATCH 5/7] userfaultfd: shmem: add userfaultfd hook for shared memory faults Mike Rapoport
2016-08-04 8:14 ` [PATCH 6/7] userfaultfd: shmem: allow registration of shared memory ranges Mike Rapoport
2016-08-04 8:14 ` [PATCH 7/7] userfaultfd: shmem: add userfaultfd_shmem test Mike Rapoport
2016-08-04 18:54 ` [PATCH 0/7] userfaultfd: add support for shared memory Andrea Arcangeli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).