From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f42.google.com (mail-wm0-f42.google.com [74.125.82.42]) by kanga.kvack.org (Postfix) with ESMTP id 1A9166B007E for ; Sun, 20 Mar 2016 08:42:34 -0400 (EDT) Received: by mail-wm0-f42.google.com with SMTP id l68so121856250wml.0 for ; Sun, 20 Mar 2016 05:42:34 -0700 (PDT) Received: from e06smtp14.uk.ibm.com (e06smtp14.uk.ibm.com. [195.75.94.110]) by mx.google.com with ESMTPS id x13si26178579wjw.168.2016.03.20.05.42.32 for (version=TLS1_2 cipher=AES128-SHA bits=128/128); Sun, 20 Mar 2016 05:42:33 -0700 (PDT) Received: from localhost by e06smtp14.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 20 Mar 2016 12:42:32 -0000 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp02.portsmouth.uk.ibm.com (Postfix) with ESMTP id DEC582190046 for ; Sun, 20 Mar 2016 12:42:10 +0000 (GMT) Received: from d06av06.portsmouth.uk.ibm.com (d06av06.portsmouth.uk.ibm.com [9.149.37.217]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u2KCgTsI1704328 for ; Sun, 20 Mar 2016 12:42:29 GMT Received: from d06av06.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u2KCgS2L016883 for ; Sun, 20 Mar 2016 08:42:28 -0400 From: Mike Rapoport Subject: [PATCH 0/5] userfaultfd: extension for non cooperative uffd usage Date: Sun, 20 Mar 2016 14:42:16 +0200 Message-Id: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrea Arcangeli Cc: Pavel Emelyanov , LKML , linux-mm@kvack.org, Mike Rapoport , Mike Rapoport Hi, This set is to address the issues that appear in userfaultfd usage scenarios when the task monitoring the uffd and the mm-owner do not cooperate to each other on VM changes such as remaps, madvises and fork()-s. The pacthes are essentially the same as in the prevoious respin (1), they've just been rebased on the current tree. [1] http://thread.gmane.org/gmane.linux.kernel.mm/132662 Pavel Emelyanov (5): uffd: Split the find_userfault() routine uffd: Add ability to report non-PF events from uffd descriptor uffd: Add fork() event uffd: Add mremap() event uffd: Add madvise() event for MADV_DONTNEED request fs/userfaultfd.c | 319 ++++++++++++++++++++++++++++++++++++++- include/linux/userfaultfd_k.h | 41 +++++ include/uapi/linux/userfaultfd.h | 28 +++- kernel/fork.c | 10 +- mm/madvise.c | 2 + mm/mremap.c | 17 ++- 6 files changed, 395 insertions(+), 22 deletions(-) -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f53.google.com (mail-wm0-f53.google.com [74.125.82.53]) by kanga.kvack.org (Postfix) with ESMTP id A78DA6B0253 for ; Sun, 20 Mar 2016 08:42:37 -0400 (EDT) Received: by mail-wm0-f53.google.com with SMTP id r129so19698180wmr.1 for ; Sun, 20 Mar 2016 05:42:37 -0700 (PDT) Received: from e06smtp16.uk.ibm.com (e06smtp16.uk.ibm.com. [195.75.94.112]) by mx.google.com with ESMTPS id 17si2491602wjv.159.2016.03.20.05.42.36 for (version=TLS1_2 cipher=AES128-SHA bits=128/128); Sun, 20 Mar 2016 05:42:36 -0700 (PDT) Received: from localhost by e06smtp16.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 20 Mar 2016 12:42:36 -0000 Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 9333F1B0804B for ; Sun, 20 Mar 2016 12:43:04 +0000 (GMT) Received: from d06av06.portsmouth.uk.ibm.com (d06av06.portsmouth.uk.ibm.com [9.149.37.217]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u2KCgXlU56688700 for ; Sun, 20 Mar 2016 12:42:33 GMT Received: from d06av06.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u2KCgX7L017034 for ; Sun, 20 Mar 2016 08:42:33 -0400 From: Mike Rapoport Subject: [PATCH 1/5] uffd: Split the find_userfault() routine Date: Sun, 20 Mar 2016 14:42:17 +0200 Message-Id: <1458477741-6942-2-git-send-email-rapoport@il.ibm.com> In-Reply-To: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrea Arcangeli Cc: Pavel Emelyanov , LKML , linux-mm@kvack.org, Mike Rapoport , Mike Rapoport From: Pavel Emelyanov I will need one to lookup for userfaultfd_wait_queue-s in different wait queue Signed-off-by: Pavel Emelyanov Signed-off-by: Mike Rapoport --- fs/userfaultfd.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 66cdb44..4f0b53d 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -483,25 +483,30 @@ static int userfaultfd_release(struct inode *inode, struct file *file) } /* fault_pending_wqh.lock must be hold by the caller */ -static inline struct userfaultfd_wait_queue *find_userfault( - struct userfaultfd_ctx *ctx) +static inline struct userfaultfd_wait_queue *find_userfault_in( + wait_queue_head_t *wqh) { wait_queue_t *wq; struct userfaultfd_wait_queue *uwq; - VM_BUG_ON(!spin_is_locked(&ctx->fault_pending_wqh.lock)); + VM_BUG_ON(!spin_is_locked(&wqh->lock)); uwq = NULL; - if (!waitqueue_active(&ctx->fault_pending_wqh)) + if (!waitqueue_active(wqh)) goto out; /* walk in reverse to provide FIFO behavior to read userfaults */ - wq = list_last_entry(&ctx->fault_pending_wqh.task_list, - typeof(*wq), task_list); + wq = list_last_entry(&wqh->task_list, typeof(*wq), task_list); uwq = container_of(wq, struct userfaultfd_wait_queue, wq); out: return uwq; } +static inline struct userfaultfd_wait_queue *find_userfault( + struct userfaultfd_ctx *ctx) +{ + return find_userfault_in(&ctx->fault_pending_wqh); +} + static unsigned int userfaultfd_poll(struct file *file, poll_table *wait) { struct userfaultfd_ctx *ctx = file->private_data; -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f42.google.com (mail-wm0-f42.google.com [74.125.82.42]) by kanga.kvack.org (Postfix) with ESMTP id 6372D6B025F for ; Sun, 20 Mar 2016 08:42:45 -0400 (EDT) Received: by mail-wm0-f42.google.com with SMTP id r129so19700248wmr.1 for ; Sun, 20 Mar 2016 05:42:45 -0700 (PDT) Received: from e06smtp08.uk.ibm.com (e06smtp08.uk.ibm.com. [195.75.94.104]) by mx.google.com with ESMTPS id g16si18713345wjn.102.2016.03.20.05.42.44 for (version=TLS1_2 cipher=AES128-SHA bits=128/128); Sun, 20 Mar 2016 05:42:44 -0700 (PDT) Received: from localhost by e06smtp08.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 20 Mar 2016 12:42:43 -0000 Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id DF0881B0804B for ; Sun, 20 Mar 2016 12:43:10 +0000 (GMT) Received: from d06av06.portsmouth.uk.ibm.com (d06av06.portsmouth.uk.ibm.com [9.149.37.217]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u2KCgexY8520036 for ; Sun, 20 Mar 2016 12:42:40 GMT Received: from d06av06.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u2KCgdTF017265 for ; Sun, 20 Mar 2016 08:42:40 -0400 From: Mike Rapoport Subject: [PATCH 2/5] uffd: Add ability to report non-PF events from uffd descriptor Date: Sun, 20 Mar 2016 14:42:18 +0200 Message-Id: <1458477741-6942-3-git-send-email-rapoport@il.ibm.com> In-Reply-To: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrea Arcangeli Cc: Pavel Emelyanov , LKML , linux-mm@kvack.org, Mike Rapoport , Mike Rapoport From: Pavel Emelyanov The custom events are queued in ctx->event_wqh not to disturb the fast-path-ed PF queue-wait-wakeup functions. The events to be generated (other than PF-s) are requested in UFFD_API ioctl with the uffd_api.features bits. Those, known by the kernel, are then turned on and reported back to the user-space. Signed-off-by: Pavel Emelyanov Signed-off-by: Mike Rapoport --- fs/userfaultfd.c | 99 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 97 insertions(+), 2 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 4f0b53d..c8e7039 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -12,6 +12,7 @@ * mm/ksm.c (mm hashing). */ +#include #include #include #include @@ -45,18 +46,23 @@ struct userfaultfd_ctx { wait_queue_head_t fault_wqh; /* waitqueue head for the pseudo fd to wakeup poll/read */ wait_queue_head_t fd_wqh; + /* waitqueue head for events */ + wait_queue_head_t event_wqh; /* a refile sequence protected by fault_pending_wqh lock */ struct seqcount refile_seq; /* pseudo fd refcounting */ atomic_t refcount; /* userfaultfd syscall flags */ unsigned int flags; + /* features requested from the userspace */ + unsigned int features; /* state machine */ enum userfaultfd_state state; /* released */ bool released; /* mm with one ore more vmas attached to this userfaultfd_ctx */ struct mm_struct *mm; + }; struct userfaultfd_wait_queue { @@ -135,6 +141,8 @@ static void userfaultfd_ctx_put(struct userfaultfd_ctx *ctx) VM_BUG_ON(waitqueue_active(&ctx->fault_pending_wqh)); VM_BUG_ON(spin_is_locked(&ctx->fault_wqh.lock)); VM_BUG_ON(waitqueue_active(&ctx->fault_wqh)); + VM_BUG_ON(spin_is_locked(&ctx->event_wqh.lock)); + VM_BUG_ON(waitqueue_active(&ctx->event_wqh)); VM_BUG_ON(spin_is_locked(&ctx->fd_wqh.lock)); VM_BUG_ON(waitqueue_active(&ctx->fd_wqh)); mmput(ctx->mm); @@ -423,6 +431,59 @@ out: return ret; } +static int __maybe_unused userfaultfd_event_wait_completion( + struct userfaultfd_ctx *ctx, + struct userfaultfd_wait_queue *ewq) +{ + int ret = 0; + + ewq->ctx = ctx; + init_waitqueue_entry(&ewq->wq, current); + + spin_lock(&ctx->event_wqh.lock); + /* + * After the __add_wait_queue the uwq is visible to userland + * through poll/read(). + */ + __add_wait_queue(&ctx->event_wqh, &ewq->wq); + for (;;) { + set_current_state(TASK_KILLABLE); + if (ewq->msg.event == 0) + break; + if (ACCESS_ONCE(ctx->released) || + fatal_signal_pending(current)) { + ret = -1; + __remove_wait_queue(&ctx->event_wqh, &ewq->wq); + break; + } + + spin_unlock(&ctx->event_wqh.lock); + + wake_up_poll(&ctx->fd_wqh, POLLIN); + schedule(); + + spin_lock(&ctx->event_wqh.lock); + } + __set_current_state(TASK_RUNNING); + spin_unlock(&ctx->event_wqh.lock); + + /* + * ctx may go away after this if the userfault pseudo fd is + * already released. + */ + + userfaultfd_ctx_put(ctx); + return ret; +} + +static void userfaultfd_event_complete(struct userfaultfd_ctx *ctx, + struct userfaultfd_wait_queue *ewq) +{ + ewq->msg.event = 0; + wake_up_locked(&ctx->event_wqh); + __remove_wait_queue(&ctx->event_wqh, &ewq->wq); +} + static int userfaultfd_release(struct inode *inode, struct file *file) { struct userfaultfd_ctx *ctx = file->private_data; @@ -507,6 +568,12 @@ static inline struct userfaultfd_wait_queue *find_userfault( return find_userfault_in(&ctx->fault_pending_wqh); } +static inline struct userfaultfd_wait_queue *find_userfault_evt( + struct userfaultfd_ctx *ctx) +{ + return find_userfault_in(&ctx->event_wqh); +} + static unsigned int userfaultfd_poll(struct file *file, poll_table *wait) { struct userfaultfd_ctx *ctx = file->private_data; @@ -538,6 +605,9 @@ static unsigned int userfaultfd_poll(struct file *file, poll_table *wait) smp_mb(); if (waitqueue_active(&ctx->fault_pending_wqh)) ret = POLLIN; + else if (waitqueue_active(&ctx->event_wqh)) + ret = POLLIN; + return ret; default: BUG(); @@ -601,6 +671,19 @@ static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait, break; } spin_unlock(&ctx->fault_pending_wqh.lock); + + spin_lock(&ctx->event_wqh.lock); + uwq = find_userfault_evt(ctx); + if (uwq) { + *msg = uwq->msg; + + userfaultfd_event_complete(ctx, uwq); + spin_unlock(&ctx->event_wqh.lock); + ret = 0; + break; + } + spin_unlock(&ctx->event_wqh.lock); + if (signal_pending(current)) { ret = -ERESTARTSYS; break; @@ -1133,6 +1216,14 @@ out: return ret; } +static inline unsigned int uffd_ctx_features(__u64 user_features) +{ + /* + * For the current set of features the bits just coincide + */ + return (unsigned int)user_features; +} + /* * userland asks for a certain API version and we return which bits * and ioctl commands are implemented in this kernel for such API @@ -1151,19 +1242,21 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx, ret = -EFAULT; if (copy_from_user(&uffdio_api, buf, sizeof(uffdio_api))) goto out; - if (uffdio_api.api != UFFD_API || uffdio_api.features) { + if (uffdio_api.api != UFFD_API || + (uffdio_api.features & ~UFFD_API_FEATURES)) { memset(&uffdio_api, 0, sizeof(uffdio_api)); if (copy_to_user(buf, &uffdio_api, sizeof(uffdio_api))) goto out; ret = -EINVAL; goto out; } - uffdio_api.features = UFFD_API_FEATURES; + uffdio_api.features &= UFFD_API_FEATURES; uffdio_api.ioctls = UFFD_API_IOCTLS; ret = -EFAULT; if (copy_to_user(buf, &uffdio_api, sizeof(uffdio_api))) goto out; ctx->state = UFFD_STATE_RUNNING; + ctx->features = uffd_ctx_features(uffdio_api.features); ret = 0; out: return ret; @@ -1250,6 +1343,7 @@ static void init_once_userfaultfd_ctx(void *mem) init_waitqueue_head(&ctx->fault_pending_wqh); init_waitqueue_head(&ctx->fault_wqh); + init_waitqueue_head(&ctx->event_wqh); init_waitqueue_head(&ctx->fd_wqh); seqcount_init(&ctx->refile_seq); } @@ -1290,6 +1384,7 @@ static struct file *userfaultfd_file_create(int flags) atomic_set(&ctx->refcount, 1); ctx->flags = flags; + ctx->features = 0; ctx->state = UFFD_STATE_WAIT_API; ctx->released = false; ctx->mm = current->mm; -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f52.google.com (mail-wm0-f52.google.com [74.125.82.52]) by kanga.kvack.org (Postfix) with ESMTP id 46B746B0260 for ; Sun, 20 Mar 2016 08:42:46 -0400 (EDT) Received: by mail-wm0-f52.google.com with SMTP id l68so79372208wml.0 for ; Sun, 20 Mar 2016 05:42:46 -0700 (PDT) Received: from e06smtp09.uk.ibm.com (e06smtp09.uk.ibm.com. [195.75.94.105]) by mx.google.com with ESMTPS id a3si9254907wmc.122.2016.03.20.05.42.44 for (version=TLS1_2 cipher=AES128-SHA bits=128/128); Sun, 20 Mar 2016 05:42:44 -0700 (PDT) Received: from localhost by e06smtp09.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 20 Mar 2016 12:42:44 -0000 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id F1EDB1B0804B for ; Sun, 20 Mar 2016 12:43:11 +0000 (GMT) Received: from d06av06.portsmouth.uk.ibm.com (d06av06.portsmouth.uk.ibm.com [9.149.37.217]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u2KCgfeW65863760 for ; Sun, 20 Mar 2016 12:42:41 GMT Received: from d06av06.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u2KCgeDA017295 for ; Sun, 20 Mar 2016 08:42:41 -0400 From: Mike Rapoport Subject: [PATCH 3/5] uffd: Add fork() event Date: Sun, 20 Mar 2016 14:42:19 +0200 Message-Id: <1458477741-6942-4-git-send-email-rapoport@il.ibm.com> In-Reply-To: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrea Arcangeli Cc: Pavel Emelyanov , LKML , linux-mm@kvack.org, Mike Rapoport , Mike Rapoport From: Pavel Emelyanov When the mm with uffd-ed vmas fork()-s the respective vmas notify their uffds with the event which contains a descriptor with new uffd. This new descriptor can then be used to get events from the child and populate its mm with data. Note, that there can be different uffd-s controlling different vmas within one mm, so first we should collect all those uffds (and ctx-s) in a list and then notify them all one by one but only once per fork(). The context is created at fork() time but the descriptor, file struct and anon inode object is created at event read time. So some trickery is added to the userfaultfd_ctx_read() to handle the ctx queues' locking vs file creation. Another thing worth noticing is that the task that fork()-s waits for the uffd event to get processed WITHOUT the mmap sem. Signed-off-by: Pavel Emelyanov Signed-off-by: Mike Rapoport --- fs/userfaultfd.c | 146 ++++++++++++++++++++++++++++++++++++++- include/linux/userfaultfd_k.h | 12 ++++ include/uapi/linux/userfaultfd.h | 13 ++-- kernel/fork.c | 10 ++- 4 files changed, 169 insertions(+), 12 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index c8e7039..565d8f2 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -65,6 +65,12 @@ struct userfaultfd_ctx { }; +struct userfaultfd_fork_ctx { + struct userfaultfd_ctx *orig; + struct userfaultfd_ctx *new; + struct list_head list; +}; + struct userfaultfd_wait_queue { struct uffd_msg msg; wait_queue_t wq; @@ -431,9 +437,8 @@ out: return ret; } -static int __maybe_unused userfaultfd_event_wait_completion( - struct userfaultfd_ctx *ctx, - struct userfaultfd_wait_queue *ewq) +static int userfaultfd_event_wait_completion(struct userfaultfd_ctx *ctx, + struct userfaultfd_wait_queue *ewq) { int ret = 0; @@ -484,6 +489,79 @@ static void userfaultfd_event_complete(struct userfaultfd_ctx *ctx, __remove_wait_queue(&ctx->event_wqh, &ewq->wq); } +int dup_userfaultfd(struct vm_area_struct *vma, struct list_head *fcs) +{ + struct userfaultfd_ctx *ctx = NULL, *octx; + struct userfaultfd_fork_ctx *fctx; + + octx = vma->vm_userfaultfd_ctx.ctx; + if (!octx || !(octx->features & UFFD_FEATURE_EVENT_FORK)) { + vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; + vma->vm_flags &= ~(VM_UFFD_WP | VM_UFFD_MISSING); + return 0; + } + + list_for_each_entry(fctx, fcs, list) + if (fctx->orig == octx) { + ctx = fctx->new; + break; + } + + if (!ctx) { + fctx = kmalloc(sizeof(*fctx), GFP_KERNEL); + if (!fctx) + return -ENOMEM; + + ctx = kmem_cache_alloc(userfaultfd_ctx_cachep, GFP_KERNEL); + if (!ctx) { + kfree(fctx); + return -ENOMEM; + } + + atomic_set(&ctx->refcount, 1); + ctx->flags = octx->flags; + ctx->state = UFFD_STATE_RUNNING; + ctx->features = octx->features; + ctx->released = false; + ctx->mm = vma->vm_mm; + atomic_inc(&ctx->mm->mm_users); + + userfaultfd_ctx_get(octx); + fctx->orig = octx; + fctx->new = ctx; + list_add_tail(&fctx->list, fcs); + } + + vma->vm_userfaultfd_ctx.ctx = ctx; + return 0; +} + +static int dup_fctx(struct userfaultfd_fork_ctx *fctx) +{ + struct userfaultfd_ctx *ctx = fctx->orig; + struct userfaultfd_wait_queue ewq; + + msg_init(&ewq.msg); + + ewq.msg.event = UFFD_EVENT_FORK; + ewq.msg.arg.reserved.reserved1 = (__u64)fctx->new; + + return userfaultfd_event_wait_completion(ctx, &ewq); +} + +void dup_userfaultfd_complete(struct list_head *fcs) +{ + int ret = 0; + struct userfaultfd_fork_ctx *fctx, *n; + + list_for_each_entry_safe(fctx, n, fcs, list) { + if (!ret) + ret = dup_fctx(fctx); + list_del(&fctx->list); + kfree(fctx); + } +} + static int userfaultfd_release(struct inode *inode, struct file *file) { struct userfaultfd_ctx *ctx = file->private_data; @@ -614,12 +692,49 @@ static unsigned int userfaultfd_poll(struct file *file, poll_table *wait) } } +static const struct file_operations userfaultfd_fops; + +static int resolve_userfault_fork(struct userfaultfd_ctx *ctx, + struct userfaultfd_ctx *new, + struct uffd_msg *msg) +{ + int fd; + struct file *file; + unsigned int flags = new->flags & UFFD_SHARED_FCNTL_FLAGS; + + fd = get_unused_fd_flags(flags); + if (fd < 0) + return fd; + + file = anon_inode_getfile("[userfaultfd]", &userfaultfd_fops, new, + O_RDWR | flags); + if (IS_ERR(file)) { + put_unused_fd(fd); + return PTR_ERR(file); + } + + fd_install(fd, file); + msg->arg.reserved.reserved1 = 0; + msg->arg.fork.ufd = fd; + + return 0; +} + static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait, struct uffd_msg *msg) { ssize_t ret; DECLARE_WAITQUEUE(wait, current); struct userfaultfd_wait_queue *uwq; + /* + * Handling fork event requires sleeping operations, so + * we drop the event_wqh lock, then do these ops, then + * lock it back and wake up the waiter. While the lock is + * dropped the ewq may go away so we keep track of it + * carefully. + */ + LIST_HEAD(fork_event); + struct userfaultfd_ctx *fork_nctx = NULL; /* always take the fd_wqh lock before the fault_pending_wqh lock */ spin_lock(&ctx->fd_wqh.lock); @@ -677,6 +792,14 @@ static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait, if (uwq) { *msg = uwq->msg; + if (uwq->msg.event == UFFD_EVENT_FORK) { + fork_nctx = (struct userfaultfd_ctx *)uwq->msg.arg.reserved.reserved1; + list_move(&uwq->wq.task_list, &fork_event); + spin_unlock(&ctx->event_wqh.lock); + ret = 0; + break; + } + userfaultfd_event_complete(ctx, uwq); spin_unlock(&ctx->event_wqh.lock); ret = 0; @@ -700,6 +823,23 @@ static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait, __set_current_state(TASK_RUNNING); spin_unlock(&ctx->fd_wqh.lock); + if (!ret && msg->event == UFFD_EVENT_FORK) { + ret = resolve_userfault_fork(ctx, fork_nctx, msg); + + if (!ret) { + spin_lock(&ctx->event_wqh.lock); + if (!list_empty(&fork_event)) { + uwq = list_first_entry(&fork_event, + typeof(*uwq), + wq.task_list); + list_del(&uwq->wq.task_list); + __add_wait_queue(&ctx->event_wqh, &uwq->wq); + userfaultfd_event_complete(ctx, uwq); + } + spin_unlock(&ctx->event_wqh.lock); + } + } + return ret; } diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 587480a..0c7b723 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -53,6 +53,9 @@ static inline bool userfaultfd_armed(struct vm_area_struct *vma) return vma->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP); } +extern int dup_userfaultfd(struct vm_area_struct *, struct list_head *); +extern void dup_userfaultfd_complete(struct list_head *); + #else /* CONFIG_USERFAULTFD */ /* mm helpers */ @@ -80,6 +83,15 @@ static inline bool userfaultfd_armed(struct vm_area_struct *vma) return false; } +static inline int dup_userfaultfd(struct vm_area_struct *, struct list_head *) +{ + return 0; +} + +static inline void dup_userfaultfd_complete(struct list_head *) +{ +} + #endif /* CONFIG_USERFAULTFD */ #endif /* _LINUX_USERFAULTFD_K_H */ diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 9057d7a..d89eef6 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -14,10 +14,9 @@ #define UFFD_API ((__u64)0xAA) /* * After implementing the respective features it will become: - * #define UFFD_API_FEATURES (UFFD_FEATURE_PAGEFAULT_FLAG_WP | \ - * UFFD_FEATURE_EVENT_FORK) + * #define UFFD_API_FEATURES (UFFD_FEATURE_PAGEFAULT_FLAG_WP) */ -#define UFFD_API_FEATURES (0) +#define UFFD_API_FEATURES (UFFD_FEATURE_EVENT_FORK) #define UFFD_API_IOCTLS \ ((__u64)1 << _UFFDIO_REGISTER | \ (__u64)1 << _UFFDIO_UNREGISTER | \ @@ -72,6 +71,10 @@ struct uffd_msg { } pagefault; struct { + __u32 ufd; + } fork; + + struct { /* unused reserved fields */ __u64 reserved1; __u64 reserved2; @@ -84,9 +87,7 @@ struct uffd_msg { * Start at 0x12 and not at 0 to be more strict against bugs. */ #define UFFD_EVENT_PAGEFAULT 0x12 -#if 0 /* not available yet */ #define UFFD_EVENT_FORK 0x13 -#endif /* flags for UFFD_EVENT_PAGEFAULT */ #define UFFD_PAGEFAULT_FLAG_WRITE (1<<0) /* If this was a write fault */ @@ -107,8 +108,8 @@ struct uffdio_api { */ #if 0 /* not available yet */ #define UFFD_FEATURE_PAGEFAULT_FLAG_WP (1<<0) -#define UFFD_FEATURE_EVENT_FORK (1<<1) #endif +#define UFFD_FEATURE_EVENT_FORK (1<<1) __u64 features; __u64 ioctls; diff --git a/kernel/fork.c b/kernel/fork.c index accb722..0624762 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -55,6 +55,7 @@ #include #include #include +#include #include #include #include @@ -408,6 +409,7 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm) struct rb_node **rb_link, *rb_parent; int retval; unsigned long charge; + LIST_HEAD(uf); uprobe_start_dup_mmap(); down_write(&oldmm->mmap_sem); @@ -461,12 +463,13 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm) if (retval) goto fail_nomem_policy; tmp->vm_mm = mm; + retval = dup_userfaultfd(tmp, &uf); + if (retval) + goto fail_nomem_anon_vma_fork; if (anon_vma_fork(tmp, mpnt)) goto fail_nomem_anon_vma_fork; - tmp->vm_flags &= - ~(VM_LOCKED|VM_LOCKONFAULT|VM_UFFD_MISSING|VM_UFFD_WP); + tmp->vm_flags &= ~(VM_LOCKED | VM_LOCKONFAULT); tmp->vm_next = tmp->vm_prev = NULL; - tmp->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; file = tmp->vm_file; if (file) { struct inode *inode = file_inode(file); @@ -522,6 +525,7 @@ out: up_write(&mm->mmap_sem); flush_tlb_mm(oldmm); up_write(&oldmm->mmap_sem); + dup_userfaultfd_complete(&uf); uprobe_end_dup_mmap(); return retval; fail_nomem_anon_vma_fork: -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f46.google.com (mail-wm0-f46.google.com [74.125.82.46]) by kanga.kvack.org (Postfix) with ESMTP id 9935C6B0262 for ; Sun, 20 Mar 2016 08:42:48 -0400 (EDT) Received: by mail-wm0-f46.google.com with SMTP id l68so91599103wml.1 for ; Sun, 20 Mar 2016 05:42:48 -0700 (PDT) Received: from e06smtp16.uk.ibm.com (e06smtp16.uk.ibm.com. [195.75.94.112]) by mx.google.com with ESMTPS id k10si13064wjy.108.2016.03.20.05.42.44 for (version=TLS1_2 cipher=AES128-SHA bits=128/128); Sun, 20 Mar 2016 05:42:44 -0700 (PDT) Received: from localhost by e06smtp16.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 20 Mar 2016 12:42:44 -0000 Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 9C8EA1B0805F for ; Sun, 20 Mar 2016 12:43:12 +0000 (GMT) Received: from d06av06.portsmouth.uk.ibm.com (d06av06.portsmouth.uk.ibm.com [9.149.37.217]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u2KCgfYw262540 for ; Sun, 20 Mar 2016 12:42:41 GMT Received: from d06av06.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u2KCgf69017321 for ; Sun, 20 Mar 2016 08:42:41 -0400 From: Mike Rapoport Subject: [PATCH 4/5] uffd: Add mremap() event Date: Sun, 20 Mar 2016 14:42:20 +0200 Message-Id: <1458477741-6942-5-git-send-email-rapoport@il.ibm.com> In-Reply-To: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrea Arcangeli Cc: Pavel Emelyanov , LKML , linux-mm@kvack.org, Mike Rapoport , Mike Rapoport From: Pavel Emelyanov The event denotes that an area [start:end] moves to different location. Length change isn't reported as "new" addresses, if they appear on the uffd reader side they will not contain any data and the latter can just zeromap them. Waiting for the event ACK is also done outside of mmap sem, as for fork event. Signed-off-by: Pavel Emelyanov Signed-off-by: Mike Rapoport --- fs/userfaultfd.c | 37 +++++++++++++++++++++++++++++++++++++ include/linux/userfaultfd_k.h | 17 +++++++++++++++++ include/uapi/linux/userfaultfd.h | 10 +++++++++- mm/mremap.c | 17 ++++++++++++----- 4 files changed, 75 insertions(+), 6 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 565d8f2..a7771bd 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -562,6 +562,43 @@ void dup_userfaultfd_complete(struct list_head *fcs) } } +void mremap_userfaultfd_prep(struct vm_area_struct *vma, + struct vm_userfaultfd_ctx *vm_ctx) +{ + struct userfaultfd_ctx *ctx; + + ctx = vma->vm_userfaultfd_ctx.ctx; + if (ctx && (ctx->features & UFFD_FEATURE_EVENT_REMAP)) { + vm_ctx->ctx = ctx; + userfaultfd_ctx_get(ctx); + } +} + +void mremap_userfaultfd_complete(struct vm_userfaultfd_ctx vm_ctx, + unsigned long from, unsigned long to, + unsigned long len) +{ + struct userfaultfd_ctx *ctx = vm_ctx.ctx; + struct userfaultfd_wait_queue ewq; + + if (!ctx) + return; + + if (to & ~PAGE_MASK) { + userfaultfd_ctx_put(ctx); + return; + } + + msg_init(&ewq.msg); + + ewq.msg.event = UFFD_EVENT_REMAP; + ewq.msg.arg.remap.from = from; + ewq.msg.arg.remap.to = to; + ewq.msg.arg.remap.len = len; + + userfaultfd_event_wait_completion(ctx, &ewq); +} + static int userfaultfd_release(struct inode *inode, struct file *file) { struct userfaultfd_ctx *ctx = file->private_data; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 0c7b723..42ea277 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -56,6 +56,12 @@ static inline bool userfaultfd_armed(struct vm_area_struct *vma) extern int dup_userfaultfd(struct vm_area_struct *, struct list_head *); extern void dup_userfaultfd_complete(struct list_head *); +extern void mremap_userfaultfd_prep(struct vm_area_struct *, + struct vm_userfaultfd_ctx *); +extern void mremap_userfaultfd_complete(struct vm_userfaultfd_ctx, + unsigned long from, unsigned long to, + unsigned long len); + #else /* CONFIG_USERFAULTFD */ /* mm helpers */ @@ -92,6 +98,17 @@ static inline void dup_userfaultfd_complete(struct list_head *) { } +static inline void mremap_userfaultfd_prep(struct vm_area_struct *vma, + struct vm_userfaultfd_ctx *ctx) +{ +} + +static inline void mremap_userfaultfd_complete(struct vm_userfaultfd_ctx ctx, + unsigned long from, + unsigned long to, + unsigned long len) +{ +} #endif /* CONFIG_USERFAULTFD */ #endif /* _LINUX_USERFAULTFD_K_H */ diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index d89eef6..46bbb6f 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -16,7 +16,7 @@ * After implementing the respective features it will become: * #define UFFD_API_FEATURES (UFFD_FEATURE_PAGEFAULT_FLAG_WP) */ -#define UFFD_API_FEATURES (UFFD_FEATURE_EVENT_FORK) +#define UFFD_API_FEATURES (UFFD_FEATURE_EVENT_FORK|UFFD_FEATURE_EVENT_REMAP) #define UFFD_API_IOCTLS \ ((__u64)1 << _UFFDIO_REGISTER | \ (__u64)1 << _UFFDIO_UNREGISTER | \ @@ -75,6 +75,12 @@ struct uffd_msg { } fork; struct { + __u64 from; + __u64 to; + __u64 len; + } remap; + + struct { /* unused reserved fields */ __u64 reserved1; __u64 reserved2; @@ -88,6 +94,7 @@ struct uffd_msg { */ #define UFFD_EVENT_PAGEFAULT 0x12 #define UFFD_EVENT_FORK 0x13 +#define UFFD_EVENT_REMAP 0x14 /* flags for UFFD_EVENT_PAGEFAULT */ #define UFFD_PAGEFAULT_FLAG_WRITE (1<<0) /* If this was a write fault */ @@ -110,6 +117,7 @@ struct uffdio_api { #define UFFD_FEATURE_PAGEFAULT_FLAG_WP (1<<0) #endif #define UFFD_FEATURE_EVENT_FORK (1<<1) +#define UFFD_FEATURE_EVENT_REMAP (1<<2) __u64 features; __u64 ioctls; diff --git a/mm/mremap.c b/mm/mremap.c index 3fa0a467..3581f31 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include @@ -234,7 +235,8 @@ unsigned long move_page_tables(struct vm_area_struct *vma, static unsigned long move_vma(struct vm_area_struct *vma, unsigned long old_addr, unsigned long old_len, - unsigned long new_len, unsigned long new_addr, bool *locked) + unsigned long new_len, unsigned long new_addr, + bool *locked, struct vm_userfaultfd_ctx *uf) { struct mm_struct *mm = vma->vm_mm; struct vm_area_struct *new_vma; @@ -293,6 +295,7 @@ static unsigned long move_vma(struct vm_area_struct *vma, old_addr = new_addr; new_addr = err; } else { + mremap_userfaultfd_prep(new_vma, uf); arch_remap(mm, old_addr, old_addr + old_len, new_addr, new_addr + new_len); } @@ -397,7 +400,8 @@ static struct vm_area_struct *vma_to_resize(unsigned long addr, } static unsigned long mremap_to(unsigned long addr, unsigned long old_len, - unsigned long new_addr, unsigned long new_len, bool *locked) + unsigned long new_addr, unsigned long new_len, bool *locked, + struct vm_userfaultfd_ctx *uf) { struct mm_struct *mm = current->mm; struct vm_area_struct *vma; @@ -442,7 +446,7 @@ static unsigned long mremap_to(unsigned long addr, unsigned long old_len, if (offset_in_page(ret)) goto out1; - ret = move_vma(vma, addr, old_len, new_len, new_addr, locked); + ret = move_vma(vma, addr, old_len, new_len, new_addr, locked, uf); if (!(offset_in_page(ret))) goto out; out1: @@ -481,6 +485,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, unsigned long ret = -EINVAL; unsigned long charged = 0; bool locked = false; + struct vm_userfaultfd_ctx uf = NULL_VM_UFFD_CTX; if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE)) return ret; @@ -506,7 +511,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, if (flags & MREMAP_FIXED) { ret = mremap_to(addr, old_len, new_addr, new_len, - &locked); + &locked, &uf); goto out; } @@ -575,7 +580,8 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, goto out; } - ret = move_vma(vma, addr, old_len, new_len, new_addr, &locked); + ret = move_vma(vma, addr, old_len, new_len, new_addr, + &locked, &uf); } out: if (offset_in_page(ret)) { @@ -585,5 +591,6 @@ out: up_write(¤t->mm->mmap_sem); if (locked && new_len > old_len) mm_populate(new_addr + old_len, new_len - old_len); + mremap_userfaultfd_complete(uf, addr, new_addr, old_len); return ret; } -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f48.google.com (mail-wm0-f48.google.com [74.125.82.48]) by kanga.kvack.org (Postfix) with ESMTP id B81F76B0263 for ; Sun, 20 Mar 2016 08:42:50 -0400 (EDT) Received: by mail-wm0-f48.google.com with SMTP id l68so91599682wml.1 for ; Sun, 20 Mar 2016 05:42:50 -0700 (PDT) Received: from e06smtp06.uk.ibm.com (e06smtp06.uk.ibm.com. [195.75.94.102]) by mx.google.com with ESMTPS id l67si9265538wmg.76.2016.03.20.05.42.46 for (version=TLS1_2 cipher=AES128-SHA bits=128/128); Sun, 20 Mar 2016 05:42:46 -0700 (PDT) Received: from localhost by e06smtp06.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 20 Mar 2016 12:42:45 -0000 Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 7CDD91B0804B for ; Sun, 20 Mar 2016 12:43:13 +0000 (GMT) Received: from d06av06.portsmouth.uk.ibm.com (d06av06.portsmouth.uk.ibm.com [9.149.37.217]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u2KCgg565701968 for ; Sun, 20 Mar 2016 12:42:42 GMT Received: from d06av06.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u2KCggbj017352 for ; Sun, 20 Mar 2016 08:42:42 -0400 From: Mike Rapoport Subject: [PATCH 5/5] uffd: Add madvise() event for MADV_DONTNEED request Date: Sun, 20 Mar 2016 14:42:21 +0200 Message-Id: <1458477741-6942-6-git-send-email-rapoport@il.ibm.com> In-Reply-To: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrea Arcangeli Cc: Pavel Emelyanov , LKML , linux-mm@kvack.org, Mike Rapoport , Mike Rapoport From: Pavel Emelyanov If the page is punched out of the address space the uffd reader should know this and zeromap the respective area in case of the #PF event. Signed-off-by: Pavel Emelyanov Signed-off-by: Mike Rapoport --- fs/userfaultfd.c | 26 ++++++++++++++++++++++++++ include/linux/userfaultfd_k.h | 12 ++++++++++++ include/uapi/linux/userfaultfd.h | 9 ++++++++- mm/madvise.c | 2 ++ 4 files changed, 48 insertions(+), 1 deletion(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index a7771bd..e65ca84 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -599,6 +599,32 @@ void mremap_userfaultfd_complete(struct vm_userfaultfd_ctx vm_ctx, userfaultfd_event_wait_completion(ctx, &ewq); } +void madvise_userfault_dontneed(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, unsigned long end) +{ + struct userfaultfd_ctx *ctx; + struct userfaultfd_wait_queue ewq; + + ctx = vma->vm_userfaultfd_ctx.ctx; + if (!ctx || !(ctx->features & UFFD_FEATURE_EVENT_MADVDONTNEED)) + return; + + userfaultfd_ctx_get(ctx); + *prev = NULL; /* We wait for ACK w/o the mmap semaphore */ + up_read(&vma->vm_mm->mmap_sem); + + msg_init(&ewq.msg); + + ewq.msg.event = UFFD_EVENT_MADVDONTNEED; + ewq.msg.arg.madv_dn.start = start; + ewq.msg.arg.madv_dn.end = end; + + userfaultfd_event_wait_completion(ctx, &ewq); + + down_read(&vma->vm_mm->mmap_sem); +} + static int userfaultfd_release(struct inode *inode, struct file *file) { struct userfaultfd_ctx *ctx = file->private_data; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 42ea277..7e22a3d 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -62,6 +62,11 @@ extern void mremap_userfaultfd_complete(struct vm_userfaultfd_ctx, unsigned long from, unsigned long to, unsigned long len); +extern void madvise_userfault_dontneed(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, + unsigned long end); + #else /* CONFIG_USERFAULTFD */ /* mm helpers */ @@ -109,6 +114,13 @@ static inline void mremap_userfaultfd_complete(struct vm_userfaultfd_ctx ctx, unsigned long len) { } + +static inline void madvise_userfault_dontneed(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, + unsigned long end) +{ +} #endif /* CONFIG_USERFAULTFD */ #endif /* _LINUX_USERFAULTFD_K_H */ diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 46bbb6f..cbcb3a5 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -16,7 +16,7 @@ * After implementing the respective features it will become: * #define UFFD_API_FEATURES (UFFD_FEATURE_PAGEFAULT_FLAG_WP) */ -#define UFFD_API_FEATURES (UFFD_FEATURE_EVENT_FORK|UFFD_FEATURE_EVENT_REMAP) +#define UFFD_API_FEATURES (UFFD_FEATURE_EVENT_FORK|UFFD_FEATURE_EVENT_REMAP|UFFD_FEATURE_EVENT_MADVDONTNEED) #define UFFD_API_IOCTLS \ ((__u64)1 << _UFFDIO_REGISTER | \ (__u64)1 << _UFFDIO_UNREGISTER | \ @@ -81,6 +81,11 @@ struct uffd_msg { } remap; struct { + __u64 start; + __u64 end; + } madv_dn; + + struct { /* unused reserved fields */ __u64 reserved1; __u64 reserved2; @@ -95,6 +100,7 @@ struct uffd_msg { #define UFFD_EVENT_PAGEFAULT 0x12 #define UFFD_EVENT_FORK 0x13 #define UFFD_EVENT_REMAP 0x14 +#define UFFD_EVENT_MADVDONTNEED 0x15 /* flags for UFFD_EVENT_PAGEFAULT */ #define UFFD_PAGEFAULT_FLAG_WRITE (1<<0) /* If this was a write fault */ @@ -118,6 +124,7 @@ struct uffdio_api { #endif #define UFFD_FEATURE_EVENT_FORK (1<<1) #define UFFD_FEATURE_EVENT_REMAP (1<<2) +#define UFFD_FEATURE_EVENT_MADVDONTNEED (1<<3) __u64 features; __u64 ioctls; diff --git a/mm/madvise.c b/mm/madvise.c index a011473..7b66d6b 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include @@ -476,6 +477,7 @@ static long madvise_dontneed(struct vm_area_struct *vma, return -EINVAL; zap_page_range(vma, start, end - start, NULL); + madvise_userfault_dontneed(vma, prev, start, end); return 0; } -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f180.google.com (mail-pf0-f180.google.com [209.85.192.180]) by kanga.kvack.org (Postfix) with ESMTP id 933546B0262 for ; Sun, 20 Mar 2016 08:54:52 -0400 (EDT) Received: by mail-pf0-f180.google.com with SMTP id x3so231140486pfb.1 for ; Sun, 20 Mar 2016 05:54:52 -0700 (PDT) Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTP id h78si14750995pfh.148.2016.03.20.05.54.51 for ; Sun, 20 Mar 2016 05:54:51 -0700 (PDT) Date: Sun, 20 Mar 2016 20:53:28 +0800 From: kbuild test robot Subject: Re: [PATCH 3/5] uffd: Add fork() event Message-ID: <201603202029.DGStOszG%fengguang.wu@intel.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="CE+1k2dSO48ffgeK" Content-Disposition: inline In-Reply-To: <1458477741-6942-4-git-send-email-rapoport@il.ibm.com> Sender: owner-linux-mm@kvack.org List-ID: To: Mike Rapoport Cc: kbuild-all@01.org, Andrea Arcangeli , Pavel Emelyanov , LKML , linux-mm@kvack.org, Mike Rapoport --CE+1k2dSO48ffgeK Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi Pavel, [auto build test ERROR on next-20160318] [also build test ERROR on v4.5] [cannot apply to v4.5-rc7 v4.5-rc6 v4.5-rc5] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url: https://github.com/0day-ci/linux/commits/Mike-Rapoport/userfaultfd-extension-for-non-cooperative-uffd-usage/20160320-204520 config: i386-tinyconfig (attached as .config) reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): In file included from kernel/fork.c:58:0: include/linux/userfaultfd_k.h: In function 'dup_userfaultfd': >> include/linux/userfaultfd_k.h:86:42: error: parameter name omitted static inline int dup_userfaultfd(struct vm_area_struct *, struct list_head *) ^ include/linux/userfaultfd_k.h:86:67: error: parameter name omitted static inline int dup_userfaultfd(struct vm_area_struct *, struct list_head *) ^ include/linux/userfaultfd_k.h: In function 'dup_userfaultfd_complete': include/linux/userfaultfd_k.h:91:52: error: parameter name omitted static inline void dup_userfaultfd_complete(struct list_head *) ^ vim +86 include/linux/userfaultfd_k.h 80 81 static inline bool userfaultfd_armed(struct vm_area_struct *vma) 82 { 83 return false; 84 } 85 > 86 static inline int dup_userfaultfd(struct vm_area_struct *, struct list_head *) 87 { 88 return 0; 89 } --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation --CE+1k2dSO48ffgeK Content-Type: application/octet-stream Content-Disposition: attachment; filename=".config.gz" Content-Transfer-Encoding: base64 H4sICFCc7lYAAy5jb25maWcAjDxbc9s2s+/9FZz0PLQzJ4ljO/7SOeMHiARFVATJEKAk+4Wj yHSiqS35k+Q2+fdnFyDF20JpZzK1sIvbYu9Y8NdffvXY63H3vDpu1qunpx/e12pb7VfH6sF7 3DxV/+cFqZek2uOB0O8AOd5sX7+/31x9uvGu3318d+HNqv22evL83fZx8/UVem52219+BUw/ TUIxLW+uJ0J7m4O33R29Q3X8pW5ffropry5vf3R+tz9EonRe+FqkSRlwPw143gLTQmeFLsM0 l0zfvqmeHq8u3+KK3jQYLPcj6Bfan7dvVvv1t/ffP928X5tVHsz6y4fq0f4+9YtTfxbwrFRF lqW5bqdUmvkznTOfj2FSFu0PM7OULCvzJChh56qUIrn9dA7OlrcfbmgEP5UZ0z8dp4fWGy7h PCjVtAwkK2OeTHXUrnXKE54LvxSKIXwMiBZcTCM93B27KyM252Xml2Hgt9B8obgsl340ZUFQ snia5kJHcjyuz2IxyZnmcEYxuxuMHzFV+llR5gBbUjDmR7yMRQJnIe55i2EWpbgusjLjuRmD 5byzL0OMBsTlBH6FIle69KMimTnwMjblNJpdkZjwPGGGU7NUKTGJ+QBFFSrjcEoO8IIluowK mCWTcFYRrJnCMMRjscHU8WQ0h+FKVaaZFhLIEoAMAY1EMnVhBnxSTM32WAyM35NEkMwyZvd3 5VQN92t5ovTDmAHwzdtHVBtvD6u/q4e31fq71294+P6Gnr3I8nTCO6OHYllylsd38LuUvMM2 dqF5GjDdOcxsqhkQE7h6zmN1e9lih400CwXq4f3T5sv7593D61N1eP8/RcIkR9biTPH37wby L/LP5SLNO2c8KUQcAEV5yZd2PmWF36i4qdGVT6jWXl+gpemUpzOelLAPJbOuUhO65MkcKIGL k0LfXp2W7efAHUaQBXDImzetAq3bSs0VpUfh6Fg857kCDuz16wJKVuiU6GxEZgYMzONyei+y gTDVkAlALmlQfN9VHF3I8t7VI3UBrgFwWn5nVd2FD+FmbecQcIXEzrurHHdJz494TQwIfMeK GCQ5VRqZ7PbNb9vdtvq9cyLqTs1F5pNj2/MHvk/zu5JpsDcRiRdGLAliTsIKxUGxuo7ZyB8r wI7DOoA14oaLgeu9w+uXw4/DsXpuufhkHkAojLASlgNAKkoXHR6HFjDMPugfHYHyDXoKSGUs VxyR2jYfja5KC+gDik77UZAOVVYXpa8EupA5WJUAjUrMUFff+TGxYiPK85YAQ8uE44FCSbQ6 C0RjXLLgz0JpAk+mqN9wLQ2J9ea52h8oKkf3aGlEGgi/y4lJihDhOmkDJiERaGfQb8rsNFdd HOuVZcV7vTr85R1hSd5q++AdjqvjwVut17vX7XGz/dquTQt/Zs2o76dFou1ZnqbCszb0bMGj 6XK/8NR414B7VwKsOxz8BCULxKC0nBoga6ZmCruQRMChwGWLY1SeMk1IJJ1zbjCNX+ccB5cE MsPLSZpqEsvYCHC+kktatMXM/uESzAKcXWtawLEJLJt19+pP87TIFK02Iu7PslSAgwCHrtOc 3ogdGY2AGYveLPpi9AbjGai3uTFgeUBsw/dPfgdK/8AvYwkYIJGAE68Gmr8QwYeO149iqWOg uM8z41CZkxn0yXyVzfIyi5nGCKCFWt7pEk6CPhagFHOaJuBHSWCjstYGNNKdCtVZjBkA1J2k j6cBlmyi0rgALoI1gkSRyFkOxzhzsNiU7tInBt0XPJ0yLBzLD2FRSxLCs9RFFDFNWBwGtFih 3nHAjPJ0wCZZeP4kIjCOJIQJ2lyzYC5g6/Wg9AEhdxi77VgVzDlheS76PNRsB0OIgAdDDoUh y5MRMWqwDpKzav+42z+vtuvK439XW9C7DDSwj5oX7EOrH/tDnFZTu+wIhIWXc2k8d3Lhc2n7 l0Y1DyxBz7fEwDGn2U7FbOIAFJSfoeJ00l0vkF5DSIg2uwRPVITCN5GSg/3TUMQDI9Kla2ox OgqhaSkTKSzjdWf/s5AZOAMTTjNUHYHQVhTnM5kLiGOB21F5+j5XyrU2HsLeBNIbIoxej4Ev g+eGBgMsYDlRCzZ0uQWocAzrYXF6AJoNQybbmnNNAkAj0x1sK4YnIaVggZaDFrNwgxql6WwA xMwC/NZiWqQF4TVBCGT8mNofJEJbCEXvwGNG78yoY5P5GcyS8ykoUQiiTSamJm3JsuFScTXQ aiVlAIsWwOicWXM5gEmxhBNrwcrMODRXoCygXRd5Ah6YBnbupqWGsk8Q0kCJgRuJzuvtBYUc 8oWhVsvRo7yIPbhSsZCDA5phFmYwQt1q40IHLEgLR4IC4pbSeu9NrEmsT3EfNQrE8rEekQac BLM75Gzug6vS83GGQNrL6OPAIST87ChI7CJmtAMwxgbWS936h/B3HYKSYKDD67QOZlg62cI0 KGKQPtQDPEZuGJ+lshBg91SOM1zjFOK59GObMrSHkGZ3tSSWOu70BK8zAc0E5FiwPOgAUvBt weDXSayrEYCZLO0pI+Kn87dfVofqwfvL2ryX/e5x89SLK07bROyy0eG9gMwstlEhVsVEHEna Sc2gX6PQBN5+6BhsS1/iDBvKG78/BkVWZF3emaDbTXQzaTSYKAOFXSSI1I9fa7ihqIWfg5F9 FznGF47OXWC/dz+hxnSKKjSXiwEGctrngheYCIZNmIjZjZIvGoTWRQSC3fcdIHPW2X63rg6H 3d47/nixseRjtTq+7qtD9wLgHhkr6CdhWg9B0sEI5iBDzkDVgl5j0mGmDRZG+w0q5sho1Cmw aygUnVTBcfhSA39j4vecM13nRkUu6GlsoAUnAWvKMdVorIkjAonuQPGDjwqaZ1rQ2T0I9DHu tPnQlsmvP93Q7urHMwCtaFcRYVIuKZG5MZcyLSaoAIiopBD0QCfweThN2gZ6TUNnjo3N/uNo /0S3+3mhUjpKlsaL4w7/VC5E4kdgBx0LqcFXrkAiZo5xpxxC4+nywxloGdMxmvTvcrF00nsu mH9V0ulRA3TQzgcn1NEL1YxTMmqF7bjtM4KAaYD6CkdFItS3H7so8YcBrDd8BqYCRD3pZ2s6 CKjHDJJJi6iikx1AMAhAv6F2e26uh83pvN8iRSJkIU0yLARXNb7rr9u4m76Opep5NbAU9FPR s+AxuBiUUwMjgg43xOnYv6bZnG/vnrSBMBkQ6CBCrMjHAOOUSA5xGDVWIX3b3qqmjGsbUZGH HUhBKStzY6bAHJ/2z7nM9MhPa9rnaQx+FMvptFON5eQ2JEImaJ1mDs2R1TOMxsFxuYMo2aEv nQCdAmtOaGMmPtFhNE6Yc9TjoVi6MnlgvIFbQDrc+1H0YRiWzQpBZeaSFBPCA/NRN13TOaUa enNNOcJzqbIYjNtVLxPctmLU6SCoRbmkJ23BPx3hA7Uuc0ubhqHi+vbiu39h/xuoD0bpDeMA hWDzYc8lTxhxf2viHjfYiHZzdQNeZleORYycFjduAF5SFPz24pQwOde3WZRkSWEittbLOK3I woht1Z37o5VG+9p+nQCzHQ7iIS06StLGxlxO+q5pr7ketDugrb8QyodQotu9n1+pHRtQfWFq BqEySubIM20mMsrlepC98t0JpegO3OIgyEvtrEJpnFMkz7Q9l7nIQf2B71X0POGZksQYzc2f ibvsxVCQ315f/HHTvWwYB4WUuHYrD2Y9ofVjzhJjHOlg1uFg32dpSue/7icF7arcq3FesfGi 64jMXNQ3uSp3gUHI8xzDDpPRsTKKdwjdbRnlhdYaotUUL8DzvMiGR9rTowp8ZgzgFrc3HV6Q Oqe1o1mTjaWd2hM27A5DjGUG75T2wOqUB61J78sPFxdUOuG+vPx40ROI+/KqjzoYhR7mFoYZ Bh9Rjvd29HUEX3LqWFFShA9qCuQ/RwX6Yag/c45pI3NNda6/SX1C/8tB9zrPPA8Unar3ZWCC 3YmLWUE1ivCujANNXRLYcHT3T7X3nlfb1dfqudoeTUDK/Ex4uxcsSusFpXU6g9YbNKOoUIzm xKuccF/997Xarn94h/WqTnS0G0OHMeefyZ7i4akaIjuvfA0fo35QJzzM7WcxD0aDT14Pzaa9 3zJfeNVx/e737lTYSOQ6bCVYnVdt/RrlCN59PGgSlMaOOgfgEFqQEq4/frygg6DMR0viFt87 FU5GRODfq/XrcfXlqTKVjJ65eTkevPcef359Wo1YYgJ2SGpMvdH3Uxas/FxklCWxubm06Gm3 uhM2nxtUCkdojoEYJoOd89mkj0itGu4Sc0SPoPp7s668YL/52941tSVMm3Xd7KVjUSnsPVLE 48wVDfC5llnoyIho0L0Ms4suJ98MH4pcLsA+2tt0EjVcgNZngWMRaLIW5pqaItrgCi3Ixdy5 GYPA57kj6QTc1snc0MmmphIEBBVGEj6ZkOxi4dV8U2TTibKYrQcMgCphSKTgUNAfzLn2jkxq moJpSCzD5oxNUV9T1gmOSl3j2p6TbRqtQG4Oa2oJcADyDvOV5EIgho9ThRk7tOZD+rSkzhmt i/1LcjGcAw2ld3h9edntj93lWEj5x5W/vBl109X31cET28Nx//psbmUP31b76sE77lfbAw7l gV6vvAfY6+YF/2ykhz0dq/3KC7MpAyWzf/4HunkPu3+2T7vVg2erDhtcsT1WTx6Iqzk1K28N TPkiJJrnaUa0tgNFu8PRCfRX+wdqGif+7uWU0FXH1bHyZGtLf/NTJX8fKg9c32m4ltZ+5LDy y9hk7Z3AusAOzI8ThfPIpQxFcKq3Ur4SNVd2uOFktpRAh6IXUWGbK0ktmQ9OIAT+td4YV1WJ 7cvrcTxha0GTrBizawQnZDhGvE897NJ3UbAs7N/Jq0HtbmfKJCclxAfGXq2BaSmZ1ZpOxIAK c9VeAGjmgolMitKWKzry34tzjnkyd0l/5n/6z9XN93KaOSo/EuW7gbCiqY043Pkt7cM/hx8I 0YA/vCuyTHDpk2fvKAtTDi5XmaQBkRo7oFmmqDmzbMyj2FY/7tiZWsSml4XqzFs/7dZ/DQF8 a1wocPGxthR9anAusEgavX5DQrDwMsO6jeMOZqu847fKWz08bNCTWD3ZUQ/vBtd/5lI5NZEe xA14WDB8j4VtE0mJhcNNTBd4hQ7xZ+zIKBoEDCFpd8zC2dxRFLJwlhJGPJeMjlyamlYquaEm 3UcBVnPttpv1wVObp816t/Umq/VfL0+rbS9OgH7EaBMf3IXhcJM9GKL17tk7vFTrzSM4ekxO WM/tHWQOrFV/fTpuHl+3azzDRq89nJR/qxnDwLhbtNpEYA5BPacFINLoaUDgeOXsPuMyc3iD CJb65uoPxx0GgJV0BRRssvx4cXF+6Rhnuq6CAKxFyeTV1cclXiuwwHG1hojSoYhsbYJ2+JCS B4I1yZTRAU33q5dvyCiE8Af9u0vrqPiZ+4kWdMDrhBKRhh3D/eq58r68Pj6CzQjGNiOkJRQL CWJjo2I/oHbRpnWnDLOOjrrVtEiotHYBkpNGvihjoTUEwhDKC9YpSUH46KEWNp5KDyK/Z/8L NQ4gsc04fw99rwfbs28/DvhgzotXP9CYjkUDZwMNSdunNDPwpc/FnMRA6JQFU4eiKhY02aV0 8CGXypkVSjgEVjyglZ6ttBITAZS+I06CB8xvwlCIjYvOwyQDak+h9Q+hnRgpB3UwsAHY5MdM 0UsDd40IrtqVF8tAqMxV1Fw4pNKkfl1+3nyzB+Ghjhu7iRQOoD9sHSOt97vD7vHoRT9eqv3b uff1tQLPnZBdEIXpoASyl+poihOosLL1kyOIdfgJd7yNk+OpXjZbY/QHLO6bRrV73ff0fjN+ PFO5X4pPlx879UDQyueaaJ3Ewam1PR0twdPPBM3f4Gob56z05U8QpC7om+oThpZ0uTWXNQJI hsPtF/EkpbNVIpWycGrnvHreHSsMpyhWUZqbqx5Z5nhBPO798nz4OjwRBYi/KfOMwku34Mdv Xn5vjXpAzFIkS+GOoGG80rHvzHDXMGvZ0m2pnXbR3GPRBHOIW7agrlQYcPgUNIpkyzLJu/Vd IsMCyUlBc75x7Uw5ap7GrrAjlGOao6buvlMZpXJcqhyd4GzJystPiUQPnda/PSzQ7TTLgitW zsAfNhjuGdFJ9R0XFtIf27Fu1fkzuJfg/lOqJ2djRcG2D/vd5qGLBgFbngraJ0uccaLSjhjR XK7oaDSzSan0PBY4n9GaDdaoa5OIIaSCB47cYpN+hA24LoMCHsdlPqG1SeAHE+YqPUunMT9N QawX4ivLeR0lG9hCGIi0OpXk7XoVuvpiCSDHuw4sqcQw1WVNQmVKmB0R/xmYsLDS+bAmZGd6 fy5STWdZDMTX9HYwPxqq69KRZA6x8scBS8GSgxMwAFumWK2/DdxZNbpitTJ0qF4fduYioT2p ViRBjbumNzA/EnGQc1prYtbLlTzH50d08GSfgZ+HlsNr5tZFMP8DLnIMgDcShofsEw4aKYnH JK1funyDuLX/ltB8PEHkn8278Y5baHq97Dfb418mu/DwXIH1a6/sTqZFKbw/jlGW5qAz6lv3 2+v6KHfPL3A4b82zRjjV9V8HM9zatu+pS0Cb6sfyA9rQmWoPCOBz/AhFlnMfwhTHwyaLKgvz lQBOFhLbelAc7fbDxeV1VzfmIiuZkqXzHRlWEJsZmKL1aJGABGDMKiep46mTLZFZJGfvPULq oiLieOui7M7G75EUtx/qAJ6RmOygOXmAZMmaJjEVVLQZol4R7aAq+WfltfWOUvOymLNZU1jh cPbQ3wBu799Y9Iay6emGZyU4efsfEBN/ef36dXDta2htKoqVqzpl8PkF95HBFlWauNS4HSad /An0db5OqpcPti0GOoxPsIGcmcE+TCmUS6FYrLkrS2yAECIVjiyZxajroLBG5PxWzGpQsYex eX1OLbYBu0YyTIY7d7F1NLi9qm9R4bi9GMKj1xerYaLV9mtPraDVLTIYZfySpTMFAkFPJ/Yt M506/ExmDzvskQDPglClaUadfQ8+LD2zQIyA8M56VEXi1IoWbNkBv2oyUncDMuIMM84z6nU4 krEVIO+3Qx2OHv7Xe349Vt8r+ANLF971ixfq86kfM5zjJ3zc6giSLcZiYZHw6eIiY5pWXhbX 1KCdEdY8nZ93ucwAmOw6M0mTSomBZD9ZC0xj3ropHofuhw9mUmDD0/sIh3/efODozKQzq2bO LUs4xq+1nfgZhjqn5Zo3d+cO1M95gO8IGOGb4OcCaHVtjs71NYH6qxX4MYBz5uanNDbfGvhX SOc/SPC5/nbPObauv8JR5m6L11Cz5Hme5iDwf3J3LaWtcCRxujYbM6uNCoaoWdtHjebRmS3J p3Q1iUjM0D6QdHyJy6j1sEj89uMBw0eIJ+g0Z1n0r3DCzJzW8KFp/WSVfELbB5YLoSPq2WcN luY1ISD4EOsNUOpiN7tQ+zJ1+Kyy7mhHaYHYAzUEkYANRwxmxQO//wHes64Ox4GAIAGM6JrP H9HJi/Zc8PWim8En5gGeE24V4M31Sa3RwoYLivjSWedjEJC3kmldukRrDYM3A0TtyPQZBPNp BrouzMAnQruyBgaeg2BErupJ+/2QIPVV3vsGzP/3cTW9CcMw9C/BuOzapu3krStVmyLKpdoQ B06T0Djs3892SpoUO0d46QckcV4cvxdpmfV7D4Vq3IHkRo/42WcrCyUDdvRWRPl4+pya+kPe Zw3eGfkbWYE4ReeyRyE0HTkOXA/eu2KuMq7QcHEiUYtA+XHkfPm+d7Xlis+Jq11OmG1wnt3S yNIPCJc2iSjsLMv0oT7HanncOSk2x90Uu6rzqh4UWeOcpMb5phsX0IGFEk5h7/z5Jju25bQ5 vm4W9rjGsK+2MuaG5+LPFqOsDNo9YfywsAJ1AZQduG+RmA6+TbOqRvR/6bzMha8YUmPTZs+z cca8AU7gu7fqLOQlSnbca8hwtVHMYQZyj6Pg+PwG7oDgcr7frr9/UqLjoxyV/FJphg7siLGm 7Dl9zrMs2VZMETz+uuWGWSAjWaOxv103tglzukOkgpj3kHDSfURyaLJuFEK223Jcv29fuG2/ /dxxkbsEGSZvWGG7xiAPqahgkDiH4GmBTeqyUdAKmoexZA6CP1hrwFfsriD1a8EjgIXPbHHU 1hA7n5jOTMaAlTsS0a0sbqPr7HZTgLzUEQwW+amG7uRzDUTkIo0acr5Ks8QzsoYXAeQHpSZ5 YY+72TnO6Q4EXerCR7gCbfeS5hvHE/nQJqApN+/iGO6pU0MhlvuKQm0smuIVMHRT9D3tKRE9 ByrO/Fs4xNYXSAGVX1gU8oaFjf1Uj6dZe5Var3s6as6gEV6ZlpWJVyYE/wEwVCCdZVgAAA== --CE+1k2dSO48ffgeK-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f44.google.com (mail-qg0-f44.google.com [209.85.192.44]) by kanga.kvack.org (Postfix) with ESMTP id 6873C6B0005 for ; Mon, 21 Mar 2016 05:54:13 -0400 (EDT) Received: by mail-qg0-f44.google.com with SMTP id u110so147808110qge.3 for ; Mon, 21 Mar 2016 02:54:13 -0700 (PDT) Received: from emea01-db3-obe.outbound.protection.outlook.com (mail-db3on0106.outbound.protection.outlook.com. [157.55.234.106]) by mx.google.com with ESMTPS id z23si9502884qka.91.2016.03.21.02.54.11 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 21 Mar 2016 02:54:12 -0700 (PDT) Subject: Re: [PATCH 0/5] userfaultfd: extension for non cooperative uffd usage References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> From: Pavel Emelyanov Message-ID: <56EFC4F2.2050104@virtuozzo.com> Date: Mon, 21 Mar 2016 12:54:58 +0300 MIME-Version: 1.0 In-Reply-To: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Mike Rapoport , Andrea Arcangeli Cc: LKML , linux-mm@kvack.org, Mike Rapoport On 03/20/2016 03:42 PM, Mike Rapoport wrote: > Hi, > > This set is to address the issues that appear in userfaultfd usage > scenarios when the task monitoring the uffd and the mm-owner do not > cooperate to each other on VM changes such as remaps, madvises and > fork()-s. > > The pacthes are essentially the same as in the prevoious respin (1), > they've just been rebased on the current tree. > > [1] http://thread.gmane.org/gmane.linux.kernel.mm/132662 Thanks, Mike! Acked-by: Pavel Emelyanov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lb0-f181.google.com (mail-lb0-f181.google.com [209.85.217.181]) by kanga.kvack.org (Postfix) with ESMTP id 93F7F828F3 for ; Wed, 6 Apr 2016 02:14:04 -0400 (EDT) Received: by mail-lb0-f181.google.com with SMTP id qe11so23140951lbc.3 for ; Tue, 05 Apr 2016 23:14:04 -0700 (PDT) Received: from mail-lf0-x243.google.com (mail-lf0-x243.google.com. [2a00:1450:4010:c07::243]) by mx.google.com with ESMTPS id at10si515755lbc.4.2016.04.05.23.14.03 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Apr 2016 23:14:03 -0700 (PDT) Received: by mail-lf0-x243.google.com with SMTP id o124so3344734lfb.2 for ; Tue, 05 Apr 2016 23:14:03 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <56EFC4F2.2050104@virtuozzo.com> References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> <56EFC4F2.2050104@virtuozzo.com> Date: Wed, 6 Apr 2016 09:14:02 +0300 Message-ID: Subject: Re: [PATCH 0/5] userfaultfd: extension for non cooperative uffd usage From: Mike Rapoport Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org List-ID: To: Pavel Emelyanov , Andrea Arcangeli Cc: Mike Rapoport , LKML , linux-mm@kvack.org On Mon, Mar 21, 2016 at 11:54 AM, Pavel Emelyanov wrote: > On 03/20/2016 03:42 PM, Mike Rapoport wrote: >> Hi, >> >> This set is to address the issues that appear in userfaultfd usage >> scenarios when the task monitoring the uffd and the mm-owner do not >> cooperate to each other on VM changes such as remaps, madvises and >> fork()-s. >> >> The pacthes are essentially the same as in the prevoious respin (1), >> they've just been rebased on the current tree. >> >> [1] http://thread.gmane.org/gmane.linux.kernel.mm/132662 > > Thanks, Mike! > > Acked-by: Pavel Emelyanov > Any updates/comments on this? -- Sincerely yours, Mike. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f200.google.com (mail-pf0-f200.google.com [209.85.192.200]) by kanga.kvack.org (Postfix) with ESMTP id 20A396B026A for ; Wed, 20 Apr 2016 05:43:42 -0400 (EDT) Received: by mail-pf0-f200.google.com with SMTP id e190so77479669pfe.3 for ; Wed, 20 Apr 2016 02:43:42 -0700 (PDT) Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-db5eur01on0128.outbound.protection.outlook.com. [104.47.2.128]) by mx.google.com with ESMTPS id f85si3439405pff.157.2016.04.20.02.43.40 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 20 Apr 2016 02:43:41 -0700 (PDT) Subject: Re: [PATCH 0/5] userfaultfd: extension for non cooperative uffd usage References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> From: Pavel Emelyanov Message-ID: <57174F90.7080109@virtuozzo.com> Date: Wed, 20 Apr 2016 12:44:48 +0300 MIME-Version: 1.0 In-Reply-To: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Mike Rapoport , Andrea Arcangeli Cc: LKML , linux-mm@kvack.org, Mike Rapoport On 03/20/2016 03:42 PM, Mike Rapoport wrote: > Hi, > > This set is to address the issues that appear in userfaultfd usage > scenarios when the task monitoring the uffd and the mm-owner do not > cooperate to each other on VM changes such as remaps, madvises and > fork()-s. > > The pacthes are essentially the same as in the prevoious respin (1), > they've just been rebased on the current tree. Hi, Andrea. Hopefully one day after LSFMM is good time to try to get a bit of your attention to this set :) -- Pavel -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f70.google.com (mail-qg0-f70.google.com [209.85.192.70]) by kanga.kvack.org (Postfix) with ESMTP id D527C6B007E for ; Fri, 22 Apr 2016 12:06:00 -0400 (EDT) Received: by mail-qg0-f70.google.com with SMTP id b14so138267255qge.2 for ; Fri, 22 Apr 2016 09:06:00 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id a190si3707919qke.76.2016.04.22.09.05.59 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 22 Apr 2016 09:06:00 -0700 (PDT) Date: Fri, 22 Apr 2016 12:05:57 -0400 From: Andrea Arcangeli Subject: Re: [PATCH 0/5] userfaultfd: extension for non cooperative uffd usage Message-ID: <20160422160557.GB4282@redhat.com> References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> <57174F90.7080109@virtuozzo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <57174F90.7080109@virtuozzo.com> Sender: owner-linux-mm@kvack.org List-ID: To: Pavel Emelyanov Cc: Mike Rapoport , LKML , linux-mm@kvack.org, Mike Rapoport Hello Pavel and Mike, On Wed, Apr 20, 2016 at 12:44:48PM +0300, Pavel Emelyanov wrote: > On 03/20/2016 03:42 PM, Mike Rapoport wrote: > > Hi, > > > > This set is to address the issues that appear in userfaultfd usage > > scenarios when the task monitoring the uffd and the mm-owner do not > > cooperate to each other on VM changes such as remaps, madvises and > > fork()-s. > > > > The pacthes are essentially the same as in the prevoious respin (1), > > they've just been rebased on the current tree. Thanks for the rebasing and the submit of these new features! > > Hi, Andrea. > > Hopefully one day after LSFMM is good time to try to get a bit of > your attention to this set :) Yes, at first glance this patchset looks fine. In fact I already merged it in my tree at the time of last post. Just I didn't have much time to review it in detail yet as I did with the wrprotect tracking one, this is why I didn't answer yet, sorry. As said I already reviewed the wrprotect tracking feature in detail and it requires a few (but non trivial) fixes and I was planning to fix that part first as the developer who sent the first implementation a few months ago got busy with something else. But until those bugs gets fixed I cannot ship it in my tree, nor in the way to -mm. The other main reason of the delay is that I got sidetracked by other issues (one internal) and the other notable one is the failure in postcopy caused by the new THP refcounting introduced in 4.5 with THP enabled, which apparently isn't the huge zeropage (tested with use_zero_page = 0) nor the MADV_DONTNEED. I'm also unconvinced it's a bug only in the userfaultfd interaction with the new THP refcounting, perhaps it's something more generic that just happen to be reproduced more easily by the heavy postcopy load, which makes it even more high priority to track that down. I'm afraid until that regression is fixed, I'll have to concentrate on fixing that. At least I found a way to reproduce faster so I'm optimistic it won't take long ;). Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755399AbcCTMmm (ORCPT ); Sun, 20 Mar 2016 08:42:42 -0400 Received: from e06smtp12.uk.ibm.com ([195.75.94.108]:56414 "EHLO e06smtp12.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755278AbcCTMme (ORCPT ); Sun, 20 Mar 2016 08:42:34 -0400 X-IBM-Helo: d06dlp03.portsmouth.uk.ibm.com X-IBM-MailFrom: rapoport@il.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org From: Mike Rapoport To: Andrea Arcangeli Cc: Pavel Emelyanov , LKML , linux-mm@kvack.org, Mike Rapoport , Mike Rapoport Subject: [PATCH 0/5] userfaultfd: extension for non cooperative uffd usage Date: Sun, 20 Mar 2016 14:42:16 +0200 Message-Id: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> X-Mailer: git-send-email 1.9.1 X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16032012-0009-0000-0000-000008D2038D Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, This set is to address the issues that appear in userfaultfd usage scenarios when the task monitoring the uffd and the mm-owner do not cooperate to each other on VM changes such as remaps, madvises and fork()-s. The pacthes are essentially the same as in the prevoious respin (1), they've just been rebased on the current tree. [1] http://thread.gmane.org/gmane.linux.kernel.mm/132662 Pavel Emelyanov (5): uffd: Split the find_userfault() routine uffd: Add ability to report non-PF events from uffd descriptor uffd: Add fork() event uffd: Add mremap() event uffd: Add madvise() event for MADV_DONTNEED request fs/userfaultfd.c | 319 ++++++++++++++++++++++++++++++++++++++- include/linux/userfaultfd_k.h | 41 +++++ include/uapi/linux/userfaultfd.h | 28 +++- kernel/fork.c | 10 +- mm/madvise.c | 2 + mm/mremap.c | 17 ++- 6 files changed, 395 insertions(+), 22 deletions(-) -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755571AbcCTMmt (ORCPT ); Sun, 20 Mar 2016 08:42:49 -0400 Received: from e06smtp17.uk.ibm.com ([195.75.94.113]:45089 "EHLO e06smtp17.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755514AbcCTMmj (ORCPT ); Sun, 20 Mar 2016 08:42:39 -0400 X-IBM-Helo: d06dlp02.portsmouth.uk.ibm.com X-IBM-MailFrom: rapoport@il.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org From: Mike Rapoport To: Andrea Arcangeli Cc: Pavel Emelyanov , LKML , linux-mm@kvack.org, Mike Rapoport , Mike Rapoport Subject: [PATCH 1/5] uffd: Split the find_userfault() routine Date: Sun, 20 Mar 2016 14:42:17 +0200 Message-Id: <1458477741-6942-2-git-send-email-rapoport@il.ibm.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16032012-0005-0000-0000-00000CF30390 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Pavel Emelyanov I will need one to lookup for userfaultfd_wait_queue-s in different wait queue Signed-off-by: Pavel Emelyanov Signed-off-by: Mike Rapoport --- fs/userfaultfd.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 66cdb44..4f0b53d 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -483,25 +483,30 @@ static int userfaultfd_release(struct inode *inode, struct file *file) } /* fault_pending_wqh.lock must be hold by the caller */ -static inline struct userfaultfd_wait_queue *find_userfault( - struct userfaultfd_ctx *ctx) +static inline struct userfaultfd_wait_queue *find_userfault_in( + wait_queue_head_t *wqh) { wait_queue_t *wq; struct userfaultfd_wait_queue *uwq; - VM_BUG_ON(!spin_is_locked(&ctx->fault_pending_wqh.lock)); + VM_BUG_ON(!spin_is_locked(&wqh->lock)); uwq = NULL; - if (!waitqueue_active(&ctx->fault_pending_wqh)) + if (!waitqueue_active(wqh)) goto out; /* walk in reverse to provide FIFO behavior to read userfaults */ - wq = list_last_entry(&ctx->fault_pending_wqh.task_list, - typeof(*wq), task_list); + wq = list_last_entry(&wqh->task_list, typeof(*wq), task_list); uwq = container_of(wq, struct userfaultfd_wait_queue, wq); out: return uwq; } +static inline struct userfaultfd_wait_queue *find_userfault( + struct userfaultfd_ctx *ctx) +{ + return find_userfault_in(&ctx->fault_pending_wqh); +} + static unsigned int userfaultfd_poll(struct file *file, poll_table *wait) { struct userfaultfd_ctx *ctx = file->private_data; -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755646AbcCTMnE (ORCPT ); Sun, 20 Mar 2016 08:43:04 -0400 Received: from e06smtp17.uk.ibm.com ([195.75.94.113]:45092 "EHLO e06smtp17.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755420AbcCTMmp (ORCPT ); Sun, 20 Mar 2016 08:42:45 -0400 X-IBM-Helo: d06dlp01.portsmouth.uk.ibm.com X-IBM-MailFrom: rapoport@il.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org From: Mike Rapoport To: Andrea Arcangeli Cc: Pavel Emelyanov , LKML , linux-mm@kvack.org, Mike Rapoport , Mike Rapoport Subject: [PATCH 2/5] uffd: Add ability to report non-PF events from uffd descriptor Date: Sun, 20 Mar 2016 14:42:18 +0200 Message-Id: <1458477741-6942-3-git-send-email-rapoport@il.ibm.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16032012-0005-0000-0000-00000CF30392 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Pavel Emelyanov The custom events are queued in ctx->event_wqh not to disturb the fast-path-ed PF queue-wait-wakeup functions. The events to be generated (other than PF-s) are requested in UFFD_API ioctl with the uffd_api.features bits. Those, known by the kernel, are then turned on and reported back to the user-space. Signed-off-by: Pavel Emelyanov Signed-off-by: Mike Rapoport --- fs/userfaultfd.c | 99 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 97 insertions(+), 2 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 4f0b53d..c8e7039 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -12,6 +12,7 @@ * mm/ksm.c (mm hashing). */ +#include #include #include #include @@ -45,18 +46,23 @@ struct userfaultfd_ctx { wait_queue_head_t fault_wqh; /* waitqueue head for the pseudo fd to wakeup poll/read */ wait_queue_head_t fd_wqh; + /* waitqueue head for events */ + wait_queue_head_t event_wqh; /* a refile sequence protected by fault_pending_wqh lock */ struct seqcount refile_seq; /* pseudo fd refcounting */ atomic_t refcount; /* userfaultfd syscall flags */ unsigned int flags; + /* features requested from the userspace */ + unsigned int features; /* state machine */ enum userfaultfd_state state; /* released */ bool released; /* mm with one ore more vmas attached to this userfaultfd_ctx */ struct mm_struct *mm; + }; struct userfaultfd_wait_queue { @@ -135,6 +141,8 @@ static void userfaultfd_ctx_put(struct userfaultfd_ctx *ctx) VM_BUG_ON(waitqueue_active(&ctx->fault_pending_wqh)); VM_BUG_ON(spin_is_locked(&ctx->fault_wqh.lock)); VM_BUG_ON(waitqueue_active(&ctx->fault_wqh)); + VM_BUG_ON(spin_is_locked(&ctx->event_wqh.lock)); + VM_BUG_ON(waitqueue_active(&ctx->event_wqh)); VM_BUG_ON(spin_is_locked(&ctx->fd_wqh.lock)); VM_BUG_ON(waitqueue_active(&ctx->fd_wqh)); mmput(ctx->mm); @@ -423,6 +431,59 @@ out: return ret; } +static int __maybe_unused userfaultfd_event_wait_completion( + struct userfaultfd_ctx *ctx, + struct userfaultfd_wait_queue *ewq) +{ + int ret = 0; + + ewq->ctx = ctx; + init_waitqueue_entry(&ewq->wq, current); + + spin_lock(&ctx->event_wqh.lock); + /* + * After the __add_wait_queue the uwq is visible to userland + * through poll/read(). + */ + __add_wait_queue(&ctx->event_wqh, &ewq->wq); + for (;;) { + set_current_state(TASK_KILLABLE); + if (ewq->msg.event == 0) + break; + if (ACCESS_ONCE(ctx->released) || + fatal_signal_pending(current)) { + ret = -1; + __remove_wait_queue(&ctx->event_wqh, &ewq->wq); + break; + } + + spin_unlock(&ctx->event_wqh.lock); + + wake_up_poll(&ctx->fd_wqh, POLLIN); + schedule(); + + spin_lock(&ctx->event_wqh.lock); + } + __set_current_state(TASK_RUNNING); + spin_unlock(&ctx->event_wqh.lock); + + /* + * ctx may go away after this if the userfault pseudo fd is + * already released. + */ + + userfaultfd_ctx_put(ctx); + return ret; +} + +static void userfaultfd_event_complete(struct userfaultfd_ctx *ctx, + struct userfaultfd_wait_queue *ewq) +{ + ewq->msg.event = 0; + wake_up_locked(&ctx->event_wqh); + __remove_wait_queue(&ctx->event_wqh, &ewq->wq); +} + static int userfaultfd_release(struct inode *inode, struct file *file) { struct userfaultfd_ctx *ctx = file->private_data; @@ -507,6 +568,12 @@ static inline struct userfaultfd_wait_queue *find_userfault( return find_userfault_in(&ctx->fault_pending_wqh); } +static inline struct userfaultfd_wait_queue *find_userfault_evt( + struct userfaultfd_ctx *ctx) +{ + return find_userfault_in(&ctx->event_wqh); +} + static unsigned int userfaultfd_poll(struct file *file, poll_table *wait) { struct userfaultfd_ctx *ctx = file->private_data; @@ -538,6 +605,9 @@ static unsigned int userfaultfd_poll(struct file *file, poll_table *wait) smp_mb(); if (waitqueue_active(&ctx->fault_pending_wqh)) ret = POLLIN; + else if (waitqueue_active(&ctx->event_wqh)) + ret = POLLIN; + return ret; default: BUG(); @@ -601,6 +671,19 @@ static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait, break; } spin_unlock(&ctx->fault_pending_wqh.lock); + + spin_lock(&ctx->event_wqh.lock); + uwq = find_userfault_evt(ctx); + if (uwq) { + *msg = uwq->msg; + + userfaultfd_event_complete(ctx, uwq); + spin_unlock(&ctx->event_wqh.lock); + ret = 0; + break; + } + spin_unlock(&ctx->event_wqh.lock); + if (signal_pending(current)) { ret = -ERESTARTSYS; break; @@ -1133,6 +1216,14 @@ out: return ret; } +static inline unsigned int uffd_ctx_features(__u64 user_features) +{ + /* + * For the current set of features the bits just coincide + */ + return (unsigned int)user_features; +} + /* * userland asks for a certain API version and we return which bits * and ioctl commands are implemented in this kernel for such API @@ -1151,19 +1242,21 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx, ret = -EFAULT; if (copy_from_user(&uffdio_api, buf, sizeof(uffdio_api))) goto out; - if (uffdio_api.api != UFFD_API || uffdio_api.features) { + if (uffdio_api.api != UFFD_API || + (uffdio_api.features & ~UFFD_API_FEATURES)) { memset(&uffdio_api, 0, sizeof(uffdio_api)); if (copy_to_user(buf, &uffdio_api, sizeof(uffdio_api))) goto out; ret = -EINVAL; goto out; } - uffdio_api.features = UFFD_API_FEATURES; + uffdio_api.features &= UFFD_API_FEATURES; uffdio_api.ioctls = UFFD_API_IOCTLS; ret = -EFAULT; if (copy_to_user(buf, &uffdio_api, sizeof(uffdio_api))) goto out; ctx->state = UFFD_STATE_RUNNING; + ctx->features = uffd_ctx_features(uffdio_api.features); ret = 0; out: return ret; @@ -1250,6 +1343,7 @@ static void init_once_userfaultfd_ctx(void *mem) init_waitqueue_head(&ctx->fault_pending_wqh); init_waitqueue_head(&ctx->fault_wqh); + init_waitqueue_head(&ctx->event_wqh); init_waitqueue_head(&ctx->fd_wqh); seqcount_init(&ctx->refile_seq); } @@ -1290,6 +1384,7 @@ static struct file *userfaultfd_file_create(int flags) atomic_set(&ctx->refcount, 1); ctx->flags = flags; + ctx->features = 0; ctx->state = UFFD_STATE_WAIT_API; ctx->released = false; ctx->mm = current->mm; -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755706AbcCTMnO (ORCPT ); Sun, 20 Mar 2016 08:43:14 -0400 Received: from e06smtp12.uk.ibm.com ([195.75.94.108]:56441 "EHLO e06smtp12.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755548AbcCTMmr (ORCPT ); Sun, 20 Mar 2016 08:42:47 -0400 X-IBM-Helo: d06dlp03.portsmouth.uk.ibm.com X-IBM-MailFrom: rapoport@il.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org From: Mike Rapoport To: Andrea Arcangeli Cc: Pavel Emelyanov , LKML , linux-mm@kvack.org, Mike Rapoport , Mike Rapoport Subject: [PATCH 5/5] uffd: Add madvise() event for MADV_DONTNEED request Date: Sun, 20 Mar 2016 14:42:21 +0200 Message-Id: <1458477741-6942-6-git-send-email-rapoport@il.ibm.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16032012-0009-0000-0000-000008D2039C Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Pavel Emelyanov If the page is punched out of the address space the uffd reader should know this and zeromap the respective area in case of the #PF event. Signed-off-by: Pavel Emelyanov Signed-off-by: Mike Rapoport --- fs/userfaultfd.c | 26 ++++++++++++++++++++++++++ include/linux/userfaultfd_k.h | 12 ++++++++++++ include/uapi/linux/userfaultfd.h | 9 ++++++++- mm/madvise.c | 2 ++ 4 files changed, 48 insertions(+), 1 deletion(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index a7771bd..e65ca84 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -599,6 +599,32 @@ void mremap_userfaultfd_complete(struct vm_userfaultfd_ctx vm_ctx, userfaultfd_event_wait_completion(ctx, &ewq); } +void madvise_userfault_dontneed(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, unsigned long end) +{ + struct userfaultfd_ctx *ctx; + struct userfaultfd_wait_queue ewq; + + ctx = vma->vm_userfaultfd_ctx.ctx; + if (!ctx || !(ctx->features & UFFD_FEATURE_EVENT_MADVDONTNEED)) + return; + + userfaultfd_ctx_get(ctx); + *prev = NULL; /* We wait for ACK w/o the mmap semaphore */ + up_read(&vma->vm_mm->mmap_sem); + + msg_init(&ewq.msg); + + ewq.msg.event = UFFD_EVENT_MADVDONTNEED; + ewq.msg.arg.madv_dn.start = start; + ewq.msg.arg.madv_dn.end = end; + + userfaultfd_event_wait_completion(ctx, &ewq); + + down_read(&vma->vm_mm->mmap_sem); +} + static int userfaultfd_release(struct inode *inode, struct file *file) { struct userfaultfd_ctx *ctx = file->private_data; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 42ea277..7e22a3d 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -62,6 +62,11 @@ extern void mremap_userfaultfd_complete(struct vm_userfaultfd_ctx, unsigned long from, unsigned long to, unsigned long len); +extern void madvise_userfault_dontneed(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, + unsigned long end); + #else /* CONFIG_USERFAULTFD */ /* mm helpers */ @@ -109,6 +114,13 @@ static inline void mremap_userfaultfd_complete(struct vm_userfaultfd_ctx ctx, unsigned long len) { } + +static inline void madvise_userfault_dontneed(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, + unsigned long end) +{ +} #endif /* CONFIG_USERFAULTFD */ #endif /* _LINUX_USERFAULTFD_K_H */ diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 46bbb6f..cbcb3a5 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -16,7 +16,7 @@ * After implementing the respective features it will become: * #define UFFD_API_FEATURES (UFFD_FEATURE_PAGEFAULT_FLAG_WP) */ -#define UFFD_API_FEATURES (UFFD_FEATURE_EVENT_FORK|UFFD_FEATURE_EVENT_REMAP) +#define UFFD_API_FEATURES (UFFD_FEATURE_EVENT_FORK|UFFD_FEATURE_EVENT_REMAP|UFFD_FEATURE_EVENT_MADVDONTNEED) #define UFFD_API_IOCTLS \ ((__u64)1 << _UFFDIO_REGISTER | \ (__u64)1 << _UFFDIO_UNREGISTER | \ @@ -81,6 +81,11 @@ struct uffd_msg { } remap; struct { + __u64 start; + __u64 end; + } madv_dn; + + struct { /* unused reserved fields */ __u64 reserved1; __u64 reserved2; @@ -95,6 +100,7 @@ struct uffd_msg { #define UFFD_EVENT_PAGEFAULT 0x12 #define UFFD_EVENT_FORK 0x13 #define UFFD_EVENT_REMAP 0x14 +#define UFFD_EVENT_MADVDONTNEED 0x15 /* flags for UFFD_EVENT_PAGEFAULT */ #define UFFD_PAGEFAULT_FLAG_WRITE (1<<0) /* If this was a write fault */ @@ -118,6 +124,7 @@ struct uffdio_api { #endif #define UFFD_FEATURE_EVENT_FORK (1<<1) #define UFFD_FEATURE_EVENT_REMAP (1<<2) +#define UFFD_FEATURE_EVENT_MADVDONTNEED (1<<3) __u64 features; __u64 ioctls; diff --git a/mm/madvise.c b/mm/madvise.c index a011473..7b66d6b 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include @@ -476,6 +477,7 @@ static long madvise_dontneed(struct vm_area_struct *vma, return -EINVAL; zap_page_range(vma, start, end - start, NULL); + madvise_userfault_dontneed(vma, prev, start, end); return 0; } -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755756AbcCTMnT (ORCPT ); Sun, 20 Mar 2016 08:43:19 -0400 Received: from e06smtp05.uk.ibm.com ([195.75.94.101]:46068 "EHLO e06smtp05.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755541AbcCTMmq (ORCPT ); Sun, 20 Mar 2016 08:42:46 -0400 X-IBM-Helo: d06dlp01.portsmouth.uk.ibm.com X-IBM-MailFrom: rapoport@il.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org From: Mike Rapoport To: Andrea Arcangeli Cc: Pavel Emelyanov , LKML , linux-mm@kvack.org, Mike Rapoport , Mike Rapoport Subject: [PATCH 4/5] uffd: Add mremap() event Date: Sun, 20 Mar 2016 14:42:20 +0200 Message-Id: <1458477741-6942-5-git-send-email-rapoport@il.ibm.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16032012-0021-0000-0000-0000085C035A Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Pavel Emelyanov The event denotes that an area [start:end] moves to different location. Length change isn't reported as "new" addresses, if they appear on the uffd reader side they will not contain any data and the latter can just zeromap them. Waiting for the event ACK is also done outside of mmap sem, as for fork event. Signed-off-by: Pavel Emelyanov Signed-off-by: Mike Rapoport --- fs/userfaultfd.c | 37 +++++++++++++++++++++++++++++++++++++ include/linux/userfaultfd_k.h | 17 +++++++++++++++++ include/uapi/linux/userfaultfd.h | 10 +++++++++- mm/mremap.c | 17 ++++++++++++----- 4 files changed, 75 insertions(+), 6 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 565d8f2..a7771bd 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -562,6 +562,43 @@ void dup_userfaultfd_complete(struct list_head *fcs) } } +void mremap_userfaultfd_prep(struct vm_area_struct *vma, + struct vm_userfaultfd_ctx *vm_ctx) +{ + struct userfaultfd_ctx *ctx; + + ctx = vma->vm_userfaultfd_ctx.ctx; + if (ctx && (ctx->features & UFFD_FEATURE_EVENT_REMAP)) { + vm_ctx->ctx = ctx; + userfaultfd_ctx_get(ctx); + } +} + +void mremap_userfaultfd_complete(struct vm_userfaultfd_ctx vm_ctx, + unsigned long from, unsigned long to, + unsigned long len) +{ + struct userfaultfd_ctx *ctx = vm_ctx.ctx; + struct userfaultfd_wait_queue ewq; + + if (!ctx) + return; + + if (to & ~PAGE_MASK) { + userfaultfd_ctx_put(ctx); + return; + } + + msg_init(&ewq.msg); + + ewq.msg.event = UFFD_EVENT_REMAP; + ewq.msg.arg.remap.from = from; + ewq.msg.arg.remap.to = to; + ewq.msg.arg.remap.len = len; + + userfaultfd_event_wait_completion(ctx, &ewq); +} + static int userfaultfd_release(struct inode *inode, struct file *file) { struct userfaultfd_ctx *ctx = file->private_data; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 0c7b723..42ea277 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -56,6 +56,12 @@ static inline bool userfaultfd_armed(struct vm_area_struct *vma) extern int dup_userfaultfd(struct vm_area_struct *, struct list_head *); extern void dup_userfaultfd_complete(struct list_head *); +extern void mremap_userfaultfd_prep(struct vm_area_struct *, + struct vm_userfaultfd_ctx *); +extern void mremap_userfaultfd_complete(struct vm_userfaultfd_ctx, + unsigned long from, unsigned long to, + unsigned long len); + #else /* CONFIG_USERFAULTFD */ /* mm helpers */ @@ -92,6 +98,17 @@ static inline void dup_userfaultfd_complete(struct list_head *) { } +static inline void mremap_userfaultfd_prep(struct vm_area_struct *vma, + struct vm_userfaultfd_ctx *ctx) +{ +} + +static inline void mremap_userfaultfd_complete(struct vm_userfaultfd_ctx ctx, + unsigned long from, + unsigned long to, + unsigned long len) +{ +} #endif /* CONFIG_USERFAULTFD */ #endif /* _LINUX_USERFAULTFD_K_H */ diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index d89eef6..46bbb6f 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -16,7 +16,7 @@ * After implementing the respective features it will become: * #define UFFD_API_FEATURES (UFFD_FEATURE_PAGEFAULT_FLAG_WP) */ -#define UFFD_API_FEATURES (UFFD_FEATURE_EVENT_FORK) +#define UFFD_API_FEATURES (UFFD_FEATURE_EVENT_FORK|UFFD_FEATURE_EVENT_REMAP) #define UFFD_API_IOCTLS \ ((__u64)1 << _UFFDIO_REGISTER | \ (__u64)1 << _UFFDIO_UNREGISTER | \ @@ -75,6 +75,12 @@ struct uffd_msg { } fork; struct { + __u64 from; + __u64 to; + __u64 len; + } remap; + + struct { /* unused reserved fields */ __u64 reserved1; __u64 reserved2; @@ -88,6 +94,7 @@ struct uffd_msg { */ #define UFFD_EVENT_PAGEFAULT 0x12 #define UFFD_EVENT_FORK 0x13 +#define UFFD_EVENT_REMAP 0x14 /* flags for UFFD_EVENT_PAGEFAULT */ #define UFFD_PAGEFAULT_FLAG_WRITE (1<<0) /* If this was a write fault */ @@ -110,6 +117,7 @@ struct uffdio_api { #define UFFD_FEATURE_PAGEFAULT_FLAG_WP (1<<0) #endif #define UFFD_FEATURE_EVENT_FORK (1<<1) +#define UFFD_FEATURE_EVENT_REMAP (1<<2) __u64 features; __u64 ioctls; diff --git a/mm/mremap.c b/mm/mremap.c index 3fa0a467..3581f31 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include @@ -234,7 +235,8 @@ unsigned long move_page_tables(struct vm_area_struct *vma, static unsigned long move_vma(struct vm_area_struct *vma, unsigned long old_addr, unsigned long old_len, - unsigned long new_len, unsigned long new_addr, bool *locked) + unsigned long new_len, unsigned long new_addr, + bool *locked, struct vm_userfaultfd_ctx *uf) { struct mm_struct *mm = vma->vm_mm; struct vm_area_struct *new_vma; @@ -293,6 +295,7 @@ static unsigned long move_vma(struct vm_area_struct *vma, old_addr = new_addr; new_addr = err; } else { + mremap_userfaultfd_prep(new_vma, uf); arch_remap(mm, old_addr, old_addr + old_len, new_addr, new_addr + new_len); } @@ -397,7 +400,8 @@ static struct vm_area_struct *vma_to_resize(unsigned long addr, } static unsigned long mremap_to(unsigned long addr, unsigned long old_len, - unsigned long new_addr, unsigned long new_len, bool *locked) + unsigned long new_addr, unsigned long new_len, bool *locked, + struct vm_userfaultfd_ctx *uf) { struct mm_struct *mm = current->mm; struct vm_area_struct *vma; @@ -442,7 +446,7 @@ static unsigned long mremap_to(unsigned long addr, unsigned long old_len, if (offset_in_page(ret)) goto out1; - ret = move_vma(vma, addr, old_len, new_len, new_addr, locked); + ret = move_vma(vma, addr, old_len, new_len, new_addr, locked, uf); if (!(offset_in_page(ret))) goto out; out1: @@ -481,6 +485,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, unsigned long ret = -EINVAL; unsigned long charged = 0; bool locked = false; + struct vm_userfaultfd_ctx uf = NULL_VM_UFFD_CTX; if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE)) return ret; @@ -506,7 +511,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, if (flags & MREMAP_FIXED) { ret = mremap_to(addr, old_len, new_addr, new_len, - &locked); + &locked, &uf); goto out; } @@ -575,7 +580,8 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, goto out; } - ret = move_vma(vma, addr, old_len, new_len, new_addr, &locked); + ret = move_vma(vma, addr, old_len, new_len, new_addr, + &locked, &uf); } out: if (offset_in_page(ret)) { @@ -585,5 +591,6 @@ out: up_write(¤t->mm->mmap_sem); if (locked && new_len > old_len) mm_populate(new_addr + old_len, new_len - old_len); + mremap_userfaultfd_complete(uf, addr, new_addr, old_len); return ret; } -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755832AbcCTMnY (ORCPT ); Sun, 20 Mar 2016 08:43:24 -0400 Received: from e06smtp08.uk.ibm.com ([195.75.94.104]:58829 "EHLO e06smtp08.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755544AbcCTMmq (ORCPT ); Sun, 20 Mar 2016 08:42:46 -0400 X-IBM-Helo: d06dlp01.portsmouth.uk.ibm.com X-IBM-MailFrom: rapoport@il.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org From: Mike Rapoport To: Andrea Arcangeli Cc: Pavel Emelyanov , LKML , linux-mm@kvack.org, Mike Rapoport , Mike Rapoport Subject: [PATCH 3/5] uffd: Add fork() event Date: Sun, 20 Mar 2016 14:42:19 +0200 Message-Id: <1458477741-6942-4-git-send-email-rapoport@il.ibm.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16032012-0033-0000-0000-00000713054D Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Pavel Emelyanov When the mm with uffd-ed vmas fork()-s the respective vmas notify their uffds with the event which contains a descriptor with new uffd. This new descriptor can then be used to get events from the child and populate its mm with data. Note, that there can be different uffd-s controlling different vmas within one mm, so first we should collect all those uffds (and ctx-s) in a list and then notify them all one by one but only once per fork(). The context is created at fork() time but the descriptor, file struct and anon inode object is created at event read time. So some trickery is added to the userfaultfd_ctx_read() to handle the ctx queues' locking vs file creation. Another thing worth noticing is that the task that fork()-s waits for the uffd event to get processed WITHOUT the mmap sem. Signed-off-by: Pavel Emelyanov Signed-off-by: Mike Rapoport --- fs/userfaultfd.c | 146 ++++++++++++++++++++++++++++++++++++++- include/linux/userfaultfd_k.h | 12 ++++ include/uapi/linux/userfaultfd.h | 13 ++-- kernel/fork.c | 10 ++- 4 files changed, 169 insertions(+), 12 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index c8e7039..565d8f2 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -65,6 +65,12 @@ struct userfaultfd_ctx { }; +struct userfaultfd_fork_ctx { + struct userfaultfd_ctx *orig; + struct userfaultfd_ctx *new; + struct list_head list; +}; + struct userfaultfd_wait_queue { struct uffd_msg msg; wait_queue_t wq; @@ -431,9 +437,8 @@ out: return ret; } -static int __maybe_unused userfaultfd_event_wait_completion( - struct userfaultfd_ctx *ctx, - struct userfaultfd_wait_queue *ewq) +static int userfaultfd_event_wait_completion(struct userfaultfd_ctx *ctx, + struct userfaultfd_wait_queue *ewq) { int ret = 0; @@ -484,6 +489,79 @@ static void userfaultfd_event_complete(struct userfaultfd_ctx *ctx, __remove_wait_queue(&ctx->event_wqh, &ewq->wq); } +int dup_userfaultfd(struct vm_area_struct *vma, struct list_head *fcs) +{ + struct userfaultfd_ctx *ctx = NULL, *octx; + struct userfaultfd_fork_ctx *fctx; + + octx = vma->vm_userfaultfd_ctx.ctx; + if (!octx || !(octx->features & UFFD_FEATURE_EVENT_FORK)) { + vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; + vma->vm_flags &= ~(VM_UFFD_WP | VM_UFFD_MISSING); + return 0; + } + + list_for_each_entry(fctx, fcs, list) + if (fctx->orig == octx) { + ctx = fctx->new; + break; + } + + if (!ctx) { + fctx = kmalloc(sizeof(*fctx), GFP_KERNEL); + if (!fctx) + return -ENOMEM; + + ctx = kmem_cache_alloc(userfaultfd_ctx_cachep, GFP_KERNEL); + if (!ctx) { + kfree(fctx); + return -ENOMEM; + } + + atomic_set(&ctx->refcount, 1); + ctx->flags = octx->flags; + ctx->state = UFFD_STATE_RUNNING; + ctx->features = octx->features; + ctx->released = false; + ctx->mm = vma->vm_mm; + atomic_inc(&ctx->mm->mm_users); + + userfaultfd_ctx_get(octx); + fctx->orig = octx; + fctx->new = ctx; + list_add_tail(&fctx->list, fcs); + } + + vma->vm_userfaultfd_ctx.ctx = ctx; + return 0; +} + +static int dup_fctx(struct userfaultfd_fork_ctx *fctx) +{ + struct userfaultfd_ctx *ctx = fctx->orig; + struct userfaultfd_wait_queue ewq; + + msg_init(&ewq.msg); + + ewq.msg.event = UFFD_EVENT_FORK; + ewq.msg.arg.reserved.reserved1 = (__u64)fctx->new; + + return userfaultfd_event_wait_completion(ctx, &ewq); +} + +void dup_userfaultfd_complete(struct list_head *fcs) +{ + int ret = 0; + struct userfaultfd_fork_ctx *fctx, *n; + + list_for_each_entry_safe(fctx, n, fcs, list) { + if (!ret) + ret = dup_fctx(fctx); + list_del(&fctx->list); + kfree(fctx); + } +} + static int userfaultfd_release(struct inode *inode, struct file *file) { struct userfaultfd_ctx *ctx = file->private_data; @@ -614,12 +692,49 @@ static unsigned int userfaultfd_poll(struct file *file, poll_table *wait) } } +static const struct file_operations userfaultfd_fops; + +static int resolve_userfault_fork(struct userfaultfd_ctx *ctx, + struct userfaultfd_ctx *new, + struct uffd_msg *msg) +{ + int fd; + struct file *file; + unsigned int flags = new->flags & UFFD_SHARED_FCNTL_FLAGS; + + fd = get_unused_fd_flags(flags); + if (fd < 0) + return fd; + + file = anon_inode_getfile("[userfaultfd]", &userfaultfd_fops, new, + O_RDWR | flags); + if (IS_ERR(file)) { + put_unused_fd(fd); + return PTR_ERR(file); + } + + fd_install(fd, file); + msg->arg.reserved.reserved1 = 0; + msg->arg.fork.ufd = fd; + + return 0; +} + static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait, struct uffd_msg *msg) { ssize_t ret; DECLARE_WAITQUEUE(wait, current); struct userfaultfd_wait_queue *uwq; + /* + * Handling fork event requires sleeping operations, so + * we drop the event_wqh lock, then do these ops, then + * lock it back and wake up the waiter. While the lock is + * dropped the ewq may go away so we keep track of it + * carefully. + */ + LIST_HEAD(fork_event); + struct userfaultfd_ctx *fork_nctx = NULL; /* always take the fd_wqh lock before the fault_pending_wqh lock */ spin_lock(&ctx->fd_wqh.lock); @@ -677,6 +792,14 @@ static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait, if (uwq) { *msg = uwq->msg; + if (uwq->msg.event == UFFD_EVENT_FORK) { + fork_nctx = (struct userfaultfd_ctx *)uwq->msg.arg.reserved.reserved1; + list_move(&uwq->wq.task_list, &fork_event); + spin_unlock(&ctx->event_wqh.lock); + ret = 0; + break; + } + userfaultfd_event_complete(ctx, uwq); spin_unlock(&ctx->event_wqh.lock); ret = 0; @@ -700,6 +823,23 @@ static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait, __set_current_state(TASK_RUNNING); spin_unlock(&ctx->fd_wqh.lock); + if (!ret && msg->event == UFFD_EVENT_FORK) { + ret = resolve_userfault_fork(ctx, fork_nctx, msg); + + if (!ret) { + spin_lock(&ctx->event_wqh.lock); + if (!list_empty(&fork_event)) { + uwq = list_first_entry(&fork_event, + typeof(*uwq), + wq.task_list); + list_del(&uwq->wq.task_list); + __add_wait_queue(&ctx->event_wqh, &uwq->wq); + userfaultfd_event_complete(ctx, uwq); + } + spin_unlock(&ctx->event_wqh.lock); + } + } + return ret; } diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 587480a..0c7b723 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -53,6 +53,9 @@ static inline bool userfaultfd_armed(struct vm_area_struct *vma) return vma->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP); } +extern int dup_userfaultfd(struct vm_area_struct *, struct list_head *); +extern void dup_userfaultfd_complete(struct list_head *); + #else /* CONFIG_USERFAULTFD */ /* mm helpers */ @@ -80,6 +83,15 @@ static inline bool userfaultfd_armed(struct vm_area_struct *vma) return false; } +static inline int dup_userfaultfd(struct vm_area_struct *, struct list_head *) +{ + return 0; +} + +static inline void dup_userfaultfd_complete(struct list_head *) +{ +} + #endif /* CONFIG_USERFAULTFD */ #endif /* _LINUX_USERFAULTFD_K_H */ diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 9057d7a..d89eef6 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -14,10 +14,9 @@ #define UFFD_API ((__u64)0xAA) /* * After implementing the respective features it will become: - * #define UFFD_API_FEATURES (UFFD_FEATURE_PAGEFAULT_FLAG_WP | \ - * UFFD_FEATURE_EVENT_FORK) + * #define UFFD_API_FEATURES (UFFD_FEATURE_PAGEFAULT_FLAG_WP) */ -#define UFFD_API_FEATURES (0) +#define UFFD_API_FEATURES (UFFD_FEATURE_EVENT_FORK) #define UFFD_API_IOCTLS \ ((__u64)1 << _UFFDIO_REGISTER | \ (__u64)1 << _UFFDIO_UNREGISTER | \ @@ -72,6 +71,10 @@ struct uffd_msg { } pagefault; struct { + __u32 ufd; + } fork; + + struct { /* unused reserved fields */ __u64 reserved1; __u64 reserved2; @@ -84,9 +87,7 @@ struct uffd_msg { * Start at 0x12 and not at 0 to be more strict against bugs. */ #define UFFD_EVENT_PAGEFAULT 0x12 -#if 0 /* not available yet */ #define UFFD_EVENT_FORK 0x13 -#endif /* flags for UFFD_EVENT_PAGEFAULT */ #define UFFD_PAGEFAULT_FLAG_WRITE (1<<0) /* If this was a write fault */ @@ -107,8 +108,8 @@ struct uffdio_api { */ #if 0 /* not available yet */ #define UFFD_FEATURE_PAGEFAULT_FLAG_WP (1<<0) -#define UFFD_FEATURE_EVENT_FORK (1<<1) #endif +#define UFFD_FEATURE_EVENT_FORK (1<<1) __u64 features; __u64 ioctls; diff --git a/kernel/fork.c b/kernel/fork.c index accb722..0624762 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -55,6 +55,7 @@ #include #include #include +#include #include #include #include @@ -408,6 +409,7 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm) struct rb_node **rb_link, *rb_parent; int retval; unsigned long charge; + LIST_HEAD(uf); uprobe_start_dup_mmap(); down_write(&oldmm->mmap_sem); @@ -461,12 +463,13 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm) if (retval) goto fail_nomem_policy; tmp->vm_mm = mm; + retval = dup_userfaultfd(tmp, &uf); + if (retval) + goto fail_nomem_anon_vma_fork; if (anon_vma_fork(tmp, mpnt)) goto fail_nomem_anon_vma_fork; - tmp->vm_flags &= - ~(VM_LOCKED|VM_LOCKONFAULT|VM_UFFD_MISSING|VM_UFFD_WP); + tmp->vm_flags &= ~(VM_LOCKED | VM_LOCKONFAULT); tmp->vm_next = tmp->vm_prev = NULL; - tmp->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; file = tmp->vm_file; if (file) { struct inode *inode = file_inode(file); @@ -522,6 +525,7 @@ out: up_write(&mm->mmap_sem); flush_tlb_mm(oldmm); up_write(&oldmm->mmap_sem); + dup_userfaultfd_complete(&uf); uprobe_end_dup_mmap(); return retval; fail_nomem_anon_vma_fork: -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755491AbcCTMzX (ORCPT ); Sun, 20 Mar 2016 08:55:23 -0400 Received: from mga02.intel.com ([134.134.136.20]:50685 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755276AbcCTMzR (ORCPT ); Sun, 20 Mar 2016 08:55:17 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,366,1455004800"; d="gz'50?scan'50,208,50";a="937809600" Date: Sun, 20 Mar 2016 20:53:28 +0800 From: kbuild test robot To: Mike Rapoport Cc: kbuild-all@01.org, Andrea Arcangeli , Pavel Emelyanov , LKML , linux-mm@kvack.org, Mike Rapoport , Mike Rapoport Subject: Re: [PATCH 3/5] uffd: Add fork() event Message-ID: <201603202029.DGStOszG%fengguang.wu@intel.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="CE+1k2dSO48ffgeK" Content-Disposition: inline In-Reply-To: <1458477741-6942-4-git-send-email-rapoport@il.ibm.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: fengguang.wu@intel.com X-SA-Exim-Scanned: No (on bee); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --CE+1k2dSO48ffgeK Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi Pavel, [auto build test ERROR on next-20160318] [also build test ERROR on v4.5] [cannot apply to v4.5-rc7 v4.5-rc6 v4.5-rc5] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url: https://github.com/0day-ci/linux/commits/Mike-Rapoport/userfaultfd-extension-for-non-cooperative-uffd-usage/20160320-204520 config: i386-tinyconfig (attached as .config) reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): In file included from kernel/fork.c:58:0: include/linux/userfaultfd_k.h: In function 'dup_userfaultfd': >> include/linux/userfaultfd_k.h:86:42: error: parameter name omitted static inline int dup_userfaultfd(struct vm_area_struct *, struct list_head *) ^ include/linux/userfaultfd_k.h:86:67: error: parameter name omitted static inline int dup_userfaultfd(struct vm_area_struct *, struct list_head *) ^ include/linux/userfaultfd_k.h: In function 'dup_userfaultfd_complete': include/linux/userfaultfd_k.h:91:52: error: parameter name omitted static inline void dup_userfaultfd_complete(struct list_head *) ^ vim +86 include/linux/userfaultfd_k.h 80 81 static inline bool userfaultfd_armed(struct vm_area_struct *vma) 82 { 83 return false; 84 } 85 > 86 static inline int dup_userfaultfd(struct vm_area_struct *, struct list_head *) 87 { 88 return 0; 89 } --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation --CE+1k2dSO48ffgeK Content-Type: application/octet-stream Content-Disposition: attachment; filename=".config.gz" Content-Transfer-Encoding: base64 H4sICFCc7lYAAy5jb25maWcAjDxbc9s2s+/9FZz0PLQzJ4ljO/7SOeMHiARFVATJEKAk+4Wj yHSiqS35k+Q2+fdnFyDF20JpZzK1sIvbYu9Y8NdffvXY63H3vDpu1qunpx/e12pb7VfH6sF7 3DxV/+cFqZek2uOB0O8AOd5sX7+/31x9uvGu3318d+HNqv22evL83fZx8/UVem52219+BUw/ TUIxLW+uJ0J7m4O33R29Q3X8pW5ffropry5vf3R+tz9EonRe+FqkSRlwPw143gLTQmeFLsM0 l0zfvqmeHq8u3+KK3jQYLPcj6Bfan7dvVvv1t/ffP928X5tVHsz6y4fq0f4+9YtTfxbwrFRF lqW5bqdUmvkznTOfj2FSFu0PM7OULCvzJChh56qUIrn9dA7OlrcfbmgEP5UZ0z8dp4fWGy7h PCjVtAwkK2OeTHXUrnXKE54LvxSKIXwMiBZcTCM93B27KyM252Xml2Hgt9B8obgsl340ZUFQ snia5kJHcjyuz2IxyZnmcEYxuxuMHzFV+llR5gBbUjDmR7yMRQJnIe55i2EWpbgusjLjuRmD 5byzL0OMBsTlBH6FIle69KMimTnwMjblNJpdkZjwPGGGU7NUKTGJ+QBFFSrjcEoO8IIluowK mCWTcFYRrJnCMMRjscHU8WQ0h+FKVaaZFhLIEoAMAY1EMnVhBnxSTM32WAyM35NEkMwyZvd3 5VQN92t5ovTDmAHwzdtHVBtvD6u/q4e31fq71294+P6Gnr3I8nTCO6OHYllylsd38LuUvMM2 dqF5GjDdOcxsqhkQE7h6zmN1e9lih400CwXq4f3T5sv7593D61N1eP8/RcIkR9biTPH37wby L/LP5SLNO2c8KUQcAEV5yZd2PmWF36i4qdGVT6jWXl+gpemUpzOelLAPJbOuUhO65MkcKIGL k0LfXp2W7efAHUaQBXDImzetAq3bSs0VpUfh6Fg857kCDuz16wJKVuiU6GxEZgYMzONyei+y gTDVkAlALmlQfN9VHF3I8t7VI3UBrgFwWn5nVd2FD+FmbecQcIXEzrurHHdJz494TQwIfMeK GCQ5VRqZ7PbNb9vdtvq9cyLqTs1F5pNj2/MHvk/zu5JpsDcRiRdGLAliTsIKxUGxuo7ZyB8r wI7DOoA14oaLgeu9w+uXw4/DsXpuufhkHkAojLASlgNAKkoXHR6HFjDMPugfHYHyDXoKSGUs VxyR2jYfja5KC+gDik77UZAOVVYXpa8EupA5WJUAjUrMUFff+TGxYiPK85YAQ8uE44FCSbQ6 C0RjXLLgz0JpAk+mqN9wLQ2J9ea52h8oKkf3aGlEGgi/y4lJihDhOmkDJiERaGfQb8rsNFdd HOuVZcV7vTr85R1hSd5q++AdjqvjwVut17vX7XGz/dquTQt/Zs2o76dFou1ZnqbCszb0bMGj 6XK/8NR414B7VwKsOxz8BCULxKC0nBoga6ZmCruQRMChwGWLY1SeMk1IJJ1zbjCNX+ccB5cE MsPLSZpqEsvYCHC+kktatMXM/uESzAKcXWtawLEJLJt19+pP87TIFK02Iu7PslSAgwCHrtOc 3ogdGY2AGYveLPpi9AbjGai3uTFgeUBsw/dPfgdK/8AvYwkYIJGAE68Gmr8QwYeO149iqWOg uM8z41CZkxn0yXyVzfIyi5nGCKCFWt7pEk6CPhagFHOaJuBHSWCjstYGNNKdCtVZjBkA1J2k j6cBlmyi0rgALoI1gkSRyFkOxzhzsNiU7tInBt0XPJ0yLBzLD2FRSxLCs9RFFDFNWBwGtFih 3nHAjPJ0wCZZeP4kIjCOJIQJ2lyzYC5g6/Wg9AEhdxi77VgVzDlheS76PNRsB0OIgAdDDoUh y5MRMWqwDpKzav+42z+vtuvK439XW9C7DDSwj5oX7EOrH/tDnFZTu+wIhIWXc2k8d3Lhc2n7 l0Y1DyxBz7fEwDGn2U7FbOIAFJSfoeJ00l0vkF5DSIg2uwRPVITCN5GSg/3TUMQDI9Kla2ox OgqhaSkTKSzjdWf/s5AZOAMTTjNUHYHQVhTnM5kLiGOB21F5+j5XyrU2HsLeBNIbIoxej4Ev g+eGBgMsYDlRCzZ0uQWocAzrYXF6AJoNQybbmnNNAkAj0x1sK4YnIaVggZaDFrNwgxql6WwA xMwC/NZiWqQF4TVBCGT8mNofJEJbCEXvwGNG78yoY5P5GcyS8ykoUQiiTSamJm3JsuFScTXQ aiVlAIsWwOicWXM5gEmxhBNrwcrMODRXoCygXRd5Ah6YBnbupqWGsk8Q0kCJgRuJzuvtBYUc 8oWhVsvRo7yIPbhSsZCDA5phFmYwQt1q40IHLEgLR4IC4pbSeu9NrEmsT3EfNQrE8rEekQac BLM75Gzug6vS83GGQNrL6OPAIST87ChI7CJmtAMwxgbWS936h/B3HYKSYKDD67QOZlg62cI0 KGKQPtQDPEZuGJ+lshBg91SOM1zjFOK59GObMrSHkGZ3tSSWOu70BK8zAc0E5FiwPOgAUvBt weDXSayrEYCZLO0pI+Kn87dfVofqwfvL2ryX/e5x89SLK07bROyy0eG9gMwstlEhVsVEHEna Sc2gX6PQBN5+6BhsS1/iDBvKG78/BkVWZF3emaDbTXQzaTSYKAOFXSSI1I9fa7ihqIWfg5F9 FznGF47OXWC/dz+hxnSKKjSXiwEGctrngheYCIZNmIjZjZIvGoTWRQSC3fcdIHPW2X63rg6H 3d47/nixseRjtTq+7qtD9wLgHhkr6CdhWg9B0sEI5iBDzkDVgl5j0mGmDRZG+w0q5sho1Cmw aygUnVTBcfhSA39j4vecM13nRkUu6GlsoAUnAWvKMdVorIkjAonuQPGDjwqaZ1rQ2T0I9DHu tPnQlsmvP93Q7urHMwCtaFcRYVIuKZG5MZcyLSaoAIiopBD0QCfweThN2gZ6TUNnjo3N/uNo /0S3+3mhUjpKlsaL4w7/VC5E4kdgBx0LqcFXrkAiZo5xpxxC4+nywxloGdMxmvTvcrF00nsu mH9V0ulRA3TQzgcn1NEL1YxTMmqF7bjtM4KAaYD6CkdFItS3H7so8YcBrDd8BqYCRD3pZ2s6 CKjHDJJJi6iikx1AMAhAv6F2e26uh83pvN8iRSJkIU0yLARXNb7rr9u4m76Opep5NbAU9FPR s+AxuBiUUwMjgg43xOnYv6bZnG/vnrSBMBkQ6CBCrMjHAOOUSA5xGDVWIX3b3qqmjGsbUZGH HUhBKStzY6bAHJ/2z7nM9MhPa9rnaQx+FMvptFON5eQ2JEImaJ1mDs2R1TOMxsFxuYMo2aEv nQCdAmtOaGMmPtFhNE6Yc9TjoVi6MnlgvIFbQDrc+1H0YRiWzQpBZeaSFBPCA/NRN13TOaUa enNNOcJzqbIYjNtVLxPctmLU6SCoRbmkJ23BPx3hA7Uuc0ubhqHi+vbiu39h/xuoD0bpDeMA hWDzYc8lTxhxf2viHjfYiHZzdQNeZleORYycFjduAF5SFPz24pQwOde3WZRkSWEittbLOK3I woht1Z37o5VG+9p+nQCzHQ7iIS06StLGxlxO+q5pr7ketDugrb8QyodQotu9n1+pHRtQfWFq BqEySubIM20mMsrlepC98t0JpegO3OIgyEvtrEJpnFMkz7Q9l7nIQf2B71X0POGZksQYzc2f ibvsxVCQ315f/HHTvWwYB4WUuHYrD2Y9ofVjzhJjHOlg1uFg32dpSue/7icF7arcq3FesfGi 64jMXNQ3uSp3gUHI8xzDDpPRsTKKdwjdbRnlhdYaotUUL8DzvMiGR9rTowp8ZgzgFrc3HV6Q Oqe1o1mTjaWd2hM27A5DjGUG75T2wOqUB61J78sPFxdUOuG+vPx40ROI+/KqjzoYhR7mFoYZ Bh9Rjvd29HUEX3LqWFFShA9qCuQ/RwX6Yag/c45pI3NNda6/SX1C/8tB9zrPPA8Unar3ZWCC 3YmLWUE1ivCujANNXRLYcHT3T7X3nlfb1dfqudoeTUDK/Ex4uxcsSusFpXU6g9YbNKOoUIzm xKuccF/997Xarn94h/WqTnS0G0OHMeefyZ7i4akaIjuvfA0fo35QJzzM7WcxD0aDT14Pzaa9 3zJfeNVx/e737lTYSOQ6bCVYnVdt/RrlCN59PGgSlMaOOgfgEFqQEq4/frygg6DMR0viFt87 FU5GRODfq/XrcfXlqTKVjJ65eTkevPcef359Wo1YYgJ2SGpMvdH3Uxas/FxklCWxubm06Gm3 uhM2nxtUCkdojoEYJoOd89mkj0itGu4Sc0SPoPp7s668YL/52941tSVMm3Xd7KVjUSnsPVLE 48wVDfC5llnoyIho0L0Ms4suJ98MH4pcLsA+2tt0EjVcgNZngWMRaLIW5pqaItrgCi3Ixdy5 GYPA57kj6QTc1snc0MmmphIEBBVGEj6ZkOxi4dV8U2TTibKYrQcMgCphSKTgUNAfzLn2jkxq moJpSCzD5oxNUV9T1gmOSl3j2p6TbRqtQG4Oa2oJcADyDvOV5EIgho9ThRk7tOZD+rSkzhmt i/1LcjGcAw2ld3h9edntj93lWEj5x5W/vBl109X31cET28Nx//psbmUP31b76sE77lfbAw7l gV6vvAfY6+YF/2ykhz0dq/3KC7MpAyWzf/4HunkPu3+2T7vVg2erDhtcsT1WTx6Iqzk1K28N TPkiJJrnaUa0tgNFu8PRCfRX+wdqGif+7uWU0FXH1bHyZGtLf/NTJX8fKg9c32m4ltZ+5LDy y9hk7Z3AusAOzI8ThfPIpQxFcKq3Ur4SNVd2uOFktpRAh6IXUWGbK0ktmQ9OIAT+td4YV1WJ 7cvrcTxha0GTrBizawQnZDhGvE897NJ3UbAs7N/Jq0HtbmfKJCclxAfGXq2BaSmZ1ZpOxIAK c9VeAGjmgolMitKWKzry34tzjnkyd0l/5n/6z9XN93KaOSo/EuW7gbCiqY043Pkt7cM/hx8I 0YA/vCuyTHDpk2fvKAtTDi5XmaQBkRo7oFmmqDmzbMyj2FY/7tiZWsSml4XqzFs/7dZ/DQF8 a1wocPGxthR9anAusEgavX5DQrDwMsO6jeMOZqu847fKWz08bNCTWD3ZUQ/vBtd/5lI5NZEe xA14WDB8j4VtE0mJhcNNTBd4hQ7xZ+zIKBoEDCFpd8zC2dxRFLJwlhJGPJeMjlyamlYquaEm 3UcBVnPttpv1wVObp816t/Umq/VfL0+rbS9OgH7EaBMf3IXhcJM9GKL17tk7vFTrzSM4ekxO WM/tHWQOrFV/fTpuHl+3azzDRq89nJR/qxnDwLhbtNpEYA5BPacFINLoaUDgeOXsPuMyc3iD CJb65uoPxx0GgJV0BRRssvx4cXF+6Rhnuq6CAKxFyeTV1cclXiuwwHG1hojSoYhsbYJ2+JCS B4I1yZTRAU33q5dvyCiE8Af9u0vrqPiZ+4kWdMDrhBKRhh3D/eq58r68Pj6CzQjGNiOkJRQL CWJjo2I/oHbRpnWnDLOOjrrVtEiotHYBkpNGvihjoTUEwhDKC9YpSUH46KEWNp5KDyK/Z/8L NQ4gsc04fw99rwfbs28/DvhgzotXP9CYjkUDZwMNSdunNDPwpc/FnMRA6JQFU4eiKhY02aV0 8CGXypkVSjgEVjyglZ6ttBITAZS+I06CB8xvwlCIjYvOwyQDak+h9Q+hnRgpB3UwsAHY5MdM 0UsDd40IrtqVF8tAqMxV1Fw4pNKkfl1+3nyzB+Ghjhu7iRQOoD9sHSOt97vD7vHoRT9eqv3b uff1tQLPnZBdEIXpoASyl+poihOosLL1kyOIdfgJd7yNk+OpXjZbY/QHLO6bRrV73ff0fjN+ PFO5X4pPlx879UDQyueaaJ3Ewam1PR0twdPPBM3f4Gob56z05U8QpC7om+oThpZ0uTWXNQJI hsPtF/EkpbNVIpWycGrnvHreHSsMpyhWUZqbqx5Z5nhBPO798nz4OjwRBYi/KfOMwku34Mdv Xn5vjXpAzFIkS+GOoGG80rHvzHDXMGvZ0m2pnXbR3GPRBHOIW7agrlQYcPgUNIpkyzLJu/Vd IsMCyUlBc75x7Uw5ap7GrrAjlGOao6buvlMZpXJcqhyd4GzJystPiUQPnda/PSzQ7TTLgitW zsAfNhjuGdFJ9R0XFtIf27Fu1fkzuJfg/lOqJ2djRcG2D/vd5qGLBgFbngraJ0uccaLSjhjR XK7oaDSzSan0PBY4n9GaDdaoa5OIIaSCB47cYpN+hA24LoMCHsdlPqG1SeAHE+YqPUunMT9N QawX4ivLeR0lG9hCGIi0OpXk7XoVuvpiCSDHuw4sqcQw1WVNQmVKmB0R/xmYsLDS+bAmZGd6 fy5STWdZDMTX9HYwPxqq69KRZA6x8scBS8GSgxMwAFumWK2/DdxZNbpitTJ0qF4fduYioT2p ViRBjbumNzA/EnGQc1prYtbLlTzH50d08GSfgZ+HlsNr5tZFMP8DLnIMgDcShofsEw4aKYnH JK1funyDuLX/ltB8PEHkn8278Y5baHq97Dfb418mu/DwXIH1a6/sTqZFKbw/jlGW5qAz6lv3 2+v6KHfPL3A4b82zRjjV9V8HM9zatu+pS0Cb6sfyA9rQmWoPCOBz/AhFlnMfwhTHwyaLKgvz lQBOFhLbelAc7fbDxeV1VzfmIiuZkqXzHRlWEJsZmKL1aJGABGDMKiep46mTLZFZJGfvPULq oiLieOui7M7G75EUtx/qAJ6RmOygOXmAZMmaJjEVVLQZol4R7aAq+WfltfWOUvOymLNZU1jh cPbQ3wBu799Y9Iay6emGZyU4efsfEBN/ef36dXDta2htKoqVqzpl8PkF95HBFlWauNS4HSad /An0db5OqpcPti0GOoxPsIGcmcE+TCmUS6FYrLkrS2yAECIVjiyZxajroLBG5PxWzGpQsYex eX1OLbYBu0YyTIY7d7F1NLi9qm9R4bi9GMKj1xerYaLV9mtPraDVLTIYZfySpTMFAkFPJ/Yt M506/ExmDzvskQDPglClaUadfQ8+LD2zQIyA8M56VEXi1IoWbNkBv2oyUncDMuIMM84z6nU4 krEVIO+3Qx2OHv7Xe349Vt8r+ANLF971ixfq86kfM5zjJ3zc6giSLcZiYZHw6eIiY5pWXhbX 1KCdEdY8nZ93ucwAmOw6M0mTSomBZD9ZC0xj3ropHofuhw9mUmDD0/sIh3/efODozKQzq2bO LUs4xq+1nfgZhjqn5Zo3d+cO1M95gO8IGOGb4OcCaHVtjs71NYH6qxX4MYBz5uanNDbfGvhX SOc/SPC5/nbPObauv8JR5m6L11Cz5Hme5iDwf3J3LaWtcCRxujYbM6uNCoaoWdtHjebRmS3J p3Q1iUjM0D6QdHyJy6j1sEj89uMBw0eIJ+g0Z1n0r3DCzJzW8KFp/WSVfELbB5YLoSPq2WcN luY1ISD4EOsNUOpiN7tQ+zJ1+Kyy7mhHaYHYAzUEkYANRwxmxQO//wHes64Ox4GAIAGM6JrP H9HJi/Zc8PWim8En5gGeE24V4M31Sa3RwoYLivjSWedjEJC3kmldukRrDYM3A0TtyPQZBPNp BrouzMAnQruyBgaeg2BErupJ+/2QIPVV3vsGzP/3cTW9CcMw9C/BuOzapu3krStVmyLKpdoQ B06T0Djs3892SpoUO0d46QckcV4cvxdpmfV7D4Vq3IHkRo/42WcrCyUDdvRWRPl4+pya+kPe Zw3eGfkbWYE4ReeyRyE0HTkOXA/eu2KuMq7QcHEiUYtA+XHkfPm+d7Xlis+Jq11OmG1wnt3S yNIPCJc2iSjsLMv0oT7HanncOSk2x90Uu6rzqh4UWeOcpMb5phsX0IGFEk5h7/z5Jju25bQ5 vm4W9rjGsK+2MuaG5+LPFqOsDNo9YfywsAJ1AZQduG+RmA6+TbOqRvR/6bzMha8YUmPTZs+z cca8AU7gu7fqLOQlSnbca8hwtVHMYQZyj6Pg+PwG7oDgcr7frr9/UqLjoxyV/FJphg7siLGm 7Dl9zrMs2VZMETz+uuWGWSAjWaOxv103tglzukOkgpj3kHDSfURyaLJuFEK223Jcv29fuG2/ /dxxkbsEGSZvWGG7xiAPqahgkDiH4GmBTeqyUdAKmoexZA6CP1hrwFfsriD1a8EjgIXPbHHU 1hA7n5jOTMaAlTsS0a0sbqPr7HZTgLzUEQwW+amG7uRzDUTkIo0acr5Ks8QzsoYXAeQHpSZ5 YY+72TnO6Q4EXerCR7gCbfeS5hvHE/nQJqApN+/iGO6pU0MhlvuKQm0smuIVMHRT9D3tKRE9 ByrO/Fs4xNYXSAGVX1gU8oaFjf1Uj6dZe5Var3s6as6gEV6ZlpWJVyYE/wEwVCCdZVgAAA== --CE+1k2dSO48ffgeK-- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754299AbcCUKJX (ORCPT ); Mon, 21 Mar 2016 06:09:23 -0400 Received: from mail-db3on0112.outbound.protection.outlook.com ([157.55.234.112]:24512 "EHLO emea01-db3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752638AbcCUKJQ (ORCPT ); Mon, 21 Mar 2016 06:09:16 -0400 X-Greylist: delayed 903 seconds by postgrey-1.27 at vger.kernel.org; Mon, 21 Mar 2016 06:09:16 EDT Authentication-Results: gmail.com; dkim=none (message not signed) header.d=none;gmail.com; dmarc=none action=none header.from=virtuozzo.com; Subject: Re: [PATCH 0/5] userfaultfd: extension for non cooperative uffd usage To: Mike Rapoport , Andrea Arcangeli References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> CC: LKML , , Mike Rapoport From: Pavel Emelyanov Message-ID: <56EFC4F2.2050104@virtuozzo.com> Date: Mon, 21 Mar 2016 12:54:58 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.5.0 MIME-Version: 1.0 In-Reply-To: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [83.220.236.146] X-ClientProxiedBy: AM2PR03CA0009.eurprd03.prod.outlook.com (25.160.207.19) To HE1PR08MB0459.eurprd08.prod.outlook.com (25.161.120.143) X-MS-Office365-Filtering-Correlation-Id: 872def0f-694c-4e09-3c95-08d3516ebfba X-Microsoft-Exchange-Diagnostics: 1;HE1PR08MB0459;2:l2XRam0qikOE/z9T4l2T0fCFGQd7X5FOROjAeaGj3XwMBn/HxPBOBVCDQfQrQIdT/094vu+NmaP/Qk7zyb40S8CkFXLAKe709P+NC3MoX3ScSSiUgNauHhP2hpYtpiSCwdPnSXFBQ3yFyEgTrlVxqhX6tmDXkbMK+OONyn0LcshZKnoUYfr8tVCKGO1zx1SG;3:/50XhrDp4FGr2UT+Jqrzqol7YdTyjZJUS6NqSPFkQG8rGMiIaukSd8jq8s6rVp8seuTL0KASBckvu3AdTP75iMa7VEFTTcXPKYWZRclhqmZoqUnGOaURgqsLDS3r8KA9;25:n54j1EFj7viuEIMkZUmvkiOSa7VJJBW7a3zivS1j34n160Xy6XLe0p03wyWMw7YZPgDcT9IPOFLy7YavnFw65XHoMxbeUzcTv6k/RojhGoghfOHnaisuZqLO/Ks0iEC5l6PQTM6iNsVClmmwfeSbGvplxQx7GR5KGyWOMJDnixHRFlEHvKUItWyap47pg4Jh137ZG2Dzle4ZzI5kU8nIIlqqFvJoKkNUEB7WsF0qoqs9U02LqpqPN851tHFbk0ab3XPMyyChNgMnWvM/snFWG98EC8FOV23nfqyjeP1dObsz2b19FWnCE4N9Urnhg8snnHN/OYRT8TFsyg5bF2GPr7oKxozFzAJVl0+KZAyPHpvV2G61WN7O2VbFkXNIrQkdkUdVN/+Cj/GEPNT3jK0c6cAGz9w8NS30vCyaBUldxeLU65e+kZ1VFSUQJjzQX+AvQ7Pe3clUPt6XWAL/6eAUE8B9Z6RQV2Bi62ogy9oS4Pl0+86WHApqU/x4z2ELLF9iImjswPCSB3/xpT5hytBXPqKcCkCGzwP35r5uM5e6m6L65yuATZlhSZmPF46mJ4b9 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:HE1PR08MB0459; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040046)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6041046)(6043046);SRVR:HE1PR08MB0459;BCL:0;PCL:0;RULEID:;SRVR:HE1PR08MB0459; X-Microsoft-Exchange-Diagnostics: 1;HE1PR08MB0459;4:gowwnyb6R+0aA038Nlk22G1qhrxoOIVXnfQP8XTOw6yjFhxaSh5UCiV7AF1wcXK1PJHBYFYNnI9PbIvs1WGdHjxto+HJX+VSxdHd/H66j/NaOoK/4g/lUrPKgouA37DeZC0YluB9u2wleljgd5k8BfpI6PCTcqqJDTRSrceuFH77BKxUXNx0TJk7E3dkzjFyLSJD/V85TUFcQIlGIvqThE0woc9GKXENc03mEgJ8kmlHKAE/qE9Jq+FYG+ALg+Lht++DfZplprtZDASa8no03QDbHGIzZPVU1FBHDylg2p2q1yiLqoy0JZABlwsTDBJlWeEQqKprzxKzlxRZ0/BzezYaAHf23LJ4Nw6fDZ29jrTY0MG4pUnd9xzQ5GEFfWyZyIYWyGVrAKZVrCQENPNyZPE+QKNAJQsBZqUv0m7YV18= X-Forefront-PRVS: 0888B1D284 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(24454002)(164054003)(377454003)(23676002)(92566002)(36756003)(1096002)(42186005)(5008740100001)(230700001)(586003)(6116002)(3846002)(50466002)(83506001)(15975445007)(64126003)(2950100001)(77096005)(2906002)(5004730100002)(66066001)(65816999)(47776003)(117156001)(81166005)(19580395003)(19580405001)(80316001)(5001770100001)(33656002)(4001350100001)(59896002)(189998001)(54356999)(87266999)(76176999)(86362001)(50986999)(42262002);DIR:OUT;SFP:1102;SCL:1;SRVR:HE1PR08MB0459;H:[192.168.43.234];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtIRTFQUjA4TUIwNDU5OzIzOkhNeHU0QzNtcVFtZ0k2Q0xFRERlaTQwV2xh?= =?utf-8?B?QTJsdElNMHFpYTVkdjNWVFdtbGE0RUptcEJMajBVYUgvVnViNk1YQWh4MFdt?= =?utf-8?B?QTZEaEdQWUc0aHJiTHJBTUlld2Z3QU1EeXFCZW9PYkZPS3QxMFFvem02b25Q?= =?utf-8?B?RE4wK1EvRDFTT1RNT3RHMlEwNVhFbDdYODFKeU8xZUYzYXl1Q0NuV3Azck1I?= =?utf-8?B?b0JXY3poOTYzRUkvcm5MSklTOFpUYXNzd0NJT25hMHJNSXNscHRCUWJ0dzFy?= =?utf-8?B?a0tqTE02ZCttUzJxL0NMOTdiZHlCbUsvWUErOUxpZzliS0JQYnJqZE55Z2pT?= =?utf-8?B?UzZETVhodnowT1BLOWZJS2E1NHN3VDNtbjFmNjJpTkpSWit3THlQTUtDL3o4?= =?utf-8?B?ZTJldzNIUmxibzJCazJFZHUxV1FNZzd2dEZJRkw2aTVzWGtpWXRyamtwYURD?= =?utf-8?B?eWg0VEd4WURsUTdjSU9qRG9PTXY5am1qR3o2cnEzcWdVRFQ3S3lUejhFL3pQ?= =?utf-8?B?cVdweHJjWnVkZzYwTTR3bHVoNmZ3SXRzVWJ5WVVmNEhkbjcyTGlYdDZSTHQx?= =?utf-8?B?cmNHejRtMFE4ZlhESlIrVHl4RGlSL2pCMkVya053eUoxbERzMmRXaHhhWTlk?= =?utf-8?B?ajNFVy9ZaVR5R053a0pYQWRYOHlrOW56VC94L1dVUjF5WTVOZ0hCL1Y3aFhS?= =?utf-8?B?UE1aS0tyUUFvZm9IcmVtOU5QMjZwT2lCS1RJQlVnSDFHZ2IwK3h2THpvTm5P?= =?utf-8?B?NkZseGM2aktaSjNtb3pMa1Ayc3l0TFRHM0lPWkx2YXJxN3I4eGJaUEVGSXhv?= =?utf-8?B?Ukd2a0lFSyt3TUFjMjd2dk15eU5nTkNPRWhPWU53cHJrS0hobDNaYTJIQUVr?= =?utf-8?B?UmphUUpQYlpKUUIwendDZGNWVHhjdXJSZm9HYmJ0Ly85K2J2Tlgzd2d1bDY5?= =?utf-8?B?QjUvVzhOVnNVOU92UWdiYjNicUY2OUJER0c0NUdZcjZxbnFUeThnTkNLcHVM?= =?utf-8?B?QkxBaXZLUFhoVHNmSG13Nks3ZVNaSWtUOTVta3RnMTJ2a2doRUpWYUN4OFZw?= =?utf-8?B?dmptSXpJa1o1ZXBuRUZ6UGd1TEw2L3JLQy9zRkVoaGlUN1Q0UDNzSUdtV1Mx?= =?utf-8?B?S1JQYkZ6cGhkMWw2N0dxMGpzMHRTVXdxZUZjYVNxQ25nQjJKOURjbGdBQlpr?= =?utf-8?B?Y2d6UFNGVXVoQjl3aHI2Nzd2VWo0QnMzWGhDK3dFY0pIdEptVWRDalZreXZP?= =?utf-8?B?K1JodmRobS9LaVdOYks4M2wyeTFHbGlFTUFnS1hqZWVXS0N4N21oOUFDcXYz?= =?utf-8?B?bkEvVW5xZ01td3ZNT3BCeUQ2QTF4NUF6UW8rbjBPQzZ1dWhFdEtPU1Fud2Fh?= =?utf-8?B?WDZqYUlMTFFKSHJPMXF0dFg4TnlFMW9TQU1lbE1WdlhIbXFLK2tKQ1gzek1n?= =?utf-8?B?d2Fwc3ExaC9EcExRYTJFemxNQmlCSWJtVHlkU1FXR2NkNXBYcUQvb1RWeWVp?= =?utf-8?Q?YVV/RdG/dLmAgBS8Mo+v+ZsWQ=3D?= X-Microsoft-Exchange-Diagnostics: 1;HE1PR08MB0459;5:n4mGHVs/Zpk3YfMEWQAda9X4uy+yJ+4FOP4AiNxPs2OFTHiHYQc+AOmfcYUZCYRA09gWiDvash72zMxgupNdpbiShbVgjYWMg5aVvacCx6wpGQPz9HB4Mq7OmOgW/H8qKdEzHd8z9y/OpuIbxrMQLg==;24:5thfZ6vWMukTzKn7YAKHes17IjSXkAwk26r89jzZcTb5iNZsoRGhcUUOrFjWjapFuEmuAFdQAqOW8xDnMPDr1/jzc15/QpGHPthio019G0w=;20:ifAyi5Sj2EJvoJKDASS0kbdmgewM4cvopvqJwhRDx8DjOwNwPDi3jMnX7oO53U+eA6+yiXQjYM4WfsVWtKCC30DcskbKW89g24ntps9oXq+9u3F8Ob+M/iHcHH/+x9snJ+U9Bb1ynQnTvwHIlnIMaoVi6e5AjWoCrVT6M4eQfuQ= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Mar 2016 09:54:08.1934 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR08MB0459 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/20/2016 03:42 PM, Mike Rapoport wrote: > Hi, > > This set is to address the issues that appear in userfaultfd usage > scenarios when the task monitoring the uffd and the mm-owner do not > cooperate to each other on VM changes such as remaps, madvises and > fork()-s. > > The pacthes are essentially the same as in the prevoious respin (1), > they've just been rebased on the current tree. > > [1] http://thread.gmane.org/gmane.linux.kernel.mm/132662 Thanks, Mike! Acked-by: Pavel Emelyanov From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754601AbcDFGOH (ORCPT ); Wed, 6 Apr 2016 02:14:07 -0400 Received: from mail-lf0-f67.google.com ([209.85.215.67]:33567 "EHLO mail-lf0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752721AbcDFGOF (ORCPT ); Wed, 6 Apr 2016 02:14:05 -0400 MIME-Version: 1.0 In-Reply-To: <56EFC4F2.2050104@virtuozzo.com> References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> <56EFC4F2.2050104@virtuozzo.com> Date: Wed, 6 Apr 2016 09:14:02 +0300 Message-ID: Subject: Re: [PATCH 0/5] userfaultfd: extension for non cooperative uffd usage From: Mike Rapoport To: Pavel Emelyanov , Andrea Arcangeli Cc: Mike Rapoport , LKML , linux-mm@kvack.org Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 21, 2016 at 11:54 AM, Pavel Emelyanov wrote: > On 03/20/2016 03:42 PM, Mike Rapoport wrote: >> Hi, >> >> This set is to address the issues that appear in userfaultfd usage >> scenarios when the task monitoring the uffd and the mm-owner do not >> cooperate to each other on VM changes such as remaps, madvises and >> fork()-s. >> >> The pacthes are essentially the same as in the prevoious respin (1), >> they've just been rebased on the current tree. >> >> [1] http://thread.gmane.org/gmane.linux.kernel.mm/132662 > > Thanks, Mike! > > Acked-by: Pavel Emelyanov > Any updates/comments on this? -- Sincerely yours, Mike. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964815AbcDTJ6l (ORCPT ); Wed, 20 Apr 2016 05:58:41 -0400 Received: from mail-db3on0136.outbound.protection.outlook.com ([157.55.234.136]:57312 "EHLO emea01-db3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S964786AbcDTJ6i (ORCPT ); Wed, 20 Apr 2016 05:58:38 -0400 Authentication-Results: gmail.com; dkim=none (message not signed) header.d=none;gmail.com; dmarc=none action=none header.from=virtuozzo.com; Subject: Re: [PATCH 0/5] userfaultfd: extension for non cooperative uffd usage To: Mike Rapoport , Andrea Arcangeli References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> CC: LKML , , Mike Rapoport From: Pavel Emelyanov Message-ID: <57174F90.7080109@virtuozzo.com> Date: Wed, 20 Apr 2016 12:44:48 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.5.0 MIME-Version: 1.0 In-Reply-To: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [195.214.232.10] X-ClientProxiedBy: DB5PR02CA0017.eurprd02.prod.outlook.com (10.161.237.27) To HE1PR08MB0460.eurprd08.prod.outlook.com (10.161.120.144) X-MS-Office365-Filtering-Correlation-Id: cc3c84f8-aa51-4ca9-f672-08d3690040b5 X-Microsoft-Exchange-Diagnostics: 1;HE1PR08MB0460;2:j/mZOdVgHU5RrBBxqw6IgZmwxa4mBQiuvB3A96vHiHhdbzGg4WpeClEO+z9jew4d8fJU8l/cXEhlUAbCH7lzVrePGbbKIb4opVb6fvTkkH7ZzsN86HiVBxqZ9h+WiH9xvKPuOspQXRAHD/VFwzMSeipsxYxUomkrTWIalufY/zmMHHwquvSCmwVdnKPiNm+i;3:5hx2GFJHh//Jsq7RV/yGVwx6KYNg7MlolWbz+x+bSbernEE2NQabdNLmusXnHFL02tCp7JBecbynDFL2LMhK4DZnesdxdAztvYpJ2TYZJ8OdkGTDYTjZo3OF7zQtZZKK;25:gsXKyPsUEkImq1uLx9XrU/y7spPGl6IytwEzGz9bTv3kBtZf57UlWdFUMCyRV+7Nne3JZpiOgSJVnq7BR2BF21Qkz+UMd0U+bfYMlk04MiZbwBM+pQjXHX8yjZR+0+bJwn10fVZEr+oyxhqJns8XJHooO+cw+3fobgVXFUJQO5hJHSFU3fp1YYVt1CJEDzD/baeYQUoMWQAgAbsUb9vQNXMIKe+boh7yqvMscDC3ju6Bmpc0aHuACJTehKhGQ8LyWXc6+7ajBzTJ4f2p9aw0Qe7IoOPolxdCxpLwLTzEzGggKNBToDVcD1ZCOSMkZo9P8Fq4PhwoufMUjtR6rYOoOg== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:HE1PR08MB0460; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(9101521026)(6040130)(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046)(6041046)(6043046);SRVR:HE1PR08MB0460;BCL:0;PCL:0;RULEID:;SRVR:HE1PR08MB0460; X-Microsoft-Exchange-Diagnostics: 1;HE1PR08MB0460;4:JlpIlgXEL+M0brefXSX+Gbc8LVPi5xvL6w2fAepbaC1fb440MSAWP5ZAjYNJOqb2PsZlBWM+VzGsaccQVSoY7e1vLtlYphigmJtLXhh5ZZK1kAzDIIf+o3nC41ft3mSKlrybEAq300wugXIVcjxMBO5/YZt68b7VUd0TXiF5P+X9ZROE0Mp1TMLAg4Fio0NZyXvE+zyEcVBynE89E0twUVmiQ7AIhHVxni8jWaU1dJaY3XvXthouVBP6TgiON7KB5SmZY4SRsbIqHjO+H/Bg2jdo1CbteR9utfs5SDOOXafhwFiuvVqB4vVrD4aUKcCbe9DKbHm4VGPn1KQIMjU7eUOmmruG7e51mNj1CycpgFpzosiyGJxVqguIBY61mvO8COTOLoRRGYwxJeQJz+43lZeL2im4Tn8aczxOoreHt53bOkQVOQTkXedu1QvUXnhI X-Forefront-PRVS: 0918748D70 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6049001)(6009001)(24454002)(377454003)(50986999)(42186005)(5004730100002)(189998001)(33656002)(76176999)(5001770100001)(83506001)(50466002)(23676002)(6116002)(3846002)(1096002)(2906002)(92566002)(80316001)(65816999)(4326007)(87266999)(86362001)(99136001)(36756003)(66066001)(586003)(77096005)(230700001)(64126003)(54356999)(2950100001)(81166005)(5008740100001)(4001350100001)(42262002);DIR:OUT;SFP:1102;SCL:1;SRVR:HE1PR08MB0460;H:[10.30.16.225];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtIRTFQUjA4TUIwNDYwOzIzOmJzS1JQc1I2REE4ZHlkOVNHUlI3T29DUDNo?= =?utf-8?B?WUN1WFpsYURybXdHcE5DSFdNV1l0Qm1ZQkJuQlp1akFZZTlreFZYWjhPWFk1?= =?utf-8?B?RkZZY0ZuNlZkUFZmYy9yb25NVGlhTGRCWjAxeityem1QV2VXTG1URUk0VXA1?= =?utf-8?B?Q3hmeFlTR0hLQ2NLeGFVTm0zdHhwZGxFTVBTays2bzBrOEc0WHdnc3BaMldP?= =?utf-8?B?U0JJdVBtNUdxWmlHSXYrMTlNQkpuSkNHb3o0QTZaUzZBcmJhVWhtczZjbUpL?= =?utf-8?B?dFZNTDRmR0V5Q2d3OTVJdzAzZmlnaSt1TVdLZnFNNzRJOFhrNzVXTHplb0Jh?= =?utf-8?B?OWhscWM5U3pQb1VzbjZrWlViNDlDOHdyL0c3c2FiVjBTNWRDT0tBekEzMlVi?= =?utf-8?B?elp0Z3FjMzFyNk8xbXJsQjNpY3pZcFc5eXNPblNBYkoxWHMrNnpQN2RKMWpv?= =?utf-8?B?aHJBaG5aUGU4SFc0ejJ1bldtcXNQeXRpY1BqODRyUmZnTkdGcmZndU44Nktl?= =?utf-8?B?a2wxalFXT2FwK0tqeHJvZFR6ampBN0FSeDFBSVZVZi8xWG9sNVVWbzQvaHRX?= =?utf-8?B?MWsvVzZ2UzIzUnRDVUNjN2ZtNEdIdVl5VlBaMmVuWFJnZVlMbmF3eldkd2Uw?= =?utf-8?B?d1BHYlNXak5aRHE4OWVzaUdIQW5SdjlGQWFvN3Bkak05dmhSMFhmV0tId0dR?= =?utf-8?B?ZGxCS0podVNYbGRFdng4M0VXd2wyYU92K1FTbzhsQktpVks1ejZ6ODVpcno1?= =?utf-8?B?L1FPdUhwUllveHd5b1dUS2xGK3NsVVQvNDIyZGM0RUxGSmw1VHFuNlVIZk9m?= =?utf-8?B?bWRyQ24rRjh5bHEzNmZibDY5WDlJSFZXekFZcmpPOHNicG4xdmN0WEhJY3dE?= =?utf-8?B?NnplSVBvQkxKVjhKQW53K1N6b0RiUXpJaE5YNTFUUkI3TGVVYmxyZW93QlRq?= =?utf-8?B?YWZQcktHNVVabnBOR2JBWlNuZUxkNERZTng5L1hWY3pkdVJZNjNEQ2Rra3Iw?= =?utf-8?B?dHBWaXc5c3VYRGcxYVZIZVB2UzNHSEtwTFlraEc0RFFORFRmNnQ3NGpjdjlN?= =?utf-8?B?NlNwKzBKQVlqVEJjMSs0VXR2UW5IK1M3UUtuN1ZDdms0V1lRR1dDVHhDeVBj?= =?utf-8?B?em1XZGczVW4xMEVLb0lIS3hPY1BlRVlBbSs0Ry8vTjJMY00rWU1VenZjQkdV?= =?utf-8?B?R25zWVRPblRCbUgvNDkrdFVVWlV3S0tmMlVHR2FWcjBqUTdrTzR5MC9HUDFL?= =?utf-8?B?bi9WS0k0amF1OS9xdERMTU9GWTF4OG9tV0V0dGFLVHRWMWtrQ3ptdnJGd0ww?= =?utf-8?B?TGRqSVM4ajYyVDBvTEdsbDg2YUhnam5uVHgwb0wwMTh3TWtQeVRxbkdiVTd2?= =?utf-8?Q?rdxZdxry?= X-Microsoft-Exchange-Diagnostics: 1;HE1PR08MB0460;5:BwJpc+VvyiC8B7XD/15cAiXLeID9592DRxExftsCONM3Tkf+mpu2ZvEbYMXUYys71jiHycZ8B8yUcBnwcs0OBIZqo8TnGJs/4LkhnfwV4BdmEk+L4S7JpjqlLsayXPpTi9RbeMAuJ96FXXz43r2RuRgO0VdCYK4L2HIq2Ekai0vjEezaBEAGfjrpKIjepBGm;24:StpQ3GBqgBBMpa1qIHpBB6py0vuNg2WsduZaxS3icPe7maJVt/o7DnEo6ktjOrxYbY89gs3MZW7P/fp24seU4Rpw1UR88LPBgxFbzXzTSRg=;7:KrWytZC3GkEs6GE0+tCny4r3KqApbyULGjCHxSgdnv8TiBcGGW45BeYkcJpGpcc+SGzeB9xdZDcOhTZEq7VrmcjOyx+Fh9zeQei4qvmVftIYHc8VLnhPHXo/DP0dS+89fGyWqkE/V/9tYvwi1wkxQIwAE0XeucdAy95X8C+nG3ijflGlVOHnZij1l0m/l6euY1Ea0q3C7l3dcFIJwEB5BV0kZkr7a5hY4pND27eS4/g=;20:2W3rqMyC53snalYhWHITkG5QV679bNGoMssL9S1lSxgwV+ea701eXTlDJtCead+pGk7+d/6R9O/qW4Xj3P1GFk3nznzUMHmFruUUdli5apEOKXNA/N/Sy2ZaaMEz1bWYejbjNJsw53ae5OQ48ZOqNVl4uJ+ryJhZ9VI7goiDyX8= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Apr 2016 09:43:38.7517 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR08MB0460 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/20/2016 03:42 PM, Mike Rapoport wrote: > Hi, > > This set is to address the issues that appear in userfaultfd usage > scenarios when the task monitoring the uffd and the mm-owner do not > cooperate to each other on VM changes such as remaps, madvises and > fork()-s. > > The pacthes are essentially the same as in the prevoious respin (1), > they've just been rebased on the current tree. Hi, Andrea. Hopefully one day after LSFMM is good time to try to get a bit of your attention to this set :) -- Pavel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754383AbcDVQGI (ORCPT ); Fri, 22 Apr 2016 12:06:08 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44819 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754257AbcDVQGF (ORCPT ); Fri, 22 Apr 2016 12:06:05 -0400 Date: Fri, 22 Apr 2016 12:05:57 -0400 From: Andrea Arcangeli To: Pavel Emelyanov Cc: Mike Rapoport , LKML , linux-mm@kvack.org, Mike Rapoport Subject: Re: [PATCH 0/5] userfaultfd: extension for non cooperative uffd usage Message-ID: <20160422160557.GB4282@redhat.com> References: <1458477741-6942-1-git-send-email-rapoport@il.ibm.com> <57174F90.7080109@virtuozzo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <57174F90.7080109@virtuozzo.com> User-Agent: Mutt/1.6.0 (2016-04-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello Pavel and Mike, On Wed, Apr 20, 2016 at 12:44:48PM +0300, Pavel Emelyanov wrote: > On 03/20/2016 03:42 PM, Mike Rapoport wrote: > > Hi, > > > > This set is to address the issues that appear in userfaultfd usage > > scenarios when the task monitoring the uffd and the mm-owner do not > > cooperate to each other on VM changes such as remaps, madvises and > > fork()-s. > > > > The pacthes are essentially the same as in the prevoious respin (1), > > they've just been rebased on the current tree. Thanks for the rebasing and the submit of these new features! > > Hi, Andrea. > > Hopefully one day after LSFMM is good time to try to get a bit of > your attention to this set :) Yes, at first glance this patchset looks fine. In fact I already merged it in my tree at the time of last post. Just I didn't have much time to review it in detail yet as I did with the wrprotect tracking one, this is why I didn't answer yet, sorry. As said I already reviewed the wrprotect tracking feature in detail and it requires a few (but non trivial) fixes and I was planning to fix that part first as the developer who sent the first implementation a few months ago got busy with something else. But until those bugs gets fixed I cannot ship it in my tree, nor in the way to -mm. The other main reason of the delay is that I got sidetracked by other issues (one internal) and the other notable one is the failure in postcopy caused by the new THP refcounting introduced in 4.5 with THP enabled, which apparently isn't the huge zeropage (tested with use_zero_page = 0) nor the MADV_DONTNEED. I'm also unconvinced it's a bug only in the userfaultfd interaction with the new THP refcounting, perhaps it's something more generic that just happen to be reproduced more easily by the heavy postcopy load, which makes it even more high priority to track that down. I'm afraid until that regression is fixed, I'll have to concentrate on fixing that. At least I found a way to reproduce faster so I'm optimistic it won't take long ;). Andrea