From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56336) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X2Nl2-0003eg-IK for qemu-devel@nongnu.org; Wed, 02 Jul 2014 12:51:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X2Nku-000802-LY for qemu-devel@nongnu.org; Wed, 02 Jul 2014 12:51:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:63624) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X2Nku-0007zt-D5 for qemu-devel@nongnu.org; Wed, 02 Jul 2014 12:51:36 -0400 From: Andrea Arcangeli Date: Wed, 2 Jul 2014 18:50:15 +0200 Message-Id: <1404319816-30229-10-git-send-email-aarcange@redhat.com> In-Reply-To: <1404319816-30229-1-git-send-email-aarcange@redhat.com> References: <1404319816-30229-1-git-send-email-aarcange@redhat.com> Subject: [Qemu-devel] [PATCH 09/10] userfaultfd: make userfaultfd_write non blocking List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org, kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Robert Love , Dave Hansen , Jan Kara , Neil Brown , Stefan Hajnoczi , Andrew Jones , KOSAKI Motohiro , Michel Lespinasse , Andrea Arcangeli , Taras Glek , Juan Quintela , Hugh Dickins , Isaku Yamahata , Mel Gorman , Android Kernel Team , Mel Gorman , "\\\"Dr. David Alan Gilbert\\\"" , "Huangpeng (Peter)" , Anthony Liguori , Mike Hommey , Keith Packard , Wenchao Xia , Minchan Kim , Dmitry Adamushko , Johannes Weiner , Paolo Bonzini , Andrew Morton It is generally inefficient to ask the wakeup of userfault ranges where there's not a single userfault address read through userfaultfd_read earlier and in turn waiting a wakeup. However it may come handy to wakeup the same userfault range twice in case of multiple thread faulting on the same address. But we should still return an error so if the application thinks this occurrence can never happen it will know it hit a bug. So just return -ENOENT instead of blocking. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 34 +++++----------------------------- 1 file changed, 5 insertions(+), 29 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 4902fa3..deed8cb 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -378,9 +378,7 @@ static ssize_t userfaultfd_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { struct userfaultfd_ctx *ctx = file->private_data; - ssize_t res; __u64 range[2]; - DECLARE_WAITQUEUE(wait, current); if (ctx->state == USERFAULTFD_STATE_ASK_PROTOCOL) { __u64 protocol; @@ -408,34 +406,12 @@ static ssize_t userfaultfd_write(struct file *file, const char __user *buf, if (range[0] >= range[1]) return -ERANGE; - spin_lock(&ctx->fd_wqh.lock); - __add_wait_queue(&ctx->fd_wqh, &wait); - for (;;) { - set_current_state(TASK_INTERRUPTIBLE); - /* always take the fd_wqh lock before the fault_wqh lock */ - if (find_userfault(ctx, NULL, POLLOUT)) { - if (!wake_userfault(ctx, range)) { - res = sizeof(range); - break; - } - } - if (signal_pending(current)) { - res = -ERESTARTSYS; - break; - } - if (file->f_flags & O_NONBLOCK) { - res = -EAGAIN; - break; - } - spin_unlock(&ctx->fd_wqh.lock); - schedule(); - spin_lock(&ctx->fd_wqh.lock); - } - __remove_wait_queue(&ctx->fd_wqh, &wait); - __set_current_state(TASK_RUNNING); - spin_unlock(&ctx->fd_wqh.lock); + /* always take the fd_wqh lock before the fault_wqh lock */ + if (find_userfault(ctx, NULL, POLLOUT)) + if (!wake_userfault(ctx, range)) + return sizeof(range); - return res; + return -ENOENT; } #ifdef CONFIG_PROC_FS