From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50237) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z7TQt-0002wg-Ft for qemu-devel@nongnu.org; Tue, 23 Jun 2015 15:00:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z7TQo-000758-D4 for qemu-devel@nongnu.org; Tue, 23 Jun 2015 15:00:31 -0400 Received: from mga14.intel.com ([192.55.52.115]:27733) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z7TQo-0006yb-86 for qemu-devel@nongnu.org; Tue, 23 Jun 2015 15:00:26 -0400 Message-ID: <5589ACC3.3060401@intel.com> Date: Tue, 23 Jun 2015 12:00:19 -0700 From: Dave Hansen MIME-Version: 1.0 References: <1431624680-20153-1-git-send-email-aarcange@redhat.com> <1431624680-20153-11-git-send-email-aarcange@redhat.com> In-Reply-To: <1431624680-20153-11-git-send-email-aarcange@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 10/23] userfaultfd: add new syscall to provide memory externalization List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Andrea Arcangeli , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, qemu-devel@nongnu.org, kvm@vger.kernel.org, linux-api@vger.kernel.org Cc: zhang.zhanghailiang@huawei.com, Pavel Emelyanov , Johannes Weiner , Hugh Dickins , "Dr. David Alan Gilbert" , Sanidhya Kashyap , Andres Lagar-Cavilla , Mel Gorman , Paolo Bonzini , "Kirill A. Shutemov" , "Huangpeng (Peter)" , Andy Lutomirski , Linus Torvalds , Peter Feiner On 05/14/2015 10:31 AM, Andrea Arcangeli wrote: > +static int userfaultfd_wake_function(wait_queue_t *wq, unsigned mode, > + int wake_flags, void *key) > +{ > + struct userfaultfd_wake_range *range = key; > + int ret; > + struct userfaultfd_wait_queue *uwq; > + unsigned long start, len; > + > + uwq = container_of(wq, struct userfaultfd_wait_queue, wq); > + ret = 0; > + /* don't wake the pending ones to avoid reads to block */ > + if (uwq->pending && !ACCESS_ONCE(uwq->ctx->released)) > + goto out; > + /* len == 0 means wake all */ > + start = range->start; > + len = range->len; > + if (len && (start > uwq->address || start + len <= uwq->address)) > + goto out; > + ret = wake_up_state(wq->private, mode); > + if (ret) > + /* wake only once, autoremove behavior */ > + list_del_init(&wq->task_list); > +out: > + return ret; > +} ... > +static __always_inline int validate_range(struct mm_struct *mm, > + __u64 start, __u64 len) > +{ > + __u64 task_size = mm->task_size; > + > + if (start & ~PAGE_MASK) > + return -EINVAL; > + if (len & ~PAGE_MASK) > + return -EINVAL; > + if (!len) > + return -EINVAL; > + if (start < mmap_min_addr) > + return -EINVAL; > + if (start >= task_size) > + return -EINVAL; > + if (len > task_size - start) > + return -EINVAL; > + return 0; > +} Hey Andrea, Down in userfaultfd_wake_function(), it looks like you intended for a len=0 to mean "wake all". But the validate_range() that we do from userspace has a !len check in it, which keeps us from passing a len=0 in from userspace. Was that "wake all" for some internal use, or is the check too strict? I was trying to use the wake ioctl after an madvise() (as opposed to filling things in using a userfd copy).