From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Hansen Subject: Re: [PATCH 10/23] userfaultfd: add new syscall to provide memory externalization Date: Tue, 23 Jun 2015 12:00:19 -0700 Message-ID: <5589ACC3.3060401@intel.com> References: <1431624680-20153-1-git-send-email-aarcange@redhat.com> <1431624680-20153-11-git-send-email-aarcange@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1431624680-20153-11-git-send-email-aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Andrea Arcangeli , Andrew Morton , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, qemu-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Cc: Pavel Emelyanov , Sanidhya Kashyap , zhang.zhanghailiang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org, Linus Torvalds , "Kirill A. Shutemov" , Andres Lagar-Cavilla , Paolo Bonzini , Rik van Riel , Mel Gorman , Andy Lutomirski , Hugh Dickins , Peter Feiner , "Dr. David Alan Gilbert" , Johannes Weiner , "Huangpeng (Peter)" List-Id: linux-api@vger.kernel.org On 05/14/2015 10:31 AM, Andrea Arcangeli wrote: > +static int userfaultfd_wake_function(wait_queue_t *wq, unsigned mode, > + int wake_flags, void *key) > +{ > + struct userfaultfd_wake_range *range = key; > + int ret; > + struct userfaultfd_wait_queue *uwq; > + unsigned long start, len; > + > + uwq = container_of(wq, struct userfaultfd_wait_queue, wq); > + ret = 0; > + /* don't wake the pending ones to avoid reads to block */ > + if (uwq->pending && !ACCESS_ONCE(uwq->ctx->released)) > + goto out; > + /* len == 0 means wake all */ > + start = range->start; > + len = range->len; > + if (len && (start > uwq->address || start + len <= uwq->address)) > + goto out; > + ret = wake_up_state(wq->private, mode); > + if (ret) > + /* wake only once, autoremove behavior */ > + list_del_init(&wq->task_list); > +out: > + return ret; > +} ... > +static __always_inline int validate_range(struct mm_struct *mm, > + __u64 start, __u64 len) > +{ > + __u64 task_size = mm->task_size; > + > + if (start & ~PAGE_MASK) > + return -EINVAL; > + if (len & ~PAGE_MASK) > + return -EINVAL; > + if (!len) > + return -EINVAL; > + if (start < mmap_min_addr) > + return -EINVAL; > + if (start >= task_size) > + return -EINVAL; > + if (len > task_size - start) > + return -EINVAL; > + return 0; > +} Hey Andrea, Down in userfaultfd_wake_function(), it looks like you intended for a len=0 to mean "wake all". But the validate_range() that we do from userspace has a !len check in it, which keeps us from passing a len=0 in from userspace. Was that "wake all" for some internal use, or is the check too strict? I was trying to use the wake ioctl after an madvise() (as opposed to filling things in using a userfd copy).