From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrea Arcangeli Subject: Re: [PATCH 10/23] userfaultfd: add new syscall to provide memory externalization Date: Tue, 23 Jun 2015 23:41:41 +0200 Message-ID: <20150623214141.GB4312@redhat.com> References: <1431624680-20153-1-git-send-email-aarcange@redhat.com> <1431624680-20153-11-git-send-email-aarcange@redhat.com> <5589ACC3.3060401@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <5589ACC3.3060401-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Dave Hansen Cc: Andrew Morton , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, qemu-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Pavel Emelyanov , Sanidhya Kashyap , zhang.zhanghailiang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org, Linus Torvalds , "Kirill A. Shutemov" , Andres Lagar-Cavilla , Paolo Bonzini , Rik van Riel , Mel Gorman , Andy Lutomirski , Hugh Dickins , Peter Feiner , "Dr. David Alan Gilbert" , Johannes Weiner , "Huangpeng (Peter)" List-Id: linux-api@vger.kernel.org Hi Dave, On Tue, Jun 23, 2015 at 12:00:19PM -0700, Dave Hansen wrote: > Down in userfaultfd_wake_function(), it looks like you intended for a > len=0 to mean "wake all". But the validate_range() that we do from > userspace has a !len check in it, which keeps us from passing a len=0 in > from userspace. > Was that "wake all" for some internal use, or is the check too strict? It's for internal use or userfaultfd_release that has to wake them all (after setting ctx->released) if the uffd is closed. It avoids to enlarge the structure by depending on the invariant that userland cannot pass len=0. If we'd accept len=0 from userland as valid, I'd be safer if it does nothing like in madvise, I doubt we want to expose this non standard kernel internal behavior to userland. > I was trying to use the wake ioctl after an madvise() (as opposed to > filling things in using a userfd copy). madvise will return 0 if len=0, mremap would return -EINVAL if new_len is zero, mmap also returns -EINVAL if len is 0, not all MM syscalls are as permissive as madvise. Can't you pass the same len you pass to madvise to UFFDIO_WAKE (or just skip the call if the madvise len is zero)? Thanks, Andrea From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f43.google.com (mail-qg0-f43.google.com [209.85.192.43]) by kanga.kvack.org (Postfix) with ESMTP id 592C16B0032 for ; Tue, 23 Jun 2015 17:41:48 -0400 (EDT) Received: by qgal13 with SMTP id l13so8127701qga.3 for ; Tue, 23 Jun 2015 14:41:48 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id c5si23559956qgf.109.2015.06.23.14.41.47 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 23 Jun 2015 14:41:47 -0700 (PDT) Date: Tue, 23 Jun 2015 23:41:41 +0200 From: Andrea Arcangeli Subject: Re: [PATCH 10/23] userfaultfd: add new syscall to provide memory externalization Message-ID: <20150623214141.GB4312@redhat.com> References: <1431624680-20153-1-git-send-email-aarcange@redhat.com> <1431624680-20153-11-git-send-email-aarcange@redhat.com> <5589ACC3.3060401@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5589ACC3.3060401@intel.com> Sender: owner-linux-mm@kvack.org List-ID: To: Dave Hansen Cc: Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, qemu-devel@nongnu.org, kvm@vger.kernel.org, linux-api@vger.kernel.org, Pavel Emelyanov , Sanidhya Kashyap , zhang.zhanghailiang@huawei.com, Linus Torvalds , "Kirill A. Shutemov" , Andres Lagar-Cavilla , Paolo Bonzini , Rik van Riel , Mel Gorman , Andy Lutomirski , Hugh Dickins , Peter Feiner , "Dr. David Alan Gilbert" , Johannes Weiner , "Huangpeng (Peter)" Hi Dave, On Tue, Jun 23, 2015 at 12:00:19PM -0700, Dave Hansen wrote: > Down in userfaultfd_wake_function(), it looks like you intended for a > len=0 to mean "wake all". But the validate_range() that we do from > userspace has a !len check in it, which keeps us from passing a len=0 in > from userspace. > Was that "wake all" for some internal use, or is the check too strict? It's for internal use or userfaultfd_release that has to wake them all (after setting ctx->released) if the uffd is closed. It avoids to enlarge the structure by depending on the invariant that userland cannot pass len=0. If we'd accept len=0 from userland as valid, I'd be safer if it does nothing like in madvise, I doubt we want to expose this non standard kernel internal behavior to userland. > I was trying to use the wake ioctl after an madvise() (as opposed to > filling things in using a userfd copy). madvise will return 0 if len=0, mremap would return -EINVAL if new_len is zero, mmap also returns -EINVAL if len is 0, not all MM syscalls are as permissive as madvise. Can't you pass the same len you pass to madvise to UFFDIO_WAKE (or just skip the call if the madvise len is zero)? Thanks, Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933862AbbFWVl6 (ORCPT ); Tue, 23 Jun 2015 17:41:58 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37266 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933550AbbFWVlq (ORCPT ); Tue, 23 Jun 2015 17:41:46 -0400 Date: Tue, 23 Jun 2015 23:41:41 +0200 From: Andrea Arcangeli To: Dave Hansen Cc: Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, qemu-devel@nongnu.org, kvm@vger.kernel.org, linux-api@vger.kernel.org, Pavel Emelyanov , Sanidhya Kashyap , zhang.zhanghailiang@huawei.com, Linus Torvalds , "Kirill A. Shutemov" , Andres Lagar-Cavilla , Paolo Bonzini , Rik van Riel , Mel Gorman , Andy Lutomirski , Hugh Dickins , Peter Feiner , "Dr. David Alan Gilbert" , Johannes Weiner , "Huangpeng (Peter)" Subject: Re: [PATCH 10/23] userfaultfd: add new syscall to provide memory externalization Message-ID: <20150623214141.GB4312@redhat.com> References: <1431624680-20153-1-git-send-email-aarcange@redhat.com> <1431624680-20153-11-git-send-email-aarcange@redhat.com> <5589ACC3.3060401@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5589ACC3.3060401@intel.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Dave, On Tue, Jun 23, 2015 at 12:00:19PM -0700, Dave Hansen wrote: > Down in userfaultfd_wake_function(), it looks like you intended for a > len=0 to mean "wake all". But the validate_range() that we do from > userspace has a !len check in it, which keeps us from passing a len=0 in > from userspace. > Was that "wake all" for some internal use, or is the check too strict? It's for internal use or userfaultfd_release that has to wake them all (after setting ctx->released) if the uffd is closed. It avoids to enlarge the structure by depending on the invariant that userland cannot pass len=0. If we'd accept len=0 from userland as valid, I'd be safer if it does nothing like in madvise, I doubt we want to expose this non standard kernel internal behavior to userland. > I was trying to use the wake ioctl after an madvise() (as opposed to > filling things in using a userfd copy). madvise will return 0 if len=0, mremap would return -EINVAL if new_len is zero, mmap also returns -EINVAL if len is 0, not all MM syscalls are as permissive as madvise. Can't you pass the same len you pass to madvise to UFFDIO_WAKE (or just skip the call if the madvise len is zero)? Thanks, Andrea From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41070) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z7Vx3-00079N-Lz for qemu-devel@nongnu.org; Tue, 23 Jun 2015 17:41:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z7Vwx-0001Mc-JG for qemu-devel@nongnu.org; Tue, 23 Jun 2015 17:41:53 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42894) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z7Vwx-0001La-DA for qemu-devel@nongnu.org; Tue, 23 Jun 2015 17:41:47 -0400 Date: Tue, 23 Jun 2015 23:41:41 +0200 From: Andrea Arcangeli Message-ID: <20150623214141.GB4312@redhat.com> References: <1431624680-20153-1-git-send-email-aarcange@redhat.com> <1431624680-20153-11-git-send-email-aarcange@redhat.com> <5589ACC3.3060401@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5589ACC3.3060401@intel.com> Subject: Re: [Qemu-devel] [PATCH 10/23] userfaultfd: add new syscall to provide memory externalization List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Dave Hansen Cc: "Huangpeng (Peter)" , zhang.zhanghailiang@huawei.com, kvm@vger.kernel.org, Pavel Emelyanov , linux-api@vger.kernel.org, Johannes Weiner , Hugh Dickins , linux-kernel@vger.kernel.org, qemu-devel@nongnu.org, linux-mm@kvack.org, Andres Lagar-Cavilla , Mel Gorman , Paolo Bonzini , "Kirill A. Shutemov" , Andrew Morton , Sanidhya Kashyap , Linus Torvalds , Andy Lutomirski , "Dr. David Alan Gilbert" , Peter Feiner Hi Dave, On Tue, Jun 23, 2015 at 12:00:19PM -0700, Dave Hansen wrote: > Down in userfaultfd_wake_function(), it looks like you intended for a > len=0 to mean "wake all". But the validate_range() that we do from > userspace has a !len check in it, which keeps us from passing a len=0 in > from userspace. > Was that "wake all" for some internal use, or is the check too strict? It's for internal use or userfaultfd_release that has to wake them all (after setting ctx->released) if the uffd is closed. It avoids to enlarge the structure by depending on the invariant that userland cannot pass len=0. If we'd accept len=0 from userland as valid, I'd be safer if it does nothing like in madvise, I doubt we want to expose this non standard kernel internal behavior to userland. > I was trying to use the wake ioctl after an madvise() (as opposed to > filling things in using a userfd copy). madvise will return 0 if len=0, mremap would return -EINVAL if new_len is zero, mmap also returns -EINVAL if len is 0, not all MM syscalls are as permissive as madvise. Can't you pass the same len you pass to madvise to UFFDIO_WAKE (or just skip the call if the madvise len is zero)? Thanks, Andrea