All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-api@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andres Lagar-Cavilla <andreslc@google.com>,
	Dave Hansen <dave@sr71.net>, Paolo Bonzini <pbonzini@redhat.com>,
	Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Andy Lutomirski <luto@amacapital.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Sasha Levin <sasha.levin@oracle.com>,
	Hugh Dickins <hughd@google.com>,
	Peter Feiner <pfeiner@google.com>,
	"\\\"Dr. David Alan Gilbert\\\"" <dgilbert@redhat.com>,
	Christopher Covington <cov@codeaurora.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Android Kernel Team <kernel-team@android.com>,
	Robert Love <rlove@google.com>,
	Dmitry Adamushko <dmitry.adamushko@gmail.com>,
	Neil Brown <neilb@suse.de>, Mike Hommey <mh@glandium.org>,
	Taras Glek <tglek@mozilla.>
Subject: Re: [PATCH 08/17] mm: madvise MADV_USERFAULT
Date: Tue, 7 Oct 2014 18:21:50 +0300	[thread overview]
Message-ID: <20141007152150.GA989@node.dhcp.inet.fi> (raw)
In-Reply-To: <20141007132458.GZ2342@redhat.com>

On Tue, Oct 07, 2014 at 03:24:58PM +0200, Andrea Arcangeli wrote:
> Hi Kirill,
> 
> On Tue, Oct 07, 2014 at 01:36:45PM +0300, Kirill A. Shutemov wrote:
> > On Fri, Oct 03, 2014 at 07:07:58PM +0200, Andrea Arcangeli wrote:
> > > MADV_USERFAULT is a new madvise flag that will set VM_USERFAULT in the
> > > vma flags. Whenever VM_USERFAULT is set in an anonymous vma, if
> > > userland touches a still unmapped virtual address, a sigbus signal is
> > > sent instead of allocating a new page. The sigbus signal handler will
> > > then resolve the page fault in userland by calling the
> > > remap_anon_pages syscall.
> > 
> > Hm. I wounder if this functionality really fits madvise(2) interface: as
> > far as I understand it, it provides a way to give a *hint* to kernel which
> > may or may not trigger an action from kernel side. I don't think an
> > application will behaive reasonably if kernel ignore the *advise* and will
> > not send SIGBUS, but allocate memory.
> > 
> > I would suggest to consider to use some other interface for the
> > functionality: a new syscall or, perhaps, mprotect().
> 
> I didn't feel like adding PROT_USERFAULT to mprotect, which looks
> hardwired to just these flags:

PROT_NOALLOC may be?

> 
>        PROT_NONE  The memory cannot be accessed at all.
> 
>        PROT_READ  The memory can be read.
> 
>        PROT_WRITE The memory can be modified.
> 
>        PROT_EXEC  The memory can be executed.

To be complete: PROT_GROWSDOWN, PROT_GROWSUP and unused PROT_SEM.

> So here somebody should comment and choose between:
> 
> 1) set VM_USERFAULT with mprotect(PROT_USERFAULT) instead of
>    the current madvise(MADV_USERFAULT)
> 
> 2) drop MADV_USERFAULT and VM_USERFAULT and force the usage of the
>    userfaultfd protocol as the only way for userland to catch
>    userfaults (each userfaultfd must already register itself into its
>    own virtual memory ranges so it's a trivial change for userfaultfd
>    users that deletes just 1 or 2 lines of userland code, but it would
>    prevent to use the SIGBUS behavior with info->si_addr=faultaddr for
>    other users)
> 
> 3) keep things as they are now: use MADV_USERFAULT for SIGBUS
>    userfaults, with optional intersection between the
>    vm_flags&VM_USERFAULT ranges and the userfaultfd registered ranges
>    with vma->vm_userfaultfd_ctx!=NULL to know if to engage the
>    userfaultfd protocol instead of the plain SIGBUS

4) new syscall?
 
> I will update the code accordingly to feedback, so please comment.

I don't have strong points on this. Just *feel* it doesn't fit advice
semantics.

The only userspace interface I've designed was not proven good by time.
I would listen what senior maintainers say. :)
 
-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-api@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andres Lagar-Cavilla <andreslc@google.com>,
	Dave Hansen <dave@sr71.net>, Paolo Bonzini <pbonzini@redhat.com>,
	Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Andy Lutomirski <luto@amacapital.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Sasha Levin <sasha.levin@oracle.com>,
	Hugh Dickins <hughd@google.com>,
	Peter Feiner <pfeiner@google.com>,
	"\\\"Dr. David Alan Gilbert\\\"" <dgilbert@redhat.com>,
	Christopher Covington <cov@codeaurora.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Android Kernel Team <kernel-team@android.com>,
	Robert Love <rlove@google.com>,
	Dmitry Adamushko <dmitry.adamushko@gmail.com>,
	Neil Brown <neilb@suse.de>, Mike Hommey <mh@glandium.org>,
	Taras Glek <tglek@mozilla.
Subject: Re: [PATCH 08/17] mm: madvise MADV_USERFAULT
Date: Tue, 7 Oct 2014 18:21:50 +0300	[thread overview]
Message-ID: <20141007152150.GA989@node.dhcp.inet.fi> (raw)
In-Reply-To: <20141007132458.GZ2342@redhat.com>

On Tue, Oct 07, 2014 at 03:24:58PM +0200, Andrea Arcangeli wrote:
> Hi Kirill,
> 
> On Tue, Oct 07, 2014 at 01:36:45PM +0300, Kirill A. Shutemov wrote:
> > On Fri, Oct 03, 2014 at 07:07:58PM +0200, Andrea Arcangeli wrote:
> > > MADV_USERFAULT is a new madvise flag that will set VM_USERFAULT in the
> > > vma flags. Whenever VM_USERFAULT is set in an anonymous vma, if
> > > userland touches a still unmapped virtual address, a sigbus signal is
> > > sent instead of allocating a new page. The sigbus signal handler will
> > > then resolve the page fault in userland by calling the
> > > remap_anon_pages syscall.
> > 
> > Hm. I wounder if this functionality really fits madvise(2) interface: as
> > far as I understand it, it provides a way to give a *hint* to kernel which
> > may or may not trigger an action from kernel side. I don't think an
> > application will behaive reasonably if kernel ignore the *advise* and will
> > not send SIGBUS, but allocate memory.
> > 
> > I would suggest to consider to use some other interface for the
> > functionality: a new syscall or, perhaps, mprotect().
> 
> I didn't feel like adding PROT_USERFAULT to mprotect, which looks
> hardwired to just these flags:

PROT_NOALLOC may be?

> 
>        PROT_NONE  The memory cannot be accessed at all.
> 
>        PROT_READ  The memory can be read.
> 
>        PROT_WRITE The memory can be modified.
> 
>        PROT_EXEC  The memory can be executed.

To be complete: PROT_GROWSDOWN, PROT_GROWSUP and unused PROT_SEM.

> So here somebody should comment and choose between:
> 
> 1) set VM_USERFAULT with mprotect(PROT_USERFAULT) instead of
>    the current madvise(MADV_USERFAULT)
> 
> 2) drop MADV_USERFAULT and VM_USERFAULT and force the usage of the
>    userfaultfd protocol as the only way for userland to catch
>    userfaults (each userfaultfd must already register itself into its
>    own virtual memory ranges so it's a trivial change for userfaultfd
>    users that deletes just 1 or 2 lines of userland code, but it would
>    prevent to use the SIGBUS behavior with info->si_addr=faultaddr for
>    other users)
> 
> 3) keep things as they are now: use MADV_USERFAULT for SIGBUS
>    userfaults, with optional intersection between the
>    vm_flags&VM_USERFAULT ranges and the userfaultfd registered ranges
>    with vma->vm_userfaultfd_ctx!=NULL to know if to engage the
>    userfaultfd protocol instead of the plain SIGBUS

4) new syscall?
 
> I will update the code accordingly to feedback, so please comment.

I don't have strong points on this. Just *feel* it doesn't fit advice
semantics.

The only userspace interface I've designed was not proven good by time.
I would listen what senior maintainers say. :)
 
-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-api@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andres Lagar-Cavilla <andreslc@google.com>,
	Dave Hansen <dave@sr71.net>, Paolo Bonzini <pbonzini@redhat.com>,
	Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Andy Lutomirski <luto@amacapital.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Sasha Levin <sasha.levin@oracle.com>,
	Hugh Dickins <hughd@google.com>,
	Peter Feiner <pfeiner@google.com>,
	"\\\"Dr. David Alan Gilbert\\\"" <dgilbert@redhat.com>,
	Christopher Covington <cov@codeaurora.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Android Kernel Team <kernel-team@android.com>,
	Robert Love <rlove@google.com>,
	Dmitry Adamushko <dmitry.adamushko@gmail.com>,
	Neil Brown <neilb@suse.de>, Mike Hommey <mh@glandium.org>,
	Taras Glek <tglek@mozilla.com>, Jan Kara <jack@suse.cz>,
	KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
	Michel Lespinasse <walken@google.com>,
	Minchan Kim <minchan@kernel.org>,
	Keith Packard <keithp@keithp.com>,
	"Huangpeng (Peter)" <peter.huangpeng@huawei.com>,
	Isaku Yamahata <yamahata@valinux.co.jp>,
	Anthony Liguori <anthony@codemonkey.ws>,
	Stefan Hajnoczi <stefanha@gmail.com>,
	Wenchao Xia <wenchaoqemu@gmail.com>,
	Andrew Jones <drjones@redhat.com>,
	Juan Quintela <quintela@redhat.com>
Subject: Re: [PATCH 08/17] mm: madvise MADV_USERFAULT
Date: Tue, 7 Oct 2014 18:21:50 +0300	[thread overview]
Message-ID: <20141007152150.GA989@node.dhcp.inet.fi> (raw)
In-Reply-To: <20141007132458.GZ2342@redhat.com>

On Tue, Oct 07, 2014 at 03:24:58PM +0200, Andrea Arcangeli wrote:
> Hi Kirill,
> 
> On Tue, Oct 07, 2014 at 01:36:45PM +0300, Kirill A. Shutemov wrote:
> > On Fri, Oct 03, 2014 at 07:07:58PM +0200, Andrea Arcangeli wrote:
> > > MADV_USERFAULT is a new madvise flag that will set VM_USERFAULT in the
> > > vma flags. Whenever VM_USERFAULT is set in an anonymous vma, if
> > > userland touches a still unmapped virtual address, a sigbus signal is
> > > sent instead of allocating a new page. The sigbus signal handler will
> > > then resolve the page fault in userland by calling the
> > > remap_anon_pages syscall.
> > 
> > Hm. I wounder if this functionality really fits madvise(2) interface: as
> > far as I understand it, it provides a way to give a *hint* to kernel which
> > may or may not trigger an action from kernel side. I don't think an
> > application will behaive reasonably if kernel ignore the *advise* and will
> > not send SIGBUS, but allocate memory.
> > 
> > I would suggest to consider to use some other interface for the
> > functionality: a new syscall or, perhaps, mprotect().
> 
> I didn't feel like adding PROT_USERFAULT to mprotect, which looks
> hardwired to just these flags:

PROT_NOALLOC may be?

> 
>        PROT_NONE  The memory cannot be accessed at all.
> 
>        PROT_READ  The memory can be read.
> 
>        PROT_WRITE The memory can be modified.
> 
>        PROT_EXEC  The memory can be executed.

To be complete: PROT_GROWSDOWN, PROT_GROWSUP and unused PROT_SEM.

> So here somebody should comment and choose between:
> 
> 1) set VM_USERFAULT with mprotect(PROT_USERFAULT) instead of
>    the current madvise(MADV_USERFAULT)
> 
> 2) drop MADV_USERFAULT and VM_USERFAULT and force the usage of the
>    userfaultfd protocol as the only way for userland to catch
>    userfaults (each userfaultfd must already register itself into its
>    own virtual memory ranges so it's a trivial change for userfaultfd
>    users that deletes just 1 or 2 lines of userland code, but it would
>    prevent to use the SIGBUS behavior with info->si_addr=faultaddr for
>    other users)
> 
> 3) keep things as they are now: use MADV_USERFAULT for SIGBUS
>    userfaults, with optional intersection between the
>    vm_flags&VM_USERFAULT ranges and the userfaultfd registered ranges
>    with vma->vm_userfaultfd_ctx!=NULL to know if to engage the
>    userfaultfd protocol instead of the plain SIGBUS

4) new syscall?
 
> I will update the code accordingly to feedback, so please comment.

I don't have strong points on this. Just *feel* it doesn't fit advice
semantics.

The only userspace interface I've designed was not proven good by time.
I would listen what senior maintainers say. :)
 
-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-api@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andres Lagar-Cavilla <andreslc@google.com>,
	Dave Hansen <dave@sr71.net>, Paolo Bonzini <pbonzini@redhat.com>,
	Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Andy Lutomirski <luto@amacapital.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Sasha Levin <sasha.levin@oracle.com>,
	Hugh Dickins <hughd@google.com>,
	Peter Feiner <pfeiner@google.com>,
	"\\\"Dr. David Alan Gilbert\\\"" <dgilbert@redhat.com>,
	Christopher Covington <cov@codeaurora.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Android Kernel Team <kernel-team@android.com>,
	Robert Love <rlove@google.com>,
	Dmitry Adamushko <dmitry.adamushko@gmail.com>,
	Neil Brown <neilb@suse.de>, Mike Hommey <mh@glandium.org>,
	Taras Glek <tglek@mozilla.com>, Jan Kara <jack@suse.cz>,
	KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
	Michel Lespinasse <walken@google.com>,
	Minchan Kim <minchan@kernel.org>,
	Keith Packard <keithp@keithp.com>,
	"Huangpeng (Peter)" <peter.huangpeng@huawei.com>,
	Isaku Yamahata <yamahata@valinux.co.jp>,
	Anthony Liguori <anthony@codemonkey.ws>,
	Stefan Hajnoczi <stefanha@gmail.com>,
	Wenchao Xia <wenchaoqemu@gmail.com>,
	Andrew Jones <drjones@redhat.com>,
	Juan Quintela <quintela@redhat.com>
Subject: Re: [PATCH 08/17] mm: madvise MADV_USERFAULT
Date: Tue, 7 Oct 2014 18:21:50 +0300	[thread overview]
Message-ID: <20141007152150.GA989@node.dhcp.inet.fi> (raw)
In-Reply-To: <20141007132458.GZ2342@redhat.com>

On Tue, Oct 07, 2014 at 03:24:58PM +0200, Andrea Arcangeli wrote:
> Hi Kirill,
> 
> On Tue, Oct 07, 2014 at 01:36:45PM +0300, Kirill A. Shutemov wrote:
> > On Fri, Oct 03, 2014 at 07:07:58PM +0200, Andrea Arcangeli wrote:
> > > MADV_USERFAULT is a new madvise flag that will set VM_USERFAULT in the
> > > vma flags. Whenever VM_USERFAULT is set in an anonymous vma, if
> > > userland touches a still unmapped virtual address, a sigbus signal is
> > > sent instead of allocating a new page. The sigbus signal handler will
> > > then resolve the page fault in userland by calling the
> > > remap_anon_pages syscall.
> > 
> > Hm. I wounder if this functionality really fits madvise(2) interface: as
> > far as I understand it, it provides a way to give a *hint* to kernel which
> > may or may not trigger an action from kernel side. I don't think an
> > application will behaive reasonably if kernel ignore the *advise* and will
> > not send SIGBUS, but allocate memory.
> > 
> > I would suggest to consider to use some other interface for the
> > functionality: a new syscall or, perhaps, mprotect().
> 
> I didn't feel like adding PROT_USERFAULT to mprotect, which looks
> hardwired to just these flags:

PROT_NOALLOC may be?

> 
>        PROT_NONE  The memory cannot be accessed at all.
> 
>        PROT_READ  The memory can be read.
> 
>        PROT_WRITE The memory can be modified.
> 
>        PROT_EXEC  The memory can be executed.

To be complete: PROT_GROWSDOWN, PROT_GROWSUP and unused PROT_SEM.

> So here somebody should comment and choose between:
> 
> 1) set VM_USERFAULT with mprotect(PROT_USERFAULT) instead of
>    the current madvise(MADV_USERFAULT)
> 
> 2) drop MADV_USERFAULT and VM_USERFAULT and force the usage of the
>    userfaultfd protocol as the only way for userland to catch
>    userfaults (each userfaultfd must already register itself into its
>    own virtual memory ranges so it's a trivial change for userfaultfd
>    users that deletes just 1 or 2 lines of userland code, but it would
>    prevent to use the SIGBUS behavior with info->si_addr=faultaddr for
>    other users)
> 
> 3) keep things as they are now: use MADV_USERFAULT for SIGBUS
>    userfaults, with optional intersection between the
>    vm_flags&VM_USERFAULT ranges and the userfaultfd registered ranges
>    with vma->vm_userfaultfd_ctx!=NULL to know if to engage the
>    userfaultfd protocol instead of the plain SIGBUS

4) new syscall?
 
> I will update the code accordingly to feedback, so please comment.

I don't have strong points on this. Just *feel* it doesn't fit advice
semantics.

The only userspace interface I've designed was not proven good by time.
I would listen what senior maintainers say. :)
 
-- 
 Kirill A. Shutemov

WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Robert Love <rlove@google.com>, Dave Hansen <dave@sr71.net>,
	Jan Kara <jack@suse.cz>,
	kvm@vger.kernel.org, Neil Brown <neilb@suse.de>,
	Stefan Hajnoczi <stefanha@gmail.com>,
	qemu-devel@nongnu.org, linux-mm@kvack.org,
	KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
	Michel Lespinasse <walken@google.com>,
	Taras Glek <tglek@mozilla.com>, Andrew Jones <drjones@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	Isaku Yamahata <yamahata@valinux.co.jp>,
	Mel Gorman <mgorman@suse.de>,
	Sasha Levin <sasha.levin@oracle.com>,
	Android Kernel Team <kernel-team@android.com>,
	"\\\"Dr. David Alan Gilbert\\\"" <dgilbert@redhat.com>,
	"Huangpeng (Peter)" <peter.huangpeng@huawei.com>,
	Andres Lagar-Cavilla <andreslc@google.com>,
	Christopher Covington <cov@codeaurora.org>,
	Anthony Liguori <anthony@codemonkey.ws>,
	Mike Hommey <mh@glandium.org>, Keith Packard <keithp@keithp.com>,
	Wenchao Xia <wenchaoqemu@gmail.com>,
	linux-api@vger.kernel.org, linux-kernel@vger.kernel.org,
	Andy Lutomirski <luto@amacapital.net>,
	Minchan Kim <minchan@kernel.org>,
	Dmitry Adamushko <dmitry.adamushko@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Feiner <pfeiner@google.com>
Subject: Re: [Qemu-devel] [PATCH 08/17] mm: madvise MADV_USERFAULT
Date: Tue, 7 Oct 2014 18:21:50 +0300	[thread overview]
Message-ID: <20141007152150.GA989@node.dhcp.inet.fi> (raw)
In-Reply-To: <20141007132458.GZ2342@redhat.com>

On Tue, Oct 07, 2014 at 03:24:58PM +0200, Andrea Arcangeli wrote:
> Hi Kirill,
> 
> On Tue, Oct 07, 2014 at 01:36:45PM +0300, Kirill A. Shutemov wrote:
> > On Fri, Oct 03, 2014 at 07:07:58PM +0200, Andrea Arcangeli wrote:
> > > MADV_USERFAULT is a new madvise flag that will set VM_USERFAULT in the
> > > vma flags. Whenever VM_USERFAULT is set in an anonymous vma, if
> > > userland touches a still unmapped virtual address, a sigbus signal is
> > > sent instead of allocating a new page. The sigbus signal handler will
> > > then resolve the page fault in userland by calling the
> > > remap_anon_pages syscall.
> > 
> > Hm. I wounder if this functionality really fits madvise(2) interface: as
> > far as I understand it, it provides a way to give a *hint* to kernel which
> > may or may not trigger an action from kernel side. I don't think an
> > application will behaive reasonably if kernel ignore the *advise* and will
> > not send SIGBUS, but allocate memory.
> > 
> > I would suggest to consider to use some other interface for the
> > functionality: a new syscall or, perhaps, mprotect().
> 
> I didn't feel like adding PROT_USERFAULT to mprotect, which looks
> hardwired to just these flags:

PROT_NOALLOC may be?

> 
>        PROT_NONE  The memory cannot be accessed at all.
> 
>        PROT_READ  The memory can be read.
> 
>        PROT_WRITE The memory can be modified.
> 
>        PROT_EXEC  The memory can be executed.

To be complete: PROT_GROWSDOWN, PROT_GROWSUP and unused PROT_SEM.

> So here somebody should comment and choose between:
> 
> 1) set VM_USERFAULT with mprotect(PROT_USERFAULT) instead of
>    the current madvise(MADV_USERFAULT)
> 
> 2) drop MADV_USERFAULT and VM_USERFAULT and force the usage of the
>    userfaultfd protocol as the only way for userland to catch
>    userfaults (each userfaultfd must already register itself into its
>    own virtual memory ranges so it's a trivial change for userfaultfd
>    users that deletes just 1 or 2 lines of userland code, but it would
>    prevent to use the SIGBUS behavior with info->si_addr=faultaddr for
>    other users)
> 
> 3) keep things as they are now: use MADV_USERFAULT for SIGBUS
>    userfaults, with optional intersection between the
>    vm_flags&VM_USERFAULT ranges and the userfaultfd registered ranges
>    with vma->vm_userfaultfd_ctx!=NULL to know if to engage the
>    userfaultfd protocol instead of the plain SIGBUS

4) new syscall?
 
> I will update the code accordingly to feedback, so please comment.

I don't have strong points on this. Just *feel* it doesn't fit advice
semantics.

The only userspace interface I've designed was not proven good by time.
I would listen what senior maintainers say. :)
 
-- 
 Kirill A. Shutemov

  reply	other threads:[~2014-10-07 15:21 UTC|newest]

Thread overview: 303+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-03 17:07 [PATCH 00/17] RFC: userfault v2 Andrea Arcangeli
2014-10-03 17:07 ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:07 ` Andrea Arcangeli
2014-10-03 17:07 ` Andrea Arcangeli
2014-10-03 17:07 ` Andrea Arcangeli
2014-10-03 17:07 ` [PATCH 01/17] mm: gup: add FOLL_TRIED Andrea Arcangeli
2014-10-03 17:07   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 18:15   ` Linus Torvalds
2014-10-03 18:15     ` [Qemu-devel] " Linus Torvalds
2014-10-03 18:15     ` Linus Torvalds
2014-10-03 18:15     ` Linus Torvalds
2014-10-03 18:15     ` Linus Torvalds
2014-10-03 20:55     ` Paolo Bonzini
2014-10-03 20:55       ` [Qemu-devel] " Paolo Bonzini
2014-10-03 20:55       ` Paolo Bonzini
2014-10-03 20:55       ` Paolo Bonzini
2014-10-03 20:55       ` Paolo Bonzini
2014-10-03 17:07 ` [PATCH 02/17] mm: gup: add get_user_pages_locked and get_user_pages_unlocked Andrea Arcangeli
2014-10-03 17:07   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07 ` [PATCH 03/17] mm: gup: use get_user_pages_unlocked within get_user_pages_fast Andrea Arcangeli
2014-10-03 17:07   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07 ` [PATCH 04/17] mm: gup: make get_user_pages_fast and __get_user_pages_fast latency conscious Andrea Arcangeli
2014-10-03 17:07   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 18:23   ` Linus Torvalds
2014-10-03 18:23     ` [Qemu-devel] " Linus Torvalds
2014-10-03 18:23     ` Linus Torvalds
2014-10-03 18:23     ` Linus Torvalds
2014-10-03 18:23     ` Linus Torvalds
2014-10-06 14:14     ` Andrea Arcangeli
2014-10-06 14:14       ` [Qemu-devel] " Andrea Arcangeli
2014-10-06 14:14       ` Andrea Arcangeli
2014-10-06 14:14       ` Andrea Arcangeli
2014-10-06 14:14       ` Andrea Arcangeli
2014-10-03 17:07 ` [PATCH 05/17] mm: gup: use get_user_pages_fast and get_user_pages_unlocked Andrea Arcangeli
2014-10-03 17:07   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07 ` [PATCH 06/17] kvm: Faults which trigger IO release the mmap_sem Andrea Arcangeli
2014-10-03 17:07   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07 ` [PATCH 07/17] mm: madvise MADV_USERFAULT: prepare vm_flags to allow more than 32bits Andrea Arcangeli
2014-10-03 17:07   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-07  9:03   ` Kirill A. Shutemov
2014-10-07  9:03     ` [Qemu-devel] " Kirill A. Shutemov
2014-10-07  9:03     ` Kirill A. Shutemov
2014-10-07  9:03     ` Kirill A. Shutemov
2014-10-07  9:03     ` Kirill A. Shutemov
     [not found]   ` <1412356087-16115-8-git-send-email-aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-11-06 20:08     ` Konstantin Khlebnikov
2014-11-06 20:08       ` [Qemu-devel] " Konstantin Khlebnikov
2014-11-06 20:08       ` Konstantin Khlebnikov
2014-11-06 20:08       ` Konstantin Khlebnikov
2014-11-06 20:08       ` Konstantin Khlebnikov
2014-10-03 17:07 ` [PATCH 08/17] mm: madvise MADV_USERFAULT Andrea Arcangeli
2014-10-03 17:07   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 23:13   ` Mike Hommey
2014-10-03 23:13     ` [Qemu-devel] " Mike Hommey
2014-10-03 23:13     ` Mike Hommey
2014-10-03 23:13     ` Mike Hommey
2014-10-03 23:13     ` Mike Hommey
2014-10-06 17:24     ` Andrea Arcangeli
2014-10-06 17:24       ` [Qemu-devel] " Andrea Arcangeli
2014-10-06 17:24       ` Andrea Arcangeli
2014-10-06 17:24       ` Andrea Arcangeli
2014-10-06 17:24       ` Andrea Arcangeli
2014-10-07 10:36   ` Kirill A. Shutemov
2014-10-07 10:36     ` [Qemu-devel] " Kirill A. Shutemov
2014-10-07 10:36     ` Kirill A. Shutemov
2014-10-07 10:36     ` Kirill A. Shutemov
2014-10-07 10:36     ` Kirill A. Shutemov
2014-10-07 10:46     ` Dr. David Alan Gilbert
2014-10-07 10:46       ` [Qemu-devel] " Dr. David Alan Gilbert
2014-10-07 10:46       ` Dr. David Alan Gilbert
2014-10-07 10:46       ` Dr. David Alan Gilbert
2014-10-07 10:46       ` Dr. David Alan Gilbert
2014-10-07 10:52       ` [Qemu-devel] " Kirill A. Shutemov
2014-10-07 10:52         ` Kirill A. Shutemov
2014-10-07 10:52         ` Kirill A. Shutemov
2014-10-07 10:52         ` Kirill A. Shutemov
2014-10-07 10:52         ` Kirill A. Shutemov
2014-10-07 11:01         ` Dr. David Alan Gilbert
2014-10-07 11:01           ` Dr. David Alan Gilbert
2014-10-07 11:01           ` Dr. David Alan Gilbert
2014-10-07 11:01           ` Dr. David Alan Gilbert
2014-10-07 11:30           ` Kirill A. Shutemov
2014-10-07 11:30             ` Kirill A. Shutemov
2014-10-07 11:30             ` Kirill A. Shutemov
2014-10-07 11:30             ` Kirill A. Shutemov
2014-10-07 11:30             ` Kirill A. Shutemov
2014-10-07 13:24     ` Andrea Arcangeli
2014-10-07 13:24       ` [Qemu-devel] " Andrea Arcangeli
2014-10-07 13:24       ` Andrea Arcangeli
2014-10-07 13:24       ` Andrea Arcangeli
2014-10-07 13:24       ` Andrea Arcangeli
2014-10-07 15:21       ` Kirill A. Shutemov [this message]
2014-10-07 15:21         ` [Qemu-devel] " Kirill A. Shutemov
2014-10-07 15:21         ` Kirill A. Shutemov
2014-10-07 15:21         ` Kirill A. Shutemov
2014-10-07 15:21         ` Kirill A. Shutemov
2014-10-03 17:07 ` [PATCH 09/17] mm: PT lock: export double_pt_lock/unlock Andrea Arcangeli
2014-10-03 17:07   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:07   ` Andrea Arcangeli
2014-10-03 17:08 ` [PATCH 10/17] mm: rmap preparation for remap_anon_pages Andrea Arcangeli
2014-10-03 17:08   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 18:31   ` Linus Torvalds
2014-10-03 18:31     ` [Qemu-devel] " Linus Torvalds
2014-10-03 18:31     ` Linus Torvalds
2014-10-03 18:31     ` Linus Torvalds
2014-10-03 18:31     ` Linus Torvalds
2014-10-06  8:55     ` Dr. David Alan Gilbert
2014-10-06  8:55       ` [Qemu-devel] " Dr. David Alan Gilbert
2014-10-06  8:55       ` Dr. David Alan Gilbert
2014-10-06  8:55       ` Dr. David Alan Gilbert
2014-10-06  8:55       ` Dr. David Alan Gilbert
2014-10-06 16:41       ` Andrea Arcangeli
2014-10-06 16:41         ` [Qemu-devel] " Andrea Arcangeli
2014-10-06 16:41         ` Andrea Arcangeli
2014-10-06 16:41         ` Andrea Arcangeli
2014-10-06 16:41         ` Andrea Arcangeli
2014-10-07 12:47         ` Linus Torvalds
2014-10-07 12:47           ` [Qemu-devel] " Linus Torvalds
2014-10-07 12:47           ` Linus Torvalds
2014-10-07 12:47           ` Linus Torvalds
2014-10-07 12:47           ` Linus Torvalds
2014-10-07 14:19           ` Andrea Arcangeli
2014-10-07 14:19             ` [Qemu-devel] " Andrea Arcangeli
2014-10-07 14:19             ` Andrea Arcangeli
2014-10-07 14:19             ` Andrea Arcangeli
2014-10-07 14:19             ` Andrea Arcangeli
2014-10-07 15:52             ` Andrea Arcangeli
2014-10-07 15:52               ` [Qemu-devel] " Andrea Arcangeli
2014-10-07 15:52               ` Andrea Arcangeli
2014-10-07 15:52               ` Andrea Arcangeli
2014-10-07 15:52               ` Andrea Arcangeli
2014-10-07 15:54               ` Andy Lutomirski
2014-10-07 15:54                 ` [Qemu-devel] " Andy Lutomirski
2014-10-07 15:54                 ` Andy Lutomirski
2014-10-07 15:54                 ` Andy Lutomirski
2014-10-07 15:54                 ` Andy Lutomirski
     [not found]               ` <20141007155247.GD2342-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-10-07 16:13                 ` Peter Feiner
2014-10-07 16:13                   ` [Qemu-devel] " Peter Feiner
2014-10-07 16:13                   ` Peter Feiner
2014-10-07 16:13                   ` Peter Feiner
     [not found]             ` <20141007141913.GC2342-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-10-07 16:56               ` Linus Torvalds
2014-10-07 16:56                 ` [Qemu-devel] " Linus Torvalds
2014-10-07 16:56                 ` Linus Torvalds
2014-10-07 16:56                 ` Linus Torvalds
2014-10-07 16:56                 ` Linus Torvalds
2014-10-07 17:07           ` Dr. David Alan Gilbert
2014-10-07 17:07             ` [Qemu-devel] " Dr. David Alan Gilbert
2014-10-07 17:07             ` Dr. David Alan Gilbert
2014-10-07 17:07             ` Dr. David Alan Gilbert
2014-10-07 17:07             ` Dr. David Alan Gilbert
2014-10-07 17:14             ` Paolo Bonzini
2014-10-07 17:14               ` [Qemu-devel] " Paolo Bonzini
2014-10-07 17:14               ` Paolo Bonzini
2014-10-07 17:14               ` Paolo Bonzini
2014-10-07 17:14               ` Paolo Bonzini
2014-10-07 17:25               ` Dr. David Alan Gilbert
2014-10-07 17:25                 ` [Qemu-devel] " Dr. David Alan Gilbert
2014-10-07 17:25                 ` Dr. David Alan Gilbert
2014-10-07 17:25                 ` Dr. David Alan Gilbert
2014-10-07 17:25                 ` Dr. David Alan Gilbert
2014-10-07 11:10   ` [Qemu-devel] " Kirill A. Shutemov
2014-10-07 11:10     ` Kirill A. Shutemov
2014-10-07 11:10     ` Kirill A. Shutemov
2014-10-07 11:10     ` Kirill A. Shutemov
2014-10-07 11:10     ` Kirill A. Shutemov
2014-10-07 13:37     ` Andrea Arcangeli
2014-10-07 13:37       ` Andrea Arcangeli
2014-10-07 13:37       ` Andrea Arcangeli
2014-10-07 13:37       ` Andrea Arcangeli
2014-10-07 13:37       ` Andrea Arcangeli
     [not found] ` <1412356087-16115-1-git-send-email-aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-10-03 17:08   ` [PATCH 11/17] mm: swp_entry_swapcount Andrea Arcangeli
2014-10-03 17:08     ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:08     ` Andrea Arcangeli
2014-10-03 17:08     ` Andrea Arcangeli
2014-10-03 17:08 ` [PATCH 12/17] mm: sys_remap_anon_pages Andrea Arcangeli
2014-10-03 17:08   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-04 13:13   ` Andi Kleen
2014-10-04 13:13     ` Andi Kleen
2014-10-06 17:00     ` Andrea Arcangeli
2014-10-06 17:00       ` Andrea Arcangeli
2014-10-03 17:08 ` [PATCH 13/17] waitqueue: add nr wake parameter to __wake_up_locked_key Andrea Arcangeli
2014-10-03 17:08   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08 ` [PATCH 14/17] userfaultfd: add new syscall to provide memory externalization Andrea Arcangeli
2014-10-03 17:08   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-10  9:39   ` Thomas Martitz
2014-10-03 17:08 ` [PATCH 15/17] userfaultfd: make userfaultfd_write non blocking Andrea Arcangeli
2014-10-03 17:08   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08 ` [PATCH 16/17] powerpc: add remap_anon_pages and userfaultfd Andrea Arcangeli
2014-10-03 17:08   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08 ` [PATCH 17/17] userfaultfd: implement USERFAULTFD_RANGE_REGISTER|UNREGISTER Andrea Arcangeli
2014-10-03 17:08   ` [Qemu-devel] " Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-03 17:08   ` Andrea Arcangeli
2014-10-27  9:32 ` [PATCH 00/17] RFC: userfault v2 zhanghailiang
2014-10-27  9:32   ` [Qemu-devel] " zhanghailiang
2014-10-27  9:32   ` zhanghailiang
2014-10-29 17:46   ` Andrea Arcangeli
2014-10-29 17:46     ` [Qemu-devel] " Andrea Arcangeli
2014-10-29 17:46     ` Andrea Arcangeli
2014-10-29 17:56     ` [Qemu-devel] " Peter Maydell
2014-10-29 17:56       ` Peter Maydell
2014-10-29 17:56       ` Peter Maydell
2014-11-21 20:14       ` Andrea Arcangeli
2014-11-21 20:14         ` Andrea Arcangeli
2014-11-21 20:14         ` Andrea Arcangeli
2014-11-21 23:05         ` Peter Maydell
2014-11-21 23:05           ` Peter Maydell
2014-11-21 23:05           ` Peter Maydell
2014-11-25 19:45           ` Andrea Arcangeli
2014-11-25 19:45             ` [Qemu-devel] " Andrea Arcangeli
2014-11-25 19:45             ` Andrea Arcangeli
2014-10-30 11:31     ` zhanghailiang
2014-10-30 11:31       ` [Qemu-devel] " zhanghailiang
2014-10-30 11:31       ` zhanghailiang
2014-10-30 12:49       ` Dr. David Alan Gilbert
2014-10-30 12:49         ` [Qemu-devel] " Dr. David Alan Gilbert
2014-10-30 12:49         ` Dr. David Alan Gilbert
2014-10-31  1:26         ` zhanghailiang
2014-10-31  1:26           ` [Qemu-devel] " zhanghailiang
2014-10-31  1:26           ` zhanghailiang
2014-11-19 18:49           ` Andrea Arcangeli
2014-11-19 18:49             ` [Qemu-devel] " Andrea Arcangeli
2014-11-19 18:49             ` Andrea Arcangeli
2014-11-20  2:54             ` zhanghailiang
2014-11-20  2:54               ` [Qemu-devel] " zhanghailiang
2014-11-20  2:54               ` zhanghailiang
2014-11-20 17:38               ` Andrea Arcangeli
2014-11-20 17:38                 ` [Qemu-devel] " Andrea Arcangeli
2014-11-20 17:38                 ` Andrea Arcangeli
2014-11-21  7:19                 ` zhanghailiang
2014-11-21  7:19                   ` [Qemu-devel] " zhanghailiang
2014-11-21  7:19                   ` zhanghailiang
2014-10-31  2:23       ` Peter Feiner
2014-10-31  2:23         ` [Qemu-devel] " Peter Feiner
2014-10-31  2:23         ` Peter Feiner
2014-10-31  3:29         ` zhanghailiang
2014-10-31  3:29           ` [Qemu-devel] " zhanghailiang
2014-10-31  3:29           ` zhanghailiang
2014-10-31  4:38           ` zhanghailiang
2014-10-31  4:38             ` [Qemu-devel] " zhanghailiang
2014-10-31  4:38             ` zhanghailiang
2014-10-31  5:17             ` Andres Lagar-Cavilla
2014-10-31  5:17               ` [Qemu-devel] " Andres Lagar-Cavilla
2014-10-31  5:17               ` Andres Lagar-Cavilla
2014-10-31  8:11               ` zhanghailiang
2014-10-31  8:11                 ` [Qemu-devel] " zhanghailiang
2014-10-31  8:11                 ` zhanghailiang
2014-10-31 19:39           ` Peter Feiner
2014-10-31 19:39             ` [Qemu-devel] " Peter Feiner
2014-10-31 19:39             ` Peter Feiner
2014-11-01  8:48             ` zhanghailiang
2014-11-01  8:48               ` [Qemu-devel] " zhanghailiang
2014-11-01  8:48               ` zhanghailiang
2014-11-20 17:29             ` Andrea Arcangeli
2014-11-20 17:29               ` [Qemu-devel] " Andrea Arcangeli
2014-11-20 17:29               ` Andrea Arcangeli
2014-11-12  7:18       ` zhanghailiang
2014-11-12  7:18         ` [Qemu-devel] " zhanghailiang
2014-11-12  7:18         ` zhanghailiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141007152150.GA989@node.dhcp.inet.fi \
    --to=kirill@shutemov.name \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreslc@google.com \
    --cc=cov@codeaurora.org \
    --cc=dave@sr71.net \
    --cc=dgilbert@redhat.com \
    --cc=dmitry.adamushko@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kernel-team@android.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=mgorman@suse.de \
    --cc=mh@glandium.org \
    --cc=neilb@suse.de \
    --cc=pbonzini@redhat.com \
    --cc=pfeiner@google.com \
    --cc=qemu-devel@nongnu.org \
    --cc=riel@redhat.com \
    --cc=rlove@google.com \
    --cc=sasha.levin@oracle.com \
    --cc=tglek@mozilla. \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.