qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Richard Weinberger <richard.weinberger@gmail.com>
Cc: kvm <kvm@vger.kernel.org>,
	qemu-devel@nongnu.org,
	Sanidhya Kashyap <sanidhya.gatech@gmail.com>,
	Dave Hansen <dave.hansen@intel.com>,
	zhang.zhanghailiang@huawei.com,
	Pavel Emelyanov <xemul@parallels.com>,
	Hugh Dickins <hughd@google.com>, Mel Gorman <mgorman@suse.de>,
	"Huangpeng (Peter)" <peter.huangpeng@huawei.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Andres Lagar-Cavilla <andreslc@google.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"open list:ABI/API" <linux-api@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Feiner <pfeiner@google.com>
Subject: Re: [Qemu-devel] [PATCH 00/23] userfaultfd v4
Date: Wed, 20 May 2015 16:17:30 +0200	[thread overview]
Message-ID: <20150520141730.GO19097@redhat.com> (raw)
In-Reply-To: <CAFLxGvwGGZH1bbMw+qReZFMK+dc6zoOTCNsuOMdp+xw_jPzPDg@mail.gmail.com>

Hello Richard,

On Tue, May 19, 2015 at 11:59:42PM +0200, Richard Weinberger wrote:
> On Tue, May 19, 2015 at 11:38 PM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
> > On Thu, 14 May 2015 19:30:57 +0200 Andrea Arcangeli <aarcange@redhat.com> wrote:
> >
> >> This is the latest userfaultfd patchset against mm-v4.1-rc3
> >> 2015-05-14-10:04.
> >
> > It would be useful to have some userfaultfd testcases in
> > tools/testing/selftests/.  Partly as an aid to arch maintainers when
> > enabling this.  And also as a standalone thing to give people a
> > practical way of exercising this interface.
> >
> > What are your thoughts on enabling userfaultfd for other architectures,
> > btw?  Are there good use cases, are people working on it, etc?
> 
> UML is using SIGSEGV for page faults.
> i.e. the UML processes receives a SIGSEGV, learns the faulting address
> from the mcontext
> and resolves the fault by installing a new mapping.
> 
> If userfaultfd is faster that the SIGSEGV notification it could speed
> up UML a bit.
> For UML I'm only interested in the notification, not the resolving
> part. The "missing"
> data is present, only a new mapping is needed. No copy of data.
> 
> Andrea, what do you think?

I think you need some kind of UFFDIO_MPROTECT ioctl that is the same
ioctl wrprotect tracking also needs. At the moment we focused the
future plans mostly on wrprotection tracking but it could be extended
to protnone tracking, either with the same feature flag as
wrprotection (with a generic UFFDIO_MPROTECT) or with two separate
feature flags and two separate ioctl.

Your pages are not missing, like in the postcopy live snapshotting
case the pages are not missing. The userfaultfd memory protection
ioctl will not modify the VMA, but it'll just selectively mark
pte/trans_huge_pmd wrprotected/protnone in order to get the faults. In
the case of postcopy live snapshotting a single ioctl call will mark
the entire guest address space readonly.

For live snapshotting the fault resolution is a no brainer: when you
get the fault the page is still readable and it just needs to be
copied off by the live snapshotting thread to a different location,
and then the UFFDIO_MPROTECT will be called again to make the page
writable and wake the blocked fault.

For the protnone, you need to modify the page before waking the
blocked userfault, you can't just remove the protnone or other threads
could modify it (if there are other threads). You'd need a further
ioctl to copy the page off to a different place by using its kernel
address (the userland address is not mapped) and copy it back to
overwrite the original page.

Alternatively once we extend the handle_userfault to tmpfs you could
map the page in two virtual mappings and track the faults in one
mapping (where the tracked app runs) and read/write the page contents
in the other mapping that isn't tracked by the userfault.

These are the first thoughts that comes to mind without knowing
exactly what you need to do after you get the fault address, and
without knowing exactly why you need to mark the region PROT_NONE.

There will be some complications in adding the wrprotection/protnone
feature: if faults could already happen when the wrprotect/protnone is
armed, the handle_userfault() could be invoked in a retry-fault, that
is not ok without allowing the userfault to return VM_FAULT_RETRY even
during a refault (i.e. FAULT_FLAG_TRIED set but FAULT_FLAG_ALLOW_RETRY
not set). The invariants of vma->vm_page_prot and pte/trans_huge_pmd
permissions must also not break anywhere. These are the two main
reasons why these features that requires to flip protection bits are
left implemented later and made visible later with uffdio_api.feature
flags and/or through uffdio_register.ioctl during UFFDIO_REGISTER.

  reply	other threads:[~2015-05-20 14:17 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-14 17:30 [Qemu-devel] [PATCH 00/23] userfaultfd v4 Andrea Arcangeli
2015-05-14 17:30 ` [Qemu-devel] [PATCH 01/23] userfaultfd: linux/Documentation/vm/userfaultfd.txt Andrea Arcangeli
2015-09-11  8:47   ` Michael Kerrisk (man-pages)
2015-12-04 15:50     ` Michael Kerrisk (man-pages)
2015-12-04 17:55       ` Andrea Arcangeli
2015-05-14 17:30 ` [Qemu-devel] [PATCH 02/23] userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 03/23] userfaultfd: uAPI Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 04/23] userfaultfd: linux/userfaultfd_k.h Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 05/23] userfaultfd: add vm_userfaultfd_ctx to the vm_area_struct Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 06/23] userfaultfd: add VM_UFFD_MISSING and VM_UFFD_WP Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 07/23] userfaultfd: call handle_userfault() for userfaultfd_missing() faults Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 08/23] userfaultfd: teach vma_merge to merge across vma->vm_userfaultfd_ctx Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 09/23] userfaultfd: prevent khugepaged to merge if userfaultfd is armed Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 10/23] userfaultfd: add new syscall to provide memory externalization Andrea Arcangeli
2015-05-14 17:49   ` Linus Torvalds
2015-05-15 16:04     ` Andrea Arcangeli
2015-05-15 18:22       ` Linus Torvalds
2015-06-23 19:00   ` Dave Hansen
2015-06-23 21:41     ` Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 11/23] userfaultfd: Rename uffd_api.bits into .features Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 12/23] userfaultfd: Rename uffd_api.bits into .features fixup Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 13/23] userfaultfd: change the read API to return a uffd_msg Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 14/23] userfaultfd: wake pending userfaults Andrea Arcangeli
2015-10-22 12:10   ` Peter Zijlstra
2015-10-22 13:20     ` Andrea Arcangeli
2015-10-22 13:38       ` Peter Zijlstra
2015-10-22 14:18         ` Andrea Arcangeli
2015-10-22 15:15           ` Peter Zijlstra
2015-10-22 15:30             ` Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 15/23] userfaultfd: optimize read() and poll() to be O(1) Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 16/23] userfaultfd: allocate the userfaultfd_ctx cacheline aligned Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 17/23] userfaultfd: solve the race between UFFDIO_COPY|ZEROPAGE and read Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 18/23] userfaultfd: buildsystem activation Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 19/23] userfaultfd: activate syscall Andrea Arcangeli
2015-08-11 10:07   ` Bharata B Rao
2015-08-11 13:48     ` Andrea Arcangeli
2015-08-12  5:23       ` Bharata B Rao
2015-09-08  6:08         ` Michael Ellerman
2015-09-08  6:39           ` Bharata B Rao
2015-09-08  7:14             ` Michael Ellerman
2015-09-08 10:40               ` Michael Ellerman
2015-09-08 12:28                 ` Dr. David Alan Gilbert
2015-09-08  8:59             ` Dr. David Alan Gilbert
2015-09-08 10:00               ` Bharata B Rao
2015-09-08 12:46                 ` Dr. David Alan Gilbert
2015-09-08 13:37                   ` Bharata B Rao
2015-09-08 14:13                     ` Dr. David Alan Gilbert
2015-09-10 12:24                       ` Bharata B Rao
2015-09-11 19:15                         ` Dr. David Alan Gilbert
2015-09-14 18:53                         ` Dr. David Alan Gilbert
2015-05-14 17:31 ` [Qemu-devel] [PATCH 20/23] userfaultfd: UFFDIO_COPY|UFFDIO_ZEROPAGE uAPI Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 21/23] userfaultfd: mcopy_atomic|mfill_zeropage: UFFDIO_COPY|UFFDIO_ZEROPAGE preparation Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 22/23] userfaultfd: avoid mmap_sem read recursion in mcopy_atomic Andrea Arcangeli
2015-05-22 20:18   ` Andrew Morton
2015-05-22 20:48     ` Andrea Arcangeli
2015-05-22 21:18       ` Andrew Morton
2015-05-23  1:04         ` Andrea Arcangeli
2015-05-14 17:31 ` [Qemu-devel] [PATCH 23/23] userfaultfd: UFFDIO_COPY and UFFDIO_ZEROPAGE Andrea Arcangeli
2015-05-18 14:24 ` [Qemu-devel] [PATCH 00/23] userfaultfd v4 Pavel Emelyanov
2015-05-19 21:38 ` Andrew Morton
2015-05-19 21:59   ` Richard Weinberger
2015-05-20 14:17     ` Andrea Arcangeli [this message]
2015-05-20 13:23   ` Andrea Arcangeli
2015-05-21 13:11 ` Kirill Smelkov
2015-05-21 15:52   ` Andrea Arcangeli
2015-05-22 16:35     ` Kirill Smelkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150520141730.GO19097@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreslc@google.com \
    --cc=dave.hansen@intel.com \
    --cc=dgilbert@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kirill@shutemov.name \
    --cc=kvm@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=mgorman@suse.de \
    --cc=pbonzini@redhat.com \
    --cc=peter.huangpeng@huawei.com \
    --cc=pfeiner@google.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.weinberger@gmail.com \
    --cc=sanidhya.gatech@gmail.com \
    --cc=torvalds@linux-foundation.org \
    --cc=xemul@parallels.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).