From: Peter Feiner <pfeiner-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
To: Andrea Arcangeli <aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Linus Torvalds
<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
"Dr. David Alan Gilbert"
<dgilbert-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
qemu-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org,
KVM list <kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Linux Kernel Mailing List
<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
linux-mm <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Andres Lagar-Cavilla
<andreslc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Dave Hansen <dave-gkUM19QKKo4@public.gmane.org>,
Paolo Bonzini <pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Mel Gorman <mgorman-l3A5Bk7waGM@public.gmane.org>,
Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Sasha Levin <sasha.levin-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Christopher Covington
<cov-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>,
Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
Android Kernel Team
<kernel-team-z5hGa2qSFaRBDgjK7y7TUQ@public.gmane.org>,
Robert Love <rlove-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Dmitry Adamushko
<dmitry.adamushko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Neil Brown <neilb-l3A5Bk7waGM@public.gmane.org>,
Mike Hommey <mh-YmoObPS1fuhg9hUCZPvPmw@public.gmane.org>,
Taras
Subject: Re: [PATCH 10/17] mm: rmap preparation for remap_anon_pages
Date: Tue, 7 Oct 2014 09:13:20 -0700 [thread overview]
Message-ID: <20141007161320.GA17858@google.com> (raw)
In-Reply-To: <20141007155247.GD2342-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
On Tue, Oct 07, 2014 at 05:52:47PM +0200, Andrea Arcangeli wrote:
> I probably grossly overestimated the benefits of resolving the
> userfault with a zerocopy page move, sorry. [...]
For posterity, I think it's worth noting that most expensive aspect of a TLB
shootdown is the interprocessor interrupt necessary to flush other CPUs' TLBs.
On a many-core machine, copying 4K of data looks pretty cheap compared to
taking an interrupt and invalidating TLBs on many cores :-)
> [...] So if we entirely drop the
> zerocopy behavior and the TLB flush of the old page like you
> suggested, the way to keep the userfaultfd mechanism decoupled from
> the userfault resolution mechanism would be to implement an
> atomic-copy syscall. That would work for SIGBUS userfaults too without
> requiring a pseudofd then. It would be enough then to call
> mcopy_atomic(userfault_addr,tmp_addr,len) with the only constraints
> that len must be a multiple of PAGE_SIZE. Of course mcopy_atomic
> wouldn't page fault or call GUP into the destination address (it can't
> otherwise the in-flight partial copy would be visible to the process,
> breaking the atomicity of the copy), but it would fill in the
> pte/trans_huge_pmd with the same strict behavior that remap_anon_pages
> currently has (in turn it would by design bypass the VM_USERFAULT
> check and be ideal for resolving userfaults).
>
> mcopy_atomic could then be also extended to tmpfs and it would work
> without requiring the source page to be a tmpfs page too without
> having to convert page types on the fly.
>
> If I add mcopy_atomic, the patch in subject (10/17) can be dropped of
> course so it'd be even less intrusive than the current
> remap_anon_pages and it would require zero TLB flush during its
> runtime (it would just require an atomic copy).
I like this new approach. It will be good to have a single interface for
resolving anon and tmpfs userfaults.
> So should I try to embed a mcopy_atomic inside userfault_write or can
> I expose it to userland as a standalone new syscall? Or should I do
> something different? Comments?
One interesting (ab)use of userfault_write would be that the faulting process
and the fault-handling process could be different, which would be necessary
for post-copy live migration in CRIU (http://criu.org).
Aside from the asthetic difference, I can't think of any advantage in favor of
a syscall.
Peter
next prev parent reply other threads:[~2014-10-07 16:13 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-03 17:07 [PATCH 00/17] RFC: userfault v2 Andrea Arcangeli
2014-10-03 17:07 ` [PATCH 01/17] mm: gup: add FOLL_TRIED Andrea Arcangeli
2014-10-03 18:15 ` Linus Torvalds
2014-10-03 20:55 ` Paolo Bonzini
2014-10-03 17:07 ` [PATCH 02/17] mm: gup: add get_user_pages_locked and get_user_pages_unlocked Andrea Arcangeli
2014-10-03 17:07 ` [PATCH 03/17] mm: gup: use get_user_pages_unlocked within get_user_pages_fast Andrea Arcangeli
2014-10-03 17:07 ` [PATCH 04/17] mm: gup: make get_user_pages_fast and __get_user_pages_fast latency conscious Andrea Arcangeli
2014-10-03 18:23 ` Linus Torvalds
2014-10-06 14:14 ` Andrea Arcangeli
2014-10-03 17:07 ` [PATCH 05/17] mm: gup: use get_user_pages_fast and get_user_pages_unlocked Andrea Arcangeli
2014-10-03 17:07 ` [PATCH 06/17] kvm: Faults which trigger IO release the mmap_sem Andrea Arcangeli
2014-10-03 17:07 ` [PATCH 07/17] mm: madvise MADV_USERFAULT: prepare vm_flags to allow more than 32bits Andrea Arcangeli
2014-10-07 9:03 ` Kirill A. Shutemov
[not found] ` <1412356087-16115-8-git-send-email-aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-11-06 20:08 ` Konstantin Khlebnikov
2014-10-03 17:07 ` [PATCH 08/17] mm: madvise MADV_USERFAULT Andrea Arcangeli
2014-10-03 23:13 ` Mike Hommey
2014-10-06 17:24 ` Andrea Arcangeli
2014-10-07 10:36 ` Kirill A. Shutemov
2014-10-07 10:46 ` Dr. David Alan Gilbert
2014-10-07 10:52 ` [Qemu-devel] " Kirill A. Shutemov
2014-10-07 11:01 ` Dr. David Alan Gilbert
2014-10-07 11:30 ` Kirill A. Shutemov
2014-10-07 13:24 ` Andrea Arcangeli
2014-10-07 15:21 ` Kirill A. Shutemov
2014-10-03 17:07 ` [PATCH 09/17] mm: PT lock: export double_pt_lock/unlock Andrea Arcangeli
2014-10-03 17:08 ` [PATCH 10/17] mm: rmap preparation for remap_anon_pages Andrea Arcangeli
2014-10-03 18:31 ` Linus Torvalds
2014-10-06 8:55 ` Dr. David Alan Gilbert
2014-10-06 16:41 ` Andrea Arcangeli
2014-10-07 12:47 ` Linus Torvalds
2014-10-07 14:19 ` Andrea Arcangeli
2014-10-07 15:52 ` Andrea Arcangeli
2014-10-07 15:54 ` Andy Lutomirski
[not found] ` <20141007155247.GD2342-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-10-07 16:13 ` Peter Feiner [this message]
[not found] ` <20141007141913.GC2342-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-10-07 16:56 ` Linus Torvalds
2014-10-07 17:07 ` Dr. David Alan Gilbert
2014-10-07 17:14 ` Paolo Bonzini
2014-10-07 17:25 ` Dr. David Alan Gilbert
2014-10-07 11:10 ` [Qemu-devel] " Kirill A. Shutemov
2014-10-07 13:37 ` Andrea Arcangeli
[not found] ` <1412356087-16115-1-git-send-email-aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-10-03 17:08 ` [PATCH 11/17] mm: swp_entry_swapcount Andrea Arcangeli
2014-10-03 17:08 ` [PATCH 12/17] mm: sys_remap_anon_pages Andrea Arcangeli
2014-10-03 17:08 ` [PATCH 13/17] waitqueue: add nr wake parameter to __wake_up_locked_key Andrea Arcangeli
2014-10-03 17:08 ` [PATCH 14/17] userfaultfd: add new syscall to provide memory externalization Andrea Arcangeli
2014-10-03 17:08 ` [PATCH 15/17] userfaultfd: make userfaultfd_write non blocking Andrea Arcangeli
2014-10-03 17:08 ` [PATCH 16/17] powerpc: add remap_anon_pages and userfaultfd Andrea Arcangeli
2014-10-03 17:08 ` [PATCH 17/17] userfaultfd: implement USERFAULTFD_RANGE_REGISTER|UNREGISTER Andrea Arcangeli
2014-10-27 9:32 ` [PATCH 00/17] RFC: userfault v2 zhanghailiang
2014-10-29 17:46 ` Andrea Arcangeli
2014-10-29 17:56 ` [Qemu-devel] " Peter Maydell
2014-11-21 20:14 ` Andrea Arcangeli
2014-11-21 23:05 ` Peter Maydell
2014-11-25 19:45 ` Andrea Arcangeli
2014-10-30 11:31 ` zhanghailiang
2014-10-30 12:49 ` Dr. David Alan Gilbert
2014-10-31 1:26 ` zhanghailiang
2014-11-19 18:49 ` Andrea Arcangeli
2014-11-20 2:54 ` zhanghailiang
2014-11-20 17:38 ` Andrea Arcangeli
2014-11-21 7:19 ` zhanghailiang
2014-10-31 2:23 ` Peter Feiner
2014-10-31 3:29 ` zhanghailiang
2014-10-31 4:38 ` zhanghailiang
2014-10-31 5:17 ` Andres Lagar-Cavilla
2014-10-31 8:11 ` zhanghailiang
2014-10-31 19:39 ` Peter Feiner
2014-11-01 8:48 ` zhanghailiang
2014-11-20 17:29 ` Andrea Arcangeli
2014-11-12 7:18 ` zhanghailiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141007161320.GA17858@google.com \
--to=pfeiner-hpiqsd4aklfqt0dzr+alfa@public.gmane.org \
--cc=aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=andreslc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=cov-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org \
--cc=dave-gkUM19QKKo4@public.gmane.org \
--cc=dgilbert-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=dmitry.adamushko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
--cc=hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=kernel-team-z5hGa2qSFaRBDgjK7y7TUQ@public.gmane.org \
--cc=kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
--cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
--cc=mgorman-l3A5Bk7waGM@public.gmane.org \
--cc=mh-YmoObPS1fuhg9hUCZPvPmw@public.gmane.org \
--cc=neilb-l3A5Bk7waGM@public.gmane.org \
--cc=pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=qemu-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org \
--cc=riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=rlove-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=sasha.levin-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
--cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).