All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Baptiste Reynal <b.reynal@virtualopensystems.com>
Cc: Hailiang Zhang <zhang.zhanghailiang@huawei.com>,
	peter.huangpeng@huawei.com, qemu list <qemu-devel@nongnu.org>,
	hanweidong@huawei.com, Juan Quintela <quintela@redhat.com>,
	dgilbert@redhat.com, Amit Shah <amit.shah@redhat.com>,
	Christian Pinto <c.pinto@virtualopensystems.com>
Subject: Re: [Qemu-devel] [RFC 00/13] Live memory snapshot based on userfaultfd
Date: Tue, 5 Jul 2016 16:59:04 +0200	[thread overview]
Message-ID: <20160705145904.GA4513@redhat.com> (raw)
In-Reply-To: <CAN9JPjH_AQjHGbMMdSZofRp8nxcfgtkzpK6LZsd7bWyfGhJkCQ@mail.gmail.com>

Hello,

On Tue, Jul 05, 2016 at 11:57:31AM +0200, Baptiste Reynal wrote:
> Ok, if it is not on Andrea schedule I am willing to take the action,
> at least for ARM/ARM64 support.

A few days ago I released this update:

https://git.kernel.org/cgit/linux/kernel/git/andrea/aa.git/

git clone -b master --reference linux
git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git
cd aa
git fetch
git reset --hard origin/master

The branch will be constantly rebased so you will need to rebase or
reset on origin/master after a fetch to get the updates.


Features added:

1) WP support for anon (Shaohua, hugetlbfs has a FIXME)
2) non cooperative support (Pavel & Mike Rapoport)
3) hugetlbfs missing faults tracking (Mike Kravetz)

WP support and hugetlbfs required a couple of fixes, the
non-cooperative support is as submitted but I wonder if we should have
a single non cooperative feature flag.

I didn't advertise it yet because It's not well tested and in fact I
don't expect the WP mode to work fully as it should.

However the kernel should run stable, I fixed enough bugs so that this
release should not be possible to DoS or exploit the kernel with this
patchset applied (unlike the original code submits which had race
conditions and potentially kernel crashing bugs).

The next thing I plan to work on is a bitflag in the swap entry for
the WP tracking so that WP tracking works correctly through swapins
without false positives. It'll work like soft-dirty. Possible that
other things are still uncovered in the WP support.

THP should be covered now (the callback was missing in the original
submit but I fixed that). KVM it's not entirely clear why it didn't
work before but it may require changes to the KVM code if this is not
enough. KVM should not use gup(write=1) for read faults on shadow
pagetables, so it has at least a chance to work.

I'm also considering using a reserved bitflag in the mapped/present
pte/trans_huge_pmds to track which virtual addresses have been
wrprotected. Without a reserved bitflag, fork() would inevitably lead
to WP userfaults false positives. I'm not sure if it's required or if
it should be left up to userland to enforce the pagetables don't
become wrprotected (i.e. use MADV_DONTFORK like of course KVM already
does). First we've to solve the false positives through swap anyway,
the two should be orthogonal improvements.

If you could test the live snapshotting patchset on my kernel master
branch and report any issue or incremental fix against my branch, it'd
be great.

On my side I think I'll focus on testing by extending the testsuite
inside the kernel to exercise WP tracking too.

There are several other active users of the new userfaultfd features,
including JIT garbage collection (that previously used mprotect and
trapped SIGSEGV), distributed shared memory, SQL database robustness
in hugetlbfs holes and postcopy live migration of containers (a
process using userfaultfd of its own being live migrated inside a
containers with the non-cooperative model, isn't solved yet though).

Thanks,
Andrea

  parent reply	other threads:[~2016-07-05 14:59 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-07 12:19 [Qemu-devel] [RFC 00/13] Live memory snapshot based on userfaultfd zhanghailiang
2016-01-07 12:19 ` [Qemu-devel] [RFC 01/13] postcopy/migration: Split fault related state into struct UserfaultState zhanghailiang
2016-01-07 12:19 ` [Qemu-devel] [RFC 02/13] migration: Allow the migrate command to work on file: urls zhanghailiang
2016-07-13 16:12   ` Dr. David Alan Gilbert
2016-07-14  5:27     ` Hailiang Zhang
2016-01-07 12:19 ` [Qemu-devel] [RFC 03/13] migration: Allow -incoming " zhanghailiang
2016-01-11 20:02   ` Dr. David Alan Gilbert
2016-01-12 13:04     ` Hailiang Zhang
2016-01-07 12:19 ` [Qemu-devel] [RFC 04/13] migration: Create a snapshot thread to realize saving memory snapshot zhanghailiang
2016-01-07 12:20 ` [Qemu-devel] [RFC 05/13] migration: implement initialization work for snapshot zhanghailiang
2016-01-07 12:20 ` [Qemu-devel] [RFC 06/13] QEMUSizedBuffer: Introduce two help functions for qsb zhanghailiang
2016-01-07 12:20 ` [Qemu-devel] [RFC 07/13] savevm: Split qemu_savevm_state_complete_precopy() into two helper functions zhanghailiang
2016-01-07 12:20 ` [Qemu-devel] [RFC 08/13] snapshot: Save VM's device state into snapshot file zhanghailiang
2016-01-07 12:20 ` [Qemu-devel] [RFC 09/13] migration/postcopy-ram: fix some helper functions to support userfaultfd write-protect zhanghailiang
2016-01-07 12:20 ` [Qemu-devel] [RFC 10/13] snapshot: Enable the write-protect notification capability for VM's RAM zhanghailiang
2016-01-07 12:20 ` [Qemu-devel] [RFC 11/13] snapshot/migration: Save VM's RAM into snapshot file zhanghailiang
2016-01-07 12:20 ` [Qemu-devel] [RFC 12/13] migration/ram: Fix some helper functions' parameter to use PageSearchStatus zhanghailiang
2016-01-11 17:55   ` Dr. David Alan Gilbert
2016-01-12 12:59     ` Hailiang Zhang
2016-01-07 12:20 ` [Qemu-devel] [RFC 13/13] snapshot: Remove page's write-protect and copy the content during setup stage zhanghailiang
2016-07-13 17:52   ` Dr. David Alan Gilbert
2016-07-14  8:02     ` Hailiang Zhang
2016-07-04 12:22 ` [Qemu-devel] [RFC 00/13] Live memory snapshot based on userfaultfd Baptiste Reynal
2016-07-05  1:49   ` Hailiang Zhang
2016-07-05  9:57     ` Baptiste Reynal
2016-07-05 10:27       ` Hailiang Zhang
2016-08-18 15:56         ` Andrea Arcangeli
2016-08-20  6:31           ` Hailiang Zhang
2017-02-27 15:37             ` Christian Pinto
2017-02-28  1:48               ` Hailiang Zhang
2017-02-28  8:30                 ` Christian Pinto
2017-02-28 16:14                 ` Andrea Arcangeli
2017-03-01  1:08                   ` Hailiang Zhang
2017-03-09 11:34             ` [Qemu-devel] [RFC PATCH 0/4] ARM/ARM64 fixes for live " Christian Pinto
2017-03-09 11:34               ` [Qemu-devel] [RFC PATCH 1/4] migration/postcopy-ram: check pagefault flags in userfaultfd thread Christian Pinto
2017-03-09 11:34               ` [Qemu-devel] [RFC PATCH 2/4] migration/ram: Fix for ARM/ARM64 page size Christian Pinto
2017-03-09 11:34               ` [Qemu-devel] [RFC PATCH 3/4] migration: snapshot thread Christian Pinto
2017-03-09 11:34               ` [Qemu-devel] [RFC PATCH 4/4] migration/postcopy-ram: ram_set_pages_wp fix Christian Pinto
2017-03-09 17:46               ` [Qemu-devel] [RFC PATCH 0/4] ARM/ARM64 fixes for live memory snapshot based on userfaultfd Dr. David Alan Gilbert
2017-03-10  8:15                 ` Christian Pinto
2016-09-06  3:39           ` [Qemu-devel] [RFC 00/13] Live " Hailiang Zhang
2016-09-18  2:14             ` Hailiang Zhang
2016-12-08 12:45               ` Hailiang Zhang
2016-07-05 14:59       ` Andrea Arcangeli [this message]
2016-07-13 18:02 ` Dr. David Alan Gilbert
2016-07-14 10:24   ` Hailiang Zhang
2016-07-14 11:43     ` Dr. David Alan Gilbert
2016-07-19  6:53       ` Hailiang Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160705145904.GA4513@redhat.com \
    --to=aarcange@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=b.reynal@virtualopensystems.com \
    --cc=c.pinto@virtualopensystems.com \
    --cc=dgilbert@redhat.com \
    --cc=hanweidong@huawei.com \
    --cc=peter.huangpeng@huawei.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.