From: David Hildenbrand <david@redhat.com>
To: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>, qemu-devel@nongnu.org
Cc: Juan Quintela <quintela@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
Peter Xu <peterx@redhat.com>,
"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>, Den Lunev <den@openvz.org>
Subject: Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots
Date: Thu, 11 Feb 2021 20:21:34 +0100 [thread overview]
Message-ID: <8efe21c6-475d-2538-01c1-659f9d44491e@redhat.com> (raw)
In-Reply-To: <5d01402e-273a-53cf-b78b-b4b7f50340bc@virtuozzo.com>
On 09.02.21 19:38, Andrey Gruzdev wrote:
> On 09.02.2021 15:37, David Hildenbrand wrote:
>> On 21.01.21 16:24, andrey.gruzdev--- via wrote:
>>> This patch series is a kind of 'rethinking' of Denis Plotnikov's
>>> ideas he's
>>> implemented in his series '[PATCH v0 0/4] migration: add background
>>> snapshot'.
>>>
>>> Currently the only way to make (external) live VM snapshot is using
>>> existing
>>> dirty page logging migration mechanism. The main problem is that it
>>> tends to
>>> produce a lot of page duplicates while running VM goes on updating
>>> already
>>> saved pages. That leads to the fact that vmstate image size is
>>> commonly several
>>> times bigger then non-zero part of virtual machine's RSS. Time
>>> required to
>>> converge RAM migration and the size of snapshot image severely depend
>>> on the
>>> guest memory write rate, sometimes resulting in unacceptably long
>>> snapshot
>>> creation time and huge image size.
>>>
>>> This series propose a way to solve the aforementioned problems. This
>>> is done
>>> by using different RAM migration mechanism based on UFFD write
>>> protection
>>> management introduced in v5.7 kernel. The migration strategy is to
>>> 'freeze'
>>> guest RAM content using write-protection and iteratively release
>>> protection
>>> for memory ranges that have already been saved to the migration stream.
>>> At the same time we read in pending UFFD write fault events and save
>>> those
>>> pages out-of-order with higher priority.
>>>
>>
>> Hi,
>>
>> just stumbled over this, quick question:
>>
>> I recently played with UFFD_WP and notices that write protection is
>> only effective on pages/ranges that have already pages populated (IOW:
>> !pte_none() in the kernel).
>>
>> In case memory was never populated (or was discarded using e.g.,
>> madvice(DONTNEED)), write-protection will be skipped silently and you
>> won't get WP events for applicable pages.
>>
>> So if someone writes to a yet unpoupulated page ("zero"), you won't
>> get WP events.
>>
>> I can spot that you do a single uffd_change_protection() on the whole
>> RAMBlock.
>>
>> How are you handling that scenario, or why don't you have to handle
>> that scenario?
>>
> Hi David,
>
> I really wonder if such a problem exists.. If we are talking about a
> write to an unpopulated page, we should get first page fault on
> non-present page and populate it with protection bits from respective vma.
> For UFFD_WP vma's page will be populated non-writable. So we'll get
> another page fault on present but read-only page and go to handle_userfault.
>
Hi,
here is another fun issue.
Assume you
1. Have a populated page, with some valuable content
2. WP protected the page
3. madvise(DONTNEED) that page
4. Write to the page
On write access, you won't get a WP event!
Instead, you will get a UFFD_EVENT_REMOVE during 3. But you cannot stop
that event (dont wake), so you cannot simply defer as you can do with WP
events.
So if the guest inflates the balloon (including balloon page migration
in Linux) or free-page-reporting reports a free page while snapshotting
is active, you won't be able to save the old content before it is zapped
and your snapshot misses pages with actual content.
Something similar would happen with virtio-mem when unplugging blocks,
however, it does not discard any pages while migration is active.
Snapshotting seems to be incompatible with concurrent discards via
virtio-balloon. You might want to inhibit ballooning while snapshotting
is active in
hw/virtio/virtio-balloon.c:virtio_balloon_inhibited() just as we do for
postcopy.
--
Thanks,
David / dhildenb
prev parent reply other threads:[~2021-02-11 20:09 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-21 15:24 [PATCH v13 0/5] UFFD write-tracking migration/snapshots andrey.gruzdev--- via
2021-01-21 15:24 ` [PATCH v13 1/5] migration: introduce 'background-snapshot' migration capability andrey.gruzdev--- via
2021-01-21 15:24 ` [PATCH v13 2/5] migration: introduce UFFD-WP low-level interface helpers andrey.gruzdev--- via
2021-01-21 15:24 ` [PATCH v13 3/5] migration: support UFFD write fault processing in ram_save_iterate() andrey.gruzdev--- via
2021-01-21 15:24 ` [PATCH v13 4/5] migration: implementation of background snapshot thread andrey.gruzdev--- via
2021-01-28 18:29 ` Dr. David Alan Gilbert
2021-01-29 8:17 ` Andrey Gruzdev
2021-01-21 15:24 ` [PATCH v13 5/5] migration: introduce 'userfaultfd-wrlat.py' script andrey.gruzdev--- via
2021-02-09 12:37 ` [PATCH v13 0/5] UFFD write-tracking migration/snapshots David Hildenbrand
2021-02-09 18:38 ` Andrey Gruzdev
2021-02-09 19:06 ` David Hildenbrand
2021-02-09 20:09 ` Peter Xu
2021-02-09 20:31 ` Peter Xu
2021-02-11 9:21 ` Andrey Gruzdev
2021-02-11 17:18 ` Peter Xu
2021-02-11 18:15 ` Andrey Gruzdev
2021-02-11 16:19 ` Andrey Gruzdev
2021-02-11 17:32 ` Peter Xu
2021-02-11 18:28 ` Andrey Gruzdev
2021-02-11 19:01 ` David Hildenbrand
2021-02-11 20:31 ` Peter Xu
2021-02-11 20:44 ` David Hildenbrand
2021-02-11 21:05 ` Peter Xu
2021-02-11 21:09 ` David Hildenbrand
2021-02-12 3:06 ` Peter Xu
2021-02-12 8:52 ` David Hildenbrand
2021-02-12 16:11 ` Peter Xu
2021-02-13 9:34 ` Andrey Gruzdev
2021-02-13 10:30 ` David Hildenbrand
2021-02-16 23:35 ` Peter Xu
2021-02-17 10:31 ` David Hildenbrand
2021-02-19 6:57 ` Andrey Gruzdev
2021-02-19 7:45 ` David Hildenbrand
2021-02-19 20:50 ` Peter Xu
2021-02-19 21:10 ` Peter Xu
2021-02-19 21:14 ` David Hildenbrand
2021-02-19 21:20 ` David Hildenbrand
2021-02-19 22:47 ` Peter Xu
2021-02-20 7:59 ` David Hildenbrand
2021-02-22 17:29 ` Peter Xu
2021-02-22 17:33 ` David Hildenbrand
2021-02-22 17:54 ` Peter Xu
2021-02-22 18:11 ` David Hildenbrand
2021-02-24 16:56 ` Andrey Gruzdev
2021-02-24 17:01 ` David Hildenbrand
2021-02-24 17:52 ` Andrey Gruzdev
2021-02-24 16:43 ` Andrey Gruzdev
2021-02-24 16:54 ` David Hildenbrand
2021-02-24 17:00 ` Andrey Gruzdev
2021-02-11 19:21 ` David Hildenbrand [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8efe21c6-475d-2538-01c1-659f9d44491e@redhat.com \
--to=david@redhat.com \
--cc=andrey.gruzdev@virtuozzo.com \
--cc=armbru@redhat.com \
--cc=den@openvz.org \
--cc=dgilbert@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).