* [PATCH v2 0/2] virtio-balloon: don't start free page hinting if postcopy is possible
@ 2021-07-08 9:53 David Hildenbrand
2021-07-08 9:53 ` [PATCH v2 1/2] " David Hildenbrand
2021-07-08 9:53 ` [PATCH v2 2/2] virtio-balloon: free page hinting cleanups David Hildenbrand
0 siblings, 2 replies; 4+ messages in thread
From: David Hildenbrand @ 2021-07-08 9:53 UTC (permalink / raw)
To: qemu-devel
Cc: Juan Quintela, Michael S. Tsirkin, David Hildenbrand,
Dr. David Alan Gilbert, Peter Xu, Wei Wang, Alexander Duyck,
Philippe Mathieu-Daudé
This is the follow up on [1]:
[PATCH v1 0/2] virtio-balloon: disallow postcopy with
VIRTIO_BALLOON_F_FREE_PAGE_HINT
Working on getting migration for virtio-mem completely right [2] I realized
that virtio-balloon with VIRTIO_BALLOON_F_FREE_PAGE_HINT paired with
postcopy might be shaky. Actually testing it, I directly found two issues,
one of both being far from trivial to fix.
Let's not start free page hinting if postcopy is possible.
v1 -> v2:
- "virtio-balloon: don't start free page hinting if postcopy is possible"
-- Instead of bailing out when starting postcopy, don't start free page
hinting if the postcopy capability is enabled and we might see the
switch to postcopy for the current migration (where we might mess with
the dirty bitmap)
- "virtio-balloon: free page hinting cleanups"
-- Added
[1] https://lkml.kernel.org/r/20210707140655.30982-1-david@redhat.com
[2] https://lkml.kernel.org/r/20210616162940.28630-1-david@redhat.com
David Hildenbrand (2):
virtio-balloon: don't start free page hinting if postcopy is possible
virtio-balloon: free page hinting cleanups
hw/virtio/virtio-balloon.c | 41 +++++++++++++++++++-------------------
1 file changed, 21 insertions(+), 20 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 1/2] virtio-balloon: don't start free page hinting if postcopy is possible
2021-07-08 9:53 [PATCH v2 0/2] virtio-balloon: don't start free page hinting if postcopy is possible David Hildenbrand
@ 2021-07-08 9:53 ` David Hildenbrand
2021-07-08 18:51 ` Peter Xu
2021-07-08 9:53 ` [PATCH v2 2/2] virtio-balloon: free page hinting cleanups David Hildenbrand
1 sibling, 1 reply; 4+ messages in thread
From: David Hildenbrand @ 2021-07-08 9:53 UTC (permalink / raw)
To: qemu-devel
Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
Alexander Duyck, Dr. David Alan Gilbert, Juan Quintela, Wei Wang,
Peter Xu, Philippe Mathieu-Daudé
Postcopy never worked properly with 'free-page-hint=on', as there are
at least two issues:
1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE
and consequently won't release free pages back to the OS once
migration finishes.
The issue is that for postcopy, we won't do a final bitmap sync while
the guest is stopped on the source and
virtio_balloon_free_page_hint_notify() will only call
virtio_balloon_free_page_done() on the source during
PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to
the destination.
2) Once the VM touches a page on the destination that has been excluded
from migration on the source via qemu_guest_free_page_hint() while
postcopy is active, that thread will stall until postcopy finishes
and all threads are woken up. (with older Linux kernels that won't
retry faults when woken up via userfaultfd, we might actually get a
SEGFAULT)
The issue is that the source will refuse to migrate any pages that
are not marked as dirty in the dirty bmap -- for example, because the
page might just have been sent. Consequently, the faulting thread will
stall, waiting for the page to be migrated -- which could take quite
a while and result in guest OS issues.
While we could fix 1) comparatively easily, 2) is harder to get right and
might require more involved RAM migration changes on source and destination
[1].
As it never worked properly, let's not start free page hinting in the
precopy notifier if the postcopy migration capability was enabled to fix
it easily. Capabilities cannot be enabled once migration is already
running.
Note 1: in the future we might either adjust migration code on the source
to track pages that have actually been sent or adjust
migration code on source and destination to eventually send
pages multiple times from the source and and deal with pages
that are sent multiple times on the destination.
Note 2: virtio-mem has similar issues, however, access to "unplugged"
memory by the guest is very rare and we would have to be very
lucky for it to happen during migration. The spec states
"The driver SHOULD NOT read from unplugged memory blocks ..."
and "The driver MUST NOT write to unplugged memory blocks".
virtio-mem will move away from virtio_balloon_free_page_done()
soon and handle this case explicitly on the destination.
[1] https://lkml.kernel.org/r/e79fd18c-aa62-c1d8-c7f3-ba3fc2c25fc8@redhat.com
Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Cc: qemu-stable@nongnu.org
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
hw/virtio/virtio-balloon.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 4b5d9e5e50..ae7867a8db 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -30,6 +30,7 @@
#include "trace.h"
#include "qemu/error-report.h"
#include "migration/misc.h"
+#include "migration/migration.h"
#include "hw/virtio/virtio-bus.h"
#include "hw/virtio/virtio-access.h"
@@ -662,6 +663,18 @@ virtio_balloon_free_page_hint_notify(NotifierWithReturn *n, void *data)
return 0;
}
+ /*
+ * Pages hinted via qemu_guest_free_page_hint() are cleared from the dirty
+ * bitmap and will not get migrated, especially also not when the postcopy
+ * destination starts using them and requests migration from the source; the
+ * faulting thread will stall until postcopy migration finishes and
+ * all threads are woken up. Let's not start free page hinting if postcopy
+ * is possible.
+ */
+ if (migrate_postcopy_ram()) {
+ return 0;
+ }
+
switch (pnd->reason) {
case PRECOPY_NOTIFY_BEFORE_BITMAP_SYNC:
virtio_balloon_free_page_stop(dev);
--
2.31.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v2 2/2] virtio-balloon: free page hinting cleanups
2021-07-08 9:53 [PATCH v2 0/2] virtio-balloon: don't start free page hinting if postcopy is possible David Hildenbrand
2021-07-08 9:53 ` [PATCH v2 1/2] " David Hildenbrand
@ 2021-07-08 9:53 ` David Hildenbrand
1 sibling, 0 replies; 4+ messages in thread
From: David Hildenbrand @ 2021-07-08 9:53 UTC (permalink / raw)
To: qemu-devel
Cc: Juan Quintela, Michael S . Tsirkin, David Hildenbrand,
Dr. David Alan Gilbert, Alexander Duyck, Wei Wang, Peter Xu,
Philippe Mathieu-Daudé
Let's compress the code a bit to improve readability. We can drop the
vm_running check in virtio_balloon_free_page_start() as it's already
properly checked in the single caller.
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
hw/virtio/virtio-balloon.c | 28 ++++++++--------------------
1 file changed, 8 insertions(+), 20 deletions(-)
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index ae7867a8db..5a69dce35d 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -534,22 +534,18 @@ static bool get_free_page_hints(VirtIOBalloon *dev)
if (dev->free_page_hint_status == FREE_PAGE_HINT_S_REQUESTED &&
id == dev->free_page_hint_cmd_id) {
dev->free_page_hint_status = FREE_PAGE_HINT_S_START;
- } else {
+ } else if (dev->free_page_hint_status == FREE_PAGE_HINT_S_START) {
/*
* Stop the optimization only when it has started. This
* avoids a stale stop sign for the previous command.
*/
- if (dev->free_page_hint_status == FREE_PAGE_HINT_S_START) {
- dev->free_page_hint_status = FREE_PAGE_HINT_S_STOP;
- }
+ dev->free_page_hint_status = FREE_PAGE_HINT_S_STOP;
}
}
- if (elem->in_num) {
- if (dev->free_page_hint_status == FREE_PAGE_HINT_S_START) {
- qemu_guest_free_page_hint(elem->in_sg[0].iov_base,
- elem->in_sg[0].iov_len);
- }
+ if (elem->in_num && dev->free_page_hint_status == FREE_PAGE_HINT_S_START) {
+ qemu_guest_free_page_hint(elem->in_sg[0].iov_base,
+ elem->in_sg[0].iov_len);
}
out:
@@ -592,16 +588,10 @@ static void virtio_balloon_free_page_start(VirtIOBalloon *s)
{
VirtIODevice *vdev = VIRTIO_DEVICE(s);
- /* For the stop and copy phase, we don't need to start the optimization */
- if (!vdev->vm_running) {
- return;
- }
-
qemu_mutex_lock(&s->free_page_lock);
if (s->free_page_hint_cmd_id == UINT_MAX) {
- s->free_page_hint_cmd_id =
- VIRTIO_BALLOON_FREE_PAGE_HINT_CMD_ID_MIN;
+ s->free_page_hint_cmd_id = VIRTIO_BALLOON_FREE_PAGE_HINT_CMD_ID_MIN;
} else {
s->free_page_hint_cmd_id++;
}
@@ -649,8 +639,7 @@ static void virtio_balloon_free_page_done(VirtIOBalloon *s)
static int
virtio_balloon_free_page_hint_notify(NotifierWithReturn *n, void *data)
{
- VirtIOBalloon *dev = container_of(n, VirtIOBalloon,
- free_page_hint_notify);
+ VirtIOBalloon *dev = container_of(n, VirtIOBalloon, free_page_hint_notify);
VirtIODevice *vdev = VIRTIO_DEVICE(dev);
PrecopyNotifyData *pnd = data;
@@ -919,8 +908,7 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
- if (virtio_has_feature(s->host_features,
- VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
+ if (virtio_has_feature(s->host_features, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE,
virtio_balloon_handle_free_page_vq);
precopy_add_notifier(&s->free_page_hint_notify);
--
2.31.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v2 1/2] virtio-balloon: don't start free page hinting if postcopy is possible
2021-07-08 9:53 ` [PATCH v2 1/2] " David Hildenbrand
@ 2021-07-08 18:51 ` Peter Xu
0 siblings, 0 replies; 4+ messages in thread
From: Peter Xu @ 2021-07-08 18:51 UTC (permalink / raw)
To: David Hildenbrand
Cc: Michael S . Tsirkin, Juan Quintela, qemu-devel, Alexander Duyck,
qemu-stable, Wei Wang, Philippe Mathieu-Daudé,
Dr. David Alan Gilbert
On Thu, Jul 08, 2021 at 11:53:38AM +0200, David Hildenbrand wrote:
> Postcopy never worked properly with 'free-page-hint=on', as there are
> at least two issues:
>
> 1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE
> and consequently won't release free pages back to the OS once
> migration finishes.
>
> The issue is that for postcopy, we won't do a final bitmap sync while
> the guest is stopped on the source and
> virtio_balloon_free_page_hint_notify() will only call
> virtio_balloon_free_page_done() on the source during
> PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to
> the destination.
>
> 2) Once the VM touches a page on the destination that has been excluded
> from migration on the source via qemu_guest_free_page_hint() while
> postcopy is active, that thread will stall until postcopy finishes
> and all threads are woken up. (with older Linux kernels that won't
> retry faults when woken up via userfaultfd, we might actually get a
> SEGFAULT)
>
> The issue is that the source will refuse to migrate any pages that
> are not marked as dirty in the dirty bmap -- for example, because the
> page might just have been sent. Consequently, the faulting thread will
> stall, waiting for the page to be migrated -- which could take quite
> a while and result in guest OS issues.
>
> While we could fix 1) comparatively easily, 2) is harder to get right and
> might require more involved RAM migration changes on source and destination
> [1].
>
> As it never worked properly, let's not start free page hinting in the
> precopy notifier if the postcopy migration capability was enabled to fix
> it easily. Capabilities cannot be enabled once migration is already
> running.
>
> Note 1: in the future we might either adjust migration code on the source
> to track pages that have actually been sent or adjust
> migration code on source and destination to eventually send
> pages multiple times from the source and and deal with pages
> that are sent multiple times on the destination.
>
> Note 2: virtio-mem has similar issues, however, access to "unplugged"
> memory by the guest is very rare and we would have to be very
> lucky for it to happen during migration. The spec states
> "The driver SHOULD NOT read from unplugged memory blocks ..."
> and "The driver MUST NOT write to unplugged memory blocks".
> virtio-mem will move away from virtio_balloon_free_page_done()
> soon and handle this case explicitly on the destination.
>
> [1] https://lkml.kernel.org/r/e79fd18c-aa62-c1d8-c7f3-ba3fc2c25fc8@redhat.com
>
> Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
> Cc: qemu-stable@nongnu.org
> Cc: Wei Wang <wei.w.wang@intel.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> Cc: Alexander Duyck <alexander.duyck@gmail.com>
> Cc: Juan Quintela <quintela@redhat.com>
> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Cc: Peter Xu <peterx@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Thanks, David.
--
Peter Xu
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-07-08 19:06 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-07-08 9:53 [PATCH v2 0/2] virtio-balloon: don't start free page hinting if postcopy is possible David Hildenbrand
2021-07-08 9:53 ` [PATCH v2 1/2] " David Hildenbrand
2021-07-08 18:51 ` Peter Xu
2021-07-08 9:53 ` [PATCH v2 2/2] virtio-balloon: free page hinting cleanups David Hildenbrand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).