All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Roth <michael.roth@amd.com>
To: <qemu-devel@nongnu.org>
Cc: qemu-stable@nongnu.org, "David Hildenbrand" <david@redhat.com>,
	"Wei Wang" <wei.w.wang@intel.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>,
	"Alexander Duyck" <alexander.duyck@gmail.com>,
	"Juan Quintela" <quintela@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Peter Xu" <peterx@redhat.com>
Subject: [PATCH 01/47] virtio-balloon: don't start free page hinting if postcopy is possible
Date: Tue, 14 Dec 2021 18:00:39 -0600	[thread overview]
Message-ID: <20211215000125.378126-2-michael.roth@amd.com> (raw)
In-Reply-To: <20211215000125.378126-1-michael.roth@amd.com>

From: David Hildenbrand <david@redhat.com>

Postcopy never worked properly with 'free-page-hint=on', as there are
at least two issues:

1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE
   and consequently won't release free pages back to the OS once
   migration finishes.

   The issue is that for postcopy, we won't do a final bitmap sync while
   the guest is stopped on the source and
   virtio_balloon_free_page_hint_notify() will only call
   virtio_balloon_free_page_done() on the source during
   PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to
   the destination.

2) Once the VM touches a page on the destination that has been excluded
   from migration on the source via qemu_guest_free_page_hint() while
   postcopy is active, that thread will stall until postcopy finishes
   and all threads are woken up. (with older Linux kernels that won't
   retry faults when woken up via userfaultfd, we might actually get a
   SEGFAULT)

   The issue is that the source will refuse to migrate any pages that
   are not marked as dirty in the dirty bmap -- for example, because the
   page might just have been sent. Consequently, the faulting thread will
   stall, waiting for the page to be migrated -- which could take quite
   a while and result in guest OS issues.

While we could fix 1) comparatively easily, 2) is harder to get right and
might require more involved RAM migration changes on source and destination
[1].

As it never worked properly, let's not start free page hinting in the
precopy notifier if the postcopy migration capability was enabled to fix
it easily. Capabilities cannot be enabled once migration is already
running.

Note 1: in the future we might either adjust migration code on the source
        to track pages that have actually been sent or adjust
        migration code on source and destination  to eventually send
        pages multiple times from the source and and deal with pages
        that are sent multiple times on the destination.

Note 2: virtio-mem has similar issues, however, access to "unplugged"
        memory by the guest is very rare and we would have to be very
        lucky for it to happen during migration. The spec states
        "The driver SHOULD NOT read from unplugged memory blocks ..."
        and "The driver MUST NOT write to unplugged memory blocks".
        virtio-mem will move away from virtio_balloon_free_page_done()
        soon and handle this case explicitly on the destination.

[1] https://lkml.kernel.org/r/e79fd18c-aa62-c1d8-c7f3-ba3fc2c25fc8@redhat.com

Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Cc: qemu-stable@nongnu.org
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210708095339.20274-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit fd51e54fa10221e5a8add894c38cc1cf199f4bc4)
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 hw/virtio/virtio-balloon.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 4b5d9e5e50..ae7867a8db 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -30,6 +30,7 @@
 #include "trace.h"
 #include "qemu/error-report.h"
 #include "migration/misc.h"
+#include "migration/migration.h"
 
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
@@ -662,6 +663,18 @@ virtio_balloon_free_page_hint_notify(NotifierWithReturn *n, void *data)
         return 0;
     }
 
+    /*
+     * Pages hinted via qemu_guest_free_page_hint() are cleared from the dirty
+     * bitmap and will not get migrated, especially also not when the postcopy
+     * destination starts using them and requests migration from the source; the
+     * faulting thread will stall until postcopy migration finishes and
+     * all threads are woken up. Let's not start free page hinting if postcopy
+     * is possible.
+     */
+    if (migrate_postcopy_ram()) {
+        return 0;
+    }
+
     switch (pnd->reason) {
     case PRECOPY_NOTIFY_BEFORE_BITMAP_SYNC:
         virtio_balloon_free_page_stop(dev);
-- 
2.25.1



  reply	other threads:[~2021-12-15  0:23 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-15  0:00 [PATCH 00/47] Patch Round-up for stable 6.1.1, freeze on 2021-12-21 Michael Roth
2021-12-15  0:00 ` Michael Roth [this message]
2021-12-15  0:00 ` [PATCH 02/47] target/arm: Don't skip M-profile reset entirely in user mode Michael Roth
2021-12-15  0:00 ` [PATCH 03/47] virtio-net: fix use after unmap/free for sg Michael Roth
2021-12-15  0:00 ` [PATCH 04/47] qemu-nbd: Change default cache mode to writeback Michael Roth
2021-12-15  0:00 ` [PATCH 05/47] hmp: Unbreak "change vnc" Michael Roth
2021-12-15  0:00 ` [PATCH 06/47] virtio-mem-pci: Fix memory leak when creating MEMORY_DEVICE_SIZE_CHANGE event Michael Roth
2021-12-15  0:00 ` [PATCH 07/47] uas: add stream number sanity checks Michael Roth
2021-12-15  0:00 ` [PATCH 08/47] vhost-user: fix duplicated notifier MR init Michael Roth
2021-12-15  0:00 ` [PATCH 09/47] libvhost-user: fix VHOST_USER_REM_MEM_REG skipping mmap_addr Michael Roth
2021-12-15  0:00 ` [PATCH 10/47] hw/display/artist: Fix bug in coordinate extraction in artist_vram_read() and artist_vram_write() Michael Roth
2021-12-15  0:00 ` [PATCH 11/47] i386/cpu: Remove AVX_VNNI feature from Cooperlake cpu model Michael Roth
2021-12-15  0:00 ` [PATCH 12/47] 9pfs: fix crash in v9fs_walk() Michael Roth
2021-12-15  0:00 ` [PATCH 13/47] plugins/execlog: removed unintended "s" at the end of log lines Michael Roth
2021-12-15  0:00 ` [PATCH 14/47] plugins: do not limit exported symbols if modules are active Michael Roth
2021-12-15  0:00 ` [PATCH 15/47] qemu-sockets: fix unix socket path copy (again) Michael Roth
2021-12-15  0:00 ` [PATCH 16/47] vhost-vsock: fix migration issue when seqpacket is supported Michael Roth
2021-12-15  0:00 ` [PATCH 17/47] hw/arm/virt: Rename default_bus_bypass_iommu Michael Roth
2021-12-15  0:00 ` [PATCH 18/47] hw/i386: " Michael Roth
2021-12-15  0:00 ` [PATCH 19/47] bios-tables-test: allow changes in DSDT ACPI tables for q35 Michael Roth
2021-12-15  0:00 ` [PATCH 20/47] hw/i386/acpi: fix conflicting IO address range for acpi pci hotplug in q35 Michael Roth
2021-12-15  0:00 ` [PATCH 21/47] bios-tables-test: Update ACPI DSDT table golden blobs for q35 Michael Roth
2021-12-15  0:01 ` [PATCH 22/47] block: introduce max_hw_iov for use in scsi-generic Michael Roth
2021-12-15  0:01 ` [PATCH 23/47] pci: fix PCI resource reserve capability on BE Michael Roth
2021-12-15  0:01 ` [PATCH 24/47] tests/acpi/bios-tables-test: add and allow changes to a new q35 DSDT table blob Michael Roth
2021-12-15  0:01 ` [PATCH 25/47] tests/acpi/pcihp: add unit tests for hotplug on multifunction bridges for q35 Michael Roth
2021-12-15  0:01 ` [PATCH 26/47] tests/acpi/bios-tables-test: update DSDT blob for multifunction bridge test Michael Roth
2021-12-15  0:01 ` [PATCH 27/47] block/file-posix: Fix return value translation for AIO discards Michael Roth
2021-12-15  0:01 ` [PATCH 28/47] Partially revert "build: -no-pie is no functional linker flag" Michael Roth
2021-12-15  0:01 ` [PATCH 29/47] target-i386: mmu: use pg_mode instead of HF_LMA_MASK Michael Roth
2021-12-15  0:01 ` [PATCH 30/47] target-i386: mmu: fix handling of noncanonical virtual addresses Michael Roth
2021-12-15  0:01 ` [PATCH 31/47] hw/scsi/scsi-disk: MODE_PAGE_ALLS not allowed in MODE SELECT commands Michael Roth
2021-12-15  0:01 ` [PATCH 32/47] hw: m68k: virt: Add compat machine for 6.1 Michael Roth
2021-12-15  0:01 ` [PATCH 33/47] rcu: Introduce force_rcu notifier Michael Roth
2021-12-15  0:01 ` [PATCH 34/47] accel/tcg: Register a " Michael Roth
2021-12-15  0:01 ` [PATCH 35/47] pcie: rename 'native-hotplug' to 'x-native-hotplug' Michael Roth
2021-12-15  0:01 ` [PATCH 36/47] virtio: use virtio accessor to access packed descriptor flags Michael Roth
2021-12-15  0:01 ` [PATCH 37/47] virtio: use virtio accessor to access packed event Michael Roth
2021-12-15  0:01 ` [PATCH 38/47] vfio: Fix memory leak of hostwin Michael Roth
2021-12-15  0:01 ` [PATCH 39/47] nbd/server: Don't complain on certain client disconnects Michael Roth
2021-12-15  0:01 ` [PATCH 40/47] hw/nvme: fix buffer overrun in nvme_changed_nslist (CVE-2021-3947) Michael Roth
2021-12-15  0:01 ` [PATCH 41/47] chardev/wctable: don't free the instance in wctablet_chr_finalize Michael Roth
2021-12-15  0:01 ` [PATCH 42/47] hw/block/fdc: Extract blk_create_empty_drive() Michael Roth
2021-12-15  0:01 ` [PATCH 43/47] hw/block/fdc: Kludge missing floppy drive to fix CVE-2021-20196 Michael Roth
2021-12-15  0:01 ` [PATCH 44/47] tests/qtest/fdc-test: Add a regression test for CVE-2021-20196 Michael Roth
2021-12-15  0:01 ` [PATCH 45/47] virtio-blk: Fix clean up of host notifiers for single MR transaction Michael Roth
2021-12-15  0:01 ` [PATCH 46/47] net: vmxnet3: validate configuration values during activate (CVE-2021-20203) Michael Roth
2021-12-15  0:01 ` [PATCH 47/47] e1000: fix tx re-entrancy problem Michael Roth
2021-12-15  9:20 ` [PATCH 00/47] Patch Round-up for stable 6.1.1, freeze on 2021-12-21 Daniel P. Berrangé
2021-12-15 13:17   ` Michael Roth via
2021-12-20 23:41 ` Michael Roth via

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211215000125.378126-2-michael.roth@amd.com \
    --to=michael.roth@amd.com \
    --cc=alexander.duyck@gmail.com \
    --cc=david@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-stable@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=wei.w.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.