qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Steven Sistare <steven.sistare@oracle.com>
To: Euan Turner <euan.turner@nutanix.com>, qemu-devel@nongnu.org
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>, Peter Xu <peterx@redhat.com>,
	Fabiano Rosas <farosas@suse.de>
Subject: Re: [RFC V1 1/6] Revert "vhost-backend: remove vhost_kernel_reset_device()"
Date: Tue, 3 Sep 2024 15:55:03 -0400	[thread overview]
Message-ID: <fa95c40d-b5e5-41eb-bba7-7842bca2f73e@oracle.com> (raw)
In-Reply-To: <c7eaca75-1824-4f46-af97-9905c2a214ba@nutanix.com>

[-- Attachment #1: Type: text/plain, Size: 4400 bytes --]

On 9/3/2024 6:44 AM, Euan Turner wrote:
> Hi Steve,
> 
> On 30/08/2024 12:56, Steve Sistare wrote:
>> This reverts commit e6383293eb01928692047e617665a742cca87e23.
>> The reset function is needed for CPR.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>   hw/virtio/vhost-backend.c | 6 ++++++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
>> index 833804d..9b75141 100644
>> --- a/hw/virtio/vhost-backend.c
>> +++ b/hw/virtio/vhost-backend.c
>> @@ -221,6 +221,11 @@ static int vhost_kernel_set_owner(struct vhost_dev *dev)
>>       return vhost_kernel_call(dev, VHOST_SET_OWNER, NULL);
>>   }
>> +static int vhost_kernel_reset_device(struct vhost_dev *dev)
>> +{
>> +    return vhost_kernel_call(dev, VHOST_RESET_OWNER, NULL);
>> +}
>> +
> How does this series avoid falling foul of c0c4f147291f37765a5275aa24c3e1195468903b (which follows the commit reverted here)?
> 
> I've been playing around with this patch series a bit, in the context of cpr-transfer, and am seeing the issues highlighted in that c0c4... commit message:
> Since vhost-kernel now has a reset_device, this is called in virtio_reset as part of qemu_machine_creation_done. (I have the full backtrace if it's helpful). Subsequent ioctls then fail (with ownership errors) due to the RESET_OWNER:
> 
> 2024-09-02T15:40:56.860541Z qemu-kvm: vhost_set_vring_call failed 1
> 2024-09-02T15:40:56.860908Z qemu-kvm: vhost_set_vring_call failed 1
> 2024-09-02T15:40:56.861253Z qemu-kvm: vhost_set_mem_table failed: Operation not permitted (1)
> 2024-09-02T15:40:56.861586Z qemu-kvm: vhost_set_vring_call failed 1
> 2024-09-02T15:40:56.861831Z qemu-kvm: vhost_set_vring_call failed 1
> 2024-09-02T15:40:56.862199Z qemu-kvm: unable to start vhost net: 1: falling back on userspace virtio
> 
> For me the NIC then fails during the migration, although the migration as a whole appears to succeed. (At least, prior the the migration, I could ssh into the VM and ping out to 8.8.8.8, but then I lose the ssh connection during the migration, and cannot ssh back in again afterwards on the new QEMU).
> 
> Do you think this could be because of QEMU falling back from the vhost backend to use virtio?
> 
> It may be down to some misconfiguration on my part, here's the netdev command line I had for reference:
> On the source QEMU:
> 
> -netdev '{"type":"tap","fd":"39","vhost":true,"vhostfd":"40","id":"hostua-43bc0eaf-ff55-44e6-87ec-a4798f592db1"}' \
> -device '{"driver":"virtio-net-pci","rx_queue_size":256,"netdev":"hostua-43bc0eaf-ff55-44e6-87ec-a4798f592db1","id":"ua-43bc0eaf-ff55-44e6-87ec-a4798f592db1","mac":"50:6b:8d:0c:03:e0","bus":"pci.1","addr":"0x0"}' \
> 
> On the destination QEMU:
> -netdev '{"type":"tap","fd":"-1","vhostfd":"-1","id":"hostua-43bc0eaf-ff55-44e6-87ec-a4798f592db1"}' \
> -device '{"driver":"virtio-net-pci","rx_queue_size":256,"netdev":"hostua-43bc0eaf-ff55-44e6-87ec-a4798f592db1","id":"ua-43bc0eaf-ff55-44e6-87ec-a4798f592db1","mac":"50:6b:8d:0c:03:e0","bus":"pci.1","addr":"0x0"}' \
> 
>>   static int vhost_kernel_get_vq_index(struct vhost_dev *dev, int idx)
>>   {
>>       assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
>> @@ -345,6 +350,7 @@ const VhostOps kernel_ops = {
>>           .vhost_get_features = vhost_kernel_get_features,
>>           .vhost_set_backend_cap = vhost_kernel_set_backend_cap,
>>           .vhost_set_owner = vhost_kernel_set_owner,
>> +        .vhost_reset_device = vhost_kernel_reset_device,
>>           .vhost_get_vq_index = vhost_kernel_get_vq_index,
>>           .vhost_vsock_set_guest_cid = vhost_kernel_vsock_set_guest_cid,
>>           .vhost_vsock_set_running = vhost_kernel_vsock_set_running,
> 

The 6 patches in this series are only sufficient for cpr-exec mode, but I have
attached the 2 additional patches you need for cpr-transfer mode.  That mode
works fine for me with those patches. When I run a similar test, vhost_reset_device
is not called on target QEMU because vhost_started is false:

   qemu_machine_creation_done()
     virtio_reset()
       if (vdev->vhost_started && k->get_vhost)
         vhost_reset_device(k->get_vhost(vdev));

I don't know why you are seeing the vhost_set_vring_call failures.  I don't see those,
with or without the 2 additional patches.

- Steve

[-- Attachment #2: 0001-migration-cpr-setup-notifier.patch --]
[-- Type: text/plain, Size: 2136 bytes --]

From 4f719179546e9327cba3b74e86d2a403e36c5d71 Mon Sep 17 00:00:00 2001
From: Steve Sistare <steven.sistare@oracle.com>
Date: Tue, 20 Aug 2024 05:36:15 -0700
Subject: [PATCH 1/2] migration: cpr setup notifier

Add a cpr-setup notification point which is called before general setup.
It is needed for resetting vhost devices, as explained in the next patch.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 include/migration/misc.h | 7 ++++---
 migration/migration.c    | 5 +++++
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/migration/misc.h b/include/migration/misc.h
index bcb6f35..bd05fe2 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -60,6 +60,7 @@ bool migration_thread_is_self(void);
 bool migration_is_setup_or_active(void);
 
 typedef enum MigrationEventType {
+    MIG_EVENT_PRECOPY_CPR_SETUP,
     MIG_EVENT_PRECOPY_SETUP,
     MIG_EVENT_PRECOPY_DONE,
     MIG_EVENT_PRECOPY_FAILED,
@@ -71,9 +72,9 @@ typedef struct MigrationEvent {
 } MigrationEvent;
 
 /*
- * A MigrationNotifyFunc may return an error code and an Error object,
- * but only when @e->type is MIG_EVENT_PRECOPY_SETUP.  The code is an int
- * to allow for different failure modes and recovery actions.
+ * A MigrationNotifyFunc may return an error code and an Error object, but
+ * only when @e->type is MIG_EVENT_PRECOPY_SETUP or MIG_EVENT_PRECOPY_CPR_SETUP.
+ * The code is an int to allow for different failure modes and recovery actions.
  */
 typedef int (*MigrationNotifyFunc)(NotifierWithReturn *notify,
                                    MigrationEvent *e, Error **errp);
diff --git a/migration/migration.c b/migration/migration.c
index 15ac8f5..8bc1975 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2201,7 +2201,12 @@ void qmp_migrate(const char *uri, bool has_channels,
         stopped = true;
     }
 
+    if (migration_call_notifiers(s, MIG_EVENT_PRECOPY_CPR_SETUP, &local_err)) {
+        goto out;
+    }
+
     if (cpr_state_save(&local_err)) {
+        migration_call_notifiers(s, MIG_EVENT_PRECOPY_FAILED, NULL);
         goto out;
     }
 
-- 
1.8.3.1


[-- Attachment #3: 0002-vhost-cpr-transfer-support.patch --]
[-- Type: text/plain, Size: 3281 bytes --]

From 0f09781df215086f095e40b42662dd2f3378ca7f Mon Sep 17 00:00:00 2001
From: Steve Sistare <steven.sistare@oracle.com>
Date: Tue, 20 Aug 2024 05:48:08 -0700
Subject: [PATCH 2/2] vhost cpr-transfer support

For cpr-transfer mode, a vhost device must be reset before CPR state
is sent to new QEMU, because after state is sent, new QEMU initializes
the device and calls set_owner.  Install a cpr-setup notifier to do so.
This still works for cpr-exec mode.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 hw/virtio/vhost.c         | 17 +++++++++++++++--
 include/hw/virtio/vhost.h |  1 +
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 54c5ec7..31c72d1 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1513,16 +1513,24 @@ static int vhost_cpr_notifier(NotifierWithReturn *notifier,
     struct vhost_dev *dev;
     int r;
 
-    dev = container_of(notifier, struct vhost_dev, cpr_exec_notifier);
+    dev = (migrate_mode() == MIG_MODE_CPR_EXEC) ?
+        container_of(notifier, struct vhost_dev, cpr_exec_notifier) :
+        container_of(notifier, struct vhost_dev, cpr_transfer_notifier);
+
     if (dev->vhost_ops->backend_type != VHOST_BACKEND_TYPE_KERNEL) {
         return 0;
     }
 
-    if (e->type == MIG_EVENT_PRECOPY_DONE) {
+    if (e->type == MIG_EVENT_PRECOPY_CPR_SETUP) {
         r = dev->vhost_ops->vhost_reset_device(dev);
         if (r < 0) {
             VHOST_OPS_DEBUG(r, "vhost_reset_device failed");
         }
+    } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
+        r = dev->vhost_ops->vhost_set_owner(dev);
+        if (r < 0) {
+            VHOST_OPS_DEBUG(r, "vhost_set_owner failed");
+        }
     }
     return 0;
 }
@@ -1538,6 +1546,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
     hdev->vdev = NULL;
     hdev->migration_blocker = NULL;
     hdev->cpr_exec_notifier.notify = NULL;
+    hdev->cpr_transfer_notifier.notify = NULL;
 
     r = vhost_set_backend_type(hdev, backend_type);
     assert(r >= 0);
@@ -1641,6 +1650,9 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
     migration_add_notifier_mode(&hdev->cpr_exec_notifier,
                                 vhost_cpr_notifier,
                                 MIG_MODE_CPR_EXEC);
+    migration_add_notifier_mode(&hdev->cpr_transfer_notifier,
+                                vhost_cpr_notifier,
+                                MIG_MODE_CPR_TRANSFER);
     QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
 
     /*
@@ -1698,6 +1710,7 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
     }
     migrate_del_blocker(&hdev->migration_blocker);
     migration_remove_notifier(&hdev->cpr_exec_notifier);
+    migration_remove_notifier(&hdev->cpr_transfer_notifier);
     g_free(hdev->mem);
     g_free(hdev->mem_sections);
     if (hdev->vhost_ops) {
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index e9aca11..ab7e874 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -133,6 +133,7 @@ struct vhost_dev {
     QLIST_HEAD(, vhost_iommu) iommu_list;
     IOMMUNotifier n;
     NotifierWithReturn cpr_exec_notifier;
+    NotifierWithReturn cpr_transfer_notifier;
     const VhostDevConfigOps *config_ops;
 };
 
-- 
1.8.3.1


  reply	other threads:[~2024-09-03 19:56 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-30 11:56 [RFC V1 0/6] Live Update: tap and vhost Steve Sistare
2024-08-30 11:56 ` [RFC V1 1/6] Revert "vhost-backend: remove vhost_kernel_reset_device()" Steve Sistare
2024-09-03 10:44   ` Euan Turner
2024-09-03 19:55     ` Steven Sistare [this message]
2024-08-30 11:56 ` [RFC V1 2/6] tap: common return label Steve Sistare
2024-08-30 11:56 ` [RFC V1 3/6] tap: fix net_init_tap() return code Steve Sistare
2024-08-30 11:56 ` [RFC V1 4/6] migration: cpr_get_fd_param helper Steve Sistare
2024-09-03 15:38   ` Fabiano Rosas
2024-08-30 11:56 ` [RFC V1 5/6] tap: cpr support Steve Sistare
2024-08-30 11:56 ` [RFC V1 6/6] tap: postload fix for cpr Steve Sistare

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fa95c40d-b5e5-41eb-bba7-7842bca2f73e@oracle.com \
    --to=steven.sistare@oracle.com \
    --cc=euan.turner@nutanix.com \
    --cc=farosas@suse.de \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).