From: Steven Sistare <steven.sistare@oracle.com>
To: Euan Turner <euan.turner@nutanix.com>, qemu-devel@nongnu.org
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
Jason Wang <jasowang@redhat.com>, Peter Xu <peterx@redhat.com>,
Fabiano Rosas <farosas@suse.de>
Subject: Re: [RFC V1 1/6] Revert "vhost-backend: remove vhost_kernel_reset_device()"
Date: Tue, 3 Sep 2024 15:55:03 -0400 [thread overview]
Message-ID: <fa95c40d-b5e5-41eb-bba7-7842bca2f73e@oracle.com> (raw)
In-Reply-To: <c7eaca75-1824-4f46-af97-9905c2a214ba@nutanix.com>
[-- Attachment #1: Type: text/plain, Size: 4400 bytes --]
On 9/3/2024 6:44 AM, Euan Turner wrote:
> Hi Steve,
>
> On 30/08/2024 12:56, Steve Sistare wrote:
>> This reverts commit e6383293eb01928692047e617665a742cca87e23.
>> The reset function is needed for CPR.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>> hw/virtio/vhost-backend.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
>> index 833804d..9b75141 100644
>> --- a/hw/virtio/vhost-backend.c
>> +++ b/hw/virtio/vhost-backend.c
>> @@ -221,6 +221,11 @@ static int vhost_kernel_set_owner(struct vhost_dev *dev)
>> return vhost_kernel_call(dev, VHOST_SET_OWNER, NULL);
>> }
>> +static int vhost_kernel_reset_device(struct vhost_dev *dev)
>> +{
>> + return vhost_kernel_call(dev, VHOST_RESET_OWNER, NULL);
>> +}
>> +
> How does this series avoid falling foul of c0c4f147291f37765a5275aa24c3e1195468903b (which follows the commit reverted here)?
>
> I've been playing around with this patch series a bit, in the context of cpr-transfer, and am seeing the issues highlighted in that c0c4... commit message:
> Since vhost-kernel now has a reset_device, this is called in virtio_reset as part of qemu_machine_creation_done. (I have the full backtrace if it's helpful). Subsequent ioctls then fail (with ownership errors) due to the RESET_OWNER:
>
> 2024-09-02T15:40:56.860541Z qemu-kvm: vhost_set_vring_call failed 1
> 2024-09-02T15:40:56.860908Z qemu-kvm: vhost_set_vring_call failed 1
> 2024-09-02T15:40:56.861253Z qemu-kvm: vhost_set_mem_table failed: Operation not permitted (1)
> 2024-09-02T15:40:56.861586Z qemu-kvm: vhost_set_vring_call failed 1
> 2024-09-02T15:40:56.861831Z qemu-kvm: vhost_set_vring_call failed 1
> 2024-09-02T15:40:56.862199Z qemu-kvm: unable to start vhost net: 1: falling back on userspace virtio
>
> For me the NIC then fails during the migration, although the migration as a whole appears to succeed. (At least, prior the the migration, I could ssh into the VM and ping out to 8.8.8.8, but then I lose the ssh connection during the migration, and cannot ssh back in again afterwards on the new QEMU).
>
> Do you think this could be because of QEMU falling back from the vhost backend to use virtio?
>
> It may be down to some misconfiguration on my part, here's the netdev command line I had for reference:
> On the source QEMU:
>
> -netdev '{"type":"tap","fd":"39","vhost":true,"vhostfd":"40","id":"hostua-43bc0eaf-ff55-44e6-87ec-a4798f592db1"}' \
> -device '{"driver":"virtio-net-pci","rx_queue_size":256,"netdev":"hostua-43bc0eaf-ff55-44e6-87ec-a4798f592db1","id":"ua-43bc0eaf-ff55-44e6-87ec-a4798f592db1","mac":"50:6b:8d:0c:03:e0","bus":"pci.1","addr":"0x0"}' \
>
> On the destination QEMU:
> -netdev '{"type":"tap","fd":"-1","vhostfd":"-1","id":"hostua-43bc0eaf-ff55-44e6-87ec-a4798f592db1"}' \
> -device '{"driver":"virtio-net-pci","rx_queue_size":256,"netdev":"hostua-43bc0eaf-ff55-44e6-87ec-a4798f592db1","id":"ua-43bc0eaf-ff55-44e6-87ec-a4798f592db1","mac":"50:6b:8d:0c:03:e0","bus":"pci.1","addr":"0x0"}' \
>
>> static int vhost_kernel_get_vq_index(struct vhost_dev *dev, int idx)
>> {
>> assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
>> @@ -345,6 +350,7 @@ const VhostOps kernel_ops = {
>> .vhost_get_features = vhost_kernel_get_features,
>> .vhost_set_backend_cap = vhost_kernel_set_backend_cap,
>> .vhost_set_owner = vhost_kernel_set_owner,
>> + .vhost_reset_device = vhost_kernel_reset_device,
>> .vhost_get_vq_index = vhost_kernel_get_vq_index,
>> .vhost_vsock_set_guest_cid = vhost_kernel_vsock_set_guest_cid,
>> .vhost_vsock_set_running = vhost_kernel_vsock_set_running,
>
The 6 patches in this series are only sufficient for cpr-exec mode, but I have
attached the 2 additional patches you need for cpr-transfer mode. That mode
works fine for me with those patches. When I run a similar test, vhost_reset_device
is not called on target QEMU because vhost_started is false:
qemu_machine_creation_done()
virtio_reset()
if (vdev->vhost_started && k->get_vhost)
vhost_reset_device(k->get_vhost(vdev));
I don't know why you are seeing the vhost_set_vring_call failures. I don't see those,
with or without the 2 additional patches.
- Steve
[-- Attachment #2: 0001-migration-cpr-setup-notifier.patch --]
[-- Type: text/plain, Size: 2136 bytes --]
From 4f719179546e9327cba3b74e86d2a403e36c5d71 Mon Sep 17 00:00:00 2001
From: Steve Sistare <steven.sistare@oracle.com>
Date: Tue, 20 Aug 2024 05:36:15 -0700
Subject: [PATCH 1/2] migration: cpr setup notifier
Add a cpr-setup notification point which is called before general setup.
It is needed for resetting vhost devices, as explained in the next patch.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
include/migration/misc.h | 7 ++++---
migration/migration.c | 5 +++++
2 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index bcb6f35..bd05fe2 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -60,6 +60,7 @@ bool migration_thread_is_self(void);
bool migration_is_setup_or_active(void);
typedef enum MigrationEventType {
+ MIG_EVENT_PRECOPY_CPR_SETUP,
MIG_EVENT_PRECOPY_SETUP,
MIG_EVENT_PRECOPY_DONE,
MIG_EVENT_PRECOPY_FAILED,
@@ -71,9 +72,9 @@ typedef struct MigrationEvent {
} MigrationEvent;
/*
- * A MigrationNotifyFunc may return an error code and an Error object,
- * but only when @e->type is MIG_EVENT_PRECOPY_SETUP. The code is an int
- * to allow for different failure modes and recovery actions.
+ * A MigrationNotifyFunc may return an error code and an Error object, but
+ * only when @e->type is MIG_EVENT_PRECOPY_SETUP or MIG_EVENT_PRECOPY_CPR_SETUP.
+ * The code is an int to allow for different failure modes and recovery actions.
*/
typedef int (*MigrationNotifyFunc)(NotifierWithReturn *notify,
MigrationEvent *e, Error **errp);
diff --git a/migration/migration.c b/migration/migration.c
index 15ac8f5..8bc1975 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2201,7 +2201,12 @@ void qmp_migrate(const char *uri, bool has_channels,
stopped = true;
}
+ if (migration_call_notifiers(s, MIG_EVENT_PRECOPY_CPR_SETUP, &local_err)) {
+ goto out;
+ }
+
if (cpr_state_save(&local_err)) {
+ migration_call_notifiers(s, MIG_EVENT_PRECOPY_FAILED, NULL);
goto out;
}
--
1.8.3.1
[-- Attachment #3: 0002-vhost-cpr-transfer-support.patch --]
[-- Type: text/plain, Size: 3281 bytes --]
From 0f09781df215086f095e40b42662dd2f3378ca7f Mon Sep 17 00:00:00 2001
From: Steve Sistare <steven.sistare@oracle.com>
Date: Tue, 20 Aug 2024 05:48:08 -0700
Subject: [PATCH 2/2] vhost cpr-transfer support
For cpr-transfer mode, a vhost device must be reset before CPR state
is sent to new QEMU, because after state is sent, new QEMU initializes
the device and calls set_owner. Install a cpr-setup notifier to do so.
This still works for cpr-exec mode.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
hw/virtio/vhost.c | 17 +++++++++++++++--
include/hw/virtio/vhost.h | 1 +
2 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 54c5ec7..31c72d1 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1513,16 +1513,24 @@ static int vhost_cpr_notifier(NotifierWithReturn *notifier,
struct vhost_dev *dev;
int r;
- dev = container_of(notifier, struct vhost_dev, cpr_exec_notifier);
+ dev = (migrate_mode() == MIG_MODE_CPR_EXEC) ?
+ container_of(notifier, struct vhost_dev, cpr_exec_notifier) :
+ container_of(notifier, struct vhost_dev, cpr_transfer_notifier);
+
if (dev->vhost_ops->backend_type != VHOST_BACKEND_TYPE_KERNEL) {
return 0;
}
- if (e->type == MIG_EVENT_PRECOPY_DONE) {
+ if (e->type == MIG_EVENT_PRECOPY_CPR_SETUP) {
r = dev->vhost_ops->vhost_reset_device(dev);
if (r < 0) {
VHOST_OPS_DEBUG(r, "vhost_reset_device failed");
}
+ } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
+ r = dev->vhost_ops->vhost_set_owner(dev);
+ if (r < 0) {
+ VHOST_OPS_DEBUG(r, "vhost_set_owner failed");
+ }
}
return 0;
}
@@ -1538,6 +1546,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
hdev->vdev = NULL;
hdev->migration_blocker = NULL;
hdev->cpr_exec_notifier.notify = NULL;
+ hdev->cpr_transfer_notifier.notify = NULL;
r = vhost_set_backend_type(hdev, backend_type);
assert(r >= 0);
@@ -1641,6 +1650,9 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
migration_add_notifier_mode(&hdev->cpr_exec_notifier,
vhost_cpr_notifier,
MIG_MODE_CPR_EXEC);
+ migration_add_notifier_mode(&hdev->cpr_transfer_notifier,
+ vhost_cpr_notifier,
+ MIG_MODE_CPR_TRANSFER);
QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
/*
@@ -1698,6 +1710,7 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
}
migrate_del_blocker(&hdev->migration_blocker);
migration_remove_notifier(&hdev->cpr_exec_notifier);
+ migration_remove_notifier(&hdev->cpr_transfer_notifier);
g_free(hdev->mem);
g_free(hdev->mem_sections);
if (hdev->vhost_ops) {
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index e9aca11..ab7e874 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -133,6 +133,7 @@ struct vhost_dev {
QLIST_HEAD(, vhost_iommu) iommu_list;
IOMMUNotifier n;
NotifierWithReturn cpr_exec_notifier;
+ NotifierWithReturn cpr_transfer_notifier;
const VhostDevConfigOps *config_ops;
};
--
1.8.3.1
next prev parent reply other threads:[~2024-09-03 19:56 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-30 11:56 [RFC V1 0/6] Live Update: tap and vhost Steve Sistare
2024-08-30 11:56 ` [RFC V1 1/6] Revert "vhost-backend: remove vhost_kernel_reset_device()" Steve Sistare
2024-09-03 10:44 ` Euan Turner
2024-09-03 19:55 ` Steven Sistare [this message]
2024-08-30 11:56 ` [RFC V1 2/6] tap: common return label Steve Sistare
2024-08-30 11:56 ` [RFC V1 3/6] tap: fix net_init_tap() return code Steve Sistare
2024-08-30 11:56 ` [RFC V1 4/6] migration: cpr_get_fd_param helper Steve Sistare
2024-09-03 15:38 ` Fabiano Rosas
2024-08-30 11:56 ` [RFC V1 5/6] tap: cpr support Steve Sistare
2024-08-30 11:56 ` [RFC V1 6/6] tap: postload fix for cpr Steve Sistare
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fa95c40d-b5e5-41eb-bba7-7842bca2f73e@oracle.com \
--to=steven.sistare@oracle.com \
--cc=euan.turner@nutanix.com \
--cc=farosas@suse.de \
--cc=jasowang@redhat.com \
--cc=mst@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).