From: Si-Wei Liu <si-wei.liu@oracle.com>
To: "Michael S. Tsirkin" <mst@redhat.com>,
Jonah Palmer <jonah.palmer@oracle.com>
Cc: qemu-devel@nongnu.org, eperezma@redhat.com, peterx@redhat.com,
jasowang@redhat.com, lvivier@redhat.com, dtatulea@nvidia.com,
leiyang@redhat.com, parav@mellanox.com, sgarzare@redhat.com,
lingshan.zhu@intel.com, boris.ostrovsky@oracle.com
Subject: Re: [PATCH v4 7/7] vdpa: move memory listener register to vhost_vdpa_init
Date: Thu, 15 May 2025 10:36:27 -0700 [thread overview]
Message-ID: <a05df34b-bacd-46b6-b958-ec94076d7649@oracle.com> (raw)
In-Reply-To: <20250515014103-mutt-send-email-mst@kernel.org>
On 5/14/2025 10:42 PM, Michael S. Tsirkin wrote:
> On Wed, May 07, 2025 at 02:46:47PM -0400, Jonah Palmer wrote:
>> From: Eugenio Pérez <eperezma@redhat.com>
>>
>> Current memory operations like pinning may take a lot of time at the
>> destination. Currently they are done after the source of the migration is
>> stopped, and before the workload is resumed at the destination. This is a
>> period where neigher traffic can flow, nor the VM workload can continue
>> (downtime).
>>
>> We can do better as we know the memory layout of the guest RAM at the
>> destination from the moment that all devices are initializaed. So
>> moving that operation allows QEMU to communicate the kernel the maps
>> while the workload is still running in the source, so Linux can start
>> mapping them.
>>
>> As a small drawback, there is a time in the initialization where QEMU
>> cannot respond to QMP etc. By some testing, this time is about
>> 0.2seconds. This may be further reduced (or increased) depending on the
>> vdpa driver and the platform hardware, and it is dominated by the cost
>> of memory pinning.
>>
>> This matches the time that we move out of the called downtime window.
>> The downtime is measured as checking the trace timestamp from the moment
>> the source suspend the device to the moment the destination starts the
>> eight and last virtqueue pair. For a 39G guest, it goes from ~2.2526
>> secs to 2.0949.
>>
>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
>>
>> v3:
>> ---
> just know that everything beyond this line is not going into
> git commit log.
I guess that was the intent? Should be fine without them in the commit
log I think. These are interim to capture what change was made to fix
specific bug *in previous posted versions*.
(having said, please help edit the log and remove the "v3:" line which
should be after the --- separator line, thx!)
-Siwei
>
>
>> Move memory listener unregistration from vhost_vdpa_reset_status to
>> vhost_vdpa_reset_device. By unregistering the listener here, we can
>> guarantee that every reset leaves the device in an expected state.
>> Also remove the duplicate call in vhost_vdpa_reset_status.
>>
>> Reported-by: Lei Yang <leiyang@redhat.com>
>> Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com>
>>
>> --
>> v2:
>> Move the memory listener registration to vhost_vdpa_set_owner function.
>> In case of hotplug the vdpa device, the memory is already set up, and
>> leaving memory listener register call in the init function made maps
>> occur before set owner call.
>>
>> To be 100% safe, let's put it right after set_owner call.
>>
>> Reported-by: Lei Yang <leiyang@redhat.com>
>> ---
>> hw/virtio/vhost-vdpa.c | 35 ++++++++++++++++++++++++++++-------
>> 1 file changed, 28 insertions(+), 7 deletions(-)
>>
>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>> index de834f2ebd..e20da95f30 100644
>> --- a/hw/virtio/vhost-vdpa.c
>> +++ b/hw/virtio/vhost-vdpa.c
>> @@ -894,8 +894,14 @@ static int vhost_vdpa_reset_device(struct vhost_dev *dev)
>>
>> ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
>> trace_vhost_vdpa_reset_device(dev);
>> + if (ret) {
>> + return ret;
>> + }
>> +
>> + memory_listener_unregister(&v->shared->listener);
>> + v->shared->listener_registered = false;
>> v->suspended = false;
>> - return ret;
>> + return 0;
>> }
>>
>> static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx)
>> @@ -1379,6 +1385,11 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
>> "IOMMU and try again");
>> return -1;
>> }
>> + if (v->shared->listener_registered &&
>> + dev->vdev->dma_as != v->shared->listener.address_space) {
>> + memory_listener_unregister(&v->shared->listener);
>> + v->shared->listener_registered = false;
>> + }
>> if (!v->shared->listener_registered) {
>> memory_listener_register(&v->shared->listener, dev->vdev->dma_as);
>> v->shared->listener_registered = true;
>> @@ -1392,8 +1403,6 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
>>
>> static void vhost_vdpa_reset_status(struct vhost_dev *dev)
>> {
>> - struct vhost_vdpa *v = dev->opaque;
>> -
>> if (!vhost_vdpa_last_dev(dev)) {
>> return;
>> }
>> @@ -1401,9 +1410,6 @@ static void vhost_vdpa_reset_status(struct vhost_dev *dev)
>> vhost_vdpa_reset_device(dev);
>> vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
>> VIRTIO_CONFIG_S_DRIVER);
>> - memory_listener_unregister(&v->shared->listener);
>> - v->shared->listener_registered = false;
>> -
>> }
>>
>> static int vhost_vdpa_set_log_base(struct vhost_dev *dev, uint64_t base,
>> @@ -1537,12 +1543,27 @@ static int vhost_vdpa_get_features(struct vhost_dev *dev,
>>
>> static int vhost_vdpa_set_owner(struct vhost_dev *dev)
>> {
>> + int r;
>> + struct vhost_vdpa *v;
>> +
>> if (!vhost_vdpa_first_dev(dev)) {
>> return 0;
>> }
>>
>> trace_vhost_vdpa_set_owner(dev);
>> - return vhost_vdpa_call(dev, VHOST_SET_OWNER, NULL);
>> + r = vhost_vdpa_call(dev, VHOST_SET_OWNER, NULL);
>> + if (unlikely(r < 0)) {
>> + return r;
>> + }
>> +
>> + /*
>> + * Being optimistic and listening address space memory. If the device
>> + * uses vIOMMU, it is changed at vhost_vdpa_dev_start.
>> + */
>> + v = dev->opaque;
>> + memory_listener_register(&v->shared->listener, &address_space_memory);
>> + v->shared->listener_registered = true;
>> + return 0;
>> }
>>
>> static int vhost_vdpa_vq_get_addr(struct vhost_dev *dev,
>> --
>> 2.43.5
next prev parent reply other threads:[~2025-05-15 17:37 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-07 18:46 [PATCH v4 0/7] Move memory listener register to vhost_vdpa_init Jonah Palmer
2025-05-07 18:46 ` [PATCH v4 1/7] vdpa: check for iova tree initialized at net_client_start Jonah Palmer
2025-05-16 1:52 ` Jason Wang
2025-05-07 18:46 ` [PATCH v4 2/7] vdpa: reorder vhost_vdpa_set_backend_cap Jonah Palmer
2025-05-16 1:53 ` Jason Wang
2025-05-16 1:56 ` Jason Wang
2025-05-07 18:46 ` [PATCH v4 3/7] vdpa: set backend capabilities at vhost_vdpa_init Jonah Palmer
2025-05-16 1:57 ` Jason Wang
2025-05-07 18:46 ` [PATCH v4 4/7] vdpa: add listener_registered Jonah Palmer
2025-05-16 2:00 ` Jason Wang
2025-05-07 18:46 ` [PATCH v4 5/7] vdpa: reorder listener assignment Jonah Palmer
2025-05-16 2:01 ` Jason Wang
2025-05-07 18:46 ` [PATCH v4 6/7] vdpa: move iova_tree allocation to net_vhost_vdpa_init Jonah Palmer
2025-05-16 2:07 ` Jason Wang
2025-05-07 18:46 ` [PATCH v4 7/7] vdpa: move memory listener register to vhost_vdpa_init Jonah Palmer
2025-05-15 5:42 ` Michael S. Tsirkin
2025-05-15 17:36 ` Si-Wei Liu [this message]
2025-05-20 13:23 ` Jonah Palmer
2025-05-14 1:42 ` [PATCH v4 0/7] Move " Lei Yang
2025-05-14 15:49 ` Eugenio Perez Martin
2025-05-15 0:17 ` Si-Wei Liu
2025-05-15 5:43 ` Michael S. Tsirkin
2025-05-15 17:41 ` Si-Wei Liu
2025-05-16 10:45 ` Michael S. Tsirkin
2025-05-15 8:30 ` Eugenio Perez Martin
2025-05-16 1:49 ` Jason Wang
2025-05-20 13:27 ` Jonah Palmer
2025-05-14 23:00 ` Si-Wei Liu
2025-05-16 1:47 ` Jason Wang
2025-05-16 1:51 ` Jason Wang
2025-05-16 6:40 ` Markus Armbruster
2025-05-16 19:09 ` Si-Wei Liu
2025-05-26 9:16 ` Markus Armbruster
2025-05-29 7:57 ` Si-Wei Liu
2025-06-02 8:08 ` Markus Armbruster
2025-06-02 8:29 ` Markus Armbruster
2025-06-06 16:21 ` Jonah Palmer
2025-06-26 12:08 ` Markus Armbruster
2025-07-02 19:31 ` Jonah Palmer
2025-07-04 15:00 ` Markus Armbruster
2025-07-07 13:21 ` Jonah Palmer
2025-07-08 8:17 ` Markus Armbruster
2025-07-09 19:57 ` Jonah Palmer
2025-07-10 5:31 ` Markus Armbruster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a05df34b-bacd-46b6-b958-ec94076d7649@oracle.com \
--to=si-wei.liu@oracle.com \
--cc=boris.ostrovsky@oracle.com \
--cc=dtatulea@nvidia.com \
--cc=eperezma@redhat.com \
--cc=jasowang@redhat.com \
--cc=jonah.palmer@oracle.com \
--cc=leiyang@redhat.com \
--cc=lingshan.zhu@intel.com \
--cc=lvivier@redhat.com \
--cc=mst@redhat.com \
--cc=parav@mellanox.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=sgarzare@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).