qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jonah Palmer <jonah.palmer@oracle.com>
To: Si-Wei Liu <si-wei.liu@oracle.com>,
	"Michael S. Tsirkin" <mst@redhat.com>
Cc: qemu-devel@nongnu.org, eperezma@redhat.com, peterx@redhat.com,
	jasowang@redhat.com, lvivier@redhat.com, dtatulea@nvidia.com,
	leiyang@redhat.com, parav@mellanox.com, sgarzare@redhat.com,
	lingshan.zhu@intel.com, boris.ostrovsky@oracle.com
Subject: Re: [PATCH v4 7/7] vdpa: move memory listener register to vhost_vdpa_init
Date: Tue, 20 May 2025 09:23:35 -0400	[thread overview]
Message-ID: <b59cfe6f-5ce2-4ded-9e6d-b189988152da@oracle.com> (raw)
In-Reply-To: <a05df34b-bacd-46b6-b958-ec94076d7649@oracle.com>


On 5/15/25 1:36 PM, Si-Wei Liu wrote:
>
>
> On 5/14/2025 10:42 PM, Michael S. Tsirkin wrote:
>> On Wed, May 07, 2025 at 02:46:47PM -0400, Jonah Palmer wrote:
>>> From: Eugenio Pérez <eperezma@redhat.com>
>>>
>>> Current memory operations like pinning may take a lot of time at the
>>> destination.  Currently they are done after the source of the 
>>> migration is
>>> stopped, and before the workload is resumed at the destination.  
>>> This is a
>>> period where neigher traffic can flow, nor the VM workload can continue
>>> (downtime).
>>>
>>> We can do better as we know the memory layout of the guest RAM at the
>>> destination from the moment that all devices are initializaed.  So
>>> moving that operation allows QEMU to communicate the kernel the maps
>>> while the workload is still running in the source, so Linux can start
>>> mapping them.
>>>
>>> As a small drawback, there is a time in the initialization where QEMU
>>> cannot respond to QMP etc.  By some testing, this time is about
>>> 0.2seconds.  This may be further reduced (or increased) depending on 
>>> the
>>> vdpa driver and the platform hardware, and it is dominated by the cost
>>> of memory pinning.
>>>
>>> This matches the time that we move out of the called downtime window.
>>> The downtime is measured as checking the trace timestamp from the 
>>> moment
>>> the source suspend the device to the moment the destination starts the
>>> eight and last virtqueue pair.  For a 39G guest, it goes from ~2.2526
>>> secs to 2.0949.
>>>
>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
>>>
>>> v3:
>>> ---
>> just know that everything beyond this line is not going into
>> git commit log.
> I guess that was the intent? Should be fine without them in the commit 
> log I think. These are interim to capture what change was made to fix 
> specific bug *in previous posted versions*.
>
> (having said, please help edit the log and remove the "v3:" line which 
> should be after the --- separator line, thx!)
>
> -Siwei
Woops, will fix this. Sorry about that.
>
>>
>>
>>> Move memory listener unregistration from vhost_vdpa_reset_status to
>>> vhost_vdpa_reset_device. By unregistering the listener here, we can
>>> guarantee that every reset leaves the device in an expected state.
>>> Also remove the duplicate call in vhost_vdpa_reset_status.
>>>
>>> Reported-by: Lei Yang <leiyang@redhat.com>
>>> Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com>
>>>
>>> -- 
>>> v2:
>>> Move the memory listener registration to vhost_vdpa_set_owner function.
>>> In case of hotplug the vdpa device, the memory is already set up, and
>>> leaving memory listener register call in the init function made maps
>>> occur before set owner call.
>>>
>>> To be 100% safe, let's put it right after set_owner call.
>>>
>>> Reported-by: Lei Yang <leiyang@redhat.com>
>>> ---
>>>   hw/virtio/vhost-vdpa.c | 35 ++++++++++++++++++++++++++++-------
>>>   1 file changed, 28 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>> index de834f2ebd..e20da95f30 100644
>>> --- a/hw/virtio/vhost-vdpa.c
>>> +++ b/hw/virtio/vhost-vdpa.c
>>> @@ -894,8 +894,14 @@ static int vhost_vdpa_reset_device(struct 
>>> vhost_dev *dev)
>>>         ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
>>>       trace_vhost_vdpa_reset_device(dev);
>>> +    if (ret) {
>>> +        return ret;
>>> +    }
>>> +
>>> + memory_listener_unregister(&v->shared->listener);
>>> +    v->shared->listener_registered = false;
>>>       v->suspended = false;
>>> -    return ret;
>>> +    return 0;
>>>   }
>>>     static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx)
>>> @@ -1379,6 +1385,11 @@ static int vhost_vdpa_dev_start(struct 
>>> vhost_dev *dev, bool started)
>>>                            "IOMMU and try again");
>>>               return -1;
>>>           }
>>> +        if (v->shared->listener_registered &&
>>> +            dev->vdev->dma_as != v->shared->listener.address_space) {
>>> + memory_listener_unregister(&v->shared->listener);
>>> +            v->shared->listener_registered = false;
>>> +        }
>>>           if (!v->shared->listener_registered) {
>>> memory_listener_register(&v->shared->listener, dev->vdev->dma_as);
>>>               v->shared->listener_registered = true;
>>> @@ -1392,8 +1403,6 @@ static int vhost_vdpa_dev_start(struct 
>>> vhost_dev *dev, bool started)
>>>     static void vhost_vdpa_reset_status(struct vhost_dev *dev)
>>>   {
>>> -    struct vhost_vdpa *v = dev->opaque;
>>> -
>>>       if (!vhost_vdpa_last_dev(dev)) {
>>>           return;
>>>       }
>>> @@ -1401,9 +1410,6 @@ static void vhost_vdpa_reset_status(struct 
>>> vhost_dev *dev)
>>>       vhost_vdpa_reset_device(dev);
>>>       vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
>>>                                  VIRTIO_CONFIG_S_DRIVER);
>>> - memory_listener_unregister(&v->shared->listener);
>>> -    v->shared->listener_registered = false;
>>> -
>>>   }
>>>     static int vhost_vdpa_set_log_base(struct vhost_dev *dev, 
>>> uint64_t base,
>>> @@ -1537,12 +1543,27 @@ static int vhost_vdpa_get_features(struct 
>>> vhost_dev *dev,
>>>     static int vhost_vdpa_set_owner(struct vhost_dev *dev)
>>>   {
>>> +    int r;
>>> +    struct vhost_vdpa *v;
>>> +
>>>       if (!vhost_vdpa_first_dev(dev)) {
>>>           return 0;
>>>       }
>>>         trace_vhost_vdpa_set_owner(dev);
>>> -    return vhost_vdpa_call(dev, VHOST_SET_OWNER, NULL);
>>> +    r = vhost_vdpa_call(dev, VHOST_SET_OWNER, NULL);
>>> +    if (unlikely(r < 0)) {
>>> +        return r;
>>> +    }
>>> +
>>> +    /*
>>> +     * Being optimistic and listening address space memory. If the 
>>> device
>>> +     * uses vIOMMU, it is changed at vhost_vdpa_dev_start.
>>> +     */
>>> +    v = dev->opaque;
>>> +    memory_listener_register(&v->shared->listener, 
>>> &address_space_memory);
>>> +    v->shared->listener_registered = true;
>>> +    return 0;
>>>   }
>>>     static int vhost_vdpa_vq_get_addr(struct vhost_dev *dev,
>>> -- 
>>> 2.43.5
>


  reply	other threads:[~2025-05-20 13:24 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-07 18:46 [PATCH v4 0/7] Move memory listener register to vhost_vdpa_init Jonah Palmer
2025-05-07 18:46 ` [PATCH v4 1/7] vdpa: check for iova tree initialized at net_client_start Jonah Palmer
2025-05-16  1:52   ` Jason Wang
2025-05-07 18:46 ` [PATCH v4 2/7] vdpa: reorder vhost_vdpa_set_backend_cap Jonah Palmer
2025-05-16  1:53   ` Jason Wang
2025-05-16  1:56   ` Jason Wang
2025-05-07 18:46 ` [PATCH v4 3/7] vdpa: set backend capabilities at vhost_vdpa_init Jonah Palmer
2025-05-16  1:57   ` Jason Wang
2025-05-07 18:46 ` [PATCH v4 4/7] vdpa: add listener_registered Jonah Palmer
2025-05-16  2:00   ` Jason Wang
2025-05-07 18:46 ` [PATCH v4 5/7] vdpa: reorder listener assignment Jonah Palmer
2025-05-16  2:01   ` Jason Wang
2025-05-07 18:46 ` [PATCH v4 6/7] vdpa: move iova_tree allocation to net_vhost_vdpa_init Jonah Palmer
2025-05-16  2:07   ` Jason Wang
2025-05-07 18:46 ` [PATCH v4 7/7] vdpa: move memory listener register to vhost_vdpa_init Jonah Palmer
2025-05-15  5:42   ` Michael S. Tsirkin
2025-05-15 17:36     ` Si-Wei Liu
2025-05-20 13:23       ` Jonah Palmer [this message]
2025-05-14  1:42 ` [PATCH v4 0/7] Move " Lei Yang
2025-05-14 15:49 ` Eugenio Perez Martin
2025-05-15  0:17   ` Si-Wei Liu
2025-05-15  5:43     ` Michael S. Tsirkin
2025-05-15 17:41       ` Si-Wei Liu
2025-05-16 10:45         ` Michael S. Tsirkin
2025-05-15  8:30     ` Eugenio Perez Martin
2025-05-16  1:49     ` Jason Wang
2025-05-20 13:27   ` Jonah Palmer
2025-05-14 23:00 ` Si-Wei Liu
2025-05-16  1:47 ` Jason Wang
2025-05-16  1:51 ` Jason Wang
2025-05-16  6:40   ` Markus Armbruster
2025-05-16 19:09     ` Si-Wei Liu
2025-05-26  9:16       ` Markus Armbruster
2025-05-29  7:57         ` Si-Wei Liu
2025-06-02  8:08           ` Markus Armbruster
2025-06-02  8:29             ` Markus Armbruster
2025-06-06 16:21               ` Jonah Palmer
2025-06-26 12:08                 ` Markus Armbruster
2025-07-02 19:31                   ` Jonah Palmer
2025-07-04 15:00                     ` Markus Armbruster
2025-07-07 13:21                       ` Jonah Palmer
2025-07-08  8:17                         ` Markus Armbruster
2025-07-09 19:57                           ` Jonah Palmer
2025-07-10  5:31                             ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b59cfe6f-5ce2-4ded-9e6d-b189988152da@oracle.com \
    --to=jonah.palmer@oracle.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=dtatulea@nvidia.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=leiyang@redhat.com \
    --cc=lingshan.zhu@intel.com \
    --cc=lvivier@redhat.com \
    --cc=mst@redhat.com \
    --cc=parav@mellanox.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=sgarzare@redhat.com \
    --cc=si-wei.liu@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).