Re: Reducing vdpa migration downtime because of memory pin / maps

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Si-Wei Liu <si-wei.liu@oracle.com>
To: Eugenio Perez Martin <eperezma@redhat.com>,
	qemu-level <qemu-devel@nongnu.org>
Cc: Jason Wang <jasowang@redhat.com>,
	Michael Tsirkin <mst@redhat.com>, Longpeng <longpeng2@huawei.com>,
	"Gonglei (Arei)" <arei.gonglei@huawei.com>,
	Eli Cohen <elic@nvidia.com>, Parav Pandit <parav@nvidia.com>,
	Juan Quintela <quintela@redhat.com>,
	David Gilbert <dgilbert@redhat.com>,
	Dragos Tatulea <dtatulea@nvidia.com>
Subject: Re: Reducing vdpa migration downtime because of memory pin / maps
Date: Tue, 6 Jun 2023 15:44:29 -0700	[thread overview]
Message-ID: <c59d2d67-d31a-b6e6-54c5-5b81c18d9547@oracle.com> (raw)
In-Reply-To: <CAJaqyWdV6pKP0SVZciMiu_HN86aJriZh0HBiwHNkO7+yErXnBA@mail.gmail.com>

Sorry for reviving this old thread, I lost the best timing to follow up 
on this while I was on vacation. I have been working on this and found 
out some discrepancy, please see below.

On 4/5/23 04:37, Eugenio Perez Martin wrote:
> Hi!
>
> As mentioned in the last upstream virtio-networking meeting, one of
> the factors that adds more downtime to migration is the handling of
> the guest memory (pin, map, etc). At this moment this handling is
> bound to the virtio life cycle (DRIVER_OK, RESET). In that sense, the
> destination device waits until all the guest memory / state is
> migrated to start pinning all the memory.
>
> The proposal is to bind it to the char device life cycle (open vs
> close),

Hmmm, really? If it's the life cycle for char device, the next guest / 
qemu launch on the same vhost-vdpa device node won't make it work.

>   so all the guest memory can be pinned for all the guest / qemu
> lifecycle.

I think to tie pinning to guest / qemu process life cycle makes more 
sense. Essentially this pinning part needs to be decoupled from the 
iotlb mapping abstraction layer, and can / should work as a standalone 
uAPI. Such that QEMU at the destination may launch and pin all guest's 
memory as needed without having to start the device, while awaiting any 
incoming migration request. Though problem is, there's no existing vhost 
uAPI that could properly serve as the vehicle for that. SET_OWNER / 
SET_MEM_TABLE / RESET_OWNER seems a remote fit.. Any objection against 
introducing a new but clean vhost uAPI for pinning guest pages, subject 
to guest's life cycle?

Another concern is the use_va stuff, originally it tags to the device 
level and is made static at the time of device instantiation, which is 
fine. But others to come just find a new home at per-group level or 
per-vq level struct. Hard to tell whether or not pinning is actually 
needed for the latter use_va friends, as they are essentially tied to 
the virtio life cycle or feature negotiation. While guest / Qemu starts 
way earlier than that. Perhaps just ignore those sub-device level use_va 
usages? Presumably !use_va at the device level is sufficient to infer 
the need of pinning for device?

Regards,
-Siwei

>
> This has two main problems:
> * At this moment the reset semantics forces the vdpa device to unmap
> all the memory. So this change needs a vhost vdpa feature flag.
> * This may increase the initialization time. Maybe we can delay it if
> qemu is not the destination of a LM. Anyway I think this should be
> done as an optimization on top.
>
> Any ideas or comments in this regard?
>
> Thanks!
>

next prev parent reply	other threads:[~2023-06-06 22:44 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-05 11:37 Reducing vdpa migration downtime because of memory pin / maps Eugenio Perez Martin
2023-04-10  2:14 ` Jason Wang
2023-04-10  3:16   ` longpeng2--- via
2023-04-10  3:21     ` Jason Wang
2023-04-10  9:04       ` Eugenio Perez Martin
2023-04-11  2:25         ` Jason Wang
2023-04-11  6:28           ` Eugenio Perez Martin
2023-04-11  6:36             ` Jason Wang
2023-04-11 12:33 ` Eugenio Perez Martin
2023-04-12  5:56   ` Jason Wang
2023-04-12  6:18     ` Jason Wang
2023-04-13  7:27       ` Eugenio Perez Martin
2023-06-06 22:44 ` Si-Wei Liu [this message]
2023-06-07  8:08   ` Eugenio Perez Martin
2023-06-08 22:40     ` Si-Wei Liu
2023-06-09  3:18       ` Jason Wang
2023-06-09 14:32       ` Eugenio Perez Martin
2023-06-27  6:36         ` Si-Wei Liu
2023-07-05 18:03           ` Eugenio Perez Martin
2023-07-06  0:13             ` Si-Wei Liu
2023-07-06  5:46               ` Eugenio Perez Martin
2023-07-08  9:14                 ` Si-Wei Liu
2023-07-10  6:04                   ` Eugenio Perez Martin
2023-07-17 19:56                     ` Si-Wei Liu
2023-07-19 10:40                       ` Eugenio Perez Martin
2023-07-20  0:48                         ` Si-Wei Liu
2023-08-02 12:42                           ` Eugenio Perez Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c59d2d67-d31a-b6e6-54c5-5b81c18d9547@oracle.com \
    --to=si-wei.liu@oracle.com \
    --cc=arei.gonglei@huawei.com \
    --cc=dgilbert@redhat.com \
    --cc=dtatulea@nvidia.com \
    --cc=elic@nvidia.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=longpeng2@huawei.com \
    --cc=mst@redhat.com \
    --cc=parav@nvidia.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).