[RFC PATCH v2 00/10] Map memory at destination .load_setup in vDPA-net migration

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Eugenio Pérez" <eperezma@redhat.com>
To: qemu-devel@nongnu.org
Cc: Gautam Dawar <gdawar@xilinx.com>,
	Jason Wang <jasowang@redhat.com>,
	Zhu Lingshan <lingshan.zhu@intel.com>,
	yin31149@gmail.com, Shannon Nelson <shannon.nelson@amd.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Dragos Tatulea <dtatulea@nvidia.com>,
	Yajun Wu <yajunw@nvidia.com>, Juan Quintela <quintela@redhat.com>,
	Laurent Vivier <lvivier@redhat.com>,
	Stefano Garzarella <sgarzare@redhat.com>,
	Parav Pandit <parav@mellanox.com>, Lei Yang <leiyang@redhat.com>,
	si-wei.liu@oracle.com
Subject: [RFC PATCH v2 00/10] Map memory at destination .load_setup in vDPA-net migration
Date: Tue, 28 Nov 2023 11:42:53 +0100	[thread overview]
Message-ID: <20231128104303.3314000-1-eperezma@redhat.com> (raw)

Current memory operations like pinning may take a lot of time at the
destination.  Currently they are done after the source of the migration is
stopped, and before the workload is resumed at the destination.  This is a
period where neigher traffic can flow, nor the VM workload can continue
(downtime).

We can do better as we know the memory layout of the guest RAM at the
destination from the moment the migration starts.  Moving that operation allows
QEMU to communicate the kernel the maps while the workload is still running in
the source, so Linux can start mapping them.

Also, the destination of the guest memory may finish before the destination
QEMU maps all the memory.  In this case, the rest of the memory will be mapped
at the same time as before applying this series, when the device is starting.
So we're only improving with this series.

RFC TODO: We should be able to not finish the migration while the memory is
still not mapped, but I still need to find how.  Suggestions are welcome.

Note that further devices setup at the end of the migration may alter the guest
memory layout. But same as the previous point, many operations are still done
incrementally, like memory pinning, so we're saving time anyway.

Only tested with vdpa_sim. I'm sending this before full benchmark, as some work
like [1] can be based on it, and Si-Wei agreed on benchmark this series with
his experience.

This needs to be applied on top of [2], which perform some code reorganization
that allows to map the memory without knowing the queue layout the guest
configure on the device.

Future directions on top of this series may include:
* Iterative migration of virtio-net devices, as it may reduce downtime per [1].
  vhost-vdpa net can apply the configuration through CVQ in the destination
  while the source is still migrating.
* Move more things ahead of migration time, like DRIVER_OK.
* Check that the devices of the destination are valid, and cancel the migration
  in case it is not.

RFC v2:
* Delegate map to another thread so it does no block QMP.
* Fix not allocating iova_tree if x-svq=on at the destination.
* Rebased on latest master.
* More cleanups of current code, that might be split from this series too.

[1] https://lore.kernel.org/qemu-devel/6c8ebb97-d546-3f1c-4cdd-54e23a566f61@nvidia.com/T/
[2] https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg05331.html

Eugenio Pérez (10):
  vdpa: do not set virtio status bits if unneeded
  vdpa: make batch_begin_once early return
  vdpa: merge _begin_batch into _batch_begin_once
  vdpa: extract out _dma_end_batch from _listener_commit
  vdpa: factor out stop path of vhost_vdpa_dev_start
  vdpa: check for iova tree initialized at net_client_start
  vdpa: set backend capabilities at vhost_vdpa_init
  vdpa: add vhost_vdpa_load_setup
  vdpa: add vhost_vdpa_net_load_setup NetClient callback
  virtio_net: register incremental migration handlers

 include/hw/virtio/vhost-vdpa.h |  25 ++++
 include/net/net.h              |   6 +
 hw/net/virtio-net.c            |  35 +++++
 hw/virtio/vhost-vdpa.c         | 257 +++++++++++++++++++++++++++------
 net/vhost-vdpa.c               |  37 ++++-
 5 files changed, 312 insertions(+), 48 deletions(-)

-- 
2.39.3

next             reply	other threads:[~2023-11-28 10:45 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-28 10:42 Eugenio Pérez [this message]
2023-11-28 10:42 ` [RFC PATCH v2 01/10] vdpa: do not set virtio status bits if unneeded Eugenio Pérez
2023-11-28 10:42 ` [RFC PATCH v2 02/10] vdpa: make batch_begin_once early return Eugenio Pérez
2023-11-28 10:42 ` [RFC PATCH v2 03/10] vdpa: merge _begin_batch into _batch_begin_once Eugenio Pérez
2023-11-28 10:42 ` [RFC PATCH v2 04/10] vdpa: extract out _dma_end_batch from _listener_commit Eugenio Pérez
2023-11-28 10:42 ` [RFC PATCH v2 05/10] vdpa: factor out stop path of vhost_vdpa_dev_start Eugenio Pérez
2023-11-28 10:42 ` [RFC PATCH v2 06/10] vdpa: check for iova tree initialized at net_client_start Eugenio Pérez
2023-11-28 10:43 ` [RFC PATCH v2 07/10] vdpa: set backend capabilities at vhost_vdpa_init Eugenio Pérez
2023-11-28 10:43 ` [RFC PATCH v2 08/10] vdpa: add vhost_vdpa_load_setup Eugenio Pérez
2023-11-28 10:43 ` [RFC PATCH v2 09/10] vdpa: add vhost_vdpa_net_load_setup NetClient callback Eugenio Pérez
2023-11-28 10:43 ` [RFC PATCH v2 10/10] virtio_net: register incremental migration handlers Eugenio Pérez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231128104303.3314000-1-eperezma@redhat.com \
    --to=eperezma@redhat.com \
    --cc=dtatulea@nvidia.com \
    --cc=gdawar@xilinx.com \
    --cc=jasowang@redhat.com \
    --cc=leiyang@redhat.com \
    --cc=lingshan.zhu@intel.com \
    --cc=lvivier@redhat.com \
    --cc=mst@redhat.com \
    --cc=parav@mellanox.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=sgarzare@redhat.com \
    --cc=shannon.nelson@amd.com \
    --cc=si-wei.liu@oracle.com \
    --cc=yajunw@nvidia.com \
    --cc=yin31149@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).