From: Eugenio Perez Martin <eperezma@redhat.com>
To: Si-Wei Liu <si-wei.liu@oracle.com>
Cc: jasowang@redhat.com, mst@redhat.com, dtatulea@nvidia.com,
leiyang@redhat.com, yin31149@gmail.com,
boris.ostrovsky@oracle.com, jonah.palmer@oracle.com,
qemu-devel@nongnu.org
Subject: Re: [PATCH 00/40] vdpa-net: improve migration downtime through descriptor ASID and persistent IOTLB
Date: Mon, 11 Dec 2023 19:39:41 +0100 [thread overview]
Message-ID: <CAJaqyWe8F5dLEAL5y1JAkyti--hk6BHbGyu8MSChYTj4ffQHdg@mail.gmail.com> (raw)
In-Reply-To: <1701970793-6865-1-git-send-email-si-wei.liu@oracle.com>
On Thu, Dec 7, 2023 at 7:50 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>
> This patch series contain several enhancements to SVQ live migration downtime
> for vDPA-net hardware device, specifically on mlx5_vdpa. Currently it is based
> off of Eugenio's RFC v2 .load_setup series [1] to utilize the shared facility
> and reduce frictions in merging or duplicating code if at all possible.
>
> It's stacked up in particular order as below, as the optimization for one on
> the top has to depend on others on the bottom. Here's a breakdown for what
> each part does respectively:
>
> Patch # | Feature / optimization
> ---------V-------------------------------------------------------------------
> 35 - 40 | trace events
> 34 | migrate_cancel bug fix
> 21 - 33 | (Un)map batching at stop-n-copy to further optimize LM down time
> 11 - 20 | persistent IOTLB [3] to improve LM down time
> 02 - 10 | SVQ descriptor ASID [2] to optimize SVQ switching
> 01 | dependent linux headers
> V
>
Hi Si-Wei,
Thanks for the series, I think it contains great additions for the
live migration solution!
It is pretty large though. Do you think it would be feasible to split
out the fixes and the tracing patches in a separated series? That
would allow the reviews to focus on the downtime reduction. I think I
acked all of them.
Maybe we can even create a third series for the vring asid? I think we
should note an increase on performance using svq so it justifies by
itself too.
> Let's first define 2 sources of downtime that this work is concerned with:
>
> * SVQ switching downtime (Downtime #1): downtime at the start of migration.
> Time spent on teardown and setup for SVQ mode switching, and this downtime
> is regarded as the maxium time for an individual vdpa-net device.
> No memory transfer is involved during SVQ switching, hence no .
>
> * LM downtime (Downtime #2): aggregated downtime for all vdpa-net devices on
> resource teardown and setup in the last stop-n-copy phase on source host.
>
> With each part of the optimizations applied bottom up, the effective outcome
> in terms of down time (in seconds) performance can be observed in this table:
>
>
> | Downtime #1 | Downtime #2
> --------------------+-------------------+-------------------
> Baseline QEMU | 20s ~ 30s | 20s
> | |
> Iterative map | |
> at destination[1] | 5s | 20s
> | |
> SVQ descriptor | |
> ASID [2] | 2s | 5s
> | |
> | |
> persistent IOTLB | 2s | 2s
> [3] | |
> | |
> (Un)map batching | |
> at stop-n-copy | 1.7s | 1.5s
> before switchover | |
>
> (VM config: 128GB mem, 2 mlx5_vdpa devices, each w/ 4 data vqs)
>
Thanks for all the profiling, it looks promising!
> Please find the details regarding each enhancement on the commit log.
>
> Thanks,
> -Siwei
>
>
> [1] [RFC PATCH v2 00/10] Map memory at destination .load_setup in vDPA-net migration
> https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg05711.html
> [2] VHOST_BACKEND_F_DESC_ASID
> https://lore.kernel.org/virtualization/20231018171456.1624030-2-dtatulea@nvidia.com/
> [3] VHOST_BACKEND_F_IOTLB_PERSIST
> https://lore.kernel.org/virtualization/1698304480-18463-1-git-send-email-si-wei.liu@oracle.com/
>
> ---
>
> Si-Wei Liu (40):
> linux-headers: add vhost_types.h and vhost.h
> vdpa: add vhost_vdpa_get_vring_desc_group
> vdpa: probe descriptor group index for data vqs
> vdpa: piggyback desc_group index when probing isolated cvq
> vdpa: populate desc_group from net_vhost_vdpa_init
> vhost: make svq work with gpa without iova translation
> vdpa: move around vhost_vdpa_set_address_space_id
> vdpa: add back vhost_vdpa_net_first_nc_vdpa
> vdpa: no repeat setting shadow_data
> vdpa: assign svq descriptors a separate ASID when possible
> vdpa: factor out vhost_vdpa_last_dev
> vdpa: check map_thread_enabled before join maps thread
> vdpa: ref counting VhostVDPAShared
> vdpa: convert iova_tree to ref count based
> vdpa: add svq_switching and flush_map to header
> vdpa: indicate SVQ switching via flag
> vdpa: judge if map can be kept across reset
> vdpa: unregister listener on last dev cleanup
> vdpa: should avoid map flushing with persistent iotlb
> vdpa: avoid mapping flush across reset
> vdpa: vhost_vdpa_dma_batch_end_once rename
> vdpa: factor out vhost_vdpa_map_batch_begin
> vdpa: vhost_vdpa_dma_batch_begin_once rename
> vdpa: factor out vhost_vdpa_dma_batch_end
> vdpa: add asid to dma_batch_once API
> vdpa: return int for dma_batch_once API
> vdpa: add asid to all dma_batch call sites
> vdpa: support iotlb_batch_asid
> vdpa: expose API vhost_vdpa_dma_batch_once
> vdpa: batch map/unmap op per svq pair basis
> vdpa: batch map and unmap around cvq svq start/stop
> vdpa: factor out vhost_vdpa_net_get_nc_vdpa
> vdpa: batch multiple dma_unmap to a single call for vm stop
> vdpa: fix network breakage after cancelling migration
> vdpa: add vhost_vdpa_set_address_space_id trace
> vdpa: add vhost_vdpa_get_vring_base trace for svq mode
> vdpa: add vhost_vdpa_set_dev_vring_base trace for svq mode
> vdpa: add trace events for eval_flush
> vdpa: add trace events for vhost_vdpa_net_load_cmd
> vdpa: add trace event for vhost_vdpa_net_load_mq
>
> hw/virtio/trace-events | 9 +-
> hw/virtio/vhost-shadow-virtqueue.c | 35 ++-
> hw/virtio/vhost-vdpa.c | 156 +++++++---
> include/hw/virtio/vhost-vdpa.h | 16 +
> include/standard-headers/linux/vhost_types.h | 13 +
> linux-headers/linux/vhost.h | 9 +
> net/trace-events | 8 +
> net/vhost-vdpa.c | 434 ++++++++++++++++++++++-----
> 8 files changed, 558 insertions(+), 122 deletions(-)
>
> --
> 1.8.3.1
>
next prev parent reply other threads:[~2023-12-11 18:41 UTC|newest]
Thread overview: 102+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-07 17:39 [PATCH 00/40] vdpa-net: improve migration downtime through descriptor ASID and persistent IOTLB Si-Wei Liu
2023-12-07 17:39 ` [PATCH 01/40] linux-headers: add vhost_types.h and vhost.h Si-Wei Liu
2023-12-11 7:47 ` Eugenio Perez Martin
2024-01-11 3:32 ` Jason Wang
2023-12-07 17:39 ` [PATCH 02/40] vdpa: add vhost_vdpa_get_vring_desc_group Si-Wei Liu
2024-01-11 3:51 ` Jason Wang
2023-12-07 17:39 ` [PATCH 03/40] vdpa: probe descriptor group index for data vqs Si-Wei Liu
2023-12-11 18:49 ` Eugenio Perez Martin
2024-01-11 4:02 ` Jason Wang
2023-12-07 17:39 ` [PATCH 04/40] vdpa: piggyback desc_group index when probing isolated cvq Si-Wei Liu
2024-01-11 7:06 ` Jason Wang
2023-12-07 17:39 ` [PATCH 05/40] vdpa: populate desc_group from net_vhost_vdpa_init Si-Wei Liu
2023-12-11 10:46 ` Eugenio Perez Martin
2023-12-11 11:01 ` Eugenio Perez Martin
2024-01-11 7:09 ` Jason Wang
2023-12-07 17:39 ` [PATCH 06/40] vhost: make svq work with gpa without iova translation Si-Wei Liu
2023-12-11 11:17 ` Eugenio Perez Martin
2024-01-11 7:31 ` Jason Wang
2023-12-07 17:39 ` [PATCH 07/40] vdpa: move around vhost_vdpa_set_address_space_id Si-Wei Liu
2023-12-11 11:18 ` Eugenio Perez Martin
2024-01-11 7:33 ` Jason Wang
2023-12-07 17:39 ` [PATCH 08/40] vdpa: add back vhost_vdpa_net_first_nc_vdpa Si-Wei Liu
2023-12-11 11:19 ` Eugenio Perez Martin
2024-01-11 7:37 ` Jason Wang
2023-12-07 17:39 ` [PATCH 09/40] vdpa: no repeat setting shadow_data Si-Wei Liu
2023-12-11 11:21 ` Eugenio Perez Martin
2024-01-11 7:34 ` Jason Wang
2023-12-07 17:39 ` [PATCH 10/40] vdpa: assign svq descriptors a separate ASID when possible Si-Wei Liu
2023-12-11 13:35 ` Eugenio Perez Martin
2024-01-11 8:02 ` Jason Wang
2023-12-07 17:39 ` [PATCH 11/40] vdpa: factor out vhost_vdpa_last_dev Si-Wei Liu
2023-12-11 13:36 ` Eugenio Perez Martin
2024-01-11 8:03 ` Jason Wang
2023-12-07 17:39 ` [PATCH 12/40] vdpa: check map_thread_enabled before join maps thread Si-Wei Liu
2023-12-07 17:39 ` [PATCH 13/40] vdpa: ref counting VhostVDPAShared Si-Wei Liu
2024-01-11 8:12 ` Jason Wang
2023-12-07 17:39 ` [PATCH 14/40] vdpa: convert iova_tree to ref count based Si-Wei Liu
2023-12-11 17:21 ` Eugenio Perez Martin
2024-01-11 8:15 ` Jason Wang
2023-12-07 17:39 ` [PATCH 15/40] vdpa: add svq_switching and flush_map to header Si-Wei Liu
2024-01-11 8:16 ` Jason Wang
2023-12-07 17:39 ` [PATCH 16/40] vdpa: indicate SVQ switching via flag Si-Wei Liu
2024-01-11 8:17 ` Jason Wang
2023-12-07 17:39 ` [PATCH 17/40] vdpa: judge if map can be kept across reset Si-Wei Liu
2023-12-13 9:51 ` Eugenio Perez Martin
2024-01-11 8:24 ` Jason Wang
2023-12-07 17:39 ` [PATCH 18/40] vdpa: unregister listener on last dev cleanup Si-Wei Liu
2023-12-11 17:37 ` Eugenio Perez Martin
2024-01-11 8:26 ` Jason Wang
2023-12-07 17:39 ` [PATCH 19/40] vdpa: should avoid map flushing with persistent iotlb Si-Wei Liu
2024-01-11 8:28 ` Jason Wang
2023-12-07 17:39 ` [PATCH 20/40] vdpa: avoid mapping flush across reset Si-Wei Liu
2024-01-11 8:30 ` Jason Wang
2023-12-07 17:39 ` [PATCH 21/40] vdpa: vhost_vdpa_dma_batch_end_once rename Si-Wei Liu
2024-01-15 2:40 ` Jason Wang
2024-01-15 2:52 ` Jason Wang
2023-12-07 17:39 ` [PATCH 22/40] vdpa: factor out vhost_vdpa_map_batch_begin Si-Wei Liu
2024-01-15 3:02 ` Jason Wang
2023-12-07 17:39 ` [PATCH 23/40] vdpa: vhost_vdpa_dma_batch_begin_once rename Si-Wei Liu
2024-01-15 3:03 ` Jason Wang
2023-12-07 17:39 ` [PATCH 24/40] vdpa: factor out vhost_vdpa_dma_batch_end Si-Wei Liu
2024-01-15 3:05 ` Jason Wang
2023-12-07 17:39 ` [PATCH 25/40] vdpa: add asid to dma_batch_once API Si-Wei Liu
2023-12-13 15:42 ` Eugenio Perez Martin
2024-01-15 3:07 ` Jason Wang
2023-12-07 17:39 ` [PATCH 26/40] vdpa: return int for " Si-Wei Liu
2023-12-07 17:39 ` [PATCH 27/40] vdpa: add asid to all dma_batch call sites Si-Wei Liu
2023-12-07 17:39 ` [PATCH 28/40] vdpa: support iotlb_batch_asid Si-Wei Liu
2023-12-13 15:42 ` Eugenio Perez Martin
2024-01-15 3:19 ` Jason Wang
2023-12-07 17:39 ` [PATCH 29/40] vdpa: expose API vhost_vdpa_dma_batch_once Si-Wei Liu
2023-12-13 15:42 ` Eugenio Perez Martin
2024-01-15 3:32 ` Jason Wang
2023-12-07 17:39 ` [PATCH 30/40] vdpa: batch map/unmap op per svq pair basis Si-Wei Liu
2024-01-15 3:33 ` Jason Wang
2023-12-07 17:39 ` [PATCH 31/40] vdpa: batch map and unmap around cvq svq start/stop Si-Wei Liu
2024-01-15 3:34 ` Jason Wang
2023-12-07 17:39 ` [PATCH 32/40] vdpa: factor out vhost_vdpa_net_get_nc_vdpa Si-Wei Liu
2024-01-15 3:35 ` Jason Wang
2023-12-07 17:39 ` [PATCH 33/40] vdpa: batch multiple dma_unmap to a single call for vm stop Si-Wei Liu
2023-12-13 16:46 ` Eugenio Perez Martin
2024-01-15 3:47 ` Jason Wang
2023-12-07 17:39 ` [PATCH 34/40] vdpa: fix network breakage after cancelling migration Si-Wei Liu
2024-01-15 3:48 ` Jason Wang
2023-12-07 17:39 ` [PATCH 35/40] vdpa: add vhost_vdpa_set_address_space_id trace Si-Wei Liu
2023-12-11 18:13 ` Eugenio Perez Martin
2024-01-15 3:50 ` Jason Wang
2023-12-07 17:39 ` [PATCH 36/40] vdpa: add vhost_vdpa_get_vring_base trace for svq mode Si-Wei Liu
2023-12-11 18:14 ` Eugenio Perez Martin
2024-01-15 3:52 ` Jason Wang
2023-12-07 17:39 ` [PATCH 37/40] vdpa: add vhost_vdpa_set_dev_vring_base " Si-Wei Liu
2023-12-11 18:14 ` Eugenio Perez Martin
2024-01-15 3:53 ` Jason Wang
2023-12-07 17:39 ` [PATCH 38/40] vdpa: add trace events for eval_flush Si-Wei Liu
2024-01-15 3:57 ` Jason Wang
2023-12-07 17:39 ` [PATCH 39/40] vdpa: add trace events for vhost_vdpa_net_load_cmd Si-Wei Liu
2023-12-11 18:14 ` Eugenio Perez Martin
2023-12-07 17:39 ` [PATCH 40/40] vdpa: add trace event for vhost_vdpa_net_load_mq Si-Wei Liu
2023-12-11 18:15 ` Eugenio Perez Martin
2024-01-15 3:58 ` Jason Wang
2023-12-11 18:39 ` Eugenio Perez Martin [this message]
2024-01-11 8:21 ` [PATCH 00/40] vdpa-net: improve migration downtime through descriptor ASID and persistent IOTLB Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJaqyWe8F5dLEAL5y1JAkyti--hk6BHbGyu8MSChYTj4ffQHdg@mail.gmail.com \
--to=eperezma@redhat.com \
--cc=boris.ostrovsky@oracle.com \
--cc=dtatulea@nvidia.com \
--cc=jasowang@redhat.com \
--cc=jonah.palmer@oracle.com \
--cc=leiyang@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=si-wei.liu@oracle.com \
--cc=yin31149@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).