qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Si-Wei Liu <si-wei.liu@oracle.com>
Cc: eperezma@redhat.com, mst@redhat.com, dtatulea@nvidia.com,
	 leiyang@redhat.com, yin31149@gmail.com,
	boris.ostrovsky@oracle.com,  jonah.palmer@oracle.com,
	qemu-devel@nongnu.org
Subject: Re: [PATCH 00/40] vdpa-net: improve migration downtime through descriptor ASID and persistent IOTLB
Date: Thu, 11 Jan 2024 16:21:39 +0800	[thread overview]
Message-ID: <CACGkMEvPpbCaAt_7PiKnTEEvB+YRVoscpVNaE-sVqZ1EKyRYdw@mail.gmail.com> (raw)
In-Reply-To: <1701970793-6865-1-git-send-email-si-wei.liu@oracle.com>

On Fri, Dec 8, 2023 at 2:50 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>
> This patch series contain several enhancements to SVQ live migration downtime
> for vDPA-net hardware device, specifically on mlx5_vdpa. Currently it is based
> off of Eugenio's RFC v2 .load_setup series [1] to utilize the shared facility
> and reduce frictions in merging or duplicating code if at all possible.
>
> It's stacked up in particular order as below, as the optimization for one on
> the top has to depend on others on the bottom. Here's a breakdown for what
> each part does respectively:
>
> Patch #  |          Feature / optimization
> ---------V-------------------------------------------------------------------
> 35 - 40  | trace events
> 34       | migrate_cancel bug fix
> 21 - 33  | (Un)map batching at stop-n-copy to further optimize LM down time
> 11 - 20  | persistent IOTLB [3] to improve LM down time
> 02 - 10  | SVQ descriptor ASID [2] to optimize SVQ switching
> 01       | dependent linux headers
>          V
>
> Let's first define 2 sources of downtime that this work is concerned with:
>
> * SVQ switching downtime (Downtime #1): downtime at the start of migration.
>   Time spent on teardown and setup for SVQ mode switching, and this downtime
>   is regarded as the maxium time for an individual vdpa-net device.
>   No memory transfer is involved during SVQ switching, hence no .
>
> * LM downtime (Downtime #2): aggregated downtime for all vdpa-net devices on
>   resource teardown and setup in the last stop-n-copy phase on source host.
>
> With each part of the optimizations applied bottom up, the effective outcome
> in terms of down time (in seconds) performance can be observed in this table:
>
>
>                     |    Downtime #1    |    Downtime #2
> --------------------+-------------------+-------------------
> Baseline QEMU       |     20s ~ 30s     |        20s
>                     |                   |
> Iterative map       |                   |
> at destination[1]   |        5s         |        20s
>                     |                   |
> SVQ descriptor      |                   |
>     ASID [2]        |        2s         |         5s
>                     |                   |
>                     |                   |
> persistent IOTLB    |        2s         |         2s
>       [3]           |                   |
>                     |                   |
> (Un)map batching    |                   |
> at stop-n-copy      |      1.7s         |       1.5s
> before switchover   |                   |
>
> (VM config: 128GB mem, 2 mlx5_vdpa devices, each w/ 4 data vqs)

This looks promising!

But the series looks a little bit huge, can we split them into 2 or 3 series?

It helps to speed up the reviewing and merging.

Thanks

>
> Please find the details regarding each enhancement on the commit log.
>
> Thanks,
> -Siwei
>
>
> [1] [RFC PATCH v2 00/10] Map memory at destination .load_setup in vDPA-net migration
> https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg05711.html
> [2] VHOST_BACKEND_F_DESC_ASID
> https://lore.kernel.org/virtualization/20231018171456.1624030-2-dtatulea@nvidia.com/
> [3] VHOST_BACKEND_F_IOTLB_PERSIST
> https://lore.kernel.org/virtualization/1698304480-18463-1-git-send-email-si-wei.liu@oracle.com/
>
> ---
>
> Si-Wei Liu (40):
>   linux-headers: add vhost_types.h and vhost.h
>   vdpa: add vhost_vdpa_get_vring_desc_group
>   vdpa: probe descriptor group index for data vqs
>   vdpa: piggyback desc_group index when probing isolated cvq
>   vdpa: populate desc_group from net_vhost_vdpa_init
>   vhost: make svq work with gpa without iova translation
>   vdpa: move around vhost_vdpa_set_address_space_id
>   vdpa: add back vhost_vdpa_net_first_nc_vdpa
>   vdpa: no repeat setting shadow_data
>   vdpa: assign svq descriptors a separate ASID when possible
>   vdpa: factor out vhost_vdpa_last_dev
>   vdpa: check map_thread_enabled before join maps thread
>   vdpa: ref counting VhostVDPAShared
>   vdpa: convert iova_tree to ref count based
>   vdpa: add svq_switching and flush_map to header
>   vdpa: indicate SVQ switching via flag
>   vdpa: judge if map can be kept across reset
>   vdpa: unregister listener on last dev cleanup
>   vdpa: should avoid map flushing with persistent iotlb
>   vdpa: avoid mapping flush across reset
>   vdpa: vhost_vdpa_dma_batch_end_once rename
>   vdpa: factor out vhost_vdpa_map_batch_begin
>   vdpa: vhost_vdpa_dma_batch_begin_once rename
>   vdpa: factor out vhost_vdpa_dma_batch_end
>   vdpa: add asid to dma_batch_once API
>   vdpa: return int for dma_batch_once API
>   vdpa: add asid to all dma_batch call sites
>   vdpa: support iotlb_batch_asid
>   vdpa: expose API vhost_vdpa_dma_batch_once
>   vdpa: batch map/unmap op per svq pair basis
>   vdpa: batch map and unmap around cvq svq start/stop
>   vdpa: factor out vhost_vdpa_net_get_nc_vdpa
>   vdpa: batch multiple dma_unmap to a single call for vm stop
>   vdpa: fix network breakage after cancelling migration
>   vdpa: add vhost_vdpa_set_address_space_id trace
>   vdpa: add vhost_vdpa_get_vring_base trace for svq mode
>   vdpa: add vhost_vdpa_set_dev_vring_base trace for svq mode
>   vdpa: add trace events for eval_flush
>   vdpa: add trace events for vhost_vdpa_net_load_cmd
>   vdpa: add trace event for vhost_vdpa_net_load_mq
>
>  hw/virtio/trace-events                       |   9 +-
>  hw/virtio/vhost-shadow-virtqueue.c           |  35 ++-
>  hw/virtio/vhost-vdpa.c                       | 156 +++++++---
>  include/hw/virtio/vhost-vdpa.h               |  16 +
>  include/standard-headers/linux/vhost_types.h |  13 +
>  linux-headers/linux/vhost.h                  |   9 +
>  net/trace-events                             |   8 +
>  net/vhost-vdpa.c                             | 434 ++++++++++++++++++++++-----
>  8 files changed, 558 insertions(+), 122 deletions(-)
>
> --
> 1.8.3.1
>



      parent reply	other threads:[~2024-01-11  8:22 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-07 17:39 [PATCH 00/40] vdpa-net: improve migration downtime through descriptor ASID and persistent IOTLB Si-Wei Liu
2023-12-07 17:39 ` [PATCH 01/40] linux-headers: add vhost_types.h and vhost.h Si-Wei Liu
2023-12-11  7:47   ` Eugenio Perez Martin
2024-01-11  3:32   ` Jason Wang
2023-12-07 17:39 ` [PATCH 02/40] vdpa: add vhost_vdpa_get_vring_desc_group Si-Wei Liu
2024-01-11  3:51   ` Jason Wang
2023-12-07 17:39 ` [PATCH 03/40] vdpa: probe descriptor group index for data vqs Si-Wei Liu
2023-12-11 18:49   ` Eugenio Perez Martin
2024-01-11  4:02   ` Jason Wang
2023-12-07 17:39 ` [PATCH 04/40] vdpa: piggyback desc_group index when probing isolated cvq Si-Wei Liu
2024-01-11  7:06   ` Jason Wang
2023-12-07 17:39 ` [PATCH 05/40] vdpa: populate desc_group from net_vhost_vdpa_init Si-Wei Liu
2023-12-11 10:46   ` Eugenio Perez Martin
2023-12-11 11:01     ` Eugenio Perez Martin
2024-01-11  7:09   ` Jason Wang
2023-12-07 17:39 ` [PATCH 06/40] vhost: make svq work with gpa without iova translation Si-Wei Liu
2023-12-11 11:17   ` Eugenio Perez Martin
2024-01-11  7:31   ` Jason Wang
2023-12-07 17:39 ` [PATCH 07/40] vdpa: move around vhost_vdpa_set_address_space_id Si-Wei Liu
2023-12-11 11:18   ` Eugenio Perez Martin
2024-01-11  7:33   ` Jason Wang
2023-12-07 17:39 ` [PATCH 08/40] vdpa: add back vhost_vdpa_net_first_nc_vdpa Si-Wei Liu
2023-12-11 11:19   ` Eugenio Perez Martin
2024-01-11  7:37   ` Jason Wang
2023-12-07 17:39 ` [PATCH 09/40] vdpa: no repeat setting shadow_data Si-Wei Liu
2023-12-11 11:21   ` Eugenio Perez Martin
2024-01-11  7:34   ` Jason Wang
2023-12-07 17:39 ` [PATCH 10/40] vdpa: assign svq descriptors a separate ASID when possible Si-Wei Liu
2023-12-11 13:35   ` Eugenio Perez Martin
2024-01-11  8:02   ` Jason Wang
2023-12-07 17:39 ` [PATCH 11/40] vdpa: factor out vhost_vdpa_last_dev Si-Wei Liu
2023-12-11 13:36   ` Eugenio Perez Martin
2024-01-11  8:03   ` Jason Wang
2023-12-07 17:39 ` [PATCH 12/40] vdpa: check map_thread_enabled before join maps thread Si-Wei Liu
2023-12-07 17:39 ` [PATCH 13/40] vdpa: ref counting VhostVDPAShared Si-Wei Liu
2024-01-11  8:12   ` Jason Wang
2023-12-07 17:39 ` [PATCH 14/40] vdpa: convert iova_tree to ref count based Si-Wei Liu
2023-12-11 17:21   ` Eugenio Perez Martin
2024-01-11  8:15   ` Jason Wang
2023-12-07 17:39 ` [PATCH 15/40] vdpa: add svq_switching and flush_map to header Si-Wei Liu
2024-01-11  8:16   ` Jason Wang
2023-12-07 17:39 ` [PATCH 16/40] vdpa: indicate SVQ switching via flag Si-Wei Liu
2024-01-11  8:17   ` Jason Wang
2023-12-07 17:39 ` [PATCH 17/40] vdpa: judge if map can be kept across reset Si-Wei Liu
2023-12-13  9:51   ` Eugenio Perez Martin
2024-01-11  8:24   ` Jason Wang
2023-12-07 17:39 ` [PATCH 18/40] vdpa: unregister listener on last dev cleanup Si-Wei Liu
2023-12-11 17:37   ` Eugenio Perez Martin
2024-01-11  8:26   ` Jason Wang
2023-12-07 17:39 ` [PATCH 19/40] vdpa: should avoid map flushing with persistent iotlb Si-Wei Liu
2024-01-11  8:28   ` Jason Wang
2023-12-07 17:39 ` [PATCH 20/40] vdpa: avoid mapping flush across reset Si-Wei Liu
2024-01-11  8:30   ` Jason Wang
2023-12-07 17:39 ` [PATCH 21/40] vdpa: vhost_vdpa_dma_batch_end_once rename Si-Wei Liu
2024-01-15  2:40   ` Jason Wang
2024-01-15  2:52     ` Jason Wang
2023-12-07 17:39 ` [PATCH 22/40] vdpa: factor out vhost_vdpa_map_batch_begin Si-Wei Liu
2024-01-15  3:02   ` Jason Wang
2023-12-07 17:39 ` [PATCH 23/40] vdpa: vhost_vdpa_dma_batch_begin_once rename Si-Wei Liu
2024-01-15  3:03   ` Jason Wang
2023-12-07 17:39 ` [PATCH 24/40] vdpa: factor out vhost_vdpa_dma_batch_end Si-Wei Liu
2024-01-15  3:05   ` Jason Wang
2023-12-07 17:39 ` [PATCH 25/40] vdpa: add asid to dma_batch_once API Si-Wei Liu
2023-12-13 15:42   ` Eugenio Perez Martin
2024-01-15  3:07   ` Jason Wang
2023-12-07 17:39 ` [PATCH 26/40] vdpa: return int for " Si-Wei Liu
2023-12-07 17:39 ` [PATCH 27/40] vdpa: add asid to all dma_batch call sites Si-Wei Liu
2023-12-07 17:39 ` [PATCH 28/40] vdpa: support iotlb_batch_asid Si-Wei Liu
2023-12-13 15:42   ` Eugenio Perez Martin
2024-01-15  3:19   ` Jason Wang
2023-12-07 17:39 ` [PATCH 29/40] vdpa: expose API vhost_vdpa_dma_batch_once Si-Wei Liu
2023-12-13 15:42   ` Eugenio Perez Martin
2024-01-15  3:32   ` Jason Wang
2023-12-07 17:39 ` [PATCH 30/40] vdpa: batch map/unmap op per svq pair basis Si-Wei Liu
2024-01-15  3:33   ` Jason Wang
2023-12-07 17:39 ` [PATCH 31/40] vdpa: batch map and unmap around cvq svq start/stop Si-Wei Liu
2024-01-15  3:34   ` Jason Wang
2023-12-07 17:39 ` [PATCH 32/40] vdpa: factor out vhost_vdpa_net_get_nc_vdpa Si-Wei Liu
2024-01-15  3:35   ` Jason Wang
2023-12-07 17:39 ` [PATCH 33/40] vdpa: batch multiple dma_unmap to a single call for vm stop Si-Wei Liu
2023-12-13 16:46   ` Eugenio Perez Martin
2024-01-15  3:47   ` Jason Wang
2023-12-07 17:39 ` [PATCH 34/40] vdpa: fix network breakage after cancelling migration Si-Wei Liu
2024-01-15  3:48   ` Jason Wang
2023-12-07 17:39 ` [PATCH 35/40] vdpa: add vhost_vdpa_set_address_space_id trace Si-Wei Liu
2023-12-11 18:13   ` Eugenio Perez Martin
2024-01-15  3:50   ` Jason Wang
2023-12-07 17:39 ` [PATCH 36/40] vdpa: add vhost_vdpa_get_vring_base trace for svq mode Si-Wei Liu
2023-12-11 18:14   ` Eugenio Perez Martin
2024-01-15  3:52   ` Jason Wang
2023-12-07 17:39 ` [PATCH 37/40] vdpa: add vhost_vdpa_set_dev_vring_base " Si-Wei Liu
2023-12-11 18:14   ` Eugenio Perez Martin
2024-01-15  3:53   ` Jason Wang
2023-12-07 17:39 ` [PATCH 38/40] vdpa: add trace events for eval_flush Si-Wei Liu
2024-01-15  3:57   ` Jason Wang
2023-12-07 17:39 ` [PATCH 39/40] vdpa: add trace events for vhost_vdpa_net_load_cmd Si-Wei Liu
2023-12-11 18:14   ` Eugenio Perez Martin
2023-12-07 17:39 ` [PATCH 40/40] vdpa: add trace event for vhost_vdpa_net_load_mq Si-Wei Liu
2023-12-11 18:15   ` Eugenio Perez Martin
2024-01-15  3:58   ` Jason Wang
2023-12-11 18:39 ` [PATCH 00/40] vdpa-net: improve migration downtime through descriptor ASID and persistent IOTLB Eugenio Perez Martin
2024-01-11  8:21 ` Jason Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACGkMEvPpbCaAt_7PiKnTEEvB+YRVoscpVNaE-sVqZ1EKyRYdw@mail.gmail.com \
    --to=jasowang@redhat.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=dtatulea@nvidia.com \
    --cc=eperezma@redhat.com \
    --cc=jonah.palmer@oracle.com \
    --cc=leiyang@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=si-wei.liu@oracle.com \
    --cc=yin31149@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).