qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Yuan Liu <yuan1.liu@intel.com>
Cc: farosas@suse.de, qemu-devel@nongnu.org, hao.xiang@bytedance.com,
	bryan.zhang@bytedance.com, nanhai.zou@intel.com
Subject: Re: [PATCH v5 0/7] Live Migration With IAA
Date: Tue, 26 Mar 2024 16:30:00 -0400	[thread overview]
Message-ID: <ZgMwSO_eRIgXZ24L@x1n> (raw)
In-Reply-To: <20240319164527.1873891-1-yuan1.liu@intel.com>

Hi, Yuan,

On Wed, Mar 20, 2024 at 12:45:20AM +0800, Yuan Liu wrote:
> 1. QPL will be used as an independent compression method like ZLIB and ZSTD,
>    QPL will force the use of the IAA accelerator and will not support software
>    compression. For a summary of issues compatible with Zlib, please refer to
>    docs/devel/migration/qpl-compression.rst

IIRC our previous discussion is we should provide a software fallback for
the new QEMU paths, right?  Why the decision changed?  Again, such fallback
can help us to make sure qpl won't get broken easily by other changes.

> 
> 2. Compression accelerator related patches are removed from this patch set and
>    will be added to the QAT patch set, we will submit separate patches to use
>    QAT to accelerate ZLIB and ZSTD.
> 
> 3. Advantages of using IAA accelerator include:
>    a. Compared with the non-compression method, it can improve downtime
>       performance without adding additional host resources (both CPU and
>       network).
>    b. Compared with using software compression methods (ZSTD/ZLIB), it can
>       provide high data compression ratio and save a lot of CPU resources
>       used for compression.
> 
> Test condition:
>   1. Host CPUs are based on Sapphire Rapids
>   2. VM type, 16 vCPU and 64G memory
>   3. The source and destination respectively use 4 IAA devices.
>   4. The workload in the VM
>     a. all vCPUs are idle state
>     b. 90% of the virtual machine's memory is used, use silesia to fill
>        the memory.
>        The introduction of silesia:
>        https://sun.aei.polsl.pl//~sdeor/index.php?page=silesia
>   5. Set "--mem-prealloc" boot parameter on the destination, this parameter
>      can make IAA performance better and related introduction is added here.
>      docs/devel/migration/qpl-compression.rst
>   6. Source migration configuration commands
>      a. migrate_set_capability multifd on
>      b. migrate_set_parameter multifd-channels 2/4/8
>      c. migrate_set_parameter downtime-limit 300
>      f. migrate_set_parameter max-bandwidth 100G/1G
>      d. migrate_set_parameter multifd-compression none/qpl/zstd
>   7. Destination migration configuration commands
>      a. migrate_set_capability multifd on
>      b. migrate_set_parameter multifd-channels 2/4/8
>      c. migrate_set_parameter multifd-compression none/qpl/zstd
> 
> Early migration result, each result is the average of three tests
> 
>  +--------+-------------+--------+--------+---------+----------+------|
>  |        | The number  |total   |downtime|network  |pages per | CPU  |
>  | None   | of channels |time(ms)|(ms)    |bandwidth|second    | Util |
>  | Comp   |             |        |        |(mbps)   |          |      |
>  |        +-------------+-----------------+---------+----------+------+
>  |Network |            2|    8571|      69|    58391|   1896525|  256%|

Is this the average bandwidth?  I'm surprised that you can hit ~59Gbps only
with 2 channels.  My previous experience is around ~1XGbps per channel, so
no more than 30Gbps for two channels.  Is it because of a faster processor?
Indeed from the 4/8 results it doesn't look like increasing the num of
channels helped a lot, and even it got worse on the downtime.

What is the rational behind "downtime improvement" when with the QPL
compressors?  IIUC in this 100Gbps case the bandwidth is never a
limitation, then I don't understand why adding the compression phase can
make the switchover faster.  I can expect much more pages sent in a
NIC-limted env like you described below with 1Gbps, but not when NIC has
unlimited resources like here.

>  |BW:100G +-------------+--------+--------+---------+----------+------+
>  |        |            4|    7180|      92|    69736|   1865640|  300%|
>  |        +-------------+--------+--------+---------+----------+------+
>  |        |            8|    7090|     121|    70562|   2174060|  307%|
>  +--------+-------------+--------+--------+---------+----------+------+
> 
>  +--------+-------------+--------+--------+---------+----------+------|
>  |        | The number  |total   |downtime|network  |pages per | CPU  |
>  | QPL    | of channels |time(ms)|(ms)    |bandwidth|second    | Util |
>  | Comp   |             |        |        |(mbps)   |          |      |
>  |        +-------------+-----------------+---------+----------+------+
>  |Network |            2|    8413|      34|    30067|   1732411|  230%|
>  |BW:100G +-------------+--------+--------+---------+----------+------+
>  |        |            4|    6559|      32|    38804|   1689954|  450%|
>  |        +-------------+--------+--------+---------+----------+------+
>  |        |            8|    6623|      37|    38745|   1566507|  790%|
>  +--------+-------------+--------+--------+---------+----------+------+
> 
>  +--------+-------------+--------+--------+---------+----------+------|
>  |        | The number  |total   |downtime|network  |pages per | CPU  |
>  | ZSTD   | of channels |time(ms)|(ms)    |bandwidth|second    | Util |
>  | Comp   |             |        |        |(mbps)   |          |      |
>  |        +-------------+-----------------+---------+----------+------+
>  |Network |            2|   95846|      24|     1800|    521829|  203%|
>  |BW:100G +-------------+--------+--------+---------+----------+------+
>  |        |            4|   49004|      24|     3529|    890532|  403%|
>  |        +-------------+--------+--------+---------+----------+------+
>  |        |            8|   25574|      32|     6782|   1762222|  800%|
>  +--------+-------------+--------+--------+---------+----------+------+
> 
> When network bandwidth resource is sufficient, QPL can improve downtime
> by 2x compared to no compression. In this scenario, with 4 channels, the
> IAA hardware resources are fully used, so adding more channels will not
> gain more benefits.
> 
>  
>  +--------+-------------+--------+--------+---------+----------+------|
>  |        | The number  |total   |downtime|network  |pages per | CPU  |
>  | None   | of channels |time(ms)|(ms)    |bandwidth|second    | Util |
>  | Comp   |             |        |        |(mbps)   |          |      |
>  |        +-------------+-----------------+---------+----------+------+
>  |Network |            2|   57758|      66|     8643|    264617|   34%|
>  |BW:  1G +-------------+--------+--------+---------+----------+------+
>  |        |            4|   57216|      58|     8726|    266773|   34%|
>  |        +-------------+--------+--------+---------+----------+------+
>  |        |            8|   56708|      53|     8804|    270223|   33%|
>  +--------+-------------+--------+--------+---------+----------+------+
> 
>  +--------+-------------+--------+--------+---------+----------+------|
>  |        | The number  |total   |downtime|network  |pages per | CPU  |
>  | QPL    | of channels |time(ms)|(ms)    |bandwidth|second    | Util |
>  | Comp   |             |        |        |(mbps)   |          |      |
>  |        +-------------+-----------------+---------+----------+------+
>  |Network |            2|   30129|      34|     8345|   2224761|   54%|
>  |BW:  1G +-------------+--------+--------+---------+----------+------+
>  |        |            4|   30317|      39|     8300|   2025220|   73%|
>  |        +-------------+--------+--------+---------+----------+------+
>  |        |            8|   29615|      35|     8514|   2250122|  131%|
>  +--------+-------------+--------+--------+---------+----------+------+
> 
>  +--------+-------------+--------+--------+---------+----------+------|
>  |        | The number  |total   |downtime|network  |pages per | CPU  |
>  | ZSTD   | of channels |time(ms)|(ms)    |bandwidth|second    | Util |
>  | Comp   |             |        |        |(mbps)   |          |      |
>  |        +-------------+-----------------+---------+----------+------+
>  |Network |            2|   95750|      24|     1802|    477236|  202%|
>  |BW:  1G +-------------+--------+--------+---------+----------+------+
>  |        |            4|   48907|      24|     3536|   1002142|  404%|
>  |        +-------------+--------+--------+---------+----------+------+
>  |        |            8|   25568|      32|     6783|   1696437|  800%|
>  +--------+-------------+--------+--------+---------+----------+------+
> 
> When network bandwidth resource is limited, the "page perf second" metric
> decreases for none compression, the success rate of migration will reduce.
> Comparison of QPL and ZSTD compression methods, QPL can save a lot of CPU
> resources used for compression.
> 
> v2:
>   - add support for multifd compression accelerator
>   - add support for the QPL accelerator in the multifd
>     compression accelerator
>   - fixed the issue that QPL was compiled into the migration
>     module by default
> 
> v3:
>   - use Meson instead of pkg-config to resolve QPL build
>     dependency issue
>   - fix coding style
>   - fix a CI issue for get_multifd_ops function in multifd.c file
> 
> v4:
>   - patch based on commit: da96ad4a6a Merge tag 'hw-misc-20240215' of
>     https://github.com/philmd/qemu into staging
>   - remove the compression accelerator implementation patches, the patches
>     will be placed in the QAT accelerator implementation.
>   - introduce QPL as a new compression method
>   - add QPL compression documentation
>   - add QPL compression migration test
>   - fix zlib/zstd compression level issue
> 
> v5:
>   - patch based on v9.0.0-rc0 (c62d54d0a8)
>   - use pkgconfig to check libaccel-config, libaccel-config is already
>     in many distributions.
>   - initialize the IOV of the sender by the specific compression method
>   - refine the coding style
>   - remove the zlib/zstd compression level not working patch, the issue
>     has been solved
> 
> Yuan Liu (7):
>   docs/migration: add qpl compression feature
>   migration/multifd: put IOV initialization into compression method
>   configure: add --enable-qpl build option
>   migration/multifd: add qpl compression method
>   migration/multifd: implement initialization of qpl compression
>   migration/multifd: implement qpl compression and decompression
>   tests/migration-test: add qpl compression test
> 
>  docs/devel/migration/features.rst        |   1 +
>  docs/devel/migration/qpl-compression.rst | 231 +++++++++++
>  hw/core/qdev-properties-system.c         |   2 +-
>  meson.build                              |  16 +
>  meson_options.txt                        |   2 +
>  migration/meson.build                    |   1 +
>  migration/multifd-qpl.c                  | 482 +++++++++++++++++++++++
>  migration/multifd-zlib.c                 |   4 +
>  migration/multifd-zstd.c                 |   6 +-
>  migration/multifd.c                      |   8 +-
>  migration/multifd.h                      |   1 +
>  qapi/migration.json                      |   7 +-
>  scripts/meson-buildoptions.sh            |   3 +
>  tests/qtest/migration-test.c             |  24 ++
>  14 files changed, 782 insertions(+), 6 deletions(-)
>  create mode 100644 docs/devel/migration/qpl-compression.rst
>  create mode 100644 migration/multifd-qpl.c
> 
> -- 
> 2.39.3
> 

-- 
Peter Xu



  parent reply	other threads:[~2024-03-26 20:30 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-19 16:45 [PATCH v5 0/7] Live Migration With IAA Yuan Liu
2024-03-19 16:45 ` [PATCH v5 1/7] docs/migration: add qpl compression feature Yuan Liu
2024-03-26 17:58   ` Peter Xu
2024-03-27  2:14     ` Liu, Yuan1
2024-03-19 16:45 ` [PATCH v5 2/7] migration/multifd: put IOV initialization into compression method Yuan Liu
2024-03-20 15:18   ` Fabiano Rosas
2024-03-20 15:32     ` Liu, Yuan1
2024-03-19 16:45 ` [PATCH v5 3/7] configure: add --enable-qpl build option Yuan Liu
2024-03-20  8:55   ` Thomas Huth
2024-03-20  8:56     ` Thomas Huth
2024-03-20 14:34       ` Liu, Yuan1
2024-03-20 10:31   ` Daniel P. Berrangé
2024-03-20 14:42     ` Liu, Yuan1
2024-03-19 16:45 ` [PATCH v5 4/7] migration/multifd: add qpl compression method Yuan Liu
2024-03-27 19:49   ` Peter Xu
2024-03-28  3:03     ` Liu, Yuan1
2024-03-19 16:45 ` [PATCH v5 5/7] migration/multifd: implement initialization of qpl compression Yuan Liu
2024-03-20 10:42   ` Daniel P. Berrangé
2024-03-20 15:02     ` Liu, Yuan1
2024-03-20 15:20       ` Daniel P. Berrangé
2024-03-20 16:04         ` Liu, Yuan1
2024-03-20 15:34       ` Peter Xu
2024-03-20 16:23         ` Liu, Yuan1
2024-03-20 20:31           ` Peter Xu
2024-03-21  1:37             ` Liu, Yuan1
2024-03-21 15:28               ` Peter Xu
2024-03-22  2:06                 ` Liu, Yuan1
2024-03-22 14:47                   ` Liu, Yuan1
2024-03-22 16:40                     ` Peter Xu
2024-03-27 19:25                       ` Peter Xu
2024-03-28  2:32                         ` Liu, Yuan1
2024-03-28 15:16                           ` Peter Xu
2024-03-29  2:04                             ` Liu, Yuan1
2024-03-19 16:45 ` [PATCH v5 6/7] migration/multifd: implement qpl compression and decompression Yuan Liu
2024-03-19 16:45 ` [PATCH v5 7/7] tests/migration-test: add qpl compression test Yuan Liu
2024-03-20 10:45   ` Daniel P. Berrangé
2024-03-20 15:30     ` Liu, Yuan1
2024-03-20 15:39       ` Daniel P. Berrangé
2024-03-20 16:26         ` Liu, Yuan1
2024-03-26 20:30 ` Peter Xu [this message]
2024-03-27  3:20   ` [PATCH v5 0/7] Live Migration With IAA Liu, Yuan1
2024-03-27 19:46     ` Peter Xu
2024-03-28  3:02       ` Liu, Yuan1
2024-03-28 15:22         ` Peter Xu
2024-03-29  3:33           ` Liu, Yuan1

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZgMwSO_eRIgXZ24L@x1n \
    --to=peterx@redhat.com \
    --cc=bryan.zhang@bytedance.com \
    --cc=farosas@suse.de \
    --cc=hao.xiang@bytedance.com \
    --cc=nanhai.zou@intel.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yuan1.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).