qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] *** Implement using Intel QAT to offload ZLIB
@ 2023-12-31 20:57 Bryan Zhang
  2023-12-31 20:58 ` [PATCH 1/5] meson: Introduce 'qatzip' feature to the build system Bryan Zhang
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Bryan Zhang @ 2023-12-31 20:57 UTC (permalink / raw)
  To: qemu-devel, farosas, marcandre.lureau, peterx, quintela,
	peter.maydell, hao.xiang
  Cc: bryan.zhang

* Overview:

This patchset implements using Intel's QAT accelerator to offload ZLIB compression and decompression in the multifd live migration path.

* Background:

Intel's 4th generation Xeon processors support Intel's QuickAssist Technology (QAT), a hardware accelerator for cryptography and compression operations.

Intel has also released a software library, QATzip, that interacts with QAT and exposes an API for QAT-accelerated ZLIB compression and decompression.

This patchset introduces a new multifd compression method, `qatzip`, which uses QATzip to perform ZLIB compression and decompression.

* Implementation:

The bulk of this patchset is in `migration/multifd-qatzip.c`, which mirrors the other compression implementation files, `migration/multifd-zlib.c` and `migration/multifd-zstd.c`, by providing an implementation of the multifd send/recv methods using the API exposed by QATzip. This is fairly straightforward, as the multifd setup/prepare/teardown methods align closely with QATzip's methods for initialization/(de)compression/teardown.

The only major divergence from the other compression methods is that we use a non-streaming compression/decompression API, as opposed to streaming each page to the compression layer one at a time. This does not require any major code changes - by the time we want to call to the compression layer, we already have a batch of pages, so it is easy to copy them into a contiguous buffer. This decision is purely performance-based, as our initial QAT benchmark testing showed that QATzip's non-streaming API outperformed the streaming API.

* Performance:

** Setup:

We use two Intel 4th generation Xeon servers for testing.

Architecture:        x86_64
CPU(s):              192
Thread(s) per core:  2
Core(s) per socket:  48
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
CPU family:          6
Model:               143
Model name:          Intel(R) Xeon(R) Platinum 8457C
Stepping:            8
CPU MHz:             2538.624
CPU max MHz:         3800.0000
CPU min MHz:         800.0000

Each server has two QAT devices, and the network bandwidth between the two servers is 1Gbps.

We perform multifd live migration over TCP using a VM with 64GB memory. We prepared the machine's memory by powering it on, allocating a large amount of memory (63GB) as a single buffer, and filling the buffer with the repeated contents of the Silesia corpus[0]. This is in lieu of a more realistic memory snapshot, which proved troublesome to acquire.

We analyzed CPU usage by averaging the output of `top` every second during live migration. This is admittedly imprecise, but we feel that it accurately portrays the different degrees of CPU usage of varying compression methods.

We present the latency, throughput, and CPU usage results for all of the compression methods, with varying numbers of multifd threads (4, 8, and 16).

[0] The Silesia corpus can be accessed here: https://sun.aei.polsl.pl//~sdeor/index.php?page=silesia

** Results:

4 multifd threads:

    |---------------|---------------|----------------|---------|---------|
    |method         |time(sec)      |throughput(mbps)|send cpu%|recv cpu%|
    |---------------|---------------|----------------|---------|---------|
    |qatzip         |111.256        |916.03          | 29.08%  | 51.90   |
    |---------------|---------------|----------------|---------|---------|
    |zlib           |193.033        |562.16          |297.36%  |237.84   |
    |---------------|---------------|----------------|---------|---------|
    |zstd           |112.449        |920.67          |234.39%  |157.57   |
    |---------------|---------------|----------------|---------|---------|
    |none           |327.014        |933.41          |  9.50%  | 25.28   |
    |---------------|---------------|----------------|---------|---------|

8 multifd threads:

    |---------------|---------------|----------------|---------|---------|
    |method         |time(sec)      |throughput(mbps)|send cpu%|recv cpu%|
    |---------------|---------------|----------------|---------|---------|
    |qatzip         |111.349        |915.20          | 29.13%  | 59.63   |
    |---------------|---------------|----------------|---------|---------|
    |zlib           |149.378        |726.64          |516.24%  |400.46   |
    |---------------|---------------|----------------|---------|---------|
    |zstd           |111.942        |925.85          |345.75%  |170.74   |
    |---------------|---------------|----------------|---------|---------|
    |none           |327.417        |933.34          |  8.38%  | 27.72   |
    |---------------|---------------|----------------|---------|---------|

16 multifd threads:

    |---------------|---------------|----------------|---------|---------|
    |method         |time(sec)      |throughput(mbps)|send cpu%|recv cpu%|
    |---------------|---------------|----------------|---------|---------|
    |qatzip         |112.035        |908.96          | 29.93%  | 63.83%  |
    |---------------|---------------|----------------|---------|---------|
    |zlib           |118.730        |912.94          |914.14%  |621.59%  |
    |---------------|---------------|----------------|---------|---------|
    |zstd           |112.167        |924.78          |384.81%  |171.54%  |
    |---------------|---------------|----------------|---------|---------|
    |none           |327.728        |932.08          |  9.31%  | 29.89%  |
    |---------------|---------------|----------------|---------|---------|

** Observations:

Latency: In our test setting, live migration is mostly network-constrained, so compression performs relatively well in general. `qatzip` particularly shows a significant improvement over `zlib` with limited threads. With 4 multifd threads, `qatzip` shows a ~42% decrease in latency over `zlib`. In all scenarios, `qatzip` shows comparable performance with `zstd`.

Throughput: In all scenarios, nearly all compression methods reach nearly the entire network throughput of 1Gbps except for `zlib`, which appears to be CPU-bound with 4 and 8 threads, but reaches comparable throughput performance with the other methods at 16 threads.

CPU usage: In all scenarios, `qatzip` consumes a fraction of the CPU usage that `zlib` and `zstd` use. In the most limited case, with 4 multifd threads, `qatzip`'s sender CPU usage is ~10% that of `zlib`, and ~12% that of `zstd`, and its receiver CPU usage is ~22% that of `zlib`, and ~33% that of `zstd`. The magnitude of these savings increases as we increase to 8 and 16 threads.

* Future work:

- Comparing QAT offloading against other compression methods in environments that are not as network-constrained.
- Combining compression offloading with offloading using other Intel accelerators (e.g. using Intel's Data Streaming Accelerator to offload zero page checking, which is part of another related patchset currently under discussion, and to offload `memcpy()` operations on the receiver side).
- Reworking multifd logic to pipeline live migration work to improve device saturation.

* Testing:

This patchset adds an integration test for the new `qatzip` multifd compression method.

* Patchset:

This patchset was generated on top of commit 7425b627.

Bryan Zhang (5):
  meson: Introduce 'qatzip' feature to the build system.
  migration: Add compression level parameter for QATzip
  migration: Introduce unimplemented 'qatzip' compression method
  migration: Implement 'qatzip' methods using QAT
  migration: Add integration test for 'qatzip' compression method

 hw/core/qdev-properties-system.c |   6 +-
 meson.build                      |  10 +
 meson_options.txt                |   2 +
 migration/meson.build            |   1 +
 migration/migration-hmp-cmds.c   |   4 +
 migration/multifd-qatzip.c       | 369 +++++++++++++++++++++++++++++++
 migration/multifd.h              |   1 +
 migration/options.c              |  27 +++
 migration/options.h              |   1 +
 qapi/migration.json              |  24 +-
 scripts/meson-buildoptions.sh    |   3 +
 tests/qtest/meson.build          |   4 +
 tests/qtest/migration-test.c     |  37 ++++
 13 files changed, 486 insertions(+), 3 deletions(-)
 create mode 100644 migration/multifd-qatzip.c

-- 
2.30.2



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/5] meson: Introduce 'qatzip' feature to the build system.
  2023-12-31 20:57 [PATCH 0/5] *** Implement using Intel QAT to offload ZLIB Bryan Zhang
@ 2023-12-31 20:58 ` Bryan Zhang
  2023-12-31 20:58 ` [PATCH 2/5] migration: Add compression level parameter for QATzip Bryan Zhang
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: Bryan Zhang @ 2023-12-31 20:58 UTC (permalink / raw)
  To: qemu-devel, farosas, marcandre.lureau, peterx, quintela,
	peter.maydell, hao.xiang
  Cc: bryan.zhang

Add a 'qatzip' feature, which is automatically disabled, and which
depends on the QATzip library if enabled.

Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
---
 meson.build                   | 10 ++++++++++
 meson_options.txt             |  2 ++
 scripts/meson-buildoptions.sh |  3 +++
 3 files changed, 15 insertions(+)

diff --git a/meson.build b/meson.build
index 6c77d9687d..99050b1109 100644
--- a/meson.build
+++ b/meson.build
@@ -1041,6 +1041,14 @@ if not get_option('zstd').auto() or have_block
                     required: get_option('zstd'),
                     method: 'pkg-config')
 endif
+
+qatzip = not_found
+if get_option('qatzip').enabled()
+  qatzip = dependency('qatzip', version: '>=1.1.2',
+                      required: get_option('qatzip'),
+                      method: 'pkg-config')
+endif
+
 virgl = not_found
 
 have_vhost_user_gpu = have_tools and targetos == 'linux' and pixman.found()
@@ -2208,6 +2216,7 @@ config_host_data.set('CONFIG_MALLOC_TRIM', has_malloc_trim)
 config_host_data.set('CONFIG_STATX', has_statx)
 config_host_data.set('CONFIG_STATX_MNT_ID', has_statx_mnt_id)
 config_host_data.set('CONFIG_ZSTD', zstd.found())
+config_host_data.set('CONFIG_QATZIP', qatzip.found())
 config_host_data.set('CONFIG_FUSE', fuse.found())
 config_host_data.set('CONFIG_FUSE_LSEEK', fuse_lseek.found())
 config_host_data.set('CONFIG_SPICE_PROTOCOL', spice_protocol.found())
@@ -4379,6 +4388,7 @@ summary_info += {'snappy support':    snappy}
 summary_info += {'bzip2 support':     libbzip2}
 summary_info += {'lzfse support':     liblzfse}
 summary_info += {'zstd support':      zstd}
+summary_info += {'QATzip support':    qatzip}
 summary_info += {'NUMA host support': numa}
 summary_info += {'capstone':          capstone}
 summary_info += {'libpmem support':   libpmem}
diff --git a/meson_options.txt b/meson_options.txt
index c9baeda639..0f3b380f82 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -259,6 +259,8 @@ option('xkbcommon', type : 'feature', value : 'auto',
        description: 'xkbcommon support')
 option('zstd', type : 'feature', value : 'auto',
        description: 'zstd compression support')
+option('qatzip', type: 'feature', value: 'disabled',
+       description: 'QATzip compression support')
 option('fuse', type: 'feature', value: 'auto',
        description: 'FUSE block device export')
 option('fuse_lseek', type : 'feature', value : 'auto',
diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh
index 680fa3f581..1afd373606 100644
--- a/scripts/meson-buildoptions.sh
+++ b/scripts/meson-buildoptions.sh
@@ -164,6 +164,7 @@ meson_options_help() {
   printf "%s\n" '  plugins         TCG plugins via shared library loading'
   printf "%s\n" '  png             PNG support with libpng'
   printf "%s\n" '  pvrdma          Enable PVRDMA support'
+  printf "%s\n" '  qatzip          QATzip compression support'
   printf "%s\n" '  qcow1           qcow1 image format support'
   printf "%s\n" '  qed             qed image format support'
   printf "%s\n" '  qga-vss         build QGA VSS support (broken with MinGW)'
@@ -430,6 +431,8 @@ _meson_option_parse() {
     --prefix=*) quote_sh "-Dprefix=$2" ;;
     --enable-pvrdma) printf "%s" -Dpvrdma=enabled ;;
     --disable-pvrdma) printf "%s" -Dpvrdma=disabled ;;
+    --enable-qatzip) printf "%s" -Dqatzip=enabled ;;
+    --disable-qatzip) printf "%s" -Dqatzip=disabled ;;
     --enable-qcow1) printf "%s" -Dqcow1=enabled ;;
     --disable-qcow1) printf "%s" -Dqcow1=disabled ;;
     --enable-qed) printf "%s" -Dqed=enabled ;;
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/5] migration: Add compression level parameter for QATzip
  2023-12-31 20:57 [PATCH 0/5] *** Implement using Intel QAT to offload ZLIB Bryan Zhang
  2023-12-31 20:58 ` [PATCH 1/5] meson: Introduce 'qatzip' feature to the build system Bryan Zhang
@ 2023-12-31 20:58 ` Bryan Zhang
  2023-12-31 20:58 ` [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method Bryan Zhang
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: Bryan Zhang @ 2023-12-31 20:58 UTC (permalink / raw)
  To: qemu-devel, farosas, marcandre.lureau, peterx, quintela,
	peter.maydell, hao.xiang
  Cc: bryan.zhang

Adds support for a parameter to specify QATzip compression level. This
is a preparatory commit for a subsequent commit that will actually use
QATzip compression.

Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
---
 migration/migration-hmp-cmds.c |  4 ++++
 migration/options.c            | 27 +++++++++++++++++++++++++++
 migration/options.h            |  1 +
 qapi/migration.json            | 19 ++++++++++++++++++-
 4 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 99710c8ffb..c3a8f1888d 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -628,6 +628,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
         p->has_multifd_zlib_level = true;
         visit_type_uint8(v, param, &p->multifd_zlib_level, &err);
         break;
+    case MIGRATION_PARAMETER_MULTIFD_QATZIP_LEVEL:
+        p->has_multifd_qatzip_level = true;
+        visit_type_uint8(v, param, &p->multifd_qatzip_level, &err);
+        break;
     case MIGRATION_PARAMETER_MULTIFD_ZSTD_LEVEL:
         p->has_multifd_zstd_level = true;
         visit_type_uint8(v, param, &p->multifd_zstd_level, &err);
diff --git a/migration/options.c b/migration/options.c
index 8d8ec73ad9..4a931effae 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -62,6 +62,12 @@
 #define DEFAULT_MIGRATE_MULTIFD_COMPRESSION MULTIFD_COMPRESSION_NONE
 /* 0: means nocompress, 1: best speed, ... 9: best compress ratio */
 #define DEFAULT_MIGRATE_MULTIFD_ZLIB_LEVEL 1
+/*
+ * 1: best speed, ... 9: best compress ratio
+ * There is some nuance here. Refer to QATzip documentation to understand
+ * the mapping of QATzip levels to standard deflate levels.
+ */
+#define DEFAULT_MIGRATE_MULTIFD_QATZIP_LEVEL 1
 /* 0: means nocompress, 1: best speed, ... 20: best compress ratio */
 #define DEFAULT_MIGRATE_MULTIFD_ZSTD_LEVEL 1
 
@@ -143,6 +149,9 @@ Property migration_properties[] = {
     DEFINE_PROP_UINT8("multifd-zlib-level", MigrationState,
                       parameters.multifd_zlib_level,
                       DEFAULT_MIGRATE_MULTIFD_ZLIB_LEVEL),
+    DEFINE_PROP_UINT8("multifd-qatzip-level", MigrationState,
+                      parameters.multifd_qatzip_level,
+                      DEFAULT_MIGRATE_MULTIFD_QATZIP_LEVEL),
     DEFINE_PROP_UINT8("multifd-zstd-level", MigrationState,
                       parameters.multifd_zstd_level,
                       DEFAULT_MIGRATE_MULTIFD_ZSTD_LEVEL),
@@ -859,6 +868,13 @@ int migrate_multifd_zlib_level(void)
     return s->parameters.multifd_zlib_level;
 }
 
+int migrate_multifd_qatzip_level(void)
+{
+    MigrationState *s = migrate_get_current();
+
+    return s->parameters.multifd_qatzip_level;
+}
+
 int migrate_multifd_zstd_level(void)
 {
     MigrationState *s = migrate_get_current();
@@ -981,6 +997,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
     params->multifd_compression = s->parameters.multifd_compression;
     params->has_multifd_zlib_level = true;
     params->multifd_zlib_level = s->parameters.multifd_zlib_level;
+    params->has_multifd_qatzip_level = true;
+    params->multifd_qatzip_level = s->parameters.multifd_qatzip_level;
     params->has_multifd_zstd_level = true;
     params->multifd_zstd_level = s->parameters.multifd_zstd_level;
     params->has_xbzrle_cache_size = true;
@@ -1036,6 +1054,7 @@ void migrate_params_init(MigrationParameters *params)
     params->has_multifd_channels = true;
     params->has_multifd_compression = true;
     params->has_multifd_zlib_level = true;
+    params->has_multifd_qatzip_level = true;
     params->has_multifd_zstd_level = true;
     params->has_xbzrle_cache_size = true;
     params->has_max_postcopy_bandwidth = true;
@@ -1145,6 +1164,14 @@ bool migrate_params_check(MigrationParameters *params, Error **errp)
         return false;
     }
 
+    if (params->has_multifd_qatzip_level &&
+        ((params->multifd_qatzip_level > 9) ||
+        (params->multifd_qatzip_level < 1))) {
+        error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "multifd_qatzip_level",
+                   "a value between 1 and 9");
+        return false;
+    }
+
     if (params->has_multifd_zstd_level &&
         (params->multifd_zstd_level > 20)) {
         error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "multifd_zstd_level",
diff --git a/migration/options.h b/migration/options.h
index 246c160aee..82707f34d0 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -87,6 +87,7 @@ MigMode migrate_mode(void);
 int migrate_multifd_channels(void);
 MultiFDCompression migrate_multifd_compression(void);
 int migrate_multifd_zlib_level(void);
+int migrate_multifd_qatzip_level(void);
 int migrate_multifd_zstd_level(void);
 uint8_t migrate_throttle_trigger_threshold(void);
 const char *migrate_tls_authz(void);
diff --git a/qapi/migration.json b/qapi/migration.json
index eb2f883513..6d5a4b0489 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -842,6 +842,11 @@
 #     speed, and 9 means best compression ratio which will consume
 #     more CPU. Defaults to 1. (Since 5.0)
 #
+# @multifd-qatzip-level: Set the compression level to be used in live
+#     migration. The level is an integer between 1 and 9, where 1 means
+#     the best compression speed, and 9 means the best compression
+#     ratio which will consume more CPU. Defaults to 1.
+#
 # @multifd-zstd-level: Set the compression level to be used in live
 #     migration, the compression level is an integer between 0 and 20,
 #     where 0 means no compression, 1 means the best compression
@@ -903,7 +908,7 @@
            'multifd-channels',
            'xbzrle-cache-size', 'max-postcopy-bandwidth',
            'max-cpu-throttle', 'multifd-compression',
-           'multifd-zlib-level', 'multifd-zstd-level',
+           'multifd-zlib-level', 'multifd-qatzip-level', 'multifd-zstd-level',
            'block-bitmap-mapping',
            { 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] },
            'vcpu-dirty-limit',
@@ -1030,6 +1035,11 @@
 #     speed, and 9 means best compression ratio which will consume
 #     more CPU. Defaults to 1. (Since 5.0)
 #
+# @multifd-qatzip-level: Set the compression level to be used in live
+#     migration. The level is an integer between 1 and 9, where 1 means
+#     the best compression speed, and 9 means the best compression
+#     ratio which will consume more CPU. Defaults to 1.
+#
 # @multifd-zstd-level: Set the compression level to be used in live
 #     migration, the compression level is an integer between 0 and 20,
 #     where 0 means no compression, 1 means the best compression
@@ -1110,6 +1120,7 @@
             '*max-cpu-throttle': 'uint8',
             '*multifd-compression': 'MultiFDCompression',
             '*multifd-zlib-level': 'uint8',
+            '*multifd-qatzip-level': 'uint8',
             '*multifd-zstd-level': 'uint8',
             '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ],
             '*x-vcpu-dirty-limit-period': { 'type': 'uint64',
@@ -1258,6 +1269,11 @@
 #     speed, and 9 means best compression ratio which will consume
 #     more CPU. Defaults to 1. (Since 5.0)
 #
+# @multifd-qatzip-level: Set the compression level to be used in live
+#     migration. The level is an integer between 1 and 9, where 1 means
+#     the best compression speed, and 9 means the best compression
+#     ratio which will consume more CPU. Defaults to 1.
+#
 # @multifd-zstd-level: Set the compression level to be used in live
 #     migration, the compression level is an integer between 0 and 20,
 #     where 0 means no compression, 1 means the best compression
@@ -1335,6 +1351,7 @@
             '*max-cpu-throttle': 'uint8',
             '*multifd-compression': 'MultiFDCompression',
             '*multifd-zlib-level': 'uint8',
+            '*multifd-qatzip-level': 'uint8',
             '*multifd-zstd-level': 'uint8',
             '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ],
             '*x-vcpu-dirty-limit-period': { 'type': 'uint64',
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method
  2023-12-31 20:57 [PATCH 0/5] *** Implement using Intel QAT to offload ZLIB Bryan Zhang
  2023-12-31 20:58 ` [PATCH 1/5] meson: Introduce 'qatzip' feature to the build system Bryan Zhang
  2023-12-31 20:58 ` [PATCH 2/5] migration: Add compression level parameter for QATzip Bryan Zhang
@ 2023-12-31 20:58 ` Bryan Zhang
  2024-01-05 20:07   ` Fabiano Rosas
  2023-12-31 20:58 ` [PATCH 4/5] migration: Implement 'qatzip' methods using QAT Bryan Zhang
  2023-12-31 20:58 ` [PATCH 5/5] migration: Add integration test for 'qatzip' compression method Bryan Zhang
  4 siblings, 1 reply; 21+ messages in thread
From: Bryan Zhang @ 2023-12-31 20:58 UTC (permalink / raw)
  To: qemu-devel, farosas, marcandre.lureau, peterx, quintela,
	peter.maydell, hao.xiang
  Cc: bryan.zhang

Adds support for 'qatzip' as an option for the multifd compression
method parameter, but copy-pastes the no-op logic to leave the actual
methods effectively unimplemented. This is in preparation of a
subsequent commit that will implement actually using QAT for compression
and decompression.

Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
---
 hw/core/qdev-properties-system.c |  6 ++-
 migration/meson.build            |  1 +
 migration/multifd-qatzip.c       | 81 ++++++++++++++++++++++++++++++++
 migration/multifd.h              |  1 +
 qapi/migration.json              |  5 +-
 5 files changed, 92 insertions(+), 2 deletions(-)
 create mode 100644 migration/multifd-qatzip.c

diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index 1a396521d5..d8e48dcb0e 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -658,7 +658,11 @@ const PropertyInfo qdev_prop_fdc_drive_type = {
 const PropertyInfo qdev_prop_multifd_compression = {
     .name = "MultiFDCompression",
     .description = "multifd_compression values, "
-                   "none/zlib/zstd",
+                   "none/zlib/zstd"
+#ifdef CONFIG_QATZIP
+                   "/qatzip"
+#endif
+                   ,
     .enum_table = &MultiFDCompression_lookup,
     .get = qdev_propinfo_get_enum,
     .set = qdev_propinfo_set_enum,
diff --git a/migration/meson.build b/migration/meson.build
index 92b1cc4297..e20f318379 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -40,6 +40,7 @@ if get_option('live_block_migration').allowed()
   system_ss.add(files('block.c'))
 endif
 system_ss.add(when: zstd, if_true: files('multifd-zstd.c'))
+system_ss.add(when: qatzip, if_true: files('multifd-qatzip.c'))
 
 specific_ss.add(when: 'CONFIG_SYSTEM_ONLY',
                 if_true: files('ram.c',
diff --git a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c
new file mode 100644
index 0000000000..1733bbddb7
--- /dev/null
+++ b/migration/multifd-qatzip.c
@@ -0,0 +1,81 @@
+/*
+ * Multifd QATzip compression implementation
+ *
+ * Copyright (c) Bytedance
+ *
+ * Authors:
+ *  Bryan Zhang <bryan.zhang@bytedance.com>
+ *  Hao Xiang   <hao.xiang@bytedance.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "exec/ramblock.h"
+#include "exec/target_page.h"
+#include "qapi/error.h"
+#include "migration.h"
+#include "options.h"
+#include "multifd.h"
+
+static int qatzip_send_setup(MultiFDSendParams *p, Error **errp)
+{
+    return 0;
+}
+
+static void qatzip_send_cleanup(MultiFDSendParams *p, Error **errp) {};
+
+static int qatzip_send_prepare(MultiFDSendParams *p, Error **errp)
+{
+    MultiFDPages_t *pages = p->pages;
+
+    for (int i = 0; i < p->normal_num; i++) {
+        p->iov[p->iovs_num].iov_base = pages->block->host + p->normal[i];
+        p->iov[p->iovs_num].iov_len = p->page_size;
+        p->iovs_num++;
+    }
+
+    p->next_packet_size = p->normal_num * p->page_size;
+    p->flags |= MULTIFD_FLAG_NOCOMP;
+    return 0;
+}
+
+static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp)
+{
+    return 0;
+}
+
+static void qatzip_recv_cleanup(MultiFDRecvParams *p) {};
+
+static int qatzip_recv_pages(MultiFDRecvParams *p, Error **errp)
+{
+    uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
+
+    if (flags != MULTIFD_FLAG_NOCOMP) {
+        error_setg(errp, "multifd %u: flags received %x flags expected %x",
+                   p->id, flags, MULTIFD_FLAG_NOCOMP);
+        return -1;
+    }
+    for (int i = 0; i < p->normal_num; i++) {
+        p->iov[i].iov_base = p->host + p->normal[i];
+        p->iov[i].iov_len = p->page_size;
+    }
+    return qio_channel_readv_all(p->c, p->iov, p->normal_num, errp);
+}
+
+static MultiFDMethods multifd_qatzip_ops = {
+    .send_setup = qatzip_send_setup,
+    .send_cleanup = qatzip_send_cleanup,
+    .send_prepare = qatzip_send_prepare,
+    .recv_setup = qatzip_recv_setup,
+    .recv_cleanup = qatzip_recv_cleanup,
+    .recv_pages = qatzip_recv_pages
+};
+
+static void multifd_qatzip_register(void)
+{
+    multifd_register_ops(MULTIFD_COMPRESSION_QATZIP, &multifd_qatzip_ops);
+}
+
+migration_init(multifd_qatzip_register);
diff --git a/migration/multifd.h b/migration/multifd.h
index a835643b48..5600f7fc82 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -33,6 +33,7 @@ int multifd_queue_page(QEMUFile *f, RAMBlock *block, ram_addr_t offset);
 #define MULTIFD_FLAG_NOCOMP (0 << 1)
 #define MULTIFD_FLAG_ZLIB (1 << 1)
 #define MULTIFD_FLAG_ZSTD (2 << 1)
+#define MULTIFD_FLAG_QATZIP (3 << 1)
 
 /* This value needs to be a multiple of qemu_target_page_size() */
 #define MULTIFD_PACKET_SIZE (512 * 1024)
diff --git a/qapi/migration.json b/qapi/migration.json
index 6d5a4b0489..e3cc195aed 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -625,11 +625,14 @@
 #
 # @zstd: use zstd compression method.
 #
+# @qatzip: use qatzip compression method.
+#
 # Since: 5.0
 ##
 { 'enum': 'MultiFDCompression',
   'data': [ 'none', 'zlib',
-            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
+            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' },
+            { 'name': 'qatzip', 'if': 'CONFIG_QATZIP'} ] }
 
 ##
 # @MigMode:
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/5] migration: Implement 'qatzip' methods using QAT
  2023-12-31 20:57 [PATCH 0/5] *** Implement using Intel QAT to offload ZLIB Bryan Zhang
                   ` (2 preceding siblings ...)
  2023-12-31 20:58 ` [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method Bryan Zhang
@ 2023-12-31 20:58 ` Bryan Zhang
  2023-12-31 20:58 ` [PATCH 5/5] migration: Add integration test for 'qatzip' compression method Bryan Zhang
  4 siblings, 0 replies; 21+ messages in thread
From: Bryan Zhang @ 2023-12-31 20:58 UTC (permalink / raw)
  To: qemu-devel, farosas, marcandre.lureau, peterx, quintela,
	peter.maydell, hao.xiang
  Cc: bryan.zhang

Uses QAT to offload deflate compression in the 'qatzip' compression
method for multifd migration.

Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
---
 migration/multifd-qatzip.c | 314 +++++++++++++++++++++++++++++++++++--
 1 file changed, 301 insertions(+), 13 deletions(-)

diff --git a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c
index 1733bbddb7..3fb0bb5b27 100644
--- a/migration/multifd-qatzip.c
+++ b/migration/multifd-qatzip.c
@@ -18,50 +18,338 @@
 #include "migration.h"
 #include "options.h"
 #include "multifd.h"
+#include <qatzip.h>
 
+struct qatzip_data {
+    /*
+     * Unique session for use with QATzip API
+     */
+    QzSession_T sess;
+
+    /*
+     * For compression: Buffer for pages to compress
+     * For decompression: Buffer for data to decompress
+     */
+    uint8_t *in_buf;
+    uint32_t in_len;
+
+    /*
+     * For compression: Output buffer of compressed data
+     * For decompression: Output buffer of decompressed data
+     */
+    uint8_t *out_buf;
+    uint32_t out_len;
+};
+
+/**
+ * qatzip_send_setup: Set up QATzip session and private buffers.
+ *
+ * @param p    Multifd channel params
+ * @param errp Pointer to error, which will be set in case of error
+ * @return     0 on success, -1 on error (and *errp will be set)
+ */
 static int qatzip_send_setup(MultiFDSendParams *p, Error **errp)
 {
+    struct qatzip_data *q;
+    QzSessionParamsDeflate_T params;
+    const char *err_msg;
+    int ret;
+
+    q = g_new0(struct qatzip_data, 1);
+    p->data = q;
+
+    ret = qzInit(&q->sess, 0);
+    if (ret != QZ_OK && ret != QZ_DUPLICATE) {
+        err_msg = "qzInit failed";
+        goto err_free_q;
+    }
+
+    ret = qzGetDefaultsDeflate(&params);
+    if (ret != QZ_OK) {
+        err_msg = "qzGetDefaultsDeflate failed";
+        goto err_close;
+    }
+
+    /* Use maximum hardware buffer size to improve batching. */
+    params.common_params.hw_buff_sz = QZ_HW_BUFF_MAX_SZ;
+
+    /* Make sure to use configured QATzip compression level. */
+    params.common_params.comp_lvl = migrate_multifd_qatzip_level();
+
+    ret = qzSetupSessionDeflate(&q->sess, &params);
+    if (ret != QZ_OK && ret != QZ_DUPLICATE) {
+        err_msg = "qzSetupSessionDeflate failed";
+        goto err_close;
+    }
+
+    /* TODO Add support for larger packets. */
+    if (MULTIFD_PACKET_SIZE > UINT32_MAX) {
+        err_msg = "packet size too large for QAT";
+        goto err_close;
+    }
+
+    q->in_len = MULTIFD_PACKET_SIZE;
+    q->in_buf = g_try_malloc(q->in_len);
+    if (!q->in_buf) {
+        err_msg = "malloc failed";
+        goto err_close;
+    }
+
+    q->out_len = qzMaxCompressedLength(MULTIFD_PACKET_SIZE, &q->sess);
+    q->out_buf = g_try_malloc(q->out_len);
+    if (!q->out_buf) {
+        err_msg = "malloc failed";
+        goto err_free_inbuf;
+    }
+
     return 0;
+
+err_free_inbuf:
+    g_free(q->in_buf);
+err_close:
+    qzClose(&q->sess);
+err_free_q:
+    g_free(q);
+    error_setg(errp, "multifd %u: %s", p->id, err_msg);
+    return -1;
 }
 
-static void qatzip_send_cleanup(MultiFDSendParams *p, Error **errp) {};
+/**
+ * qatzip_send_cleanup: Tear down QATzip session and release private buffers.
+ *
+ * @param p    Multifd channel params
+ * @param errp Pointer to error, which will be set in case of error
+ * @return     None
+ */
+static void qatzip_send_cleanup(MultiFDSendParams *p, Error **errp)
+{
+    struct qatzip_data *q = p->data;
+    const char *err_msg;
+    int ret;
+
+    ret = qzTeardownSession(&q->sess);
+    if (ret != QZ_OK) {
+        err_msg = "qzTeardownSession failed";
+        goto err;
+    }
+
+    ret = qzClose(&q->sess);
+    if (ret != QZ_OK) {
+        err_msg = "qzClose failed";
+        goto err;
+    }
+
+    g_free(q->in_buf);
+    q->in_buf = NULL;
+    g_free(q->out_buf);
+    q->out_buf = NULL;
+    g_free(p->data);
+    p->data = NULL;
+    return;
+
+err:
+    error_setg(errp, "multifd %u: %s", p->id, err_msg);
+}
 
+/**
+ * qatzip_send_prepare: Compress pages and update IO channel info.
+ *
+ * @param p    Multifd channel params
+ * @param errp Pointer to error, which will be set in case of error
+ * @return     0 on success, -1 on error (and *errp will be set)
+ */
 static int qatzip_send_prepare(MultiFDSendParams *p, Error **errp)
 {
-    MultiFDPages_t *pages = p->pages;
+    struct qatzip_data *q = p->data;
+    int ret;
+    unsigned int in_len, out_len;
 
+    /* memcpy all the pages into one buffer. */
     for (int i = 0; i < p->normal_num; i++) {
-        p->iov[p->iovs_num].iov_base = pages->block->host + p->normal[i];
-        p->iov[p->iovs_num].iov_len = p->page_size;
-        p->iovs_num++;
+        memcpy(q->in_buf + (i * p->page_size),
+               p->pages->block->host + p->normal[i],
+               p->page_size);
+    }
+
+    in_len = p->normal_num * p->page_size;
+    if (in_len > q->in_len) {
+        error_setg(errp, "multifd %u: unexpectedly large input", p->id);
+        return -1;
+    }
+    out_len = q->out_len;
+
+    /*
+     * Unlike other multifd compression implementations, we use a non-streaming
+     * API and place all the data into one buffer, rather than sending each page
+     * to the compression API at a time. Based on initial benchmarks, the
+     * non-streaming API outperforms the streaming API. Plus, the logic in QEMU
+     * is friendly to using the non-streaming API anyway. If either of these
+     * statements becomes no longer true, we can revisit adding a streaming
+     * implementation.
+     */
+    ret = qzCompress(&q->sess, q->in_buf, &in_len, q->out_buf, &out_len, 1);
+    if (ret != QZ_OK) {
+        error_setg(errp, "multifd %u: QATzip returned %d instead of QZ_OK",
+                   p->id, ret);
+        return -1;
+    }
+    if (in_len != p->normal_num * p->page_size) {
+        error_setg(errp, "multifd %u: QATzip failed to compress all input",
+                   p->id);
+        return -1;
     }
 
-    p->next_packet_size = p->normal_num * p->page_size;
-    p->flags |= MULTIFD_FLAG_NOCOMP;
+    p->iov[p->iovs_num].iov_base = q->out_buf;
+    p->iov[p->iovs_num].iov_len = out_len;
+    p->iovs_num++;
+    p->next_packet_size = out_len;
+    p->flags |= MULTIFD_FLAG_QATZIP;
     return 0;
 }
 
+/**
+ * qatzip_recv_setup: Set up QATzip session and allocate private buffers.
+ *
+ * @param p    Multifd channel params
+ * @param errp Pointer to error, which will be set in case of error
+ * @return     0 on success, -1 on error (and *errp will be set)
+ */
 static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp)
 {
+    struct qatzip_data *q;
+    QzSessionParamsDeflate_T params;
+    const char *err_msg;
+    int ret;
+
+    q = g_new0(struct qatzip_data, 1);
+    p->data = q;
+
+    ret = qzInit(&q->sess, 0);
+    if (ret != QZ_OK && ret != QZ_DUPLICATE) {
+        err_msg = "qzInit failed";
+        goto err_free_q;
+    }
+
+    ret = qzGetDefaultsDeflate(&params);
+    if (ret != QZ_OK) {
+        err_msg = "qzGetDefaultsDeflate failed";
+        goto err_close;
+    }
+
+    /* Set maximum hardware buffer size for improved batching. */
+    params.common_params.hw_buff_sz = QZ_HW_BUFF_MAX_SZ;
+
+    /* Make sure to use configured QATzip compression level. */
+    params.common_params.comp_lvl = migrate_multifd_qatzip_level();
+
+    ret = qzSetupSessionDeflate(&q->sess, &params);
+    if (ret != QZ_OK && ret != QZ_DUPLICATE) {
+        err_msg = "qzSetupSessionDeflate failed";
+        goto err_close;
+    }
+
+    /*
+     * Mimic multifd-zlib, which reserves extra space for the incoming packet.
+     */
+    q->in_len = MULTIFD_PACKET_SIZE * 2;
+    q->in_buf = g_try_malloc(q->in_len);
+    if (!q->in_buf) {
+        err_msg = "malloc failed";
+        goto err_close;
+    }
+
+    q->out_len = MULTIFD_PACKET_SIZE;
+    q->out_buf = g_try_malloc(q->out_len);
+    if (!q->out_buf) {
+        err_msg = "malloc failed";
+        goto err_free_inbuf;
+    }
+
     return 0;
+
+err_free_inbuf:
+    g_free(q->in_buf);
+err_close:
+    qzClose(&q->sess);
+err_free_q:
+    g_free(q);
+    error_setg(errp, "multifd %u: %s", p->id, err_msg);
+    return -1;
 }
 
-static void qatzip_recv_cleanup(MultiFDRecvParams *p) {};
+/**
+ * qatzip_recv_cleanup: Tear down QATzip session and release private buffers.
+ *
+ * @param p    Multifd channel params
+ * @return     None
+ */
+static void qatzip_recv_cleanup(MultiFDRecvParams *p)
+{
+    struct qatzip_data *q = p->data;
+
+    /* Ignoring return values here due to function signature. */
+    qzTeardownSession(&q->sess);
+    qzClose(&q->sess);
+    g_free(q->in_buf);
+    g_free(q->out_buf);
+    g_free(p->data);
+}
 
+
+/**
+ * qatzip_recv_pages: Decompress pages and copy them to the appropriate
+ * locations.
+ *
+ * @param p    Multifd channel params
+ * @param errp Pointer to error, which will be set in case of error
+ * @return     0 on success, -1 on error (and *errp will be set)
+ */
 static int qatzip_recv_pages(MultiFDRecvParams *p, Error **errp)
 {
+    struct qatzip_data *q = p->data;
+    int ret;
+    unsigned int in_len, out_len;
+    uint32_t in_size = p->next_packet_size;
+    uint32_t expected_size = p->normal_num * p->page_size;
     uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
 
-    if (flags != MULTIFD_FLAG_NOCOMP) {
+    if (in_size > q->in_len) {
+        error_setg(errp, "multifd %u: received unexpectedly large packet",
+                   p->id);
+        return -1;
+    }
+
+    if (flags != MULTIFD_FLAG_QATZIP) {
         error_setg(errp, "multifd %u: flags received %x flags expected %x",
-                   p->id, flags, MULTIFD_FLAG_NOCOMP);
+                   p->id, flags, MULTIFD_FLAG_QATZIP);
+        return -1;
+    }
+
+    ret = qio_channel_read_all(p->c, (void *)q->in_buf, in_size, errp);
+    if (ret != 0) {
+        return ret;
+    }
+
+    in_len = in_size;
+    out_len = q->out_len;
+    ret = qzDecompress(&q->sess, q->in_buf, &in_len, q->out_buf, &out_len);
+    if (ret != QZ_OK) {
+        error_setg(errp, "multifd %u: qzDecompress failed", p->id);
+        return -1;
+    }
+    if (out_len != expected_size) {
+        error_setg(errp, "multifd %u: packet size received %u size expected %u",
+                   p->id, out_len, expected_size);
         return -1;
     }
+
+    /* Copy each page to its appropriate location. */
     for (int i = 0; i < p->normal_num; i++) {
-        p->iov[i].iov_base = p->host + p->normal[i];
-        p->iov[i].iov_len = p->page_size;
+        memcpy(p->host + p->normal[i],
+               q->out_buf + p->page_size * i,
+               p->page_size);
     }
-    return qio_channel_readv_all(p->c, p->iov, p->normal_num, errp);
+    return 0;
 }
 
 static MultiFDMethods multifd_qatzip_ops = {
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 5/5] migration: Add integration test for 'qatzip' compression method
  2023-12-31 20:57 [PATCH 0/5] *** Implement using Intel QAT to offload ZLIB Bryan Zhang
                   ` (3 preceding siblings ...)
  2023-12-31 20:58 ` [PATCH 4/5] migration: Implement 'qatzip' methods using QAT Bryan Zhang
@ 2023-12-31 20:58 ` Bryan Zhang
  2024-01-29  8:47   ` Peter Xu
  4 siblings, 1 reply; 21+ messages in thread
From: Bryan Zhang @ 2023-12-31 20:58 UTC (permalink / raw)
  To: qemu-devel, farosas, marcandre.lureau, peterx, quintela,
	peter.maydell, hao.xiang
  Cc: bryan.zhang

Adds an integration test for 'qatzip'.

Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
---
 tests/qtest/meson.build      |  4 ++++
 tests/qtest/migration-test.c | 37 ++++++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 47dabf91d0..5931bd6418 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -302,6 +302,10 @@ if gnutls.found()
   endif
 endif
 
+if qatzip.found()
+  migration_files += [qatzip]
+endif
+
 qtests = {
   'bios-tables-test': [io, 'boot-sector.c', 'acpi-utils.c', 'tpm-emu.c'],
   'cdrom-test': files('boot-sector.c'),
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index d520c587f7..f51bc4056f 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -35,6 +35,10 @@
 # endif /* CONFIG_TASN1 */
 #endif /* CONFIG_GNUTLS */
 
+#ifdef CONFIG_QATZIP
+#include <qatzip.h>
+#endif /* CONFIG_QATZIP */
+
 /* For dirty ring test; so far only x86_64 is supported */
 #if defined(__linux__) && defined(HOST_X86_64)
 #include "linux/kvm.h"
@@ -2572,6 +2576,15 @@ test_migrate_precopy_tcp_multifd_zstd_start(QTestState *from,
 }
 #endif /* CONFIG_ZSTD */
 
+#ifdef CONFIG_QATZIP
+static void *
+test_migrate_precopy_tcp_multifd_qatzip_start(QTestState *from,
+                                              QTestState *to)
+{
+    return test_migrate_precopy_tcp_multifd_start_common(from, to, "qatzip");
+}
+#endif
+
 static void test_multifd_tcp_none(void)
 {
     MigrateCommon args = {
@@ -2607,6 +2620,17 @@ static void test_multifd_tcp_zstd(void)
 }
 #endif
 
+#ifdef CONFIG_QATZIP
+static void test_multifd_tcp_qatzip(void)
+{
+    MigrateCommon args = {
+        .listen_uri = "defer",
+        .start_hook = test_migrate_precopy_tcp_multifd_qatzip_start,
+    };
+    test_precopy_common(&args);
+}
+#endif
+
 #ifdef CONFIG_GNUTLS
 static void *
 test_migrate_multifd_tcp_tls_psk_start_match(QTestState *from,
@@ -3480,6 +3504,19 @@ int main(int argc, char **argv)
     qtest_add_func("/migration/multifd/tcp/plain/zstd",
                    test_multifd_tcp_zstd);
 #endif
+#ifdef CONFIG_QATZIP
+    /*
+     * Use QATzip's qzInit() function as a runtime hardware check.
+     * Ideally there might be a cleaner way to probe for the presence of QAT.
+     */
+    QzSession_T sess;
+    memset(&sess, 0, sizeof(QzSession_T));
+    if (qzInit(&sess, 0) == QZ_OK) {
+        qzClose(&sess);
+        qtest_add_func("/migration/multifd/tcp/plain/qatzip",
+                    test_multifd_tcp_qatzip);
+    }
+#endif
 #ifdef CONFIG_GNUTLS
     qtest_add_func("/migration/multifd/tcp/tls/psk/match",
                    test_multifd_tcp_tls_psk_match);
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method
  2023-12-31 20:58 ` [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method Bryan Zhang
@ 2024-01-05 20:07   ` Fabiano Rosas
  2024-01-05 23:52     ` [External] " Hao Xiang
  2024-01-08  3:17     ` Liu, Yuan1
  0 siblings, 2 replies; 21+ messages in thread
From: Fabiano Rosas @ 2024-01-05 20:07 UTC (permalink / raw)
  To: Bryan Zhang, qemu-devel, marcandre.lureau, peterx, quintela,
	peter.maydell, hao.xiang
  Cc: bryan.zhang, yuan1.liu, berrange

Bryan Zhang <bryan.zhang@bytedance.com> writes:

+cc Yuan Liu, Daniel Berrangé

> Adds support for 'qatzip' as an option for the multifd compression
> method parameter, but copy-pastes the no-op logic to leave the actual
> methods effectively unimplemented. This is in preparation of a
> subsequent commit that will implement actually using QAT for compression
> and decompression.
>
> Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
> Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
> ---
>  hw/core/qdev-properties-system.c |  6 ++-
>  migration/meson.build            |  1 +
>  migration/multifd-qatzip.c       | 81 ++++++++++++++++++++++++++++++++
>  migration/multifd.h              |  1 +
>  qapi/migration.json              |  5 +-
>  5 files changed, 92 insertions(+), 2 deletions(-)
>  create mode 100644 migration/multifd-qatzip.c
>
> diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
> index 1a396521d5..d8e48dcb0e 100644
> --- a/hw/core/qdev-properties-system.c
> +++ b/hw/core/qdev-properties-system.c
> @@ -658,7 +658,11 @@ const PropertyInfo qdev_prop_fdc_drive_type = {
>  const PropertyInfo qdev_prop_multifd_compression = {
>      .name = "MultiFDCompression",
>      .description = "multifd_compression values, "
> -                   "none/zlib/zstd",
> +                   "none/zlib/zstd"
> +#ifdef CONFIG_QATZIP
> +                   "/qatzip"
> +#endif
> +                   ,
>      .enum_table = &MultiFDCompression_lookup,
>      .get = qdev_propinfo_get_enum,
>      .set = qdev_propinfo_set_enum,
> diff --git a/migration/meson.build b/migration/meson.build
> index 92b1cc4297..e20f318379 100644
> --- a/migration/meson.build
> +++ b/migration/meson.build
> @@ -40,6 +40,7 @@ if get_option('live_block_migration').allowed()
>    system_ss.add(files('block.c'))
>  endif
>  system_ss.add(when: zstd, if_true: files('multifd-zstd.c'))
> +system_ss.add(when: qatzip, if_true: files('multifd-qatzip.c'))
>  
>  specific_ss.add(when: 'CONFIG_SYSTEM_ONLY',
>                  if_true: files('ram.c',
> diff --git a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c
> new file mode 100644
> index 0000000000..1733bbddb7
> --- /dev/null
> +++ b/migration/multifd-qatzip.c
> @@ -0,0 +1,81 @@
> +/*
> + * Multifd QATzip compression implementation
> + *
> + * Copyright (c) Bytedance
> + *
> + * Authors:
> + *  Bryan Zhang <bryan.zhang@bytedance.com>
> + *  Hao Xiang   <hao.xiang@bytedance.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "exec/ramblock.h"
> +#include "exec/target_page.h"
> +#include "qapi/error.h"
> +#include "migration.h"
> +#include "options.h"
> +#include "multifd.h"
> +
> +static int qatzip_send_setup(MultiFDSendParams *p, Error **errp)
> +{
> +    return 0;
> +}
> +
> +static void qatzip_send_cleanup(MultiFDSendParams *p, Error **errp) {};
> +
> +static int qatzip_send_prepare(MultiFDSendParams *p, Error **errp)
> +{
> +    MultiFDPages_t *pages = p->pages;
> +
> +    for (int i = 0; i < p->normal_num; i++) {
> +        p->iov[p->iovs_num].iov_base = pages->block->host + p->normal[i];
> +        p->iov[p->iovs_num].iov_len = p->page_size;
> +        p->iovs_num++;
> +    }
> +
> +    p->next_packet_size = p->normal_num * p->page_size;
> +    p->flags |= MULTIFD_FLAG_NOCOMP;
> +    return 0;
> +}
> +
> +static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp)
> +{
> +    return 0;
> +}
> +
> +static void qatzip_recv_cleanup(MultiFDRecvParams *p) {};
> +
> +static int qatzip_recv_pages(MultiFDRecvParams *p, Error **errp)
> +{
> +    uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
> +
> +    if (flags != MULTIFD_FLAG_NOCOMP) {
> +        error_setg(errp, "multifd %u: flags received %x flags expected %x",
> +                   p->id, flags, MULTIFD_FLAG_NOCOMP);
> +        return -1;
> +    }
> +    for (int i = 0; i < p->normal_num; i++) {
> +        p->iov[i].iov_base = p->host + p->normal[i];
> +        p->iov[i].iov_len = p->page_size;
> +    }
> +    return qio_channel_readv_all(p->c, p->iov, p->normal_num, errp);
> +}
> +
> +static MultiFDMethods multifd_qatzip_ops = {
> +    .send_setup = qatzip_send_setup,
> +    .send_cleanup = qatzip_send_cleanup,
> +    .send_prepare = qatzip_send_prepare,
> +    .recv_setup = qatzip_recv_setup,
> +    .recv_cleanup = qatzip_recv_cleanup,
> +    .recv_pages = qatzip_recv_pages
> +};
> +
> +static void multifd_qatzip_register(void)
> +{
> +    multifd_register_ops(MULTIFD_COMPRESSION_QATZIP, &multifd_qatzip_ops);
> +}
> +
> +migration_init(multifd_qatzip_register);
> diff --git a/migration/multifd.h b/migration/multifd.h
> index a835643b48..5600f7fc82 100644
> --- a/migration/multifd.h
> +++ b/migration/multifd.h
> @@ -33,6 +33,7 @@ int multifd_queue_page(QEMUFile *f, RAMBlock *block, ram_addr_t offset);
>  #define MULTIFD_FLAG_NOCOMP (0 << 1)
>  #define MULTIFD_FLAG_ZLIB (1 << 1)
>  #define MULTIFD_FLAG_ZSTD (2 << 1)
> +#define MULTIFD_FLAG_QATZIP (3 << 1)
>  
>  /* This value needs to be a multiple of qemu_target_page_size() */
>  #define MULTIFD_PACKET_SIZE (512 * 1024)
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 6d5a4b0489..e3cc195aed 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -625,11 +625,14 @@
>  #
>  # @zstd: use zstd compression method.
>  #
> +# @qatzip: use qatzip compression method.
> +#
>  # Since: 5.0
>  ##
>  { 'enum': 'MultiFDCompression',
>    'data': [ 'none', 'zlib',
> -            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
> +            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' },
> +            { 'name': 'qatzip', 'if': 'CONFIG_QATZIP'} ] }

In another thread adding support to another Intel accelerator (IAA) we
decided that it was better to select the offloading as an accelerator
method to multifd zlib rather than as an entirely new compression
format. Take a look at that discussion:
https://lore.kernel.org/r/ZTFCnqbbqlmsUkRC@redhat.com

As I understand it, QAT + QATzip would be compatible with both zlib and
IAA + QPL, so we'd add another accelerator method like this:

https://lore.kernel.org/r/20240103112851.908082-3-yuan1.liu@intel.com

All that, of course, assuming we even want to support both
accelerators. They're addressing the same problem after all. I wonder
how we'd choose a precedence, since both seem to be present in the same
processor family.




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method
  2024-01-05 20:07   ` Fabiano Rosas
@ 2024-01-05 23:52     ` Hao Xiang
  2024-01-06  6:31       ` Hao Xiang
  2024-01-08  3:25       ` Liu, Yuan1
  2024-01-08  3:17     ` Liu, Yuan1
  1 sibling, 2 replies; 21+ messages in thread
From: Hao Xiang @ 2024-01-05 23:52 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: Bryan Zhang, qemu-devel, marcandre.lureau, peterx, quintela,
	peter.maydell, yuan1.liu, berrange

On Fri, Jan 5, 2024 at 12:07 PM Fabiano Rosas <farosas@suse.de> wrote:
>
> Bryan Zhang <bryan.zhang@bytedance.com> writes:
>
> +cc Yuan Liu, Daniel Berrangé
>
> > Adds support for 'qatzip' as an option for the multifd compression
> > method parameter, but copy-pastes the no-op logic to leave the actual
> > methods effectively unimplemented. This is in preparation of a
> > subsequent commit that will implement actually using QAT for compression
> > and decompression.
> >
> > Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
> > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
> > ---
> >  hw/core/qdev-properties-system.c |  6 ++-
> >  migration/meson.build            |  1 +
> >  migration/multifd-qatzip.c       | 81 ++++++++++++++++++++++++++++++++
> >  migration/multifd.h              |  1 +
> >  qapi/migration.json              |  5 +-
> >  5 files changed, 92 insertions(+), 2 deletions(-)
> >  create mode 100644 migration/multifd-qatzip.c
> >
> > diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
> > index 1a396521d5..d8e48dcb0e 100644
> > --- a/hw/core/qdev-properties-system.c
> > +++ b/hw/core/qdev-properties-system.c
> > @@ -658,7 +658,11 @@ const PropertyInfo qdev_prop_fdc_drive_type = {
> >  const PropertyInfo qdev_prop_multifd_compression = {
> >      .name = "MultiFDCompression",
> >      .description = "multifd_compression values, "
> > -                   "none/zlib/zstd",
> > +                   "none/zlib/zstd"
> > +#ifdef CONFIG_QATZIP
> > +                   "/qatzip"
> > +#endif
> > +                   ,
> >      .enum_table = &MultiFDCompression_lookup,
> >      .get = qdev_propinfo_get_enum,
> >      .set = qdev_propinfo_set_enum,
> > diff --git a/migration/meson.build b/migration/meson.build
> > index 92b1cc4297..e20f318379 100644
> > --- a/migration/meson.build
> > +++ b/migration/meson.build
> > @@ -40,6 +40,7 @@ if get_option('live_block_migration').allowed()
> >    system_ss.add(files('block.c'))
> >  endif
> >  system_ss.add(when: zstd, if_true: files('multifd-zstd.c'))
> > +system_ss.add(when: qatzip, if_true: files('multifd-qatzip.c'))
> >
> >  specific_ss.add(when: 'CONFIG_SYSTEM_ONLY',
> >                  if_true: files('ram.c',
> > diff --git a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c
> > new file mode 100644
> > index 0000000000..1733bbddb7
> > --- /dev/null
> > +++ b/migration/multifd-qatzip.c
> > @@ -0,0 +1,81 @@
> > +/*
> > + * Multifd QATzip compression implementation
> > + *
> > + * Copyright (c) Bytedance
> > + *
> > + * Authors:
> > + *  Bryan Zhang <bryan.zhang@bytedance.com>
> > + *  Hao Xiang   <hao.xiang@bytedance.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "exec/ramblock.h"
> > +#include "exec/target_page.h"
> > +#include "qapi/error.h"
> > +#include "migration.h"
> > +#include "options.h"
> > +#include "multifd.h"
> > +
> > +static int qatzip_send_setup(MultiFDSendParams *p, Error **errp)
> > +{
> > +    return 0;
> > +}
> > +
> > +static void qatzip_send_cleanup(MultiFDSendParams *p, Error **errp) {};
> > +
> > +static int qatzip_send_prepare(MultiFDSendParams *p, Error **errp)
> > +{
> > +    MultiFDPages_t *pages = p->pages;
> > +
> > +    for (int i = 0; i < p->normal_num; i++) {
> > +        p->iov[p->iovs_num].iov_base = pages->block->host + p->normal[i];
> > +        p->iov[p->iovs_num].iov_len = p->page_size;
> > +        p->iovs_num++;
> > +    }
> > +
> > +    p->next_packet_size = p->normal_num * p->page_size;
> > +    p->flags |= MULTIFD_FLAG_NOCOMP;
> > +    return 0;
> > +}
> > +
> > +static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp)
> > +{
> > +    return 0;
> > +}
> > +
> > +static void qatzip_recv_cleanup(MultiFDRecvParams *p) {};
> > +
> > +static int qatzip_recv_pages(MultiFDRecvParams *p, Error **errp)
> > +{
> > +    uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
> > +
> > +    if (flags != MULTIFD_FLAG_NOCOMP) {
> > +        error_setg(errp, "multifd %u: flags received %x flags expected %x",
> > +                   p->id, flags, MULTIFD_FLAG_NOCOMP);
> > +        return -1;
> > +    }
> > +    for (int i = 0; i < p->normal_num; i++) {
> > +        p->iov[i].iov_base = p->host + p->normal[i];
> > +        p->iov[i].iov_len = p->page_size;
> > +    }
> > +    return qio_channel_readv_all(p->c, p->iov, p->normal_num, errp);
> > +}
> > +
> > +static MultiFDMethods multifd_qatzip_ops = {
> > +    .send_setup = qatzip_send_setup,
> > +    .send_cleanup = qatzip_send_cleanup,
> > +    .send_prepare = qatzip_send_prepare,
> > +    .recv_setup = qatzip_recv_setup,
> > +    .recv_cleanup = qatzip_recv_cleanup,
> > +    .recv_pages = qatzip_recv_pages
> > +};
> > +
> > +static void multifd_qatzip_register(void)
> > +{
> > +    multifd_register_ops(MULTIFD_COMPRESSION_QATZIP, &multifd_qatzip_ops);
> > +}
> > +
> > +migration_init(multifd_qatzip_register);
> > diff --git a/migration/multifd.h b/migration/multifd.h
> > index a835643b48..5600f7fc82 100644
> > --- a/migration/multifd.h
> > +++ b/migration/multifd.h
> > @@ -33,6 +33,7 @@ int multifd_queue_page(QEMUFile *f, RAMBlock *block, ram_addr_t offset);
> >  #define MULTIFD_FLAG_NOCOMP (0 << 1)
> >  #define MULTIFD_FLAG_ZLIB (1 << 1)
> >  #define MULTIFD_FLAG_ZSTD (2 << 1)
> > +#define MULTIFD_FLAG_QATZIP (3 << 1)
> >
> >  /* This value needs to be a multiple of qemu_target_page_size() */
> >  #define MULTIFD_PACKET_SIZE (512 * 1024)
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index 6d5a4b0489..e3cc195aed 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -625,11 +625,14 @@
> >  #
> >  # @zstd: use zstd compression method.
> >  #
> > +# @qatzip: use qatzip compression method.
> > +#
> >  # Since: 5.0
> >  ##
> >  { 'enum': 'MultiFDCompression',
> >    'data': [ 'none', 'zlib',
> > -            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
> > +            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' },
> > +            { 'name': 'qatzip', 'if': 'CONFIG_QATZIP'} ] }
>
> In another thread adding support to another Intel accelerator (IAA) we
> decided that it was better to select the offloading as an accelerator
> method to multifd zlib rather than as an entirely new compression
> format. Take a look at that discussion:
> https://lore.kernel.org/r/ZTFCnqbbqlmsUkRC@redhat.com

We had some early discussion with Intel folks (probably a different
team than the one with the above patchset). The understanding at the
time is that QAT is good at both bulk data compression and
decompression. IAA is good at decompression with smaller data size but
compression performance is not the best. In multifd, we are
compressing up to 128 4k pages at a time and potentially this can
increase by configuring the packet size, at the time we thought QAT
could be a better fit in the multifd live migration scenario. We would
like to hear more from Intel.

From our benchmark testing, with two QAT devices, we can get deflate
compression throughout to around 7GB/s with ~160% CPU. That's beating
the current software implementation (zlib and zstd) by a lot. We are
still tuning the configuration in QEMU live migration now.

>
> As I understand it, QAT + QATzip would be compatible with both zlib and
> IAA + QPL, so we'd add another accelerator method like this:
>
> https://lore.kernel.org/r/20240103112851.908082-3-yuan1.liu@intel.com
>

I quickly read over the IAA patchset and I saw this:

"However, due to some reasons, QPL is currently
not compatible with the existing Zlib method that Zlib compressed data
can be decompressed by QPl and vice versa."

The above probably means the current zlib software implementation and
IAA are not compatible.

For QAT, although, both Intel's QATzip and zlib are internally using
deflate, QATzip only supports deflate with a 4 byte header, deflate
wrapped by Gzip header and footer, or deflate wrapped by Intel® QAT
Gzip* extension header and footer. None of the headers can be
recognized by zlib software implementation is my understanding. So if
we want to make them compatible with zlib, the QATzip library needs to
support that.

> All that, of course, assuming we even want to support both
> accelerators. They're addressing the same problem after all. I wonder
> how we'd choose a precedence, since both seem to be present in the same
> processor family.
>
>

That's an interesting question :-) I think overall performance
(throughput and CPU overhead) should both be considered. IAA and QAT
accelerators don't present on all systems. We Bytedance choose to have
both on our platform when purchasing from Intel.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method
  2024-01-05 23:52     ` [External] " Hao Xiang
@ 2024-01-06  6:31       ` Hao Xiang
  2024-01-08 20:24         ` Fabiano Rosas
  2024-01-08  3:25       ` Liu, Yuan1
  1 sibling, 1 reply; 21+ messages in thread
From: Hao Xiang @ 2024-01-06  6:31 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: Bryan Zhang, qemu-devel, marcandre.lureau, peterx, quintela,
	peter.maydell, yuan1.liu, berrange

On Fri, Jan 5, 2024 at 3:52 PM Hao Xiang <hao.xiang@bytedance.com> wrote:
>
> On Fri, Jan 5, 2024 at 12:07 PM Fabiano Rosas <farosas@suse.de> wrote:
> >
> > Bryan Zhang <bryan.zhang@bytedance.com> writes:
> >
> > +cc Yuan Liu, Daniel Berrangé
> >
> > > Adds support for 'qatzip' as an option for the multifd compression
> > > method parameter, but copy-pastes the no-op logic to leave the actual
> > > methods effectively unimplemented. This is in preparation of a
> > > subsequent commit that will implement actually using QAT for compression
> > > and decompression.
> > >
> > > Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
> > > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
> > > ---
> > >  hw/core/qdev-properties-system.c |  6 ++-
> > >  migration/meson.build            |  1 +
> > >  migration/multifd-qatzip.c       | 81 ++++++++++++++++++++++++++++++++
> > >  migration/multifd.h              |  1 +
> > >  qapi/migration.json              |  5 +-
> > >  5 files changed, 92 insertions(+), 2 deletions(-)
> > >  create mode 100644 migration/multifd-qatzip.c
> > >
> > > diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
> > > index 1a396521d5..d8e48dcb0e 100644
> > > --- a/hw/core/qdev-properties-system.c
> > > +++ b/hw/core/qdev-properties-system.c
> > > @@ -658,7 +658,11 @@ const PropertyInfo qdev_prop_fdc_drive_type = {
> > >  const PropertyInfo qdev_prop_multifd_compression = {
> > >      .name = "MultiFDCompression",
> > >      .description = "multifd_compression values, "
> > > -                   "none/zlib/zstd",
> > > +                   "none/zlib/zstd"
> > > +#ifdef CONFIG_QATZIP
> > > +                   "/qatzip"
> > > +#endif
> > > +                   ,
> > >      .enum_table = &MultiFDCompression_lookup,
> > >      .get = qdev_propinfo_get_enum,
> > >      .set = qdev_propinfo_set_enum,
> > > diff --git a/migration/meson.build b/migration/meson.build
> > > index 92b1cc4297..e20f318379 100644
> > > --- a/migration/meson.build
> > > +++ b/migration/meson.build
> > > @@ -40,6 +40,7 @@ if get_option('live_block_migration').allowed()
> > >    system_ss.add(files('block.c'))
> > >  endif
> > >  system_ss.add(when: zstd, if_true: files('multifd-zstd.c'))
> > > +system_ss.add(when: qatzip, if_true: files('multifd-qatzip.c'))
> > >
> > >  specific_ss.add(when: 'CONFIG_SYSTEM_ONLY',
> > >                  if_true: files('ram.c',
> > > diff --git a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c
> > > new file mode 100644
> > > index 0000000000..1733bbddb7
> > > --- /dev/null
> > > +++ b/migration/multifd-qatzip.c
> > > @@ -0,0 +1,81 @@
> > > +/*
> > > + * Multifd QATzip compression implementation
> > > + *
> > > + * Copyright (c) Bytedance
> > > + *
> > > + * Authors:
> > > + *  Bryan Zhang <bryan.zhang@bytedance.com>
> > > + *  Hao Xiang   <hao.xiang@bytedance.com>
> > > + *
> > > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > > + * See the COPYING file in the top-level directory.
> > > + */
> > > +
> > > +#include "qemu/osdep.h"
> > > +#include "exec/ramblock.h"
> > > +#include "exec/target_page.h"
> > > +#include "qapi/error.h"
> > > +#include "migration.h"
> > > +#include "options.h"
> > > +#include "multifd.h"
> > > +
> > > +static int qatzip_send_setup(MultiFDSendParams *p, Error **errp)
> > > +{
> > > +    return 0;
> > > +}
> > > +
> > > +static void qatzip_send_cleanup(MultiFDSendParams *p, Error **errp) {};
> > > +
> > > +static int qatzip_send_prepare(MultiFDSendParams *p, Error **errp)
> > > +{
> > > +    MultiFDPages_t *pages = p->pages;
> > > +
> > > +    for (int i = 0; i < p->normal_num; i++) {
> > > +        p->iov[p->iovs_num].iov_base = pages->block->host + p->normal[i];
> > > +        p->iov[p->iovs_num].iov_len = p->page_size;
> > > +        p->iovs_num++;
> > > +    }
> > > +
> > > +    p->next_packet_size = p->normal_num * p->page_size;
> > > +    p->flags |= MULTIFD_FLAG_NOCOMP;
> > > +    return 0;
> > > +}
> > > +
> > > +static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp)
> > > +{
> > > +    return 0;
> > > +}
> > > +
> > > +static void qatzip_recv_cleanup(MultiFDRecvParams *p) {};
> > > +
> > > +static int qatzip_recv_pages(MultiFDRecvParams *p, Error **errp)
> > > +{
> > > +    uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
> > > +
> > > +    if (flags != MULTIFD_FLAG_NOCOMP) {
> > > +        error_setg(errp, "multifd %u: flags received %x flags expected %x",
> > > +                   p->id, flags, MULTIFD_FLAG_NOCOMP);
> > > +        return -1;
> > > +    }
> > > +    for (int i = 0; i < p->normal_num; i++) {
> > > +        p->iov[i].iov_base = p->host + p->normal[i];
> > > +        p->iov[i].iov_len = p->page_size;
> > > +    }
> > > +    return qio_channel_readv_all(p->c, p->iov, p->normal_num, errp);
> > > +}
> > > +
> > > +static MultiFDMethods multifd_qatzip_ops = {
> > > +    .send_setup = qatzip_send_setup,
> > > +    .send_cleanup = qatzip_send_cleanup,
> > > +    .send_prepare = qatzip_send_prepare,
> > > +    .recv_setup = qatzip_recv_setup,
> > > +    .recv_cleanup = qatzip_recv_cleanup,
> > > +    .recv_pages = qatzip_recv_pages
> > > +};
> > > +
> > > +static void multifd_qatzip_register(void)
> > > +{
> > > +    multifd_register_ops(MULTIFD_COMPRESSION_QATZIP, &multifd_qatzip_ops);
> > > +}
> > > +
> > > +migration_init(multifd_qatzip_register);
> > > diff --git a/migration/multifd.h b/migration/multifd.h
> > > index a835643b48..5600f7fc82 100644
> > > --- a/migration/multifd.h
> > > +++ b/migration/multifd.h
> > > @@ -33,6 +33,7 @@ int multifd_queue_page(QEMUFile *f, RAMBlock *block, ram_addr_t offset);
> > >  #define MULTIFD_FLAG_NOCOMP (0 << 1)
> > >  #define MULTIFD_FLAG_ZLIB (1 << 1)
> > >  #define MULTIFD_FLAG_ZSTD (2 << 1)
> > > +#define MULTIFD_FLAG_QATZIP (3 << 1)
> > >
> > >  /* This value needs to be a multiple of qemu_target_page_size() */
> > >  #define MULTIFD_PACKET_SIZE (512 * 1024)
> > > diff --git a/qapi/migration.json b/qapi/migration.json
> > > index 6d5a4b0489..e3cc195aed 100644
> > > --- a/qapi/migration.json
> > > +++ b/qapi/migration.json
> > > @@ -625,11 +625,14 @@
> > >  #
> > >  # @zstd: use zstd compression method.
> > >  #
> > > +# @qatzip: use qatzip compression method.
> > > +#
> > >  # Since: 5.0
> > >  ##
> > >  { 'enum': 'MultiFDCompression',
> > >    'data': [ 'none', 'zlib',
> > > -            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
> > > +            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' },
> > > +            { 'name': 'qatzip', 'if': 'CONFIG_QATZIP'} ] }
> >
> > In another thread adding support to another Intel accelerator (IAA) we
> > decided that it was better to select the offloading as an accelerator
> > method to multifd zlib rather than as an entirely new compression
> > format. Take a look at that discussion:
> > https://lore.kernel.org/r/ZTFCnqbbqlmsUkRC@redhat.com
>
> We had some early discussion with Intel folks (probably a different
> team than the one with the above patchset). The understanding at the
> time is that QAT is good at both bulk data compression and
> decompression. IAA is good at decompression with smaller data size but
> compression performance is not the best. In multifd, we are
> compressing up to 128 4k pages at a time and potentially this can
> increase by configuring the packet size, at the time we thought QAT
> could be a better fit in the multifd live migration scenario. We would
> like to hear more from Intel.
>
> From our benchmark testing, with two QAT devices, we can get deflate
> compression throughout to around 7GB/s with ~160% CPU. That's beating
> the current software implementation (zlib and zstd) by a lot. We are
> still tuning the configuration in QEMU live migration now.
>
> >
> > As I understand it, QAT + QATzip would be compatible with both zlib and
> > IAA + QPL, so we'd add another accelerator method like this:
> >
> > https://lore.kernel.org/r/20240103112851.908082-3-yuan1.liu@intel.com
> >
>
> I quickly read over the IAA patchset and I saw this:
>
> "However, due to some reasons, QPL is currently
> not compatible with the existing Zlib method that Zlib compressed data
> can be decompressed by QPl and vice versa."
>
> The above probably means the current zlib software implementation and
> IAA are not compatible.
>
> For QAT, although, both Intel's QATzip and zlib are internally using
> deflate, QATzip only supports deflate with a 4 byte header, deflate
> wrapped by Gzip header and footer, or deflate wrapped by Intel® QAT
> Gzip* extension header and footer. None of the headers can be
> recognized by zlib software implementation is my understanding. So if
> we want to make them compatible with zlib, the QATzip library needs to
> support that.

I took a closer look at Intel's documentation here
https://github.com/intel/QATzip
QATzip does have a compression format QZ_DEFLATE_RAW, which uses the
deflate format but no header at all. This looks like the same as what
QEMU's current zlib implementation does - using the raw deflate
format. I can have a quick test to confirm that. Meanwhile, the
documentation mentioned that if using the raw deflate format,
decompression cannot be offloaded by QAT hardware. So there is a
trade-off here if we want to avoid creating a new compression format
in QEMU.

>
> > All that, of course, assuming we even want to support both
> > accelerators. They're addressing the same problem after all. I wonder
> > how we'd choose a precedence, since both seem to be present in the same
> > processor family.
> >
> >
>
> That's an interesting question :-) I think overall performance
> (throughput and CPU overhead) should both be considered. IAA and QAT
> accelerators don't present on all systems. We Bytedance choose to have
> both on our platform when purchasing from Intel.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method
  2024-01-05 20:07   ` Fabiano Rosas
  2024-01-05 23:52     ` [External] " Hao Xiang
@ 2024-01-08  3:17     ` Liu, Yuan1
  1 sibling, 0 replies; 21+ messages in thread
From: Liu, Yuan1 @ 2024-01-08  3:17 UTC (permalink / raw)
  To: Fabiano Rosas, Bryan Zhang, qemu-devel@nongnu.org,
	marcandre.lureau@redhat.com, peterx@redhat.com,
	quintela@redhat.com, peter.maydell@linaro.org,
	hao.xiang@bytedance.com
  Cc: bryan.zhang@bytedance.com, berrange@redhat.com, Zou, Nanhai

> -----Original Message-----
> From: Fabiano Rosas <farosas@suse.de>
> Sent: Saturday, January 6, 2024 4:07 AM
> To: Bryan Zhang <bryan.zhang@bytedance.com>; qemu-devel@nongnu.org;
> marcandre.lureau@redhat.com; peterx@redhat.com; quintela@redhat.com;
> peter.maydell@linaro.org; hao.xiang@bytedance.com
> Cc: bryan.zhang@bytedance.com; Liu, Yuan1 <yuan1.liu@intel.com>;
> berrange@redhat.com
> Subject: Re: [PATCH 3/5] migration: Introduce unimplemented 'qatzip'
> compression method
> 
> Bryan Zhang <bryan.zhang@bytedance.com> writes:
> 
> +cc Yuan Liu, Daniel Berrangé
> 
> > Adds support for 'qatzip' as an option for the multifd compression
> > method parameter, but copy-pastes the no-op logic to leave the actual
> > methods effectively unimplemented. This is in preparation of a
> > subsequent commit that will implement actually using QAT for
> > compression and decompression.
> >
> > Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
> > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
> > ---
> >  hw/core/qdev-properties-system.c |  6 ++-
> >  migration/meson.build            |  1 +
> >  migration/multifd-qatzip.c       | 81 ++++++++++++++++++++++++++++++++
> >  migration/multifd.h              |  1 +
> >  qapi/migration.json              |  5 +-
> >  5 files changed, 92 insertions(+), 2 deletions(-)  create mode 100644
> > migration/multifd-qatzip.c
> >
> > diff --git a/hw/core/qdev-properties-system.c
> > b/hw/core/qdev-properties-system.c
> > index 1a396521d5..d8e48dcb0e 100644
> > --- a/hw/core/qdev-properties-system.c
> > +++ b/hw/core/qdev-properties-system.c
> > @@ -658,7 +658,11 @@ const PropertyInfo qdev_prop_fdc_drive_type = {
> > const PropertyInfo qdev_prop_multifd_compression = {
> >      .name = "MultiFDCompression",
> >      .description = "multifd_compression values, "
> > -                   "none/zlib/zstd",
> > +                   "none/zlib/zstd"
> > +#ifdef CONFIG_QATZIP
> > +                   "/qatzip"
> > +#endif
> > +                   ,
> >      .enum_table = &MultiFDCompression_lookup,
> >      .get = qdev_propinfo_get_enum,
> >      .set = qdev_propinfo_set_enum,
> > diff --git a/migration/meson.build b/migration/meson.build index
> > 92b1cc4297..e20f318379 100644
> > --- a/migration/meson.build
> > +++ b/migration/meson.build
> > @@ -40,6 +40,7 @@ if get_option('live_block_migration').allowed()
> >    system_ss.add(files('block.c'))
> >  endif
> >  system_ss.add(when: zstd, if_true: files('multifd-zstd.c'))
> > +system_ss.add(when: qatzip, if_true: files('multifd-qatzip.c'))
> >
> >  specific_ss.add(when: 'CONFIG_SYSTEM_ONLY',
> >                  if_true: files('ram.c', diff --git
> > a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c new file
> > mode 100644 index 0000000000..1733bbddb7
> > --- /dev/null
> > +++ b/migration/multifd-qatzip.c
> > @@ -0,0 +1,81 @@
> > +/*
> > + * Multifd QATzip compression implementation
> > + *
> > + * Copyright (c) Bytedance
> > + *
> > + * Authors:
> > + *  Bryan Zhang <bryan.zhang@bytedance.com>
> > + *  Hao Xiang   <hao.xiang@bytedance.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or
> later.
> > + * See the COPYING file in the top-level directory.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "exec/ramblock.h"
> > +#include "exec/target_page.h"
> > +#include "qapi/error.h"
> > +#include "migration.h"
> > +#include "options.h"
> > +#include "multifd.h"
> > +
> > +static int qatzip_send_setup(MultiFDSendParams *p, Error **errp) {
> > +    return 0;
> > +}
> > +
> > +static void qatzip_send_cleanup(MultiFDSendParams *p, Error **errp)
> > +{};
> > +
> > +static int qatzip_send_prepare(MultiFDSendParams *p, Error **errp) {
> > +    MultiFDPages_t *pages = p->pages;
> > +
> > +    for (int i = 0; i < p->normal_num; i++) {
> > +        p->iov[p->iovs_num].iov_base = pages->block->host + p-
> >normal[i];
> > +        p->iov[p->iovs_num].iov_len = p->page_size;
> > +        p->iovs_num++;
> > +    }
> > +
> > +    p->next_packet_size = p->normal_num * p->page_size;
> > +    p->flags |= MULTIFD_FLAG_NOCOMP;
> > +    return 0;
> > +}
> > +
> > +static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp) {
> > +    return 0;
> > +}
> > +
> > +static void qatzip_recv_cleanup(MultiFDRecvParams *p) {};
> > +
> > +static int qatzip_recv_pages(MultiFDRecvParams *p, Error **errp) {
> > +    uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
> > +
> > +    if (flags != MULTIFD_FLAG_NOCOMP) {
> > +        error_setg(errp, "multifd %u: flags received %x flags
> expected %x",
> > +                   p->id, flags, MULTIFD_FLAG_NOCOMP);
> > +        return -1;
> > +    }
> > +    for (int i = 0; i < p->normal_num; i++) {
> > +        p->iov[i].iov_base = p->host + p->normal[i];
> > +        p->iov[i].iov_len = p->page_size;
> > +    }
> > +    return qio_channel_readv_all(p->c, p->iov, p->normal_num, errp);
> > +}
> > +
> > +static MultiFDMethods multifd_qatzip_ops = {
> > +    .send_setup = qatzip_send_setup,
> > +    .send_cleanup = qatzip_send_cleanup,
> > +    .send_prepare = qatzip_send_prepare,
> > +    .recv_setup = qatzip_recv_setup,
> > +    .recv_cleanup = qatzip_recv_cleanup,
> > +    .recv_pages = qatzip_recv_pages
> > +};
> > +
> > +static void multifd_qatzip_register(void) {
> > +    multifd_register_ops(MULTIFD_COMPRESSION_QATZIP,
> > +&multifd_qatzip_ops); }
> > +
> > +migration_init(multifd_qatzip_register);
> > diff --git a/migration/multifd.h b/migration/multifd.h index
> > a835643b48..5600f7fc82 100644
> > --- a/migration/multifd.h
> > +++ b/migration/multifd.h
> > @@ -33,6 +33,7 @@ int multifd_queue_page(QEMUFile *f, RAMBlock *block,
> > ram_addr_t offset);  #define MULTIFD_FLAG_NOCOMP (0 << 1)  #define
> > MULTIFD_FLAG_ZLIB (1 << 1)  #define MULTIFD_FLAG_ZSTD (2 << 1)
> > +#define MULTIFD_FLAG_QATZIP (3 << 1)
> >
> >  /* This value needs to be a multiple of qemu_target_page_size() */
> > #define MULTIFD_PACKET_SIZE (512 * 1024) diff --git
> > a/qapi/migration.json b/qapi/migration.json index
> > 6d5a4b0489..e3cc195aed 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -625,11 +625,14 @@
> >  #
> >  # @zstd: use zstd compression method.
> >  #
> > +# @qatzip: use qatzip compression method.
> > +#
> >  # Since: 5.0
> >  ##
> >  { 'enum': 'MultiFDCompression',
> >    'data': [ 'none', 'zlib',
> > -            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
> > +            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' },
> > +            { 'name': 'qatzip', 'if': 'CONFIG_QATZIP'} ] }
> 
> In another thread adding support to another Intel accelerator (IAA) we
> decided that it was better to select the offloading as an accelerator
> method to multifd zlib rather than as an entirely new compression format.
> Take a look at that discussion:
> https://lore.kernel.org/r/ZTFCnqbbqlmsUkRC@redhat.com
> 
> As I understand it, QAT + QATzip would be compatible with both zlib and
> IAA + QPL, so we'd add another accelerator method like this:
> 
> https://lore.kernel.org/r/20240103112851.908082-3-yuan1.liu@intel.com
> 
> All that, of course, assuming we even want to support both accelerators.
> They're addressing the same problem after all. I wonder how we'd choose a
> precedence, since both seem to be present in the same processor family.


I agree that IAA and QAT should run under the accelerator framework and be 
compatible with zlib.

Regarding the choice between QAT and IAA:

1. This decision can be determined by the end customer through live migration 
parameters. The Xeon products have different specifications, which will impact 
the presence and quantity of IAA and QAT devices.

2. If the customer's chosen product includes both IAA and QAT, the live migration
software can list the priority of accelerators supporting the zlib compression 
algorithm.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [External] Re: [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method
  2024-01-05 23:52     ` [External] " Hao Xiang
  2024-01-06  6:31       ` Hao Xiang
@ 2024-01-08  3:25       ` Liu, Yuan1
  2024-01-08 20:27         ` Fabiano Rosas
  1 sibling, 1 reply; 21+ messages in thread
From: Liu, Yuan1 @ 2024-01-08  3:25 UTC (permalink / raw)
  To: Hao Xiang, Fabiano Rosas
  Cc: Bryan Zhang, qemu-devel@nongnu.org, marcandre.lureau@redhat.com,
	peterx@redhat.com, quintela@redhat.com, peter.maydell@linaro.org,
	berrange@redhat.com

> -----Original Message-----
> From: Hao Xiang <hao.xiang@bytedance.com>
> Sent: Saturday, January 6, 2024 7:53 AM
> To: Fabiano Rosas <farosas@suse.de>
> Cc: Bryan Zhang <bryan.zhang@bytedance.com>; qemu-devel@nongnu.org;
> marcandre.lureau@redhat.com; peterx@redhat.com; quintela@redhat.com;
> peter.maydell@linaro.org; Liu, Yuan1 <yuan1.liu@intel.com>;
> berrange@redhat.com
> Subject: Re: [External] Re: [PATCH 3/5] migration: Introduce unimplemented
> 'qatzip' compression method
> 
> On Fri, Jan 5, 2024 at 12:07 PM Fabiano Rosas <farosas@suse.de> wrote:
> >
> > Bryan Zhang <bryan.zhang@bytedance.com> writes:
> >
> > +cc Yuan Liu, Daniel Berrangé
> >
> > > Adds support for 'qatzip' as an option for the multifd compression
> > > method parameter, but copy-pastes the no-op logic to leave the
> > > actual methods effectively unimplemented. This is in preparation of
> > > a subsequent commit that will implement actually using QAT for
> > > compression and decompression.
> > >
> > > Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
> > > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
> > > ---
> > >  hw/core/qdev-properties-system.c |  6 ++-
> > >  migration/meson.build            |  1 +
> > >  migration/multifd-qatzip.c       | 81
> ++++++++++++++++++++++++++++++++
> > >  migration/multifd.h              |  1 +
> > >  qapi/migration.json              |  5 +-
> > >  5 files changed, 92 insertions(+), 2 deletions(-)  create mode
> > > 100644 migration/multifd-qatzip.c
> > >
> > > diff --git a/hw/core/qdev-properties-system.c
> > > b/hw/core/qdev-properties-system.c
> > > index 1a396521d5..d8e48dcb0e 100644
> > > --- a/hw/core/qdev-properties-system.c
> > > +++ b/hw/core/qdev-properties-system.c
> > > @@ -658,7 +658,11 @@ const PropertyInfo qdev_prop_fdc_drive_type = {
> > > const PropertyInfo qdev_prop_multifd_compression = {
> > >      .name = "MultiFDCompression",
> > >      .description = "multifd_compression values, "
> > > -                   "none/zlib/zstd",
> > > +                   "none/zlib/zstd"
> > > +#ifdef CONFIG_QATZIP
> > > +                   "/qatzip"
> > > +#endif
> > > +                   ,
> > >      .enum_table = &MultiFDCompression_lookup,
> > >      .get = qdev_propinfo_get_enum,
> > >      .set = qdev_propinfo_set_enum,
> > > diff --git a/migration/meson.build b/migration/meson.build index
> > > 92b1cc4297..e20f318379 100644
> > > --- a/migration/meson.build
> > > +++ b/migration/meson.build
> > > @@ -40,6 +40,7 @@ if get_option('live_block_migration').allowed()
> > >    system_ss.add(files('block.c'))
> > >  endif
> > >  system_ss.add(when: zstd, if_true: files('multifd-zstd.c'))
> > > +system_ss.add(when: qatzip, if_true: files('multifd-qatzip.c'))
> > >
> > >  specific_ss.add(when: 'CONFIG_SYSTEM_ONLY',
> > >                  if_true: files('ram.c', diff --git
> > > a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c new file
> > > mode 100644 index 0000000000..1733bbddb7
> > > --- /dev/null
> > > +++ b/migration/multifd-qatzip.c
> > > @@ -0,0 +1,81 @@
> > > +/*
> > > + * Multifd QATzip compression implementation
> > > + *
> > > + * Copyright (c) Bytedance
> > > + *
> > > + * Authors:
> > > + *  Bryan Zhang <bryan.zhang@bytedance.com>
> > > + *  Hao Xiang   <hao.xiang@bytedance.com>
> > > + *
> > > + * This work is licensed under the terms of the GNU GPL, version 2 or
> later.
> > > + * See the COPYING file in the top-level directory.
> > > + */
> > > +
> > > +#include "qemu/osdep.h"
> > > +#include "exec/ramblock.h"
> > > +#include "exec/target_page.h"
> > > +#include "qapi/error.h"
> > > +#include "migration.h"
> > > +#include "options.h"
> > > +#include "multifd.h"
> > > +
> > > +static int qatzip_send_setup(MultiFDSendParams *p, Error **errp) {
> > > +    return 0;
> > > +}
> > > +
> > > +static void qatzip_send_cleanup(MultiFDSendParams *p, Error **errp)
> > > +{};
> > > +
> > > +static int qatzip_send_prepare(MultiFDSendParams *p, Error **errp)
> > > +{
> > > +    MultiFDPages_t *pages = p->pages;
> > > +
> > > +    for (int i = 0; i < p->normal_num; i++) {
> > > +        p->iov[p->iovs_num].iov_base = pages->block->host + p-
> >normal[i];
> > > +        p->iov[p->iovs_num].iov_len = p->page_size;
> > > +        p->iovs_num++;
> > > +    }
> > > +
> > > +    p->next_packet_size = p->normal_num * p->page_size;
> > > +    p->flags |= MULTIFD_FLAG_NOCOMP;
> > > +    return 0;
> > > +}
> > > +
> > > +static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp) {
> > > +    return 0;
> > > +}
> > > +
> > > +static void qatzip_recv_cleanup(MultiFDRecvParams *p) {};
> > > +
> > > +static int qatzip_recv_pages(MultiFDRecvParams *p, Error **errp) {
> > > +    uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
> > > +
> > > +    if (flags != MULTIFD_FLAG_NOCOMP) {
> > > +        error_setg(errp, "multifd %u: flags received %x flags
> expected %x",
> > > +                   p->id, flags, MULTIFD_FLAG_NOCOMP);
> > > +        return -1;
> > > +    }
> > > +    for (int i = 0; i < p->normal_num; i++) {
> > > +        p->iov[i].iov_base = p->host + p->normal[i];
> > > +        p->iov[i].iov_len = p->page_size;
> > > +    }
> > > +    return qio_channel_readv_all(p->c, p->iov, p->normal_num,
> > > +errp); }
> > > +
> > > +static MultiFDMethods multifd_qatzip_ops = {
> > > +    .send_setup = qatzip_send_setup,
> > > +    .send_cleanup = qatzip_send_cleanup,
> > > +    .send_prepare = qatzip_send_prepare,
> > > +    .recv_setup = qatzip_recv_setup,
> > > +    .recv_cleanup = qatzip_recv_cleanup,
> > > +    .recv_pages = qatzip_recv_pages };
> > > +
> > > +static void multifd_qatzip_register(void) {
> > > +    multifd_register_ops(MULTIFD_COMPRESSION_QATZIP,
> > > +&multifd_qatzip_ops); }
> > > +
> > > +migration_init(multifd_qatzip_register);
> > > diff --git a/migration/multifd.h b/migration/multifd.h index
> > > a835643b48..5600f7fc82 100644
> > > --- a/migration/multifd.h
> > > +++ b/migration/multifd.h
> > > @@ -33,6 +33,7 @@ int multifd_queue_page(QEMUFile *f, RAMBlock
> > > *block, ram_addr_t offset);  #define MULTIFD_FLAG_NOCOMP (0 << 1)
> > > #define MULTIFD_FLAG_ZLIB (1 << 1)  #define MULTIFD_FLAG_ZSTD (2 <<
> > > 1)
> > > +#define MULTIFD_FLAG_QATZIP (3 << 1)
> > >
> > >  /* This value needs to be a multiple of qemu_target_page_size() */
> > > #define MULTIFD_PACKET_SIZE (512 * 1024) diff --git
> > > a/qapi/migration.json b/qapi/migration.json index
> > > 6d5a4b0489..e3cc195aed 100644
> > > --- a/qapi/migration.json
> > > +++ b/qapi/migration.json
> > > @@ -625,11 +625,14 @@
> > >  #
> > >  # @zstd: use zstd compression method.
> > >  #
> > > +# @qatzip: use qatzip compression method.
> > > +#
> > >  # Since: 5.0
> > >  ##
> > >  { 'enum': 'MultiFDCompression',
> > >    'data': [ 'none', 'zlib',
> > > -            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
> > > +            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' },
> > > +            { 'name': 'qatzip', 'if': 'CONFIG_QATZIP'} ] }
> >
> > In another thread adding support to another Intel accelerator (IAA) we
> > decided that it was better to select the offloading as an accelerator
> > method to multifd zlib rather than as an entirely new compression
> > format. Take a look at that discussion:
> > https://lore.kernel.org/r/ZTFCnqbbqlmsUkRC@redhat.com
> 
> We had some early discussion with Intel folks (probably a different team
> than the one with the above patchset). The understanding at the time is
> that QAT is good at both bulk data compression and decompression. IAA is
> good at decompression with smaller data size but compression performance
> is not the best. In multifd, we are compressing up to 128 4k pages at a
> time and potentially this can increase by configuring the packet size, at
> the time we thought QAT could be a better fit in the multifd live
> migration scenario. We would like to hear more from Intel.
> 
> From our benchmark testing, with two QAT devices, we can get deflate
> compression throughout to around 7GB/s with ~160% CPU. That's beating the
> current software implementation (zlib and zstd) by a lot. We are still
> tuning the configuration in QEMU live migration now.
> 
> >
> > As I understand it, QAT + QATzip would be compatible with both zlib
> > and IAA + QPL, so we'd add another accelerator method like this:
> >
> > https://lore.kernel.org/r/20240103112851.908082-3-yuan1.liu@intel.com
> >
> 
> I quickly read over the IAA patchset and I saw this:
> 
> "However, due to some reasons, QPL is currently not compatible with the
> existing Zlib method that Zlib compressed data can be decompressed by QPl
> and vice versa."
> 
> The above probably means the current zlib software implementation and IAA
> are not compatible.
> 
> For QAT, although, both Intel's QATzip and zlib are internally using
> deflate, QATzip only supports deflate with a 4 byte header, deflate
> wrapped by Gzip header and footer, or deflate wrapped by Intel® QAT
> Gzip* extension header and footer. None of the headers can be recognized
> by zlib software implementation is my understanding. So if we want to make
> them compatible with zlib, the QATzip library needs to support that.

The QPL library currently cannot support the Z_SYNC_FLULSH operation of zlib steaming. 
This is the reason why it is not compatible with zlib.

> > All that, of course, assuming we even want to support both
> > accelerators. They're addressing the same problem after all. I wonder
> > how we'd choose a precedence, since both seem to be present in the
> > same processor family.
> >
> >
> 
> That's an interesting question :-) I think overall performance (throughput
> and CPU overhead) should both be considered. IAA and QAT accelerators
> don't present on all systems. We Bytedance choose to have both on our
> platform when purchasing from Intel.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method
  2024-01-06  6:31       ` Hao Xiang
@ 2024-01-08 20:24         ` Fabiano Rosas
  0 siblings, 0 replies; 21+ messages in thread
From: Fabiano Rosas @ 2024-01-08 20:24 UTC (permalink / raw)
  To: Hao Xiang
  Cc: Bryan Zhang, qemu-devel, marcandre.lureau, peterx, quintela,
	peter.maydell, yuan1.liu, berrange

Hao Xiang <hao.xiang@bytedance.com> writes:

> On Fri, Jan 5, 2024 at 3:52 PM Hao Xiang <hao.xiang@bytedance.com> wrote:
>>
>> On Fri, Jan 5, 2024 at 12:07 PM Fabiano Rosas <farosas@suse.de> wrote:
>> >
>> > Bryan Zhang <bryan.zhang@bytedance.com> writes:
>> >
>> > +cc Yuan Liu, Daniel Berrangé
>> >
>> > > Adds support for 'qatzip' as an option for the multifd compression
>> > > method parameter, but copy-pastes the no-op logic to leave the actual
>> > > methods effectively unimplemented. This is in preparation of a
>> > > subsequent commit that will implement actually using QAT for compression
>> > > and decompression.
>> > >
>> > > Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
>> > > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
>> > > ---
>> > >  hw/core/qdev-properties-system.c |  6 ++-
>> > >  migration/meson.build            |  1 +
>> > >  migration/multifd-qatzip.c       | 81 ++++++++++++++++++++++++++++++++
>> > >  migration/multifd.h              |  1 +
>> > >  qapi/migration.json              |  5 +-
>> > >  5 files changed, 92 insertions(+), 2 deletions(-)
>> > >  create mode 100644 migration/multifd-qatzip.c
>> > >
>> > > diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
>> > > index 1a396521d5..d8e48dcb0e 100644
>> > > --- a/hw/core/qdev-properties-system.c
>> > > +++ b/hw/core/qdev-properties-system.c
>> > > @@ -658,7 +658,11 @@ const PropertyInfo qdev_prop_fdc_drive_type = {
>> > >  const PropertyInfo qdev_prop_multifd_compression = {
>> > >      .name = "MultiFDCompression",
>> > >      .description = "multifd_compression values, "
>> > > -                   "none/zlib/zstd",
>> > > +                   "none/zlib/zstd"
>> > > +#ifdef CONFIG_QATZIP
>> > > +                   "/qatzip"
>> > > +#endif
>> > > +                   ,
>> > >      .enum_table = &MultiFDCompression_lookup,
>> > >      .get = qdev_propinfo_get_enum,
>> > >      .set = qdev_propinfo_set_enum,
>> > > diff --git a/migration/meson.build b/migration/meson.build
>> > > index 92b1cc4297..e20f318379 100644
>> > > --- a/migration/meson.build
>> > > +++ b/migration/meson.build
>> > > @@ -40,6 +40,7 @@ if get_option('live_block_migration').allowed()
>> > >    system_ss.add(files('block.c'))
>> > >  endif
>> > >  system_ss.add(when: zstd, if_true: files('multifd-zstd.c'))
>> > > +system_ss.add(when: qatzip, if_true: files('multifd-qatzip.c'))
>> > >
>> > >  specific_ss.add(when: 'CONFIG_SYSTEM_ONLY',
>> > >                  if_true: files('ram.c',
>> > > diff --git a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c
>> > > new file mode 100644
>> > > index 0000000000..1733bbddb7
>> > > --- /dev/null
>> > > +++ b/migration/multifd-qatzip.c
>> > > @@ -0,0 +1,81 @@
>> > > +/*
>> > > + * Multifd QATzip compression implementation
>> > > + *
>> > > + * Copyright (c) Bytedance
>> > > + *
>> > > + * Authors:
>> > > + *  Bryan Zhang <bryan.zhang@bytedance.com>
>> > > + *  Hao Xiang   <hao.xiang@bytedance.com>
>> > > + *
>> > > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>> > > + * See the COPYING file in the top-level directory.
>> > > + */
>> > > +
>> > > +#include "qemu/osdep.h"
>> > > +#include "exec/ramblock.h"
>> > > +#include "exec/target_page.h"
>> > > +#include "qapi/error.h"
>> > > +#include "migration.h"
>> > > +#include "options.h"
>> > > +#include "multifd.h"
>> > > +
>> > > +static int qatzip_send_setup(MultiFDSendParams *p, Error **errp)
>> > > +{
>> > > +    return 0;
>> > > +}
>> > > +
>> > > +static void qatzip_send_cleanup(MultiFDSendParams *p, Error **errp) {};
>> > > +
>> > > +static int qatzip_send_prepare(MultiFDSendParams *p, Error **errp)
>> > > +{
>> > > +    MultiFDPages_t *pages = p->pages;
>> > > +
>> > > +    for (int i = 0; i < p->normal_num; i++) {
>> > > +        p->iov[p->iovs_num].iov_base = pages->block->host + p->normal[i];
>> > > +        p->iov[p->iovs_num].iov_len = p->page_size;
>> > > +        p->iovs_num++;
>> > > +    }
>> > > +
>> > > +    p->next_packet_size = p->normal_num * p->page_size;
>> > > +    p->flags |= MULTIFD_FLAG_NOCOMP;
>> > > +    return 0;
>> > > +}
>> > > +
>> > > +static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp)
>> > > +{
>> > > +    return 0;
>> > > +}
>> > > +
>> > > +static void qatzip_recv_cleanup(MultiFDRecvParams *p) {};
>> > > +
>> > > +static int qatzip_recv_pages(MultiFDRecvParams *p, Error **errp)
>> > > +{
>> > > +    uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
>> > > +
>> > > +    if (flags != MULTIFD_FLAG_NOCOMP) {
>> > > +        error_setg(errp, "multifd %u: flags received %x flags expected %x",
>> > > +                   p->id, flags, MULTIFD_FLAG_NOCOMP);
>> > > +        return -1;
>> > > +    }
>> > > +    for (int i = 0; i < p->normal_num; i++) {
>> > > +        p->iov[i].iov_base = p->host + p->normal[i];
>> > > +        p->iov[i].iov_len = p->page_size;
>> > > +    }
>> > > +    return qio_channel_readv_all(p->c, p->iov, p->normal_num, errp);
>> > > +}
>> > > +
>> > > +static MultiFDMethods multifd_qatzip_ops = {
>> > > +    .send_setup = qatzip_send_setup,
>> > > +    .send_cleanup = qatzip_send_cleanup,
>> > > +    .send_prepare = qatzip_send_prepare,
>> > > +    .recv_setup = qatzip_recv_setup,
>> > > +    .recv_cleanup = qatzip_recv_cleanup,
>> > > +    .recv_pages = qatzip_recv_pages
>> > > +};
>> > > +
>> > > +static void multifd_qatzip_register(void)
>> > > +{
>> > > +    multifd_register_ops(MULTIFD_COMPRESSION_QATZIP, &multifd_qatzip_ops);
>> > > +}
>> > > +
>> > > +migration_init(multifd_qatzip_register);
>> > > diff --git a/migration/multifd.h b/migration/multifd.h
>> > > index a835643b48..5600f7fc82 100644
>> > > --- a/migration/multifd.h
>> > > +++ b/migration/multifd.h
>> > > @@ -33,6 +33,7 @@ int multifd_queue_page(QEMUFile *f, RAMBlock *block, ram_addr_t offset);
>> > >  #define MULTIFD_FLAG_NOCOMP (0 << 1)
>> > >  #define MULTIFD_FLAG_ZLIB (1 << 1)
>> > >  #define MULTIFD_FLAG_ZSTD (2 << 1)
>> > > +#define MULTIFD_FLAG_QATZIP (3 << 1)
>> > >
>> > >  /* This value needs to be a multiple of qemu_target_page_size() */
>> > >  #define MULTIFD_PACKET_SIZE (512 * 1024)
>> > > diff --git a/qapi/migration.json b/qapi/migration.json
>> > > index 6d5a4b0489..e3cc195aed 100644
>> > > --- a/qapi/migration.json
>> > > +++ b/qapi/migration.json
>> > > @@ -625,11 +625,14 @@
>> > >  #
>> > >  # @zstd: use zstd compression method.
>> > >  #
>> > > +# @qatzip: use qatzip compression method.
>> > > +#
>> > >  # Since: 5.0
>> > >  ##
>> > >  { 'enum': 'MultiFDCompression',
>> > >    'data': [ 'none', 'zlib',
>> > > -            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
>> > > +            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' },
>> > > +            { 'name': 'qatzip', 'if': 'CONFIG_QATZIP'} ] }
>> >
>> > In another thread adding support to another Intel accelerator (IAA) we
>> > decided that it was better to select the offloading as an accelerator
>> > method to multifd zlib rather than as an entirely new compression
>> > format. Take a look at that discussion:
>> > https://lore.kernel.org/r/ZTFCnqbbqlmsUkRC@redhat.com
>>
>> We had some early discussion with Intel folks (probably a different
>> team than the one with the above patchset). The understanding at the
>> time is that QAT is good at both bulk data compression and
>> decompression. IAA is good at decompression with smaller data size but
>> compression performance is not the best. In multifd, we are
>> compressing up to 128 4k pages at a time and potentially this can
>> increase by configuring the packet size, at the time we thought QAT
>> could be a better fit in the multifd live migration scenario. We would
>> like to hear more from Intel.
>>
>> From our benchmark testing, with two QAT devices, we can get deflate
>> compression throughout to around 7GB/s with ~160% CPU. That's beating
>> the current software implementation (zlib and zstd) by a lot. We are
>> still tuning the configuration in QEMU live migration now.
>>
>> >
>> > As I understand it, QAT + QATzip would be compatible with both zlib and
>> > IAA + QPL, so we'd add another accelerator method like this:
>> >
>> > https://lore.kernel.org/r/20240103112851.908082-3-yuan1.liu@intel.com
>> >
>>
>> I quickly read over the IAA patchset and I saw this:
>>
>> "However, due to some reasons, QPL is currently
>> not compatible with the existing Zlib method that Zlib compressed data
>> can be decompressed by QPl and vice versa."
>>
>> The above probably means the current zlib software implementation and
>> IAA are not compatible.
>>
>> For QAT, although, both Intel's QATzip and zlib are internally using
>> deflate, QATzip only supports deflate with a 4 byte header, deflate
>> wrapped by Gzip header and footer, or deflate wrapped by Intel® QAT
>> Gzip* extension header and footer. None of the headers can be
>> recognized by zlib software implementation is my understanding. So if
>> we want to make them compatible with zlib, the QATzip library needs to
>> support that.
>
> I took a closer look at Intel's documentation here
> https://github.com/intel/QATzip
> QATzip does have a compression format QZ_DEFLATE_RAW, which uses the
> deflate format but no header at all. This looks like the same as what
> QEMU's current zlib implementation does - using the raw deflate
> format. I can have a quick test to confirm that. Meanwhile, the
> documentation mentioned that if using the raw deflate format,
> decompression cannot be offloaded by QAT hardware. So there is a
> trade-off here if we want to avoid creating a new compression format
> in QEMU.

The default zlib behavior is to include the zlib header and trailer. To
obtain a raw deflate stream we'd need to suppress the zlib wrapper. It
might be an option to change QEMU code to make it produce a stream
compatible with QAT provided we don't currently use any feature that
needs the header.

One immediate issue I see is that without the header zlib defaults to
the 32kb window size, which stops us from changing the window size to
cope with QPL's 4kb limitation.

>
>>
>> > All that, of course, assuming we even want to support both
>> > accelerators. They're addressing the same problem after all. I wonder
>> > how we'd choose a precedence, since both seem to be present in the same
>> > processor family.
>> >
>> >
>>
>> That's an interesting question :-) I think overall performance
>> (throughput and CPU overhead) should both be considered. IAA and QAT
>> accelerators don't present on all systems. We Bytedance choose to have
>> both on our platform when purchasing from Intel.

We might need someone with the proper hardware to run a benchmark using
both accelerators if we end up deciding to supporting both. Ideally we'd
have a precedence defined so we don't need to force the user to select
one of them when both are present.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [External] Re: [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method
  2024-01-08  3:25       ` Liu, Yuan1
@ 2024-01-08 20:27         ` Fabiano Rosas
  2024-01-09  2:26           ` Liu, Yuan1
  2024-01-11  6:39           ` Hao Xiang
  0 siblings, 2 replies; 21+ messages in thread
From: Fabiano Rosas @ 2024-01-08 20:27 UTC (permalink / raw)
  To: Liu, Yuan1, Hao Xiang
  Cc: Bryan Zhang, qemu-devel@nongnu.org, marcandre.lureau@redhat.com,
	peterx@redhat.com, quintela@redhat.com, peter.maydell@linaro.org,
	berrange@redhat.com

"Liu, Yuan1" <yuan1.liu@intel.com> writes:

>> -----Original Message-----
>> From: Hao Xiang <hao.xiang@bytedance.com>
>> Sent: Saturday, January 6, 2024 7:53 AM
>> To: Fabiano Rosas <farosas@suse.de>
>> Cc: Bryan Zhang <bryan.zhang@bytedance.com>; qemu-devel@nongnu.org;
>> marcandre.lureau@redhat.com; peterx@redhat.com; quintela@redhat.com;
>> peter.maydell@linaro.org; Liu, Yuan1 <yuan1.liu@intel.com>;
>> berrange@redhat.com
>> Subject: Re: [External] Re: [PATCH 3/5] migration: Introduce unimplemented
>> 'qatzip' compression method
>> 
>> On Fri, Jan 5, 2024 at 12:07 PM Fabiano Rosas <farosas@suse.de> wrote:
>> >
>> > Bryan Zhang <bryan.zhang@bytedance.com> writes:
>> >
>> > +cc Yuan Liu, Daniel Berrangé
>> >
>> > > Adds support for 'qatzip' as an option for the multifd compression
>> > > method parameter, but copy-pastes the no-op logic to leave the
>> > > actual methods effectively unimplemented. This is in preparation of
>> > > a subsequent commit that will implement actually using QAT for
>> > > compression and decompression.
>> > >
>> > > Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
>> > > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
>> > > ---
>> > >  hw/core/qdev-properties-system.c |  6 ++-
>> > >  migration/meson.build            |  1 +
>> > >  migration/multifd-qatzip.c       | 81
>> ++++++++++++++++++++++++++++++++
>> > >  migration/multifd.h              |  1 +
>> > >  qapi/migration.json              |  5 +-
>> > >  5 files changed, 92 insertions(+), 2 deletions(-)  create mode
>> > > 100644 migration/multifd-qatzip.c
>> > >
>> > > diff --git a/hw/core/qdev-properties-system.c
>> > > b/hw/core/qdev-properties-system.c
>> > > index 1a396521d5..d8e48dcb0e 100644
>> > > --- a/hw/core/qdev-properties-system.c
>> > > +++ b/hw/core/qdev-properties-system.c
>> > > @@ -658,7 +658,11 @@ const PropertyInfo qdev_prop_fdc_drive_type = {
>> > > const PropertyInfo qdev_prop_multifd_compression = {
>> > >      .name = "MultiFDCompression",
>> > >      .description = "multifd_compression values, "
>> > > -                   "none/zlib/zstd",
>> > > +                   "none/zlib/zstd"
>> > > +#ifdef CONFIG_QATZIP
>> > > +                   "/qatzip"
>> > > +#endif
>> > > +                   ,
>> > >      .enum_table = &MultiFDCompression_lookup,
>> > >      .get = qdev_propinfo_get_enum,
>> > >      .set = qdev_propinfo_set_enum,
>> > > diff --git a/migration/meson.build b/migration/meson.build index
>> > > 92b1cc4297..e20f318379 100644
>> > > --- a/migration/meson.build
>> > > +++ b/migration/meson.build
>> > > @@ -40,6 +40,7 @@ if get_option('live_block_migration').allowed()
>> > >    system_ss.add(files('block.c'))
>> > >  endif
>> > >  system_ss.add(when: zstd, if_true: files('multifd-zstd.c'))
>> > > +system_ss.add(when: qatzip, if_true: files('multifd-qatzip.c'))
>> > >
>> > >  specific_ss.add(when: 'CONFIG_SYSTEM_ONLY',
>> > >                  if_true: files('ram.c', diff --git
>> > > a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c new file
>> > > mode 100644 index 0000000000..1733bbddb7
>> > > --- /dev/null
>> > > +++ b/migration/multifd-qatzip.c
>> > > @@ -0,0 +1,81 @@
>> > > +/*
>> > > + * Multifd QATzip compression implementation
>> > > + *
>> > > + * Copyright (c) Bytedance
>> > > + *
>> > > + * Authors:
>> > > + *  Bryan Zhang <bryan.zhang@bytedance.com>
>> > > + *  Hao Xiang   <hao.xiang@bytedance.com>
>> > > + *
>> > > + * This work is licensed under the terms of the GNU GPL, version 2 or
>> later.
>> > > + * See the COPYING file in the top-level directory.
>> > > + */
>> > > +
>> > > +#include "qemu/osdep.h"
>> > > +#include "exec/ramblock.h"
>> > > +#include "exec/target_page.h"
>> > > +#include "qapi/error.h"
>> > > +#include "migration.h"
>> > > +#include "options.h"
>> > > +#include "multifd.h"
>> > > +
>> > > +static int qatzip_send_setup(MultiFDSendParams *p, Error **errp) {
>> > > +    return 0;
>> > > +}
>> > > +
>> > > +static void qatzip_send_cleanup(MultiFDSendParams *p, Error **errp)
>> > > +{};
>> > > +
>> > > +static int qatzip_send_prepare(MultiFDSendParams *p, Error **errp)
>> > > +{
>> > > +    MultiFDPages_t *pages = p->pages;
>> > > +
>> > > +    for (int i = 0; i < p->normal_num; i++) {
>> > > +        p->iov[p->iovs_num].iov_base = pages->block->host + p-
>> >normal[i];
>> > > +        p->iov[p->iovs_num].iov_len = p->page_size;
>> > > +        p->iovs_num++;
>> > > +    }
>> > > +
>> > > +    p->next_packet_size = p->normal_num * p->page_size;
>> > > +    p->flags |= MULTIFD_FLAG_NOCOMP;
>> > > +    return 0;
>> > > +}
>> > > +
>> > > +static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp) {
>> > > +    return 0;
>> > > +}
>> > > +
>> > > +static void qatzip_recv_cleanup(MultiFDRecvParams *p) {};
>> > > +
>> > > +static int qatzip_recv_pages(MultiFDRecvParams *p, Error **errp) {
>> > > +    uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
>> > > +
>> > > +    if (flags != MULTIFD_FLAG_NOCOMP) {
>> > > +        error_setg(errp, "multifd %u: flags received %x flags
>> expected %x",
>> > > +                   p->id, flags, MULTIFD_FLAG_NOCOMP);
>> > > +        return -1;
>> > > +    }
>> > > +    for (int i = 0; i < p->normal_num; i++) {
>> > > +        p->iov[i].iov_base = p->host + p->normal[i];
>> > > +        p->iov[i].iov_len = p->page_size;
>> > > +    }
>> > > +    return qio_channel_readv_all(p->c, p->iov, p->normal_num,
>> > > +errp); }
>> > > +
>> > > +static MultiFDMethods multifd_qatzip_ops = {
>> > > +    .send_setup = qatzip_send_setup,
>> > > +    .send_cleanup = qatzip_send_cleanup,
>> > > +    .send_prepare = qatzip_send_prepare,
>> > > +    .recv_setup = qatzip_recv_setup,
>> > > +    .recv_cleanup = qatzip_recv_cleanup,
>> > > +    .recv_pages = qatzip_recv_pages };
>> > > +
>> > > +static void multifd_qatzip_register(void) {
>> > > +    multifd_register_ops(MULTIFD_COMPRESSION_QATZIP,
>> > > +&multifd_qatzip_ops); }
>> > > +
>> > > +migration_init(multifd_qatzip_register);
>> > > diff --git a/migration/multifd.h b/migration/multifd.h index
>> > > a835643b48..5600f7fc82 100644
>> > > --- a/migration/multifd.h
>> > > +++ b/migration/multifd.h
>> > > @@ -33,6 +33,7 @@ int multifd_queue_page(QEMUFile *f, RAMBlock
>> > > *block, ram_addr_t offset);  #define MULTIFD_FLAG_NOCOMP (0 << 1)
>> > > #define MULTIFD_FLAG_ZLIB (1 << 1)  #define MULTIFD_FLAG_ZSTD (2 <<
>> > > 1)
>> > > +#define MULTIFD_FLAG_QATZIP (3 << 1)
>> > >
>> > >  /* This value needs to be a multiple of qemu_target_page_size() */
>> > > #define MULTIFD_PACKET_SIZE (512 * 1024) diff --git
>> > > a/qapi/migration.json b/qapi/migration.json index
>> > > 6d5a4b0489..e3cc195aed 100644
>> > > --- a/qapi/migration.json
>> > > +++ b/qapi/migration.json
>> > > @@ -625,11 +625,14 @@
>> > >  #
>> > >  # @zstd: use zstd compression method.
>> > >  #
>> > > +# @qatzip: use qatzip compression method.
>> > > +#
>> > >  # Since: 5.0
>> > >  ##
>> > >  { 'enum': 'MultiFDCompression',
>> > >    'data': [ 'none', 'zlib',
>> > > -            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
>> > > +            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' },
>> > > +            { 'name': 'qatzip', 'if': 'CONFIG_QATZIP'} ] }
>> >
>> > In another thread adding support to another Intel accelerator (IAA) we
>> > decided that it was better to select the offloading as an accelerator
>> > method to multifd zlib rather than as an entirely new compression
>> > format. Take a look at that discussion:
>> > https://lore.kernel.org/r/ZTFCnqbbqlmsUkRC@redhat.com
>> 
>> We had some early discussion with Intel folks (probably a different team
>> than the one with the above patchset). The understanding at the time is
>> that QAT is good at both bulk data compression and decompression. IAA is
>> good at decompression with smaller data size but compression performance
>> is not the best. In multifd, we are compressing up to 128 4k pages at a
>> time and potentially this can increase by configuring the packet size, at
>> the time we thought QAT could be a better fit in the multifd live
>> migration scenario. We would like to hear more from Intel.
>> 
>> From our benchmark testing, with two QAT devices, we can get deflate
>> compression throughout to around 7GB/s with ~160% CPU. That's beating the
>> current software implementation (zlib and zstd) by a lot. We are still
>> tuning the configuration in QEMU live migration now.
>> 
>> >
>> > As I understand it, QAT + QATzip would be compatible with both zlib
>> > and IAA + QPL, so we'd add another accelerator method like this:
>> >
>> > https://lore.kernel.org/r/20240103112851.908082-3-yuan1.liu@intel.com
>> >
>> 
>> I quickly read over the IAA patchset and I saw this:
>> 
>> "However, due to some reasons, QPL is currently not compatible with the
>> existing Zlib method that Zlib compressed data can be decompressed by QPl
>> and vice versa."
>> 
>> The above probably means the current zlib software implementation and IAA
>> are not compatible.
>> 
>> For QAT, although, both Intel's QATzip and zlib are internally using
>> deflate, QATzip only supports deflate with a 4 byte header, deflate
>> wrapped by Gzip header and footer, or deflate wrapped by Intel® QAT
>> Gzip* extension header and footer. None of the headers can be recognized
>> by zlib software implementation is my understanding. So if we want to make
>> them compatible with zlib, the QATzip library needs to support that.
>
> The QPL library currently cannot support the Z_SYNC_FLULSH operation of zlib steaming. 
> This is the reason why it is not compatible with zlib.
>

I had understood from previous discussion that we'd be able to at least
support compression with QPL and decompression with the existing
zlib-based code. Is that not correct? I was about to suggest
experimenting with the window size in the existing code to hopefully
solve the 4kb window size issue. If there are other limitations, then
that will not be enough.

Also, can you point to the source of that statement about Z_SYNC_FLUSH,
I couldn't find more information about it in the documentation.

>> > All that, of course, assuming we even want to support both
>> > accelerators. They're addressing the same problem after all. I wonder
>> > how we'd choose a precedence, since both seem to be present in the
>> > same processor family.
>> >
>> >
>> 
>> That's an interesting question :-) I think overall performance (throughput
>> and CPU overhead) should both be considered. IAA and QAT accelerators
>> don't present on all systems. We Bytedance choose to have both on our
>> platform when purchasing from Intel.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [External] Re: [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method
  2024-01-08 20:27         ` Fabiano Rosas
@ 2024-01-09  2:26           ` Liu, Yuan1
  2024-01-11  5:41             ` Hao Xiang
  2024-01-11  6:39           ` Hao Xiang
  1 sibling, 1 reply; 21+ messages in thread
From: Liu, Yuan1 @ 2024-01-09  2:26 UTC (permalink / raw)
  To: Fabiano Rosas, Hao Xiang
  Cc: Bryan Zhang, qemu-devel@nongnu.org, marcandre.lureau@redhat.com,
	peterx@redhat.com, quintela@redhat.com, peter.maydell@linaro.org,
	berrange@redhat.com, Zou, Nanhai

> -----Original Message-----
> From: Fabiano Rosas <farosas@suse.de>
> Sent: Tuesday, January 9, 2024 4:28 AM
> To: Liu, Yuan1 <yuan1.liu@intel.com>; Hao Xiang <hao.xiang@bytedance.com>
> Cc: Bryan Zhang <bryan.zhang@bytedance.com>; qemu-devel@nongnu.org;
> marcandre.lureau@redhat.com; peterx@redhat.com; quintela@redhat.com;
> peter.maydell@linaro.org; berrange@redhat.com
> Subject: RE: [External] Re: [PATCH 3/5] migration: Introduce unimplemented
> 'qatzip' compression method
> 
> "Liu, Yuan1" <yuan1.liu@intel.com> writes:
> 
> >> -----Original Message-----
> >> From: Hao Xiang <hao.xiang@bytedance.com>
> >> Sent: Saturday, January 6, 2024 7:53 AM
> >> To: Fabiano Rosas <farosas@suse.de>
> >> Cc: Bryan Zhang <bryan.zhang@bytedance.com>; qemu-devel@nongnu.org;
> >> marcandre.lureau@redhat.com; peterx@redhat.com; quintela@redhat.com;
> >> peter.maydell@linaro.org; Liu, Yuan1 <yuan1.liu@intel.com>;
> >> berrange@redhat.com
> >> Subject: Re: [External] Re: [PATCH 3/5] migration: Introduce
> >> unimplemented 'qatzip' compression method
> >>
> >> On Fri, Jan 5, 2024 at 12:07 PM Fabiano Rosas <farosas@suse.de> wrote:
> >> >
> >> > Bryan Zhang <bryan.zhang@bytedance.com> writes:
> >> >
> >> > +cc Yuan Liu, Daniel Berrangé
> >> >
> >> > > Adds support for 'qatzip' as an option for the multifd
> >> > > compression method parameter, but copy-pastes the no-op logic to
> >> > > leave the actual methods effectively unimplemented. This is in
> >> > > preparation of a subsequent commit that will implement actually
> >> > > using QAT for compression and decompression.
> >> > >
> >> > > Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
> >> > > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
> >> > > ---
> >> > >  hw/core/qdev-properties-system.c |  6 ++-
> >> > >  migration/meson.build            |  1 +
> >> > >  migration/multifd-qatzip.c       | 81
> >> ++++++++++++++++++++++++++++++++
> >> > >  migration/multifd.h              |  1 +
> >> > >  qapi/migration.json              |  5 +-
> >> > >  5 files changed, 92 insertions(+), 2 deletions(-)  create mode
> >> > > 100644 migration/multifd-qatzip.c
> >> > >
> >> > > diff --git a/hw/core/qdev-properties-system.c
> >> > > b/hw/core/qdev-properties-system.c
> >> > > index 1a396521d5..d8e48dcb0e 100644
> >> > > --- a/hw/core/qdev-properties-system.c
> >> > > +++ b/hw/core/qdev-properties-system.c
> >> > > @@ -658,7 +658,11 @@ const PropertyInfo qdev_prop_fdc_drive_type
> >> > > = { const PropertyInfo qdev_prop_multifd_compression = {
> >> > >      .name = "MultiFDCompression",
> >> > >      .description = "multifd_compression values, "
> >> > > -                   "none/zlib/zstd",
> >> > > +                   "none/zlib/zstd"
> >> > > +#ifdef CONFIG_QATZIP
> >> > > +                   "/qatzip"
> >> > > +#endif
> >> > > +                   ,
> >> > >      .enum_table = &MultiFDCompression_lookup,
> >> > >      .get = qdev_propinfo_get_enum,
> >> > >      .set = qdev_propinfo_set_enum, diff --git
> >> > > a/migration/meson.build b/migration/meson.build index
> >> > > 92b1cc4297..e20f318379 100644
> >> > > --- a/migration/meson.build
> >> > > +++ b/migration/meson.build
> >> > > @@ -40,6 +40,7 @@ if get_option('live_block_migration').allowed()
> >> > >    system_ss.add(files('block.c'))  endif
> >> > >  system_ss.add(when: zstd, if_true: files('multifd-zstd.c'))
> >> > > +system_ss.add(when: qatzip, if_true: files('multifd-qatzip.c'))
> >> > >
> >> > >  specific_ss.add(when: 'CONFIG_SYSTEM_ONLY',
> >> > >                  if_true: files('ram.c', diff --git
> >> > > a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c new file
> >> > > mode 100644 index 0000000000..1733bbddb7
> >> > > --- /dev/null
> >> > > +++ b/migration/multifd-qatzip.c
> >> > > @@ -0,0 +1,81 @@
> >> > > +/*
> >> > > + * Multifd QATzip compression implementation
> >> > > + *
> >> > > + * Copyright (c) Bytedance
> >> > > + *
> >> > > + * Authors:
> >> > > + *  Bryan Zhang <bryan.zhang@bytedance.com>
> >> > > + *  Hao Xiang   <hao.xiang@bytedance.com>
> >> > > + *
> >> > > + * This work is licensed under the terms of the GNU GPL, version 2
> or
> >> later.
> >> > > + * See the COPYING file in the top-level directory.
> >> > > + */
> >> > > +
> >> > > +#include "qemu/osdep.h"
> >> > > +#include "exec/ramblock.h"
> >> > > +#include "exec/target_page.h"
> >> > > +#include "qapi/error.h"
> >> > > +#include "migration.h"
> >> > > +#include "options.h"
> >> > > +#include "multifd.h"
> >> > > +
> >> > > +static int qatzip_send_setup(MultiFDSendParams *p, Error **errp) {
> >> > > +    return 0;
> >> > > +}
> >> > > +
> >> > > +static void qatzip_send_cleanup(MultiFDSendParams *p, Error
> **errp)
> >> > > +{};
> >> > > +
> >> > > +static int qatzip_send_prepare(MultiFDSendParams *p, Error **errp)
> >> > > +{
> >> > > +    MultiFDPages_t *pages = p->pages;
> >> > > +
> >> > > +    for (int i = 0; i < p->normal_num; i++) {
> >> > > +        p->iov[p->iovs_num].iov_base = pages->block->host + p-
> >> >normal[i];
> >> > > +        p->iov[p->iovs_num].iov_len = p->page_size;
> >> > > +        p->iovs_num++;
> >> > > +    }
> >> > > +
> >> > > +    p->next_packet_size = p->normal_num * p->page_size;
> >> > > +    p->flags |= MULTIFD_FLAG_NOCOMP;
> >> > > +    return 0;
> >> > > +}
> >> > > +
> >> > > +static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp) {
> >> > > +    return 0;
> >> > > +}
> >> > > +
> >> > > +static void qatzip_recv_cleanup(MultiFDRecvParams *p) {};
> >> > > +
> >> > > +static int qatzip_recv_pages(MultiFDRecvParams *p, Error **errp) {
> >> > > +    uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
> >> > > +
> >> > > +    if (flags != MULTIFD_FLAG_NOCOMP) {
> >> > > +        error_setg(errp, "multifd %u: flags received %x flags
> >> expected %x",
> >> > > +                   p->id, flags, MULTIFD_FLAG_NOCOMP);
> >> > > +        return -1;
> >> > > +    }
> >> > > +    for (int i = 0; i < p->normal_num; i++) {
> >> > > +        p->iov[i].iov_base = p->host + p->normal[i];
> >> > > +        p->iov[i].iov_len = p->page_size;
> >> > > +    }
> >> > > +    return qio_channel_readv_all(p->c, p->iov, p->normal_num,
> >> > > +errp); }
> >> > > +
> >> > > +static MultiFDMethods multifd_qatzip_ops = {
> >> > > +    .send_setup = qatzip_send_setup,
> >> > > +    .send_cleanup = qatzip_send_cleanup,
> >> > > +    .send_prepare = qatzip_send_prepare,
> >> > > +    .recv_setup = qatzip_recv_setup,
> >> > > +    .recv_cleanup = qatzip_recv_cleanup,
> >> > > +    .recv_pages = qatzip_recv_pages };
> >> > > +
> >> > > +static void multifd_qatzip_register(void) {
> >> > > +    multifd_register_ops(MULTIFD_COMPRESSION_QATZIP,
> >> > > +&multifd_qatzip_ops); }
> >> > > +
> >> > > +migration_init(multifd_qatzip_register);
> >> > > diff --git a/migration/multifd.h b/migration/multifd.h index
> >> > > a835643b48..5600f7fc82 100644
> >> > > --- a/migration/multifd.h
> >> > > +++ b/migration/multifd.h
> >> > > @@ -33,6 +33,7 @@ int multifd_queue_page(QEMUFile *f, RAMBlock
> >> > > *block, ram_addr_t offset);  #define MULTIFD_FLAG_NOCOMP (0 << 1)
> >> > > #define MULTIFD_FLAG_ZLIB (1 << 1)  #define MULTIFD_FLAG_ZSTD (2 <<
> >> > > 1)
> >> > > +#define MULTIFD_FLAG_QATZIP (3 << 1)
> >> > >
> >> > >  /* This value needs to be a multiple of qemu_target_page_size() */
> >> > > #define MULTIFD_PACKET_SIZE (512 * 1024) diff --git
> >> > > a/qapi/migration.json b/qapi/migration.json index
> >> > > 6d5a4b0489..e3cc195aed 100644
> >> > > --- a/qapi/migration.json
> >> > > +++ b/qapi/migration.json
> >> > > @@ -625,11 +625,14 @@
> >> > >  #
> >> > >  # @zstd: use zstd compression method.
> >> > >  #
> >> > > +# @qatzip: use qatzip compression method.
> >> > > +#
> >> > >  # Since: 5.0
> >> > >  ##
> >> > >  { 'enum': 'MultiFDCompression',
> >> > >    'data': [ 'none', 'zlib',
> >> > > -            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
> >> > > +            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' },
> >> > > +            { 'name': 'qatzip', 'if': 'CONFIG_QATZIP'} ] }
> >> >
> >> > In another thread adding support to another Intel accelerator (IAA)
> we
> >> > decided that it was better to select the offloading as an accelerator
> >> > method to multifd zlib rather than as an entirely new compression
> >> > format. Take a look at that discussion:
> >> > https://lore.kernel.org/r/ZTFCnqbbqlmsUkRC@redhat.com
> >>
> >> We had some early discussion with Intel folks (probably a different
> team
> >> than the one with the above patchset). The understanding at the time is
> >> that QAT is good at both bulk data compression and decompression. IAA
> is
> >> good at decompression with smaller data size but compression
> performance
> >> is not the best. In multifd, we are compressing up to 128 4k pages at a
> >> time and potentially this can increase by configuring the packet size,
> at
> >> the time we thought QAT could be a better fit in the multifd live
> >> migration scenario. We would like to hear more from Intel.
> >>
> >> From our benchmark testing, with two QAT devices, we can get deflate
> >> compression throughout to around 7GB/s with ~160% CPU. That's beating
> the
> >> current software implementation (zlib and zstd) by a lot. We are still
> >> tuning the configuration in QEMU live migration now.
> >>
> >> >
> >> > As I understand it, QAT + QATzip would be compatible with both zlib
> >> > and IAA + QPL, so we'd add another accelerator method like this:
> >> >
> >> > https://lore.kernel.org/r/20240103112851.908082-3-yuan1.liu@intel.com
> >> >
> >>
> >> I quickly read over the IAA patchset and I saw this:
> >>
> >> "However, due to some reasons, QPL is currently not compatible with the
> >> existing Zlib method that Zlib compressed data can be decompressed by
> QPl
> >> and vice versa."
> >>
> >> The above probably means the current zlib software implementation and
> IAA
> >> are not compatible.
> >>
> >> For QAT, although, both Intel's QATzip and zlib are internally using
> >> deflate, QATzip only supports deflate with a 4 byte header, deflate
> >> wrapped by Gzip header and footer, or deflate wrapped by Intel® QAT
> >> Gzip* extension header and footer. None of the headers can be
> recognized
> >> by zlib software implementation is my understanding. So if we want to
> make
> >> them compatible with zlib, the QATzip library needs to support that.
> >
> > The QPL library currently cannot support the Z_SYNC_FLULSH operation of
> zlib steaming.
> > This is the reason why it is not compatible with zlib.
> >
> 
> I had understood from previous discussion that we'd be able to at least
> support compression with QPL and decompression with the existing
> zlib-based code. Is that not correct? I was about to suggest
> experimenting with the window size in the existing code to hopefully
> solve the 4kb window size issue. If there are other limitations, then
> that will not be enough.
> 
> Also, can you point to the source of that statement about Z_SYNC_FLUSH,
> I couldn't find more information about it in the documentation.

static int zlib_send_prepare(MultiFDSendParams *p, Error **errp)
{
    struct zlib_data *z = p->data;
    z_stream *zs = &z->zs;
    …
    for (i = 0; i < p->normal_num; i++) {
        uint32_t available = z->zbuff_len - out_size;
        int flush = Z_NO_FLUSH;

        if (i == p->normal_num - 1) {
            flush = Z_SYNC_FLUSH;
        }
If I understand correctly, the current implementation of multifd zlib treats each multifd thread as a separate stream. This implementation adds zlib headers and footers only to the beginning and end of the data, and the data transmitted in between does not require headers and footers.

The current implementation for IAA supports compressing data indicated by p->normal_num as a single stream. Each compressed data segment has zlib headers and footers. Since Z_SYNC_FLUSH is not supported, this means IAA has to complete the compression for a stream at once and cannot output parts of a stream in advance. Therefore, the current IAA is not compatible with existing zlib. Currently, it seems that the QAT implementation follows a similar approach.

Regarding the reference to Z_SYNC_FLUSH, you can find it at https://www.zlib.net/manual.html:
“If the parameter flush is set to Z_SYNC_FLUSH, all pending output is flushed to the output buffer and the output is aligned on a byte boundary, so that the decompressor can get all input data available so far. (In particular avail_in is zero after the call if enough output space has been provided before the call.) Flushing may degrade compression for some compression algorithms and so it should be used only when necessary. This completes the current deflate block and follows it with an empty stored block that is three bits plus filler bits to the next byte, followed by four bytes (00 00 ff ff).”

Based on the information above, I suggest the following options:

1. Modify multifd zlib to perform stream compression each time with p->normal_num pages. If this modification makes IAA compatible with zlib, I will implement it in the next version as per https://lore.kernel.org/all/20240103112851.908082-4-yuan1.liu@intel.com/T/ and provide performance data. We will also verify the feasibility with QAT.

2. Use zlib compression without stream compression, meaning each page is independently compressed. The advantage is that accelerators can concurrently process more pages, and the current IAA and QAT can both be compatible. The downside is a loss in compression ratio, and the length of the data after compressing each page needs to be added to MultiFDPacket_t. If future compression functionality considers support only on accelerators, the compression ratio can be improved through compression levels or other features without additional host CPU overhead.

Additionally, the default window size is set to 4K, which should effectively support IAA hardware.

> >> > All that, of course, assuming we even want to support both
> >> > accelerators. They're addressing the same problem after all. I wonder
> >> > how we'd choose a precedence, since both seem to be present in the
> >> > same processor family.
> >> >
> >> >
> >>
> >> That's an interesting question :-) I think overall performance
> (throughput
> >> and CPU overhead) should both be considered. IAA and QAT accelerators
> >> don't present on all systems. We Bytedance choose to have both on our
> >> platform when purchasing from Intel.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method
  2024-01-09  2:26           ` Liu, Yuan1
@ 2024-01-11  5:41             ` Hao Xiang
  2024-01-13 14:29               ` Liu, Yuan1
  0 siblings, 1 reply; 21+ messages in thread
From: Hao Xiang @ 2024-01-11  5:41 UTC (permalink / raw)
  To: Liu, Yuan1
  Cc: Fabiano Rosas, Bryan Zhang, qemu-devel@nongnu.org,
	marcandre.lureau@redhat.com, peterx@redhat.com,
	quintela@redhat.com, peter.maydell@linaro.org,
	berrange@redhat.com, Zou, Nanhai

On Mon, Jan 8, 2024 at 6:26 PM Liu, Yuan1 <yuan1.liu@intel.com> wrote:
>
> > -----Original Message-----
> > From: Fabiano Rosas <farosas@suse.de>
> > Sent: Tuesday, January 9, 2024 4:28 AM
> > To: Liu, Yuan1 <yuan1.liu@intel.com>; Hao Xiang <hao.xiang@bytedance.com>
> > Cc: Bryan Zhang <bryan.zhang@bytedance.com>; qemu-devel@nongnu.org;
> > marcandre.lureau@redhat.com; peterx@redhat.com; quintela@redhat.com;
> > peter.maydell@linaro.org; berrange@redhat.com
> > Subject: RE: [External] Re: [PATCH 3/5] migration: Introduce unimplemented
> > 'qatzip' compression method
> >
> > "Liu, Yuan1" <yuan1.liu@intel.com> writes:
> >
> > >> -----Original Message-----
> > >> From: Hao Xiang <hao.xiang@bytedance.com>
> > >> Sent: Saturday, January 6, 2024 7:53 AM
> > >> To: Fabiano Rosas <farosas@suse.de>
> > >> Cc: Bryan Zhang <bryan.zhang@bytedance.com>; qemu-devel@nongnu.org;
> > >> marcandre.lureau@redhat.com; peterx@redhat.com; quintela@redhat.com;
> > >> peter.maydell@linaro.org; Liu, Yuan1 <yuan1.liu@intel.com>;
> > >> berrange@redhat.com
> > >> Subject: Re: [External] Re: [PATCH 3/5] migration: Introduce
> > >> unimplemented 'qatzip' compression method
> > >>
> > >> On Fri, Jan 5, 2024 at 12:07 PM Fabiano Rosas <farosas@suse.de> wrote:
> > >> >
> > >> > Bryan Zhang <bryan.zhang@bytedance.com> writes:
> > >> >
> > >> > +cc Yuan Liu, Daniel Berrangé
> > >> >
> > >> > > Adds support for 'qatzip' as an option for the multifd
> > >> > > compression method parameter, but copy-pastes the no-op logic to
> > >> > > leave the actual methods effectively unimplemented. This is in
> > >> > > preparation of a subsequent commit that will implement actually
> > >> > > using QAT for compression and decompression.
> > >> > >
> > >> > > Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
> > >> > > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
> > >> > > ---
> > >> > >  hw/core/qdev-properties-system.c |  6 ++-
> > >> > >  migration/meson.build            |  1 +
> > >> > >  migration/multifd-qatzip.c       | 81
> > >> ++++++++++++++++++++++++++++++++
> > >> > >  migration/multifd.h              |  1 +
> > >> > >  qapi/migration.json              |  5 +-
> > >> > >  5 files changed, 92 insertions(+), 2 deletions(-)  create mode
> > >> > > 100644 migration/multifd-qatzip.c
> > >> > >
> > >> > > diff --git a/hw/core/qdev-properties-system.c
> > >> > > b/hw/core/qdev-properties-system.c
> > >> > > index 1a396521d5..d8e48dcb0e 100644
> > >> > > --- a/hw/core/qdev-properties-system.c
> > >> > > +++ b/hw/core/qdev-properties-system.c
> > >> > > @@ -658,7 +658,11 @@ const PropertyInfo qdev_prop_fdc_drive_type
> > >> > > = { const PropertyInfo qdev_prop_multifd_compression = {
> > >> > >      .name = "MultiFDCompression",
> > >> > >      .description = "multifd_compression values, "
> > >> > > -                   "none/zlib/zstd",
> > >> > > +                   "none/zlib/zstd"
> > >> > > +#ifdef CONFIG_QATZIP
> > >> > > +                   "/qatzip"
> > >> > > +#endif
> > >> > > +                   ,
> > >> > >      .enum_table = &MultiFDCompression_lookup,
> > >> > >      .get = qdev_propinfo_get_enum,
> > >> > >      .set = qdev_propinfo_set_enum, diff --git
> > >> > > a/migration/meson.build b/migration/meson.build index
> > >> > > 92b1cc4297..e20f318379 100644
> > >> > > --- a/migration/meson.build
> > >> > > +++ b/migration/meson.build
> > >> > > @@ -40,6 +40,7 @@ if get_option('live_block_migration').allowed()
> > >> > >    system_ss.add(files('block.c'))  endif
> > >> > >  system_ss.add(when: zstd, if_true: files('multifd-zstd.c'))
> > >> > > +system_ss.add(when: qatzip, if_true: files('multifd-qatzip.c'))
> > >> > >
> > >> > >  specific_ss.add(when: 'CONFIG_SYSTEM_ONLY',
> > >> > >                  if_true: files('ram.c', diff --git
> > >> > > a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c new file
> > >> > > mode 100644 index 0000000000..1733bbddb7
> > >> > > --- /dev/null
> > >> > > +++ b/migration/multifd-qatzip.c
> > >> > > @@ -0,0 +1,81 @@
> > >> > > +/*
> > >> > > + * Multifd QATzip compression implementation
> > >> > > + *
> > >> > > + * Copyright (c) Bytedance
> > >> > > + *
> > >> > > + * Authors:
> > >> > > + *  Bryan Zhang <bryan.zhang@bytedance.com>
> > >> > > + *  Hao Xiang   <hao.xiang@bytedance.com>
> > >> > > + *
> > >> > > + * This work is licensed under the terms of the GNU GPL, version 2
> > or
> > >> later.
> > >> > > + * See the COPYING file in the top-level directory.
> > >> > > + */
> > >> > > +
> > >> > > +#include "qemu/osdep.h"
> > >> > > +#include "exec/ramblock.h"
> > >> > > +#include "exec/target_page.h"
> > >> > > +#include "qapi/error.h"
> > >> > > +#include "migration.h"
> > >> > > +#include "options.h"
> > >> > > +#include "multifd.h"
> > >> > > +
> > >> > > +static int qatzip_send_setup(MultiFDSendParams *p, Error **errp) {
> > >> > > +    return 0;
> > >> > > +}
> > >> > > +
> > >> > > +static void qatzip_send_cleanup(MultiFDSendParams *p, Error
> > **errp)
> > >> > > +{};
> > >> > > +
> > >> > > +static int qatzip_send_prepare(MultiFDSendParams *p, Error **errp)
> > >> > > +{
> > >> > > +    MultiFDPages_t *pages = p->pages;
> > >> > > +
> > >> > > +    for (int i = 0; i < p->normal_num; i++) {
> > >> > > +        p->iov[p->iovs_num].iov_base = pages->block->host + p-
> > >> >normal[i];
> > >> > > +        p->iov[p->iovs_num].iov_len = p->page_size;
> > >> > > +        p->iovs_num++;
> > >> > > +    }
> > >> > > +
> > >> > > +    p->next_packet_size = p->normal_num * p->page_size;
> > >> > > +    p->flags |= MULTIFD_FLAG_NOCOMP;
> > >> > > +    return 0;
> > >> > > +}
> > >> > > +
> > >> > > +static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp) {
> > >> > > +    return 0;
> > >> > > +}
> > >> > > +
> > >> > > +static void qatzip_recv_cleanup(MultiFDRecvParams *p) {};
> > >> > > +
> > >> > > +static int qatzip_recv_pages(MultiFDRecvParams *p, Error **errp) {
> > >> > > +    uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
> > >> > > +
> > >> > > +    if (flags != MULTIFD_FLAG_NOCOMP) {
> > >> > > +        error_setg(errp, "multifd %u: flags received %x flags
> > >> expected %x",
> > >> > > +                   p->id, flags, MULTIFD_FLAG_NOCOMP);
> > >> > > +        return -1;
> > >> > > +    }
> > >> > > +    for (int i = 0; i < p->normal_num; i++) {
> > >> > > +        p->iov[i].iov_base = p->host + p->normal[i];
> > >> > > +        p->iov[i].iov_len = p->page_size;
> > >> > > +    }
> > >> > > +    return qio_channel_readv_all(p->c, p->iov, p->normal_num,
> > >> > > +errp); }
> > >> > > +
> > >> > > +static MultiFDMethods multifd_qatzip_ops = {
> > >> > > +    .send_setup = qatzip_send_setup,
> > >> > > +    .send_cleanup = qatzip_send_cleanup,
> > >> > > +    .send_prepare = qatzip_send_prepare,
> > >> > > +    .recv_setup = qatzip_recv_setup,
> > >> > > +    .recv_cleanup = qatzip_recv_cleanup,
> > >> > > +    .recv_pages = qatzip_recv_pages };
> > >> > > +
> > >> > > +static void multifd_qatzip_register(void) {
> > >> > > +    multifd_register_ops(MULTIFD_COMPRESSION_QATZIP,
> > >> > > +&multifd_qatzip_ops); }
> > >> > > +
> > >> > > +migration_init(multifd_qatzip_register);
> > >> > > diff --git a/migration/multifd.h b/migration/multifd.h index
> > >> > > a835643b48..5600f7fc82 100644
> > >> > > --- a/migration/multifd.h
> > >> > > +++ b/migration/multifd.h
> > >> > > @@ -33,6 +33,7 @@ int multifd_queue_page(QEMUFile *f, RAMBlock
> > >> > > *block, ram_addr_t offset);  #define MULTIFD_FLAG_NOCOMP (0 << 1)
> > >> > > #define MULTIFD_FLAG_ZLIB (1 << 1)  #define MULTIFD_FLAG_ZSTD (2 <<
> > >> > > 1)
> > >> > > +#define MULTIFD_FLAG_QATZIP (3 << 1)
> > >> > >
> > >> > >  /* This value needs to be a multiple of qemu_target_page_size() */
> > >> > > #define MULTIFD_PACKET_SIZE (512 * 1024) diff --git
> > >> > > a/qapi/migration.json b/qapi/migration.json index
> > >> > > 6d5a4b0489..e3cc195aed 100644
> > >> > > --- a/qapi/migration.json
> > >> > > +++ b/qapi/migration.json
> > >> > > @@ -625,11 +625,14 @@
> > >> > >  #
> > >> > >  # @zstd: use zstd compression method.
> > >> > >  #
> > >> > > +# @qatzip: use qatzip compression method.
> > >> > > +#
> > >> > >  # Since: 5.0
> > >> > >  ##
> > >> > >  { 'enum': 'MultiFDCompression',
> > >> > >    'data': [ 'none', 'zlib',
> > >> > > -            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
> > >> > > +            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' },
> > >> > > +            { 'name': 'qatzip', 'if': 'CONFIG_QATZIP'} ] }
> > >> >
> > >> > In another thread adding support to another Intel accelerator (IAA)
> > we
> > >> > decided that it was better to select the offloading as an accelerator
> > >> > method to multifd zlib rather than as an entirely new compression
> > >> > format. Take a look at that discussion:
> > >> > https://lore.kernel.org/r/ZTFCnqbbqlmsUkRC@redhat.com
> > >>
> > >> We had some early discussion with Intel folks (probably a different
> > team
> > >> than the one with the above patchset). The understanding at the time is
> > >> that QAT is good at both bulk data compression and decompression. IAA
> > is
> > >> good at decompression with smaller data size but compression
> > performance
> > >> is not the best. In multifd, we are compressing up to 128 4k pages at a
> > >> time and potentially this can increase by configuring the packet size,
> > at
> > >> the time we thought QAT could be a better fit in the multifd live
> > >> migration scenario. We would like to hear more from Intel.
> > >>
> > >> From our benchmark testing, with two QAT devices, we can get deflate
> > >> compression throughout to around 7GB/s with ~160% CPU. That's beating
> > the
> > >> current software implementation (zlib and zstd) by a lot. We are still
> > >> tuning the configuration in QEMU live migration now.
> > >>
> > >> >
> > >> > As I understand it, QAT + QATzip would be compatible with both zlib
> > >> > and IAA + QPL, so we'd add another accelerator method like this:
> > >> >
> > >> > https://lore.kernel.org/r/20240103112851.908082-3-yuan1.liu@intel.com
> > >> >
> > >>
> > >> I quickly read over the IAA patchset and I saw this:
> > >>
> > >> "However, due to some reasons, QPL is currently not compatible with the
> > >> existing Zlib method that Zlib compressed data can be decompressed by
> > QPl
> > >> and vice versa."
> > >>
> > >> The above probably means the current zlib software implementation and
> > IAA
> > >> are not compatible.
> > >>
> > >> For QAT, although, both Intel's QATzip and zlib are internally using
> > >> deflate, QATzip only supports deflate with a 4 byte header, deflate
> > >> wrapped by Gzip header and footer, or deflate wrapped by Intel® QAT
> > >> Gzip* extension header and footer. None of the headers can be
> > recognized
> > >> by zlib software implementation is my understanding. So if we want to
> > make
> > >> them compatible with zlib, the QATzip library needs to support that.
> > >
> > > The QPL library currently cannot support the Z_SYNC_FLULSH operation of
> > zlib steaming.
> > > This is the reason why it is not compatible with zlib.
> > >
> >
> > I had understood from previous discussion that we'd be able to at least
> > support compression with QPL and decompression with the existing
> > zlib-based code. Is that not correct? I was about to suggest
> > experimenting with the window size in the existing code to hopefully
> > solve the 4kb window size issue. If there are other limitations, then
> > that will not be enough.
> >
> > Also, can you point to the source of that statement about Z_SYNC_FLUSH,
> > I couldn't find more information about it in the documentation.
>
> static int zlib_send_prepare(MultiFDSendParams *p, Error **errp)
> {
>     struct zlib_data *z = p->data;
>     z_stream *zs = &z->zs;
>     …
>     for (i = 0; i < p->normal_num; i++) {
>         uint32_t available = z->zbuff_len - out_size;
>         int flush = Z_NO_FLUSH;
>
>         if (i == p->normal_num - 1) {
>             flush = Z_SYNC_FLUSH;
>         }
> If I understand correctly, the current implementation of multifd zlib treats each multifd thread as a separate stream. This implementation adds zlib headers and footers only to the beginning and end of the data, and the data transmitted in between does not require headers and footers.

zlib_send_prepare() is just calling the deflate() repeatedly in a
loop. While compressing the last page, it sets the flag Z_SYNC_FLUSH,
which should dump all pending output to the buffer.

"This implementation adds zlib headers and footers only to the
beginning and end of the data,"
^ You mean putting a header/footer per multifd packet, correct?

>
> The current implementation for IAA supports compressing data indicated by p->normal_num as a single stream. Each compressed data segment has zlib headers and footers. Since Z_SYNC_FLUSH is not supported, this means IAA has to complete the compression for a stream at once and cannot output parts of a stream in advance. Therefore, the current IAA is not compatible with existing zlib. Currently, it seems that the QAT implementation follows a similar approach.

"Since Z_SYNC_FLUSH is not supported, this means IAA has to complete
the compression for a stream at once and cannot output parts of a
stream in advance."
Does IAA's deflate compression put a header/footer per page? Or per
multifd packet?

>
> Regarding the reference to Z_SYNC_FLUSH, you can find it at https://www.zlib.net/manual.html:
> “If the parameter flush is set to Z_SYNC_FLUSH, all pending output is flushed to the output buffer and the output is aligned on a byte boundary, so that the decompressor can get all input data available so far. (In particular avail_in is zero after the call if enough output space has been provided before the call.) Flushing may degrade compression for some compression algorithms and so it should be used only when necessary. This completes the current deflate block and follows it with an empty stored block that is three bits plus filler bits to the next byte, followed by four bytes (00 00 ff ff).”
>
> Based on the information above, I suggest the following options:
>
> 1. Modify multifd zlib to perform stream compression each time with p->normal_num pages. If this modification makes IAA compatible with zlib, I will implement it in the next version as per https://lore.kernel.org/all/20240103112851.908082-4-yuan1.liu@intel.com/T/ and provide performance data. We will also verify the feasibility with QAT.
>
> 2. Use zlib compression without stream compression, meaning each page is independently compressed. The advantage is that accelerators can concurrently process more pages, and the current IAA and QAT can both be compatible. The downside is a loss in compression ratio, and the length of the data after compressing each page needs to be added to MultiFDPacket_t. If future compression functionality considers support only on accelerators, the compression ratio can be improved through compression levels or other features without additional host CPU overhead.
>

We noticed a pretty significant performance difference in QAT deflate
while using the streaming version VS the non-streaming version. The
non-streaming version has better performance.
Sounds like the easiest way to have zlib/IAA/QAT to be capable of
compressing/decompressing interchangeably is to set the Z_SYNC_FLUSH
flag on all deflate() calls in zlib_send_prepare(). And in my opinion,
it is worth the trade-off. If my hardware doesn't have accelerators
and I want pure software based compression, I would choose zstd over
zlib.

> Additionally, the default window size is set to 4K, which should effectively support IAA hardware.
>
> > >> > All that, of course, assuming we even want to support both
> > >> > accelerators. They're addressing the same problem after all. I wonder
> > >> > how we'd choose a precedence, since both seem to be present in the
> > >> > same processor family.
> > >> >
> > >> >
> > >>
> > >> That's an interesting question :-) I think overall performance
> > (throughput
> > >> and CPU overhead) should both be considered. IAA and QAT accelerators
> > >> don't present on all systems. We Bytedance choose to have both on our
> > >> platform when purchasing from Intel.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method
  2024-01-08 20:27         ` Fabiano Rosas
  2024-01-09  2:26           ` Liu, Yuan1
@ 2024-01-11  6:39           ` Hao Xiang
  2024-01-13 14:10             ` Liu, Yuan1
  1 sibling, 1 reply; 21+ messages in thread
From: Hao Xiang @ 2024-01-11  6:39 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: Liu, Yuan1, Bryan Zhang, qemu-devel@nongnu.org,
	marcandre.lureau@redhat.com, peterx@redhat.com,
	quintela@redhat.com, peter.maydell@linaro.org,
	berrange@redhat.com

On Mon, Jan 8, 2024 at 12:28 PM Fabiano Rosas <farosas@suse.de> wrote:
>
> "Liu, Yuan1" <yuan1.liu@intel.com> writes:
>
> >> -----Original Message-----
> >> From: Hao Xiang <hao.xiang@bytedance.com>
> >> Sent: Saturday, January 6, 2024 7:53 AM
> >> To: Fabiano Rosas <farosas@suse.de>
> >> Cc: Bryan Zhang <bryan.zhang@bytedance.com>; qemu-devel@nongnu.org;
> >> marcandre.lureau@redhat.com; peterx@redhat.com; quintela@redhat.com;
> >> peter.maydell@linaro.org; Liu, Yuan1 <yuan1.liu@intel.com>;
> >> berrange@redhat.com
> >> Subject: Re: [External] Re: [PATCH 3/5] migration: Introduce unimplemented
> >> 'qatzip' compression method
> >>
> >> On Fri, Jan 5, 2024 at 12:07 PM Fabiano Rosas <farosas@suse.de> wrote:
> >> >
> >> > Bryan Zhang <bryan.zhang@bytedance.com> writes:
> >> >
> >> > +cc Yuan Liu, Daniel Berrangé
> >> >
> >> > > Adds support for 'qatzip' as an option for the multifd compression
> >> > > method parameter, but copy-pastes the no-op logic to leave the
> >> > > actual methods effectively unimplemented. This is in preparation of
> >> > > a subsequent commit that will implement actually using QAT for
> >> > > compression and decompression.
> >> > >
> >> > > Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
> >> > > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
> >> > > ---
> >> > >  hw/core/qdev-properties-system.c |  6 ++-
> >> > >  migration/meson.build            |  1 +
> >> > >  migration/multifd-qatzip.c       | 81
> >> ++++++++++++++++++++++++++++++++
> >> > >  migration/multifd.h              |  1 +
> >> > >  qapi/migration.json              |  5 +-
> >> > >  5 files changed, 92 insertions(+), 2 deletions(-)  create mode
> >> > > 100644 migration/multifd-qatzip.c
> >> > >
> >> > > diff --git a/hw/core/qdev-properties-system.c
> >> > > b/hw/core/qdev-properties-system.c
> >> > > index 1a396521d5..d8e48dcb0e 100644
> >> > > --- a/hw/core/qdev-properties-system.c
> >> > > +++ b/hw/core/qdev-properties-system.c
> >> > > @@ -658,7 +658,11 @@ const PropertyInfo qdev_prop_fdc_drive_type = {
> >> > > const PropertyInfo qdev_prop_multifd_compression = {
> >> > >      .name = "MultiFDCompression",
> >> > >      .description = "multifd_compression values, "
> >> > > -                   "none/zlib/zstd",
> >> > > +                   "none/zlib/zstd"
> >> > > +#ifdef CONFIG_QATZIP
> >> > > +                   "/qatzip"
> >> > > +#endif
> >> > > +                   ,
> >> > >      .enum_table = &MultiFDCompression_lookup,
> >> > >      .get = qdev_propinfo_get_enum,
> >> > >      .set = qdev_propinfo_set_enum,
> >> > > diff --git a/migration/meson.build b/migration/meson.build index
> >> > > 92b1cc4297..e20f318379 100644
> >> > > --- a/migration/meson.build
> >> > > +++ b/migration/meson.build
> >> > > @@ -40,6 +40,7 @@ if get_option('live_block_migration').allowed()
> >> > >    system_ss.add(files('block.c'))
> >> > >  endif
> >> > >  system_ss.add(when: zstd, if_true: files('multifd-zstd.c'))
> >> > > +system_ss.add(when: qatzip, if_true: files('multifd-qatzip.c'))
> >> > >
> >> > >  specific_ss.add(when: 'CONFIG_SYSTEM_ONLY',
> >> > >                  if_true: files('ram.c', diff --git
> >> > > a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c new file
> >> > > mode 100644 index 0000000000..1733bbddb7
> >> > > --- /dev/null
> >> > > +++ b/migration/multifd-qatzip.c
> >> > > @@ -0,0 +1,81 @@
> >> > > +/*
> >> > > + * Multifd QATzip compression implementation
> >> > > + *
> >> > > + * Copyright (c) Bytedance
> >> > > + *
> >> > > + * Authors:
> >> > > + *  Bryan Zhang <bryan.zhang@bytedance.com>
> >> > > + *  Hao Xiang   <hao.xiang@bytedance.com>
> >> > > + *
> >> > > + * This work is licensed under the terms of the GNU GPL, version 2 or
> >> later.
> >> > > + * See the COPYING file in the top-level directory.
> >> > > + */
> >> > > +
> >> > > +#include "qemu/osdep.h"
> >> > > +#include "exec/ramblock.h"
> >> > > +#include "exec/target_page.h"
> >> > > +#include "qapi/error.h"
> >> > > +#include "migration.h"
> >> > > +#include "options.h"
> >> > > +#include "multifd.h"
> >> > > +
> >> > > +static int qatzip_send_setup(MultiFDSendParams *p, Error **errp) {
> >> > > +    return 0;
> >> > > +}
> >> > > +
> >> > > +static void qatzip_send_cleanup(MultiFDSendParams *p, Error **errp)
> >> > > +{};
> >> > > +
> >> > > +static int qatzip_send_prepare(MultiFDSendParams *p, Error **errp)
> >> > > +{
> >> > > +    MultiFDPages_t *pages = p->pages;
> >> > > +
> >> > > +    for (int i = 0; i < p->normal_num; i++) {
> >> > > +        p->iov[p->iovs_num].iov_base = pages->block->host + p-
> >> >normal[i];
> >> > > +        p->iov[p->iovs_num].iov_len = p->page_size;
> >> > > +        p->iovs_num++;
> >> > > +    }
> >> > > +
> >> > > +    p->next_packet_size = p->normal_num * p->page_size;
> >> > > +    p->flags |= MULTIFD_FLAG_NOCOMP;
> >> > > +    return 0;
> >> > > +}
> >> > > +
> >> > > +static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp) {
> >> > > +    return 0;
> >> > > +}
> >> > > +
> >> > > +static void qatzip_recv_cleanup(MultiFDRecvParams *p) {};
> >> > > +
> >> > > +static int qatzip_recv_pages(MultiFDRecvParams *p, Error **errp) {
> >> > > +    uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
> >> > > +
> >> > > +    if (flags != MULTIFD_FLAG_NOCOMP) {
> >> > > +        error_setg(errp, "multifd %u: flags received %x flags
> >> expected %x",
> >> > > +                   p->id, flags, MULTIFD_FLAG_NOCOMP);
> >> > > +        return -1;
> >> > > +    }
> >> > > +    for (int i = 0; i < p->normal_num; i++) {
> >> > > +        p->iov[i].iov_base = p->host + p->normal[i];
> >> > > +        p->iov[i].iov_len = p->page_size;
> >> > > +    }
> >> > > +    return qio_channel_readv_all(p->c, p->iov, p->normal_num,
> >> > > +errp); }
> >> > > +
> >> > > +static MultiFDMethods multifd_qatzip_ops = {
> >> > > +    .send_setup = qatzip_send_setup,
> >> > > +    .send_cleanup = qatzip_send_cleanup,
> >> > > +    .send_prepare = qatzip_send_prepare,
> >> > > +    .recv_setup = qatzip_recv_setup,
> >> > > +    .recv_cleanup = qatzip_recv_cleanup,
> >> > > +    .recv_pages = qatzip_recv_pages };
> >> > > +
> >> > > +static void multifd_qatzip_register(void) {
> >> > > +    multifd_register_ops(MULTIFD_COMPRESSION_QATZIP,
> >> > > +&multifd_qatzip_ops); }
> >> > > +
> >> > > +migration_init(multifd_qatzip_register);
> >> > > diff --git a/migration/multifd.h b/migration/multifd.h index
> >> > > a835643b48..5600f7fc82 100644
> >> > > --- a/migration/multifd.h
> >> > > +++ b/migration/multifd.h
> >> > > @@ -33,6 +33,7 @@ int multifd_queue_page(QEMUFile *f, RAMBlock
> >> > > *block, ram_addr_t offset);  #define MULTIFD_FLAG_NOCOMP (0 << 1)
> >> > > #define MULTIFD_FLAG_ZLIB (1 << 1)  #define MULTIFD_FLAG_ZSTD (2 <<
> >> > > 1)
> >> > > +#define MULTIFD_FLAG_QATZIP (3 << 1)
> >> > >
> >> > >  /* This value needs to be a multiple of qemu_target_page_size() */
> >> > > #define MULTIFD_PACKET_SIZE (512 * 1024) diff --git
> >> > > a/qapi/migration.json b/qapi/migration.json index
> >> > > 6d5a4b0489..e3cc195aed 100644
> >> > > --- a/qapi/migration.json
> >> > > +++ b/qapi/migration.json
> >> > > @@ -625,11 +625,14 @@
> >> > >  #
> >> > >  # @zstd: use zstd compression method.
> >> > >  #
> >> > > +# @qatzip: use qatzip compression method.
> >> > > +#
> >> > >  # Since: 5.0
> >> > >  ##
> >> > >  { 'enum': 'MultiFDCompression',
> >> > >    'data': [ 'none', 'zlib',
> >> > > -            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
> >> > > +            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' },
> >> > > +            { 'name': 'qatzip', 'if': 'CONFIG_QATZIP'} ] }
> >> >
> >> > In another thread adding support to another Intel accelerator (IAA) we
> >> > decided that it was better to select the offloading as an accelerator
> >> > method to multifd zlib rather than as an entirely new compression
> >> > format. Take a look at that discussion:
> >> > https://lore.kernel.org/r/ZTFCnqbbqlmsUkRC@redhat.com
> >>
> >> We had some early discussion with Intel folks (probably a different team
> >> than the one with the above patchset). The understanding at the time is
> >> that QAT is good at both bulk data compression and decompression. IAA is
> >> good at decompression with smaller data size but compression performance
> >> is not the best. In multifd, we are compressing up to 128 4k pages at a
> >> time and potentially this can increase by configuring the packet size, at
> >> the time we thought QAT could be a better fit in the multifd live
> >> migration scenario. We would like to hear more from Intel.
> >>
> >> From our benchmark testing, with two QAT devices, we can get deflate
> >> compression throughout to around 7GB/s with ~160% CPU. That's beating the
> >> current software implementation (zlib and zstd) by a lot. We are still
> >> tuning the configuration in QEMU live migration now.
> >>
> >> >
> >> > As I understand it, QAT + QATzip would be compatible with both zlib
> >> > and IAA + QPL, so we'd add another accelerator method like this:
> >> >
> >> > https://lore.kernel.org/r/20240103112851.908082-3-yuan1.liu@intel.com
> >> >
> >>
> >> I quickly read over the IAA patchset and I saw this:
> >>
> >> "However, due to some reasons, QPL is currently not compatible with the
> >> existing Zlib method that Zlib compressed data can be decompressed by QPl
> >> and vice versa."
> >>
> >> The above probably means the current zlib software implementation and IAA
> >> are not compatible.
> >>
> >> For QAT, although, both Intel's QATzip and zlib are internally using
> >> deflate, QATzip only supports deflate with a 4 byte header, deflate
> >> wrapped by Gzip header and footer, or deflate wrapped by Intel® QAT
> >> Gzip* extension header and footer. None of the headers can be recognized
> >> by zlib software implementation is my understanding. So if we want to make
> >> them compatible with zlib, the QATzip library needs to support that.
> >
> > The QPL library currently cannot support the Z_SYNC_FLULSH operation of zlib steaming.
> > This is the reason why it is not compatible with zlib.
> >

When doing the QPL compression, if we set both QPL_FLAG_FIRST and
QPL_FLAG_LAST flags and run a single job for all pages in a multifd
packet, would it generate the same format as Z_SYNC_FLULSH in the
current zlib-based code?

>
> I had understood from previous discussion that we'd be able to at least
> support compression with QPL and decompression with the existing
> zlib-based code. Is that not correct? I was about to suggest
> experimenting with the window size in the existing code to hopefully
> solve the 4kb window size issue. If there are other limitations, then
> that will not be enough.

>
> Also, can you point to the source of that statement about Z_SYNC_FLUSH,
> I couldn't find more information about it in the documentation.
>
> >> > All that, of course, assuming we even want to support both
> >> > accelerators. They're addressing the same problem after all. I wonder
> >> > how we'd choose a precedence, since both seem to be present in the
> >> > same processor family.
> >> >
> >> >
> >>
> >> That's an interesting question :-) I think overall performance (throughput
> >> and CPU overhead) should both be considered. IAA and QAT accelerators
> >> don't present on all systems. We Bytedance choose to have both on our
> >> platform when purchasing from Intel.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [External] Re: [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method
  2024-01-11  6:39           ` Hao Xiang
@ 2024-01-13 14:10             ` Liu, Yuan1
  0 siblings, 0 replies; 21+ messages in thread
From: Liu, Yuan1 @ 2024-01-13 14:10 UTC (permalink / raw)
  To: Hao Xiang, Fabiano Rosas
  Cc: Bryan Zhang, qemu-devel@nongnu.org, marcandre.lureau@redhat.com,
	peterx@redhat.com, quintela@redhat.com, peter.maydell@linaro.org,
	berrange@redhat.com

> -----Original Message-----
> From: Hao Xiang <hao.xiang@bytedance.com>
> Sent: Thursday, January 11, 2024 2:40 PM
> To: Fabiano Rosas <farosas@suse.de>
> Cc: Liu, Yuan1 <yuan1.liu@intel.com>; Bryan Zhang
> <bryan.zhang@bytedance.com>; qemu-devel@nongnu.org;
> marcandre.lureau@redhat.com; peterx@redhat.com; quintela@redhat.com;
> peter.maydell@linaro.org; berrange@redhat.com
> Subject: Re: [External] Re: [PATCH 3/5] migration: Introduce unimplemented
> 'qatzip' compression method
> 
> On Mon, Jan 8, 2024 at 12:28 PM Fabiano Rosas <farosas@suse.de> wrote:
> >
> > "Liu, Yuan1" <yuan1.liu@intel.com> writes:
> >
> > >> -----Original Message-----
> > >> From: Hao Xiang <hao.xiang@bytedance.com>
> > >> Sent: Saturday, January 6, 2024 7:53 AM
> > >> To: Fabiano Rosas <farosas@suse.de>
> > >> Cc: Bryan Zhang <bryan.zhang@bytedance.com>; qemu-devel@nongnu.org;
> > >> marcandre.lureau@redhat.com; peterx@redhat.com;
> > >> quintela@redhat.com; peter.maydell@linaro.org; Liu, Yuan1
> > >> <yuan1.liu@intel.com>; berrange@redhat.com
> > >> Subject: Re: [External] Re: [PATCH 3/5] migration: Introduce
> > >> unimplemented 'qatzip' compression method
> > >>
> > >> On Fri, Jan 5, 2024 at 12:07 PM Fabiano Rosas <farosas@suse.de>
> wrote:
> > >> >
> > >> > Bryan Zhang <bryan.zhang@bytedance.com> writes:
> > >> >
> > >> > +cc Yuan Liu, Daniel Berrangé
> > >> >
> > >> > > Adds support for 'qatzip' as an option for the multifd
> > >> > > compression method parameter, but copy-pastes the no-op logic
> > >> > > to leave the actual methods effectively unimplemented. This is
> > >> > > in preparation of a subsequent commit that will implement
> > >> > > actually using QAT for compression and decompression.
> > >> > >
> > >> > > Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
> > >> > > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
> > >> > > ---
> > >> > >  hw/core/qdev-properties-system.c |  6 ++-
> > >> > >  migration/meson.build            |  1 +
> > >> > >  migration/multifd-qatzip.c       | 81
> > >> ++++++++++++++++++++++++++++++++
> > >> > >  migration/multifd.h              |  1 +
> > >> > >  qapi/migration.json              |  5 +-
> > >> > >  5 files changed, 92 insertions(+), 2 deletions(-)  create mode
> > >> > > 100644 migration/multifd-qatzip.c
> > >> > >
> > >> > > diff --git a/hw/core/qdev-properties-system.c
> > >> > > b/hw/core/qdev-properties-system.c
> > >> > > index 1a396521d5..d8e48dcb0e 100644
> > >> > > --- a/hw/core/qdev-properties-system.c
> > >> > > +++ b/hw/core/qdev-properties-system.c
> > >> > > @@ -658,7 +658,11 @@ const PropertyInfo
> > >> > > qdev_prop_fdc_drive_type = { const PropertyInfo
> qdev_prop_multifd_compression = {
> > >> > >      .name = "MultiFDCompression",
> > >> > >      .description = "multifd_compression values, "
> > >> > > -                   "none/zlib/zstd",
> > >> > > +                   "none/zlib/zstd"
> > >> > > +#ifdef CONFIG_QATZIP
> > >> > > +                   "/qatzip"
> > >> > > +#endif
> > >> > > +                   ,
> > >> > >      .enum_table = &MultiFDCompression_lookup,
> > >> > >      .get = qdev_propinfo_get_enum,
> > >> > >      .set = qdev_propinfo_set_enum, diff --git
> > >> > > a/migration/meson.build b/migration/meson.build index
> > >> > > 92b1cc4297..e20f318379 100644
> > >> > > --- a/migration/meson.build
> > >> > > +++ b/migration/meson.build
> > >> > > @@ -40,6 +40,7 @@ if get_option('live_block_migration').allowed()
> > >> > >    system_ss.add(files('block.c'))  endif
> > >> > >  system_ss.add(when: zstd, if_true: files('multifd-zstd.c'))
> > >> > > +system_ss.add(when: qatzip, if_true:
> > >> > > +files('multifd-qatzip.c'))
> > >> > >
> > >> > >  specific_ss.add(when: 'CONFIG_SYSTEM_ONLY',
> > >> > >                  if_true: files('ram.c', diff --git
> > >> > > a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c new
> file
> > >> > > mode 100644 index 0000000000..1733bbddb7
> > >> > > --- /dev/null
> > >> > > +++ b/migration/multifd-qatzip.c
> > >> > > @@ -0,0 +1,81 @@
> > >> > > +/*
> > >> > > + * Multifd QATzip compression implementation
> > >> > > + *
> > >> > > + * Copyright (c) Bytedance
> > >> > > + *
> > >> > > + * Authors:
> > >> > > + *  Bryan Zhang <bryan.zhang@bytedance.com>
> > >> > > + *  Hao Xiang   <hao.xiang@bytedance.com>
> > >> > > + *
> > >> > > + * This work is licensed under the terms of the GNU GPL, version
> 2 or
> > >> later.
> > >> > > + * See the COPYING file in the top-level directory.
> > >> > > + */
> > >> > > +
> > >> > > +#include "qemu/osdep.h"
> > >> > > +#include "exec/ramblock.h"
> > >> > > +#include "exec/target_page.h"
> > >> > > +#include "qapi/error.h"
> > >> > > +#include "migration.h"
> > >> > > +#include "options.h"
> > >> > > +#include "multifd.h"
> > >> > > +
> > >> > > +static int qatzip_send_setup(MultiFDSendParams *p, Error **errp)
> {
> > >> > > +    return 0;
> > >> > > +}
> > >> > > +
> > >> > > +static void qatzip_send_cleanup(MultiFDSendParams *p, Error
> **errp)
> > >> > > +{};
> > >> > > +
> > >> > > +static int qatzip_send_prepare(MultiFDSendParams *p, Error
> **errp)
> > >> > > +{
> > >> > > +    MultiFDPages_t *pages = p->pages;
> > >> > > +
> > >> > > +    for (int i = 0; i < p->normal_num; i++) {
> > >> > > +        p->iov[p->iovs_num].iov_base = pages->block->host + p-
> > >> >normal[i];
> > >> > > +        p->iov[p->iovs_num].iov_len = p->page_size;
> > >> > > +        p->iovs_num++;
> > >> > > +    }
> > >> > > +
> > >> > > +    p->next_packet_size = p->normal_num * p->page_size;
> > >> > > +    p->flags |= MULTIFD_FLAG_NOCOMP;
> > >> > > +    return 0;
> > >> > > +}
> > >> > > +
> > >> > > +static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp)
> {
> > >> > > +    return 0;
> > >> > > +}
> > >> > > +
> > >> > > +static void qatzip_recv_cleanup(MultiFDRecvParams *p) {};
> > >> > > +
> > >> > > +static int qatzip_recv_pages(MultiFDRecvParams *p, Error **errp)
> {
> > >> > > +    uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK;
> > >> > > +
> > >> > > +    if (flags != MULTIFD_FLAG_NOCOMP) {
> > >> > > +        error_setg(errp, "multifd %u: flags received %x flags
> > >> expected %x",
> > >> > > +                   p->id, flags, MULTIFD_FLAG_NOCOMP);
> > >> > > +        return -1;
> > >> > > +    }
> > >> > > +    for (int i = 0; i < p->normal_num; i++) {
> > >> > > +        p->iov[i].iov_base = p->host + p->normal[i];
> > >> > > +        p->iov[i].iov_len = p->page_size;
> > >> > > +    }
> > >> > > +    return qio_channel_readv_all(p->c, p->iov, p->normal_num,
> > >> > > +errp); }
> > >> > > +
> > >> > > +static MultiFDMethods multifd_qatzip_ops = {
> > >> > > +    .send_setup = qatzip_send_setup,
> > >> > > +    .send_cleanup = qatzip_send_cleanup,
> > >> > > +    .send_prepare = qatzip_send_prepare,
> > >> > > +    .recv_setup = qatzip_recv_setup,
> > >> > > +    .recv_cleanup = qatzip_recv_cleanup,
> > >> > > +    .recv_pages = qatzip_recv_pages };
> > >> > > +
> > >> > > +static void multifd_qatzip_register(void) {
> > >> > > +    multifd_register_ops(MULTIFD_COMPRESSION_QATZIP,
> > >> > > +&multifd_qatzip_ops); }
> > >> > > +
> > >> > > +migration_init(multifd_qatzip_register);
> > >> > > diff --git a/migration/multifd.h b/migration/multifd.h index
> > >> > > a835643b48..5600f7fc82 100644
> > >> > > --- a/migration/multifd.h
> > >> > > +++ b/migration/multifd.h
> > >> > > @@ -33,6 +33,7 @@ int multifd_queue_page(QEMUFile *f, RAMBlock
> > >> > > *block, ram_addr_t offset);  #define MULTIFD_FLAG_NOCOMP (0 << 1)
> > >> > > #define MULTIFD_FLAG_ZLIB (1 << 1)  #define MULTIFD_FLAG_ZSTD (2
> <<
> > >> > > 1)
> > >> > > +#define MULTIFD_FLAG_QATZIP (3 << 1)
> > >> > >
> > >> > >  /* This value needs to be a multiple of qemu_target_page_size()
> */
> > >> > > #define MULTIFD_PACKET_SIZE (512 * 1024) diff --git
> > >> > > a/qapi/migration.json b/qapi/migration.json index
> > >> > > 6d5a4b0489..e3cc195aed 100644
> > >> > > --- a/qapi/migration.json
> > >> > > +++ b/qapi/migration.json
> > >> > > @@ -625,11 +625,14 @@
> > >> > >  #
> > >> > >  # @zstd: use zstd compression method.
> > >> > >  #
> > >> > > +# @qatzip: use qatzip compression method.
> > >> > > +#
> > >> > >  # Since: 5.0
> > >> > >  ##
> > >> > >  { 'enum': 'MultiFDCompression',
> > >> > >    'data': [ 'none', 'zlib',
> > >> > > -            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
> > >> > > +            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' },
> > >> > > +            { 'name': 'qatzip', 'if': 'CONFIG_QATZIP'} ] }
> > >> >
> > >> > In another thread adding support to another Intel accelerator (IAA)
> we
> > >> > decided that it was better to select the offloading as an
> accelerator
> > >> > method to multifd zlib rather than as an entirely new compression
> > >> > format. Take a look at that discussion:
> > >> > https://lore.kernel.org/r/ZTFCnqbbqlmsUkRC@redhat.com
> > >>
> > >> We had some early discussion with Intel folks (probably a different
> team
> > >> than the one with the above patchset). The understanding at the time
> is
> > >> that QAT is good at both bulk data compression and decompression. IAA
> is
> > >> good at decompression with smaller data size but compression
> performance
> > >> is not the best. In multifd, we are compressing up to 128 4k pages at
> a
> > >> time and potentially this can increase by configuring the packet
> size, at
> > >> the time we thought QAT could be a better fit in the multifd live
> > >> migration scenario. We would like to hear more from Intel.
> > >>
> > >> From our benchmark testing, with two QAT devices, we can get deflate
> > >> compression throughout to around 7GB/s with ~160% CPU. That's beating
> the
> > >> current software implementation (zlib and zstd) by a lot. We are
> still
> > >> tuning the configuration in QEMU live migration now.
> > >>
> > >> >
> > >> > As I understand it, QAT + QATzip would be compatible with both zlib
> > >> > and IAA + QPL, so we'd add another accelerator method like this:
> > >> >
> > >> > https://lore.kernel.org/r/20240103112851.908082-3-
> yuan1.liu@intel.com
> > >> >
> > >>
> > >> I quickly read over the IAA patchset and I saw this:
> > >>
> > >> "However, due to some reasons, QPL is currently not compatible with
> the
> > >> existing Zlib method that Zlib compressed data can be decompressed by
> QPl
> > >> and vice versa."
> > >>
> > >> The above probably means the current zlib software implementation and
> IAA
> > >> are not compatible.
> > >>
> > >> For QAT, although, both Intel's QATzip and zlib are internally using
> > >> deflate, QATzip only supports deflate with a 4 byte header, deflate
> > >> wrapped by Gzip header and footer, or deflate wrapped by Intel® QAT
> > >> Gzip* extension header and footer. None of the headers can be
> recognized
> > >> by zlib software implementation is my understanding. So if we want to
> make
> > >> them compatible with zlib, the QATzip library needs to support that.
> > >
> > > The QPL library currently cannot support the Z_SYNC_FLULSH operation
> of zlib steaming.
> > > This is the reason why it is not compatible with zlib.
> > >
> 
> When doing the QPL compression, if we set both QPL_FLAG_FIRST and
> QPL_FLAG_LAST flags and run a single job for all pages in a multifd
> packet, would it generate the same format as Z_SYNC_FLULSH in the
> current zlib-based code?

Yes, When we want to use an IAA physics engine for data compression,
We can use a QPL job and set the QPL_FLAG_FIRST and QPL_FLAG_LAST flags, 
which will generate compressed data according to the deflate method.

Z_SYNC_FLUSH not only flushes the buffer, but also maintains data 
alignment and maintains the current compression state. 
QPL is currently unable to do this.

> >
> > I had understood from previous discussion that we'd be able to at least
> > support compression with QPL and decompression with the existing
> > zlib-based code. Is that not correct? I was about to suggest
> > experimenting with the window size in the existing code to hopefully
> > solve the 4kb window size issue. If there are other limitations, then
> > that will not be enough.
> 
> >
> > Also, can you point to the source of that statement about Z_SYNC_FLUSH,
> > I couldn't find more information about it in the documentation.
> >
> > >> > All that, of course, assuming we even want to support both
> > >> > accelerators. They're addressing the same problem after all. I
> wonder
> > >> > how we'd choose a precedence, since both seem to be present in the
> > >> > same processor family.
> > >> >
> > >> >
> > >>
> > >> That's an interesting question :-) I think overall performance
> (throughput
> > >> and CPU overhead) should both be considered. IAA and QAT accelerators
> > >> don't present on all systems. We Bytedance choose to have both on our
> > >> platform when purchasing from Intel.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [External] Re: [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method
  2024-01-11  5:41             ` Hao Xiang
@ 2024-01-13 14:29               ` Liu, Yuan1
  0 siblings, 0 replies; 21+ messages in thread
From: Liu, Yuan1 @ 2024-01-13 14:29 UTC (permalink / raw)
  To: Hao Xiang
  Cc: Fabiano Rosas, Bryan Zhang, qemu-devel@nongnu.org,
	marcandre.lureau@redhat.com, peterx@redhat.com,
	quintela@redhat.com, peter.maydell@linaro.org,
	berrange@redhat.com, Zou, Nanhai



> -----Original Message-----
> From: Hao Xiang <hao.xiang@bytedance.com>
> Sent: Thursday, January 11, 2024 1:42 PM
> To: Liu, Yuan1 <yuan1.liu@intel.com>
> Cc: Fabiano Rosas <farosas@suse.de>; Bryan Zhang
> <bryan.zhang@bytedance.com>; qemu-devel@nongnu.org;
> marcandre.lureau@redhat.com; peterx@redhat.com; quintela@redhat.com;
> peter.maydell@linaro.org; berrange@redhat.com; Zou, Nanhai
> <nanhai.zou@intel.com>
> Subject: Re: [External] Re: [PATCH 3/5] migration: Introduce unimplemented
> 'qatzip' compression method
> 
> On Mon, Jan 8, 2024 at 6:26 PM Liu, Yuan1 <yuan1.liu@intel.com> wrote:
> >
> > > -----Original Message-----
> > > From: Fabiano Rosas <farosas@suse.de>
> > > Sent: Tuesday, January 9, 2024 4:28 AM
> > > To: Liu, Yuan1 <yuan1.liu@intel.com>; Hao Xiang
> > > <hao.xiang@bytedance.com>
> > > Cc: Bryan Zhang <bryan.zhang@bytedance.com>; qemu-devel@nongnu.org;
> > > marcandre.lureau@redhat.com; peterx@redhat.com; quintela@redhat.com;
> > > peter.maydell@linaro.org; berrange@redhat.com
> > > Subject: RE: [External] Re: [PATCH 3/5] migration: Introduce
> > > unimplemented 'qatzip' compression method
> > >
> > > "Liu, Yuan1" <yuan1.liu@intel.com> writes:
> > >
> > > >> -----Original Message-----
> > > >> From: Hao Xiang <hao.xiang@bytedance.com>
> > > >> Sent: Saturday, January 6, 2024 7:53 AM
> > > >> To: Fabiano Rosas <farosas@suse.de>
> > > >> Cc: Bryan Zhang <bryan.zhang@bytedance.com>;
> > > >> qemu-devel@nongnu.org; marcandre.lureau@redhat.com;
> > > >> peterx@redhat.com; quintela@redhat.com; peter.maydell@linaro.org;
> > > >> Liu, Yuan1 <yuan1.liu@intel.com>; berrange@redhat.com
> > > >> Subject: Re: [External] Re: [PATCH 3/5] migration: Introduce
> > > >> unimplemented 'qatzip' compression method
> > > >>
> > > >> On Fri, Jan 5, 2024 at 12:07 PM Fabiano Rosas <farosas@suse.de>
> wrote:
> > > >> >
> > > >> > Bryan Zhang <bryan.zhang@bytedance.com> writes:
> > > >> >
> > > >> > +cc Yuan Liu, Daniel Berrangé
> > > >> >
> > > >> > > Adds support for 'qatzip' as an option for the multifd
> > > >> > > compression method parameter, but copy-pastes the no-op logic
> > > >> > > to leave the actual methods effectively unimplemented. This
> > > >> > > is in preparation of a subsequent commit that will implement
> > > >> > > actually using QAT for compression and decompression.
> > > >> > >
> > > >> > > Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
> > > >> > > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
> > > >> > > ---
> > > >> > >  hw/core/qdev-properties-system.c |  6 ++-
> > > >> > >  migration/meson.build            |  1 +
> > > >> > >  migration/multifd-qatzip.c       | 81
> > > >> ++++++++++++++++++++++++++++++++
> > > >> > >  migration/multifd.h              |  1 +
> > > >> > >  qapi/migration.json              |  5 +-
> > > >> > >  5 files changed, 92 insertions(+), 2 deletions(-)  create
> > > >> > > mode
> > > >> > > 100644 migration/multifd-qatzip.c
> > > >> > >
> > > >> > > diff --git a/hw/core/qdev-properties-system.c
> > > >> > > b/hw/core/qdev-properties-system.c
> > > >> > > index 1a396521d5..d8e48dcb0e 100644
> > > >> > > --- a/hw/core/qdev-properties-system.c
> > > >> > > +++ b/hw/core/qdev-properties-system.c
> > > >> > > @@ -658,7 +658,11 @@ const PropertyInfo
> > > >> > > qdev_prop_fdc_drive_type = { const PropertyInfo
> qdev_prop_multifd_compression = {
> > > >> > >      .name = "MultiFDCompression",
> > > >> > >      .description = "multifd_compression values, "
> > > >> > > -                   "none/zlib/zstd",
> > > >> > > +                   "none/zlib/zstd"
> > > >> > > +#ifdef CONFIG_QATZIP
> > > >> > > +                   "/qatzip"
> > > >> > > +#endif
> > > >> > > +                   ,
> > > >> > >      .enum_table = &MultiFDCompression_lookup,
> > > >> > >      .get = qdev_propinfo_get_enum,
> > > >> > >      .set = qdev_propinfo_set_enum, diff --git
> > > >> > > a/migration/meson.build b/migration/meson.build index
> > > >> > > 92b1cc4297..e20f318379 100644
> > > >> > > --- a/migration/meson.build
> > > >> > > +++ b/migration/meson.build
> > > >> > > @@ -40,6 +40,7 @@ if
> get_option('live_block_migration').allowed()
> > > >> > >    system_ss.add(files('block.c'))  endif
> > > >> > >  system_ss.add(when: zstd, if_true: files('multifd-zstd.c'))
> > > >> > > +system_ss.add(when: qatzip, if_true:
> > > >> > > +files('multifd-qatzip.c'))
> > > >> > >
> > > >> > >  specific_ss.add(when: 'CONFIG_SYSTEM_ONLY',
> > > >> > >                  if_true: files('ram.c', diff --git
> > > >> > > a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c new
> > > >> > > file mode 100644 index 0000000000..1733bbddb7
> > > >> > > --- /dev/null
> > > >> > > +++ b/migration/multifd-qatzip.c
> > > >> > > @@ -0,0 +1,81 @@
> > > >> > > +/*
> > > >> > > + * Multifd QATzip compression implementation
> > > >> > > + *
> > > >> > > + * Copyright (c) Bytedance
> > > >> > > + *
> > > >> > > + * Authors:
> > > >> > > + *  Bryan Zhang <bryan.zhang@bytedance.com>
> > > >> > > + *  Hao Xiang   <hao.xiang@bytedance.com>
> > > >> > > + *
> > > >> > > + * This work is licensed under the terms of the GNU GPL,
> > > >> > > +version 2
> > > or
> > > >> later.
> > > >> > > + * See the COPYING file in the top-level directory.
> > > >> > > + */
> > > >> > > +
> > > >> > > +#include "qemu/osdep.h"
> > > >> > > +#include "exec/ramblock.h"
> > > >> > > +#include "exec/target_page.h"
> > > >> > > +#include "qapi/error.h"
> > > >> > > +#include "migration.h"
> > > >> > > +#include "options.h"
> > > >> > > +#include "multifd.h"
> > > >> > > +
> > > >> > > +static int qatzip_send_setup(MultiFDSendParams *p, Error
> **errp) {
> > > >> > > +    return 0;
> > > >> > > +}
> > > >> > > +
> > > >> > > +static void qatzip_send_cleanup(MultiFDSendParams *p, Error
> > > **errp)
> > > >> > > +{};
> > > >> > > +
> > > >> > > +static int qatzip_send_prepare(MultiFDSendParams *p, Error
> > > >> > > +**errp) {
> > > >> > > +    MultiFDPages_t *pages = p->pages;
> > > >> > > +
> > > >> > > +    for (int i = 0; i < p->normal_num; i++) {
> > > >> > > +        p->iov[p->iovs_num].iov_base = pages->block->host +
> > > >> > > + p-
> > > >> >normal[i];
> > > >> > > +        p->iov[p->iovs_num].iov_len = p->page_size;
> > > >> > > +        p->iovs_num++;
> > > >> > > +    }
> > > >> > > +
> > > >> > > +    p->next_packet_size = p->normal_num * p->page_size;
> > > >> > > +    p->flags |= MULTIFD_FLAG_NOCOMP;
> > > >> > > +    return 0;
> > > >> > > +}
> > > >> > > +
> > > >> > > +static int qatzip_recv_setup(MultiFDRecvParams *p, Error
> **errp) {
> > > >> > > +    return 0;
> > > >> > > +}
> > > >> > > +
> > > >> > > +static void qatzip_recv_cleanup(MultiFDRecvParams *p) {};
> > > >> > > +
> > > >> > > +static int qatzip_recv_pages(MultiFDRecvParams *p, Error
> **errp) {
> > > >> > > +    uint32_t flags = p->flags &
> > > >> > > +MULTIFD_FLAG_COMPRESSION_MASK;
> > > >> > > +
> > > >> > > +    if (flags != MULTIFD_FLAG_NOCOMP) {
> > > >> > > +        error_setg(errp, "multifd %u: flags received %x
> > > >> > > + flags
> > > >> expected %x",
> > > >> > > +                   p->id, flags, MULTIFD_FLAG_NOCOMP);
> > > >> > > +        return -1;
> > > >> > > +    }
> > > >> > > +    for (int i = 0; i < p->normal_num; i++) {
> > > >> > > +        p->iov[i].iov_base = p->host + p->normal[i];
> > > >> > > +        p->iov[i].iov_len = p->page_size;
> > > >> > > +    }
> > > >> > > +    return qio_channel_readv_all(p->c, p->iov,
> > > >> > > +p->normal_num, errp); }
> > > >> > > +
> > > >> > > +static MultiFDMethods multifd_qatzip_ops = {
> > > >> > > +    .send_setup = qatzip_send_setup,
> > > >> > > +    .send_cleanup = qatzip_send_cleanup,
> > > >> > > +    .send_prepare = qatzip_send_prepare,
> > > >> > > +    .recv_setup = qatzip_recv_setup,
> > > >> > > +    .recv_cleanup = qatzip_recv_cleanup,
> > > >> > > +    .recv_pages = qatzip_recv_pages };
> > > >> > > +
> > > >> > > +static void multifd_qatzip_register(void) {
> > > >> > > +    multifd_register_ops(MULTIFD_COMPRESSION_QATZIP,
> > > >> > > +&multifd_qatzip_ops); }
> > > >> > > +
> > > >> > > +migration_init(multifd_qatzip_register);
> > > >> > > diff --git a/migration/multifd.h b/migration/multifd.h index
> > > >> > > a835643b48..5600f7fc82 100644
> > > >> > > --- a/migration/multifd.h
> > > >> > > +++ b/migration/multifd.h
> > > >> > > @@ -33,6 +33,7 @@ int multifd_queue_page(QEMUFile *f,
> > > >> > > RAMBlock *block, ram_addr_t offset);  #define
> > > >> > > MULTIFD_FLAG_NOCOMP (0 << 1) #define MULTIFD_FLAG_ZLIB (1 <<
> > > >> > > 1)  #define MULTIFD_FLAG_ZSTD (2 <<
> > > >> > > 1)
> > > >> > > +#define MULTIFD_FLAG_QATZIP (3 << 1)
> > > >> > >
> > > >> > >  /* This value needs to be a multiple of
> > > >> > > qemu_target_page_size() */ #define MULTIFD_PACKET_SIZE (512 *
> > > >> > > 1024) diff --git a/qapi/migration.json b/qapi/migration.json
> > > >> > > index 6d5a4b0489..e3cc195aed 100644
> > > >> > > --- a/qapi/migration.json
> > > >> > > +++ b/qapi/migration.json
> > > >> > > @@ -625,11 +625,14 @@
> > > >> > >  #
> > > >> > >  # @zstd: use zstd compression method.
> > > >> > >  #
> > > >> > > +# @qatzip: use qatzip compression method.
> > > >> > > +#
> > > >> > >  # Since: 5.0
> > > >> > >  ##
> > > >> > >  { 'enum': 'MultiFDCompression',
> > > >> > >    'data': [ 'none', 'zlib',
> > > >> > > -            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
> > > >> > > +            { 'name': 'zstd', 'if': 'CONFIG_ZSTD' },
> > > >> > > +            { 'name': 'qatzip', 'if': 'CONFIG_QATZIP'} ] }
> > > >> >
> > > >> > In another thread adding support to another Intel accelerator
> > > >> > (IAA)
> > > we
> > > >> > decided that it was better to select the offloading as an
> > > >> > accelerator method to multifd zlib rather than as an entirely
> > > >> > new compression format. Take a look at that discussion:
> > > >> > https://lore.kernel.org/r/ZTFCnqbbqlmsUkRC@redhat.com
> > > >>
> > > >> We had some early discussion with Intel folks (probably a
> > > >> different
> > > team
> > > >> than the one with the above patchset). The understanding at the
> > > >> time is that QAT is good at both bulk data compression and
> > > >> decompression. IAA
> > > is
> > > >> good at decompression with smaller data size but compression
> > > performance
> > > >> is not the best. In multifd, we are compressing up to 128 4k
> > > >> pages at a time and potentially this can increase by configuring
> > > >> the packet size,
> > > at
> > > >> the time we thought QAT could be a better fit in the multifd live
> > > >> migration scenario. We would like to hear more from Intel.
> > > >>
> > > >> From our benchmark testing, with two QAT devices, we can get
> > > >> deflate compression throughout to around 7GB/s with ~160% CPU.
> > > >> That's beating
> > > the
> > > >> current software implementation (zlib and zstd) by a lot. We are
> > > >> still tuning the configuration in QEMU live migration now.
> > > >>
> > > >> >
> > > >> > As I understand it, QAT + QATzip would be compatible with both
> > > >> > zlib and IAA + QPL, so we'd add another accelerator method like
> this:
> > > >> >
> > > >> > https://lore.kernel.org/r/20240103112851.908082-3-yuan1.liu@int
> > > >> > el.com
> > > >> >
> > > >>
> > > >> I quickly read over the IAA patchset and I saw this:
> > > >>
> > > >> "However, due to some reasons, QPL is currently not compatible
> > > >> with the existing Zlib method that Zlib compressed data can be
> > > >> decompressed by
> > > QPl
> > > >> and vice versa."
> > > >>
> > > >> The above probably means the current zlib software implementation
> > > >> and
> > > IAA
> > > >> are not compatible.
> > > >>
> > > >> For QAT, although, both Intel's QATzip and zlib are internally
> > > >> using deflate, QATzip only supports deflate with a 4 byte header,
> > > >> deflate wrapped by Gzip header and footer, or deflate wrapped by
> > > >> Intel® QAT
> > > >> Gzip* extension header and footer. None of the headers can be
> > > recognized
> > > >> by zlib software implementation is my understanding. So if we
> > > >> want to
> > > make
> > > >> them compatible with zlib, the QATzip library needs to support
> that.
> > > >
> > > > The QPL library currently cannot support the Z_SYNC_FLULSH
> > > > operation of
> > > zlib steaming.
> > > > This is the reason why it is not compatible with zlib.
> > > >
> > >
> > > I had understood from previous discussion that we'd be able to at
> > > least support compression with QPL and decompression with the
> > > existing zlib-based code. Is that not correct? I was about to
> > > suggest experimenting with the window size in the existing code to
> > > hopefully solve the 4kb window size issue. If there are other
> > > limitations, then that will not be enough.
> > >
> > > Also, can you point to the source of that statement about
> > > Z_SYNC_FLUSH, I couldn't find more information about it in the
> documentation.
> >
> > static int zlib_send_prepare(MultiFDSendParams *p, Error **errp) {
> >     struct zlib_data *z = p->data;
> >     z_stream *zs = &z->zs;
> >     …
> >     for (i = 0; i < p->normal_num; i++) {
> >         uint32_t available = z->zbuff_len - out_size;
> >         int flush = Z_NO_FLUSH;
> >
> >         if (i == p->normal_num - 1) {
> >             flush = Z_SYNC_FLUSH;
> >         }
> > If I understand correctly, the current implementation of multifd zlib
> treats each multifd thread as a separate stream. This implementation adds
> zlib headers and footers only to the beginning and end of the data, and
> the data transmitted in between does not require headers and footers.
> 
> zlib_send_prepare() is just calling the deflate() repeatedly in a loop.
> While compressing the last page, it sets the flag Z_SYNC_FLUSH, which
> should dump all pending output to the buffer.
> 
> "This implementation adds zlib headers and footers only to the beginning
> and end of the data,"
> ^ You mean putting a header/footer per multifd packet, correct?

Yes, the header information is constructed by DeflateInit, which is 
only done once for each multifd channel. If there is footer information, 
the footer information can be added with Z_FINISH of deflate

> >
> > The current implementation for IAA supports compressing data indicated
> by p->normal_num as a single stream. Each compressed data segment has zlib
> headers and footers. Since Z_SYNC_FLUSH is not supported, this means IAA
> has to complete the compression for a stream at once and cannot output
> parts of a stream in advance. Therefore, the current IAA is not compatible
> with existing zlib. Currently, it seems that the QAT implementation
> follows a similar approach.
> 
> "Since Z_SYNC_FLUSH is not supported, this means IAA has to complete the
> compression for a stream at once and cannot output parts of a stream in
> advance."
Yes, this is a limitation for compatibility with existing zlib implementations

> Does IAA's deflate compression put a header/footer per page? Or per
> multifd packet?
Yes, both of them can be done. QPL library supports ZLIB header/footer.

> >
> > Regarding the reference to Z_SYNC_FLUSH, you can find it at
> https://www.zlib.net/manual.html:
> > “If the parameter flush is set to Z_SYNC_FLUSH, all pending output is
> flushed to the output buffer and the output is aligned on a byte boundary,
> so that the decompressor can get all input data available so far. (In
> particular avail_in is zero after the call if enough output space has been
> provided before the call.) Flushing may degrade compression for some
> compression algorithms and so it should be used only when necessary. This
> completes the current deflate block and follows it with an empty stored
> block that is three bits plus filler bits to the next byte, followed by
> four bytes (00 00 ff ff).”
> >
> > Based on the information above, I suggest the following options:
> >
> > 1. Modify multifd zlib to perform stream compression each time with p-
> >normal_num pages. If this modification makes IAA compatible with zlib, I
> will implement it in the next version as per
> https://lore.kernel.org/all/20240103112851.908082-4-yuan1.liu@intel.com/T/
> and provide performance data. We will also verify the feasibility with
> QAT.
> >
> > 2. Use zlib compression without stream compression, meaning each page is
> independently compressed. The advantage is that accelerators can
> concurrently process more pages, and the current IAA and QAT can both be
> compatible. The downside is a loss in compression ratio, and the length of
> the data after compressing each page needs to be added to MultiFDPacket_t.
> If future compression functionality considers support only on
> accelerators, the compression ratio can be improved through compression
> levels or other features without additional host CPU overhead.
> >
> 
> We noticed a pretty significant performance difference in QAT deflate
> while using the streaming version VS the non-streaming version. The non-
> streaming version has better performance.
> Sounds like the easiest way to have zlib/IAA/QAT to be capable of
> compressing/decompressing interchangeably is to set the Z_SYNC_FLUSH flag
> on all deflate() calls in zlib_send_prepare(). And in my opinion, it is
> worth the trade-off. If my hardware doesn't have accelerators and I want
> pure software based compression, I would choose zstd over zlib.

like you said before, we can use the deflate function to compress 
individual pages or multifd packets. I am working on solving this 
issue and will provide compatibility results and early performance 
data in the next version

> > Additionally, the default window size is set to 4K, which should
> effectively support IAA hardware.
> >
> > > >> > All that, of course, assuming we even want to support both
> > > >> > accelerators. They're addressing the same problem after all. I
> > > >> > wonder how we'd choose a precedence, since both seem to be
> > > >> > present in the same processor family.
> > > >> >
> > > >> >
> > > >>
> > > >> That's an interesting question :-) I think overall performance
> > > (throughput
> > > >> and CPU overhead) should both be considered. IAA and QAT
> > > >> accelerators don't present on all systems. We Bytedance choose to
> > > >> have both on our platform when purchasing from Intel.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 5/5] migration: Add integration test for 'qatzip' compression method
  2023-12-31 20:58 ` [PATCH 5/5] migration: Add integration test for 'qatzip' compression method Bryan Zhang
@ 2024-01-29  8:47   ` Peter Xu
  2024-02-29  2:55     ` [External] " Bryan Zhang .
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Xu @ 2024-01-29  8:47 UTC (permalink / raw)
  To: Bryan Zhang
  Cc: qemu-devel, farosas, marcandre.lureau, quintela, peter.maydell,
	hao.xiang

On Sun, Dec 31, 2023 at 08:58:04PM +0000, Bryan Zhang wrote:
> Adds an integration test for 'qatzip'.

Please use "tests" as prefix of this patch.  It can be "tests/migration:",
"tests/migration-test:", etc.

> 
> Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
> Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>

[...]

>  test_migrate_multifd_tcp_tls_psk_start_match(QTestState *from,
> @@ -3480,6 +3504,19 @@ int main(int argc, char **argv)
>      qtest_add_func("/migration/multifd/tcp/plain/zstd",
>                     test_multifd_tcp_zstd);
>  #endif
> +#ifdef CONFIG_QATZIP
> +    /*
> +     * Use QATzip's qzInit() function as a runtime hardware check.
> +     * Ideally there might be a cleaner way to probe for the presence of QAT.
> +     */
> +    QzSession_T sess;
> +    memset(&sess, 0, sizeof(QzSession_T));
> +    if (qzInit(&sess, 0) == QZ_OK) {

Does "0" means it'll fail if no hardware is available, no matter whether
due to processor too old, or limited resources?

Would it make sense to test it even if only software fallbacks are
available?  IIUC the migration path will still be covered in software
fallbacks, so it may still makes sense to me.  It can be very likely that
most CIs will not cover the qatzip paths otherwise, because of either no
control over hardwares, or limited privileges of the CI user (does qat HWs
need special privilege normally?  I have no idea how QAT resource
management is done if there're only limited HW resources).

Besides, I believe all the data points you provided in the cover letter are
collected with 2 QAT HWs enabled?  I'm curious how's the performance of the
software fallback of qatzip comparing to existing algorithm: iiuc zstd is
mostly always preferred if sololy comparing to zlib?  where does qatzip
soft-mode stand in the picture?

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH 5/5] migration: Add integration test for 'qatzip' compression method
  2024-01-29  8:47   ` Peter Xu
@ 2024-02-29  2:55     ` Bryan Zhang .
  2024-02-29  3:27       ` Peter Xu
  0 siblings, 1 reply; 21+ messages in thread
From: Bryan Zhang . @ 2024-02-29  2:55 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, farosas, marcandre.lureau, quintela, peter.maydell,
	hao.xiang

[-- Attachment #1: Type: text/plain, Size: 3109 bytes --]

On Mon, Jan 29, 2024 at 12:53 AM Peter Xu <peterx@redhat.com> wrote:

> On Sun, Dec 31, 2023 at 08:58:04PM +0000, Bryan Zhang wrote:
> > Adds an integration test for 'qatzip'.
>
> Please use "tests" as prefix of this patch.  It can be "tests/migration:",
> "tests/migration-test:", etc.
>
> Will do.

> >
> > Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
> > Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
>
> [...]
>
> >  test_migrate_multifd_tcp_tls_psk_start_match(QTestState *from,
> > @@ -3480,6 +3504,19 @@ int main(int argc, char **argv)
> >      qtest_add_func("/migration/multifd/tcp/plain/zstd",
> >                     test_multifd_tcp_zstd);
> >  #endif
> > +#ifdef CONFIG_QATZIP
> > +    /*
> > +     * Use QATzip's qzInit() function as a runtime hardware check.
> > +     * Ideally there might be a cleaner way to probe for the presence
> of QAT.
> > +     */
> > +    QzSession_T sess;
> > +    memset(&sess, 0, sizeof(QzSession_T));
> > +    if (qzInit(&sess, 0) == QZ_OK) {
>
> Does "0" means it'll fail if no hardware is available, no matter whether
> due to processor too old, or limited resources?
>
> Would it make sense to test it even if only software fallbacks are
> available?  IIUC the migration path will still be covered in software
> fallbacks, so it may still makes sense to me.  It can be very likely that
> most CIs will not cover the qatzip paths otherwise, because of either no
> control over hardwares, or limited privileges of the CI user (does qat HWs
> need special privilege normally?  I have no idea how QAT resource
> management is done if there're only limited HW resources).
>
> Besides, I believe all the data points you provided in the cover letter are
> collected with 2 QAT HWs enabled?  I'm curious how's the performance of the
> software fallback of qatzip comparing to existing algorithm: iiuc zstd is
> mostly always preferred if sololy comparing to zlib?  where does qatzip
> soft-mode stand in the picture?


> Yes, as I understand it, 0 means the code will error if the hardware is
unavailable for whatever reason.

We can enable software fallback in the live migration path, which will also
enable using software to run the QATzip tests (and will conveniently allow
us to remove the awkward `qzInit` check in the test code). It also might be
a good idea since it would also avoid failure in case of, e.g., a transient
hardware problem?

Performance: The software fallback seems prohibitively slow. QATzip
fallback uses the built-in zlib implementation, but I ran a migration test
that normally takes zlib about 150 seconds and the QATzip fallback took
about 30 minutes before my SSH session disconnected :P

Also, a note: When enabling software fallback, QATzip and the QAT driver
will both repeatedly print a complaint to the QEMU monitor when trying to
compress without hardware. Is it bad form to introduce
seemingly-unsuppressable prints from dependencies, or is this acceptable?

>
> Thanks,
>
> --
> Peter Xu
>
> Thanks for your feedback.

--
Bryan Zhang

[-- Attachment #2: Type: text/html, Size: 4283 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH 5/5] migration: Add integration test for 'qatzip' compression method
  2024-02-29  2:55     ` [External] " Bryan Zhang .
@ 2024-02-29  3:27       ` Peter Xu
  0 siblings, 0 replies; 21+ messages in thread
From: Peter Xu @ 2024-02-29  3:27 UTC (permalink / raw)
  To: Bryan Zhang .
  Cc: qemu-devel, farosas, marcandre.lureau, quintela, peter.maydell,
	hao.xiang

On Wed, Feb 28, 2024 at 06:55:37PM -0800, Bryan Zhang . wrote:
> We can enable software fallback in the live migration path, which will also
> enable using software to run the QATzip tests (and will conveniently allow
> us to remove the awkward `qzInit` check in the test code). It also might be
> a good idea since it would also avoid failure in case of, e.g., a transient
> hardware problem?
> 
> Performance: The software fallback seems prohibitively slow. QATzip
> fallback uses the built-in zlib implementation, but I ran a migration test
> that normally takes zlib about 150 seconds and the QATzip fallback took
> about 30 minutes before my SSH session disconnected :P

When preparing the qatzip documentataion file, please mention this.  This
is definitely not expected by myself, we should suggest mostly never use
qatzip compatible (software) mode unless the admin is sure to have the
hardware support.

It should then only be used in qtest to cover the qatzip paths only.

> 
> Also, a note: When enabling software fallback, QATzip and the QAT driver
> will both repeatedly print a complaint to the QEMU monitor when trying to
> compress without hardware. Is it bad form to introduce
> seemingly-unsuppressable prints from dependencies, or is this acceptable?

It is definitely bad to keep printing the same message, no matter from
where.

Why it repeatedly do so?  Do you know why the library cannot make it
print_once()?

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-02-29  3:27 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-31 20:57 [PATCH 0/5] *** Implement using Intel QAT to offload ZLIB Bryan Zhang
2023-12-31 20:58 ` [PATCH 1/5] meson: Introduce 'qatzip' feature to the build system Bryan Zhang
2023-12-31 20:58 ` [PATCH 2/5] migration: Add compression level parameter for QATzip Bryan Zhang
2023-12-31 20:58 ` [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method Bryan Zhang
2024-01-05 20:07   ` Fabiano Rosas
2024-01-05 23:52     ` [External] " Hao Xiang
2024-01-06  6:31       ` Hao Xiang
2024-01-08 20:24         ` Fabiano Rosas
2024-01-08  3:25       ` Liu, Yuan1
2024-01-08 20:27         ` Fabiano Rosas
2024-01-09  2:26           ` Liu, Yuan1
2024-01-11  5:41             ` Hao Xiang
2024-01-13 14:29               ` Liu, Yuan1
2024-01-11  6:39           ` Hao Xiang
2024-01-13 14:10             ` Liu, Yuan1
2024-01-08  3:17     ` Liu, Yuan1
2023-12-31 20:58 ` [PATCH 4/5] migration: Implement 'qatzip' methods using QAT Bryan Zhang
2023-12-31 20:58 ` [PATCH 5/5] migration: Add integration test for 'qatzip' compression method Bryan Zhang
2024-01-29  8:47   ` Peter Xu
2024-02-29  2:55     ` [External] " Bryan Zhang .
2024-02-29  3:27       ` Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).