From: Fabiano Rosas <farosas@suse.de>
To: "Yichen Wang" <yichen.wang@bytedance.com>,
"Dr. David Alan Gilbert" <dave@treblig.org>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Marc-André Lureau" <marcandre.lureau@redhat.com>,
"Daniel P. Berrangé" <berrange@redhat.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Peter Xu" <peterx@redhat.com>, "Eric Blake" <eblake@redhat.com>,
"Markus Armbruster" <armbru@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Cornelia Huck" <cohuck@redhat.com>,
qemu-devel@nongnu.org
Cc: Hao Xiang <hao.xiang@linux.dev>,
"Liu, Yuan1" <yuan1.liu@intel.com>,
Shivam Kumar <shivam.kumar1@nutanix.com>,
"Ho-Ren (Jack) Chuang" <horenchuang@bytedance.com>,
Yichen Wang <yichen.wang@bytedance.com>,
Bryan Zhang <bryan.zhang@bytedance.com>
Subject: Re: [PATCH v6 03/12] util/dsa: Implement DSA device start and stop logic.
Date: Wed, 16 Oct 2024 18:00:34 -0300 [thread overview]
Message-ID: <87ttdb3gr1.fsf@suse.de> (raw)
In-Reply-To: <20241009234610.27039-4-yichen.wang@bytedance.com>
Yichen Wang <yichen.wang@bytedance.com> writes:
> From: Hao Xiang <hao.xiang@linux.dev>
>
> * DSA device open and close.
> * DSA group contains multiple DSA devices.
> * DSA group configure/start/stop/clean.
>
> Signed-off-by: Hao Xiang <hao.xiang@linux.dev>
> Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
> Signed-off-by: Yichen Wang <yichen.wang@bytedance.com>
> ---
> include/qemu/dsa.h | 103 +++++++++++++++++
> util/dsa.c | 282 +++++++++++++++++++++++++++++++++++++++++++++
> util/meson.build | 3 +
> 3 files changed, 388 insertions(+)
> create mode 100644 include/qemu/dsa.h
> create mode 100644 util/dsa.c
>
> diff --git a/include/qemu/dsa.h b/include/qemu/dsa.h
> new file mode 100644
> index 0000000000..501bb8c70d
> --- /dev/null
> +++ b/include/qemu/dsa.h
> @@ -0,0 +1,103 @@
> +/*
> + * Interface for using Intel Data Streaming Accelerator to offload certain
> + * background operations.
> + *
> + * Copyright (C) Bytedance Ltd.
> + *
> + * Authors:
> + * Hao Xiang <hao.xiang@bytedance.com>
> + * Yichen Wang <yichen.wang@bytedance.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef QEMU_DSA_H
> +#define QEMU_DSA_H
> +
> +#include "qemu/error-report.h"
> +#include "qemu/thread.h"
> +#include "qemu/queue.h"
> +
> +#ifdef CONFIG_DSA_OPT
> +
> +#pragma GCC push_options
> +#pragma GCC target("enqcmd")
> +
> +#include <linux/idxd.h>
> +#include "x86intrin.h"
> +
> +typedef struct {
> + void *work_queue;
> +} QemuDsaDevice;
> +
> +typedef QSIMPLEQ_HEAD(QemuDsaTaskQueue, QemuDsaBatchTask) QemuDsaTaskQueue;
> +
> +typedef struct {
> + QemuDsaDevice *dsa_devices;
> + int num_dsa_devices;
> + /* The index of the next DSA device to be used. */
> + uint32_t device_allocator_index;
> + bool running;
> + QemuMutex task_queue_lock;
> + QemuCond task_queue_cond;
> + QemuDsaTaskQueue task_queue;
> +} QemuDsaDeviceGroup;
> +
> +/**
> + * @brief Initializes DSA devices.
> + *
> + * @param dsa_parameter A list of DSA device path from migration parameter.
> + *
> + * @return int Zero if successful, otherwise non zero.
> + */
> +int qemu_dsa_init(const strList *dsa_parameter, Error **errp);
> +
> +/**
> + * @brief Start logic to enable using DSA.
> + */
> +void qemu_dsa_start(void);
> +
> +/**
> + * @brief Stop the device group and the completion thread.
> + */
> +void qemu_dsa_stop(void);
> +
> +/**
> + * @brief Clean up system resources created for DSA offloading.
> + */
> +void qemu_dsa_cleanup(void);
> +
> +/**
> + * @brief Check if DSA is running.
> + *
> + * @return True if DSA is running, otherwise false.
> + */
> +bool qemu_dsa_is_running(void);
> +
> +#else
> +
> +static inline bool qemu_dsa_is_running(void)
> +{
> + return false;
> +}
> +
> +static inline int qemu_dsa_init(const strList *dsa_parameter, Error **errp)
> +{
> + if (dsa_parameter != NULL && strlen(dsa_parameter) != 0) {
> + error_setg(errp, "DSA is not supported.");
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> +static inline void qemu_dsa_start(void) {}
> +
> +static inline void qemu_dsa_stop(void) {}
> +
> +static inline void qemu_dsa_cleanup(void) {}
> +
> +#endif
> +
> +#endif
> diff --git a/util/dsa.c b/util/dsa.c
> new file mode 100644
> index 0000000000..54d0e20c29
> --- /dev/null
> +++ b/util/dsa.c
> @@ -0,0 +1,282 @@
> +/*
> + * Use Intel Data Streaming Accelerator to offload certain background
> + * operations.
> + *
> + * Copyright (C) Bytedance Ltd.
> + *
> + * Authors:
> + * Hao Xiang <hao.xiang@bytedance.com>
> + * Bryan Zhang <bryan.zhang@bytedance.com>
> + * Yichen Wang <yichen.wang@bytedance.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qemu/queue.h"
> +#include "qemu/memalign.h"
> +#include "qemu/lockable.h"
> +#include "qemu/cutils.h"
> +#include "qemu/dsa.h"
> +#include "qemu/bswap.h"
> +#include "qemu/error-report.h"
> +#include "qemu/rcu.h"
> +
> +#pragma GCC push_options
> +#pragma GCC target("enqcmd")
> +
> +#include <linux/idxd.h>
> +#include "x86intrin.h"
> +
> +#define DSA_WQ_PORTAL_SIZE 4096
> +#define MAX_DSA_DEVICES 16
> +
> +uint32_t max_retry_count;
> +static QemuDsaDeviceGroup dsa_group;
> +
> +
> +/**
> + * @brief This function opens a DSA device's work queue and
> + * maps the DSA device memory into the current process.
> + *
> + * @param dsa_wq_path A pointer to the DSA device work queue's file path.
> + * @return A pointer to the mapped memory, or MAP_FAILED on failure.
> + */
> +static void *
> +map_dsa_device(const char *dsa_wq_path)
> +{
> + void *dsa_device;
> + int fd;
> +
> + fd = open(dsa_wq_path, O_RDWR);
> + if (fd < 0) {
> + error_report("Open %s failed with errno = %d.",
> + dsa_wq_path, errno);
> + return MAP_FAILED;
> + }
> + dsa_device = mmap(NULL, DSA_WQ_PORTAL_SIZE, PROT_WRITE,
> + MAP_SHARED | MAP_POPULATE, fd, 0);
> + close(fd);
> + if (dsa_device == MAP_FAILED) {
> + error_report("mmap failed with errno = %d.", errno);
> + return MAP_FAILED;
> + }
> + return dsa_device;
> +}
> +
> +/**
> + * @brief Initializes a DSA device structure.
> + *
> + * @param instance A pointer to the DSA device.
> + * @param work_queue A pointer to the DSA work queue.
> + */
> +static void
> +dsa_device_init(QemuDsaDevice *instance,
> + void *dsa_work_queue)
> +{
> + instance->work_queue = dsa_work_queue;
> +}
> +
> +/**
> + * @brief Cleans up a DSA device structure.
> + *
> + * @param instance A pointer to the DSA device to cleanup.
> + */
> +static void
> +dsa_device_cleanup(QemuDsaDevice *instance)
> +{
> + if (instance->work_queue != MAP_FAILED) {
> + munmap(instance->work_queue, DSA_WQ_PORTAL_SIZE);
> + }
> +}
> +
> +/**
> + * @brief Initializes a DSA device group.
> + *
> + * @param group A pointer to the DSA device group.
> + * @param dsa_parameter A list of DSA device path from are separated by space
> + * character migration parameter. Multiple DSA device path.
> + *
> + * @return Zero if successful, non-zero otherwise.
> + */
> +static int
> +dsa_device_group_init(QemuDsaDeviceGroup *group,
> + const strList *dsa_parameter,
> + Error **errp)
> +{
> + if (dsa_parameter == NULL) {
> + error_setg(errp, "dsa device path is not supplied.");
> + return -1;
> + }
> +
> + int ret = 0;
> + const char *dsa_path[MAX_DSA_DEVICES];
> + int num_dsa_devices = 0;
> +
> + while (dsa_parameter) {
> + dsa_path[num_dsa_devices++] = dsa_parameter->value;
> + if (num_dsa_devices == MAX_DSA_DEVICES) {
> + break;
> + }
> + dsa_parameter = dsa_parameter->next;
> + }
> +
> + group->dsa_devices =
> + g_new0(QemuDsaDevice, num_dsa_devices);
> + group->num_dsa_devices = num_dsa_devices;
> + group->device_allocator_index = 0;
> +
> + group->running = false;
> + qemu_mutex_init(&group->task_queue_lock);
> + qemu_cond_init(&group->task_queue_cond);
> + QSIMPLEQ_INIT(&group->task_queue);
> +
> + void *dsa_wq = MAP_FAILED;
> + for (int i = 0; i < num_dsa_devices; i++) {
> + dsa_wq = map_dsa_device(dsa_path[i]);
> + if (dsa_wq == MAP_FAILED) {
> + error_setg(errp, "map_dsa_device failed MAP_FAILED.");
> + ret = -1;
> + goto exit;
This will leave subsequent dsa_devices uninitialized after
map_dsa_device fails for the i-th device, but the loop at
dsa_device_group_cleanup() still passes all of them into
dsa_device_cleanup(), so the check != MAP_FAILED there will be true.
> + }
> + dsa_device_init(&dsa_group.dsa_devices[i], dsa_wq);
I think you mean &group->dsa_devices[i] here.
> + }
> +
> +exit:
> + return ret;
> +}
> +
> +/**
> + * @brief Starts a DSA device group.
> + *
> + * @param group A pointer to the DSA device group.
> + */
> +static void
> +dsa_device_group_start(QemuDsaDeviceGroup *group)
> +{
> + group->running = true;
> +}
> +
> +/**
> + * @brief Stops a DSA device group.
> + *
> + * @param group A pointer to the DSA device group.
> + */
> +__attribute__((unused))
> +static void
> +dsa_device_group_stop(QemuDsaDeviceGroup *group)
> +{
> + group->running = false;
> +}
> +
> +/**
> + * @brief Cleans up a DSA device group.
> + *
> + * @param group A pointer to the DSA device group.
> + */
> +static void
> +dsa_device_group_cleanup(QemuDsaDeviceGroup *group)
> +{
> + if (!group->dsa_devices) {
> + return;
> + }
> + for (int i = 0; i < group->num_dsa_devices; i++) {
> + dsa_device_cleanup(&group->dsa_devices[i]);
> + }
> + g_free(group->dsa_devices);
> + group->dsa_devices = NULL;
> +
> + qemu_mutex_destroy(&group->task_queue_lock);
> + qemu_cond_destroy(&group->task_queue_cond);
> +}
> +
> +/**
> + * @brief Returns the next available DSA device in the group.
> + *
> + * @param group A pointer to the DSA device group.
> + *
> + * @return struct QemuDsaDevice* A pointer to the next available DSA device
> + * in the group.
> + */
> +__attribute__((unused))
> +static QemuDsaDevice *
> +dsa_device_group_get_next_device(QemuDsaDeviceGroup *group)
> +{
> + if (group->num_dsa_devices == 0) {
> + return NULL;
> + }
> + uint32_t current = qatomic_fetch_inc(&group->device_allocator_index);
> + current %= group->num_dsa_devices;
> + return &group->dsa_devices[current];
> +}
> +
> +/**
> + * @brief Check if DSA is running.
> + *
> + * @return True if DSA is running, otherwise false.
> + */
> +bool qemu_dsa_is_running(void)
> +{
> + return false;
> +}
> +
> +static void
> +dsa_globals_init(void)
> +{
> + max_retry_count = UINT32_MAX;
> +}
> +
> +/**
> + * @brief Initializes DSA devices.
> + *
> + * @param dsa_parameter A list of DSA device path from migration parameter.
> + *
> + * @return int Zero if successful, otherwise non zero.
> + */
> +int qemu_dsa_init(const strList *dsa_parameter, Error **errp)
> +{
> + dsa_globals_init();
> +
> + return dsa_device_group_init(&dsa_group, dsa_parameter, errp);
> +}
> +
> +/**
> + * @brief Start logic to enable using DSA.
> + *
> + */
> +void qemu_dsa_start(void)
> +{
> + if (dsa_group.num_dsa_devices == 0) {
> + return;
> + }
> + if (dsa_group.running) {
> + return;
> + }
> + dsa_device_group_start(&dsa_group);
> +}
> +
> +/**
> + * @brief Stop the device group and the completion thread.
> + *
> + */
> +void qemu_dsa_stop(void)
> +{
> + QemuDsaDeviceGroup *group = &dsa_group;
> +
> + if (!group->running) {
> + return;
> + }
> +}
> +
> +/**
> + * @brief Clean up system resources created for DSA offloading.
> + *
> + */
> +void qemu_dsa_cleanup(void)
> +{
> + qemu_dsa_stop();
> + dsa_device_group_cleanup(&dsa_group);
> +}
> +
> diff --git a/util/meson.build b/util/meson.build
> index 5d8bef9891..3360f62923 100644
> --- a/util/meson.build
> +++ b/util/meson.build
> @@ -88,6 +88,9 @@ if have_block or have_ga
> endif
> if have_block
> util_ss.add(files('aio-wait.c'))
> + if config_host_data.get('CONFIG_DSA_OPT')
> + util_ss.add(files('dsa.c'))
> + endif
> util_ss.add(files('buffer.c'))
> util_ss.add(files('bufferiszero.c'))
> util_ss.add(files('hbitmap.c'))
next prev parent reply other threads:[~2024-10-16 21:01 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-09 23:45 [PATCH v6 00/12] Use Intel DSA accelerator to offload zero page checking in multifd live migration Yichen Wang
2024-10-09 23:45 ` [PATCH v6 01/12] meson: Introduce new instruction set enqcmd to the build system Yichen Wang
2024-10-09 23:46 ` [PATCH v6 02/12] util/dsa: Add idxd into linux header copy list Yichen Wang
2024-10-09 23:46 ` [PATCH v6 03/12] util/dsa: Implement DSA device start and stop logic Yichen Wang
2024-10-16 18:59 ` Peter Xu
2024-10-16 21:00 ` Fabiano Rosas [this message]
2024-10-09 23:46 ` [PATCH v6 04/12] util/dsa: Implement DSA task enqueue and dequeue Yichen Wang
2024-10-09 23:46 ` [PATCH v6 05/12] util/dsa: Implement DSA task asynchronous completion thread model Yichen Wang
2024-10-09 23:46 ` [PATCH v6 06/12] util/dsa: Implement zero page checking in DSA task Yichen Wang
2024-10-09 23:46 ` [PATCH v6 07/12] util/dsa: Implement DSA task asynchronous submission and wait for completion Yichen Wang
2024-10-09 23:46 ` [PATCH v6 08/12] migration/multifd: Add new migration option for multifd DSA offloading Yichen Wang
2024-10-11 17:14 ` Dr. David Alan Gilbert
2024-10-15 22:09 ` [External] " Yichen Wang
2024-10-15 22:51 ` Dr. David Alan Gilbert
2024-10-09 23:46 ` [PATCH v6 09/12] migration/multifd: Enable DSA offloading in multifd sender path Yichen Wang
2024-10-17 19:11 ` Fabiano Rosas
2024-10-09 23:46 ` [PATCH v6 10/12] migration/multifd: Add migration option set packet size Yichen Wang
2024-10-17 19:16 ` Fabiano Rosas
2024-10-09 23:46 ` [PATCH v6 11/12] util/dsa: Add unit test coverage for Intel DSA task submission and completion Yichen Wang
2024-10-09 23:46 ` [PATCH v6 12/12] migration/multifd: Add integration tests for multifd with Intel DSA offloading Yichen Wang
2024-10-11 14:13 ` [PATCH v6 00/12] Use Intel DSA accelerator to offload zero page checking in multifd live migration Fabiano Rosas
2024-10-15 22:05 ` [External] " Yichen Wang
2024-10-11 16:32 ` Peter Xu
2024-10-11 16:53 ` Dr. David Alan Gilbert
2024-10-15 22:02 ` [External] " Yichen Wang
2024-10-16 19:44 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ttdb3gr1.fsf@suse.de \
--to=farosas@suse.de \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=bryan.zhang@bytedance.com \
--cc=cohuck@redhat.com \
--cc=dave@treblig.org \
--cc=eblake@redhat.com \
--cc=hao.xiang@linux.dev \
--cc=horenchuang@bytedance.com \
--cc=marcandre.lureau@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=shivam.kumar1@nutanix.com \
--cc=yichen.wang@bytedance.com \
--cc=yuan1.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.