From: Fabiano Rosas <farosas@suse.de>
To: "Yichen Wang" <yichen.wang@bytedance.com>,
"Peter Xu" <peterx@redhat.com>,
"Dr. David Alan Gilbert" <dave@treblig.org>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Marc-André Lureau" <marcandre.lureau@redhat.com>,
"Daniel P. Berrangé" <berrange@redhat.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Eric Blake" <eblake@redhat.com>,
"Markus Armbruster" <armbru@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Cornelia Huck" <cohuck@redhat.com>,
qemu-devel@nongnu.org
Cc: Hao Xiang <hao.xiang@linux.dev>,
"Liu, Yuan1" <yuan1.liu@intel.com>,
Shivam Kumar <shivam.kumar1@nutanix.com>,
"Ho-Ren (Jack) Chuang" <horenchuang@bytedance.com>,
Yichen Wang <yichen.wang@bytedance.com>,
Bryan Zhang <bryan.zhang@bytedance.com>
Subject: Re: [PATCH v7 03/12] util/dsa: Implement DSA device start and stop logic.
Date: Thu, 21 Nov 2024 11:11:03 -0300 [thread overview]
Message-ID: <875xogsmmg.fsf@suse.de> (raw)
In-Reply-To: <20241114220132.27399-4-yichen.wang@bytedance.com>
Yichen Wang <yichen.wang@bytedance.com> writes:
> From: Hao Xiang <hao.xiang@linux.dev>
>
> * DSA device open and close.
> * DSA group contains multiple DSA devices.
> * DSA group configure/start/stop/clean.
>
> Signed-off-by: Hao Xiang <hao.xiang@linux.dev>
> Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
> Signed-off-by: Yichen Wang <yichen.wang@bytedance.com>
> ---
> include/qemu/dsa.h | 103 +++++++++++++++++
> util/dsa.c | 280 +++++++++++++++++++++++++++++++++++++++++++++
> util/meson.build | 3 +
> 3 files changed, 386 insertions(+)
> create mode 100644 include/qemu/dsa.h
> create mode 100644 util/dsa.c
>
> diff --git a/include/qemu/dsa.h b/include/qemu/dsa.h
> new file mode 100644
> index 0000000000..71686af28f
> --- /dev/null
> +++ b/include/qemu/dsa.h
> @@ -0,0 +1,103 @@
> +/*
> + * Interface for using Intel Data Streaming Accelerator to offload certain
> + * background operations.
> + *
> + * Copyright (C) Bytedance Ltd.
> + *
> + * Authors:
> + * Hao Xiang <hao.xiang@bytedance.com>
> + * Yichen Wang <yichen.wang@bytedance.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef QEMU_DSA_H
> +#define QEMU_DSA_H
> +
> +#include "qapi/error.h"
> +#include "qemu/thread.h"
> +#include "qemu/queue.h"
> +
> +#ifdef CONFIG_DSA_OPT
> +
> +#pragma GCC push_options
> +#pragma GCC target("enqcmd")
> +
> +#include <linux/idxd.h>
> +#include "x86intrin.h"
> +
> +typedef struct {
> + void *work_queue;
> +} QemuDsaDevice;
> +
> +typedef QSIMPLEQ_HEAD(QemuDsaTaskQueue, QemuDsaBatchTask) QemuDsaTaskQueue;
> +
> +typedef struct {
> + QemuDsaDevice *dsa_devices;
> + int num_dsa_devices;
> + /* The index of the next DSA device to be used. */
> + uint32_t device_allocator_index;
> + bool running;
> + QemuMutex task_queue_lock;
> + QemuCond task_queue_cond;
> + QemuDsaTaskQueue task_queue;
> +} QemuDsaDeviceGroup;
> +
> +/**
> + * @brief Initializes DSA devices.
> + *
> + * @param dsa_parameter A list of DSA device path from migration parameter.
> + *
> + * @return int Zero if successful, otherwise non zero.
> + */
> +int qemu_dsa_init(const strList *dsa_parameter, Error **errp);
> +
> +/**
> + * @brief Start logic to enable using DSA.
> + */
> +void qemu_dsa_start(void);
> +
> +/**
> + * @brief Stop the device group and the completion thread.
> + */
> +void qemu_dsa_stop(void);
> +
> +/**
> + * @brief Clean up system resources created for DSA offloading.
> + */
> +void qemu_dsa_cleanup(void);
> +
> +/**
> + * @brief Check if DSA is running.
> + *
> + * @return True if DSA is running, otherwise false.
> + */
> +bool qemu_dsa_is_running(void);
> +
> +#else
> +
> +static inline bool qemu_dsa_is_running(void)
> +{
> + return false;
> +}
> +
> +static inline int qemu_dsa_init(const strList *dsa_parameter, Error **errp)
> +{
> + if (dsa_parameter != NULL && strlen(dsa_parameter) != 0) {
> + error_setg(errp, "DSA is not supported.");
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> +static inline void qemu_dsa_start(void) {}
> +
> +static inline void qemu_dsa_stop(void) {}
> +
> +static inline void qemu_dsa_cleanup(void) {}
> +
> +#endif
> +
> +#endif
> diff --git a/util/dsa.c b/util/dsa.c
> new file mode 100644
> index 0000000000..79dab5d62c
> --- /dev/null
> +++ b/util/dsa.c
> @@ -0,0 +1,280 @@
> +/*
> + * Use Intel Data Streaming Accelerator to offload certain background
> + * operations.
> + *
> + * Copyright (C) Bytedance Ltd.
> + *
> + * Authors:
> + * Hao Xiang <hao.xiang@bytedance.com>
> + * Bryan Zhang <bryan.zhang@bytedance.com>
> + * Yichen Wang <yichen.wang@bytedance.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qemu/queue.h"
> +#include "qemu/memalign.h"
> +#include "qemu/lockable.h"
> +#include "qemu/cutils.h"
> +#include "qemu/dsa.h"
> +#include "qemu/bswap.h"
> +#include "qemu/error-report.h"
> +#include "qemu/rcu.h"
> +
> +#pragma GCC push_options
> +#pragma GCC target("enqcmd")
> +
> +#include <linux/idxd.h>
> +#include "x86intrin.h"
> +
> +#define DSA_WQ_PORTAL_SIZE 4096
> +#define MAX_DSA_DEVICES 16
> +
> +uint32_t max_retry_count;
> +static QemuDsaDeviceGroup dsa_group;
> +
> +
> +/**
> + * @brief This function opens a DSA device's work queue and
> + * maps the DSA device memory into the current process.
> + *
> + * @param dsa_wq_path A pointer to the DSA device work queue's file path.
> + * @return A pointer to the mapped memory, or MAP_FAILED on failure.
> + */
> +static void *
> +map_dsa_device(const char *dsa_wq_path)
> +{
> + void *dsa_device;
> + int fd;
> +
> + fd = open(dsa_wq_path, O_RDWR);
> + if (fd < 0) {
> + error_report("Open %s failed with errno = %d.",
> + dsa_wq_path, errno);
> + return MAP_FAILED;
> + }
> + dsa_device = mmap(NULL, DSA_WQ_PORTAL_SIZE, PROT_WRITE,
> + MAP_SHARED | MAP_POPULATE, fd, 0);
> + close(fd);
> + if (dsa_device == MAP_FAILED) {
> + error_report("mmap failed with errno = %d.", errno);
> + return MAP_FAILED;
> + }
> + return dsa_device;
> +}
> +
> +/**
> + * @brief Initializes a DSA device structure.
> + *
> + * @param instance A pointer to the DSA device.
> + * @param work_queue A pointer to the DSA work queue.
> + */
> +static void
> +dsa_device_init(QemuDsaDevice *instance,
> + void *dsa_work_queue)
> +{
> + instance->work_queue = dsa_work_queue;
> +}
> +
> +/**
> + * @brief Cleans up a DSA device structure.
> + *
> + * @param instance A pointer to the DSA device to cleanup.
> + */
> +static void
> +dsa_device_cleanup(QemuDsaDevice *instance)
> +{
> + if (instance->work_queue != MAP_FAILED) {
> + munmap(instance->work_queue, DSA_WQ_PORTAL_SIZE);
> + }
> +}
> +
> +/**
> + * @brief Initializes a DSA device group.
> + *
> + * @param group A pointer to the DSA device group.
> + * @param dsa_parameter A list of DSA device path from are separated by space
> + * character migration parameter. Multiple DSA device path.
> + *
> + * @return Zero if successful, non-zero otherwise.
> + */
> +static int
> +dsa_device_group_init(QemuDsaDeviceGroup *group,
> + const strList *dsa_parameter,
> + Error **errp)
> +{
> + if (dsa_parameter == NULL) {
> + error_setg(errp, "dsa device path is not supplied.");
> + return -1;
> + }
> +
> + int ret = 0;
> + const char *dsa_path[MAX_DSA_DEVICES];
> + int num_dsa_devices = 0;
> +
> + while (dsa_parameter) {
> + dsa_path[num_dsa_devices++] = dsa_parameter->value;
> + if (num_dsa_devices == MAX_DSA_DEVICES) {
> + break;
> + }
> + dsa_parameter = dsa_parameter->next;
> + }
> +
> + group->dsa_devices =
> + g_new0(QemuDsaDevice, num_dsa_devices);
> + group->num_dsa_devices = num_dsa_devices;
> + group->device_allocator_index = 0;
> +
> + group->running = false;
> + qemu_mutex_init(&group->task_queue_lock);
> + qemu_cond_init(&group->task_queue_cond);
> + QSIMPLEQ_INIT(&group->task_queue);
> +
> + void *dsa_wq = MAP_FAILED;
> + for (int i = 0; i < num_dsa_devices; i++) {
> + dsa_wq = map_dsa_device(dsa_path[i]);
> + if (dsa_wq == MAP_FAILED) {
> + error_setg(errp, "map_dsa_device failed MAP_FAILED.");
This will assert if it fails in more than one iteration, errp cannot be
set twice. You'll have to test 'ret' outside of the loop before
returning and set the error there.
> + ret = -1;
> + }
> + dsa_device_init(&group->dsa_devices[i], dsa_wq);
> + }
> +
> + return ret;
> +}
> +
> +/**
> + * @brief Starts a DSA device group.
> + *
> + * @param group A pointer to the DSA device group.
> + */
> +static void
> +dsa_device_group_start(QemuDsaDeviceGroup *group)
> +{
> + group->running = true;
> +}
> +
> +/**
> + * @brief Stops a DSA device group.
> + *
> + * @param group A pointer to the DSA device group.
> + */
> +__attribute__((unused))
> +static void
> +dsa_device_group_stop(QemuDsaDeviceGroup *group)
> +{
> + group->running = false;
> +}
> +
> +/**
> + * @brief Cleans up a DSA device group.
> + *
> + * @param group A pointer to the DSA device group.
> + */
> +static void
> +dsa_device_group_cleanup(QemuDsaDeviceGroup *group)
> +{
> + if (!group->dsa_devices) {
> + return;
> + }
> + for (int i = 0; i < group->num_dsa_devices; i++) {
> + dsa_device_cleanup(&group->dsa_devices[i]);
> + }
> + g_free(group->dsa_devices);
> + group->dsa_devices = NULL;
> +
> + qemu_mutex_destroy(&group->task_queue_lock);
> + qemu_cond_destroy(&group->task_queue_cond);
> +}
> +
> +/**
> + * @brief Returns the next available DSA device in the group.
> + *
> + * @param group A pointer to the DSA device group.
> + *
> + * @return struct QemuDsaDevice* A pointer to the next available DSA device
> + * in the group.
> + */
> +__attribute__((unused))
> +static QemuDsaDevice *
> +dsa_device_group_get_next_device(QemuDsaDeviceGroup *group)
> +{
> + if (group->num_dsa_devices == 0) {
> + return NULL;
> + }
> + uint32_t current = qatomic_fetch_inc(&group->device_allocator_index);
> + current %= group->num_dsa_devices;
> + return &group->dsa_devices[current];
> +}
> +
> +/**
> + * @brief Check if DSA is running.
> + *
> + * @return True if DSA is running, otherwise false.
> + */
> +bool qemu_dsa_is_running(void)
> +{
> + return false;
> +}
> +
> +static void
> +dsa_globals_init(void)
> +{
> + max_retry_count = UINT32_MAX;
> +}
> +
> +/**
> + * @brief Initializes DSA devices.
> + *
> + * @param dsa_parameter A list of DSA device path from migration parameter.
> + *
> + * @return int Zero if successful, otherwise non zero.
> + */
> +int qemu_dsa_init(const strList *dsa_parameter, Error **errp)
> +{
> + dsa_globals_init();
> +
> + return dsa_device_group_init(&dsa_group, dsa_parameter, errp);
> +}
> +
> +/**
> + * @brief Start logic to enable using DSA.
> + *
> + */
> +void qemu_dsa_start(void)
> +{
> + if (dsa_group.num_dsa_devices == 0) {
> + return;
> + }
> + if (dsa_group.running) {
> + return;
> + }
> + dsa_device_group_start(&dsa_group);
> +}
> +
> +/**
> + * @brief Stop the device group and the completion thread.
> + *
> + */
> +void qemu_dsa_stop(void)
> +{
> + QemuDsaDeviceGroup *group = &dsa_group;
> +
> + if (!group->running) {
> + return;
> + }
> +}
> +
> +/**
> + * @brief Clean up system resources created for DSA offloading.
> + *
> + */
> +void qemu_dsa_cleanup(void)
> +{
> + qemu_dsa_stop();
> + dsa_device_group_cleanup(&dsa_group);
> +}
> +
> diff --git a/util/meson.build b/util/meson.build
> index 5d8bef9891..5ec2158f9e 100644
> --- a/util/meson.build
> +++ b/util/meson.build
> @@ -123,6 +123,9 @@ if cpu == 'aarch64'
> util_ss.add(files('cpuinfo-aarch64.c'))
> elif cpu in ['x86', 'x86_64']
> util_ss.add(files('cpuinfo-i386.c'))
> + if config_host_data.get('CONFIG_DSA_OPT')
> + util_ss.add(files('dsa.c'))
> + endif
> elif cpu == 'loongarch64'
> util_ss.add(files('cpuinfo-loongarch.c'))
> elif cpu in ['ppc', 'ppc64']
next prev parent reply other threads:[~2024-11-21 14:12 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-14 22:01 [PATCH v7 00/12] Use Intel DSA accelerator to offload zero page checking in multifd live migration Yichen Wang
2024-11-14 22:01 ` [PATCH v7 01/12] meson: Introduce new instruction set enqcmd to the build system Yichen Wang
2024-11-21 13:51 ` Fabiano Rosas
2024-11-14 22:01 ` [PATCH v7 02/12] util/dsa: Add idxd into linux header copy list Yichen Wang
2024-11-21 13:51 ` Fabiano Rosas
2024-11-14 22:01 ` [PATCH v7 03/12] util/dsa: Implement DSA device start and stop logic Yichen Wang
2024-11-21 14:11 ` Fabiano Rosas [this message]
2024-11-14 22:01 ` [PATCH v7 04/12] util/dsa: Implement DSA task enqueue and dequeue Yichen Wang
2024-11-21 20:55 ` Fabiano Rosas
2024-11-14 22:01 ` [PATCH v7 05/12] util/dsa: Implement DSA task asynchronous completion thread model Yichen Wang
2024-11-21 20:58 ` Fabiano Rosas
2024-11-14 22:01 ` [PATCH v7 06/12] util/dsa: Implement zero page checking in DSA task Yichen Wang
2024-11-25 15:53 ` Fabiano Rosas
2024-11-26 4:38 ` [External] " Yichen Wang
2024-11-14 22:01 ` [PATCH v7 07/12] util/dsa: Implement DSA task asynchronous submission and wait for completion Yichen Wang
2024-11-25 18:00 ` Fabiano Rosas
2024-11-14 22:01 ` [PATCH v7 08/12] migration/multifd: Add new migration option for multifd DSA offloading Yichen Wang
2024-11-15 14:32 ` Dr. David Alan Gilbert
2024-11-14 22:01 ` [PATCH v7 09/12] migration/multifd: Enable DSA offloading in multifd sender path Yichen Wang
2024-11-21 20:50 ` Fabiano Rosas
2024-11-26 4:41 ` [External] " Yichen Wang
2024-11-26 13:20 ` Fabiano Rosas
2024-12-03 3:43 ` Yichen Wang
2024-11-14 22:01 ` [PATCH v7 10/12] util/dsa: Add unit test coverage for Intel DSA task submission and completion Yichen Wang
2024-11-14 22:01 ` [PATCH v7 11/12] migration/multifd: Add integration tests for multifd with Intel DSA offloading Yichen Wang
2024-11-25 18:25 ` Fabiano Rosas
2024-11-14 22:01 ` [PATCH v7 12/12] migration/doc: Add DSA zero page detection doc Yichen Wang
2024-11-25 18:28 ` Fabiano Rosas
2024-11-19 21:31 ` [PATCH v7 00/12] Use Intel DSA accelerator to offload zero page checking in multifd live migration Fabiano Rosas
2024-11-26 4:43 ` [External] " Yichen Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=875xogsmmg.fsf@suse.de \
--to=farosas@suse.de \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=bryan.zhang@bytedance.com \
--cc=cohuck@redhat.com \
--cc=dave@treblig.org \
--cc=eblake@redhat.com \
--cc=hao.xiang@linux.dev \
--cc=horenchuang@bytedance.com \
--cc=marcandre.lureau@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=shivam.kumar1@nutanix.com \
--cc=yichen.wang@bytedance.com \
--cc=yuan1.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).