* [PATCH 0/2] io_uring r/w with metadata
@ 2025-07-23 17:04 Vincent Fu
2025-07-23 17:04 ` [PATCH 1/2] engines/io_uring: support " Vincent Fu
2025-07-23 17:04 ` [PATCH 2/2] t/io_uring_pi: test script for io_uring PI Vincent Fu
0 siblings, 2 replies; 7+ messages in thread
From: Vincent Fu @ 2025-07-23 17:04 UTC (permalink / raw)
To: anuj1072538, axboe, hch, joshi.k, anuj20.g, fio; +Cc: Vincent Fu
These two patches add support for io_uring read and write operations for
block devices with metadata. The first patch adds this support to the
io_uring ioengine and the second patch adds a script for this feature
using NVMe devices.
This code has been tested on the kernel below:
tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs-6.17.integrity
with the patch below applied
https://lore.kernel.org/linux-block/20250722120755.87501-1-anuj20.g@samsung.com/
This code has passed testing under the following settings:
QEMU NVMe device 512B data size; 8B, 16B, 64B metadata size; 16b Guard PI
QEMU NVMe device 4K data size; 16B, 64B metadata size, 64b Guard PI
I have not been able to run the following tests due to platform
issues:
QEMU NVMe device 4K data size; any metadata size; 16b Guard PI
nvme format fails
scsi-debug device
module is loaded with dif=1,2,or 3 but PI support is not detected
Would appreciate help anyone can provide for the missing test platforms.
Vincent Fu (2):
engines/io_uring: support r/w with metadata
t/io_uring_pi: test script for io_uring PI
engines/io_uring.c | 241 ++++++++++++++++++++++++--
engines/nvme.c | 35 ++--
engines/nvme.h | 18 ++
io_u.h | 1 +
os/linux/io_uring.h | 15 ++
t/io_uring_pi.py | 408 ++++++++++++++++++++++++++++++++++++++++++++
6 files changed, 681 insertions(+), 37 deletions(-)
create mode 100644 t/io_uring_pi.py
--
2.47.2
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] engines/io_uring: support r/w with metadata
2025-07-23 17:04 [PATCH 0/2] io_uring r/w with metadata Vincent Fu
@ 2025-07-23 17:04 ` Vincent Fu
2025-07-23 17:23 ` Jens Axboe
2025-07-23 17:04 ` [PATCH 2/2] t/io_uring_pi: test script for io_uring PI Vincent Fu
1 sibling, 1 reply; 7+ messages in thread
From: Vincent Fu @ 2025-07-23 17:04 UTC (permalink / raw)
To: anuj1072538, axboe, hch, joshi.k, anuj20.g, fio; +Cc: Vincent Fu
Use the new IOCTL to query block devices for metadata support. Then use
the new io_uring capability to send read and write commands with
metadata.
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
---
engines/io_uring.c | 241 +++++++++++++++++++++++++++++++++++++++++---
engines/nvme.c | 35 +++----
engines/nvme.h | 18 ++++
io_u.h | 1 +
os/linux/io_uring.h | 15 +++
5 files changed, 273 insertions(+), 37 deletions(-)
diff --git a/engines/io_uring.c b/engines/io_uring.c
index 87018f84..180db5b5 100644
--- a/engines/io_uring.c
+++ b/engines/io_uring.c
@@ -30,6 +30,71 @@
#include <sys/stat.h>
+#ifndef IO_INTEGRITY_CHK_GUARD
+/* flags for integrity meta */
+#define IO_INTEGRITY_CHK_GUARD (1U << 0) /* enforce guard check */
+#define IO_INTEGRITY_CHK_REFTAG (1U << 1) /* enforce ref check */
+#define IO_INTEGRITY_CHK_APPTAG (1U << 2) /* enforce app check */
+#endif /* IO_INTEGRITY_CHK_GUARD */
+
+#ifndef FS_IOC_GETLBMD_CAP
+/* Protection info capability flags */
+#define LBMD_PI_CAP_INTEGRITY (1 << 0)
+#define LBMD_PI_CAP_REFTAG (1 << 1)
+
+/* Checksum types for Protection Information */
+#define LBMD_PI_CSUM_NONE 0
+#define LBMD_PI_CSUM_IP 1
+#define LBMD_PI_CSUM_CRC16_T10DIF 2
+#define LBMD_PI_CSUM_CRC64_NVME 4
+
+/*
+ * Logical block metadata capability descriptor
+ * If the device does not support metadata, all the fields will be zero.
+ * Applications must check lbmd_flags to determine whether metadata is
+ * supported or not.
+ */
+struct logical_block_metadata_cap {
+ /* Bitmask of logical block metadata capability flags */
+ __u32 lbmd_flags;
+ /*
+ * The amount of data described by each unit of logical block
+ * metadata
+ */
+ __u16 lbmd_interval;
+ /*
+ * Size in bytes of the logical block metadata associated with each
+ * interval
+ */
+ __u8 lbmd_size;
+ /*
+ * Size in bytes of the opaque block tag associated with each
+ * interval
+ */
+ __u8 lbmd_opaque_size;
+ /*
+ * Offset in bytes of the opaque block tag within the logical block
+ * metadata
+ */
+ __u8 lbmd_opaque_offset;
+ /* Size in bytes of the T10 PI tuple associated with each interval */
+ __u8 lbmd_pi_size;
+ /* Offset in bytes of T10 PI tuple within the logical block metadata */
+ __u8 lbmd_pi_offset;
+ /* T10 PI guard tag type */
+ __u8 lbmd_guard_tag_type;
+ /* Size in bytes of the T10 PI application tag */
+ __u8 lbmd_app_tag_size;
+ /* Size in bytes of the T10 PI reference tag */
+ __u8 lbmd_ref_tag_size;
+ /* Size in bytes of the T10 PI storage tag */
+ __u8 lbmd_storage_tag_size;
+ __u8 pad;
+};
+
+#define FS_IOC_GETLBMD_CAP _IOWR(0x15, 2, struct logical_block_metadata_cap)
+#endif /* FS_IOC_GETLBMD_CAP */
+
enum uring_cmd_type {
FIO_URING_CMD_NVME = 1,
};
@@ -73,6 +138,7 @@ struct ioring_data {
struct io_u **io_u_index;
char *md_buf;
+ char *pi_attr;
int *fds;
@@ -436,6 +502,27 @@ static int fio_ioring_prep(struct thread_data *td, struct io_u *io_u)
sqe->len = 1;
}
}
+ if (o->md_per_io_size) {
+ struct io_uring_attr_pi *pi_attr = io_u->pi_attr;
+ struct nvme_data *data = FILE_ENG_DATA(io_u->file);
+
+ sqe->attr_type_mask = IORING_RW_ATTR_FLAG_PI;
+ sqe->attr_ptr = (__u64)(uintptr_t)pi_attr;
+ pi_attr->addr = (__u64)(uintptr_t)io_u->mmap_data;
+ pi_attr->len = o->md_per_io_size;
+ pi_attr->app_tag = o->apptag;
+ pi_attr->flags = 0;
+ if (strstr(o->pi_chk, "GUARD") != NULL)
+ pi_attr->flags |= IO_INTEGRITY_CHK_GUARD;
+ if (strstr(o->pi_chk, "REFTAG") != NULL) {
+ __u64 slba = get_slba(data, io_u->offset);
+
+ pi_attr->flags |= IO_INTEGRITY_CHK_REFTAG;
+ pi_attr->seed = (__u32)slba;
+ }
+ if (strstr(o->pi_chk, "APPTAG") != NULL)
+ pi_attr->flags |= IO_INTEGRITY_CHK_APPTAG;
+ }
sqe->rw_flags = 0;
if (!td->o.odirect && o->uncached)
sqe->rw_flags |= RWF_DONTCACHE;
@@ -565,6 +652,7 @@ static int fio_ioring_cmd_prep(struct thread_data *td, struct io_u *io_u)
static struct io_u *fio_ioring_event(struct thread_data *td, int event)
{
struct ioring_data *ld = td->io_ops_data;
+ struct ioring_options *o = td->eo;
struct io_uring_cqe *cqe;
struct io_u *io_u;
unsigned index;
@@ -590,6 +678,19 @@ static struct io_u *fio_ioring_event(struct thread_data *td, int event)
io_u->error = -cqe->res;
else
io_u->resid = io_u->xfer_buflen - cqe->res;
+ return io_u;
+ }
+
+ if (o->md_per_io_size) {
+ struct nvme_data *data;
+ int ret;
+
+ data = FILE_ENG_DATA(io_u->file);
+ if (data->pi_type && (io_u->ddir == DDIR_READ) && !o->pi_act) {
+ ret = fio_nvme_pi_verify(data, io_u);
+ if (ret)
+ io_u->error = ret;
+ }
}
return io_u;
@@ -802,6 +903,21 @@ static enum fio_q_status fio_ioring_queue(struct thread_data *td,
o->cmd_type == FIO_URING_CMD_NVME)
fio_ioring_cmd_nvme_pi(td, io_u);
+ if (!strcmp(td->io_ops->name, "io_uring") && o->md_per_io_size) {
+ struct nvme_data *data = FILE_ENG_DATA(io_u->file);
+ struct nvme_cmd_ext_io_opts ext_opts = {0};
+
+ if (data->pi_type) {
+ if (o->pi_act)
+ ext_opts.io_flags |= NVME_IO_PRINFO_PRACT;
+
+ ext_opts.io_flags |= o->prchk;
+ ext_opts.apptag = o->apptag;
+ ext_opts.apptag_mask = o->apptag_mask;
+ }
+ fio_nvme_generate_guard(io_u, &ext_opts);
+ }
+
tail = *ring->tail;
ring->array[tail & ld->sq_ring_mask] = io_u->index;
atomic_store_release(ring->tail, tail + 1);
@@ -920,6 +1036,7 @@ static void fio_ioring_cleanup(struct thread_data *td)
fio_cmdprio_cleanup(&ld->cmdprio);
free(ld->io_u_index);
free(ld->md_buf);
+ free(ld->pi_attr);
free(ld->iovecs);
free(ld->fds);
free(ld->dsm);
@@ -1347,12 +1464,20 @@ static int fio_ioring_init(struct thread_data *td)
/* io_u index */
ld->io_u_index = calloc(td->o.iodepth, sizeof(struct io_u *));
+ if (!strcmp(td->io_ops->name, "io_uring") && o->md_per_io_size) {
+ if (o->apptag_mask != 0xffff) {
+ log_err("fio: io_uring with metadata requires an apptag_mask of 0xffff\n");
+ return 1;
+ }
+ }
+
/*
* metadata buffer for nvme command.
* We are only supporting iomem=malloc / mem=malloc as of now.
*/
- if (!strcmp(td->io_ops->name, "io_uring_cmd") &&
- (o->cmd_type == FIO_URING_CMD_NVME) && o->md_per_io_size) {
+ if ((!strcmp(td->io_ops->name, "io_uring_cmd") &&
+ (o->cmd_type == FIO_URING_CMD_NVME) && o->md_per_io_size) ||
+ (!strcmp(td->io_ops->name, "io_uring") && o->md_per_io_size)) {
md_size = (unsigned long long) o->md_per_io_size
* (unsigned long long) td->o.iodepth;
md_size += page_mask + td->o.mem_align;
@@ -1363,6 +1488,12 @@ static int fio_ioring_init(struct thread_data *td)
free(ld);
return 1;
}
+ ld->pi_attr = calloc(ld->iodepth, sizeof(struct io_uring_attr_pi));
+ if (!ld->pi_attr) {
+ free(ld->md_buf);
+ free(ld);
+ return 1;
+ }
}
parse_prchk_flags(o);
@@ -1429,22 +1560,23 @@ static int fio_ioring_io_u_init(struct thread_data *td, struct io_u *io_u)
struct ioring_data *ld = td->io_ops_data;
struct ioring_options *o = td->eo;
struct nvme_pi_data *pi_data;
- char *p;
+ char *p, *q;
ld->io_u_index[io_u->index] = io_u;
- if (!strcmp(td->io_ops->name, "io_uring_cmd")) {
- p = PTR_ALIGN(ld->md_buf, page_mask) + td->o.mem_align;
- p += o->md_per_io_size * io_u->index;
- io_u->mmap_data = p;
-
- if (!o->pi_act) {
- pi_data = calloc(1, sizeof(*pi_data));
- pi_data->io_flags |= o->prchk;
- pi_data->apptag_mask = o->apptag_mask;
- pi_data->apptag = o->apptag;
- io_u->engine_data = pi_data;
- }
+ p = PTR_ALIGN(ld->md_buf, page_mask) + td->o.mem_align;
+ p += o->md_per_io_size * io_u->index;
+ io_u->mmap_data = p;
+ q = ld->pi_attr;
+ q += (sizeof(struct io_uring_attr_pi) * io_u->index);
+ io_u->pi_attr = q;
+
+ if (!o->pi_act) {
+ pi_data = calloc(1, sizeof(*pi_data));
+ pi_data->io_flags |= o->prchk;
+ pi_data->apptag_mask = o->apptag_mask;
+ pi_data->apptag = o->apptag;
+ io_u->engine_data = pi_data;
}
return 0;
@@ -1463,11 +1595,90 @@ static void fio_ioring_io_u_free(struct thread_data *td, struct io_u *io_u)
}
}
+static int fio_get_pi_info(struct fio_file *f, __u64 *nlba, __u32 pi_act,
+ struct nvme_data *data)
+{
+ struct logical_block_metadata_cap md_cap;
+ int ret;
+ int fd, err = 0;
+
+ fd = open(f->file_name, O_RDONLY);
+ if (fd < 0)
+ return -errno;
+
+ ret = ioctl(fd, FS_IOC_GETLBMD_CAP, &md_cap);
+ if (ret < 0) {
+ err = -errno;
+ log_err("%s: failed to query protection information capabilities; error %d\n", f->file_name, errno);
+ goto out;
+ }
+
+ if (!(md_cap.lbmd_flags & LBMD_PI_CAP_INTEGRITY)) {
+ log_err("%s: Protection information not supported\n", f->file_name);
+ err = -ENOTSUP;
+ goto out;
+ }
+
+ /* Currently we don't support storage tags */
+ if (md_cap.lbmd_storage_tag_size) {
+ log_err("%s: Storage tag not supported\n", f->file_name);
+ err = -ENOTSUP;
+ goto out;
+ }
+
+ data->lba_size = md_cap.lbmd_interval;
+ data->lba_shift = ilog2(data->lba_size);
+ data->ms = md_cap.lbmd_size;
+ data->pi_size = md_cap.lbmd_pi_size;
+ data->pi_loc = !(md_cap.lbmd_pi_offset);
+
+ /* Assume Type 1 PI if reference tags supported */
+ if (md_cap.lbmd_flags & LBMD_PI_CAP_REFTAG)
+ data->pi_type = NVME_NS_DPS_PI_TYPE1;
+ else
+ data->pi_type = NVME_NS_DPS_PI_TYPE3;
+
+ switch (md_cap.lbmd_guard_tag_type) {
+ case LBMD_PI_CSUM_CRC16_T10DIF:
+ data->guard_type = NVME_NVM_NS_16B_GUARD;
+ break;
+ case LBMD_PI_CSUM_CRC64_NVME:
+ data->guard_type = NVME_NVM_NS_64B_GUARD;
+ break;
+ default:
+ log_err("%s: unsupported checksum type %d\n", f->file_name, md_cap.lbmd_guard_tag_type);
+ err = -ENOTSUP;
+ goto out;
+ }
+
+out:
+ close(fd);
+ return err;
+}
+
static int fio_ioring_open_file(struct thread_data *td, struct fio_file *f)
{
struct ioring_data *ld = td->io_ops_data;
struct ioring_options *o = td->eo;
+ if (o->md_per_io_size) {
+ struct nvme_data *data = NULL;
+ __u64 nlba = 0;
+ int ret;
+
+ data = FILE_ENG_DATA(f);
+ if (data == NULL) {
+ data = calloc(1, sizeof(struct nvme_data));
+ ret = fio_get_pi_info(f, &nlba, o->pi_act, data);
+ if (ret) {
+ free(data);
+ return ret;
+ }
+
+ FILE_SET_ENG_DATA(f, data);
+ }
+ }
+
if (!ld || !o->registerfiles)
return generic_open_file(td, f);
diff --git a/engines/nvme.c b/engines/nvme.c
index 37a31e2f..4b3d3860 100644
--- a/engines/nvme.c
+++ b/engines/nvme.c
@@ -8,22 +8,6 @@
#include "../crc/crc-t10dif.h"
#include "../crc/crc64.h"
-static inline __u64 get_slba(struct nvme_data *data, __u64 offset)
-{
- if (data->lba_ext)
- return offset / data->lba_ext;
-
- return offset >> data->lba_shift;
-}
-
-static inline __u32 get_nlb(struct nvme_data *data, __u64 len)
-{
- if (data->lba_ext)
- return len / data->lba_ext - 1;
-
- return (len >> data->lba_shift) - 1;
-}
-
static void fio_nvme_generate_pi_16b_guard(struct nvme_data *data,
struct io_u *io_u,
struct nvme_cmd_ext_io_opts *opts)
@@ -421,14 +405,9 @@ int fio_nvme_uring_cmd_prep(struct nvme_uring_cmd *cmd, struct io_u *io_u,
return 0;
}
-void fio_nvme_pi_fill(struct nvme_uring_cmd *cmd, struct io_u *io_u,
- struct nvme_cmd_ext_io_opts *opts)
+void fio_nvme_generate_guard(struct io_u *io_u, struct nvme_cmd_ext_io_opts *opts)
{
struct nvme_data *data = FILE_ENG_DATA(io_u->file);
- __u64 slba;
-
- slba = get_slba(data, io_u->offset);
- cmd->cdw12 |= opts->io_flags;
if (data->pi_type && !(opts->io_flags & NVME_IO_PRINFO_PRACT)) {
if (data->guard_type == NVME_NVM_NS_16B_GUARD)
@@ -436,6 +415,18 @@ void fio_nvme_pi_fill(struct nvme_uring_cmd *cmd, struct io_u *io_u,
else if (data->guard_type == NVME_NVM_NS_64B_GUARD)
fio_nvme_generate_pi_64b_guard(data, io_u, opts);
}
+}
+
+void fio_nvme_pi_fill(struct nvme_uring_cmd *cmd, struct io_u *io_u,
+ struct nvme_cmd_ext_io_opts *opts)
+{
+ struct nvme_data *data = FILE_ENG_DATA(io_u->file);
+ __u64 slba;
+
+ slba = get_slba(data, io_u->offset);
+ cmd->cdw12 |= opts->io_flags;
+
+ fio_nvme_generate_guard(io_u, opts);
switch (data->pi_type) {
case NVME_NS_DPS_PI_TYPE1:
diff --git a/engines/nvme.h b/engines/nvme.h
index 60b38d7f..4371eb5b 100644
--- a/engines/nvme.h
+++ b/engines/nvme.h
@@ -438,6 +438,8 @@ int fio_nvme_uring_cmd_prep(struct nvme_uring_cmd *cmd, struct io_u *io_u,
void fio_nvme_pi_fill(struct nvme_uring_cmd *cmd, struct io_u *io_u,
struct nvme_cmd_ext_io_opts *opts);
+void fio_nvme_generate_guard(struct io_u *io_u, struct nvme_cmd_ext_io_opts *opts);
+
int fio_nvme_pi_verify(struct nvme_data *data, struct io_u *io_u);
int fio_nvme_get_zoned_model(struct thread_data *td, struct fio_file *f,
@@ -476,4 +478,20 @@ static inline bool fio_nvme_pi_ref_escape(__u8 *reftag)
return memcmp(reftag, ref_esc, sizeof(ref_esc)) == 0;
}
+static inline __u64 get_slba(struct nvme_data *data, __u64 offset)
+{
+ if (data->lba_ext)
+ return offset / data->lba_ext;
+
+ return offset >> data->lba_shift;
+}
+
+static inline __u32 get_nlb(struct nvme_data *data, __u64 len)
+{
+ if (data->lba_ext)
+ return len / data->lba_ext - 1;
+
+ return (len >> data->lba_shift) - 1;
+}
+
#endif
diff --git a/io_u.h b/io_u.h
index 178c1229..2d20a2b2 100644
--- a/io_u.h
+++ b/io_u.h
@@ -145,6 +145,7 @@ struct io_u {
#endif
void *mmap_data;
};
+ void *pi_attr;
};
/*
diff --git a/os/linux/io_uring.h b/os/linux/io_uring.h
index b3876381..7b099902 100644
--- a/os/linux/io_uring.h
+++ b/os/linux/io_uring.h
@@ -70,6 +70,10 @@ struct io_uring_sqe {
__u64 addr3;
__u64 __pad2[1];
};
+ struct {
+ __u64 attr_ptr; /* pointer to attribute information */
+ __u64 attr_type_mask; /* bit mask of attributes */
+ };
/*
* If the ring is initialized with IORING_SETUP_SQE128, then
* this field is used for 80 bytes of arbitrary command data
@@ -78,6 +82,17 @@ struct io_uring_sqe {
};
};
+/* sqe->attr_type_mask flags */
+#define IORING_RW_ATTR_FLAG_PI (1U << 0)
+/* PI attribute information */
+struct io_uring_attr_pi {
+ __u16 flags;
+ __u16 app_tag;
+ __u32 len;
+ __u64 addr;
+ __u64 seed;
+ __u64 rsvd;
+};
enum {
IOSQE_FIXED_FILE_BIT,
IOSQE_IO_DRAIN_BIT,
--
2.47.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] t/io_uring_pi: test script for io_uring PI
2025-07-23 17:04 [PATCH 0/2] io_uring r/w with metadata Vincent Fu
2025-07-23 17:04 ` [PATCH 1/2] engines/io_uring: support " Vincent Fu
@ 2025-07-23 17:04 ` Vincent Fu
1 sibling, 0 replies; 7+ messages in thread
From: Vincent Fu @ 2025-07-23 17:04 UTC (permalink / raw)
To: anuj1072538, axboe, hch, joshi.k, anuj20.g, fio; +Cc: Vincent Fu
Test io_uring metadata support on NVMe devices.
Limited to direct=1 and metadata in a separate buffer.
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
---
t/io_uring_pi.py | 408 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 408 insertions(+)
create mode 100644 t/io_uring_pi.py
diff --git a/t/io_uring_pi.py b/t/io_uring_pi.py
new file mode 100644
index 00000000..bd92edfd
--- /dev/null
+++ b/t/io_uring_pi.py
@@ -0,0 +1,408 @@
+#!/usr/bin/env python3
+
+"""
+# io_uring_pi.py
+#
+# Test metadata support using the io_uring ioengine.
+#
+# USAGE
+# See python3 io_uring_pi.py --help
+#
+# EXAMPLES (THIS IS A DESTRUCTIVE TEST!!)
+# python3 t/io_uring_pi.py --dut /dev/nvme1n1 -f ./fio
+#
+# REQUIREMENTS
+# Python 3.6
+#
+"""
+
+import os
+import sys
+import json
+import time
+import locale
+import logging
+import argparse
+import itertools
+import subprocess
+from pathlib import Path
+from fiotestlib import FioJobCmdTest, run_fio_tests
+from fiotestcommon import SUCCESS_NONZERO
+
+
+NUMBER_IOS = 8192
+BS_LOW = 1
+BS_HIGH = 16
+
+class DifDixTest(FioJobCmdTest):
+ """
+ NVMe DIF/DIX test class.
+ """
+
+ def setup(self, parameters):
+ """Setup a test."""
+
+ fio_args = [
+ "--name=io_uring_pi",
+ "--ioengine=io_uring",
+ "--direct=1",
+ f"--filename={self.fio_opts['filename']}",
+ f"--rw={self.fio_opts['rw']}",
+ f"--bsrange={self.fio_opts['bsrange']}",
+ f"--output={os.path.basename(self.filenames['output'])}",
+ f"--md_per_io_size={self.fio_opts['md_per_io_size']}",
+ "--pi_act=0",
+ f"--pi_chk={self.fio_opts['pi_chk']}",
+ f"--apptag={self.fio_opts['apptag']}",
+ f"--apptag_mask={self.fio_opts['apptag_mask']}",
+ ]
+ for opt in ['fixedbufs', 'nonvectored', 'force_async', 'registerfiles',
+ 'sqthread_poll', 'sqthread_poll_cpu', 'hipri', 'nowait',
+ 'time_based', 'runtime', 'verify', 'io_size', 'offset', 'number_ios',
+ 'output-format']:
+ if opt in self.fio_opts:
+ option = f"--{opt}={self.fio_opts[opt]}"
+ fio_args.append(option)
+
+ super().setup(fio_args)
+
+
+TEST_LIST = [
+#
+# Write data with pi_act=0 and then read the data back
+#
+ {
+ # Write workload with variable IO sizes
+ # pi_act=0
+ "test_id": 101,
+ "fio_opts": {
+ "rw": 'write',
+ "number_ios": NUMBER_IOS,
+ "output-format": "json",
+ "apptag": "0x8888",
+ "apptag_mask": "0xFFFF",
+ "pi_act": 0,
+ },
+ "pi_chk": "GUARD,REFTAG,APPTAG",
+ "bs_low": BS_LOW,
+ "bs_high": BS_HIGH,
+ "test_class": DifDixTest,
+ },
+ {
+ # Read workload with fixed small IO size
+ # pi_act=0
+ "test_id": 102,
+ "fio_opts": {
+ "rw": 'read',
+ "number_ios": NUMBER_IOS,
+ "output-format": "json",
+ "apptag": "0x8888",
+ "apptag_mask": "0xFFFF",
+ "pi_act": 0,
+ },
+ "pi_chk": "GUARD,REFTAG,APPTAG",
+ "bs_low": BS_LOW,
+ "bs_high": BS_LOW,
+ "test_class": DifDixTest,
+ },
+ {
+ # Read workload with variable IO size
+ # pi_act=0
+ "test_id": 103,
+ "fio_opts": {
+ "rw": 'read',
+ "number_ios": NUMBER_IOS,
+ "output-format": "json",
+ "apptag": "0x8888",
+ "apptag_mask": "0xFFFF",
+ "pi_act": 0,
+ },
+ "pi_chk": "GUARD,REFTAG,APPTAG",
+ "bs_low": BS_LOW,
+ "bs_high": BS_HIGH,
+ "test_class": DifDixTest,
+ },
+ {
+ # Read workload with variable IO size
+ # trigger apptag mismatch error
+ # pi_act=0
+ "test_id": 104,
+ "fio_opts": {
+ "rw": 'read',
+ "number_ios": NUMBER_IOS,
+ "output-format": "json",
+ "apptag": "0xA888",
+ "apptag_mask": "0xFFFF",
+ "pi_act": 0,
+ },
+ "pi_chk": "GUARD,REFTAG,APPTAG",
+ "bs_low": BS_LOW,
+ "bs_high": BS_HIGH,
+ "success": SUCCESS_NONZERO,
+ "test_class": DifDixTest,
+ },
+ {
+ # Read workload with variable IO size
+ # fails because apptag mask must be 0xFFFF
+ # pi_act=0
+ "test_id": 105,
+ "fio_opts": {
+ "rw": 'read',
+ "number_ios": NUMBER_IOS,
+ "output-format": "json",
+ "apptag": "0xF888",
+ "apptag_mask": "0x0FFF",
+ "pi_act": 0,
+ },
+ "pi_chk": "GUARD,REFTAG,APPTAG",
+ "bs_low": BS_LOW,
+ "bs_high": BS_HIGH,
+ "success": SUCCESS_NONZERO,
+ "test_class": DifDixTest,
+ },
+]
+
+
+def get_lbafs(args):
+ """
+ Determine which LBA formats to use. Use either the ones specified on the
+ command line or if none are specified query the device and use all lba
+ formats with metadata.
+ """
+ lbaf_list = []
+ id_ns_cmd = f"sudo nvme id-ns --output-format=json {args.dut}".split(' ')
+ id_ns_output = subprocess.check_output(id_ns_cmd)
+ lbafs = json.loads(id_ns_output)['lbafs']
+ if args.lbaf:
+ for lbaf in args.lbaf:
+ lbaf_list.append({'lbaf': lbaf, 'ds': 2 ** lbafs[lbaf]['ds'],
+ 'ms': lbafs[lbaf]['ms'], })
+ if lbafs[lbaf]['ms'] == 0:
+ print(f'Error: lbaf {lbaf} has metadata size zero')
+ sys.exit(1)
+ else:
+ for lbaf_num, lbaf in enumerate(lbafs):
+ if lbaf['ms'] != 0:
+ lbaf_list.append({'lbaf': lbaf_num, 'ds': 2 ** lbaf['ds'],
+ 'ms': lbaf['ms'], })
+
+ return lbaf_list
+
+
+def get_guard_pi(lbaf_list, args):
+ """
+ Find out how many bits of guard protection information are associated with
+ each lbaf to be used. If this is not available assume 16-bit guard pi.
+ Also record the bytes of protection information associated with the number
+ of guard PI bits.
+ """
+ nvm_id_ns_cmd = f"sudo nvme nvm-id-ns --output-format=json {args.dut}".split(' ')
+ try:
+ nvm_id_ns_output = subprocess.check_output(nvm_id_ns_cmd)
+ except subprocess.CalledProcessError:
+ print(f"Non-zero return code from {' '.join(nvm_id_ns_cmd)}; " \
+ "assuming all lbafs use 16b Guard Protection Information")
+ for lbaf in lbaf_list:
+ lbaf['guard_pi_bits'] = 16
+ else:
+ elbafs = json.loads(nvm_id_ns_output)['elbafs']
+ for elbaf_num, elbaf in enumerate(elbafs):
+ for lbaf in lbaf_list:
+ if lbaf['lbaf'] == elbaf_num:
+ lbaf['guard_pi_bits'] = 16 << elbaf['pif']
+
+ # For 16b Guard Protection Information, the PI requires 8 bytes
+ # For 32b and 64b Guard PI, the PI requires 16 bytes
+ for lbaf in lbaf_list:
+ if lbaf['guard_pi_bits'] == 16:
+ lbaf['pi_bytes'] = 8
+ else:
+ lbaf['pi_bytes'] = 16
+
+
+def get_capabilities(args):
+ """
+ Determine what end-to-end data protection features the device supports.
+ """
+ caps = { 'pil': [], 'pitype': [], 'elba': [] }
+ id_ns_cmd = f"sudo nvme id-ns --output-format=json {args.dut}".split(' ')
+ id_ns_output = subprocess.check_output(id_ns_cmd)
+ id_ns_json = json.loads(id_ns_output)
+
+ mc = id_ns_json['mc']
+ if mc & 1:
+ caps['elba'].append(1)
+ if mc & 2:
+ caps['elba'].append(0)
+
+ dpc = id_ns_json['dpc']
+ if dpc & 1:
+ caps['pitype'].append(1)
+ if dpc & 2:
+ caps['pitype'].append(2)
+ if dpc & 4:
+ caps['pitype'].append(3)
+ if dpc & 8:
+ caps['pil'].append(1)
+ if dpc & 16:
+ caps['pil'].append(0)
+
+ for _, value in caps.items():
+ if len(value) == 0:
+ logging.error("One or more end-to-end data protection features unsupported: %s", caps)
+ sys.exit(-1)
+
+ return caps
+
+
+def format_device(args, lbaf, pitype, pil, elba):
+ """
+ Format device using specified lba format with specified pitype, pil, and
+ elba values.
+ """
+
+ format_cmd = f"sudo nvme format {args.dut} --lbaf={lbaf['lbaf']} " \
+ f"--pi={pitype} --pil={pil} --ms={elba} --force"
+ logging.debug("Format command: %s", format_cmd)
+ format_cmd = format_cmd.split(' ')
+ format_cmd_result = subprocess.run(format_cmd, capture_output=True, check=False,
+ encoding=locale.getpreferredencoding())
+
+ # Sometimes nvme-cli may format the device successfully but fail to
+ # rescan the namespaces after the format. Continue if this happens but
+ # abort if some other error occurs.
+ if format_cmd_result.returncode != 0:
+ if 'failed to rescan namespaces' not in format_cmd_result.stderr \
+ or 'Success formatting namespace' not in format_cmd_result.stdout:
+ logging.error(format_cmd_result.stdout)
+ logging.error(format_cmd_result.stderr)
+ print("Unable to format device; skipping this configuration")
+ return False
+
+ logging.debug(format_cmd_result.stdout)
+ return True
+
+
+def parse_args():
+ """Parse command-line arguments."""
+
+ parser = argparse.ArgumentParser()
+ parser.add_argument('-d', '--debug', help='Enable debug messages', action='store_true')
+ parser.add_argument('-f', '--fio', help='path to file executable (e.g., ./fio)')
+ parser.add_argument('-a', '--artifact-root', help='artifact root directory')
+ parser.add_argument('-s', '--skip', nargs='+', type=int,
+ help='list of test(s) to skip')
+ parser.add_argument('-o', '--run-only', nargs='+', type=int,
+ help='list of test(s) to run, skipping all others')
+ parser.add_argument('--dut', help='target device to test '
+ '(e.g., /dev/nvme1n1). WARNING: THIS IS A DESTRUCTIVE TEST', required=True)
+ parser.add_argument('-l', '--lbaf', nargs='+', type=int,
+ help='list of lba formats to test')
+ args = parser.parse_args()
+
+ return args
+
+
+def difdix_test(test_env, args, lbaf, pitype):
+ """
+ Adjust test arguments based on values of lbaf, and pitype. Then run
+ the tests.
+ """
+ for test in TEST_LIST:
+ test['force_skip'] = False
+
+ blocksize = lbaf['ds']
+ # Set fio blocksize parameter at runtime
+ test['fio_opts']['md_per_io_size'] = lbaf['ms'] * test['bs_high']
+
+ test['fio_opts']['bsrange'] = f"{blocksize * test['bs_low']}-{blocksize * test['bs_high']}"
+
+ # Set fio pi_chk parameter at runtime. If the device is formatted
+ # with Type 3 protection information, this means that the reference
+ # tag is not checked and I/O commands may throw an error if they
+ # are submitted with the REFTAG bit set in pi_chk. Make sure fio
+ # does not set pi_chk's REFTAG bit if the device is formatted with
+ # Type 3 PI.
+ if 'pi_chk' in test:
+ if pitype == 3 and 'REFTAG' in test['pi_chk']:
+ test['fio_opts']['pi_chk'] = test['pi_chk'].replace('REFTAG','')
+ logging.debug("Type 3 PI: dropping REFTAG bit")
+ else:
+ test['fio_opts']['pi_chk'] = test['pi_chk']
+
+ logging.debug("Test %d: pi_act=%d, bsrange=%s, md_per_io_size=%d", test['test_id'],
+ test['fio_opts']['pi_act'], test['fio_opts']['bsrange'],
+ test['fio_opts']['md_per_io_size'])
+
+ return run_fio_tests(TEST_LIST, test_env, args)
+
+
+def main():
+ args = parse_args()
+
+ if args.debug:
+ logging.basicConfig(level=logging.DEBUG)
+ else:
+ logging.basicConfig(level=logging.INFO)
+
+ artifact_root = args.artifact_root if args.artifact_root else \
+ f"io_uring_pi-test-{time.strftime('%Y%m%d-%H%M%S')}"
+ os.mkdir(artifact_root)
+ print(f"Artifact directory is {artifact_root}")
+
+ if args.fio:
+ fio_path = str(Path(args.fio).absolute())
+ else:
+ fio_path = os.path.join(os.path.dirname(__file__), '../fio')
+ print(f"fio path is {fio_path}")
+
+ lbaf_list = get_lbafs(args)
+ get_guard_pi(lbaf_list, args)
+ caps = get_capabilities(args)
+ print("Device capabilities:", caps)
+
+ for test in TEST_LIST:
+ test['fio_opts']['filename'] = args.dut
+ test_env = {
+ 'fio_path': fio_path,
+ 'fio_root': str(Path(__file__).absolute().parent.parent),
+ 'artifact_root': artifact_root,
+ 'basename': 'io_uring_pi',
+ }
+
+ total = { 'passed': 0, 'failed': 0, 'skipped': 0 }
+
+ try:
+ for lbaf, pil, pitype in itertools.product(lbaf_list, caps['pil'], caps['pitype']):
+ if lbaf['ms'] == 0:
+ continue
+
+ print("\n")
+ print("-" * 120)
+ print(f"lbaf: {lbaf}, pil: {pil}, pitype: {pitype}")
+ print("-" * 120)
+
+ if not format_device(args, lbaf, pitype, pil, 0):
+ print("Formatting failed")
+ continue
+
+ test_env['artifact_root'] = \
+ os.path.join(artifact_root, f"lbaf{lbaf['lbaf']}pil{pil}pitype{pitype}")
+ os.mkdir(test_env['artifact_root'])
+
+ passed, failed, skipped = difdix_test(test_env, args, lbaf, pitype)
+
+ total['passed'] += passed
+ total['failed'] += failed
+ total['skipped'] += skipped
+
+ except KeyboardInterrupt:
+ pass
+
+ print(f"\n\n{total['passed']} test(s) passed, {total['failed']} failed, " \
+ f"{total['skipped']} skipped")
+ sys.exit(total['failed'])
+
+
+if __name__ == '__main__':
+ main()
--
2.47.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] engines/io_uring: support r/w with metadata
2025-07-23 17:04 ` [PATCH 1/2] engines/io_uring: support " Vincent Fu
@ 2025-07-23 17:23 ` Jens Axboe
2025-07-23 17:37 ` Jens Axboe
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2025-07-23 17:23 UTC (permalink / raw)
To: Vincent Fu, anuj1072538, hch, joshi.k, anuj20.g, fio; +Cc: Vincent Fu
On 7/23/25 11:04 AM, Vincent Fu wrote:
> @@ -436,6 +502,27 @@ static int fio_ioring_prep(struct thread_data *td, struct io_u *io_u)
> sqe->len = 1;
> }
> }
> + if (o->md_per_io_size) {
> + struct io_uring_attr_pi *pi_attr = io_u->pi_attr;
> + struct nvme_data *data = FILE_ENG_DATA(io_u->file);
> +
> + sqe->attr_type_mask = IORING_RW_ATTR_FLAG_PI;
> + sqe->attr_ptr = (__u64)(uintptr_t)pi_attr;
> + pi_attr->addr = (__u64)(uintptr_t)io_u->mmap_data;
> + pi_attr->len = o->md_per_io_size;
> + pi_attr->app_tag = o->apptag;
> + pi_attr->flags = 0;
> + if (strstr(o->pi_chk, "GUARD") != NULL)
> + pi_attr->flags |= IO_INTEGRITY_CHK_GUARD;
> + if (strstr(o->pi_chk, "REFTAG") != NULL) {
> + __u64 slba = get_slba(data, io_u->offset);
> +
> + pi_attr->flags |= IO_INTEGRITY_CHK_REFTAG;
> + pi_attr->seed = (__u32)slba;
> + }
> + if (strstr(o->pi_chk, "APPTAG") != NULL)
> + pi_attr->flags |= IO_INTEGRITY_CHK_APPTAG;
> + }
I'd move these out-of-line whenever possible. It both improves
readability, but also conveys that this isn't necessarily the normal/hot
path.
> @@ -590,6 +678,19 @@ static struct io_u *fio_ioring_event(struct thread_data *td, int event)
> io_u->error = -cqe->res;
> else
> io_u->resid = io_u->xfer_buflen - cqe->res;
> + return io_u;
> + }
> +
> + if (o->md_per_io_size) {
> + struct nvme_data *data;
> + int ret;
> +
> + data = FILE_ENG_DATA(io_u->file);
> + if (data->pi_type && (io_u->ddir == DDIR_READ) && !o->pi_act) {
> + ret = fio_nvme_pi_verify(data, io_u);
> + if (ret)
> + io_u->error = ret;
> + }
> }
Ditto
> + if (!strcmp(td->io_ops->name, "io_uring") && o->md_per_io_size) {
> + struct nvme_data *data = FILE_ENG_DATA(io_u->file);
> + struct nvme_cmd_ext_io_opts ext_opts = {0};
> +
> + if (data->pi_type) {
> + if (o->pi_act)
> + ext_opts.io_flags |= NVME_IO_PRINFO_PRACT;
> +
> + ext_opts.io_flags |= o->prchk;
> + ext_opts.apptag = o->apptag;
> + ext_opts.apptag_mask = o->apptag_mask;
> + }
> + fio_nvme_generate_guard(io_u, &ext_opts);
> + }
Ehh a strcmp() in the hot path?! First of all, that's a big no-no.
Secondly, if this really was required, you'd add something to put that
strcmp() in the slow path and flag it. Lastly, thankfully this should be
much better as:
if (td->io_ops == &ioengine_uring ...)
instead.
> @@ -1347,12 +1464,20 @@ static int fio_ioring_init(struct thread_data *td)
> /* io_u index */
> ld->io_u_index = calloc(td->o.iodepth, sizeof(struct io_u *));
>
> + if (!strcmp(td->io_ops->name, "io_uring") && o->md_per_io_size) {
> + if (o->apptag_mask != 0xffff) {
> + log_err("fio: io_uring with metadata requires an apptag_mask of 0xffff\n");
> + return 1;
> + }
> + }
Same, though this is at least slow path. But people copy-paste wonky
code like that all the time...
> * metadata buffer for nvme command.
> * We are only supporting iomem=malloc / mem=malloc as of now.
> */
> - if (!strcmp(td->io_ops->name, "io_uring_cmd") &&
> - (o->cmd_type == FIO_URING_CMD_NVME) && o->md_per_io_size) {
> + if ((!strcmp(td->io_ops->name, "io_uring_cmd") &&
> + (o->cmd_type == FIO_URING_CMD_NVME) && o->md_per_io_size) ||
> + (!strcmp(td->io_ops->name, "io_uring") && o->md_per_io_size)) {
> md_size = (unsigned long long) o->md_per_io_size
> * (unsigned long long) td->o.iodepth;
> md_size += page_mask + td->o.mem_align;
And I'm starting to see where this is coming from. If you are so
inclined, please do a cleanup patch first getting rid of these strcmp()
calls.
> static int fio_ioring_open_file(struct thread_data *td, struct fio_file *f)
> {
> struct ioring_data *ld = td->io_ops_data;
> struct ioring_options *o = td->eo;
>
> + if (o->md_per_io_size) {
> + struct nvme_data *data = NULL;
> + __u64 nlba = 0;
> + int ret;
> +
> + data = FILE_ENG_DATA(f);
> + if (data == NULL) {
> + data = calloc(1, sizeof(struct nvme_data));
> + ret = fio_get_pi_info(f, &nlba, o->pi_act, data);
> + if (ret) {
> + free(data);
> + return ret;
> + }
> +
> + FILE_SET_ENG_DATA(f, data);
> + }
> + }
Out of line please...
> diff --git a/engines/nvme.c b/engines/nvme.c
> index 37a31e2f..4b3d3860 100644
> --- a/engines/nvme.c
> +++ b/engines/nvme.c
> @@ -8,22 +8,6 @@
> #include "../crc/crc-t10dif.h"
> #include "../crc/crc64.h"
>
> -static inline __u64 get_slba(struct nvme_data *data, __u64 offset)
> -{
> - if (data->lba_ext)
> - return offset / data->lba_ext;
> -
> - return offset >> data->lba_shift;
> -}
> -
> -static inline __u32 get_nlb(struct nvme_data *data, __u64 len)
> -{
> - if (data->lba_ext)
> - return len / data->lba_ext - 1;
> -
> - return (len >> data->lba_shift) - 1;
> -}
> -
Various bits like this, please do a prep patch first moving things
around.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] engines/io_uring: support r/w with metadata
2025-07-23 17:23 ` Jens Axboe
@ 2025-07-23 17:37 ` Jens Axboe
2025-07-23 18:28 ` Vincent Fu
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2025-07-23 17:37 UTC (permalink / raw)
To: Vincent Fu, anuj1072538, hch, joshi.k, anuj20.g, fio; +Cc: Vincent Fu
>> + if (!strcmp(td->io_ops->name, "io_uring") && o->md_per_io_size) {
>> + struct nvme_data *data = FILE_ENG_DATA(io_u->file);
>> + struct nvme_cmd_ext_io_opts ext_opts = {0};
>> +
>> + if (data->pi_type) {
>> + if (o->pi_act)
>> + ext_opts.io_flags |= NVME_IO_PRINFO_PRACT;
>> +
>> + ext_opts.io_flags |= o->prchk;
>> + ext_opts.apptag = o->apptag;
>> + ext_opts.apptag_mask = o->apptag_mask;
>> + }
>> + fio_nvme_generate_guard(io_u, &ext_opts);
>> + }
>
> Ehh a strcmp() in the hot path?! First of all, that's a big no-no.
> Secondly, if this really was required, you'd add something to put that
> strcmp() in the slow path and flag it. Lastly, thankfully this should be
> much better as:
>
> if (td->io_ops == &ioengine_uring ...)
>
> instead.
Eh I guess dynamically loaded engines would need special treatment. I'll
take a look. In any case, strcmp() is just too ugly to live, actually
quite a few in there and the io_uring engine is the only one that thinks
this is necessary.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] engines/io_uring: support r/w with metadata
2025-07-23 17:37 ` Jens Axboe
@ 2025-07-23 18:28 ` Vincent Fu
2025-07-23 19:23 ` Jens Axboe
0 siblings, 1 reply; 7+ messages in thread
From: Vincent Fu @ 2025-07-23 18:28 UTC (permalink / raw)
To: Jens Axboe, anuj1072538, hch, joshi.k, anuj20.g, fio; +Cc: Vincent Fu
On 7/23/25 1:37 PM, Jens Axboe wrote:
>>> + if (!strcmp(td->io_ops->name, "io_uring") && o->md_per_io_size) {
>>> + struct nvme_data *data = FILE_ENG_DATA(io_u->file);
>>> + struct nvme_cmd_ext_io_opts ext_opts = {0};
>>> +
>>> + if (data->pi_type) {
>>> + if (o->pi_act)
>>> + ext_opts.io_flags |= NVME_IO_PRINFO_PRACT;
>>> +
>>> + ext_opts.io_flags |= o->prchk;
>>> + ext_opts.apptag = o->apptag;
>>> + ext_opts.apptag_mask = o->apptag_mask;
>>> + }
>>> + fio_nvme_generate_guard(io_u, &ext_opts);
>>> + }
>>
>> Ehh a strcmp() in the hot path?! First of all, that's a big no-no.
>> Secondly, if this really was required, you'd add something to put that
>> strcmp() in the slow path and flag it. Lastly, thankfully this should be
>> much better as:
>>
>> if (td->io_ops == &ioengine_uring ...)
>>
>> instead.
>
> Eh I guess dynamically loaded engines would need special treatment. I'll
> take a look. In any case, strcmp() is just too ugly to live, actually
> quite a few in there and the io_uring engine is the only one that thinks
> this is necessary.
>
Thanks for the feedback. How about doing a strcmp at ioengine init time
and setting a flag that we can check in the hot path?
Vincent
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] engines/io_uring: support r/w with metadata
2025-07-23 18:28 ` Vincent Fu
@ 2025-07-23 19:23 ` Jens Axboe
0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2025-07-23 19:23 UTC (permalink / raw)
To: Vincent Fu, anuj1072538, hch, joshi.k, anuj20.g, fio; +Cc: Vincent Fu
On 7/23/25 12:28 PM, Vincent Fu wrote:
> On 7/23/25 1:37 PM, Jens Axboe wrote:
>>>> + if (!strcmp(td->io_ops->name, "io_uring") && o->md_per_io_size) {
>>>> + struct nvme_data *data = FILE_ENG_DATA(io_u->file);
>>>> + struct nvme_cmd_ext_io_opts ext_opts = {0};
>>>> +
>>>> + if (data->pi_type) {
>>>> + if (o->pi_act)
>>>> + ext_opts.io_flags |= NVME_IO_PRINFO_PRACT;
>>>> +
>>>> + ext_opts.io_flags |= o->prchk;
>>>> + ext_opts.apptag = o->apptag;
>>>> + ext_opts.apptag_mask = o->apptag_mask;
>>>> + }
>>>> + fio_nvme_generate_guard(io_u, &ext_opts);
>>>> + }
>>>
>>> Ehh a strcmp() in the hot path?! First of all, that's a big no-no.
>>> Secondly, if this really was required, you'd add something to put that
>>> strcmp() in the slow path and flag it. Lastly, thankfully this should be
>>> much better as:
>>>
>>> if (td->io_ops == &ioengine_uring ...)
>>>
>>> instead.
>>
>> Eh I guess dynamically loaded engines would need special treatment. I'll
>> take a look. In any case, strcmp() is just too ugly to live, actually
>> quite a few in there and the io_uring engine is the only one that thinks
>> this is necessary.
>>
>
> Thanks for the feedback. How about doing a strcmp at ioengine init
> time and setting a flag that we can check in the hot path?
You can do that if you want, that'd be fine too. I did io_ops->prep
compare, which is a bit annoying in that it's a double deref to get
there. But at least it's not a strcmp! If you check the repo now, some
cleanups done too, last couple of feature additions to the io_uring
engine have not really been up to snuff imho.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-07-23 19:23 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-23 17:04 [PATCH 0/2] io_uring r/w with metadata Vincent Fu
2025-07-23 17:04 ` [PATCH 1/2] engines/io_uring: support " Vincent Fu
2025-07-23 17:23 ` Jens Axboe
2025-07-23 17:37 ` Jens Axboe
2025-07-23 18:28 ` Vincent Fu
2025-07-23 19:23 ` Jens Axboe
2025-07-23 17:04 ` [PATCH 2/2] t/io_uring_pi: test script for io_uring PI Vincent Fu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).