* [PATCH v2 0/2] io_uring_cmd cleanup and fixes
[not found] <CGME20230515054350epcas5p4643d68a6f79cae867b88be678154329b@epcas5p4.samsung.com>
@ 2023-05-15 11:03 ` Ankit Kumar
[not found] ` <CGME20230515054351epcas5p1f2922cc8507e9e6a57bb0228914ff0bc@epcas5p1.samsung.com>
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Ankit Kumar @ 2023-05-15 11:03 UTC (permalink / raw)
To: axboe; +Cc: fio, vincentfu, joshi.k, Ankit Kumar
The NVM command set specification 1.0c supports 64 LBA formats.
The first patch adds support for that.
The io_uring_cmd ioengine assumes that logical block size is always
power of 2. But with namespace formats where metadata is transferred
at the end of logical block i.e. an extended logical block this is
not true.
This second patch adds the support for extended logical block sizes.
This however doesn't include support for protection info and metadata
transferred as separate buffer, we return error for those.
Changes from v1:
* Split the original patch into 2 as suggested by Vincent and Jens
* Address all review comments from Vincent
Ankit Kumar (2):
engines/nvme: support for 64 LBA formats
engines/io_uring: io_uring_cmd engine fixes
engines/io_uring.c | 30 ++++++++++++++++++---
engines/nvme.c | 66 ++++++++++++++++++++++++++++++++++++++--------
engines/nvme.h | 6 ++---
3 files changed, 84 insertions(+), 18 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2 1/2] engines/nvme: support for 64 LBA formats
[not found] ` <CGME20230515054351epcas5p1f2922cc8507e9e6a57bb0228914ff0bc@epcas5p1.samsung.com>
@ 2023-05-15 11:03 ` Ankit Kumar
0 siblings, 0 replies; 5+ messages in thread
From: Ankit Kumar @ 2023-05-15 11:03 UTC (permalink / raw)
To: axboe; +Cc: fio, vincentfu, joshi.k, Ankit Kumar
The NVM command set specification 1.0c supports 64 LBA formats.
The 0-based nlbaf field specifies the number of LBA formats.
The flbas field is used to calculate the current LBA format, in which
bit 0-3 indicates lsb and bit 5-6 indicates msb of the format index.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
---
engines/nvme.c | 14 +++++++++++++-
engines/nvme.h | 3 +--
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/engines/nvme.c b/engines/nvme.c
index fd2161f3..96a5f064 100644
--- a/engines/nvme.c
+++ b/engines/nvme.c
@@ -99,6 +99,7 @@ int fio_nvme_get_info(struct fio_file *f, __u32 *nsid, __u32 *lba_sz,
struct nvme_id_ns ns;
int namespace_id;
int fd, err;
+ __u32 format_idx;
if (f->filetype != FIO_TYPE_CHAR) {
log_err("ioengine io_uring_cmd only works with nvme ns "
@@ -131,7 +132,18 @@ int fio_nvme_get_info(struct fio_file *f, __u32 *nsid, __u32 *lba_sz,
}
*nsid = namespace_id;
- *lba_sz = 1 << ns.lbaf[(ns.flbas & 0x0f)].ds;
+
+ /*
+ * 16 or 64 as maximum number of supported LBA formats.
+ * From flbas bit 0-3 indicates lsb and bit 5-6 indicates msb
+ * of the format index used to format the namespace.
+ */
+ if (ns.nlbaf < 16)
+ format_idx = ns.flbas & 0xf;
+ else
+ format_idx = (ns.flbas & 0xf) + (((ns.flbas >> 5) & 0x3) << 4);
+
+ *lba_sz = 1 << ns.lbaf[format_idx].ds;
*nlba = ns.nsze;
close(fd);
diff --git a/engines/nvme.h b/engines/nvme.h
index 408594d5..9d6288c8 100644
--- a/engines/nvme.h
+++ b/engines/nvme.h
@@ -134,8 +134,7 @@ struct nvme_id_ns {
__le16 endgid;
__u8 nguid[16];
__u8 eui64[8];
- struct nvme_lbaf lbaf[16];
- __u8 rsvd192[192];
+ struct nvme_lbaf lbaf[64];
__u8 vs[3712];
};
--
2.25.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v2 2/2] engines/io_uring: io_uring_cmd engine fixes
[not found] ` <CGME20230515054352epcas5p15ceded12f770df07d842f324f1ef7857@epcas5p1.samsung.com>
@ 2023-05-15 11:03 ` Ankit Kumar
0 siblings, 0 replies; 5+ messages in thread
From: Ankit Kumar @ 2023-05-15 11:03 UTC (permalink / raw)
To: axboe; +Cc: fio, vincentfu, joshi.k, Ankit Kumar
The io_uring_cmd ioengine assumes that logical block size is always
power of 2. But with namespace formats where metadata is transferred
at the end of logical block i.e. an extended logical block this is
not true.
This patch calculates the correct extended logical block size and
uses division operation for start and number of logical block
calculation. The existing calculation for power of 2 logical block
size will remain same.
This also add checks to verify that block sizes are multiple of LBA size.
This current implementation however doesn't support protection info
and metadata transferred as separate buffer. Return error for those.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
---
engines/io_uring.c | 30 ++++++++++++++++++++++----
engines/nvme.c | 52 +++++++++++++++++++++++++++++++++++++---------
engines/nvme.h | 3 ++-
3 files changed, 70 insertions(+), 15 deletions(-)
diff --git a/engines/io_uring.c b/engines/io_uring.c
index f5ffe9f4..90e5a856 100644
--- a/engines/io_uring.c
+++ b/engines/io_uring.c
@@ -1177,22 +1177,40 @@ static int fio_ioring_cmd_open_file(struct thread_data *td, struct fio_file *f)
if (o->cmd_type == FIO_URING_CMD_NVME) {
struct nvme_data *data = NULL;
unsigned int nsid, lba_size = 0;
+ __u32 ms = 0;
__u64 nlba = 0;
int ret;
/* Store the namespace-id and lba size. */
data = FILE_ENG_DATA(f);
if (data == NULL) {
- ret = fio_nvme_get_info(f, &nsid, &lba_size, &nlba);
+ ret = fio_nvme_get_info(f, &nsid, &lba_size, &ms, &nlba);
if (ret)
return ret;
data = calloc(1, sizeof(struct nvme_data));
data->nsid = nsid;
- data->lba_shift = ilog2(lba_size);
+ if (ms)
+ data->lba_ext = lba_size + ms;
+ else
+ data->lba_shift = ilog2(lba_size);
FILE_SET_ENG_DATA(f, data);
}
+
+ lba_size = data->lba_ext ? data->lba_ext : (1 << data->lba_shift);
+
+ for_each_rw_ddir(ddir) {
+ if (td->o.min_bs[ddir] % lba_size ||
+ td->o.max_bs[ddir] % lba_size) {
+ if (data->lba_ext)
+ log_err("block size must be a multiple of "
+ "(LBA data size + Metadata size)\n");
+ else
+ log_err("block size must be a multiple of LBA data size\n");
+ return 1;
+ }
+ }
}
if (!ld || !o->registerfiles)
return generic_open_file(td, f);
@@ -1243,16 +1261,20 @@ static int fio_ioring_cmd_get_file_size(struct thread_data *td,
if (o->cmd_type == FIO_URING_CMD_NVME) {
struct nvme_data *data = NULL;
unsigned int nsid, lba_size = 0;
+ __u32 ms = 0;
__u64 nlba = 0;
int ret;
- ret = fio_nvme_get_info(f, &nsid, &lba_size, &nlba);
+ ret = fio_nvme_get_info(f, &nsid, &lba_size, &ms, &nlba);
if (ret)
return ret;
data = calloc(1, sizeof(struct nvme_data));
data->nsid = nsid;
- data->lba_shift = ilog2(lba_size);
+ if (ms)
+ data->lba_ext = lba_size + ms;
+ else
+ data->lba_shift = ilog2(lba_size);
f->real_file_size = lba_size * nlba;
fio_file_set_size_known(f);
diff --git a/engines/nvme.c b/engines/nvme.c
index 96a5f064..1047ade2 100644
--- a/engines/nvme.c
+++ b/engines/nvme.c
@@ -21,8 +21,13 @@ int fio_nvme_uring_cmd_prep(struct nvme_uring_cmd *cmd, struct io_u *io_u,
else
return -ENOTSUP;
- slba = io_u->offset >> data->lba_shift;
- nlb = (io_u->xfer_buflen >> data->lba_shift) - 1;
+ if (data->lba_ext) {
+ slba = io_u->offset / data->lba_ext;
+ nlb = (io_u->xfer_buflen / data->lba_ext) - 1;
+ } else {
+ slba = io_u->offset >> data->lba_shift;
+ nlb = (io_u->xfer_buflen >> data->lba_shift) - 1;
+ }
/* cdw10 and cdw11 represent starting lba */
cmd->cdw10 = slba & 0xffffffff;
@@ -65,8 +70,13 @@ int fio_nvme_trim(const struct thread_data *td, struct fio_file *f,
struct nvme_dsm_range dsm;
int ret;
- dsm.nlb = (len >> data->lba_shift);
- dsm.slba = (offset >> data->lba_shift);
+ if (data->lba_ext) {
+ dsm.nlb = len / data->lba_ext;
+ dsm.slba = offset / data->lba_ext;
+ } else {
+ dsm.nlb = len >> data->lba_shift;
+ dsm.slba = offset >> data->lba_shift;
+ }
ret = nvme_trim(f->fd, data->nsid, 1, sizeof(struct nvme_dsm_range),
&dsm);
@@ -94,7 +104,7 @@ static int nvme_identify(int fd, __u32 nsid, enum nvme_identify_cns cns,
}
int fio_nvme_get_info(struct fio_file *f, __u32 *nsid, __u32 *lba_sz,
- __u64 *nlba)
+ __u32 *ms, __u64 *nlba)
{
struct nvme_id_ns ns;
int namespace_id;
@@ -114,9 +124,8 @@ int fio_nvme_get_info(struct fio_file *f, __u32 *nsid, __u32 *lba_sz,
namespace_id = ioctl(fd, NVME_IOCTL_ID);
if (namespace_id < 0) {
err = -errno;
- log_err("failed to fetch namespace-id");
- close(fd);
- return err;
+ log_err("%s: failed to fetch namespace-id\n", f->file_name);
+ goto out;
}
/*
@@ -126,7 +135,8 @@ int fio_nvme_get_info(struct fio_file *f, __u32 *nsid, __u32 *lba_sz,
err = nvme_identify(fd, namespace_id, NVME_IDENTIFY_CNS_NS,
NVME_CSI_NVM, &ns);
if (err) {
- log_err("failed to fetch identify namespace\n");
+ log_err("%s: failed to fetch identify namespace\n",
+ f->file_name);
close(fd);
return err;
}
@@ -144,10 +154,32 @@ int fio_nvme_get_info(struct fio_file *f, __u32 *nsid, __u32 *lba_sz,
format_idx = (ns.flbas & 0xf) + (((ns.flbas >> 5) & 0x3) << 4);
*lba_sz = 1 << ns.lbaf[format_idx].ds;
+
+ /*
+ * Only extended LBA can be supported.
+ * Bit 4 for flbas indicates if metadata is transferred at the end of
+ * logical block creating an extended LBA.
+ */
+ *ms = le16_to_cpu(ns.lbaf[format_idx].ms);
+ if (*ms && !((ns.flbas >> 4) & 0x1)) {
+ log_err("%s: only extended logical block can be supported\n",
+ f->file_name);
+ err = -ENOTSUP;
+ goto out;
+ }
+
+ /* Check for end to end data protection support */
+ if (ns.dps & 0x3) {
+ log_err("%s: end to end data protection not supported\n",
+ f->file_name);
+ err = -ENOTSUP;
+ goto out;
+ }
*nlba = ns.nsze;
+out:
close(fd);
- return 0;
+ return err;
}
int fio_nvme_get_zoned_model(struct thread_data *td, struct fio_file *f,
diff --git a/engines/nvme.h b/engines/nvme.h
index 9d6288c8..f7cb820d 100644
--- a/engines/nvme.h
+++ b/engines/nvme.h
@@ -88,6 +88,7 @@ enum nvme_zns_zs {
struct nvme_data {
__u32 nsid;
__u32 lba_shift;
+ __u32 lba_ext;
};
struct nvme_lbaf {
@@ -222,7 +223,7 @@ int fio_nvme_iomgmt_ruhs(struct thread_data *td, struct fio_file *f,
struct nvme_fdp_ruh_status *ruhs, __u32 bytes);
int fio_nvme_get_info(struct fio_file *f, __u32 *nsid, __u32 *lba_sz,
- __u64 *nlba);
+ __u32 *ms, __u64 *nlba);
int fio_nvme_uring_cmd_prep(struct nvme_uring_cmd *cmd, struct io_u *io_u,
struct iovec *iov);
--
2.25.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2 0/2] io_uring_cmd cleanup and fixes
2023-05-15 11:03 ` [PATCH v2 0/2] io_uring_cmd cleanup and fixes Ankit Kumar
[not found] ` <CGME20230515054351epcas5p1f2922cc8507e9e6a57bb0228914ff0bc@epcas5p1.samsung.com>
[not found] ` <CGME20230515054352epcas5p15ceded12f770df07d842f324f1ef7857@epcas5p1.samsung.com>
@ 2023-05-15 12:49 ` Vincent Fu
2023-05-16 12:37 ` Ankit Kumar
2 siblings, 1 reply; 5+ messages in thread
From: Vincent Fu @ 2023-05-15 12:49 UTC (permalink / raw)
To: Ankit Kumar, axboe; +Cc: fio, joshi.k
On 5/15/23 07:03, Ankit Kumar wrote:
> The NVM command set specification 1.0c supports 64 LBA formats.
> The first patch adds support for that.
>
> The io_uring_cmd ioengine assumes that logical block size is always
> power of 2. But with namespace formats where metadata is transferred
> at the end of logical block i.e. an extended logical block this is
> not true.
> This second patch adds the support for extended logical block sizes.
> This however doesn't include support for protection info and metadata
> transferred as separate buffer, we return error for those.
>
> Changes from v1:
> * Split the original patch into 2 as suggested by Vincent and Jens
> * Address all review comments from Vincent
>
> Ankit Kumar (2):
> engines/nvme: support for 64 LBA formats
> engines/io_uring: io_uring_cmd engine fixes
>
> engines/io_uring.c | 30 ++++++++++++++++++---
> engines/nvme.c | 66 ++++++++++++++++++++++++++++++++++++++--------
> engines/nvme.h | 6 ++---
> 3 files changed, 84 insertions(+), 18 deletions(-)
>
Applied with a change to the title of the second patch's commit message.
Thanks.
Vincent
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2 0/2] io_uring_cmd cleanup and fixes
2023-05-15 12:49 ` [PATCH v2 0/2] io_uring_cmd cleanup and fixes Vincent Fu
@ 2023-05-16 12:37 ` Ankit Kumar
0 siblings, 0 replies; 5+ messages in thread
From: Ankit Kumar @ 2023-05-16 12:37 UTC (permalink / raw)
To: Vincent Fu, axboe; +Cc: fio, joshi.k, anuj20.g
[-- Attachment #1: Type: text/plain, Size: 1467 bytes --]
On Mon, May 15, 2023 at 08:49:54AM -0400, Vincent Fu wrote:
> On 5/15/23 07:03, Ankit Kumar wrote:
> > The NVM command set specification 1.0c supports 64 LBA formats.
> > The first patch adds support for that.
> >
> > The io_uring_cmd ioengine assumes that logical block size is always
> > power of 2. But with namespace formats where metadata is transferred
> > at the end of logical block i.e. an extended logical block this is
> > not true.
> > This second patch adds the support for extended logical block sizes.
> > This however doesn't include support for protection info and metadata
> > transferred as separate buffer, we return error for those.
> >
> > Changes from v1:
> > * Split the original patch into 2 as suggested by Vincent and Jens
> > * Address all review comments from Vincent
> >
> > Ankit Kumar (2):
> > engines/nvme: support for 64 LBA formats
> > engines/io_uring: io_uring_cmd engine fixes
> >
> > engines/io_uring.c | 30 ++++++++++++++++++---
> > engines/nvme.c | 66 ++++++++++++++++++++++++++++++++++++++--------
> > engines/nvme.h | 6 ++---
> > 3 files changed, 84 insertions(+), 18 deletions(-)
> >
>
> Applied with a change to the title of the second patch's commit message.
> Thanks.
>
> Vincent
>
Thanks,
I was wondering if extended LBA changes are required for t/io_uring,
as its part of fast path, and any branching introduced because of
lba_shift and lba_ext may have an impact on performace.
Ankit
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-05-16 9:26 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20230515054350epcas5p4643d68a6f79cae867b88be678154329b@epcas5p4.samsung.com>
2023-05-15 11:03 ` [PATCH v2 0/2] io_uring_cmd cleanup and fixes Ankit Kumar
[not found] ` <CGME20230515054351epcas5p1f2922cc8507e9e6a57bb0228914ff0bc@epcas5p1.samsung.com>
2023-05-15 11:03 ` [PATCH v2 1/2] engines/nvme: support for 64 LBA formats Ankit Kumar
[not found] ` <CGME20230515054352epcas5p15ceded12f770df07d842f324f1ef7857@epcas5p1.samsung.com>
2023-05-15 11:03 ` [PATCH v2 2/2] engines/io_uring: io_uring_cmd engine fixes Ankit Kumar
2023-05-15 12:49 ` [PATCH v2 0/2] io_uring_cmd cleanup and fixes Vincent Fu
2023-05-16 12:37 ` Ankit Kumar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).