All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] NVMe 1.4 Identify Namespace Support
@ 2019-06-10 21:06 Bart Van Assche
  2019-06-10 21:06 ` [PATCH v2 1/3] nvme: Introduce NVMe 1.4 Identify Namespace fields in struct nvme_id_ns Bart Van Assche
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Bart Van Assche @ 2019-06-10 21:06 UTC (permalink / raw)


Hi Christoph,

This patch series adds support for several of the new parameters introduced
in the NVMe 1.4 Identify Namespace command. Please consider these patches
for kernel version 5.3.

Thanks,

Bart.

Changes compared to v1:
- Added a patch for the NVMe target code that exports these parameters.
- Limited the physical block size to the value of AWUPF/NAWUPF as appropriate.

Bart Van Assche (3):
  nvme: Introduce NVMe 1.4 Identify Namespace fields in struct
    nvme_id_ns
  nvmet: Export NVMe namespace attributes
  nvme: Set physical block size and optimal I/O size according to NVMe
    1.4

 drivers/nvme/host/core.c          | 24 +++++++++++++++--
 drivers/nvme/host/nvme.h          |  1 +
 drivers/nvme/target/admin-cmd.c   |  5 ++++
 drivers/nvme/target/io-cmd-bdev.c | 43 +++++++++++++++++++++++++++++++
 drivers/nvme/target/io-cmd-file.c | 34 ++++++++++++++++++++++++
 drivers/nvme/target/nvmet.h       |  2 ++
 include/linux/nvme.h              | 12 ++++++---
 7 files changed, 116 insertions(+), 5 deletions(-)

-- 
2.22.0.rc3

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/3] nvme: Introduce NVMe 1.4 Identify Namespace fields in struct nvme_id_ns
  2019-06-10 21:06 [PATCH v2 0/3] NVMe 1.4 Identify Namespace Support Bart Van Assche
@ 2019-06-10 21:06 ` Bart Van Assche
  2019-06-10 21:06 ` [PATCH v2 2/3] nvmet: Export NVMe namespace attributes Bart Van Assche
  2019-06-10 21:06 ` [PATCH v2 3/3] nvme: Set physical block size and optimal I/O size according to NVMe 1.4 Bart Van Assche
  2 siblings, 0 replies; 13+ messages in thread
From: Bart Van Assche @ 2019-06-10 21:06 UTC (permalink / raw)


Several new fields have been introduced in version 1.4 of the NVMe spec
at offsets that were defined as reserved in version 1.3d of the NVMe
spec. Update the definition of the nvme_id_ns data structure such that
it is in sync with version 1.4 of the NVMe spec. This change preserves
backwards compatibility.

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni at wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen at oracle.com>
Reviewed-by: Keith Busch <kbusch at kernel.org>
Cc: Sagi Grimberg <sagi at grimberg.me>
Cc: Hannes Reinecke <hare at suse.com>
Signed-off-by: Bart Van Assche <bvanassche at acm.org>
---
 include/linux/nvme.h | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 8028adacaff3..2b5072ec4511 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -315,7 +315,7 @@ struct nvme_id_ns {
 	__u8			nmic;
 	__u8			rescap;
 	__u8			fpi;
-	__u8			rsvd33;
+	__u8			dlfeat;
 	__le16			nawun;
 	__le16			nawupf;
 	__le16			nacwu;
@@ -324,11 +324,17 @@ struct nvme_id_ns {
 	__le16			nabspf;
 	__le16			noiob;
 	__u8			nvmcap[16];
-	__u8			rsvd64[28];
+	__le16			npwg;
+	__le16			npwa;
+	__le16			npdg;
+	__le16			npda;
+	__le16			nows;
+	__u8			rsvd74[18];
 	__le32			anagrpid;
 	__u8			rsvd96[3];
 	__u8			nsattr;
-	__u8			rsvd100[4];
+	__le16			nvmsetid;
+	__le16			endgid;
 	__u8			nguid[16];
 	__u8			eui64[8];
 	struct nvme_lbaf	lbaf[16];
-- 
2.22.0.rc3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/3] nvmet: Export NVMe namespace attributes
  2019-06-10 21:06 [PATCH v2 0/3] NVMe 1.4 Identify Namespace Support Bart Van Assche
  2019-06-10 21:06 ` [PATCH v2 1/3] nvme: Introduce NVMe 1.4 Identify Namespace fields in struct nvme_id_ns Bart Van Assche
@ 2019-06-10 21:06 ` Bart Van Assche
  2019-06-14  4:42   ` Chaitanya Kulkarni
  2019-06-10 21:06 ` [PATCH v2 3/3] nvme: Set physical block size and optimal I/O size according to NVMe 1.4 Bart Van Assche
  2 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2019-06-10 21:06 UTC (permalink / raw)


Make the NVMe NAWUN, NAWUPF, NACWU, NPWG, NPWA, NPDG and NOWS attributes
available to initator systems.

Cc: Keith Busch <kbusch at kernel.org>
Cc: Sagi Grimberg <sagi at grimberg.me>
Cc: Martin K. Petersen <martin.petersen at oracle.com>
Cc: Hannes Reinecke <hare at suse.com>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni at wdc.com>
Signed-off-by: Bart Van Assche <bvanassche at acm.org>
---
 drivers/nvme/target/admin-cmd.c   |  5 ++++
 drivers/nvme/target/io-cmd-bdev.c | 43 +++++++++++++++++++++++++++++++
 drivers/nvme/target/io-cmd-file.c | 34 ++++++++++++++++++++++++
 drivers/nvme/target/nvmet.h       |  2 ++
 4 files changed, 84 insertions(+)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index 9f72d515fc4b..2558ed333d29 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -442,6 +442,11 @@ static void nvmet_execute_identify_ns(struct nvmet_req *req)
 		break;
         }
 
+	if (ns->file)
+		nvmet_file_set_limits(ns->file, id);
+	else if (ns->bdev)
+		nvmet_bdev_set_limits(ns->bdev, id);
+
 	/*
 	 * We just provide a single LBA format that matches what the
 	 * underlying device reports.
diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index 3efc52f9c309..967eadeb0d10 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -8,6 +8,49 @@
 #include <linux/module.h>
 #include "nvmet.h"
 
+/* Convert a 32-bit number to a 16-bit 0's based number */
+static __le16 to0based(uint32_t a)
+{
+	return cpu_to_le16(max(1U, min(1U << 16, a)) - 1);
+}
+
+void nvmet_bdev_set_limits(struct block_device *bdev, struct nvme_id_ns *id)
+{
+	const struct queue_limits *ql = &bdev_get_queue(bdev)->limits;
+	/* Number of physical blocks per logical block. */
+	const uint32_t ppl = ql->physical_block_size / ql->logical_block_size;
+	/* Physical blocks per logical block, 0's based. */
+	const __le16 ppl0b = to0based(ppl);
+
+	/*
+	 * For NVMe 1.2 and later, bit 1 indicates that the fields NAWUN,
+	 * NAWUPF, and NACWU are defined for this namespace and should be
+	 * used by the host for this namespace instead of the AWUN, AWUPF,
+	 * and ACWU fields in the Identify Controller data structure.
+	 */
+	id->nsfeat |= 1 << 1;
+	id->nawun = ppl0b;
+	id->nawupf = ppl0b;
+	id->nacwu = ppl0b;
+
+	/*
+	 * For NVMe 1.4 and later, bit 4 indicates that the fields NPWG,
+	 * NPWA, NPDG, NPDA, and NOWS are defined for this namespace and
+	 * should be used by the host for I/O optimization.
+	 */
+	id->nsfeat |= 1 << 4;
+	/* NPWG = Namespace Preferred Write Granularity. 0's based */
+	id->npwg = ppl0b;
+	/* NPWA = Namespace Preferred Write Alignment. 0's based */
+	id->npwa = id->npwg;
+	/* NPDG = Namespace Preferred Deallocate Granularity. 0's based */
+	id->npdg = to0based(ql->discard_granularity / ql->logical_block_size);
+	/* NPDG = Namespace Preferred Deallocate Alignment */
+	id->npda = id->npdg;
+	/* NOWS = Namespace Optimal Write Size */
+	id->nows = to0based(ql->io_opt / ql->logical_block_size);
+}
+
 int nvmet_bdev_ns_enable(struct nvmet_ns *ns)
 {
 	int ret;
diff --git a/drivers/nvme/target/io-cmd-file.c b/drivers/nvme/target/io-cmd-file.c
index 05453f5d1448..34a22f0ea374 100644
--- a/drivers/nvme/target/io-cmd-file.c
+++ b/drivers/nvme/target/io-cmd-file.c
@@ -13,6 +13,40 @@
 #define NVMET_MAX_MPOOL_BVEC		16
 #define NVMET_MIN_MPOOL_OBJ		16
 
+void nvmet_file_set_limits(struct file *file, struct nvme_id_ns *id)
+{
+	/* Physical blocks per logical block, 0's based. */
+	const __le16 ppl0b = cpu_to_le16(0);
+
+	/*
+	 * For NVMe 1.2 and later, bit 1 indicates that the fields NAWUN,
+	 * NAWUPF, and NACWU are defined for this namespace and should be
+	 * used by the host for this namespace instead of the AWUN, AWUPF,
+	 * and ACWU fields in the Identify Controller data structure.
+	 */
+	id->nsfeat |= 1 << 1;
+	id->nawun = ppl0b;
+	id->nawupf = ppl0b;
+	id->nacwu = ppl0b;
+
+	/*
+	 * For NVMe 1.4 and later, bit 4 indicates that the fields NPWG,
+	 * NPWA, NPDG, NPDA, and NOWS are defined for this namespace and
+	 * should be used by the host for I/O optimization.
+	 */
+	id->nsfeat |= 1 << 4;
+	/* NPWG = Namespace Preferred Write Granularity. 0's based */
+	id->npwg = ppl0b;
+	/* NPWA = Namespace Preferred Write Alignment. 0's based */
+	id->npwa = id->npwg;
+	/* NPDG = Namespace Preferred Deallocate Granularity. 0's based */
+	id->npdg = ppl0b;
+	/* NPDG = Namespace Preferred Deallocate Alignment */
+	id->npda = id->npdg;
+	/* NOWS = Namespace Optimal Write Size */
+	id->nows = ppl0b;
+}
+
 void nvmet_file_ns_disable(struct nvmet_ns *ns)
 {
 	if (ns->file) {
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index c25d88fc9dec..ed0362544932 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -363,6 +363,8 @@ u16 nvmet_set_feat_async_event(struct nvmet_req *req, u32 mask);
 void nvmet_execute_async_event(struct nvmet_req *req);
 
 u16 nvmet_parse_connect_cmd(struct nvmet_req *req);
+void nvmet_bdev_set_limits(struct block_device *bdev, struct nvme_id_ns *id);
+void nvmet_file_set_limits(struct file *file, struct nvme_id_ns *id);
 u16 nvmet_bdev_parse_io_cmd(struct nvmet_req *req);
 u16 nvmet_file_parse_io_cmd(struct nvmet_req *req);
 u16 nvmet_parse_admin_cmd(struct nvmet_req *req);
-- 
2.22.0.rc3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/3] nvme: Set physical block size and optimal I/O size according to NVMe 1.4
  2019-06-10 21:06 [PATCH v2 0/3] NVMe 1.4 Identify Namespace Support Bart Van Assche
  2019-06-10 21:06 ` [PATCH v2 1/3] nvme: Introduce NVMe 1.4 Identify Namespace fields in struct nvme_id_ns Bart Van Assche
  2019-06-10 21:06 ` [PATCH v2 2/3] nvmet: Export NVMe namespace attributes Bart Van Assche
@ 2019-06-10 21:06 ` Bart Van Assche
  2019-06-13  1:53   ` Martin K. Petersen
  2 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2019-06-10 21:06 UTC (permalink / raw)


>From the NVMe 1.4 spec:

NSFEAT bit 4 if set to 1: indicates that the fields NPWG, NPWA, NPDG, NPDA,
and NOWS are defined for this namespace and should be used by the host for
I/O optimization;
[ ... ]
Namespace Preferred Write Granularity (NPWG): This field indicates the
smallest recommended write granularity in logical blocks for this namespace.
This is a 0's based value. The size indicated should be less than or equal
to Maximum Data Transfer Size (MDTS) that is specified in units of minimum
memory page size. The value of this field may change if the namespace is
reformatted. The size should be a multiple of Namespace Preferred Write
Alignment (NPWA). Refer to section 8.25 for how this field is utilized to
improve performance and endurance.
[ ... ]
Each Write, Write Uncorrectable, or Write Zeroes commands should address a
multiple of Namespace Preferred Write Granularity (NPWG) (refer to Figure
245) and Stream Write Size (SWS) (refer to Figure 515) logical blocks (as
expressed in the NLB field), and the SLBA field of the command should be
aligned to Namespace Preferred Write Alignment (NPWA) (refer to Figure 245)
for best performance.

Cc: Keith Busch <kbusch at kernel.org>
Cc: Sagi Grimberg <sagi at grimberg.me>
Cc: Hannes Reinecke <hare at suse.com>
Signed-off-by: Bart Van Assche <bvanassche at acm.org>
---
 drivers/nvme/host/core.c | 24 ++++++++++++++++++++++--
 drivers/nvme/host/nvme.h |  1 +
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 1b7c2afd84cb..00bdab0e2d57 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1608,6 +1608,7 @@ static void nvme_update_disk_info(struct gendisk *disk,
 {
 	sector_t capacity = le64_to_cpu(id->nsze) << (ns->lba_shift - 9);
 	unsigned short bs = 1 << ns->lba_shift;
+	uint32_t nawupf, phys_bs, io_opt;
 
 	if (ns->lba_shift > PAGE_SHIFT) {
 		/* unsupported block size, set capacity to 0 later */
@@ -1616,9 +1617,27 @@ static void nvme_update_disk_info(struct gendisk *disk,
 	blk_mq_freeze_queue(disk->queue);
 	blk_integrity_unregister(disk);
 
+	nawupf = (1 + ns->ctrl->subsys->awupf) * bs;
+	if (id->nsfeat & (1 << 1))
+		nawupf = (1 + id->nawupf) * bs;
+	phys_bs = bs;
+	io_opt = bs;
+	if (id->nsfeat & (1 << 4)) {
+		/* NPWG = Namespace Preferred Write Granularity */
+		phys_bs *= 1 + le16_to_cpu(id->npwg);
+		/* NOWS = Namespace Optimal Write Size */
+		io_opt *= 1 + le16_to_cpu(id->nows);
+	}
+
 	blk_queue_logical_block_size(disk->queue, bs);
-	blk_queue_physical_block_size(disk->queue, bs);
-	blk_queue_io_min(disk->queue, bs);
+	/*
+	 * Linux filesystems assume writing a single physical block is
+	 * an atomic operation. Hence limit the physical block size to the
+	 * value of the Atomic Write Unit Power Fail parameter.
+	 */
+	blk_queue_physical_block_size(disk->queue, min(phys_bs, nawupf));
+	blk_queue_io_min(disk->queue, phys_bs);
+	blk_queue_io_opt(disk->queue, io_opt);
 
 	if (ns->ms && !ns->ext &&
 	    (ns->ctrl->ops->flags & NVME_F_METADATA_SUPPORTED))
@@ -2415,6 +2434,7 @@ static int nvme_init_subsystem(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
 	memcpy(subsys->firmware_rev, id->fr, sizeof(subsys->firmware_rev));
 	subsys->vendor_id = le16_to_cpu(id->vid);
 	subsys->cmic = id->cmic;
+	subsys->awupf = le16_to_cpu(id->awupf);
 #ifdef CONFIG_NVME_MULTIPATH
 	subsys->iopolicy = NVME_IOPOLICY_NUMA;
 #endif
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 55553d293a98..d93279fd5960 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -272,6 +272,7 @@ struct nvme_subsystem {
 	char			firmware_rev[8];
 	u8			cmic;
 	u16			vendor_id;
+	u16			awupf;	/* 0's based awupf value. */
 	struct ida		ns_ida;
 #ifdef CONFIG_NVME_MULTIPATH
 	enum nvme_iopolicy	iopolicy;
-- 
2.22.0.rc3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/3] nvme: Set physical block size and optimal I/O size according to NVMe 1.4
  2019-06-10 21:06 ` [PATCH v2 3/3] nvme: Set physical block size and optimal I/O size according to NVMe 1.4 Bart Van Assche
@ 2019-06-13  1:53   ` Martin K. Petersen
  2019-06-13 17:57     ` Bart Van Assche
  0 siblings, 1 reply; 13+ messages in thread
From: Martin K. Petersen @ 2019-06-13  1:53 UTC (permalink / raw)



Bart,

> +	nawupf = (1 + ns->ctrl->subsys->awupf) * bs;
> +	if (id->nsfeat & (1 << 1))
> +		nawupf = (1 + id->nawupf) * bs;

This tripped me up a bit. I would have preferred an else statement and
maybe a clarifying comment to make it obvious whether the value comes
from the controller or the namespace.

Also, unlike awupf, nawupf is not a 0-based value (0 means "use awupf"
and not 1 logical block).

And finally, I think it's confusing that you use nawupf for the variable
name post modification. In terms of naming, I think you'd be better off
to do s/nawupf/phys_bs/ or atomic_bs. And then rename your existing
phys_bs variable to io_min to match the existing block layer usage.

So something like:

        /* Use reported namespace write atomicity */
	if (id->nsfeat & (1 << 1) && id->nawupf != 0)
           phys_bs = id->nawupf * bs;
        else /* Fall back to reported controller write atomicity */
           phys_bs = (1 + ns->ctrl->subsys->awupf) * bs;

[...]

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 3/3] nvme: Set physical block size and optimal I/O size according to NVMe 1.4
  2019-06-13  1:53   ` Martin K. Petersen
@ 2019-06-13 17:57     ` Bart Van Assche
  2019-06-13 23:57       ` Martin K. Petersen
  0 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2019-06-13 17:57 UTC (permalink / raw)


On 6/12/19 6:53 PM, Martin K. Petersen wrote:
> 
> Bart,
> 
>> +	nawupf = (1 + ns->ctrl->subsys->awupf) * bs;
>> +	if (id->nsfeat & (1 << 1))
>> +		nawupf = (1 + id->nawupf) * bs;
> 
> This tripped me up a bit. I would have preferred an else statement and
> maybe a clarifying comment to make it obvious whether the value comes
> from the controller or the namespace.
> 
> Also, unlike awupf, nawupf is not a 0-based value (0 means "use awupf"
> and not 1 logical block).

Hi Martin,

I agree that NAWUPF == 0 means that AWUPF should be used instead. But
are you sure that NAWUPF is not 0's-based? From version 1.4 of the NVMe
specification:

Namespace Atomic Write Unit Power Fail (NAWUPF): This field indicates
the namespace specific size of the write operation guaranteed to be
written atomically to the NVM during a power fail or error condition.
A value of 0h indicates that the size for this namespace is the same
size as that reported in the AWUPF field of the Identify Controller data
structure. All other values specify a size in terms of logical blocks
using the same encoding as the AWUPF field. Refer to section 6.4.
[ ... ]
Atomic Write Unit Power Fail (AWUPF): [ ... ] This field is specified in
logical blocks and is a 0?s based value. The AWUPF value shall be less
than or equal to the AWUN value.

Since the spec says that NAWUPF and AWUPF use the same encoding, does
that mean that NAWUPF is also 0's based?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 3/3] nvme: Set physical block size and optimal I/O size according to NVMe 1.4
  2019-06-13 17:57     ` Bart Van Assche
@ 2019-06-13 23:57       ` Martin K. Petersen
  0 siblings, 0 replies; 13+ messages in thread
From: Martin K. Petersen @ 2019-06-13 23:57 UTC (permalink / raw)



Bart,

> Namespace Atomic Write Unit Power Fail (NAWUPF): This field indicates
> the namespace specific size of the write operation guaranteed to be
> written atomically to the NVM during a power fail or error condition.
> A value of 0h indicates that the size for this namespace is the same
> size as that reported in the AWUPF field of the Identify Controller data
> structure. All other values specify a size in terms of logical blocks
> using the same encoding as the AWUPF field.

There's some ambiguity there. But I think you're right that the encoding
statement qualifies it. This means that there is no explicit way of
expressing a NAWUPF of 1 logical block but that is OK since it is
implied.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 2/3] nvmet: Export NVMe namespace attributes
  2019-06-10 21:06 ` [PATCH v2 2/3] nvmet: Export NVMe namespace attributes Bart Van Assche
@ 2019-06-14  4:42   ` Chaitanya Kulkarni
  2019-06-14 15:31     ` Bart Van Assche
  0 siblings, 1 reply; 13+ messages in thread
From: Chaitanya Kulkarni @ 2019-06-14  4:42 UTC (permalink / raw)


On 6/10/19 2:06 PM, Bart Van Assche wrote:
> Make the NVMe NAWUN, NAWUPF, NACWU, NPWG, NPWA, NPDG and NOWS attributes
> available to initator systems.
> 
> Cc: Keith Busch <kbusch at kernel.org>
> Cc: Sagi Grimberg <sagi at grimberg.me>
> Cc: Martin K. Petersen <martin.petersen at oracle.com>
> Cc: Hannes Reinecke <hare at suse.com>
> Cc: Chaitanya Kulkarni <chaitanya.kulkarni at wdc.com>
> Signed-off-by: Bart Van Assche <bvanassche at acm.org>
> ---
>   drivers/nvme/target/admin-cmd.c   |  5 ++++
>   drivers/nvme/target/io-cmd-bdev.c | 43 +++++++++++++++++++++++++++++++
>   drivers/nvme/target/io-cmd-file.c | 34 ++++++++++++++++++++++++
>   drivers/nvme/target/nvmet.h       |  2 ++
>   4 files changed, 84 insertions(+)
> 
> diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
> index 9f72d515fc4b..2558ed333d29 100644
> --- a/drivers/nvme/target/admin-cmd.c
> +++ b/drivers/nvme/target/admin-cmd.c
> @@ -442,6 +442,11 @@ static void nvmet_execute_identify_ns(struct nvmet_req *req)
>   		break;
>           }
>   
> +	if (ns->file)
> +		nvmet_file_set_limits(ns->file, id);
> +	else if (ns->bdev)
> +		nvmet_bdev_set_limits(ns->bdev, id);
> +
>   	/*
>   	 * We just provide a single LBA format that matches what the
>   	 * underlying device reports.
> diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
> index 3efc52f9c309..967eadeb0d10 100644
> --- a/drivers/nvme/target/io-cmd-bdev.c
> +++ b/drivers/nvme/target/io-cmd-bdev.c
> @@ -8,6 +8,49 @@
>   #include <linux/module.h>
>   #include "nvmet.h"
>   
> +/* Convert a 32-bit number to a 16-bit 0's based number */
Since following function is only helper (and even though it is not in a 
fast patch can we make it inline ?
> +static __le16 to0based(uint32_t a)
The code in io-cmd-bdev.c uses u16, in the above line we are using 
uint32_t. Which one we should follow ? I think either way we should be 
consistent.

> +{
> +	return cpu_to_le16(max(1U, min(1U << 16, a)) - 1);
> +}
> +
> +void nvmet_bdev_set_limits(struct block_device *bdev, struct nvme_id_ns *id)
> +{
> +	const struct queue_limits *ql = &bdev_get_queue(bdev)->limits;
> +	/* Number of physical blocks per logical block. */
> +	const uint32_t ppl = ql->physical_block_size / ql->logical_block_size;
> +	/* Physical blocks per logical block, 0's based. */
> +	const __le16 ppl0b = to0based(ppl);
> +
> +	/*
> +	 * For NVMe 1.2 and later, bit 1 indicates that the fields NAWUN,
> +	 * NAWUPF, and NACWU are defined for this namespace and should be
> +	 * used by the host for this namespace instead of the AWUN, AWUPF,
> +	 * and ACWU fields in the Identify Controller data structure.
> +	 */
> +	id->nsfeat |= 1 << 1;
> +	id->nawun = ppl0b;
> +	id->nawupf = ppl0b;
> +	id->nacwu = ppl0b;
The above 4 assignments are similar to the assignments present in the
nvmet_file_set_limits() ? If so is it make sense to have a helper to 
reduce the code duplication ?
> +
> +	/*
> +	 * For NVMe 1.4 and later, bit 4 indicates that the fields NPWG,
> +	 * NPWA, NPDG, NPDA, and NOWS are defined for this namespace and
> +	 * should be used by the host for I/O optimization.
> +	 */
> +	id->nsfeat |= 1 << 4;
> +	/* NPWG = Namespace Preferred Write Granularity. 0's based */
> +	id->npwg = ppl0b;
> +	/* NPWA = Namespace Preferred Write Alignment. 0's based */
> +	id->npwa = id->npwg;
> +	/* NPDG = Namespace Preferred Deallocate Granularity. 0's based */
> +	id->npdg = to0based(ql->discard_granularity / ql->logical_block_size);
> +	/* NPDG = Namespace Preferred Deallocate Alignment */
> +	id->npda = id->npdg;
> +	/* NOWS = Namespace Optimal Write Size */
> +	id->nows = to0based(ql->io_opt / ql->logical_block_size);
> +}
> +
>   int nvmet_bdev_ns_enable(struct nvmet_ns *ns)
>   {
>   	int ret;
> diff --git a/drivers/nvme/target/io-cmd-file.c b/drivers/nvme/target/io-cmd-file.c
> index 05453f5d1448..34a22f0ea374 100644
> --- a/drivers/nvme/target/io-cmd-file.c
> +++ b/drivers/nvme/target/io-cmd-file.c
> @@ -13,6 +13,40 @@
>   #define NVMET_MAX_MPOOL_BVEC		16
>   #define NVMET_MIN_MPOOL_OBJ		16
>   
> +void nvmet_file_set_limits(struct file *file, struct nvme_id_ns *id)
> +{
> +	/* Physical blocks per logical block, 0's based. */
> +	const __le16 ppl0b = cpu_to_le16(0);
> +
> +	/*
> +	 * For NVMe 1.2 and later, bit 1 indicates that the fields NAWUN,
> +	 * NAWUPF, and NACWU are defined for this namespace and should be
> +	 * used by the host for this namespace instead of the AWUN, AWUPF,
> +	 * and ACWU fields in the Identify Controller data structure.
> +	 */
> +	id->nsfeat |= 1 << 1;
> +	id->nawun = ppl0b;
> +	id->nawupf = ppl0b;
> +	id->nacwu = ppl0b;

The above 3 assignments are where we are assigning value = 0, which will 
be the case any way since caller of this function is using kzalloc() to 
allocate nvme_id_ns *id(), isn't it ?

static void nvmet_execute_identify_ns(struct nvmet_req *req)
{
         struct nvmet_ns *ns;
         struct nvme_id_ns *id;
         u16 status = 0;

         if (le32_to_cpu(req->cmd->identify.nsid) == NVME_NSID_ALL) {
                 status = NVME_SC_INVALID_NS | NVME_SC_DNR;
                 goto out;
         }

         id = kzalloc(sizeof(*id), GFP_KERNEL);
         if (!id) {
                 status = NVME_SC_INTERNAL;
                 goto out;
         }

In that case can we get rid of the redundant code of 0 assignment ?
> +
> +	/*
> +	 * For NVMe 1.4 and later, bit 4 indicates that the fields NPWG,
> +	 * NPWA, NPDG, NPDA, and NOWS are defined for this namespace and
> +	 * should be used by the host for I/O optimization.
> +	 */
> +	id->nsfeat |= 1 << 4;
> +	/* NPWG = Namespace Preferred Write Granularity. 0's based */
> +	id->npwg = ppl0b;
> +	/* NPWA = Namespace Preferred Write Alignment. 0's based */
> +	id->npwa = id->npwg;
> +	/* NPDG = Namespace Preferred Deallocate Granularity. 0's based */
> +	id->npdg = ppl0b;
> +	/* NPDG = Namespace Preferred Deallocate Alignment */
> +	id->npda = id->npdg;
> +	/* NOWS = Namespace Optimal Write Size */
> +	id->nows = ppl0b;

Same here ppl0b assigning 0red values to the structure which is 
allocated using kzalloc().

> +}
> +
>   void nvmet_file_ns_disable(struct nvmet_ns *ns)
>   {
>   	if (ns->file) {
> diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
> index c25d88fc9dec..ed0362544932 100644
> --- a/drivers/nvme/target/nvmet.h
> +++ b/drivers/nvme/target/nvmet.h
> @@ -363,6 +363,8 @@ u16 nvmet_set_feat_async_event(struct nvmet_req *req, u32 mask);
>   void nvmet_execute_async_event(struct nvmet_req *req);
>   
>   u16 nvmet_parse_connect_cmd(struct nvmet_req *req);
> +void nvmet_bdev_set_limits(struct block_device *bdev, struct nvme_id_ns *id);
> +void nvmet_file_set_limits(struct file *file, struct nvme_id_ns *id);
>   u16 nvmet_bdev_parse_io_cmd(struct nvmet_req *req);
>   u16 nvmet_file_parse_io_cmd(struct nvmet_req *req);
>   u16 nvmet_parse_admin_cmd(struct nvmet_req *req);
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 2/3] nvmet: Export NVMe namespace attributes
  2019-06-14  4:42   ` Chaitanya Kulkarni
@ 2019-06-14 15:31     ` Bart Van Assche
  2019-06-14 17:18       ` Chaitanya Kulkarni
  0 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2019-06-14 15:31 UTC (permalink / raw)


On 6/13/19 9:42 PM, Chaitanya Kulkarni wrote:
> On 6/10/19 2:06 PM, Bart Van Assche wrote:
>> +/* Convert a 32-bit number to a 16-bit 0's based number */
>
> Since following function is only helper (and even though it is not in a
> fast patch can we make it inline ?

Do you mean declaring that function as inline or inlining the function? 
The usual approach for small helper functions in .c files is to not 
declare these inline and to let the compiler decide whether or not to 
inline these functions. I don't want to inline that function manually 
because that would reduce readability.

>> +static __le16 to0based(uint32_t a)
>
> The code in io-cmd-bdev.c uses u16, in the above line we are using
> uint32_t. Which one we should follow ? I think either way we should be
> consistent.

Are you sure the purpose of both functions is the same? Changing the 
argument type of this function from uint32_t into uint16_t or u16 is 
wrong. That would cause any argument larger than or equal to 1 << 16 to 
be truncated.

>> +	id->nsfeat |= 1 << 1;
>> +	id->nawun = ppl0b;
>> +	id->nawupf = ppl0b;
>> +	id->nacwu = ppl0b;
>
> The above 4 assignments are similar to the assignments present in the
> nvmet_file_set_limits() ? If so is it make sense to have a helper to
> reduce the code duplication ?

I don't think that that is a good idea.

>> +void nvmet_file_set_limits(struct file *file, struct nvme_id_ns *id)
>> +{
>> +	/* Physical blocks per logical block, 0's based. */
>> +	const __le16 ppl0b = cpu_to_le16(0);
>> +
>> +	/*
>> +	 * For NVMe 1.2 and later, bit 1 indicates that the fields NAWUN,
>> +	 * NAWUPF, and NACWU are defined for this namespace and should be
>> +	 * used by the host for this namespace instead of the AWUN, AWUPF,
>> +	 * and ACWU fields in the Identify Controller data structure.
>> +	 */
>> +	id->nsfeat |= 1 << 1;
>> +	id->nawun = ppl0b;
>> +	id->nawupf = ppl0b;
>> +	id->nacwu = ppl0b;
> 
> The above 3 assignments are where we are assigning value = 0, which will
> be the case any way since caller of this function is using kzalloc() to
> allocate nvme_id_ns *id(), isn't it ?

That change would reduce readability of the code.

The reason I wrote the code this way is because I was hoping that a 
filesystem expert would have a look at this code and tell me whether or 
not the physical block size is available in e.g. the superblock.

Bart.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 2/3] nvmet: Export NVMe namespace attributes
  2019-06-14 15:31     ` Bart Van Assche
@ 2019-06-14 17:18       ` Chaitanya Kulkarni
  2019-06-14 17:43         ` Bart Van Assche
  2019-06-14 23:56         ` Bart Van Assche
  0 siblings, 2 replies; 13+ messages in thread
From: Chaitanya Kulkarni @ 2019-06-14 17:18 UTC (permalink / raw)


Hi Bart,

Thanks for the reply. Please see my inline reply, some things I'm not clear
about so asking for explanation which will help me in future to write a
better code.
My goal is to write a better code so it will be easy for maintainers and
reviewers.

On 6/14/19 8:31 AM, Bart Van Assche wrote:
> On 6/13/19 9:42 PM, Chaitanya Kulkarni wrote:
>> On 6/10/19 2:06 PM, Bart Van Assche wrote:
>>> +/* Convert a 32-bit number to a 16-bit 0's based number */
>> Since following function is only helper (and even though it is not in a
>> fast patch can we make it inline ?
> Do you mean declaring that function as inline or inlining the function? 
Not inline the function since it is makes perfect sense to have this
helpers.
> The usual approach for small helper functions in .c files is to not 
> declare these inline and to let the compiler decide whether or not to 
> inline these functions. I don't want to inline that function manually 
> because that would reduce readability.
Okay.
>
>>> +static __le16 to0based(uint32_t a)
>> The code in io-cmd-bdev.c uses u16, in the above line we are using
>> uint32_t. Which one we should follow ? I think either way we should be
>> consistent.
> Are you sure the purpose of both functions is the same? Changing the 
> argument type of this function from uint32_t into uint16_t or u16 is 
> wrong. That would cause any argument larger than or equal to 1 << 16 to 
> be truncated.

Sorry for not being clear, what I meant is uintXXX_t vs uXX. e.g.
instead of uint16_t ->u16

uint32_t ->u32. My comment refereed to the consistency of the data type
and not

changing the size of the variable.

>
>>> +	id->nsfeat |= 1 << 1;
>>> +	id->nawun = ppl0b;
>>> +	id->nawupf = ppl0b;
>>> +	id->nacwu = ppl0b;
>> The above 4 assignments are similar to the assignments present in the
>> nvmet_file_set_limits() ? If so is it make sense to have a helper to
>> reduce the code duplication ?
> I don't think that that is a good idea.

It will be great learning experience for me if you can educate me about
why this

is not a good idea, can you please explain  ?

The rational behind asking for an explanation is that I've got a comment
on my code

about not having same copy of the code in the kernel which is right.

The situation is same here where this code is operating on same
structure doing the

same thing across file io-cmd-bdev.c and io-cmd-file.c.


My understanding is that if this patch gets applied then we will have
same copy of two

assignments in the target which are exactly doing the same thing which
calls for a nice

little helper function.

>
>>> +void nvmet_file_set_limits(struct file *file, struct nvme_id_ns *id)
>>> +{
>>> +	/* Physical blocks per logical block, 0's based. */
>>> +	const __le16 ppl0b = cpu_to_le16(0);
>>> +
>>> +	/*
>>> +	 * For NVMe 1.2 and later, bit 1 indicates that the fields NAWUN,
>>> +	 * NAWUPF, and NACWU are defined for this namespace and should be
>>> +	 * used by the host for this namespace instead of the AWUN, AWUPF,
>>> +	 * and ACWU fields in the Identify Controller data structure.
>>> +	 */
>>> +	id->nsfeat |= 1 << 1;
>>> +	id->nawun = ppl0b;
>>> +	id->nawupf = ppl0b;
>>> +	id->nacwu = ppl0b;
>> The above 3 assignments are where we are assigning value = 0, which will
>> be the case any way since caller of this function is using kzalloc() to
>> allocate nvme_id_ns *id(), isn't it ?
> That change would reduce readability of the code.
>
> The reason I wrote the code this way is because I was hoping that a 
> filesystem expert would have a look at this code and tell me whether or 
> not the physical block size is available in e.g. the superblock.

Okay, I understand now thanks again.

>
> Bart.
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 2/3] nvmet: Export NVMe namespace attributes
  2019-06-14 17:18       ` Chaitanya Kulkarni
@ 2019-06-14 17:43         ` Bart Van Assche
  2019-06-14 17:53           ` Chaitanya Kulkarni
  2019-06-14 23:56         ` Bart Van Assche
  1 sibling, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2019-06-14 17:43 UTC (permalink / raw)


On 6/14/19 10:18 AM, Chaitanya Kulkarni wrote:
>>>> +	id->nsfeat |= 1 << 1;
>>>> +	id->nawun = ppl0b;
>>>> +	id->nawupf = ppl0b;
>>>> +	id->nacwu = ppl0b;
>>> The above 4 assignments are similar to the assignments present in the
>>> nvmet_file_set_limits() ? If so is it make sense to have a helper to
>>> reduce the code duplication ?
>> I don't think that that is a good idea.
> 
> It will be great learning experience for me if you can educate me about
> why this
> 
> is not a good idea, can you please explain  ?
> 
> The rational behind asking for an explanation is that I've got a comment
> on my code
> 
> about not having same copy of the code in the kernel which is right.
> 
> The situation is same here where this code is operating on same
> structure doing the
> 
> same thing across file io-cmd-bdev.c and io-cmd-file.c.
> 
> 
> My understanding is that if this patch gets applied then we will have
> same copy of two
> 
> assignments in the target which are exactly doing the same thing which
> calls for a nice
> 
> little helper function.

Hi Chaitanya,

There is no hard rule that code duplication should be avoided at any
price in the Linux kernel. For this patch I will leave it to the NVMe
maintainers to decide whether or not they want to move these assignments
into a helper function.

Regarding req_op_str(): I had never seen it before if the same
functionality is needed in two different source files in the same
subsystem that code got copy/pasted instead of calling a single function
from all code that needs the functionality. I had asked to avoid to
duplicate req_op_str() because of the following reasons:
- If a new request operation would be added having one req_op_str()
implementation is more convenient than having multiple.
- An operation like conversion from a request operation into a string
should not happen inside f2fs. f2fs should call a block layer function
to do that conversion.

Bart.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 2/3] nvmet: Export NVMe namespace attributes
  2019-06-14 17:43         ` Bart Van Assche
@ 2019-06-14 17:53           ` Chaitanya Kulkarni
  0 siblings, 0 replies; 13+ messages in thread
From: Chaitanya Kulkarni @ 2019-06-14 17:53 UTC (permalink / raw)


On 06/14/2019 10:43 AM, Bart Van Assche wrote:
> On 6/14/19 10:18 AM, Chaitanya Kulkarni wrote:
>>>>> +	id->nsfeat |= 1 << 1;
>>>>> +	id->nawun = ppl0b;
>>>>> +	id->nawupf = ppl0b;
>>>>> +	id->nacwu = ppl0b;
>>>> The above 4 assignments are similar to the assignments present in the
>>>> nvmet_file_set_limits() ? If so is it make sense to have a helper to
>>>> reduce the code duplication ?
>>> I don't think that that is a good idea.
>>
>> It will be great learning experience for me if you can educate me about
>> why this
>>
>> is not a good idea, can you please explain  ?
>>
>> The rational behind asking for an explanation is that I've got a comment
>> on my code
>>
>> about not having same copy of the code in the kernel which is right.
>>
>> The situation is same here where this code is operating on same
>> structure doing the
>>
>> same thing across file io-cmd-bdev.c and io-cmd-file.c.
>>
>>
>> My understanding is that if this patch gets applied then we will have
>> same copy of two
>>
>> assignments in the target which are exactly doing the same thing which
>> calls for a nice
>>
>> little helper function.
>
> Hi Chaitanya,
>
> There is no hard rule that code duplication should be avoided at any
> price in the Linux kernel. For this patch I will leave it to the NVMe
> maintainers to decide whether or not they want to move these assignments
> into a helper function.
>
I'll keep in mind from next time.
> Regarding req_op_str(): I had never seen it before if the same
> functionality is needed in two different source files in the same
> subsystem that code got copy/pasted instead of calling a single function
> from all code that needs the functionality. I had asked to avoid to
> duplicate req_op_str() because of the following reasons:
> - If a new request operation would be added having one req_op_str()
> implementation is more convenient than having multiple.
> - An operation like conversion from a request operation into a string
> should not happen inside f2fs. f2fs should call a block layer function
> to do that conversion.
This makes perfect sense, thanks for the great explanation Bart.
>
> Bart.
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 2/3] nvmet: Export NVMe namespace attributes
  2019-06-14 17:18       ` Chaitanya Kulkarni
  2019-06-14 17:43         ` Bart Van Assche
@ 2019-06-14 23:56         ` Bart Van Assche
  1 sibling, 0 replies; 13+ messages in thread
From: Bart Van Assche @ 2019-06-14 23:56 UTC (permalink / raw)


On 6/14/19 10:18 AM, Chaitanya Kulkarni wrote:
> Sorry for not being clear, what I meant is uintXXX_t vs uXX. e.g.
> instead of uint16_t ->u16, uint32_t ->u32. My comment refereed to the
> consistency of the data type and not changing the size of the variable.

Hi Chaitanya,

I think the rule is that uint32_t etc. must be used in uapi header 
files. I'm not sure whether there is a strict rule for other kernel code 
than uapi header files. But since the NVMe code uses u32 etc., I will 
make this patch consistent with the rest of the NVMe code.

Bart.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-06-14 23:56 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-06-10 21:06 [PATCH v2 0/3] NVMe 1.4 Identify Namespace Support Bart Van Assche
2019-06-10 21:06 ` [PATCH v2 1/3] nvme: Introduce NVMe 1.4 Identify Namespace fields in struct nvme_id_ns Bart Van Assche
2019-06-10 21:06 ` [PATCH v2 2/3] nvmet: Export NVMe namespace attributes Bart Van Assche
2019-06-14  4:42   ` Chaitanya Kulkarni
2019-06-14 15:31     ` Bart Van Assche
2019-06-14 17:18       ` Chaitanya Kulkarni
2019-06-14 17:43         ` Bart Van Assche
2019-06-14 17:53           ` Chaitanya Kulkarni
2019-06-14 23:56         ` Bart Van Assche
2019-06-10 21:06 ` [PATCH v2 3/3] nvme: Set physical block size and optimal I/O size according to NVMe 1.4 Bart Van Assche
2019-06-13  1:53   ` Martin K. Petersen
2019-06-13 17:57     ` Bart Van Assche
2019-06-13 23:57       ` Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.