* [PATCH 1/4] nvme: simplify nvme_req_qid()
2019-07-27 18:41 [PATCH 0/4] nvme: update way to get qid in trace Minwoo Im
@ 2019-07-27 18:41 ` Minwoo Im
2019-07-27 18:41 ` [PATCH 2/4] nvme: check admin queue with ctrl->admin_q, not rq_disk Minwoo Im
` (2 subsequent siblings)
3 siblings, 0 replies; 9+ messages in thread
From: Minwoo Im @ 2019-07-27 18:41 UTC (permalink / raw)
blk_mq_unique_tag() just makes a unique tag which holds not only the
tag, but also the queue_num of the hctx. With this unique tag,
blk_mq_unique_tag_to_hwq is nothing but removing the tag from it.
Therefore the qid or the I/O queues can simply be:
(struct blk_mq_hw_ctx *)->queue_num + 1
Cc: Keith Busch <kbusch at kernel.org>
Cc: Jens Axboe <axboe at fb.com>
Cc: Christoph Hellwig <hch at lst.de>
Cc: Sagi Grimberg <sagi at grimberg.me>
Signed-off-by: Minwoo Im <minwoo.im.dev at gmail.com>
---
drivers/nvme/host/nvme.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 26b563f9985b..2e76198f5833 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -126,7 +126,7 @@ static inline u16 nvme_req_qid(struct request *req)
{
if (!req->rq_disk)
return 0;
- return blk_mq_unique_tag_to_hwq(blk_mq_unique_tag(req)) + 1;
+ return req->mq_hctx->queue_num + 1;
}
/* The below value is the specific amount of delay needed before checking
--
2.17.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH 2/4] nvme: check admin queue with ctrl->admin_q, not rq_disk
2019-07-27 18:41 [PATCH 0/4] nvme: update way to get qid in trace Minwoo Im
2019-07-27 18:41 ` [PATCH 1/4] nvme: simplify nvme_req_qid() Minwoo Im
@ 2019-07-27 18:41 ` Minwoo Im
2019-07-27 18:41 ` [PATCH 3/4] nvme: lightnvm: trace opcode name Minwoo Im
2019-07-27 18:41 ` [PATCH 4/4] nvme: lightnvm: trace opcode name of I/O commands for 2.0 Minwoo Im
3 siblings, 0 replies; 9+ messages in thread
From: Minwoo Im @ 2019-07-27 18:41 UTC (permalink / raw)
It used to figure out whether the given request is about the admin queue
or not by the req->rq_disk only. But, in case of lightnvm, the request
may not have rq_disk mapped so that we just can't return the proper qid
for the I/O NVM commands like Vector Chunk Read/Write.
This patch replaces the condition from rq_disk to check
nvme_ctrl->admin_q and the given request_queue from the request itself.
This patch also moved nvme_req_qid() right next to struct nvme_ctrl to
have it inside.
Cc: Keith Busch <kbusch at kernel.org>
Cc: Jens Axboe <axboe at fb.com>
Cc: Christoph Hellwig <hch at lst.de>
Cc: Sagi Grimberg <sagi at grimberg.me>
Cc: Matias Bj?rling <mb at lightnvm.io>
Cc: Javier Gonz?lez <javier at javigon.com>
Signed-off-by: Minwoo Im <minwoo.im.dev at gmail.com>
---
drivers/nvme/host/nvme.h | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 2e76198f5833..60c1abf8dce1 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -122,13 +122,6 @@ static inline struct nvme_request *nvme_req(struct request *req)
return blk_mq_rq_to_pdu(req);
}
-static inline u16 nvme_req_qid(struct request *req)
-{
- if (!req->rq_disk)
- return 0;
- return req->mq_hctx->queue_num + 1;
-}
-
/* The below value is the specific amount of delay needed before checking
* readiness in case of the PCI_DEVICE(0x1c58, 0x0003), which needs the
* NVME_QUIRK_DELAY_BEFORE_CHK_RDY quirk enabled. The value (in ms) was
@@ -260,6 +253,15 @@ struct nvme_ctrl {
struct nvme_fault_inject fault_inject;
};
+static inline u16 nvme_req_qid(struct request *req)
+{
+ struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
+
+ if (req->q == ctrl->admin_q)
+ return 0;
+ return req->mq_hctx->queue_num + 1;
+}
+
enum nvme_iopolicy {
NVME_IOPOLICY_NUMA,
NVME_IOPOLICY_RR,
--
2.17.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH 3/4] nvme: lightnvm: trace opcode name
2019-07-27 18:41 [PATCH 0/4] nvme: update way to get qid in trace Minwoo Im
2019-07-27 18:41 ` [PATCH 1/4] nvme: simplify nvme_req_qid() Minwoo Im
2019-07-27 18:41 ` [PATCH 2/4] nvme: check admin queue with ctrl->admin_q, not rq_disk Minwoo Im
@ 2019-07-27 18:41 ` Minwoo Im
2019-07-27 19:36 ` Matias Bjørling
2019-07-27 18:41 ` [PATCH 4/4] nvme: lightnvm: trace opcode name of I/O commands for 2.0 Minwoo Im
3 siblings, 1 reply; 9+ messages in thread
From: Minwoo Im @ 2019-07-27 18:41 UTC (permalink / raw)
This patch moved opcode enum values to nvme.h to make it support for
command trace with opcode name istead of raw opcode value.
Example of Vector Chunk Read:
... cmd=(0xe2 cdw10=00 00 00 00 00 00 00 00 00 ...
It would be like:
... cmd=(nvme_nvm_admin_identity cdw10=00 00 00 00 00 00 00 00 00 ...
Although OCSSD 1.2 has been deprecated, we have the codes for 1.2 so
that this kind of traces might be deprecated later time.
Cc: Keith Busch <kbusch at kernel.org>
Cc: Jens Axboe <axboe at fb.com>
Cc: Christoph Hellwig <hch at lst.de>
Cc: Sagi Grimberg <sagi at grimberg.me>
Cc: Matias Bj?rling <mb at lightnvm.io>
Cc: Javier Gonz?lez <javier at javigon.com>
Signed-off-by: Minwoo Im <minwoo.im.dev at gmail.com>
---
drivers/nvme/host/lightnvm.c | 6 ------
include/linux/nvme.h | 10 +++++++++-
2 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/nvme/host/lightnvm.c b/drivers/nvme/host/lightnvm.c
index ba009d4c9dfa..d06d0919c139 100644
--- a/drivers/nvme/host/lightnvm.c
+++ b/drivers/nvme/host/lightnvm.c
@@ -15,12 +15,6 @@
#include <linux/sched/sysctl.h>
#include <uapi/linux/lightnvm.h>
-enum nvme_nvm_admin_opcode {
- nvme_nvm_admin_identity = 0xe2,
- nvme_nvm_admin_get_bb_tbl = 0xf2,
- nvme_nvm_admin_set_bb_tbl = 0xf1,
-};
-
enum nvme_nvm_log_page {
NVME_NVM_LOG_REPORT_CHUNK = 0xca,
};
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 01aa6a6c241d..fddf4c776788 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -814,6 +814,11 @@ enum nvme_admin_opcode {
nvme_admin_security_send = 0x81,
nvme_admin_security_recv = 0x82,
nvme_admin_sanitize_nvm = 0x84,
+
+ /* OCSSD 1.2 */
+ nvme_nvm_admin_identity = 0xe2,
+ nvme_nvm_admin_get_bb_tbl = 0xf2,
+ nvme_nvm_admin_set_bb_tbl = 0xf1,
};
#define nvme_admin_opcode_name(opcode) { opcode, #opcode }
@@ -840,7 +845,10 @@ enum nvme_admin_opcode {
nvme_admin_opcode_name(nvme_admin_format_nvm), \
nvme_admin_opcode_name(nvme_admin_security_send), \
nvme_admin_opcode_name(nvme_admin_security_recv), \
- nvme_admin_opcode_name(nvme_admin_sanitize_nvm))
+ nvme_admin_opcode_name(nvme_admin_sanitize_nvm), \
+ nvme_admin_opcode_name(nvme_nvm_admin_identity), \
+ nvme_admin_opcode_name(nvme_nvm_admin_get_bb_tbl), \
+ nvme_admin_opcode_name(nvme_nvm_admin_set_bb_tbl))
enum {
NVME_QUEUE_PHYS_CONTIG = (1 << 0),
--
2.17.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH 3/4] nvme: lightnvm: trace opcode name
2019-07-27 18:41 ` [PATCH 3/4] nvme: lightnvm: trace opcode name Minwoo Im
@ 2019-07-27 19:36 ` Matias Bjørling
2019-07-28 3:12 ` Minwoo Im
0 siblings, 1 reply; 9+ messages in thread
From: Matias Bjørling @ 2019-07-27 19:36 UTC (permalink / raw)
On 7/27/19 8:41 PM, Minwoo Im wrote:
> This patch moved opcode enum values to nvme.h to make it support for
> command trace with opcode name istead of raw opcode value.
>
> Example of Vector Chunk Read:
> ... cmd=(0xe2 cdw10=00 00 00 00 00 00 00 00 00 ...
>
> It would be like:
> ... cmd=(nvme_nvm_admin_identity cdw10=00 00 00 00 00 00 00 00 00 ...
>
> Although OCSSD 1.2 has been deprecated, we have the codes for 1.2 so
> that this kind of traces might be deprecated later time.
>
> Cc: Keith Busch <kbusch at kernel.org>
> Cc: Jens Axboe <axboe at fb.com>
> Cc: Christoph Hellwig <hch at lst.de>
> Cc: Sagi Grimberg <sagi at grimberg.me>
> Cc: Matias Bj?rling <mb at lightnvm.io>
> Cc: Javier Gonz?lez <javier at javigon.com>
> Signed-off-by: Minwoo Im <minwoo.im.dev at gmail.com>
> ---
> drivers/nvme/host/lightnvm.c | 6 ------
> include/linux/nvme.h | 10 +++++++++-
> 2 files changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/nvme/host/lightnvm.c b/drivers/nvme/host/lightnvm.c
> index ba009d4c9dfa..d06d0919c139 100644
> --- a/drivers/nvme/host/lightnvm.c
> +++ b/drivers/nvme/host/lightnvm.c
> @@ -15,12 +15,6 @@
> #include <linux/sched/sysctl.h>
> #include <uapi/linux/lightnvm.h>
>
> -enum nvme_nvm_admin_opcode {
> - nvme_nvm_admin_identity = 0xe2,
> - nvme_nvm_admin_get_bb_tbl = 0xf2,
> - nvme_nvm_admin_set_bb_tbl = 0xf1,
> -};
> -
> enum nvme_nvm_log_page {
> NVME_NVM_LOG_REPORT_CHUNK = 0xca,
> };
> diff --git a/include/linux/nvme.h b/include/linux/nvme.h
> index 01aa6a6c241d..fddf4c776788 100644
> --- a/include/linux/nvme.h
> +++ b/include/linux/nvme.h
> @@ -814,6 +814,11 @@ enum nvme_admin_opcode {
> nvme_admin_security_send = 0x81,
> nvme_admin_security_recv = 0x82,
> nvme_admin_sanitize_nvm = 0x84,
> +
> + /* OCSSD 1.2 */
> + nvme_nvm_admin_identity = 0xe2,
> + nvme_nvm_admin_get_bb_tbl = 0xf2,
> + nvme_nvm_admin_set_bb_tbl = 0xf1,
> };
>
> #define nvme_admin_opcode_name(opcode) { opcode, #opcode }
> @@ -840,7 +845,10 @@ enum nvme_admin_opcode {
> nvme_admin_opcode_name(nvme_admin_format_nvm), \
> nvme_admin_opcode_name(nvme_admin_security_send), \
> nvme_admin_opcode_name(nvme_admin_security_recv), \
> - nvme_admin_opcode_name(nvme_admin_sanitize_nvm))
> + nvme_admin_opcode_name(nvme_admin_sanitize_nvm), \
> + nvme_admin_opcode_name(nvme_nvm_admin_identity), \
> + nvme_admin_opcode_name(nvme_nvm_admin_get_bb_tbl), \
> + nvme_admin_opcode_name(nvme_nvm_admin_set_bb_tbl))
>
> enum {
> NVME_QUEUE_PHYS_CONTIG = (1 << 0),
>
We should not pollute the standardilized NVMe code with vendor-specific
opcodes. The above code assumes generally that an opcode is associated
as an admin opcode - this is true for an OCSSD 1.2 drive, but not so for
any other vendor's drive.
Besides that, I don't think we should add anymore code for 1.2 since it
is deprecated.
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH 3/4] nvme: lightnvm: trace opcode name
2019-07-27 19:36 ` Matias Bjørling
@ 2019-07-28 3:12 ` Minwoo Im
0 siblings, 0 replies; 9+ messages in thread
From: Minwoo Im @ 2019-07-28 3:12 UTC (permalink / raw)
Hi Matias,
> We should not pollute the standardilized NVMe code with vendor-specific
> opcodes. The above code assumes generally that an opcode is associated as an
> admin opcode - this is true for an OCSSD 1.2 drive, but not so for any other
> vendor's drive.
Agree with that. I was thinking that these are nothing but an opcodes
which could be not a pollusion, but you're right other vendors are not
suitable for these opcodes.
> Besides that, I don't think we should add anymore code for 1.2 since it is
> deprecated.
Okay, I think I need to think about how to trace the nvme commands for
the vendor-specific commands with minimal changes.
Thanks!
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 4/4] nvme: lightnvm: trace opcode name of I/O commands for 2.0
2019-07-27 18:41 [PATCH 0/4] nvme: update way to get qid in trace Minwoo Im
` (2 preceding siblings ...)
2019-07-27 18:41 ` [PATCH 3/4] nvme: lightnvm: trace opcode name Minwoo Im
@ 2019-07-27 18:41 ` Minwoo Im
2019-07-27 19:47 ` Matias Bjørling
3 siblings, 1 reply; 9+ messages in thread
From: Minwoo Im @ 2019-07-27 18:41 UTC (permalink / raw)
This patch adds opcode values which are defined in OCSSD 2.0 spec. This
will make it printed a name of opcode rather than the raw value.
Cc: Keith Busch <kbusch at kernel.org>
Cc: Jens Axboe <axboe at fb.com>
Cc: Christoph Hellwig <hch at lst.de>
Cc: Sagi Grimberg <sagi at grimberg.me>
Cc: Matias Bj?rling <mb at lightnvm.io>
Cc: Javier Gonz?lez <javier at javigon.com>
Signed-off-by: Minwoo Im <minwoo.im.dev at gmail.com>
---
include/linux/nvme.h | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index fddf4c776788..d1db459dbfc1 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -566,6 +566,12 @@ enum nvme_opcode {
nvme_cmd_resv_report = 0x0e,
nvme_cmd_resv_acquire = 0x11,
nvme_cmd_resv_release = 0x15,
+
+ /* OCSSD 2.0 */
+ nvme_nvm_cmd_vec_chunk_reset = 0x90,
+ nvme_nvm_cmd_vec_chunk_write = 0x91,
+ nvme_nvm_cmd_vec_chunk_read = 0x92,
+ nvme_nvm_cmd_vec_chunk_copy = 0x93,
};
#define nvme_opcode_name(opcode) { opcode, #opcode }
@@ -581,7 +587,11 @@ enum nvme_opcode {
nvme_opcode_name(nvme_cmd_resv_register), \
nvme_opcode_name(nvme_cmd_resv_report), \
nvme_opcode_name(nvme_cmd_resv_acquire), \
- nvme_opcode_name(nvme_cmd_resv_release))
+ nvme_opcode_name(nvme_cmd_resv_release), \
+ nvme_opcode_name(nvme_nvm_cmd_vec_chunk_reset), \
+ nvme_opcode_name(nvme_nvm_cmd_vec_chunk_write), \
+ nvme_opcode_name(nvme_nvm_cmd_vec_chunk_read), \
+ nvme_opcode_name(nvme_nvm_cmd_vec_chunk_copy))
/*
--
2.17.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH 4/4] nvme: lightnvm: trace opcode name of I/O commands for 2.0
2019-07-27 18:41 ` [PATCH 4/4] nvme: lightnvm: trace opcode name of I/O commands for 2.0 Minwoo Im
@ 2019-07-27 19:47 ` Matias Bjørling
2019-07-28 3:15 ` Minwoo Im
0 siblings, 1 reply; 9+ messages in thread
From: Matias Bjørling @ 2019-07-27 19:47 UTC (permalink / raw)
On 7/27/19 8:41 PM, Minwoo Im wrote:
> This patch adds opcode values which are defined in OCSSD 2.0 spec. This
> will make it printed a name of opcode rather than the raw value.
>
> Cc: Keith Busch <kbusch at kernel.org>
> Cc: Jens Axboe <axboe at fb.com>
> Cc: Christoph Hellwig <hch at lst.de>
> Cc: Sagi Grimberg <sagi at grimberg.me>
> Cc: Matias Bj?rling <mb at lightnvm.io>
> Cc: Javier Gonz?lez <javier at javigon.com>
> Signed-off-by: Minwoo Im <minwoo.im.dev at gmail.com>
> ---
> include/linux/nvme.h | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/nvme.h b/include/linux/nvme.h
> index fddf4c776788..d1db459dbfc1 100644
> --- a/include/linux/nvme.h
> +++ b/include/linux/nvme.h
> @@ -566,6 +566,12 @@ enum nvme_opcode {
> nvme_cmd_resv_report = 0x0e,
> nvme_cmd_resv_acquire = 0x11,
> nvme_cmd_resv_release = 0x15,
> +
> + /* OCSSD 2.0 */
> + nvme_nvm_cmd_vec_chunk_reset = 0x90,
> + nvme_nvm_cmd_vec_chunk_write = 0x91,
> + nvme_nvm_cmd_vec_chunk_read = 0x92,
> + nvme_nvm_cmd_vec_chunk_copy = 0x93,
> };
>
> #define nvme_opcode_name(opcode) { opcode, #opcode }
> @@ -581,7 +587,11 @@ enum nvme_opcode {
> nvme_opcode_name(nvme_cmd_resv_register), \
> nvme_opcode_name(nvme_cmd_resv_report), \
> nvme_opcode_name(nvme_cmd_resv_acquire), \
> - nvme_opcode_name(nvme_cmd_resv_release))
> + nvme_opcode_name(nvme_cmd_resv_release), \
> + nvme_opcode_name(nvme_nvm_cmd_vec_chunk_reset), \
> + nvme_opcode_name(nvme_nvm_cmd_vec_chunk_write), \
> + nvme_opcode_name(nvme_nvm_cmd_vec_chunk_read), \
> + nvme_opcode_name(nvme_nvm_cmd_vec_chunk_copy))
>
>
> /*
>
It can not be assumed that it is an nvme opcode for any other device
than an OCSSD 2.0 device (the command opcodes are allocated from the
vendor-specific range of the NVMe specification). The trace code should
be updated to take that into account before using the lookup value.
As a side note, I appreciate the work being put into supporting OCSSD,
but for broad adoption, the OCSSD interface is superseded by the Zoned
Namespaces (ZNS) technical proposal that is under standardization in the
NVMe workgroup. We do expect a lot of adoption in this area, and have
the kernel code ready when work is ratified (only at that point is the
TBD values allocated). I hope that the energy that is now being put into
OCSSD, can be put onto ZNS, as that is where we will see broad adoption
and long-term support in the kernel.
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH 4/4] nvme: lightnvm: trace opcode name of I/O commands for 2.0
2019-07-27 19:47 ` Matias Bjørling
@ 2019-07-28 3:15 ` Minwoo Im
0 siblings, 0 replies; 9+ messages in thread
From: Minwoo Im @ 2019-07-28 3:15 UTC (permalink / raw)
Hi Matias,
> It can not be assumed that it is an nvme opcode for any other device than an
> OCSSD 2.0 device (the command opcodes are allocated from the vendor-specific
> range of the NVMe specification). The trace code should be updated to take
> that into account before using the lookup value.
Okay!
> As a side note, I appreciate the work being put into supporting OCSSD, but
> for broad adoption, the OCSSD interface is superseded by the Zoned
> Namespaces (ZNS) technical proposal that is under standardization in the
> NVMe workgroup. We do expect a lot of adoption in this area, and have the
> kernel code ready when work is ratified (only at that point is the TBD
> values allocated). I hope that the energy that is now being put into OCSSD,
> can be put onto ZNS, as that is where we will see broad adoption and
> long-term support in the kernel.
Sure, Once it gets ratified, that would be great to put the energy on
it. Thanks for the info, Matias.
Thanks!
^ permalink raw reply [flat|nested] 9+ messages in thread