* [PATCH v3 0/1] nvme : Add ioctl to query nvme device attributes
[not found] <CGME20221117141717eucas1p22722572ae0898b65a872658d5a4ef14e@eucas1p2.samsung.com>
@ 2022-11-17 14:13 ` Joel Granados
2022-11-17 14:13 ` [PATCH v3 1/1] nvme : Add ioctl to query nvme attributes Joel Granados
0 siblings, 1 reply; 6+ messages in thread
From: Joel Granados @ 2022-11-17 14:13 UTC (permalink / raw)
To: hch, sagi, chaitanyak, kbusch; +Cc: linux-nvme, gost.dev, Joel Granados
What?
In this patch I add an ioctl that provides nvme controller attributes to
the user that can open the char device file. We also provide them for the
block devices for convenience.
Why?
Applications with write permissions should not need to be privileged to
write to the device. With Kanchan's latest patch
(https://lore.kernel.org/linux-nvme/20221020070205.57366-1-joshi.k@samsung.com/)
the nvme IO and identify commands in passthru now follow device
permissions; however there are still some controller attributes like
minimal data transfer size (MDTS) which need a privileged user to be
queried.
How?
Added an ioctl (NVME_IOCTL_GET_ATTR) that fetches existing and new
attributes from nvme_ctrl struct. The new attributes are dmsl, vsl and wusl
from the I/O command set specific identify controller data structure in the
nvme spec.
I have made the call extensible by embedding the struct size as the first
member and using it as a version. I call copy_struct_from_user to make sure
that struct members that are unique are zeroed out. For this I followed
in the steps of the openat2 system call [1] and the extensible ioctl for
vfio [2]. Another interesting reference for this is here
https://lwn.net/Articles/830666/
Feedback is greatly appreciated :)
[1] https://github.com/torvalds/linux/commit/fddb5d430ad9
[2] https://github.com/torvalds/linux/blob/master/include/uapi/linux/vfio.h#L56
V3:
* Removed unneeded comments in nvme_ioctl.h
* Added a comment to the nvme_sectors_to_mps function
* Moved size checks to nvme_check_size in core.h
* Changed struct initialization to match what we use in nvme driver {} vs
{0}
V2:
* Changed comment from // to /**/
* Took a call out from an if condition and assigned it to ret var.
Joel Granados (1):
nvme : Add ioctl to query nvme attributes
drivers/nvme/host/core.c | 8 ++++-
drivers/nvme/host/ioctl.c | 61 +++++++++++++++++++++++++++++++++
drivers/nvme/host/nvme.h | 11 ++++++
include/uapi/linux/nvme_ioctl.h | 23 +++++++++++++
4 files changed, 102 insertions(+), 1 deletion(-)
--
2.30.2
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 1/1] nvme : Add ioctl to query nvme attributes
2022-11-17 14:13 ` [PATCH v3 0/1] nvme : Add ioctl to query nvme device attributes Joel Granados
@ 2022-11-17 14:13 ` Joel Granados
2022-11-18 22:35 ` Keith Busch
0 siblings, 1 reply; 6+ messages in thread
From: Joel Granados @ 2022-11-17 14:13 UTC (permalink / raw)
To: hch, sagi, chaitanyak, kbusch; +Cc: linux-nvme, gost.dev, Joel Granados
An ioctl (NVME_IOCTL_GET_ATTR) is added to query nvme controller
attributes needed to effectively write to the char device without needing
to be a privileged user. They are also provided on block devices for
convenience. The attributes here are meant to complement what you can
already get with the allowed identify commands.
New members are added to the nvme_ctrl struct: dmsl, vsl and wusl from
the I/O Command Set Specific Identify Controller Data Structure in the nvme
spec.
We add NVME_IOCTL_GET_ATTR_{V0SZ,CURZS} in order to make the ioctl
extensible. These serve as both size and version. The caller is expected to
pass the structure size as the first member of the struct. For example:
{...
struct nvme_get_attr nvme_get_attr = {0};
nvme_get_attr.argsz = sizeof(struct nvme_get_attr);
...}
Signed-off-by: Joel Granados <j.granados@samsung.com>
---
drivers/nvme/host/core.c | 8 ++++-
drivers/nvme/host/ioctl.c | 61 +++++++++++++++++++++++++++++++++
drivers/nvme/host/nvme.h | 11 ++++++
include/uapi/linux/nvme_ioctl.h | 23 +++++++++++++
4 files changed, 102 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f94b05c585cb..51c99302786b 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3061,9 +3061,12 @@ static int nvme_init_non_mdts_limits(struct nvme_ctrl *ctrl)
if (id->dmrl)
ctrl->max_discard_segments = id->dmrl;
- ctrl->dmrsl = le32_to_cpu(id->dmrsl);
if (id->wzsl)
ctrl->max_zeroes_sectors = nvme_mps_to_sectors(ctrl, id->wzsl);
+ ctrl->dmrsl = le32_to_cpu(id->dmrsl);
+ ctrl->dmsl = le64_to_cpu(id->dmsl);
+ ctrl->vsl = id->vsl;
+ ctrl->wusl = id->wusl;
free_data:
kfree(id);
@@ -5246,6 +5249,9 @@ static inline void _nvme_check_size(void)
BUILD_BUG_ON(sizeof(struct nvme_dbbuf) != 64);
BUILD_BUG_ON(sizeof(struct nvme_directive_cmd) != 64);
BUILD_BUG_ON(sizeof(struct nvme_feat_host_behavior) != 512);
+ BUILD_BUG_ON(sizeof(struct nvme_get_attr) < NVME_IOCTL_GET_ATTR_V0SZ);
+ BUILD_BUG_ON(sizeof(struct nvme_get_attr) != NVME_IOCTL_GET_ATTR_CURSZ);
+
}
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 9273db147872..daafea38d39b 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -53,6 +53,61 @@ static void __user *nvme_to_user_ptr(uintptr_t ptrval)
return (void __user *)ptrval;
}
+/*
+ * Reverse conversion from sectors to original NVMe values expressed in units
+ * of MPSMIN. Use ilog2 to change to power of 2 values and then reverse what
+ * was done in nvme_mps_to_sectors. 9 is the kernel sectors size and 12 is part
+ * of how mpsmin is defined in the NVMe spec (2^(12 + mpsmin))
+ */
+static inline u32 nvme_sectors_to_mps(struct nvme_ctrl *ctrl, u32 unit)
+{
+ return ilog2(unit) + 9 - 12 - NVME_CAP_MPSMIN(ctrl->cap);
+}
+
+static int nvme_ctrl_export_attrs(struct nvme_ctrl *ctrl,
+ struct nvme_get_attr __user *arg)
+{
+ struct nvme_get_attr nvme_get_attr = { };
+ __u32 usize, retsize;
+ int ret;
+
+ if (copy_from_user(&nvme_get_attr, arg, 2 * sizeof(__u32)))
+ return -EFAULT;
+
+ if (nvme_get_attr.flags != 0)
+ return -EINVAL;
+
+ if (nvme_get_attr.argsz < NVME_IOCTL_GET_ATTR_V0SZ)
+ return -EINVAL;
+ usize = nvme_get_attr.argsz;
+
+ /* Enforce backwards compatibility */
+ ret = copy_struct_from_user(&nvme_get_attr, NVME_IOCTL_GET_ATTR_CURSZ,
+ arg, usize);
+ if (ret)
+ return ret;
+
+ nvme_get_attr.argsz = NVME_IOCTL_GET_ATTR_CURSZ;
+ nvme_get_attr.mpsmin = NVME_CAP_MPSMIN(ctrl->cap);
+ nvme_get_attr.vsl = ctrl->vsl;
+ nvme_get_attr.wusl = ctrl->wusl;
+ nvme_get_attr.dmrl = ctrl->max_discard_segments;
+ nvme_get_attr.dmsl = ctrl->dmsl;
+ nvme_get_attr.dmrsl = ctrl->dmrsl;
+ nvme_get_attr.oncs = ctrl->oncs;
+ if (ctrl->max_zeroes_sectors > 0)
+ nvme_get_attr.wzsl = nvme_sectors_to_mps(ctrl, ctrl->max_zeroes_sectors);
+ if (ctrl->max_hw_sectors > 0)
+ nvme_get_attr.mdts = nvme_sectors_to_mps(ctrl, ctrl->max_hw_sectors);
+
+ retsize = min(usize, NVME_IOCTL_GET_ATTR_CURSZ);
+ ret = copy_to_user((struct nvme_get_attr __user *)arg, &nvme_get_attr, retsize);
+ if (ret)
+ return -EFAULT;
+
+ return 0;
+}
+
static void *nvme_add_user_metadata(struct request *req, void __user *ubuf,
unsigned len, u32 seed)
{
@@ -613,6 +668,8 @@ static int nvme_ctrl_ioctl(struct nvme_ctrl *ctrl, unsigned int cmd,
return nvme_user_cmd(ctrl, NULL, argp, mode);
case NVME_IOCTL_ADMIN64_CMD:
return nvme_user_cmd64(ctrl, NULL, argp, false, mode);
+ case NVME_IOCTL_GET_ATTR:
+ return nvme_ctrl_export_attrs(ctrl, argp);
default:
return sed_ioctl(ctrl->opal_dev, cmd, argp);
}
@@ -659,6 +716,8 @@ static int nvme_ns_ioctl(struct nvme_ns *ns, unsigned int cmd,
return nvme_user_cmd64(ns->ctrl, ns, argp, false, mode);
case NVME_IOCTL_IO64_CMD_VEC:
return nvme_user_cmd64(ns->ctrl, ns, argp, true, mode);
+ case NVME_IOCTL_GET_ATTR:
+ return nvme_ctrl_export_attrs(ns->ctrl, argp);
default:
return -ENOTTY;
}
@@ -950,6 +1009,8 @@ long nvme_dev_ioctl(struct file *file, unsigned int cmd,
return -EACCES;
nvme_queue_scan(ctrl);
return 0;
+ case NVME_IOCTL_GET_ATTR:
+ return nvme_ctrl_export_attrs(ctrl, argp);
default:
return -ENOTTY;
}
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index f9df10653f3c..0f36259a7992 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -292,6 +292,9 @@ struct nvme_ctrl {
u16 crdt[3];
u16 oncs;
u32 dmrsl;
+ u64 dmsl;
+ u8 vsl;
+ u8 wusl;
u16 oacs;
u16 sqsize;
u32 max_namespaces;
@@ -814,6 +817,7 @@ int __nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
union nvme_result *result, void *buffer, unsigned bufflen,
int qid, int at_head,
blk_mq_req_flags_t flags);
+
int nvme_set_features(struct nvme_ctrl *dev, unsigned int fid,
unsigned int dword11, void *buffer, size_t buflen,
u32 *result);
@@ -1071,4 +1075,11 @@ static inline const unsigned char *nvme_get_admin_opcode_str(u8 opcode)
}
#endif /* CONFIG_NVME_VERBOSE_ERRORS */
+/*
+ * List of all nvme ioctl controller attribute calls. Use size of nvme_get_attr
+ * as the version which enables us to use copy_struct_from_user
+ */
+#define NVME_IOCTL_GET_ATTR_V0SZ 32u
+#define NVME_IOCTL_GET_ATTR_CURSZ NVME_IOCTL_GET_ATTR_V0SZ
+
#endif /* _NVME_H */
diff --git a/include/uapi/linux/nvme_ioctl.h b/include/uapi/linux/nvme_ioctl.h
index 2f76cba67166..57b05d26f682 100644
--- a/include/uapi/linux/nvme_ioctl.h
+++ b/include/uapi/linux/nvme_ioctl.h
@@ -92,6 +92,28 @@ struct nvme_uring_cmd {
__u32 rsvd2;
};
+/*
+ * This struct is populated from the following nvme commands:
+ * [1] Identify Controller Data Structure
+ * [2] I/O Command Set Specific Identify Controller Data Structure for the
+ * NVME Command Set. Usually contained in struct nvme_id_ctrl_nvm
+ * [3] Controller capabilities on controller initialization
+ */
+struct nvme_get_attr {
+ __u32 argsz;
+ __u32 flags;
+
+ __u32 mpsmin;
+ __u8 vsl;
+ __u8 wzsl;
+ __u8 wusl;
+ __u8 dmrl;
+ __u64 dmsl;
+ __u32 dmrsl;
+ __u16 oncs;
+ __u8 mdts;
+};
+
#define nvme_admin_cmd nvme_passthru_cmd
#define NVME_IOCTL_ID _IO('N', 0x40)
@@ -104,6 +126,7 @@ struct nvme_uring_cmd {
#define NVME_IOCTL_ADMIN64_CMD _IOWR('N', 0x47, struct nvme_passthru_cmd64)
#define NVME_IOCTL_IO64_CMD _IOWR('N', 0x48, struct nvme_passthru_cmd64)
#define NVME_IOCTL_IO64_CMD_VEC _IOWR('N', 0x49, struct nvme_passthru_cmd64)
+#define NVME_IOCTL_GET_ATTR _IOWR('N', 0x50, struct nvme_get_attr)
/* io_uring async commands: */
#define NVME_URING_CMD_IO _IOWR('N', 0x80, struct nvme_uring_cmd)
--
2.30.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v3 1/1] nvme : Add ioctl to query nvme attributes
2022-11-17 14:13 ` [PATCH v3 1/1] nvme : Add ioctl to query nvme attributes Joel Granados
@ 2022-11-18 22:35 ` Keith Busch
2022-11-19 0:04 ` Chaitanya Kulkarni
2022-11-21 15:51 ` Joel Granados
0 siblings, 2 replies; 6+ messages in thread
From: Keith Busch @ 2022-11-18 22:35 UTC (permalink / raw)
To: Joel Granados; +Cc: hch, sagi, chaitanyak, linux-nvme, gost.dev
On Thu, Nov 17, 2022 at 03:13:37PM +0100, Joel Granados wrote:
> +#define NVME_IOCTL_GET_ATTR _IOWR('N', 0x50, struct nvme_get_attr)
Isn't the ioctl's number derived from the sizeof the parameter? IOW, if
you add new fields to the struct in the future, the ioctl number will
change, breaking older applications that were using it. I don't think we
want to keep adding new ioctl's to support legacy users, so probably
padding that struct with plenty of reserved space is the way to go.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3 1/1] nvme : Add ioctl to query nvme attributes
2022-11-18 22:35 ` Keith Busch
@ 2022-11-19 0:04 ` Chaitanya Kulkarni
2022-11-21 15:54 ` Joel Granados
2022-11-21 15:51 ` Joel Granados
1 sibling, 1 reply; 6+ messages in thread
From: Chaitanya Kulkarni @ 2022-11-19 0:04 UTC (permalink / raw)
To: Keith Busch, Joel Granados
Cc: hch@lst.de, sagi@grimberg.me, linux-nvme@lists.infradead.org,
gost.dev@samsung.com
On 11/18/22 14:35, Keith Busch wrote:
> On Thu, Nov 17, 2022 at 03:13:37PM +0100, Joel Granados wrote:
>> +#define NVME_IOCTL_GET_ATTR _IOWR('N', 0x50, struct nvme_get_attr)
>
> Isn't the ioctl's number derived from the sizeof the parameter? IOW, if
> you add new fields to the struct in the future, the ioctl number will
> change, breaking older applications that were using it. I don't think we
> want to keep adding new ioctl's to support legacy users, so probably
> padding that struct with plenty of reserved space is the way to go.
This is exactly I've asked explicitly to make the size of the structure
close to nvme command size (or even better to nvme identify controller
data structure size) so it will be future proof and we can add new
parameters.
Instead the rsvd field seems to be removed in this patch.
-ck
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3 1/1] nvme : Add ioctl to query nvme attributes
2022-11-18 22:35 ` Keith Busch
2022-11-19 0:04 ` Chaitanya Kulkarni
@ 2022-11-21 15:51 ` Joel Granados
1 sibling, 0 replies; 6+ messages in thread
From: Joel Granados @ 2022-11-21 15:51 UTC (permalink / raw)
To: Keith Busch; +Cc: hch, sagi, chaitanyak, linux-nvme, gost.dev
[-- Attachment #1: Type: text/plain, Size: 969 bytes --]
On Fri, Nov 18, 2022 at 03:35:03PM -0700, Keith Busch wrote:
> On Thu, Nov 17, 2022 at 03:13:37PM +0100, Joel Granados wrote:
> > +#define NVME_IOCTL_GET_ATTR _IOWR('N', 0x50, struct nvme_get_attr)
>
> Isn't the ioctl's number derived from the sizeof the parameter? IOW, if
> you add new fields to the struct in the future, the ioctl number will
> change, breaking older applications that were using it. I don't think we
> want to keep adding new ioctl's to support legacy users, so probably
> padding that struct with plenty of reserved space is the way to go.
My bad.
That is NOT supposed to be an '_IOWR' it was meant to be an '_IO'. In
this way we do not have the size of the structure changing the ioctl
number and we handle the extensibility with the first two arguments of
the ioctl. In the same way that it is handled in
https://github.com/torvalds/linux/blob/master/include/uapi/linux/vfio.h.
I'll send another version fixing this.
Thx
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 659 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3 1/1] nvme : Add ioctl to query nvme attributes
2022-11-19 0:04 ` Chaitanya Kulkarni
@ 2022-11-21 15:54 ` Joel Granados
0 siblings, 0 replies; 6+ messages in thread
From: Joel Granados @ 2022-11-21 15:54 UTC (permalink / raw)
To: Chaitanya Kulkarni
Cc: Keith Busch, hch@lst.de, sagi@grimberg.me,
linux-nvme@lists.infradead.org, gost.dev@samsung.com
[-- Attachment #1: Type: text/plain, Size: 1266 bytes --]
On Sat, Nov 19, 2022 at 12:04:43AM +0000, Chaitanya Kulkarni wrote:
> On 11/18/22 14:35, Keith Busch wrote:
> > On Thu, Nov 17, 2022 at 03:13:37PM +0100, Joel Granados wrote:
> >> +#define NVME_IOCTL_GET_ATTR _IOWR('N', 0x50, struct nvme_get_attr)
> >
> > Isn't the ioctl's number derived from the sizeof the parameter? IOW, if
> > you add new fields to the struct in the future, the ioctl number will
> > change, breaking older applications that were using it. I don't think we
> > want to keep adding new ioctl's to support legacy users, so probably
> > padding that struct with plenty of reserved space is the way to go.
>
> This is exactly I've asked explicitly to make the size of the structure
> close to nvme command size (or even better to nvme identify controller
> data structure size) so it will be future proof and we can add new
> parameters.
>
> Instead the rsvd field seems to be removed in this patch.
By including the size of the struct in the ioctl we do not need to
pad it. We can add all the struct members that we want in the future as
long as we do not remove any that already existed. In the same way that
it is implemented in https://github.com/torvalds/linux/blob/master/include/uapi/linux/vfio.h.
>
> -ck
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 659 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-11-21 15:54 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20221117141717eucas1p22722572ae0898b65a872658d5a4ef14e@eucas1p2.samsung.com>
2022-11-17 14:13 ` [PATCH v3 0/1] nvme : Add ioctl to query nvme device attributes Joel Granados
2022-11-17 14:13 ` [PATCH v3 1/1] nvme : Add ioctl to query nvme attributes Joel Granados
2022-11-18 22:35 ` Keith Busch
2022-11-19 0:04 ` Chaitanya Kulkarni
2022-11-21 15:54 ` Joel Granados
2022-11-21 15:51 ` Joel Granados
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox