linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/8] nvme: Add Controller Data Queue to the nvme driver
@ 2025-07-14  9:15 Joel Granados
  2025-07-14  9:15 ` [PATCH RFC 1/8] nvme: Add CDQ command definitions for contiguous PRPs Joel Granados
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: Joel Granados @ 2025-07-14  9:15 UTC (permalink / raw)
  To: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg
  Cc: Klaus Jensen, linux-nvme, linux-kernel, Joel Granados

This series introduces support for Controller Data Queues (CDQs) in the
NVMe driver. CDQs allow an NVME controller to post information to the
host through a single completion queue. This series adds data structures,
helpers, and the user interface required to create, read, and delete CDQs.

Motivation
==========
The main motivation is to enable Controller Data Queues as described in
the 2.2 revision of the NVME base specification. This series places the
kernel as an intermediary between the NVME controller producing CDQ
entries and the user space process consuming them. It is general enough
to encompass different use cases that require controller initiated
communication delivered outside the regular I/O traffic streams (like
LBA tracking for example).

What is done
============
* Added nvme_admin_cdq opcode and NVME_FEAT_CDQ feature flag
* Defined a new struct nvme_cdq command for create/delete operations
* Added a cdq_nvme_queue struct that holds the CDQ state
* Added an xarray for each nvme_ctrl that holds a reference to all
  controller CDQs.
* Added a new ioctl (NVME_IOCTL_ADMIN_CDQ) and argument struct
  (nvme_cdq_cmd) for CDQ creation
* Added helpers for consuming CDQs: nvme_cdq_{next,send_feature,traverse}
* Added helpers for CDQ admin: nvme_cdq_{free,alloc,create,delete}

In summary, this series implements creation, consumption, and cleanup of
Controller Data Queues, providing a file-descriptor based interface for
user space to read CDQ entries.

CDQ life cycle
==============
To create a CDQ, user space defines the number of entries, entry size,
location of the phase tag (8.1.6.2 NVME base spec), MOS field (5.1.4
NVME base spec) and if necessary, CQS field (5.1.4.1.1 NVME base spec).
All these are passed through the NVME_IOCTL_ADMIN_CDQ ioctl which
allocates and connects the controller to CDQ memory and returns the CDQ
ID (defined by the controller) and a CDQ file descriptor (CDQ FD).

The CDQ FD is used to consume entries through read system call. For
every "read", all available (new) entries are copied from the
internal Kernel CDQ buffer to the user space buffer.

The CDQ ID, on the other hand, is meant for interactions that are
outside CDQ creation and consumption. In these cases the caller is
expected to send NVME commands down through one of the already available
mechanisms (like the NVME_IOCTL_ADMIN_CMD ioctl).

CDQ data structures and memory are cleaned up when the release file
operation is called on the FD, which usually means the close system call
or the user process gets killed.

Testing
=======
The User Data Migration Queue (5.1.4.1.1 NVME base spec) implemented in
the QEMU NVME device [1] is used for testing purposes. CDQ creation,
consumption and deletion is shown by calling a CDQ example in libvfn [2]
(a low level NVME/PCIe library) from within QEMU. For brevity, I have
*not* included any of the testing commands; but I can provide them if
needed.

Questions
=========

Here are some questions that where on my mind.

1. I have used an ioctl for the CDQ creation. Any better alternatives?
2. The deletion is handled by closing the file descriptor. Should this
   be handled by the ioctl?

Any feedback, questions or comments is greatly appreciated

Best

[1] https://github.com/SamsungDS/qemu/tree/nvme.tp4159
[2] https://github.com/Joelgranados/libvfn/blob/jag/cdq/examples/cdq.c

Signed-off-by: Joel Granados <joel.granados@kernel.org>
---
Joel Granados (8):
      nvme: Add CDQ command definitions for contiguous PRPs
      nvme: Add cdq data structure to nvme_ctrl
      nvme: Add file descriptor to read CDQs
      nvme: Add function to create a CDQ
      nvme: Add function to delete CDQ
      nvme: Add a release ops to cdq file ops
      nvme: Add Controller Data Queue (CDQ) ioctl command
      nvme: Connect CDQ ioctl to nvme driver

 drivers/nvme/host/core.c        | 253 ++++++++++++++++++++++++++++++++++++++++
 drivers/nvme/host/ioctl.c       |  47 +++++++-
 drivers/nvme/host/nvme.h        |  20 ++++
 include/linux/nvme.h            |  30 +++++
 include/uapi/linux/nvme_ioctl.h |  12 ++
 5 files changed, 361 insertions(+), 1 deletion(-)
---
base-commit: 0ff41df1cb268fc69e703a08a57ee14ae967d0ca
change-id: 20250624-jag-cdq-691ed7e68c1c

Best regards,
-- 
Joel Granados <joel.granados@kernel.org>




^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-07-21  6:26 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-14  9:15 [PATCH RFC 0/8] nvme: Add Controller Data Queue to the nvme driver Joel Granados
2025-07-14  9:15 ` [PATCH RFC 1/8] nvme: Add CDQ command definitions for contiguous PRPs Joel Granados
2025-07-14  9:15 ` [PATCH RFC 2/8] nvme: Add cdq data structure to nvme_ctrl Joel Granados
2025-07-14  9:15 ` [PATCH RFC 3/8] nvme: Add file descriptor to read CDQs Joel Granados
2025-07-14  9:15 ` [PATCH RFC 4/8] nvme: Add function to create a CDQ Joel Granados
2025-07-14  9:15 ` [PATCH RFC 5/8] nvme: Add function to delete CDQ Joel Granados
2025-07-14  9:15 ` [PATCH RFC 6/8] nvme: Add a release ops to cdq file ops Joel Granados
2025-07-14  9:15 ` [PATCH RFC 7/8] nvme: Add Controller Data Queue (CDQ) ioctl command Joel Granados
2025-07-14  9:15 ` [PATCH RFC 8/8] nvme: Connect CDQ ioctl to nvme driver Joel Granados
2025-07-14 13:02 ` [PATCH RFC 0/8] nvme: Add Controller Data Queue to the " Christoph Hellwig
2025-07-18 11:33   ` Joel Granados
2025-07-21  6:26     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).