* Re: [PATCH RFC] nvme-ioctl: propagate PRP1 from ioctl to admin cmd [not found] <ajlDjpjK_clMrnwx@ubuntu-server> @ 2026-06-22 14:35 ` Keith Busch 2026-06-22 14:56 ` David Epping 0 siblings, 1 reply; 5+ messages in thread From: Keith Busch @ 2026-06-22 14:35 UTC (permalink / raw) To: David Epping Cc: linux-nvme, Jens Axboe, Christoph Hellwig, Sagi Grimberg, Leon Romanovsky, Joachim Foerster On Mon, Jun 22, 2026 at 04:15:42PM +0200, David Epping wrote: > @@ -306,6 +306,7 @@ static int nvme_user_cmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns, > c.common.nsid = cpu_to_le32(cmd.nsid); > c.common.cdw2[0] = cpu_to_le32(cmd.cdw2); > c.common.cdw2[1] = cpu_to_le32(cmd.cdw3); > + c.common.dptr.prp1 = cpu_to_le64(cmd.addr); This is not correct: the user space virtual address isn't the device DMA'able address. The driver already handles mapping the user address to kernel space, then to dma, then sets the PRP accordingly. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH RFC] nvme-ioctl: propagate PRP1 from ioctl to admin cmd 2026-06-22 14:35 ` [PATCH RFC] nvme-ioctl: propagate PRP1 from ioctl to admin cmd Keith Busch @ 2026-06-22 14:56 ` David Epping 2026-06-22 15:15 ` Keith Busch 0 siblings, 1 reply; 5+ messages in thread From: David Epping @ 2026-06-22 14:56 UTC (permalink / raw) To: Keith Busch Cc: linux-nvme, Jens Axboe, Christoph Hellwig, Sagi Grimberg, Leon Romanovsky, Joachim Foerster On Mon, Jun 22, 2026 at 08:35:42AM -0600, Keith Busch wrote: > On Mon, Jun 22, 2026 at 04:15:42PM +0200, David Epping wrote: > > @@ -306,6 +306,7 @@ static int nvme_user_cmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns, > > c.common.nsid = cpu_to_le32(cmd.nsid); > > c.common.cdw2[0] = cpu_to_le32(cmd.cdw2); > > c.common.cdw2[1] = cpu_to_le32(cmd.cdw3); > > + c.common.dptr.prp1 = cpu_to_le64(cmd.addr); > > This is not correct: the user space virtual address isn't the device > DMA'able address. The driver already handles mapping the user address to > kernel space, then to dma, then sets the PRP accordingly. To clarify, the ioctl struct addr field is not filled with a memory buffer address by the userspace, but a PCIe mapped BAR address plus offset. It is obtained by the userspace application operating the FPGA vfio device by reading from PCI config space via VFIO_PCI_CONFIG_REGION_INDEX. So it is the address Linux assigned to that BAR (plus offset). ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH RFC] nvme-ioctl: propagate PRP1 from ioctl to admin cmd 2026-06-22 14:56 ` David Epping @ 2026-06-22 15:15 ` Keith Busch 2026-06-23 10:34 ` David Epping 0 siblings, 1 reply; 5+ messages in thread From: Keith Busch @ 2026-06-22 15:15 UTC (permalink / raw) To: David Epping Cc: linux-nvme, Jens Axboe, Christoph Hellwig, Sagi Grimberg, Leon Romanovsky, Joachim Foerster On Mon, Jun 22, 2026 at 04:56:22PM +0200, David Epping wrote: > On Mon, Jun 22, 2026 at 08:35:42AM -0600, Keith Busch wrote: > > On Mon, Jun 22, 2026 at 04:15:42PM +0200, David Epping wrote: > > > @@ -306,6 +306,7 @@ static int nvme_user_cmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns, > > > c.common.nsid = cpu_to_le32(cmd.nsid); > > > c.common.cdw2[0] = cpu_to_le32(cmd.cdw2); > > > c.common.cdw2[1] = cpu_to_le32(cmd.cdw3); > > > + c.common.dptr.prp1 = cpu_to_le64(cmd.addr); > > > > This is not correct: the user space virtual address isn't the device > > DMA'able address. The driver already handles mapping the user address to > > kernel space, then to dma, then sets the PRP accordingly. > > To clarify, the ioctl struct addr field is not filled with a memory buffer > address by the userspace, but a PCIe mapped BAR address plus offset. > It is obtained by the userspace application operating the FPGA vfio device > by reading from PCI config space via VFIO_PCI_CONFIG_REGION_INDEX. > So it is the address Linux assigned to that BAR (plus offset). The driver and block layer should already handle PCIe addresses. You're supposed to mmap it to user space first though, and pass that address in instead. And you'd also need to set cmd.data_len to a non-zero value so the driver doesn't skip setting up the data pointers. Note, creating IO queues from user space, while not explicitly prevented today, is not supported. The driver doesn't know you've done this so the queue isn't properly handled on a controller reset. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH RFC] nvme-ioctl: propagate PRP1 from ioctl to admin cmd 2026-06-22 15:15 ` Keith Busch @ 2026-06-23 10:34 ` David Epping 2026-06-23 12:19 ` Keith Busch 0 siblings, 1 reply; 5+ messages in thread From: David Epping @ 2026-06-23 10:34 UTC (permalink / raw) To: Keith Busch Cc: linux-nvme, Jens Axboe, Christoph Hellwig, Sagi Grimberg, Leon Romanovsky, Joachim Foerster On Mon, Jun 22, 2026 at 09:15:40AM -0600, Keith Busch wrote: > On Mon, Jun 22, 2026 at 04:56:22PM +0200, David Epping wrote: > > On Mon, Jun 22, 2026 at 08:35:42AM -0600, Keith Busch wrote: > > > On Mon, Jun 22, 2026 at 04:15:42PM +0200, David Epping wrote: > > > > @@ -306,6 +306,7 @@ static int nvme_user_cmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns, > > > > c.common.nsid = cpu_to_le32(cmd.nsid); > > > > c.common.cdw2[0] = cpu_to_le32(cmd.cdw2); > > > > c.common.cdw2[1] = cpu_to_le32(cmd.cdw3); > > > > + c.common.dptr.prp1 = cpu_to_le64(cmd.addr); > > > > > > This is not correct: the user space virtual address isn't the device > > > DMA'able address. The driver already handles mapping the user address to > > > kernel space, then to dma, then sets the PRP accordingly. > > > > To clarify, the ioctl struct addr field is not filled with a memory buffer > > address by the userspace, but a PCIe mapped BAR address plus offset. > > It is obtained by the userspace application operating the FPGA vfio device > > by reading from PCI config space via VFIO_PCI_CONFIG_REGION_INDEX. > > So it is the address Linux assigned to that BAR (plus offset). > > The driver and block layer should already handle PCIe addresses. You're > supposed to mmap it to user space first though, and pass that address in > instead. And you'd also need to set cmd.data_len to a non-zero value so > the driver doesn't skip setting up the data pointers. > > Note, creating IO queues from user space, while not explicitly prevented > today, is not supported. The driver doesn't know you've done this so the > queue isn't properly handled on a controller reset. > Hi Keith, I understand that creating IO queues from user space is not supported by the current driver. That's why we created patches for that a couple of years ago and ported them to new Kernels since. My question is, and maybe I should have put this in my initial email explicitely, is there interest in having such functionality in the upstream Linux in-Kernel NVMe driver? An interface and mechanism to request and manage IO queues that are not used by the Linux NVMe driver to perform IO, but handed to a separate entity for this purpose. Of course an upstream implementation would have to take many more things into account, like the reset you mentioned, and IOMMU setup, and much more. But that's only worth looking at if there is upstream interest in it. Thanks for your feedback, David ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH RFC] nvme-ioctl: propagate PRP1 from ioctl to admin cmd 2026-06-23 10:34 ` David Epping @ 2026-06-23 12:19 ` Keith Busch 0 siblings, 0 replies; 5+ messages in thread From: Keith Busch @ 2026-06-23 12:19 UTC (permalink / raw) To: David Epping Cc: linux-nvme, Jens Axboe, Christoph Hellwig, Sagi Grimberg, Leon Romanovsky, Joachim Foerster On Tue, Jun 23, 2026 at 12:34:29PM +0200, David Epping wrote: > My question is, and maybe I should have put this in my initial email > explicitely, is there interest in having such functionality in the upstream > Linux in-Kernel NVMe driver? An interface and mechanism to request and > manage IO queues that are not used by the Linux NVMe driver to perform IO, > but handed to a separate entity for this purpose. Partitioning device resources to assign to special purposes should be under a well defined framework. Unfortunately the only thing I know of approaching this is SIOV. :) Not sure how other maintainers and developers feel about it, but that's the route I would go for this. It at least provides memory access on a queue granularity and neatly separates the control plane. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-23 12:19 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <ajlDjpjK_clMrnwx@ubuntu-server>
2026-06-22 14:35 ` [PATCH RFC] nvme-ioctl: propagate PRP1 from ioctl to admin cmd Keith Busch
2026-06-22 14:56 ` David Epping
2026-06-22 15:15 ` Keith Busch
2026-06-23 10:34 ` David Epping
2026-06-23 12:19 ` Keith Busch
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox