From: David Epping <david.epping@missinglinkelectronics.com>
To: 顾泽兵 <guzebing@bytedance.com>
Cc: linux-nvme@lists.infradead.org, Keith Busch <kbusch@kernel.org>,
Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>,
Sagi Grimberg <sagi@grimberg.me>,
Leon Romanovsky <leon@kernel.org>,
Joachim Foerster <joachim.foerster@missinglinkelectronics.com>
Subject: Re: [PATCH RFC] nvme-ioctl: propagate PRP1 from ioctl to admin cmd
Date: Mon, 29 Jun 2026 15:02:01 +0200 [thread overview]
Message-ID: <akJsyeCNp9QOD3Mp@ubuntu-server> (raw)
In-Reply-To: <4b210910448ef2227190f426e97614787d15d32b.5465e366.d014.4145.8c6e.48ae0643e67e@bytedance.com>
On Mon, Jun 29, 2026 at 05:05:51PM +0800, 顾泽兵 wrote:
> > The system setup where this patch has been used is as follows:
> > - P2P PCIe capable CPU (currently also IOMMU disabled)
> > - patched Linux in-Kernel NVMe driver for local PCIe NVMe SSDs
> > - FPGA accelerator implementing NVMe IO queue memory and IO queue handling,
> > exposed via PCIe BAR
> > - vfio-pcie Kernel driver plus vfio userspace FPGA driver / application
> > - The userspace application creates new NVMe IO queues at the SSD using the
> > patched admin ioctl and points them towards the FPGA BAR. It then informs
> > the FPGA about the SSD BAR address and IO queue ID. From then on the FPGA
> > can access the SSD storage entirely without software interaction.
>
> Hi David,
>
> I would like to ask for your insight on one point about the FPGA
> queue-offload setup described in the RFC. This is not about the PRP1
> ioctl change itself; I am personally interested in FPGA/NVMe datapath
> offload and would like to better understand how your setup handled this.
>
> For the I/O queues handled by the FPGA, how does the FPGA learn that the
> SSD has posted new CQEs?
>
> Did your implementation disable interrupts for those CQs and let the
> FPGA poll the CQ phase tag, or did you use MSI/MSI-X with the
> corresponding NVMe MSI-X vector targeting an FPGA BAR event register
> instead of the host interrupt controller?
>
> I also wonder how the I/O work was submitted to the FPGA in this model.
> Does the CPU still provide the FPGA with per-I/O information such as the
> data buffer address and the NVMe namespace/LBA range, while the FPGA
> then builds and submits the NVMe commands? Or is the FPGA able to derive
> most of that by itself after the initial queue setup?
>
> Thanks,
> Guzebing
>
Hi Guzebing,
the I/O queues managed by the FPGA are implemented as FPGA internal SRAM,
and thus the FPGA sees and performs every single queue memory access.
As you assumed, interrupts are disabled for these queues, and software
would call this polling, but for the FPGA it is instantaneous knowledge
about the access.
After initial I/O queue setup the FPGA operates completely autonomous as
far as NVMe is concerened.
There is additional Linux userspace software controlling the operation
and telling the FPGA which linear range of LBAs it is allowed to access,
but that is not a NVMe driver/protocol level knowledge or enforcement.
As such, Linux simultaneous access to the same LBAs is technically
possible, but does not make sense because of caching.
We use the FPGA to record data from external sources (FPGA attached
network interfaces, high-speed ADCs, ...) to a set of NVMe SSDs in RAID
configuration. Linux never gets to see this data (or even knows this is
happening). Only after the recording Linux may open and use the RAID
block device (we use mdraid structures). This mutually exclusive access
scheduling is managed by userspace software.
Best regards, David
prev parent reply other threads:[~2026-06-29 13:02 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-22 14:15 [PATCH RFC] nvme-ioctl: propagate PRP1 from ioctl to admin cmd David Epping
2026-06-22 14:35 ` Keith Busch
2026-06-22 14:56 ` David Epping
2026-06-22 15:15 ` Keith Busch
2026-06-23 10:34 ` David Epping
2026-06-23 12:19 ` Keith Busch
2026-06-24 7:40 ` Christoph Hellwig
2026-06-26 17:55 ` David Epping
2026-06-26 22:22 ` Keith Busch
2026-06-29 12:20 ` David Epping
2026-06-29 12:28 ` Christoph Hellwig
2026-06-29 9:05 ` 顾泽兵
2026-06-29 13:02 ` David Epping [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=akJsyeCNp9QOD3Mp@ubuntu-server \
--to=david.epping@missinglinkelectronics.com \
--cc=axboe@kernel.dk \
--cc=guzebing@bytedance.com \
--cc=hch@lst.de \
--cc=joachim.foerster@missinglinkelectronics.com \
--cc=kbusch@kernel.org \
--cc=leon@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox