From: Pavel Begunkov <asml.silence@gmail.com>
To: "Fang, Wilson" <wilson.fang@intel.com>,
"io-uring@vger.kernel.org" <io-uring@vger.kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Subject: Re: dma_buf support with io_uring
Date: Thu, 23 Jun 2022 11:35:02 +0100 [thread overview]
Message-ID: <42611180-e6a0-e700-d0ac-b007d8307ea4@gmail.com> (raw)
In-Reply-To: <BY5PR11MB399055971B9A3902CC3A3121EFB59@BY5PR11MB3990.namprd11.prod.outlook.com>
On 6/23/22 07:17, Fang, Wilson wrote:
> Hi Jens,
>
> We are exploring a kernel native mechanism to support peer to peer data transfer between a NVMe SSD and another device supporting dma_buf, connected on the same PCIe root complex.
> NVMe SSD DMA engine requires physical memory address and there is no easy way to pass non system memory address through VFS to the block device driver.
> One of the ideas is to use the io_uring and dma_buf mechanism which is supported by the peer device of the SSD.
Interesting, that's quite aligns with what we're doing, that is a
more generic way for p2p with some non-p2p optimisations on the way.
Our approach we tried before is to let userspace to register dma-buf
fd inside io_uring as a register buffer, prepare everything in advance
like dmabuf attach, and then rw/send/etc. can use that.
> The flow is as below:
> 1. Application passes the dma_buf fd to the kernel through liburing.
> 2. Io_uring adds two new options IORING_OP_READ_DMA and IORING_OP_WRITE_DMA to support read write operations that DMA to/from the peer device memory.
> 3. If the dma_buf fd is valid, io_uring attaches dma_buf and get sgl which contains physical memory addresses to be passed down to the block device driver.
> 4. NVMe SSD DMA engine DMA the data to/from the physical memory address.
>
> The road blocker we are facing is that dma_buf_attach() and dma_buf_map_attachment() APIs expects the caller to provide the struct device *dev as input parameter pointing to the device which does the DMA (in this case the block/NVMe device that holds the source data).
> But since io_uring operates at the VFS layer there is no straight forward way of finding the block/NVMe device object (struct device*) from the source file descriptor.
>
> Do you have any recommendations? Much appreciated!
For finding a device pointer, we added an optional file operation
callback. I think that's much better than parsing it on the io_uring
side, especially since we need a guarantee that the device is the
only one which will be targeted and won't change (e.g. network may
choose a device dynamically based on target address).
I think we have space to cooperate here :)
--
Pavel Begunkov
next prev parent reply other threads:[~2022-06-23 10:35 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <BY5PR11MB399005DAD1BB172B7A42586AEFB59@BY5PR11MB3990.namprd11.prod.outlook.com>
2022-06-23 6:17 ` dma_buf support with io_uring Fang, Wilson
2022-06-23 10:35 ` Pavel Begunkov [this message]
2022-07-13 5:41 ` Fang, Wilson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=42611180-e6a0-e700-d0ac-b007d8307ea4@gmail.com \
--to=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=io-uring@vger.kernel.org \
--cc=wilson.fang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.