From: Christoph Hellwig <hch@lst.de>
To: Pavel Begunkov <asml.silence@gmail.com>
Cc: "Jens Axboe" <axboe@kernel.dk>, "Keith Busch" <kbusch@kernel.org>,
"Christoph Hellwig" <hch@lst.de>,
"Sagi Grimberg" <sagi@grimberg.me>,
"Alexander Viro" <viro@zeniv.linux.org.uk>,
"Christian Brauner" <brauner@kernel.org>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Sumit Semwal" <sumit.semwal@linaro.org>,
"Christian König" <christian.koenig@amd.com>,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org,
io-uring@vger.kernel.org, linux-media@vger.kernel.org,
dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org,
"Nitesh Shetty" <nj.shetty@samsung.com>,
"Kanchan Joshi" <joshi.k@samsung.com>,
"Anuj Gupta" <anuj20.g@samsung.com>,
"Tushar Gohad" <tushar.gohad@intel.com>,
"William Power" <william.power@intel.com>,
"Phil Cayton" <phil.cayton@intel.com>,
"Jason Gunthorpe" <jgg@nvidia.com>
Subject: Re: [PATCH v3 07/10] nvme-pci: implement dma_token backed requests
Date: Wed, 13 May 2026 10:38:17 +0200 [thread overview]
Message-ID: <20260513083817.GC6461@lst.de> (raw)
In-Reply-To: <5cecb1157ab784f9f303a91449fdf11b03aa6002.1777475843.git.asml.silence@gmail.com>
FYI, I really want SGL support before this get merged, but ignoring that
for now:
> +struct nvme_dmabuf_map {
> + struct io_dmabuf_map base;
> + dma_addr_t *dma_list;
> + struct sg_table *sgt;
> + unsigned nr_entries;
I'd make dma_list a variable-sized array at the end of the struture to avoid
an extra allocation and pointer derefernece.
>
> +static void nvme_dmabuf_map_sync(struct nvme_dev *nvme_dev, struct request *req,
> + bool for_cpu)
> +{
> + int length = blk_rq_payload_bytes(req);
> + struct device *dev = nvme_dev->dev;
> + enum dma_data_direction dma_dir;
> + struct bio *bio = req->bio;
> + struct nvme_dmabuf_map *map;
> + dma_addr_t *dma_list;
> + int offset, map_idx;
> +
> + dma_dir = rq_data_dir(req) == READ ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> + map = container_of(bio->dmabuf_map, struct nvme_dmabuf_map, base);
> + dma_list = map->dma_list;
> +
> + offset = bio->bi_iter.bi_bvec_done;
> + map_idx = offset / NVME_CTRL_PAGE_SIZE;
> + length += offset & (NVME_CTRL_PAGE_SIZE - 1);
Please initialize the variable at declaration time and use or add proper
helpers to simplify this:
static inline struct nvme_dmabuf_map *
to_nvme_dmabuf_map(struct io_dmabuf_map *map)
{
return container_of(map, struct nvme_dmabuf_map, base);
}
....
enum dma_data_direction dma_dir = rq_dma_dir(req);
struct device *dev = nvme_dev->dev;
struct bio *bio = req->bio;
struct nvme_dmabuf_map *map = to_nvme_dmabuf_map(bio->bi_dmabuf_map);
dma_addr_t *dma_list = map->dma_list;
int offset = bio->bi_iter.bi_bvec_done;
int mmap_idx = offset / NVME_CTRL_PAGE_SIZE;
int length = blk_rq_payload_bytes(req) +
offset & (NVME_CTRL_PAGE_SIZE - 1);
Also a lot of these ints sound like they should be unsigned.
> +
> + while (length > 0) {
> + u64 dma_addr = dma_list[map_idx++];
> +
> + if (for_cpu)
> + __dma_sync_single_for_cpu(dev, dma_addr,
> + NVME_CTRL_PAGE_SIZE, dma_dir);
> + else
> + __dma_sync_single_for_device(dev, dma_addr,
> + NVME_CTRL_PAGE_SIZE,
> + dma_dir);
> + length -= NVME_CTRL_PAGE_SIZE;
> + }
> +}
Nothing should be using these __dma_sync helpers that are internal
details. Using them means you call into sync code that should be skipped
on most common server class systems.
Also the for_cpu argument is a bit ugly. I'd rather have separate
routines as in the core dma-mapping code, even if that means a little bit
of code duplication.
> +static blk_status_t nvme_rq_setup_dmabuf_map(struct request *req,
> + struct nvme_queue *nvmeq)
> +{
> + struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
> + int length = blk_rq_payload_bytes(req);
> + u64 dma_addr, prp1_dma, prp2_dma;
> + struct bio *bio = req->bio;
> + struct nvme_dmabuf_map *map;
> + dma_addr_t *dma_list;
> + dma_addr_t prp_dma;
> + __le64 *prp_list;
> + int i, map_idx;
> + int offset;
> +
> + nvme_dmabuf_map_sync(nvmeq->dev, req, false);
> +
> + map = container_of(bio->dmabuf_map, struct nvme_dmabuf_map, base);
> + dma_list = map->dma_list;
> +
> + offset = bio->bi_iter.bi_bvec_done;
> + map_idx = offset / NVME_CTRL_PAGE_SIZE;
> + offset &= (NVME_CTRL_PAGE_SIZE - 1);
> + prp1_dma = dma_list[map_idx++] + offset;
Same comments as for the sync helper above.
> + length -= (NVME_CTRL_PAGE_SIZE - offset);
> + if (length <= 0) {
> + prp2_dma = 0;
> + goto done;
> + }
> +
> + if (length <= NVME_CTRL_PAGE_SIZE) {
> + prp2_dma = dma_list[map_idx];
> + goto done;
> + }
> +
> + if (DIV_ROUND_UP(length, NVME_CTRL_PAGE_SIZE) <=
> + NVME_SMALL_POOL_SIZE / sizeof(__le64))
> + iod->flags |= IOD_SMALL_DESCRIPTOR;
> +
> + prp_list = dma_pool_alloc(nvme_dma_pool(nvmeq, iod), GFP_ATOMIC,
> + &prp_dma);
> + if (!prp_list)
> + return BLK_STS_RESOURCE;
> +
> + iod->descriptors[iod->nr_descriptors++] = prp_list;
> + prp2_dma = prp_dma;
And I really hate how this duplicates all the nasty PRP building logic,
although right now I don't have a good answer to that.
> +static inline bool nvme_rq_is_dmabuf_attached(struct request *req)
> +{
> + if (!IS_ENABLED(CONFIG_DMABUF_TOKEN))
> + return false;
> + return req->bio && bio_flagged(req->bio, BIO_DMABUF_MAP);
> +}
This is something that should go into the block layer.
next prev parent reply other threads:[~2026-05-13 8:38 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-29 15:25 [PATCH v3 00/10] Add dmabuf read/write via io_uring Pavel Begunkov
2026-04-29 15:25 ` [PATCH v3 01/10] file: add callback for creating long-term dmabuf maps Pavel Begunkov
2026-04-30 6:03 ` Christian König
2026-04-30 18:33 ` Pavel Begunkov
2026-05-04 7:14 ` Christian König
2026-05-13 8:11 ` Christoph Hellwig
2026-04-29 15:25 ` [PATCH v3 02/10] iov_iter: add iterator type for " Pavel Begunkov
2026-05-13 8:11 ` Christoph Hellwig
2026-05-13 10:05 ` David Laight
2026-05-13 13:29 ` David Laight
2026-05-18 9:24 ` Pavel Begunkov
2026-05-18 10:40 ` David Laight
2026-04-29 15:25 ` [PATCH v3 03/10] block: move bvec init into __bio_clone Pavel Begunkov
2026-05-13 8:12 ` Christoph Hellwig
2026-05-18 9:10 ` Pavel Begunkov
2026-04-29 15:25 ` [PATCH v3 04/10] block: introduce dma map backed bio type Pavel Begunkov
2026-05-13 8:19 ` Christoph Hellwig
2026-05-18 10:29 ` Pavel Begunkov
2026-05-18 12:22 ` Christian König
2026-05-18 12:40 ` Pavel Begunkov
2026-05-18 12:57 ` Christoph Hellwig
2026-05-18 13:59 ` Pavel Begunkov
2026-05-18 12:54 ` Christoph Hellwig
2026-05-19 9:21 ` David Laight
2026-05-20 8:30 ` Christoph Hellwig
2026-05-25 7:29 ` Pavel Begunkov
2026-05-13 8:39 ` Christoph Hellwig
2026-05-18 9:11 ` Pavel Begunkov
2026-04-29 15:25 ` [PATCH v3 05/10] lib: add dmabuf token infrastructure Pavel Begunkov
2026-05-13 8:24 ` Christoph Hellwig
2026-05-18 10:14 ` Pavel Begunkov
2026-05-18 12:53 ` Christoph Hellwig
2026-05-18 14:23 ` Pavel Begunkov
2026-05-19 6:56 ` Christoph Hellwig
2026-05-19 7:55 ` Pavel Begunkov
2026-05-19 9:25 ` Christoph Hellwig
2026-05-18 11:24 ` Markus Elfring
2026-05-18 14:02 ` Markus Elfring
2026-04-29 15:25 ` [PATCH v3 06/10] block: forward create_dmabuf_token to drivers Pavel Begunkov
2026-05-13 8:25 ` Christoph Hellwig
2026-05-18 9:13 ` Pavel Begunkov
2026-04-29 15:25 ` [PATCH v3 07/10] nvme-pci: implement dma_token backed requests Pavel Begunkov
2026-04-29 15:29 ` Pavel Begunkov
2026-04-29 16:07 ` Maurizio Lombardi
2026-04-30 18:18 ` Pavel Begunkov
2026-05-13 8:38 ` Christoph Hellwig [this message]
2026-05-18 9:29 ` Pavel Begunkov
2026-05-18 10:18 ` Anuj Gupta/Anuj Gupta
2026-05-18 10:30 ` Pavel Begunkov
2026-04-29 15:25 ` [PATCH v3 08/10] io_uring/rsrc: introduce buf registration structure Pavel Begunkov
2026-04-29 15:25 ` [PATCH v3 09/10] io_uring/rsrc: extend buffer update Pavel Begunkov
2026-04-29 15:25 ` [PATCH v3 10/10] io_uring/rsrc: add dmabuf backed registered buffers Pavel Begunkov
2026-05-04 15:29 ` [PATCH v3 00/10] Add dmabuf read/write via io_uring Ming Lei
2026-05-06 9:02 ` Pavel Begunkov
2026-05-07 9:50 ` Ming Lei
2026-05-12 9:30 ` Pavel Begunkov
2026-05-12 7:00 ` Christoph Hellwig
2026-05-12 9:30 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260513083817.GC6461@lst.de \
--to=hch@lst.de \
--cc=akpm@linux-foundation.org \
--cc=anuj20.g@samsung.com \
--cc=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=io-uring@vger.kernel.org \
--cc=jgg@nvidia.com \
--cc=joshi.k@samsung.com \
--cc=kbusch@kernel.org \
--cc=linaro-mm-sig@lists.linaro.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=nj.shetty@samsung.com \
--cc=phil.cayton@intel.com \
--cc=sagi@grimberg.me \
--cc=sumit.semwal@linaro.org \
--cc=tushar.gohad@intel.com \
--cc=viro@zeniv.linux.org.uk \
--cc=william.power@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.