From: Keith Busch <kbusch@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>, Sagi Grimberg <sagi@grimberg.me>,
Chaitanya Kulkarni <kch@nvidia.com>,
Kanchan Joshi <joshi.k@samsung.com>,
Leon Romanovsky <leon@kernel.org>,
Nitesh Shetty <nj.shetty@samsung.com>,
Logan Gunthorpe <logang@deltatee.com>,
linux-block@vger.kernel.org, linux-nvme@lists.infradead.org
Subject: Re: [PATCH 1/9] block: don't merge different kinds of P2P transfers in a single bio
Date: Wed, 11 Jun 2025 10:26:03 -0600 [thread overview]
Message-ID: <aEmuG1dUDGuci7VW@kbusch-mbp> (raw)
In-Reply-To: <20250611034316.GA2869@lst.de>
On Wed, Jun 11, 2025 at 05:43:16AM +0200, Christoph Hellwig wrote:
> On Tue, Jun 10, 2025 at 09:37:30AM -0600, Keith Busch wrote:
> > I may be out of the loop here. Is this an optimization to make something
> > easier for the DMA layer?
>
> Yes. P2P that is based on a bus address (i.e. using a switch) uses
> a completely different way to DMA MAP than the normal IOMMU or
> direct mapping. So the optimization of collapsing all host physical
> addresses into an iova can't work once it is present.
>
> > I don't think there's any fundamental reason
> > why devices like nvme couldn't handle a command that uses memory mixed
> > among multiple devices and/or host memory, at least.
>
> Sure, devices don't even see if an IOVA is P2P or not, this is all
> host side.
Sorry for my ignorant questions here, but I'm not sure how this setup
(P2P transactions with switches and IOMMU enabled) actually works and
would like to understand better.
If I recall correctly, the PCIe ACS features will default redirect
everything up to the root-complex when you have the IOMMU on. A device
can set its memory request TLP's Address Type field to have the switch
direct the transaction directly to a peer device instead, but how does
the nvme device know how to set the it memory request's AT field?
There's nothing that says a command's addresses are untranslated IOVAs
vs translated peer addresses, right? Lacking some mechanism to specify
what kind of address the nvme controller is dealing with, wouldn't you
be forced to map peer addresses with the IOMMU, having P2P transactions
make a round trip through it only using mapped IOVAs?
next prev parent reply other threads:[~2025-06-11 21:16 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-10 5:06 new DMA API conversion for nvme-pci Christoph Hellwig
2025-06-10 5:06 ` [PATCH 1/9] block: don't merge different kinds of P2P transfers in a single bio Christoph Hellwig
2025-06-10 12:44 ` Leon Romanovsky
2025-06-10 15:37 ` Keith Busch
2025-06-11 3:43 ` Christoph Hellwig
2025-06-11 16:26 ` Keith Busch [this message]
2025-06-11 16:39 ` Logan Gunthorpe
2025-06-11 16:41 ` Keith Busch
2025-06-11 19:41 ` Logan Gunthorpe
2025-06-11 20:00 ` Keith Busch
2025-06-12 4:57 ` Christoph Hellwig
2025-06-12 6:24 ` Kanchan Joshi
2025-06-13 6:19 ` Christoph Hellwig
2025-06-12 15:22 ` Logan Gunthorpe
2025-06-10 5:06 ` [PATCH 2/9] block: add scatterlist-less DMA mapping helpers Christoph Hellwig
2025-06-10 12:51 ` Leon Romanovsky
2025-06-11 13:43 ` Daniel Gomez
2025-06-16 5:02 ` Christoph Hellwig
2025-06-16 6:43 ` Daniel Gomez
2025-06-16 11:31 ` Christoph Hellwig
2025-06-16 12:37 ` Daniel Gomez
2025-06-16 12:42 ` Christoph Hellwig
2025-06-16 12:52 ` Daniel Gomez
2025-06-16 13:01 ` Christoph Hellwig
2025-06-12 6:35 ` Kanchan Joshi
2025-06-13 6:17 ` Christoph Hellwig
2025-06-10 5:06 ` [PATCH 3/9] nvme-pci: simplify nvme_pci_metadata_use_sgls Christoph Hellwig
2025-06-10 12:52 ` Leon Romanovsky
2025-06-11 21:38 ` Keith Busch
2025-06-12 4:59 ` Christoph Hellwig
2025-06-10 5:06 ` [PATCH 4/9] nvme-pci: refactor nvme_pci_use_sgls Christoph Hellwig
2025-06-10 13:10 ` Leon Romanovsky
2025-06-11 13:43 ` Daniel Gomez
2025-06-12 5:00 ` Christoph Hellwig
2025-06-11 20:50 ` Keith Busch
2025-06-12 5:00 ` Christoph Hellwig
2025-06-10 5:06 ` [PATCH 5/9] nvme-pci: merge the simple PRP and SGL setup into a common helper Christoph Hellwig
2025-06-10 13:13 ` Leon Romanovsky
2025-06-11 13:44 ` Daniel Gomez
2025-06-12 5:01 ` Christoph Hellwig
2025-06-11 21:03 ` Keith Busch
2025-06-12 5:01 ` Christoph Hellwig
2025-06-10 5:06 ` [PATCH 6/9] nvme-pci: remove superfluous arguments Christoph Hellwig
2025-06-10 13:15 ` Leon Romanovsky
2025-06-11 21:05 ` Keith Busch
2025-06-10 5:06 ` [PATCH 7/9] nvme-pci: convert the data mapping blk_rq_dma_map Christoph Hellwig
2025-06-10 13:19 ` Leon Romanovsky
2025-06-11 12:15 ` Daniel Gomez
2025-06-12 5:02 ` Christoph Hellwig
2025-06-16 7:41 ` Daniel Gomez
2025-06-16 11:33 ` Christoph Hellwig
2025-06-17 17:33 ` Daniel Gomez
2025-06-17 23:25 ` Keith Busch
2025-06-17 17:43 ` Daniel Gomez
2025-06-17 17:45 ` Daniel Gomez
2025-06-11 14:13 ` Daniel Gomez
2025-06-12 5:03 ` Christoph Hellwig
2025-06-16 7:49 ` Daniel Gomez
2025-06-16 11:35 ` Christoph Hellwig
2025-06-10 5:06 ` [PATCH 8/9] nvme-pci: replace NVME_MAX_KB_SZ with NVME_MAX_BYTE Christoph Hellwig
2025-06-10 13:20 ` Leon Romanovsky
2025-06-11 14:00 ` Daniel Gomez
2025-06-10 5:06 ` [PATCH 9/9] nvme-pci: rework the build time assert for NVME_MAX_NR_DESCRIPTORS Christoph Hellwig
2025-06-10 13:21 ` Leon Romanovsky
2025-06-11 13:51 ` Daniel Gomez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aEmuG1dUDGuci7VW@kbusch-mbp \
--to=kbusch@kernel.org \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=joshi.k@samsung.com \
--cc=kch@nvidia.com \
--cc=leon@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=logang@deltatee.com \
--cc=nj.shetty@samsung.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).