From: Jason Gunthorpe <jgg@ziepe.ca>
To: Christoph Hellwig <hch@lst.de>
Cc: "Logan Gunthorpe" <logang@deltatee.com>,
linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org,
linux-nvdimm@lists.01.org, linux-block@vger.kernel.org,
"Stephen Bates" <sbates@raithlin.com>,
"Jens Axboe" <axboe@kernel.dk>,
"Keith Busch" <keith.busch@intel.com>,
"Sagi Grimberg" <sagi@grimberg.me>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Max Gurtovoy" <maxg@mellanox.com>,
"Dan Williams" <dan.j.williams@intel.com>,
"Jérôme Glisse" <jglisse@redhat.com>,
"Benjamin Herrenschmidt" <benh@kernel.crashing.org>
Subject: Re: [PATCH 06/12] IB/core: Add optional PCI P2P flag to rdma_rw_ctx_[init|destroy]()
Date: Mon, 8 Jan 2018 11:09:17 -0700 [thread overview]
Message-ID: <20180108180917.GF11348@ziepe.ca> (raw)
In-Reply-To: <20180108145901.GA10743@lst.de>
On Mon, Jan 08, 2018 at 03:59:01PM +0100, Christoph Hellwig wrote:
> On Thu, Jan 04, 2018 at 09:50:31PM -0700, Jason Gunthorpe wrote:
> > Well that argument applies equally to the RDMA RW API wrappers around
> > the DMA API. I think it is fine if sgl are defined to only have P2P or
> > not, and that debugging support seemed reasonable to me..
> >
> > > It's also very difficult to add similar functionality to dma_map_page seeing
> > > dma_unmap_page won't have any way to know what it's dealing with. It just
> > > seems confusing to support P2P in the SG version and not the page version.
> >
> > Well, this proposal is to support P2P in only some RDMA APIs and not
> > others, so it seems about as confusing to me..
>
> As usual we implement what actually has a consumer. On top of that the
> R/W API is the only core RDMA API that actually does DMA mapping for the
> ULP at the moment.
Well again the same can be said for dma_map_page vs dma_map_sg...
> For SENDs and everything else dma maps are done by the ULP (I'd like
> to eventually change that, though - e.g. sends through that are
> inline to the workqueue don't need a dma map to start with).
> That's because the initial design was to let the ULPs do the DMA
> mappings, which fundamentally is wrong. I've fixed it for the R/W
> API when adding it, but no one has started work on SENDs and atomics.
Well, you know why it is like this, and it is very complicated to
unwind - the HW driver does not have enough information during CQ
processing to properly do any unmaps, let alone serious error tear
down unmaps, so we'd need a bunch of new APIs developed first, like RW
did. :\
> > And on that topic, does this scheme work with HFI?
>
> No, and I guess we need an opt-out. HFI generally seems to be
> extremely weird.
This series needs some kind of fix so HFI, QIB, rxe, etc don't get
broken, and it shouldn't be 'fixed' at the RDMA level.
> > This is why P2P must fit in to the common DMA framework somehow, we
> > rely on these abstractions to work properly and fully in RDMA.
>
> Moving P2P up to common RDMA code isn't going to fix this. For that
> we need to stop preting that something that isn't DMA can abuse the
> dma mapping framework, and until then opt them out of behavior that
> assumes actual DMA like P2P.
It could, if we had a DMA op for p2p then the drivers that provide
their own ops can implement it appropriately or not at all.
Eg the correct implementation for rxe to support p2p memory is
probably somewhat straightfoward.
> > I think you should consider pushing this directly into the dma_ops
> > implementations. Add a p2p_supported flag to struct dma_map_ops, and
> > only if it is true can a caller pass a homogeneous SGL to ops->map_sg.
> > Only map_sg would be supported for P2P. Upgraded implementations can
> > call the helper function.
>
> If at all it should be in the dma_map* wrappers, but for that we'd need
> a good identifier. And it still would not solve the whole fake dma
> ops issue.
Very long term the IOMMUs under the ops will need to care about this,
so the wrapper is not an optimal place to put it - but I wouldn't
object if it gets it out of RDMA :)
Jason
next prev parent reply other threads:[~2018-01-08 18:09 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-04 19:01 [PATCH 00/11] Copy Offload in NVMe Fabrics with P2P PCI Memory Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 01/12] pci-p2p: Support peer to peer memory Logan Gunthorpe
2018-01-04 21:40 ` Bjorn Helgaas
2018-01-04 23:06 ` Logan Gunthorpe
2018-01-04 21:59 ` Bjorn Helgaas
2018-01-05 0:20 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 02/12] pci-p2p: Add sysfs group to display p2pmem stats Logan Gunthorpe
2018-01-04 21:50 ` Bjorn Helgaas
2018-01-04 22:25 ` Jason Gunthorpe
2018-01-04 23:13 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 03/12] pci-p2p: Add PCI p2pmem dma mappings to adjust the bus offset Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 04/12] pci-p2p: Clear ACS P2P flags for all client devices Logan Gunthorpe
2018-01-04 21:57 ` Bjorn Helgaas
2018-01-04 22:35 ` Alex Williamson
2018-01-05 0:00 ` Logan Gunthorpe
2018-01-05 1:09 ` Logan Gunthorpe
2018-01-05 3:33 ` Alex Williamson
2018-01-05 6:47 ` Jerome Glisse
2018-01-05 15:41 ` Alex Williamson
2018-01-05 17:10 ` Logan Gunthorpe
2018-01-05 17:18 ` Alex Williamson
2018-01-04 19:01 ` [PATCH 05/12] block: Introduce PCI P2P flags for request and request queue Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 06/12] IB/core: Add optional PCI P2P flag to rdma_rw_ctx_[init|destroy]() Logan Gunthorpe
2018-01-04 19:22 ` Jason Gunthorpe
2018-01-04 19:52 ` Logan Gunthorpe
2018-01-04 22:13 ` Jason Gunthorpe
2018-01-04 23:44 ` Logan Gunthorpe
2018-01-05 4:50 ` Jason Gunthorpe
2018-01-08 14:59 ` Christoph Hellwig
2018-01-08 18:09 ` Jason Gunthorpe [this message]
2018-01-08 18:17 ` Logan Gunthorpe
2018-01-08 18:29 ` Jason Gunthorpe
2018-01-08 18:34 ` Christoph Hellwig
2018-01-08 18:44 ` Logan Gunthorpe
2018-01-08 18:57 ` Christoph Hellwig
2018-01-08 19:05 ` Logan Gunthorpe
2018-01-09 16:47 ` Christoph Hellwig
2018-01-08 19:49 ` Jason Gunthorpe
2018-01-09 16:46 ` Christoph Hellwig
2018-01-09 17:10 ` Jason Gunthorpe
2018-01-08 19:01 ` Jason Gunthorpe
2018-01-09 16:55 ` Christoph Hellwig
2018-01-04 19:01 ` [PATCH 07/12] nvme-pci: clean up CMB initialization Logan Gunthorpe
2018-01-04 19:08 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 08/12] nvme-pci: clean up SMBSZ bit definitions Logan Gunthorpe
2018-01-04 19:08 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 09/12] nvme-pci: Use PCI p2pmem subsystem to manage the CMB Logan Gunthorpe
2018-01-05 15:30 ` Marta Rybczynska
2018-01-05 18:14 ` Logan Gunthorpe
2018-01-05 18:11 ` Keith Busch
2018-01-05 18:19 ` Logan Gunthorpe
2018-01-05 19:01 ` Keith Busch
2018-01-05 19:04 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 10/12] nvme-pci: Add support for P2P memory in requests Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 11/12] nvme-pci: Add a quirk for a pseudo CMB Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 12/12] nvmet: Optionally use PCI P2P memory Logan Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180108180917.GF11348@ziepe.ca \
--to=jgg@ziepe.ca \
--cc=axboe@kernel.dk \
--cc=benh@kernel.crashing.org \
--cc=bhelgaas@google.com \
--cc=dan.j.williams@intel.com \
--cc=hch@lst.de \
--cc=jglisse@redhat.com \
--cc=keith.busch@intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=logang@deltatee.com \
--cc=maxg@mellanox.com \
--cc=sagi@grimberg.me \
--cc=sbates@raithlin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox