From mboxrd@z Thu Jan 1 00:00:00 1970 From: Benjamin Herrenschmidt Subject: Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory Date: Wed, 19 Apr 2017 11:20:06 +1000 Message-ID: <1492564806.25766.124.camel@kernel.crashing.org> References: <1492381396.25766.43.camel@kernel.crashing.org> <20170418164557.GA7181@obsidianresearch.com> <20170418190138.GH7181@obsidianresearch.com> <20170418210339.GA24257@obsidianresearch.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20170418210339.GA24257-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" To: Jason Gunthorpe , Dan Williams Cc: Jens Axboe , "James E.J. Bottomley" , "Martin K. Petersen" , linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Steve Wise , "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, Keith Busch , Jerome Glisse , Bjorn Helgaas , linux-scsi , linux-nvdimm , Max Gurtovoy , Christoph Hellwig List-Id: linux-rdma@vger.kernel.org On Tue, 2017-04-18 at 15:03 -0600, Jason Gunthorpe wrote: > I don't follow, when does get_dma_ops() return a p2p aware provider? > It has no way to know if the DMA is going to involve p2p, get_dma_ops > is called with the device initiating the DMA. > > So you'd always return the P2P shim on a system that has registered > P2P memory? > > Even so, how does this shim work? dma_ops are not really intended to > be stacked. How would we make unmap work, for instance? What happens > when the underlying iommu dma ops actually natively understands p2p > and doesn't want the shim? Good point. We only know on a per-page basis ... ugh. So we really need to change the arch main dma_ops. I'm not opposed to that. What we then need to do is have that main arch dma_map_sg, when it encounters a "device" page, call into a helper attached to the devmap to handle *that page*, providing sufficient context. That helper wouldn't perform the actual iommu mapping. It would simply return something along the lines of: - "use that alternate bus address and don't map in the iommu" - "use that alternate bus address and do map in the iommu" - "proceed as normal" - "fail" What do you think ? Cheers, Ben.