From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB7B2ECAAA1 for ; Fri, 9 Sep 2022 13:24:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229585AbiIINYn (ORCPT ); Fri, 9 Sep 2022 09:24:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229631AbiIINYl (ORCPT ); Fri, 9 Sep 2022 09:24:41 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 352D118375; Fri, 9 Sep 2022 06:24:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=iY0w1kh/xK+MCzo1rcnqvZqdmGxlHyuGkIUEG2fe/n0=; b=3wAl0TxsoTvU4WSk/aqxa6e9u/ 30XUqAlq7pk+SKpG5MSQFbT0MoWDiD1aYo7F1FJ3DrSUU/9qFFxdzWUyN2J66hZuW72TG3hUIDZdP P42VoYLJin+mebeMH4p6hwbaQ6n3DWx4lhsetBocd6lS+lVM+d40I3KLyzVAXt3CokaWpej/0paac cqzPcUqf/5srbDXkPIK36whHP4DZUpnSRlHspglNS5p77kTiy4b9zZUMLF3IIgBjcz6Br2jRoxYX+ vWGuxJBluz3V6oi4VjQPpmc7mmu3iWRQ00Aie4He+c8ZFXrNNAodqvzzBHDQ6SOErfZj+lNQgw+to rptERk0w==; Received: from hch by bombadil.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1oWdzn-00GJ4H-M7; Fri, 09 Sep 2022 13:24:35 +0000 Date: Fri, 9 Sep 2022 06:24:35 -0700 From: Christoph Hellwig To: Jason Gunthorpe Cc: Christoph Hellwig , Christian =?iso-8859-1?Q?K=F6nig?= , Alex Williamson , Cornelia Huck , dri-devel@lists.freedesktop.org, kvm@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-media@vger.kernel.org, Sumit Semwal , Daniel Vetter , Leon Romanovsky , linux-rdma@vger.kernel.org, Maor Gottlieb , Oded Gabbay , Dan Williams Subject: Re: [PATCH v2 4/4] vfio/pci: Allow MMIO regions to be exported through dma-buf Message-ID: References: <4-v2-472615b3877e+28f7-vfio_dma_buf_jgg@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org On Wed, Sep 07, 2022 at 01:12:52PM -0300, Jason Gunthorpe wrote: > The PCI offset is some embedded thing - I've never seen it in a server > platform. That's not actually true, e.g. some power system definitively had it, althiugh I don't know if the current ones do. But that's not that point. The offset is a configuration fully supported by Linux, and someone that just works by using the proper APIs. Doing some handwaiving about embedded only or bad design doesn't matter. There is a reason why we have these proper APIs and no one has any business bypassing them. > I also seem to remember that iommu and PCI offset don't play nice > together - so for the VFIO use case where the iommu is present I'm > pretty sure we can very safely assume 0 offset. That seems confirmed > by the fact that VFIO has never handled PCI offset in its own P2P path > and P2P works fine in VMs across a wide range of platforms. I think the offset is one of the reasons why IOVA windows can be reserved (and maybe also why ppc is so weird). > So, would you be OK with this series if I try to make a dma_map_p2p() > that resolves the offset issue? Well, if it also solves the other issue of invalid scatterlists leaking outside of drm we can think about it. > > > Last but not least I don't really see how the code would even work > > when an IOMMU is used, as dma_map_resource will return an IOVA that > > is only understood by the IOMMU itself, and not the other endpoint. > > I don't understand this. > > __iommu_dma_map() will put the given phys into the iommu_domain > associated with 'dev' and return the IOVA it picked. Yes, __iommu_dma_map creates an IOVA for the mapped remote BAR. That is the right thing if the I/O goes through the host bridge, but it is the wrong thing if the I/O goes through the switch - in that case the IOVA generated is not something that the endpoint that owns the BAR can even understand. Take a look at iommu_dma_map_sg and pci_p2pdma_map_segment to see how this is handled.