linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Robin Murphy <robin.murphy@arm.com>
Cc: "Leon Romanovsky" <leon@kernel.org>,
	"Christoph Hellwig" <hch@lst.de>,
	"Jason Gunthorpe" <jgg@ziepe.ca>, "Jens Axboe" <axboe@kernel.dk>,
	"Joerg Roedel" <joro@8bytes.org>, "Will Deacon" <will@kernel.org>,
	"Sagi Grimberg" <sagi@grimberg.me>,
	"Keith Busch" <kbusch@kernel.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Logan Gunthorpe" <logang@deltatee.com>,
	"Yishai Hadas" <yishaih@nvidia.com>,
	"Shameer Kolothum" <shameerali.kolothum.thodi@huawei.com>,
	"Kevin Tian" <kevin.tian@intel.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Marek Szyprowski" <m.szyprowski@samsung.com>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-block@vger.kernel.org, linux-rdma@vger.kernel.org,
	iommu@lists.linux.dev, linux-nvme@lists.infradead.org,
	linux-pci@vger.kernel.org, kvm@vger.kernel.org,
	linux-mm@kvack.org, "Randy Dunlap" <rdunlap@infradead.org>
Subject: Re: [PATCH v7 00/17] Provide a new two step DMA mapping API
Date: Thu, 27 Mar 2025 17:56:16 +0000	[thread overview]
Message-ID: <Z-WRQOYEvOWlI34w@casper.infradead.org> (raw)
In-Reply-To: <e024fe3d-bddf-4006-8535-656fd0a3fada@arm.com>

On Fri, Mar 21, 2025 at 04:05:22PM +0000, Robin Murphy wrote:
> > The main issue which we are trying to solve "abuse of SG lists for
> > things without struct page", is not going to disappear by itself.
> 
> What everyone seems to have missed is that while it is technically true that
> the streaming DMA API doesn't need a literal struct page, it still very much
> depends on something which having a struct page makes it sufficiently safe
> to assume: that what it's being given is valid kernel memory that it can do
> things like phys_to_virt() or kmap_atomic() on. A completely generic DMA
> mapping API which could do the right thing for any old PFN on any system
> would be a very hard thing to achieve, and I suspect even harder to do
> efficiently. And pushing the complexity into every caller to encourage and
> normalise drivers calling virt_to_phys() all over (_so_ many bugs there...)
> and pass magic flags to influence internal behaviour of the API
> implementation clearly isn't scalable. Don't think I haven't seen the other
> thread where Christian had the same concern that this "sounds like an
> absolutely horrible design."

Doing I/O to memory which does not have a struct page is the whole point
of this series (and many many more patches to come in the future).

This is very useful functionality to have.  Xen can do it, which is
advantageous for a hypervisor as it really doesn't use the struct page
for anything; that memory is assigned to the guest and the host only
needs the page in order to do I/O on belaf of the guest.

I first came up against this problem with the 3DXP project, which is now
dead but there are other similar projects that involve giving each
machine in a cluster access to a large amount of shared memory, and
there's not really a good place to allocate the memmap from.
And the only reason to allocate memmap is so that we can do I/O to
this memory.

I'm sure there are other use cases.  Given that nVidia are so
interested in this, I would guess that at least one of them involves
a graphics card.

I don't think that phys_to_virt() is something that has ever been
guaranteed to work (HIGHEMEM and so on).  I do think that we should
support kmap_local_phys() for these things -- there's no need to
have a struct page for that.

I haven't looked at the implementation, but I think we need to agree
that this is useful functionality to have, or this isn't going anywhere.

  parent reply	other threads:[~2025-03-27 17:56 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-05 14:40 [PATCH v7 00/17] Provide a new two step DMA mapping API Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 01/17] PCI/P2PDMA: Refactor the p2pdma mapping helpers Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 02/17] dma-mapping: move the PCI P2PDMA mapping helpers to pci-p2pdma.h Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 03/17] iommu: generalize the batched sync after map interface Leon Romanovsky
2025-03-17  9:52   ` Niklas Schnelle
2025-03-17 13:44     ` Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 04/17] iommu: add kernel-doc for iommu_unmap and iommu_unmap_fast Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 05/17] dma-mapping: Provide an interface to allow allocate IOVA Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 06/17] iommu/dma: Factor out a iommu_dma_map_swiotlb helper Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 07/17] dma-mapping: Implement link/unlink ranges API Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 08/17] dma-mapping: add a dma_need_unmap helper Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 09/17] docs: core-api: document the IOVA-based API Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 10/17] mm/hmm: let users to tag specific PFN with DMA mapped bit Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 11/17] mm/hmm: provide generic DMA managing logic Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 12/17] RDMA/umem: Store ODP access mask information in PFN Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 13/17] RDMA/core: Convert UMEM ODP DMA mapping to caching IOVA and page linkage Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 14/17] RDMA/umem: Separate implicit ODP initialization from explicit ODP Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 15/17] vfio/mlx5: Explicitly use number of pages instead of allocated length Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 16/17] vfio/mlx5: Rewrite create mkey flow to allow better code reuse Leon Romanovsky
2025-02-05 14:40 ` [PATCH v7 17/17] vfio/mlx5: Enable the DMA link API Leon Romanovsky
2025-02-20 12:48 ` [PATCH v7 00/17] Provide a new two step DMA mapping API Leon Romanovsky
2025-02-28 19:54   ` Robin Murphy
2025-03-02  8:57     ` Leon Romanovsky
2025-03-21 16:05       ` Robin Murphy
2025-03-25 12:36         ` Jason Gunthorpe
2025-03-25 14:41           ` Leon Romanovsky
2025-04-01  1:09             ` Luis Chamberlain
2025-03-27 17:56         ` Matthew Wilcox [this message]
2025-03-12  9:28     ` Marek Szyprowski
2025-03-12 19:32       ` Leon Romanovsky
2025-03-14 10:52         ` Marek Szyprowski
2025-03-14 18:49           ` Leon Romanovsky
2025-03-19  8:30             ` Leon Romanovsky
2025-03-19 17:58           ` Jason Gunthorpe
2025-03-20 23:52             ` Marek Szyprowski
2025-03-22  0:41               ` Jason Gunthorpe
2025-03-28 14:18                 ` Marek Szyprowski
2025-03-31 19:10                   ` Jason Gunthorpe
2025-03-31 14:46                 ` Chuck Lever
2025-04-18  1:20                 ` Dan Williams
2025-03-21 13:52       ` Robin Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-WRQOYEvOWlI34w@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bhelgaas@google.com \
    --cc=corbet@lwn.net \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=joro@8bytes.org \
    --cc=kbusch@kernel.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=m.szyprowski@samsung.com \
    --cc=rdunlap@infradead.org \
    --cc=robin.murphy@arm.com \
    --cc=sagi@grimberg.me \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=will@kernel.org \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).