From: Jason Gunthorpe <jgg@nvidia.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Leon Romanovsky <leon@kernel.org>,
Abdiel Janulgue <abdiel.janulgue@gmail.com>,
Alexander Potapenko <glider@google.com>,
Alex Gaynor <alex.gaynor@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Hellwig <hch@lst.de>,
Danilo Krummrich <dakr@kernel.org>,
iommu@lists.linux.dev, Jason Wang <jasowang@redhat.com>,
Jens Axboe <axboe@kernel.dk>, Joerg Roedel <joro@8bytes.org>,
Jonathan Corbet <corbet@lwn.net>, Juergen Gross <jgross@suse.com>,
kasan-dev@googlegroups.com, Keith Busch <kbusch@kernel.org>,
linux-block@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-nvme@lists.infradead.org, linuxppc-dev@lists.ozlabs.org,
linux-trace-kernel@vger.kernel.org,
Madhavan Srinivasan <maddy@linux.ibm.com>,
Masami Hiramatsu <mhiramat@kernel.org>,
Michael Ellerman <mpe@ellerman.id.au>,
"Michael S. Tsirkin" <mst@redhat.com>,
Miguel Ojeda <ojeda@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
rust-for-linux@vger.kernel.org, Sagi Grimberg <sagi@grimberg.me>,
Stefano Stabellini <sstabellini@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
virtualization@lists.linux.dev, Will Deacon <will@kernel.org>,
xen-devel@lists.xenproject.org
Subject: Re: [PATCH v4 00/16] dma-mapping: migrate to physical address-based API
Date: Fri, 5 Sep 2025 14:43:24 -0300 [thread overview]
Message-ID: <20250905174324.GI616306@nvidia.com> (raw)
In-Reply-To: <7557f31e-1504-4f62-b00b-70e25bb793cb@samsung.com>
On Fri, Sep 05, 2025 at 06:20:51PM +0200, Marek Szyprowski wrote:
> I've checked the most advertised use case in
> https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/log/?h=dmabuf-vfio
> and I still don't see the reason why it cannot be based
> on dma_map_resource() API? I'm aware of the little asymmetry of the
> client calls is such case, indeed it is not preety, but this should work
> even now:
>
> phys = phys_vec[i].paddr;
>
> if (is_mmio)
> dma_map_resource(phys, len, ...);
> else
> dma_map_page(phys_to_page(phys), offset_in_page(phys), ...);
>
> What did I miss?
I have a somewhat different answer than Leon..
The link path would need a resource variation too:
+ ret = dma_iova_link(attachment->dev, state,
+ phys_vec[i].paddr, 0,
+ phys_vec[i].len, dir, attrs);
+ if (ret)
+ goto err_unmap_dma;
+
+ mapped_len += phys_vec[i].len;
It is an existing bug that we don't properly handle all details of
MMIO for link.
Since this is already a phys_addr_t I wouldn't strongly argue that
should be done by adding ATTR_MMIO to dma_iova_link().
If you did that, then you'd still want a dma_(un)map_phys() helper
that handled ATTR_MMIO too. It could be an inline "if () resource else
page" wrapper like you say.
So API wise I think we have the right design here.
I think the question you are asking is how much changing to the
internals of the DMA API do you want to do to make ATTR_MMIO.
It is not zero, but there is some minimum that is less than this.
So reason #1 much of this ATTR_MMIO is needed anyhow. Being consistent
and unifying the dma_map_resource path with ATTR_MMIO should improve
the long term maintainability of the code. We already uncovered paths
where map_resource is not behaving consistently with map_page and it
is unclear if these are bugs or deliberate.
Reason #2 we do actually want to get rid of struct page usage to help
advance Matthew's work. This means we want to build a clean struct
page less path for IO. Meaning we can do phys to virt, or kmap phys,
but none of: phys to page, page to virt, page to phys. Stopping at a
phys based public API and then leaving all the phys to page/etc
conversions hidden inside is not enough.
This is why I was looking at the dma_ops path, to see just how much
page usage there is, and I found very little. So this dream is
achievable and with this series we are there for ARM64 and x86
environments.
> This patchset focuses only on the dma_map_page -> dma_map_phys rework.
> There are also other interfaces, like dma_alloc_pages() and so far
> nothing has been proposed for them so far.
That's because they already have non-page alternatives.
Allmost all places call dma_alloc_noncoherent():
static inline void *dma_alloc_noncoherent(struct device *dev, size_t size,
dma_addr_t *dma_handle, enum dma_data_direction dir, gfp_t gfp)
{
struct page *page = dma_alloc_pages(dev, size, dma_handle, dir, gfp);
return page ? page_address(page) : NULL;
Which is KVA based.
There is only one user I found of alloc_pages:
drivers/firewire/ohci.c: ctx->pages[i] = dma_alloc_pages(dev, PAGE_SIZE, &dma_addr,
And it deliberately uses page->private:
set_page_private(ctx->pages[i], dma_addr);
So it is correct to use the struct page API.
Some usages of dma_alloc_noncontiguous() can be implemented using the
dma_iova_link() flow like drivers/vfio/pci/mlx5/cmd.c shows by using
alloc_pages_bulk() for the allocator. We don't yet have a 'dma alloc
link' operation though, and there are only 4 users of
dma_alloc_noncontiguous()..
Jason
prev parent reply other threads:[~2025-09-05 17:43 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-19 17:36 [PATCH v4 00/16] dma-mapping: migrate to physical address-based API Leon Romanovsky
2025-08-19 17:36 ` [PATCH v4 01/16] dma-mapping: introduce new DMA attribute to indicate MMIO memory Leon Romanovsky
2025-08-28 13:03 ` Jason Gunthorpe
2025-08-19 17:36 ` [PATCH v4 02/16] iommu/dma: implement DMA_ATTR_MMIO for dma_iova_link() Leon Romanovsky
2025-08-19 17:36 ` [PATCH v4 03/16] dma-debug: refactor to use physical addresses for page mapping Leon Romanovsky
2025-08-28 13:19 ` Jason Gunthorpe
2025-09-05 16:26 ` Marek Szyprowski
2025-08-19 17:36 ` [PATCH v4 04/16] dma-mapping: rename trace_dma_*map_page to trace_dma_*map_phys Leon Romanovsky
2025-08-28 13:27 ` Jason Gunthorpe
2025-08-19 17:36 ` [PATCH v4 05/16] iommu/dma: rename iommu_dma_*map_page to iommu_dma_*map_phys Leon Romanovsky
2025-08-28 13:38 ` Jason Gunthorpe
2025-08-19 17:36 ` [PATCH v4 06/16] iommu/dma: extend iommu_dma_*map_phys API to handle MMIO memory Leon Romanovsky
2025-08-28 13:49 ` Jason Gunthorpe
2025-08-19 17:36 ` [PATCH v4 07/16] dma-mapping: convert dma_direct_*map_page to be phys_addr_t based Leon Romanovsky
2025-08-28 14:19 ` Jason Gunthorpe
2025-08-19 17:36 ` [PATCH v4 08/16] kmsan: convert kmsan_handle_dma to use physical addresses Leon Romanovsky
2025-08-28 15:00 ` Jason Gunthorpe
2025-08-19 17:36 ` [PATCH v4 09/16] dma-mapping: handle MMIO flow in dma_map|unmap_page Leon Romanovsky
2025-08-28 15:17 ` Jason Gunthorpe
2025-08-31 13:12 ` Leon Romanovsky
2025-08-19 17:36 ` [PATCH v4 10/16] xen: swiotlb: Open code map_resource callback Leon Romanovsky
2025-08-28 15:18 ` Jason Gunthorpe
2025-08-19 17:36 ` [PATCH v4 11/16] dma-mapping: export new dma_*map_phys() interface Leon Romanovsky
2025-08-19 18:22 ` Keith Busch
2025-08-28 16:01 ` Jason Gunthorpe
2025-08-19 17:36 ` [PATCH v4 12/16] mm/hmm: migrate to physical address-based DMA mapping API Leon Romanovsky
2025-08-19 17:36 ` [PATCH v4 13/16] mm/hmm: properly take MMIO path Leon Romanovsky
2025-08-19 17:36 ` [PATCH v4 14/16] block-dma: migrate to dma_map_phys instead of map_page Leon Romanovsky
2025-08-19 18:20 ` Keith Busch
2025-08-19 18:49 ` Leon Romanovsky
2025-09-02 20:49 ` Marek Szyprowski
2025-09-02 21:59 ` Keith Busch
2025-09-02 23:24 ` Jason Gunthorpe
2025-08-19 17:36 ` [PATCH v4 15/16] block-dma: properly take MMIO path Leon Romanovsky
2025-08-19 18:24 ` Keith Busch
2025-08-28 15:19 ` Keith Busch
2025-08-28 16:54 ` Leon Romanovsky
2025-08-28 17:15 ` Keith Busch
2025-08-28 18:41 ` Jason Gunthorpe
2025-08-28 19:10 ` Keith Busch
2025-08-28 19:18 ` Jason Gunthorpe
2025-08-28 20:54 ` Keith Busch
2025-08-28 23:45 ` Jason Gunthorpe
2025-08-29 12:35 ` Keith Busch
2025-08-19 17:37 ` [PATCH v4 16/16] nvme-pci: unmap MMIO pages with appropriate interface Leon Romanovsky
2025-08-19 19:58 ` Keith Busch
2025-08-28 11:57 ` [PATCH v4 00/16] dma-mapping: migrate to physical address-based API Leon Romanovsky
2025-09-01 21:47 ` Marek Szyprowski
2025-09-01 22:23 ` Jason Gunthorpe
2025-09-02 9:29 ` Leon Romanovsky
2025-08-29 13:16 ` Jason Gunthorpe
2025-09-05 16:20 ` Marek Szyprowski
2025-09-05 17:38 ` Leon Romanovsky
2025-09-05 17:43 ` Jason Gunthorpe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250905174324.GI616306@nvidia.com \
--to=jgg@nvidia.com \
--cc=abdiel.janulgue@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=alex.gaynor@gmail.com \
--cc=axboe@kernel.dk \
--cc=corbet@lwn.net \
--cc=dakr@kernel.org \
--cc=glider@google.com \
--cc=hch@lst.de \
--cc=iommu@lists.linux.dev \
--cc=jasowang@redhat.com \
--cc=jgross@suse.com \
--cc=joro@8bytes.org \
--cc=kasan-dev@googlegroups.com \
--cc=kbusch@kernel.org \
--cc=leon@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=m.szyprowski@samsung.com \
--cc=maddy@linux.ibm.com \
--cc=mhiramat@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=mst@redhat.com \
--cc=ojeda@kernel.org \
--cc=robin.murphy@arm.com \
--cc=rostedt@goodmis.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=sagi@grimberg.me \
--cc=sstabellini@kernel.org \
--cc=virtualization@lists.linux.dev \
--cc=will@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).