From: Jason Gunthorpe <jgg@nvidia.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Leon Romanovsky <leon@kernel.org>,
Abdiel Janulgue <abdiel.janulgue@gmail.com>,
Alexander Potapenko <glider@google.com>,
Alex Gaynor <alex.gaynor@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Hellwig <hch@lst.de>,
Danilo Krummrich <dakr@kernel.org>,
iommu@lists.linux.dev, Jason Wang <jasowang@redhat.com>,
Jens Axboe <axboe@kernel.dk>, Joerg Roedel <joro@8bytes.org>,
Jonathan Corbet <corbet@lwn.net>, Juergen Gross <jgross@suse.com>,
kasan-dev@googlegroups.com, Keith Busch <kbusch@kernel.org>,
linux-block@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-nvme@lists.infradead.org, linuxppc-dev@lists.ozlabs.org,
linux-trace-kernel@vger.kernel.org,
Madhavan Srinivasan <maddy@linux.ibm.com>,
Masami Hiramatsu <mhiramat@kernel.org>,
Michael Ellerman <mpe@ellerman.id.au>,
"Michael S. Tsirkin" <mst@redhat.com>,
Miguel Ojeda <ojeda@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
rust-for-linux@vger.kernel.org, Sagi Grimberg <sagi@grimberg.me>,
Stefano Stabellini <sstabellini@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
virtualization@lists.linux.dev, Will Deacon <will@kernel.org>,
xen-devel@lists.xenproject.org
Subject: Re: [PATCH v1 00/16] dma-mapping: migrate to physical address-based API
Date: Sat, 9 Aug 2025 10:34:54 -0300 [thread overview]
Message-ID: <20250809133454.GP184255@nvidia.com> (raw)
In-Reply-To: <a154e058-c0e6-4208-9f52-57cec22eaf7d@samsung.com>
On Fri, Aug 08, 2025 at 08:51:08PM +0200, Marek Szyprowski wrote:
> First - basing the API on the phys_addr_t.
>
> Page based API had the advantage that it was really hard to abuse it and
> call for something that is not 'a normal RAM'.
This is not true anymore. Today we have ZONE_DEVICE as a struct page
type with a whole bunch of non-dram sub-types:
enum memory_type {
/* 0 is reserved to catch uninitialized type fields */
MEMORY_DEVICE_PRIVATE = 1,
MEMORY_DEVICE_COHERENT,
MEMORY_DEVICE_FS_DAX,
MEMORY_DEVICE_GENERIC,
MEMORY_DEVICE_PCI_P2PDMA,
};
Few of which are kmappable/page_to_virtable() in a way that is useful
for the DMA API.
DMA API sort of ignores all of this and relies on the caller to not
pass in an incorrect struct page. eg we rely on things like the block
stack to do the right stuff when a MEMORY_DEVICE_PCI_P2PDMA is present
in a bio_vec.
Which is not really fundamentally different from just using
phys_addr_t in the first place.
Sure, this was a stronger argument when this stuff was originally
written, before ZONE_DEVICE was invented.
> I initially though that phys_addr_t based API will somehow simplify
> arch specific implementation, as some of them indeed rely on
> phys_addr_t internally, but I missed other things pointed by
> Robin. Do we have here any alternative?
I think it is less of a code simplification, more as a reduction in
conceptual load. When we can say directly there is no struct page type
anyhwere in the DMA API layers then we only have to reason about
kmap/phys_to_virt compatibly.
This is also a weaker overall requirement than needing an actual
struct page which allows optimizing other parts of the kernel. Like we
aren't forced to create MEMORY_DEVICE_PCI_P2PDMA stuct pages just to
use the dma api.
Again, any place in the kernel we can get rid of struct page the
smoother the road will be for the MM side struct page restructuring.
For example one of the bigger eventual goes here is to make a bio_vec
store phys_addr_t, not struct page pointers.
DMA API is not alone here, we have been de-struct-paging the kernel
for a long time now:
netdev: https://lore.kernel.org/linux-mm/20250609043225.77229-1-byungchul@sk.com/
slab: https://lore.kernel.org/linux-mm/20211201181510.18784-1-vbabka@suse.cz/
iommmu: https://lore.kernel.org/all/0-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com/
page tables: https://lore.kernel.org/linux-mm/20230731170332.69404-1-vishal.moola@gmail.com/
zswap: https://lore.kernel.org/all/20241216150450.1228021-1-42.hyeyoo@gmail.com/
With a long term goal that struct page only exists for legacy code,
and is maybe entirely compiled out of modern server kernels.
> Second - making dma_map_phys() a single API to handle all cases.
>
> Do we really need such single function to handle all cases?
If we accept the direction to remove struct page then it makes little
sense to have a dma_map_ram(phys_addr) and dma_map_resource(phys_addr)
and force key callers (like block) to have more ifs - especially if
the conditional could become "free" inside the dma API (see below).
Plus if we keep the callchain split then adding a
"dma_link_resource"/etc are now needed as well.
> DMA_ATTR_MMIO for every typical DMA user? I know that branching is
> cheap, but this will probably increase code size for most of the typical
> users for no reason.
Well, having two call chains will increase the code size much more,
and 'resource' can't be compiled out. Arguably this unification should
reduce the .text size since many of the resource only functions go
away.
There are some branches, and I think the push toward re-using
DMA_ATTR_SKIP_CPU_SYNC was directly to try to reduce that branch
cost.
However, I think we should be looking for a design here that is "free"
on the fast no-swiotlb and non-cache-flush path. I think this can be
achieved by checking ATTR_MMIO only after seeing swiotlb is needed
(like today's is p2p check). And we can probably freely fold it into
the existing sync check:
if ((attrs & (DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_MMIO)) == 0)
I saw Leon hasn't done these micro optimizations, but it seems like it
could work out.
Regards,
Jason
next prev parent reply other threads:[~2025-08-09 13:34 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-04 12:42 [PATCH v1 00/16] dma-mapping: migrate to physical address-based API Leon Romanovsky
2025-08-04 12:42 ` [PATCH v1 01/16] dma-mapping: introduce new DMA attribute to indicate MMIO memory Leon Romanovsky
2025-08-06 17:31 ` Jason Gunthorpe
2025-08-04 12:42 ` [PATCH v1 02/16] iommu/dma: handle MMIO path in dma_iova_link Leon Romanovsky
2025-08-06 18:10 ` Jason Gunthorpe
2025-08-04 12:42 ` [PATCH v1 03/16] dma-debug: refactor to use physical addresses for page mapping Leon Romanovsky
2025-08-06 18:26 ` Jason Gunthorpe
2025-08-06 18:38 ` Leon Romanovsky
2025-08-04 12:42 ` [PATCH v1 04/16] dma-mapping: rename trace_dma_*map_page to trace_dma_*map_phys Leon Romanovsky
2025-08-04 12:42 ` [PATCH v1 05/16] iommu/dma: rename iommu_dma_*map_page to iommu_dma_*map_phys Leon Romanovsky
2025-08-06 18:44 ` Jason Gunthorpe
2025-08-04 12:42 ` [PATCH v1 06/16] iommu/dma: extend iommu_dma_*map_phys API to handle MMIO memory Leon Romanovsky
2025-08-07 12:07 ` Jason Gunthorpe
2025-08-04 12:42 ` [PATCH v1 07/16] dma-mapping: convert dma_direct_*map_page to be phys_addr_t based Leon Romanovsky
2025-08-07 12:13 ` Jason Gunthorpe
2025-08-04 12:42 ` [PATCH v1 08/16] kmsan: convert kmsan_handle_dma to use physical addresses Leon Romanovsky
2025-08-07 12:21 ` Jason Gunthorpe
2025-08-13 15:07 ` Leon Romanovsky
2025-08-14 12:13 ` Jason Gunthorpe
2025-08-14 12:35 ` Leon Romanovsky
2025-08-14 12:44 ` Jason Gunthorpe
2025-08-14 13:31 ` Leon Romanovsky
2025-08-14 14:14 ` Jason Gunthorpe
2025-08-04 12:42 ` [PATCH v1 09/16] dma-mapping: handle MMIO flow in dma_map|unmap_page Leon Romanovsky
2025-08-07 13:08 ` Jason Gunthorpe
2025-08-04 12:42 ` [PATCH v1 10/16] xen: swiotlb: Open code map_resource callback Leon Romanovsky
2025-08-07 14:40 ` Jürgen Groß
2025-08-04 12:42 ` [PATCH v1 11/16] dma-mapping: export new dma_*map_phys() interface Leon Romanovsky
2025-08-07 13:38 ` Jason Gunthorpe
2025-08-04 12:42 ` [PATCH v1 12/16] mm/hmm: migrate to physical address-based DMA mapping API Leon Romanovsky
2025-08-07 13:14 ` Jason Gunthorpe
2025-08-04 12:42 ` [PATCH v1 13/16] mm/hmm: properly take MMIO path Leon Romanovsky
2025-08-07 13:14 ` Jason Gunthorpe
2025-08-04 12:42 ` [PATCH v1 14/16] block-dma: migrate to dma_map_phys instead of map_page Leon Romanovsky
2025-08-04 12:42 ` [PATCH v1 15/16] block-dma: properly take MMIO path Leon Romanovsky
2025-08-04 12:42 ` [PATCH v1 16/16] nvme-pci: unmap MMIO pages with appropriate interface Leon Romanovsky
2025-08-07 13:45 ` Jason Gunthorpe
2025-08-13 15:37 ` Leon Romanovsky
2025-08-07 14:19 ` [PATCH v1 00/16] dma-mapping: migrate to physical address-based API Jason Gunthorpe
2025-08-08 18:51 ` Marek Szyprowski
2025-08-09 13:34 ` Jason Gunthorpe [this message]
2025-08-09 16:53 ` Demi Marie Obenour
2025-08-10 17:02 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250809133454.GP184255@nvidia.com \
--to=jgg@nvidia.com \
--cc=abdiel.janulgue@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=alex.gaynor@gmail.com \
--cc=axboe@kernel.dk \
--cc=corbet@lwn.net \
--cc=dakr@kernel.org \
--cc=glider@google.com \
--cc=hch@lst.de \
--cc=iommu@lists.linux.dev \
--cc=jasowang@redhat.com \
--cc=jgross@suse.com \
--cc=joro@8bytes.org \
--cc=kasan-dev@googlegroups.com \
--cc=kbusch@kernel.org \
--cc=leon@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=m.szyprowski@samsung.com \
--cc=maddy@linux.ibm.com \
--cc=mhiramat@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=mst@redhat.com \
--cc=ojeda@kernel.org \
--cc=robin.murphy@arm.com \
--cc=rostedt@goodmis.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=sagi@grimberg.me \
--cc=sstabellini@kernel.org \
--cc=virtualization@lists.linux.dev \
--cc=will@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).