linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] update DMA API documentation
@ 2025-06-24 13:39 Petr Tesarik
  2025-06-24 13:39 ` [PATCH 1/8] docs: dma-api: use "DMA API" consistently throughout the document Petr Tesarik
                   ` (7 more replies)
  0 siblings, 8 replies; 30+ messages in thread
From: Petr Tesarik @ 2025-06-24 13:39 UTC (permalink / raw)
  To: Jonathan Corbet, Morton
  Cc: Marek Szyprowski, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT,
	Petr Tesarik

A few documentation updates:

* remove outdated and confusing parts
* reduce duplicates
* update streaming DMA API expectations

Petr Tesarik (8):
  docs: dma-api: use "DMA API" consistently throughout the document
  docs: dma-api: replace consistent with coherent
  docs: dma-api: remove remnants of PCI DMA API
  docs: dma-api: add a kernel-doc comment for dma_pool_zalloc()
  docs: dma-api: remove duplicate description of the DMA pool API
  docs: dma-api: clarify DMA addressing limitations
  docs: dma-api: update streaming DMA API physical address constraints
  docs: dma-api: clean up documentation of dma_map_sg()

 Documentation/core-api/dma-api-howto.rst |  36 ++---
 Documentation/core-api/dma-api.rst       | 173 +++++++----------------
 Documentation/core-api/mm-api.rst        |   4 +
 include/linux/dmapool.h                  |   8 ++
 mm/dmapool.c                             |   6 +-
 5 files changed, 85 insertions(+), 142 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/8] docs: dma-api: use "DMA API" consistently throughout the document
  2025-06-24 13:39 [PATCH 0/8] update DMA API documentation Petr Tesarik
@ 2025-06-24 13:39 ` Petr Tesarik
  2025-06-25  2:41   ` Randy Dunlap
  2025-06-24 13:39 ` [PATCH 2/8] docs: dma-api: replace consistent with coherent Petr Tesarik
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Petr Tesarik @ 2025-06-24 13:39 UTC (permalink / raw)
  To: Jonathan Corbet, Morton
  Cc: Marek Szyprowski, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT,
	Petr Tesarik

Make sure that all occurrences are spelled "DMA API" (all uppercase, no
hyphen, no underscore).

Signed-off-by: Petr Tesarik <ptesarik@suse.com>
---
 Documentation/core-api/dma-api.rst | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
index 2ad08517e626..97f42c15f5e4 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -13,10 +13,10 @@ machines.  Unless you know that your driver absolutely has to support
 non-consistent platforms (this is usually only legacy platforms) you
 should only use the API described in part I.
 
-Part I - dma_API
+Part I - DMA API
 ----------------
 
-To get the dma_API, you must #include <linux/dma-mapping.h>.  This
+To get the DMA API, you must #include <linux/dma-mapping.h>.  This
 provides dma_addr_t and the interfaces described below.
 
 A dma_addr_t can hold any valid DMA address for the platform.  It can be
@@ -76,7 +76,7 @@ may only be called with IRQs enabled.
 Part Ib - Using small DMA-coherent buffers
 ------------------------------------------
 
-To get this part of the dma_API, you must #include <linux/dmapool.h>
+To get this part of the DMA API, you must #include <linux/dmapool.h>
 
 Many drivers need lots of small DMA-coherent memory regions for DMA
 descriptors or I/O buffers.  Rather than allocating in units of a page
@@ -247,7 +247,7 @@ Maps a piece of processor virtual memory so it can be accessed by the
 device and returns the DMA address of the memory.
 
 The direction for both APIs may be converted freely by casting.
-However the dma_API uses a strongly typed enumerator for its
+However the DMA API uses a strongly typed enumerator for its
 direction:
 
 ======================= =============================================
@@ -775,19 +775,19 @@ memory or doing partial flushes.
 	of two for easy alignment.
 
 
-Part III - Debug drivers use of the DMA-API
+Part III - Debug drivers use of the DMA API
 -------------------------------------------
 
-The DMA-API as described above has some constraints. DMA addresses must be
+The DMA API as described above has some constraints. DMA addresses must be
 released with the corresponding function with the same size for example. With
 the advent of hardware IOMMUs it becomes more and more important that drivers
 do not violate those constraints. In the worst case such a violation can
 result in data corruption up to destroyed filesystems.
 
-To debug drivers and find bugs in the usage of the DMA-API checking code can
+To debug drivers and find bugs in the usage of the DMA API checking code can
 be compiled into the kernel which will tell the developer about those
 violations. If your architecture supports it you can select the "Enable
-debugging of DMA-API usage" option in your kernel configuration. Enabling this
+debugging of DMA API usage" option in your kernel configuration. Enabling this
 option has a performance impact. Do not enable it in production kernels.
 
 If you boot the resulting kernel will contain code which does some bookkeeping
@@ -826,7 +826,7 @@ example warning message may look like this::
 	<EOI> <4>---[ end trace f6435a98e2a38c0e ]---
 
 The driver developer can find the driver and the device including a stacktrace
-of the DMA-API call which caused this warning.
+of the DMA API call which caused this warning.
 
 Per default only the first error will result in a warning message. All other
 errors will only silently counted. This limitation exist to prevent the code
@@ -834,7 +834,7 @@ from flooding your kernel log. To support debugging a device driver this can
 be disabled via debugfs. See the debugfs interface documentation below for
 details.
 
-The debugfs directory for the DMA-API debugging code is called dma-api/. In
+The debugfs directory for the DMA API debugging code is called dma-api/. In
 this directory the following files can currently be found:
 
 =============================== ===============================================
@@ -882,7 +882,7 @@ dma-api/driver_filter		You can write a name of a driver into this file
 
 If you have this code compiled into your kernel it will be enabled by default.
 If you want to boot without the bookkeeping anyway you can provide
-'dma_debug=off' as a boot parameter. This will disable DMA-API debugging.
+'dma_debug=off' as a boot parameter. This will disable DMA API debugging.
 Notice that you can not enable it again at runtime. You have to reboot to do
 so.
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/8] docs: dma-api: replace consistent with coherent
  2025-06-24 13:39 [PATCH 0/8] update DMA API documentation Petr Tesarik
  2025-06-24 13:39 ` [PATCH 1/8] docs: dma-api: use "DMA API" consistently throughout the document Petr Tesarik
@ 2025-06-24 13:39 ` Petr Tesarik
  2025-06-26  4:51   ` Petr Tesarik
  2025-06-24 13:39 ` [PATCH 3/8] docs: dma-api: remove remnants of PCI DMA API Petr Tesarik
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Petr Tesarik @ 2025-06-24 13:39 UTC (permalink / raw)
  To: Jonathan Corbet, Morton
  Cc: Marek Szyprowski, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT,
	Petr Tesarik

For consistency, always use the term "coherent" when talking about memory
that is not subject to CPU caching effects. The term "consistent" is a
relic of a long-removed pci_alloc_consistent() function.

Signed-off-by: Petr Tesarik <ptesarik@suse.com>
---
 Documentation/core-api/dma-api-howto.rst | 36 ++++++++++++------------
 Documentation/core-api/dma-api.rst       | 14 ++++-----
 mm/dmapool.c                             |  6 ++--
 3 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/Documentation/core-api/dma-api-howto.rst b/Documentation/core-api/dma-api-howto.rst
index 0bf31b6c4383..96fce2a9aa90 100644
--- a/Documentation/core-api/dma-api-howto.rst
+++ b/Documentation/core-api/dma-api-howto.rst
@@ -155,7 +155,7 @@ a device with limitations, it needs to be decreased.
 
 Special note about PCI: PCI-X specification requires PCI-X devices to support
 64-bit addressing (DAC) for all transactions.  And at least one platform (SGI
-SN2) requires 64-bit consistent allocations to operate correctly when the IO
+SN2) requires 64-bit coherent allocations to operate correctly when the IO
 bus is in PCI-X mode.
 
 For correct operation, you must set the DMA mask to inform the kernel about
@@ -174,7 +174,7 @@ used instead:
 
 		int dma_set_mask(struct device *dev, u64 mask);
 
-	The setup for consistent allocations is performed via a call
+	The setup for coherent allocations is performed via a call
 	to dma_set_coherent_mask()::
 
 		int dma_set_coherent_mask(struct device *dev, u64 mask);
@@ -241,7 +241,7 @@ it would look like this::
 
 The coherent mask will always be able to set the same or a smaller mask as
 the streaming mask. However for the rare case that a device driver only
-uses consistent allocations, one would have to check the return value from
+uses coherent allocations, one would have to check the return value from
 dma_set_coherent_mask().
 
 Finally, if your device can only drive the low 24-bits of
@@ -298,20 +298,20 @@ Types of DMA mappings
 
 There are two types of DMA mappings:
 
-- Consistent DMA mappings which are usually mapped at driver
+- Coherent DMA mappings which are usually mapped at driver
   initialization, unmapped at the end and for which the hardware should
   guarantee that the device and the CPU can access the data
   in parallel and will see updates made by each other without any
   explicit software flushing.
 
-  Think of "consistent" as "synchronous" or "coherent".
+  Think of "coherent" as "synchronous".
 
-  The current default is to return consistent memory in the low 32
+  The current default is to return coherent memory in the low 32
   bits of the DMA space.  However, for future compatibility you should
-  set the consistent mask even if this default is fine for your
+  set the coherent mask even if this default is fine for your
   driver.
 
-  Good examples of what to use consistent mappings for are:
+  Good examples of what to use coherent mappings for are:
 
 	- Network card DMA ring descriptors.
 	- SCSI adapter mailbox command data structures.
@@ -320,13 +320,13 @@ There are two types of DMA mappings:
 
   The invariant these examples all require is that any CPU store
   to memory is immediately visible to the device, and vice
-  versa.  Consistent mappings guarantee this.
+  versa.  Coherent mappings guarantee this.
 
   .. important::
 
-	     Consistent DMA memory does not preclude the usage of
+	     Coherent DMA memory does not preclude the usage of
 	     proper memory barriers.  The CPU may reorder stores to
-	     consistent memory just as it may normal memory.  Example:
+	     coherent memory just as it may normal memory.  Example:
 	     if it is important for the device to see the first word
 	     of a descriptor updated before the second, you must do
 	     something like::
@@ -365,10 +365,10 @@ Also, systems with caches that aren't DMA-coherent will work better
 when the underlying buffers don't share cache lines with other data.
 
 
-Using Consistent DMA mappings
-=============================
+Using Coherent DMA mappings
+===========================
 
-To allocate and map large (PAGE_SIZE or so) consistent DMA regions,
+To allocate and map large (PAGE_SIZE or so) coherent DMA regions,
 you should do::
 
 	dma_addr_t dma_handle;
@@ -385,10 +385,10 @@ __get_free_pages() (but takes size instead of a page order).  If your
 driver needs regions sized smaller than a page, you may prefer using
 the dma_pool interface, described below.
 
-The consistent DMA mapping interfaces, will by default return a DMA address
+The coherent DMA mapping interfaces, will by default return a DMA address
 which is 32-bit addressable.  Even if the device indicates (via the DMA mask)
-that it may address the upper 32-bits, consistent allocation will only
-return > 32-bit addresses for DMA if the consistent DMA mask has been
+that it may address the upper 32-bits, coherent allocation will only
+return > 32-bit addresses for DMA if the coherent DMA mask has been
 explicitly changed via dma_set_coherent_mask().  This is true of the
 dma_pool interface as well.
 
@@ -497,7 +497,7 @@ program address space.  Such platforms can and do report errors in the
 kernel logs when the DMA controller hardware detects violation of the
 permission setting.
 
-Only streaming mappings specify a direction, consistent mappings
+Only streaming mappings specify a direction, coherent mappings
 implicitly have a direction attribute setting of
 DMA_BIDIRECTIONAL.
 
diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
index 97f42c15f5e4..c0a2cc7d0b95 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -8,9 +8,9 @@ This document describes the DMA API.  For a more gentle introduction
 of the API (and actual examples), see Documentation/core-api/dma-api-howto.rst.
 
 This API is split into two pieces.  Part I describes the basic API.
-Part II describes extensions for supporting non-consistent memory
+Part II describes extensions for supporting non-coherent memory
 machines.  Unless you know that your driver absolutely has to support
-non-consistent platforms (this is usually only legacy platforms) you
+non-coherent platforms (this is usually only legacy platforms) you
 should only use the API described in part I.
 
 Part I - DMA API
@@ -33,13 +33,13 @@ Part Ia - Using large DMA-coherent buffers
 	dma_alloc_coherent(struct device *dev, size_t size,
 			   dma_addr_t *dma_handle, gfp_t flag)
 
-Consistent memory is memory for which a write by either the device or
+Coherent memory is memory for which a write by either the device or
 the processor can immediately be read by the processor or device
 without having to worry about caching effects.  (You may however need
 to make sure to flush the processor's write buffers before telling
 devices to read that memory.)
 
-This routine allocates a region of <size> bytes of consistent memory.
+This routine allocates a region of <size> bytes of coherent memory.
 
 It returns a pointer to the allocated region (in the processor's virtual
 address space) or NULL if the allocation failed.
@@ -48,9 +48,9 @@ It also returns a <dma_handle> which may be cast to an unsigned integer the
 same width as the bus and given to the device as the DMA address base of
 the region.
 
-Note: consistent memory can be expensive on some platforms, and the
+Note: coherent memory can be expensive on some platforms, and the
 minimum allocation length may be as big as a page, so you should
-consolidate your requests for consistent memory as much as possible.
+consolidate your requests for coherent memory as much as possible.
 The simplest way to do that is to use the dma_pool calls (see below).
 
 The flag parameter (dma_alloc_coherent() only) allows the caller to
@@ -64,7 +64,7 @@ the returned memory, like GFP_DMA).
 	dma_free_coherent(struct device *dev, size_t size, void *cpu_addr,
 			  dma_addr_t dma_handle)
 
-Free a region of consistent memory you previously allocated.  dev,
+Free a region of coherent memory you previously allocated.  dev,
 size and dma_handle must all be the same as those passed into
 dma_alloc_coherent().  cpu_addr must be the virtual address returned by
 the dma_alloc_coherent().
diff --git a/mm/dmapool.c b/mm/dmapool.c
index 5be8cc1c6529..5d8af6e29127 100644
--- a/mm/dmapool.c
+++ b/mm/dmapool.c
@@ -200,7 +200,7 @@ static void pool_block_push(struct dma_pool *pool, struct dma_block *block,
 
 
 /**
- * dma_pool_create_node - Creates a pool of consistent memory blocks, for dma.
+ * dma_pool_create_node - Creates a pool of coherent DMA memory blocks.
  * @name: name of pool, for diagnostics
  * @dev: device that will be doing the DMA
  * @size: size of the blocks in this pool.
@@ -210,7 +210,7 @@ static void pool_block_push(struct dma_pool *pool, struct dma_block *block,
  * Context: not in_interrupt()
  *
  * Given one of these pools, dma_pool_alloc()
- * may be used to allocate memory.  Such memory will all have "consistent"
+ * may be used to allocate memory.  Such memory will all have coherent
  * DMA mappings, accessible by the device and its driver without using
  * cache flushing primitives.  The actual size of blocks allocated may be
  * larger than requested because of alignment.
@@ -395,7 +395,7 @@ void dma_pool_destroy(struct dma_pool *pool)
 EXPORT_SYMBOL(dma_pool_destroy);
 
 /**
- * dma_pool_alloc - get a block of consistent memory
+ * dma_pool_alloc - get a block of coherent memory
  * @pool: dma pool that will produce the block
  * @mem_flags: GFP_* bitmask
  * @handle: pointer to dma address of block
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/8] docs: dma-api: remove remnants of PCI DMA API
  2025-06-24 13:39 [PATCH 0/8] update DMA API documentation Petr Tesarik
  2025-06-24 13:39 ` [PATCH 1/8] docs: dma-api: use "DMA API" consistently throughout the document Petr Tesarik
  2025-06-24 13:39 ` [PATCH 2/8] docs: dma-api: replace consistent with coherent Petr Tesarik
@ 2025-06-24 13:39 ` Petr Tesarik
  2025-06-26  1:46   ` Bagas Sanjaya
  2025-06-24 13:39 ` [PATCH 4/8] docs: dma-api: add a kernel-doc comment for dma_pool_zalloc() Petr Tesarik
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Petr Tesarik @ 2025-06-24 13:39 UTC (permalink / raw)
  To: Jonathan Corbet, Morton
  Cc: Marek Szyprowski, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT,
	Petr Tesarik

The wording sometimes suggests there are multiple functions for an
operation. This was in fact the case before PCI DMA API was removed, but
since there is only one API now, the documentation has become confusing.

To improve readability:

* Remove implicit references to the PCI DMA API (plurals, use of "both",
  etc.)

* Where possible, refer to an actual function rather than a more generic
  description of the operation.

Signed-off-by: Petr Tesarik <ptesarik@suse.com>
---
 Documentation/core-api/dma-api.rst | 25 ++++++++++---------------
 1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
index c0a2cc7d0b95..3e89e3b0ecfd 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -53,10 +53,9 @@ minimum allocation length may be as big as a page, so you should
 consolidate your requests for coherent memory as much as possible.
 The simplest way to do that is to use the dma_pool calls (see below).
 
-The flag parameter (dma_alloc_coherent() only) allows the caller to
-specify the ``GFP_`` flags (see kmalloc()) for the allocation (the
-implementation may choose to ignore flags that affect the location of
-the returned memory, like GFP_DMA).
+The flag parameter allows the caller to specify the ``GFP_`` flags (see
+kmalloc()) for the allocation (the implementation may ignore flags that affect
+the location of the returned memory, like GFP_DMA).
 
 ::
 
@@ -64,13 +63,12 @@ the returned memory, like GFP_DMA).
 	dma_free_coherent(struct device *dev, size_t size, void *cpu_addr,
 			  dma_addr_t dma_handle)
 
-Free a region of coherent memory you previously allocated.  dev,
-size and dma_handle must all be the same as those passed into
-dma_alloc_coherent().  cpu_addr must be the virtual address returned by
-the dma_alloc_coherent().
+Free a previously allocated region of coherent memory.  dev, size and dma_handle
+must all be the same as those passed into dma_alloc_coherent().  cpu_addr must
+be the virtual address returned by dma_alloc_coherent().
 
-Note that unlike their sibling allocation calls, these routines
-may only be called with IRQs enabled.
+Note that unlike the sibling allocation call, this routine may only be called
+with IRQs enabled.
 
 
 Part Ib - Using small DMA-coherent buffers
@@ -246,9 +244,7 @@ Part Id - Streaming DMA mappings
 Maps a piece of processor virtual memory so it can be accessed by the
 device and returns the DMA address of the memory.
 
-The direction for both APIs may be converted freely by casting.
-However the DMA API uses a strongly typed enumerator for its
-direction:
+The DMA API uses a strongly typed enumerator for its direction:
 
 ======================= =============================================
 DMA_NONE		no direction (used for debugging)
@@ -325,8 +321,7 @@ DMA_BIDIRECTIONAL	direction isn't known
 			 enum dma_data_direction direction)
 
 Unmaps the region previously mapped.  All the parameters passed in
-must be identical to those passed in (and returned) by the mapping
-API.
+must be identical to those passed to (and returned by) dma_map_single().
 
 ::
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 4/8] docs: dma-api: add a kernel-doc comment for dma_pool_zalloc()
  2025-06-24 13:39 [PATCH 0/8] update DMA API documentation Petr Tesarik
                   ` (2 preceding siblings ...)
  2025-06-24 13:39 ` [PATCH 3/8] docs: dma-api: remove remnants of PCI DMA API Petr Tesarik
@ 2025-06-24 13:39 ` Petr Tesarik
  2025-06-24 13:39 ` [PATCH 5/8] docs: dma-api: remove duplicate description of the DMA pool API Petr Tesarik
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 30+ messages in thread
From: Petr Tesarik @ 2025-06-24 13:39 UTC (permalink / raw)
  To: Jonathan Corbet, Morton
  Cc: Marek Szyprowski, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT,
	Petr Tesarik

Document the dma_pool_zalloc() wrapper.

Signed-off-by: Petr Tesarik <ptesarik@suse.com>
---
 Documentation/core-api/mm-api.rst | 2 ++
 include/linux/dmapool.h           | 8 ++++++++
 2 files changed, 10 insertions(+)

diff --git a/Documentation/core-api/mm-api.rst b/Documentation/core-api/mm-api.rst
index af8151db88b2..a61766328ac0 100644
--- a/Documentation/core-api/mm-api.rst
+++ b/Documentation/core-api/mm-api.rst
@@ -97,6 +97,8 @@ DMA pools
 .. kernel-doc:: mm/dmapool.c
    :export:
 
+.. kernel-doc:: include/linux/dmapool.h
+
 More Memory Management Functions
 ================================
 
diff --git a/include/linux/dmapool.h b/include/linux/dmapool.h
index 06c4de602b2f..c0c7717d3ae7 100644
--- a/include/linux/dmapool.h
+++ b/include/linux/dmapool.h
@@ -60,6 +60,14 @@ static inline struct dma_pool *dma_pool_create(const char *name,
 				    NUMA_NO_NODE);
 }
 
+/**
+ * dma_pool_zalloc - Get a zero-initialized block of DMA coherent memory.
+ * @pool: dma pool that will produce the block
+ * @mem_flags: GFP_* bitmask
+ * @handle: pointer to dma address of block
+ *
+ * Same as @dma_pool_alloc, but the returned memory is zeroed.
+ */
 static inline void *dma_pool_zalloc(struct dma_pool *pool, gfp_t mem_flags,
 				    dma_addr_t *handle)
 {
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 5/8] docs: dma-api: remove duplicate description of the DMA pool API
  2025-06-24 13:39 [PATCH 0/8] update DMA API documentation Petr Tesarik
                   ` (3 preceding siblings ...)
  2025-06-24 13:39 ` [PATCH 4/8] docs: dma-api: add a kernel-doc comment for dma_pool_zalloc() Petr Tesarik
@ 2025-06-24 13:39 ` Petr Tesarik
  2025-06-25  2:40   ` Randy Dunlap
  2025-06-24 13:39 ` [PATCH 6/8] docs: dma-api: clarify DMA addressing limitations Petr Tesarik
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Petr Tesarik @ 2025-06-24 13:39 UTC (permalink / raw)
  To: Jonathan Corbet, Morton
  Cc: Marek Szyprowski, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT,
	Petr Tesarik

The DMA pool API is documented in Memory Management APIs. Do not duplicate
it in DMA API documentation.

Signed-off-by: Petr Tesarik <ptesarik@suse.com>
---
 Documentation/core-api/dma-api.rst | 62 +-----------------------------
 Documentation/core-api/mm-api.rst  |  2 +
 2 files changed, 4 insertions(+), 60 deletions(-)

diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
index 3e89e3b0ecfd..f7fddaf7510c 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -83,66 +83,8 @@ much like a struct kmem_cache, except that they use the DMA-coherent allocator,
 not __get_free_pages().  Also, they understand common hardware constraints
 for alignment, like queue heads needing to be aligned on N-byte boundaries.
 
-
-::
-
-	struct dma_pool *
-	dma_pool_create(const char *name, struct device *dev,
-			size_t size, size_t align, size_t alloc);
-
-dma_pool_create() initializes a pool of DMA-coherent buffers
-for use with a given device.  It must be called in a context which
-can sleep.
-
-The "name" is for diagnostics (like a struct kmem_cache name); dev and size
-are like what you'd pass to dma_alloc_coherent().  The device's hardware
-alignment requirement for this type of data is "align" (which is expressed
-in bytes, and must be a power of two).  If your device has no boundary
-crossing restrictions, pass 0 for alloc; passing 4096 says memory allocated
-from this pool must not cross 4KByte boundaries.
-
-::
-
-	void *
-	dma_pool_zalloc(struct dma_pool *pool, gfp_t mem_flags,
-		        dma_addr_t *handle)
-
-Wraps dma_pool_alloc() and also zeroes the returned memory if the
-allocation attempt succeeded.
-
-
-::
-
-	void *
-	dma_pool_alloc(struct dma_pool *pool, gfp_t gfp_flags,
-		       dma_addr_t *dma_handle);
-
-This allocates memory from the pool; the returned memory will meet the
-size and alignment requirements specified at creation time.  Pass
-GFP_ATOMIC to prevent blocking, or if it's permitted (not
-in_interrupt, not holding SMP locks), pass GFP_KERNEL to allow
-blocking.  Like dma_alloc_coherent(), this returns two values:  an
-address usable by the CPU, and the DMA address usable by the pool's
-device.
-
-::
-
-	void
-	dma_pool_free(struct dma_pool *pool, void *vaddr,
-		      dma_addr_t addr);
-
-This puts memory back into the pool.  The pool is what was passed to
-dma_pool_alloc(); the CPU (vaddr) and DMA addresses are what
-were returned when that routine allocated the memory being freed.
-
-::
-
-	void
-	dma_pool_destroy(struct dma_pool *pool);
-
-dma_pool_destroy() frees the resources of the pool.  It must be
-called in a context which can sleep.  Make sure you've freed all allocated
-memory back to the pool before you destroy it.
+See :ref:`Documentation/core-api/mm-api.rst <dma_pools>` for a detailed
+description of the DMA pools API.
 
 
 Part Ic - DMA addressing limitations
diff --git a/Documentation/core-api/mm-api.rst b/Documentation/core-api/mm-api.rst
index a61766328ac0..de0bab6e3fdd 100644
--- a/Documentation/core-api/mm-api.rst
+++ b/Documentation/core-api/mm-api.rst
@@ -91,6 +91,8 @@ Memory pools
 .. kernel-doc:: mm/mempool.c
    :export:
 
+.. _dma_pools:
+
 DMA pools
 =========
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 6/8] docs: dma-api: clarify DMA addressing limitations
  2025-06-24 13:39 [PATCH 0/8] update DMA API documentation Petr Tesarik
                   ` (4 preceding siblings ...)
  2025-06-24 13:39 ` [PATCH 5/8] docs: dma-api: remove duplicate description of the DMA pool API Petr Tesarik
@ 2025-06-24 13:39 ` Petr Tesarik
  2025-06-26  1:47   ` Bagas Sanjaya
  2025-06-24 13:39 ` [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints Petr Tesarik
  2025-06-24 13:39 ` [PATCH 8/8] docs: dma-api: clean up documentation of dma_map_sg() Petr Tesarik
  7 siblings, 1 reply; 30+ messages in thread
From: Petr Tesarik @ 2025-06-24 13:39 UTC (permalink / raw)
  To: Jonathan Corbet, Morton
  Cc: Marek Szyprowski, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT,
	Petr Tesarik

Move the description of DMA mask from the documentation of dma_map_single()
to Part Ic - DMA addressing limitations and improve the wording.

Explain when a mask setting function may fail, and do not repeat this
explanation for each individual function.

Clarify which device parameters are updated by each mask setting function.

Signed-off-by: Petr Tesarik <ptesarik@suse.com>
---
 Documentation/core-api/dma-api.rst | 35 +++++++++++++++---------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
index f7fddaf7510c..cd432996949c 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -90,13 +90,20 @@ description of the DMA pools API.
 Part Ic - DMA addressing limitations
 ------------------------------------
 
+DMA mask is a bit mask of the addressable region for the device. In other words,
+if applying the DMA mask (a bitwise AND operation) to the DMA address of a
+memory region does not clear any bits in the address, then the device can
+perform DMA to that memory region.
+
+All the below functions which set a DMA mask may fail if the requested mask
+cannot be used with the device, or if the device is not capable of doing DMA.
+
 ::
 
 	int
 	dma_set_mask_and_coherent(struct device *dev, u64 mask)
 
-Checks to see if the mask is possible and updates the device
-streaming and coherent DMA mask parameters if it is.
+Updates both streaming and coherent DMA masks.
 
 Returns: 0 if successful and a negative error if not.
 
@@ -105,8 +112,7 @@ Returns: 0 if successful and a negative error if not.
 	int
 	dma_set_mask(struct device *dev, u64 mask)
 
-Checks to see if the mask is possible and updates the device
-parameters if it is.
+Updates only the streaming DMA mask.
 
 Returns: 0 if successful and a negative error if not.
 
@@ -115,8 +121,7 @@ Returns: 0 if successful and a negative error if not.
 	int
 	dma_set_coherent_mask(struct device *dev, u64 mask)
 
-Checks to see if the mask is possible and updates the device
-parameters if it is.
+Updates only the coherent DMA mask.
 
 Returns: 0 if successful and a negative error if not.
 
@@ -171,7 +176,7 @@ transfer memory ownership.  Returns %false if those calls can be skipped.
 	unsigned long
 	dma_get_merge_boundary(struct device *dev);
 
-Returns the DMA merge boundary. If the device cannot merge any the DMA address
+Returns the DMA merge boundary. If the device cannot merge any DMA address
 segments, the function returns 0.
 
 Part Id - Streaming DMA mappings
@@ -205,16 +210,12 @@ DMA_BIDIRECTIONAL	direction isn't known
 	this API should be obtained from sources which guarantee it to be
 	physically contiguous (like kmalloc).
 
-	Further, the DMA address of the memory must be within the
-	dma_mask of the device (the dma_mask is a bit mask of the
-	addressable region for the device, i.e., if the DMA address of
-	the memory ANDed with the dma_mask is still equal to the DMA
-	address, then the device can perform DMA to the memory).  To
-	ensure that the memory allocated by kmalloc is within the dma_mask,
-	the driver may specify various platform-dependent flags to restrict
-	the DMA address range of the allocation (e.g., on x86, GFP_DMA
-	guarantees to be within the first 16MB of available DMA addresses,
-	as required by ISA devices).
+	Further, the DMA address of the memory must be within the dma_mask of
+	the device.  To ensure that the memory allocated by kmalloc is within
+	the dma_mask, the driver may specify various platform-dependent flags
+	to restrict the DMA address range of the allocation (e.g., on x86,
+	GFP_DMA guarantees to be within the first 16MB of available DMA
+	addresses, as required by ISA devices).
 
 	Note also that the above constraints on physical contiguity and
 	dma_mask may not apply if the platform has an IOMMU (a device which
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints
  2025-06-24 13:39 [PATCH 0/8] update DMA API documentation Petr Tesarik
                   ` (5 preceding siblings ...)
  2025-06-24 13:39 ` [PATCH 6/8] docs: dma-api: clarify DMA addressing limitations Petr Tesarik
@ 2025-06-24 13:39 ` Petr Tesarik
  2025-06-26  1:49   ` Bagas Sanjaya
  2025-06-24 13:39 ` [PATCH 8/8] docs: dma-api: clean up documentation of dma_map_sg() Petr Tesarik
  7 siblings, 1 reply; 30+ messages in thread
From: Petr Tesarik @ 2025-06-24 13:39 UTC (permalink / raw)
  To: Jonathan Corbet, Morton
  Cc: Marek Szyprowski, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT,
	Petr Tesarik

Clarify that SWIOTLB also allows to use any physical address with the
streaming DMA API. Remove the requirement to use platform-dependent flags
to allocate buffers for dma_map_single().

Do not claim that platforms with an IOMMU may not require physically
contiguous buffers. Although the claim is generally correct, it is
misleading, because the current implementation of the streaming DMA API
explicitly rejects vmalloc addresses, no matter if an IOMMU is present or
not.

Signed-off-by: Petr Tesarik <ptesarik@suse.com>
---
 Documentation/core-api/dma-api.rst | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
index cd432996949c..65132ec88104 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -210,18 +210,12 @@ DMA_BIDIRECTIONAL	direction isn't known
 	this API should be obtained from sources which guarantee it to be
 	physically contiguous (like kmalloc).
 
-	Further, the DMA address of the memory must be within the dma_mask of
-	the device.  To ensure that the memory allocated by kmalloc is within
-	the dma_mask, the driver may specify various platform-dependent flags
-	to restrict the DMA address range of the allocation (e.g., on x86,
-	GFP_DMA guarantees to be within the first 16MB of available DMA
-	addresses, as required by ISA devices).
-
-	Note also that the above constraints on physical contiguity and
-	dma_mask may not apply if the platform has an IOMMU (a device which
-	maps an I/O DMA address to a physical memory address).  However, to be
-	portable, device driver writers may *not* assume that such an IOMMU
-	exists.
+	Mapping may also fail if the memory is not within the DMA mask of the
+	device.  However, this constraint does not apply if the platform has
+	an IOMMU (a device which maps an I/O DMA address to a physical memory
+	address), or the kernel is configured with SWIOTLB (bounce buffers).
+	It is reasonable to assume that at least one of these mechanisms
+	allows streaming DMA to any physical address.
 
 .. warning::
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 8/8] docs: dma-api: clean up documentation of dma_map_sg()
  2025-06-24 13:39 [PATCH 0/8] update DMA API documentation Petr Tesarik
                   ` (6 preceding siblings ...)
  2025-06-24 13:39 ` [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints Petr Tesarik
@ 2025-06-24 13:39 ` Petr Tesarik
  2025-06-26  1:50   ` Bagas Sanjaya
  7 siblings, 1 reply; 30+ messages in thread
From: Petr Tesarik @ 2025-06-24 13:39 UTC (permalink / raw)
  To: Jonathan Corbet, Morton
  Cc: Marek Szyprowski, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT,
	Petr Tesarik

Describe in one sentence what the function does.

Do not repeat example situations when the returned number is lower than
the number of segments on input.

Signed-off-by: Petr Tesarik <ptesarik@suse.com>
---
 Documentation/core-api/dma-api.rst | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
index 65132ec88104..f5aadb7f8626 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -308,10 +308,10 @@ action (e.g. reduce current DMA mapping usage or delay and try again later).
 	dma_map_sg(struct device *dev, struct scatterlist *sg,
 		   int nents, enum dma_data_direction direction)
 
-Returns: the number of DMA address segments mapped (this may be shorter
-than <nents> passed in if some elements of the scatter/gather list are
-physically or virtually adjacent and an IOMMU maps them with a single
-entry).
+Maps a scatter/gather list for DMA. Returns the number of DMA address segments
+mapped, which may be smaller than <nents> passed in if several consecutive
+sglist entries are merged (e.g. with an IOMMU, or if some adjacent segments
+just happen to be physically contiguous).
 
 Please note that the sg cannot be mapped again if it has been mapped once.
 The mapping process is allowed to destroy information in the sg.
@@ -335,9 +335,8 @@ With scatterlists, you use the resulting mapping like this::
 where nents is the number of entries in the sglist.
 
 The implementation is free to merge several consecutive sglist entries
-into one (e.g. with an IOMMU, or if several pages just happen to be
-physically contiguous) and returns the actual number of sg entries it
-mapped them to. On failure 0, is returned.
+into one.  The returned number is the actual number of sg entries it
+mapped them to. On failure, 0 is returned.
 
 Then you should loop count times (note: this can be less than nents times)
 and use sg_dma_address() and sg_dma_len() macros where you previously
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/8] docs: dma-api: remove duplicate description of the DMA pool API
  2025-06-24 13:39 ` [PATCH 5/8] docs: dma-api: remove duplicate description of the DMA pool API Petr Tesarik
@ 2025-06-25  2:40   ` Randy Dunlap
  2025-06-25  6:41     ` Petr Tesarik
  0 siblings, 1 reply; 30+ messages in thread
From: Randy Dunlap @ 2025-06-25  2:40 UTC (permalink / raw)
  To: Petr Tesarik, Jonathan Corbet, Morton
  Cc: Marek Szyprowski, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT

Hi,

On 6/24/25 6:39 AM, Petr Tesarik wrote:
> The DMA pool API is documented in Memory Management APIs. Do not duplicate
> it in DMA API documentation.
> 

This looks like it works (from just visual inspection), but I'm wondering
why not just move all DMA API interfaces to dma-api.rst and don't have any
in mm-api.rst... ?

Thanks.

> Signed-off-by: Petr Tesarik <ptesarik@suse.com>
> ---
>  Documentation/core-api/dma-api.rst | 62 +-----------------------------
>  Documentation/core-api/mm-api.rst  |  2 +
>  2 files changed, 4 insertions(+), 60 deletions(-)
> 
> diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
> index 3e89e3b0ecfd..f7fddaf7510c 100644
> --- a/Documentation/core-api/dma-api.rst
> +++ b/Documentation/core-api/dma-api.rst
> @@ -83,66 +83,8 @@ much like a struct kmem_cache, except that they use the DMA-coherent allocator,
>  not __get_free_pages().  Also, they understand common hardware constraints
>  for alignment, like queue heads needing to be aligned on N-byte boundaries.
>  
> -
> -::
> -
> -	struct dma_pool *
> -	dma_pool_create(const char *name, struct device *dev,
> -			size_t size, size_t align, size_t alloc);
> -
> -dma_pool_create() initializes a pool of DMA-coherent buffers
> -for use with a given device.  It must be called in a context which
> -can sleep.
> -
> -The "name" is for diagnostics (like a struct kmem_cache name); dev and size
> -are like what you'd pass to dma_alloc_coherent().  The device's hardware
> -alignment requirement for this type of data is "align" (which is expressed
> -in bytes, and must be a power of two).  If your device has no boundary
> -crossing restrictions, pass 0 for alloc; passing 4096 says memory allocated
> -from this pool must not cross 4KByte boundaries.
> -
> -::
> -
> -	void *
> -	dma_pool_zalloc(struct dma_pool *pool, gfp_t mem_flags,
> -		        dma_addr_t *handle)
> -
> -Wraps dma_pool_alloc() and also zeroes the returned memory if the
> -allocation attempt succeeded.
> -
> -
> -::
> -
> -	void *
> -	dma_pool_alloc(struct dma_pool *pool, gfp_t gfp_flags,
> -		       dma_addr_t *dma_handle);
> -
> -This allocates memory from the pool; the returned memory will meet the
> -size and alignment requirements specified at creation time.  Pass
> -GFP_ATOMIC to prevent blocking, or if it's permitted (not
> -in_interrupt, not holding SMP locks), pass GFP_KERNEL to allow
> -blocking.  Like dma_alloc_coherent(), this returns two values:  an
> -address usable by the CPU, and the DMA address usable by the pool's
> -device.
> -
> -::
> -
> -	void
> -	dma_pool_free(struct dma_pool *pool, void *vaddr,
> -		      dma_addr_t addr);
> -
> -This puts memory back into the pool.  The pool is what was passed to
> -dma_pool_alloc(); the CPU (vaddr) and DMA addresses are what
> -were returned when that routine allocated the memory being freed.
> -
> -::
> -
> -	void
> -	dma_pool_destroy(struct dma_pool *pool);
> -
> -dma_pool_destroy() frees the resources of the pool.  It must be
> -called in a context which can sleep.  Make sure you've freed all allocated
> -memory back to the pool before you destroy it.
> +See :ref:`Documentation/core-api/mm-api.rst <dma_pools>` for a detailed
> +description of the DMA pools API.
>  
>  
>  Part Ic - DMA addressing limitations
> diff --git a/Documentation/core-api/mm-api.rst b/Documentation/core-api/mm-api.rst
> index a61766328ac0..de0bab6e3fdd 100644
> --- a/Documentation/core-api/mm-api.rst
> +++ b/Documentation/core-api/mm-api.rst
> @@ -91,6 +91,8 @@ Memory pools
>  .. kernel-doc:: mm/mempool.c
>     :export:
>  
> +.. _dma_pools:
> +
>  DMA pools
>  =========
>  

-- 
~Randy


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/8] docs: dma-api: use "DMA API" consistently throughout the document
  2025-06-24 13:39 ` [PATCH 1/8] docs: dma-api: use "DMA API" consistently throughout the document Petr Tesarik
@ 2025-06-25  2:41   ` Randy Dunlap
  0 siblings, 0 replies; 30+ messages in thread
From: Randy Dunlap @ 2025-06-25  2:41 UTC (permalink / raw)
  To: Petr Tesarik, Jonathan Corbet, Morton
  Cc: Marek Szyprowski, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT

Hi Petr,

On 6/24/25 6:39 AM, Petr Tesarik wrote:
> Make sure that all occurrences are spelled "DMA API" (all uppercase, no
> hyphen, no underscore).
> 
> Signed-off-by: Petr Tesarik <ptesarik@suse.com>

LGTM. Thanks.

Reviewed-by: Randy Dunlap <rdunlap@infradead.org>

> ---
>  Documentation/core-api/dma-api.rst | 22 +++++++++++-----------
>  1 file changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
> index 2ad08517e626..97f42c15f5e4 100644
> --- a/Documentation/core-api/dma-api.rst
> +++ b/Documentation/core-api/dma-api.rst
> @@ -13,10 +13,10 @@ machines.  Unless you know that your driver absolutely has to support
>  non-consistent platforms (this is usually only legacy platforms) you
>  should only use the API described in part I.
>  
> -Part I - dma_API
> +Part I - DMA API
>  ----------------
>  
> -To get the dma_API, you must #include <linux/dma-mapping.h>.  This
> +To get the DMA API, you must #include <linux/dma-mapping.h>.  This
>  provides dma_addr_t and the interfaces described below.
>  
>  A dma_addr_t can hold any valid DMA address for the platform.  It can be
> @@ -76,7 +76,7 @@ may only be called with IRQs enabled.
>  Part Ib - Using small DMA-coherent buffers
>  ------------------------------------------
>  
> -To get this part of the dma_API, you must #include <linux/dmapool.h>
> +To get this part of the DMA API, you must #include <linux/dmapool.h>
>  
>  Many drivers need lots of small DMA-coherent memory regions for DMA
>  descriptors or I/O buffers.  Rather than allocating in units of a page
> @@ -247,7 +247,7 @@ Maps a piece of processor virtual memory so it can be accessed by the
>  device and returns the DMA address of the memory.
>  
>  The direction for both APIs may be converted freely by casting.
> -However the dma_API uses a strongly typed enumerator for its
> +However the DMA API uses a strongly typed enumerator for its
>  direction:
>  
>  ======================= =============================================
> @@ -775,19 +775,19 @@ memory or doing partial flushes.
>  	of two for easy alignment.
>  
>  
> -Part III - Debug drivers use of the DMA-API
> +Part III - Debug drivers use of the DMA API
>  -------------------------------------------
>  
> -The DMA-API as described above has some constraints. DMA addresses must be
> +The DMA API as described above has some constraints. DMA addresses must be
>  released with the corresponding function with the same size for example. With
>  the advent of hardware IOMMUs it becomes more and more important that drivers
>  do not violate those constraints. In the worst case such a violation can
>  result in data corruption up to destroyed filesystems.
>  
> -To debug drivers and find bugs in the usage of the DMA-API checking code can
> +To debug drivers and find bugs in the usage of the DMA API checking code can
>  be compiled into the kernel which will tell the developer about those
>  violations. If your architecture supports it you can select the "Enable
> -debugging of DMA-API usage" option in your kernel configuration. Enabling this
> +debugging of DMA API usage" option in your kernel configuration. Enabling this
>  option has a performance impact. Do not enable it in production kernels.
>  
>  If you boot the resulting kernel will contain code which does some bookkeeping
> @@ -826,7 +826,7 @@ example warning message may look like this::
>  	<EOI> <4>---[ end trace f6435a98e2a38c0e ]---
>  
>  The driver developer can find the driver and the device including a stacktrace
> -of the DMA-API call which caused this warning.
> +of the DMA API call which caused this warning.
>  
>  Per default only the first error will result in a warning message. All other
>  errors will only silently counted. This limitation exist to prevent the code
> @@ -834,7 +834,7 @@ from flooding your kernel log. To support debugging a device driver this can
>  be disabled via debugfs. See the debugfs interface documentation below for
>  details.
>  
> -The debugfs directory for the DMA-API debugging code is called dma-api/. In
> +The debugfs directory for the DMA API debugging code is called dma-api/. In
>  this directory the following files can currently be found:
>  
>  =============================== ===============================================
> @@ -882,7 +882,7 @@ dma-api/driver_filter		You can write a name of a driver into this file
>  
>  If you have this code compiled into your kernel it will be enabled by default.
>  If you want to boot without the bookkeeping anyway you can provide
> -'dma_debug=off' as a boot parameter. This will disable DMA-API debugging.
> +'dma_debug=off' as a boot parameter. This will disable DMA API debugging.
>  Notice that you can not enable it again at runtime. You have to reboot to do
>  so.
>  

-- 
~Randy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/8] docs: dma-api: remove duplicate description of the DMA pool API
  2025-06-25  2:40   ` Randy Dunlap
@ 2025-06-25  6:41     ` Petr Tesarik
  0 siblings, 0 replies; 30+ messages in thread
From: Petr Tesarik @ 2025-06-25  6:41 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Jonathan Corbet, Morton, Marek Szyprowski, Leon Romanovsky,
	Keith Busch, Caleb Sander Mateos, Sagi Grimberg, Jens Axboe,
	John Garry, open list:DOCUMENTATION, open list,
	open list:MEMORY MANAGEMENT

On Tue, 24 Jun 2025 19:40:37 -0700
Randy Dunlap <rdunlap@infradead.org> wrote:

> Hi,
> 
> On 6/24/25 6:39 AM, Petr Tesarik wrote:
> > The DMA pool API is documented in Memory Management APIs. Do not duplicate
> > it in DMA API documentation.
> >   
> 
> This looks like it works (from just visual inspection), but I'm wondering
> why not just move all DMA API interfaces to dma-api.rst and don't have any
> in mm-api.rst... ?

That's also an option. As long as documentation is not repeated in more
than one place, I'm happy with the result. Now, seeing that it was you
who originally moved DMA pools from Drivers under Memory Management in
commit a80a438bd088 ("docbook: dmapool: fix fatal changed filename"), I
expect no complaints when I move it to dma-api.rst in v2.

Thanks for the idea!

Petr T

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/8] docs: dma-api: remove remnants of PCI DMA API
  2025-06-24 13:39 ` [PATCH 3/8] docs: dma-api: remove remnants of PCI DMA API Petr Tesarik
@ 2025-06-26  1:46   ` Bagas Sanjaya
  0 siblings, 0 replies; 30+ messages in thread
From: Bagas Sanjaya @ 2025-06-26  1:46 UTC (permalink / raw)
  To: Petr Tesarik, Jonathan Corbet, Morton
  Cc: Marek Szyprowski, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT

[-- Attachment #1: Type: text/plain, Size: 2943 bytes --]

On Tue, Jun 24, 2025 at 03:39:18PM +0200, Petr Tesarik wrote:
> diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
> index c0a2cc7d0b95..3e89e3b0ecfd 100644
> --- a/Documentation/core-api/dma-api.rst
> +++ b/Documentation/core-api/dma-api.rst
> @@ -53,10 +53,9 @@ minimum allocation length may be as big as a page, so you should
>  consolidate your requests for coherent memory as much as possible.
>  The simplest way to do that is to use the dma_pool calls (see below).
>  
> -The flag parameter (dma_alloc_coherent() only) allows the caller to
> -specify the ``GFP_`` flags (see kmalloc()) for the allocation (the
> -implementation may choose to ignore flags that affect the location of
> -the returned memory, like GFP_DMA).
> +The flag parameter allows the caller to specify the ``GFP_`` flags (see
> +kmalloc()) for the allocation (the implementation may ignore flags that affect
> +the location of the returned memory, like GFP_DMA).
>  
>  ::
>  
> @@ -64,13 +63,12 @@ the returned memory, like GFP_DMA).
>  	dma_free_coherent(struct device *dev, size_t size, void *cpu_addr,
>  			  dma_addr_t dma_handle)
>  
> -Free a region of coherent memory you previously allocated.  dev,
> -size and dma_handle must all be the same as those passed into
> -dma_alloc_coherent().  cpu_addr must be the virtual address returned by
> -the dma_alloc_coherent().
> +Free a previously allocated region of coherent memory.  dev, size and dma_handle
> +must all be the same as those passed into dma_alloc_coherent().  cpu_addr must
> +be the virtual address returned by dma_alloc_coherent().
>  
> -Note that unlike their sibling allocation calls, these routines
> -may only be called with IRQs enabled.
> +Note that unlike the sibling allocation call, this routine may only be called
> +with IRQs enabled.
>  
>  
>  Part Ib - Using small DMA-coherent buffers
> @@ -246,9 +244,7 @@ Part Id - Streaming DMA mappings
>  Maps a piece of processor virtual memory so it can be accessed by the
>  device and returns the DMA address of the memory.
>  
> -The direction for both APIs may be converted freely by casting.
> -However the DMA API uses a strongly typed enumerator for its
> -direction:
> +The DMA API uses a strongly typed enumerator for its direction:
>  
>  ======================= =============================================
>  DMA_NONE		no direction (used for debugging)
> @@ -325,8 +321,7 @@ DMA_BIDIRECTIONAL	direction isn't known
>  			 enum dma_data_direction direction)
>  
>  Unmaps the region previously mapped.  All the parameters passed in
> -must be identical to those passed in (and returned) by the mapping
> -API.
> +must be identical to those passed to (and returned by) dma_map_single().
>  
>  ::
>  

LGTM, thanks!

Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 6/8] docs: dma-api: clarify DMA addressing limitations
  2025-06-24 13:39 ` [PATCH 6/8] docs: dma-api: clarify DMA addressing limitations Petr Tesarik
@ 2025-06-26  1:47   ` Bagas Sanjaya
  0 siblings, 0 replies; 30+ messages in thread
From: Bagas Sanjaya @ 2025-06-26  1:47 UTC (permalink / raw)
  To: Petr Tesarik, Jonathan Corbet, Morton
  Cc: Marek Szyprowski, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT

[-- Attachment #1: Type: text/plain, Size: 3824 bytes --]

On Tue, Jun 24, 2025 at 03:39:21PM +0200, Petr Tesarik wrote:
> diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
> index f7fddaf7510c..cd432996949c 100644
> --- a/Documentation/core-api/dma-api.rst
> +++ b/Documentation/core-api/dma-api.rst
> @@ -90,13 +90,20 @@ description of the DMA pools API.
>  Part Ic - DMA addressing limitations
>  ------------------------------------
>  
> +DMA mask is a bit mask of the addressable region for the device. In other words,
> +if applying the DMA mask (a bitwise AND operation) to the DMA address of a
> +memory region does not clear any bits in the address, then the device can
> +perform DMA to that memory region.
> +
> +All the below functions which set a DMA mask may fail if the requested mask
> +cannot be used with the device, or if the device is not capable of doing DMA.
> +
>  ::
>  
>  	int
>  	dma_set_mask_and_coherent(struct device *dev, u64 mask)
>  
> -Checks to see if the mask is possible and updates the device
> -streaming and coherent DMA mask parameters if it is.
> +Updates both streaming and coherent DMA masks.
>  
>  Returns: 0 if successful and a negative error if not.
>  
> @@ -105,8 +112,7 @@ Returns: 0 if successful and a negative error if not.
>  	int
>  	dma_set_mask(struct device *dev, u64 mask)
>  
> -Checks to see if the mask is possible and updates the device
> -parameters if it is.
> +Updates only the streaming DMA mask.
>  
>  Returns: 0 if successful and a negative error if not.
>  
> @@ -115,8 +121,7 @@ Returns: 0 if successful and a negative error if not.
>  	int
>  	dma_set_coherent_mask(struct device *dev, u64 mask)
>  
> -Checks to see if the mask is possible and updates the device
> -parameters if it is.
> +Updates only the coherent DMA mask.
>  
>  Returns: 0 if successful and a negative error if not.
>  
> @@ -171,7 +176,7 @@ transfer memory ownership.  Returns %false if those calls can be skipped.
>  	unsigned long
>  	dma_get_merge_boundary(struct device *dev);
>  
> -Returns the DMA merge boundary. If the device cannot merge any the DMA address
> +Returns the DMA merge boundary. If the device cannot merge any DMA address
>  segments, the function returns 0.
>  
>  Part Id - Streaming DMA mappings
> @@ -205,16 +210,12 @@ DMA_BIDIRECTIONAL	direction isn't known
>  	this API should be obtained from sources which guarantee it to be
>  	physically contiguous (like kmalloc).
>  
> -	Further, the DMA address of the memory must be within the
> -	dma_mask of the device (the dma_mask is a bit mask of the
> -	addressable region for the device, i.e., if the DMA address of
> -	the memory ANDed with the dma_mask is still equal to the DMA
> -	address, then the device can perform DMA to the memory).  To
> -	ensure that the memory allocated by kmalloc is within the dma_mask,
> -	the driver may specify various platform-dependent flags to restrict
> -	the DMA address range of the allocation (e.g., on x86, GFP_DMA
> -	guarantees to be within the first 16MB of available DMA addresses,
> -	as required by ISA devices).
> +	Further, the DMA address of the memory must be within the dma_mask of
> +	the device.  To ensure that the memory allocated by kmalloc is within
> +	the dma_mask, the driver may specify various platform-dependent flags
> +	to restrict the DMA address range of the allocation (e.g., on x86,
> +	GFP_DMA guarantees to be within the first 16MB of available DMA
> +	addresses, as required by ISA devices).
>  
>  	Note also that the above constraints on physical contiguity and
>  	dma_mask may not apply if the platform has an IOMMU (a device which
 
LGTM, thanks!

Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints
  2025-06-24 13:39 ` [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints Petr Tesarik
@ 2025-06-26  1:49   ` Bagas Sanjaya
  2025-06-26  5:06     ` Petr Tesarik
  0 siblings, 1 reply; 30+ messages in thread
From: Bagas Sanjaya @ 2025-06-26  1:49 UTC (permalink / raw)
  To: Petr Tesarik, Jonathan Corbet, Morton
  Cc: Marek Szyprowski, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT

[-- Attachment #1: Type: text/plain, Size: 1764 bytes --]

On Tue, Jun 24, 2025 at 03:39:22PM +0200, Petr Tesarik wrote:
> diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
> index cd432996949c..65132ec88104 100644
> --- a/Documentation/core-api/dma-api.rst
> +++ b/Documentation/core-api/dma-api.rst
> @@ -210,18 +210,12 @@ DMA_BIDIRECTIONAL	direction isn't known
>  	this API should be obtained from sources which guarantee it to be
>  	physically contiguous (like kmalloc).
>  
> -	Further, the DMA address of the memory must be within the dma_mask of
> -	the device.  To ensure that the memory allocated by kmalloc is within
> -	the dma_mask, the driver may specify various platform-dependent flags
> -	to restrict the DMA address range of the allocation (e.g., on x86,
> -	GFP_DMA guarantees to be within the first 16MB of available DMA
> -	addresses, as required by ISA devices).
> -
> -	Note also that the above constraints on physical contiguity and
> -	dma_mask may not apply if the platform has an IOMMU (a device which
> -	maps an I/O DMA address to a physical memory address).  However, to be
> -	portable, device driver writers may *not* assume that such an IOMMU
> -	exists.
> +	Mapping may also fail if the memory is not within the DMA mask of the
> +	device.  However, this constraint does not apply if the platform has
> +	an IOMMU (a device which maps an I/O DMA address to a physical memory
> +	address), or the kernel is configured with SWIOTLB (bounce buffers).
> +	It is reasonable to assume that at least one of these mechanisms
> +	allows streaming DMA to any physical address.
>  
>  .. warning::
>  

LGTM, thanks!

Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/8] docs: dma-api: clean up documentation of dma_map_sg()
  2025-06-24 13:39 ` [PATCH 8/8] docs: dma-api: clean up documentation of dma_map_sg() Petr Tesarik
@ 2025-06-26  1:50   ` Bagas Sanjaya
  0 siblings, 0 replies; 30+ messages in thread
From: Bagas Sanjaya @ 2025-06-26  1:50 UTC (permalink / raw)
  To: Petr Tesarik, Jonathan Corbet, Morton
  Cc: Marek Szyprowski, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT

[-- Attachment #1: Type: text/plain, Size: 2011 bytes --]

On Tue, Jun 24, 2025 at 03:39:23PM +0200, Petr Tesarik wrote:
> diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
> index 65132ec88104..f5aadb7f8626 100644
> --- a/Documentation/core-api/dma-api.rst
> +++ b/Documentation/core-api/dma-api.rst
> @@ -308,10 +308,10 @@ action (e.g. reduce current DMA mapping usage or delay and try again later).
>  	dma_map_sg(struct device *dev, struct scatterlist *sg,
>  		   int nents, enum dma_data_direction direction)
>  
> -Returns: the number of DMA address segments mapped (this may be shorter
> -than <nents> passed in if some elements of the scatter/gather list are
> -physically or virtually adjacent and an IOMMU maps them with a single
> -entry).
> +Maps a scatter/gather list for DMA. Returns the number of DMA address segments
> +mapped, which may be smaller than <nents> passed in if several consecutive
> +sglist entries are merged (e.g. with an IOMMU, or if some adjacent segments
> +just happen to be physically contiguous).
>  
>  Please note that the sg cannot be mapped again if it has been mapped once.
>  The mapping process is allowed to destroy information in the sg.
> @@ -335,9 +335,8 @@ With scatterlists, you use the resulting mapping like this::
>  where nents is the number of entries in the sglist.
>  
>  The implementation is free to merge several consecutive sglist entries
> -into one (e.g. with an IOMMU, or if several pages just happen to be
> -physically contiguous) and returns the actual number of sg entries it
> -mapped them to. On failure 0, is returned.
> +into one.  The returned number is the actual number of sg entries it
> +mapped them to. On failure, 0 is returned.
>  
>  Then you should loop count times (note: this can be less than nents times)
>  and use sg_dma_address() and sg_dma_len() macros where you previously

Looks good, thanks!

Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/8] docs: dma-api: replace consistent with coherent
  2025-06-24 13:39 ` [PATCH 2/8] docs: dma-api: replace consistent with coherent Petr Tesarik
@ 2025-06-26  4:51   ` Petr Tesarik
  2025-06-26  7:21     ` Marek Szyprowski
  0 siblings, 1 reply; 30+ messages in thread
From: Petr Tesarik @ 2025-06-26  4:51 UTC (permalink / raw)
  To: Marek Szyprowski, Robin Murphy, Christoph Hellwig
  Cc: Jonathan Corbet, Andrew Morton, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT,
	iommu

On Tue, 24 Jun 2025 15:39:17 +0200
Petr Tesarik <ptesarik@suse.com> wrote:

> For consistency, always use the term "coherent" when talking about memory
> that is not subject to CPU caching effects. The term "consistent" is a
> relic of a long-removed pci_alloc_consistent() function.

I realize that I'm not an authoritative source for this, but I forgot
to add more trusted maintainers to the recipient list.

Now, all you DMA experts, do you agree that the word "consistent"
should be finally eradicated from DMA API documentation?

Petr T

> Signed-off-by: Petr Tesarik <ptesarik@suse.com>
> ---
>  Documentation/core-api/dma-api-howto.rst | 36 ++++++++++++------------
>  Documentation/core-api/dma-api.rst       | 14 ++++-----
>  mm/dmapool.c                             |  6 ++--
>  3 files changed, 28 insertions(+), 28 deletions(-)
> 
> diff --git a/Documentation/core-api/dma-api-howto.rst b/Documentation/core-api/dma-api-howto.rst
> index 0bf31b6c4383..96fce2a9aa90 100644
> --- a/Documentation/core-api/dma-api-howto.rst
> +++ b/Documentation/core-api/dma-api-howto.rst
> @@ -155,7 +155,7 @@ a device with limitations, it needs to be decreased.
>  
>  Special note about PCI: PCI-X specification requires PCI-X devices to support
>  64-bit addressing (DAC) for all transactions.  And at least one platform (SGI
> -SN2) requires 64-bit consistent allocations to operate correctly when the IO
> +SN2) requires 64-bit coherent allocations to operate correctly when the IO
>  bus is in PCI-X mode.
>  
>  For correct operation, you must set the DMA mask to inform the kernel about
> @@ -174,7 +174,7 @@ used instead:
>  
>  		int dma_set_mask(struct device *dev, u64 mask);
>  
> -	The setup for consistent allocations is performed via a call
> +	The setup for coherent allocations is performed via a call
>  	to dma_set_coherent_mask()::
>  
>  		int dma_set_coherent_mask(struct device *dev, u64 mask);
> @@ -241,7 +241,7 @@ it would look like this::
>  
>  The coherent mask will always be able to set the same or a smaller mask as
>  the streaming mask. However for the rare case that a device driver only
> -uses consistent allocations, one would have to check the return value from
> +uses coherent allocations, one would have to check the return value from
>  dma_set_coherent_mask().
>  
>  Finally, if your device can only drive the low 24-bits of
> @@ -298,20 +298,20 @@ Types of DMA mappings
>  
>  There are two types of DMA mappings:
>  
> -- Consistent DMA mappings which are usually mapped at driver
> +- Coherent DMA mappings which are usually mapped at driver
>    initialization, unmapped at the end and for which the hardware should
>    guarantee that the device and the CPU can access the data
>    in parallel and will see updates made by each other without any
>    explicit software flushing.
>  
> -  Think of "consistent" as "synchronous" or "coherent".
> +  Think of "coherent" as "synchronous".
>  
> -  The current default is to return consistent memory in the low 32
> +  The current default is to return coherent memory in the low 32
>    bits of the DMA space.  However, for future compatibility you should
> -  set the consistent mask even if this default is fine for your
> +  set the coherent mask even if this default is fine for your
>    driver.
>  
> -  Good examples of what to use consistent mappings for are:
> +  Good examples of what to use coherent mappings for are:
>  
>  	- Network card DMA ring descriptors.
>  	- SCSI adapter mailbox command data structures.
> @@ -320,13 +320,13 @@ There are two types of DMA mappings:
>  
>    The invariant these examples all require is that any CPU store
>    to memory is immediately visible to the device, and vice
> -  versa.  Consistent mappings guarantee this.
> +  versa.  Coherent mappings guarantee this.
>  
>    .. important::
>  
> -	     Consistent DMA memory does not preclude the usage of
> +	     Coherent DMA memory does not preclude the usage of
>  	     proper memory barriers.  The CPU may reorder stores to
> -	     consistent memory just as it may normal memory.  Example:
> +	     coherent memory just as it may normal memory.  Example:
>  	     if it is important for the device to see the first word
>  	     of a descriptor updated before the second, you must do
>  	     something like::
> @@ -365,10 +365,10 @@ Also, systems with caches that aren't DMA-coherent will work better
>  when the underlying buffers don't share cache lines with other data.
>  
>  
> -Using Consistent DMA mappings
> -=============================
> +Using Coherent DMA mappings
> +===========================
>  
> -To allocate and map large (PAGE_SIZE or so) consistent DMA regions,
> +To allocate and map large (PAGE_SIZE or so) coherent DMA regions,
>  you should do::
>  
>  	dma_addr_t dma_handle;
> @@ -385,10 +385,10 @@ __get_free_pages() (but takes size instead of a page order).  If your
>  driver needs regions sized smaller than a page, you may prefer using
>  the dma_pool interface, described below.
>  
> -The consistent DMA mapping interfaces, will by default return a DMA address
> +The coherent DMA mapping interfaces, will by default return a DMA address
>  which is 32-bit addressable.  Even if the device indicates (via the DMA mask)
> -that it may address the upper 32-bits, consistent allocation will only
> -return > 32-bit addresses for DMA if the consistent DMA mask has been
> +that it may address the upper 32-bits, coherent allocation will only
> +return > 32-bit addresses for DMA if the coherent DMA mask has been
>  explicitly changed via dma_set_coherent_mask().  This is true of the
>  dma_pool interface as well.
>  
> @@ -497,7 +497,7 @@ program address space.  Such platforms can and do report errors in the
>  kernel logs when the DMA controller hardware detects violation of the
>  permission setting.
>  
> -Only streaming mappings specify a direction, consistent mappings
> +Only streaming mappings specify a direction, coherent mappings
>  implicitly have a direction attribute setting of
>  DMA_BIDIRECTIONAL.
>  
> diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
> index 97f42c15f5e4..c0a2cc7d0b95 100644
> --- a/Documentation/core-api/dma-api.rst
> +++ b/Documentation/core-api/dma-api.rst
> @@ -8,9 +8,9 @@ This document describes the DMA API.  For a more gentle introduction
>  of the API (and actual examples), see Documentation/core-api/dma-api-howto.rst.
>  
>  This API is split into two pieces.  Part I describes the basic API.
> -Part II describes extensions for supporting non-consistent memory
> +Part II describes extensions for supporting non-coherent memory
>  machines.  Unless you know that your driver absolutely has to support
> -non-consistent platforms (this is usually only legacy platforms) you
> +non-coherent platforms (this is usually only legacy platforms) you
>  should only use the API described in part I.
>  
>  Part I - DMA API
> @@ -33,13 +33,13 @@ Part Ia - Using large DMA-coherent buffers
>  	dma_alloc_coherent(struct device *dev, size_t size,
>  			   dma_addr_t *dma_handle, gfp_t flag)
>  
> -Consistent memory is memory for which a write by either the device or
> +Coherent memory is memory for which a write by either the device or
>  the processor can immediately be read by the processor or device
>  without having to worry about caching effects.  (You may however need
>  to make sure to flush the processor's write buffers before telling
>  devices to read that memory.)
>  
> -This routine allocates a region of <size> bytes of consistent memory.
> +This routine allocates a region of <size> bytes of coherent memory.
>  
>  It returns a pointer to the allocated region (in the processor's virtual
>  address space) or NULL if the allocation failed.
> @@ -48,9 +48,9 @@ It also returns a <dma_handle> which may be cast to an unsigned integer the
>  same width as the bus and given to the device as the DMA address base of
>  the region.
>  
> -Note: consistent memory can be expensive on some platforms, and the
> +Note: coherent memory can be expensive on some platforms, and the
>  minimum allocation length may be as big as a page, so you should
> -consolidate your requests for consistent memory as much as possible.
> +consolidate your requests for coherent memory as much as possible.
>  The simplest way to do that is to use the dma_pool calls (see below).
>  
>  The flag parameter (dma_alloc_coherent() only) allows the caller to
> @@ -64,7 +64,7 @@ the returned memory, like GFP_DMA).
>  	dma_free_coherent(struct device *dev, size_t size, void *cpu_addr,
>  			  dma_addr_t dma_handle)
>  
> -Free a region of consistent memory you previously allocated.  dev,
> +Free a region of coherent memory you previously allocated.  dev,
>  size and dma_handle must all be the same as those passed into
>  dma_alloc_coherent().  cpu_addr must be the virtual address returned by
>  the dma_alloc_coherent().
> diff --git a/mm/dmapool.c b/mm/dmapool.c
> index 5be8cc1c6529..5d8af6e29127 100644
> --- a/mm/dmapool.c
> +++ b/mm/dmapool.c
> @@ -200,7 +200,7 @@ static void pool_block_push(struct dma_pool *pool, struct dma_block *block,
>  
>  
>  /**
> - * dma_pool_create_node - Creates a pool of consistent memory blocks, for dma.
> + * dma_pool_create_node - Creates a pool of coherent DMA memory blocks.
>   * @name: name of pool, for diagnostics
>   * @dev: device that will be doing the DMA
>   * @size: size of the blocks in this pool.
> @@ -210,7 +210,7 @@ static void pool_block_push(struct dma_pool *pool, struct dma_block *block,
>   * Context: not in_interrupt()
>   *
>   * Given one of these pools, dma_pool_alloc()
> - * may be used to allocate memory.  Such memory will all have "consistent"
> + * may be used to allocate memory.  Such memory will all have coherent
>   * DMA mappings, accessible by the device and its driver without using
>   * cache flushing primitives.  The actual size of blocks allocated may be
>   * larger than requested because of alignment.
> @@ -395,7 +395,7 @@ void dma_pool_destroy(struct dma_pool *pool)
>  EXPORT_SYMBOL(dma_pool_destroy);
>  
>  /**
> - * dma_pool_alloc - get a block of consistent memory
> + * dma_pool_alloc - get a block of coherent memory
>   * @pool: dma pool that will produce the block
>   * @mem_flags: GFP_* bitmask
>   * @handle: pointer to dma address of block


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints
  2025-06-26  1:49   ` Bagas Sanjaya
@ 2025-06-26  5:06     ` Petr Tesarik
  2025-06-26  7:09       ` Marek Szyprowski
                         ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Petr Tesarik @ 2025-06-26  5:06 UTC (permalink / raw)
  To: Marek Szyprowski, Robin Murphy
  Cc: Bagas Sanjaya, Jonathan Corbet, Andrew Morton, Leon Romanovsky,
	Keith Busch, Caleb Sander Mateos, Sagi Grimberg, Jens Axboe,
	John Garry, open list:DOCUMENTATION, open list,
	open list:MEMORY MANAGEMENT, iommu

On Thu, 26 Jun 2025 08:49:17 +0700
Bagas Sanjaya <bagasdotme@gmail.com> wrote:

> On Tue, Jun 24, 2025 at 03:39:22PM +0200, Petr Tesarik wrote:
> > diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
> > index cd432996949c..65132ec88104 100644
> > --- a/Documentation/core-api/dma-api.rst
> > +++ b/Documentation/core-api/dma-api.rst
> > @@ -210,18 +210,12 @@ DMA_BIDIRECTIONAL	direction isn't known
> >  	this API should be obtained from sources which guarantee it to be
> >  	physically contiguous (like kmalloc).
> >  
> > -	Further, the DMA address of the memory must be within the dma_mask of
> > -	the device.  To ensure that the memory allocated by kmalloc is within
> > -	the dma_mask, the driver may specify various platform-dependent flags
> > -	to restrict the DMA address range of the allocation (e.g., on x86,
> > -	GFP_DMA guarantees to be within the first 16MB of available DMA
> > -	addresses, as required by ISA devices).
> > -
> > -	Note also that the above constraints on physical contiguity and
> > -	dma_mask may not apply if the platform has an IOMMU (a device which
> > -	maps an I/O DMA address to a physical memory address).  However, to be
> > -	portable, device driver writers may *not* assume that such an IOMMU
> > -	exists.
> > +	Mapping may also fail if the memory is not within the DMA mask of the
> > +	device.  However, this constraint does not apply if the platform has
> > +	an IOMMU (a device which maps an I/O DMA address to a physical memory
> > +	address), or the kernel is configured with SWIOTLB (bounce buffers).
> > +	It is reasonable to assume that at least one of these mechanisms
> > +	allows streaming DMA to any physical address.

Now I realize this last sentence may be contentious...

@Marek, @Robin Do you agree that device drivers should not be concerned
about the physical address of a buffer passed to the streaming DMA API?

I mean, are there any real-world systems with:
  * some RAM that is not DMA-addressable,
  * no IOMMU,
  * CONFIG_SWIOTLB is not set?

FWIW if _I_ received a bug report that a device driver fails to submit
I/O on such a system, I would politely explain the reporter that their
kernel is misconfigured, and they should enable CONFIG_SWIOTLB.

Just my two cents,
Petr T

> >  .. warning::
> >    
> 
> LGTM, thanks!
> 
> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>

Thank you for the review, Bagas.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints
  2025-06-26  5:06     ` Petr Tesarik
@ 2025-06-26  7:09       ` Marek Szyprowski
  2025-06-26  8:25         ` Petr Tesarik
  2025-06-26  9:58       ` Robin Murphy
  2025-06-27 12:52       ` Christoph Hellwig
  2 siblings, 1 reply; 30+ messages in thread
From: Marek Szyprowski @ 2025-06-26  7:09 UTC (permalink / raw)
  To: Petr Tesarik, Robin Murphy
  Cc: Bagas Sanjaya, Jonathan Corbet, Andrew Morton, Leon Romanovsky,
	Keith Busch, Caleb Sander Mateos, Sagi Grimberg, Jens Axboe,
	John Garry, open list:DOCUMENTATION, open list,
	open list:MEMORY MANAGEMENT, iommu

On 26.06.2025 07:06, Petr Tesarik wrote:
> On Thu, 26 Jun 2025 08:49:17 +0700
> Bagas Sanjaya <bagasdotme@gmail.com> wrote:
>
>> On Tue, Jun 24, 2025 at 03:39:22PM +0200, Petr Tesarik wrote:
>>> diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
>>> index cd432996949c..65132ec88104 100644
>>> --- a/Documentation/core-api/dma-api.rst
>>> +++ b/Documentation/core-api/dma-api.rst
>>> @@ -210,18 +210,12 @@ DMA_BIDIRECTIONAL	direction isn't known
>>>   	this API should be obtained from sources which guarantee it to be
>>>   	physically contiguous (like kmalloc).
>>>   
>>> -	Further, the DMA address of the memory must be within the dma_mask of
>>> -	the device.  To ensure that the memory allocated by kmalloc is within
>>> -	the dma_mask, the driver may specify various platform-dependent flags
>>> -	to restrict the DMA address range of the allocation (e.g., on x86,
>>> -	GFP_DMA guarantees to be within the first 16MB of available DMA
>>> -	addresses, as required by ISA devices).
>>> -
>>> -	Note also that the above constraints on physical contiguity and
>>> -	dma_mask may not apply if the platform has an IOMMU (a device which
>>> -	maps an I/O DMA address to a physical memory address).  However, to be
>>> -	portable, device driver writers may *not* assume that such an IOMMU
>>> -	exists.
>>> +	Mapping may also fail if the memory is not within the DMA mask of the
>>> +	device.  However, this constraint does not apply if the platform has
>>> +	an IOMMU (a device which maps an I/O DMA address to a physical memory
>>> +	address), or the kernel is configured with SWIOTLB (bounce buffers).
>>> +	It is reasonable to assume that at least one of these mechanisms
>>> +	allows streaming DMA to any physical address.
> Now I realize this last sentence may be contentious...
>
> @Marek, @Robin Do you agree that device drivers should not be concerned
> about the physical address of a buffer passed to the streaming DMA API?
>
> I mean, are there any real-world systems with:
>    * some RAM that is not DMA-addressable,
>    * no IOMMU,
>    * CONFIG_SWIOTLB is not set?
>
> FWIW if _I_ received a bug report that a device driver fails to submit
> I/O on such a system, I would politely explain the reporter that their
> kernel is misconfigured, and they should enable CONFIG_SWIOTLB.

What about the systems with legacy 16/24bit ZONE_DMA (i.e. ISA bus)? 
AFAIR they don't use SWIOTLB and probably they won't be able to use 
streaming DMA API for all system RAM.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/8] docs: dma-api: replace consistent with coherent
  2025-06-26  4:51   ` Petr Tesarik
@ 2025-06-26  7:21     ` Marek Szyprowski
  0 siblings, 0 replies; 30+ messages in thread
From: Marek Szyprowski @ 2025-06-26  7:21 UTC (permalink / raw)
  To: Petr Tesarik, Robin Murphy, Christoph Hellwig
  Cc: Jonathan Corbet, Andrew Morton, Leon Romanovsky, Keith Busch,
	Caleb Sander Mateos, Sagi Grimberg, Jens Axboe, John Garry,
	open list:DOCUMENTATION, open list, open list:MEMORY MANAGEMENT,
	iommu

On 26.06.2025 06:51, Petr Tesarik wrote:
> On Tue, 24 Jun 2025 15:39:17 +0200
> Petr Tesarik <ptesarik@suse.com> wrote:
>
>> For consistency, always use the term "coherent" when talking about memory
>> that is not subject to CPU caching effects. The term "consistent" is a
>> relic of a long-removed pci_alloc_consistent() function.
> I realize that I'm not an authoritative source for this, but I forgot
> to add more trusted maintainers to the recipient list.
>
> Now, all you DMA experts, do you agree that the word "consistent"
> should be finally eradicated from DMA API documentation?

Well, this was always puzzling for me, why there are those 2 names used 
(especially in case of dma_alloc_coherent() vs. 
dma_alloc_nonconsistent()). I'm for unifying them to "coherent" as this 
is the term use more often.


> > ...

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints
  2025-06-26  7:09       ` Marek Szyprowski
@ 2025-06-26  8:25         ` Petr Tesarik
  0 siblings, 0 replies; 30+ messages in thread
From: Petr Tesarik @ 2025-06-26  8:25 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Robin Murphy, Bagas Sanjaya, Jonathan Corbet, Andrew Morton,
	Leon Romanovsky, Keith Busch, Caleb Sander Mateos, Sagi Grimberg,
	Jens Axboe, John Garry, open list:DOCUMENTATION, open list,
	open list:MEMORY MANAGEMENT, iommu

On Thu, 26 Jun 2025 09:09:34 +0200
Marek Szyprowski <m.szyprowski@samsung.com> wrote:

> On 26.06.2025 07:06, Petr Tesarik wrote:
> > On Thu, 26 Jun 2025 08:49:17 +0700
> > Bagas Sanjaya <bagasdotme@gmail.com> wrote:
> >  
> >> On Tue, Jun 24, 2025 at 03:39:22PM +0200, Petr Tesarik wrote:  
> >>> diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
> >>> index cd432996949c..65132ec88104 100644
> >>> --- a/Documentation/core-api/dma-api.rst
> >>> +++ b/Documentation/core-api/dma-api.rst
> >>> @@ -210,18 +210,12 @@ DMA_BIDIRECTIONAL	direction isn't known
> >>>   	this API should be obtained from sources which guarantee it to be
> >>>   	physically contiguous (like kmalloc).
> >>>   
> >>> -	Further, the DMA address of the memory must be within the dma_mask of
> >>> -	the device.  To ensure that the memory allocated by kmalloc is within
> >>> -	the dma_mask, the driver may specify various platform-dependent flags
> >>> -	to restrict the DMA address range of the allocation (e.g., on x86,
> >>> -	GFP_DMA guarantees to be within the first 16MB of available DMA
> >>> -	addresses, as required by ISA devices).
> >>> -
> >>> -	Note also that the above constraints on physical contiguity and
> >>> -	dma_mask may not apply if the platform has an IOMMU (a device which
> >>> -	maps an I/O DMA address to a physical memory address).  However, to be
> >>> -	portable, device driver writers may *not* assume that such an IOMMU
> >>> -	exists.
> >>> +	Mapping may also fail if the memory is not within the DMA mask of the
> >>> +	device.  However, this constraint does not apply if the platform has
> >>> +	an IOMMU (a device which maps an I/O DMA address to a physical memory
> >>> +	address), or the kernel is configured with SWIOTLB (bounce buffers).
> >>> +	It is reasonable to assume that at least one of these mechanisms
> >>> +	allows streaming DMA to any physical address.  
> > Now I realize this last sentence may be contentious...
> >
> > @Marek, @Robin Do you agree that device drivers should not be concerned
> > about the physical address of a buffer passed to the streaming DMA API?
> >
> > I mean, are there any real-world systems with:
> >    * some RAM that is not DMA-addressable,
> >    * no IOMMU,
> >    * CONFIG_SWIOTLB is not set?
> >
> > FWIW if _I_ received a bug report that a device driver fails to submit
> > I/O on such a system, I would politely explain the reporter that their
> > kernel is misconfigured, and they should enable CONFIG_SWIOTLB.  
> 
> What about the systems with legacy 16/24bit ZONE_DMA (i.e. ISA bus)? 
> AFAIR they don't use SWIOTLB and probably they won't be able to use 
> streaming DMA API for all system RAM.

ISA is probably dead, but yeah, there may still be some systems with
LPC, which inherits the same addressing limitations.

I haven't really tested, but I believe these systems should be able to
enable SWIOTLB. Is there a specific reason they can't use SWIOTLB?

But if there is doubt, I can probably test such configuration.

Petr T

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints
  2025-06-26  5:06     ` Petr Tesarik
  2025-06-26  7:09       ` Marek Szyprowski
@ 2025-06-26  9:58       ` Robin Murphy
  2025-06-26 13:48         ` Petr Tesarik
  2025-06-27 12:52       ` Christoph Hellwig
  2 siblings, 1 reply; 30+ messages in thread
From: Robin Murphy @ 2025-06-26  9:58 UTC (permalink / raw)
  To: Petr Tesarik, Marek Szyprowski
  Cc: Bagas Sanjaya, Jonathan Corbet, Andrew Morton, Leon Romanovsky,
	Keith Busch, Caleb Sander Mateos, Sagi Grimberg, Jens Axboe,
	John Garry, open list:DOCUMENTATION, open list,
	open list:MEMORY MANAGEMENT, iommu

On 2025-06-26 6:06 am, Petr Tesarik wrote:
> On Thu, 26 Jun 2025 08:49:17 +0700
> Bagas Sanjaya <bagasdotme@gmail.com> wrote:
> 
>> On Tue, Jun 24, 2025 at 03:39:22PM +0200, Petr Tesarik wrote:
>>> diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
>>> index cd432996949c..65132ec88104 100644
>>> --- a/Documentation/core-api/dma-api.rst
>>> +++ b/Documentation/core-api/dma-api.rst
>>> @@ -210,18 +210,12 @@ DMA_BIDIRECTIONAL	direction isn't known
>>>   	this API should be obtained from sources which guarantee it to be
>>>   	physically contiguous (like kmalloc).
>>>   
>>> -	Further, the DMA address of the memory must be within the dma_mask of
>>> -	the device.  To ensure that the memory allocated by kmalloc is within
>>> -	the dma_mask, the driver may specify various platform-dependent flags
>>> -	to restrict the DMA address range of the allocation (e.g., on x86,
>>> -	GFP_DMA guarantees to be within the first 16MB of available DMA
>>> -	addresses, as required by ISA devices).
>>> -
>>> -	Note also that the above constraints on physical contiguity and
>>> -	dma_mask may not apply if the platform has an IOMMU (a device which
>>> -	maps an I/O DMA address to a physical memory address).  However, to be
>>> -	portable, device driver writers may *not* assume that such an IOMMU
>>> -	exists.
>>> +	Mapping may also fail if the memory is not within the DMA mask of the
>>> +	device.  However, this constraint does not apply if the platform has
>>> +	an IOMMU (a device which maps an I/O DMA address to a physical memory
>>> +	address), or the kernel is configured with SWIOTLB (bounce buffers).
>>> +	It is reasonable to assume that at least one of these mechanisms
>>> +	allows streaming DMA to any physical address.
> 
> Now I realize this last sentence may be contentious...

The whole paragraph is wrong as written, not least because it is 
conflating two separate things: "any physical address" is objectively 
untrue, since SWIOTLB can only bounce from buffers within by the 
kernel's linear/direct map, i.e. not highmem, not random memory 
carveouts, and and definitely not PAs which are not RAM at all. 
Secondly, even if the source buffer *is* bounceable/mappable, there is 
still no guarantee at all that it can actually be made to appear at a 
DMA address within an arbitrary DMA mask. We aim for a general 
expectation that 32-bit DMA masks should be well-supported (but still 
not 100% guaranteed), but anything smaller can absolutely still have a 
high chance of failing, e.g. due to the SWIOTLB buffer being allocated 
too high or limited IOVA space.

> @Marek, @Robin Do you agree that device drivers should not be concerned
> about the physical address of a buffer passed to the streaming DMA API?
> 
> I mean, are there any real-world systems with:
>    * some RAM that is not DMA-addressable,
>    * no IOMMU,
>    * CONFIG_SWIOTLB is not set?

Yes, almost certainly, because "DMA-addressable" depends on individual 
devices. You can't stop a user from sticking, say, a Broadcom 43xx WiFi 
card into a PCI slot on an i.MX6 board with 2GB of RAM that *starts* 
just above its 31-bit DMA capability. People are still using AMD Seattle 
machines, where even though arm64 does have SWIOTLB it's essentially 
useless since RAM starts up around 40 bits IIRC (and although they do 
also have SMMUs for PCI, older firmware didn't advertise them).

> FWIW if _I_ received a bug report that a device driver fails to submit
> I/O on such a system, I would politely explain the reporter that their
> kernel is misconfigured, and they should enable CONFIG_SWIOTLB.

It's not really that simple. SWIOTLB, ZONE_DMA, etc. require platform 
support, which end users can't just turn on if it's not there to begin with.

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints
  2025-06-26  9:58       ` Robin Murphy
@ 2025-06-26 13:48         ` Petr Tesarik
  2025-06-26 16:45           ` Robin Murphy
  0 siblings, 1 reply; 30+ messages in thread
From: Petr Tesarik @ 2025-06-26 13:48 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Marek Szyprowski, Bagas Sanjaya, Jonathan Corbet, Andrew Morton,
	Leon Romanovsky, Keith Busch, Caleb Sander Mateos, Sagi Grimberg,
	Jens Axboe, John Garry, open list:DOCUMENTATION, open list,
	open list:MEMORY MANAGEMENT, iommu

On Thu, 26 Jun 2025 10:58:00 +0100
Robin Murphy <robin.murphy@arm.com> wrote:

> On 2025-06-26 6:06 am, Petr Tesarik wrote:
> > On Thu, 26 Jun 2025 08:49:17 +0700
> > Bagas Sanjaya <bagasdotme@gmail.com> wrote:
> >   
> >> On Tue, Jun 24, 2025 at 03:39:22PM +0200, Petr Tesarik wrote:  
> >>> diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
> >>> index cd432996949c..65132ec88104 100644
> >>> --- a/Documentation/core-api/dma-api.rst
> >>> +++ b/Documentation/core-api/dma-api.rst
> >>> @@ -210,18 +210,12 @@ DMA_BIDIRECTIONAL	direction isn't known
> >>>   	this API should be obtained from sources which guarantee it to be
> >>>   	physically contiguous (like kmalloc).
> >>>   
> >>> -	Further, the DMA address of the memory must be within the dma_mask of
> >>> -	the device.  To ensure that the memory allocated by kmalloc is within
> >>> -	the dma_mask, the driver may specify various platform-dependent flags
> >>> -	to restrict the DMA address range of the allocation (e.g., on x86,
> >>> -	GFP_DMA guarantees to be within the first 16MB of available DMA
> >>> -	addresses, as required by ISA devices).
> >>> -
> >>> -	Note also that the above constraints on physical contiguity and
> >>> -	dma_mask may not apply if the platform has an IOMMU (a device which
> >>> -	maps an I/O DMA address to a physical memory address).  However, to be
> >>> -	portable, device driver writers may *not* assume that such an IOMMU
> >>> -	exists.
> >>> +	Mapping may also fail if the memory is not within the DMA mask of the
> >>> +	device.  However, this constraint does not apply if the platform has
> >>> +	an IOMMU (a device which maps an I/O DMA address to a physical memory
> >>> +	address), or the kernel is configured with SWIOTLB (bounce buffers).
> >>> +	It is reasonable to assume that at least one of these mechanisms
> >>> +	allows streaming DMA to any physical address.  
> > 
> > Now I realize this last sentence may be contentious...  
> 
> The whole paragraph is wrong as written, not least because it is 
> conflating two separate things: "any physical address" is objectively 
> untrue, since SWIOTLB can only bounce from buffers within by the 
> kernel's linear/direct map, i.e. not highmem, not random memory 
> carveouts, and and definitely not PAs which are not RAM at all. 

I see, saying "any" was indeed too strong.

> Secondly, even if the source buffer *is* bounceable/mappable, there is 
> still no guarantee at all that it can actually be made to appear at a 
> DMA address within an arbitrary DMA mask. We aim for a general 
> expectation that 32-bit DMA masks should be well-supported (but still 
> not 100% guaranteed), but anything smaller can absolutely still have a 
> high chance of failing, e.g. due to the SWIOTLB buffer being allocated 
> too high or limited IOVA space.

Of course this cannot be guaranteed. The function may always fail and
return DMA_MAPPING_ERROR. No doubts about it.

> > @Marek, @Robin Do you agree that device drivers should not be concerned
> > about the physical address of a buffer passed to the streaming DMA API?
> > 
> > I mean, are there any real-world systems with:
> >    * some RAM that is not DMA-addressable,
> >    * no IOMMU,
> >    * CONFIG_SWIOTLB is not set?  
> 
> Yes, almost certainly, because "DMA-addressable" depends on individual 
> devices. You can't stop a user from sticking, say, a Broadcom 43xx WiFi 
> card into a PCI slot on an i.MX6 board with 2GB of RAM that *starts* 
> just above its 31-bit DMA capability. People are still using AMD Seattle 
> machines, where even though arm64 does have SWIOTLB it's essentially 
> useless since RAM starts up around 40 bits IIRC (and although they do 
> also have SMMUs for PCI, older firmware didn't advertise them).

Some of these scenarios can never work properly because of hardware
limitations. There's nothing software can do about a bus master which
cannot address any RAM in the machine. I'm not trying to claim that an
operating system kernel can do magic and square the circle. If that's
how it sounded, then my wording needs to be improved.

IIUC the expected audience of this document are device driver authors.
They want a clear guidance on how they should allocate buffers for the
streaming DMA API. Now, it is my understanding that device drivers
should *not* have to care about the physical location of a buffer
passed to the streaming DMA API.

Even if a bus master implements less than 32 address bits in hardware,
I'm convinced that device drivers should not have to examine the system
to check if an IOMMU is available and try to guess whether a buffer
must be bounced, and how exactly the bounce buffer should be allocated.

If we can agree on this, I can iron out the details for a v2 of this
patch series.

> > FWIW if _I_ received a bug report that a device driver fails to submit
> > I/O on such a system, I would politely explain the reporter that their
> > kernel is misconfigured, and they should enable CONFIG_SWIOTLB.  
> 
> It's not really that simple. SWIOTLB, ZONE_DMA, etc. require platform 
> support, which end users can't just turn on if it's not there to begin with.

I know this very well. As you may not be aware, my ultimate goal is to
get rid of ZONE_DMA and instead enhance the buddy allocator to allow
allocations within an arbitrary physical address range, which will not
rely on platform support. But that's another story; for now, let's just
agree on how the DMA API is supposed to work.

Petr T

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints
  2025-06-26 13:48         ` Petr Tesarik
@ 2025-06-26 16:45           ` Robin Murphy
  2025-06-26 19:40             ` Petr Tesarik
  2025-06-27 12:55             ` Christoph Hellwig
  0 siblings, 2 replies; 30+ messages in thread
From: Robin Murphy @ 2025-06-26 16:45 UTC (permalink / raw)
  To: Petr Tesarik
  Cc: Marek Szyprowski, Bagas Sanjaya, Jonathan Corbet, Andrew Morton,
	Leon Romanovsky, Keith Busch, Caleb Sander Mateos, Sagi Grimberg,
	Jens Axboe, John Garry, open list:DOCUMENTATION, open list,
	open list:MEMORY MANAGEMENT, iommu

On 26/06/2025 2:48 pm, Petr Tesarik wrote:
> On Thu, 26 Jun 2025 10:58:00 +0100
> Robin Murphy <robin.murphy@arm.com> wrote:
> 
>> On 2025-06-26 6:06 am, Petr Tesarik wrote:
>>> On Thu, 26 Jun 2025 08:49:17 +0700
>>> Bagas Sanjaya <bagasdotme@gmail.com> wrote:
>>>    
>>>> On Tue, Jun 24, 2025 at 03:39:22PM +0200, Petr Tesarik wrote:
>>>>> diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
>>>>> index cd432996949c..65132ec88104 100644
>>>>> --- a/Documentation/core-api/dma-api.rst
>>>>> +++ b/Documentation/core-api/dma-api.rst
>>>>> @@ -210,18 +210,12 @@ DMA_BIDIRECTIONAL	direction isn't known
>>>>>    	this API should be obtained from sources which guarantee it to be
>>>>>    	physically contiguous (like kmalloc).
>>>>>    
>>>>> -	Further, the DMA address of the memory must be within the dma_mask of
>>>>> -	the device.  To ensure that the memory allocated by kmalloc is within
>>>>> -	the dma_mask, the driver may specify various platform-dependent flags
>>>>> -	to restrict the DMA address range of the allocation (e.g., on x86,
>>>>> -	GFP_DMA guarantees to be within the first 16MB of available DMA
>>>>> -	addresses, as required by ISA devices).
>>>>> -
>>>>> -	Note also that the above constraints on physical contiguity and
>>>>> -	dma_mask may not apply if the platform has an IOMMU (a device which
>>>>> -	maps an I/O DMA address to a physical memory address).  However, to be
>>>>> -	portable, device driver writers may *not* assume that such an IOMMU
>>>>> -	exists.
>>>>> +	Mapping may also fail if the memory is not within the DMA mask of the
>>>>> +	device.  However, this constraint does not apply if the platform has
>>>>> +	an IOMMU (a device which maps an I/O DMA address to a physical memory
>>>>> +	address), or the kernel is configured with SWIOTLB (bounce buffers).
>>>>> +	It is reasonable to assume that at least one of these mechanisms
>>>>> +	allows streaming DMA to any physical address.
>>>
>>> Now I realize this last sentence may be contentious...
>>
>> The whole paragraph is wrong as written, not least because it is
>> conflating two separate things: "any physical address" is objectively
>> untrue, since SWIOTLB can only bounce from buffers within by the
>> kernel's linear/direct map, i.e. not highmem, not random memory
>> carveouts, and and definitely not PAs which are not RAM at all.
> 
> I see, saying "any" was indeed too strong.
> 
>> Secondly, even if the source buffer *is* bounceable/mappable, there is
>> still no guarantee at all that it can actually be made to appear at a
>> DMA address within an arbitrary DMA mask. We aim for a general
>> expectation that 32-bit DMA masks should be well-supported (but still
>> not 100% guaranteed), but anything smaller can absolutely still have a
>> high chance of failing, e.g. due to the SWIOTLB buffer being allocated
>> too high or limited IOVA space.
> 
> Of course this cannot be guaranteed. The function may always fail and
> return DMA_MAPPING_ERROR. No doubts about it.
> 
>>> @Marek, @Robin Do you agree that device drivers should not be concerned
>>> about the physical address of a buffer passed to the streaming DMA API?
>>>
>>> I mean, are there any real-world systems with:
>>>     * some RAM that is not DMA-addressable,
>>>     * no IOMMU,
>>>     * CONFIG_SWIOTLB is not set?
>>
>> Yes, almost certainly, because "DMA-addressable" depends on individual
>> devices. You can't stop a user from sticking, say, a Broadcom 43xx WiFi
>> card into a PCI slot on an i.MX6 board with 2GB of RAM that *starts*
>> just above its 31-bit DMA capability. People are still using AMD Seattle
>> machines, where even though arm64 does have SWIOTLB it's essentially
>> useless since RAM starts up around 40 bits IIRC (and although they do
>> also have SMMUs for PCI, older firmware didn't advertise them).
> 
> Some of these scenarios can never work properly because of hardware
> limitations. There's nothing software can do about a bus master which
> cannot address any RAM in the machine. I'm not trying to claim that an
> operating system kernel can do magic and square the circle. If that's
> how it sounded, then my wording needs to be improved.
> 
> IIUC the expected audience of this document are device driver authors.
> They want a clear guidance on how they should allocate buffers for the
> streaming DMA API. Now, it is my understanding that device drivers
> should *not* have to care about the physical location of a buffer
> passed to the streaming DMA API.
> 
> Even if a bus master implements less than 32 address bits in hardware,
> I'm convinced that device drivers should not have to examine the system
> to check if an IOMMU is available and try to guess whether a buffer
> must be bounced, and how exactly the bounce buffer should be allocated.

It's never been suggested that drivers should do that; indeed trying to 
poke into and second-guess the DMA API implementation is generally even 
less OK than making blind assumptions about what it might do. The 
overall message here is essentially "if you want to do streaming DMA 
then you may need to be wary of where your memory comes from." We can't 
just throw that out and say "Yeah it's fine now, whatever you do the API 
will deal with it" because that simply isn't true as a general 
statement; drivers dealing with limited DMA masks *do* still need to be 
concerned with GFP_DMA (or even GFP_DMA32 might still be advisable in 
certain cases) if they want to have an appreciable chance of success. 
All that's different these days is that notion of "limited" generally 
meaning "32 bits or smaller".

> If we can agree on this, I can iron out the details for a v2 of this
> patch series.
> 
>>> FWIW if _I_ received a bug report that a device driver fails to submit
>>> I/O on such a system, I would politely explain the reporter that their
>>> kernel is misconfigured, and they should enable CONFIG_SWIOTLB.
>>
>> It's not really that simple. SWIOTLB, ZONE_DMA, etc. require platform
>> support, which end users can't just turn on if it's not there to begin with.
> 
> I know this very well. As you may not be aware, my ultimate goal is to
> get rid of ZONE_DMA and instead enhance the buddy allocator to allow
> allocations within an arbitrary physical address range, which will not
> rely on platform support. But that's another story; for now, let's just
> agree on how the DMA API is supposed to work.

Indeed that might actually end up pushing things in the opposite 
direction, at least in some cases. Right now, a driver with, say, a 
40-bit DMA mask is usually better off not special-casing DMA buffers, 
and just making plain GFP_KERNEL allocations for everything (on the 
assumption that 64-bit systems with masses of memory *should* have 
SWIOTLB to cover things in the worst case), vs. artificially 
constraining its DMA buffers to GFP_DMA32 and having to deal with 
allocation failure more often. However with a more precise and flexible 
allocator, there's then a much stronger incentive for such drivers to 
explicitly mark *every* allocation that may be used for DMA, in order to 
get the optimal behaviour.

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints
  2025-06-26 16:45           ` Robin Murphy
@ 2025-06-26 19:40             ` Petr Tesarik
  2025-06-27 11:07               ` Robin Murphy
  2025-06-27 12:55             ` Christoph Hellwig
  1 sibling, 1 reply; 30+ messages in thread
From: Petr Tesarik @ 2025-06-26 19:40 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Marek Szyprowski, Bagas Sanjaya, Jonathan Corbet, Andrew Morton,
	Leon Romanovsky, Keith Busch, Caleb Sander Mateos, Sagi Grimberg,
	Jens Axboe, John Garry, open list:DOCUMENTATION, open list,
	open list:MEMORY MANAGEMENT, iommu

On Thu, 26 Jun 2025 17:45:18 +0100
Robin Murphy <robin.murphy@arm.com> wrote:

> On 26/06/2025 2:48 pm, Petr Tesarik wrote:
> > On Thu, 26 Jun 2025 10:58:00 +0100
> > Robin Murphy <robin.murphy@arm.com> wrote:
> >   
> >> On 2025-06-26 6:06 am, Petr Tesarik wrote:  
> >>> On Thu, 26 Jun 2025 08:49:17 +0700
> >>> Bagas Sanjaya <bagasdotme@gmail.com> wrote:
> >>>      
> >>>> On Tue, Jun 24, 2025 at 03:39:22PM +0200, Petr Tesarik wrote:  
> >>>>> diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
> >>>>> index cd432996949c..65132ec88104 100644
> >>>>> --- a/Documentation/core-api/dma-api.rst
> >>>>> +++ b/Documentation/core-api/dma-api.rst
> >>>>> @@ -210,18 +210,12 @@ DMA_BIDIRECTIONAL	direction isn't known
> >>>>>    	this API should be obtained from sources which guarantee it to be
> >>>>>    	physically contiguous (like kmalloc).
> >>>>>    
> >>>>> -	Further, the DMA address of the memory must be within the dma_mask of
> >>>>> -	the device.  To ensure that the memory allocated by kmalloc is within
> >>>>> -	the dma_mask, the driver may specify various platform-dependent flags
> >>>>> -	to restrict the DMA address range of the allocation (e.g., on x86,
> >>>>> -	GFP_DMA guarantees to be within the first 16MB of available DMA
> >>>>> -	addresses, as required by ISA devices).
> >>>>> -
> >>>>> -	Note also that the above constraints on physical contiguity and
> >>>>> -	dma_mask may not apply if the platform has an IOMMU (a device which
> >>>>> -	maps an I/O DMA address to a physical memory address).  However, to be
> >>>>> -	portable, device driver writers may *not* assume that such an IOMMU
> >>>>> -	exists.
> >>>>> +	Mapping may also fail if the memory is not within the DMA mask of the
> >>>>> +	device.  However, this constraint does not apply if the platform has
> >>>>> +	an IOMMU (a device which maps an I/O DMA address to a physical memory
> >>>>> +	address), or the kernel is configured with SWIOTLB (bounce buffers).
> >>>>> +	It is reasonable to assume that at least one of these mechanisms
> >>>>> +	allows streaming DMA to any physical address.  
> >>>
> >>> Now I realize this last sentence may be contentious...  
> >>
> >> The whole paragraph is wrong as written, not least because it is
> >> conflating two separate things: "any physical address" is objectively
> >> untrue, since SWIOTLB can only bounce from buffers within by the
> >> kernel's linear/direct map, i.e. not highmem, not random memory
> >> carveouts, and and definitely not PAs which are not RAM at all.  
> > 
> > I see, saying "any" was indeed too strong.
> >   
> >> Secondly, even if the source buffer *is* bounceable/mappable, there is
> >> still no guarantee at all that it can actually be made to appear at a
> >> DMA address within an arbitrary DMA mask. We aim for a general
> >> expectation that 32-bit DMA masks should be well-supported (but still
> >> not 100% guaranteed), but anything smaller can absolutely still have a
> >> high chance of failing, e.g. due to the SWIOTLB buffer being allocated
> >> too high or limited IOVA space.  
> > 
> > Of course this cannot be guaranteed. The function may always fail and
> > return DMA_MAPPING_ERROR. No doubts about it.
> >   
> >>> @Marek, @Robin Do you agree that device drivers should not be concerned
> >>> about the physical address of a buffer passed to the streaming DMA API?
> >>>
> >>> I mean, are there any real-world systems with:
> >>>     * some RAM that is not DMA-addressable,
> >>>     * no IOMMU,
> >>>     * CONFIG_SWIOTLB is not set?  
> >>
> >> Yes, almost certainly, because "DMA-addressable" depends on individual
> >> devices. You can't stop a user from sticking, say, a Broadcom 43xx WiFi
> >> card into a PCI slot on an i.MX6 board with 2GB of RAM that *starts*
> >> just above its 31-bit DMA capability. People are still using AMD Seattle
> >> machines, where even though arm64 does have SWIOTLB it's essentially
> >> useless since RAM starts up around 40 bits IIRC (and although they do
> >> also have SMMUs for PCI, older firmware didn't advertise them).  
> > 
> > Some of these scenarios can never work properly because of hardware
> > limitations. There's nothing software can do about a bus master which
> > cannot address any RAM in the machine. I'm not trying to claim that an
> > operating system kernel can do magic and square the circle. If that's
> > how it sounded, then my wording needs to be improved.
> > 
> > IIUC the expected audience of this document are device driver authors.
> > They want a clear guidance on how they should allocate buffers for the
> > streaming DMA API. Now, it is my understanding that device drivers
> > should *not* have to care about the physical location of a buffer
> > passed to the streaming DMA API.
> > 
> > Even if a bus master implements less than 32 address bits in hardware,
> > I'm convinced that device drivers should not have to examine the system
> > to check if an IOMMU is available and try to guess whether a buffer
> > must be bounced, and how exactly the bounce buffer should be allocated.  
> 
> It's never been suggested that drivers should do that; indeed trying to 
> poke into and second-guess the DMA API implementation is generally even 
> less OK than making blind assumptions about what it might do. The 
> overall message here is essentially "if you want to do streaming DMA 
> then you may need to be wary of where your memory comes from." We can't 
> just throw that out and say "Yeah it's fine now, whatever you do the API 
> will deal with it" because that simply isn't true as a general 
> statement; drivers dealing with limited DMA masks *do* still need to be 
> concerned with GFP_DMA (or even GFP_DMA32 might still be advisable in 
> certain cases) if they want to have an appreciable chance of success. 
> All that's different these days is that notion of "limited" generally 
> meaning "32 bits or smaller".

We're on the same page then. I'm going to make a better explanation of
how things work and what is expected from DMA API users.

Thank you very much for your feedback! I'm sure it will be greatly
appreciated by future generations of device driver authors.

> > If we can agree on this, I can iron out the details for a v2 of this
> > patch series.
> >   
> >>> FWIW if _I_ received a bug report that a device driver fails to submit
> >>> I/O on such a system, I would politely explain the reporter that their
> >>> kernel is misconfigured, and they should enable CONFIG_SWIOTLB.  
> >>
> >> It's not really that simple. SWIOTLB, ZONE_DMA, etc. require platform
> >> support, which end users can't just turn on if it's not there to begin with.  
> > 
> > I know this very well. As you may not be aware, my ultimate goal is to
> > get rid of ZONE_DMA and instead enhance the buddy allocator to allow
> > allocations within an arbitrary physical address range, which will not
> > rely on platform support. But that's another story; for now, let's just
> > agree on how the DMA API is supposed to work.  
> 
> Indeed that might actually end up pushing things in the opposite 
> direction, at least in some cases. Right now, a driver with, say, a 
> 40-bit DMA mask is usually better off not special-casing DMA buffers, 
> and just making plain GFP_KERNEL allocations for everything (on the 
> assumption that 64-bit systems with masses of memory *should* have 
> SWIOTLB to cover things in the worst case), vs. artificially 
> constraining its DMA buffers to GFP_DMA32 and having to deal with 
> allocation failure more often. However with a more precise and flexible 
> allocator, there's then a much stronger incentive for such drivers to 
> explicitly mark *every* allocation that may be used for DMA, in order to 
> get the optimal behaviour.

I have a different opinion. Most buffers that are passed to the
streaming DMA API are I/O data (data read from/written to disk, or
received from/sent to network). For the write/send case, these pages
were previously allocated by user space, and at that point the kernel
had no clue that they would be later used for device I/O.

For example, consider this user-space sequence:

	buffer = malloc(BUFFER_SIZE);
	fill_in_data(buffer);
	res = write(fd, buffer, BUFFER_SIZE);

The write(2) syscall will try to do zero copy, and that's how the
buffer address is passed down to a device driver. If the buffer is not
directly accessible by the device, its content must be copied to a
different physical location. That should be done by SWIOTLB, not the
device driver. Last chance to chose a better placement for the buffer
was at malloc(3) time, but at that time the device driver was not
involved at all. Er, yes, we may want to provide an ioctl to allocate
a suitable buffer for a target device. I think DRM even had such an
ioctl once and then removed it, because it was not used in any released
userspace code...

In short, the device driver has no control of how these buffers were
allocated, and it's not fair to expect anything from the driver.

Sure, there are also control data structures, e.g. Tx/Rx rings, but
they are typically allocated during device initialization (or ndo_open)
using the coherent DMA API and reused for all subsequent I/O.

In summary, yes, it would be great if we could reduce bouncing, but
most of that work has already been done, and there's little left for
improvement. So, why am I working on a PAR (Physical Address Range)
Allocator? Certainly not to help users of the streaming DMA API. No,
but it should help dynamic SWIOTLB when the primary SWIOTLB is
allocated in an unsuitable physical location.

Petr T

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints
  2025-06-26 19:40             ` Petr Tesarik
@ 2025-06-27 11:07               ` Robin Murphy
  2025-06-27 11:32                 ` Petr Tesarik
  0 siblings, 1 reply; 30+ messages in thread
From: Robin Murphy @ 2025-06-27 11:07 UTC (permalink / raw)
  To: Petr Tesarik
  Cc: Marek Szyprowski, Bagas Sanjaya, Jonathan Corbet, Andrew Morton,
	Leon Romanovsky, Keith Busch, Caleb Sander Mateos, Sagi Grimberg,
	Jens Axboe, John Garry, open list:DOCUMENTATION, open list,
	open list:MEMORY MANAGEMENT, iommu

On 2025-06-26 8:40 pm, Petr Tesarik wrote:
> On Thu, 26 Jun 2025 17:45:18 +0100
> Robin Murphy <robin.murphy@arm.com> wrote:
> 
>> On 26/06/2025 2:48 pm, Petr Tesarik wrote:
>>> On Thu, 26 Jun 2025 10:58:00 +0100
>>> Robin Murphy <robin.murphy@arm.com> wrote:
>>>    
>>>> On 2025-06-26 6:06 am, Petr Tesarik wrote:
>>>>> On Thu, 26 Jun 2025 08:49:17 +0700
>>>>> Bagas Sanjaya <bagasdotme@gmail.com> wrote:
>>>>>       
>>>>>> On Tue, Jun 24, 2025 at 03:39:22PM +0200, Petr Tesarik wrote:
>>>>>>> diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
>>>>>>> index cd432996949c..65132ec88104 100644
>>>>>>> --- a/Documentation/core-api/dma-api.rst
>>>>>>> +++ b/Documentation/core-api/dma-api.rst
>>>>>>> @@ -210,18 +210,12 @@ DMA_BIDIRECTIONAL	direction isn't known
>>>>>>>     	this API should be obtained from sources which guarantee it to be
>>>>>>>     	physically contiguous (like kmalloc).
>>>>>>>     
>>>>>>> -	Further, the DMA address of the memory must be within the dma_mask of
>>>>>>> -	the device.  To ensure that the memory allocated by kmalloc is within
>>>>>>> -	the dma_mask, the driver may specify various platform-dependent flags
>>>>>>> -	to restrict the DMA address range of the allocation (e.g., on x86,
>>>>>>> -	GFP_DMA guarantees to be within the first 16MB of available DMA
>>>>>>> -	addresses, as required by ISA devices).
>>>>>>> -
>>>>>>> -	Note also that the above constraints on physical contiguity and
>>>>>>> -	dma_mask may not apply if the platform has an IOMMU (a device which
>>>>>>> -	maps an I/O DMA address to a physical memory address).  However, to be
>>>>>>> -	portable, device driver writers may *not* assume that such an IOMMU
>>>>>>> -	exists.
>>>>>>> +	Mapping may also fail if the memory is not within the DMA mask of the
>>>>>>> +	device.  However, this constraint does not apply if the platform has
>>>>>>> +	an IOMMU (a device which maps an I/O DMA address to a physical memory
>>>>>>> +	address), or the kernel is configured with SWIOTLB (bounce buffers).
>>>>>>> +	It is reasonable to assume that at least one of these mechanisms
>>>>>>> +	allows streaming DMA to any physical address.
>>>>>
>>>>> Now I realize this last sentence may be contentious...
>>>>
>>>> The whole paragraph is wrong as written, not least because it is
>>>> conflating two separate things: "any physical address" is objectively
>>>> untrue, since SWIOTLB can only bounce from buffers within by the
>>>> kernel's linear/direct map, i.e. not highmem, not random memory
>>>> carveouts, and and definitely not PAs which are not RAM at all.
>>>
>>> I see, saying "any" was indeed too strong.
>>>    
>>>> Secondly, even if the source buffer *is* bounceable/mappable, there is
>>>> still no guarantee at all that it can actually be made to appear at a
>>>> DMA address within an arbitrary DMA mask. We aim for a general
>>>> expectation that 32-bit DMA masks should be well-supported (but still
>>>> not 100% guaranteed), but anything smaller can absolutely still have a
>>>> high chance of failing, e.g. due to the SWIOTLB buffer being allocated
>>>> too high or limited IOVA space.
>>>
>>> Of course this cannot be guaranteed. The function may always fail and
>>> return DMA_MAPPING_ERROR. No doubts about it.
>>>    
>>>>> @Marek, @Robin Do you agree that device drivers should not be concerned
>>>>> about the physical address of a buffer passed to the streaming DMA API?
>>>>>
>>>>> I mean, are there any real-world systems with:
>>>>>      * some RAM that is not DMA-addressable,
>>>>>      * no IOMMU,
>>>>>      * CONFIG_SWIOTLB is not set?
>>>>
>>>> Yes, almost certainly, because "DMA-addressable" depends on individual
>>>> devices. You can't stop a user from sticking, say, a Broadcom 43xx WiFi
>>>> card into a PCI slot on an i.MX6 board with 2GB of RAM that *starts*
>>>> just above its 31-bit DMA capability. People are still using AMD Seattle
>>>> machines, where even though arm64 does have SWIOTLB it's essentially
>>>> useless since RAM starts up around 40 bits IIRC (and although they do
>>>> also have SMMUs for PCI, older firmware didn't advertise them).
>>>
>>> Some of these scenarios can never work properly because of hardware
>>> limitations. There's nothing software can do about a bus master which
>>> cannot address any RAM in the machine. I'm not trying to claim that an
>>> operating system kernel can do magic and square the circle. If that's
>>> how it sounded, then my wording needs to be improved.
>>>
>>> IIUC the expected audience of this document are device driver authors.
>>> They want a clear guidance on how they should allocate buffers for the
>>> streaming DMA API. Now, it is my understanding that device drivers
>>> should *not* have to care about the physical location of a buffer
>>> passed to the streaming DMA API.
>>>
>>> Even if a bus master implements less than 32 address bits in hardware,
>>> I'm convinced that device drivers should not have to examine the system
>>> to check if an IOMMU is available and try to guess whether a buffer
>>> must be bounced, and how exactly the bounce buffer should be allocated.
>>
>> It's never been suggested that drivers should do that; indeed trying to
>> poke into and second-guess the DMA API implementation is generally even
>> less OK than making blind assumptions about what it might do. The
>> overall message here is essentially "if you want to do streaming DMA
>> then you may need to be wary of where your memory comes from." We can't
>> just throw that out and say "Yeah it's fine now, whatever you do the API
>> will deal with it" because that simply isn't true as a general
>> statement; drivers dealing with limited DMA masks *do* still need to be
>> concerned with GFP_DMA (or even GFP_DMA32 might still be advisable in
>> certain cases) if they want to have an appreciable chance of success.
>> All that's different these days is that notion of "limited" generally
>> meaning "32 bits or smaller".
> 
> We're on the same page then. I'm going to make a better explanation of
> how things work and what is expected from DMA API users.
> 
> Thank you very much for your feedback! I'm sure it will be greatly
> appreciated by future generations of device driver authors.
> 
>>> If we can agree on this, I can iron out the details for a v2 of this
>>> patch series.
>>>    
>>>>> FWIW if _I_ received a bug report that a device driver fails to submit
>>>>> I/O on such a system, I would politely explain the reporter that their
>>>>> kernel is misconfigured, and they should enable CONFIG_SWIOTLB.
>>>>
>>>> It's not really that simple. SWIOTLB, ZONE_DMA, etc. require platform
>>>> support, which end users can't just turn on if it's not there to begin with.
>>>
>>> I know this very well. As you may not be aware, my ultimate goal is to
>>> get rid of ZONE_DMA and instead enhance the buddy allocator to allow
>>> allocations within an arbitrary physical address range, which will not
>>> rely on platform support. But that's another story; for now, let's just
>>> agree on how the DMA API is supposed to work.
>>
>> Indeed that might actually end up pushing things in the opposite
>> direction, at least in some cases. Right now, a driver with, say, a
>> 40-bit DMA mask is usually better off not special-casing DMA buffers,
>> and just making plain GFP_KERNEL allocations for everything (on the
>> assumption that 64-bit systems with masses of memory *should* have
>> SWIOTLB to cover things in the worst case), vs. artificially
>> constraining its DMA buffers to GFP_DMA32 and having to deal with
>> allocation failure more often. However with a more precise and flexible
>> allocator, there's then a much stronger incentive for such drivers to
>> explicitly mark *every* allocation that may be used for DMA, in order to
>> get the optimal behaviour.
> 
> I have a different opinion. Most buffers that are passed to the
> streaming DMA API are I/O data (data read from/written to disk, or
> received from/sent to network). For the write/send case, these pages
> were previously allocated by user space, and at that point the kernel
> had no clue that they would be later used for device I/O.
> 
> For example, consider this user-space sequence:
> 
> 	buffer = malloc(BUFFER_SIZE);
> 	fill_in_data(buffer);
> 	res = write(fd, buffer, BUFFER_SIZE);
> 
> The write(2) syscall will try to do zero copy, and that's how the
> buffer address is passed down to a device driver. If the buffer is not
> directly accessible by the device, its content must be copied to a
> different physical location. That should be done by SWIOTLB, not the
> device driver. Last chance to chose a better placement for the buffer
> was at malloc(3) time, but at that time the device driver was not
> involved at all. Er, yes, we may want to provide an ioctl to allocate
> a suitable buffer for a target device. I think DRM even had such an
> ioctl once and then removed it, because it was not used in any released
> userspace code...
> 
> In short, the device driver has no control of how these buffers were
> allocated, and it's not fair to expect anything from the driver.

Indeed, for true zero-copy to existing userspace memory then there's not 
much anyone can change, hence "at least in some cases". However, there 
are an awful lot of drivers/subsystems which use streaming DMA on their 
own relatively short-lived kmalloc() allocations - the first example 
which always comes to mind is all the interfaces like SPI, I2C, UART, 
etc. which are either dmaengine clients or have their own DMA (and 
indeed some of which were historically trying to do it from temporary 
buffers on the stack). Heck, even alloc_skb() might end up being 
commonly used if this "ethernet" thing ever catches on...

Thanks,
Robin.

> Sure, there are also control data structures, e.g. Tx/Rx rings, but
> they are typically allocated during device initialization (or ndo_open)
> using the coherent DMA API and reused for all subsequent I/O.
> 
> In summary, yes, it would be great if we could reduce bouncing, but
> most of that work has already been done, and there's little left for
> improvement. So, why am I working on a PAR (Physical Address Range)
> Allocator? Certainly not to help users of the streaming DMA API. No,
> but it should help dynamic SWIOTLB when the primary SWIOTLB is
> allocated in an unsuitable physical location.
> 
> Petr T


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints
  2025-06-27 11:07               ` Robin Murphy
@ 2025-06-27 11:32                 ` Petr Tesarik
  0 siblings, 0 replies; 30+ messages in thread
From: Petr Tesarik @ 2025-06-27 11:32 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Marek Szyprowski, Bagas Sanjaya, Jonathan Corbet, Andrew Morton,
	Leon Romanovsky, Keith Busch, Caleb Sander Mateos, Sagi Grimberg,
	Jens Axboe, John Garry, open list:DOCUMENTATION, open list,
	open list:MEMORY MANAGEMENT, iommu

On Fri, 27 Jun 2025 12:07:56 +0100
Robin Murphy <robin.murphy@arm.com> wrote:

> On 2025-06-26 8:40 pm, Petr Tesarik wrote:
> > On Thu, 26 Jun 2025 17:45:18 +0100
> > Robin Murphy <robin.murphy@arm.com> wrote:
> >   
> >> On 26/06/2025 2:48 pm, Petr Tesarik wrote:  
> >>> On Thu, 26 Jun 2025 10:58:00 +0100
> >>> Robin Murphy <robin.murphy@arm.com> wrote:
>[...]
> >>>> It's not really that simple. SWIOTLB, ZONE_DMA, etc. require platform
> >>>> support, which end users can't just turn on if it's not there to begin with.  
> >>>
> >>> I know this very well. As you may not be aware, my ultimate goal is to
> >>> get rid of ZONE_DMA and instead enhance the buddy allocator to allow
> >>> allocations within an arbitrary physical address range, which will not
> >>> rely on platform support. But that's another story; for now, let's just
> >>> agree on how the DMA API is supposed to work.  
> >>
> >> Indeed that might actually end up pushing things in the opposite
> >> direction, at least in some cases. Right now, a driver with, say, a
> >> 40-bit DMA mask is usually better off not special-casing DMA buffers,
> >> and just making plain GFP_KERNEL allocations for everything (on the
> >> assumption that 64-bit systems with masses of memory *should* have
> >> SWIOTLB to cover things in the worst case), vs. artificially
> >> constraining its DMA buffers to GFP_DMA32 and having to deal with
> >> allocation failure more often. However with a more precise and flexible
> >> allocator, there's then a much stronger incentive for such drivers to
> >> explicitly mark *every* allocation that may be used for DMA, in order to
> >> get the optimal behaviour.  
> > 
> > I have a different opinion. Most buffers that are passed to the
> > streaming DMA API are I/O data (data read from/written to disk, or
> > received from/sent to network). For the write/send case, these pages
> > were previously allocated by user space, and at that point the kernel
> > had no clue that they would be later used for device I/O.
> > 
> > For example, consider this user-space sequence:
> > 
> > 	buffer = malloc(BUFFER_SIZE);
> > 	fill_in_data(buffer);
> > 	res = write(fd, buffer, BUFFER_SIZE);
> > 
> > The write(2) syscall will try to do zero copy, and that's how the
> > buffer address is passed down to a device driver. If the buffer is not
> > directly accessible by the device, its content must be copied to a
> > different physical location. That should be done by SWIOTLB, not the
> > device driver. Last chance to chose a better placement for the buffer
> > was at malloc(3) time, but at that time the device driver was not
> > involved at all. Er, yes, we may want to provide an ioctl to allocate
> > a suitable buffer for a target device. I think DRM even had such an
> > ioctl once and then removed it, because it was not used in any released
> > userspace code...
> > 
> > In short, the device driver has no control of how these buffers were
> > allocated, and it's not fair to expect anything from the driver.  
> 
> Indeed, for true zero-copy to existing userspace memory then there's not 
> much anyone can change, hence "at least in some cases". However, there 
> are an awful lot of drivers/subsystems which use streaming DMA on their 
> own relatively short-lived kmalloc() allocations - the first example 
> which always comes to mind is all the interfaces like SPI, I2C, UART, 
> etc. which are either dmaengine clients or have their own DMA (and 
> indeed some of which were historically trying to do it from temporary 
> buffers on the stack). Heck, even alloc_skb() might end up being 
> commonly used if this "ethernet" thing ever catches on...

I have been looking around a bit already, and I didn't see an _awful_
lot of these short-lived allocations, but yes, I've found some, and
yes, most of them are in the subsystems you mentioned...

Anyway, thank you for your patience with reading my DMA API docs
update!

Petr T

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints
  2025-06-26  5:06     ` Petr Tesarik
  2025-06-26  7:09       ` Marek Szyprowski
  2025-06-26  9:58       ` Robin Murphy
@ 2025-06-27 12:52       ` Christoph Hellwig
  2 siblings, 0 replies; 30+ messages in thread
From: Christoph Hellwig @ 2025-06-27 12:52 UTC (permalink / raw)
  To: Petr Tesarik
  Cc: Marek Szyprowski, Robin, Bagas Sanjaya, Jonathan Corbet,
	Andrew Morton, Leon Romanovsky, Keith Busch, Caleb Sander Mateos,
	Sagi Grimberg, Jens Axboe, John Garry, open list:DOCUMENTATION,
	open list, open list:MEMORY MANAGEMENT, iommu

On Thu, Jun 26, 2025 at 07:06:02AM +0200, Petr Tesarik wrote:
> @Marek, @Robin Do you agree that device drivers should not be concerned
> about the physical address of a buffer passed to the streaming DMA API?
> 
> I mean, are there any real-world systems with:
>   * some RAM that is not DMA-addressable,
>   * no IOMMU,
>   * CONFIG_SWIOTLB is not set?
> 
> FWIW if _I_ received a bug report that a device driver fails to submit
> I/O on such a system, I would politely explain the reporter that their
> kernel is misconfigured, and they should enable CONFIG_SWIOTLB.

Modulo the case of < 32-bit (or 31-bit for some systems) case the
general idea was that the iommu API always works except for temporary
resource shortages.  Now if you configure without SWIOTLB on system
that would otherwise need it you gotta keep the pieces.  That's why
I've always argued against making that (and ZONE_DMA) user selectable,
but some arch maintainers insisted that they want that, breaking the
original guarantee.  Which is annoying as dma_map_* can't tell you
if the failure is permanent or transient.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints
  2025-06-26 16:45           ` Robin Murphy
  2025-06-26 19:40             ` Petr Tesarik
@ 2025-06-27 12:55             ` Christoph Hellwig
  2025-06-27 13:02               ` Petr Tesarik
  1 sibling, 1 reply; 30+ messages in thread
From: Christoph Hellwig @ 2025-06-27 12:55 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Petr Tesarik, Marek Szyprowski, Bagas Sanjaya, Jonathan Corbet,
	Andrew Morton, Leon Romanovsky, Keith Busch, Caleb Sander Mateos,
	Sagi Grimberg, Jens Axboe, John Garry, open list:DOCUMENTATION,
	open list, open list:MEMORY MANAGEMENT, iommu

On Thu, Jun 26, 2025 at 05:45:18PM +0100, Robin Murphy wrote:
> Indeed that might actually end up pushing things in the opposite direction,
> at least in some cases. Right now, a driver with, say, a 40-bit DMA mask is
> usually better off not special-casing DMA buffers, and just making plain
> GFP_KERNEL allocations for everything (on the assumption that 64-bit systems
> with masses of memory *should* have SWIOTLB to cover things in the worst
> case), vs. artificially constraining its DMA buffers to GFP_DMA32 and having
> to deal with allocation failure more often. However with a more precise and
> flexible allocator, there's then a much stronger incentive for such drivers
> to explicitly mark *every* allocation that may be used for DMA, in order to
> get the optimal behaviour.

It really should be using dma_alloc_pages to ensure it gets addressable
memory for these cases.  For sub-page allocations it could use dmapool,
but that's a little annoying because it does coherent allocations which
95% of the users don't actually need.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints
  2025-06-27 12:55             ` Christoph Hellwig
@ 2025-06-27 13:02               ` Petr Tesarik
  0 siblings, 0 replies; 30+ messages in thread
From: Petr Tesarik @ 2025-06-27 13:02 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Robin Murphy, Marek Szyprowski, Bagas Sanjaya, Jonathan Corbet,
	Andrew Morton, Leon Romanovsky, Keith Busch, Caleb Sander Mateos,
	Sagi Grimberg, Jens Axboe, John Garry, open list:DOCUMENTATION,
	open list, open list:MEMORY MANAGEMENT, iommu

On Fri, 27 Jun 2025 05:55:09 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> On Thu, Jun 26, 2025 at 05:45:18PM +0100, Robin Murphy wrote:
> > Indeed that might actually end up pushing things in the opposite direction,
> > at least in some cases. Right now, a driver with, say, a 40-bit DMA mask is
> > usually better off not special-casing DMA buffers, and just making plain
> > GFP_KERNEL allocations for everything (on the assumption that 64-bit systems
> > with masses of memory *should* have SWIOTLB to cover things in the worst
> > case), vs. artificially constraining its DMA buffers to GFP_DMA32 and having
> > to deal with allocation failure more often. However with a more precise and
> > flexible allocator, there's then a much stronger incentive for such drivers
> > to explicitly mark *every* allocation that may be used for DMA, in order to
> > get the optimal behaviour.  
> 
> It really should be using dma_alloc_pages to ensure it gets addressable
> memory for these cases.  For sub-page allocations it could use dmapool,
> but that's a little annoying because it does coherent allocations which
> 95% of the users don't actually need.

Wow, thank you for this insight! There's one item on my TODO list:
convert SLAB_CACHE_DMA caches to dmapool. But now I see it would
introduce a regression (accessing DMA-coherent pages may be much
slower). I could implement a variant of dmapool which allocates normal
pages from a given physical address range, and it seems it would be
actually useful.

Petr T

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2025-06-27 13:02 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-24 13:39 [PATCH 0/8] update DMA API documentation Petr Tesarik
2025-06-24 13:39 ` [PATCH 1/8] docs: dma-api: use "DMA API" consistently throughout the document Petr Tesarik
2025-06-25  2:41   ` Randy Dunlap
2025-06-24 13:39 ` [PATCH 2/8] docs: dma-api: replace consistent with coherent Petr Tesarik
2025-06-26  4:51   ` Petr Tesarik
2025-06-26  7:21     ` Marek Szyprowski
2025-06-24 13:39 ` [PATCH 3/8] docs: dma-api: remove remnants of PCI DMA API Petr Tesarik
2025-06-26  1:46   ` Bagas Sanjaya
2025-06-24 13:39 ` [PATCH 4/8] docs: dma-api: add a kernel-doc comment for dma_pool_zalloc() Petr Tesarik
2025-06-24 13:39 ` [PATCH 5/8] docs: dma-api: remove duplicate description of the DMA pool API Petr Tesarik
2025-06-25  2:40   ` Randy Dunlap
2025-06-25  6:41     ` Petr Tesarik
2025-06-24 13:39 ` [PATCH 6/8] docs: dma-api: clarify DMA addressing limitations Petr Tesarik
2025-06-26  1:47   ` Bagas Sanjaya
2025-06-24 13:39 ` [PATCH 7/8] docs: dma-api: update streaming DMA API physical address constraints Petr Tesarik
2025-06-26  1:49   ` Bagas Sanjaya
2025-06-26  5:06     ` Petr Tesarik
2025-06-26  7:09       ` Marek Szyprowski
2025-06-26  8:25         ` Petr Tesarik
2025-06-26  9:58       ` Robin Murphy
2025-06-26 13:48         ` Petr Tesarik
2025-06-26 16:45           ` Robin Murphy
2025-06-26 19:40             ` Petr Tesarik
2025-06-27 11:07               ` Robin Murphy
2025-06-27 11:32                 ` Petr Tesarik
2025-06-27 12:55             ` Christoph Hellwig
2025-06-27 13:02               ` Petr Tesarik
2025-06-27 12:52       ` Christoph Hellwig
2025-06-24 13:39 ` [PATCH 8/8] docs: dma-api: clean up documentation of dma_map_sg() Petr Tesarik
2025-06-26  1:50   ` Bagas Sanjaya

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).