linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8
@ 2022-11-06 22:01 Catalin Marinas
  2022-11-06 22:01 ` [PATCH v3 01/13] mm/slab: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN Catalin Marinas
                   ` (13 more replies)
  0 siblings, 14 replies; 44+ messages in thread
From: Catalin Marinas @ 2022-11-06 22:01 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman
  Cc: Will Deacon, Marc Zyngier, Andrew Morton, Herbert Xu,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

Hi,

That's the third attempt at reducing the kmalloc() minimum alignment on
arm64 below the ARCH_DMA_MINALIGN of 128. The first version was not
aggressive enough, limiting ARCH_KMALLOC_MINALIGN to 64 while the second
version added an explicit __GFP_PACKED flag.

This third version reduces ARCH_KMALLOC_MINALIGN to 8 while defining
ARCH_DMA_MINALIGN for all platforms and using it instead of the former
in places where we need a static alignment (structure or members align
attributes).

The first patch decouples the kmalloc() and DMA alignment, though this
only takes effect after the Kconfig entry is enabled by the last patch.

Patches 2 and 3 add bouncing via the swiotlb if any of the sizes are
small enough to have originated from an unaligned kmalloc() cache. Not
entirely sure whether my approach for iommu bouncing is correct, so open
to suggestions.

Patch 4 is a fallback in case there is no swiotlb buffer. Together with
patch 6, we can still get a smaller kmalloc() minalign of 64 (typical
cache line size) rather than 128 on arm64. If we improve the bouncing to
use the DMA coherent pool, this run-time __kmalloc_minalign() can go
away. Patch 5 is some cleanup following the refactoring in patch 4.

Patches 7-12 change some ARCH_KMALLOC_MINALIGN uses to
ARCH_DMA_MINALIGN. The crypto changes have been rejected by Herbert
previously but I still included them here until the crypto code is
refactored.

The last patch enables the bouncing for arm64.

Thanks.

Catalin Marinas (13):
  mm/slab: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN
  dma-mapping: Force bouncing if the kmalloc() size is not
    cacheline-aligned
  iommu/dma: Force bouncing of the size is not cacheline-aligned
  mm/slab: Allow kmalloc() minimum alignment fallback to
    dma_get_cache_alignment()
  mm/slab: Simplify create_kmalloc_cache() args and make it static
  dma: Allow the smaller cache_line_size() returned by
    dma_get_cache_alignment()
  drivers/base: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  drivers/gpu: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  drivers/usb: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  drivers/spi: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  drivers/md: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  dma: arm64: Add CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC and enable it for
    arm64

 arch/arm64/Kconfig            |  2 ++
 drivers/base/devres.c         |  6 ++---
 drivers/gpu/drm/drm_managed.c |  6 ++---
 drivers/iommu/dma-iommu.c     | 12 ++++++---
 drivers/md/dm-crypt.c         |  2 +-
 drivers/spi/spidev.c          |  2 +-
 drivers/usb/core/buffer.c     |  8 +++---
 include/linux/crypto.h        |  2 +-
 include/linux/dma-map-ops.h   | 50 +++++++++++++++++++++++++++++++++++
 include/linux/dma-mapping.h   |  4 ++-
 include/linux/scatterlist.h   | 27 ++++++++++++++++---
 include/linux/slab.h          | 14 +++++++---
 kernel/dma/Kconfig            | 14 ++++++++++
 kernel/dma/direct.h           |  3 ++-
 mm/slab.c                     |  6 +----
 mm/slab.h                     |  5 ++--
 mm/slab_common.c              | 49 +++++++++++++++++++++++++++-------
 17 files changed, 169 insertions(+), 43 deletions(-)


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v3 01/13] mm/slab: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN
  2022-11-06 22:01 [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Catalin Marinas
@ 2022-11-06 22:01 ` Catalin Marinas
  2022-11-06 22:01 ` [PATCH v3 02/13] dma-mapping: Force bouncing if the kmalloc() size is not cacheline-aligned Catalin Marinas
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Catalin Marinas @ 2022-11-06 22:01 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman
  Cc: Will Deacon, Marc Zyngier, Andrew Morton, Herbert Xu,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

In preparation for supporting a kmalloc() minimum alignment smaller than
the arch DMA alignment, decouple the two definitions. This requires that
the DMA API bounces smaller kmalloc() allocations, hence the smaller
ARCH_KMALLOC_MINALIGN is only enabled if a new
CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC option is enabled (to be introduced
in a subsequent patch).

After this patch, ARCH_DMA_MINALIGN is expected to be used in static
alignment annotations and defined by an architecture to be the maximum
alignment for all supported configurations/SoCs in a single Image.
ARCH_KMALLOC_MINALIGN becomes the default sizeof(unsigned long long).

Since ARCH_DMA_MINALIGN is now always defined, adjust the #ifdef in
dma_get_cache_alignment() so that there is no change for architectures
not requiring a minimum DMA alignment.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robin Murphy <robin.murphy@arm.com>
---
 include/linux/dma-mapping.h |  2 +-
 include/linux/slab.h        | 14 +++++++++++---
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 0ee20b764000..3288a1339271 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -545,7 +545,7 @@ static inline int dma_set_min_align_mask(struct device *dev,
 
 static inline int dma_get_cache_alignment(void)
 {
-#ifdef ARCH_DMA_MINALIGN
+#ifdef ARCH_HAS_DMA_MINALIGN
 	return ARCH_DMA_MINALIGN;
 #endif
 	return 1;
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 90877fcde70b..b104d63e5456 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -221,12 +221,20 @@ void kmem_dump_obj(void *object);
  * alignment larger than the alignment of a 64-bit integer.
  * Setting ARCH_DMA_MINALIGN in arch headers allows that.
  */
-#if defined(ARCH_DMA_MINALIGN) && ARCH_DMA_MINALIGN > 8
+#ifdef ARCH_DMA_MINALIGN
+#define ARCH_HAS_DMA_MINALIGN
+#if ARCH_DMA_MINALIGN > 8 && !defined(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC)
 #define ARCH_KMALLOC_MINALIGN ARCH_DMA_MINALIGN
-#define KMALLOC_MIN_SIZE ARCH_DMA_MINALIGN
-#define KMALLOC_SHIFT_LOW ilog2(ARCH_DMA_MINALIGN)
+#endif
 #else
+#define ARCH_DMA_MINALIGN __alignof__(unsigned long long)
+#endif
+
+#ifndef ARCH_KMALLOC_MINALIGN
 #define ARCH_KMALLOC_MINALIGN __alignof__(unsigned long long)
+#else
+#define KMALLOC_MIN_SIZE ARCH_KMALLOC_MINALIGN
+#define KMALLOC_SHIFT_LOW ilog2(KMALLOC_MIN_SIZE)
 #endif
 
 /*

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 02/13] dma-mapping: Force bouncing if the kmalloc() size is not cacheline-aligned
  2022-11-06 22:01 [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Catalin Marinas
  2022-11-06 22:01 ` [PATCH v3 01/13] mm/slab: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN Catalin Marinas
@ 2022-11-06 22:01 ` Catalin Marinas
  2022-11-07  9:43   ` Christoph Hellwig
  2022-11-06 22:01 ` [PATCH v3 03/13] iommu/dma: Force bouncing of the " Catalin Marinas
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 44+ messages in thread
From: Catalin Marinas @ 2022-11-06 22:01 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman
  Cc: Will Deacon, Marc Zyngier, Andrew Morton, Herbert Xu,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

For direct DMA, if the size is small enough to have originated from a
kmalloc() cache below ARCH_DMA_MINALIGN, check its alignment against
cache_line_size() and bounce if necessary. For larger sizes, it is the
responsibility of the DMA API caller to ensure proper alignment.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robin Murphy <robin.murphy@arm.com>
---
 include/linux/dma-map-ops.h | 27 +++++++++++++++++++++++++++
 kernel/dma/direct.h         |  3 ++-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index d678afeb8a13..785f7aa90f57 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -8,6 +8,7 @@
 
 #include <linux/dma-mapping.h>
 #include <linux/pgtable.h>
+#include <linux/slab.h>
 
 struct cma;
 
@@ -275,6 +276,32 @@ static inline bool dev_is_dma_coherent(struct device *dev)
 }
 #endif /* CONFIG_ARCH_HAS_DMA_COHERENCE_H */
 
+/*
+ * Check whether the given size, assuming it is for a kmalloc()'ed object, is
+ * safe for non-coherent DMA or needs bouncing.
+ */
+static inline bool dma_kmalloc_needs_bounce(struct device *dev, size_t size,
+					    enum dma_data_direction dir)
+{
+	/*
+	 * No need for bouncing if coherent DMA or the direction is
+	 * DMA_TO_DEVICE.
+	 */
+	if (!IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ||
+	    dir == DMA_TO_DEVICE || dev_is_dma_coherent(dev))
+		return false;
+
+	/*
+	 * Larger kmalloc() sizes are guaranteed to be aligned to
+	 * ARCH_DMA_MINALIGN.
+	 */
+	if (size >= 2 * ARCH_DMA_MINALIGN ||
+	    IS_ALIGNED(kmalloc_size_roundup(size), dma_get_cache_alignment()))
+		return false;
+
+	return true;
+}
+
 void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 		gfp_t gfp, unsigned long attrs);
 void arch_dma_free(struct device *dev, size_t size, void *cpu_addr,
diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h
index e38ffc5e6bdd..97ec892ea0b5 100644
--- a/kernel/dma/direct.h
+++ b/kernel/dma/direct.h
@@ -94,7 +94,8 @@ static inline dma_addr_t dma_direct_map_page(struct device *dev,
 		return swiotlb_map(dev, phys, size, dir, attrs);
 	}
 
-	if (unlikely(!dma_capable(dev, dma_addr, size, true))) {
+	if (unlikely(!dma_capable(dev, dma_addr, size, true)) ||
+	    dma_kmalloc_needs_bounce(dev, size, dir)) {
 		if (is_pci_p2pdma_page(page))
 			return DMA_MAPPING_ERROR;
 		if (is_swiotlb_active(dev))

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 03/13] iommu/dma: Force bouncing of the size is not cacheline-aligned
  2022-11-06 22:01 [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Catalin Marinas
  2022-11-06 22:01 ` [PATCH v3 01/13] mm/slab: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN Catalin Marinas
  2022-11-06 22:01 ` [PATCH v3 02/13] dma-mapping: Force bouncing if the kmalloc() size is not cacheline-aligned Catalin Marinas
@ 2022-11-06 22:01 ` Catalin Marinas
  2022-11-07  9:46   ` Christoph Hellwig
  2022-11-14 23:23   ` Isaac Manjarres
  2022-11-06 22:01 ` [PATCH v3 04/13] mm/slab: Allow kmalloc() minimum alignment fallback to dma_get_cache_alignment() Catalin Marinas
                   ` (10 subsequent siblings)
  13 siblings, 2 replies; 44+ messages in thread
From: Catalin Marinas @ 2022-11-06 22:01 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman
  Cc: Will Deacon, Marc Zyngier, Andrew Morton, Herbert Xu,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

Similarly to the direct DMA, bounce small allocations as they may have
originated from a kmalloc() cache not safe for DMA. Unlike the direct
DMA, iommu_dma_map_sg() cannot call iommu_dma_map_sg_swiotlb() for all
non-coherent devices as this would break some cases where the iova is
expected to be contiguous (dmabuf). Instead, scan the scatterlist for
any small sizes and only go the swiotlb path if any element of the list
needs bouncing (note that iommu_dma_map_page() would still only bounce
those buffers which are not DMA-aligned).

To avoid scanning the scatterlist on the 'sync' operations, introduce a
SG_DMA_BOUNCED flag set during the iommu_dma_map_sg() call (suggested by
Robin Murphy).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robin Murphy <robin.murphy@arm.com>
---

Not entirely sure about this approach but here it is. And it needs
better testing.

 drivers/iommu/dma-iommu.c   | 12 ++++++++----
 include/linux/dma-map-ops.h | 23 +++++++++++++++++++++++
 include/linux/scatterlist.h | 27 ++++++++++++++++++++++++---
 3 files changed, 55 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9297b741f5e8..8c80dffe0337 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -948,7 +948,7 @@ static void iommu_dma_sync_sg_for_cpu(struct device *dev,
 	struct scatterlist *sg;
 	int i;
 
-	if (dev_use_swiotlb(dev))
+	if (dev_use_swiotlb(dev) || sg_is_dma_bounced(sgl))
 		for_each_sg(sgl, sg, nelems, i)
 			iommu_dma_sync_single_for_cpu(dev, sg_dma_address(sg),
 						      sg->length, dir);
@@ -964,7 +964,7 @@ static void iommu_dma_sync_sg_for_device(struct device *dev,
 	struct scatterlist *sg;
 	int i;
 
-	if (dev_use_swiotlb(dev))
+	if (dev_use_swiotlb(dev) || sg_is_dma_bounced(sgl))
 		for_each_sg(sgl, sg, nelems, i)
 			iommu_dma_sync_single_for_device(dev,
 							 sg_dma_address(sg),
@@ -990,7 +990,8 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
 	 * If both the physical buffer start address and size are
 	 * page aligned, we don't need to use a bounce page.
 	 */
-	if (dev_use_swiotlb(dev) && iova_offset(iovad, phys | size)) {
+	if ((dev_use_swiotlb(dev) && iova_offset(iovad, phys | size)) ||
+	    dma_kmalloc_needs_bounce(dev, size, dir)) {
 		void *padding_start;
 		size_t padding_size, aligned_size;
 
@@ -1202,7 +1203,10 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
 			goto out;
 	}
 
-	if (dev_use_swiotlb(dev))
+	if (dma_sg_kmalloc_needs_bounce(dev, sg, nents, dir))
+		sg_dma_mark_bounced(sg);
+
+	if (dev_use_swiotlb(dev) || sg_is_dma_bounced(sg))
 		return iommu_dma_map_sg_swiotlb(dev, sg, nents, dir, attrs);
 
 	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index 785f7aa90f57..e747a46261d4 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -302,6 +302,29 @@ static inline bool dma_kmalloc_needs_bounce(struct device *dev, size_t size,
 	return true;
 }
 
+/*
+ * Return true if any of the scatterlist elements needs bouncing due to
+ * potentially originating from a small kmalloc() cache.
+ */
+static inline bool dma_sg_kmalloc_needs_bounce(struct device *dev,
+					       struct scatterlist *sg, int nents,
+					       enum dma_data_direction dir)
+{
+	struct scatterlist *s;
+	int i;
+
+	if (!IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ||
+	    dir == DMA_TO_DEVICE || dev_is_dma_coherent(dev))
+		return false;
+
+	for_each_sg(sg, s, nents, i) {
+		if (dma_kmalloc_needs_bounce(dev, s->length, dir))
+			return true;
+	}
+
+	return false;
+}
+
 void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 		gfp_t gfp, unsigned long attrs);
 void arch_dma_free(struct device *dev, size_t size, void *cpu_addr,
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 375a5e90d86a..f16cf040fe2c 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -16,7 +16,7 @@ struct scatterlist {
 #ifdef CONFIG_NEED_SG_DMA_LENGTH
 	unsigned int	dma_length;
 #endif
-#ifdef CONFIG_PCI_P2PDMA
+#if defined(CONFIG_PCI_P2PDMA) || defined(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC)
 	unsigned int    dma_flags;
 #endif
 };
@@ -248,6 +248,29 @@ static inline void sg_unmark_end(struct scatterlist *sg)
 	sg->page_link &= ~SG_END;
 }
 
+#define SG_DMA_BUS_ADDRESS	(1 << 0)
+#define SG_DMA_BOUNCED		(1 << 1)
+
+#ifdef CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC
+static inline bool sg_is_dma_bounced(struct scatterlist *sg)
+{
+	return sg->dma_flags & SG_DMA_BOUNCED;
+}
+
+static inline void sg_dma_mark_bounced(struct scatterlist *sg)
+{
+	sg->dma_flags |= SG_DMA_BOUNCED;
+}
+#else
+static inline bool sg_is_dma_bounced(struct scatterlist *sg)
+{
+	return false;
+}
+static inline void sg_dma_mark_bounced(struct scatterlist *sg)
+{
+}
+#endif
+
 /*
  * CONFGI_PCI_P2PDMA depends on CONFIG_64BIT which means there is 4 bytes
  * in struct scatterlist (assuming also CONFIG_NEED_SG_DMA_LENGTH is set).
@@ -256,8 +279,6 @@ static inline void sg_unmark_end(struct scatterlist *sg)
  */
 #ifdef CONFIG_PCI_P2PDMA
 
-#define SG_DMA_BUS_ADDRESS (1 << 0)
-
 /**
  * sg_dma_is_bus address - Return whether a given segment was marked
  *			   as a bus address

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 04/13] mm/slab: Allow kmalloc() minimum alignment fallback to dma_get_cache_alignment()
  2022-11-06 22:01 [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Catalin Marinas
                   ` (2 preceding siblings ...)
  2022-11-06 22:01 ` [PATCH v3 03/13] iommu/dma: Force bouncing of the " Catalin Marinas
@ 2022-11-06 22:01 ` Catalin Marinas
       [not found]   ` <202211070812.BhGKB0Hd-lkp@intel.com>
  2022-11-06 22:01 ` [PATCH v3 05/13] mm/slab: Simplify create_kmalloc_cache() args and make it static Catalin Marinas
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 44+ messages in thread
From: Catalin Marinas @ 2022-11-06 22:01 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman
  Cc: Will Deacon, Marc Zyngier, Andrew Morton, Herbert Xu,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

For architecture with ARCH_DMA_MINALIGN larger than
ARCH_KMALLOC_MINALIGN, if no default swiotlb buffer is available, fall
back to a kmalloc() minimum alignment which is safe for DMA.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---

I'm not particularly fond of the slab_common.c code probing the default
swiotlb. In one incarnation I had a __weak arch_kmalloc_minalign()
overridden by the arch code but decided there wasn't anything arm64
specific in there.

 mm/slab.c        |  6 +-----
 mm/slab.h        |  2 ++
 mm/slab_common.c | 40 +++++++++++++++++++++++++++++++++++-----
 3 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 59c8e28f7b6a..6e31eb027ef6 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1241,11 +1241,7 @@ void __init kmem_cache_init(void)
 	 * Initialize the caches that provide memory for the  kmem_cache_node
 	 * structures first.  Without this, further allocations will bug.
 	 */
-	kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
-				kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
-				kmalloc_info[INDEX_NODE].size,
-				ARCH_KMALLOC_FLAGS, 0,
-				kmalloc_info[INDEX_NODE].size);
+	new_kmalloc_cache(INDEX_NODE, KMALLOC_NORMAL, ARCH_KMALLOC_FLAGS);
 	slab_state = PARTIAL_NODE;
 	setup_kmalloc_cache_index_table();
 
diff --git a/mm/slab.h b/mm/slab.h
index 0202a8c2f0d2..d0460e0f6760 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -288,6 +288,8 @@ int __kmem_cache_create(struct kmem_cache *, slab_flags_t flags);
 struct kmem_cache *create_kmalloc_cache(const char *name, unsigned int size,
 			slab_flags_t flags, unsigned int useroffset,
 			unsigned int usersize);
+void __init new_kmalloc_cache(int idx, enum kmalloc_cache_type type,
+			      slab_flags_t flags);
 extern void create_boot_cache(struct kmem_cache *, const char *name,
 			unsigned int size, slab_flags_t flags,
 			unsigned int useroffset, unsigned int usersize);
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 33b1886b06eb..b62f27c2dda7 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -17,6 +17,8 @@
 #include <linux/cpu.h>
 #include <linux/uaccess.h>
 #include <linux/seq_file.h>
+#include <linux/dma-mapping.h>
+#include <linux/swiotlb.h>
 #include <linux/proc_fs.h>
 #include <linux/debugfs.h>
 #include <linux/kasan.h>
@@ -852,9 +854,30 @@ void __init setup_kmalloc_cache_index_table(void)
 	}
 }
 
-static void __init
+static unsigned int __kmalloc_minalign(void)
+{
+	int cache_align = dma_get_cache_alignment();
+
+	/*
+	 * If CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC is not enabled,
+	 * ARCH_KMALLOC_MINALIGN matches ARCH_DMA_MINALIGN.
+	 */
+	if (!IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ||
+	    cache_align < ARCH_KMALLOC_MINALIGN || io_tlb_default_mem.nslabs)
+		return ARCH_KMALLOC_MINALIGN;
+
+	pr_info_once("No default DMA bounce buffer, increasing the kmalloc() minimum alignment to %d\n",
+		     cache_align);
+	return cache_align;
+}
+
+void __init
 new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
 {
+	unsigned int minalign = __kmalloc_minalign();
+	unsigned int aligned_size = kmalloc_info[idx].size;
+	int aligned_idx = idx;
+
 	if (type == KMALLOC_RECLAIM) {
 		flags |= SLAB_RECLAIM_ACCOUNT;
 	} else if (IS_ENABLED(CONFIG_MEMCG_KMEM) && (type == KMALLOC_CGROUP)) {
@@ -867,10 +890,17 @@ new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
 		flags |= SLAB_CACHE_DMA;
 	}
 
-	kmalloc_caches[type][idx] = create_kmalloc_cache(
-					kmalloc_info[idx].name[type],
-					kmalloc_info[idx].size, flags, 0,
-					kmalloc_info[idx].size);
+	if (minalign > ARCH_KMALLOC_MINALIGN) {
+		aligned_size = ALIGN(aligned_size, minalign);
+		aligned_idx = __kmalloc_index(aligned_size, false);
+	}
+
+	if (!kmalloc_caches[type][aligned_idx])
+		kmalloc_caches[type][aligned_idx] = create_kmalloc_cache(
+					kmalloc_info[aligned_idx].name[type],
+					aligned_size, flags, 0, aligned_size);
+	if (idx != aligned_idx)
+		kmalloc_caches[type][idx] = kmalloc_caches[type][aligned_idx];
 
 	/*
 	 * If CONFIG_MEMCG_KMEM is enabled, disable cache merging for

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 05/13] mm/slab: Simplify create_kmalloc_cache() args and make it static
  2022-11-06 22:01 [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Catalin Marinas
                   ` (3 preceding siblings ...)
  2022-11-06 22:01 ` [PATCH v3 04/13] mm/slab: Allow kmalloc() minimum alignment fallback to dma_get_cache_alignment() Catalin Marinas
@ 2022-11-06 22:01 ` Catalin Marinas
  2022-11-06 22:01 ` [PATCH v3 06/13] dma: Allow the smaller cache_line_size() returned by dma_get_cache_alignment() Catalin Marinas
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Catalin Marinas @ 2022-11-06 22:01 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman
  Cc: Will Deacon, Marc Zyngier, Andrew Morton, Herbert Xu,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

create_kmalloc_cache() is now only called from new_kmalloc_cache() in
the same file, so make it static. In addition, the useroffset argument
is always 0 while usersize is the same as size. Remove them.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 mm/slab.h        |  3 ---
 mm/slab_common.c | 11 +++++------
 2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/mm/slab.h b/mm/slab.h
index d0460e0f6760..93ba93e7b772 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -285,9 +285,6 @@ gfp_t kmalloc_fix_flags(gfp_t flags);
 /* Functions provided by the slab allocators */
 int __kmem_cache_create(struct kmem_cache *, slab_flags_t flags);
 
-struct kmem_cache *create_kmalloc_cache(const char *name, unsigned int size,
-			slab_flags_t flags, unsigned int useroffset,
-			unsigned int usersize);
 void __init new_kmalloc_cache(int idx, enum kmalloc_cache_type type,
 			      slab_flags_t flags);
 extern void create_boot_cache(struct kmem_cache *, const char *name,
diff --git a/mm/slab_common.c b/mm/slab_common.c
index b62f27c2dda7..3fe3f4ad1362 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -654,17 +654,16 @@ void __init create_boot_cache(struct kmem_cache *s, const char *name,
 	s->refcount = -1;	/* Exempt from merging for now */
 }
 
-struct kmem_cache *__init create_kmalloc_cache(const char *name,
-		unsigned int size, slab_flags_t flags,
-		unsigned int useroffset, unsigned int usersize)
+static struct kmem_cache *__init create_kmalloc_cache(const char *name,
+						      unsigned int size,
+						      slab_flags_t flags)
 {
 	struct kmem_cache *s = kmem_cache_zalloc(kmem_cache, GFP_NOWAIT);
 
 	if (!s)
 		panic("Out of memory when creating slab %s\n", name);
 
-	create_boot_cache(s, name, size, flags | SLAB_KMALLOC, useroffset,
-								usersize);
+	create_boot_cache(s, name, size, flags | SLAB_KMALLOC, 0, size);
 	kasan_cache_create_kmalloc(s);
 	list_add(&s->list, &slab_caches);
 	s->refcount = 1;
@@ -898,7 +897,7 @@ new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
 	if (!kmalloc_caches[type][aligned_idx])
 		kmalloc_caches[type][aligned_idx] = create_kmalloc_cache(
 					kmalloc_info[aligned_idx].name[type],
-					aligned_size, flags, 0, aligned_size);
+					aligned_size, flags);
 	if (idx != aligned_idx)
 		kmalloc_caches[type][idx] = kmalloc_caches[type][aligned_idx];
 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 06/13] dma: Allow the smaller cache_line_size() returned by dma_get_cache_alignment()
  2022-11-06 22:01 [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Catalin Marinas
                   ` (4 preceding siblings ...)
  2022-11-06 22:01 ` [PATCH v3 05/13] mm/slab: Simplify create_kmalloc_cache() args and make it static Catalin Marinas
@ 2022-11-06 22:01 ` Catalin Marinas
  2022-11-06 22:01 ` [PATCH v3 07/13] drivers/base: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN Catalin Marinas
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Catalin Marinas @ 2022-11-06 22:01 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman
  Cc: Will Deacon, Marc Zyngier, Andrew Morton, Herbert Xu,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

On architectures like arm64, ARCH_DMA_MINALIGN is larger than the
majority of cache line size configurations. Allow an architecture to opt
in to dma_get_cache_alignment() returning such smaller size and select
the option for arm64.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Is there any architecture where ARCH_DMA_MINALIGN is larger than
cache_line_size()? We could avoid another Kconfig entry.

 arch/arm64/Kconfig          | 1 +
 include/linux/dma-mapping.h | 2 ++
 kernel/dma/Kconfig          | 6 ++++++
 3 files changed, 9 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 505c8a1ccbe0..3991cb7b8a33 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -22,6 +22,7 @@ config ARM64
 	select ARCH_HAS_CURRENT_STACK_POINTER
 	select ARCH_HAS_DEBUG_VIRTUAL
 	select ARCH_HAS_DEBUG_VM_PGTABLE
+	select ARCH_HAS_DMA_CACHE_LINE_SIZE
 	select ARCH_HAS_DMA_PREP_COHERENT
 	select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
 	select ARCH_HAS_FAST_MULTIPLIER
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 3288a1339271..b29124341317 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -545,6 +545,8 @@ static inline int dma_set_min_align_mask(struct device *dev,
 
 static inline int dma_get_cache_alignment(void)
 {
+	if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_CACHE_LINE_SIZE))
+		return cache_line_size();
 #ifdef ARCH_HAS_DMA_MINALIGN
 	return ARCH_DMA_MINALIGN;
 #endif
diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index 56866aaa2ae1..d6fab8e3cbae 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -76,6 +76,12 @@ config ARCH_HAS_DMA_PREP_COHERENT
 config ARCH_HAS_FORCE_DMA_UNENCRYPTED
 	bool
 
+config ARCH_HAS_DMA_CACHE_LINE_SIZE
+	bool
+	help
+	  Select if the architecture has non-coherent DMA and
+	  cache_line_size() is a safe alignment for DMA buffers.
+
 config SWIOTLB
 	bool
 	select NEED_DMA_MAP_STATE

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 07/13] drivers/base: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  2022-11-06 22:01 [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Catalin Marinas
                   ` (5 preceding siblings ...)
  2022-11-06 22:01 ` [PATCH v3 06/13] dma: Allow the smaller cache_line_size() returned by dma_get_cache_alignment() Catalin Marinas
@ 2022-11-06 22:01 ` Catalin Marinas
  2022-11-06 22:01 ` [PATCH v3 08/13] drivers/gpu: " Catalin Marinas
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Catalin Marinas @ 2022-11-06 22:01 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman
  Cc: Will Deacon, Marc Zyngier, Andrew Morton, Herbert Xu,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

ARCH_DMA_MINALIGN represents the minimum (static) alignment for safe DMA
operations while ARCH_KMALLOC_MINALIGN is the minimum kmalloc() objects
alignment.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
---
 drivers/base/devres.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/base/devres.c b/drivers/base/devres.c
index 4ab2b50ee38f..5dbfa3c31af6 100644
--- a/drivers/base/devres.c
+++ b/drivers/base/devres.c
@@ -29,10 +29,10 @@ struct devres {
 	 * Some archs want to perform DMA into kmalloc caches
 	 * and need a guaranteed alignment larger than
 	 * the alignment of a 64-bit integer.
-	 * Thus we use ARCH_KMALLOC_MINALIGN here and get exactly the same
-	 * buffer alignment as if it was allocated by plain kmalloc().
+	 * Thus we use ARCH_DMA_MINALIGN for data[] which will force the same
+	 * alignment for struct devres when allocated by kmalloc().
 	 */
-	u8 __aligned(ARCH_KMALLOC_MINALIGN) data[];
+	u8 __aligned(ARCH_DMA_MINALIGN) data[];
 };
 
 struct devres_group {

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 08/13] drivers/gpu: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  2022-11-06 22:01 [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Catalin Marinas
                   ` (6 preceding siblings ...)
  2022-11-06 22:01 ` [PATCH v3 07/13] drivers/base: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN Catalin Marinas
@ 2022-11-06 22:01 ` Catalin Marinas
  2022-11-06 22:01 ` [PATCH v3 09/13] drivers/usb: " Catalin Marinas
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Catalin Marinas @ 2022-11-06 22:01 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman
  Cc: Will Deacon, Marc Zyngier, Andrew Morton, Herbert Xu,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

ARCH_DMA_MINALIGN represents the minimum (static) alignment for safe DMA
operations while ARCH_KMALLOC_MINALIGN is the minimum kmalloc() objects
alignment.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/drm_managed.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_managed.c b/drivers/gpu/drm/drm_managed.c
index 4cf214de50c4..3a5802f60e65 100644
--- a/drivers/gpu/drm/drm_managed.c
+++ b/drivers/gpu/drm/drm_managed.c
@@ -49,10 +49,10 @@ struct drmres {
 	 * Some archs want to perform DMA into kmalloc caches
 	 * and need a guaranteed alignment larger than
 	 * the alignment of a 64-bit integer.
-	 * Thus we use ARCH_KMALLOC_MINALIGN here and get exactly the same
-	 * buffer alignment as if it was allocated by plain kmalloc().
+	 * Thus we use ARCH_DMA_MINALIGN for data[] which will force the same
+	 * alignment for struct drmres when allocated by kmalloc().
 	 */
-	u8 __aligned(ARCH_KMALLOC_MINALIGN) data[];
+	u8 __aligned(ARCH_DMA_MINALIGN) data[];
 };
 
 static void free_dr(struct drmres *dr)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 09/13] drivers/usb: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  2022-11-06 22:01 [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Catalin Marinas
                   ` (7 preceding siblings ...)
  2022-11-06 22:01 ` [PATCH v3 08/13] drivers/gpu: " Catalin Marinas
@ 2022-11-06 22:01 ` Catalin Marinas
  2022-11-06 22:01 ` [PATCH v3 10/13] drivers/spi: " Catalin Marinas
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Catalin Marinas @ 2022-11-06 22:01 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman
  Cc: Will Deacon, Marc Zyngier, Andrew Morton, Herbert Xu,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

ARCH_DMA_MINALIGN represents the minimum (static) alignment for safe DMA
operations while ARCH_KMALLOC_MINALIGN is the minimum kmalloc() objects
alignment.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/usb/core/buffer.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/core/buffer.c b/drivers/usb/core/buffer.c
index fbb087b728dc..e21d8d106977 100644
--- a/drivers/usb/core/buffer.c
+++ b/drivers/usb/core/buffer.c
@@ -34,13 +34,13 @@ void __init usb_init_pool_max(void)
 {
 	/*
 	 * The pool_max values must never be smaller than
-	 * ARCH_KMALLOC_MINALIGN.
+	 * ARCH_DMA_MINALIGN.
 	 */
-	if (ARCH_KMALLOC_MINALIGN <= 32)
+	if (ARCH_DMA_MINALIGN <= 32)
 		;			/* Original value is okay */
-	else if (ARCH_KMALLOC_MINALIGN <= 64)
+	else if (ARCH_DMA_MINALIGN <= 64)
 		pool_max[0] = 64;
-	else if (ARCH_KMALLOC_MINALIGN <= 128)
+	else if (ARCH_DMA_MINALIGN <= 128)
 		pool_max[0] = 0;	/* Don't use this pool */
 	else
 		BUILD_BUG();		/* We don't allow this */

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 10/13] drivers/spi: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  2022-11-06 22:01 [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Catalin Marinas
                   ` (8 preceding siblings ...)
  2022-11-06 22:01 ` [PATCH v3 09/13] drivers/usb: " Catalin Marinas
@ 2022-11-06 22:01 ` Catalin Marinas
  2022-11-07 12:58   ` Mark Brown
  2022-11-06 22:01 ` [PATCH v3 11/13] crypto: " Catalin Marinas
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 44+ messages in thread
From: Catalin Marinas @ 2022-11-06 22:01 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman
  Cc: Will Deacon, Marc Zyngier, Andrew Morton, Herbert Xu,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

ARCH_DMA_MINALIGN represents the minimum (static) alignment for safe DMA
operations while ARCH_KMALLOC_MINALIGN is the minimum kmalloc() objects
alignment.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Brown <broonie@kernel.org>
---
 drivers/spi/spidev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/spi/spidev.c b/drivers/spi/spidev.c
index b2775d82d2d7..f74486240b1f 100644
--- a/drivers/spi/spidev.c
+++ b/drivers/spi/spidev.c
@@ -228,7 +228,7 @@ static int spidev_message(struct spidev_data *spidev,
 		/* Ensure that also following allocations from rx_buf/tx_buf will meet
 		 * DMA alignment requirements.
 		 */
-		unsigned int len_aligned = ALIGN(u_tmp->len, ARCH_KMALLOC_MINALIGN);
+		unsigned int len_aligned = ALIGN(u_tmp->len, ARCH_DMA_MINALIGN);
 
 		k_tmp->len = u_tmp->len;
 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 11/13] crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  2022-11-06 22:01 [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Catalin Marinas
                   ` (9 preceding siblings ...)
  2022-11-06 22:01 ` [PATCH v3 10/13] drivers/spi: " Catalin Marinas
@ 2022-11-06 22:01 ` Catalin Marinas
  2022-11-07  2:22   ` Herbert Xu
  2022-11-06 22:01 ` [PATCH v3 12/13] drivers/md: " Catalin Marinas
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 44+ messages in thread
From: Catalin Marinas @ 2022-11-06 22:01 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman
  Cc: Will Deacon, Marc Zyngier, Andrew Morton, Herbert Xu,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

ARCH_DMA_MINALIGN represents the minimum (static) alignment for safe DMA
operations while ARCH_KMALLOC_MINALIGN is the minimum kmalloc()
alignment. This will ensure that the static alignment of various
structures or members of those structures( e.g. __ctx[] in struct
aead_request) is safe for DMA. Note that sizeof such structures becomes
aligned to ARCH_DMA_MINALIGN and kmalloc() will honour such alignment,
so there is no confusion for the compiler.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ard Biesheuvel <ardb@kernel.org>
---

I know Herbert NAK'ed this patch but I'm still keeping it here
temporarily, until we agree on some refactoring at the crypto code. FTR,
I don't think there's anything wrong with this patch since kmalloc()
will return ARCH_DMA_MINALIGN-aligned objects if the sizeof such objects
is a multiple of ARCH_DMA_MINALIGN (side-effect of
CRYPTO_MINALIGN_ATTR).

 include/linux/crypto.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/crypto.h b/include/linux/crypto.h
index 2324ab6f1846..654b9c355575 100644
--- a/include/linux/crypto.h
+++ b/include/linux/crypto.h
@@ -167,7 +167,7 @@
  * maintenance for non-coherent DMA (cache invalidation in particular) does not
  * affect data that may be accessed by the CPU concurrently.
  */
-#define CRYPTO_MINALIGN ARCH_KMALLOC_MINALIGN
+#define CRYPTO_MINALIGN ARCH_DMA_MINALIGN
 
 #define CRYPTO_MINALIGN_ATTR __attribute__ ((__aligned__(CRYPTO_MINALIGN)))
 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 12/13] drivers/md: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  2022-11-06 22:01 [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Catalin Marinas
                   ` (10 preceding siblings ...)
  2022-11-06 22:01 ` [PATCH v3 11/13] crypto: " Catalin Marinas
@ 2022-11-06 22:01 ` Catalin Marinas
  2022-11-06 22:01 ` [PATCH v3 13/13] dma: arm64: Add CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC and enable it for arm64 Catalin Marinas
  2023-03-16 18:38 ` [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Isaac Manjarres
  13 siblings, 0 replies; 44+ messages in thread
From: Catalin Marinas @ 2022-11-06 22:01 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman
  Cc: Will Deacon, Marc Zyngier, Andrew Morton, Herbert Xu,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

ARCH_DMA_MINALIGN represents the minimum (static) alignment for safe DMA
operations while ARCH_KMALLOC_MINALIGN is the minimum kmalloc() objects
alignment.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@kernel.org>
---

This is somewhat related to the previous crypto patch. The dm_crypt_io
structure is supposed to be CRYPTO_MINALIGN-aligned.

 drivers/md/dm-crypt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 159c6806c19b..4f1bd5b348c3 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -3250,7 +3250,7 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 
 	cc->per_bio_data_size = ti->per_io_data_size =
 		ALIGN(sizeof(struct dm_crypt_io) + cc->dmreq_start + additional_req_size,
-		      ARCH_KMALLOC_MINALIGN);
+		      ARCH_DMA_MINALIGN);
 
 	ret = mempool_init(&cc->page_pool, BIO_MAX_VECS, crypt_page_alloc, crypt_page_free, cc);
 	if (ret) {

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 13/13] dma: arm64: Add CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC and enable it for arm64
  2022-11-06 22:01 [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Catalin Marinas
                   ` (11 preceding siblings ...)
  2022-11-06 22:01 ` [PATCH v3 12/13] drivers/md: " Catalin Marinas
@ 2022-11-06 22:01 ` Catalin Marinas
  2022-11-07 13:03   ` Robin Murphy
  2023-03-16 18:38 ` [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Isaac Manjarres
  13 siblings, 1 reply; 44+ messages in thread
From: Catalin Marinas @ 2022-11-06 22:01 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman
  Cc: Will Deacon, Marc Zyngier, Andrew Morton, Herbert Xu,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

With all the infrastructure in place for bouncing small kmalloc()
buffers, add the corresponding Kconfig entry and select it for arm64.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robin Murphy <robin.murphy@arm.com>
---
 arch/arm64/Kconfig | 1 +
 kernel/dma/Kconfig | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 3991cb7b8a33..f889cf16e6ab 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -100,6 +100,7 @@ config ARM64
 	select ARCH_WANT_FRAME_POINTERS
 	select ARCH_WANT_HUGE_PMD_SHARE if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36)
 	select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
+	select ARCH_WANT_KMALLOC_DMA_BOUNCE
 	select ARCH_WANT_LD_ORPHAN_WARN
 	select ARCH_WANTS_NO_INSTR
 	select ARCH_WANTS_THP_SWAP if ARM64_4K_PAGES
diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index d6fab8e3cbae..b56e76371023 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -86,6 +86,14 @@ config SWIOTLB
 	bool
 	select NEED_DMA_MAP_STATE
 
+config ARCH_WANT_KMALLOC_DMA_BOUNCE
+	bool
+
+config DMA_BOUNCE_UNALIGNED_KMALLOC
+	def_bool y
+	depends on ARCH_WANT_KMALLOC_DMA_BOUNCE
+	depends on SWIOTLB && !SLOB
+
 config DMA_RESTRICTED_POOL
 	bool "DMA Restricted Pool"
 	depends on OF && OF_RESERVED_MEM && SWIOTLB

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 11/13] crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  2022-11-06 22:01 ` [PATCH v3 11/13] crypto: " Catalin Marinas
@ 2022-11-07  2:22   ` Herbert Xu
  2022-11-07  9:05     ` Catalin Marinas
  0 siblings, 1 reply; 44+ messages in thread
From: Herbert Xu @ 2022-11-07  2:22 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

On Sun, Nov 06, 2022 at 10:01:41PM +0000, Catalin Marinas wrote:
> ARCH_DMA_MINALIGN represents the minimum (static) alignment for safe DMA
> operations while ARCH_KMALLOC_MINALIGN is the minimum kmalloc()
> alignment. This will ensure that the static alignment of various
> structures or members of those structures( e.g. __ctx[] in struct
> aead_request) is safe for DMA. Note that sizeof such structures becomes
> aligned to ARCH_DMA_MINALIGN and kmalloc() will honour such alignment,
> so there is no confusion for the compiler.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> ---
> 
> I know Herbert NAK'ed this patch but I'm still keeping it here
> temporarily, until we agree on some refactoring at the crypto code. FTR,
> I don't think there's anything wrong with this patch since kmalloc()
> will return ARCH_DMA_MINALIGN-aligned objects if the sizeof such objects
> is a multiple of ARCH_DMA_MINALIGN (side-effect of
> CRYPTO_MINALIGN_ATTR).

As I said before changing CRYPTO_MINALIGN doesn't do anything and
that's why this patch is broken.

To get what you want the drivers have to be modified.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 11/13] crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  2022-11-07  2:22   ` Herbert Xu
@ 2022-11-07  9:05     ` Catalin Marinas
  2022-11-07  9:12       ` Herbert Xu
  0 siblings, 1 reply; 44+ messages in thread
From: Catalin Marinas @ 2022-11-07  9:05 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

On Mon, Nov 07, 2022 at 10:22:18AM +0800, Herbert Xu wrote:
> On Sun, Nov 06, 2022 at 10:01:41PM +0000, Catalin Marinas wrote:
> > ARCH_DMA_MINALIGN represents the minimum (static) alignment for safe DMA
> > operations while ARCH_KMALLOC_MINALIGN is the minimum kmalloc()
> > alignment. This will ensure that the static alignment of various
> > structures or members of those structures( e.g. __ctx[] in struct
> > aead_request) is safe for DMA. Note that sizeof such structures becomes
> > aligned to ARCH_DMA_MINALIGN and kmalloc() will honour such alignment,
> > so there is no confusion for the compiler.
> > 
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Herbert Xu <herbert@gondor.apana.org.au>
> > Cc: Ard Biesheuvel <ardb@kernel.org>
> > ---
> > 
> > I know Herbert NAK'ed this patch but I'm still keeping it here
> > temporarily, until we agree on some refactoring at the crypto code. FTR,
> > I don't think there's anything wrong with this patch since kmalloc()
> > will return ARCH_DMA_MINALIGN-aligned objects if the sizeof such objects
> > is a multiple of ARCH_DMA_MINALIGN (side-effect of
> > CRYPTO_MINALIGN_ATTR).
> 
> As I said before changing CRYPTO_MINALIGN doesn't do anything and
> that's why this patch is broken.

Well, it does ensure that the __alignof__ and sizeof structures like
crypto_alg and aead_request is still 128 after this change. A kmalloc()
of a size multiple of 128 returns a 128-byte aligned object. So the aim
is just to keep the current binary layout/alignment to 128 on arm64. In
theory, no functional change.

Of course, there are better ways to do it but I think the crypto code
should move away from ARCH_KMALLOC_MINALIGN and use something like
dma_get_cache_alignment() instead. The cra_alignmask should be specific
to the device and typically small values (or 0 if no alignment required
by the device). The DMA alignment is specific to the SoC and CPU, so
this should be handled elsewhere.

As I don't fully understand the crypto code, I had a naive attempt at
forcing a higher alignmask but it ended up in a kernel panic:

diff --git a/include/linux/crypto.h b/include/linux/crypto.h
index 2324ab6f1846..6dc84c504b52 100644
--- a/include/linux/crypto.h
+++ b/include/linux/crypto.h
@@ -13,6 +13,7 @@
 #define _LINUX_CRYPTO_H
 
 #include <linux/atomic.h>
+#include <linux/dma-mapping.h>
 #include <linux/kernel.h>
 #include <linux/list.h>
 #include <linux/bug.h>
@@ -696,7 +697,7 @@ static inline unsigned int crypto_tfm_alg_blocksize(struct crypto_tfm *tfm)
 
 static inline unsigned int crypto_tfm_alg_alignmask(struct crypto_tfm *tfm)
 {
-	return tfm->__crt_alg->cra_alignmask;
+	return tfm->__crt_alg->cra_alignmask | (dma_get_cache_alignment() - 1);
 }
 
 static inline u32 crypto_tfm_get_flags(struct crypto_tfm *tfm)

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 11/13] crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  2022-11-07  9:05     ` Catalin Marinas
@ 2022-11-07  9:12       ` Herbert Xu
  2022-11-07  9:38         ` Catalin Marinas
  0 siblings, 1 reply; 44+ messages in thread
From: Herbert Xu @ 2022-11-07  9:12 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

On Mon, Nov 07, 2022 at 09:05:04AM +0000, Catalin Marinas wrote:
>
> Well, it does ensure that the __alignof__ and sizeof structures like
> crypto_alg and aead_request is still 128 after this change. A kmalloc()
> of a size multiple of 128 returns a 128-byte aligned object. So the aim
> is just to keep the current binary layout/alignment to 128 on arm64. In
> theory, no functional change.

Changing CRYPTO_MINALIGN to 128 does not cause structures that are
smaller than 128 bytes to magically become larger than 128 bytes.
All it does is declare to the compiler that it may assume that
these pointers are 128-byte aligned (which is obviously untrue
if kmalloc does not guarantee that).

So I don't see how this changes anything in practice.  Buffers
that required bouncing prior to your change will still require
bouncing.

If you're set on doing it this way then I can proceed with the
original patch-set to change the drivers.  I've just been putting
it off because it seems that you guys weren't quite decided on
which way to go.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 04/13] mm/slab: Allow kmalloc() minimum alignment fallback to dma_get_cache_alignment()
       [not found]   ` <202211070812.BhGKB0Hd-lkp@intel.com>
@ 2022-11-07  9:22     ` Catalin Marinas
  0 siblings, 0 replies; 44+ messages in thread
From: Catalin Marinas @ 2022-11-07  9:22 UTC (permalink / raw)
  To: kernel test robot
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, oe-kbuild-all, LKML, Will Deacon,
	Marc Zyngier, Andrew Morton, Linux Memory Management List,
	Herbert Xu, Ard Biesheuvel, Isaac Manjarres, Saravana Kannan,
	Alasdair Kergon, Daniel Vetter, Joerg Roedel, Mark Brown,
	Mike Snitzer, Rafael J. Wysocki, Robin Murphy, iommu,
	linux-arm-kernel

On Mon, Nov 07, 2022 at 08:50:31AM +0800, kernel test robot wrote:
> url:    https://github.com/intel-lab-lkp/linux/commits/Catalin-Marinas/mm-dma-arm64-Reduce-ARCH_KMALLOC_MINALIGN-to-8/20221107-060303
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link:    https://lore.kernel.org/r/20221106220143.2129263-5-catalin.marinas%40arm.com
> patch subject: [PATCH v3 04/13] mm/slab: Allow kmalloc() minimum alignment fallback to dma_get_cache_alignment()
> config: parisc-randconfig-r003-20221106
> compiler: hppa-linux-gcc (GCC) 12.1.0
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # https://github.com/intel-lab-lkp/linux/commit/309bc52a1ed9665a1b9d32bcf094918ceb6af519
>         git remote add linux-review https://github.com/intel-lab-lkp/linux
>         git fetch --no-tags linux-review Catalin-Marinas/mm-dma-arm64-Reduce-ARCH_KMALLOC_MINALIGN-to-8/20221107-060303
>         git checkout 309bc52a1ed9665a1b9d32bcf094918ceb6af519
>         # save the config file
>         mkdir build_dir && cp config build_dir/.config
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=parisc SHELL=/bin/bash
> 
> If you fix the issue, kindly add following tag where applicable
> | Reported-by: kernel test robot <lkp@intel.com>
> 
> All errors (new ones prefixed by >>):
> 
>    mm/slab_common.c: In function '__kmalloc_minalign':
> >> mm/slab_common.c:866:52: error: 'io_tlb_default_mem' undeclared (first use in this function)
>      866 |             cache_align < ARCH_KMALLOC_MINALIGN || io_tlb_default_mem.nslabs)
>          |                                                    ^~~~~~~~~~~~~~~~~~
>    mm/slab_common.c:866:52: note: each undeclared identifier is reported only once for each function it appears in

Thanks for this. It looks like I didn't test the series with
CONFIG_SWIOTLB disabled.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 11/13] crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  2022-11-07  9:12       ` Herbert Xu
@ 2022-11-07  9:38         ` Catalin Marinas
  0 siblings, 0 replies; 44+ messages in thread
From: Catalin Marinas @ 2022-11-07  9:38 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

On Mon, Nov 07, 2022 at 05:12:53PM +0800, Herbert Xu wrote:
> On Mon, Nov 07, 2022 at 09:05:04AM +0000, Catalin Marinas wrote:
> > Well, it does ensure that the __alignof__ and sizeof structures like
> > crypto_alg and aead_request is still 128 after this change. A kmalloc()
> > of a size multiple of 128 returns a 128-byte aligned object. So the aim
> > is just to keep the current binary layout/alignment to 128 on arm64. In
> > theory, no functional change.
> 
> Changing CRYPTO_MINALIGN to 128 does not cause structures that are
> smaller than 128 bytes to magically become larger than 128 bytes.

For structures, it does (not arrays though):

#define __aligned(x)	__attribute__((__aligned__(x)))

struct align_test1 {
	char c;
	char __aligned(128) data[];
};

struct align_test2 {
	char c;
} __aligned(128);

char aligned_array[4] __aligned(128);

With the above, we have:

sizeof(align_test1) == 128; __alignof__(align_test1) == 128;
sizeof(align_test2) == 128; __alignof__(align_test2) == 128;
sizeof(align_array) == 4;   __alignof__(align_array) == 128;

> If you're set on doing it this way then I can proceed with the
> original patch-set to change the drivers.  I've just been putting
> it off because it seems that you guys weren't quite decided on
> which way to go.

Yes, reviving your patchset would help and that can be done
independently of this series as long as the crypto code starts using
dma_get_cache_alignment() and drops CRYPTO_MINALIGN_ATTR entirely. If at
the point of creating the mask the code knows whether the device is
coherent, it can even avoid any additional alignment (though still
honouring the cra_alignmask that a device requires). So such reworking
would be beneficial irrespective of this series.

It seems that swiotlb bouncing is the preferred route and least
intrusive but let's see the feedback on the other parts of the series.

Thanks.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 02/13] dma-mapping: Force bouncing if the kmalloc() size is not cacheline-aligned
  2022-11-06 22:01 ` [PATCH v3 02/13] dma-mapping: Force bouncing if the kmalloc() size is not cacheline-aligned Catalin Marinas
@ 2022-11-07  9:43   ` Christoph Hellwig
  0 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2022-11-07  9:43 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Isaac Manjarres, Saravana Kannan,
	Alasdair Kergon, Daniel Vetter, Joerg Roedel, Mark Brown,
	Mike Snitzer, Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

> +/*
> + * Check whether the given size, assuming it is for a kmalloc()'ed object, is
> + * safe for non-coherent DMA or needs bouncing.
> + */
> +static inline bool dma_kmalloc_needs_bounce(struct device *dev, size_t size,
> +					    enum dma_data_direction dir)
> +{
> +	/*
> +	 * No need for bouncing if coherent DMA or the direction is
> +	 * DMA_TO_DEVICE.
> +	 */
> +	if (!IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ||
> +	    dir == DMA_TO_DEVICE || dev_is_dma_coherent(dev))

Minor nit, but for clarify I'd preper to split the generaly availabily
checks from the direction one, i.e.:

	if (dev_is_dma_coherent(dev) ||
	    !IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC))
		return false;

	if (dir == DMA_TO_DEVICE)
		return false;

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 03/13] iommu/dma: Force bouncing of the size is not cacheline-aligned
  2022-11-06 22:01 ` [PATCH v3 03/13] iommu/dma: Force bouncing of the " Catalin Marinas
@ 2022-11-07  9:46   ` Christoph Hellwig
  2022-11-07 10:54     ` Catalin Marinas
  2022-11-14 23:23   ` Isaac Manjarres
  1 sibling, 1 reply; 44+ messages in thread
From: Christoph Hellwig @ 2022-11-07  9:46 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Isaac Manjarres, Saravana Kannan,
	Alasdair Kergon, Daniel Vetter, Joerg Roedel, Mark Brown,
	Mike Snitzer, Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

> +static inline bool dma_sg_kmalloc_needs_bounce(struct device *dev,
> +					       struct scatterlist *sg, int nents,
> +					       enum dma_data_direction dir)
> +{
> +	struct scatterlist *s;
> +	int i;
> +
> +	if (!IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ||
> +	    dir == DMA_TO_DEVICE || dev_is_dma_coherent(dev))
> +		return false;

This part should be shared with dma-direct in a well documented helper.

> +	for_each_sg(sg, s, nents, i) {
> +		if (dma_kmalloc_needs_bounce(dev, s->length, dir))
> +			return true;
> +	}

And for this loop iteration I'd much prefer it to be out of line, and
also not available in a global helper.

But maybe someone can come up with a nice tweak to the dma-iommu
code to not require the extra sglist walk anyway.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 03/13] iommu/dma: Force bouncing of the size is not cacheline-aligned
  2022-11-07  9:46   ` Christoph Hellwig
@ 2022-11-07 10:54     ` Catalin Marinas
  2022-11-07 13:26       ` Robin Murphy
  2022-11-08  7:50       ` Christoph Hellwig
  0 siblings, 2 replies; 44+ messages in thread
From: Catalin Marinas @ 2022-11-07 10:54 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Arnd Bergmann, Greg Kroah-Hartman, Will Deacon,
	Marc Zyngier, Andrew Morton, Herbert Xu, Ard Biesheuvel,
	Isaac Manjarres, Saravana Kannan, Alasdair Kergon, Daniel Vetter,
	Joerg Roedel, Mark Brown, Mike Snitzer, Rafael J. Wysocki,
	Robin Murphy, linux-mm, iommu, linux-arm-kernel

On Mon, Nov 07, 2022 at 10:46:03AM +0100, Christoph Hellwig wrote:
> > +static inline bool dma_sg_kmalloc_needs_bounce(struct device *dev,
> > +					       struct scatterlist *sg, int nents,
> > +					       enum dma_data_direction dir)
> > +{
> > +	struct scatterlist *s;
> > +	int i;
> > +
> > +	if (!IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ||
> > +	    dir == DMA_TO_DEVICE || dev_is_dma_coherent(dev))
> > +		return false;
> 
> This part should be shared with dma-direct in a well documented helper.
> 
> > +	for_each_sg(sg, s, nents, i) {
> > +		if (dma_kmalloc_needs_bounce(dev, s->length, dir))
> > +			return true;
> > +	}
> 
> And for this loop iteration I'd much prefer it to be out of line, and
> also not available in a global helper.
> 
> But maybe someone can come up with a nice tweak to the dma-iommu
> code to not require the extra sglist walk anyway.

An idea: we could add another member to struct scatterlist to track the
bounced address. We can then do the bouncing in a similar way to
iommu_dma_map_sg_swiotlb() but without the iova allocation. The latter
would be a common path for both the bounced and non-bounced cases.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 10/13] drivers/spi: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  2022-11-06 22:01 ` [PATCH v3 10/13] drivers/spi: " Catalin Marinas
@ 2022-11-07 12:58   ` Mark Brown
  0 siblings, 0 replies; 44+ messages in thread
From: Mark Brown @ 2022-11-07 12:58 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Isaac Manjarres, Saravana Kannan,
	Alasdair Kergon, Daniel Vetter, Joerg Roedel, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 270 bytes --]

On Sun, Nov 06, 2022 at 10:01:40PM +0000, Catalin Marinas wrote:
> ARCH_DMA_MINALIGN represents the minimum (static) alignment for safe DMA
> operations while ARCH_KMALLOC_MINALIGN is the minimum kmalloc() objects
> alignment.

Acked-by: Mark Brown <broonie@kernel.org>

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 13/13] dma: arm64: Add CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC and enable it for arm64
  2022-11-06 22:01 ` [PATCH v3 13/13] dma: arm64: Add CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC and enable it for arm64 Catalin Marinas
@ 2022-11-07 13:03   ` Robin Murphy
  2022-11-07 14:38     ` Christoph Hellwig
  2022-11-08  9:52     ` Catalin Marinas
  0 siblings, 2 replies; 44+ messages in thread
From: Robin Murphy @ 2022-11-07 13:03 UTC (permalink / raw)
  To: Catalin Marinas, Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman
  Cc: Will Deacon, Marc Zyngier, Andrew Morton, Herbert Xu,
	Ard Biesheuvel, Isaac Manjarres, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, linux-mm, iommu, linux-arm-kernel

On 2022-11-06 22:01, Catalin Marinas wrote:
> With all the infrastructure in place for bouncing small kmalloc()
> buffers, add the corresponding Kconfig entry and select it for arm64.

AFAICS we're missing the crucial part to ensure that SWIOTLB is 
available even when max_pfn <= arm64_dma_phys_limit, which is very 
likely to be true on low-memory systems that care most about kmalloc 
wastage. The only way to override that currently is with 
"swiotlb=force", but bouncing *everything* is not desirable either.

Thanks,
Robin.

> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Robin Murphy <robin.murphy@arm.com>
> ---
>   arch/arm64/Kconfig | 1 +
>   kernel/dma/Kconfig | 8 ++++++++
>   2 files changed, 9 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 3991cb7b8a33..f889cf16e6ab 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -100,6 +100,7 @@ config ARM64
>   	select ARCH_WANT_FRAME_POINTERS
>   	select ARCH_WANT_HUGE_PMD_SHARE if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36)
>   	select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
> +	select ARCH_WANT_KMALLOC_DMA_BOUNCE
>   	select ARCH_WANT_LD_ORPHAN_WARN
>   	select ARCH_WANTS_NO_INSTR
>   	select ARCH_WANTS_THP_SWAP if ARM64_4K_PAGES
> diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
> index d6fab8e3cbae..b56e76371023 100644
> --- a/kernel/dma/Kconfig
> +++ b/kernel/dma/Kconfig
> @@ -86,6 +86,14 @@ config SWIOTLB
>   	bool
>   	select NEED_DMA_MAP_STATE
>   
> +config ARCH_WANT_KMALLOC_DMA_BOUNCE
> +	bool
> +
> +config DMA_BOUNCE_UNALIGNED_KMALLOC
> +	def_bool y
> +	depends on ARCH_WANT_KMALLOC_DMA_BOUNCE
> +	depends on SWIOTLB && !SLOB
> +
>   config DMA_RESTRICTED_POOL
>   	bool "DMA Restricted Pool"
>   	depends on OF && OF_RESERVED_MEM && SWIOTLB

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 03/13] iommu/dma: Force bouncing of the size is not cacheline-aligned
  2022-11-07 10:54     ` Catalin Marinas
@ 2022-11-07 13:26       ` Robin Murphy
  2022-11-08 10:51         ` Catalin Marinas
  2022-11-08  7:50       ` Christoph Hellwig
  1 sibling, 1 reply; 44+ messages in thread
From: Robin Murphy @ 2022-11-07 13:26 UTC (permalink / raw)
  To: Catalin Marinas, Christoph Hellwig
  Cc: Linus Torvalds, Arnd Bergmann, Greg Kroah-Hartman, Will Deacon,
	Marc Zyngier, Andrew Morton, Herbert Xu, Ard Biesheuvel,
	Isaac Manjarres, Saravana Kannan, Alasdair Kergon, Daniel Vetter,
	Joerg Roedel, Mark Brown, Mike Snitzer, Rafael J. Wysocki,
	linux-mm, iommu, linux-arm-kernel

On 2022-11-07 10:54, Catalin Marinas wrote:
> On Mon, Nov 07, 2022 at 10:46:03AM +0100, Christoph Hellwig wrote:
>>> +static inline bool dma_sg_kmalloc_needs_bounce(struct device *dev,
>>> +					       struct scatterlist *sg, int nents,
>>> +					       enum dma_data_direction dir)
>>> +{
>>> +	struct scatterlist *s;
>>> +	int i;
>>> +
>>> +	if (!IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ||
>>> +	    dir == DMA_TO_DEVICE || dev_is_dma_coherent(dev))
>>> +		return false;
>>
>> This part should be shared with dma-direct in a well documented helper.
>>
>>> +	for_each_sg(sg, s, nents, i) {
>>> +		if (dma_kmalloc_needs_bounce(dev, s->length, dir))
>>> +			return true;
>>> +	}
>>
>> And for this loop iteration I'd much prefer it to be out of line, and
>> also not available in a global helper.
>>
>> But maybe someone can come up with a nice tweak to the dma-iommu
>> code to not require the extra sglist walk anyway.
> 
> An idea: we could add another member to struct scatterlist to track the
> bounced address. We can then do the bouncing in a similar way to
> iommu_dma_map_sg_swiotlb() but without the iova allocation. The latter
> would be a common path for both the bounced and non-bounced cases.

FWIW I spent a little time looking at this as well; I'm pretty confident
it can be done without the extra walk if the iommu-dma bouncing is
completely refactored (and it might want a SWIOTLB helper to retrieve
the original page from a bounced address). That's going to be a bigger
job than I'll be able to finish this cycle, and I concluded that this
in-between approach wouldn't be worth posting for its own sake, but as
part of this series I think it's a reasonable compromise. What we have
here is effectively a pretty specialist config that trades DMA mapping
performance for memory efficiency, so trading a little more performance
initially for the sake of keeping it manageable seems fair to me.

The one thing I did get as far as writing up is the patch below, which
I'll share as an indirect review comment on this patch - feel free to
pick it up or squash it in if you think it's worthwhile.

Thanks,
Robin.

----->8-----
From: Robin Murphy <robin.murphy@arm.com>
Date: Wed, 2 Nov 2022 17:35:09 +0000
Subject: [PATCH] scatterlist: Add dedicated config for DMA flags

The DMA flags field will be useful for users beyond PCI P2P, so upgrade
to its own dedicated config option.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
  drivers/pci/Kconfig         | 1 +
  include/linux/scatterlist.h | 4 ++--
  kernel/dma/Kconfig          | 3 +++
  3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 55c028af4bd9..0303604d9de9 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -173,6 +173,7 @@ config PCI_P2PDMA
  	#
  	depends on 64BIT
  	select GENERIC_ALLOCATOR
+	select NEED_SG_DMA_FLAGS
  	help
  	  Enableѕ drivers to do PCI peer-to-peer transactions to and from
  	  BARs that are exposed in other devices that are the part of
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 375a5e90d86a..87aaf8b5cdb4 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -16,7 +16,7 @@ struct scatterlist {
  #ifdef CONFIG_NEED_SG_DMA_LENGTH
  	unsigned int	dma_length;
  #endif
-#ifdef CONFIG_PCI_P2PDMA
+#ifdef CONFIG_NEED_SG_DMA_FLAGS
  	unsigned int    dma_flags;
  #endif
  };
@@ -249,7 +249,7 @@ static inline void sg_unmark_end(struct scatterlist *sg)
  }
  
  /*
- * CONFGI_PCI_P2PDMA depends on CONFIG_64BIT which means there is 4 bytes
+ * CONFIG_PCI_P2PDMA depends on CONFIG_64BIT which means there is 4 bytes
   * in struct scatterlist (assuming also CONFIG_NEED_SG_DMA_LENGTH is set).
   * Use this padding for DMA flags bits to indicate when a specific
   * dma address is a bus address.
diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index 56866aaa2ae1..48016c4f67ac 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -24,6 +24,9 @@ config DMA_OPS_BYPASS
  config ARCH_HAS_DMA_MAP_DIRECT
  	bool
  
+config NEED_SG_DMA_FLAGS
+	bool
+
  config NEED_SG_DMA_LENGTH
  	bool
  
-- 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 13/13] dma: arm64: Add CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC and enable it for arm64
  2022-11-07 13:03   ` Robin Murphy
@ 2022-11-07 14:38     ` Christoph Hellwig
  2022-11-07 15:24       ` Robin Murphy
  2022-11-08  9:52     ` Catalin Marinas
  1 sibling, 1 reply; 44+ messages in thread
From: Christoph Hellwig @ 2022-11-07 14:38 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Catalin Marinas, Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Isaac Manjarres, Saravana Kannan,
	Alasdair Kergon, Daniel Vetter, Joerg Roedel, Mark Brown,
	Mike Snitzer, Rafael J. Wysocki, linux-mm, iommu,
	linux-arm-kernel

On Mon, Nov 07, 2022 at 01:03:31PM +0000, Robin Murphy wrote:
> On 2022-11-06 22:01, Catalin Marinas wrote:
>> With all the infrastructure in place for bouncing small kmalloc()
>> buffers, add the corresponding Kconfig entry and select it for arm64.
>
> AFAICS we're missing the crucial part to ensure that SWIOTLB is available 
> even when max_pfn <= arm64_dma_phys_limit, which is very likely to be true 
> on low-memory systems that care most about kmalloc wastage. The only way to 
> override that currently is with "swiotlb=force", but bouncing *everything* 
> is not desirable either.

FYI, one of the reasons for the swiotlb_init refactor that passes
flags and a boolean a while ago is that we can trivially just either
pass another flag or check a condition in swiotlb_init to allocate the
buffer.  There's actually another case for which we need the
unconditional allocation, and that is the bouncing for untrusted
external devices with dma-iommu.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 13/13] dma: arm64: Add CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC and enable it for arm64
  2022-11-07 14:38     ` Christoph Hellwig
@ 2022-11-07 15:24       ` Robin Murphy
  0 siblings, 0 replies; 44+ messages in thread
From: Robin Murphy @ 2022-11-07 15:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Catalin Marinas, Linus Torvalds, Arnd Bergmann,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Isaac Manjarres, Saravana Kannan,
	Alasdair Kergon, Daniel Vetter, Joerg Roedel, Mark Brown,
	Mike Snitzer, Rafael J. Wysocki, linux-mm, iommu,
	linux-arm-kernel

On 2022-11-07 14:38, Christoph Hellwig wrote:
> On Mon, Nov 07, 2022 at 01:03:31PM +0000, Robin Murphy wrote:
>> On 2022-11-06 22:01, Catalin Marinas wrote:
>>> With all the infrastructure in place for bouncing small kmalloc()
>>> buffers, add the corresponding Kconfig entry and select it for arm64.
>>
>> AFAICS we're missing the crucial part to ensure that SWIOTLB is available
>> even when max_pfn <= arm64_dma_phys_limit, which is very likely to be true
>> on low-memory systems that care most about kmalloc wastage. The only way to
>> override that currently is with "swiotlb=force", but bouncing *everything*
>> is not desirable either.
> 
> FYI, one of the reasons for the swiotlb_init refactor that passes
> flags and a boolean a while ago is that we can trivially just either
> pass another flag or check a condition in swiotlb_init to allocate the
> buffer.  There's actually another case for which we need the
> unconditional allocation, and that is the bouncing for untrusted
> external devices with dma-iommu.

Right, I guess machines with Thunderbolt and all the firmware 
annotations but less than 4GB of RAM are unlikely to exist in the wild, 
so the untrusted bouncing logic has been getting lucky so far. There are 
however plenty of arm64 systems with small amounts of RAM and 
non-coherent USB so in this case someone's likely to fall over it pretty 
much right away. I know it's easy to add a new condition, but it still 
has to actually *be* added.

Thanks,
Robin.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 03/13] iommu/dma: Force bouncing of the size is not cacheline-aligned
  2022-11-07 10:54     ` Catalin Marinas
  2022-11-07 13:26       ` Robin Murphy
@ 2022-11-08  7:50       ` Christoph Hellwig
  1 sibling, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2022-11-08  7:50 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Christoph Hellwig, Linus Torvalds, Arnd Bergmann,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Isaac Manjarres, Saravana Kannan,
	Alasdair Kergon, Daniel Vetter, Joerg Roedel, Mark Brown,
	Mike Snitzer, Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

On Mon, Nov 07, 2022 at 10:54:36AM +0000, Catalin Marinas wrote:
> An idea: we could add another member to struct scatterlist to track the
> bounced address. We can then do the bouncing in a similar way to
> iommu_dma_map_sg_swiotlb() but without the iova allocation. The latter
> would be a common path for both the bounced and non-bounced cases.

That would be a pretty massive memory overhead for an unusual case,
so I'd rather avoid it.  In addition to the long term plan of doing
DMA mappings without a scatterlist..

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 13/13] dma: arm64: Add CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC and enable it for arm64
  2022-11-07 13:03   ` Robin Murphy
  2022-11-07 14:38     ` Christoph Hellwig
@ 2022-11-08  9:52     ` Catalin Marinas
  2022-11-08 10:03       ` Christoph Hellwig
  1 sibling, 1 reply; 44+ messages in thread
From: Catalin Marinas @ 2022-11-08  9:52 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Isaac Manjarres, Saravana Kannan,
	Alasdair Kergon, Daniel Vetter, Joerg Roedel, Mark Brown,
	Mike Snitzer, Rafael J. Wysocki, linux-mm, iommu,
	linux-arm-kernel

On Mon, Nov 07, 2022 at 01:03:31PM +0000, Robin Murphy wrote:
> On 2022-11-06 22:01, Catalin Marinas wrote:
> > With all the infrastructure in place for bouncing small kmalloc()
> > buffers, add the corresponding Kconfig entry and select it for arm64.
> 
> AFAICS we're missing the crucial part to ensure that SWIOTLB is available
> even when max_pfn <= arm64_dma_phys_limit, which is very likely to be true
> on low-memory systems that care most about kmalloc wastage.

This was a deliberate decision for this version. Patch 4 mitigates it a
bit by raising the kmalloc() minimum alignment to the cache line size
(typically 64). It's still an improvement from the current 128-byte
alignment.

Since it's hard to guess the optimal swiotlb buffer for such platforms,
I think a follow-up step would be to use the DMA coherent pool for
bouncing if no swiotlb buffer is available. At least the pool can grow
dynamically. Yet another option would be to increase the swiotlb buffer
at run-time but it has an overhead for is_swiotlb_buffer().

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 13/13] dma: arm64: Add CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC and enable it for arm64
  2022-11-08  9:52     ` Catalin Marinas
@ 2022-11-08 10:03       ` Christoph Hellwig
  2022-11-30 18:48         ` Isaac Manjarres
  0 siblings, 1 reply; 44+ messages in thread
From: Christoph Hellwig @ 2022-11-08 10:03 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Robin Murphy, Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Isaac Manjarres, Saravana Kannan,
	Alasdair Kergon, Daniel Vetter, Joerg Roedel, Mark Brown,
	Mike Snitzer, Rafael J. Wysocki, linux-mm, iommu,
	linux-arm-kernel, agraf

On Tue, Nov 08, 2022 at 09:52:15AM +0000, Catalin Marinas wrote:
> Since it's hard to guess the optimal swiotlb buffer for such platforms,
> I think a follow-up step would be to use the DMA coherent pool for
> bouncing if no swiotlb buffer is available. At least the pool can grow
> dynamically. Yet another option would be to increase the swiotlb buffer
> at run-time but it has an overhead for is_swiotlb_buffer().

Alex said he wanted to look into growing the swiotlb buffer on demand
for other reason, so adding him to Cc to check if there has been any
progress on that.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 03/13] iommu/dma: Force bouncing of the size is not cacheline-aligned
  2022-11-07 13:26       ` Robin Murphy
@ 2022-11-08 10:51         ` Catalin Marinas
  2022-11-08 11:40           ` Robin Murphy
  0 siblings, 1 reply; 44+ messages in thread
From: Catalin Marinas @ 2022-11-08 10:51 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Linus Torvalds, Arnd Bergmann,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Isaac Manjarres, Saravana Kannan,
	Alasdair Kergon, Daniel Vetter, Joerg Roedel, Mark Brown,
	Mike Snitzer, Rafael J. Wysocki, linux-mm, iommu,
	linux-arm-kernel

On Mon, Nov 07, 2022 at 01:26:21PM +0000, Robin Murphy wrote:
> On 2022-11-07 10:54, Catalin Marinas wrote:
> > On Mon, Nov 07, 2022 at 10:46:03AM +0100, Christoph Hellwig wrote:
> > > > +static inline bool dma_sg_kmalloc_needs_bounce(struct device *dev,
> > > > +					       struct scatterlist *sg, int nents,
> > > > +					       enum dma_data_direction dir)
> > > > +{
> > > > +	struct scatterlist *s;
> > > > +	int i;
> > > > +
> > > > +	if (!IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ||
> > > > +	    dir == DMA_TO_DEVICE || dev_is_dma_coherent(dev))
> > > > +		return false;
> > > 
> > > This part should be shared with dma-direct in a well documented helper.
> > > 
> > > > +	for_each_sg(sg, s, nents, i) {
> > > > +		if (dma_kmalloc_needs_bounce(dev, s->length, dir))
> > > > +			return true;
> > > > +	}
> > > 
> > > And for this loop iteration I'd much prefer it to be out of line, and
> > > also not available in a global helper.
> > > 
> > > But maybe someone can come up with a nice tweak to the dma-iommu
> > > code to not require the extra sglist walk anyway.
> > 
> > An idea: we could add another member to struct scatterlist to track the
> > bounced address. We can then do the bouncing in a similar way to
> > iommu_dma_map_sg_swiotlb() but without the iova allocation. The latter
> > would be a common path for both the bounced and non-bounced cases.
> 
> FWIW I spent a little time looking at this as well; I'm pretty confident
> it can be done without the extra walk if the iommu-dma bouncing is
> completely refactored (and it might want a SWIOTLB helper to retrieve
> the original page from a bounced address).

Doesn't sg_page() provide the original page already? Either way, the
swiotlb knows it as it needs to do the copying between buffers.

> That's going to be a bigger
> job than I'll be able to finish this cycle, and I concluded that this
> in-between approach wouldn't be worth posting for its own sake, but as
> part of this series I think it's a reasonable compromise.

I'll drop my hack once you have something. Happy to carry it as part of
this series.

> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
> index 375a5e90d86a..87aaf8b5cdb4 100644
> --- a/include/linux/scatterlist.h
> +++ b/include/linux/scatterlist.h
> @@ -16,7 +16,7 @@ struct scatterlist {
>  #ifdef CONFIG_NEED_SG_DMA_LENGTH
>  	unsigned int	dma_length;
>  #endif
> -#ifdef CONFIG_PCI_P2PDMA
> +#ifdef CONFIG_NEED_SG_DMA_FLAGS
>  	unsigned int    dma_flags;
>  #endif

I initially had something similar but I decided it's overkill for a
patch that I expected to be NAK'ed.

I'll include your patch in my series in the meantime.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 03/13] iommu/dma: Force bouncing of the size is not cacheline-aligned
  2022-11-08 10:51         ` Catalin Marinas
@ 2022-11-08 11:40           ` Robin Murphy
  0 siblings, 0 replies; 44+ messages in thread
From: Robin Murphy @ 2022-11-08 11:40 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Christoph Hellwig, Linus Torvalds, Arnd Bergmann,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Isaac Manjarres, Saravana Kannan,
	Alasdair Kergon, Daniel Vetter, Joerg Roedel, Mark Brown,
	Mike Snitzer, Rafael J. Wysocki, linux-mm, iommu,
	linux-arm-kernel

On 2022-11-08 10:51, Catalin Marinas wrote:
> On Mon, Nov 07, 2022 at 01:26:21PM +0000, Robin Murphy wrote:
>> On 2022-11-07 10:54, Catalin Marinas wrote:
>>> On Mon, Nov 07, 2022 at 10:46:03AM +0100, Christoph Hellwig wrote:
>>>>> +static inline bool dma_sg_kmalloc_needs_bounce(struct device *dev,
>>>>> +					       struct scatterlist *sg, int nents,
>>>>> +					       enum dma_data_direction dir)
>>>>> +{
>>>>> +	struct scatterlist *s;
>>>>> +	int i;
>>>>> +
>>>>> +	if (!IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) ||
>>>>> +	    dir == DMA_TO_DEVICE || dev_is_dma_coherent(dev))
>>>>> +		return false;
>>>>
>>>> This part should be shared with dma-direct in a well documented helper.
>>>>
>>>>> +	for_each_sg(sg, s, nents, i) {
>>>>> +		if (dma_kmalloc_needs_bounce(dev, s->length, dir))
>>>>> +			return true;
>>>>> +	}
>>>>
>>>> And for this loop iteration I'd much prefer it to be out of line, and
>>>> also not available in a global helper.
>>>>
>>>> But maybe someone can come up with a nice tweak to the dma-iommu
>>>> code to not require the extra sglist walk anyway.
>>>
>>> An idea: we could add another member to struct scatterlist to track the
>>> bounced address. We can then do the bouncing in a similar way to
>>> iommu_dma_map_sg_swiotlb() but without the iova allocation. The latter
>>> would be a common path for both the bounced and non-bounced cases.
>>
>> FWIW I spent a little time looking at this as well; I'm pretty confident
>> it can be done without the extra walk if the iommu-dma bouncing is
>> completely refactored (and it might want a SWIOTLB helper to retrieve
>> the original page from a bounced address).
> 
> Doesn't sg_page() provide the original page already? Either way, the
> swiotlb knows it as it needs to do the copying between buffers.

For the part where we temporarily rewrite the offsets and lengths to 
pass to iommu_map_sg(), we'd also have to swizzle any relevant page 
pointers so that that picks up the physical addresses of the bounce 
buffer slots rather than the original pages, but then we need to put 
them back straight afterwards. Since SWIOTLB keeps track of that 
internally, it'll be a lot neater and more efficient to simply ask for 
it than to allocate more temporary storage to remember it independently 
(like I did for that horrible erratum thing to keep it self-contained).

>> That's going to be a bigger
>> job than I'll be able to finish this cycle, and I concluded that this
>> in-between approach wouldn't be worth posting for its own sake, but as
>> part of this series I think it's a reasonable compromise.
> 
> I'll drop my hack once you have something. Happy to carry it as part of
> this series.

Cool, I can't promise how soon I'll get there, but like I said if all 
the other objections are worked out in the meantime I have no issue with 
landing this approach and improving on it later.

Thanks,
Robin.

>> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
>> index 375a5e90d86a..87aaf8b5cdb4 100644
>> --- a/include/linux/scatterlist.h
>> +++ b/include/linux/scatterlist.h
>> @@ -16,7 +16,7 @@ struct scatterlist {
>>   #ifdef CONFIG_NEED_SG_DMA_LENGTH
>>   	unsigned int	dma_length;
>>   #endif
>> -#ifdef CONFIG_PCI_P2PDMA
>> +#ifdef CONFIG_NEED_SG_DMA_FLAGS
>>   	unsigned int    dma_flags;
>>   #endif
> 
> I initially had something similar but I decided it's overkill for a
> patch that I expected to be NAK'ed.
> 
> I'll include your patch in my series in the meantime.
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 03/13] iommu/dma: Force bouncing of the size is not cacheline-aligned
  2022-11-06 22:01 ` [PATCH v3 03/13] iommu/dma: Force bouncing of the " Catalin Marinas
  2022-11-07  9:46   ` Christoph Hellwig
@ 2022-11-14 23:23   ` Isaac Manjarres
  2022-11-15 11:48     ` Catalin Marinas
  1 sibling, 1 reply; 44+ messages in thread
From: Isaac Manjarres @ 2022-11-14 23:23 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

On Sun, Nov 06, 2022 at 10:01:33PM +0000, Catalin Marinas wrote:
> @@ -1202,7 +1203,10 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>  			goto out;
>  	}
>  
> -	if (dev_use_swiotlb(dev))
> +	if (dma_sg_kmalloc_needs_bounce(dev, sg, nents, dir))
> +		sg_dma_mark_bounced(sg);
> +
> +	if (dev_use_swiotlb(dev) || sg_is_dma_bounced(sg))
>  		return iommu_dma_map_sg_swiotlb(dev, sg, nents, dir, attrs);

Shouldn't you add a similar check in the iommu_dma_unmap_sg() path to
free any SWIOTLB memory that may have been allocated to bounce a scatter gather
list?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 03/13] iommu/dma: Force bouncing of the size is not cacheline-aligned
  2022-11-14 23:23   ` Isaac Manjarres
@ 2022-11-15 11:48     ` Catalin Marinas
  0 siblings, 0 replies; 44+ messages in thread
From: Catalin Marinas @ 2022-11-15 11:48 UTC (permalink / raw)
  To: Isaac Manjarres
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

On Mon, Nov 14, 2022 at 03:23:54PM -0800, Isaac Manjarres wrote:
> On Sun, Nov 06, 2022 at 10:01:33PM +0000, Catalin Marinas wrote:
> > @@ -1202,7 +1203,10 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
> >  			goto out;
> >  	}
> >  
> > -	if (dev_use_swiotlb(dev))
> > +	if (dma_sg_kmalloc_needs_bounce(dev, sg, nents, dir))
> > +		sg_dma_mark_bounced(sg);
> > +
> > +	if (dev_use_swiotlb(dev) || sg_is_dma_bounced(sg))
> >  		return iommu_dma_map_sg_swiotlb(dev, sg, nents, dir, attrs);
> 
> Shouldn't you add a similar check in the iommu_dma_unmap_sg() path to
> free any SWIOTLB memory that may have been allocated to bounce a scatter gather
> list?

Good point, not sure how I missed this. The sync'ing works fine as
iommu_dma_sync_sg_for_cpu() has the check but the swiotlb buffer won't
be freed.

Thanks.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 13/13] dma: arm64: Add CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC and enable it for arm64
  2022-11-08 10:03       ` Christoph Hellwig
@ 2022-11-30 18:48         ` Isaac Manjarres
  2022-11-30 23:32           ` Alexander Graf
  0 siblings, 1 reply; 44+ messages in thread
From: Isaac Manjarres @ 2022-11-30 18:48 UTC (permalink / raw)
  To: agraf
  Cc: Catalin Marinas, Robin Murphy, Linus Torvalds, Arnd Bergmann,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, linux-mm, iommu, linux-arm-kernel,
	Christoph Hellwig

On Tue, Nov 08, 2022 at 11:03:31AM +0100, Christoph Hellwig wrote:
> On Tue, Nov 08, 2022 at 09:52:15AM +0000, Catalin Marinas wrote:
> > Since it's hard to guess the optimal swiotlb buffer for such platforms,
> > I think a follow-up step would be to use the DMA coherent pool for
> > bouncing if no swiotlb buffer is available. At least the pool can grow
> > dynamically. Yet another option would be to increase the swiotlb buffer
> > at run-time but it has an overhead for is_swiotlb_buffer().
> 
> Alex said he wanted to look into growing the swiotlb buffer on demand
> for other reason, so adding him to Cc to check if there has been any
> progress on that.
Hi Alex,

Did you get a chance to look into this? If so, have you been able to
make progress on being able to grow the SWIOTLB buffer on demand?

Thanks,
Isaac

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 13/13] dma: arm64: Add CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC and enable it for arm64
  2022-11-30 18:48         ` Isaac Manjarres
@ 2022-11-30 23:32           ` Alexander Graf
  2023-04-20 11:51             ` Petr Tesařík
  0 siblings, 1 reply; 44+ messages in thread
From: Alexander Graf @ 2022-11-30 23:32 UTC (permalink / raw)
  To: Isaac Manjarres
  Cc: Catalin Marinas, Robin Murphy, Linus Torvalds, Arnd Bergmann,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, linux-mm, iommu, linux-arm-kernel,
	Christoph Hellwig

Hi Isaac,

On 30.11.22 19:48, Isaac Manjarres wrote:
> On Tue, Nov 08, 2022 at 11:03:31AM +0100, Christoph Hellwig wrote:
>> On Tue, Nov 08, 2022 at 09:52:15AM +0000, Catalin Marinas wrote:
>>> Since it's hard to guess the optimal swiotlb buffer for such platforms,
>>> I think a follow-up step would be to use the DMA coherent pool for
>>> bouncing if no swiotlb buffer is available. At least the pool can grow
>>> dynamically. Yet another option would be to increase the swiotlb buffer
>>> at run-time but it has an overhead for is_swiotlb_buffer().
>> Alex said he wanted to look into growing the swiotlb buffer on demand
>> for other reason, so adding him to Cc to check if there has been any
>> progress on that.
> Hi Alex,
>
> Did you get a chance to look into this? If so, have you been able to
> make progress on being able to grow the SWIOTLB buffer on demand?


I've been slightly under water and haven't been able to look at this yet 
:). It's on my list, but will probably be a while until I get to it. 
Would you be interested in having a first try?


Thanks,

Alex



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8
  2022-11-06 22:01 [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Catalin Marinas
                   ` (12 preceding siblings ...)
  2022-11-06 22:01 ` [PATCH v3 13/13] dma: arm64: Add CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC and enable it for arm64 Catalin Marinas
@ 2023-03-16 18:38 ` Isaac Manjarres
  2023-04-19 16:06   ` Catalin Marinas
  13 siblings, 1 reply; 44+ messages in thread
From: Isaac Manjarres @ 2023-03-16 18:38 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

On Sun, Nov 06, 2022 at 10:01:30PM +0000, Catalin Marinas wrote:
> Patches 7-12 change some ARCH_KMALLOC_MINALIGN uses to
> ARCH_DMA_MINALIGN. The crypto changes have been rejected by Herbert
> previously but I still included them here until the crypto code is
> refactored.
Hi Catalin,

Herbert merged the changes to the crypto code that were required to be
able to safely lower the minimum alignment for kmalloc in [1].

Given this, I don't think there's anything blocking this series from
being merged. The requirement for SWIOTLB to get to the minimum
kmalloc alignment down to 8 bytes shouldn't prevent this series from
being merged, as the amount of memory that is allocated for SWIOTLB
can be configured through the commandline to minimize the impact of
having SWIOTLB memory. Additionally, even if no SWIOTLB is present,
this series still offers memory savings on a lot of ARM64 platforms
by using the cache line size as the minimum alignment for kmalloc.

Can you please rebase this series so that it can be merged?

Thanks,
Isaac

[1]: https://lore.kernel.org/all/Y4nDL50nToBbi4DS@gondor.apana.org.au/

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8
  2023-03-16 18:38 ` [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Isaac Manjarres
@ 2023-04-19 16:06   ` Catalin Marinas
  2023-04-20  9:52     ` Petr Tesarik
  2023-05-15 19:09     ` Isaac Manjarres
  0 siblings, 2 replies; 44+ messages in thread
From: Catalin Marinas @ 2023-04-19 16:06 UTC (permalink / raw)
  To: Isaac Manjarres
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel, Petr Tesarik

On Thu, Mar 16, 2023 at 11:38:47AM -0700, Isaac Manjarres wrote:
> On Sun, Nov 06, 2022 at 10:01:30PM +0000, Catalin Marinas wrote:
> > Patches 7-12 change some ARCH_KMALLOC_MINALIGN uses to
> > ARCH_DMA_MINALIGN. The crypto changes have been rejected by Herbert
> > previously but I still included them here until the crypto code is
> > refactored.
> 
> Herbert merged the changes to the crypto code that were required to be
> able to safely lower the minimum alignment for kmalloc in [1].

Yes, I saw this.

> Given this, I don't think there's anything blocking this series from
> being merged. The requirement for SWIOTLB to get to the minimum
> kmalloc alignment down to 8 bytes shouldn't prevent this series from
> being merged, as the amount of memory that is allocated for SWIOTLB
> can be configured through the commandline to minimize the impact of
> having SWIOTLB memory. Additionally, even if no SWIOTLB is present,
> this series still offers memory savings on a lot of ARM64 platforms
> by using the cache line size as the minimum alignment for kmalloc.

Actually, there's some progress on the swiotlb front to allow dynamic
allocation. I haven't reviewed the series yet (I wasn't aware of it
until v2) but at a quick look, it limits the dynamic allocation to
bouncing buffers of at least a page size. Maybe this can be later
improved for buffers below ARCH_DMA_MINALIGN.

https://lore.kernel.org/r/cover.1681898595.git.petr.tesarik.ext@huawei.com

> Can you please rebase this series so that it can be merged?

I rebased it locally but the last stumbling block is sorting out the
iommu bouncing. I was hoping Robin Murphy can lend a hand but he's been
busy with other bits. I'll repost the series at 6.4-rc1.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8
  2023-04-19 16:06   ` Catalin Marinas
@ 2023-04-20  9:52     ` Petr Tesarik
  2023-04-20 17:43       ` Catalin Marinas
  2023-05-15 19:09     ` Isaac Manjarres
  1 sibling, 1 reply; 44+ messages in thread
From: Petr Tesarik @ 2023-04-20  9:52 UTC (permalink / raw)
  To: Catalin Marinas, Isaac Manjarres
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

On 4/19/2023 6:06 PM, Catalin Marinas wrote:
> On Thu, Mar 16, 2023 at 11:38:47AM -0700, Isaac Manjarres wrote:
>[...]>> Given this, I don't think there's anything blocking this series from
>> being merged. The requirement for SWIOTLB to get to the minimum
>> kmalloc alignment down to 8 bytes shouldn't prevent this series from
>> being merged, as the amount of memory that is allocated for SWIOTLB
>> can be configured through the commandline to minimize the impact of
>> having SWIOTLB memory. Additionally, even if no SWIOTLB is present,
>> this series still offers memory savings on a lot of ARM64 platforms
>> by using the cache line size as the minimum alignment for kmalloc.
> 
> Actually, there's some progress on the swiotlb front to allow dynamic
> allocation. I haven't reviewed the series yet (I wasn't aware of it
> until v2) but at a quick look, it limits the dynamic allocation to
> bouncing buffers of at least a page size. Maybe this can be later
> improved for buffers below ARCH_DMA_MINALIGN.

Indeed. My patch allocates dynamic bounce buffers with
dma_direct_alloc_pages() to keep things simple for now, but there is no
real reason against allocating less than a page with another suitable
allocator.

However, I'd be interested what the use case is, so I can assess the
performance impact, which depends on workload, and FYI it may not even
be negative. ;-)

Petr Tesarik


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 13/13] dma: arm64: Add CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC and enable it for arm64
  2022-11-30 23:32           ` Alexander Graf
@ 2023-04-20 11:51             ` Petr Tesařík
  0 siblings, 0 replies; 44+ messages in thread
From: Petr Tesařík @ 2023-04-20 11:51 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Isaac Manjarres, Catalin Marinas, Robin Murphy, Linus Torvalds,
	Arnd Bergmann, Greg Kroah-Hartman, Will Deacon, Marc Zyngier,
	Andrew Morton, Herbert Xu, Ard Biesheuvel, Saravana Kannan,
	Alasdair Kergon, Daniel Vetter, Joerg Roedel, Mark Brown,
	Mike Snitzer, Rafael J. Wysocki, linux-mm, iommu,
	linux-arm-kernel, Christoph Hellwig

Hi Alex!

Nice to meet you again...

On Thu, 1 Dec 2022 00:32:07 +0100
Alexander Graf <agraf@csgraf.de> wrote:

> Hi Isaac,
> 
> On 30.11.22 19:48, Isaac Manjarres wrote:
> > On Tue, Nov 08, 2022 at 11:03:31AM +0100, Christoph Hellwig wrote:  
> >> On Tue, Nov 08, 2022 at 09:52:15AM +0000, Catalin Marinas wrote:  
> >>> Since it's hard to guess the optimal swiotlb buffer for such platforms,
> >>> I think a follow-up step would be to use the DMA coherent pool for
> >>> bouncing if no swiotlb buffer is available. At least the pool can grow
> >>> dynamically. Yet another option would be to increase the swiotlb buffer
> >>> at run-time but it has an overhead for is_swiotlb_buffer().  
> >> Alex said he wanted to look into growing the swiotlb buffer on demand
> >> for other reason, so adding him to Cc to check if there has been any
> >> progress on that.  
> > Hi Alex,
> >
> > Did you get a chance to look into this? If so, have you been able to
> > make progress on being able to grow the SWIOTLB buffer on demand?  
> 
> 
> I've been slightly under water and haven't been able to look at this yet 
> :). It's on my list, but will probably be a while until I get to it. 
> Would you be interested in having a first try?

All right, I have just found this thread now after having sent my own
patch series to make SWIOTLB dynamic. I hope you don't mind. I didn't
want to "steal" the project from you.

Petr T

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8
  2023-04-20  9:52     ` Petr Tesarik
@ 2023-04-20 17:43       ` Catalin Marinas
  0 siblings, 0 replies; 44+ messages in thread
From: Catalin Marinas @ 2023-04-20 17:43 UTC (permalink / raw)
  To: Petr Tesarik
  Cc: Isaac Manjarres, Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel

On Thu, Apr 20, 2023 at 11:52:00AM +0200, Petr Tesarik wrote:
> On 4/19/2023 6:06 PM, Catalin Marinas wrote:
> > On Thu, Mar 16, 2023 at 11:38:47AM -0700, Isaac Manjarres wrote:
> >[...]>> Given this, I don't think there's anything blocking this series from
> >> being merged. The requirement for SWIOTLB to get to the minimum
> >> kmalloc alignment down to 8 bytes shouldn't prevent this series from
> >> being merged, as the amount of memory that is allocated for SWIOTLB
> >> can be configured through the commandline to minimize the impact of
> >> having SWIOTLB memory. Additionally, even if no SWIOTLB is present,
> >> this series still offers memory savings on a lot of ARM64 platforms
> >> by using the cache line size as the minimum alignment for kmalloc.
> > 
> > Actually, there's some progress on the swiotlb front to allow dynamic
> > allocation. I haven't reviewed the series yet (I wasn't aware of it
> > until v2) but at a quick look, it limits the dynamic allocation to
> > bouncing buffers of at least a page size. Maybe this can be later
> > improved for buffers below ARCH_DMA_MINALIGN.
> 
> Indeed. My patch allocates dynamic bounce buffers with
> dma_direct_alloc_pages() to keep things simple for now, but there is no
> real reason against allocating less than a page with another suitable
> allocator.

I guess it could fall back to a suitably aligned kmalloc() for smaller
sizes.

> However, I'd be interested what the use case is, so I can assess the
> performance impact, which depends on workload, and FYI it may not even
> be negative. ;-)

On arm64 we have an ARCH_DMA_MINALIGN of 128 bytes as that's the largest
cache line size that you can find on a non-coherent platform. The
implication is that ARCH_KMALLOC_MINALIGN is also 128, so smaller
slab-{8,16,32,64,96,192} caches cannot be created, leading to some
memory wastage.

This series decouples the two static alignments so that we can have an
ARCH_KMALLOC_MINALIGN of 8 while keeping ARCH_DMA_MINALIGN as 128. The
problem is that there are some drivers that do a small kmalloc() (below
a cache line size; typically USB drivers) and expect DMA to such buffer
to work. If the cache line is shared with some unrelated data, either
the cache maintenance in the DMA API corrupts such data or the cache
dirtying overrides inbound DMA data.

So, the solution is to bounce such small buffers if they end up in
functions like dma_map_single(). All we need is for the bounce buffer to
be aligned to the cache line size and honour the coherent mask (normally
ok with one of the GFP_DMA/DMA32 flags if required).

The swiotlb buffer would solve this but there are some (mobile)
platforms where the vendor disables the bounce buffer to save memory.
Having a way to dynamically allocate it in those rare cases above would
be helpful.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8
  2023-04-19 16:06   ` Catalin Marinas
  2023-04-20  9:52     ` Petr Tesarik
@ 2023-05-15 19:09     ` Isaac Manjarres
  2023-05-16 17:19       ` Catalin Marinas
  1 sibling, 1 reply; 44+ messages in thread
From: Isaac Manjarres @ 2023-05-15 19:09 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel, Petr Tesarik

On Wed, Apr 19, 2023 at 05:06:04PM +0100, Catalin Marinas wrote:
> I rebased it locally but the last stumbling block is sorting out the
> iommu bouncing. I was hoping Robin Murphy can lend a hand but he's been
> busy with other bits. I'll repost the series at 6.4-rc1.
Hey Catalin, just following up on this. I think it might be worthwhile
to split this series into two series:

Series 1: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN,
and use the cacheline size to determine the minimum kmalloc
alignment.

Series 2: Lower the minimum kmalloc alignment to 8 bytes by adding
support for using SWIOTLB to bounce unaligned kmalloc buffers for DMA
transactions.

Dividing the patches as such has the advantage of lowering the minimum
kmalloc alignment to 64 bytes on many ARM64 systems while the work for
lowering the minimum alignment to 8 bytes proceeds. This provides a
noticeable decrease in the slab memory footprint (e.g. I observed a 15
MB decrease in slab usage on a device I was using).

What are your thoughts on this?

--Isaac

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8
  2023-05-15 19:09     ` Isaac Manjarres
@ 2023-05-16 17:19       ` Catalin Marinas
  2023-05-16 18:19         ` Isaac Manjarres
  0 siblings, 1 reply; 44+ messages in thread
From: Catalin Marinas @ 2023-05-16 17:19 UTC (permalink / raw)
  To: Isaac Manjarres
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel, Petr Tesarik

On Mon, May 15, 2023 at 12:09:12PM -0700, Isaac Manjarres wrote:
> On Wed, Apr 19, 2023 at 05:06:04PM +0100, Catalin Marinas wrote:
> > I rebased it locally but the last stumbling block is sorting out the
> > iommu bouncing. I was hoping Robin Murphy can lend a hand but he's been
> > busy with other bits. I'll repost the series at 6.4-rc1.
> Hey Catalin, just following up on this. I think it might be worthwhile
> to split this series into two series:
> 
> Series 1: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN,
> and use the cacheline size to determine the minimum kmalloc
> alignment.
> 
> Series 2: Lower the minimum kmalloc alignment to 8 bytes by adding
> support for using SWIOTLB to bounce unaligned kmalloc buffers for DMA
> transactions.
> 
> Dividing the patches as such has the advantage of lowering the minimum
> kmalloc alignment to 64 bytes on many ARM64 systems while the work for
> lowering the minimum alignment to 8 bytes proceeds. This provides a
> noticeable decrease in the slab memory footprint (e.g. I observed a 15
> MB decrease in slab usage on a device I was using).

I attempted "series 1" some time ago and the discussion led to the
combined approach (i.e. don't bother with limiting kmalloc minimum
alignment to cache_line_size() but instead bounce those small buffers).
In my series, I still have this fallback in case there's no swiotlb
buffer.

I'll post a new series this week (including DMA bouncing) but I'll try
to move the bouncing towards the end of the series in case there are
more discussions around this, at least the first part could be picked
up.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8
  2023-05-16 17:19       ` Catalin Marinas
@ 2023-05-16 18:19         ` Isaac Manjarres
  0 siblings, 0 replies; 44+ messages in thread
From: Isaac Manjarres @ 2023-05-16 18:19 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linus Torvalds, Arnd Bergmann, Christoph Hellwig,
	Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Andrew Morton,
	Herbert Xu, Ard Biesheuvel, Saravana Kannan, Alasdair Kergon,
	Daniel Vetter, Joerg Roedel, Mark Brown, Mike Snitzer,
	Rafael J. Wysocki, Robin Murphy, linux-mm, iommu,
	linux-arm-kernel, Petr Tesarik

On Tue, May 16, 2023 at 06:19:43PM +0100, Catalin Marinas wrote:
> On Mon, May 15, 2023 at 12:09:12PM -0700, Isaac Manjarres wrote:
> > Hey Catalin, just following up on this. I think it might be worthwhile
> > to split this series into two series:
> > 
> > Series 1: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN,
> > and use the cacheline size to determine the minimum kmalloc
> > alignment.
> > 
> > Series 2: Lower the minimum kmalloc alignment to 8 bytes by adding
> > support for using SWIOTLB to bounce unaligned kmalloc buffers for DMA
> > transactions.
> 
> I attempted "series 1" some time ago and the discussion led to the
> combined approach (i.e. don't bother with limiting kmalloc minimum
> alignment to cache_line_size() but instead bounce those small buffers).
> In my series, I still have this fallback in case there's no swiotlb
> buffer.

> I'll post a new series this week (including DMA bouncing) but I'll try
> to move the bouncing towards the end of the series in case there are
> more discussions around this, at least the first part could be picked
> up.

Thanks Catalin! I think restructuring the series as you're suggesting
makes sense. At least being able to pick up the first part of the series
would be great, since it will have a positive impact on the memory
footprint for a lot of devices. This also helps alleviate some of the
memory overhead for devices that move from a 32-bit ARM kernel to a
64-bit ARM kernel.

The second part can continue to be refined until the SWIOTLB and IOMMU bounce
buffering refactor is complete.

-Isaac

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2023-05-16 18:19 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-06 22:01 [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Catalin Marinas
2022-11-06 22:01 ` [PATCH v3 01/13] mm/slab: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN Catalin Marinas
2022-11-06 22:01 ` [PATCH v3 02/13] dma-mapping: Force bouncing if the kmalloc() size is not cacheline-aligned Catalin Marinas
2022-11-07  9:43   ` Christoph Hellwig
2022-11-06 22:01 ` [PATCH v3 03/13] iommu/dma: Force bouncing of the " Catalin Marinas
2022-11-07  9:46   ` Christoph Hellwig
2022-11-07 10:54     ` Catalin Marinas
2022-11-07 13:26       ` Robin Murphy
2022-11-08 10:51         ` Catalin Marinas
2022-11-08 11:40           ` Robin Murphy
2022-11-08  7:50       ` Christoph Hellwig
2022-11-14 23:23   ` Isaac Manjarres
2022-11-15 11:48     ` Catalin Marinas
2022-11-06 22:01 ` [PATCH v3 04/13] mm/slab: Allow kmalloc() minimum alignment fallback to dma_get_cache_alignment() Catalin Marinas
     [not found]   ` <202211070812.BhGKB0Hd-lkp@intel.com>
2022-11-07  9:22     ` Catalin Marinas
2022-11-06 22:01 ` [PATCH v3 05/13] mm/slab: Simplify create_kmalloc_cache() args and make it static Catalin Marinas
2022-11-06 22:01 ` [PATCH v3 06/13] dma: Allow the smaller cache_line_size() returned by dma_get_cache_alignment() Catalin Marinas
2022-11-06 22:01 ` [PATCH v3 07/13] drivers/base: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN Catalin Marinas
2022-11-06 22:01 ` [PATCH v3 08/13] drivers/gpu: " Catalin Marinas
2022-11-06 22:01 ` [PATCH v3 09/13] drivers/usb: " Catalin Marinas
2022-11-06 22:01 ` [PATCH v3 10/13] drivers/spi: " Catalin Marinas
2022-11-07 12:58   ` Mark Brown
2022-11-06 22:01 ` [PATCH v3 11/13] crypto: " Catalin Marinas
2022-11-07  2:22   ` Herbert Xu
2022-11-07  9:05     ` Catalin Marinas
2022-11-07  9:12       ` Herbert Xu
2022-11-07  9:38         ` Catalin Marinas
2022-11-06 22:01 ` [PATCH v3 12/13] drivers/md: " Catalin Marinas
2022-11-06 22:01 ` [PATCH v3 13/13] dma: arm64: Add CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC and enable it for arm64 Catalin Marinas
2022-11-07 13:03   ` Robin Murphy
2022-11-07 14:38     ` Christoph Hellwig
2022-11-07 15:24       ` Robin Murphy
2022-11-08  9:52     ` Catalin Marinas
2022-11-08 10:03       ` Christoph Hellwig
2022-11-30 18:48         ` Isaac Manjarres
2022-11-30 23:32           ` Alexander Graf
2023-04-20 11:51             ` Petr Tesařík
2023-03-16 18:38 ` [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Isaac Manjarres
2023-04-19 16:06   ` Catalin Marinas
2023-04-20  9:52     ` Petr Tesarik
2023-04-20 17:43       ` Catalin Marinas
2023-05-15 19:09     ` Isaac Manjarres
2023-05-16 17:19       ` Catalin Marinas
2023-05-16 18:19         ` Isaac Manjarres

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).