linux-input.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] Input: Fix insufficent DMA alignment.
@ 2022-11-27 14:41 Jonathan Cameron
  2022-11-27 14:41 ` [PATCH 1/9] Input: psxpad - Fix padding for DMA safe buffers Jonathan Cameron
                   ` (9 more replies)
  0 siblings, 10 replies; 20+ messages in thread
From: Jonathan Cameron @ 2022-11-27 14:41 UTC (permalink / raw)
  To: linux-input, Dmitry Torokhov
  Cc: Jonathan Cameron, Daniel Mack, Michael Hennerich,
	Tomohiro Yoshidomi, Javier Martinez Canillas, Linus Walleij,
	Benjamin Tissoires, Lauri Kasanen, Daniel Hung-yu Wu

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

This problem was discovered in IIO as a side effect of the discussions about
relaxing kmalloc alignment on arm64 and resulted in a series of large
patch sets.

https://lore.kernel.org/linux-iio/20220508175712.647246-1-jic23@kernel.org/

Unsurprisingly there are cases of it in other subsystems.

The short version of this is that there are a few known arm64 chips where
___cacheline_aligned enforces 64 byte alignment which is what we typically
want for performance optimization as the size of the L1 cache lines.
However, further out in the cache hierarchy we have caches with 128 byte
lines.  Those are the ones that matter for DMA safety.
So we need the larger alignment guarantees of ARCH_KMALLOC_MINALIGN which
in this case is 128 bytes.

There is one other use of ____cacheline_aligned in input:
joystick/iforce/iforce-usb.c

Whilst suspicious I'm not sure enough of the requirements of USB to
know if they are there for DMA safety or some other constraint.

Jonathan

Cc: Daniel Mack <daniel@zonque.org>
Cc: Michael Hennerich <michael.hennerich@analog.com>
Cc: Tomohiro Yoshidomi <sylph23k@gmail.com>
Cc: Javier Martinez Canillas <javier@dowhile0.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: Lauri Kasanen <cand@gmx.com>
Cc: Daniel Hung-yu Wu <hywu@google.com>

Jonathan Cameron (9):
  Input: psxpad - Fix padding for DMA safe buffers.
  Input: ad714x - Fix padding for DMA safe buffers.
  Input: ad7887 - Fix padding for DMA safe buffers.
  Input: ads7846 - Fix padding for DMA safe buffers.
  Input: cyttsp - Fix padding for DMA safe buffers.
  Input: surface3 - Fix padding for DMA safe buffers.
  Input: n64joy - Fix DMA buffer alignment.
  Input: atmel_captouch - Avoid suspect DMA buffer alignment.
  Input: elants - Fix suspect DMA buffer alignment

 drivers/input/joystick/n64joy.c          | 6 +++---
 drivers/input/joystick/psxpad-spi.c      | 4 ++--
 drivers/input/misc/ad714x.h              | 2 +-
 drivers/input/misc/atmel_captouch.c      | 2 +-
 drivers/input/touchscreen/ad7877.c       | 4 ++--
 drivers/input/touchscreen/ads7846.c      | 4 ++--
 drivers/input/touchscreen/cyttsp_core.h  | 2 +-
 drivers/input/touchscreen/elants_i2c.c   | 2 +-
 drivers/input/touchscreen/surface3_spi.c | 2 +-
 9 files changed, 14 insertions(+), 14 deletions(-)

-- 
2.38.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/9] Input: psxpad - Fix padding for DMA safe buffers.
  2022-11-27 14:41 [PATCH 0/9] Input: Fix insufficent DMA alignment Jonathan Cameron
@ 2022-11-27 14:41 ` Jonathan Cameron
  2022-11-27 14:41 ` [PATCH 2/9] Input: ad714x " Jonathan Cameron
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Jonathan Cameron @ 2022-11-27 14:41 UTC (permalink / raw)
  To: linux-input, Dmitry Torokhov; +Cc: Jonathan Cameron, Tomohiro Yoshidomi

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

On some architectures (e.g. arm64), ____cachline_aligned only aligns
to the cacheline size of the L1 cache size. L1_CACHE_BYTES in
arch64/include/asm/cache.h  Unfortunately DMA safety on these
architectures requires the buffer no share a last level cache cacheline
given by ARCH_DMA_MINALIGN which has a greater granularity.
ARCH_DMA_MINALIGN is not defined for all architectures, but when it is
defined it is used to set the size of ARCH_KMALLOC_MINALIGN
to allow DMA safe buffer allocations.

As such the correct alignment requirement is
__aligned(ARCH_KMALLOC_MINALIGN).
This has recently been fixed in other subsystems such as IIO.

Fixes: 8be193c7b1f4 ("Input: add support for PlayStation 1/2 joypads connected via SPI")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Tomohiro Yoshidomi <sylph23k@gmail.com>
---
 drivers/input/joystick/psxpad-spi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/input/joystick/psxpad-spi.c b/drivers/input/joystick/psxpad-spi.c
index a32656064f39..8098d205b58d 100644
--- a/drivers/input/joystick/psxpad-spi.c
+++ b/drivers/input/joystick/psxpad-spi.c
@@ -65,8 +65,8 @@ struct psxpad {
 	bool motor2enable;
 	u8 motor1level;
 	u8 motor2level;
-	u8 sendbuf[0x20] ____cacheline_aligned;
-	u8 response[sizeof(PSX_CMD_POLL)] ____cacheline_aligned;
+	u8 sendbuf[0x20] __aligned(ARCH_KMALLOC_MINALIGN);
+	u8 response[sizeof(PSX_CMD_POLL)] __aligned(ARCH_KMALLOC_MINALIGN);
 };
 
 static int psxpad_command(struct psxpad *pad, const u8 sendcmdlen)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/9] Input: ad714x - Fix padding for DMA safe buffers.
  2022-11-27 14:41 [PATCH 0/9] Input: Fix insufficent DMA alignment Jonathan Cameron
  2022-11-27 14:41 ` [PATCH 1/9] Input: psxpad - Fix padding for DMA safe buffers Jonathan Cameron
@ 2022-11-27 14:41 ` Jonathan Cameron
  2022-11-28  7:14   ` Hennerich, Michael
  2022-11-27 14:41 ` [PATCH 3/9] Input: ad7887 " Jonathan Cameron
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 20+ messages in thread
From: Jonathan Cameron @ 2022-11-27 14:41 UTC (permalink / raw)
  To: linux-input, Dmitry Torokhov; +Cc: Jonathan Cameron, Michael Hennerich

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

On some architectures (e.g. arm64), ____cachline_aligned only aligns
to the cacheline size of the L1 cache size. L1_CACHE_BYTES in
arch64/include/asm/cache.h  Unfortunately DMA safety on these
architectures requires the buffer no share a last level cache cacheline
given by ARCH_DMA_MINALIGN which has a greater granularity.
ARCH_DMA_MINALIGN is not defined for all architectures, but when it is
defined it is used to set the size of ARCH_KMALLOC_MINALIGN
to allow DMA safe buffer allocations.

As such the correct alignment requirement is
__aligned(ARCH_KMALLOC_MINALIGN).
This has recently been fixed in other subsystems such as IIO.

Fixes tag is inprecise because there may not have been any architectures
where the two values were different at the time of the earlier fix.

Fixes: c0409feb8689 ("Input: ad714x - use DMA-safe buffers for spi_write()")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Michael Hennerich <michael.hennerich@analog.com>
---
 drivers/input/misc/ad714x.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/input/misc/ad714x.h b/drivers/input/misc/ad714x.h
index af847b5f0d0e..2b8b901183be 100644
--- a/drivers/input/misc/ad714x.h
+++ b/drivers/input/misc/ad714x.h
@@ -41,7 +41,7 @@ struct ad714x_chip {
 	unsigned product;
 	unsigned version;
 
-	__be16 xfer_buf[16] ____cacheline_aligned;
+	__be16 xfer_buf[16] __aligned(ARCH_KMALLOC_MINALIGN);
 
 };
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/9] Input: ad7887 - Fix padding for DMA safe buffers.
  2022-11-27 14:41 [PATCH 0/9] Input: Fix insufficent DMA alignment Jonathan Cameron
  2022-11-27 14:41 ` [PATCH 1/9] Input: psxpad - Fix padding for DMA safe buffers Jonathan Cameron
  2022-11-27 14:41 ` [PATCH 2/9] Input: ad714x " Jonathan Cameron
@ 2022-11-27 14:41 ` Jonathan Cameron
  2022-11-28  7:14   ` Hennerich, Michael
  2022-11-27 14:41 ` [PATCH 4/9] Input: ads7846 " Jonathan Cameron
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 20+ messages in thread
From: Jonathan Cameron @ 2022-11-27 14:41 UTC (permalink / raw)
  To: linux-input, Dmitry Torokhov; +Cc: Jonathan Cameron, Michael Hennerich

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

On some architectures (e.g. arm64), ____cachline_aligned only aligns
to the cacheline size of the L1 cache size. L1_CACHE_BYTES in
arch64/include/asm/cache.h  Unfortunately DMA safety on these
architectures requires the buffer no share a last level cache cacheline
given by ARCH_DMA_MINALIGN which has a greater granularity.
ARCH_DMA_MINALIGN is not defined for all architectures, but when it is
defined it is used to set the size of ARCH_KMALLOC_MINALIGN
to allow DMA safe buffer allocations.

As such the correct alignment requirement is
__aligned(ARCH_KMALLOC_MINALIGN).
This has recently been fixed in other subsystems such as IIO.

Fixes tag is imprecise as there may have been no architectures where the
two alignments differed at the time of that patch.

Fixes: 3843384a0554 ("Input: ad7877 - keep dma rx buffers in seperate cache lines")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Michael Hennerich <michael.hennerich@analog.com>
---
 drivers/input/touchscreen/ad7877.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c
index 08f5372f0bfd..eaf11dffb28e 100644
--- a/drivers/input/touchscreen/ad7877.c
+++ b/drivers/input/touchscreen/ad7877.c
@@ -150,7 +150,7 @@ struct ser_req {
 	 * DMA (thus cache coherency maintenance) requires the
 	 * transfer buffers to live in their own cache lines.
 	 */
-	u16 sample ____cacheline_aligned;
+	u16 sample __aligned(ARCH_KMALLOC_MINALIGN);
 };
 
 struct ad7877 {
@@ -189,7 +189,7 @@ struct ad7877 {
 	 * DMA (thus cache coherency maintenance) requires the
 	 * transfer buffers to live in their own cache lines.
 	 */
-	u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
+	u16 conversion_data[AD7877_NR_SENSE] __aligned(ARCH_KMALLOC_MINALIGN);
 };
 
 static bool gpio3;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/9] Input: ads7846 - Fix padding for DMA safe buffers.
  2022-11-27 14:41 [PATCH 0/9] Input: Fix insufficent DMA alignment Jonathan Cameron
                   ` (2 preceding siblings ...)
  2022-11-27 14:41 ` [PATCH 3/9] Input: ad7887 " Jonathan Cameron
@ 2022-11-27 14:41 ` Jonathan Cameron
  2022-11-27 14:41 ` [PATCH 5/9] Input: cyttsp " Jonathan Cameron
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Jonathan Cameron @ 2022-11-27 14:41 UTC (permalink / raw)
  To: linux-input, Dmitry Torokhov; +Cc: Jonathan Cameron, Daniel Mack

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

On some architectures (e.g. arm64), ____cachline_aligned only aligns
to the cacheline size of the L1 cache size. L1_CACHE_BYTES in
arch64/include/asm/cache.h  Unfortunately DMA safety on these
architectures requires the buffer no share a last level cache cacheline
given by ARCH_DMA_MINALIGN which has a greater granularity.
ARCH_DMA_MINALIGN is not defined for all architectures, but when it is
defined it is used to set the size of ARCH_KMALLOC_MINALIGN
to allow DMA safe buffer allocations.

As such the correct alignment requirement is
__aligned(ARCH_KMALLOC_MINALIGN).
This has recently been fixed in other subsystems such as IIO.

Fixes tag for this is complex as at the time of original introduction, it
is likely that there were no cases where the two alignments were different.

Fixes: 1dbe7dada2d0 ("Input: ads7846 - make transfer buffers DMA safe")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Daniel Mack <daniel@zonque.org>
---
 drivers/input/touchscreen/ads7846.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/input/touchscreen/ads7846.c b/drivers/input/touchscreen/ads7846.c
index bed68a68f330..074ca9f59788 100644
--- a/drivers/input/touchscreen/ads7846.c
+++ b/drivers/input/touchscreen/ads7846.c
@@ -337,7 +337,7 @@ struct ser_req {
 	 * DMA (thus cache coherency maintenance) requires the
 	 * transfer buffers to live in their own cache lines.
 	 */
-	__be16 sample ____cacheline_aligned;
+	__be16 sample __aligned(ARCH_KMALLOC_MINALIGN);
 };
 
 struct ads7845_ser_req {
@@ -348,7 +348,7 @@ struct ads7845_ser_req {
 	 * DMA (thus cache coherency maintenance) requires the
 	 * transfer buffers to live in their own cache lines.
 	 */
-	u8 sample[3] ____cacheline_aligned;
+	u8 sample[3] __aligned(ARCH_KMALLOC_MINALIGN);
 };
 
 static int ads7846_read12_ser(struct device *dev, unsigned command)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/9] Input: cyttsp - Fix padding for DMA safe buffers.
  2022-11-27 14:41 [PATCH 0/9] Input: Fix insufficent DMA alignment Jonathan Cameron
                   ` (3 preceding siblings ...)
  2022-11-27 14:41 ` [PATCH 4/9] Input: ads7846 " Jonathan Cameron
@ 2022-11-27 14:41 ` Jonathan Cameron
  2022-11-27 14:41 ` [PATCH 6/9] Input: surface3 " Jonathan Cameron
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Jonathan Cameron @ 2022-11-27 14:41 UTC (permalink / raw)
  To: linux-input, Dmitry Torokhov
  Cc: Jonathan Cameron, Javier Martinez Canillas, Linus Walleij

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

On some architectures (e.g. arm64), ____cachline_aligned only aligns
to the cacheline size of the L1 cache size. L1_CACHE_BYTES in
arch64/include/asm/cache.h  Unfortunately DMA safety on these
architectures requires the buffer no share a last level cache cacheline
given by ARCH_DMA_MINALIGN which has a greater granularity.
ARCH_DMA_MINALIGN is not defined for all architectures, but when it is
defined it is used to set the size of ARCH_KMALLOC_MINALIGN
to allow DMA safe buffer allocations.

As such the correct alignment requirement is
__aligned(ARCH_KMALLOC_MINALIGN).
This has recently been fixed in other subsystems such as IIO.

Fixes tag for this is complex as at the time of original introduction, it
is likely that there were no cases where the two alignments were different.

Fixes: 4065d1e7b216 ("Input: add Cypress TTSP capacitive multi-touch screen support")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Javier Martinez Canillas <javier@dowhile0.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/input/touchscreen/cyttsp_core.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/input/touchscreen/cyttsp_core.h b/drivers/input/touchscreen/cyttsp_core.h
index 075509e695a2..e87cb323623c 100644
--- a/drivers/input/touchscreen/cyttsp_core.h
+++ b/drivers/input/touchscreen/cyttsp_core.h
@@ -131,7 +131,7 @@ struct cyttsp {
 	u8 lp_intrvl;
 	u8 *bl_keys;
 
-	u8 xfer_buf[] ____cacheline_aligned;
+	u8 xfer_buf[] __aligned(ARCH_KMALLOC_MINALIGN);
 };
 
 struct cyttsp *cyttsp_probe(const struct cyttsp_bus_ops *bus_ops,
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 6/9] Input: surface3 - Fix padding for DMA safe buffers.
  2022-11-27 14:41 [PATCH 0/9] Input: Fix insufficent DMA alignment Jonathan Cameron
                   ` (4 preceding siblings ...)
  2022-11-27 14:41 ` [PATCH 5/9] Input: cyttsp " Jonathan Cameron
@ 2022-11-27 14:41 ` Jonathan Cameron
  2022-11-27 16:26   ` Jonathan Cameron
  2022-11-27 14:41 ` [PATCH 7/9] Input: n64joy - Fix DMA buffer alignment Jonathan Cameron
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 20+ messages in thread
From: Jonathan Cameron @ 2022-11-27 14:41 UTC (permalink / raw)
  To: linux-input, Dmitry Torokhov; +Cc: Jonathan Cameron, Benjamin Tissoires

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

On some architectures (e.g. arm64), ____cachline_aligned only aligns
to the cacheline size of the L1 cache size. L1_CACHE_BYTES in
arch64/include/asm/cache.h  Unfortunately DMA safety on these
architectures requires the buffer no share a last level cache cacheline
given by ARCH_DMA_MINALIGN which has a greater granularity.
ARCH_DMA_MINALIGN is not defined for all architectures, but when it is
defined it is used to set the size of ARCH_KMALLOC_MINALIGN
to allow DMA safe buffer allocations.

As such the correct alignment requirement is
__aligned(ARCH_KMALLOC_MINALIGN).
This has recently been fixed in other subsystems such as IIO.

Presumably this part is little used on boards where this could actually
matter so this is mostly about removing code that might be coppied
elsewhere.

Fixes: 4feacbc24eea ("Input: add new driver for the Surface 3")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
---
 drivers/input/touchscreen/surface3_spi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/input/touchscreen/surface3_spi.c b/drivers/input/touchscreen/surface3_spi.c
index 1da23e5585a0..6c884fc2b332 100644
--- a/drivers/input/touchscreen/surface3_spi.c
+++ b/drivers/input/touchscreen/surface3_spi.c
@@ -32,7 +32,7 @@ struct surface3_ts_data {
 	struct input_dev *pen_input_dev;
 	int pen_tool;
 
-	u8 rd_buf[SURFACE3_PACKET_SIZE]		____cacheline_aligned;
+	u8 rd_buf[SURFACE3_PACKET_SIZE] __aligned(ARCH_KMALLOC_MINALIGN;
 };
 
 struct surface3_ts_data_finger {
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 7/9] Input: n64joy - Fix DMA buffer alignment.
  2022-11-27 14:41 [PATCH 0/9] Input: Fix insufficent DMA alignment Jonathan Cameron
                   ` (5 preceding siblings ...)
  2022-11-27 14:41 ` [PATCH 6/9] Input: surface3 " Jonathan Cameron
@ 2022-11-27 14:41 ` Jonathan Cameron
  2022-11-27 16:48   ` Lauri Kasanen
  2022-11-27 14:41 ` [PATCH 8/9] Input: atmel_captouch - Avoid suspect " Jonathan Cameron
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 20+ messages in thread
From: Jonathan Cameron @ 2022-11-27 14:41 UTC (permalink / raw)
  To: linux-input, Dmitry Torokhov; +Cc: Jonathan Cameron, Lauri Kasanen

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

The use of ____cacheline_aligned to ensure a buffer is DMA safe only
enforces the start of the buffer alignment. In this case, sufficient
alignment is already ensured by the use of kzalloc().
____cacheline_aligned does not ensure that no other members of the
structure are placed in the same cacheline after the end of the
buffer marked.  Thus to ensure a DMA safe buffer it must be at the end
of the structure.

Whilst here switch from ____cacheline_aligned to
__aligned(ARCH_KMALLOC_MINALIGN) as on some architectures, with variable
sized cacheline lines across their cache hierarchy, require this
greater alignment guarantee for DMA safety.  Make this change throughout
the driver as it reduces need for a reader to know about the particular
architecture.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Lauri Kasanen <cand@gmx.com>
--
Only partly compile tested as I don't have a mips toolchain set up.
Would be great to add the stubs to be able to build these drivers
with COMPILE_TEST.
---
 drivers/input/joystick/n64joy.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/input/joystick/n64joy.c b/drivers/input/joystick/n64joy.c
index 9dbca366613e..d8c50103c108 100644
--- a/drivers/input/joystick/n64joy.c
+++ b/drivers/input/joystick/n64joy.c
@@ -44,12 +44,12 @@ static const char *n64joy_phys[MAX_CONTROLLERS] = {
 };
 
 struct n64joy_priv {
-	u64 si_buf[8] ____cacheline_aligned;
 	struct timer_list timer;
 	struct mutex n64joy_mutex;
 	struct input_dev *n64joy_dev[MAX_CONTROLLERS];
 	u32 __iomem *reg_base;
 	u8 n64joy_opened;
+	u64 si_buf[8] __aligned(ARCH_KMALLOC_MINALIGN);
 };
 
 struct joydata {
@@ -129,7 +129,7 @@ static void n64joy_exec_pif(struct n64joy_priv *priv, const u64 in[8])
 	local_irq_restore(flags);
 }
 
-static const u64 polldata[] ____cacheline_aligned = {
+static const u64 polldata[] __aligned(ARCH_KMALLOC_MINALIGN) = {
 	0xff010401ffffffff,
 	0xff010401ffffffff,
 	0xff010401ffffffff,
@@ -222,7 +222,7 @@ static void n64joy_close(struct input_dev *dev)
 	mutex_unlock(&priv->n64joy_mutex);
 }
 
-static const u64 __initconst scandata[] ____cacheline_aligned = {
+static const u64 __initconst scandata[] __aligned(ARCH_KMALLOC_MINALIGN) = {
 	0xff010300ffffffff,
 	0xff010300ffffffff,
 	0xff010300ffffffff,
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 8/9] Input: atmel_captouch - Avoid suspect DMA buffer alignment.
  2022-11-27 14:41 [PATCH 0/9] Input: Fix insufficent DMA alignment Jonathan Cameron
                   ` (6 preceding siblings ...)
  2022-11-27 14:41 ` [PATCH 7/9] Input: n64joy - Fix DMA buffer alignment Jonathan Cameron
@ 2022-11-27 14:41 ` Jonathan Cameron
  2022-11-27 14:41 ` [PATCH 9/9] Input: elants - Fix " Jonathan Cameron
  2022-11-28 18:16 ` [PATCH 0/9] Input: Fix insufficent DMA alignment Dmitry Torokhov
  9 siblings, 0 replies; 20+ messages in thread
From: Jonathan Cameron @ 2022-11-27 14:41 UTC (permalink / raw)
  To: linux-input, Dmitry Torokhov; +Cc: Jonathan Cameron, Daniel Hung-yu Wu

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

On some architectures (e.g. arm64), ____cachline_aligned only aligns
to the cacheline size of the L1 cache size. L1_CACHE_BYTES in
arch64/include/asm/cache.h  Unfortunately DMA safety on these
architectures requires the buffer no share a last level cache cacheline
given by ARCH_DMA_MINALIGN which has a greater granularity.
ARCH_DMA_MINALIGN is not defined for all architectures, but when it is
defined it is used to set the size of ARCH_KMALLOC_MINALIGN
to allow DMA safe buffer allocations.

As such the correct alignment requirement is
__aligned(ARCH_KMALLOC_MINALIGN).
This has recently been fixed in other subsystems such as IIO.

This is probably not a fix as such, because likely this particular
device is only used on devices where the given alignment is sufficient.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Daniel Hung-yu Wu <hywu@google.com>
---
 drivers/input/misc/atmel_captouch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/input/misc/atmel_captouch.c b/drivers/input/misc/atmel_captouch.c
index 051aded6815a..9681c763356c 100644
--- a/drivers/input/misc/atmel_captouch.c
+++ b/drivers/input/misc/atmel_captouch.c
@@ -71,7 +71,7 @@ struct atmel_captouch_device {
 	u32 num_btn;
 	u32 keycodes[MAX_NUM_OF_BUTTONS];
 	u8 prev_btn;
-	u8 xfer_buf[8] ____cacheline_aligned;
+	u8 xfer_buf[8] __aligned(ARCH_KMALLOC_MINALIGN);
 };
 
 /*
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 9/9] Input: elants - Fix suspect DMA buffer alignment
  2022-11-27 14:41 [PATCH 0/9] Input: Fix insufficent DMA alignment Jonathan Cameron
                   ` (7 preceding siblings ...)
  2022-11-27 14:41 ` [PATCH 8/9] Input: atmel_captouch - Avoid suspect " Jonathan Cameron
@ 2022-11-27 14:41 ` Jonathan Cameron
  2022-11-28 18:16 ` [PATCH 0/9] Input: Fix insufficent DMA alignment Dmitry Torokhov
  9 siblings, 0 replies; 20+ messages in thread
From: Jonathan Cameron @ 2022-11-27 14:41 UTC (permalink / raw)
  To: linux-input, Dmitry Torokhov; +Cc: Jonathan Cameron, Daniel Hung-yu Wu

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

On some architectures (e.g. arm64), ____cachline_aligned only aligns
to the cacheline size of the L1 cache size. L1_CACHE_BYTES in
arch64/include/asm/cache.h  Unfortunately DMA safety on these
architectures requires the buffer no share a last level cache cacheline
given by ARCH_DMA_MINALIGN which has a greater granularity.
ARCH_DMA_MINALIGN is not defined for all architectures, but when it is
defined it is used to set the size of ARCH_KMALLOC_MINALIGN
to allow DMA safe buffer allocations.

As such the correct alignment requirement is
__aligned(ARCH_KMALLOC_MINALIGN).
This has recently been fixed in other subsystems such as IIO.

Fixes: 00f73f97527f ("Input: elants_i2c - use DMA safe i2c when possible")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Daniel Hung-yu Wu <hywu@google.com>
---
 drivers/input/touchscreen/elants_i2c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/input/touchscreen/elants_i2c.c b/drivers/input/touchscreen/elants_i2c.c
index 879a4d984c90..7ad58518c651 100644
--- a/drivers/input/touchscreen/elants_i2c.c
+++ b/drivers/input/touchscreen/elants_i2c.c
@@ -184,7 +184,7 @@ struct elants_data {
 	bool keep_power_in_suspend;
 
 	/* Must be last to be used for DMA operations */
-	u8 buf[MAX_PACKET_SIZE] ____cacheline_aligned;
+	u8 buf[MAX_PACKET_SIZE] __aligned(ARCH_KMALLOC_MINALIGN);
 };
 
 static int elants_i2c_send(struct i2c_client *client,
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/9] Input: surface3 - Fix padding for DMA safe buffers.
  2022-11-27 14:41 ` [PATCH 6/9] Input: surface3 " Jonathan Cameron
@ 2022-11-27 16:26   ` Jonathan Cameron
  0 siblings, 0 replies; 20+ messages in thread
From: Jonathan Cameron @ 2022-11-27 16:26 UTC (permalink / raw)
  To: linux-input, Dmitry Torokhov; +Cc: Jonathan Cameron, Benjamin Tissoires

On Sun, 27 Nov 2022 14:41:13 +0000
Jonathan Cameron <jic23@kernel.org> wrote:

> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> On some architectures (e.g. arm64), ____cachline_aligned only aligns
> to the cacheline size of the L1 cache size. L1_CACHE_BYTES in
> arch64/include/asm/cache.h  Unfortunately DMA safety on these
> architectures requires the buffer no share a last level cache cacheline
> given by ARCH_DMA_MINALIGN which has a greater granularity.
> ARCH_DMA_MINALIGN is not defined for all architectures, but when it is
> defined it is used to set the size of ARCH_KMALLOC_MINALIGN
> to allow DMA safe buffer allocations.
> 
> As such the correct alignment requirement is
> __aligned(ARCH_KMALLOC_MINALIGN).
> This has recently been fixed in other subsystems such as IIO.
> 
> Presumably this part is little used on boards where this could actually
> matter so this is mostly about removing code that might be coppied
> elsewhere.
> 
> Fixes: 4feacbc24eea ("Input: add new driver for the Surface 3")
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>

Sigh. Sunday afternoon incompetence on my part...

Dmitry, if everything else is fine with this series I can resend this
or if you are feeling generous feel free to fix it up whilst applying ;)

Jonathan

> ---
>  drivers/input/touchscreen/surface3_spi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/input/touchscreen/surface3_spi.c b/drivers/input/touchscreen/surface3_spi.c
> index 1da23e5585a0..6c884fc2b332 100644
> --- a/drivers/input/touchscreen/surface3_spi.c
> +++ b/drivers/input/touchscreen/surface3_spi.c
> @@ -32,7 +32,7 @@ struct surface3_ts_data {
>  	struct input_dev *pen_input_dev;
>  	int pen_tool;
>  
> -	u8 rd_buf[SURFACE3_PACKET_SIZE]		____cacheline_aligned;
> +	u8 rd_buf[SURFACE3_PACKET_SIZE] __aligned(ARCH_KMALLOC_MINALIGN;

Missing bracket.

>  };
>  
>  struct surface3_ts_data_finger {


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/9] Input: n64joy - Fix DMA buffer alignment.
  2022-11-27 14:41 ` [PATCH 7/9] Input: n64joy - Fix DMA buffer alignment Jonathan Cameron
@ 2022-11-27 16:48   ` Lauri Kasanen
  2022-11-27 18:01     ` Jonathan Cameron
  2022-11-28 18:04     ` Dmitry Torokhov
  0 siblings, 2 replies; 20+ messages in thread
From: Lauri Kasanen @ 2022-11-27 16:48 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: linux-input, Dmitry Torokhov, Jonathan Cameron

On Sun, 27 Nov 2022 14:41:14 +0000
Jonathan Cameron <jic23@kernel.org> wrote:

> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> The use of ____cacheline_aligned to ensure a buffer is DMA safe only
> enforces the start of the buffer alignment. In this case, sufficient
> alignment is already ensured by the use of kzalloc().
> ____cacheline_aligned does not ensure that no other members of the
> structure are placed in the same cacheline after the end of the
> buffer marked.  Thus to ensure a DMA safe buffer it must be at the end
> of the structure.

This move is unnecessary, because the cacheline is 16 bytes and the
buffer is 64 bytes.

> Whilst here switch from ____cacheline_aligned to
> __aligned(ARCH_KMALLOC_MINALIGN) as on some architectures, with variable
> sized cacheline lines across their cache hierarchy, require this
> greater alignment guarantee for DMA safety.  Make this change throughout
> the driver as it reduces need for a reader to know about the particular
> architecture.

This change looks ok.

- Lauri

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/9] Input: n64joy - Fix DMA buffer alignment.
  2022-11-27 16:48   ` Lauri Kasanen
@ 2022-11-27 18:01     ` Jonathan Cameron
  2022-11-28  6:49       ` Lauri Kasanen
  2022-11-28 18:04     ` Dmitry Torokhov
  1 sibling, 1 reply; 20+ messages in thread
From: Jonathan Cameron @ 2022-11-27 18:01 UTC (permalink / raw)
  To: Lauri Kasanen; +Cc: linux-input, Dmitry Torokhov, Jonathan Cameron

On Sun, 27 Nov 2022 18:48:44 +0200
Lauri Kasanen <cand@gmx.com> wrote:

> On Sun, 27 Nov 2022 14:41:14 +0000
> Jonathan Cameron <jic23@kernel.org> wrote:
> 
> > From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > 
> > The use of ____cacheline_aligned to ensure a buffer is DMA safe only
> > enforces the start of the buffer alignment. In this case, sufficient
> > alignment is already ensured by the use of kzalloc().
> > ____cacheline_aligned does not ensure that no other members of the
> > structure are placed in the same cacheline after the end of the
> > buffer marked.  Thus to ensure a DMA safe buffer it must be at the end
> > of the structure.  
> 
> This move is unnecessary, because the cacheline is 16 bytes and the
> buffer is 64 bytes.

Ah.  That particular option hadn't occurred to me (and I'd failed to notice
how big the buffer is :( ).
The marking isn't needed at all then as the allocation is already
guaranteed to be sufficiently aligned. However, maybe that is a bit subtle
and having some sort of marking is useful.

Curious question though, why is the buffer so big?
Each struct joydata is 8 bytes I think, but the driver only accesses 4 of them.

Is the hardware putting garbage into the remaining 2 cachelines or is there
something subtle going on?

Or given my earlier success, maybe I'm misreading the code entirely.

Jonathan

> 
> > Whilst here switch from ____cacheline_aligned to
> > __aligned(ARCH_KMALLOC_MINALIGN) as on some architectures, with variable
> > sized cacheline lines across their cache hierarchy, require this
> > greater alignment guarantee for DMA safety.  Make this change throughout
> > the driver as it reduces need for a reader to know about the particular
> > architecture.  
> 
> This change looks ok.
> 
> - Lauri


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/9] Input: n64joy - Fix DMA buffer alignment.
  2022-11-27 18:01     ` Jonathan Cameron
@ 2022-11-28  6:49       ` Lauri Kasanen
  0 siblings, 0 replies; 20+ messages in thread
From: Lauri Kasanen @ 2022-11-28  6:49 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: linux-input, Dmitry Torokhov, Jonathan Cameron

On Sun, 27 Nov 2022 18:01:26 +0000
Jonathan Cameron <jic23@kernel.org> wrote:

> > This move is unnecessary, because the cacheline is 16 bytes and the
> > buffer is 64 bytes.
> 
> Ah.  That particular option hadn't occurred to me (and I'd failed to notice
> how big the buffer is :( ).
> The marking isn't needed at all then as the allocation is already
> guaranteed to be sufficiently aligned. However, maybe that is a bit subtle
> and having some sort of marking is useful.

You can replace the __cacheline annotation with a comment, that's
totally fine.

> Curious question though, why is the buffer so big?
> Each struct joydata is 8 bytes I think, but the driver only accesses 4 of them.
> 
> Is the hardware putting garbage into the remaining 2 cachelines or is there
> something subtle going on?

That chip operates on 64-byte units. I don't remember whether the
remaining area is garbage or the input bytes or something else.

- Lauri

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH 3/9] Input: ad7887 - Fix padding for DMA safe buffers.
  2022-11-27 14:41 ` [PATCH 3/9] Input: ad7887 " Jonathan Cameron
@ 2022-11-28  7:14   ` Hennerich, Michael
  0 siblings, 0 replies; 20+ messages in thread
From: Hennerich, Michael @ 2022-11-28  7:14 UTC (permalink / raw)
  To: Jonathan Cameron, linux-input@vger.kernel.org, Dmitry Torokhov
  Cc: Jonathan Cameron



> -----Original Message-----
> From: Jonathan Cameron <jic23@kernel.org>
> Sent: Sonntag, 27. November 2022 15:41
> To: linux-input@vger.kernel.org; Dmitry Torokhov
> <dmitry.torokhov@gmail.com>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>; Hennerich, Michael
> <Michael.Hennerich@analog.com>
> Subject: [PATCH 3/9] Input: ad7887 - Fix padding for DMA safe buffers.
> 
> 
> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> On some architectures (e.g. arm64), ____cachline_aligned only aligns to the
> cacheline size of the L1 cache size. L1_CACHE_BYTES in
> arch64/include/asm/cache.h  Unfortunately DMA safety on these architectures
> requires the buffer no share a last level cache cacheline given by
> ARCH_DMA_MINALIGN which has a greater granularity.
> ARCH_DMA_MINALIGN is not defined for all architectures, but when it is
> defined it is used to set the size of ARCH_KMALLOC_MINALIGN to allow DMA
> safe buffer allocations.
> 
> As such the correct alignment requirement is
> __aligned(ARCH_KMALLOC_MINALIGN).
> This has recently been fixed in other subsystems such as IIO.
> 
> Fixes tag is imprecise as there may have been no architectures where the two
> alignments differed at the time of that patch.
> 
> Fixes: 3843384a0554 ("Input: ad7877 - keep dma rx buffers in seperate cache
> lines")
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Michael Hennerich <michael.hennerich@analog.com>

Acked-by: Michael Hennerich <michael.hennerich@analog.com>

> ---
>  drivers/input/touchscreen/ad7877.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/input/touchscreen/ad7877.c
> b/drivers/input/touchscreen/ad7877.c
> index 08f5372f0bfd..eaf11dffb28e 100644
> --- a/drivers/input/touchscreen/ad7877.c
> +++ b/drivers/input/touchscreen/ad7877.c
> @@ -150,7 +150,7 @@ struct ser_req {
>  	 * DMA (thus cache coherency maintenance) requires the
>  	 * transfer buffers to live in their own cache lines.
>  	 */
> -	u16 sample ____cacheline_aligned;
> +	u16 sample __aligned(ARCH_KMALLOC_MINALIGN);
>  };
> 
>  struct ad7877 {
> @@ -189,7 +189,7 @@ struct ad7877 {
>  	 * DMA (thus cache coherency maintenance) requires the
>  	 * transfer buffers to live in their own cache lines.
>  	 */
> -	u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned;
> +	u16 conversion_data[AD7877_NR_SENSE]
> __aligned(ARCH_KMALLOC_MINALIGN);
>  };
> 
>  static bool gpio3;
> --
> 2.38.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH 2/9] Input: ad714x - Fix padding for DMA safe buffers.
  2022-11-27 14:41 ` [PATCH 2/9] Input: ad714x " Jonathan Cameron
@ 2022-11-28  7:14   ` Hennerich, Michael
  0 siblings, 0 replies; 20+ messages in thread
From: Hennerich, Michael @ 2022-11-28  7:14 UTC (permalink / raw)
  To: Jonathan Cameron, linux-input@vger.kernel.org, Dmitry Torokhov
  Cc: Jonathan Cameron



> -----Original Message-----
> From: Jonathan Cameron <jic23@kernel.org>
> Sent: Sonntag, 27. November 2022 15:41
> To: linux-input@vger.kernel.org; Dmitry Torokhov
> <dmitry.torokhov@gmail.com>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>; Hennerich, Michael
> <Michael.Hennerich@analog.com>
> Subject: [PATCH 2/9] Input: ad714x - Fix padding for DMA safe buffers.
> 
> 
> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> On some architectures (e.g. arm64), ____cachline_aligned only aligns to the
> cacheline size of the L1 cache size. L1_CACHE_BYTES in
> arch64/include/asm/cache.h  Unfortunately DMA safety on these architectures
> requires the buffer no share a last level cache cacheline given by
> ARCH_DMA_MINALIGN which has a greater granularity.
> ARCH_DMA_MINALIGN is not defined for all architectures, but when it is
> defined it is used to set the size of ARCH_KMALLOC_MINALIGN to allow DMA
> safe buffer allocations.
> 
> As such the correct alignment requirement is
> __aligned(ARCH_KMALLOC_MINALIGN).
> This has recently been fixed in other subsystems such as IIO.
> 
> Fixes tag is inprecise because there may not have been any architectures
> where the two values were different at the time of the earlier fix.
> 
> Fixes: c0409feb8689 ("Input: ad714x - use DMA-safe buffers for spi_write()")
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Michael Hennerich <michael.hennerich@analog.com>

Acked-by: Michael Hennerich <michael.hennerich@analog.com>

> ---
>  drivers/input/misc/ad714x.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/input/misc/ad714x.h b/drivers/input/misc/ad714x.h index
> af847b5f0d0e..2b8b901183be 100644
> --- a/drivers/input/misc/ad714x.h
> +++ b/drivers/input/misc/ad714x.h
> @@ -41,7 +41,7 @@ struct ad714x_chip {
>  	unsigned product;
>  	unsigned version;
> 
> -	__be16 xfer_buf[16] ____cacheline_aligned;
> +	__be16 xfer_buf[16] __aligned(ARCH_KMALLOC_MINALIGN);
> 
>  };
> 
> --
> 2.38.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/9] Input: n64joy - Fix DMA buffer alignment.
  2022-11-27 16:48   ` Lauri Kasanen
  2022-11-27 18:01     ` Jonathan Cameron
@ 2022-11-28 18:04     ` Dmitry Torokhov
  1 sibling, 0 replies; 20+ messages in thread
From: Dmitry Torokhov @ 2022-11-28 18:04 UTC (permalink / raw)
  To: Lauri Kasanen; +Cc: Jonathan Cameron, linux-input, Jonathan Cameron

On Sun, Nov 27, 2022 at 06:48:44PM +0200, Lauri Kasanen wrote:
> On Sun, 27 Nov 2022 14:41:14 +0000
> Jonathan Cameron <jic23@kernel.org> wrote:
> 
> > From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > 
> > The use of ____cacheline_aligned to ensure a buffer is DMA safe only
> > enforces the start of the buffer alignment. In this case, sufficient
> > alignment is already ensured by the use of kzalloc().
> > ____cacheline_aligned does not ensure that no other members of the
> > structure are placed in the same cacheline after the end of the
> > buffer marked.  Thus to ensure a DMA safe buffer it must be at the end
> > of the structure.
> 
> This move is unnecessary, because the cacheline is 16 bytes and the
> buffer is 64 bytes.

I think it is still worth moving it or alternatively adding a comment
why we believe the following member will not be sharing cacheline with
the buffer.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/9] Input: Fix insufficent DMA alignment.
  2022-11-27 14:41 [PATCH 0/9] Input: Fix insufficent DMA alignment Jonathan Cameron
                   ` (8 preceding siblings ...)
  2022-11-27 14:41 ` [PATCH 9/9] Input: elants - Fix " Jonathan Cameron
@ 2022-11-28 18:16 ` Dmitry Torokhov
  2022-11-29  9:18   ` Jonathan Cameron
  9 siblings, 1 reply; 20+ messages in thread
From: Dmitry Torokhov @ 2022-11-28 18:16 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-input, Jonathan Cameron, Daniel Mack, Michael Hennerich,
	Tomohiro Yoshidomi, Javier Martinez Canillas, Linus Walleij,
	Benjamin Tissoires, Lauri Kasanen, Daniel Hung-yu Wu

Hi Jonathan,

On Sun, Nov 27, 2022 at 02:41:07PM +0000, Jonathan Cameron wrote:
> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> This problem was discovered in IIO as a side effect of the discussions about
> relaxing kmalloc alignment on arm64 and resulted in a series of large
> patch sets.
> 
> https://lore.kernel.org/linux-iio/20220508175712.647246-1-jic23@kernel.org/
> 
> Unsurprisingly there are cases of it in other subsystems.
> 
> The short version of this is that there are a few known arm64 chips where
> ___cacheline_aligned enforces 64 byte alignment which is what we typically
> want for performance optimization as the size of the L1 cache lines.
> However, further out in the cache hierarchy we have caches with 128 byte
> lines.  Those are the ones that matter for DMA safety.
> So we need the larger alignment guarantees of ARCH_KMALLOC_MINALIGN which
> in this case is 128 bytes.

I wonder if we could have something like ____dmasafe_aligned instead of
sprinkling ARCH_KMALLOC_MINALIGN around?

> 
> There is one other use of ____cacheline_aligned in input:
> joystick/iforce/iforce-usb.c
> 
> Whilst suspicious I'm not sure enough of the requirements of USB to
> know if they are there for DMA safety or some other constraint.

Yes, USB has requirements similar to SPI.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/9] Input: Fix insufficent DMA alignment.
  2022-11-28 18:16 ` [PATCH 0/9] Input: Fix insufficent DMA alignment Dmitry Torokhov
@ 2022-11-29  9:18   ` Jonathan Cameron
  2022-11-30  1:23     ` Dmitry Torokhov
  0 siblings, 1 reply; 20+ messages in thread
From: Jonathan Cameron @ 2022-11-29  9:18 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Jonathan Cameron, linux-input, Daniel Mack, Michael Hennerich,
	Tomohiro Yoshidomi, Javier Martinez Canillas, Linus Walleij,
	Benjamin Tissoires, Lauri Kasanen, Daniel Hung-yu Wu,
	Catalin Marinas

On Mon, 28 Nov 2022 10:16:31 -0800
Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:

> Hi Jonathan,
> 
> On Sun, Nov 27, 2022 at 02:41:07PM +0000, Jonathan Cameron wrote:
> > From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > 
> > This problem was discovered in IIO as a side effect of the discussions about
> > relaxing kmalloc alignment on arm64 and resulted in a series of large
> > patch sets.
> > 
> > https://lore.kernel.org/linux-iio/20220508175712.647246-1-jic23@kernel.org/
> > 
> > Unsurprisingly there are cases of it in other subsystems.
> > 
> > The short version of this is that there are a few known arm64 chips where
> > ___cacheline_aligned enforces 64 byte alignment which is what we typically
> > want for performance optimization as the size of the L1 cache lines.
> > However, further out in the cache hierarchy we have caches with 128 byte
> > lines.  Those are the ones that matter for DMA safety.
> > So we need the larger alignment guarantees of ARCH_KMALLOC_MINALIGN which
> > in this case is 128 bytes.  
> 
> I wonder if we could have something like ____dmasafe_aligned instead of
> sprinkling ARCH_KMALLOC_MINALIGN around?

I agree in principle and eventually that will be ARCH_DMA_MINALIGN.
But it isn't useable yet for backports.
Catalin has worked on several approaches to reducing the alignment of kmalloc.
I may well be lagging on the current plan...
https://lore.kernel.org/linux-arm-kernel/20221106220143.2129263-1-catalin.marinas@arm.com/#r

The way crypto and IIO handled this was to add a local define
IIO_DMA_MINALIGN, CRYPTO_MINALIGN

Catalin, if you do forwards with making ARCH_DMA_MINALIGN global available, make sure
to include IIO_DMA_MINALIGN.  I'm fine with the approach you used for crypto of changing
the local define.  Note that in IIO this typically only bloats a single structure
per device instance, so it's not worth the complexity of dynamic alignment.
As we are marking structure elements with it, in all cases they are multiples of
IIO_DMA_MINALIGN in size, so everything is easy.

On the other side of things, we might be able to relax most of these alignment tricks
in IIO (and input) if swiotlb is the approach eventually used.  I'm personally not that
keen on that transition but meh, this stuff usually involves a serial bus, so the extra
bounce isnt' too painful. Note that we regularly do fun things like setting tx
buffer to be ARCH_KMALLOC_MINALIGN and the rx buffer to be that +n bytes on the basis
that a given serial bus master won't trash itself and hence this is safe. I think your
approach will bounce all of those which is somewhat unfortunate.  Other fun things
include DMA mapping a series of small buffers in one allocation for a series of serial
messages.  Again, that is guaranteed to be safe.

Jonathan






> 
> > 
> > There is one other use of ____cacheline_aligned in input:
> > joystick/iforce/iforce-usb.c
> > 
> > Whilst suspicious I'm not sure enough of the requirements of USB to
> > know if they are there for DMA safety or some other constraint.  
> 
> Yes, USB has requirements similar to SPI.
> 
> Thanks.
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/9] Input: Fix insufficent DMA alignment.
  2022-11-29  9:18   ` Jonathan Cameron
@ 2022-11-30  1:23     ` Dmitry Torokhov
  0 siblings, 0 replies; 20+ messages in thread
From: Dmitry Torokhov @ 2022-11-30  1:23 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Jonathan Cameron, linux-input, Daniel Mack, Michael Hennerich,
	Tomohiro Yoshidomi, Javier Martinez Canillas, Linus Walleij,
	Benjamin Tissoires, Lauri Kasanen, Daniel Hung-yu Wu,
	Catalin Marinas

On Tue, Nov 29, 2022 at 09:18:28AM +0000, Jonathan Cameron wrote:
> On Mon, 28 Nov 2022 10:16:31 -0800
> Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:
> 
> > Hi Jonathan,
> > 
> > On Sun, Nov 27, 2022 at 02:41:07PM +0000, Jonathan Cameron wrote:
> > > From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > 
> > > This problem was discovered in IIO as a side effect of the discussions about
> > > relaxing kmalloc alignment on arm64 and resulted in a series of large
> > > patch sets.
> > > 
> > > https://lore.kernel.org/linux-iio/20220508175712.647246-1-jic23@kernel.org/
> > > 
> > > Unsurprisingly there are cases of it in other subsystems.
> > > 
> > > The short version of this is that there are a few known arm64 chips where
> > > ___cacheline_aligned enforces 64 byte alignment which is what we typically
> > > want for performance optimization as the size of the L1 cache lines.
> > > However, further out in the cache hierarchy we have caches with 128 byte
> > > lines.  Those are the ones that matter for DMA safety.
> > > So we need the larger alignment guarantees of ARCH_KMALLOC_MINALIGN which
> > > in this case is 128 bytes.  
> > 
> > I wonder if we could have something like ____dmasafe_aligned instead of
> > sprinkling ARCH_KMALLOC_MINALIGN around?
> 
> I agree in principle and eventually that will be ARCH_DMA_MINALIGN.
> But it isn't useable yet for backports.

Sorry, I do not follow. Are talking about backports because the code is
broken in the mainline right now, or it will become broken when
Catalin's changes land? And even if it is broken right now why can't we
add

#define ____dmasafe_aligned __attribute__((__aligned__(ARCH_KMALLOC_MINALIGN)))

somewhere in include/linux/cache.h? Then you can tweak it as needed
independently of kmalloc alignment and without need to touch drivers
again and it should be easy to backport still.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-11-30  1:24 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-27 14:41 [PATCH 0/9] Input: Fix insufficent DMA alignment Jonathan Cameron
2022-11-27 14:41 ` [PATCH 1/9] Input: psxpad - Fix padding for DMA safe buffers Jonathan Cameron
2022-11-27 14:41 ` [PATCH 2/9] Input: ad714x " Jonathan Cameron
2022-11-28  7:14   ` Hennerich, Michael
2022-11-27 14:41 ` [PATCH 3/9] Input: ad7887 " Jonathan Cameron
2022-11-28  7:14   ` Hennerich, Michael
2022-11-27 14:41 ` [PATCH 4/9] Input: ads7846 " Jonathan Cameron
2022-11-27 14:41 ` [PATCH 5/9] Input: cyttsp " Jonathan Cameron
2022-11-27 14:41 ` [PATCH 6/9] Input: surface3 " Jonathan Cameron
2022-11-27 16:26   ` Jonathan Cameron
2022-11-27 14:41 ` [PATCH 7/9] Input: n64joy - Fix DMA buffer alignment Jonathan Cameron
2022-11-27 16:48   ` Lauri Kasanen
2022-11-27 18:01     ` Jonathan Cameron
2022-11-28  6:49       ` Lauri Kasanen
2022-11-28 18:04     ` Dmitry Torokhov
2022-11-27 14:41 ` [PATCH 8/9] Input: atmel_captouch - Avoid suspect " Jonathan Cameron
2022-11-27 14:41 ` [PATCH 9/9] Input: elants - Fix " Jonathan Cameron
2022-11-28 18:16 ` [PATCH 0/9] Input: Fix insufficent DMA alignment Dmitry Torokhov
2022-11-29  9:18   ` Jonathan Cameron
2022-11-30  1:23     ` Dmitry Torokhov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).