[RFC v3 0/5] Align atomic storage

linux-m68k.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC v3 0/5] Align atomic storage
@ 2025-10-07 22:19 Finn Thain
  2025-10-07 22:19 ` [RFC v3 3/5] atomic: Specify alignment for atomic_t and atomic64_t Finn Thain
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Finn Thain @ 2025-10-07 22:19 UTC (permalink / raw)
  To: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann,
	Peter Zijlstra, Will Deacon
  Cc: Andrew Morton, Arnd Bergmann, Boqun Feng, bpf, Jonathan Corbet,
	Eduard Zingerman, Geert Uytterhoeven, Hao Luo, John Fastabend,
	Jiri Olsa, KP Singh, Lance Yang, linux-arch, linux-doc,
	linux-kernel, linux-m68k, Mark Rutland, Martin KaFai Lau,
	Stanislav Fomichev, Song Liu, Yonghong Song

This series adds the __aligned attribute to atomic_t and atomic64_t
definitions in include/asm-generic.

It also adds Kconfig options to enable a new runtime warning to help
reveal misaligned atomic accesses on platforms which don't trap that.

Some people might assume scalars are aligned to 4-byte boundaries, while
others might assume natural alignment. Best not to encourage such
assumptions in the documentation.

Moreover, being that locks are performance sensitive, and being that
atomic operations tend to involve further assumptions, there seems to be
room for improvement here.

Pertinent to this discussion are the section "Memory Efficiency" in
Documentation/RCU/Design/Requirements/Requirements.rst
and the section "GUARANTEES" in Documentation/memory-barriers.txt

---
Changed since v2:
 - Specify natural alignment for atomic64_t.
 - CONFIG_DEBUG_ATOMIC checks for natural alignment again.
 - New patch to add weakened alignment check.
 - New patch for explicit alignment in BFP header.

---

Finn Thain (4):
  documentation: Discourage alignment assumptions
  bpf: Explicitly align bpf_res_spin_lock
  atomic: Specify alignment for atomic_t and atomic64_t
  atomic: Add option for weaker alignment check

Peter Zijlstra (1):
  atomic: Add alignment check to instrumented atomic operations

 .../core-api/unaligned-memory-access.rst      |  7 -------
 include/asm-generic/atomic64.h                |  2 +-
 include/asm-generic/rqspinlock.h              |  2 +-
 include/linux/instrumented.h                  | 12 ++++++++++++
 include/linux/types.h                         |  2 +-
 kernel/bpf/rqspinlock.c                       |  1 -
 lib/Kconfig.debug                             | 19 +++++++++++++++++++
 7 files changed, 34 insertions(+), 11 deletions(-)

-- 
2.49.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [RFC v3 1/5] documentation: Discourage alignment assumptions
  2025-10-07 22:19 [RFC v3 0/5] Align atomic storage Finn Thain
                   ` (2 preceding siblings ...)
  2025-10-07 22:19 ` [RFC v3 2/5] bpf: Explicitly align bpf_res_spin_lock Finn Thain
@ 2025-10-07 22:19 ` Finn Thain
  2025-10-14 10:23   ` David Laight
  2025-10-16 13:30   ` Arnd Bergmann
  2025-10-07 22:19 ` [RFC v3 4/5] atomic: Add alignment check to instrumented atomic operations Finn Thain
  4 siblings, 2 replies; 18+ messages in thread
From: Finn Thain @ 2025-10-07 22:19 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon
  Cc: Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland,
	Arnd Bergmann, linux-kernel, linux-arch, Geert Uytterhoeven,
	linux-m68k, linux-doc

Discourage assumptions that simply don't hold for all Linux ABIs.
Exceptions to the natural alignment rule for scalar types include
long long on i386 and sh.
---
 Documentation/core-api/unaligned-memory-access.rst | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/Documentation/core-api/unaligned-memory-access.rst b/Documentation/core-api/unaligned-memory-access.rst
index 5ceeb80eb539..1390ce2b7291 100644
--- a/Documentation/core-api/unaligned-memory-access.rst
+++ b/Documentation/core-api/unaligned-memory-access.rst
@@ -40,9 +40,6 @@ The rule mentioned above forms what we refer to as natural alignment:
 When accessing N bytes of memory, the base memory address must be evenly
 divisible by N, i.e. addr % N == 0.
 
-When writing code, assume the target architecture has natural alignment
-requirements.
-
 In reality, only a few architectures require natural alignment on all sizes
 of memory access. However, we must consider ALL supported architectures;
 writing code that satisfies natural alignment requirements is the easiest way
@@ -103,10 +100,6 @@ Therefore, for standard structure types you can always rely on the compiler
 to pad structures so that accesses to fields are suitably aligned (assuming
 you do not cast the field to a type of different length).
 
-Similarly, you can also rely on the compiler to align variables and function
-parameters to a naturally aligned scheme, based on the size of the type of
-the variable.
-
 At this point, it should be clear that accessing a single byte (u8 or char)
 will never cause an unaligned access, because all memory addresses are evenly
 divisible by one.
-- 
2.49.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC v3 2/5] bpf: Explicitly align bpf_res_spin_lock
  2025-10-07 22:19 [RFC v3 0/5] Align atomic storage Finn Thain
  2025-10-07 22:19 ` [RFC v3 3/5] atomic: Specify alignment for atomic_t and atomic64_t Finn Thain
  2025-10-07 22:19 ` [RFC v3 5/5] atomic: Add option for weaker alignment check Finn Thain
@ 2025-10-07 22:19 ` Finn Thain
  2025-10-09  2:10   ` Alexei Starovoitov
  2025-10-07 22:19 ` [RFC v3 1/5] documentation: Discourage alignment assumptions Finn Thain
  2025-10-07 22:19 ` [RFC v3 4/5] atomic: Add alignment check to instrumented atomic operations Finn Thain
  4 siblings, 1 reply; 18+ messages in thread
From: Finn Thain @ 2025-10-07 22:19 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko
  Cc: Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland,
	Arnd Bergmann, linux-kernel, linux-arch, Geert Uytterhoeven,
	linux-m68k, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, bpf

Align bpf_res_spin_lock to avoid a BUILD_BUG_ON() when the alignment
changes, as it will do on m68k when, in a subsequent patch, the minimum
alignment of the atomic_t member of struct rqspinlock gets increased.
Drop the BUILD_BUG_ON() as it is now redundant.

Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
---
 include/asm-generic/rqspinlock.h | 2 +-
 kernel/bpf/rqspinlock.c          | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/asm-generic/rqspinlock.h b/include/asm-generic/rqspinlock.h
index 6d4244d643df..eac2dcd31b83 100644
--- a/include/asm-generic/rqspinlock.h
+++ b/include/asm-generic/rqspinlock.h
@@ -28,7 +28,7 @@ struct rqspinlock {
  */
 struct bpf_res_spin_lock {
 	u32 val;
-};
+} __aligned(__alignof__(struct rqspinlock));
 
 struct qspinlock;
 #ifdef CONFIG_QUEUED_SPINLOCKS
diff --git a/kernel/bpf/rqspinlock.c b/kernel/bpf/rqspinlock.c
index 338305c8852c..a88a0e9749d7 100644
--- a/kernel/bpf/rqspinlock.c
+++ b/kernel/bpf/rqspinlock.c
@@ -671,7 +671,6 @@ __bpf_kfunc int bpf_res_spin_lock(struct bpf_res_spin_lock *lock)
 	int ret;
 
 	BUILD_BUG_ON(sizeof(rqspinlock_t) != sizeof(struct bpf_res_spin_lock));
-	BUILD_BUG_ON(__alignof__(rqspinlock_t) != __alignof__(struct bpf_res_spin_lock));
 
 	preempt_disable();
 	ret = res_spin_lock((rqspinlock_t *)lock);
-- 
2.49.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC v3 3/5] atomic: Specify alignment for atomic_t and atomic64_t
  2025-10-07 22:19 [RFC v3 0/5] Align atomic storage Finn Thain
@ 2025-10-07 22:19 ` Finn Thain
  2025-10-07 22:19 ` [RFC v3 5/5] atomic: Add option for weaker alignment check Finn Thain
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Finn Thain @ 2025-10-07 22:19 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon
  Cc: Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland,
	Arnd Bergmann, linux-kernel, linux-arch, Geert Uytterhoeven,
	linux-m68k, Lance Yang

Some recent commits incorrectly assumed 4-byte alignment of locks.
That assumption fails on Linux/m68k (and, interestingly, would have
failed on Linux/cris also). Specify the minimum alignment of atomic
variables for fewer surprises and (hopefully) better performance.

On an m68k system with 14 MB of RAM, this patch reduces the available
memory by a couple of percent. On a 64 MB system, the cost is under 1%
but still significant. I don't know whether there is sufficient
performance gain to justify the memory cost; it still has to be measured.

Cc: Lance Yang <lance.yang@linux.dev>
Link: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc1_0g@mail.gmail.com/
---
Changed since v2:
 - Specify natural alignment for atomic64_t.
Changed since v1:
 - atomic64_t now gets an __aligned attribute too.
 - The 'Fixes' tag has been dropped because Lance sent a different fix
   for commit e711faaafbe5 ("hung_task: replace blocker_mutex with encoded
   blocker") that's suitable for -stable.
---
 include/asm-generic/atomic64.h | 2 +-
 include/linux/types.h          | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/atomic64.h b/include/asm-generic/atomic64.h
index 100d24b02e52..f22ccfc0df98 100644
--- a/include/asm-generic/atomic64.h
+++ b/include/asm-generic/atomic64.h
@@ -10,7 +10,7 @@
 #include <linux/types.h>
 
 typedef struct {
-	s64 counter;
+	s64 __aligned(sizeof(s64)) counter;
 } atomic64_t;
 
 #define ATOMIC64_INIT(i)	{ (i) }
diff --git a/include/linux/types.h b/include/linux/types.h
index 6dfdb8e8e4c3..a225a518c2c3 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -179,7 +179,7 @@ typedef phys_addr_t resource_size_t;
 typedef unsigned long irq_hw_number_t;
 
 typedef struct {
-	int counter;
+	int __aligned(sizeof(int)) counter;
 } atomic_t;
 
 #define ATOMIC_INIT(i) { (i) }
-- 
2.49.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC v3 4/5] atomic: Add alignment check to instrumented atomic operations
  2025-10-07 22:19 [RFC v3 0/5] Align atomic storage Finn Thain
                   ` (3 preceding siblings ...)
  2025-10-07 22:19 ` [RFC v3 1/5] documentation: Discourage alignment assumptions Finn Thain
@ 2025-10-07 22:19 ` Finn Thain
  4 siblings, 0 replies; 18+ messages in thread
From: Finn Thain @ 2025-10-07 22:19 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon
  Cc: Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland,
	Arnd Bergmann, linux-kernel, linux-arch, Geert Uytterhoeven,
	linux-m68k

From: Peter Zijlstra <peterz@infradead.org>

Add a Kconfig option for debug builds which logs a warning when an
instrumented atomic operation takes place that's misaligned.
Some platforms don't trap for this.

Link: https://lore.kernel.org/lkml/20250901093600.GF4067720@noisy.programming.kicks-ass.net/
---
Changed since v2:
 - Always check for natural alignment.
---
To make this useful on those architectures that don't naturally align
scalars (x86-32, m68k and sh come to mind), please also use
"[PATCH] atomic: skip alignment check for try_cmpxchg() old arg"
https://lore.kernel.org/all/20251006110740.468309-1-arnd@kernel.org/
---
 include/linux/instrumented.h |  4 ++++
 lib/Kconfig.debug            | 10 ++++++++++
 2 files changed, 14 insertions(+)

diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h
index 711a1f0d1a73..402a999a0d6b 100644
--- a/include/linux/instrumented.h
+++ b/include/linux/instrumented.h
@@ -7,6 +7,7 @@
 #ifndef _LINUX_INSTRUMENTED_H
 #define _LINUX_INSTRUMENTED_H
 
+#include <linux/bug.h>
 #include <linux/compiler.h>
 #include <linux/kasan-checks.h>
 #include <linux/kcsan-checks.h>
@@ -67,6 +68,7 @@ static __always_inline void instrument_atomic_read(const volatile void *v, size_
 {
 	kasan_check_read(v, size);
 	kcsan_check_atomic_read(v, size);
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1)));
 }
 
 /**
@@ -81,6 +83,7 @@ static __always_inline void instrument_atomic_write(const volatile void *v, size
 {
 	kasan_check_write(v, size);
 	kcsan_check_atomic_write(v, size);
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1)));
 }
 
 /**
@@ -95,6 +98,7 @@ static __always_inline void instrument_atomic_read_write(const volatile void *v,
 {
 	kasan_check_write(v, size);
 	kcsan_check_atomic_read_write(v, size);
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1)));
 }
 
 /**
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index ebe33181b6e6..d82626b7d6be 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1363,6 +1363,16 @@ config DEBUG_PREEMPT
 	  depending on workload as it triggers debugging routines for each
 	  this_cpu operation. It should only be used for debugging purposes.
 
+config DEBUG_ATOMIC
+	bool "Debug atomic variables"
+	depends on DEBUG_KERNEL
+	help
+	  If you say Y here then the kernel will add a runtime alignment check
+	  to atomic accesses. Useful for architectures that do not have trap on
+	  mis-aligned access.
+
+	  This option has potentially significant overhead.
+
 menu "Lock Debugging (spinlocks, mutexes, etc...)"
 
 config LOCK_DEBUGGING_SUPPORT
-- 
2.49.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC v3 5/5] atomic: Add option for weaker alignment check
  2025-10-07 22:19 [RFC v3 0/5] Align atomic storage Finn Thain
  2025-10-07 22:19 ` [RFC v3 3/5] atomic: Specify alignment for atomic_t and atomic64_t Finn Thain
@ 2025-10-07 22:19 ` Finn Thain
  2025-10-07 22:19 ` [RFC v3 2/5] bpf: Explicitly align bpf_res_spin_lock Finn Thain
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Finn Thain @ 2025-10-07 22:19 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon
  Cc: Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland,
	Arnd Bergmann, linux-kernel, linux-arch, Geert Uytterhoeven,
	linux-m68k

Add a new Kconfig symbol to make CONFIG_DEBUG_ATOMIC more useful on
those architectures which do not align dynamic allocations to
8-byte boundaries. Without this, CONFIG_DEBUG_ATOMIC produces excessive
WARN splats.
---
 include/linux/instrumented.h | 14 +++++++++++---
 lib/Kconfig.debug            |  9 +++++++++
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h
index 402a999a0d6b..3c9a570dbfa7 100644
--- a/include/linux/instrumented.h
+++ b/include/linux/instrumented.h
@@ -8,11 +8,13 @@
 #define _LINUX_INSTRUMENTED_H
 
 #include <linux/bug.h>
+#include <linux/cache.h>
 #include <linux/compiler.h>
 #include <linux/kasan-checks.h>
 #include <linux/kcsan-checks.h>
 #include <linux/kmsan-checks.h>
 #include <linux/types.h>
+#include <vdso/align.h>
 
 /**
  * instrument_read - instrument regular read access
@@ -68,7 +70,9 @@ static __always_inline void instrument_atomic_read(const volatile void *v, size_
 {
 	kasan_check_read(v, size);
 	kcsan_check_atomic_read(v, size);
-	WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1)));
+	if (IS_ENABLED(CONFIG_DEBUG_ATOMIC))
+		WARN_ON_ONCE(v != PTR_ALIGN(v, IS_ENABLED(CONFIG_DEBUG_ATOMIC_LARGEST_ALIGN) ?
+					       __LARGEST_ALIGN : size));
 }
 
 /**
@@ -83,7 +87,9 @@ static __always_inline void instrument_atomic_write(const volatile void *v, size
 {
 	kasan_check_write(v, size);
 	kcsan_check_atomic_write(v, size);
-	WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1)));
+	if (IS_ENABLED(CONFIG_DEBUG_ATOMIC))
+		WARN_ON_ONCE(v != PTR_ALIGN(v, IS_ENABLED(CONFIG_DEBUG_ATOMIC_LARGEST_ALIGN) ?
+					       __LARGEST_ALIGN : size));
 }
 
 /**
@@ -98,7 +104,9 @@ static __always_inline void instrument_atomic_read_write(const volatile void *v,
 {
 	kasan_check_write(v, size);
 	kcsan_check_atomic_read_write(v, size);
-	WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1)));
+	if (IS_ENABLED(CONFIG_DEBUG_ATOMIC))
+		WARN_ON_ONCE(v != PTR_ALIGN(v, IS_ENABLED(CONFIG_DEBUG_ATOMIC_LARGEST_ALIGN) ?
+					       __LARGEST_ALIGN : size));
 }
 
 /**
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index d82626b7d6be..f81b8a9b9216 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1373,6 +1373,15 @@ config DEBUG_ATOMIC
 
 	  This option has potentially significant overhead.
 
+config DEBUG_ATOMIC_LARGEST_ALIGN
+	bool "Check alignment only up to __LARGEST_ALIGN"
+	depends on DEBUG_ATOMIC
+	help
+	  Say Y here if you require natural alignment of atomic accesses
+	  only up to long word boundaries. That is, char, short, int and long
+	  will be checked for natural alignment but anything larger checked
+	  only for alignment with a long word boundary.
+
 menu "Lock Debugging (spinlocks, mutexes, etc...)"
 
 config LOCK_DEBUGGING_SUPPORT
-- 
2.49.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC v3 2/5] bpf: Explicitly align bpf_res_spin_lock
  2025-10-07 22:19 ` [RFC v3 2/5] bpf: Explicitly align bpf_res_spin_lock Finn Thain
@ 2025-10-09  2:10   ` Alexei Starovoitov
  2025-10-09  2:56     ` Finn Thain
  2025-10-09  7:02     ` Peter Zijlstra
  0 siblings, 2 replies; 18+ messages in thread
From: Alexei Starovoitov @ 2025-10-09  2:10 UTC (permalink / raw)
  To: Finn Thain
  Cc: Peter Zijlstra, Will Deacon, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Andrew Morton, Boqun Feng, Jonathan Corbet,
	Mark Rutland, Arnd Bergmann, LKML, linux-arch, Geert Uytterhoeven,
	linux-m68k, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, bpf

On Tue, Oct 7, 2025 at 4:50 PM Finn Thain <fthain@linux-m68k.org> wrote:
>
> Align bpf_res_spin_lock to avoid a BUILD_BUG_ON() when the alignment
> changes, as it will do on m68k when, in a subsequent patch, the minimum
> alignment of the atomic_t member of struct rqspinlock gets increased.
> Drop the BUILD_BUG_ON() as it is now redundant.
>
> Cc: Martin KaFai Lau <martin.lau@linux.dev>
> Cc: Eduard Zingerman <eddyz87@gmail.com>
> Cc: Song Liu <song@kernel.org>
> Cc: Yonghong Song <yonghong.song@linux.dev>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: KP Singh <kpsingh@kernel.org>
> Cc: Stanislav Fomichev <sdf@fomichev.me>
> Cc: Hao Luo <haoluo@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> ---
>  include/asm-generic/rqspinlock.h | 2 +-
>  kernel/bpf/rqspinlock.c          | 1 -
>  2 files changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/include/asm-generic/rqspinlock.h b/include/asm-generic/rqspinlock.h
> index 6d4244d643df..eac2dcd31b83 100644
> --- a/include/asm-generic/rqspinlock.h
> +++ b/include/asm-generic/rqspinlock.h
> @@ -28,7 +28,7 @@ struct rqspinlock {
>   */
>  struct bpf_res_spin_lock {
>         u32 val;
> -};
> +} __aligned(__alignof__(struct rqspinlock));

I don't follow.
In the next patch you do:

typedef struct {
- int counter;
+ int __aligned(sizeof(int)) counter;
} atomic_t;

so it was 4 and still 4 ?
Are you saying 'int' on m68k is not 4 byte aligned by default,
so you have to force 4 byte align?

>  struct qspinlock;
>  #ifdef CONFIG_QUEUED_SPINLOCKS
> diff --git a/kernel/bpf/rqspinlock.c b/kernel/bpf/rqspinlock.c
> index 338305c8852c..a88a0e9749d7 100644
> --- a/kernel/bpf/rqspinlock.c
> +++ b/kernel/bpf/rqspinlock.c
> @@ -671,7 +671,6 @@ __bpf_kfunc int bpf_res_spin_lock(struct bpf_res_spin_lock *lock)
>         int ret;
>
>         BUILD_BUG_ON(sizeof(rqspinlock_t) != sizeof(struct bpf_res_spin_lock));
> -       BUILD_BUG_ON(__alignof__(rqspinlock_t) != __alignof__(struct bpf_res_spin_lock));

Why delete it? Didn't you make them equal in the above hunk?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC v3 2/5] bpf: Explicitly align bpf_res_spin_lock
  2025-10-09  2:10   ` Alexei Starovoitov
@ 2025-10-09  2:56     ` Finn Thain
  2025-10-09  7:02     ` Peter Zijlstra
  1 sibling, 0 replies; 18+ messages in thread
From: Finn Thain @ 2025-10-09  2:56 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Peter Zijlstra, Will Deacon, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Andrew Morton, Boqun Feng, Jonathan Corbet,
	Mark Rutland, Arnd Bergmann, LKML, linux-arch, Geert Uytterhoeven,
	linux-m68k, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, bpf

[-- Attachment #1: Type: text/plain, Size: 2391 bytes --]


On Wed, 8 Oct 2025, Alexei Starovoitov wrote:

> On Tue, Oct 7, 2025 at 4:50 PM Finn Thain <fthain@linux-m68k.org> wrote:
> >
> > Align bpf_res_spin_lock to avoid a BUILD_BUG_ON() when the alignment
> > changes, as it will do on m68k when, in a subsequent patch, the minimum
> > alignment of the atomic_t member of struct rqspinlock gets increased.
> > Drop the BUILD_BUG_ON() as it is now redundant.
> >
> > Cc: Martin KaFai Lau <martin.lau@linux.dev>
> > Cc: Eduard Zingerman <eddyz87@gmail.com>
> > Cc: Song Liu <song@kernel.org>
> > Cc: Yonghong Song <yonghong.song@linux.dev>
> > Cc: John Fastabend <john.fastabend@gmail.com>
> > Cc: KP Singh <kpsingh@kernel.org>
> > Cc: Stanislav Fomichev <sdf@fomichev.me>
> > Cc: Hao Luo <haoluo@google.com>
> > Cc: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  include/asm-generic/rqspinlock.h | 2 +-
> >  kernel/bpf/rqspinlock.c          | 1 -
> >  2 files changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/include/asm-generic/rqspinlock.h b/include/asm-generic/rqspinlock.h
> > index 6d4244d643df..eac2dcd31b83 100644
> > --- a/include/asm-generic/rqspinlock.h
> > +++ b/include/asm-generic/rqspinlock.h
> > @@ -28,7 +28,7 @@ struct rqspinlock {
> >   */
> >  struct bpf_res_spin_lock {
> >         u32 val;
> > -};
> > +} __aligned(__alignof__(struct rqspinlock));
> 
> I don't follow.
> In the next patch you do:
> 
> typedef struct {
> - int counter;
> + int __aligned(sizeof(int)) counter;
> } atomic_t;
> 
> so it was 4 and still 4 ?
> Are you saying 'int' on m68k is not 4 byte aligned by default,
> so you have to force 4 byte align?
> 

Right. __alignof(int) == 2 on m68k.

> >  struct qspinlock;
> >  #ifdef CONFIG_QUEUED_SPINLOCKS
> > diff --git a/kernel/bpf/rqspinlock.c b/kernel/bpf/rqspinlock.c
> > index 338305c8852c..a88a0e9749d7 100644
> > --- a/kernel/bpf/rqspinlock.c
> > +++ b/kernel/bpf/rqspinlock.c
> > @@ -671,7 +671,6 @@ __bpf_kfunc int bpf_res_spin_lock(struct bpf_res_spin_lock *lock)
> >         int ret;
> >
> >         BUILD_BUG_ON(sizeof(rqspinlock_t) != sizeof(struct bpf_res_spin_lock));
> > -       BUILD_BUG_ON(__alignof__(rqspinlock_t) != __alignof__(struct bpf_res_spin_lock));
> 
> Why delete it? Didn't you make them equal in the above hunk?
> 

I deleted it because it's tautological. I think "do not repeat yourself" 
applies here.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC v3 2/5] bpf: Explicitly align bpf_res_spin_lock
  2025-10-09  2:10   ` Alexei Starovoitov
  2025-10-09  2:56     ` Finn Thain
@ 2025-10-09  7:02     ` Peter Zijlstra
  2025-10-09 15:17       ` Alexei Starovoitov
  1 sibling, 1 reply; 18+ messages in thread
From: Peter Zijlstra @ 2025-10-09  7:02 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Finn Thain, Will Deacon, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Andrew Morton, Boqun Feng, Jonathan Corbet,
	Mark Rutland, Arnd Bergmann, LKML, linux-arch, Geert Uytterhoeven,
	linux-m68k, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, bpf

On Wed, Oct 08, 2025 at 07:10:13PM -0700, Alexei Starovoitov wrote:

> Are you saying 'int' on m68k is not 4 byte aligned by default,
> so you have to force 4 byte align?

This; m68k has u16 alignment, just to keep life interesting I suppose
:-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC v3 2/5] bpf: Explicitly align bpf_res_spin_lock
  2025-10-09  7:02     ` Peter Zijlstra
@ 2025-10-09 15:17       ` Alexei Starovoitov
  2025-10-09 16:01         ` Arnd Bergmann
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2025-10-09 15:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Finn Thain, Will Deacon, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Andrew Morton, Boqun Feng, Jonathan Corbet,
	Mark Rutland, Arnd Bergmann, LKML, linux-arch, Geert Uytterhoeven,
	linux-m68k, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, bpf

On Thu, Oct 9, 2025 at 12:02 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Oct 08, 2025 at 07:10:13PM -0700, Alexei Starovoitov wrote:
>
> > Are you saying 'int' on m68k is not 4 byte aligned by default,
> > so you have to force 4 byte align?
>
> This; m68k has u16 alignment, just to keep life interesting I suppose
> :-)

It's not "interesting". It adds burden to the rest of the kernel
for this architectural quirk.
Linus put the foot down for big-endian on arm64 and riscv.
We should do the same here.
x86 uses -mcmodel=kernel for 64-bit and -mregparm=3 for 32-bit.
m68k can do the same.
They can adjust the compiler to make 'int' 4 byte aligned under some
compiler flag. The kernel is built standalone, so it doesn't have
to conform to native calling convention or anything else.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC v3 2/5] bpf: Explicitly align bpf_res_spin_lock
  2025-10-09 15:17       ` Alexei Starovoitov
@ 2025-10-09 16:01         ` Arnd Bergmann
  2025-10-09 16:12           ` Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: Arnd Bergmann @ 2025-10-09 16:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Peter Zijlstra
  Cc: Finn Thain, Will Deacon, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Andrew Morton, Boqun Feng, Jonathan Corbet,
	Mark Rutland, LKML, Linux-Arch, Geert Uytterhoeven, linux-m68k,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	bpf

On Thu, Oct 9, 2025, at 17:17, Alexei Starovoitov wrote:
> On Thu, Oct 9, 2025 at 12:02 AM Peter Zijlstra <peterz@infradead.org> wrote:
>>
>> On Wed, Oct 08, 2025 at 07:10:13PM -0700, Alexei Starovoitov wrote:
>>
>> > Are you saying 'int' on m68k is not 4 byte aligned by default,
>> > so you have to force 4 byte align?
>>
>> This; m68k has u16 alignment, just to keep life interesting I suppose
>> :-)
>
> It's not "interesting". It adds burden to the rest of the kernel
> for this architectural quirk.
> Linus put the foot down for big-endian on arm64 and riscv.
> We should do the same here.
> x86 uses -mcmodel=kernel for 64-bit and -mregparm=3 for 32-bit.
> m68k can do the same.
> They can adjust the compiler to make 'int' 4 byte aligned under some
> compiler flag. The kernel is built standalone, so it doesn't have
> to conform to native calling convention or anything else.

I agree that building the kernel with -malign-int makes a lot
of sense here, there is even a project to rebuild the entire
user space with the same flag.

However, changing either the kernel or userspace to build with
-malign-int also has its cost, since for ABI compatibility
reasons any include/uapi/*/*.h header that defines a structure
with a misaligned word needs a custom annotation in order to
still define the layout to be the same as before, and the
annotations do complicate the common headers.

See 
https://lore.kernel.org/all/534e8ff8-70cb-4b78-b0b4-f88645bd180a@app.fastmail.com/
for a list of structures that likely need to be annotated,
and the thread around it for more of the nasty details that
make this nontrivial.

       Arnd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC v3 2/5] bpf: Explicitly align bpf_res_spin_lock
  2025-10-09 16:01         ` Arnd Bergmann
@ 2025-10-09 16:12           ` Alexei Starovoitov
  0 siblings, 0 replies; 18+ messages in thread
From: Alexei Starovoitov @ 2025-10-09 16:12 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Peter Zijlstra, Finn Thain, Will Deacon, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Andrew Morton, Boqun Feng,
	Jonathan Corbet, Mark Rutland, LKML, Linux-Arch,
	Geert Uytterhoeven, linux-m68k, Martin KaFai Lau,
	Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf

On Thu, Oct 9, 2025 at 9:02 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Thu, Oct 9, 2025, at 17:17, Alexei Starovoitov wrote:
> > On Thu, Oct 9, 2025 at 12:02 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >>
> >> On Wed, Oct 08, 2025 at 07:10:13PM -0700, Alexei Starovoitov wrote:
> >>
> >> > Are you saying 'int' on m68k is not 4 byte aligned by default,
> >> > so you have to force 4 byte align?
> >>
> >> This; m68k has u16 alignment, just to keep life interesting I suppose
> >> :-)
> >
> > It's not "interesting". It adds burden to the rest of the kernel
> > for this architectural quirk.
> > Linus put the foot down for big-endian on arm64 and riscv.
> > We should do the same here.
> > x86 uses -mcmodel=kernel for 64-bit and -mregparm=3 for 32-bit.
> > m68k can do the same.
> > They can adjust the compiler to make 'int' 4 byte aligned under some
> > compiler flag. The kernel is built standalone, so it doesn't have
> > to conform to native calling convention or anything else.
>
> I agree that building the kernel with -malign-int makes a lot
> of sense here, there is even a project to rebuild the entire
> user space with the same flag.
>
> However, changing either the kernel or userspace to build with
> -malign-int also has its cost, since for ABI compatibility
> reasons any include/uapi/*/*.h header that defines a structure
> with a misaligned word needs a custom annotation in order to
> still define the layout to be the same as before, and the
> annotations do complicate the common headers.
>
> See
> https://lore.kernel.org/all/534e8ff8-70cb-4b78-b0b4-f88645bd180a@app.fastmail.com/
> for a list of structures that likely need to be annotated,
> and the thread around it for more of the nasty details that
> make this nontrivial.

I see. So this is a lesser evil.

Acked-by: Alexei Starovoitov <ast@kernel.org>

for the patch then.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC v3 1/5] documentation: Discourage alignment assumptions
  2025-10-07 22:19 ` [RFC v3 1/5] documentation: Discourage alignment assumptions Finn Thain
@ 2025-10-14 10:23   ` David Laight
  2025-10-15  7:40     ` Finn Thain
  2025-10-16 13:30   ` Arnd Bergmann
  1 sibling, 1 reply; 18+ messages in thread
From: David Laight @ 2025-10-14 10:23 UTC (permalink / raw)
  To: Finn Thain
  Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng,
	Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel,
	linux-arch, Geert Uytterhoeven, linux-m68k, linux-doc

On Wed, 08 Oct 2025 09:19:20 +1100
Finn Thain <fthain@linux-m68k.org> wrote:

> Discourage assumptions that simply don't hold for all Linux ABIs.
> Exceptions to the natural alignment rule for scalar types include
> long long on i386 and sh.
> ---
>  Documentation/core-api/unaligned-memory-access.rst | 7 -------
>  1 file changed, 7 deletions(-)
> 
> diff --git a/Documentation/core-api/unaligned-memory-access.rst b/Documentation/core-api/unaligned-memory-access.rst
> index 5ceeb80eb539..1390ce2b7291 100644
> --- a/Documentation/core-api/unaligned-memory-access.rst
> +++ b/Documentation/core-api/unaligned-memory-access.rst
> @@ -40,9 +40,6 @@ The rule mentioned above forms what we refer to as natural alignment:
>  When accessing N bytes of memory, the base memory address must be evenly
>  divisible by N, i.e. addr % N == 0.
>  
> -When writing code, assume the target architecture has natural alignment
> -requirements.

I think I'd be more explicit, perhaps:
Note that not all architectures align 64bit items on 8 byte boundaries or
even 32bit items on 4 byte boundaries.

	David

> -
>  In reality, only a few architectures require natural alignment on all sizes
>  of memory access. However, we must consider ALL supported architectures;
>  writing code that satisfies natural alignment requirements is the easiest way
> @@ -103,10 +100,6 @@ Therefore, for standard structure types you can always rely on the compiler
>  to pad structures so that accesses to fields are suitably aligned (assuming
>  you do not cast the field to a type of different length).
>  
> -Similarly, you can also rely on the compiler to align variables and function
> -parameters to a naturally aligned scheme, based on the size of the type of
> -the variable.
> -
>  At this point, it should be clear that accessing a single byte (u8 or char)
>  will never cause an unaligned access, because all memory addresses are evenly
>  divisible by one.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC v3 1/5] documentation: Discourage alignment assumptions
  2025-10-14 10:23   ` David Laight
@ 2025-10-15  7:40     ` Finn Thain
  2025-10-15 13:53       ` David Laight
  0 siblings, 1 reply; 18+ messages in thread
From: Finn Thain @ 2025-10-15  7:40 UTC (permalink / raw)
  To: David Laight
  Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng,
	Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel,
	linux-arch, Geert Uytterhoeven, linux-m68k, linux-doc


On Tue, 14 Oct 2025, David Laight wrote:

> On Wed, 08 Oct 2025 09:19:20 +1100
> Finn Thain <fthain@linux-m68k.org> wrote:
> 
> > Discourage assumptions that simply don't hold for all Linux ABIs.
> > Exceptions to the natural alignment rule for scalar types include
> > long long on i386 and sh.
> > ---
> >  Documentation/core-api/unaligned-memory-access.rst | 7 -------
> >  1 file changed, 7 deletions(-)
> > 
> > diff --git a/Documentation/core-api/unaligned-memory-access.rst b/Documentation/core-api/unaligned-memory-access.rst
> > index 5ceeb80eb539..1390ce2b7291 100644
> > --- a/Documentation/core-api/unaligned-memory-access.rst
> > +++ b/Documentation/core-api/unaligned-memory-access.rst
> > @@ -40,9 +40,6 @@ The rule mentioned above forms what we refer to as natural alignment:
> >  When accessing N bytes of memory, the base memory address must be evenly
> >  divisible by N, i.e. addr % N == 0.
> >  
> > -When writing code, assume the target architecture has natural alignment
> > -requirements.
> 
> I think I'd be more explicit, perhaps:
> Note that not all architectures align 64bit items on 8 byte boundaries or
> even 32bit items on 4 byte boundaries.
> 

That's what the next para is alluding to...

> > In reality, only a few architectures require natural alignment on all sizes
> > of memory access. However, we must consider ALL supported architectures; 
> > writing code that satisfies natural alignment requirements is the easiest way 
> > to achieve full portability.

How about this?

"In reality, only a few architectures require natural alignment for all 
sizes of memory access. That is, not all architectures need 64-bit values 
to be aligned on 8-byte boundaries and 32-bit values on 4-byte boundaries. 
However, when writing code intended to achieve full portability, we must 
consider all supported architectures."

> > @@ -103,10 +100,6 @@ Therefore, for standard structure types you can always rely on the compiler
> >  to pad structures so that accesses to fields are suitably aligned (assuming
> >  you do not cast the field to a type of different length).
> >  
> > -Similarly, you can also rely on the compiler to align variables and function
> > -parameters to a naturally aligned scheme, based on the size of the type of
> > -the variable.
> > -
> >  At this point, it should be clear that accessing a single byte (u8 or char)
> >  will never cause an unaligned access, because all memory addresses are evenly
> >  divisible by one.
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC v3 1/5] documentation: Discourage alignment assumptions
  2025-10-15  7:40     ` Finn Thain
@ 2025-10-15 13:53       ` David Laight
  2025-10-16  6:53         ` Finn Thain
  0 siblings, 1 reply; 18+ messages in thread
From: David Laight @ 2025-10-15 13:53 UTC (permalink / raw)
  To: Finn Thain
  Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng,
	Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel,
	linux-arch, Geert Uytterhoeven, linux-m68k, linux-doc

On Wed, 15 Oct 2025 18:40:39 +1100 (AEDT)
Finn Thain <fthain@linux-m68k.org> wrote:

> On Tue, 14 Oct 2025, David Laight wrote:
> 
> > On Wed, 08 Oct 2025 09:19:20 +1100
> > Finn Thain <fthain@linux-m68k.org> wrote:
> >   
> > > Discourage assumptions that simply don't hold for all Linux ABIs.
> > > Exceptions to the natural alignment rule for scalar types include
> > > long long on i386 and sh.
> > > ---
> > >  Documentation/core-api/unaligned-memory-access.rst | 7 -------
> > >  1 file changed, 7 deletions(-)
> > > 
> > > diff --git a/Documentation/core-api/unaligned-memory-access.rst b/Documentation/core-api/unaligned-memory-access.rst
> > > index 5ceeb80eb539..1390ce2b7291 100644
> > > --- a/Documentation/core-api/unaligned-memory-access.rst
> > > +++ b/Documentation/core-api/unaligned-memory-access.rst
> > > @@ -40,9 +40,6 @@ The rule mentioned above forms what we refer to as natural alignment:
> > >  When accessing N bytes of memory, the base memory address must be evenly
> > >  divisible by N, i.e. addr % N == 0.
> > >  
> > > -When writing code, assume the target architecture has natural alignment
> > > -requirements.  
> > 
> > I think I'd be more explicit, perhaps:
> > Note that not all architectures align 64bit items on 8 byte boundaries or
> > even 32bit items on 4 byte boundaries.
> >   
> 
> That's what the next para is alluding to...
> 
> > > In reality, only a few architectures require natural alignment on all sizes
> > > of memory access. However, we must consider ALL supported architectures; 
> > > writing code that satisfies natural alignment requirements is the easiest way 
> > > to achieve full portability.  
> 
> How about this?
> 
> "In reality, only a few architectures require natural alignment for all 
> sizes of memory access. That is, not all architectures need 64-bit values 
> to be aligned on 8-byte boundaries and 32-bit values on 4-byte boundaries. 
> However, when writing code intended to achieve full portability, we must 
> consider all supported architectures."

There are several separate alignments:
- The alignment the cpu needs, for most x86 instructions this is 1 byte [1].
  Many RISC cpu require 'word' alignment (for some definition of 'word').
  A problematic case is data that crosses page boundaries.
- The alignment the compiler uses for structure members; returned by _Alignof().
  m68k only 16bit aligns 32bit values.
- The 'preferred' alignment returned by __alignof__().
  32bit x86 returns 8 for 64bit types even though the ABI only 4-byte aligns them.
- The 'natural' alignment based on the size of the item.
  I'd guess that 'complex double' (if supported) may only be 8 byte aligned.

What normally matters is the ABI alignment for structure members.
If you mark anything 'packed' the compiler will generate shifts and masks (etc)
to get working code.
Taking the address of an item in a packed structure generates a warning
for very good reason. 

[1] I've fallen foul of gcc deciding to 'vectorise' a loop and then having
it crash because the buffer address was misaligned.
Nasty because the code worked in initial testing and I expected the loop
(32bit adds of a buffer) to work fine even when misaligned.

	David

> 
> > > @@ -103,10 +100,6 @@ Therefore, for standard structure types you can always rely on the compiler
> > >  to pad structures so that accesses to fields are suitably aligned (assuming
> > >  you do not cast the field to a type of different length).
> > >  
> > > -Similarly, you can also rely on the compiler to align variables and function
> > > -parameters to a naturally aligned scheme, based on the size of the type of
> > > -the variable.
> > > -
> > >  At this point, it should be clear that accessing a single byte (u8 or char)
> > >  will never cause an unaligned access, because all memory addresses are evenly
> > >  divisible by one.  
> > 
> >   


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC v3 1/5] documentation: Discourage alignment assumptions
  2025-10-15 13:53       ` David Laight
@ 2025-10-16  6:53         ` Finn Thain
  0 siblings, 0 replies; 18+ messages in thread
From: Finn Thain @ 2025-10-16  6:53 UTC (permalink / raw)
  To: David Laight
  Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng,
	Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel,
	linux-arch, Geert Uytterhoeven, linux-m68k, linux-doc


On Wed, 15 Oct 2025, David Laight wrote:

> 
> There are several separate alignments:
> - The alignment the cpu needs, for most x86 instructions this is 1 byte [1].
>   Many RISC cpu require 'word' alignment (for some definition of 'word').
>   A problematic case is data that crosses page boundaries.
> - The alignment the compiler uses for structure members; returned by _Alignof().
>   m68k only 16bit aligns 32bit values.
> - The 'preferred' alignment returned by __alignof__().
>   32bit x86 returns 8 for 64bit types even though the ABI only 4-byte aligns them.
> - The 'natural' alignment based on the size of the item.
>   I'd guess that 'complex double' (if supported) may only be 8 byte aligned.
> 

Those distinctions could be useful in a discussion about memory 
efficiency. But this document is concerned with avoiding a performance 
penalty -- it's entirely unconcerned with over-alignment and memory waste.
Hence, "aligned" is used as shorthand for "naturally aligned".

The ambiguity in this document (and my proposed change) stems from using 
the word architecture to cover ABI, platform, CPU, ISA etc.
I can improve upon that.

> What normally matters is the ABI alignment for structure members.
> If you mark anything 'packed' the compiler will generate shifts and masks (etc)
> to get working code.
> Taking the address of an item in a packed structure generates a warning
> for very good reason. 
> 

I believe the problem with 'packed' is already covered in this document.

> [1] I've fallen foul of gcc deciding to 'vectorise' a loop and then having
> it crash because the buffer address was misaligned.
> Nasty because the code worked in initial testing and I expected the loop
> (32bit adds of a buffer) to work fine even when misaligned.
> 

I think that pitfall is already discussed also, along with a remedy.

There is also this,

    ... for standard structure types you can always rely on the compiler 
    to pad structures so that accesses to fields are suitably aligned 
    (assuming you do not cast the field to a type of different length).

So it seems to be fairly comprehensive but I may be missing something (?)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC v3 1/5] documentation: Discourage alignment assumptions
  2025-10-07 22:19 ` [RFC v3 1/5] documentation: Discourage alignment assumptions Finn Thain
  2025-10-14 10:23   ` David Laight
@ 2025-10-16 13:30   ` Arnd Bergmann
  2025-10-16 22:14     ` Finn Thain
  1 sibling, 1 reply; 18+ messages in thread
From: Arnd Bergmann @ 2025-10-16 13:30 UTC (permalink / raw)
  To: Finn Thain, Peter Zijlstra, Will Deacon
  Cc: Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland,
	linux-kernel, Linux-Arch, Geert Uytterhoeven, linux-m68k,
	linux-doc

 On Wed, Oct 8, 2025, at 00:19, Finn Thain wrote:
> Discourage assumptions that simply don't hold for all Linux ABIs.
> Exceptions to the natural alignment rule for scalar types include
> long long on i386 and sh.
> ---

I think both of the paragraphs you remove are still correct and I
would not remove them:

>  Documentation/core-api/unaligned-memory-access.rst | 7 -------
>  1 file changed, 7 deletions(-)
>
> diff --git a/Documentation/core-api/unaligned-memory-access.rst 
> b/Documentation/core-api/unaligned-memory-access.rst
> index 5ceeb80eb539..1390ce2b7291 100644
> --- a/Documentation/core-api/unaligned-memory-access.rst
> +++ b/Documentation/core-api/unaligned-memory-access.rst
> 
> -When writing code, assume the target architecture has natural alignment
> -requirements.
> -

It is clearly important to not intentionally misalign variables
because that breaks on hardware that requires aligned data.

Assuming natural alignment is the safe choice here, but you
could change 'architecture' to 'hardware' here if you
think that is otherwise ambiguous.

> -Similarly, you can also rely on the compiler to align variables and function
> -parameters to a naturally aligned scheme, based on the size of the type of
> -the variable.

This also seems to refer to something else: even on m68k
and i386, scalar stack and .data variables have natural
alignment even though the ABI does not require that.

It's probably a good idea to list the specific exceptions to
the struct layout rules in the previous paragraph, e.g.

[
 Fortunately, the compiler understands the alignment constraints, so in the
 above case it would insert 2 bytes of padding in between field1 and field2.
 Therefore, for standard structure types you can always rely on the compiler
-to pad structures so that accesses to fields are suitably aligned (assuming
-you do not cast the field to a type of different length).
+to pad structures so that accesses to fields are suitably aligned for
+the CPU hardware.
+On all 64-bit architectures, this means that all scalar struct members
+are naturally aligned. However, some 32-bit ABIs including i386
+only align 64-bit members on 32-bit offsets, and m68k uses at most
+16-bit alignment.
]

      Arnd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC v3 1/5] documentation: Discourage alignment assumptions
  2025-10-16 13:30   ` Arnd Bergmann
@ 2025-10-16 22:14     ` Finn Thain
  0 siblings, 0 replies; 18+ messages in thread
From: Finn Thain @ 2025-10-16 22:14 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng,
	Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch,
	Geert Uytterhoeven, linux-m68k, linux-doc


On Thu, 16 Oct 2025, Arnd Bergmann wrote:

>  On Wed, Oct 8, 2025, at 00:19, Finn Thain wrote:
> > Discourage assumptions that simply don't hold for all Linux ABIs.
> > Exceptions to the natural alignment rule for scalar types include
> > long long on i386 and sh.
> > ---
> 
> I think both of the paragraphs you remove are still correct and I
> would not remove them:
> 

Yes -- correct in some obscure sense, but misleading at face value.
Hence this patch.

> >  Documentation/core-api/unaligned-memory-access.rst | 7 -------
> >  1 file changed, 7 deletions(-)
> >
> > diff --git a/Documentation/core-api/unaligned-memory-access.rst 
> > b/Documentation/core-api/unaligned-memory-access.rst
> > index 5ceeb80eb539..1390ce2b7291 100644
> > --- a/Documentation/core-api/unaligned-memory-access.rst
> > +++ b/Documentation/core-api/unaligned-memory-access.rst
> > 
> > -When writing code, assume the target architecture has natural alignment
> > -requirements.
> > -
> 
> It is clearly important to not intentionally misalign variables
> because that breaks on hardware that requires aligned data.
> 
> Assuming natural alignment is the safe choice here, but you
> could change 'architecture' to 'hardware' here if you
> think that is otherwise ambiguous.
> 

Do you know of any hardware that has "natural alignment requirements" in 
the completely unqualified sense in which that the claim is made? i.e. 
considering all of its native vector and scalar types.

That's a rhetorical question. I'm not trying to provide an exhaustive and 
up-to-date list of platform quirks. With this patch, I'm merely trying to 
discourage faulty assumptions.

> > -Similarly, you can also rely on the compiler to align variables and function
> > -parameters to a naturally aligned scheme, based on the size of the type of
> > -the variable.
> 
> This also seems to refer to something else: even on m68k and i386, 
> scalar stack and .data variables have natural alignment even though the 
> ABI does not require that.
> 

Then it is doubly misleading.

There is value in explaining what the compiler can and cannot be relied 
upon to deliver, but I think the myfunc() example already serves that 
purpose. Don't you agree?

> It's probably a good idea to list the specific exceptions to
> the struct layout rules in the previous paragraph, e.g.
> 
> [
>  Fortunately, the compiler understands the alignment constraints, so in the
>  above case it would insert 2 bytes of padding in between field1 and field2.
>  Therefore, for standard structure types you can always rely on the compiler
> -to pad structures so that accesses to fields are suitably aligned (assuming
> -you do not cast the field to a type of different length).
> +to pad structures so that accesses to fields are suitably aligned for
> +the CPU hardware.
> +On all 64-bit architectures, 

The poor reader: "I wonder what '64-bit architecture' means in this 
context..."

> this means that all scalar struct members
> +are naturally aligned. However, some 32-bit ABIs including i386
> +only align 64-bit members on 32-bit offsets, and m68k uses at most
> +16-bit alignment. ]

"... oh, okay, so anything that naturally aligns a 64-bit word is a 
"64-bit architecture". I get it. Oh, hang-on, that doesn't make sense... 
Why would any 32-bit architecture be expected to naturally align 64-bit 
members? What is he talking about? CONFIG_64BIT maybe??"

Moreover, you have digressed into a discussion of the ABI. The aim of this 
document is to avoid the performance penalty for unaligned accesses. 

Whereas, ABI traits would seem to be relevant to a discussion about memory 
efficiency, like I said to David.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-10-16 22:14 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-07 22:19 [RFC v3 0/5] Align atomic storage Finn Thain
2025-10-07 22:19 ` [RFC v3 3/5] atomic: Specify alignment for atomic_t and atomic64_t Finn Thain
2025-10-07 22:19 ` [RFC v3 5/5] atomic: Add option for weaker alignment check Finn Thain
2025-10-07 22:19 ` [RFC v3 2/5] bpf: Explicitly align bpf_res_spin_lock Finn Thain
2025-10-09  2:10   ` Alexei Starovoitov
2025-10-09  2:56     ` Finn Thain
2025-10-09  7:02     ` Peter Zijlstra
2025-10-09 15:17       ` Alexei Starovoitov
2025-10-09 16:01         ` Arnd Bergmann
2025-10-09 16:12           ` Alexei Starovoitov
2025-10-07 22:19 ` [RFC v3 1/5] documentation: Discourage alignment assumptions Finn Thain
2025-10-14 10:23   ` David Laight
2025-10-15  7:40     ` Finn Thain
2025-10-15 13:53       ` David Laight
2025-10-16  6:53         ` Finn Thain
2025-10-16 13:30   ` Arnd Bergmann
2025-10-16 22:14     ` Finn Thain
2025-10-07 22:19 ` [RFC v3 4/5] atomic: Add alignment check to instrumented atomic operations Finn Thain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).