* [RFC v2 0/3] Align atomic storage
@ 2025-09-14 0:45 Finn Thain
2025-09-14 0:45 ` [RFC v2 1/3] documentation: Discourage alignment assumptions Finn Thain
` (2 more replies)
0 siblings, 3 replies; 44+ messages in thread
From: Finn Thain @ 2025-09-14 0:45 UTC (permalink / raw)
To: Peter Zijlstra, Will Deacon
Cc: Andrew Morton, Arnd Bergmann, Boqun Feng, Jonathan Corbet,
Geert Uytterhoeven, linux-arch, linux-doc, linux-kernel,
linux-m68k, Mark Rutland
This series adds the __aligned attribute to atomic_t and atomic64_t.
It also adds a Kconfig option to enable a new runtime warning to help
reveal misaligned atomic accesses on platforms which don't trap that.
Some people might assume scalars are aligned to 4-byte boundaries, while
others might assume natural alignment. Best not to encourage such
assumptions.
Moreover, being that locks are performance sensitive, and being that
atomic operations tend to involve further assumptions, there seems to be
room for improvement here.
Pertinent to this discussion are the section "Memory Efficiency" in
Documentation/RCU/Design/Requirements/Requirements.rst
and the section "GUARANTEES" in Documentation/memory-barriers.txt
Finn Thain (2):
documentation: Discourage alignment assumptions
atomic: Specify alignment for atomic_t and atomic64_t
Peter Zijlstra (1):
atomic: Add alignment check to instrumented atomic operations
Documentation/core-api/unaligned-memory-access.rst | 7 -------
include/asm-generic/atomic64.h | 2 +-
include/linux/instrumented.h | 4 ++++
include/linux/types.h | 2 +-
lib/Kconfig.debug | 10 ++++++++++
5 files changed, 16 insertions(+), 9 deletions(-)
--
2.49.1
^ permalink raw reply [flat|nested] 44+ messages in thread* [RFC v2 1/3] documentation: Discourage alignment assumptions 2025-09-14 0:45 [RFC v2 0/3] Align atomic storage Finn Thain @ 2025-09-14 0:45 ` Finn Thain 2025-09-14 0:45 ` [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations Finn Thain 2025-09-14 0:45 ` [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t Finn Thain 2 siblings, 0 replies; 44+ messages in thread From: Finn Thain @ 2025-09-14 0:45 UTC (permalink / raw) To: Peter Zijlstra, Will Deacon, Jonathan Corbet Cc: Andrew Morton, Boqun Feng, Mark Rutland, Arnd Bergmann, linux-kernel, linux-arch, Geert Uytterhoeven, linux-m68k, linux-doc Discourage assumptions that simply don't hold for all Linux ABIs. Exceptions to the natural alignment rule for scalar types include long long on i386 and sh. --- Documentation/core-api/unaligned-memory-access.rst | 7 ------- 1 file changed, 7 deletions(-) diff --git a/Documentation/core-api/unaligned-memory-access.rst b/Documentation/core-api/unaligned-memory-access.rst index 5ceeb80eb539..1390ce2b7291 100644 --- a/Documentation/core-api/unaligned-memory-access.rst +++ b/Documentation/core-api/unaligned-memory-access.rst @@ -40,9 +40,6 @@ The rule mentioned above forms what we refer to as natural alignment: When accessing N bytes of memory, the base memory address must be evenly divisible by N, i.e. addr % N == 0. -When writing code, assume the target architecture has natural alignment -requirements. - In reality, only a few architectures require natural alignment on all sizes of memory access. However, we must consider ALL supported architectures; writing code that satisfies natural alignment requirements is the easiest way @@ -103,10 +100,6 @@ Therefore, for standard structure types you can always rely on the compiler to pad structures so that accesses to fields are suitably aligned (assuming you do not cast the field to a type of different length). -Similarly, you can also rely on the compiler to align variables and function -parameters to a naturally aligned scheme, based on the size of the type of -the variable. - At this point, it should be clear that accessing a single byte (u8 or char) will never cause an unaligned access, because all memory addresses are evenly divisible by one. -- 2.49.1 ^ permalink raw reply related [flat|nested] 44+ messages in thread
* [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations 2025-09-14 0:45 [RFC v2 0/3] Align atomic storage Finn Thain 2025-09-14 0:45 ` [RFC v2 1/3] documentation: Discourage alignment assumptions Finn Thain @ 2025-09-14 0:45 ` Finn Thain 2025-09-15 8:00 ` Peter Zijlstra 2025-09-14 0:45 ` [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t Finn Thain 2 siblings, 1 reply; 44+ messages in thread From: Finn Thain @ 2025-09-14 0:45 UTC (permalink / raw) To: Peter Zijlstra, Will Deacon Cc: Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel, linux-arch, Geert Uytterhoeven, linux-m68k From: Peter Zijlstra <peterz@infradead.org> Add a Kconfig option for debug builds which logs a warning when an instrumented atomic operation takes place at some location that isn't a long word boundary. Some platforms don't trap for this. Link: https://lore.kernel.org/lkml/20250901093600.GF4067720@noisy.programming.kicks-ass.net/ --- This patch differs slightly from Peter's code which checked for natural alignment. --- include/linux/instrumented.h | 4 ++++ lib/Kconfig.debug | 10 ++++++++++ 2 files changed, 14 insertions(+) diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h index 711a1f0d1a73..55f5685971a1 100644 --- a/include/linux/instrumented.h +++ b/include/linux/instrumented.h @@ -7,6 +7,7 @@ #ifndef _LINUX_INSTRUMENTED_H #define _LINUX_INSTRUMENTED_H +#include <linux/bug.h> #include <linux/compiler.h> #include <linux/kasan-checks.h> #include <linux/kcsan-checks.h> @@ -67,6 +68,7 @@ static __always_inline void instrument_atomic_read(const volatile void *v, size_ { kasan_check_read(v, size); kcsan_check_atomic_read(v, size); + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (sizeof(long) - 1))); } /** @@ -81,6 +83,7 @@ static __always_inline void instrument_atomic_write(const volatile void *v, size { kasan_check_write(v, size); kcsan_check_atomic_write(v, size); + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (sizeof(long) - 1))); } /** @@ -95,6 +98,7 @@ static __always_inline void instrument_atomic_read_write(const volatile void *v, { kasan_check_write(v, size); kcsan_check_atomic_read_write(v, size); + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (sizeof(long) - 1))); } /** diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index ebe33181b6e6..d82626b7d6be 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1363,6 +1363,16 @@ config DEBUG_PREEMPT depending on workload as it triggers debugging routines for each this_cpu operation. It should only be used for debugging purposes. +config DEBUG_ATOMIC + bool "Debug atomic variables" + depends on DEBUG_KERNEL + help + If you say Y here then the kernel will add a runtime alignment check + to atomic accesses. Useful for architectures that do not have trap on + mis-aligned access. + + This option has potentially significant overhead. + menu "Lock Debugging (spinlocks, mutexes, etc...)" config LOCK_DEBUGGING_SUPPORT -- 2.49.1 ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations 2025-09-14 0:45 ` [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations Finn Thain @ 2025-09-15 8:00 ` Peter Zijlstra 2025-09-15 9:38 ` Finn Thain 0 siblings, 1 reply; 44+ messages in thread From: Peter Zijlstra @ 2025-09-15 8:00 UTC (permalink / raw) To: Finn Thain Cc: Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel, linux-arch, Geert Uytterhoeven, linux-m68k On Sun, Sep 14, 2025 at 10:45:29AM +1000, Finn Thain wrote: > From: Peter Zijlstra <peterz@infradead.org> > > Add a Kconfig option for debug builds which logs a warning when an > instrumented atomic operation takes place at some location that isn't > a long word boundary. Some platforms don't trap for this. > > Link: https://lore.kernel.org/lkml/20250901093600.GF4067720@noisy.programming.kicks-ass.net/ > --- > This patch differs slightly from Peter's code which checked for natural > alignment. > --- > include/linux/instrumented.h | 4 ++++ > lib/Kconfig.debug | 10 ++++++++++ > 2 files changed, 14 insertions(+) > > diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h > index 711a1f0d1a73..55f5685971a1 100644 > --- a/include/linux/instrumented.h > +++ b/include/linux/instrumented.h > @@ -7,6 +7,7 @@ > #ifndef _LINUX_INSTRUMENTED_H > #define _LINUX_INSTRUMENTED_H > > +#include <linux/bug.h> > #include <linux/compiler.h> > #include <linux/kasan-checks.h> > #include <linux/kcsan-checks.h> > @@ -67,6 +68,7 @@ static __always_inline void instrument_atomic_read(const volatile void *v, size_ > { > kasan_check_read(v, size); > kcsan_check_atomic_read(v, size); > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (sizeof(long) - 1))); > } > > /** > @@ -81,6 +83,7 @@ static __always_inline void instrument_atomic_write(const volatile void *v, size > { > kasan_check_write(v, size); > kcsan_check_atomic_write(v, size); > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (sizeof(long) - 1))); > } > > /** > @@ -95,6 +98,7 @@ static __always_inline void instrument_atomic_read_write(const volatile void *v, > { > kasan_check_write(v, size); > kcsan_check_atomic_read_write(v, size); > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (sizeof(long) - 1))); > } Right, so why aren't we trusting the size argument? And instead mandating a possibly larger alignment? Note how things like test_and_set_bit() will use sizeof(long), while atomic_set() will use sizeof(*v), which, on LP64 architectures are very much not the same. The same with atomic_*() vs atomic_long_*() / atomic64_*(), they will have different alignment requirements. And then there is cmpxchg(), that can be u8 u16, u32 and u64 depending on the user. And then there is cmpxchg128(). I really don't see how using long here is correct. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations 2025-09-15 8:00 ` Peter Zijlstra @ 2025-09-15 9:38 ` Finn Thain 2025-09-15 10:06 ` Peter Zijlstra 0 siblings, 1 reply; 44+ messages in thread From: Finn Thain @ 2025-09-15 9:38 UTC (permalink / raw) To: Peter Zijlstra Cc: Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel, linux-arch, Geert Uytterhoeven, linux-m68k On Mon, 15 Sep 2025, Peter Zijlstra wrote: > On Sun, Sep 14, 2025 at 10:45:29AM +1000, Finn Thain wrote: > > From: Peter Zijlstra <peterz@infradead.org> > > > > Add a Kconfig option for debug builds which logs a warning when an > > instrumented atomic operation takes place at some location that isn't > > a long word boundary. Some platforms don't trap for this. > > > > Link: https://lore.kernel.org/lkml/20250901093600.GF4067720@noisy.programming.kicks-ass.net/ > > --- > > This patch differs slightly from Peter's code which checked for natural > > alignment. > > --- > > include/linux/instrumented.h | 4 ++++ > > lib/Kconfig.debug | 10 ++++++++++ > > 2 files changed, 14 insertions(+) > > > > diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h > > index 711a1f0d1a73..55f5685971a1 100644 > > --- a/include/linux/instrumented.h > > +++ b/include/linux/instrumented.h > > @@ -7,6 +7,7 @@ > > #ifndef _LINUX_INSTRUMENTED_H > > #define _LINUX_INSTRUMENTED_H > > > > +#include <linux/bug.h> > > #include <linux/compiler.h> > > #include <linux/kasan-checks.h> > > #include <linux/kcsan-checks.h> > > @@ -67,6 +68,7 @@ static __always_inline void instrument_atomic_read(const volatile void *v, size_ > > { > > kasan_check_read(v, size); > > kcsan_check_atomic_read(v, size); > > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (sizeof(long) - 1))); > > } > > > > /** > > @@ -81,6 +83,7 @@ static __always_inline void instrument_atomic_write(const volatile void *v, size > > { > > kasan_check_write(v, size); > > kcsan_check_atomic_write(v, size); > > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (sizeof(long) - 1))); > > } > > > > /** > > @@ -95,6 +98,7 @@ static __always_inline void instrument_atomic_read_write(const volatile void *v, > > { > > kasan_check_write(v, size); > > kcsan_check_atomic_read_write(v, size); > > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (sizeof(long) - 1))); > > } > > Right, so why aren't we trusting the size argument? And instead > mandating a possibly larger alignment? > It wasn't supposed to mandate a larger alignment in practice. I considered doing something like (unsigned long)v & (size - 1) & (sizeof(long) - 1) but decided that the extra overhead probably wouldn't be worthwhile, if in practice, no-one is doing atomic ops on shorts or chars. I will revisit this. When you do atomic operations on atomic_t or atomic64_t, (sizeof(long) - 1) probably doesn't make much sense. But atomic operations get used on scalar types (aside from atomic_t and atomic64_t) that don't have natural alignment. Please refer to the other thread about this: https://lore.kernel.org/all/ed1e0896-fd85-5101-e136-e4a5a37ca5ff@linux-m68k.org/ ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations 2025-09-15 9:38 ` Finn Thain @ 2025-09-15 10:06 ` Peter Zijlstra 2025-09-15 10:37 ` Finn Thain 0 siblings, 1 reply; 44+ messages in thread From: Peter Zijlstra @ 2025-09-15 10:06 UTC (permalink / raw) To: Finn Thain Cc: Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel, linux-arch, Geert Uytterhoeven, linux-m68k On Mon, Sep 15, 2025 at 07:38:52PM +1000, Finn Thain wrote: > > On Mon, 15 Sep 2025, Peter Zijlstra wrote: > > > On Sun, Sep 14, 2025 at 10:45:29AM +1000, Finn Thain wrote: > > > From: Peter Zijlstra <peterz@infradead.org> > > > > > > Add a Kconfig option for debug builds which logs a warning when an > > > instrumented atomic operation takes place at some location that isn't > > > a long word boundary. Some platforms don't trap for this. > > > > > > Link: https://lore.kernel.org/lkml/20250901093600.GF4067720@noisy.programming.kicks-ass.net/ > > > --- > > > This patch differs slightly from Peter's code which checked for natural > > > alignment. > > > --- > > > include/linux/instrumented.h | 4 ++++ > > > lib/Kconfig.debug | 10 ++++++++++ > > > 2 files changed, 14 insertions(+) > > > > > > diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h > > > index 711a1f0d1a73..55f5685971a1 100644 > > > --- a/include/linux/instrumented.h > > > +++ b/include/linux/instrumented.h > > > @@ -7,6 +7,7 @@ > > > #ifndef _LINUX_INSTRUMENTED_H > > > #define _LINUX_INSTRUMENTED_H > > > > > > +#include <linux/bug.h> > > > #include <linux/compiler.h> > > > #include <linux/kasan-checks.h> > > > #include <linux/kcsan-checks.h> > > > @@ -67,6 +68,7 @@ static __always_inline void instrument_atomic_read(const volatile void *v, size_ > > > { > > > kasan_check_read(v, size); > > > kcsan_check_atomic_read(v, size); > > > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (sizeof(long) - 1))); > > > } > > > > > > /** > > > @@ -81,6 +83,7 @@ static __always_inline void instrument_atomic_write(const volatile void *v, size > > > { > > > kasan_check_write(v, size); > > > kcsan_check_atomic_write(v, size); > > > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (sizeof(long) - 1))); > > > } > > > > > > /** > > > @@ -95,6 +98,7 @@ static __always_inline void instrument_atomic_read_write(const volatile void *v, > > > { > > > kasan_check_write(v, size); > > > kcsan_check_atomic_read_write(v, size); > > > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (sizeof(long) - 1))); > > > } > > > > Right, so why aren't we trusting the size argument? And instead > > mandating a possibly larger alignment? > > > > It wasn't supposed to mandate a larger alignment in practice. I considered > doing something like (unsigned long)v & (size - 1) & (sizeof(long) - 1) > but decided that the extra overhead probably wouldn't be worthwhile, if in > practice, no-one is doing atomic ops on shorts or chars. I will revisit > this. atomic_t is aligned at 4 bytes, you're now mandating it is aligned at 8 bytes (on LP64), this cannot be right. kernel/locking/qspinlock.c:xchg_tail() does xchg_relaxed(&lock->tail, ...) which is u16. Again, you cannot mandate 8 bytes here. > When you do atomic operations on atomic_t or atomic64_t, (sizeof(long) > - 1) probably doesn't make much sense. But atomic operations get used on > scalar types (aside from atomic_t and atomic64_t) that don't have natural > alignment. Please refer to the other thread about this: > https://lore.kernel.org/all/ed1e0896-fd85-5101-e136-e4a5a37ca5ff@linux-m68k.org/ Perhaps set ARCH_SLAB_MINALIGN ? ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations 2025-09-15 10:06 ` Peter Zijlstra @ 2025-09-15 10:37 ` Finn Thain 2025-09-15 11:20 ` Arnd Bergmann 0 siblings, 1 reply; 44+ messages in thread From: Finn Thain @ 2025-09-15 10:37 UTC (permalink / raw) To: Peter Zijlstra Cc: Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel, linux-arch, Geert Uytterhoeven, linux-m68k On Mon, 15 Sep 2025, Peter Zijlstra wrote: > On Mon, Sep 15, 2025 at 07:38:52PM +1000, Finn Thain wrote: > > > > On Mon, 15 Sep 2025, Peter Zijlstra wrote: > > > > > On Sun, Sep 14, 2025 at 10:45:29AM +1000, Finn Thain wrote: > > > > From: Peter Zijlstra <peterz@infradead.org> > > > > > > > > Add a Kconfig option for debug builds which logs a warning when an > > > > instrumented atomic operation takes place at some location that isn't > > > > a long word boundary. Some platforms don't trap for this. > > > > > > > > Link: https://lore.kernel.org/lkml/20250901093600.GF4067720@noisy.programming.kicks-ass.net/ > > > > --- > > > > This patch differs slightly from Peter's code which checked for natural > > > > alignment. > > > > --- > > > > include/linux/instrumented.h | 4 ++++ > > > > lib/Kconfig.debug | 10 ++++++++++ > > > > 2 files changed, 14 insertions(+) > > > > > > > > diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h > > > > index 711a1f0d1a73..55f5685971a1 100644 > > > > --- a/include/linux/instrumented.h > > > > +++ b/include/linux/instrumented.h > > > > @@ -7,6 +7,7 @@ > > > > #ifndef _LINUX_INSTRUMENTED_H > > > > #define _LINUX_INSTRUMENTED_H > > > > > > > > +#include <linux/bug.h> > > > > #include <linux/compiler.h> > > > > #include <linux/kasan-checks.h> > > > > #include <linux/kcsan-checks.h> > > > > @@ -67,6 +68,7 @@ static __always_inline void instrument_atomic_read(const volatile void *v, size_ > > > > { > > > > kasan_check_read(v, size); > > > > kcsan_check_atomic_read(v, size); > > > > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (sizeof(long) - 1))); > > > > } > > > > > > > > /** > > > > @@ -81,6 +83,7 @@ static __always_inline void instrument_atomic_write(const volatile void *v, size > > > > { > > > > kasan_check_write(v, size); > > > > kcsan_check_atomic_write(v, size); > > > > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (sizeof(long) - 1))); > > > > } > > > > > > > > /** > > > > @@ -95,6 +98,7 @@ static __always_inline void instrument_atomic_read_write(const volatile void *v, > > > > { > > > > kasan_check_write(v, size); > > > > kcsan_check_atomic_read_write(v, size); > > > > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (sizeof(long) - 1))); > > > > } > > > > > > Right, so why aren't we trusting the size argument? And instead > > > mandating a possibly larger alignment? > > > > > > > It wasn't supposed to mandate a larger alignment in practice. I considered > > doing something like (unsigned long)v & (size - 1) & (sizeof(long) - 1) > > but decided that the extra overhead probably wouldn't be worthwhile, if in > > practice, no-one is doing atomic ops on shorts or chars. I will revisit > > this. > > atomic_t is aligned at 4 bytes, you're now mandating it is aligned at 8 > bytes (on LP64), this cannot be right. > > kernel/locking/qspinlock.c:xchg_tail() does xchg_relaxed(&lock->tail, > ...) which is u16. Again, you cannot mandate 8 bytes here. > OK. I will change it back to your code (i.e. mandate natural alignment). > > When you do atomic operations on atomic_t or atomic64_t, (sizeof(long) > > - 1) probably doesn't make much sense. But atomic operations get used on > > scalar types (aside from atomic_t and atomic64_t) that don't have natural > > alignment. Please refer to the other thread about this: > > https://lore.kernel.org/all/ed1e0896-fd85-5101-e136-e4a5a37ca5ff@linux-m68k.org/ > > Perhaps set ARCH_SLAB_MINALIGN ? > That's not going to help much. The 850 byte offset of task_works into struct task_struct and the 418 byte offset of exit_state in struct task_struct are already misaligned. But that's all moot, if you intended that CONFIG_DEBUG_ATOMIC should complain about any deviation from natural alignment. I still don't have any performance measurements but I'm willing to assume there's a penalty for such deviation. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations 2025-09-15 10:37 ` Finn Thain @ 2025-09-15 11:20 ` Arnd Bergmann 2025-09-16 0:16 ` Finn Thain 0 siblings, 1 reply; 44+ messages in thread From: Arnd Bergmann @ 2025-09-15 11:20 UTC (permalink / raw) To: Finn Thain, Peter Zijlstra Cc: Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, Geert Uytterhoeven, linux-m68k On Mon, Sep 15, 2025, at 12:37, Finn Thain wrote: > On Mon, 15 Sep 2025, Peter Zijlstra wrote: >> >> > When you do atomic operations on atomic_t or atomic64_t, (sizeof(long) >> > - 1) probably doesn't make much sense. But atomic operations get used on >> > scalar types (aside from atomic_t and atomic64_t) that don't have natural >> > alignment. Please refer to the other thread about this: >> > https://lore.kernel.org/all/ed1e0896-fd85-5101-e136-e4a5a37ca5ff@linux-m68k.org/ >> >> Perhaps set ARCH_SLAB_MINALIGN ? >> > > That's not going to help much. The 850 byte offset of task_works into > struct task_struct and the 418 byte offset of exit_state in struct > task_struct are already misaligned. Has there been any progress on building m68k kernels with -mint-align? IIRC there are only a small number of uapi structures that need __packed annotations to maintain the existing syscall ABI. Arnd ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations 2025-09-15 11:20 ` Arnd Bergmann @ 2025-09-16 0:16 ` Finn Thain 2025-09-16 10:10 ` Geert Uytterhoeven 2025-09-16 12:37 ` Arnd Bergmann 0 siblings, 2 replies; 44+ messages in thread From: Finn Thain @ 2025-09-16 0:16 UTC (permalink / raw) To: Arnd Bergmann Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, Geert Uytterhoeven, linux-m68k On Mon, 15 Sep 2025, Arnd Bergmann wrote: > On Mon, Sep 15, 2025, at 12:37, Finn Thain wrote: > > On Mon, 15 Sep 2025, Peter Zijlstra wrote: > >> > >> > When you do atomic operations on atomic_t or atomic64_t, (sizeof(long) > >> > - 1) probably doesn't make much sense. But atomic operations get used on > >> > scalar types (aside from atomic_t and atomic64_t) that don't have natural > >> > alignment. Please refer to the other thread about this: > >> > https://lore.kernel.org/all/ed1e0896-fd85-5101-e136-e4a5a37ca5ff@linux-m68k.org/ > >> > >> Perhaps set ARCH_SLAB_MINALIGN ? > >> > > > > That's not going to help much. The 850 byte offset of task_works into > > struct task_struct and the 418 byte offset of exit_state in struct > > task_struct are already misaligned. > > Has there been any progress on building m68k kernels with -mint-align? Not that I know of. > IIRC there are only a small number of uapi structures that need > __packed annotations to maintain the existing syscall ABI. > Packing uapi structures (and adopting -malign-int) sounds easier than the alternative, which might be to align certain internal kernel struct members, on a case-by-case basis, where doing so could be shown to improve performance on some architecture or other (while keeping -mno-align-int). Well, it's easy to find all the structs that belong to the uapi, but it's not easy to find all the internal kernel structs that describe MMIO registers. For -malign-int, both kinds of structs are a problem. If better performance is to be had, my guess is that aligning atomic_t will get 80% of it (just an appeal to the Pareto principle, FWIW...) ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations 2025-09-16 0:16 ` Finn Thain @ 2025-09-16 10:10 ` Geert Uytterhoeven 2025-09-17 1:23 ` Finn Thain 2025-09-16 12:37 ` Arnd Bergmann 1 sibling, 1 reply; 44+ messages in thread From: Geert Uytterhoeven @ 2025-09-16 10:10 UTC (permalink / raw) To: Finn Thain Cc: Arnd Bergmann, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, John Paul Adrian Glaubitz Hi Finn, CC Adrian, On Tue, 16 Sept 2025 at 02:16, Finn Thain <fthain@linux-m68k.org> wrote: > On Mon, 15 Sep 2025, Arnd Bergmann wrote: > > On Mon, Sep 15, 2025, at 12:37, Finn Thain wrote: > > > On Mon, 15 Sep 2025, Peter Zijlstra wrote: > > >> > > >> > When you do atomic operations on atomic_t or atomic64_t, (sizeof(long) > > >> > - 1) probably doesn't make much sense. But atomic operations get used on > > >> > scalar types (aside from atomic_t and atomic64_t) that don't have natural > > >> > alignment. Please refer to the other thread about this: > > >> > https://lore.kernel.org/all/ed1e0896-fd85-5101-e136-e4a5a37ca5ff@linux-m68k.org/ > > >> > > >> Perhaps set ARCH_SLAB_MINALIGN ? > > > > > > That's not going to help much. The 850 byte offset of task_works into > > > struct task_struct and the 418 byte offset of exit_state in struct > > > task_struct are already misaligned. > > > > Has there been any progress on building m68k kernels with -mint-align? > > Not that I know of. > > > IIRC there are only a small number of uapi structures that need > > __packed annotations to maintain the existing syscall ABI. > > Packing uapi structures (and adopting -malign-int) sounds easier than the > alternative, which might be to align certain internal kernel struct > members, on a case-by-case basis, where doing so could be shown to improve > performance on some architecture or other (while keeping -mno-align-int). indeed. > Well, it's easy to find all the structs that belong to the uapi, but it's > not easy to find all the internal kernel structs that describe MMIO > registers. For -malign-int, both kinds of structs are a problem. For structures under arch/m68k/include/asm/, just create a single C file that calculates sizeof() of each structure, and compare the generated code with and without -malign-int. Any differences should be investigated, and attributed when needed. For structures inside m68k-specific drivers, do something similar inside those drivers ('git grep "struct\s*[a-zA-Z0-9_]*\s*{"' is your friend). Most Amiga-specific drivers should be fine, as they were used on APUS (PowerPC) before. I guess the same is true for some of the Mac-specific drivers that are shared with PowerPC. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations 2025-09-16 10:10 ` Geert Uytterhoeven @ 2025-09-17 1:23 ` Finn Thain 0 siblings, 0 replies; 44+ messages in thread From: Finn Thain @ 2025-09-17 1:23 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Arnd Bergmann, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, John Paul Adrian Glaubitz On Tue, 16 Sep 2025, Geert Uytterhoeven wrote: > On Tue, 16 Sept 2025 at 02:16, Finn Thain <fthain@linux-m68k.org> wrote: > > On Mon, 15 Sep 2025, Arnd Bergmann wrote: > > > > > > Has there been any progress on building m68k kernels with > > > -mint-align? > > > > Not that I know of. > > > > > IIRC there are only a small number of uapi structures that need > > > __packed annotations to maintain the existing syscall ABI. > > > > Packing uapi structures (and adopting -malign-int) sounds easier than > > the alternative, which might be to align certain internal kernel > > struct members, on a case-by-case basis, where doing so could be shown > > to improve performance on some architecture or other (while keeping > > -mno-align-int). > > indeed. > Well, it "sounds easier". But I doubt that it really is easier. > > Well, it's easy to find all the structs that belong to the uapi, but > > it's not easy to find all the internal kernel structs that describe > > MMIO registers. For -malign-int, both kinds of structs are a problem. > > For structures under arch/m68k/include/asm/, just create a single C file > that calculates sizeof() of each structure, and compare the generated > code with and without -malign-int. Any differences should be > investigated, and attributed when needed. > > For structures inside m68k-specific drivers, do something similar inside > those drivers ('git grep "struct\s*[a-zA-Z0-9_]*\s*{"' is your friend). > There's something to be said for adding static_assert() checks for the structs that belong to all fixed interfaces. The patches to actually pack of struct members could take place after that. All of which is a lot of work, compared to specifying alignment for those core kernel data structures where doing so improves performance. This approach is platform neutral; it's not just m68k that benefits. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations 2025-09-16 0:16 ` Finn Thain 2025-09-16 10:10 ` Geert Uytterhoeven @ 2025-09-16 12:37 ` Arnd Bergmann 2025-09-16 21:38 ` Brad Boyer 2025-09-17 2:14 ` Finn Thain 1 sibling, 2 replies; 44+ messages in thread From: Arnd Bergmann @ 2025-09-16 12:37 UTC (permalink / raw) To: Finn Thain Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, Geert Uytterhoeven, linux-m68k On Tue, Sep 16, 2025, at 02:16, Finn Thain wrote: > On Mon, 15 Sep 2025, Arnd Bergmann wrote: > >> IIRC there are only a small number of uapi structures that need >> __packed annotations to maintain the existing syscall ABI. >> > > Packing uapi structures (and adopting -malign-int) sounds easier than the > alternative, which might be to align certain internal kernel struct > members, on a case-by-case basis, where doing so could be shown to improve > performance on some architecture or other (while keeping -mno-align-int). > > Well, it's easy to find all the structs that belong to the uapi, but it's > not easy to find all the internal kernel structs that describe MMIO > registers. For -malign-int, both kinds of structs are a problem. Right, especially since those structure definitions are more likely to be used on older drivers, there are probably some that do happen on m68k. On the other hand, any driver that is portable to non-m68k targets won't have this problem. I tried a trivial m68k defconfig build with "-Wpadded -malign-int" and found 3021 instances of structure padding, compared to 2271 without -malign-int. That is still something one could very reasonably go through with 'vim -q output', as almost all of them are obviously kernel-internal structures and not problematic. A quick manual scan only shows a number of uapi structures but very few drivers. There are a few hardware structures that are internally packed but in a structure that has gets an extra two bytes of harmless padding at the end (struct CUSTOM, struct amiga_hw_present, struct atari_hw_present, struct TT_DMA, struct ppc_regs). The only ones I could find that actually seemed suspicious are: arch/m68k/include/asm/atarihw.h:426:10: warning: padding struct to align 'dst_address' [-Wpadded] arch/m68k/include/asm/openprom.h:76:13: warning: padding struct to align 'boot_dev_ctrl' [-Wpadded] drivers/net/ethernet/i825xx/82596.c:250:1: warning: padding struct size to alignment boundary I'm sure there are a few more, but probably not a lot more. See https://pastebin.com/Z2bjnD0G for the full list. I've also tried annotating the ones that show up in defconfig, see diffstat below. Unfortunately I found no way of annotating a struct a whole to use the traditional padding rules: "#pragma pack(push, 2)" lowers the alignment of all struct members to two bytes, including those with an explicit alignment like __aligned_u64. A struct-wide __attribute__((packed, aligned(2))) seems even worse, as it makes all members inside of the struck tightly packed and only aligns the struct itself. This means all misaligned members have to be individually marked as __packed, unless someone comes up with another type of annotation. > If better performance is to be had, my guess is that aligning atomic_t > will get 80% of it (just an appeal to the Pareto principle, FWIW...) arch/m68k selects CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS for anything other than Dragonball, so I would not expect much performance difference at all, unless CASL on unaligned data ends up causing alignment traps as it does on most architectures. Arnd arch/m68k/atari/atakeyb.c | 2 +- arch/m68k/include/asm/amigahw.h | 4 +-- arch/m68k/include/asm/atarihw.h | 6 ++-- arch/m68k/include/asm/mvme147hw.h | 2 +- arch/m68k/include/asm/openprom.h | 2 +- arch/m68k/include/uapi/asm/ptrace.h | 2 +- arch/m68k/include/uapi/asm/sigcontext.h | 2 +- arch/m68k/include/uapi/asm/stat.h | 4 +-- drivers/net/ethernet/i825xx/82596.c | 4 +-- include/uapi/asm-generic/termios.h | 2 +- include/uapi/linux/acct.h | 2 +- include/uapi/linux/ax25.h | 6 ++-- include/uapi/linux/blktrace_api.h | 2 +- include/uapi/linux/btrfs.h | 2 +- include/uapi/linux/cdrom.h | 36 +++++++++++----------- include/uapi/linux/dlm.h | 2 +- include/uapi/linux/dqblk_xfs.h | 2 +- include/uapi/linux/ethtool.h | 14 ++++----- include/uapi/linux/fb.h | 6 ++-- include/uapi/linux/fd.h | 6 ++-- include/uapi/linux/filter.h | 2 +- include/uapi/linux/hdlc/ioctl.h | 6 ++-- include/uapi/linux/hiddev.h | 2 +- include/uapi/linux/i2c-dev.h | 2 +- include/uapi/linux/i2c.h | 2 +- include/uapi/linux/if.h | 2 +- include/uapi/linux/if_arcnet.h | 4 +-- include/uapi/linux/if_bonding.h | 2 +- include/uapi/linux/if_bridge.h | 14 ++++----- include/uapi/linux/if_link.h | 2 +- include/uapi/linux/if_plip.h | 2 +- include/uapi/linux/if_pppox.h | 2 +- include/uapi/linux/if_vlan.h | 2 +- include/uapi/linux/inet_diag.h | 2 +- include/uapi/linux/input.h | 4 +-- include/uapi/linux/ip6_tunnel.h | 4 +-- include/uapi/linux/kd.h | 2 +- include/uapi/linux/llc.h | 2 +- include/uapi/linux/loop.h | 2 +- include/uapi/linux/mctp.h | 4 +-- include/uapi/linux/mptcp.h | 2 +- include/uapi/linux/mroute.h | 4 +-- include/uapi/linux/mroute6.h | 6 ++-- include/uapi/linux/msdos_fs.h | 2 +- include/uapi/linux/msg.h | 6 ++-- include/uapi/linux/mtio.h | 2 +- include/uapi/linux/netfilter/ipset/ip_set.h | 4 +-- include/uapi/linux/netfilter/nf_nat.h | 2 +- include/uapi/linux/netfilter/x_tables.h | 8 ++--- include/uapi/linux/netfilter/xt_HMARK.h | 2 +- include/uapi/linux/netfilter/xt_TPROXY.h | 4 +-- include/uapi/linux/netfilter/xt_connbytes.h | 2 +- include/uapi/linux/netfilter/xt_connmark.h | 6 ++-- include/uapi/linux/netfilter/xt_conntrack.h | 4 +-- include/uapi/linux/netfilter/xt_esp.h | 2 +- include/uapi/linux/netfilter/xt_hashlimit.h | 6 ++-- include/uapi/linux/netfilter/xt_helper.h | 2 +- include/uapi/linux/netfilter/xt_ipcomp.h | 2 +- include/uapi/linux/netfilter/xt_iprange.h | 2 +- include/uapi/linux/netfilter/xt_l2tp.h | 2 +- include/uapi/linux/netfilter/xt_mac.h | 2 +- include/uapi/linux/netfilter/xt_mark.h | 2 +- include/uapi/linux/netfilter/xt_owner.h | 2 +- include/uapi/linux/netfilter/xt_realm.h | 2 +- include/uapi/linux/netfilter/xt_recent.h | 4 +-- include/uapi/linux/netfilter/xt_set.h | 4 +-- include/uapi/linux/netfilter/xt_statistic.h | 2 +- include/uapi/linux/netfilter/xt_tcpmss.h | 2 +- include/uapi/linux/netfilter/xt_tcpudp.h | 2 +- include/uapi/linux/netfilter/xt_time.h | 2 +- include/uapi/linux/netfilter/xt_u32.h | 6 ++-- include/uapi/linux/netfilter_arp/arp_tables.h | 2 +- include/uapi/linux/netfilter_arp/arpt_mangle.h | 2 +- include/uapi/linux/netfilter_bridge/ebt_802_3.h | 2 +- include/uapi/linux/netfilter_bridge/ebt_arp.h | 2 +- include/uapi/linux/netfilter_bridge/ebt_arpreply.h | 2 +- include/uapi/linux/netfilter_bridge/ebt_mark_m.h | 2 +- include/uapi/linux/netfilter_bridge/ebt_nat.h | 2 +- include/uapi/linux/netfilter_bridge/ebt_stp.h | 4 +-- include/uapi/linux/netfilter_bridge/ebt_vlan.h | 2 +- include/uapi/linux/netfilter_ipv4/ipt_ah.h | 2 +- include/uapi/linux/netfilter_ipv6/ip6_tables.h | 2 +- include/uapi/linux/netfilter_ipv6/ip6t_ah.h | 2 +- include/uapi/linux/netfilter_ipv6/ip6t_frag.h | 2 +- include/uapi/linux/netfilter_ipv6/ip6t_opts.h | 2 +- include/uapi/linux/netfilter_ipv6/ip6t_rt.h | 2 +- include/uapi/linux/netfilter_ipv6/ip6t_srh.h | 2 +- include/uapi/linux/netrom.h | 2 +- include/uapi/linux/nfs_idmap.h | 2 +- include/uapi/linux/nfs_mount.h | 2 +- include/uapi/linux/pkt_cls.h | 2 +- include/uapi/linux/rds.h | 4 +-- include/uapi/linux/rose.h | 8 ++--- include/uapi/linux/route.h | 2 +- include/uapi/linux/rtc.h | 2 +- include/uapi/linux/sctp.h | 18 +++++------ include/uapi/linux/sed-opal.h | 2 +- include/uapi/linux/sem.h | 2 +- include/uapi/linux/serial.h | 4 +-- include/uapi/linux/soundcard.h | 8 ++--- include/uapi/linux/taskstats.h | 2 +- include/uapi/linux/virtio_net.h | 4 +-- include/uapi/linux/wireless.h | 4 +-- include/uapi/linux/xfrm.h | 26 ++++++++-------- include/uapi/mtd/mtd-abi.h | 2 +- include/uapi/rdma/hfi/hfi1_ioctl.h | 2 +- include/uapi/rdma/ib_user_ioctl_verbs.h | 2 +- include/uapi/rdma/ib_user_mad.h | 2 +- 109 files changed, 205 insertions(+), 204 deletions(-) ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations 2025-09-16 12:37 ` Arnd Bergmann @ 2025-09-16 21:38 ` Brad Boyer 2025-09-17 16:54 ` Andreas Schwab 2025-09-17 2:14 ` Finn Thain 1 sibling, 1 reply; 44+ messages in thread From: Brad Boyer @ 2025-09-16 21:38 UTC (permalink / raw) To: Arnd Bergmann Cc: Finn Thain, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, Geert Uytterhoeven, linux-m68k On Tue, Sep 16, 2025 at 02:37:21PM +0200, Arnd Bergmann wrote: > arch/m68k selects CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS for anything > other than Dragonball, so I would not expect much performance difference > at all, unless CASL on unaligned data ends up causing alignment traps > as it does on most architectures. I believe it depends on the exact CPU. The 68060 user manual has a section called "Emulating CAS2 and CAS Misaligned" discussing the handling of such instances through software emulation. The 68040 user manual doesn't have any similar section. I haven't looked at the exception handlers in the kernel to see if such cases are being handled. The documentation says it results in an unimplemented integer exception, and the handler has to manually lock the bus while performing normal memory operations. Brad Boyer flar@allandria.com ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations 2025-09-16 21:38 ` Brad Boyer @ 2025-09-17 16:54 ` Andreas Schwab 0 siblings, 0 replies; 44+ messages in thread From: Andreas Schwab @ 2025-09-17 16:54 UTC (permalink / raw) To: Brad Boyer Cc: Arnd Bergmann, Finn Thain, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, Geert Uytterhoeven, linux-m68k On Sep 16 2025, Brad Boyer wrote: > I believe it depends on the exact CPU. The 68060 user manual has a > section called "Emulating CAS2 and CAS Misaligned" discussing the > handling of such instances through software emulation. The 68040 > user manual doesn't have any similar section. The 68040 still handles them in hardware. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations 2025-09-16 12:37 ` Arnd Bergmann 2025-09-16 21:38 ` Brad Boyer @ 2025-09-17 2:14 ` Finn Thain 2025-09-22 15:49 ` Arnd Bergmann 1 sibling, 1 reply; 44+ messages in thread From: Finn Thain @ 2025-09-17 2:14 UTC (permalink / raw) To: Arnd Bergmann Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, Geert Uytterhoeven, linux-m68k On Tue, 16 Sep 2025, Arnd Bergmann wrote: > On Tue, Sep 16, 2025, at 02:16, Finn Thain wrote: > > > Packing uapi structures (and adopting -malign-int) sounds easier than > > the alternative, which might be to align certain internal kernel > > struct members, on a case-by-case basis, where doing so could be shown > > to improve performance on some architecture or other (while keeping > > -mno-align-int). > > > > Well, it's easy to find all the structs that belong to the uapi, but > > it's not easy to find all the internal kernel structs that describe > > MMIO registers. For -malign-int, both kinds of structs are a problem. > > Right, especially since those structure definitions are more likely to > be used on older drivers, there are probably some that do happen on > m68k. On the other hand, any driver that is portable to non-m68k targets > won't have this problem. > Yes, but only inasmuchas drivers are completely portable. Any data structure accessed or declared using #if conditional code would defeat that kind of reasoning. > I tried a trivial m68k defconfig build with "-Wpadded -malign-int" and > found 3021 instances of structure padding, compared to 2271 without > -malign-int. That is still something one could very reasonably go > through with 'vim -q output', as almost all of them are obviously > kernel-internal structures and not problematic. > So, 3021 is the number of structs potentially needing to be checked? This is not the kind of upper bound I consider feasible for careful manual editing (it's worse than I thought). We need the compiler to help. > A quick manual scan only shows a number of uapi structures but very few > drivers. There are a few hardware structures that are internally packed > but in a structure that has gets an extra two bytes of harmless padding > at the end (struct CUSTOM, struct amiga_hw_present, struct > atari_hw_present, struct TT_DMA, struct ppc_regs). I wouldn't assume that padding at the end is inconsequential. I think the analysis requires comparing object code. E.g. add the same padding explicitly to see whether it alters object code under -mno-align-int. > The only ones I could find that actually seemed suspicious are: > > arch/m68k/include/asm/atarihw.h:426:10: warning: padding struct to align > 'dst_address' [-Wpadded] arch/m68k/include/asm/openprom.h:76:13: > warning: padding struct to align 'boot_dev_ctrl' [-Wpadded] > drivers/net/ethernet/i825xx/82596.c:250:1: warning: padding struct size > to alignment boundary > > I'm sure there are a few more, but probably not a lot more. See > https://pastebin.com/Z2bjnD0G for the full list. I've also tried > annotating the ones that show up in defconfig, see diffstat below. > > Unfortunately I found no way of annotating a struct a whole to use the > traditional padding rules: "#pragma pack(push, 2)" lowers the alignment > of all struct members to two bytes, including those with an explicit > alignment like __aligned_u64. A struct-wide __attribute__((packed, > aligned(2))) seems even worse, as it makes all members inside of the > struck tightly packed and only aligns the struct itself. > > This means all misaligned members have to be individually marked as > __packed, unless someone comes up with another type of annotation. > Right. The trick will be to annotate the affected struct members by automatic program transformation. > > If better performance is to be had, my guess is that aligning atomic_t > > will get 80% of it (just an appeal to the Pareto principle, FWIW...) > > arch/m68k selects CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS for anything > other than Dragonball, so I would not expect much performance difference > at all, unless CASL on unaligned data ends up causing alignment traps as > it does on most architectures. > I think the answer to the performance question would depend on both choice of workload and choice of architecture. I haven't done any measurements on this patch series yet. > Arnd > > arch/m68k/atari/atakeyb.c | 2 +- > arch/m68k/include/asm/amigahw.h | 4 +-- > arch/m68k/include/asm/atarihw.h | 6 ++-- > arch/m68k/include/asm/mvme147hw.h | 2 +- > arch/m68k/include/asm/openprom.h | 2 +- > arch/m68k/include/uapi/asm/ptrace.h | 2 +- > arch/m68k/include/uapi/asm/sigcontext.h | 2 +- > arch/m68k/include/uapi/asm/stat.h | 4 +-- > drivers/net/ethernet/i825xx/82596.c | 4 +-- > include/uapi/asm-generic/termios.h | 2 +- > include/uapi/linux/acct.h | 2 +- > include/uapi/linux/ax25.h | 6 ++-- > include/uapi/linux/blktrace_api.h | 2 +- > include/uapi/linux/btrfs.h | 2 +- > include/uapi/linux/cdrom.h | 36 +++++++++++----------- > include/uapi/linux/dlm.h | 2 +- > include/uapi/linux/dqblk_xfs.h | 2 +- > include/uapi/linux/ethtool.h | 14 ++++----- > include/uapi/linux/fb.h | 6 ++-- > include/uapi/linux/fd.h | 6 ++-- > include/uapi/linux/filter.h | 2 +- > include/uapi/linux/hdlc/ioctl.h | 6 ++-- > include/uapi/linux/hiddev.h | 2 +- > include/uapi/linux/i2c-dev.h | 2 +- > include/uapi/linux/i2c.h | 2 +- > include/uapi/linux/if.h | 2 +- > include/uapi/linux/if_arcnet.h | 4 +-- > include/uapi/linux/if_bonding.h | 2 +- > include/uapi/linux/if_bridge.h | 14 ++++----- > include/uapi/linux/if_link.h | 2 +- > include/uapi/linux/if_plip.h | 2 +- > include/uapi/linux/if_pppox.h | 2 +- > include/uapi/linux/if_vlan.h | 2 +- > include/uapi/linux/inet_diag.h | 2 +- > include/uapi/linux/input.h | 4 +-- > include/uapi/linux/ip6_tunnel.h | 4 +-- > include/uapi/linux/kd.h | 2 +- > include/uapi/linux/llc.h | 2 +- > include/uapi/linux/loop.h | 2 +- > include/uapi/linux/mctp.h | 4 +-- > include/uapi/linux/mptcp.h | 2 +- > include/uapi/linux/mroute.h | 4 +-- > include/uapi/linux/mroute6.h | 6 ++-- > include/uapi/linux/msdos_fs.h | 2 +- > include/uapi/linux/msg.h | 6 ++-- > include/uapi/linux/mtio.h | 2 +- > include/uapi/linux/netfilter/ipset/ip_set.h | 4 +-- > include/uapi/linux/netfilter/nf_nat.h | 2 +- > include/uapi/linux/netfilter/x_tables.h | 8 ++--- > include/uapi/linux/netfilter/xt_HMARK.h | 2 +- > include/uapi/linux/netfilter/xt_TPROXY.h | 4 +-- > include/uapi/linux/netfilter/xt_connbytes.h | 2 +- > include/uapi/linux/netfilter/xt_connmark.h | 6 ++-- > include/uapi/linux/netfilter/xt_conntrack.h | 4 +-- > include/uapi/linux/netfilter/xt_esp.h | 2 +- > include/uapi/linux/netfilter/xt_hashlimit.h | 6 ++-- > include/uapi/linux/netfilter/xt_helper.h | 2 +- > include/uapi/linux/netfilter/xt_ipcomp.h | 2 +- > include/uapi/linux/netfilter/xt_iprange.h | 2 +- > include/uapi/linux/netfilter/xt_l2tp.h | 2 +- > include/uapi/linux/netfilter/xt_mac.h | 2 +- > include/uapi/linux/netfilter/xt_mark.h | 2 +- > include/uapi/linux/netfilter/xt_owner.h | 2 +- > include/uapi/linux/netfilter/xt_realm.h | 2 +- > include/uapi/linux/netfilter/xt_recent.h | 4 +-- > include/uapi/linux/netfilter/xt_set.h | 4 +-- > include/uapi/linux/netfilter/xt_statistic.h | 2 +- > include/uapi/linux/netfilter/xt_tcpmss.h | 2 +- > include/uapi/linux/netfilter/xt_tcpudp.h | 2 +- > include/uapi/linux/netfilter/xt_time.h | 2 +- > include/uapi/linux/netfilter/xt_u32.h | 6 ++-- > include/uapi/linux/netfilter_arp/arp_tables.h | 2 +- > include/uapi/linux/netfilter_arp/arpt_mangle.h | 2 +- > include/uapi/linux/netfilter_bridge/ebt_802_3.h | 2 +- > include/uapi/linux/netfilter_bridge/ebt_arp.h | 2 +- > include/uapi/linux/netfilter_bridge/ebt_arpreply.h | 2 +- > include/uapi/linux/netfilter_bridge/ebt_mark_m.h | 2 +- > include/uapi/linux/netfilter_bridge/ebt_nat.h | 2 +- > include/uapi/linux/netfilter_bridge/ebt_stp.h | 4 +-- > include/uapi/linux/netfilter_bridge/ebt_vlan.h | 2 +- > include/uapi/linux/netfilter_ipv4/ipt_ah.h | 2 +- > include/uapi/linux/netfilter_ipv6/ip6_tables.h | 2 +- > include/uapi/linux/netfilter_ipv6/ip6t_ah.h | 2 +- > include/uapi/linux/netfilter_ipv6/ip6t_frag.h | 2 +- > include/uapi/linux/netfilter_ipv6/ip6t_opts.h | 2 +- > include/uapi/linux/netfilter_ipv6/ip6t_rt.h | 2 +- > include/uapi/linux/netfilter_ipv6/ip6t_srh.h | 2 +- > include/uapi/linux/netrom.h | 2 +- > include/uapi/linux/nfs_idmap.h | 2 +- > include/uapi/linux/nfs_mount.h | 2 +- > include/uapi/linux/pkt_cls.h | 2 +- > include/uapi/linux/rds.h | 4 +-- > include/uapi/linux/rose.h | 8 ++--- > include/uapi/linux/route.h | 2 +- > include/uapi/linux/rtc.h | 2 +- > include/uapi/linux/sctp.h | 18 +++++------ > include/uapi/linux/sed-opal.h | 2 +- > include/uapi/linux/sem.h | 2 +- > include/uapi/linux/serial.h | 4 +-- > include/uapi/linux/soundcard.h | 8 ++--- > include/uapi/linux/taskstats.h | 2 +- > include/uapi/linux/virtio_net.h | 4 +-- > include/uapi/linux/wireless.h | 4 +-- > include/uapi/linux/xfrm.h | 26 ++++++++-------- > include/uapi/mtd/mtd-abi.h | 2 +- > include/uapi/rdma/hfi/hfi1_ioctl.h | 2 +- > include/uapi/rdma/ib_user_ioctl_verbs.h | 2 +- > include/uapi/rdma/ib_user_mad.h | 2 +- > 109 files changed, 205 insertions(+), 204 deletions(-) > Thanks for sending these results. I was looking into doing something similar using pahole but found that difficult. When I #include'd all the headers of interest, the thing doesn't build. I abandoned that approach and concluded that the way forward was to get the compiler to do the analysis (perhaps with a plug-in), during a normal kernel build, rather than during compilation of some contrived mess containing all the headers and all the structs. It would be sufficient to have a plug-in to list the sites of all members potentially needing annnotation. The list could be manually pared down and the actual annotating could be scripted. Note that this list would be shorter than a -Wpadded list, because there's a bunch of padding that's not relevant. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations 2025-09-17 2:14 ` Finn Thain @ 2025-09-22 15:49 ` Arnd Bergmann 2025-09-23 6:39 ` Finn Thain 0 siblings, 1 reply; 44+ messages in thread From: Arnd Bergmann @ 2025-09-22 15:49 UTC (permalink / raw) To: Finn Thain Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, Geert Uytterhoeven, linux-m68k On Wed, Sep 17, 2025, at 04:14, Finn Thain wrote: > On Tue, 16 Sep 2025, Arnd Bergmann wrote: >> On Tue, Sep 16, 2025, at 02:16, Finn Thain wrote: >> >> Right, especially since those structure definitions are more likely to >> be used on older drivers, there are probably some that do happen on >> m68k. On the other hand, any driver that is portable to non-m68k targets >> won't have this problem. >> > > Yes, but only inasmuchas drivers are completely portable. Any data > structure accessed or declared using #if conditional code would defeat > that kind of reasoning. Sure, but those are extremely rare. If there is a structure definition, it's usually done in a way that describes the hardware and is already densely packed. There are only a handful of compile-time checks for CONFIG_M68K and __m68k__ in the kernel, and none of those are around data structure definitions. >> I tried a trivial m68k defconfig build with "-Wpadded -malign-int" and >> found 3021 instances of structure padding, compared to 2271 without >> -malign-int. That is still something one could very reasonably go >> through with 'vim -q output', as almost all of them are obviously >> kernel-internal structures and not problematic. >> > > So, 3021 is the number of structs potentially needing to be checked? This > is not the kind of upper bound I consider feasible for careful manual > editing (it's worse than I thought). We need the compiler to help. It's the number of add pads, so there are some structures that show up many times in that list. There would be significantly more than those 3021 if you include drivers that are not in defconfig. An 'allmodconfig' build brings the number up to 17000, but it's easy to quickly bring that number down again by excluding stuff that is obviously specific to a random Arm SoC. I would also exclude all of fs/* and most of net/* as being clearly irrelevant here. In a given file, there are likely many instances either in the same struct or a set of related structs, where all can be discarded after looking at the pattern in that file. See the end of this email for a count by directory. I think just doing the uapi headers is probably enough to avoid accidentally introducing a new ABI when either the kernel or userspace gets built with -malign-int, everything inside of the kernel is at most a bug that can be fixed later. >> A quick manual scan only shows a number of uapi structures but very few >> drivers. There are a few hardware structures that are internally packed >> but in a structure that has gets an extra two bytes of harmless padding >> at the end (struct CUSTOM, struct amiga_hw_present, struct >> atari_hw_present, struct TT_DMA, struct pcc_regs). > > I wouldn't assume that padding at the end is inconsequential. I think the > analysis requires comparing object code. E.g. add the same padding > explicitly to see whether it alters object code under -mno-align-int. There is clearly a difference in uapi headers, especially when it gets passed through an ioctl that encodes the size of the structure in the command code. For the structures I listed above, I can see no way it would actually make a difference. >> The only ones I could find that actually seemed suspicious are: >> >> arch/m68k/include/asm/atarihw.h:426:10: warning: padding struct to align >> 'dst_address' [-Wpadded] arch/m68k/include/asm/openprom.h:76:13: >> warning: padding struct to align 'boot_dev_ctrl' [-Wpadded] >> drivers/net/ethernet/i825xx/82596.c:250:1: warning: padding struct size >> to alignment boundary >> >> I'm sure there are a few more, but probably not a lot more. See >> https://pastebin.com/Z2bjnD0G for the full list. I've also tried >> annotating the ones that show up in defconfig, see diffstat below. >> >> Unfortunately I found no way of annotating a struct a whole to use the >> traditional padding rules: "#pragma pack(push, 2)" lowers the alignment >> of all struct members to two bytes, including those with an explicit >> alignment like __aligned_u64. A struct-wide __attribute__((packed, >> aligned(2))) seems even worse, as it makes all members inside of the >> struck tightly packed and only aligns the struct itself. >> >> This means all misaligned members have to be individually marked as >> __packed, unless someone comes up with another type of annotation. >> > > Right. The trick will be to annotate the affected struct members by > automatic program transformation. I don't think automated transformation is going to work well here, as you may not want the same approach in each case, depending on what the code is: - it may be enough to annotate a single member as packed in order to make the entire structure compatible - some structures may have lots of misaligned members, but no holes, so a global __attribute__((packed, aligned(2))) on that struct is cleaner - if there are holes, some strategic additions of explicit padding can be cleaner than annotating each misaligned member. - automation won't be able to tell whether a structure is ABI relevant or not - similarly, many files are not going to be interesting for m68k. E.g. if drivers/infiniband has an ABI that is different for -malign-int, that can likely be ignored because nobody cares in practice. > It would be sufficient to have a plug-in to list the sites of all members > potentially needing annnotation. The list could be manually pared down and > the actual annotating could be scripted. Note that this list would be > shorter than a -Wpadded list, because there's a bunch of padding that's > not relevant. Yes, that should work to rule out many instances and to find out exactly where packing attributes are required. Arnd 2 arch/m68k/atari 5 arch/m68k/include/asm 39 block 1 block/fs 11 block/partitions 34 crypto 6 crypto/asymmetric_keys 10 drivers/accessibility/speakup 11 drivers/android 12 drivers/android/tests 15 drivers/ata 8 drivers/auxdisplay 11 drivers/base 5 drivers/base/firmware_loader 2 drivers/base/firmware_loader/builtin 10 drivers/base/regmap 2 drivers/base/test 2 drivers/bcma 18 drivers/block 2 drivers/block/aoe 7 drivers/block/drbd 4 drivers/block/null_blk 7 drivers/block/rnbd 2 drivers/block/zram 55 drivers/bluetooth 4 drivers/bus 2 drivers/bus/mhi/ep 4 drivers/bus/mhi/host 3 drivers/cdx 6 drivers/cdx/controller 3 drivers/char 8 drivers/char/hw_random 35 drivers/char/ipmi 11 drivers/char/tpm 1 drivers/char/tpm/eventlog 3 drivers/char/tpm/st33zp24 1 drivers/char/xillybus 48 drivers/clk 5 drivers/clk/actions 8 drivers/clk/bcm 8 drivers/clk/imx 6 drivers/clk/ingenic 11 drivers/clk/mediatek 24 drivers/clk/qcom 2 drivers/clk/ralink 21 drivers/clk/renesas 6 drivers/clk/samsung 1 drivers/clk/sifive 9 drivers/clk/socfpga 11 drivers/clk/sophgo 3 drivers/clk/sprd 2 drivers/clk/starfive 17 drivers/clk/sunxi-ng 1 drivers/clk/ti 1 drivers/clk/versatile 6 drivers/clk/visconti 2 drivers/clk/xilinx 43 drivers/clocksource 1 drivers/comedi 48 drivers/comedi/drivers 6 drivers/comedi/drivers/tests 9 drivers/counter 37 drivers/crypto 3 drivers/crypto/amlogic 4 drivers/crypto/aspeed 28 drivers/crypto/caam 17 drivers/crypto/ccree 3 drivers/crypto/hisilicon/sec 7 drivers/crypto/inside-secure 5 drivers/crypto/inside-secure/eip93 2 drivers/crypto/intel/ixp4xx 4 drivers/crypto/intel/keembay 3 drivers/crypto/marvell/cesa 7 drivers/crypto/qce 4 drivers/crypto/starfive 6 drivers/crypto/tegra 7 drivers/crypto/virtio 2 drivers/crypto/xilinx 4 drivers/dax 2 drivers/devfreq 1 drivers/devfreq/event 48 drivers/dma 3 drivers/dma-buf 1 drivers/dma-buf/heaps 7 drivers/dma/amd/qdma 4 drivers/dma/dw 3 drivers/dma/dw-axi-dmac 2 drivers/dma/lgm 3 drivers/dma/mediatek 5 drivers/dma/qcom 1 drivers/dma/sf-pdma 4 drivers/dma/sh 12 drivers/dma/stm32 14 drivers/dma/ti 27 drivers/dma/xilinx 5 drivers/dpll/zl3073x 14 drivers/extcon 11 drivers/firewire 8 drivers/firmware 35 drivers/firmware/arm_scmi 10 drivers/firmware/arm_scmi/transports 6 drivers/firmware/arm_scmi/vendors/imx 1 drivers/firmware/cirrus 2 drivers/firmware/google 9 drivers/firmware/imx 1 drivers/firmware/microchip 2 drivers/firmware/samsung 11 drivers/fpga 10 drivers/fsi 2 drivers/gnss 47 drivers/gpio 8 drivers/gpio/pinctrl 4 drivers/gpu/drm 12 drivers/gpu/drm/arm 15 drivers/gpu/drm/arm/display/komeda 1 drivers/gpu/drm/arm/display/komeda/d71 2 drivers/gpu/drm/atmel-hlcdc 40 drivers/gpu/drm/bridge 9 drivers/gpu/drm/bridge/adv7511 8 drivers/gpu/drm/bridge/analogix 9 drivers/gpu/drm/bridge/cadence 4 drivers/gpu/drm/bridge/imx 11 drivers/gpu/drm/bridge/synopsys 1 drivers/gpu/drm/clients 4 drivers/gpu/drm/display 5 drivers/gpu/drm/etnaviv 15 drivers/gpu/drm/exynos 3 drivers/gpu/drm/gud 3 drivers/gpu/drm/hisilicon/kirin 1 drivers/gpu/drm/imx/dc 11 drivers/gpu/drm/imx/dcss 2 drivers/gpu/drm/imx/ipuv3 6 drivers/gpu/drm/ingenic 9 drivers/gpu/drm/kmb 4 drivers/gpu/drm/lima 7 drivers/gpu/drm/logicvc 2 drivers/gpu/drm/mcde 39 drivers/gpu/drm/mediatek 17 drivers/gpu/drm/meson 2 drivers/gpu/drm/mxsfb 8 drivers/gpu/drm/omapdrm 41 drivers/gpu/drm/omapdrm/dss 41 drivers/gpu/drm/panel 2 drivers/gpu/drm/pl111 8 drivers/gpu/drm/renesas/rcar-du 2 drivers/gpu/drm/renesas/rz-du 22 drivers/gpu/drm/rockchip 1 drivers/gpu/drm/scheduler/tests 4 drivers/gpu/drm/sitronix 2 drivers/gpu/drm/solomon 5 drivers/gpu/drm/sprd 11 drivers/gpu/drm/sti 3 drivers/gpu/drm/stm 13 drivers/gpu/drm/sun4i 1 drivers/gpu/drm/sysfb 23 drivers/gpu/drm/tegra 13 drivers/gpu/drm/tests 1 drivers/gpu/drm/tests/sysfb 10 drivers/gpu/drm/tidss 3 drivers/gpu/drm/tiny 2 drivers/gpu/drm/ttm 4 drivers/gpu/drm/ttm/tests 3 drivers/gpu/drm/v3d 8 drivers/gpu/drm/virtio 3 drivers/gpu/drm/vkms 4 drivers/gpu/drm/vkms/tests 9 drivers/gpu/drm/xlnx 13 drivers/gpu/host1x 11 drivers/gpu/host1x/hw 13 drivers/gpu/ipu-v3 10 drivers/greybus 135 drivers/hid 4 drivers/hid/i2c-hid 5 drivers/hid/usbhid 1 drivers/hte 271 drivers/hwmon 5 drivers/hwmon/occ 4 drivers/hwmon/peci 19 drivers/hwmon/pmbus 8 drivers/hwtracing/intel_th 1 drivers/hwtracing/stm 9 drivers/i2c 110 drivers/i2c/busses 4 drivers/i2c/muxes 1 drivers/i2c/muxes/pinctrl 15 drivers/i3c/master 5 drivers/i3c/master/mipi-i3c-hci 2 drivers/iio 82 drivers/iio/accel 248 drivers/iio/adc 6 drivers/iio/addac 7 drivers/iio/amplifiers 2 drivers/iio/cdc 25 drivers/iio/chemical 1 drivers/iio/common/cros_ec_sensors 2 drivers/iio/common/ms_sensors 5 drivers/iio/common/ssp_sensors 80 drivers/iio/dac 2 drivers/iio/dummy 19 drivers/iio/frequency 19 drivers/iio/gyro 11 drivers/iio/humidity 2 drivers/iio/humidity/common/ms_sensors 23 drivers/iio/imu 3 drivers/iio/imu/bmi160 4 drivers/iio/imu/bmi270 7 drivers/iio/imu/bmi323 4 drivers/iio/imu/bno055 7 drivers/iio/imu/inv_icm42600 7 drivers/iio/imu/inv_mpu6050 19 drivers/iio/imu/st_lsm6dsx 79 drivers/iio/light 24 drivers/iio/magnetometer 3 drivers/iio/orientation 3 drivers/iio/position 11 drivers/iio/potentiometer 1 drivers/iio/potentiostat 34 drivers/iio/pressure 2 drivers/iio/pressure/common/ms_sensors 26 drivers/iio/proximity 9 drivers/iio/resolver 19 drivers/iio/temperature 2 drivers/iio/temperature/common/ms_sensors 2 drivers/iio/trigger 32 drivers/infiniband/core 11 drivers/infiniband/sw/siw 4 drivers/infiniband/ulp/ipoib 5 drivers/infiniband/ulp/iser 6 drivers/infiniband/ulp/isert 11 drivers/infiniband/ulp/rtrs 10 drivers/infiniband/ulp/srp 5 drivers/infiniband/ulp/srpt 8 drivers/input 1 drivers/input/gameport 29 drivers/input/joystick 2 drivers/input/joystick/iforce 43 drivers/input/keyboard 57 drivers/input/misc 42 drivers/input/mouse 40 drivers/input/rmi4 8 drivers/input/serio 5 drivers/input/tablet 119 drivers/input/touchscreen 2 drivers/interconnect 1 drivers/interconnect/imx 5 drivers/iommu 15 drivers/iommu/iommufd 2 drivers/ipack/devices 6 drivers/irqchip 9 drivers/isdn/hardware/mISDN 4 drivers/isdn/mISDN 56 drivers/leds 2 drivers/leds/blink 9 drivers/leds/flash 9 drivers/leds/rgb 5 drivers/leds/trigger 19 drivers/mailbox 97 drivers/md 20 drivers/md/bcache 9 drivers/md/persistent-data 9 drivers/media/cec/core 3 drivers/media/cec/i2c 3 drivers/media/cec/platform/cec-gpio 3 drivers/media/cec/platform/tegra 8 drivers/media/cec/usb/extron-da-hd-4k-plus 4 drivers/media/cec/usb/pulse8 1 drivers/media/cec/usb/rainshadow 4 drivers/media/common 5 drivers/media/common/b2c2 5 drivers/media/common/siano 1 drivers/media/common/v4l2-tpg 4 drivers/media/common/videobuf2 6 drivers/media/dvb-core 381 drivers/media/dvb-frontends 30 drivers/media/dvb-frontends/cxd2880 55 drivers/media/dvb-frontends/drx39xyj 3 drivers/media/firewire 215 drivers/media/i2c 2 drivers/media/i2c/adv748x 19 drivers/media/i2c/ccs 1 drivers/media/i2c/cx25840 1 drivers/media/i2c/et8ek8 3 drivers/media/i2c/s5c73m3 1 drivers/media/mc 6 drivers/media/platform/allegro-dvt 2 drivers/media/platform/amlogic/c3/isp 1 drivers/media/platform/amlogic/c3/mipi-adapter 1 drivers/media/platform/amlogic/meson-ge2d 8 drivers/media/platform/amphion 2 drivers/media/platform/aspeed 4 drivers/media/platform/atmel 2 drivers/media/platform/cadence 7 drivers/media/platform/chips-media/coda 25 drivers/media/platform/chips-media/wave5 3 drivers/media/platform/imagination 2 drivers/media/platform/intel 2 drivers/media/platform/marvell 4 drivers/media/platform/mediatek/jpeg 4 drivers/media/platform/mediatek/mdp 9 drivers/media/platform/mediatek/mdp3 5 drivers/media/platform/mediatek/vcodec/common/decoder 3 drivers/media/platform/mediatek/vcodec/common/encoder 5 drivers/media/platform/mediatek/vcodec/decoder 28 drivers/media/platform/mediatek/vcodec/decoder/vdec 4 drivers/media/platform/mediatek/vcodec/encoder 6 drivers/media/platform/mediatek/vcodec/encoder/venc 2 drivers/media/platform/mediatek/vpu 8 drivers/media/platform/microchip 1 drivers/media/platform/nuvoton 2 drivers/media/platform/nvidia/tegra-vde 11 drivers/media/platform/nxp 2 drivers/media/platform/nxp/dw100 5 drivers/media/platform/nxp/imx-jpeg 20 drivers/media/platform/qcom/camss 3 drivers/media/platform/qcom/iris 17 drivers/media/platform/qcom/venus 11 drivers/media/platform/renesas 4 drivers/media/platform/renesas/rcar-vin 8 drivers/media/platform/renesas/rzg2l-cru 18 drivers/media/platform/renesas/vsp1 1 drivers/media/platform/rockchip/rga 5 drivers/media/platform/rockchip/rkisp1 5 drivers/media/platform/rockchip/rkvdec 5 drivers/media/platform/samsung/exynos-gsc 25 drivers/media/platform/samsung/exynos4-is 3 drivers/media/platform/samsung/s5p-jpeg 25 drivers/media/platform/samsung/s5p-mfc 7 drivers/media/platform/st/sti/bdisp 1 drivers/media/platform/st/sti/c8sectpfe 5 drivers/media/platform/st/sti/delta 8 drivers/media/platform/st/sti/hva 5 drivers/media/platform/st/stm32 1 drivers/media/platform/st/stm32/dma2d 5 drivers/media/platform/st/stm32/stm32-dcmipp 2 drivers/media/platform/sunxi/sun4i-csi 3 drivers/media/platform/synopsys/hdmirx 1 drivers/media/platform/ti/am437x 2 drivers/media/platform/ti/cal 5 drivers/media/platform/ti/davinci 1 drivers/media/platform/ti/j721e-csi2rx 3 drivers/media/platform/ti/omap 31 drivers/media/platform/ti/omap3isp 6 drivers/media/platform/ti/vpe 10 drivers/media/platform/verisilicon 5 drivers/media/platform/xilinx 16 drivers/media/radio 2 drivers/media/radio/si470x 56 drivers/media/rc 5 drivers/media/rc/img-ir 4 drivers/media/spi 2 drivers/media/test-drivers 1 drivers/media/test-drivers/vicodec 28 drivers/media/test-drivers/vidtv 3 drivers/media/test-drivers/vimc 27 drivers/media/test-drivers/vivid 93 drivers/media/tuners 2 drivers/media/usb/as102 6 drivers/media/usb/au0828 1 drivers/media/usb/b2c2 28 drivers/media/usb/cx231xx 24 drivers/media/usb/dvb-usb 34 drivers/media/usb/dvb-usb-v2 22 drivers/media/usb/em28xx 5 drivers/media/usb/go7007 53 drivers/media/usb/gspca 3 drivers/media/usb/gspca/gl860 2 drivers/media/usb/gspca/m5602 4 drivers/media/usb/gspca/stv06xx 6 drivers/media/usb/hdpvr 1 drivers/media/usb/msi2500 7 drivers/media/usb/pvrusb2 7 drivers/media/usb/pwc 2 drivers/media/usb/s2255 1 drivers/media/usb/siano 1 drivers/media/usb/stk1160 28 drivers/media/usb/uvc 1 drivers/media/v4l2-core 12 drivers/memory 1 drivers/memory/samsung 6 drivers/memory/tegra 13 drivers/memstick/core 2 drivers/memstick/host 18 drivers/mfd 23 drivers/misc 2 drivers/misc/altera-stapl 1 drivers/misc/amd-sbi 7 drivers/misc/eeprom 2 drivers/misc/lis3lv02d 2 drivers/misc/lkdtm 8 drivers/mmc/core 97 drivers/mmc/host 7 drivers/most 6 drivers/mtd 1 drivers/mtd/chips 12 drivers/mtd/devices 1 drivers/mtd/maps 1 drivers/mtd/nand 34 drivers/mtd/nand/raw 3 drivers/mtd/nand/raw/atmel 2 drivers/mtd/nand/raw/brcmnand 4 drivers/mtd/nand/raw/gpmi-nand 2 drivers/mtd/nand/raw/ingenic 1 drivers/mtd/parsers 8 drivers/mtd/spi-nor 1 drivers/mtd/spi-nor/controllers 9 drivers/mtd/ubi 37 drivers/net 3 drivers/net/arcnet 1 drivers/net/bonding 3 drivers/net/caif 7 drivers/net/can 2 drivers/net/can/c_can 1 drivers/net/can/cc770 2 drivers/net/can/flexcan 7 drivers/net/can/m_can 3 drivers/net/can/rcar 2 drivers/net/can/slcan 1 drivers/net/can/softing 3 drivers/net/can/spi 12 drivers/net/can/spi/mcp251xfd 12 drivers/net/can/usb 2 drivers/net/can/usb/etas_es58x 6 drivers/net/can/usb/kvaser_usb 4 drivers/net/can/usb/peak_usb 14 drivers/net/dsa 13 drivers/net/dsa/b53 5 drivers/net/dsa/hirschmann 17 drivers/net/dsa/microchip 20 drivers/net/dsa/mv88e6xxx 1 drivers/net/dsa/ocelot 6 drivers/net/dsa/qca 6 drivers/net/dsa/realtek 13 drivers/net/dsa/sja1105 3 drivers/net/ethernet 1 drivers/net/ethernet/3com 10 drivers/net/ethernet/8390 3 drivers/net/ethernet/adi 5 drivers/net/ethernet/airoha 5 drivers/net/ethernet/amd 5 drivers/net/ethernet/amd/xgbe 18 drivers/net/ethernet/apm/xgene 2 drivers/net/ethernet/apm/xgene-v2 1 drivers/net/ethernet/arc 3 drivers/net/ethernet/asix 5 drivers/net/ethernet/atheros 14 drivers/net/ethernet/broadcom 5 drivers/net/ethernet/broadcom/asp2 11 drivers/net/ethernet/broadcom/genet 4 drivers/net/ethernet/cadence 2 drivers/net/ethernet/calxeda 3 drivers/net/ethernet/cirrus 3 drivers/net/ethernet/cortina 4 drivers/net/ethernet/davicom 11 drivers/net/ethernet/engleder 2 drivers/net/ethernet/faraday 16 drivers/net/ethernet/freescale 9 drivers/net/ethernet/freescale/enetc 58 drivers/net/ethernet/freescale/fman 3 drivers/net/ethernet/fujitsu 1 drivers/net/ethernet/hisilicon 16 drivers/net/ethernet/hisilicon/hns 1 drivers/net/ethernet/i825xx 10 drivers/net/ethernet/marvell 17 drivers/net/ethernet/marvell/mvpp2 52 drivers/net/ethernet/marvell/prestera 25 drivers/net/ethernet/mediatek 2 drivers/net/ethernet/mellanox/mlxbf_gige 4 drivers/net/ethernet/mellanox/mlxfw 51 drivers/net/ethernet/mellanox/mlxsw 1 drivers/net/ethernet/mellanox/mlxsw/mlxfw 7 drivers/net/ethernet/micrel 4 drivers/net/ethernet/microchip 23 drivers/net/ethernet/microchip/lan966x 30 drivers/net/ethernet/microchip/sparx5 26 drivers/net/ethernet/microchip/sparx5/lan969x 6 drivers/net/ethernet/microchip/vcap 22 drivers/net/ethernet/microchip/vcap/sparx5 6 drivers/net/ethernet/mscc 1 drivers/net/ethernet/ni 4 drivers/net/ethernet/qualcomm 4 drivers/net/ethernet/qualcomm/emac 4 drivers/net/ethernet/qualcomm/rmnet 14 drivers/net/ethernet/renesas 5 drivers/net/ethernet/samsung/sxgbe 3 drivers/net/ethernet/smsc 6 drivers/net/ethernet/socionext 40 drivers/net/ethernet/stmicro/stmmac 3 drivers/net/ethernet/sunplus 3 drivers/net/ethernet/synopsys 4 drivers/net/ethernet/vertexcom 5 drivers/net/ethernet/via 8 drivers/net/ethernet/wiznet 10 drivers/net/ethernet/xilinx 11 drivers/net/hamradio 34 drivers/net/ieee802154 32 drivers/net/ipa 6 drivers/net/ipa/data 1 drivers/net/ipvlan 8 drivers/net/mctp 4 drivers/net/mdio 21 drivers/net/netdevsim 5 drivers/net/ovpn 3 drivers/net/pcs 58 drivers/net/phy 14 drivers/net/phy/mscc 4 drivers/net/phy/qcom 1 drivers/net/phy/realtek 2 drivers/net/plip 6 drivers/net/ppp 3 drivers/net/pse-pd 2 drivers/net/slip 1 drivers/net/team 34 drivers/net/usb 7 drivers/net/vxlan 13 drivers/net/wan 2 drivers/net/wan/framer/pef2256 7 drivers/net/wireguard 2 drivers/net/wireguard/selftest 9 drivers/net/wireless/ath 148 drivers/net/wireless/ath/ath10k 173 drivers/net/wireless/ath/ath11k 61 drivers/net/wireless/ath/ath6kl 100 drivers/net/wireless/ath/ath9k 28 drivers/net/wireless/ath/carl9170 53 drivers/net/wireless/ath/wcn36xx 8 drivers/net/wireless/atmel 70 drivers/net/wireless/broadcom/b43 27 drivers/net/wireless/broadcom/b43legacy 74 drivers/net/wireless/broadcom/brcm80211/brcmfmac 44 drivers/net/wireless/broadcom/brcm80211/brcmfmac/bca 2 drivers/net/wireless/broadcom/brcm80211/brcmfmac/bca/include 45 drivers/net/wireless/broadcom/brcm80211/brcmfmac/cyw 2 drivers/net/wireless/broadcom/brcm80211/brcmfmac/cyw/include 3 drivers/net/wireless/broadcom/brcm80211/brcmfmac/include 44 drivers/net/wireless/broadcom/brcm80211/brcmfmac/wcc 2 drivers/net/wireless/broadcom/brcm80211/brcmfmac/wcc/include 39 drivers/net/wireless/broadcom/brcm80211/brcmsmac 1 drivers/net/wireless/broadcom/brcm80211/brcmsmac/include 64 drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy 3 drivers/net/wireless/broadcom/brcm80211/brcmutil/include 7 drivers/net/wireless/intersil/p54 22 drivers/net/wireless/marvell/libertas 9 drivers/net/wireless/marvell/libertas_tf 147 drivers/net/wireless/marvell/mwifiex 69 drivers/net/wireless/mediatek/mt76 59 drivers/net/wireless/mediatek/mt76/mt7615 61 drivers/net/wireless/mediatek/mt76/mt76x0 57 drivers/net/wireless/mediatek/mt76/mt76x2 50 drivers/net/wireless/mediatek/mt76/mt7921 51 drivers/net/wireless/mediatek/mt76/mt7925 10 drivers/net/wireless/mediatek/mt7601u 34 drivers/net/wireless/microchip/wilc1000 10 drivers/net/wireless/purelifi/plfxlc 14 drivers/net/wireless/ralink/rt2x00 5 drivers/net/wireless/realtek/rtl818x/rtl8187 25 drivers/net/wireless/realtek/rtl8xxxu 106 drivers/net/wireless/realtek/rtlwifi 59 drivers/net/wireless/realtek/rtlwifi/btcoexist 102 drivers/net/wireless/realtek/rtlwifi/rtl8192c 1 drivers/net/wireless/realtek/rtlwifi/rtl8192c/rtl8192ce 104 drivers/net/wireless/realtek/rtlwifi/rtl8192cu 2 drivers/net/wireless/realtek/rtlwifi/rtl8192cu/rtl8192c 1 drivers/net/wireless/realtek/rtlwifi/rtl8192cu/rtl8192ce 102 drivers/net/wireless/realtek/rtlwifi/rtl8192d 97 drivers/net/wireless/realtek/rtlwifi/rtl8192du 1 drivers/net/wireless/realtek/rtlwifi/rtl8192du/rtl8192d 90 drivers/net/wireless/realtek/rtw88 224 drivers/net/wireless/realtek/rtw89 33 drivers/net/wireless/rsi 11 drivers/net/wireless/silabs/wfx 31 drivers/net/wireless/st/cw1200 11 drivers/net/wireless/ti/wl1251 2 drivers/net/wireless/ti/wl12xx 36 drivers/net/wireless/ti/wl12xx/wlcore 4 drivers/net/wireless/ti/wl18xx 36 drivers/net/wireless/ti/wl18xx/wlcore 40 drivers/net/wireless/ti/wlcore 13 drivers/net/wireless/virtual 11 drivers/net/wireless/zydas/zd1211rw 2 drivers/net/wwan 15 drivers/nfc 3 drivers/nfc/fdp 3 drivers/nfc/nfcmrvl 1 drivers/nfc/nxp-nci 5 drivers/nfc/pn533 2 drivers/nfc/pn544 4 drivers/nfc/s3fwrn5 9 drivers/nfc/st-nci 6 drivers/nfc/st21nfca 5 drivers/nfc/st95hf 2 drivers/nvme/common 33 drivers/nvme/host 36 drivers/nvme/target 16 drivers/nvme/target/host 22 drivers/nvme/target/target 7 drivers/nvmem 3 drivers/of 3 drivers/opp 5 drivers/pci 2 drivers/pcmcia 1 drivers/phy 5 drivers/phy/allwinner 1 drivers/phy/amlogic 6 drivers/phy/broadcom 9 drivers/phy/cadence 5 drivers/phy/freescale 2 drivers/phy/marvell 8 drivers/phy/mediatek 3 drivers/phy/microchip 2 drivers/phy/motorola 2 drivers/phy/mscc 16 drivers/phy/qualcomm 1 drivers/phy/ralink 6 drivers/phy/realtek 2 drivers/phy/renesas 19 drivers/phy/rockchip 4 drivers/phy/samsung 6 drivers/phy/socionext 4 drivers/phy/st 1 drivers/phy/tegra 7 drivers/phy/ti 2 drivers/phy/xilinx 24 drivers/pinctrl 2 drivers/pinctrl/actions 8 drivers/pinctrl/bcm 3 drivers/pinctrl/berlin 1 drivers/pinctrl/cirrus 6 drivers/pinctrl/freescale 21 drivers/pinctrl/mediatek 1 drivers/pinctrl/meson 1 drivers/pinctrl/nomadik 2 drivers/pinctrl/nuvoton 1 drivers/pinctrl/pxa 9 drivers/pinctrl/qcom 7 drivers/pinctrl/renesas 6 drivers/pinctrl/samsung 5 drivers/pinctrl/sophgo 4 drivers/pinctrl/spacemit 1 drivers/pinctrl/sprd 1 drivers/pinctrl/starfive 5 drivers/pinctrl/stm32 2 drivers/pinctrl/ti 1 drivers/pinctrl/uniphier 1 drivers/pinctrl/visconti 2 drivers/platform/arm64 16 drivers/platform/chrome 7 drivers/platform/cznic 2 drivers/platform/mellanox 6 drivers/platform/olpc 3 drivers/pnp 3 drivers/pnp/isapnp 2 drivers/power/reset 1 drivers/power/sequencing 74 drivers/power/supply 1 drivers/powercap 1 drivers/pps/clients 13 drivers/ptp 21 drivers/pwm 43 drivers/regulator 7 drivers/remoteproc 5 drivers/reset 1 drivers/reset/amlogic 2 drivers/reset/sti 7 drivers/rpmsg 52 drivers/rtc 43 drivers/scsi 3 drivers/scsi/device_handler 4 drivers/scsi/hisi_sas 2 drivers/scsi/libfc 5 drivers/scsi/pcmcia 2 drivers/siox 3 drivers/slimbus 1 drivers/soc/amlogic 3 drivers/soc/aspeed 4 drivers/soc/fsl/qe 3 drivers/soc/loongson 12 drivers/soc/mediatek 1 drivers/soc/nuvoton 25 drivers/soc/qcom 2 drivers/soc/samsung 4 drivers/soc/sunxi 4 drivers/soundwire 122 drivers/spi 4 drivers/spmi 3 drivers/ssb 6 drivers/staging/fbtft 3 drivers/staging/gpib/agilent_82357a 1 drivers/staging/gpib/cb7210 1 drivers/staging/gpib/gpio 14 drivers/staging/gpib/include 1 drivers/staging/gpib/ines 3 drivers/staging/gpib/lpvo_usb_gpib 4 drivers/staging/gpib/ni_usb 34 drivers/staging/greybus 1 drivers/staging/iio/adc 1 drivers/staging/iio/addac 4 drivers/staging/iio/frequency 2 drivers/staging/iio/impedance-analyzer 7 drivers/staging/media/deprecated/atmel 10 drivers/staging/media/imx 2 drivers/staging/media/max96712 4 drivers/staging/media/meson/vdec 4 drivers/staging/media/tegra-video 11 drivers/staging/most/dim2 1 drivers/staging/most/i2c 2 drivers/staging/most/net 1 drivers/staging/most/video 2 drivers/staging/octeon 63 drivers/staging/rtl8723bs/hal 138 drivers/staging/rtl8723bs/include 1 drivers/staging/vc04_services/bcm2835-audio 4 drivers/staging/vc04_services/bcm2835-audio/interface/vchiq_arm 1 drivers/staging/vc04_services/bcm2835-camera 3 drivers/staging/vc04_services/bcm2835-camera/vchiq-mmal 4 drivers/staging/vc04_services/interface/vchiq_arm 3 drivers/staging/vc04_services/vchiq-mmal 4 drivers/staging/vc04_services/vchiq-mmal/interface/vchiq_arm 14 drivers/target 5 drivers/target/iscsi 1 drivers/target/loopback 3 drivers/target/sbp 2 drivers/target/tcm_fc 2 drivers/target/tcm_remote 1 drivers/tee 9 drivers/thermal 4 drivers/thermal/mediatek 4 drivers/thermal/qcom 1 drivers/thermal/renesas 2 drivers/thermal/samsung 11 drivers/thermal/tegra 2 drivers/thermal/ti-soc-thermal 16 drivers/tty 7 drivers/tty/ipwireless 59 drivers/tty/serial 16 drivers/tty/serial/8250 2 drivers/ufs/core 3 drivers/ufs/host 6 drivers/usb/atm 3 drivers/usb/c67x00 10 drivers/usb/cdns3 11 drivers/usb/cdns3/host 9 drivers/usb/chipidea 14 drivers/usb/chipidea/host 10 drivers/usb/class 1 drivers/usb/common 6 drivers/usb/core 1 drivers/usb/core/misc 35 drivers/usb/dwc2 19 drivers/usb/dwc3 1 drivers/usb/dwc3/host 12 drivers/usb/fotg210 3 drivers/usb/gadget 103 drivers/usb/gadget/function 6 drivers/usb/gadget/legacy 49 drivers/usb/gadget/udc 8 drivers/usb/gadget/udc/aspeed-vhub 4 drivers/usb/gadget/udc/bdc 75 drivers/usb/host 1 drivers/usb/image 6 drivers/usb/isp1760 30 drivers/usb/misc 10 drivers/usb/misc/sisusbvga 2 drivers/usb/mon 7 drivers/usb/mtu3 28 drivers/usb/musb 2 drivers/usb/phy 3 drivers/usb/renesas_usbhs 2 drivers/usb/roles 66 drivers/usb/serial 23 drivers/usb/storage 1 drivers/usb/storage/scsi 12 drivers/usb/typec 2 drivers/usb/typec/altmodes 4 drivers/usb/typec/mux 17 drivers/usb/typec/tcpm 2 drivers/usb/typec/tcpm/qcom 2 drivers/usb/typec/tipd 10 drivers/usb/usbip 3 drivers/vdpa/vdpa_sim 2 drivers/vfio 1 drivers/vfio/cdx 2 drivers/vfio/platform 2 drivers/vfio/platform/reset 12 drivers/vhost 20 drivers/video/backlight 25 drivers/video/fbdev 6 drivers/video/fbdev/aty 5 drivers/video/fbdev/core 1 drivers/virt/coco/guest 19 drivers/virtio 1 drivers/w1 6 drivers/w1/masters 4 drivers/w1/slaves 34 drivers/watchdog 18 fs 1 fs/9p 2 fs/adfs 1 fs/affs 33 fs/afs 1 fs/autofs 150 fs/bcachefs 74 fs/btrfs 47 fs/btrfs/tests 3 fs/cachefiles 24 fs/ceph 2 fs/coda 3 fs/configfs 5 fs/crypto 1 fs/debugfs 10 fs/dlm 9 fs/ecryptfs 1 fs/efs 17 fs/erofs 9 fs/exfat 18 fs/ext4 27 fs/f2fs 11 fs/fat 4 fs/freevxfs 15 fs/fuse 4 fs/gfs2 4 fs/hfs 4 fs/hfsplus 4 fs/hpfs 3 fs/iomap 3 fs/isofs 8 fs/jffs2 12 fs/jfs 1 fs/lockd 1 fs/minix 6 fs/mm 1 fs/netfs 22 fs/nfs 8 fs/nfs/blocklayout 6 fs/nfs/filelayout 8 fs/nfs/flexfilelayout 1 fs/nfs_common/nfs 48 fs/nfsd 1 fs/nilfs2 2 fs/notify/fanotify 17 fs/ntfs3 1 fs/ntfs3/lib 18 fs/ocfs2 10 fs/ocfs2/cluster 11 fs/ocfs2/dlm 4 fs/ocfs2/dlm/cluster 6 fs/ocfs2/dlmfs 6 fs/orangefs 18 fs/overlayfs 4 fs/proc 2 fs/pstore 1 fs/qnx6 4 fs/quota 69 fs/smb/client 2 fs/smb/client/common/smbdirect 24 fs/smb/server 19 fs/smb/server/mgmt 3 fs/smb/server/mgmt/mgmt 2 fs/squashfs 22 fs/ubifs 7 fs/udf 5 fs/ufs 1 fs/verity 18 fs/xfs 31 fs/xfs/libxfs 32 fs/xfs/scrub 3 fs/zonefs 10 include/acpi 2 include/asm-generic 1 include/clocksource 37 include/crypto 10 include/crypto/internal 59 include/drm 9 include/drm/bridge 39 include/drm/display 10 include/drm/ttm 8 include/keys 3 include/kunit 910 include/linux 5 include/linux/amba 14 include/linux/bcma 3 include/linux/can 3 include/linux/can/platform 9 include/linux/cdx 32 include/linux/ceph 4 include/linux/clk 3 include/linux/comedi 2 include/linux/crush 2 include/linux/device 5 include/linux/dma 5 include/linux/dsa 1 include/linux/extcon 5 include/linux/firmware/cirrus 1 include/linux/fpga 1 include/linux/framer 11 include/linux/fsl 7 include/linux/gpio 10 include/linux/greybus 3 include/linux/hsi 9 include/linux/i3c 11 include/linux/iio 1 include/linux/iio/accel 5 include/linux/iio/adc 1 include/linux/iio/afe 14 include/linux/iio/common 2 include/linux/iio/dac 5 include/linux/iio/frequency 4 include/linux/iio/imu 3 include/linux/input 8 include/linux/lockd 1 include/linux/mdio 44 include/linux/mfd 1 include/linux/mfd/abx500 10 include/linux/mfd/arizona 1 include/linux/mfd/da9052 2 include/linux/mfd/da9063 1 include/linux/mfd/da9150 2 include/linux/mfd/madera 1 include/linux/mfd/mt6358 5 include/linux/mfd/samsung 5 include/linux/mfd/wm831x 7 include/linux/mfd/wm8994 38 include/linux/mmc 40 include/linux/mtd 13 include/linux/netfilter 6 include/linux/netfilter/ipset 1 include/linux/netfilter_arp 4 include/linux/netfilter_bridge 1 include/linux/netfilter_ipv4 3 include/linux/phy 4 include/linux/pinctrl 87 include/linux/platform_data 1 include/linux/platform_data/txx9 1 include/linux/platform_data/x86 8 include/linux/power 3 include/linux/pse-pd 18 include/linux/regulator 1 include/linux/remoteproc 1 include/linux/reset 2 include/linux/rtc 4 include/linux/sched 9 include/linux/soc/mediatek 4 include/linux/soc/qcom 8 include/linux/soc/ti 13 include/linux/soundwire 26 include/linux/spi 16 include/linux/ssb 17 include/linux/sunrpc 82 include/linux/usb 69 include/media 2 include/media/davinci 9 include/media/drv-intf 24 include/media/i2c 7 include/media/tpg 7 include/memory 627 include/net 10 include/net/9p 90 include/net/bluetooth 5 include/net/caif 6 include/net/libeth 64 include/net/netfilter 15 include/net/netfilter/../net/bridge 15 include/net/netns 20 include/net/nfc 1 include/net/page_pool 1 include/net/phonet 40 include/net/sctp 36 include/net/tc_act 29 include/pcmcia 101 include/rdma 89 include/scsi 2 include/soc/at91 7 include/soc/fsl/qe 2 include/soc/microchip 17 include/soc/mscc 5 include/soc/tegra 162 include/sound 31 include/target 32 include/target/iscsi 1 include/tools/lib/bpf 6 include/trace/events/../fs/dlm 15 include/trace/events/../net/bridge 17 include/trace/events/../sound/soc/sof 1 include/uapi/asm-generic 2 include/uapi/drm 216 include/uapi/linux 1 include/uapi/linux/caif 2 include/uapi/linux/can 4 include/uapi/linux/dvb 3 include/uapi/linux/hdlc 58 include/uapi/linux/netfilter 2 include/uapi/linux/netfilter/ipset 3 include/uapi/linux/netfilter_arp 6 include/uapi/linux/netfilter_bridge 2 include/uapi/linux/netfilter_ipv4 5 include/uapi/linux/netfilter_ipv6 3 include/uapi/linux/tc_act 2 include/uapi/linux/usb 3 include/uapi/misc 1 include/uapi/mtd 2 include/uapi/rdma 1 include/uapi/rdma/hfi 1 include/uapi/scsi 11 include/uapi/sound 1 include/uapi/video 28 include/ufs 32 include/video 2 init 22 io_uring 1 io_uring/fs 1 io_uring/kernel/futex 2 ipc 23 kernel 35 kernel/bpf 4 kernel/bpf/cgroup 1 kernel/bpf/tools/lib/bpf 6 kernel/cgroup 1 kernel/dma 3 kernel/futex 2 kernel/irq 5 kernel/locking 6 kernel/mm 2 kernel/module 1 kernel/printk 4 kernel/rcu 13 kernel/sched 9 kernel/time 30 lib 5 lib/842 1 lib/crc/tests 1 lib/crypto 1 lib/crypto/tests 4 lib/kunit 6 lib/mm 14 lib/tests 5 lib/xz 3 lib/zlib_deflate 4 lib/zstd/compress 1 lib/zstd/decompress 32 mm 9 mm/damon 2 net/6lowpan 2 net/8021q 6 net/9p 7 net/atm 15 net/atm/bridge 1 net/ax25 41 net/batman-adv 28 net/bluetooth 4 net/bluetooth/bnep 4 net/bluetooth/hidp 1 net/bluetooth/rfcomm 1 net/bpf 31 net/bridge 15 net/bridge/netfilter 8 net/caif 3 net/can 3 net/can/j1939 5 net/ceph 16 net/core 8 net/devlink 8 net/dsa 16 net/ethtool 1 net/ethtool/core 1 net/handshake 4 net/hsr 1 net/ieee802154 23 net/ipv4 6 net/ipv6 15 net/ipv6/bridge 3 net/ipv6/ila 1 net/key 5 net/l2tp 116 net/mac80211 102 net/mac80211/tests 3 net/mac802154 4 net/mpls 15 net/mptcp 33 net/ncsi 44 net/netfilter 15 net/netfilter/bridge 11 net/netfilter/ipset 2 net/netfilter/ipvs 1 net/netlabel 3 net/netlink 11 net/nfc 6 net/nfc/hci 4 net/nfc/nci 11 net/openvswitch 1 net/packet 1 net/qrtr 19 net/rds 4 net/rfkill 38 net/rxrpc 53 net/sched 2 net/sctp 1 net/shaper/core 38 net/smc 5 net/sunrpc 1 net/sunrpc/auth_gss 37 net/tipc 4 net/tls 2 net/unix 4 net/vmw_vsock 24 net/wireless 15 net/wireless/tests 97 net/wireless/tests/mac80211 10 net/xfrm 6 samples/vfio-mdev 6 security/apparmor/include 3 security/integrity/evm 1 security/integrity/ima 1 security/keys 7 security/landlock 4 security/selinux 12 security/selinux/include 10 security/selinux/ss 36 security/tomoyo 1 security/yama 3 sound/core 3 sound/core/oss 8 sound/core/seq 9 sound/core/seq/oss 9 sound/drivers 1 sound/drivers/vx 9 sound/firewire 8 sound/firewire/bebob 7 sound/firewire/bebob/. 9 sound/firewire/dice 8 sound/firewire/digi00x 9 sound/firewire/fireface 9 sound/firewire/fireface/. 8 sound/firewire/fireworks 8 sound/firewire/fireworks/. 9 sound/firewire/motu 8 sound/firewire/motu/. 16 sound/firewire/oxfw 9 sound/firewire/oxfw/. 11 sound/firewire/tascam 11 sound/firewire/tascam/. 31 sound/hda/codecs 14 sound/hda/codecs/cirrus 18 sound/hda/codecs/common 7 sound/hda/codecs/hdmi 15 sound/hda/codecs/realtek 2 sound/hda/codecs/realtek/side-codecs 2 sound/hda/codecs/side-codecs 19 sound/hda/common 2 sound/hda/core 2 sound/pci/ac97 2 sound/pcmcia/pdaudiocf 1 sound/soc/adi 2 sound/soc/amd 18 sound/soc/amd/sof/amd 2 sound/soc/apple 5 sound/soc/atmel 1 sound/soc/atmel/codecs 2 sound/soc/bcm 335 sound/soc/codecs 3 sound/soc/codecs/aw88395 2 sound/soc/dwc 25 sound/soc/fsl 6 sound/soc/fsl/codecs 1 sound/soc/generic 1 sound/soc/hisilicon 4 sound/soc/img 3 sound/soc/intel/boards/codecs 3 sound/soc/intel/keembay 1 sound/soc/jz4740 5 sound/soc/mediatek/common 5 sound/soc/mediatek/mt8186 3 sound/soc/mediatek/mt8186/codecs 3 sound/soc/mediatek/mt8186/common 5 sound/soc/mediatek/mt8188 5 sound/soc/mediatek/mt8188/codecs 3 sound/soc/mediatek/mt8188/common 4 sound/soc/mediatek/mt8195 1 sound/soc/mediatek/mt8195/codecs 3 sound/soc/mediatek/mt8195/common 9 sound/soc/mediatek/mt8365 3 sound/soc/mediatek/mt8365/common 2 sound/soc/meson 10 sound/soc/qcom 1 sound/soc/qcom/codecs 17 sound/soc/qcom/qdsp6 2 sound/soc/renesas 1 sound/soc/renesas/rcar 4 sound/soc/rockchip 2 sound/soc/rockchip/codecs 3 sound/soc/samsung 12 sound/soc/samsung/codecs 19 sound/soc/sof 14 sound/soc/sof/imx 17 sound/soc/sof/intel 17 sound/soc/sof/mediatek 17 sound/soc/sof/mediatek/mt8186 17 sound/soc/sof/mediatek/mt8195 11 sound/soc/sof/xtensa 2 sound/soc/sprd 4 sound/soc/starfive 4 sound/soc/stm 5 sound/soc/sunxi 2 sound/soc/tegra 16 sound/soc/ti 2 sound/soc/uniphier 10 sound/soc/usb 4 sound/soc/xilinx 1 sound/spi 43 sound/usb 10 sound/usb/6fire 7 sound/usb/caiaq 5 sound/usb/hiface 3 sound/usb/line6 5 sound/usb/misc 44 sound/usb/qcom 5 sound/usb/usx2y 1 sound/virtio ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations 2025-09-22 15:49 ` Arnd Bergmann @ 2025-09-23 6:39 ` Finn Thain 0 siblings, 0 replies; 44+ messages in thread From: Finn Thain @ 2025-09-23 6:39 UTC (permalink / raw) To: Arnd Bergmann Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, Geert Uytterhoeven, linux-m68k On Mon, 22 Sep 2025, Arnd Bergmann wrote: > > I don't think automated transformation is going to work well here, as > you may not want the same approach in each case, depending on what the > code is: > > - it may be enough to annotate a single member as packed in order to > make the entire structure compatible Right, and that's the only transformation I mentioned. > - some structures may have lots of misaligned members, but no holes, so > a global __attribute__((packed, aligned(2))) on that struct is cleaner That simplification should be amenable to further automation, if the first transformation is. > - if there are holes, some strategic additions of explicit padding can > be cleaner than annotating each misaligned member. It's a matter of taste. > - automation won't be able to tell whether a structure is ABI relevant > or not Right. That's why I said the list of struct members would need to be "manually pared down". But even that task may be too large to be feasible. > - similarly, many files are not going to be interesting for m68k. E.g. > if drivers/infiniband has an ABI that is different for -malign-int, > that can likely be ignored because nobody cares in practice. > Right. That's why I advocated running the plug-in on a "normal build" not a contrived mass of #includes and structs. ^ permalink raw reply [flat|nested] 44+ messages in thread
* [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-14 0:45 [RFC v2 0/3] Align atomic storage Finn Thain 2025-09-14 0:45 ` [RFC v2 1/3] documentation: Discourage alignment assumptions Finn Thain 2025-09-14 0:45 ` [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations Finn Thain @ 2025-09-14 0:45 ` Finn Thain 2025-09-15 7:13 ` Geert Uytterhoeven ` (2 more replies) 2 siblings, 3 replies; 44+ messages in thread From: Finn Thain @ 2025-09-14 0:45 UTC (permalink / raw) To: Peter Zijlstra, Will Deacon Cc: Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel, linux-arch, Geert Uytterhoeven, linux-m68k, Lance Yang Some recent commits incorrectly assumed 4-byte alignment of locks. That assumption fails on Linux/m68k (and, interestingly, would have failed on Linux/cris also). Specify the minimum alignment of atomic variables for fewer surprises and (hopefully) better performance. Consistent with i386, atomic64_t is not given natural alignment here. Cc: Lance Yang <lance.yang@linux.dev> Link: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc1_0g@mail.gmail.com/ --- Changed since v1: - atomic64_t now gets an __aligned attribute too. - The 'Fixes' tag has been dropped because Lance sent a different fix for commit e711faaafbe5 ("hung_task: replace blocker_mutex with encoded blocker") that's suitable for -stable. --- include/asm-generic/atomic64.h | 2 +- include/linux/types.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/asm-generic/atomic64.h b/include/asm-generic/atomic64.h index 100d24b02e52..7ae82ac17645 100644 --- a/include/asm-generic/atomic64.h +++ b/include/asm-generic/atomic64.h @@ -10,7 +10,7 @@ #include <linux/types.h> typedef struct { - s64 counter; + s64 counter __aligned(sizeof(long)); } atomic64_t; #define ATOMIC64_INIT(i) { (i) } diff --git a/include/linux/types.h b/include/linux/types.h index 6dfdb8e8e4c3..cd5b2b0f4b02 100644 --- a/include/linux/types.h +++ b/include/linux/types.h @@ -179,7 +179,7 @@ typedef phys_addr_t resource_size_t; typedef unsigned long irq_hw_number_t; typedef struct { - int counter; + int counter __aligned(sizeof(int)); } atomic_t; #define ATOMIC_INIT(i) { (i) } -- 2.49.1 ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-14 0:45 ` [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t Finn Thain @ 2025-09-15 7:13 ` Geert Uytterhoeven 2025-09-15 7:35 ` Arnd Bergmann 2025-09-22 7:06 ` Geert Uytterhoeven 2 siblings, 0 replies; 44+ messages in thread From: Geert Uytterhoeven @ 2025-09-15 7:13 UTC (permalink / raw) To: Finn Thain Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel, linux-arch, linux-m68k, Lance Yang Hi Finn, Thanks for your patch! On Sun, 14 Sept 2025 at 02:59, Finn Thain <fthain@linux-m68k.org> wrote: > Some recent commits incorrectly assumed 4-byte alignment of locks. > That assumption fails on Linux/m68k (and, interestingly, would have > failed on Linux/cris also). Specify the minimum alignment of atomic > variables for fewer surprises and (hopefully) better performance. > > Consistent with i386, atomic64_t is not given natural alignment here. You forgot to drop this line. > > Cc: Lance Yang <lance.yang@linux.dev> > Link: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc1_0g@mail.gmail.com/ > --- > Changed since v1: > - atomic64_t now gets an __aligned attribute too. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-14 0:45 ` [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t Finn Thain 2025-09-15 7:13 ` Geert Uytterhoeven @ 2025-09-15 7:35 ` Arnd Bergmann 2025-09-15 8:06 ` Peter Zijlstra 2025-09-15 9:26 ` Finn Thain 2025-09-22 7:06 ` Geert Uytterhoeven 2 siblings, 2 replies; 44+ messages in thread From: Arnd Bergmann @ 2025-09-15 7:35 UTC (permalink / raw) To: Finn Thain, Peter Zijlstra, Will Deacon Cc: Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, Geert Uytterhoeven, linux-m68k, Lance Yang On Sun, Sep 14, 2025, at 02:45, Finn Thain wrote: > index 100d24b02e52..7ae82ac17645 100644 > --- a/include/asm-generic/atomic64.h > +++ b/include/asm-generic/atomic64.h > @@ -10,7 +10,7 @@ > #include <linux/types.h> > > typedef struct { > - s64 counter; > + s64 counter __aligned(sizeof(long)); > } atomic64_t; Why is this not aligned to 8 bytes? I checked all supported architectures and found that arc, csky, m68k, microblaze, openrisc, sh and x86-32 use a smaller alignment by default, but arc and x86-32 override it to 8 bytes already. x86 changed it back in 2009 with commit bbf2a330d92c ("x86: atomic64: The atomic64_t data type should be 8 bytes aligned on 32-bit too"), and arc uses the same one. Changing csky, m68k, microblaze, openrisc and sh to use the same alignment as all others is probably less risky in the long run in case anything relies on that the same way that code expects native alignment on atomic_t. Arnd ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-15 7:35 ` Arnd Bergmann @ 2025-09-15 8:06 ` Peter Zijlstra 2025-09-15 9:26 ` Finn Thain 1 sibling, 0 replies; 44+ messages in thread From: Peter Zijlstra @ 2025-09-15 8:06 UTC (permalink / raw) To: Arnd Bergmann Cc: Finn Thain, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, Geert Uytterhoeven, linux-m68k, Lance Yang On Mon, Sep 15, 2025 at 09:35:08AM +0200, Arnd Bergmann wrote: > On Sun, Sep 14, 2025, at 02:45, Finn Thain wrote: > > index 100d24b02e52..7ae82ac17645 100644 > > --- a/include/asm-generic/atomic64.h > > +++ b/include/asm-generic/atomic64.h > > @@ -10,7 +10,7 @@ > > #include <linux/types.h> > > > > typedef struct { > > - s64 counter; > > + s64 counter __aligned(sizeof(long)); > > } atomic64_t; > > Why is this not aligned to 8 bytes? Yeah, please use natural alignment for all atomic types. > I checked all supported architectures > and found that arc, csky, m68k, microblaze, openrisc, sh and x86-32 > use a smaller alignment by default, but arc and x86-32 override it > to 8 bytes already. x86 changed it back in 2009 with commit > bbf2a330d92c ("x86: atomic64: The atomic64_t data type should be 8 > bytes aligned on 32-bit too"), and arc uses the same one. Right, so x86 as a whole will accept unaligned data for its atomics, but at terrible overhead. Modern x86 grew (optional) trap on unaligned, and we've made sure the kernel is 'clean'. With that, the i586 cmpxchg8b instruction very much wants 8 byte alignment. Similarly, the x86_64 cmpxchg16b instruction wants 16 bytes alignment. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-15 7:35 ` Arnd Bergmann 2025-09-15 8:06 ` Peter Zijlstra @ 2025-09-15 9:26 ` Finn Thain 2025-09-15 9:29 ` Arnd Bergmann 1 sibling, 1 reply; 44+ messages in thread From: Finn Thain @ 2025-09-15 9:26 UTC (permalink / raw) To: Arnd Bergmann Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, Geert Uytterhoeven, linux-m68k, Lance Yang On Mon, 15 Sep 2025, Arnd Bergmann wrote: > On Sun, Sep 14, 2025, at 02:45, Finn Thain wrote: > > index 100d24b02e52..7ae82ac17645 100644 > > --- a/include/asm-generic/atomic64.h > > +++ b/include/asm-generic/atomic64.h > > @@ -10,7 +10,7 @@ > > #include <linux/types.h> > > > > typedef struct { > > - s64 counter; > > + s64 counter __aligned(sizeof(long)); > > } atomic64_t; > > Why is this not aligned to 8 bytes? I checked all supported > architectures and found that arc, csky, m68k, microblaze, openrisc, sh > and x86-32 use a smaller alignment by default, but arc and x86-32 > override it to 8 bytes already. x86 changed it back in 2009 with commit > bbf2a330d92c ("x86: atomic64: The atomic64_t data type should be 8 bytes > aligned on 32-bit too"), and arc uses the same one. > Right, I forgot to check includes in arch/x86/include. (I had assumed this definition was relevant to that architecture, hence the sizeof(long), in order to stick to native alignment on x86-32.) > Changing csky, m68k, microblaze, openrisc and sh to use the same > alignment as all others is probably less risky in the long run in case > anything relies on that the same way that code expects native alignment > on atomic_t. > By "native alignment", do you mean "natural alignment" here? ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-15 9:26 ` Finn Thain @ 2025-09-15 9:29 ` Arnd Bergmann 0 siblings, 0 replies; 44+ messages in thread From: Arnd Bergmann @ 2025-09-15 9:29 UTC (permalink / raw) To: Finn Thain Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, Geert Uytterhoeven, linux-m68k, Lance Yang On Mon, Sep 15, 2025, at 11:26, Finn Thain wrote: > On Mon, 15 Sep 2025, Arnd Bergmann wrote: >> Why is this not aligned to 8 bytes? I checked all supported >> architectures and found that arc, csky, m68k, microblaze, openrisc, sh >> and x86-32 use a smaller alignment by default, but arc and x86-32 >> override it to 8 bytes already. x86 changed it back in 2009 with commit >> bbf2a330d92c ("x86: atomic64: The atomic64_t data type should be 8 bytes >> aligned on 32-bit too"), and arc uses the same one. >> > > Right, I forgot to check includes in arch/x86/include. (I had assumed this > definition was relevant to that architecture, hence the sizeof(long), in > order to stick to native alignment on x86-32.) Ok >> Changing csky, m68k, microblaze, openrisc and sh to use the same >> alignment as all others is probably less risky in the long run in case >> anything relies on that the same way that code expects native alignment >> on atomic_t. >> > > By "native alignment", do you mean "natural alignment" here? Yes, that's what I meant. Arnd ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-14 0:45 ` [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t Finn Thain 2025-09-15 7:13 ` Geert Uytterhoeven 2025-09-15 7:35 ` Arnd Bergmann @ 2025-09-22 7:06 ` Geert Uytterhoeven 2025-09-22 8:16 ` Finn Thain 2 siblings, 1 reply; 44+ messages in thread From: Geert Uytterhoeven @ 2025-09-22 7:06 UTC (permalink / raw) To: Finn Thain Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel, linux-arch, linux-m68k, Lance Yang Hi Finn, On Sun, 14 Sept 2025 at 02:59, Finn Thain <fthain@linux-m68k.org> wrote: > Some recent commits incorrectly assumed 4-byte alignment of locks. > That assumption fails on Linux/m68k (and, interestingly, would have > failed on Linux/cris also). Specify the minimum alignment of atomic > variables for fewer surprises and (hopefully) better performance. > > Consistent with i386, atomic64_t is not given natural alignment here. > > Cc: Lance Yang <lance.yang@linux.dev> > Link: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc1_0g@mail.gmail.com/ Thanks for your patch! > --- a/include/asm-generic/atomic64.h > +++ b/include/asm-generic/atomic64.h > @@ -10,7 +10,7 @@ > #include <linux/types.h> > > typedef struct { > - s64 counter; > + s64 counter __aligned(sizeof(long)); > } atomic64_t; > > #define ATOMIC64_INIT(i) { (i) } > diff --git a/include/linux/types.h b/include/linux/types.h > index 6dfdb8e8e4c3..cd5b2b0f4b02 100644 > --- a/include/linux/types.h > +++ b/include/linux/types.h > @@ -179,7 +179,7 @@ typedef phys_addr_t resource_size_t; > typedef unsigned long irq_hw_number_t; > > typedef struct { > - int counter; > + int counter __aligned(sizeof(int)); > } atomic_t; > > #define ATOMIC_INIT(i) { (i) } This triggers a failure in kernel/bpf/rqspinlock.c: kernel/bpf/rqspinlock.c: In function ‘bpf_res_spin_lock’: include/linux/compiler_types.h:572:45: error: call to ‘__compiletime_assert_397’ declared with attribute error: BUILD_BUG_ON failed: __alignof__(rqspinlock_t) != __alignof__(struct bpf_res_spin_lock) 572 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^ include/linux/compiler_types.h:553:25: note: in definition of macro ‘__compiletime_assert’ 553 | prefix ## suffix(); \ | ^~~~~~ include/linux/compiler_types.h:572:9: note: in expansion of macro ‘_compiletime_assert’ 572 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^~~~~~~~~~~~~~~~~~~ include/linux/build_bug.h:39:37: note: in expansion of macro ‘compiletime_assert’ 39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) | ^~~~~~~~~~~~~~~~~~ include/linux/build_bug.h:50:9: note: in expansion of macro ‘BUILD_BUG_ON_MSG’ 50 | BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) | ^~~~~~~~~~~~~~~~ kernel/bpf/rqspinlock.c:695:9: note: in expansion of macro ‘BUILD_BUG_ON’ 695 | BUILD_BUG_ON(__alignof__(rqspinlock_t) != __alignof__(struct bpf_res_spin_lock)); | ^~~~~~~~~~~~ I haven't investigated it yet. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-22 7:06 ` Geert Uytterhoeven @ 2025-09-22 8:16 ` Finn Thain 2025-09-22 9:29 ` Geert Uytterhoeven 2025-09-22 15:21 ` Arnd Bergmann 0 siblings, 2 replies; 44+ messages in thread From: Finn Thain @ 2025-09-22 8:16 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel, linux-arch, linux-m68k, Lance Yang [-- Attachment #1: Type: text/plain, Size: 1851 bytes --] On Mon, 22 Sep 2025, Geert Uytterhoeven wrote: > > This triggers a failure in kernel/bpf/rqspinlock.c: > > kernel/bpf/rqspinlock.c: In function ‘bpf_res_spin_lock’: > include/linux/compiler_types.h:572:45: error: call to > ‘__compiletime_assert_397’ declared with attribute error: BUILD_BUG_ON > failed: __alignof__(rqspinlock_t) != __alignof__(struct > bpf_res_spin_lock) > 572 | _compiletime_assert(condition, msg, > __compiletime_assert_, __COUNTER__) > | ^ > include/linux/compiler_types.h:553:25: note: in definition of macro > ‘__compiletime_assert’ > 553 | prefix ## suffix(); > \ > | ^~~~~~ > include/linux/compiler_types.h:572:9: note: in expansion of macro > ‘_compiletime_assert’ > 572 | _compiletime_assert(condition, msg, > __compiletime_assert_, __COUNTER__) > | ^~~~~~~~~~~~~~~~~~~ > include/linux/build_bug.h:39:37: note: in expansion of macro > ‘compiletime_assert’ > 39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) > | ^~~~~~~~~~~~~~~~~~ > include/linux/build_bug.h:50:9: note: in expansion of macro ‘BUILD_BUG_ON_MSG’ > 50 | BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) > | ^~~~~~~~~~~~~~~~ > kernel/bpf/rqspinlock.c:695:9: note: in expansion of macro ‘BUILD_BUG_ON’ > 695 | BUILD_BUG_ON(__alignof__(rqspinlock_t) != > __alignof__(struct bpf_res_spin_lock)); > | ^~~~~~~~~~~~ > > I haven't investigated it yet. > Yes, I noticed that also, after I started building with defconfigs. I pushed a new patch to my github repo. https://github.com/fthain/linux/tree/atomic_t ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-22 8:16 ` Finn Thain @ 2025-09-22 9:29 ` Geert Uytterhoeven 2025-09-22 15:21 ` Arnd Bergmann 1 sibling, 0 replies; 44+ messages in thread From: Geert Uytterhoeven @ 2025-09-22 9:29 UTC (permalink / raw) To: Finn Thain Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel, linux-arch, linux-m68k, Lance Yang Hi Finn, On Mon, 22 Sept 2025 at 10:16, Finn Thain <fthain@linux-m68k.org> wrote: > On Mon, 22 Sep 2025, Geert Uytterhoeven wrote: > > This triggers a failure in kernel/bpf/rqspinlock.c: > > > > kernel/bpf/rqspinlock.c: In function ‘bpf_res_spin_lock’: > > include/linux/compiler_types.h:572:45: error: call to > > ‘__compiletime_assert_397’ declared with attribute error: BUILD_BUG_ON > > failed: __alignof__(rqspinlock_t) != __alignof__(struct > > bpf_res_spin_lock) > > 572 | _compiletime_assert(condition, msg, > > __compiletime_assert_, __COUNTER__) > > | ^ > > include/linux/compiler_types.h:553:25: note: in definition of macro > > ‘__compiletime_assert’ > > 553 | prefix ## suffix(); > > \ > > | ^~~~~~ > > include/linux/compiler_types.h:572:9: note: in expansion of macro > > ‘_compiletime_assert’ > > 572 | _compiletime_assert(condition, msg, > > __compiletime_assert_, __COUNTER__) > > | ^~~~~~~~~~~~~~~~~~~ > > include/linux/build_bug.h:39:37: note: in expansion of macro > > ‘compiletime_assert’ > > 39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) > > | ^~~~~~~~~~~~~~~~~~ > > include/linux/build_bug.h:50:9: note: in expansion of macro ‘BUILD_BUG_ON_MSG’ > > 50 | BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) > > | ^~~~~~~~~~~~~~~~ > > kernel/bpf/rqspinlock.c:695:9: note: in expansion of macro ‘BUILD_BUG_ON’ > > 695 | BUILD_BUG_ON(__alignof__(rqspinlock_t) != > > __alignof__(struct bpf_res_spin_lock)); > > | ^~~~~~~~~~~~ > > > > I haven't investigated it yet. > > Yes, I noticed that also, after I started building with defconfigs. > I pushed a new patch to my github repo. > > https://github.com/fthain/linux/tree/atomic_t Thanks, the updated version fixes the issue. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-22 8:16 ` Finn Thain 2025-09-22 9:29 ` Geert Uytterhoeven @ 2025-09-22 15:21 ` Arnd Bergmann 2025-09-23 6:28 ` Finn Thain 1 sibling, 1 reply; 44+ messages in thread From: Arnd Bergmann @ 2025-09-22 15:21 UTC (permalink / raw) To: Finn Thain, Geert Uytterhoeven Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang On Mon, Sep 22, 2025, at 10:16, Finn Thain wrote: > > Yes, I noticed that also, after I started building with defconfigs. > I pushed a new patch to my github repo. > > https://github.com/fthain/linux/tree/atomic_t I had a look at that repository, and I think the extra annotations you added on a lot of structures have the same mistake that I made when looking at annotating ABI structures for -malign-int earlier: Adding __aligned__((sizeof(long))) to a structure definition only changes the /outer/ alignment of the structure itself, so e.g. struct { short a; int b; } __attribute__((aligned(sizeof(long)))); creates a structure with 4-byte alignment and 8-byte size when building with -mno-align-int, but the padding is added after 'b', which is still misaligned. In order to align members the same way that -malign-int does, each unaligned member needs a separate attribute to force aligning it, like struct { short a; int b __attribute__((aligned(sizeof(int)))); }; This is different from how __attribute__((packed)) works, as that is always applied to each member of the struct if you add it to the end. Arnd ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-22 15:21 ` Arnd Bergmann @ 2025-09-23 6:28 ` Finn Thain 2025-09-23 6:41 ` Arnd Bergmann 2025-09-30 2:18 ` Finn Thain 0 siblings, 2 replies; 44+ messages in thread From: Finn Thain @ 2025-09-23 6:28 UTC (permalink / raw) To: Arnd Bergmann Cc: Geert Uytterhoeven, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang On Mon, 22 Sep 2025, Arnd Bergmann wrote: > On Mon, Sep 22, 2025, at 10:16, Finn Thain wrote: > > > > Yes, I noticed that also, after I started building with defconfigs. > > I pushed a new patch to my github repo. > > > > https://github.com/fthain/linux/tree/atomic_t > > I had a look at that repository, and I think the extra annotations you > added on a lot of structures have the same mistake that I made when > looking at annotating ABI structures for -malign-int earlier: > > Adding __aligned__((sizeof(long))) to a structure definition only > changes the /outer/ alignment of the structure itself, Right. I should have dropped those changes earlier. @@ -8,7 +8,7 @@ struct vfsmount; struct path { struct vfsmount *mnt; struct dentry *dentry; -} __randomize_layout; +} __aligned(sizeof(long)) __randomize_layout; There's no need: struct path contains a struct dentry, which contains a seqcount_spinlock_t, which contains a spinlock_t which contains an atomic_t member, which is explicitly aligned. Despite that, there's still some kmem cache or other allocator somewhere that has produced some misaligned path and dentry structures. So we get misaligned atomics somewhere in the VFS and TTY layers. I was unable to find those allocations. > so e.g. > > struct { > short a; > int b; > } __attribute__((aligned(sizeof(long)))); > > creates a structure with 4-byte alignment and 8-byte > size when building with -mno-align-int, but the padding > is added after 'b', which is still misaligned. > Sure but -mno-align-int isn't particularly relevant here as it's only for m68k. Besides, I'd like to avoid the mess that would create. The basic aim here is to naturally align both atomic_t and atomic64_t on all architectures (my testing is confined to m68k) and to try to find out what that would cost and what benefit it might bring. The patches you're reviewing were a hack to try to silence the WARN from CONFIG_DEBUG_ATOMIC, because that way I could try to assess the cost/benefit. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-23 6:28 ` Finn Thain @ 2025-09-23 6:41 ` Arnd Bergmann 2025-09-23 8:05 ` Finn Thain 2025-09-30 2:18 ` Finn Thain 1 sibling, 1 reply; 44+ messages in thread From: Arnd Bergmann @ 2025-09-23 6:41 UTC (permalink / raw) To: Finn Thain Cc: Geert Uytterhoeven, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang On Tue, Sep 23, 2025, at 08:28, Finn Thain wrote: > On Mon, 22 Sep 2025, Arnd Bergmann wrote: >> On Mon, Sep 22, 2025, at 10:16, Finn Thain wrote: > @@ -8,7 +8,7 @@ struct vfsmount; > struct path { > struct vfsmount *mnt; > struct dentry *dentry; > -} __randomize_layout; > +} __aligned(sizeof(long)) __randomize_layout; > > There's no need: struct path contains a struct dentry, which contains a > seqcount_spinlock_t, which contains a spinlock_t which contains an > atomic_t member, which is explicitly aligned. > > Despite that, there's still some kmem cache or other allocator somewhere > that has produced some misaligned path and dentry structures. So we get > misaligned atomics somewhere in the VFS and TTY layers. I was unable to > find those allocations. Ok, I see. Those would certainly be good to find. I would assume that all kmem caches have a sensible alignment on each architecture, but I think the definition in linux/slab.h actually ends up setting the minimum to 2 here: #ifndef ARCH_KMALLOC_MINALIGN #define ARCH_KMALLOC_MINALIGN __alignof__(unsigned long long) #elif ARCH_KMALLOC_MINALIGN > 8 #define KMALLOC_MIN_SIZE ARCH_KMALLOC_MINALIGN #define KMALLOC_SHIFT_LOW ilog2(KMALLOC_MIN_SIZE) #endif #ifndef ARCH_SLAB_MINALIGN #define ARCH_SLAB_MINALIGN __alignof__(unsigned long long) #endif Maybe we should just change __alignof__(unsigned long long) to a plain '8' here and make that the minimum alignment everywhere, same as the atomic64_t alignment change. Alternatively, we can keep the __alignof__ here in order to reduce padding on architectures with a default 4-byte alignment for __u64, but then override ARCH_SLAB_MINALIGN and ARCH_KMALLOC_MINALIGN on m68k to be '4' instead of '2'. Arnd ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-23 6:41 ` Arnd Bergmann @ 2025-09-23 8:05 ` Finn Thain 2025-09-23 19:11 ` Arnd Bergmann 0 siblings, 1 reply; 44+ messages in thread From: Finn Thain @ 2025-09-23 8:05 UTC (permalink / raw) To: Arnd Bergmann Cc: Geert Uytterhoeven, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang On Tue, 23 Sep 2025, Arnd Bergmann wrote: > On Tue, Sep 23, 2025, at 08:28, Finn Thain wrote: > > On Mon, 22 Sep 2025, Arnd Bergmann wrote: > >> On Mon, Sep 22, 2025, at 10:16, Finn Thain wrote: > > @@ -8,7 +8,7 @@ struct vfsmount; > > struct path { > > struct vfsmount *mnt; > > struct dentry *dentry; > > -} __randomize_layout; > > +} __aligned(sizeof(long)) __randomize_layout; > > > > There's no need: struct path contains a struct dentry, which contains a > > seqcount_spinlock_t, which contains a spinlock_t which contains an > > atomic_t member, which is explicitly aligned. > > > > Despite that, there's still some kmem cache or other allocator somewhere > > that has produced some misaligned path and dentry structures. So we get > > misaligned atomics somewhere in the VFS and TTY layers. I was unable to > > find those allocations. > > Ok, I see. Those would certainly be good to find. I would assume that > all kmem caches have a sensible alignment on each architecture, but I > think the definition in linux/slab.h actually ends up setting the > minimum to 2 here: > > #ifndef ARCH_KMALLOC_MINALIGN > #define ARCH_KMALLOC_MINALIGN __alignof__(unsigned long long) > #elif ARCH_KMALLOC_MINALIGN > 8 > #define KMALLOC_MIN_SIZE ARCH_KMALLOC_MINALIGN > #define KMALLOC_SHIFT_LOW ilog2(KMALLOC_MIN_SIZE) > #endif > > #ifndef ARCH_SLAB_MINALIGN > #define ARCH_SLAB_MINALIGN __alignof__(unsigned long long) > #endif > > Maybe we should just change __alignof__(unsigned long long) to a plain > '8' here and make that the minimum alignment everywhere, same as the > atomic64_t alignment change. > It would be better (less wasteful of memory) to specify the alignment parameter to kmem_cache_create() only at those call sites where it matters. > Alternatively, we can keep the __alignof__ here in order to reduce > padding on architectures with a default 4-byte alignment for __u64, but > then override ARCH_SLAB_MINALIGN and ARCH_KMALLOC_MINALIGN on m68k to be > '4' instead of '2'. > Raising that to 4 would probably have little or no effect (for m68k or any other arch). Back when I prepared the RFC patch series, I instrumented create_cache() in mm/slab_common.c, and those caches that were allocated at boot (for my usual minimal m68k .config) were already aligned to 4 bytes or 16. Also, increasing ARCH_SLAB_MINALIGN to 8 didn't solve the VFS/TTY layer problem I have with CONFIG_DEBUG_ATOMIC on m68k. So the culprit is not the obvious suspect (a kmem cache of objects with atomic64_t members). There's some other allocator at work and it's aligning objects to 2 bytes not 4. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-23 8:05 ` Finn Thain @ 2025-09-23 19:11 ` Arnd Bergmann 0 siblings, 0 replies; 44+ messages in thread From: Arnd Bergmann @ 2025-09-23 19:11 UTC (permalink / raw) To: Finn Thain Cc: Geert Uytterhoeven, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang On Tue, Sep 23, 2025, at 10:05, Finn Thain wrote: > On Tue, 23 Sep 2025, Arnd Bergmann wrote: >> On Tue, Sep 23, 2025, at 08:28, Finn Thain wrote: >> >> Maybe we should just change __alignof__(unsigned long long) to a plain >> '8' here and make that the minimum alignment everywhere, same as the >> atomic64_t alignment change. >> > > It would be better (less wasteful of memory) to specify the alignment > parameter to kmem_cache_create() only at those call sites where it > matters. I think that's the idea: __alignof__(unsigned long long) is the alignment that the ABI requires for allocating arbitrary data structures that may contain 64-bit integers, so this is already the minimum. Architectures that may have noncoherent DMA master devices often need to raise the ARCH_KMALLOC_MINALIGN to the cache line size but leave the ARCH_SLAB_MINALIGN at the ABI-required minimum value. Any caller that needs a higher alignment than the ABI minimum needs to specify that. >> Alternatively, we can keep the __alignof__ here in order to reduce >> padding on architectures with a default 4-byte alignment for __u64, but >> then override ARCH_SLAB_MINALIGN and ARCH_KMALLOC_MINALIGN on m68k to be >> '4' instead of '2'. >> > > Raising that to 4 would probably have little or no effect (for m68k or any > other arch). Back when I prepared the RFC patch series, I instrumented > create_cache() in mm/slab_common.c, and those caches that were allocated > at boot (for my usual minimal m68k .config) were already aligned to 4 > bytes or 16. Ok > Also, increasing ARCH_SLAB_MINALIGN to 8 didn't solve the VFS/TTY layer > problem I have with CONFIG_DEBUG_ATOMIC on m68k. So the culprit is not the > obvious suspect (a kmem cache of objects with atomic64_t members). There's > some other allocator at work and it's aligning objects to 2 bytes not 4. In that case, my guess would be something that concatenates structures in some variant of struct s1 { int a; short b; int rest[]; }; struct s2 { atomic_t a; }; struct s1 *obj1 = kmalloc(sizeof(struct s1) + sizeof(struct s2), GFP_KERNEL); struct s2 *obj2 = (void*)&obj1[1]; which causes problems because s2 has a larger alignment than s1. We've had problems with DMA to structures like this in the past. The more common construct that does struct s1 { short a; struct s2; }; should not produce misaligned members in the inner struct, unless the outer one has a __packed attribute. Unfortunately we disabled -Wpacked-not-aligned, which would otherwise catch that problem. Arnd ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-23 6:28 ` Finn Thain 2025-09-23 6:41 ` Arnd Bergmann @ 2025-09-30 2:18 ` Finn Thain 2025-09-30 6:35 ` Arnd Bergmann 2025-09-30 7:41 ` Geert Uytterhoeven 1 sibling, 2 replies; 44+ messages in thread From: Finn Thain @ 2025-09-30 2:18 UTC (permalink / raw) To: Arnd Bergmann Cc: Geert Uytterhoeven, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang On Tue, 23 Sep 2025, I wrote: > > ... there's still some kmem cache or other allocator somewhere that has > produced some misaligned path and dentry structures. So we get > misaligned atomics somewhere in the VFS and TTY layers. I was unable to > find those allocations. > It turned out that the problem wasn't dynamic allocations, it was a local variable in the core locking code (kernel/locking/rwsem.c): a misaligned long used with an atomic operation (cmpxchg). To get natural alignment for 64-bit quantities, I had to align other local variables as well, such as the one in ktime_get_real_ts64_mg() that's used with atomic64_try_cmpxchg(). The atomic_t branch in my github repo has the patches I wrote for that. To silence the misalignment WARN from CONFIG_DEBUG_ATOMIC, for 64-bit atomic operations, for my small m68k .config, it was also necesary to increase ARCH_SLAB_MINALIGN to 8. However, I'm not advocating a ARCH_SLAB_MINALIGN increase, as that wastes memory. I think it might be more useful to limit the alignment test for CONFIG_DEBUG_ATOMIC, as follows. diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h index 402a999a0d6b..cd569a87c0a8 100644 --- a/include/linux/instrumented.h +++ b/include/linux/instrumented.h @@ -68,7 +68,7 @@ static __always_inline void instrument_atomic_read(const volatile void *v, size_ { kasan_check_read(v, size); kcsan_check_atomic_read(v, size); - WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1))); + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1) & 3)); } /** @@ -83,7 +83,7 @@ static __always_inline void instrument_atomic_write(const volatile void *v, size { kasan_check_write(v, size); kcsan_check_atomic_write(v, size); - WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1))); + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1) & 3)); } /** @@ -98,7 +98,7 @@ static __always_inline void instrument_atomic_read_write(const volatile void *v, { kasan_check_write(v, size); kcsan_check_atomic_read_write(v, size); - WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1))); + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1) & 3)); } /** ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-30 2:18 ` Finn Thain @ 2025-09-30 6:35 ` Arnd Bergmann 2025-10-01 1:03 ` Finn Thain ` (2 more replies) 2025-09-30 7:41 ` Geert Uytterhoeven 1 sibling, 3 replies; 44+ messages in thread From: Arnd Bergmann @ 2025-09-30 6:35 UTC (permalink / raw) To: Finn Thain Cc: Geert Uytterhoeven, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang On Tue, Sep 30, 2025, at 04:18, Finn Thain wrote: > On Tue, 23 Sep 2025, I wrote: >> >> ... there's still some kmem cache or other allocator somewhere that has >> produced some misaligned path and dentry structures. So we get >> misaligned atomics somewhere in the VFS and TTY layers. I was unable to >> find those allocations. >> > > It turned out that the problem wasn't dynamic allocations, it was a local > variable in the core locking code (kernel/locking/rwsem.c): a misaligned > long used with an atomic operation (cmpxchg). To get natural alignment for > 64-bit quantities, I had to align other local variables as well, such as > the one in ktime_get_real_ts64_mg() that's used with > atomic64_try_cmpxchg(). The atomic_t branch in my github repo has the > patches I wrote for that. It looks like the variable you get the warning for is not even the atomic64_t but the 'old' argument to atomic64_try_cmpxchg(), at least in some of the cases you found if not all of them. I don't see where why there is a requirement to have that aligned at all, even if we do require the atomic64_t to be naturally aligned, and I would expect the same warning to hit on x86-32 and the other architectures with 4-byte alignment of u64 variable on stack and .data. > To silence the misalignment WARN from CONFIG_DEBUG_ATOMIC, for 64-bit > atomic operations, for my small m68k .config, it was also necesary to > increase ARCH_SLAB_MINALIGN to 8. However, I'm not advocating a > ARCH_SLAB_MINALIGN increase, as that wastes memory. Have you tried to quantify the memory waste here? I assume that most slab allocations are already 8-byte aligned, at least kmalloc() with size>4, while custom caches are usually done for larger structures where an extra average of 2 bytes per allocation may not be that bad. > diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h > index 402a999a0d6b..cd569a87c0a8 100644 > --- a/include/linux/instrumented.h > +++ b/include/linux/instrumented.h > @@ -68,7 +68,7 @@ static __always_inline void > instrument_atomic_read(const volatile void *v, size_ > { > kasan_check_read(v, size); > kcsan_check_atomic_read(v, size); > - WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & > (size - 1))); > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & > (size - 1) & 3)); > } What is the alignment of stack variables on m68k? E.g. if you have a function with two local variables, would that still be able to trigger the check? int f(atomic64_t *a) { u16 pad; u64 old; g(&pad); atomic64_try_cmpxchg(a, &old, 0); } Since there is nothing telling the compiler that the 'old' argument to atomic*_try_cmpcxchg() needs to be naturally aligned, maybe that check should be changed to only test for the ABI-guaranteed alignment? I think that would still be needed on x86-32. Arnd diff --git a/include/linux/atomic/atomic-instrumented.h b/include/linux/atomic/atomic-instrumented.h index 9409a6ddf3e0..e57763a889bd 100644 --- a/include/linux/atomic/atomic-instrumented.h +++ b/include/linux/atomic/atomic-instrumented.h @@ -1276,7 +1276,7 @@ atomic_try_cmpxchg(atomic_t *v, int *old, int new) { kcsan_mb(); instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_atomic_read_write(old, alignof(*old)); return raw_atomic_try_cmpxchg(v, old, new); } @@ -1298,7 +1298,7 @@ static __always_inline bool atomic_try_cmpxchg_acquire(atomic_t *v, int *old, int new) { instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_atomic_read_write(old, alignof(*old)); return raw_atomic_try_cmpxchg_acquire(v, old, new); } @@ -1321,7 +1321,7 @@ atomic_try_cmpxchg_release(atomic_t *v, int *old, int new) { kcsan_release(); instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_atomic_read_write(old, alignof(*old)); return raw_atomic_try_cmpxchg_release(v, old, new); } @@ -1343,7 +1343,7 @@ static __always_inline bool atomic_try_cmpxchg_relaxed(atomic_t *v, int *old, int new) { instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_atomic_read_write(old, alignof(*old)); return raw_atomic_try_cmpxchg_relaxed(v, old, new); } @@ -2854,7 +2854,7 @@ atomic64_try_cmpxchg(atomic64_t *v, s64 *old, s64 new) { kcsan_mb(); instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_atomic_read_write(old, alignof(*old)); return raw_atomic64_try_cmpxchg(v, old, new); } @@ -2876,7 +2876,7 @@ static __always_inline bool atomic64_try_cmpxchg_acquire(atomic64_t *v, s64 *old, s64 new) { instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_atomic_read_write(old, alignof(*old)); return raw_atomic64_try_cmpxchg_acquire(v, old, new); } @@ -2899,7 +2899,7 @@ atomic64_try_cmpxchg_release(atomic64_t *v, s64 *old, s64 new) { kcsan_release(); instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_atomic_read_write(old, alignof(*old)); return raw_atomic64_try_cmpxchg_release(v, old, new); } @@ -2921,7 +2921,7 @@ static __always_inline bool atomic64_try_cmpxchg_relaxed(atomic64_t *v, s64 *old, s64 new) { instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_atomic_read_write(old, alignof(*old)); return raw_atomic64_try_cmpxchg_relaxed(v, old, new); } @@ -4432,7 +4432,7 @@ atomic_long_try_cmpxchg(atomic_long_t *v, long *old, long new) { kcsan_mb(); instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_atomic_read_write(old, alignof(*old)); return raw_atomic_long_try_cmpxchg(v, old, new); } @@ -4454,7 +4454,7 @@ static __always_inline bool atomic_long_try_cmpxchg_acquire(atomic_long_t *v, long *old, long new) { instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_atomic_read_write(old, alignof(*old)); return raw_atomic_long_try_cmpxchg_acquire(v, old, new); } @@ -4477,7 +4477,7 @@ atomic_long_try_cmpxchg_release(atomic_long_t *v, long *old, long new) { kcsan_release(); instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_atomic_read_write(old, alignof(*old)); return raw_atomic_long_try_cmpxchg_release(v, old, new); } @@ -4499,7 +4499,7 @@ static __always_inline bool atomic_long_try_cmpxchg_relaxed(atomic_long_t *v, long *old, long new) { instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_atomic_read_write(old, alignof(*old)); return raw_atomic_long_try_cmpxchg_relaxed(v, old, new); } ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-30 6:35 ` Arnd Bergmann @ 2025-10-01 1:03 ` Finn Thain 2025-10-01 6:44 ` Arnd Bergmann 2025-10-06 9:25 ` Finn Thain 2025-10-06 9:37 ` Peter Zijlstra 2 siblings, 1 reply; 44+ messages in thread From: Finn Thain @ 2025-10-01 1:03 UTC (permalink / raw) To: Arnd Bergmann Cc: Geert Uytterhoeven, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang On Tue, 30 Sep 2025, Arnd Bergmann wrote: > On Tue, Sep 30, 2025, at 04:18, Finn Thain wrote: > > > > It turned out that the problem wasn't dynamic allocations, it was a > > local variable in the core locking code (kernel/locking/rwsem.c): a > > misaligned long used with an atomic operation (cmpxchg). To get > > natural alignment for 64-bit quantities, I had to align other local > > variables as well, such as the one in ktime_get_real_ts64_mg() that's > > used with atomic64_try_cmpxchg(). The atomic_t branch in my github > > repo has the patches I wrote for that. > > It looks like the variable you get the warning for is not even the > atomic64_t but the 'old' argument to atomic64_try_cmpxchg(), at least in > some of the cases you found if not all of them. > > I don't see where why there is a requirement to have that aligned at > all, even if we do require the atomic64_t to be naturally aligned, and I > would expect the same warning to hit on x86-32 and the other > architectures with 4-byte alignment of u64 variable on stack and .data. > Right -- there's only one memory operand in a CAS instruction on m68k, and there's only one pointer in the C implementation in asm-generic. > > To silence the misalignment WARN from CONFIG_DEBUG_ATOMIC, for 64-bit > > atomic operations, for my small m68k .config, it was also necesary to > > increase ARCH_SLAB_MINALIGN to 8. However, I'm not advocating a > > ARCH_SLAB_MINALIGN increase, as that wastes memory. > > Have you tried to quantify the memory waste here? I think it's entirely workload dependent. The memory efficiency question comes down to the misalignment distance as a proportion of the size of the allocation. > I assume that most slab allocations are already 8-byte aligned, at least > kmalloc() with size>4, while custom caches are usually done for larger > structures where an extra average of 2 bytes per allocation may not be > that bad. > > > diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h > > index 402a999a0d6b..cd569a87c0a8 100644 > > --- a/include/linux/instrumented.h > > +++ b/include/linux/instrumented.h > > @@ -68,7 +68,7 @@ static __always_inline void > > instrument_atomic_read(const volatile void *v, size_ > > { > > kasan_check_read(v, size); > > kcsan_check_atomic_read(v, size); > > - WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & > > (size - 1))); > > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & > > (size - 1) & 3)); > > } > > What is the alignment of stack variables on m68k? E.g. if you have a > function with two local variables, would that still be able to trigger > the check? > > int f(atomic64_t *a) > { > u16 pad; > u64 old; > > g(&pad); > atomic64_try_cmpxchg(a, &old, 0); > } > I assume so: int foo(void) { short s; long long ll; return alignof(ll); } # Compilation provided by Compiler Explorer at https://godbolt.org/ foo(): link.w %fp,#0 moveq #2,%d0 unlk %fp rts > Since there is nothing telling the compiler that the 'old' argument to > atomic*_try_cmpcxchg() needs to be naturally aligned, maybe that check > should be changed to only test for the ABI-guaranteed alignment? I think > that would still be needed on x86-32. > I don't know why we would check the alignment of the 'old' quantity. It's going to be loaded into a register before being used, right? ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-10-01 1:03 ` Finn Thain @ 2025-10-01 6:44 ` Arnd Bergmann 2025-10-06 9:25 ` Finn Thain 0 siblings, 1 reply; 44+ messages in thread From: Arnd Bergmann @ 2025-10-01 6:44 UTC (permalink / raw) To: Finn Thain Cc: Geert Uytterhoeven, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang On Wed, Oct 1, 2025, at 03:03, Finn Thain wrote: > On Tue, 30 Sep 2025, Arnd Bergmann wrote: >> What is the alignment of stack variables on m68k? E.g. if you have a >> function with two local variables, would that still be able to trigger >> the check? >> >> int f(atomic64_t *a) >> { >> u16 pad; >> u64 old; >> >> g(&pad); >> atomic64_try_cmpxchg(a, &old, 0); >> } >> > > I assume so: > > int foo(void) { > short s; > long long ll; > return alignof(ll); > } > > # Compilation provided by Compiler Explorer at https://godbolt.org/ > foo(): > link.w %fp,#0 > moveq #2,%d0 > unlk %fp > rts This just returns the guaranteed alignment of the 'long long' type based on -malign-int/-mno-align-int. Checking again I find that gcc's m68k-linux target aligns stack allocations to 4 bytes, though the m68k-unknown target apparently keeps the 2 byte alignment: https://gcc.gnu.org/legacy-ml/gcc-patches/2007-09/msg01572.html https://godbolt.org/z/48fGMj56W Surprisingly the godbolt.org link also shows a significant overhead when building the same code with -malign-int in the second tab. This is unrelated to the issue here, but I wonder if that is something to report to the gcc bug tracker if we ever get to building the kernel with -malign-int. Similarly, I noticed that clang does not support the -malign-int flag on m68k at all. >> Since there is nothing telling the compiler that the 'old' argument to >> atomic*_try_cmpcxchg() needs to be naturally aligned, maybe that check >> should be changed to only test for the ABI-guaranteed alignment? I think >> that would still be needed on x86-32. >> > > I don't know why we would check the alignment of the 'old' quantity. It's > going to be loaded into a register before being used, right? I was wondering about that as well, but checking for alignof(*old) probably can't hurt. The only architectures that actually have a custom arch_try_cmpxchg*() are s390 and x86 and those don't care about alignmnent of 'old', but it's possible that another architecture that can't handle unaligned load/store would add an inline asm implementation in the future and break if an alignment fixup happens in the middle of an ll/sc loop. Arnd ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-10-01 6:44 ` Arnd Bergmann @ 2025-10-06 9:25 ` Finn Thain 0 siblings, 0 replies; 44+ messages in thread From: Finn Thain @ 2025-10-06 9:25 UTC (permalink / raw) To: Arnd Bergmann Cc: Geert Uytterhoeven, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang On Wed, 1 Oct 2025, Arnd Bergmann wrote: > >> Since there is nothing telling the compiler that the 'old' argument > >> to atomic*_try_cmpcxchg() needs to be naturally aligned, maybe that > >> check should be changed to only test for the ABI-guaranteed > >> alignment? I think that would still be needed on x86-32. > >> > > > > I don't know why we would check the alignment of the 'old' quantity. > > It's going to be loaded into a register before being used, right? > > I was wondering about that as well, but checking for alignof(*old) > probably can't hurt. The only architectures that actually have a custom > arch_try_cmpxchg*() are s390 and x86 and those don't care about > alignmnent of 'old', but it's possible that another architecture that > can't handle unaligned load/store would add an inline asm implementation > in the future and break if an alignment fixup happens in the middle of > an ll/sc loop. > That hypothetical future requirement seems improbable to me. Moreover, would such an architecture have a need for CONFIG_DEBUG_ATOMIC? ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-30 6:35 ` Arnd Bergmann 2025-10-01 1:03 ` Finn Thain @ 2025-10-06 9:25 ` Finn Thain 2025-10-06 10:07 ` Arnd Bergmann 2025-10-06 9:37 ` Peter Zijlstra 2 siblings, 1 reply; 44+ messages in thread From: Finn Thain @ 2025-10-06 9:25 UTC (permalink / raw) To: Arnd Bergmann Cc: Geert Uytterhoeven, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang On Tue, 30 Sep 2025, Arnd Bergmann wrote: > On Tue, Sep 30, 2025, at 04:18, Finn Thain wrote: > > > It turned out that the problem wasn't dynamic allocations, it was a > > local variable in the core locking code (kernel/locking/rwsem.c): a > > misaligned long used with an atomic operation (cmpxchg). To get > > natural alignment for 64-bit quantities, I had to align other local > > variables as well, such as the one in ktime_get_real_ts64_mg() that's > > used with atomic64_try_cmpxchg(). The atomic_t branch in my github > > repo has the patches I wrote for that. > > It looks like the variable you get the warning for is not even the > atomic64_t but the 'old' argument to atomic64_try_cmpxchg(), at least in > some of the cases you found if not all of them. > > I don't see where why there is a requirement to have that aligned at > all, even if we do require the atomic64_t to be naturally aligned, and I > would expect the same warning to hit on x86-32 and the other > architectures with 4-byte alignment of u64 variable on stack and .data. > > ... > > Since there is nothing telling the compiler that the 'old' argument to > atomic*_try_cmpcxchg() needs to be naturally aligned, maybe that check > should be changed to only test for the ABI-guaranteed alignment? I think > that would still be needed on x86-32. > > Arnd > > diff --git a/include/linux/atomic/atomic-instrumented.h b/include/linux/atomic/atomic-instrumented.h > index 9409a6ddf3e0..e57763a889bd 100644 > --- a/include/linux/atomic/atomic-instrumented.h > +++ b/include/linux/atomic/atomic-instrumented.h > @@ -1276,7 +1276,7 @@ atomic_try_cmpxchg(atomic_t *v, int *old, int new) > { > kcsan_mb(); > instrument_atomic_read_write(v, sizeof(*v)); > - instrument_atomic_read_write(old, sizeof(*old)); > + instrument_atomic_read_write(old, alignof(*old)); > return raw_atomic_try_cmpxchg(v, old, new); > } > In the same file, we have: #define try_cmpxchg(ptr, oldp, ...) \ ({ \ typeof(ptr) __ai_ptr = (ptr); \ typeof(oldp) __ai_oldp = (oldp); \ kcsan_mb(); \ instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \ raw_try_cmpxchg(__ai_ptr, __ai_oldp, __VA_ARGS__); \ }) In try_cmpxchg(), unlike atomic_try_cmpxchg(), the 'old' parameter is checked by instrument_read_write() instead of instrument_atomic_read_write(), which suggests a different patch. This header is generated by a script so the change below would be more complicated in practice. diff --git a/include/linux/atomic/atomic-instrumented.h b/include/linux/atomic/atomic-instrumented.h index 9409a6ddf3e0..ce3890bcd903 100644 --- a/include/linux/atomic/atomic-instrumented.h +++ b/include/linux/atomic/atomic-instrumented.h @@ -1276,7 +1276,7 @@ atomic_try_cmpxchg(atomic_t *v, int *old, int new) { kcsan_mb(); instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_read_write(old, sizeof(*old)); return raw_atomic_try_cmpxchg(v, old, new); } @@ -1298,7 +1298,7 @@ static __always_inline bool atomic_try_cmpxchg_acquire(atomic_t *v, int *old, int new) { instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_read_write(old, sizeof(*old)); return raw_atomic_try_cmpxchg_acquire(v, old, new); } @@ -1321,7 +1321,7 @@ atomic_try_cmpxchg_release(atomic_t *v, int *old, int new) { kcsan_release(); instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_read_write(old, sizeof(*old)); return raw_atomic_try_cmpxchg_release(v, old, new); } @@ -1343,7 +1343,7 @@ static __always_inline bool atomic_try_cmpxchg_relaxed(atomic_t *v, int *old, int new) { instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_read_write(old, sizeof(*old)); return raw_atomic_try_cmpxchg_relaxed(v, old, new); } @@ -2854,7 +2854,7 @@ atomic64_try_cmpxchg(atomic64_t *v, s64 *old, s64 new) { kcsan_mb(); instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_read_write(old, sizeof(*old)); return raw_atomic64_try_cmpxchg(v, old, new); } @@ -2876,7 +2876,7 @@ static __always_inline bool atomic64_try_cmpxchg_acquire(atomic64_t *v, s64 *old, s64 new) { instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_read_write(old, sizeof(*old)); return raw_atomic64_try_cmpxchg_acquire(v, old, new); } @@ -2899,7 +2899,7 @@ atomic64_try_cmpxchg_release(atomic64_t *v, s64 *old, s64 new) { kcsan_release(); instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_read_write(old, sizeof(*old)); return raw_atomic64_try_cmpxchg_release(v, old, new); } @@ -2921,7 +2921,7 @@ static __always_inline bool atomic64_try_cmpxchg_relaxed(atomic64_t *v, s64 *old, s64 new) { instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_read_write(old, sizeof(*old)); return raw_atomic64_try_cmpxchg_relaxed(v, old, new); } @@ -4432,7 +4432,7 @@ atomic_long_try_cmpxchg(atomic_long_t *v, long *old, long new) { kcsan_mb(); instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_read_write(old, sizeof(*old)); return raw_atomic_long_try_cmpxchg(v, old, new); } @@ -4454,7 +4454,7 @@ static __always_inline bool atomic_long_try_cmpxchg_acquire(atomic_long_t *v, long *old, long new) { instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_read_write(old, sizeof(*old)); return raw_atomic_long_try_cmpxchg_acquire(v, old, new); } @@ -4477,7 +4477,7 @@ atomic_long_try_cmpxchg_release(atomic_long_t *v, long *old, long new) { kcsan_release(); instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_read_write(old, sizeof(*old)); return raw_atomic_long_try_cmpxchg_release(v, old, new); } @@ -4499,7 +4499,7 @@ static __always_inline bool atomic_long_try_cmpxchg_relaxed(atomic_long_t *v, long *old, long new) { instrument_atomic_read_write(v, sizeof(*v)); - instrument_atomic_read_write(old, sizeof(*old)); + instrument_read_write(old, sizeof(*old)); return raw_atomic_long_try_cmpxchg_relaxed(v, old, new); } ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-10-06 9:25 ` Finn Thain @ 2025-10-06 10:07 ` Arnd Bergmann 2025-10-06 10:22 ` Peter Zijlstra 0 siblings, 1 reply; 44+ messages in thread From: Arnd Bergmann @ 2025-10-06 10:07 UTC (permalink / raw) To: Finn Thain Cc: Geert Uytterhoeven, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang On Mon, Oct 6, 2025, at 11:25, Finn Thain wrote: > On Tue, 30 Sep 2025, Arnd Bergmann wrote: >> On Tue, Sep 30, 2025, at 04:18, Finn Thain wrote: >> >> Since there is nothing telling the compiler that the 'old' argument to >> atomic*_try_cmpcxchg() needs to be naturally aligned, maybe that check >> should be changed to only test for the ABI-guaranteed alignment? I think >> that would still be needed on x86-32. >> >> diff --git a/include/linux/atomic/atomic-instrumented.h b/include/linux/atomic/atomic-instrumented.h >> index 9409a6ddf3e0..e57763a889bd 100644 >> --- a/include/linux/atomic/atomic-instrumented.h >> +++ b/include/linux/atomic/atomic-instrumented.h >> @@ -1276,7 +1276,7 @@ atomic_try_cmpxchg(atomic_t *v, int *old, int new) >> { >> kcsan_mb(); >> instrument_atomic_read_write(v, sizeof(*v)); >> - instrument_atomic_read_write(old, sizeof(*old)); >> + instrument_atomic_read_write(old, alignof(*old)); >> return raw_atomic_try_cmpxchg(v, old, new); >> } >> > > In the same file, we have: > > #define try_cmpxchg(ptr, oldp, ...) \ > ({ \ > typeof(ptr) __ai_ptr = (ptr); \ > typeof(oldp) __ai_oldp = (oldp); \ > kcsan_mb(); \ > instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ > instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \ > raw_try_cmpxchg(__ai_ptr, __ai_oldp, __VA_ARGS__); \ > }) > > In try_cmpxchg(), unlike atomic_try_cmpxchg(), the 'old' parameter is > checked by instrument_read_write() instead of > instrument_atomic_read_write(), which suggests a different patch. Right, we still need the effect of instrument_read_write() for kasan/kcsan sanitizing with the correct size, only the alignment check from the instrument_atomic_read_write() is wrong here here. It looks like Mark Rutland fixed the try_cmpxchg() function this way in commit ec570320b09f ("locking/atomic: Correct (cmp)xchg() instrumentation"), but for some reason we still have the wrong annotation on the atomic_try_cmpxchg() helpers. > This header is generated by a script so the change below would be more > complicated in practice. > > diff --git a/include/linux/atomic/atomic-instrumented.h > b/include/linux/atomic/atomic-instrumented.h > index 9409a6ddf3e0..ce3890bcd903 100644 > --- a/include/linux/atomic/atomic-instrumented.h > +++ b/include/linux/atomic/atomic-instrumented.h > @@ -1276,7 +1276,7 @@ atomic_try_cmpxchg(atomic_t *v, int *old, int new) > { > kcsan_mb(); > instrument_atomic_read_write(v, sizeof(*v)); > - instrument_atomic_read_write(old, sizeof(*old)); > + instrument_read_write(old, sizeof(*old)); > return raw_atomic_try_cmpxchg(v, old, new); > } This looks good to me. Mark, I've tried to modify your script to produce that output, the output looks correct to me, but it makes the script more complex than I liked and I'm not sure that matching on "${type}"="p" is the best way to check for this. Maybe you can come up with a version that is clearer or more robust. Arnd diff --git a/scripts/atomic/gen-atomic-instrumented.sh b/scripts/atomic/gen-atomic-instrumented.sh index 592f3ec89b5f..9c1d53f81eb2 100755 --- a/scripts/atomic/gen-atomic-instrumented.sh +++ b/scripts/atomic/gen-atomic-instrumented.sh @@ -12,7 +12,7 @@ gen_param_check() local arg="$1"; shift local type="${arg%%:*}" local name="$(gen_param_name "${arg}")" - local rw="write" + local rw="atomic_write" case "${type#c}" in i) return;; @@ -20,14 +20,17 @@ gen_param_check() if [ ${type#c} != ${type} ]; then # We don't write to constant parameters. - rw="read" + rw="atomic_read" + elif [ "${type}" = "p" ] ; then + # The "old" argument in try_cmpxchg() gets accessed non-atomically + rw="read_write" elif [ "${meta}" != "s" ]; then # An atomic RMW: if this parameter is not a constant, and this atomic is # not just a 's'tore, this parameter is both read from and written to. - rw="read_write" + rw="atomic_read_write" fi - printf "\tinstrument_atomic_${rw}(${name}, sizeof(*${name}));\n" + printf "\tinstrument_${rw}(${name}, sizeof(*${name}));\n" } #gen_params_checks(meta, arg...) ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-10-06 10:07 ` Arnd Bergmann @ 2025-10-06 10:22 ` Peter Zijlstra 2025-10-06 11:09 ` Arnd Bergmann 0 siblings, 1 reply; 44+ messages in thread From: Peter Zijlstra @ 2025-10-06 10:22 UTC (permalink / raw) To: Arnd Bergmann Cc: Finn Thain, Geert Uytterhoeven, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang On Mon, Oct 06, 2025 at 12:07:24PM +0200, Arnd Bergmann wrote: > It looks like Mark Rutland fixed the try_cmpxchg() function this > way in commit ec570320b09f ("locking/atomic: Correct (cmp)xchg() > instrumentation"), but for some reason we still have the wrong > annotation on the atomic_try_cmpxchg() helpers. > Mark, I've tried to modify your script to produce that output, > the output looks correct to me, but it makes the script more > complex than I liked and I'm not sure that matching on > "${type}"="p" is the best way to check for this. I don't see anything wrong with this. The result looks good. > diff --git a/scripts/atomic/gen-atomic-instrumented.sh b/scripts/atomic/gen-atomic-instrumented.sh > index 592f3ec89b5f..9c1d53f81eb2 100755 > --- a/scripts/atomic/gen-atomic-instrumented.sh > +++ b/scripts/atomic/gen-atomic-instrumented.sh > @@ -12,7 +12,7 @@ gen_param_check() > local arg="$1"; shift > local type="${arg%%:*}" > local name="$(gen_param_name "${arg}")" > - local rw="write" > + local rw="atomic_write" > > case "${type#c}" in > i) return;; > @@ -20,14 +20,17 @@ gen_param_check() > > if [ ${type#c} != ${type} ]; then > # We don't write to constant parameters. > - rw="read" > + rw="atomic_read" > + elif [ "${type}" = "p" ] ; then > + # The "old" argument in try_cmpxchg() gets accessed non-atomically > + rw="read_write" > elif [ "${meta}" != "s" ]; then > # An atomic RMW: if this parameter is not a constant, and this atomic is > # not just a 's'tore, this parameter is both read from and written to. > - rw="read_write" > + rw="atomic_read_write" > fi > > - printf "\tinstrument_atomic_${rw}(${name}, sizeof(*${name}));\n" > + printf "\tinstrument_${rw}(${name}, sizeof(*${name}));\n" > } > > #gen_params_checks(meta, arg...) > ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-10-06 10:22 ` Peter Zijlstra @ 2025-10-06 11:09 ` Arnd Bergmann 0 siblings, 0 replies; 44+ messages in thread From: Arnd Bergmann @ 2025-10-06 11:09 UTC (permalink / raw) To: Peter Zijlstra Cc: Finn Thain, Geert Uytterhoeven, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang On Mon, Oct 6, 2025, at 12:22, Peter Zijlstra wrote: > On Mon, Oct 06, 2025 at 12:07:24PM +0200, Arnd Bergmann wrote: > >> It looks like Mark Rutland fixed the try_cmpxchg() function this >> way in commit ec570320b09f ("locking/atomic: Correct (cmp)xchg() >> instrumentation"), but for some reason we still have the wrong >> annotation on the atomic_try_cmpxchg() helpers. > >> Mark, I've tried to modify your script to produce that output, >> the output looks correct to me, but it makes the script more >> complex than I liked and I'm not sure that matching on >> "${type}"="p" is the best way to check for this. > > I don't see anything wrong with this. The result looks good. Thanks for having a look, I've sent it out as a proper patch now. Arnd ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-30 6:35 ` Arnd Bergmann 2025-10-01 1:03 ` Finn Thain 2025-10-06 9:25 ` Finn Thain @ 2025-10-06 9:37 ` Peter Zijlstra 2 siblings, 0 replies; 44+ messages in thread From: Peter Zijlstra @ 2025-10-06 9:37 UTC (permalink / raw) To: Arnd Bergmann Cc: Finn Thain, Geert Uytterhoeven, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang, elver On Tue, Sep 30, 2025 at 08:35:55AM +0200, Arnd Bergmann wrote: > Since there is nothing telling the compiler that > the 'old' argument to atomic*_try_cmpcxchg() needs to > be naturally aligned, maybe that check should be changed > to only test for the ABI-guaranteed alignment? I think > that would still be needed on x86-32. > > Arnd > > diff --git a/include/linux/atomic/atomic-instrumented.h b/include/linux/atomic/atomic-instrumented.h > index 9409a6ddf3e0..e57763a889bd 100644 > --- a/include/linux/atomic/atomic-instrumented.h > +++ b/include/linux/atomic/atomic-instrumented.h > @@ -1276,7 +1276,7 @@ atomic_try_cmpxchg(atomic_t *v, int *old, int new) > { > kcsan_mb(); > instrument_atomic_read_write(v, sizeof(*v)); > - instrument_atomic_read_write(old, sizeof(*old)); > + instrument_atomic_read_write(old, alignof(*old)); > return raw_atomic_try_cmpxchg(v, old, new); > } That's wrong. The argument there really is size, it tells KASAN how wide the access is. Arguably we could switch old to instrument_read_write() since the target is not actually an atomic. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-30 2:18 ` Finn Thain 2025-09-30 6:35 ` Arnd Bergmann @ 2025-09-30 7:41 ` Geert Uytterhoeven 2025-10-01 1:46 ` Finn Thain 1 sibling, 1 reply; 44+ messages in thread From: Geert Uytterhoeven @ 2025-09-30 7:41 UTC (permalink / raw) To: Finn Thain Cc: Arnd Bergmann, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang Hi Finn, On Tue, 30 Sept 2025 at 04:18, Finn Thain <fthain@linux-m68k.org> wrote: > On Tue, 23 Sep 2025, I wrote: > > ... there's still some kmem cache or other allocator somewhere that has > > produced some misaligned path and dentry structures. So we get > > misaligned atomics somewhere in the VFS and TTY layers. I was unable to > > find those allocations. > > It turned out that the problem wasn't dynamic allocations, it was a local > variable in the core locking code (kernel/locking/rwsem.c): a misaligned > long used with an atomic operation (cmpxchg). To get natural alignment for > 64-bit quantities, I had to align other local variables as well, such as > the one in ktime_get_real_ts64_mg() that's used with > atomic64_try_cmpxchg(). The atomic_t branch in my github repo has the > patches I wrote for that. So cmpxchg() and friends should not take a volatile void *, but (new) properly-aligned types, using the new _Generic()? > To silence the misalignment WARN from CONFIG_DEBUG_ATOMIC, for 64-bit > atomic operations, for my small m68k .config, it was also necesary to > increase ARCH_SLAB_MINALIGN to 8. However, I'm not advocating a Probably ARCH_SLAB_MINALIGN should be 4 on m68k. Somehow I thought that was already the case, but it is __alignof__(unsigned long long) = 2. > ARCH_SLAB_MINALIGN increase, as that wastes memory. I think it might be > more useful to limit the alignment test for CONFIG_DEBUG_ATOMIC, as > follows. Did you check what would be the actual impact of increasing it to 4 or 8? > --- a/include/linux/instrumented.h > +++ b/include/linux/instrumented.h > @@ -68,7 +68,7 @@ static __always_inline void instrument_atomic_read(const volatile void *v, size_ > { > kasan_check_read(v, size); > kcsan_check_atomic_read(v, size); > - WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1))); > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1) & 3)); I'd make that an arch-overridable define instead of hardcoded 3. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-09-30 7:41 ` Geert Uytterhoeven @ 2025-10-01 1:46 ` Finn Thain 2025-10-01 7:08 ` Geert Uytterhoeven 0 siblings, 1 reply; 44+ messages in thread From: Finn Thain @ 2025-10-01 1:46 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Arnd Bergmann, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang On Tue, 30 Sep 2025, Geert Uytterhoeven wrote: > > To silence the misalignment WARN from CONFIG_DEBUG_ATOMIC, for 64-bit > > atomic operations, for my small m68k .config, it was also necesary to > > increase ARCH_SLAB_MINALIGN to 8. However, I'm not advocating a > > Probably ARCH_SLAB_MINALIGN should be 4 on m68k. Somehow I thought that > was already the case, but it is __alignof__(unsigned long long) = 2. > I agree -- setting ARCH_SLAB_MINALIGN to 4 would be better style, and may avoid surprises in future. Right now that won't have any effect because that value gets increased to sizeof(void *) by calculate_alignment() and gets increased to ARCH_KMALLOC_MINALIGN or ARCH_DMA_MINALIGN by __kmalloc_minalign(). > > ARCH_SLAB_MINALIGN increase, as that wastes memory. I think it might be > > more useful to limit the alignment test for CONFIG_DEBUG_ATOMIC, as > > follows. > > Did you check what would be the actual impact of increasing it to 4 or 8? > > > --- a/include/linux/instrumented.h > > +++ b/include/linux/instrumented.h > > @@ -68,7 +68,7 @@ static __always_inline void instrument_atomic_read(const volatile void *v, size_ > > { > > kasan_check_read(v, size); > > kcsan_check_atomic_read(v, size); > > - WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1))); > > + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & (size - 1) & 3)); > > I'd make that an arch-overridable define instead of hardcoded 3. > How about (sizeof(atomic_long_t) - 1)? Can you comment on this, Peter? ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t 2025-10-01 1:46 ` Finn Thain @ 2025-10-01 7:08 ` Geert Uytterhoeven 0 siblings, 0 replies; 44+ messages in thread From: Geert Uytterhoeven @ 2025-10-01 7:08 UTC (permalink / raw) To: Finn Thain Cc: Arnd Bergmann, Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch, linux-m68k, Lance Yang Hi Finn, On Wed, 1 Oct 2025 at 03:46, Finn Thain <fthain@linux-m68k.org> wrote: > On Tue, 30 Sep 2025, Geert Uytterhoeven wrote: > > > To silence the misalignment WARN from CONFIG_DEBUG_ATOMIC, for 64-bit > > > atomic operations, for my small m68k .config, it was also necesary to > > > increase ARCH_SLAB_MINALIGN to 8. However, I'm not advocating a > > > > Probably ARCH_SLAB_MINALIGN should be 4 on m68k. Somehow I thought that > > was already the case, but it is __alignof__(unsigned long long) = 2. > > I agree -- setting ARCH_SLAB_MINALIGN to 4 would be better style, and may > avoid surprises in future. Right now that won't have any effect because > that value gets increased to sizeof(void *) by calculate_alignment() and Ah, so there it happens... > gets increased to ARCH_KMALLOC_MINALIGN or ARCH_DMA_MINALIGN by > __kmalloc_minalign(). Thanks for checking! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 44+ messages in thread
end of thread, other threads:[~2025-10-06 11:09 UTC | newest] Thread overview: 44+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-09-14 0:45 [RFC v2 0/3] Align atomic storage Finn Thain 2025-09-14 0:45 ` [RFC v2 1/3] documentation: Discourage alignment assumptions Finn Thain 2025-09-14 0:45 ` [RFC v2 3/3] atomic: Add alignment check to instrumented atomic operations Finn Thain 2025-09-15 8:00 ` Peter Zijlstra 2025-09-15 9:38 ` Finn Thain 2025-09-15 10:06 ` Peter Zijlstra 2025-09-15 10:37 ` Finn Thain 2025-09-15 11:20 ` Arnd Bergmann 2025-09-16 0:16 ` Finn Thain 2025-09-16 10:10 ` Geert Uytterhoeven 2025-09-17 1:23 ` Finn Thain 2025-09-16 12:37 ` Arnd Bergmann 2025-09-16 21:38 ` Brad Boyer 2025-09-17 16:54 ` Andreas Schwab 2025-09-17 2:14 ` Finn Thain 2025-09-22 15:49 ` Arnd Bergmann 2025-09-23 6:39 ` Finn Thain 2025-09-14 0:45 ` [RFC v2 2/3] atomic: Specify alignment for atomic_t and atomic64_t Finn Thain 2025-09-15 7:13 ` Geert Uytterhoeven 2025-09-15 7:35 ` Arnd Bergmann 2025-09-15 8:06 ` Peter Zijlstra 2025-09-15 9:26 ` Finn Thain 2025-09-15 9:29 ` Arnd Bergmann 2025-09-22 7:06 ` Geert Uytterhoeven 2025-09-22 8:16 ` Finn Thain 2025-09-22 9:29 ` Geert Uytterhoeven 2025-09-22 15:21 ` Arnd Bergmann 2025-09-23 6:28 ` Finn Thain 2025-09-23 6:41 ` Arnd Bergmann 2025-09-23 8:05 ` Finn Thain 2025-09-23 19:11 ` Arnd Bergmann 2025-09-30 2:18 ` Finn Thain 2025-09-30 6:35 ` Arnd Bergmann 2025-10-01 1:03 ` Finn Thain 2025-10-01 6:44 ` Arnd Bergmann 2025-10-06 9:25 ` Finn Thain 2025-10-06 9:25 ` Finn Thain 2025-10-06 10:07 ` Arnd Bergmann 2025-10-06 10:22 ` Peter Zijlstra 2025-10-06 11:09 ` Arnd Bergmann 2025-10-06 9:37 ` Peter Zijlstra 2025-09-30 7:41 ` Geert Uytterhoeven 2025-10-01 1:46 ` Finn Thain 2025-10-01 7:08 ` Geert Uytterhoeven
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).