* [RFC v3 0/5] Align atomic storage
@ 2025-10-07 22:19 Finn Thain
2025-10-07 22:19 ` [RFC v3 1/5] documentation: Discourage alignment assumptions Finn Thain
0 siblings, 1 reply; 8+ messages in thread
From: Finn Thain @ 2025-10-07 22:19 UTC (permalink / raw)
To: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann,
Peter Zijlstra, Will Deacon
Cc: Andrew Morton, Arnd Bergmann, Boqun Feng, bpf, Jonathan Corbet,
Eduard Zingerman, Geert Uytterhoeven, Hao Luo, John Fastabend,
Jiri Olsa, KP Singh, Lance Yang, linux-arch, linux-doc,
linux-kernel, linux-m68k, Mark Rutland, Martin KaFai Lau,
Stanislav Fomichev, Song Liu, Yonghong Song
This series adds the __aligned attribute to atomic_t and atomic64_t
definitions in include/asm-generic.
It also adds Kconfig options to enable a new runtime warning to help
reveal misaligned atomic accesses on platforms which don't trap that.
Some people might assume scalars are aligned to 4-byte boundaries, while
others might assume natural alignment. Best not to encourage such
assumptions in the documentation.
Moreover, being that locks are performance sensitive, and being that
atomic operations tend to involve further assumptions, there seems to be
room for improvement here.
Pertinent to this discussion are the section "Memory Efficiency" in
Documentation/RCU/Design/Requirements/Requirements.rst
and the section "GUARANTEES" in Documentation/memory-barriers.txt
---
Changed since v2:
- Specify natural alignment for atomic64_t.
- CONFIG_DEBUG_ATOMIC checks for natural alignment again.
- New patch to add weakened alignment check.
- New patch for explicit alignment in BFP header.
---
Finn Thain (4):
documentation: Discourage alignment assumptions
bpf: Explicitly align bpf_res_spin_lock
atomic: Specify alignment for atomic_t and atomic64_t
atomic: Add option for weaker alignment check
Peter Zijlstra (1):
atomic: Add alignment check to instrumented atomic operations
.../core-api/unaligned-memory-access.rst | 7 -------
include/asm-generic/atomic64.h | 2 +-
include/asm-generic/rqspinlock.h | 2 +-
include/linux/instrumented.h | 12 ++++++++++++
include/linux/types.h | 2 +-
kernel/bpf/rqspinlock.c | 1 -
lib/Kconfig.debug | 19 +++++++++++++++++++
7 files changed, 34 insertions(+), 11 deletions(-)
--
2.49.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [RFC v3 1/5] documentation: Discourage alignment assumptions
2025-10-07 22:19 [RFC v3 0/5] Align atomic storage Finn Thain
@ 2025-10-07 22:19 ` Finn Thain
2025-10-14 10:23 ` David Laight
2025-10-16 13:30 ` Arnd Bergmann
0 siblings, 2 replies; 8+ messages in thread
From: Finn Thain @ 2025-10-07 22:19 UTC (permalink / raw)
To: Peter Zijlstra, Will Deacon
Cc: Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland,
Arnd Bergmann, linux-kernel, linux-arch, Geert Uytterhoeven,
linux-m68k, linux-doc
Discourage assumptions that simply don't hold for all Linux ABIs.
Exceptions to the natural alignment rule for scalar types include
long long on i386 and sh.
---
Documentation/core-api/unaligned-memory-access.rst | 7 -------
1 file changed, 7 deletions(-)
diff --git a/Documentation/core-api/unaligned-memory-access.rst b/Documentation/core-api/unaligned-memory-access.rst
index 5ceeb80eb539..1390ce2b7291 100644
--- a/Documentation/core-api/unaligned-memory-access.rst
+++ b/Documentation/core-api/unaligned-memory-access.rst
@@ -40,9 +40,6 @@ The rule mentioned above forms what we refer to as natural alignment:
When accessing N bytes of memory, the base memory address must be evenly
divisible by N, i.e. addr % N == 0.
-When writing code, assume the target architecture has natural alignment
-requirements.
-
In reality, only a few architectures require natural alignment on all sizes
of memory access. However, we must consider ALL supported architectures;
writing code that satisfies natural alignment requirements is the easiest way
@@ -103,10 +100,6 @@ Therefore, for standard structure types you can always rely on the compiler
to pad structures so that accesses to fields are suitably aligned (assuming
you do not cast the field to a type of different length).
-Similarly, you can also rely on the compiler to align variables and function
-parameters to a naturally aligned scheme, based on the size of the type of
-the variable.
-
At this point, it should be clear that accessing a single byte (u8 or char)
will never cause an unaligned access, because all memory addresses are evenly
divisible by one.
--
2.49.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [RFC v3 1/5] documentation: Discourage alignment assumptions
2025-10-07 22:19 ` [RFC v3 1/5] documentation: Discourage alignment assumptions Finn Thain
@ 2025-10-14 10:23 ` David Laight
2025-10-15 7:40 ` Finn Thain
2025-10-16 13:30 ` Arnd Bergmann
1 sibling, 1 reply; 8+ messages in thread
From: David Laight @ 2025-10-14 10:23 UTC (permalink / raw)
To: Finn Thain
Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng,
Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel,
linux-arch, Geert Uytterhoeven, linux-m68k, linux-doc
On Wed, 08 Oct 2025 09:19:20 +1100
Finn Thain <fthain@linux-m68k.org> wrote:
> Discourage assumptions that simply don't hold for all Linux ABIs.
> Exceptions to the natural alignment rule for scalar types include
> long long on i386 and sh.
> ---
> Documentation/core-api/unaligned-memory-access.rst | 7 -------
> 1 file changed, 7 deletions(-)
>
> diff --git a/Documentation/core-api/unaligned-memory-access.rst b/Documentation/core-api/unaligned-memory-access.rst
> index 5ceeb80eb539..1390ce2b7291 100644
> --- a/Documentation/core-api/unaligned-memory-access.rst
> +++ b/Documentation/core-api/unaligned-memory-access.rst
> @@ -40,9 +40,6 @@ The rule mentioned above forms what we refer to as natural alignment:
> When accessing N bytes of memory, the base memory address must be evenly
> divisible by N, i.e. addr % N == 0.
>
> -When writing code, assume the target architecture has natural alignment
> -requirements.
I think I'd be more explicit, perhaps:
Note that not all architectures align 64bit items on 8 byte boundaries or
even 32bit items on 4 byte boundaries.
David
> -
> In reality, only a few architectures require natural alignment on all sizes
> of memory access. However, we must consider ALL supported architectures;
> writing code that satisfies natural alignment requirements is the easiest way
> @@ -103,10 +100,6 @@ Therefore, for standard structure types you can always rely on the compiler
> to pad structures so that accesses to fields are suitably aligned (assuming
> you do not cast the field to a type of different length).
>
> -Similarly, you can also rely on the compiler to align variables and function
> -parameters to a naturally aligned scheme, based on the size of the type of
> -the variable.
> -
> At this point, it should be clear that accessing a single byte (u8 or char)
> will never cause an unaligned access, because all memory addresses are evenly
> divisible by one.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC v3 1/5] documentation: Discourage alignment assumptions
2025-10-14 10:23 ` David Laight
@ 2025-10-15 7:40 ` Finn Thain
2025-10-15 13:53 ` David Laight
0 siblings, 1 reply; 8+ messages in thread
From: Finn Thain @ 2025-10-15 7:40 UTC (permalink / raw)
To: David Laight
Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng,
Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel,
linux-arch, Geert Uytterhoeven, linux-m68k, linux-doc
On Tue, 14 Oct 2025, David Laight wrote:
> On Wed, 08 Oct 2025 09:19:20 +1100
> Finn Thain <fthain@linux-m68k.org> wrote:
>
> > Discourage assumptions that simply don't hold for all Linux ABIs.
> > Exceptions to the natural alignment rule for scalar types include
> > long long on i386 and sh.
> > ---
> > Documentation/core-api/unaligned-memory-access.rst | 7 -------
> > 1 file changed, 7 deletions(-)
> >
> > diff --git a/Documentation/core-api/unaligned-memory-access.rst b/Documentation/core-api/unaligned-memory-access.rst
> > index 5ceeb80eb539..1390ce2b7291 100644
> > --- a/Documentation/core-api/unaligned-memory-access.rst
> > +++ b/Documentation/core-api/unaligned-memory-access.rst
> > @@ -40,9 +40,6 @@ The rule mentioned above forms what we refer to as natural alignment:
> > When accessing N bytes of memory, the base memory address must be evenly
> > divisible by N, i.e. addr % N == 0.
> >
> > -When writing code, assume the target architecture has natural alignment
> > -requirements.
>
> I think I'd be more explicit, perhaps:
> Note that not all architectures align 64bit items on 8 byte boundaries or
> even 32bit items on 4 byte boundaries.
>
That's what the next para is alluding to...
> > In reality, only a few architectures require natural alignment on all sizes
> > of memory access. However, we must consider ALL supported architectures;
> > writing code that satisfies natural alignment requirements is the easiest way
> > to achieve full portability.
How about this?
"In reality, only a few architectures require natural alignment for all
sizes of memory access. That is, not all architectures need 64-bit values
to be aligned on 8-byte boundaries and 32-bit values on 4-byte boundaries.
However, when writing code intended to achieve full portability, we must
consider all supported architectures."
> > @@ -103,10 +100,6 @@ Therefore, for standard structure types you can always rely on the compiler
> > to pad structures so that accesses to fields are suitably aligned (assuming
> > you do not cast the field to a type of different length).
> >
> > -Similarly, you can also rely on the compiler to align variables and function
> > -parameters to a naturally aligned scheme, based on the size of the type of
> > -the variable.
> > -
> > At this point, it should be clear that accessing a single byte (u8 or char)
> > will never cause an unaligned access, because all memory addresses are evenly
> > divisible by one.
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC v3 1/5] documentation: Discourage alignment assumptions
2025-10-15 7:40 ` Finn Thain
@ 2025-10-15 13:53 ` David Laight
2025-10-16 6:53 ` Finn Thain
0 siblings, 1 reply; 8+ messages in thread
From: David Laight @ 2025-10-15 13:53 UTC (permalink / raw)
To: Finn Thain
Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng,
Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel,
linux-arch, Geert Uytterhoeven, linux-m68k, linux-doc
On Wed, 15 Oct 2025 18:40:39 +1100 (AEDT)
Finn Thain <fthain@linux-m68k.org> wrote:
> On Tue, 14 Oct 2025, David Laight wrote:
>
> > On Wed, 08 Oct 2025 09:19:20 +1100
> > Finn Thain <fthain@linux-m68k.org> wrote:
> >
> > > Discourage assumptions that simply don't hold for all Linux ABIs.
> > > Exceptions to the natural alignment rule for scalar types include
> > > long long on i386 and sh.
> > > ---
> > > Documentation/core-api/unaligned-memory-access.rst | 7 -------
> > > 1 file changed, 7 deletions(-)
> > >
> > > diff --git a/Documentation/core-api/unaligned-memory-access.rst b/Documentation/core-api/unaligned-memory-access.rst
> > > index 5ceeb80eb539..1390ce2b7291 100644
> > > --- a/Documentation/core-api/unaligned-memory-access.rst
> > > +++ b/Documentation/core-api/unaligned-memory-access.rst
> > > @@ -40,9 +40,6 @@ The rule mentioned above forms what we refer to as natural alignment:
> > > When accessing N bytes of memory, the base memory address must be evenly
> > > divisible by N, i.e. addr % N == 0.
> > >
> > > -When writing code, assume the target architecture has natural alignment
> > > -requirements.
> >
> > I think I'd be more explicit, perhaps:
> > Note that not all architectures align 64bit items on 8 byte boundaries or
> > even 32bit items on 4 byte boundaries.
> >
>
> That's what the next para is alluding to...
>
> > > In reality, only a few architectures require natural alignment on all sizes
> > > of memory access. However, we must consider ALL supported architectures;
> > > writing code that satisfies natural alignment requirements is the easiest way
> > > to achieve full portability.
>
> How about this?
>
> "In reality, only a few architectures require natural alignment for all
> sizes of memory access. That is, not all architectures need 64-bit values
> to be aligned on 8-byte boundaries and 32-bit values on 4-byte boundaries.
> However, when writing code intended to achieve full portability, we must
> consider all supported architectures."
There are several separate alignments:
- The alignment the cpu needs, for most x86 instructions this is 1 byte [1].
Many RISC cpu require 'word' alignment (for some definition of 'word').
A problematic case is data that crosses page boundaries.
- The alignment the compiler uses for structure members; returned by _Alignof().
m68k only 16bit aligns 32bit values.
- The 'preferred' alignment returned by __alignof__().
32bit x86 returns 8 for 64bit types even though the ABI only 4-byte aligns them.
- The 'natural' alignment based on the size of the item.
I'd guess that 'complex double' (if supported) may only be 8 byte aligned.
What normally matters is the ABI alignment for structure members.
If you mark anything 'packed' the compiler will generate shifts and masks (etc)
to get working code.
Taking the address of an item in a packed structure generates a warning
for very good reason.
[1] I've fallen foul of gcc deciding to 'vectorise' a loop and then having
it crash because the buffer address was misaligned.
Nasty because the code worked in initial testing and I expected the loop
(32bit adds of a buffer) to work fine even when misaligned.
David
>
> > > @@ -103,10 +100,6 @@ Therefore, for standard structure types you can always rely on the compiler
> > > to pad structures so that accesses to fields are suitably aligned (assuming
> > > you do not cast the field to a type of different length).
> > >
> > > -Similarly, you can also rely on the compiler to align variables and function
> > > -parameters to a naturally aligned scheme, based on the size of the type of
> > > -the variable.
> > > -
> > > At this point, it should be clear that accessing a single byte (u8 or char)
> > > will never cause an unaligned access, because all memory addresses are evenly
> > > divisible by one.
> >
> >
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC v3 1/5] documentation: Discourage alignment assumptions
2025-10-15 13:53 ` David Laight
@ 2025-10-16 6:53 ` Finn Thain
0 siblings, 0 replies; 8+ messages in thread
From: Finn Thain @ 2025-10-16 6:53 UTC (permalink / raw)
To: David Laight
Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng,
Jonathan Corbet, Mark Rutland, Arnd Bergmann, linux-kernel,
linux-arch, Geert Uytterhoeven, linux-m68k, linux-doc
On Wed, 15 Oct 2025, David Laight wrote:
>
> There are several separate alignments:
> - The alignment the cpu needs, for most x86 instructions this is 1 byte [1].
> Many RISC cpu require 'word' alignment (for some definition of 'word').
> A problematic case is data that crosses page boundaries.
> - The alignment the compiler uses for structure members; returned by _Alignof().
> m68k only 16bit aligns 32bit values.
> - The 'preferred' alignment returned by __alignof__().
> 32bit x86 returns 8 for 64bit types even though the ABI only 4-byte aligns them.
> - The 'natural' alignment based on the size of the item.
> I'd guess that 'complex double' (if supported) may only be 8 byte aligned.
>
Those distinctions could be useful in a discussion about memory
efficiency. But this document is concerned with avoiding a performance
penalty -- it's entirely unconcerned with over-alignment and memory waste.
Hence, "aligned" is used as shorthand for "naturally aligned".
The ambiguity in this document (and my proposed change) stems from using
the word architecture to cover ABI, platform, CPU, ISA etc.
I can improve upon that.
> What normally matters is the ABI alignment for structure members.
> If you mark anything 'packed' the compiler will generate shifts and masks (etc)
> to get working code.
> Taking the address of an item in a packed structure generates a warning
> for very good reason.
>
I believe the problem with 'packed' is already covered in this document.
> [1] I've fallen foul of gcc deciding to 'vectorise' a loop and then having
> it crash because the buffer address was misaligned.
> Nasty because the code worked in initial testing and I expected the loop
> (32bit adds of a buffer) to work fine even when misaligned.
>
I think that pitfall is already discussed also, along with a remedy.
There is also this,
... for standard structure types you can always rely on the compiler
to pad structures so that accesses to fields are suitably aligned
(assuming you do not cast the field to a type of different length).
So it seems to be fairly comprehensive but I may be missing something (?)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC v3 1/5] documentation: Discourage alignment assumptions
2025-10-07 22:19 ` [RFC v3 1/5] documentation: Discourage alignment assumptions Finn Thain
2025-10-14 10:23 ` David Laight
@ 2025-10-16 13:30 ` Arnd Bergmann
2025-10-16 22:14 ` Finn Thain
1 sibling, 1 reply; 8+ messages in thread
From: Arnd Bergmann @ 2025-10-16 13:30 UTC (permalink / raw)
To: Finn Thain, Peter Zijlstra, Will Deacon
Cc: Andrew Morton, Boqun Feng, Jonathan Corbet, Mark Rutland,
linux-kernel, Linux-Arch, Geert Uytterhoeven, linux-m68k,
linux-doc
On Wed, Oct 8, 2025, at 00:19, Finn Thain wrote:
> Discourage assumptions that simply don't hold for all Linux ABIs.
> Exceptions to the natural alignment rule for scalar types include
> long long on i386 and sh.
> ---
I think both of the paragraphs you remove are still correct and I
would not remove them:
> Documentation/core-api/unaligned-memory-access.rst | 7 -------
> 1 file changed, 7 deletions(-)
>
> diff --git a/Documentation/core-api/unaligned-memory-access.rst
> b/Documentation/core-api/unaligned-memory-access.rst
> index 5ceeb80eb539..1390ce2b7291 100644
> --- a/Documentation/core-api/unaligned-memory-access.rst
> +++ b/Documentation/core-api/unaligned-memory-access.rst
>
> -When writing code, assume the target architecture has natural alignment
> -requirements.
> -
It is clearly important to not intentionally misalign variables
because that breaks on hardware that requires aligned data.
Assuming natural alignment is the safe choice here, but you
could change 'architecture' to 'hardware' here if you
think that is otherwise ambiguous.
> -Similarly, you can also rely on the compiler to align variables and function
> -parameters to a naturally aligned scheme, based on the size of the type of
> -the variable.
This also seems to refer to something else: even on m68k
and i386, scalar stack and .data variables have natural
alignment even though the ABI does not require that.
It's probably a good idea to list the specific exceptions to
the struct layout rules in the previous paragraph, e.g.
[
Fortunately, the compiler understands the alignment constraints, so in the
above case it would insert 2 bytes of padding in between field1 and field2.
Therefore, for standard structure types you can always rely on the compiler
-to pad structures so that accesses to fields are suitably aligned (assuming
-you do not cast the field to a type of different length).
+to pad structures so that accesses to fields are suitably aligned for
+the CPU hardware.
+On all 64-bit architectures, this means that all scalar struct members
+are naturally aligned. However, some 32-bit ABIs including i386
+only align 64-bit members on 32-bit offsets, and m68k uses at most
+16-bit alignment.
]
Arnd
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC v3 1/5] documentation: Discourage alignment assumptions
2025-10-16 13:30 ` Arnd Bergmann
@ 2025-10-16 22:14 ` Finn Thain
0 siblings, 0 replies; 8+ messages in thread
From: Finn Thain @ 2025-10-16 22:14 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Boqun Feng,
Jonathan Corbet, Mark Rutland, linux-kernel, Linux-Arch,
Geert Uytterhoeven, linux-m68k, linux-doc
On Thu, 16 Oct 2025, Arnd Bergmann wrote:
> On Wed, Oct 8, 2025, at 00:19, Finn Thain wrote:
> > Discourage assumptions that simply don't hold for all Linux ABIs.
> > Exceptions to the natural alignment rule for scalar types include
> > long long on i386 and sh.
> > ---
>
> I think both of the paragraphs you remove are still correct and I
> would not remove them:
>
Yes -- correct in some obscure sense, but misleading at face value.
Hence this patch.
> > Documentation/core-api/unaligned-memory-access.rst | 7 -------
> > 1 file changed, 7 deletions(-)
> >
> > diff --git a/Documentation/core-api/unaligned-memory-access.rst
> > b/Documentation/core-api/unaligned-memory-access.rst
> > index 5ceeb80eb539..1390ce2b7291 100644
> > --- a/Documentation/core-api/unaligned-memory-access.rst
> > +++ b/Documentation/core-api/unaligned-memory-access.rst
> >
> > -When writing code, assume the target architecture has natural alignment
> > -requirements.
> > -
>
> It is clearly important to not intentionally misalign variables
> because that breaks on hardware that requires aligned data.
>
> Assuming natural alignment is the safe choice here, but you
> could change 'architecture' to 'hardware' here if you
> think that is otherwise ambiguous.
>
Do you know of any hardware that has "natural alignment requirements" in
the completely unqualified sense in which that the claim is made? i.e.
considering all of its native vector and scalar types.
That's a rhetorical question. I'm not trying to provide an exhaustive and
up-to-date list of platform quirks. With this patch, I'm merely trying to
discourage faulty assumptions.
> > -Similarly, you can also rely on the compiler to align variables and function
> > -parameters to a naturally aligned scheme, based on the size of the type of
> > -the variable.
>
> This also seems to refer to something else: even on m68k and i386,
> scalar stack and .data variables have natural alignment even though the
> ABI does not require that.
>
Then it is doubly misleading.
There is value in explaining what the compiler can and cannot be relied
upon to deliver, but I think the myfunc() example already serves that
purpose. Don't you agree?
> It's probably a good idea to list the specific exceptions to
> the struct layout rules in the previous paragraph, e.g.
>
> [
> Fortunately, the compiler understands the alignment constraints, so in the
> above case it would insert 2 bytes of padding in between field1 and field2.
> Therefore, for standard structure types you can always rely on the compiler
> -to pad structures so that accesses to fields are suitably aligned (assuming
> -you do not cast the field to a type of different length).
> +to pad structures so that accesses to fields are suitably aligned for
> +the CPU hardware.
> +On all 64-bit architectures,
The poor reader: "I wonder what '64-bit architecture' means in this
context..."
> this means that all scalar struct members
> +are naturally aligned. However, some 32-bit ABIs including i386
> +only align 64-bit members on 32-bit offsets, and m68k uses at most
> +16-bit alignment. ]
"... oh, okay, so anything that naturally aligns a 64-bit word is a
"64-bit architecture". I get it. Oh, hang-on, that doesn't make sense...
Why would any 32-bit architecture be expected to naturally align 64-bit
members? What is he talking about? CONFIG_64BIT maybe??"
Moreover, you have digressed into a discussion of the ABI. The aim of this
document is to avoid the performance penalty for unaligned accesses.
Whereas, ABI traits would seem to be relevant to a discussion about memory
efficiency, like I said to David.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-10-16 22:14 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-07 22:19 [RFC v3 0/5] Align atomic storage Finn Thain
2025-10-07 22:19 ` [RFC v3 1/5] documentation: Discourage alignment assumptions Finn Thain
2025-10-14 10:23 ` David Laight
2025-10-15 7:40 ` Finn Thain
2025-10-15 13:53 ` David Laight
2025-10-16 6:53 ` Finn Thain
2025-10-16 13:30 ` Arnd Bergmann
2025-10-16 22:14 ` Finn Thain
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).