From: Indu Bhagat <indu.bhagat@oracle.com>
To: Jens Remus <jremus@linux.ibm.com>,
Sterling Augustine <saugustine@google.com>,
Pavel Labath <labath@google.com>,
Andrii Nakryiko <andrii@kernel.org>,
Josh Poimboeuf <jpoimboe@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Serhei Makarov <smakarov@redhat.com>,
Binutils <binutils@sourceware.org>
Cc: "linux-toolchains@vger.kernel.org" <linux-toolchains@vger.kernel.org>
Subject: Unaligned access trade-offs for SFrame FRE layout
Date: Fri, 12 Sep 2025 10:34:42 -0700 [thread overview]
Message-ID: <b7b139c6-1963-4ffc-a872-518010a50563@oracle.com> (raw)
TL;DR: Thinking and experimenting a bit on the possible approaches for
avoiding unaligned accesses in the SFrame FRE layout (in SFrame V3), I
am not convinced that avoiding unaligned accesses for performance is
worth it. IMO, forsaking compactness for avoiding unaligned accesses is
not a good trade off for SFrame.
Problem Statement
On architectures such as x86_64, AArch64, and s390x, unaligned memory
accesses are handled transparently by the hardware but incur a
performance penalty. The objective of this analysis is to evaluate if
these unaligned accesses can be eliminated from the SFrame FRE layout
and if doing so provides a net performance benefit.
The central challenge is that any alternative must demonstrate a clear
performance improvement while avoiding significant size overhead.
Introducing "bloat" to the format to solve a potential performance issue
is a poor trade-off.
Source of unaligned accesses in SFrame FRE
- (#1) Access to the SFrame FRE start address (sfre_start_address)
- (#2) Access to the SFrame FRE stack offsets, This is varlen data
tailing SFrame FRE top-level members (sfre_start_address and FRE info),
usually interpreted as stack offsets)
(Note that in the SFrame specification, SFrame Header, and SFrame FDE
(function descriptor entry) have aligned accesses.)
Updated notes on the various approaches and respective evaluation notes
on the wiki page:
https://sourceware.org/binutils/wiki/sframe/sframev3todo#Avoid_unaligned_accesses
Summary of Approaches and Analysis/Notes
Unaligned accesses may mean lower performance, but the alternative we
pick must at least provide better performance. It is also important
that the chosen approach does not add bloat to the format. Avoiding
unaligned accesses at the expense of bloating up the format is not a
good idea IMO.
Approach 1a: Bucketed members
Pros: Negligible bloat.
Cons: 1. Writing out the FRE data is somewhat more involved. Affects
assemblers, linkers. 2. For the common case though, accessing stack
offsets now needs more memory accesses per FRE. This approach will not
bring clear performance benefits; the additional complexity in SFrame
readers and writers is not justified then either.
Approach 1b: Bucketed members with Index
Cons: Significant bloat (~30%).
Approach 2: De-duplicated "stack offsets"
Pros: Will help reduce the size of SFrame sections.
Cons: 1. SFrame FRE layout is designed to be flexible so that it can
serve needs of new ABIs: The varlen data is interpreted as stack
offsets on x86_64, and AArch64, but may not be the case for other ABIs.
De-duplicating non-structured data is not meaningful. 2. Writing out the
FRE data is quite more involved, increasing the complexity in Toolchain.
Approach 3: Good old basic padding
Cons: Significant bloat (~22%). Performance win arguable as well.
IMO, none of these approaches provide viable way to move forward. The
proposed methods either fail to deliver the desired clear performance
gain or introduce a significant size penalty or complexity, which is an
unacceptable trade-off.
Would like to gather inputs from the interested folks on this. Please
take a look and chime in. Other ideas welcome.
Thanks
next reply other threads:[~2025-09-12 17:35 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-12 17:34 Indu Bhagat [this message]
2025-09-12 18:19 ` Unaligned access trade-offs for SFrame FRE layout Segher Boessenkool
2025-09-12 19:18 ` Steven Rostedt
2025-09-13 7:56 ` Indu Bhagat
2025-09-15 16:04 ` Steven Rostedt
[not found] ` <CAEG7qUxk_cZYv3X_VM6+ZGaVFAD-7jdPd3xA92xYHUAqyzb2Xw@mail.gmail.com>
2025-09-13 8:01 ` Indu Bhagat
2025-09-14 14:14 ` Jan Beulich
2025-09-14 14:39 ` Rainer Orth
2025-09-14 15:23 ` Jan Beulich
2025-09-14 16:18 ` Rainer Orth
2025-09-14 18:10 ` Jan Beulich
2025-09-15 5:42 ` Indu Bhagat
2025-09-15 16:07 ` Steven Rostedt
2025-09-15 17:22 ` Segher Boessenkool
2025-09-16 6:05 ` Fangrui Song
2025-09-16 15:58 ` Steven Rostedt
2025-09-18 10:39 ` Jens Remus
2025-09-16 16:03 ` Indu Bhagat
2025-09-16 16:32 ` Fangrui Song
2025-09-16 16:44 ` Segher Boessenkool
2025-09-16 17:05 ` Fangrui Song
2025-09-16 17:54 ` Segher Boessenkool
2025-09-16 17:33 ` Indu Bhagat
2025-09-17 21:12 ` Steven Rostedt
2025-09-17 23:55 ` Alan Modra
2025-09-15 9:08 ` Segher Boessenkool
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b7b139c6-1963-4ffc-a872-518010a50563@oracle.com \
--to=indu.bhagat@oracle.com \
--cc=andrii@kernel.org \
--cc=binutils@sourceware.org \
--cc=jpoimboe@kernel.org \
--cc=jremus@linux.ibm.com \
--cc=labath@google.com \
--cc=linux-toolchains@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=saugustine@google.com \
--cc=smakarov@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).