linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Unaligned access trade-offs for SFrame FRE layout
@ 2025-09-12 17:34 Indu Bhagat
  2025-09-12 18:19 ` Segher Boessenkool
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Indu Bhagat @ 2025-09-12 17:34 UTC (permalink / raw)
  To: Jens Remus, Sterling Augustine, Pavel Labath, Andrii Nakryiko,
	Josh Poimboeuf, Steven Rostedt, Serhei Makarov, Binutils
  Cc: linux-toolchains@vger.kernel.org

TL;DR: Thinking and experimenting a bit on the possible approaches for 
avoiding unaligned accesses in the SFrame FRE layout (in SFrame V3), I 
am not convinced that avoiding unaligned accesses for performance is 
worth it.  IMO, forsaking compactness for avoiding unaligned accesses is 
not a good trade off for SFrame.

Problem Statement
On architectures such as x86_64, AArch64, and s390x, unaligned memory 
accesses are handled transparently by the hardware but incur a 
performance penalty. The objective of this analysis is to evaluate if 
these unaligned accesses can be eliminated from the SFrame FRE layout 
and if doing so provides a net performance benefit.

The central challenge is that any alternative must demonstrate a clear
performance improvement while avoiding significant size overhead. 
Introducing "bloat" to the format to solve a potential performance issue 
is a poor trade-off.

Source of unaligned accesses in SFrame FRE
  - (#1) Access to the SFrame FRE start address (sfre_start_address)
  - (#2) Access to the SFrame FRE stack offsets,  This is varlen data 
tailing SFrame FRE top-level members (sfre_start_address and FRE info), 
usually interpreted as stack offsets)

(Note that in the SFrame specification, SFrame Header, and SFrame FDE 
(function descriptor entry) have aligned accesses.)

Updated notes on the various approaches and respective evaluation notes 
on the wiki page:
https://sourceware.org/binutils/wiki/sframe/sframev3todo#Avoid_unaligned_accesses

Summary of Approaches and Analysis/Notes
Unaligned accesses may mean lower performance, but the alternative we 
pick must at least provide better performance.  It is also important 
that the chosen approach does not add bloat to the format.  Avoiding 
unaligned accesses at the expense of bloating up the format is not a 
good idea IMO.

Approach 1a: Bucketed members
  Pros: Negligible bloat.
  Cons: 1. Writing out the FRE data is somewhat more involved. Affects
   assemblers, linkers. 2. For the common case though, accessing stack 
offsets now needs more memory accesses per FRE.  This approach will not 
bring clear performance benefits; the additional complexity in SFrame 
readers and writers is not justified then either.

Approach 1b: Bucketed members with Index
  Cons: Significant bloat (~30%).

Approach 2: De-duplicated "stack offsets"
  Pros: Will help reduce the size of SFrame sections.
  Cons: 1. SFrame FRE layout is designed to be flexible so that it can
   serve needs of new ABIs:  The varlen data is interpreted as stack 
offsets on x86_64, and AArch64, but may not be the case for other ABIs. 
De-duplicating non-structured data is not meaningful. 2. Writing out the 
FRE data is quite more involved, increasing the complexity in Toolchain.

Approach 3: Good old basic padding
  Cons: Significant bloat (~22%).  Performance win arguable as well.

IMO, none of these approaches provide viable way to move forward. The 
proposed methods either fail to deliver the desired clear performance 
gain or introduce a significant size penalty or complexity, which is an 
unacceptable trade-off.

Would like to gather inputs from the interested folks on this. Please 
take a look and chime in.  Other ideas welcome.

Thanks

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2025-09-18 10:39 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-12 17:34 Unaligned access trade-offs for SFrame FRE layout Indu Bhagat
2025-09-12 18:19 ` Segher Boessenkool
2025-09-12 19:18 ` Steven Rostedt
2025-09-13  7:56   ` Indu Bhagat
2025-09-15 16:04     ` Steven Rostedt
     [not found]   ` <CAEG7qUxk_cZYv3X_VM6+ZGaVFAD-7jdPd3xA92xYHUAqyzb2Xw@mail.gmail.com>
2025-09-13  8:01     ` Indu Bhagat
2025-09-14 14:14 ` Jan Beulich
2025-09-14 14:39   ` Rainer Orth
2025-09-14 15:23     ` Jan Beulich
2025-09-14 16:18       ` Rainer Orth
2025-09-14 18:10         ` Jan Beulich
2025-09-15  5:42           ` Indu Bhagat
2025-09-15 16:07             ` Steven Rostedt
2025-09-15 17:22               ` Segher Boessenkool
2025-09-16  6:05               ` Fangrui Song
2025-09-16 15:58                 ` Steven Rostedt
2025-09-18 10:39                   ` Jens Remus
2025-09-16 16:03                 ` Indu Bhagat
2025-09-16 16:32                   ` Fangrui Song
2025-09-16 16:44                     ` Segher Boessenkool
2025-09-16 17:05                       ` Fangrui Song
2025-09-16 17:54                       ` Segher Boessenkool
2025-09-16 17:33                     ` Indu Bhagat
2025-09-17 21:12                 ` Steven Rostedt
2025-09-17 23:55                   ` Alan Modra
2025-09-15  9:08       ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).