linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Segher Boessenkool <segher@kernel.crashing.org>
To: Fangrui Song <maskray@sourceware.org>
Cc: Indu Bhagat <indu.bhagat@oracle.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Jan Beulich <jbeulich@suse.com>,
	Rainer Orth <ro@cebitec.uni-bielefeld.de>,
	"linux-toolchains@vger.kernel.org"
	<linux-toolchains@vger.kernel.org>,
	Jens Remus <jremus@linux.ibm.com>,
	Sterling Augustine <saugustine@google.com>,
	Pavel Labath <labath@google.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Josh Poimboeuf <jpoimboe@kernel.org>,
	Serhei Makarov <smakarov@redhat.com>,
	Binutils <binutils@sourceware.org>
Subject: Re: Unaligned access trade-offs for SFrame FRE layout
Date: Tue, 16 Sep 2025 12:54:24 -0500	[thread overview]
Message-ID: <aMmkUNXT2Fi-0D1h@gate> (raw)
In-Reply-To: <aMmT6q1jERmMq-AR@gate>

On Tue, Sep 16, 2025 at 11:44:26AM -0500, Segher Boessenkool wrote:
> On Tue, Sep 16, 2025 at 09:32:30AM -0700, Fangrui Song wrote:
> > The read32le(p) function is either a standard read or a byte-swapped
> > read.
> 
> You should never overcomplicate things by doing byte-swaps.  Instead,
> just say what you mean:
> 
> u32 read32le(u8 *p)
> {
> 	return p[0] + 0x100*p[1] + 0x10000*p[2] + 0x1000000*p[3];
> }
> 
> or something like that.  The compiler can optimise such things just
> fine!  There is no need to go via extra indirections.

The following actually compiles to optimal code, both with -mbig and
with -mlittle:

===
typedef unsigned int u32;
typedef unsigned char u8;

u32 read32le(u8 *p)
{
        return (u32)p[0] | (u32)p[1]<<8 | (u32)p[2]<<16 | (u32)p[3]<<24;
}
===

With -O2 -mbig:
        lwbrx 3,0,3      # 10   [c=8 l=4]  bswapsi2_load
        blr              # 18   [c=4 l=4]  simple_return
(on a BE system), and with -O2 -mlittle:
        lwz 3,0(3)       # 11   [c=8 l=4]  *movsi_internal1/3
        blr              # 19   [c=4 l=4]  simple_return
(I used -mcpu=power10, because a) why not, and b) with an ancient CPU
GCC will make more sure not to do misaligned accesses.  Power8 is fine
already, 970 (aka Apple G5) isn't (for the LE accesses on a BE host):
and that is good, because such accesses will frequently trap, so on
average they are quite expensive if done as a single read.


Segher

  parent reply	other threads:[~2025-09-16 17:55 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-12 17:34 Unaligned access trade-offs for SFrame FRE layout Indu Bhagat
2025-09-12 18:19 ` Segher Boessenkool
2025-09-12 19:18 ` Steven Rostedt
2025-09-13  7:56   ` Indu Bhagat
2025-09-15 16:04     ` Steven Rostedt
     [not found]   ` <CAEG7qUxk_cZYv3X_VM6+ZGaVFAD-7jdPd3xA92xYHUAqyzb2Xw@mail.gmail.com>
2025-09-13  8:01     ` Indu Bhagat
2025-09-14 14:14 ` Jan Beulich
2025-09-14 14:39   ` Rainer Orth
2025-09-14 15:23     ` Jan Beulich
2025-09-14 16:18       ` Rainer Orth
2025-09-14 18:10         ` Jan Beulich
2025-09-15  5:42           ` Indu Bhagat
2025-09-15 16:07             ` Steven Rostedt
2025-09-15 17:22               ` Segher Boessenkool
2025-09-16  6:05               ` Fangrui Song
2025-09-16 15:58                 ` Steven Rostedt
2025-09-18 10:39                   ` Jens Remus
2025-09-16 16:03                 ` Indu Bhagat
2025-09-16 16:32                   ` Fangrui Song
2025-09-16 16:44                     ` Segher Boessenkool
2025-09-16 17:05                       ` Fangrui Song
2025-09-16 17:54                       ` Segher Boessenkool [this message]
2025-09-16 17:33                     ` Indu Bhagat
2025-09-17 21:12                 ` Steven Rostedt
2025-09-17 23:55                   ` Alan Modra
2025-09-15  9:08       ` Segher Boessenkool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aMmkUNXT2Fi-0D1h@gate \
    --to=segher@kernel.crashing.org \
    --cc=andrii@kernel.org \
    --cc=binutils@sourceware.org \
    --cc=indu.bhagat@oracle.com \
    --cc=jbeulich@suse.com \
    --cc=jpoimboe@kernel.org \
    --cc=jremus@linux.ibm.com \
    --cc=labath@google.com \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=maskray@sourceware.org \
    --cc=ro@cebitec.uni-bielefeld.de \
    --cc=rostedt@goodmis.org \
    --cc=saugustine@google.com \
    --cc=smakarov@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).