linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: linux@arm.linux.org.uk (Russell King - ARM Linux)
To: linux-arm-kernel@lists.infradead.org
Subject: DWord alignment on ARMv7
Date: Fri, 4 Mar 2016 14:56:56 +0000	[thread overview]
Message-ID: <20160304145656.GC19428@n2100.arm.linux.org.uk> (raw)
In-Reply-To: <2494410.DJhrOHD4yE@wuerfel>

On Fri, Mar 04, 2016 at 03:41:58PM +0100, Arnd Bergmann wrote:
> On Friday 04 March 2016 13:46:51 Russell King - ARM Linux wrote:
> > On Fri, Mar 04, 2016 at 02:30:23PM +0100, Arnd Bergmann wrote:
> > > Ah, I thought it only required 32-bit alignment like ldm/stm, but it
> > > seems that it won't do that. However, an implementation like
> > > 
> > > unsigned long long get_unaligned_u64(void *p)
> > > {
> > >         unsigned long long upper, lower;
> > >         lower = *(unsigned long*)p;
> > >         upper = *(unsigned long*)(p+4);
> > > 
> > >         return lower | (upper << 32);
> > > }
> > > 
> > > does get compiled into
> > > 
> > > 00000000 <f>:
> > >    0:   e8900003        ldm     r0, {r0, r1}
> > >    4:   e12fff1e        bx      lr
> > 
> > I think it may be something of a bitch to work around, because the
> > compiler is going to do stuff like that behind your back.
> > 
> > The only way around that would be to bypass the compiler by using
> > asm(), but then you end up bypassing the instruction scheduling too.
> > That may not matter, as the resulting overhead may still be lower.
> 
> I think the compiler is correctly optimizing the code according to
> what we tell it about the alignment here:

It is, that's not what I was saying though - I wasn't saying whether or
not the compiler is correctly optimising the code (it is.)  I was saying
that we _don't_ want the compiler optimising the code in this way here -
that's a completely _different_ point.

> The implementation appears to be suboptimal for cross-endian loads
> though, as gcc-5.3 does not use the 'rev' instruction here but
> falls back on byte accesses. We can easily fix that by introducing
> one more generic implementation for the cross-endian accesses doing
> 
> static __always_inline void put_unaligned_be64(u64 val, void *p)
> {
>        __put_unaligned_cpu64((u64 __force)cpu_to_be64(val), p);
> }
> static __always_inline void put_unaligned_be32(u32 val, void *p)
> {
>        __put_unaligned_cpu32((u32 __force)cpu_to_be32(val), p);
> }
> static __always_inline void put_unaligned_be16(u16 val, void *p)
> {
>        __put_unaligned_cpu16((u16 __force)cpu_to_be16(val), p);
> }
> 
> which is better on ARM than the currently used
> 
> static inline void __put_unaligned_le16(u16 val, u8 *p)
> {
>         *p++ = val;
>         *p++ = val >> 8;
> }
> static inline void __put_unaligned_le32(u32 val, u8 *p)
> {
>         __put_unaligned_le16(val >> 16, p + 2);
>         __put_unaligned_le16(val, p);
> }
> static inline void __put_unaligned_le64(u64 val, u8 *p)
> {
>         __put_unaligned_le32(val >> 32, p + 4);
>         __put_unaligned_le32(val, p);
> }
> 
> because it will allow using ldr/str+rev for unaligned
> cross-endian accesses, but disallow ldm/stm+rev.

Looks like it's going to make the unaligned stuff even more of a rabbit
warren of include files.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

  reply	other threads:[~2016-03-04 14:56 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-03 22:27 DWord alignment on ARMv7 Marc Kleine-Budde
2016-03-03 23:54 ` Will Deacon
2016-03-04  8:01   ` btrfs_get_token_64() alignment problem on ARM (was: Re: DWord alignment on ARMv7) Marc Kleine-Budde
2016-03-04  9:16     ` David Sterba
2016-03-04 10:48   ` DWord alignment on ARMv7 Ard Biesheuvel
2016-03-04 11:02     ` Russell King - ARM Linux
2016-03-04 11:14       ` Ard Biesheuvel
2016-03-04 11:19         ` Marc Kleine-Budde
2016-03-04 11:26           ` Ard Biesheuvel
2016-03-04 11:38         ` Arnd Bergmann
2016-03-04 11:44           ` Ard Biesheuvel
2016-03-04 13:30             ` Arnd Bergmann
2016-03-04 13:33               ` Ard Biesheuvel
2016-03-04 13:46               ` Russell King - ARM Linux
2016-03-04 14:41                 ` Arnd Bergmann
2016-03-04 14:56                   ` Russell King - ARM Linux [this message]
2016-03-04 15:49                     ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160304145656.GC19428@n2100.arm.linux.org.uk \
    --to=linux@arm.linux.org.uk \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).