public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: arnd@arndb.de (Arnd Bergmann)
To: linux-arm-kernel@lists.infradead.org
Subject: DWord alignment on ARMv7
Date: Fri, 04 Mar 2016 15:41:58 +0100	[thread overview]
Message-ID: <2494410.DJhrOHD4yE@wuerfel> (raw)
In-Reply-To: <20160304134651.GB19428@n2100.arm.linux.org.uk>

On Friday 04 March 2016 13:46:51 Russell King - ARM Linux wrote:
> On Fri, Mar 04, 2016 at 02:30:23PM +0100, Arnd Bergmann wrote:
> > Ah, I thought it only required 32-bit alignment like ldm/stm, but it
> > seems that it won't do that. However, an implementation like
> > 
> > unsigned long long get_unaligned_u64(void *p)
> > {
> >         unsigned long long upper, lower;
> >         lower = *(unsigned long*)p;
> >         upper = *(unsigned long*)(p+4);
> > 
> >         return lower | (upper << 32);
> > }
> > 
> > does get compiled into
> > 
> > 00000000 <f>:
> >    0:   e8900003        ldm     r0, {r0, r1}
> >    4:   e12fff1e        bx      lr
> 
> I think it may be something of a bitch to work around, because the
> compiler is going to do stuff like that behind your back.
> 
> The only way around that would be to bypass the compiler by using
> asm(), but then you end up bypassing the instruction scheduling too.
> That may not matter, as the resulting overhead may still be lower.

I think the compiler is correctly optimizing the code according to
what we tell it about the alignment here: the ABI explicitly specifies
that variables are naturally aligned, so whenever we dereference
a 32-bit pointer that is not __packed, it should assume it can safely
use any instruction that requires this alignment. This is essentially
what get_unaligned() is designed to work around, when we have
data that is known to be possibly misaligned.

Simply using the generic implementation from before 0567f5facbd
("asm-generic: allow generic unaligned access if the arch supports it")
solves the problem, as we explicitly tell the compiler about the
fact that data is unaligned and let it do the right thing (bytewise
access on ARMv5 or lower, normal access except ldm and ldlr on
later architectures).

The implementation appears to be suboptimal for cross-endian loads
though, as gcc-5.3 does not use the 'rev' instruction here but
falls back on byte accesses. We can easily fix that by introducing
one more generic implementation for the cross-endian accesses doing

static __always_inline void put_unaligned_be64(u64 val, void *p)
{
       __put_unaligned_cpu64((u64 __force)cpu_to_be64(val), p);
}
static __always_inline void put_unaligned_be32(u32 val, void *p)
{
       __put_unaligned_cpu32((u32 __force)cpu_to_be32(val), p);
}
static __always_inline void put_unaligned_be16(u16 val, void *p)
{
       __put_unaligned_cpu16((u16 __force)cpu_to_be16(val), p);
}

which is better on ARM than the currently used

static inline void __put_unaligned_le16(u16 val, u8 *p)
{
        *p++ = val;
        *p++ = val >> 8;
}
static inline void __put_unaligned_le32(u32 val, u8 *p)
{
        __put_unaligned_le16(val >> 16, p + 2);
        __put_unaligned_le16(val, p);
}
static inline void __put_unaligned_le64(u64 val, u8 *p)
{
        __put_unaligned_le32(val >> 32, p + 4);
        __put_unaligned_le32(val, p);
}

because it will allow using ldr/str+rev for unaligned
cross-endian accesses, but disallow ldm/stm+rev.

	Arnd

  reply	other threads:[~2016-03-04 14:41 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-03 22:27 DWord alignment on ARMv7 Marc Kleine-Budde
2016-03-03 23:54 ` Will Deacon
2016-03-04  8:01   ` btrfs_get_token_64() alignment problem on ARM (was: Re: DWord alignment on ARMv7) Marc Kleine-Budde
2016-03-04  9:16     ` David Sterba
2016-03-04 10:48   ` DWord alignment on ARMv7 Ard Biesheuvel
2016-03-04 11:02     ` Russell King - ARM Linux
2016-03-04 11:14       ` Ard Biesheuvel
2016-03-04 11:19         ` Marc Kleine-Budde
2016-03-04 11:26           ` Ard Biesheuvel
2016-03-04 11:38         ` Arnd Bergmann
2016-03-04 11:44           ` Ard Biesheuvel
2016-03-04 13:30             ` Arnd Bergmann
2016-03-04 13:33               ` Ard Biesheuvel
2016-03-04 13:46               ` Russell King - ARM Linux
2016-03-04 14:41                 ` Arnd Bergmann [this message]
2016-03-04 14:56                   ` Russell King - ARM Linux
2016-03-04 15:49                     ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2494410.DJhrOHD4yE@wuerfel \
    --to=arnd@arndb.de \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox