* misaligned load/store in ppc32 memcpy
@ 2009-06-30 21:56 Kumar Gala
2009-07-02 1:32 ` Mark Nelson
2009-07-02 3:40 ` Paul Mackerras
0 siblings, 2 replies; 3+ messages in thread
From: Kumar Gala @ 2009-06-30 21:56 UTC (permalink / raw)
To: Mark Nelson; +Cc: linuxppc-dev@ozlabs.org list
[-- Attachment #1: Type: text/html, Size: 493 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: misaligned load/store in ppc32 memcpy
2009-06-30 21:56 misaligned load/store in ppc32 memcpy Kumar Gala
@ 2009-07-02 1:32 ` Mark Nelson
2009-07-02 3:40 ` Paul Mackerras
1 sibling, 0 replies; 3+ messages in thread
From: Mark Nelson @ 2009-07-02 1:32 UTC (permalink / raw)
To: Kumar Gala; +Cc: Paul Mackerras, linuxppc-dev
Hi Kumar,
On Wednesday 01 July 2009 07:56:31 Kumar Gala wrote:
> Mark,
>
> Ben pointed me to you regarding my question if we should be expecting misaligned load/store operations in the ppc32 mempcy that exists in copy_32.S.
>
> (To be more specific, I'm seeing this behavior and wondering if we really should have memcpy avoid doing word size ld/st if the addresses aren't also aligned)
>
That's a good question; but because I don't really know very much about the
ppc32 cores that this memcpy routine was written and tested with (let alone
the newer embedded variations) I'll CC Paul and hopefully he can let us know :)
Thanks!
Mark
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: misaligned load/store in ppc32 memcpy
2009-06-30 21:56 misaligned load/store in ppc32 memcpy Kumar Gala
2009-07-02 1:32 ` Mark Nelson
@ 2009-07-02 3:40 ` Paul Mackerras
1 sibling, 0 replies; 3+ messages in thread
From: Paul Mackerras @ 2009-07-02 3:40 UTC (permalink / raw)
To: Kumar Gala; +Cc: Mark Nelson, linuxppc-dev@ozlabs.org list
Kumar Gala writes:
> Ben pointed me to you regarding my question if we should be expecting
> misaligned load/store operations in the ppc32 mempcy that exists in
> copy_32.S.
> (To be more specific, I'm seeing this behavior and wondering if we
> really should have memcpy avoid doing word size ld/st if the addresses
> aren't also aligned)
We align the destination to a word boundary using byte-by-byte copies,
then copy words using word loads and stores. The loads may be
misaligned, but they are still faster than doing aligned loads and
shuffling the bits around - or at least they were when measured the
speed 10 years or so ago, which would have been on 750 or 74xx cpus.
If the penalty for unaligned loads on Freescale embedded cores is high
enough that it's faster to shuffle the bits or to copy byte-by-byte
then we can have an alternative version for them.
Paul.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-07-02 3:40 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-30 21:56 misaligned load/store in ppc32 memcpy Kumar Gala
2009-07-02 1:32 ` Mark Nelson
2009-07-02 3:40 ` Paul Mackerras
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).