From: Eric Dumazet <dada1@cosmosbay.com>
To: David Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org, xemul@sw.ru, devel@openvz.org,
bridge@lists.osdl.org
Subject: Re: [BRIDGE] Unaligned access on IA64 when comparing ethernet addresses
Date: Thu, 19 Apr 2007 22:29:28 +0200 [thread overview]
Message-ID: <4627D128.7060707@cosmosbay.com> (raw)
In-Reply-To: <20070419.130101.91442981.davem@davemloft.net>
David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Thu, 19 Apr 2007 16:14:23 +0200
>
>> On Wed, 18 Apr 2007 13:04:22 -0700 (PDT)
>> David Miller <davem@davemloft.net> wrote:
>>
>>> Although I don't think gcc does anything fancy since we don't
>>> use memcmp(). It's a tradeoff, we'd like to use unsigned long
>>> comparisons when both objects are aligned correctly but we also
>>> don't want it to use any more than one potentially mispredicted
>>> branch.
>> Again, memcmp() *cannot* be optimized, because its semantic is to compare bytes.
>>
>> memcpy() can take into account alignement if known at compile time, not memcmp()
>>
>> http://lists.openwall.net/netdev/2007/03/13/31
>
> I was prehaps thinking about strlen() where I know several
> implementations work a word at a time even though it is
> a byte-based operation:
>
> --------------------
> #define LO_MAGIC 0x01010101
> #define HI_MAGIC 0x80808080
> ...
> sethi %hi(HI_MAGIC), %o4
> ...
> or %o4, %lo(HI_MAGIC), %o3
> ...
> sethi %hi(LO_MAGIC), %o4
> ...
> or %o4, %lo(LO_MAGIC), %o2
> ...
> 8:
> ld [%o0], %o5
> 2:
> sub %o5, %o2, %o4
> andcc %o4, %o3, %g0
> be,pt %icc, 8b
> add %o0, 4, %o0
> --------------------
>
> I figured some similar trick could be done with strcmp() and
> memcmp().
>
>
Hum, I was refering to IA64 (or the more spreaded x86 arches), that is litle
endian AFAIK.
On big endian machines, a compiler can indeed perform some word tricks for
memcmp() if it knows at compile time both pointers are word aligned.
PowerPc example (xlc compiler)
int func(const unsigned int *a, const unsigned int *b)
{
return memcmp(a, b, 6);
}
.func: # 0x00000000 (H.10.NO_SYMBOL)
l r5,0(r3)
l r0,0(r4)
cmp 0,r5,r0
bc BO_IF_NOT,CR0_EQ,__L2c
lhz r3,4(r3)
lhz r0,4(r4)
sf r0,r0,r3
sfze r3,r0
a r0,r3,r0
aze r3,r0
bcr BO_ALWAYS,CR0_LT
__L2c: # 0x0000002c (H.10.NO_SYMBOL+0x2c)
sf r0,r0,r5
sfze r3,r0
a r0,r3,r0
aze r3,r0
bcr BO_ALWAYS,CR0_LT
But to compare 6 bytes, known to be aligned to even addresses, current code is
just fine and portable. We *could* use arch/endian specific tricks to save one
or two cycles, but who really wants that ?
WARNING: multiple messages have this Message-ID (diff)
From: Eric Dumazet <dada1@cosmosbay.com>
To: David Miller <davem@davemloft.net>
Cc: shemminger@linux-foundation.org, xemul@sw.ru,
netdev@vger.kernel.org, bridge@lists.osdl.org, devel@openvz.org
Subject: Re: [BRIDGE] Unaligned access on IA64 when comparing ethernet addresses
Date: Thu, 19 Apr 2007 22:29:28 +0200 [thread overview]
Message-ID: <4627D128.7060707@cosmosbay.com> (raw)
In-Reply-To: <20070419.130101.91442981.davem@davemloft.net>
David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Thu, 19 Apr 2007 16:14:23 +0200
>
>> On Wed, 18 Apr 2007 13:04:22 -0700 (PDT)
>> David Miller <davem@davemloft.net> wrote:
>>
>>> Although I don't think gcc does anything fancy since we don't
>>> use memcmp(). It's a tradeoff, we'd like to use unsigned long
>>> comparisons when both objects are aligned correctly but we also
>>> don't want it to use any more than one potentially mispredicted
>>> branch.
>> Again, memcmp() *cannot* be optimized, because its semantic is to compare bytes.
>>
>> memcpy() can take into account alignement if known at compile time, not memcmp()
>>
>> http://lists.openwall.net/netdev/2007/03/13/31
>
> I was prehaps thinking about strlen() where I know several
> implementations work a word at a time even though it is
> a byte-based operation:
>
> --------------------
> #define LO_MAGIC 0x01010101
> #define HI_MAGIC 0x80808080
> ...
> sethi %hi(HI_MAGIC), %o4
> ...
> or %o4, %lo(HI_MAGIC), %o3
> ...
> sethi %hi(LO_MAGIC), %o4
> ...
> or %o4, %lo(LO_MAGIC), %o2
> ...
> 8:
> ld [%o0], %o5
> 2:
> sub %o5, %o2, %o4
> andcc %o4, %o3, %g0
> be,pt %icc, 8b
> add %o0, 4, %o0
> --------------------
>
> I figured some similar trick could be done with strcmp() and
> memcmp().
>
>
Hum, I was refering to IA64 (or the more spreaded x86 arches), that is litle
endian AFAIK.
On big endian machines, a compiler can indeed perform some word tricks for
memcmp() if it knows at compile time both pointers are word aligned.
PowerPc example (xlc compiler)
int func(const unsigned int *a, const unsigned int *b)
{
return memcmp(a, b, 6);
}
.func: # 0x00000000 (H.10.NO_SYMBOL)
l r5,0(r3)
l r0,0(r4)
cmp 0,r5,r0
bc BO_IF_NOT,CR0_EQ,__L2c
lhz r3,4(r3)
lhz r0,4(r4)
sf r0,r0,r3
sfze r3,r0
a r0,r3,r0
aze r3,r0
bcr BO_ALWAYS,CR0_LT
__L2c: # 0x0000002c (H.10.NO_SYMBOL+0x2c)
sf r0,r0,r5
sfze r3,r0
a r0,r3,r0
aze r3,r0
bcr BO_ALWAYS,CR0_LT
But to compare 6 bytes, known to be aligned to even addresses, current code is
just fine and portable. We *could* use arch/endian specific tricks to save one
or two cycles, but who really wants that ?
next prev parent reply other threads:[~2007-04-19 20:29 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-04-17 11:49 [Bridge] [BRIDGE] Unaligned access on IA64 when comparing ethernet addresses Pavel Emelianov
2007-04-17 11:49 ` Pavel Emelianov
2007-04-17 19:31 ` [Bridge] " David Miller
2007-04-17 19:31 ` David Miller
2007-04-17 19:55 ` [Bridge] " Stephen Hemminger
2007-04-17 19:55 ` Stephen Hemminger
2007-04-17 20:09 ` [Bridge] " David Miller
2007-04-17 20:09 ` David Miller
2007-04-17 20:37 ` [Bridge] " Stephen Hemminger
2007-04-17 20:37 ` Stephen Hemminger
2007-04-17 21:09 ` [Bridge] " David Miller
2007-04-17 21:09 ` David Miller
2007-04-17 21:24 ` [Bridge] " Eric Dumazet
2007-04-17 21:24 ` Eric Dumazet
2007-04-17 21:27 ` [Bridge] " David Miller
2007-04-17 21:27 ` David Miller
2007-04-18 6:43 ` [Bridge] " Pavel Emelianov
2007-04-18 6:43 ` Pavel Emelianov
2007-04-18 8:28 ` [Bridge] " David Miller
2007-04-18 8:28 ` David Miller
2007-04-18 8:37 ` [Bridge] " Pavel Emelianov
2007-04-18 8:37 ` Pavel Emelianov
2007-04-18 14:44 ` [Bridge] " Stephen Hemminger
2007-04-18 14:44 ` Stephen Hemminger
2007-04-18 20:04 ` [Bridge] " David Miller
2007-04-18 20:04 ` David Miller
2007-04-19 14:14 ` Eric Dumazet
2007-04-19 14:14 ` Eric Dumazet
2007-04-19 18:18 ` Stephen Hemminger
2007-04-19 18:18 ` Stephen Hemminger
2007-04-19 20:01 ` [BRIDGE] " David Miller
2007-04-19 20:01 ` David Miller
2007-04-19 20:29 ` Eric Dumazet [this message]
2007-04-19 20:29 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4627D128.7060707@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=bridge@lists.osdl.org \
--cc=davem@davemloft.net \
--cc=devel@openvz.org \
--cc=netdev@vger.kernel.org \
--cc=xemul@sw.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.