* [PATCH net-next] ipv6: optimize ipv6 addresses compares
@ 2012-07-11 3:49 Eric Dumazet
2012-07-11 3:59 ` Joe Perches
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Eric Dumazet @ 2012-07-11 3:49 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Joe Perches
From: Eric Dumazet <edumazet@google.com>
On 64 bit arches having efficient unaligned accesses (eg x86_64) we can
use long words to reduce number of instructions for free.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Joe Perches <joe@perches.com>
---
include/net/ipv6.h | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index aecf884..9ac5ded 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -302,10 +302,19 @@ static inline int
ipv6_masked_addr_cmp(const struct in6_addr *a1, const struct in6_addr *m,
const struct in6_addr *a2)
{
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
+ const unsigned long *ul1 = (const unsigned long *)a1;
+ const unsigned long *ulm = (const unsigned long *)m;
+ const unsigned long *ul2 = (const unsigned long *)a2;
+
+ return !!(((ul1[0] ^ ul2[0]) & ulm[0]) |
+ ((ul1[1] ^ ul2[1]) & ulm[1]));
+#else
return !!(((a1->s6_addr32[0] ^ a2->s6_addr32[0]) & m->s6_addr32[0]) |
((a1->s6_addr32[1] ^ a2->s6_addr32[1]) & m->s6_addr32[1]) |
((a1->s6_addr32[2] ^ a2->s6_addr32[2]) & m->s6_addr32[2]) |
((a1->s6_addr32[3] ^ a2->s6_addr32[3]) & m->s6_addr32[3]));
+#endif
}
static inline void ipv6_addr_prefix(struct in6_addr *pfx,
@@ -335,10 +344,17 @@ static inline void ipv6_addr_set(struct in6_addr *addr,
static inline bool ipv6_addr_equal(const struct in6_addr *a1,
const struct in6_addr *a2)
{
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
+ const unsigned long *ul1 = (const unsigned long *)a1;
+ const unsigned long *ul2 = (const unsigned long *)a2;
+
+ return ((ul1[0] ^ ul2[0]) | (ul1[1] ^ ul2[1])) == 0UL;
+#else
return ((a1->s6_addr32[0] ^ a2->s6_addr32[0]) |
(a1->s6_addr32[1] ^ a2->s6_addr32[1]) |
(a1->s6_addr32[2] ^ a2->s6_addr32[2]) |
(a1->s6_addr32[3] ^ a2->s6_addr32[3])) == 0;
+#endif
}
static inline bool __ipv6_prefix_equal(const __be32 *a1, const __be32 *a2,
@@ -391,8 +407,14 @@ bool ip6_frag_match(struct inet_frag_queue *q, void *a);
static inline bool ipv6_addr_any(const struct in6_addr *a)
{
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
+ const unsigned long *ul = (const unsigned long *)a;
+
+ return (ul[0] | ul[1]) == 0UL;
+#else
return (a->s6_addr32[0] | a->s6_addr32[1] |
a->s6_addr32[2] | a->s6_addr32[3]) == 0;
+#endif
}
static inline bool ipv6_addr_loopback(const struct in6_addr *a)
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH net-next] ipv6: optimize ipv6 addresses compares
2012-07-11 3:49 [PATCH net-next] ipv6: optimize ipv6 addresses compares Eric Dumazet
@ 2012-07-11 3:59 ` Joe Perches
2012-07-11 4:02 ` David Miller
2012-07-11 4:14 ` Joe Perches
2 siblings, 0 replies; 11+ messages in thread
From: Joe Perches @ 2012-07-11 3:59 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
On Wed, 2012-07-11 at 05:49 +0200, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> On 64 bit arches having efficient unaligned accesses (eg x86_64) we can
> use long words to reduce number of instructions for free.
Thanks Eric. This looks very sensible.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next] ipv6: optimize ipv6 addresses compares
2012-07-11 3:49 [PATCH net-next] ipv6: optimize ipv6 addresses compares Eric Dumazet
2012-07-11 3:59 ` Joe Perches
@ 2012-07-11 4:02 ` David Miller
2012-07-11 4:07 ` Eric Dumazet
2012-07-11 4:14 ` Joe Perches
2 siblings, 1 reply; 11+ messages in thread
From: David Miller @ 2012-07-11 4:02 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev, joe
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 11 Jul 2012 05:49:18 +0200
> From: Eric Dumazet <edumazet@google.com>
>
> On 64 bit arches having efficient unaligned accesses (eg x86_64) we can
> use long words to reduce number of instructions for free.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Joe Perches <joe@perches.com>
Maybe we can even be sure that they are 64-bit aligned too?
If there's an embedded u64 in the in6_addr union, they really should
be.
It can't even be an issue in the protocol headers, because in the
socket demux we read the two 32-bit ipv4 addresses in the packet
header as one 64-bit chunk already.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next] ipv6: optimize ipv6 addresses compares
2012-07-11 4:02 ` David Miller
@ 2012-07-11 4:07 ` Eric Dumazet
2012-07-11 4:13 ` David Miller
0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2012-07-11 4:07 UTC (permalink / raw)
To: David Miller; +Cc: netdev, joe
On Tue, 2012-07-10 at 21:02 -0700, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 11 Jul 2012 05:49:18 +0200
>
> > From: Eric Dumazet <edumazet@google.com>
> >
> > On 64 bit arches having efficient unaligned accesses (eg x86_64) we can
> > use long words to reduce number of instructions for free.
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Cc: Joe Perches <joe@perches.com>
>
> Maybe we can even be sure that they are 64-bit aligned too?
>
> If there's an embedded u64 in the in6_addr union, they really should
> be.
>
> It can't even be an issue in the protocol headers, because in the
> socket demux we read the two 32-bit ipv4 addresses in the packet
> header as one 64-bit chunk already.
I dont think this 8bytes alignment is possible with ip6tables.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next] ipv6: optimize ipv6 addresses compares
2012-07-11 4:07 ` Eric Dumazet
@ 2012-07-11 4:13 ` David Miller
2012-07-11 4:44 ` Eric Dumazet
0 siblings, 1 reply; 11+ messages in thread
From: David Miller @ 2012-07-11 4:13 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev, joe
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 11 Jul 2012 06:07:58 +0200
> I dont think this 8bytes alignment is possible with ip6tables.
Hmmm, wouldn't it make more sense to make ip6tables use a special
accessor than to penalize everyone?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next] ipv6: optimize ipv6 addresses compares
2012-07-11 4:13 ` David Miller
@ 2012-07-11 4:44 ` Eric Dumazet
2012-07-11 4:53 ` Eric Dumazet
2012-07-11 5:44 ` David Miller
0 siblings, 2 replies; 11+ messages in thread
From: Eric Dumazet @ 2012-07-11 4:44 UTC (permalink / raw)
To: David Miller; +Cc: netdev, joe
On Tue, 2012-07-10 at 21:13 -0700, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 11 Jul 2012 06:07:58 +0200
>
> > I dont think this 8bytes alignment is possible with ip6tables.
>
> Hmmm, wouldn't it make more sense to make ip6tables use a special
> accessor than to penalize everyone?
But we cannot guarantee 64bit alignment everywhere.
Think of tunnels for example.
I dont see where in demux code we have a 64bit access ?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next] ipv6: optimize ipv6 addresses compares
2012-07-11 4:44 ` Eric Dumazet
@ 2012-07-11 4:53 ` Eric Dumazet
2012-07-11 5:44 ` David Miller
1 sibling, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2012-07-11 4:53 UTC (permalink / raw)
To: David Miller; +Cc: netdev, joe
On Wed, 2012-07-11 at 06:44 +0200, Eric Dumazet wrote:
> I dont see where in demux code we have a 64bit access ?
I guess you meant the code in include/net/inet_hashtables.h ?
INET_ADDR_COOKIE() loads the two 32bits into one 64bit register/var
So there is no 64bit alignment in packet header itself.
Then, INET_MATCH does a *(u64 *)&(inet_sk(__sk)->inet_daddr)))
This happens to work because skc_daddr & skc_rcv_saddr are at the
beginning of struct sock_common, and its 8bytes aligned on 64bit arches.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next] ipv6: optimize ipv6 addresses compares
2012-07-11 4:44 ` Eric Dumazet
2012-07-11 4:53 ` Eric Dumazet
@ 2012-07-11 5:44 ` David Miller
1 sibling, 0 replies; 11+ messages in thread
From: David Miller @ 2012-07-11 5:44 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev, joe
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 11 Jul 2012 06:44:21 +0200
> I dont see where in demux code we have a 64bit access ?
Oh I see, we only do it in the socket, sigh.
#define INET_MATCH(__sk, __net, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
(((__sk)->sk_hash == (__hash)) && net_eq(sock_net(__sk), (__net)) && \
((*((__addrpair *)&(inet_sk(__sk)->inet_daddr))) == (__cookie)) && \
((*((__portpair *)&(inet_sk(__sk)->inet_dport))) == (__ports)) && \
...
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next] ipv6: optimize ipv6 addresses compares
2012-07-11 3:49 [PATCH net-next] ipv6: optimize ipv6 addresses compares Eric Dumazet
2012-07-11 3:59 ` Joe Perches
2012-07-11 4:02 ` David Miller
@ 2012-07-11 4:14 ` Joe Perches
2012-07-11 5:05 ` Eric Dumazet
2 siblings, 1 reply; 11+ messages in thread
From: Joe Perches @ 2012-07-11 4:14 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
On Wed, 2012-07-11 at 05:49 +0200, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> On 64 bit arches having efficient unaligned accesses (eg x86_64) we can
> use long words to reduce number of instructions for free.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Joe Perches <joe@perches.com>
> ---
> include/net/ipv6.h | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/include/net/ipv6.h b/include/net/ipv6.h
> index aecf884..9ac5ded 100644
> --- a/include/net/ipv6.h
> +++ b/include/net/ipv6.h
> @@ -302,10 +302,19 @@ static inline int
> ipv6_masked_addr_cmp(const struct in6_addr *a1, const struct in6_addr *m,
> const struct in6_addr *a2)
> {
> +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
> + const unsigned long *ul1 = (const unsigned long *)a1;
> + const unsigned long *ulm = (const unsigned long *)m;
> + const unsigned long *ul2 = (const unsigned long *)a2;
> +
> + return !!(((ul1[0] ^ ul2[0]) & ulm[0]) |
> + ((ul1[1] ^ ul2[1]) & ulm[1]));
> +#else
> return !!(((a1->s6_addr32[0] ^ a2->s6_addr32[0]) & m->s6_addr32[0]) |
> ((a1->s6_addr32[1] ^ a2->s6_addr32[1]) & m->s6_addr32[1]) |
> ((a1->s6_addr32[2] ^ a2->s6_addr32[2]) & m->s6_addr32[2]) |
> ((a1->s6_addr32[3] ^ a2->s6_addr32[3]) & m->s6_addr32[3]));
> +#endif
> }
Come to think of it, this should probably be bool to
avoid anyone possibly using this in a sorting function.
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH net-next] ipv6: optimize ipv6 addresses compares
2012-07-11 4:14 ` Joe Perches
@ 2012-07-11 5:05 ` Eric Dumazet
2012-07-11 6:13 ` David Miller
0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2012-07-11 5:05 UTC (permalink / raw)
To: Joe Perches; +Cc: David Miller, netdev
From: Eric Dumazet <edumazet@google.com>
On Tue, 2012-07-10 at 21:14 -0700, Joe Perches wrote:
> Come to think of it, this should probably be bool to
> avoid anyone possibly using this in a sorting function.
Yes, this sounds reasonable, thanks.
[PATCH net-next v2] ipv6: optimize ipv6 addresses compares
On 64 bit arches having efficient unaligned accesses (eg x86_64) we can
use long words to reduce number of instructions for free.
Joe Perches suggested to change ipv6_masked_addr_cmp() to return a bool
instead of 'int', to make sure ipv6_masked_addr_cmp() cannot be used
in a sorting function.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Joe Perches <joe@perches.com>
---
include/net/ipv6.h | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index aecf884..d4261d4 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -298,14 +298,23 @@ static inline int ipv6_addr_cmp(const struct in6_addr *a1, const struct in6_addr
return memcmp(a1, a2, sizeof(struct in6_addr));
}
-static inline int
+static inline bool
ipv6_masked_addr_cmp(const struct in6_addr *a1, const struct in6_addr *m,
const struct in6_addr *a2)
{
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
+ const unsigned long *ul1 = (const unsigned long *)a1;
+ const unsigned long *ulm = (const unsigned long *)m;
+ const unsigned long *ul2 = (const unsigned long *)a2;
+
+ return !!(((ul1[0] ^ ul2[0]) & ulm[0]) |
+ ((ul1[1] ^ ul2[1]) & ulm[1]));
+#else
return !!(((a1->s6_addr32[0] ^ a2->s6_addr32[0]) & m->s6_addr32[0]) |
((a1->s6_addr32[1] ^ a2->s6_addr32[1]) & m->s6_addr32[1]) |
((a1->s6_addr32[2] ^ a2->s6_addr32[2]) & m->s6_addr32[2]) |
((a1->s6_addr32[3] ^ a2->s6_addr32[3]) & m->s6_addr32[3]));
+#endif
}
static inline void ipv6_addr_prefix(struct in6_addr *pfx,
@@ -335,10 +344,17 @@ static inline void ipv6_addr_set(struct in6_addr *addr,
static inline bool ipv6_addr_equal(const struct in6_addr *a1,
const struct in6_addr *a2)
{
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
+ const unsigned long *ul1 = (const unsigned long *)a1;
+ const unsigned long *ul2 = (const unsigned long *)a2;
+
+ return ((ul1[0] ^ ul2[0]) | (ul1[1] ^ ul2[1])) == 0UL;
+#else
return ((a1->s6_addr32[0] ^ a2->s6_addr32[0]) |
(a1->s6_addr32[1] ^ a2->s6_addr32[1]) |
(a1->s6_addr32[2] ^ a2->s6_addr32[2]) |
(a1->s6_addr32[3] ^ a2->s6_addr32[3])) == 0;
+#endif
}
static inline bool __ipv6_prefix_equal(const __be32 *a1, const __be32 *a2,
@@ -391,8 +407,14 @@ bool ip6_frag_match(struct inet_frag_queue *q, void *a);
static inline bool ipv6_addr_any(const struct in6_addr *a)
{
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
+ const unsigned long *ul = (const unsigned long *)a;
+
+ return (ul[0] | ul[1]) == 0UL;
+#else
return (a->s6_addr32[0] | a->s6_addr32[1] |
a->s6_addr32[2] | a->s6_addr32[3]) == 0;
+#endif
}
static inline bool ipv6_addr_loopback(const struct in6_addr *a)
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH net-next] ipv6: optimize ipv6 addresses compares
2012-07-11 5:05 ` Eric Dumazet
@ 2012-07-11 6:13 ` David Miller
0 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2012-07-11 6:13 UTC (permalink / raw)
To: eric.dumazet; +Cc: joe, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 11 Jul 2012 07:05:57 +0200
> [PATCH net-next v2] ipv6: optimize ipv6 addresses compares
>
> On 64 bit arches having efficient unaligned accesses (eg x86_64) we can
> use long words to reduce number of instructions for free.
>
> Joe Perches suggested to change ipv6_masked_addr_cmp() to return a bool
> instead of 'int', to make sure ipv6_masked_addr_cmp() cannot be used
> in a sorting function.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Looks good, will apply, thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-07-11 6:13 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-11 3:49 [PATCH net-next] ipv6: optimize ipv6 addresses compares Eric Dumazet
2012-07-11 3:59 ` Joe Perches
2012-07-11 4:02 ` David Miller
2012-07-11 4:07 ` Eric Dumazet
2012-07-11 4:13 ` David Miller
2012-07-11 4:44 ` Eric Dumazet
2012-07-11 4:53 ` Eric Dumazet
2012-07-11 5:44 ` David Miller
2012-07-11 4:14 ` Joe Perches
2012-07-11 5:05 ` Eric Dumazet
2012-07-11 6:13 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).