From: James Greenhalgh <james.greenhalgh@arm.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>,
<netdev@vger.kernel.org>, <davem@davemloft.net>,
<luke.starrett@broadcom.com>, <catalin.marinas@arm.com>,
<nd@arm.com>
Subject: Re: [PATCH net-next] ipv6: Implement optimized IPv6 masked address comparison for ARM64
Date: Fri, 17 Mar 2017 12:22:52 +0000 [thread overview]
Message-ID: <20170317122252.GA32449@arm.com> (raw)
In-Reply-To: <bfa8020c-bd12-8de1-5be5-a3408c293600@arm.com>
On Fri, Mar 17, 2017 at 12:00:42PM +0000, Robin Murphy wrote:
> On 17/03/17 04:42, Subash Abhinov Kasiviswanathan wrote:
> > Android devices use multiple ip[6]tables for statistics, UID matching
> > and other functionality. Perf output indicated that ip6_do_table
> > was taking a considerable amount of CPU and more that ip_do_table
> > for an equivalent rate. ipv6_masked_addr_cmp was chosen for
> > optimization as there are more instructions required than the
> > equivalent operation in ip_packet_match.
> >
> > Using 128 bit operations helps to reduce the number of instructions
> > for the match on an ARM64 system. This helps to improve UDPv6 DL
> > performance by 40Mbps (860Mbps -> 900Mbps) on a CPU limited system.
>
> After trying to have a look at the codegen difference it makes, I think
> I may have found why it's faster ;)
>
> ----------
> [root@space-channel-5 ~]# cat > ip.c
> #include <stdbool.h>
> #include <netinet/in.h>
>
> bool
> ipv6_masked_addr_cmp(const struct in6_addr *a1, const struct in6_addr *m,
> const struct in6_addr *a2)
> {
> const unsigned long *ul1 = (const unsigned long *)a1;
> const unsigned long *ulm = (const unsigned long *)m;
> const unsigned long *ul2 = (const unsigned long *)a2;
>
> return !!(((ul1[0] ^ ul2[0]) & ulm[0]) |
> ((ul1[1] ^ ul2[1]) & ulm[1]));
> }
>
> bool
> ipv6_masked_addr_cmp_new(const struct in6_addr *a1, const struct
> in6_addr *m,
> const struct in6_addr *a2)
> {
> const __uint128_t *ul1 = (const __uint128_t *)a1;
> const __uint128_t *ulm = (const __uint128_t *)m;
> const __uint128_t *ul2 = (const __uint128_t *)a1;
>
> return !!((*ul1 ^ *ul2) & *ulm);
> }
<snip>
> That's clearly not right - I'm not sure quite what undefined behaviour
> assumption convinces GCC to optimise the whole thing away>
While the pointer casting is a bit ghastly, I don't actually think that
GCC is taking advantage of undefined behaviour here, rather it looks like
you have a simple typo on line 3:
> const __uint128_t *ul1 = (const __uint128_t *)a1;
> const __uint128_t *ulm = (const __uint128_t *)m;
> const __uint128_t *ul2 = (const __uint128_t *)a1;
ul2 = a2, surely?
As it is (stripping casts) you have a1 ^ a1, which will get you to 0
pretty quickly. Fixing that up for you;
bool
ipv6_masked_addr_cmp_new(const struct in6_addr *a1, const struct
in6_addr *m,
const struct in6_addr *a2)
{
const __uint128_t *ul1 = (const __uint128_t *)a1;
const __uint128_t *ulm = (const __uint128_t *)m;
const __uint128_t *ul2 = (const __uint128_t *)a2;
return !!((*ul1 ^ *ul2) & *ulm);
}
$ gcc -O2
ipv6_masked_addr_cmp_new:
ldp x4, x3, [x0]
ldp x5, x2, [x2]
ldp x0, x1, [x1]
eor x4, x4, x5
eor x2, x3, x2
and x0, x0, x4
and x1, x1, x2
orr x0, x0, x1
cmp x0, 0
cset w0, ne
ret
Which at least looks like it might calculate something useful :-)
Cheers,
James
next prev parent reply other threads:[~2017-03-17 12:23 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-17 4:42 [PATCH net-next] ipv6: Implement optimized IPv6 masked address comparison for ARM64 Subash Abhinov Kasiviswanathan
2017-03-17 12:00 ` Robin Murphy
2017-03-17 12:22 ` James Greenhalgh [this message]
2017-03-17 21:20 ` Subash Abhinov Kasiviswanathan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170317122252.GA32449@arm.com \
--to=james.greenhalgh@arm.com \
--cc=catalin.marinas@arm.com \
--cc=davem@davemloft.net \
--cc=luke.starrett@broadcom.com \
--cc=nd@arm.com \
--cc=netdev@vger.kernel.org \
--cc=robin.murphy@arm.com \
--cc=subashab@codeaurora.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.