All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Greenhalgh <james.greenhalgh@arm.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>,
	<netdev@vger.kernel.org>, <davem@davemloft.net>,
	<luke.starrett@broadcom.com>, <catalin.marinas@arm.com>,
	<nd@arm.com>
Subject: Re: [PATCH net-next] ipv6: Implement optimized IPv6 masked address comparison for ARM64
Date: Fri, 17 Mar 2017 12:22:52 +0000	[thread overview]
Message-ID: <20170317122252.GA32449@arm.com> (raw)
In-Reply-To: <bfa8020c-bd12-8de1-5be5-a3408c293600@arm.com>

On Fri, Mar 17, 2017 at 12:00:42PM +0000, Robin Murphy wrote:
> On 17/03/17 04:42, Subash Abhinov Kasiviswanathan wrote:
> > Android devices use multiple ip[6]tables for statistics, UID matching
> > and other functionality. Perf output indicated that ip6_do_table
> > was taking a considerable amount of CPU and more that ip_do_table
> > for an equivalent rate. ipv6_masked_addr_cmp was chosen for
> > optimization as there are more instructions required than the
> > equivalent operation in ip_packet_match.
> > 
> > Using 128 bit operations helps to reduce the number of instructions
> > for the match on an ARM64 system. This helps to improve UDPv6 DL
> > performance by 40Mbps (860Mbps -> 900Mbps) on a CPU limited system.
> 
> After trying to have a look at the codegen difference it makes, I think
> I may have found why it's faster ;)
> 
> ----------
> [root@space-channel-5 ~]# cat > ip.c
> #include <stdbool.h>
> #include <netinet/in.h>
> 	
> bool
> ipv6_masked_addr_cmp(const struct in6_addr *a1, const struct in6_addr *m,
> 		     const struct in6_addr *a2)
> {
> 	const unsigned long *ul1 = (const unsigned long *)a1;
> 	const unsigned long *ulm = (const unsigned long *)m;
> 	const unsigned long *ul2 = (const unsigned long *)a2;
> 
> 	return !!(((ul1[0] ^ ul2[0]) & ulm[0]) |
> 		  ((ul1[1] ^ ul2[1]) & ulm[1]));
> }
> 
> bool
> ipv6_masked_addr_cmp_new(const struct in6_addr *a1, const struct
> in6_addr *m,
> 		     const struct in6_addr *a2)
> {
> 	const __uint128_t *ul1 = (const __uint128_t *)a1;
> 	const __uint128_t *ulm = (const __uint128_t *)m;
> 	const __uint128_t *ul2 = (const __uint128_t *)a1;
> 
> 	return !!((*ul1 ^ *ul2) & *ulm);
> }

<snip>

> That's clearly not right - I'm not sure quite what undefined behaviour
> assumption convinces GCC to optimise the whole thing away>

While the pointer casting is a bit ghastly, I don't actually think that
GCC is taking advantage of undefined behaviour here, rather it looks like
you have a simple typo on line 3:

> 	const __uint128_t *ul1 = (const __uint128_t *)a1;
> 	const __uint128_t *ulm = (const __uint128_t *)m;
> 	const __uint128_t *ul2 = (const __uint128_t *)a1;

ul2 = a2, surely?

As it is (stripping casts) you have a1 ^ a1, which will get you to 0
pretty quickly. Fixing that up for you;

  bool
  ipv6_masked_addr_cmp_new(const struct in6_addr *a1, const struct
  in6_addr *m,
  		     const struct in6_addr *a2)
  {
  	const __uint128_t *ul1 = (const __uint128_t *)a1;
  	const __uint128_t *ulm = (const __uint128_t *)m;
  	const __uint128_t *ul2 = (const __uint128_t *)a2;

  	return !!((*ul1 ^ *ul2) & *ulm);
  }

$ gcc -O2

  ipv6_masked_addr_cmp_new:
	ldp	x4, x3, [x0]
	ldp	x5, x2, [x2]
	ldp	x0, x1, [x1]
	eor	x4, x4, x5
	eor	x2, x3, x2
	and	x0, x0, x4
	and	x1, x1, x2
	orr	x0, x0, x1
	cmp	x0, 0
	cset	w0, ne
	ret

Which at least looks like it might calculate something useful :-)

Cheers,
James

  reply	other threads:[~2017-03-17 12:23 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-17  4:42 [PATCH net-next] ipv6: Implement optimized IPv6 masked address comparison for ARM64 Subash Abhinov Kasiviswanathan
2017-03-17 12:00 ` Robin Murphy
2017-03-17 12:22   ` James Greenhalgh [this message]
2017-03-17 21:20     ` Subash Abhinov Kasiviswanathan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170317122252.GA32449@arm.com \
    --to=james.greenhalgh@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=davem@davemloft.net \
    --cc=luke.starrett@broadcom.com \
    --cc=nd@arm.com \
    --cc=netdev@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=subashab@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.