linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Scott Wood <scottwood@freescale.com>
To: LEROY Christophe <christophe.leroy@c-s.fr>
Cc: Paul Mackerras <paulus@samba.org>,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: powerpc32: rearrange instructions order in ip_fast_csum()
Date: Tue, 24 Mar 2015 20:22:48 -0500	[thread overview]
Message-ID: <20150325012248.GA7270@home.buserror.net> (raw)
In-Reply-To: <20150203113927.B909E1A5F15@localhost.localdomain>

On Tue, Feb 03, 2015 at 12:39:27PM +0100, LEROY Christophe wrote:
> On PPC_8xx, lwz has a 2 cycles latency, and branching also takes 2 cycles.
> As the size of the header is minimum 5 words, we can unroll the loop for the
> first words to reduce number of branching, and we can re-order the instructions
> to limit loading latency.

Please wrap commit messages at around 70 characters.

> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
>  arch/powerpc/lib/checksum_32.S | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/lib/checksum_32.S b/arch/powerpc/lib/checksum_32.S
> index 6d67e05..5500704 100644
> --- a/arch/powerpc/lib/checksum_32.S
> +++ b/arch/powerpc/lib/checksum_32.S
> @@ -26,13 +26,17 @@
>  _GLOBAL(ip_fast_csum)
>  	lwz	r0,0(r3)
>  	lwzu	r5,4(r3)
> -	addic.	r4,r4,-2
> +	addic.	r4,r4,-4
>  	addc	r0,r0,r5
>  	mtctr	r4
>  	blelr-
> -1:	lwzu	r4,4(r3)
> -	adde	r0,r0,r4
> +	lwzu	r5,4(r3)
> +	lwzu	r4,4(r3)

The blelr is pointless since len is guaranteed to be >= 5 (assuming that
comment is accurate), but now it's both pointless and in the wrong place,
since you haven't yet finished the four words that you subtracted from
r4.

How about keeping the blelr, without the -, moving it after the initial
words, and changing the number of inital words to 5?  Also maybe do all
the loads up front, since many PPC chips have a three cycle load latency
rather than two.

-Scott

  reply	other threads:[~2015-03-25  1:23 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-03 11:39 [PATCH] powerpc32: rearrange instructions order in ip_fast_csum() Christophe Leroy
2015-03-25  1:22 ` Scott Wood [this message]
2015-04-28 19:07   ` christophe leroy
  -- strict thread matches above, loose matches on Subject: below --
2014-09-19 13:57 Christophe Leroy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150325012248.GA7270@home.buserror.net \
    --to=scottwood@freescale.com \
    --cc=christophe.leroy@c-s.fr \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).