All of lore.kernel.org
 help / color / mirror / Atom feed
From: Segher Boessenkool <segher@kernel.crashing.org>
To: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 3/3] powerpc/lib: optimise PPC32 memcmp
Date: Thu, 24 May 2018 12:24:58 -0500	[thread overview]
Message-ID: <20180524172458.GA17342@gate.crashing.org> (raw)
In-Reply-To: <bb7ef10a-4777-7635-3e54-830dd3eeb8bb@c-s.fr>

On Wed, May 23, 2018 at 09:47:32AM +0200, Christophe Leroy wrote:
> At the time being, memcmp() compares two chunks of memory
> byte per byte.
> 
> This patch optimises the comparison by comparing word by word.
> 
> A small benchmark performed on an 8xx comparing two chuncks
> of 512 bytes performed 100000 times gives:
> 
> Before : 5852274 TB ticks
> After:   1488638 TB ticks

> diff --git a/arch/powerpc/lib/string_32.S b/arch/powerpc/lib/string_32.S
> index 40a576d56ac7..542e6cecbcaf 100644
> --- a/arch/powerpc/lib/string_32.S
> +++ b/arch/powerpc/lib/string_32.S
> @@ -16,17 +16,45 @@
>  	.text
> 
>  _GLOBAL(memcmp)
> -	cmpwi	cr0, r5, 0
> -	beq-	2f
> -	mtctr	r5
> -	addi	r6,r3,-1
> -	addi	r4,r4,-1
> -1:	lbzu	r3,1(r6)
> -	lbzu	r0,1(r4)
> -	subf.	r3,r0,r3
> -	bdnzt	2,1b
> +	srawi.	r7, r5, 2		/* Divide len by 4 */
> +	mr	r6, r3
> +	beq-	3f
> +	mtctr	r7
> +	li	r7, 0
> +1:
> +#ifdef __LITTLE_ENDIAN__
> +	lwbrx	r3, r6, r7
> +	lwbrx	r0, r4, r7
> +#else
> +	lwzx	r3, r6, r7
> +	lwzx	r0, r4, r7
> +#endif

You don't test whether the pointers are word-aligned.  Does that work?
Say, when a load is crossing a page boundary, or segment boundary.


Segher

  reply	other threads:[~2018-05-24 17:25 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-22 16:06 [PATCH v3 0/3] powerpc/lib: Optimisation of string functions for PPC32 - part 1 Christophe Leroy
2018-05-22 16:06 ` [PATCH v3 1/3] powerpc/lib: move PPC32 specific functions out of string.S Christophe Leroy
2018-05-22 16:06 ` [PATCH v3 2/3] powerpc/lib: optimise 32 bits __clear_user() Christophe Leroy
2018-05-22 16:06 ` [PATCH v3 3/3] powerpc/lib: optimise PPC32 memcmp Christophe Leroy
2018-05-23  7:47   ` [PATCH v4 " Christophe Leroy
2018-05-24 17:24     ` Segher Boessenkool [this message]
2018-05-25  5:55       ` Christophe LEROY

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180524172458.GA17342@gate.crashing.org \
    --to=segher@kernel.crashing.org \
    --cc=benh@kernel.crashing.org \
    --cc=christophe.leroy@c-s.fr \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.