All of lore.kernel.org
 help / color / mirror / Atom feed
From: Segher Boessenkool <segher@kernel.crashing.org>
To: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 5/5] powerpc/lib: inline memcmp() for small constant sizes
Date: Thu, 17 May 2018 08:55:51 -0500	[thread overview]
Message-ID: <20180517135551.GT17342@gate.crashing.org> (raw)
In-Reply-To: <8a6f90d882c8b60e5fa0826cd23dd70a92075659.1526553552.git.christophe.leroy@c-s.fr>

On Thu, May 17, 2018 at 12:49:58PM +0200, Christophe Leroy wrote:
> In my 8xx configuration, I get 208 calls to memcmp()
> Within those 208 calls, about half of them have constant sizes,
> 46 have a size of 8, 17 have a size of 16, only a few have a
> size over 16. Other fixed sizes are mostly 4, 6 and 10.
> 
> This patch inlines calls to memcmp() when size
> is constant and lower than or equal to 16
> 
> In my 8xx configuration, this reduces the number of calls
> to memcmp() from 208 to 123
> 
> The following table shows the number of TB timeticks to perform
> a constant size memcmp() before and after the patch depending on
> the size
> 
> 	Before	After	Improvement
> 01:	 7577	 5682	25%
> 02:	41668	 5682	86%
> 03:	51137	13258	74%
> 04:	45455	 5682	87%
> 05:	58713	13258	77%
> 06:	58712	13258	77%
> 07:	68183	20834	70%
> 08:	56819	15153	73%
> 09:	70077	28411	60%
> 10:	70077	28411	60%
> 11:	79546	35986	55%
> 12:	68182	28411	58%
> 13:	81440	35986	55%
> 14:	81440	39774	51%
> 15:	94697	43562	54%
> 16:	79546	37881	52%

Could you show results with a more recent GCC?  What version was this?

What is this really measuring?  I doubt it takes 7577 (or 5682) timebase
ticks to do a 1-byte memcmp, which is just 3 instructions after all.


Segher

  parent reply	other threads:[~2018-05-17 13:56 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-17 10:49 [PATCH v2 0/5] powerpc/lib: Optimisation of string functions (mainly for PPC32) Christophe Leroy
2018-05-17 10:49 ` [PATCH v2 1/5] powerpc/lib: move PPC32 specific functions out of string.S Christophe Leroy
2018-05-17 13:33   ` Segher Boessenkool
2018-05-17 10:49 ` [PATCH v2 2/5] powerpc/lib: optimise 32 bits __clear_user() Christophe Leroy
2018-05-17 10:49 ` [PATCH v2 3/5] powerpc/lib: optimise PPC32 memcmp Christophe Leroy
2018-05-17 10:49 ` [PATCH v2 4/5] powerpc/lib: inline string functions NUL size verification Christophe Leroy
2018-05-17 10:49 ` [PATCH v2 5/5] powerpc/lib: inline memcmp() for small constant sizes Christophe Leroy
2018-05-17 13:03   ` Mathieu Malaterre
2018-05-17 13:21     ` Christophe LEROY
2018-05-17 13:44       ` Benjamin Herrenschmidt
2018-05-17 13:55   ` Segher Boessenkool [this message]
2018-05-18 10:35     ` Christophe Leroy
2018-05-18 15:20       ` Segher Boessenkool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180517135551.GT17342@gate.crashing.org \
    --to=segher@kernel.crashing.org \
    --cc=benh@kernel.crashing.org \
    --cc=christophe.leroy@c-s.fr \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.