public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Bob Pearson" <rpearson@systemfabricworks.com>
To: "'Joakim Tjernlund'" <joakim.tjernlund@transmode.se>
Cc: <akpm@linux-foundation.org>, <fzago@systemfabricworks.com>,
	"'George Spelvin'" <linux@horizon.com>,
	<linux-kernel@vger.kernel.org>
Subject: RE: [patch v3 7/7] crc32: final-cleanup.diff
Date: Wed, 10 Aug 2011 16:57:18 -0500	[thread overview]
Message-ID: <008301cc57a8$79d6e330$6d84a990$@systemfabricworks.com> (raw)
In-Reply-To: <OFD6F63A43.07F043D9-ONC12578E8.00565019-C12578E8.0056DB9D@transmode.se>



> -----Original Message-----
> From: Joakim Tjernlund [mailto:joakim.tjernlund@transmode.se]
> Sent: Wednesday, August 10, 2011 10:49 AM
> To: Bob Pearson
> Cc: akpm@linux-foundation.org; fzago@systemfabricworks.com; 'George
> Spelvin'; linux-kernel@vger.kernel.org
> Subject: RE: [patch v3 7/7] crc32: final-cleanup.diff
> 
> "Bob Pearson" <rpearson@systemfabricworks.com> wrote on 2011/08/10
> 17:13:00:
> 
> > From: "Bob Pearson" <rpearson@systemfabricworks.com>
> > To: "'Joakim Tjernlund'" <joakim.tjernlund@transmode.se>
> > Cc: <akpm@linux-foundation.org>, <fzago@systemfabricworks.com>,
> "'George Spelvin'" <linux@horizon.com>, <linux-kernel@vger.kernel.org>
> > Date: 2011/08/10 17:13
> > Subject: RE: [patch v3 7/7] crc32: final-cleanup.diff
> >
> > OK. Can you post your current version of crc32.c? I'll try to merge them
> > together.
> 
> OK, here it comes again, prefably this should be the first patch
> in the series.
> 
> From f5268d74f1a81610820e92785397f1247946ce15 Mon Sep 17 00:00:00 2001
> From: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> Date: Fri, 5 Aug 2011 17:49:42 +0200
> Subject: [PATCH] crc32: Optimize inner loop.
> 
> taking a pointer reference to each row in the crc table matrix,
> one can reduce the inner loop with a few insn's on RISC
> archs like PowerPC.
> ---
>  lib/crc32.c |   21 +++++++++++----------
>  1 files changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/crc32.c b/lib/crc32.c
> index 4855995..b06d1e7 100644
> --- a/lib/crc32.c
> +++ b/lib/crc32.c
> @@ -51,20 +51,21 @@ static inline u32
>  crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32
> (*tab)[256])
>  {
>  # ifdef __LITTLE_ENDIAN
> -#  define DO_CRC(x) crc = tab[0][(crc ^ (x)) & 255] ^ (crc >> 8)
> -#  define DO_CRC4 crc = tab[3][(crc) & 255] ^ \
> -		tab[2][(crc >> 8) & 255] ^ \
> -		tab[1][(crc >> 16) & 255] ^ \
> -		tab[0][(crc >> 24) & 255]
> +#  define DO_CRC(x) crc = t0[(crc ^ (x)) & 255] ^ (crc >> 8)
> +#  define DO_CRC4 crc = t3[(crc) & 255] ^ \
> +		t2[(crc >> 8) & 255] ^ \
> +		t1[(crc >> 16) & 255] ^ \
> +		t0[(crc >> 24) & 255]
>  # else
> -#  define DO_CRC(x) crc = tab[0][((crc >> 24) ^ (x)) & 255] ^ (crc << 8)
> -#  define DO_CRC4 crc = tab[0][(crc) & 255] ^ \
> -		tab[1][(crc >> 8) & 255] ^ \
> -		tab[2][(crc >> 16) & 255] ^ \
> -		tab[3][(crc >> 24) & 255]
> +#  define DO_CRC(x) crc = t0[((crc >> 24) ^ (x)) & 255] ^ (crc << 8)
> +#  define DO_CRC4 crc = t0[(crc) & 255] ^ \
> +		t1[(crc >> 8) & 255] ^  \
> +		t2[(crc >> 16) & 255] ^	\
> +		t3[(crc >> 24) & 255]
>  # endif
>  	const u32 *b;
>  	size_t    rem_len;
> +	const u32 *t0=tab[0], *t1=tab[1], *t2=tab[2], *t3=tab[3];
> 
>  	/* Align it */
>  	if (unlikely((long)buf & 3 && len)) {
> --
> 1.7.3.4

I tried this on X86_64 and Sparc 64. Very small improvement for Intel and
significant improvement for Sparc. Here are the results based on current
self test which is a mix of crc32_le and crc32_be with random offsets and
lengths:
Results are 'best case' I.e. I picked the shortest time from a handful of
runs.

	Arch		CPU		Freq		BITS	bytes
nsec		cycles/byte
	
____________________________________________________________________________
Current proposed patch
	X86_64		Intel E5520	2.268G		64	225944
161294		1.619
	X86_64		Intel E5520	2.268G		32	225944
267795		2.688
	Sun		Sparc III+	900M		64	225944
757235		3.028
	Sun		Sparc III+	900M		32	225944
935558		3.727
With pointers instead of 2D array references
	X86_64		E5520		2.268G		64	225944
157975		1.584
	X86_64		E5520		2.268M		32	225944
273366		2.744
	Sun		Sparc III+	900M		64	225944
570724		2.273
 	Sun		Sparc III+	900M		32	225944
848897		3.381

The change doesn't really help or hurt for X86_64 but significantly helps
Sparc and you report gains for PPC so it looks good.


      reply	other threads:[~2011-08-10 21:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-09  5:29 [patch v3 7/7] crc32: final-cleanup.diff Bob Pearson
2011-08-09  6:21 ` Bob Pearson
2011-08-09 17:39   ` Joakim Tjernlund
2011-08-09 23:05     ` Bob Pearson
2011-08-10 11:40       ` Joakim Tjernlund
2011-08-10 15:13         ` Bob Pearson
2011-08-10 15:48           ` Joakim Tjernlund
2011-08-10 21:57             ` Bob Pearson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='008301cc57a8$79d6e330$6d84a990$@systemfabricworks.com' \
    --to=rpearson@systemfabricworks.com \
    --cc=akpm@linux-foundation.org \
    --cc=fzago@systemfabricworks.com \
    --cc=joakim.tjernlund@transmode.se \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@horizon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox