public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Bob Pearson" <rpearson@systemfabricworks.com>
To: "'George Spelvin'" <linux@horizon.com>,
	<akpm@linux-foundation.org>, <fzago@systemfabricworks.com>,
	<joakim.tjernlund@transmode.se>, <linux-kernel@vger.kernel.org>
Subject: RE: [patch v3 6/7] crc32: add-slicing-by-8.diff
Date: Tue, 9 Aug 2011 10:28:50 -0500	[thread overview]
Message-ID: <011501cc56a9$0b1ca470$2155ed50$@systemfabricworks.com> (raw)
In-Reply-To: <20110809112114.3943.qmail@science.horizon.com>



> -----Original Message-----
> From: George Spelvin [mailto:linux@horizon.com]
> Sent: Tuesday, August 09, 2011 6:21 AM
> To: akpm@linux-foundation.org; fzago@systemfabricworks.com;
> joakim.tjernlund@transmode.se; linux-kernel@vger.kernel.org;
> linux@horizon.com; rpearson@systemfabricworks.com
> Subject: Re: [patch v3 6/7] crc32: add-slicing-by-8.diff
> 
> While writing up some documentation of this algorithm, I came up with
> a potential speedup.  Or, at least, realized why slicing by more than
> 4 is so much faster than slicing by 4 or less.
> 
> Note that the inner loop of the algorithm is as follows:
> 
> +#  define DO_CRC8a (tab[7][(q) & 255] ^ \
> +		tab[6][(q >> 8) & 255] ^ \
> +		tab[5][(q >> 16) & 255] ^ \
> +		tab[4][(q >> 24) & 255])
> +#  define DO_CRC8b (tab[3][(q) & 255] ^ \
> +		tab[2][(q >> 8) & 255] ^ \
> +		tab[1][(q >> 16) & 255] ^ \
> +		tab[0][(q >> 24) & 255])
> 
> +	for (--b; middle_len; --middle_len) {
> +		u32 q;
> +		q = crc ^ *++b;
> +		crc = DO_CRC8a;
> +		q = *++b;
> +		crc ^= DO_CRC8b;
>  	}
> 
> Note the data dependencies: DO_CRC8a depends on the
> previous crc, which depends on the previous DO_CRC8b.
> But DO_CRC8b does not depend on anything except the
> input data at *++b.

I think you've got it. That is exactly why it works. The hardware is pretty
good at finding the parallelism since this code runs at almost 2X the speed
of the CRC4 loop.

> 
> It would increase parallelism to schedule DO_CRC8b before DO_CRC8a,
> to start those loads before the previous crc value is available.
> 
> Maybe the compiler and/pr processor can find this parallelism already,
> but if not, it might be useful to try reordering it:
> 
> #  define DO_CRC8a(x) (tab[7][(x) & 255] ^ \
> 		tab[6][((x) >> 8) & 255] ^ \
> 		tab[5][((x) >> 16) & 255] ^ \
> 		tab[4][((x) >> 24) & 255])
> #  define DO_CRC8b(x) (tab[3][(x) & 255] ^ \
> 		tab[2][((x) >> 8) & 255] ^ \
> 		tab[1][((x) >> 16) & 255] ^ \
> 		tab[0][((x) >> 24) & 255])
> 
> 	for ( ; middle_len; --middle_len, b += 2) {
> 		u32 q = DO_CRC8b(b[1]);
> 		crc ^= b[0];
> 		crc = q ^ DO_CRC8a(crc);
> 	}


  reply	other threads:[~2011-08-09 15:28 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-09  5:27 [patch v3 6/7] crc32: add-slicing-by-8.diff Bob Pearson
2011-08-09 11:21 ` George Spelvin
2011-08-09 15:28   ` Bob Pearson [this message]
2011-08-09 17:21 ` Joakim Tjernlund
2011-08-09 20:52   ` Bob Pearson
2011-08-10  9:32     ` Joakim Tjernlund

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='011501cc56a9$0b1ca470$2155ed50$@systemfabricworks.com' \
    --to=rpearson@systemfabricworks.com \
    --cc=akpm@linux-foundation.org \
    --cc=fzago@systemfabricworks.com \
    --cc=joakim.tjernlund@transmode.se \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@horizon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox