linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@us.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	"Darrick J. Wong" <djwong@us.ibm.com>
Cc: Theodore Tso <tytso@mit.edu>,
	Joakim Tjernlund <joakim.tjernlund@transmode.se>,
	Bob Pearson <rpearson@systemfabricworks.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	linux-crypto <linux-crypto@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Mingming Cao <cmm@us.ibm.com>,
	linux-ext4@vger.kernel.org
Subject: [PATCH 09/14] Add two changes that improve the performance of x86 systems
Date: Mon, 28 Nov 2011 14:38:07 -0800	[thread overview]
Message-ID: <20111128223807.28705.45122.stgit@elm3c44.beaverton.ibm.com> (raw)
In-Reply-To: <20111128223659.28705.56719.stgit@elm3c44.beaverton.ibm.com>

	1. replace main loop with incrementing counter
	   this change improves the performance of the selftest
	   by about 5-6% on Nehalem CPUs. The apparent
	   reason is that the compiler can use the loop index
	   to perform an indexed memory access. This is
	   reported to make the performance of PowerPC CPUs
	   to get worse.
	2. replace the rem_len loop with incrementing counter
	   this change improves the performance of the selftest,
	   which has more than the usual number of occurances,
	   by about 1-2% on x86 CPUs. In actual work loads
	   the length is most often a multiple of 4 bytes and
	   this code does not get executed as often if at all.
	   Again this change is reported to make the performance
	   of PowerPC get worse.

Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>
---
 lib/crc32.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)


diff --git a/lib/crc32.c b/lib/crc32.c
index 6311712..2c8e8c0 100644
--- a/lib/crc32.c
+++ b/lib/crc32.c
@@ -66,6 +66,9 @@ crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32 (*tab)[256])
 # endif
 	const u32 *b;
 	size_t rem_len;
+# ifdef CONFIG_X86
+	size_t i;
+# endif
 	const u32 *t0 = tab[0], *t1 = tab[1], *t2 = tab[2], *t3 = tab[3];
 	const u32 *t4 = tab[4], *t5 = tab[5], *t6 = tab[6], *t7 = tab[7];
 	u32 q;
@@ -86,7 +89,12 @@ crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32 (*tab)[256])
 # endif
 
 	b = (const u32 *)buf;
+# ifdef CONFIG_X86
+	--b;
+	for (i = 0; i < len; i++) {
+# else
 	for (--b; len; --len) {
+# endif
 		q = crc ^ *++b; /* use pre increment for speed */
 # if CRC_LE_BITS == 32
 		crc = DO_CRC4;
@@ -100,9 +108,14 @@ crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32 (*tab)[256])
 	/* And the last few bytes */
 	if (len) {
 		u8 *p = (u8 *)(b + 1) - 1;
+# ifdef CONFIG_X86
+		for (i = 0; i < len; i++)
+			DO_CRC(*++p); /* use pre increment for speed */
+# else
 		do {
 			DO_CRC(*++p); /* use pre increment for speed */
 		} while (--len);
+# endif
 	}
 	return crc;
 #undef DO_CRC

  parent reply	other threads:[~2011-11-28 22:38 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-28 22:36 [PATCH v5.1 00/14] crc32c: Add faster algorithm and self-test code Darrick J. Wong
2011-11-28 22:37 ` [PATCH 01/14] removed two instances of trailing whitespaces Darrick J. Wong
2011-11-28 22:37 ` [PATCH 02/14] Moved a long comment from lib/crc32.c to Documentation/crc32.txt Darrick J. Wong
2011-11-28 22:37 ` [PATCH 03/14] Replaced the unit test provided in crc32.c, which doesn't have a Darrick J. Wong
2011-11-28 22:37 ` [PATCH 04/14] Replace 2D array references by pointer references in loops Darrick J. Wong
2011-11-28 22:37 ` [PATCH 05/14] Misc cleanup of lib/crc32.c and related files Darrick J. Wong
2011-11-28 22:37 ` [PATCH 06/14] crc32.c in its original version freely mixed u32, __le32 and __be32 types Darrick J. Wong
2011-11-28 22:37 ` [PATCH 07/14] crc32.c provides a choice of one of several algorithms for Darrick J. Wong
2011-11-28 22:37 ` [PATCH 08/14] add slicing-by-8 algorithm to the existing Darrick J. Wong
2011-11-28 22:38 ` Darrick J. Wong [this message]
2011-11-28 22:38 ` [PATCH 10/14] Some final changes Darrick J. Wong
2011-11-28 22:38 ` [PATCH 11/14] crc32: Bolt on crc32c Darrick J. Wong
2011-11-28 22:38 ` [PATCH 12/14] crypto: crc32c should use library implementation Darrick J. Wong
2011-11-28 22:38 ` [PATCH 13/14] crc32: Add self-test code for crc32c Darrick J. Wong
2011-11-28 22:38 ` [PATCH 14/14] crc32: Select an algorithm via kconfig Darrick J. Wong
2011-11-30 22:29 ` [PATCH v5.1 00/14] crc32c: Add faster algorithm and self-test code Andrew Morton
2011-12-01 20:12   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111128223807.28705.45122.stgit@elm3c44.beaverton.ibm.com \
    --to=djwong@us.ibm.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=akpm@linux-foundation.org \
    --cc=cmm@us.ibm.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=joakim.tjernlund@transmode.se \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rpearson@systemfabricworks.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).