From: "Darrick J. Wong" <djwong@us.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Theodore Tso <tytso@mit.edu>,
Joakim Tjernlund <joakim.tjernlund@transmode.se>,
Bob Pearson <rpearson@systemfabricworks.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
Andreas Dilger <adilger.kernel@dilger.ca>,
linux-crypto <linux-crypto@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Mingming Cao <cmm@us.ibm.com>,
linux-ext4@vger.kernel.org,
Herbert Xu <herbert@gondor.apana.org.au>
Subject: [PATCH 09/14] crc32: Optimize loop counter for x86
Date: Thu, 01 Dec 2011 12:14:44 -0800 [thread overview]
Message-ID: <20111201201444.5876.78886.stgit@elm3c44.beaverton.ibm.com> (raw)
In-Reply-To: <20111201201341.5876.83743.stgit@elm3c44.beaverton.ibm.com>
Add two changes that improve the performance of x86 systems
1. replace main loop with incrementing counter
this change improves the performance of the selftest
by about 5-6% on Nehalem CPUs. The apparent
reason is that the compiler can use the loop index
to perform an indexed memory access. This is
reported to make the performance of PowerPC CPUs
to get worse.
2. replace the rem_len loop with incrementing counter
this change improves the performance of the selftest,
which has more than the usual number of occurances,
by about 1-2% on x86 CPUs. In actual work loads
the length is most often a multiple of 4 bytes and
this code does not get executed as often if at all.
Again this change is reported to make the performance
of PowerPC get worse.
From: Bob Pearson <rpearson@systemfabricworks.com>
Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>
[djwong@us.ibm.com: Minor changelog tweaks]
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
---
lib/crc32.c | 13 +++++++++++++
1 files changed, 13 insertions(+), 0 deletions(-)
diff --git a/lib/crc32.c b/lib/crc32.c
index 6311712..2c8e8c0 100644
--- a/lib/crc32.c
+++ b/lib/crc32.c
@@ -66,6 +66,9 @@ crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32 (*tab)[256])
# endif
const u32 *b;
size_t rem_len;
+# ifdef CONFIG_X86
+ size_t i;
+# endif
const u32 *t0 = tab[0], *t1 = tab[1], *t2 = tab[2], *t3 = tab[3];
const u32 *t4 = tab[4], *t5 = tab[5], *t6 = tab[6], *t7 = tab[7];
u32 q;
@@ -86,7 +89,12 @@ crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32 (*tab)[256])
# endif
b = (const u32 *)buf;
+# ifdef CONFIG_X86
+ --b;
+ for (i = 0; i < len; i++) {
+# else
for (--b; len; --len) {
+# endif
q = crc ^ *++b; /* use pre increment for speed */
# if CRC_LE_BITS == 32
crc = DO_CRC4;
@@ -100,9 +108,14 @@ crc32_body(u32 crc, unsigned char const *buf, size_t len, const u32 (*tab)[256])
/* And the last few bytes */
if (len) {
u8 *p = (u8 *)(b + 1) - 1;
+# ifdef CONFIG_X86
+ for (i = 0; i < len; i++)
+ DO_CRC(*++p); /* use pre increment for speed */
+# else
do {
DO_CRC(*++p); /* use pre increment for speed */
} while (--len);
+# endif
}
return crc;
#undef DO_CRC
next prev parent reply other threads:[~2011-12-01 20:14 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-01 20:13 [PATCH v5.2 00/14] crc32c: Add faster algorithm and self-test code Darrick J. Wong
2011-12-01 20:13 ` [PATCH 01/14] crc32: removed two instances of trailing whitespaces Darrick J. Wong
2011-12-01 20:13 ` [PATCH 02/14] crc32: Move long comment about crc32 fundamentals to Documentation/ Darrick J. Wong
2011-12-01 20:14 ` [PATCH 03/14] crc32: Simplify unit test code Darrick J. Wong
2011-12-01 20:14 ` [PATCH 04/14] crc32: Speed up memory table access on powerpc Darrick J. Wong
2011-12-01 20:14 ` [PATCH 05/14] crc32: Miscellaneous cleanups Darrick J. Wong
2011-12-01 20:14 ` [PATCH 06/14] crc32: Fix mixing of endian-specific types Darrick J. Wong
2011-12-01 20:14 ` [PATCH 07/14] crc32: Make CRC_*_BITS definition correspond to actual bit counts Darrick J. Wong
2011-12-01 20:14 ` [PATCH 08/14] crc32: Add slice-by-8 algorithm to existing code Darrick J. Wong
2011-12-01 20:14 ` Darrick J. Wong [this message]
2011-12-01 20:14 ` [PATCH 10/14] crc32: Add note about this patchset to crc32.c Darrick J. Wong
2011-12-01 20:14 ` [PATCH 11/14] crc32: Bolt on crc32c Darrick J. Wong
2011-12-01 20:15 ` [PATCH 12/14] crypto: crc32c should use library implementation Darrick J. Wong
2011-12-01 20:15 ` [PATCH 13/14] crc32: Add self-test code for crc32c Darrick J. Wong
2011-12-01 20:15 ` [PATCH 14/14] crc32: Select an algorithm via kconfig Darrick J. Wong
2011-12-02 0:25 ` Herbert Xu
2011-12-03 2:36 ` Darrick J. Wong
2011-12-12 22:58 ` Darrick J. Wong
2011-12-12 23:10 ` Bob Pearson
2011-12-13 6:32 ` Darrick J. Wong
2011-12-13 8:27 ` Joakim Tjernlund
2011-12-13 18:36 ` Darrick J. Wong
2011-12-01 20:20 ` [PATCH v5.2 00/14] crc32c: Add faster algorithm and self-test code Joel Becker
2011-12-01 20:31 ` Darrick J. Wong
2011-12-02 0:23 ` Herbert Xu
2011-12-03 2:30 ` Darrick J. Wong
2011-12-03 11:00 ` Herbert Xu
-- strict thread matches above, loose matches on Subject: below --
2012-01-07 5:50 [PATCH v5.3 " Darrick J. Wong
2012-01-07 5:51 ` [PATCH 09/14] crc32: Optimize loop counter for x86 Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111201201444.5876.78886.stgit@elm3c44.beaverton.ibm.com \
--to=djwong@us.ibm.com \
--cc=adilger.kernel@dilger.ca \
--cc=akpm@linux-foundation.org \
--cc=cmm@us.ibm.com \
--cc=herbert@gondor.apana.org.au \
--cc=joakim.tjernlund@transmode.se \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rpearson@systemfabricworks.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).