[PATCH v2] libxfs: use crc32c slice-by-8 variant by default

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Eric Sandeen <sandeen@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>, xfs <linux-xfs@vger.kernel.org>
Subject: [PATCH v2] libxfs: use crc32c slice-by-8 variant by default
Date: Mon, 15 May 2017 09:23:08 -0700	[thread overview]
Message-ID: <20170515162308.GN4519@birch.djwong.org> (raw)

The crc32c code used in xfsprogs was copied directly from the Linux
kernel.  However, that code selects slice-by-4 by default, which isn't
the fastest -- that's slice-by-8, which trades table size for speed.
Fix some makefile dependency problems and explicitly select the
algorithm we want.  With this patch applied, I see about a 10% drop in
CPU time running xfs_repair.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 libxfs/Makefile    |    4 ++--
 libxfs/crc32defs.h |   34 ++++++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/libxfs/Makefile b/libxfs/Makefile
index 0f3759e..c5dc382 100644
--- a/libxfs/Makefile
+++ b/libxfs/Makefile
@@ -124,7 +124,7 @@ LDIRT = gen_crc32table crc32table.h crc32selftest
 
 default: crc32selftest ltdepend $(LTLIBRARY)
 
-crc32table.h: gen_crc32table.c
+crc32table.h: gen_crc32table.c crc32defs.h
 	@echo "    [CC]     gen_crc32table"
 	$(Q) $(BUILD_CC) $(BUILD_CFLAGS) -o gen_crc32table $<
 	@echo "    [GENERATE] $@"
@@ -135,7 +135,7 @@ crc32table.h: gen_crc32table.c
 # systems/architectures. Hence we make sure that xfsprogs will never use a
 # busted CRC calculation at build time and hence avoid putting bad CRCs down on
 # disk.
-crc32selftest: gen_crc32table.c crc32table.h crc32.c
+crc32selftest: gen_crc32table.c crc32table.h crc32.c crc32defs.h
 	@echo "    [TEST]    CRC32"
 	$(Q) $(BUILD_CC) $(BUILD_CFLAGS) -D CRC32_SELFTEST=1 crc32.c -o $@
 	$(Q) ./$@
diff --git a/libxfs/crc32defs.h b/libxfs/crc32defs.h
index 64cba2c..2999782 100644
--- a/libxfs/crc32defs.h
+++ b/libxfs/crc32defs.h
@@ -1,4 +1,38 @@
 /*
+ * Use slice-by-8, which is the fastest variant.
+ *
+ * Calculate checksum 8 bytes at a time with a clever slicing algorithm.
+ * This is the fastest algorithm, but comes with a 8KiB lookup table.
+ * Most modern processors have enough cache to hold this table without
+ * thrashing the cache.
+ *
+ * The Linux kernel uses this as the default implementation "unless you
+ * have a good reason not to".  The reason why Kconfig urges you to pick
+ * SLICEBY8 is because people challenged the assertion that we should
+ * always use slice by 8, so Darrick wrote a crc microbenchmark utility
+ * and ran it on as many machines as he could get his hands on to show
+ * that sb8 was the fastest.
+ *
+ * Every 64-bit machine (and most of the 32-bit ones too) saw the best
+ * results with sb8.  Any machine with more than 4K of cache saw better
+ * results.  The spreadsheet still exists today[1]; note that
+ * 'crc32-kern-le' corresponds to the slice by 4 algorithm which is the
+ * default unless CRC_LE_BITS is defined explicitly.
+ *
+ * FWIW, there are a handful of board defconfigs in the kernel that
+ * don't pick sliceby8.  These are all embedded 32-bit mips/ppc systems
+ * with very small cache sizes which experience cache thrashing with the
+ * slice by 8 algorithm, and therefore chose to pick defaults that are
+ * saner for their particular board configuration.  For nearly all of
+ * XFS' perceived userbase (which we assume are 32 and 64-bit machines
+ * with sufficiently large CPU cache and largeish storage devices) slice
+ * by 8 is the right choice.
+ *
+ * [1] https://goo.gl/0LSzsG ("crc32c_bench")
+ */
+#define CRC_LE_BITS 64
+
+/*
  * There are multiple 16-bit CRC polynomials in common use, but this is
  * *the* standard CRC-32 polynomial, first popularized by Ethernet.
  * x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x^1+x^0

                 reply	other threads:[~2017-05-15 16:23 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:0f3759e dfblob:c5dc382 dfblob:64cba2c dfblob:2999782 )
 OR (
bs:"[PATCH v2] libxfs: use crc32c slice-by-8 variant by default" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170515162308.GN4519@birch.djwong.org \
    --to=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).