From: Dave Chinner <david@fromorbit.com>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: linux-xfs@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
Dave Chinner <dchinner@redhat.com>,
"Darrick J. Wong" <darrick.wong@oracle.com>
Subject: Re: [rfc] larger batches for crc32c
Date: Fri, 28 Oct 2016 15:29:29 +1100 [thread overview]
Message-ID: <20161028042929.GH22126@dastard> (raw)
In-Reply-To: <20161028131234.24a5cb6f@roar.ozlabs.ibm.com>
On Fri, Oct 28, 2016 at 01:12:34PM +1100, Nicholas Piggin wrote:
> On Fri, 28 Oct 2016 08:42:44 +1100
> Dave Chinner <david@fromorbit.com> wrote:
> > > As a rule, it helps the crc implementation if it can operate on as large a
> > > chunk as possible (alignment, startup overhead, etc). So I did a quick hack
> > > at getting XFS checksumming to feed crc32c() with larger chunks, by setting
> > > the existing crc to 0 before running over the entire buffer. Together with
> > > some small work on the powerpc crc implementation, crc drops below 0.1%.
> >
> > I wouldn't have expected reducing call numbers and small alignment
> > changes to make that amount of difference given the amount of data
> > we are actually checksumming. How much of that difference was due to
> > the improved CRC implementation?
>
> Sorry, I should have been more clear about what was happening. Not enough
> sleep. The larger sizes allows the vectorized crc implementation to kick in.
Ah, ok. So it never gets out of the slow, branchy lead in/lead out
code for the smaller chunks.Fair enough.
For the verify side it probably doesn't matter than much - the
latency of the initial memory fetches on the data to be verified are
likely to be the dominant factor for performance...
> > FWIW, can you provide some additional context by grabbing the log
> > stats that tell us the load on the log that is generating this
> > profile? A sample over a minute of a typical workload (with a
> > corresponding CPU profile) would probably be sufficient. You can get
> > them simply by zeroing the xfs stats via
> > /proc/sys/fs/xfs/stats_clear at the start of the sample period and
> > then dumping /proc/fs/xfs/stat at the end.
>
> Yeah I'll get some better information for you.
>
> > > I don't know if something like this would be acceptable? It's not pretty,
> > > but I didn't see an easier way.
> >
> > ISTR we made the choice not to do that to avoid potential problems
> > with potential race conditions and bugs (i.e. don't modify anything
> > in objects on read access) but I can't point you at anything
> > specific...
>
> Sounds pretty reasonable, especially for the verifiers. For the paths
> that create/update the checksums (including this log checksum), it seems
> like it should be less controversial.
Yup. For the paths that update the checksum we have to have an
exclusive lock on the object (and will always require that), so I
can't see a problem with changing the update code to use this
method.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2016-10-28 4:29 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-27 16:17 [rfc] larger batches for crc32c Nicholas Piggin
2016-10-27 18:29 ` Darrick J. Wong
2016-10-28 3:21 ` Nicholas Piggin
2016-10-27 21:42 ` Dave Chinner
2016-10-27 23:16 ` Dave Chinner
2016-10-28 2:12 ` Nicholas Piggin
2016-10-28 4:29 ` Dave Chinner [this message]
2016-10-28 5:02 ` Nicholas Piggin
2016-10-31 3:08 ` Dave Chinner
2016-11-01 3:39 ` Nicholas Piggin
2016-11-01 5:47 ` Dave Chinner
2016-11-02 2:18 ` [rfe]: finobt option separable from crc option? (was [rfc] larger batches for crc32c) L.A. Walsh
2016-11-03 8:29 ` Dave Chinner
2016-11-03 16:04 ` L.A. Walsh
2016-11-03 18:15 ` Eric Sandeen
2016-11-03 23:00 ` Dave Chinner
2016-11-04 6:56 ` L.A. Walsh
2016-11-04 17:37 ` Eric Sandeen
2016-11-04 0:12 ` [rfc] larger batches for crc32c Dave Chinner
2016-11-04 2:28 ` Nicholas Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161028042929.GH22126@dastard \
--to=david@fromorbit.com \
--cc=darrick.wong@oracle.com \
--cc=dchinner@redhat.com \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
--cc=npiggin@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).