From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH v2 02/12] libxfs: track largest metadata LSN in use via verifiers
Date: Fri, 2 Oct 2015 12:16:34 +1000 [thread overview]
Message-ID: <20151002021634.GJ3902@dastard> (raw)
In-Reply-To: <20151001203851.GA3349@bfoster.bfoster>
On Thu, Oct 01, 2015 at 04:38:51PM -0400, Brian Foster wrote:
> On Thu, Sep 24, 2015 at 08:36:25AM +1000, Dave Chinner wrote:
> > On Wed, Sep 23, 2015 at 09:18:31AM -0400, Brian Foster wrote:
> > > > This will have the same lock contention problems that the kernel
> > > > code would have had - my repair scalablity tests regularly reach
> > > > over 1GB/s of metadata being prefetched through tens of threads, so
> > > > this is going have a significant impact on performance in those
> > > > tests....
> ...
> >
> > > I'll have to think about this some more and see what's effective. I'd
> > > also like to quantify the effect the current locking has on performance
> > > if possible. Can you provide a brief description of your typical repair
> > > test that you would expect this to hurt? E.g., a large fs, many AGs,
> > > populated with fs_mark and repaired with many threads..? Any special
> > > storage configuration? Thanks.
> >
> > Just my usual 500TB fs_mark test...
>
> Thanks for the test information and sample results. I wasn't able to get
> close enough to the base numbers you mentioned on IRC with the spinning
> rust storage I have available. Instead, I tried running something
> similar using a large ramdisk as a backing store. I have a 500T sparse
> file formatted with XFS and populated with ~25m inodes that uses roughly
> ~16GB of the backing store (leaving another 16GB of usable RAM for the
> server).
Keep in mind when doing this sort of testing that a ramdisk does not
perform like normal storage - it does synchronous IO via memcpy()
rather than async IO via DMA, and hence has a very different CPU
cache footprint, concurrency and latency profile to normal IO.
I'm not saying your numbers are invalid, just making sure that you
remember that ramdisks, while fast, may not give results that are
representative of normal storage behaviour...
> I run xfs_repair[1] against that 500TB fs and see spikes of throughput up
> over 2GB/s and get repair result reports like the following:
<snip>
> So I'm not seeing much difference here with the max lsn tracking as it
> is implemented in this series.
Yup, same here. min/max spread between the 4.2.0 debian unstable
package and a locally built 4.2.0+lsn tracking is the same. Both
varied between 2m55s and 3m5s, all used roughly the same user and
system CPU. So we'll commit it as is ;)
> Out of curiosity, I ran a v3.2.2
> xfs_repair binary that happened to be installed on this host, got a much
> faster result than even the current master, and via perf diff discovered
> that the biggest difference between the runs was actual CRC calculation.
I think you might have drawn the incorrect conclusion from the
profile data...
> Based on that, I ran the same crc=0 test against the current code with
> the following results:
>
> Phase Start End Duration
> Phase 1: 10/01 13:53:49 10/01 13:53:49
> Phase 2: 10/01 13:53:49 10/01 13:53:50 1 second
> Phase 3: 10/01 13:53:50 10/01 13:54:52 1 minute, 2 seconds
> Phase 4: 10/01 13:54:52 10/01 13:55:01 9 seconds
> Phase 5: 10/01 13:55:01 10/01 13:55:01
> Phase 6: 10/01 13:55:01 10/01 13:55:35 34 seconds
> Phase 7: 10/01 13:55:35 10/01 13:55:35
>
> ... so that knocks off another 15s or so from the test. Note that the
> lsn lock is irrelevant in the crc=0 case as there are no metadata LSNs,
> thus no verification occurs.
My results:
fsmark creation
mkfs (4.2.0) files/s (avg) wall time repair
defaults 203209 4m57s 2m55s-3m05s
-m crc=0 188837 5m06s 2m07s-2m10s
On first glance, I've just replicated your results. But the PCP
monitoring I run all the time tell a different story. Both
xfs_repair runs are io bound during phase 3 and 4, but the crc=0 run
does a *lot less IO*.
The reason for this will be obvious when I tell you: inodes are 512
bytes when CRCs are enabled, 256 bytes when they aren't. Hence
there's twice as much metadata to be read when CRCs are enabled, and
when it is read at a fixed rate, it takes twice as long to read.
The trap here is that perf profiles tend to present
"percentage of CPU used by a function" rather than absolute CPU
consumed by the function across a test run. That's why you might
think that the CPU consumed by CRCs is responsible for the
difference in performance - it's the only really obvious difference
in the profile. What I think you'll find if you look deeper is that
the time spent in memcpy() deep in the ramdisk IO path has doubled
and that's where most of the additional time comes from.
Apples to apples comparison at the IO level on my test rig:
fsmark creation
mkfs files/s (avg) wall time repair
defaults 203209 4m57s 2m55s-3m05s
-m crc=0 188837 5m06s 2m07s-2m10s
-m crc=0 -i size=512 194331 5m00s 3m00s-3m10s
Yup, when doing IO-equivalent testing, CRCs have no measurable
wall time impact on xfs_repair runtime.
Performance analysis is tricky stuff.... ;)
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2015-10-02 2:16 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-11 18:55 [PATCH v2 00/12] xfsprogs: format the log correctly on v5 supers Brian Foster
2015-09-11 18:55 ` [PATCH v2 01/12] libxfs: validate metadata LSNs against log on v5 superblocks Brian Foster
2015-09-11 18:55 ` [PATCH v2 02/12] libxfs: track largest metadata LSN in use via verifiers Brian Foster
2015-09-23 3:44 ` Dave Chinner
2015-09-23 13:18 ` Brian Foster
2015-09-23 22:36 ` Dave Chinner
2015-10-01 20:38 ` Brian Foster
2015-10-02 2:16 ` Dave Chinner [this message]
2015-10-02 11:33 ` Brian Foster
2015-09-11 18:55 ` [PATCH v2 03/12] libxfs: don't hardcode cycle 1 into unmount op header Brian Foster
2015-09-23 3:48 ` Dave Chinner
2015-09-23 13:22 ` Brian Foster
2015-09-24 0:37 ` Dave Chinner
2015-09-24 13:00 ` Brian Foster
2015-09-11 18:55 ` [PATCH v2 04/12] libxfs: pass lsn param to log clear and record header logging helpers Brian Foster
2015-09-11 18:55 ` [PATCH v2 05/12] libxfs: add ability to clear log to arbitrary log cycle Brian Foster
2015-09-11 18:55 ` [PATCH v2 06/12] libxlog: pull struct xlog out of xlog_is_dirty() Brian Foster
2015-09-11 18:55 ` [PATCH v2 07/12] xfs_repair: track log state throughout all recovery phases Brian Foster
2015-09-11 18:55 ` [PATCH v2 08/12] xfs_repair: process the log in no_modify mode Brian Foster
2015-09-11 18:55 ` [PATCH v2 09/12] xfs_repair: format the log with forward cycle number on v5 supers Brian Foster
2015-09-11 18:55 ` [PATCH v2 10/12] xfs_repair: don't clear the log by default Brian Foster
2015-09-11 18:55 ` [PATCH v2 11/12] xfs_db: do not reset current lsn from uuid command on v5 supers Brian Foster
2015-09-11 18:55 ` [PATCH v2 12/12] db/metadump: bump lsn when log is cleared " Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151002021634.GJ3902@dastard \
--to=david@fromorbit.com \
--cc=bfoster@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox