public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH v2 02/12] libxfs: track largest metadata LSN in use via verifiers
Date: Thu, 1 Oct 2015 16:38:51 -0400	[thread overview]
Message-ID: <20151001203851.GA3349@bfoster.bfoster> (raw)
In-Reply-To: <20150923223625.GR19114@dastard>

On Thu, Sep 24, 2015 at 08:36:25AM +1000, Dave Chinner wrote:
> On Wed, Sep 23, 2015 at 09:18:31AM -0400, Brian Foster wrote:
> > On Wed, Sep 23, 2015 at 01:44:06PM +1000, Dave Chinner wrote:
> > > On Fri, Sep 11, 2015 at 02:55:32PM -0400, Brian Foster wrote:
> > > > +
> > > > +	pthread_mutex_lock(&libxfs_max_lsn_lock);
> > > > +
> > > > +	max_cycle = CYCLE_LSN(libxfs_max_lsn);
> > > > +	max_block = BLOCK_LSN(libxfs_max_lsn);
> > > > +
> > > > +	if ((cycle > max_cycle) ||
> > > > +	    (cycle == max_cycle && block > max_block))
> > > > +		libxfs_max_lsn = lsn;
> 
> Actually, we have XFS_LSN_CMP(lsn1, lsn2) for this. i.e.
> 
> 	if (XFS_LSN_CMP(lsn, libxfs_max_lsn) > 0)
> 		libxfs_max_lsn = lsn;
> 
> > > > +
> > > > +	pthread_mutex_unlock(&libxfs_max_lsn_lock);
> > > 
> > > This will have the same lock contention problems that the kernel
> > > code would have had - my repair scalablity tests regularly reach
> > > over 1GB/s of metadata being prefetched through tens of threads, so
> > > this is going have a significant impact on performance in those
> > > tests....
> > > 
> > 
...
> 
> > I'll have to think about this some more and see what's effective. I'd
> > also like to quantify the effect the current locking has on performance
> > if possible. Can you provide a brief description of your typical repair
> > test that you would expect this to hurt? E.g., a large fs, many AGs,
> > populated with fs_mark and repaired with many threads..? Any special
> > storage configuration? Thanks.
> 
> Just my usual 500TB fs_mark test...
> 

Thanks for the test information and sample results. I wasn't able to get
close enough to the base numbers you mentioned on IRC with the spinning
rust storage I have available. Instead, I tried running something
similar using a large ramdisk as a backing store. I have a 500T sparse
file formatted with XFS and populated with ~25m inodes that uses roughly
~16GB of the backing store (leaving another 16GB of usable RAM for the
server).

I run xfs_repair[1] against that 500TB fs and see spikes of throughput up
over 2GB/s and get repair result reports like the following:

Phase           Start           End             Duration
Phase 1:        10/01 13:03:44  10/01 13:03:45  1 second
Phase 2:        10/01 13:03:45  10/01 13:03:46  1 second
Phase 3:        10/01 13:03:46  10/01 13:05:01  1 minute, 15 seconds
Phase 4:        10/01 13:05:01  10/01 13:05:14  13 seconds
Phase 5:        10/01 13:05:14  10/01 13:05:15  1 second
Phase 6:        10/01 13:05:15  10/01 13:05:50  35 seconds
Phase 7:        10/01 13:05:50  10/01 13:05:50

The numbers don't change that much on repeated runs and if I do a quick
and dirty average of the duration of phases 3, 4 and 6 and compare with
results from the for-next branch, the runtime degradation is on the
order of tenths of a second. Here's a for-next (e.g., no max lsn
tracking) run for reference:

Phase           Start           End             Duration
Phase 1:        10/01 13:19:53  10/01 13:19:53
Phase 2:        10/01 13:19:53  10/01 13:19:56  3 seconds
Phase 3:        10/01 13:19:56  10/01 13:21:11  1 minute, 15 seconds
Phase 4:        10/01 13:21:11  10/01 13:21:22  11 seconds
Phase 5:        10/01 13:21:22  10/01 13:21:23  1 second
Phase 6:        10/01 13:21:23  10/01 13:21:57  34 seconds
Phase 7:        10/01 13:21:57  10/01 13:21:57

So I'm not seeing much difference here with the max lsn tracking as it
is implemented in this series. Out of curiosity, I ran a v3.2.2
xfs_repair binary that happened to be installed on this host, got a much
faster result than even the current master, and via perf diff discovered
that the biggest difference between the runs was actual CRC calculation.
Based on that, I ran the same crc=0 test against the current code with
the following results:

Phase           Start           End             Duration
Phase 1:        10/01 13:53:49  10/01 13:53:49
Phase 2:        10/01 13:53:49  10/01 13:53:50  1 second
Phase 3:        10/01 13:53:50  10/01 13:54:52  1 minute, 2 seconds
Phase 4:        10/01 13:54:52  10/01 13:55:01  9 seconds
Phase 5:        10/01 13:55:01  10/01 13:55:01
Phase 6:        10/01 13:55:01  10/01 13:55:35  34 seconds
Phase 7:        10/01 13:55:35  10/01 13:55:35

... so that knocks off another 15s or so from the test. Note that the
lsn lock is irrelevant in the crc=0 case as there are no metadata LSNs,
thus no verification occurs.

All in all, I can't really reproduce any tangible degradation due to the
maxlsn lock and I don't really want to prematurely optimize it if it's
not a contention point in practice. Thoughts? If you get a chance, care
to give this code a quick run under your xfs_repair test environment? If
you can reproduce something there, I can continue to try and figure out
what might be different in my test.

Brian

[1] xfs_repair -o bhash=100101 -v -v -t 1 -f <file>

> $ cat ~/tests/fsmark-50-test-xfs.sh 
> #!/bin/bash
> 
> QUOTA=
> MKFSOPTS=
> NFILES=100000
> DEV=/dev/vdc
> LOGBSIZE=256k
> 
> while [ $# -gt 0 ]; do
>         case "$1" in
>         -q)     QUOTA="uquota,gquota,pquota" ;;
>         -N)     NFILES=$2 ; shift ;;
>         -d)     DEV=$2 ; shift ;;
>         -l)     LOGBSIZE=$2; shift ;;
>         --)     shift ; break ;;
>         esac
>         shift
> done
> MKFSOPTS="$MKFSOPTS $*"
> 
> echo QUOTA=$QUOTA
> echo MKFSOPTS=$MKFSOPTS
> echo DEV=$DEV
> 
> sudo umount /mnt/scratch > /dev/null 2>&1
> sudo mkfs.xfs -f $MKFSOPTS $DEV
> sudo mount -o nobarrier,logbsize=$LOGBSIZE,$QUOTA $DEV /mnt/scratch
> sudo chmod 777 /mnt/scratch
> cd /home/dave/src/fs_mark-3.3/
> sudo sh -c "echo 1 > /proc/sys/fs/xfs/stats_clear"
> time ./fs_mark  -D  10000  -S0  -n  $NFILES  -s  0  -L  32 \
>         -d  /mnt/scratch/0  -d  /mnt/scratch/1 \
>         -d  /mnt/scratch/2  -d  /mnt/scratch/3 \
>         -d  /mnt/scratch/4  -d  /mnt/scratch/5 \
>         -d  /mnt/scratch/6  -d  /mnt/scratch/7 \
>         -d  /mnt/scratch/8  -d  /mnt/scratch/9 \
>         -d  /mnt/scratch/10  -d  /mnt/scratch/11 \
>         -d  /mnt/scratch/12  -d  /mnt/scratch/13 \
>         -d  /mnt/scratch/14  -d  /mnt/scratch/15 \
>         | tee >(stats --trim-outliers | tail -1 1>&2)
> sync
> 
> echo Repair
> sudo umount /mnt/scratch
> time sudo xfs_repair -o bhash=100101 -v -v -t 1 $DEV
> time sudo mount -o nobarrier,logbsize=$LOGBSIZE,$QUOTA $DEV /mnt/scratch
> 
> echo bulkstat files
> 
> time (
>         sudo ~/src/xfstests-dev/src/bstat  -q /mnt/scratch 1024 | wc -l
> )
> 
> echo walking files
> ~/tests/walk-scratch.sh
> 
> echo removing files
> for f in /mnt/scratch/* ; do time rm -rf $f &  done
> wait
> 
> sudo umount /mnt/scratch
> $
> 
> -- 
> Dave Chinner
> david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2015-10-01 20:38 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-11 18:55 [PATCH v2 00/12] xfsprogs: format the log correctly on v5 supers Brian Foster
2015-09-11 18:55 ` [PATCH v2 01/12] libxfs: validate metadata LSNs against log on v5 superblocks Brian Foster
2015-09-11 18:55 ` [PATCH v2 02/12] libxfs: track largest metadata LSN in use via verifiers Brian Foster
2015-09-23  3:44   ` Dave Chinner
2015-09-23 13:18     ` Brian Foster
2015-09-23 22:36       ` Dave Chinner
2015-10-01 20:38         ` Brian Foster [this message]
2015-10-02  2:16           ` Dave Chinner
2015-10-02 11:33             ` Brian Foster
2015-09-11 18:55 ` [PATCH v2 03/12] libxfs: don't hardcode cycle 1 into unmount op header Brian Foster
2015-09-23  3:48   ` Dave Chinner
2015-09-23 13:22     ` Brian Foster
2015-09-24  0:37       ` Dave Chinner
2015-09-24 13:00         ` Brian Foster
2015-09-11 18:55 ` [PATCH v2 04/12] libxfs: pass lsn param to log clear and record header logging helpers Brian Foster
2015-09-11 18:55 ` [PATCH v2 05/12] libxfs: add ability to clear log to arbitrary log cycle Brian Foster
2015-09-11 18:55 ` [PATCH v2 06/12] libxlog: pull struct xlog out of xlog_is_dirty() Brian Foster
2015-09-11 18:55 ` [PATCH v2 07/12] xfs_repair: track log state throughout all recovery phases Brian Foster
2015-09-11 18:55 ` [PATCH v2 08/12] xfs_repair: process the log in no_modify mode Brian Foster
2015-09-11 18:55 ` [PATCH v2 09/12] xfs_repair: format the log with forward cycle number on v5 supers Brian Foster
2015-09-11 18:55 ` [PATCH v2 10/12] xfs_repair: don't clear the log by default Brian Foster
2015-09-11 18:55 ` [PATCH v2 11/12] xfs_db: do not reset current lsn from uuid command on v5 supers Brian Foster
2015-09-11 18:55 ` [PATCH v2 12/12] db/metadump: bump lsn when log is cleared " Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151001203851.GA3349@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox