public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH v2 02/12] libxfs: track largest metadata LSN in use via verifiers
Date: Thu, 24 Sep 2015 08:36:25 +1000	[thread overview]
Message-ID: <20150923223625.GR19114@dastard> (raw)
In-Reply-To: <20150923131831.GB37210@bfoster.bfoster>

On Wed, Sep 23, 2015 at 09:18:31AM -0400, Brian Foster wrote:
> On Wed, Sep 23, 2015 at 01:44:06PM +1000, Dave Chinner wrote:
> > On Fri, Sep 11, 2015 at 02:55:32PM -0400, Brian Foster wrote:
> > > +
> > > +	pthread_mutex_lock(&libxfs_max_lsn_lock);
> > > +
> > > +	max_cycle = CYCLE_LSN(libxfs_max_lsn);
> > > +	max_block = BLOCK_LSN(libxfs_max_lsn);
> > > +
> > > +	if ((cycle > max_cycle) ||
> > > +	    (cycle == max_cycle && block > max_block))
> > > +		libxfs_max_lsn = lsn;

Actually, we have XFS_LSN_CMP(lsn1, lsn2) for this. i.e.

	if (XFS_LSN_CMP(lsn, libxfs_max_lsn) > 0)
		libxfs_max_lsn = lsn;

> > > +
> > > +	pthread_mutex_unlock(&libxfs_max_lsn_lock);
> > 
> > This will have the same lock contention problems that the kernel
> > code would have had - my repair scalablity tests regularly reach
> > over 1GB/s of metadata being prefetched through tens of threads, so
> > this is going have a significant impact on performance in those
> > tests....
> > 
> 
> Hmm, Ok. I initially didn't have locking here (by accident) but
> reproduced some breakage for which the exact details escape me. I
> suspect it was a test and set race.
> 
> I suppose we could try doing something like what we plan to do on the
> kernel side in terms of a racy check followed by the locked check so the
> lock contention goes away once we find the max lsn. The context here is
> different, however, in that we're using a 64-bit LSN rather than
> independently updated cycle and block fields. It could also take a while
> to stabilize the max lsn depending on the fs. (There are some details on
> the kernel side I'd like to get worked out first as well).

It still has to work on 32 bit machines, where 64 bit operations are
not atomic....

> Perhaps we could try to make this smarter... in the case where either
> the log has been zeroed or we've identified an LSN beyond the current
> log state, we only really need to track the max cycle value since a log
> format is imminent at that point. In the case where the fs is
> consistent, we could seed the starting value with the current log lsn,
> or track per-ag max LSNs without locking and compare them at at the end,
> etc.

Yes, we should seen the initial "max lsn" with whatever we find in
the log, regardless of whether the fs is consistent or not...

Also, per-ag max lsn tracking would break the lock up - that will
bring scope down to the point where contention won't be a problem...

> I'll have to think about this some more and see what's effective. I'd
> also like to quantify the effect the current locking has on performance
> if possible. Can you provide a brief description of your typical repair
> test that you would expect this to hurt? E.g., a large fs, many AGs,
> populated with fs_mark and repaired with many threads..? Any special
> storage configuration? Thanks.

Just my usual 500TB fs_mark test...

$ cat ~/tests/fsmark-50-test-xfs.sh 
#!/bin/bash

QUOTA=
MKFSOPTS=
NFILES=100000
DEV=/dev/vdc
LOGBSIZE=256k

while [ $# -gt 0 ]; do
        case "$1" in
        -q)     QUOTA="uquota,gquota,pquota" ;;
        -N)     NFILES=$2 ; shift ;;
        -d)     DEV=$2 ; shift ;;
        -l)     LOGBSIZE=$2; shift ;;
        --)     shift ; break ;;
        esac
        shift
done
MKFSOPTS="$MKFSOPTS $*"

echo QUOTA=$QUOTA
echo MKFSOPTS=$MKFSOPTS
echo DEV=$DEV

sudo umount /mnt/scratch > /dev/null 2>&1
sudo mkfs.xfs -f $MKFSOPTS $DEV
sudo mount -o nobarrier,logbsize=$LOGBSIZE,$QUOTA $DEV /mnt/scratch
sudo chmod 777 /mnt/scratch
cd /home/dave/src/fs_mark-3.3/
sudo sh -c "echo 1 > /proc/sys/fs/xfs/stats_clear"
time ./fs_mark  -D  10000  -S0  -n  $NFILES  -s  0  -L  32 \
        -d  /mnt/scratch/0  -d  /mnt/scratch/1 \
        -d  /mnt/scratch/2  -d  /mnt/scratch/3 \
        -d  /mnt/scratch/4  -d  /mnt/scratch/5 \
        -d  /mnt/scratch/6  -d  /mnt/scratch/7 \
        -d  /mnt/scratch/8  -d  /mnt/scratch/9 \
        -d  /mnt/scratch/10  -d  /mnt/scratch/11 \
        -d  /mnt/scratch/12  -d  /mnt/scratch/13 \
        -d  /mnt/scratch/14  -d  /mnt/scratch/15 \
        | tee >(stats --trim-outliers | tail -1 1>&2)
sync

echo Repair
sudo umount /mnt/scratch
time sudo xfs_repair -o bhash=100101 -v -v -t 1 $DEV
time sudo mount -o nobarrier,logbsize=$LOGBSIZE,$QUOTA $DEV /mnt/scratch

echo bulkstat files

time (
        sudo ~/src/xfstests-dev/src/bstat  -q /mnt/scratch 1024 | wc -l
)

echo walking files
~/tests/walk-scratch.sh

echo removing files
for f in /mnt/scratch/* ; do time rm -rf $f &  done
wait

sudo umount /mnt/scratch
$

-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2015-09-23 22:45 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-11 18:55 [PATCH v2 00/12] xfsprogs: format the log correctly on v5 supers Brian Foster
2015-09-11 18:55 ` [PATCH v2 01/12] libxfs: validate metadata LSNs against log on v5 superblocks Brian Foster
2015-09-11 18:55 ` [PATCH v2 02/12] libxfs: track largest metadata LSN in use via verifiers Brian Foster
2015-09-23  3:44   ` Dave Chinner
2015-09-23 13:18     ` Brian Foster
2015-09-23 22:36       ` Dave Chinner [this message]
2015-10-01 20:38         ` Brian Foster
2015-10-02  2:16           ` Dave Chinner
2015-10-02 11:33             ` Brian Foster
2015-09-11 18:55 ` [PATCH v2 03/12] libxfs: don't hardcode cycle 1 into unmount op header Brian Foster
2015-09-23  3:48   ` Dave Chinner
2015-09-23 13:22     ` Brian Foster
2015-09-24  0:37       ` Dave Chinner
2015-09-24 13:00         ` Brian Foster
2015-09-11 18:55 ` [PATCH v2 04/12] libxfs: pass lsn param to log clear and record header logging helpers Brian Foster
2015-09-11 18:55 ` [PATCH v2 05/12] libxfs: add ability to clear log to arbitrary log cycle Brian Foster
2015-09-11 18:55 ` [PATCH v2 06/12] libxlog: pull struct xlog out of xlog_is_dirty() Brian Foster
2015-09-11 18:55 ` [PATCH v2 07/12] xfs_repair: track log state throughout all recovery phases Brian Foster
2015-09-11 18:55 ` [PATCH v2 08/12] xfs_repair: process the log in no_modify mode Brian Foster
2015-09-11 18:55 ` [PATCH v2 09/12] xfs_repair: format the log with forward cycle number on v5 supers Brian Foster
2015-09-11 18:55 ` [PATCH v2 10/12] xfs_repair: don't clear the log by default Brian Foster
2015-09-11 18:55 ` [PATCH v2 11/12] xfs_db: do not reset current lsn from uuid command on v5 supers Brian Foster
2015-09-11 18:55 ` [PATCH v2 12/12] db/metadump: bump lsn when log is cleared " Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150923223625.GR19114@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox