From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH v2 02/12] libxfs: track largest metadata LSN in use via verifiers
Date: Thu, 24 Sep 2015 08:36:25 +1000 [thread overview]
Message-ID: <20150923223625.GR19114@dastard> (raw)
In-Reply-To: <20150923131831.GB37210@bfoster.bfoster>
On Wed, Sep 23, 2015 at 09:18:31AM -0400, Brian Foster wrote:
> On Wed, Sep 23, 2015 at 01:44:06PM +1000, Dave Chinner wrote:
> > On Fri, Sep 11, 2015 at 02:55:32PM -0400, Brian Foster wrote:
> > > +
> > > + pthread_mutex_lock(&libxfs_max_lsn_lock);
> > > +
> > > + max_cycle = CYCLE_LSN(libxfs_max_lsn);
> > > + max_block = BLOCK_LSN(libxfs_max_lsn);
> > > +
> > > + if ((cycle > max_cycle) ||
> > > + (cycle == max_cycle && block > max_block))
> > > + libxfs_max_lsn = lsn;
Actually, we have XFS_LSN_CMP(lsn1, lsn2) for this. i.e.
if (XFS_LSN_CMP(lsn, libxfs_max_lsn) > 0)
libxfs_max_lsn = lsn;
> > > +
> > > + pthread_mutex_unlock(&libxfs_max_lsn_lock);
> >
> > This will have the same lock contention problems that the kernel
> > code would have had - my repair scalablity tests regularly reach
> > over 1GB/s of metadata being prefetched through tens of threads, so
> > this is going have a significant impact on performance in those
> > tests....
> >
>
> Hmm, Ok. I initially didn't have locking here (by accident) but
> reproduced some breakage for which the exact details escape me. I
> suspect it was a test and set race.
>
> I suppose we could try doing something like what we plan to do on the
> kernel side in terms of a racy check followed by the locked check so the
> lock contention goes away once we find the max lsn. The context here is
> different, however, in that we're using a 64-bit LSN rather than
> independently updated cycle and block fields. It could also take a while
> to stabilize the max lsn depending on the fs. (There are some details on
> the kernel side I'd like to get worked out first as well).
It still has to work on 32 bit machines, where 64 bit operations are
not atomic....
> Perhaps we could try to make this smarter... in the case where either
> the log has been zeroed or we've identified an LSN beyond the current
> log state, we only really need to track the max cycle value since a log
> format is imminent at that point. In the case where the fs is
> consistent, we could seed the starting value with the current log lsn,
> or track per-ag max LSNs without locking and compare them at at the end,
> etc.
Yes, we should seen the initial "max lsn" with whatever we find in
the log, regardless of whether the fs is consistent or not...
Also, per-ag max lsn tracking would break the lock up - that will
bring scope down to the point where contention won't be a problem...
> I'll have to think about this some more and see what's effective. I'd
> also like to quantify the effect the current locking has on performance
> if possible. Can you provide a brief description of your typical repair
> test that you would expect this to hurt? E.g., a large fs, many AGs,
> populated with fs_mark and repaired with many threads..? Any special
> storage configuration? Thanks.
Just my usual 500TB fs_mark test...
$ cat ~/tests/fsmark-50-test-xfs.sh
#!/bin/bash
QUOTA=
MKFSOPTS=
NFILES=100000
DEV=/dev/vdc
LOGBSIZE=256k
while [ $# -gt 0 ]; do
case "$1" in
-q) QUOTA="uquota,gquota,pquota" ;;
-N) NFILES=$2 ; shift ;;
-d) DEV=$2 ; shift ;;
-l) LOGBSIZE=$2; shift ;;
--) shift ; break ;;
esac
shift
done
MKFSOPTS="$MKFSOPTS $*"
echo QUOTA=$QUOTA
echo MKFSOPTS=$MKFSOPTS
echo DEV=$DEV
sudo umount /mnt/scratch > /dev/null 2>&1
sudo mkfs.xfs -f $MKFSOPTS $DEV
sudo mount -o nobarrier,logbsize=$LOGBSIZE,$QUOTA $DEV /mnt/scratch
sudo chmod 777 /mnt/scratch
cd /home/dave/src/fs_mark-3.3/
sudo sh -c "echo 1 > /proc/sys/fs/xfs/stats_clear"
time ./fs_mark -D 10000 -S0 -n $NFILES -s 0 -L 32 \
-d /mnt/scratch/0 -d /mnt/scratch/1 \
-d /mnt/scratch/2 -d /mnt/scratch/3 \
-d /mnt/scratch/4 -d /mnt/scratch/5 \
-d /mnt/scratch/6 -d /mnt/scratch/7 \
-d /mnt/scratch/8 -d /mnt/scratch/9 \
-d /mnt/scratch/10 -d /mnt/scratch/11 \
-d /mnt/scratch/12 -d /mnt/scratch/13 \
-d /mnt/scratch/14 -d /mnt/scratch/15 \
| tee >(stats --trim-outliers | tail -1 1>&2)
sync
echo Repair
sudo umount /mnt/scratch
time sudo xfs_repair -o bhash=100101 -v -v -t 1 $DEV
time sudo mount -o nobarrier,logbsize=$LOGBSIZE,$QUOTA $DEV /mnt/scratch
echo bulkstat files
time (
sudo ~/src/xfstests-dev/src/bstat -q /mnt/scratch 1024 | wc -l
)
echo walking files
~/tests/walk-scratch.sh
echo removing files
for f in /mnt/scratch/* ; do time rm -rf $f & done
wait
sudo umount /mnt/scratch
$
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2015-09-23 22:45 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-11 18:55 [PATCH v2 00/12] xfsprogs: format the log correctly on v5 supers Brian Foster
2015-09-11 18:55 ` [PATCH v2 01/12] libxfs: validate metadata LSNs against log on v5 superblocks Brian Foster
2015-09-11 18:55 ` [PATCH v2 02/12] libxfs: track largest metadata LSN in use via verifiers Brian Foster
2015-09-23 3:44 ` Dave Chinner
2015-09-23 13:18 ` Brian Foster
2015-09-23 22:36 ` Dave Chinner [this message]
2015-10-01 20:38 ` Brian Foster
2015-10-02 2:16 ` Dave Chinner
2015-10-02 11:33 ` Brian Foster
2015-09-11 18:55 ` [PATCH v2 03/12] libxfs: don't hardcode cycle 1 into unmount op header Brian Foster
2015-09-23 3:48 ` Dave Chinner
2015-09-23 13:22 ` Brian Foster
2015-09-24 0:37 ` Dave Chinner
2015-09-24 13:00 ` Brian Foster
2015-09-11 18:55 ` [PATCH v2 04/12] libxfs: pass lsn param to log clear and record header logging helpers Brian Foster
2015-09-11 18:55 ` [PATCH v2 05/12] libxfs: add ability to clear log to arbitrary log cycle Brian Foster
2015-09-11 18:55 ` [PATCH v2 06/12] libxlog: pull struct xlog out of xlog_is_dirty() Brian Foster
2015-09-11 18:55 ` [PATCH v2 07/12] xfs_repair: track log state throughout all recovery phases Brian Foster
2015-09-11 18:55 ` [PATCH v2 08/12] xfs_repair: process the log in no_modify mode Brian Foster
2015-09-11 18:55 ` [PATCH v2 09/12] xfs_repair: format the log with forward cycle number on v5 supers Brian Foster
2015-09-11 18:55 ` [PATCH v2 10/12] xfs_repair: don't clear the log by default Brian Foster
2015-09-11 18:55 ` [PATCH v2 11/12] xfs_db: do not reset current lsn from uuid command on v5 supers Brian Foster
2015-09-11 18:55 ` [PATCH v2 12/12] db/metadump: bump lsn when log is cleared " Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150923223625.GR19114@dastard \
--to=david@fromorbit.com \
--cc=bfoster@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox