From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH v2 02/12] libxfs: track largest metadata LSN in use via verifiers
Date: Thu, 24 Sep 2015 08:36:25 +1000 [thread overview]
Message-ID: <20150923223625.GR19114@dastard> (raw)
In-Reply-To: <20150923131831.GB37210@bfoster.bfoster>
On Wed, Sep 23, 2015 at 09:18:31AM -0400, Brian Foster wrote:
> On Wed, Sep 23, 2015 at 01:44:06PM +1000, Dave Chinner wrote:
> > On Fri, Sep 11, 2015 at 02:55:32PM -0400, Brian Foster wrote:
> > > +
> > > + pthread_mutex_lock(&libxfs_max_lsn_lock);
> > > +
> > > + max_cycle = CYCLE_LSN(libxfs_max_lsn);
> > > + max_block = BLOCK_LSN(libxfs_max_lsn);
> > > +
> > > + if ((cycle > max_cycle) ||
> > > + (cycle == max_cycle && block > max_block))
> > > + libxfs_max_lsn = lsn;
Actually, we have XFS_LSN_CMP(lsn1, lsn2) for this. i.e.
if (XFS_LSN_CMP(lsn, libxfs_max_lsn) > 0)
libxfs_max_lsn = lsn;
> > > +
> > > + pthread_mutex_unlock(&libxfs_max_lsn_lock);
> >
> > This will have the same lock contention problems that the kernel
> > code would have had - my repair scalablity tests regularly reach
> > over 1GB/s of metadata being prefetched through tens of threads, so
> > this is going have a significant impact on performance in those
> > tests....
> >
>
> Hmm, Ok. I initially didn't have locking here (by accident) but
> reproduced some breakage for which the exact details escape me. I
> suspect it was a test and set race.
>
> I suppose we could try doing something like what we plan to do on the
> kernel side in terms of a racy check followed by the locked check so the
> lock contention goes away once we find the max lsn. The context here is
> different, however, in that we're using a 64-bit LSN rather than
> independently updated cycle and block fields. It could also take a while
> to stabilize the max lsn depending on the fs. (There are some details on
> the kernel side I'd like to get worked out first as well).
It still has to work on 32 bit machines, where 64 bit operations are
not atomic....
> Perhaps we could try to make this smarter... in the case where either
> the log has been zeroed or we've identified an LSN beyond the current
> log state, we only really need to track the max cycle value since a log
> format is imminent at that point. In the case where the fs is
> consistent, we could seed the starting value with the current log lsn,
> or track per-ag max LSNs without locking and compare them at at the end,
> etc.
Yes, we should seen the initial "max lsn" with whatever we find in
the log, regardless of whether the fs is consistent or not...
Also, per-ag max lsn tracking would break the lock up - that will
bring scope down to the point where contention won't be a problem...
> I'll have to think about this some more and see what's effective. I'd
> also like to quantify the effect the current locking has on performance
> if possible. Can you provide a brief description of your typical repair
> test that you would expect this to hurt? E.g., a large fs, many AGs,
> populated with fs_mark and repaired with many threads..? Any special
> storage configuration? Thanks.
Just my usual 500TB fs_mark test...
$ cat ~/tests/fsmark-50-test-xfs.sh
#!/bin/bash
QUOTA=
MKFSOPTS=
NFILES=100000
DEV=/dev/vdc
LOGBSIZE=256k
while [ $# -gt 0 ]; do
case "$1" in
-q) QUOTA="uquota,gquota,pquota" ;;
-N) NFILES=$2 ; shift ;;
-d) DEV=$2 ; shift ;;
-l) LOGBSIZE=$2; shift ;;
--) shift ; break ;;
esac
shift
done
MKFSOPTS="$MKFSOPTS $*"
echo QUOTA=$QUOTA
echo MKFSOPTS=$MKFSOPTS
echo DEV=$DEV
sudo umount /mnt/scratch > /dev/null 2>&1
sudo mkfs.xfs -f $MKFSOPTS $DEV
sudo mount -o nobarrier,logbsize=$LOGBSIZE,$QUOTA $DEV /mnt/scratch
sudo chmod 777 /mnt/scratch
cd /home/dave/src/fs_mark-3.3/
sudo sh -c "echo 1 > /proc/sys/fs/xfs/stats_clear"
time ./fs_mark -D 10000 -S0 -n $NFILES -s 0 -L 32 \
-d /mnt/scratch/0 -d /mnt/scratch/1 \
-d /mnt/scratch/2 -d /mnt/scratch/3 \
-d /mnt/scratch/4 -d /mnt/scratch/5 \
-d /mnt/scratch/6 -d /mnt/scratch/7 \
-d /mnt/scratch/8 -d /mnt/scratch/9 \
-d /mnt/scratch/10 -d /mnt/scratch/11 \
-d /mnt/scratch/12 -d /mnt/scratch/13 \
-d /mnt/scratch/14 -d /mnt/scratch/15 \
| tee >(stats --trim-outliers | tail -1 1>&2)
sync
echo Repair
sudo umount /mnt/scratch
time sudo xfs_repair -o bhash=100101 -v -v -t 1 $DEV
time sudo mount -o nobarrier,logbsize=$LOGBSIZE,$QUOTA $DEV /mnt/scratch
echo bulkstat files
time (
sudo ~/src/xfstests-dev/src/bstat -q /mnt/scratch 1024 | wc -l
)
echo walking files
~/tests/walk-scratch.sh
echo removing files
for f in /mnt/scratch/* ; do time rm -rf $f & done
wait
sudo umount /mnt/scratch
$
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2015-09-23 22:45 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-11 18:55 [PATCH v2 00/12] xfsprogs: format the log correctly on v5 supers Brian Foster
2015-09-11 18:55 ` [PATCH v2 01/12] libxfs: validate metadata LSNs against log on v5 superblocks Brian Foster
2015-09-11 18:55 ` [PATCH v2 02/12] libxfs: track largest metadata LSN in use via verifiers Brian Foster
2015-09-23 3:44 ` Dave Chinner
2015-09-23 13:18 ` Brian Foster
2015-09-23 22:36 ` Dave Chinner [this message]
2015-10-01 20:38 ` Brian Foster
2015-10-02 2:16 ` Dave Chinner
2015-10-02 11:33 ` Brian Foster
2015-09-11 18:55 ` [PATCH v2 03/12] libxfs: don't hardcode cycle 1 into unmount op header Brian Foster
2015-09-23 3:48 ` Dave Chinner
2015-09-23 13:22 ` Brian Foster
2015-09-24 0:37 ` Dave Chinner
2015-09-24 13:00 ` Brian Foster
2015-09-11 18:55 ` [PATCH v2 04/12] libxfs: pass lsn param to log clear and record header logging helpers Brian Foster
2015-09-11 18:55 ` [PATCH v2 05/12] libxfs: add ability to clear log to arbitrary log cycle Brian Foster
2015-09-11 18:55 ` [PATCH v2 06/12] libxlog: pull struct xlog out of xlog_is_dirty() Brian Foster
2015-09-11 18:55 ` [PATCH v2 07/12] xfs_repair: track log state throughout all recovery phases Brian Foster
2015-09-11 18:55 ` [PATCH v2 08/12] xfs_repair: process the log in no_modify mode Brian Foster
2015-09-11 18:55 ` [PATCH v2 09/12] xfs_repair: format the log with forward cycle number on v5 supers Brian Foster
2015-09-11 18:55 ` [PATCH v2 10/12] xfs_repair: don't clear the log by default Brian Foster
2015-09-11 18:55 ` [PATCH v2 11/12] xfs_db: do not reset current lsn from uuid command on v5 supers Brian Foster
2015-09-11 18:55 ` [PATCH v2 12/12] db/metadump: bump lsn when log is cleared " Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150923223625.GR19114@dastard \
--to=david@fromorbit.com \
--cc=bfoster@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.