From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>, xfs@oss.sgi.com
Subject: Re: [PATCH v2 00/11] xfs: introduce the free inode btree
Date: Wed, 20 Nov 2013 09:17:48 +1100 [thread overview]
Message-ID: <20131119221748.GR11434@dastard> (raw)
In-Reply-To: <528BD853.8090900@redhat.com>
On Tue, Nov 19, 2013 at 04:29:55PM -0500, Brian Foster wrote:
> On 11/13/2013 04:10 PM, Dave Chinner wrote:
> ...
> >
> > The problem can be demonstrated with a single CPU and a single
> > spindle. Create a single AG filesystem of a 100GB, and populate it
> > with 10 million inodes.
> >
> > Time how long it takes to create another 10000 inodes in a new
> > directory. Measure CPU usage.
> >
> > Randomly delete 10,000 inodes from the original population to
> > sparsely populate the inobt with 10000 free inodes.
> >
> > Time how long it takes to create another 10000 inodes in a new
> > directory. Measure CPU usage.
> >
> > The difference in time and CPU will be diretly related to the
> > addition time spent searching the inobt for free inodes...
> >
>
> Thanks for the suggestion, Dave. I've run some fs_mark tests along the
> lines of what is described here. I create 10m files, randomly remove
> ~10k from that dataset and measure the process of allocating 10k new
> inodes in both finobt and non-finobt scenarios (after a clean remount).
>
> The tests run from a 4xcpu VM with 4GB RAM and against an isolated SATA
> drive I had lying around (mapped directly via virtio). The drive is
> formatted with a single VG/LV and as follows with xfs:
>
> meta-data=/dev/mapper/testvg-testlv isize=512 agcount=1,
> agsize=26214400 blks
> = sectsz=512 attr=2, projid32bit=1
> = crc=1 finobt=0
> data = bsize=4096 blocks=26214400, imaxpct=25
> = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0 ftype=1
> log =internal bsize=4096 blocks=12800, version=2
> = sectsz=512 sunit=0 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
>
> Once the fs has been prepared with a random set of free inodes, the
> following command is used to measure performance:
>
> fs_mark -k -S 0 -D 4 -L 10 -n 1000 -s 0 -d /mnt/testdir
>
> I've also collected some perf record data of these commands to compare
> CPU usage. I can make the full/raw data available if desirable. Snippets
> of the results are included below.
>
> --- non-finobt, agi freecount = 9961 after random removal
>
> - fs_mark
>
> FSUse% Count Size Files/sec App Overhead
> 5 1000 0 1020.1 10811
> 5 2000 0 361.4 19498
> 5 3000 0 230.1 12154
> 5 4000 0 166.7 12816
> 5 5000 0 129.7 27409
> 5 6000 0 105.7 13946
> 5 7000 0 87.6 31792
> 5 8000 0 77.8 14921
> 5 9000 0 67.3 15597
> 5 10000 0 62.4 15835
Yes, that's pretty much as I expected - exponential degradation due
to the increasing search radius from the parent directory location...
> --- finobt, agi freecount = 10137 after random removal
>
> - fs_mark
>
> FSUse% Count Size Files/sec App Overhead
> 5 1000 0 9210.0 8587
> 5 2000 0 5592.1 14933
> 5 3000 0 7095.4 11355
> 5 4000 0 5371.1 13613
> 5 5000 0 4919.3 14534
> 5 6000 0 4375.7 15813
> 5 7000 0 5011.3 15095
> 5 8000 0 4629.8 17902
> 5 9000 0 5622.9 12975
> 5 10000 0 5761.4 12203
And that shows little, if any degradation once we toss the first
1000 inodes from the result. Nice demonstration!
> Summarized, the results show a nice improvement for inode allocation
> into a set of inode chunks with random free inode availability. The 10k
> inode allocation reduces from ~90s to ~2s and CPU usage from XFS drops
> way down in the perf profile.
>
> I haven't extensively tested the following, but a quick 1 million inode
> allocation test on a fresh, single AG fs shows a slight degradation with
> the finobt enabled in terms of time to complete:
>
> fs_mark -k -S 0 -D 4 -L 10 -n 100000 -s 0 -d /mnt/bigdir
>
> - non-finobt
>
> real 1m35.349s
> user 0m4.555s
> sys 1m29.749s
>
> - finobt
>
> real 1m42.396s
> user 0m4.326s
> sys 1m37.152s
Given that you have multiple threads banging on the same AGI, and
the hold time for the AGI is going to be slightly longer due to
needing to update two btrees instead of one, this is to be expected.
However, if you are in a memory limited situation, there's a good
chance that the lower memory footprint of the buffer cache as a
result of the finobt based searches will make a difference to these
results. With 4GB of RAM and 1M inodes, you're not generating memory
pressure and so such effects won't be seen in performance results.
As it is, the parallel fsmark tests I did on v1 of the patchset on a
fast SSD based filesystem (sparse 100TB filesystem) showed a small
improvement in performance with finobt enabled. Those tests spend
most of their time in memory pressure situations, so perhaps we're
actually seeing the difference here. However, I haven't tested the
current version yet, so take that with a grain of salt for the
moment.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2013-11-19 22:17 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-13 14:36 [PATCH v2 00/11] xfs: introduce the free inode btree Brian Foster
2013-11-13 14:36 ` [PATCH v2 01/11] xfs: refactor xfs_ialloc_btree.c to support multiple inobt numbers Brian Foster
2013-11-13 16:17 ` Christoph Hellwig
2013-11-13 14:36 ` [PATCH v2 02/11] xfs: reserve v5 superblock read-only compat. feature bit for finobt Brian Foster
2013-11-13 16:18 ` Christoph Hellwig
2013-11-13 14:36 ` [PATCH v2 03/11] xfs: support the XFS_BTNUM_FINOBT free inode btree type Brian Foster
2013-11-13 14:37 ` [PATCH v2 04/11] xfs: update inode allocation/free transaction reservations for finobt Brian Foster
2013-11-13 14:37 ` [PATCH v2 05/11] xfs: insert newly allocated inode chunks into the finobt Brian Foster
2013-11-13 14:37 ` [PATCH v2 06/11] xfs: use and update the finobt on inode allocation Brian Foster
2013-11-13 14:37 ` [PATCH v2 07/11] xfs: refactor xfs_difree() inobt bits into xfs_difree_inobt() helper Brian Foster
2013-11-13 14:37 ` [PATCH v2 08/11] xfs: update the finobt on inode free Brian Foster
2013-11-13 14:37 ` [PATCH v2 09/11] xfs: add finobt support to growfs Brian Foster
2013-11-13 14:37 ` [PATCH v2 10/11] xfs: report finobt status in fs geometry Brian Foster
2013-11-13 14:37 ` [PATCH v2 11/11] xfs: enable the finobt feature on v5 superblocks Brian Foster
2013-11-13 16:17 ` [PATCH v2 00/11] xfs: introduce the free inode btree Christoph Hellwig
2013-11-13 17:55 ` Brian Foster
2013-11-13 21:10 ` Dave Chinner
2013-11-19 21:29 ` Brian Foster
2013-11-19 22:17 ` Dave Chinner [this message]
2013-11-17 22:43 ` Michael L. Semon
2013-11-18 22:38 ` Michael L. Semon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131119221748.GR11434@dastard \
--to=david@fromorbit.com \
--cc=bfoster@redhat.com \
--cc=hch@infradead.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox