public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>, xfs@oss.sgi.com
Subject: Re: [PATCH v2 00/11] xfs: introduce the free inode btree
Date: Wed, 20 Nov 2013 09:17:48 +1100	[thread overview]
Message-ID: <20131119221748.GR11434@dastard> (raw)
In-Reply-To: <528BD853.8090900@redhat.com>

On Tue, Nov 19, 2013 at 04:29:55PM -0500, Brian Foster wrote:
> On 11/13/2013 04:10 PM, Dave Chinner wrote:
> ...
> > 
> > The problem can be demonstrated with a single CPU and a single
> > spindle. Create a single AG filesystem of a 100GB, and populate it
> > with 10 million inodes.
> > 
> > Time how long it takes to create another 10000 inodes in a new
> > directory. Measure CPU usage.
> > 
> > Randomly delete 10,000 inodes from the original population to
> > sparsely populate the inobt with 10000 free inodes.
> > 
> > Time how long it takes to create another 10000 inodes in a new
> > directory. Measure CPU usage.
> > 
> > The difference in time and CPU will be diretly related to the
> > addition time spent searching the inobt for free inodes...
> > 
> 
> Thanks for the suggestion, Dave. I've run some fs_mark tests along the
> lines of what is described here. I create 10m files, randomly remove
> ~10k from that dataset and measure the process of allocating 10k new
> inodes in both finobt and non-finobt scenarios (after a clean remount).
> 
> The tests run from a 4xcpu VM with 4GB RAM and against an isolated SATA
> drive I had lying around (mapped directly via virtio). The drive is
> formatted with a single VG/LV and as follows with xfs:
> 
> meta-data=/dev/mapper/testvg-testlv isize=512    agcount=1,
> agsize=26214400 blks
>          =                       sectsz=512   attr=2, projid32bit=1
>          =                       crc=1        finobt=0
> data     =                       bsize=4096   blocks=26214400, imaxpct=25
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
> log      =internal               bsize=4096   blocks=12800, version=2
>          =                       sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
> Once the fs has been prepared with a random set of free inodes, the
> following command is used to measure performance:
> 
> 	fs_mark -k -S 0 -D 4 -L 10 -n 1000 -s 0 -d /mnt/testdir
> 
> I've also collected some perf record data of these commands to compare
> CPU usage. I can make the full/raw data available if desirable. Snippets
> of the results are included below.
> 
> --- non-finobt, agi freecount = 9961 after random removal
> 
> - fs_mark
> 
> FSUse%        Count         Size    Files/sec     App Overhead
>      5         1000            0       1020.1            10811
>      5         2000            0        361.4            19498
>      5         3000            0        230.1            12154
>      5         4000            0        166.7            12816
>      5         5000            0        129.7            27409
>      5         6000            0        105.7            13946
>      5         7000            0         87.6            31792
>      5         8000            0         77.8            14921
>      5         9000            0         67.3            15597
>      5        10000            0         62.4            15835

Yes, that's pretty much as I expected - exponential degradation due
to the increasing search radius from the parent directory location...

> --- finobt, agi freecount = 10137 after random removal
> 
> - fs_mark
> 
> FSUse%        Count         Size    Files/sec     App Overhead
>      5         1000            0       9210.0             8587
>      5         2000            0       5592.1            14933
>      5         3000            0       7095.4            11355
>      5         4000            0       5371.1            13613
>      5         5000            0       4919.3            14534
>      5         6000            0       4375.7            15813
>      5         7000            0       5011.3            15095
>      5         8000            0       4629.8            17902
>      5         9000            0       5622.9            12975
>      5        10000            0       5761.4            12203

And that shows little, if any degradation once we toss the first
1000 inodes from the result. Nice demonstration!

> Summarized, the results show a nice improvement for inode allocation
> into a set of inode chunks with random free inode availability. The 10k
> inode allocation reduces from ~90s to ~2s and CPU usage from XFS drops
> way down in the perf profile.
> 
> I haven't extensively tested the following, but a quick 1 million inode
> allocation test on a fresh, single AG fs shows a slight degradation with
> the finobt enabled in terms of time to complete:
> 
> 	fs_mark -k -S 0 -D 4 -L 10 -n 100000 -s 0 -d /mnt/bigdir
> 
> - non-finobt
> 
> real    1m35.349s
> user    0m4.555s
> sys     1m29.749s
> 
> - finobt
> 
> real    1m42.396s
> user    0m4.326s
> sys     1m37.152s

Given that you have multiple threads banging on the same AGI, and
the hold time for the AGI is going to be slightly longer due to
needing to update two btrees instead of one, this is to be expected.

However, if you are in a memory limited situation, there's a good
chance that the lower memory footprint of the buffer cache as a
result of the finobt based searches will make a difference to these
results. With 4GB of RAM and 1M inodes, you're not generating memory
pressure and so such effects won't be seen in performance results.

As it is, the parallel fsmark tests I did on v1 of the patchset on a
fast SSD based filesystem (sparse 100TB filesystem) showed a small
improvement in performance with finobt enabled. Those tests spend
most of their time in memory pressure situations, so perhaps we're
actually seeing the difference here. However, I haven't tested the
current version yet, so take that with a grain of salt for the
moment.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-11-19 22:17 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-13 14:36 [PATCH v2 00/11] xfs: introduce the free inode btree Brian Foster
2013-11-13 14:36 ` [PATCH v2 01/11] xfs: refactor xfs_ialloc_btree.c to support multiple inobt numbers Brian Foster
2013-11-13 16:17   ` Christoph Hellwig
2013-11-13 14:36 ` [PATCH v2 02/11] xfs: reserve v5 superblock read-only compat. feature bit for finobt Brian Foster
2013-11-13 16:18   ` Christoph Hellwig
2013-11-13 14:36 ` [PATCH v2 03/11] xfs: support the XFS_BTNUM_FINOBT free inode btree type Brian Foster
2013-11-13 14:37 ` [PATCH v2 04/11] xfs: update inode allocation/free transaction reservations for finobt Brian Foster
2013-11-13 14:37 ` [PATCH v2 05/11] xfs: insert newly allocated inode chunks into the finobt Brian Foster
2013-11-13 14:37 ` [PATCH v2 06/11] xfs: use and update the finobt on inode allocation Brian Foster
2013-11-13 14:37 ` [PATCH v2 07/11] xfs: refactor xfs_difree() inobt bits into xfs_difree_inobt() helper Brian Foster
2013-11-13 14:37 ` [PATCH v2 08/11] xfs: update the finobt on inode free Brian Foster
2013-11-13 14:37 ` [PATCH v2 09/11] xfs: add finobt support to growfs Brian Foster
2013-11-13 14:37 ` [PATCH v2 10/11] xfs: report finobt status in fs geometry Brian Foster
2013-11-13 14:37 ` [PATCH v2 11/11] xfs: enable the finobt feature on v5 superblocks Brian Foster
2013-11-13 16:17 ` [PATCH v2 00/11] xfs: introduce the free inode btree Christoph Hellwig
2013-11-13 17:55   ` Brian Foster
2013-11-13 21:10     ` Dave Chinner
2013-11-19 21:29       ` Brian Foster
2013-11-19 22:17         ` Dave Chinner [this message]
2013-11-17 22:43 ` Michael L. Semon
2013-11-18 22:38   ` Michael L. Semon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131119221748.GR11434@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=hch@infradead.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox