linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: performance of filesystem xattrs with Samba4
       [not found]   ` <20041119101600.GM1974@schnapps.adilger.int>
@ 2004-11-22 13:02     ` tridge
  2004-11-22 21:40       ` Andreas Dilger
  0 siblings, 1 reply; 2+ messages in thread
From: tridge @ 2004-11-22 13:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andreas Dilger, linux-fsdevel

I've put up graphs of the first set of dbench3 results for various
filesystems at:

   http://samba.org/~tridge/xattr_results/

All the tests were run on a 2.6.10-rc2 kernel with the patch from
Andreas to add support to ext3 for large inodes. I needed to tweak the
patch for 2.6.10-rc2, but not by much. Full details on the setup are
in the README, and the scripts for reproducing the results yourself
(and the graphs) are in the same directory.

The results show that the ext3 large inode patch is extremely
worthwhile. Using a 256 byte inode on ext3 gained a factor of up to 7x
in performance, and only lost a very small amount when xattrs were not
used. It took ext3 from a very mediocre performance to being the clear
winner among current Linux journaled filesystems for performance when
xattrs are used. Eventually I think that larger inodes should become
the default.

Similarly on xfs, using the large inode option (512 bytes this time)
made a huge difference, gaining a factor of 6x in the best case. If
all versions of the xfs code can handle large inodes then I think it
would be good to change the default, especially as it seems to have
almost no cost when xattrs are not used.

Without xattrs reiser3 did extremely well under heavier load, where it
is less of a in-memory test, just as Hans thought it
would. Unfortunately I wasn't able to try reiser4 in these runs due to
the lockups I reported earlier, but I look forward to trying it once
those are fixed.

Reiser3 was also the best "out of the box" journaled filesystem with
xattrs, but it was easily beaten by xfs and ext3 once large inodes
were enabled in those.

jfs wins the award for consistency. As I watched the results develop I
was tempted to just disable the jfs tests as it was so slow, but
eventually it overtook xfs at very large loads. Maybe if I run large
enough loads it will be the overall winner :)

The massive gap between ext2 and the other filesystems really shows
clearly how much we are paying for journaling. I haven't tried any
journal on external device or journal on nvram card tricks yet, but it
looks like those will be worth pursuing.

I'll leave the test script running overnight generating some more
results for even higher loads. I'll update the graphs in the morning.

Cheers, Tridge

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: performance of filesystem xattrs with Samba4
  2004-11-22 13:02     ` performance of filesystem xattrs with Samba4 tridge
@ 2004-11-22 21:40       ` Andreas Dilger
  0 siblings, 0 replies; 2+ messages in thread
From: Andreas Dilger @ 2004-11-22 21:40 UTC (permalink / raw)
  To: tridge; +Cc: linux-kernel, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 2828 bytes --]

On Nov 23, 2004  00:02 +1100, tridge@samba.org wrote:
> I've put up graphs of the first set of dbench3 results for various
> filesystems at:
> 
>    http://samba.org/~tridge/xattr_results/
> 
> The results show that the ext3 large inode patch is extremely
> worthwhile. Using a 256 byte inode on ext3 gained a factor of up to 7x
> in performance, and only lost a very small amount when xattrs were not
> used. It took ext3 from a very mediocre performance to being the clear
> winner among current Linux journaled filesystems for performance when
> xattrs are used. Eventually I think that larger inodes should become
> the default.

For Lustre we tune the inode size at format time to allow the storing
of the "default" EA data within the larger inode.  Is this the case with
samba and 256-byte inodes (i.e. is your EA data all going to fit within
the extra 124 bytes of space for storing EAs)?  If you have to put any
of the commonly-used EA data into an external block the benefits are lost.

> The massive gap between ext2 and the other filesystems really shows
> clearly how much we are paying for journaling. I haven't tried any
> journal on external device or journal on nvram card tricks yet, but it
> looks like those will be worth pursuing.

One of the other things we do for Lustre right away is create the ext3
filesystem with larger journal sizes so that for the many-client cases
we do not get synchronous journal flushing if there are lots of active
threads.  This can make a huge difference in overall performance at
high loads.  Use "mke2fs -J size=400 ..." to create a 400MB journal
(assuming you have at least that much RAM and a large enough block
device, at least 4x the journal size just from a "don't waste space"
point of view).

One factor is that you don't necessarily need to write so much data at one
time, but also that ext3 needs to reserve journal space for the worst-case
usage, so you get 40-100 threads allocating "worst case" then "filling"
the journal (causing new operations to block) and finally completing with
only a small fraction of those reserved journal blocks actually used.

Having an external journal device also generally gives you a large
journal (by default it is the full size of the block device specified)
so sometimes the effects of the large journal are confused with the
fact that it is external.  I haven't seen any perf numbers recently on
what kind of effect having an external journal has.  I highly doubt that
NVRAM cards are any better than a dedicated disk for the journal, since
journal IO is write-only (except during recovery) and virtually seek-free.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://members.shaw.ca/adilger/             http://members.shaw.ca/golinux/


[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2004-11-23 22:33 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1098383538.987.359.camel@new.localdomain>
     [not found] ` <16797.41728.984065.479474@samba.org>
     [not found]   ` <20041119101600.GM1974@schnapps.adilger.int>
2004-11-22 13:02     ` performance of filesystem xattrs with Samba4 tridge
2004-11-22 21:40       ` Andreas Dilger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).