public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Klaassen <ak@spinpro.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: External log size limitations
Date: Fri, 18 Feb 2011 10:26:37 -0500	[thread overview]
Message-ID: <4D5E8FAD.9080802@spinpro.com> (raw)
In-Reply-To: <20110217003233.GH13052@dastard>

Dave Chinner wrote:
> The limit is just under 2GB now - that document is a couple of years
> out of date - so if you are running on anything more recent that a
> ~2.6.27 kernel 2GB logs should work fine.

Ah, good to know.

> Data write speed or metadata write speed? What sort of write
> patterns?

A couple of hundred nodes on a renderfarm doing mostly compositing with 
some 3D.  It's about 80/20 read/write.  On the current system that we're 
thinking of converting - an Exastore version 3 system - browsing the 
filesystem becomes ridiculously slow when write loads become moderate, 
which is why snappier metadata operations are attractive to us.

One thing I'm worried about, though, is moving from the Exastore's 64K 
block size to the 4K Linux blocksize limitation.  My quick calculation 
says that that's going to reduce our throughput under random load (which 
is what a renderfarm becomes with a couple of hundred nodes) from about 
200MB/s to about 13MB/s with our 56x7200rpm disks.  It's too bad those 
large blocksize patches from a couple of years back didn't go through to 
make this worry moot.

 > Also, don't forget that data is not logged so increasing
> the log size won't change the speed of data writeback.

Yes, of course... that momentarily slipped my mind.

> As it is, 2GB is still not enough for preventing metadata writeback
> for minutes if that is what you are trying to do.  Even if you use
> the new delaylog mount option - which reduces log traffic by an
> order of magnitude for most non-synchronous workloads - log write
> rates can be upwards of 30MB/s under concurrent metadata intensive
> workloads....

Is there a rule-of-thumb to convert number of files being written to log 
write rates?  We push a lot of data through, but most of the files are a 
few megabytes in size instead of a few kilobytes.

> If you want a log larger than 2GB, then there is a lot of code
> changes in both kernel an userspace as the log arithmetic is all
> done via 32 bit integers and a lot of it is byte based.

Good to know.

> As it is, there are significant scaling issues with logs of even 2GB
> in size - log replay can take tens of minutes when a log full of
> inode changes have to be replayed,

We've got decent a UPS, so unless we get kernel panics, those tens of 
minutes for an occasional unexpected hard shutdown should mean less lost 
production time than the drag of slower metadata operations all the time.

 > filling a 2GB log means you'll
> probably have ten of gigabytes of dirty metadata in memory, so
> response to memory shortages can cause IO storms and severe
> interactivity problems, etc.

I assume that if we packed the server with 128GB of RAM we wouldn't have 
to worry about that as much.  But... short of that, would you have a 
rule of thumb for log size to memory size?  Could I expect reasonable 
performance with a 2GB log and 32GB in the server?  With 12GB in the server?

I know you'd have to mostly guess to make up a rule of thumb, but your 
guesses would be a lot better than mine.  :-)

> In general, I'm finding that a log size of around 512MB w/ delaylog
> gives the best tradeoff between scalability, performance, memory
> usage and relatively sane recovery times...

I'm excited about the delaylog and other improvements I'm seeing 
entering the kernel, but I'm worried about stability.  There seem to 
have been a lot of bugfix patches and panic reports since 2.6.35 for XFS 
to go along with the performance improvements, which makes me tempted to 
stick to 2.6.34 until the dust settles and the kinks are worked out.  If 
I put the new XFS code on the server, will it stay up for a year or more 
without any panics or crashes?

Thanks for your great feedback.  This is one of the things that makes 
open source awesome.

Andrew


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2011-02-18 15:23 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-16 18:54 External log size limitations Andrew Klaassen
2011-02-17  0:32 ` Dave Chinner
2011-02-18 15:26   ` Andrew Klaassen [this message]
2011-02-18 19:55     ` Stan Hoeppner
2011-02-18 20:31       ` Andrew Klaassen
2011-02-19  3:53         ` Stan Hoeppner
2011-02-19 10:02           ` Matthias Schniedermeyer
2011-02-19 20:33             ` Stan Hoeppner
2011-02-19 21:47               ` Emmanuel Florac
2011-02-20 21:14     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D5E8FAD.9080802@spinpro.com \
    --to=ak@spinpro.com \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox