public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Helge Hafting <helge.hafting@aitel.hist.no>
To: "Dieter Stüken" <stueken@conterra.de>
Cc: linux-kernel@vger.kernel.org
Subject: Re: ext3 metadata performace
Date: Fri, 12 May 2006 08:35:41 +0200	[thread overview]
Message-ID: <44642CBD.4000305@aitel.hist.no> (raw)
In-Reply-To: <4463461C.3070201@conterra.de>

Dieter Stüken wrote:

> after I switched from from ext2 to ext3 i observed some severe 
> performance degradation. Most discussion about this topic deals
> with tuning of data-io performance. My problem however is related to 
> metadata updates. When cloning (cp -al) or deleting directory trees I 
> find, that about 7200 files are created/deleted per minute. Seems
> this is related to some ex3 strategy, to wait for each metadata to be
> written to disk. Interestingly this occurs with my new hw-raid
> controller (3ware 9500S), which even has an battery buffered disk cache.
> Thus there is no need for synchronous IO anyway. If I disable the
> disk cache on my plain SATA disk using ext3, I also get this behavior.
>
> Would it be make sense for ext3, to disable synchronous writes even
> for metadata (similar to the "data=writeback" option)? This means, that
> ext3 won't protect the (meta) data currently written. This is needed
> if running a database or an email server, where the process performing
> the IO must be sure, the data is definitely on disk, if it returns form
> the system call. In most cases, however, you choose ex3 to ensure the
> consistency of your file system after a crash, to avoid an fsck.
> If some files, created just before the crash, vanish, does not hurt
> me too much.

Turning off synchronous writes like this won't work!
The battery-backed cache can help you in that you can consider
data "written" once it is transferred to that cache.  Metadata must still
go synchronously into the cache though, or you get a broken fs
if ever your machine crash in the middle of a transaction. (Leaving
an update halfway in that battery cache, and halfway in main memory.
Then main memory dies from the power cut / reboot.)

The caching controller should report back to the linux device driver
that "data is committed" as soon as it hits the cache - no need to
wait for it to actually hit the platters.  This can help performance with
bursty writes tremendously - but it won't help you with long-lasting writes
as you will then be limited by platter speed as soon as the battery cache
is completely full.

Helge Hafting




  parent reply	other threads:[~2006-05-12  6:39 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-11 14:11 ext3 metadata performace Dieter Stüken
2006-05-11 15:43 ` Avi Kivity
2006-05-11 18:46   ` Miquel van Smoorenburg
2006-05-12  0:36 ` Hua Zhong
2006-05-12  6:35 ` Helge Hafting [this message]
2006-05-12 10:11   ` Dieter Stüken
     [not found] <6bkbC-4V9-27@gated-at.bofh.it>
2006-05-12  0:12 ` Robert Hancock

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44642CBD.4000305@aitel.hist.no \
    --to=helge.hafting@aitel.hist.no \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stueken@conterra.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox