Re: synchronous mounts - Stephen C. Tweedie

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: "Stephen C. Tweedie" <sct@redhat.com>
To: Jeff Garzik <jgarzik@mandrakesoft.com>
Cc: "Stephen C. Tweedie" <sct@redhat.com>,
	Andrew Morton <akpm@zip.com.au>,
	lkml <linux-kernel@vger.kernel.org>,
	Neil Brown <neilb@cse.unsw.edu.au>
Subject: Re: synchronous mounts
Date: Fri, 16 Nov 2001 12:28:55 +0000	[thread overview]
Message-ID: <20011116122855.C2389@redhat.com> (raw)
In-Reply-To: <3BF376EC.EA9B03C8@zip.com.au> <20011115214525.C14221@redhat.com> <3BF45B9F.DEE1076B@mandrakesoft.com>
In-Reply-To: <3BF45B9F.DEE1076B@mandrakesoft.com>; from jgarzik@mandrakesoft.com on Thu, Nov 15, 2001 at 07:19:43PM -0500

Hi,

On Thu, Nov 15, 2001 at 07:19:43PM -0500, Jeff Garzik wrote:

> When working on something likely to crash, I always remount my
> filesystems 'sync' with the intention to have the kernel immediately
> sync to disk anything and everything it is coded to do.

The kernel has, in my memory, never behaved like that on sync mounts.
mount -o sync was always intended just to give people the BSD-style
sync metadata updates that some users expected.

The "mount" man page is wrong on this one.

> Since the
> kernel is responsible to flushing data to disk, it makes perfect sense
> to have an option to sync not only metadata but data to disk
> immediately, if the user desires such.

If you want to sync _everything_, it's at least 5 seeks per write
syscall when you're writing a new file: superblock, group descriptor,
block bitmap, inode, data, and potentially inode indirect.

There's no point doing all that, especially since some of that data is
redundant and will be rebuilt by e2fsck anyway after a crash.

Is it really such an important feature that we're willing to suffer a
factor-of-100 or more slowdown for it?

> Further, expecting all apps to fsync(2) files under the right
> circumstances is not reasonable.  There are "normal" circumstances where
> someone expects non-syncing behavior of "cat foo bar > foobar", and then
> there are extentuating circumstances where another expects the shell to
> sync that command immediately.  Should we rewrite cat/bash/apps to all
> fsync, depending on an option?  Should we expect people to modify all
> their shell scripts to include "/bin/sync" for those times when they
> want data-sync?  Such is not scalable at all.

Not-scalable is doing 5000 seeks to write a 4MB file.  

The behaviour you are talking about now, "cat foo bar > foobar" and
expecting it to be intact on return, is *not the same thing*.  The
sync mount option is there to order metadata writes for predictable
recovery of the directory structure.  In the "cat" case, nobody cares
what the inode is like during the write.  All that is desired in that
example is fsync-on-close, and it is insane to implement
fsync-on-close by writing every single block of the file
synchronously.

At ALS, an ext3 user asked why ext3 performance was entirely unusable
under mount -o sync (he had a broken config which accidentally set an
ext3 mount synchronous), whereas ext2 was OK.  I only realised
afterwards that this was because of ext3's ordered data writes:
whereas ext2 was just syncing the inodes and indirect blocks on write,
ext3 was syncing the data too as part of the ordered data guarantees,
and performance was totally destroyed by the extra seeks.

"sync to keep the fs structures intact" and "sync to keep this file
intact" are two totally different things.  In the latter case, we only
care about the file contents as a whole, so fsync-on-close is far more
appropriate.  If we want that, lets add it as a new option, but I
don't see the benefit in making o- sync do all file data writes 100%
synchronously.

Cheers,
 Stephen

next prev parent reply	other threads:[~2001-11-16 12:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-11-15  8:03 synchronous mounts Andrew Morton
2001-11-15 21:45 ` Stephen C. Tweedie
2001-11-15 22:02   ` Andrew Morton
2001-11-15 23:05     ` Stephen C. Tweedie
2001-11-16  0:19   ` Jeff Garzik
2001-11-16  8:33     ` Andrew Morton
2001-11-16 10:47       ` Matthias Andree
2001-11-16 12:28     ` Stephen C. Tweedie [this message]
2001-11-16 13:28       ` Jeff Garzik
2001-11-16 13:37       ` Jeff Garzik
2001-11-16 22:40       ` Matthias Andree
2001-11-16  3:07   ` Neil Brown
2001-11-17  7:13     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20011116122855.C2389@redhat.com \
    --to=sct@redhat.com \
    --cc=akpm@zip.com.au \
    --cc=jgarzik@mandrakesoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@cse.unsw.edu.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox