Re: synchronous mounts

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jeff Garzik <jgarzik@mandrakesoft.com>
To: "Stephen C. Tweedie" <sct@redhat.com>
Cc: Andrew Morton <akpm@zip.com.au>,
	lkml <linux-kernel@vger.kernel.org>,
	Neil Brown <neilb@cse.unsw.edu.au>
Subject: Re: synchronous mounts
Date: Fri, 16 Nov 2001 08:28:09 -0500	[thread overview]
Message-ID: <3BF51469.5EC478B7@mandrakesoft.com> (raw)
In-Reply-To: <3BF376EC.EA9B03C8@zip.com.au> <20011115214525.C14221@redhat.com> <3BF45B9F.DEE1076B@mandrakesoft.com> <20011116122855.C2389@redhat.com>

"Stephen C. Tweedie" wrote:
> On Thu, Nov 15, 2001 at 07:19:43PM -0500, Jeff Garzik wrote:
> > When working on something likely to crash, I always remount my
> > filesystems 'sync' with the intention to have the kernel immediately
> > sync to disk anything and everything it is coded to do.
> 
> The kernel has, in my memory, never behaved like that on sync mounts.
> mount -o sync was always intended just to give people the BSD-style
> sync metadata updates that some users expected.
> 
> The "mount" man page is wrong on this one.

No, that's always been a bug.  Occasionally it gets brought up on lkml
and people have talked about fixing it.


> > Since the
> > kernel is responsible to flushing data to disk, it makes perfect sense
> > to have an option to sync not only metadata but data to disk
> > immediately, if the user desires such.
> 
> If you want to sync _everything_, it's at least 5 seeks per write
> syscall when you're writing a new file: superblock, group descriptor,
> block bitmap, inode, data, and potentially inode indirect.
> 
> There's no point doing all that, especially since some of that data is
> redundant and will be rebuilt by e2fsck anyway after a crash.
> 
> Is it really such an important feature that we're willing to suffer a
> factor-of-100 or more slowdown for it?

mount -o dirsync, if you don't want the slowdown.

I can write a fast, incorrect implementation of 'sync' too.  Let's try
for a correct implementation.


> > Further, expecting all apps to fsync(2) files under the right
> > circumstances is not reasonable.  There are "normal" circumstances where
> > someone expects non-syncing behavior of "cat foo bar > foobar", and then
> > there are extentuating circumstances where another expects the shell to
> > sync that command immediately.  Should we rewrite cat/bash/apps to all
> > fsync, depending on an option?  Should we expect people to modify all
> > their shell scripts to include "/bin/sync" for those times when they
> > want data-sync?  Such is not scalable at all.
> 
> Not-scalable is doing 5000 seeks to write a 4MB file.
> 
> The behaviour you are talking about now, "cat foo bar > foobar" and
> expecting it to be intact on return, is *not the same thing*.  The
> sync mount option is there to order metadata writes for predictable
> recovery of the directory structure.  In the "cat" case, nobody cares
> what the inode is like during the write.  All that is desired in that
> example is fsync-on-close, and it is insane to implement
> fsync-on-close by writing every single block of the file
> synchronously.

The "sync" mount option's purpose is and should be this simple:
"All I/O to the file system should be done synchronously."

If you want different behavior, use a different option.

I still do not understand why you seem to think modifying tons of
programs and shell scripts is reasonable, in order to get "true" sync
behavior.


> At ALS, an ext3 user asked why ext3 performance was entirely unusable
> under mount -o sync (he had a broken config which accidentally set an
> ext3 mount synchronous), whereas ext2 was OK.  I only realised
> afterwards that this was because of ext3's ordered data writes:
> whereas ext2 was just syncing the inodes and indirect blocks on write,
> ext3 was syncing the data too as part of the ordered data guarantees,
> and performance was totally destroyed by the extra seeks.

Sure.  Seems perfectly normal and expected for an fs mounted 'sync'.


> "sync to keep the fs structures intact" and "sync to keep this file
> intact" are two totally different things.  In the latter case, we only
> care about the file contents as a whole, so fsync-on-close is far more
> appropriate.  If we want that, lets add it as a new option, but I
> don't see the benefit in making o- sync do all file data writes 100%
> synchronously.

The benefit is a correct implementation..

	Jeff


-- 
Jeff Garzik      | Only so many songs can be sung
Building 1024    | with two lips, two lungs, and one tongue.
MandrakeSoft     |         - nomeansno

next prev parent reply	other threads:[~2001-11-16 13:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-11-15  8:03 synchronous mounts Andrew Morton
2001-11-15 21:45 ` Stephen C. Tweedie
2001-11-15 22:02   ` Andrew Morton
2001-11-15 23:05     ` Stephen C. Tweedie
2001-11-16  0:19   ` Jeff Garzik
2001-11-16  8:33     ` Andrew Morton
2001-11-16 10:47       ` Matthias Andree
2001-11-16 12:28     ` Stephen C. Tweedie
2001-11-16 13:28       ` Jeff Garzik [this message]
2001-11-16 13:37       ` Jeff Garzik
2001-11-16 22:40       ` Matthias Andree
2001-11-16  3:07   ` Neil Brown
2001-11-17  7:13     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3BF51469.5EC478B7@mandrakesoft.com \
    --to=jgarzik@mandrakesoft.com \
    --cc=akpm@zip.com.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@cse.unsw.edu.au \
    --cc=sct@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.