From: Theodore Ts'o <tytso@mit.edu>
To: Richard Weinberger <richard@nod.at>
Cc: Johannes Schindelin <johannes.schindelin@gmx.de>,
git@vger.kernel.org, David Gstir <david@sigma-star.at>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: broken repo after power cut
Date: Mon, 22 Jun 2015 08:31:35 -0400 [thread overview]
Message-ID: <20150622123135.GU29480@thunk.org> (raw)
In-Reply-To: <5587EF5F.90207@nod.at>
On Mon, Jun 22, 2015 at 01:19:59PM +0200, Richard Weinberger wrote:
>
> > The bottome lins is that if you care about files being written, you
> > need to use fsync(). Should git use fsync() by default? Well, if you
> > are willing to accept that if your system crashes within a second or
> > so of your last git operation, you might need to run "git fsck" and
> > potentially recover from a busted repo, maybe speed is more important
> > for you (and git is known for its speed/performance, after all. :-)
I made a typo in the above. s/second/minute/. (Linux's writeback
timer is 30 seconds, but if the disk is busy it might take a bit
longer to get all of the data blocks written out to disk and
committed.)
> I think core.fsyncObjectFiles documentation really needs an update.
> What about this one?
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 43bb53c..b08fa11 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -693,10 +693,16 @@ core.whitespace::
> core.fsyncObjectFiles::
> This boolean will enable 'fsync()' when writing object files.
> +
> -This is a total waste of time and effort on a filesystem that orders
> -data writes properly, but can be useful for filesystems that do not use
> -journalling (traditional UNIX filesystems) or that only journal metadata
> -and not file contents (OS X's HFS+, or Linux ext3 with "data=writeback").
> +For performance reasons git does not call 'fsync()' after writing object
> +files. This means that after a power cut your git repository can get
> +corrupted as not all data hit the storage media. Especially on modern
> +filesystems like ext4, xfs or btrfs this can happen very easily.
> +If you have to face power cuts and care about your data it is strongly
> +recommended to enable this setting.
> +Please note that git's behavior used to be safe on ext3 with data=ordered,
> +for any other filesystems or mount settings this is not the case as
> +POSIX clearly states that you have to call 'fsync()' to make sure that
> +all data is written.
My main complaint about this is that it's a bit Linux-centric. For
example, the fact that fsync(2) is needed to push data out of the
cache is also true for MacOS (and indeed all other Unix systems going
back three decades) as well as Windows. In fact, it's not a matter of
"POSIX says", but "POSIX documented", but since standards are held in
high esteem, it's sometimes a bit more convenient to use them as an
appeal to authority. :-)
(Ext3's data=ordered behaviour is an outlier, and in fact, the reason
why it mostly safe to skip fsync(2) calls when using ext3 data=ordered
was an accidental side effect of another problem which was trying to
solve based on the relatively primitive way it handled block
allocation.)
Cheers,
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
prev parent reply other threads:[~2015-06-22 12:31 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <5585C1B6.50407@nod.at>
[not found] ` <330ab8f498e1b435d5b210384200b649@www.dscho.org>
2015-06-21 13:07 ` broken repo after power cut Richard Weinberger
2015-06-21 13:59 ` Christoph Hellwig
2015-06-21 14:08 ` Richard Weinberger
2015-06-22 0:35 ` Theodore Ts'o
2015-06-22 11:19 ` Richard Weinberger
2015-06-22 12:31 ` Theodore Ts'o [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150622123135.GU29480@thunk.org \
--to=tytso@mit.edu \
--cc=david@sigma-star.at \
--cc=git@vger.kernel.org \
--cc=johannes.schindelin@gmx.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=richard@nod.at \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).