git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Richard Weinberger <richard@nod.at>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Johannes Schindelin <johannes.schindelin@gmx.de>,
	git@vger.kernel.org, David Gstir <david@sigma-star.at>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: broken repo after power cut
Date: Mon, 22 Jun 2015 13:19:59 +0200	[thread overview]
Message-ID: <5587EF5F.90207@nod.at> (raw)
In-Reply-To: <20150622003551.GP29480@thunk.org>

Am 22.06.2015 um 02:35 schrieb Theodore Ts'o:
> On Sun, Jun 21, 2015 at 03:07:41PM +0200, Richard Weinberger wrote:
> 
>>> I was then shocked to learn that ext4 apparently has a default
>>> setting that allows it to truncate files upon power failure
>>> (something about a full journal vs a fast journal or some such)
> 
> s/ext4/all modern file systems/
> 
> POSIX makes **no guarantees** about what happens after a power failure
> unless you use fsync() --- which git does not do by default (see below).

Thanks for pointing this out.

> The bottome lins is that if you care about files being written, you
> need to use fsync().  Should git use fsync() by default?  Well, if you
> are willing to accept that if your system crashes within a second or
> so of your last git operation, you might need to run "git fsck" and
> potentially recover from a busted repo, maybe speed is more important
> for you (and git is known for its speed/performance, after all. :-)
> 
> The actual state of the source tree would have been written using a
> text editor which tends to be paranoid about using fsync (at least, if
> you use a real editor like Emacs or Vi, as opposed to the toy notepad
> editors shipped with GNOME or KDE :-).  So as long as you know what
> you're doing, it's unlikely that you will actually lose any work.
> 
> Personally, I have core.fsyncobjectfiles set to yes in my .gitconfig.
> Part of this is because I have an SSD, so the speed hit really doesn't
> bother me, and needing to recover a corrupted git repository is a pain
> (although I have certainly done it in the past).

I think core.fsyncObjectFiles documentation really needs an update.
What about this one?

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 43bb53c..b08fa11 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -693,10 +693,16 @@ core.whitespace::
 core.fsyncObjectFiles::
 	This boolean will enable 'fsync()' when writing object files.
 +
-This is a total waste of time and effort on a filesystem that orders
-data writes properly, but can be useful for filesystems that do not use
-journalling (traditional UNIX filesystems) or that only journal metadata
-and not file contents (OS X's HFS+, or Linux ext3 with "data=writeback").
+For performance reasons git does not call 'fsync()' after writing object
+files. This means that after a power cut your git repository can get
+corrupted as not all data hit the storage media. Especially on modern
+filesystems like ext4, xfs or btrfs this can happen very easily.
+If you have to face power cuts and care about your data it is strongly
+recommended to enable this setting.
+Please note that git's behavior used to be safe on ext3 with data=ordered,
+for any other filesystems or mount settings this is not the case as
+POSIX clearly states that you have to call 'fsync()' to make sure that
+all data is written.

 core.preloadIndex::
 	Enable parallel index preload for operations like 'git diff'

--
Thanks,
//richard

  reply	other threads:[~2015-06-22 11:20 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-20 19:40 broken repo after power cut Richard Weinberger
2015-06-21 12:28 ` Johannes Schindelin
2015-06-21 13:07   ` Richard Weinberger
2015-06-21 13:59     ` Christoph Hellwig
2015-06-21 14:08       ` Richard Weinberger
2015-06-22  0:35     ` Theodore Ts'o
2015-06-22 11:19       ` Richard Weinberger [this message]
2015-06-22 12:31         ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5587EF5F.90207@nod.at \
    --to=richard@nod.at \
    --cc=david@sigma-star.at \
    --cc=git@vger.kernel.org \
    --cc=johannes.schindelin@gmx.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).