From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Christoph Hellwig <hch@lst.de>,
git@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] enable core.fsyncObjectFiles by default
Date: Wed, 17 Jan 2018 22:44:25 +0100 [thread overview]
Message-ID: <87h8rki2iu.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <xmqqd128s3wf.fsf@gitster.mtv.corp.google.com>
On Wed, Jan 17 2018, Junio C. Hamano jotted:
> Christoph Hellwig <hch@lst.de> writes:
>
>> fsync is required for data integrity as there is no gurantee that
>> data makes it to disk at any specified time without it. Even for
>> ext3 with data=ordered mode the file system will only commit all
>> data at some point in time that is not guaranteed.
>
> It comes from this one:
>
> commit aafe9fbaf4f1d1f27a6f6e3eb3e246fff81240ef
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Wed Jun 18 15:18:44 2008 -0700
>
> Add config option to enable 'fsync()' of object files
>
> As explained in the documentation[*] this is totally useless on
> filesystems that do ordered/journalled data writes, but it can be a
> useful safety feature on filesystems like HFS+ that only journal the
> metadata, not the actual file contents.
>
> It defaults to off, although we could presumably in theory some day
> auto-enable it on a per-filesystem basis.
>
> [*] Yes, I updated the docs for the thing. Hell really _has_ frozen
> over, and the four horsemen are probably just beyond the horizon.
> EVERYBODY PANIC!
>
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> index 0e25b2c92..9a1cec5c8 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -866,10 +866,8 @@ core.whitespace::
>> core.fsyncObjectFiles::
>> This boolean will enable 'fsync()' when writing object files.
>> +
>> -This is a total waste of time and effort on a filesystem that orders
>> -data writes properly, but can be useful for filesystems that do not use
>> -journalling (traditional UNIX filesystems) or that only journal metadata
>> -and not file contents (OS X's HFS+, or Linux ext3 with "data=writeback").
>> +This option is enabled by default and ensures actual data integrity
>> +by calling fsync after writing object files.
>
> I am somewhat sympathetic to the desire to flip the default to
> "safe" and allow those who know they are already safe to tweak the
> knob for performance, and it also makes sense to document that the
> default is "true" here. But I do not see the point of removing the
> four lines from this paragraph; the sole effect of the removal is to
> rob information from readers that they can use to decide if they
> want to disable the configuration, no?
[CC'd the author of the current behavior]
Some points/questions:
a) Is there some reliable way to test whether this is needed from
userspace? I'm thinking something like `git update-index
--test-untracked-cache` but for fsync().
b) On the filesystems that don't need this, what's the performance
impact?
I ran a small test myself on CentOS 7 (3.10) with ext4 data=ordered
on the tests I thought might do a lot of loose object writes:
$ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_MAKE_OPTS="NO_OPENSSL=Y CFLAGS=-O3 -j56" ./run origin/master fsync-on~ fsync-on p3400-rebase.sh p0007-write-cache.sh
[...]
Test fsync-on~ fsync-on
-------------------------------------------------------------------------------------------------------
3400.2: rebase on top of a lot of unrelated changes 1.45(1.30+0.17) 1.45(1.28+0.20) +0.0%
3400.4: rebase a lot of unrelated changes without split-index 4.34(3.71+0.66) 4.33(3.69+0.66) -0.2%
3400.6: rebase a lot of unrelated changes with split-index 3.38(2.94+0.47) 3.38(2.93+0.47) +0.0%
0007.2: write_locked_index 3 times (3214 files) 0.01(0.00+0.00) 0.01(0.00+0.00) +0.0%
No impact. However I did my own test of running the test suite 10%
times with/without this patch, and it runs 9% slower:
fsync-off: avg:21.59 21.50 21.50 21.52 21.53 21.54 21.57 21.59 21.61 21.63 21.95
fsync-on: avg:23.43 23.21 23.25 23.26 23.26 23.27 23.32 23.49 23.51 23.83 23.88
Test script at the end of this E-Mail.
c) What sort of guarantees in this regard do NFS-mounted filesystems
commonly make?
Test script:
use v5.10.0;
use strict;
use warnings;
use Time::HiRes qw(time);
use List::Util qw(sum);
use Data::Dumper;
my %time;
for my $ref (@ARGV) {
system "git checkout $ref";
system qq[make -j56 CFLAGS="-O3 -g" NO_OPENSSL=Y all];
for (1..10) {
my $t0 = -time();
system "(cd t && NO_SVN_TESTS=1 GIT_TEST_HTTPD=0 prove -j56 --state=slow,save t[0-9]*.sh)";
$t0 += time();
push @{$time{$ref}} => $t0;
}
}
for my $ref (sort keys %time) {
printf "%20s: avg:%.2f %s\n",
$ref,
sum(@{$time{$ref}})/@{$time{$ref}},
join(" ", map { sprintf "%.02f", $_ } sort { $a <=> $b } @{$time{$ref}});
}
next prev parent reply other threads:[~2018-01-17 21:44 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-17 18:48 [PATCH] enable core.fsyncObjectFiles by default Christoph Hellwig
2018-01-17 19:04 ` Junio C Hamano
2018-01-17 19:35 ` Christoph Hellwig
2018-01-17 19:35 ` Christoph Hellwig
2018-01-17 20:05 ` Andreas Schwab
2018-01-17 19:37 ` Matthew Wilcox
2018-01-17 19:42 ` Christoph Hellwig
2018-01-17 21:44 ` Ævar Arnfjörð Bjarmason [this message]
2018-01-17 22:07 ` Linus Torvalds
2018-01-17 22:25 ` Linus Torvalds
2018-01-17 23:16 ` Ævar Arnfjörð Bjarmason
2018-01-17 23:42 ` Linus Torvalds
2018-01-17 23:52 ` Theodore Ts'o
2018-01-17 23:57 ` Linus Torvalds
2018-01-18 16:27 ` Christoph Hellwig
2018-01-19 19:08 ` Junio C Hamano
2018-01-20 22:14 ` Theodore Ts'o
2018-01-20 22:27 ` Junio C Hamano
2018-01-22 15:09 ` Ævar Arnfjörð Bjarmason
2018-01-22 18:09 ` Theodore Ts'o
2018-01-22 18:09 ` Theodore Ts'o
2018-01-23 0:47 ` Jeff King
2018-01-23 5:45 ` Theodore Ts'o
2018-01-23 5:45 ` Theodore Ts'o
2018-01-23 16:17 ` Jeff King
2018-01-23 0:25 ` Jeff King
2018-01-21 21:32 ` Chris Mason
2020-09-17 11:06 ` Ævar Arnfjörð Bjarmason
2020-09-17 11:28 ` [RFC PATCH 0/2] should core.fsyncObjectFiles fsync the dir entry + docs Ævar Arnfjörð Bjarmason
2020-09-17 11:28 ` [RFC PATCH 1/2] sha1-file: fsync() loose dir entry when core.fsyncObjectFiles Ævar Arnfjörð Bjarmason
2020-09-17 13:16 ` Jeff King
2020-09-17 15:09 ` Christoph Hellwig
2020-09-17 14:09 ` Christoph Hellwig
2020-09-17 14:55 ` Jeff King
2020-09-17 14:56 ` Christoph Hellwig
2020-09-17 15:37 ` Junio C Hamano
2020-09-17 17:12 ` Jeff King
2020-09-17 20:37 ` Taylor Blau
2020-09-22 10:42 ` Ævar Arnfjörð Bjarmason
2020-09-17 20:21 ` Johannes Sixt
2020-09-22 8:24 ` Ævar Arnfjörð Bjarmason
2020-11-19 11:38 ` Johannes Schindelin
2020-09-17 11:28 ` [RFC PATCH 2/2] core.fsyncObjectFiles: make the docs less flippant Ævar Arnfjörð Bjarmason
2020-09-17 14:12 ` Christoph Hellwig
2020-09-17 15:43 ` Junio C Hamano
2020-09-17 20:15 ` Johannes Sixt
2020-10-08 8:13 ` Johannes Schindelin
2020-10-08 15:57 ` Ævar Arnfjörð Bjarmason
2020-10-08 18:53 ` Junio C Hamano
2020-10-09 10:44 ` Johannes Schindelin
2020-09-17 19:21 ` Marc Branchaud
2020-09-17 14:14 ` [PATCH] enable core.fsyncObjectFiles by default Christoph Hellwig
2020-09-17 15:30 ` Junio C Hamano
2018-01-17 20:55 ` Jeff King
2018-01-17 21:10 ` Christoph Hellwig
-- strict thread matches above, loose matches on Subject: below --
2015-06-23 21:57 [PATCH] Enable " Stefan Beller
2015-06-23 22:21 ` Junio C Hamano
2015-06-23 23:29 ` Theodore Ts'o
2015-06-24 5:32 ` Junio C Hamano
2015-06-24 14:30 ` Theodore Ts'o
2015-06-24 1:07 ` Duy Nguyen
2015-06-24 3:37 ` Jeff King
2015-06-24 5:20 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h8rki2iu.fsf@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=hch@lst.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.