From: Ric Wheeler <rwheeler@redhat.com>
To: Chris Mason <chris.mason@oracle.com>,
Jakob Unterwurzacher <jakobunt@gmail.com>,
linux-btrfs@vger.kernel.org
Subject: Re: Rename+crash behaviour of btrfs - nearly ext3!
Date: Tue, 18 May 2010 19:00:57 -0400 [thread overview]
Message-ID: <4BF31C29.7080403@redhat.com> (raw)
In-Reply-To: <20100518131304.GX8635@think>
On 05/18/2010 09:13 AM, Chris Mason wrote:
> On Tue, May 18, 2010 at 02:03:49PM +0200, Jakob Unterwurzacher wrote:
>
>> On 18/05/10 02:59, Chris Mason wrote:
>>
>>>>> Ok, I upgraded to 2.6.34 final and switched to defconfig.
>>>>> I only did the rename test ( i.e. no overwrite ), the window is now
>>>>> 1.1s, both with vanilla and with the patch.
>>>>>
>>>> Thanks, so much for the easy fix. I'll take a look.
>>>>
>>> Ohhhhh, I read your initial email wrong, I'm sorry. The test we're
>>> failing, the rentest, doesn't overwrite one file with another. It is
>>> just creating a file and then renaming it.
>>>
>> Yes, the overwrite test goes perfectly fine.
>>
>>
>>> Btrfs is explicitly choosing not to sync the file in this case because
>>> the rename isn't replacing good old data with new unwritten data. The
>>> rename is taking new unwritten data and giving it a different name.
>>>
>>> Are there applications that rely on this?
>>>
>>> -chris
>>>
>> Well, dpkg (the Debian/Ubuntu package manager) did. Then ext4 became the
>> default fs in Ubuntu and massive breakage was reported [1]. Now dpkg is
>> fsync()ing everything and is about 2x slower than it was with ext3 [2].
>>
>> Btrfs is so close to getting it "right" that i wondered whether the new
>> file name hitting the disk could be delayed that one second for the data
>> to make it to disk first.
>>
>>
> The thing is that different apps have a different version of 'right'. Rename
> is atomically replacing one file with another, and I completely agree
> that when we have an established file on disk, we shouldn't replace it
> with something that is potentially garbage.
>
> But for the zeros case we have a file that isn't on disk and we're just
> giving it a new name. I can see a different class of applications
> getting upset about renames slowing the system down dramatically because
> they suddenly imply a lot of IO.
>
> I'm more than open to discussion on this one, but I don't see how:
>
> rm -f foo2
> dd if=/dev/zero of=foo bs=1M count=1000
> mv foo foo2
>
> Should be expected to write 1GB of data.
>
> -chris
>
Just to weigh in here, I think that you have the right behaviour
already. If an application wants to force this to sync the data to disk,
it should use fsync() after the rename.
Having application depend on semantics that only ext3 provided is not an
excuse for making a rename take multiple seconds....
Thanks!
Ric
next prev parent reply other threads:[~2010-05-18 23:00 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-17 18:04 Rename+crash behaviour of btrfs - nearly ext3! Jakob Unterwurzacher
2010-05-17 19:12 ` Ric Wheeler
2010-05-17 19:25 ` Josef Bacik
2010-05-17 20:09 ` Chris Mason
2010-05-17 20:30 ` Jakob Unterwurzacher
2010-05-17 19:36 ` Chris Mason
2010-05-18 0:14 ` Jakob Unterwurzacher
2010-05-18 0:30 ` Chris Mason
2010-05-18 0:59 ` Chris Mason
2010-05-18 12:03 ` Jakob Unterwurzacher
2010-05-18 13:13 ` Chris Mason
2010-05-18 13:28 ` Oystein Viggen
2010-05-18 14:47 ` Thomas Bellman
2010-05-18 13:39 ` Aidan Van Dyk
2010-05-18 14:06 ` Jakob Unterwurzacher
2010-05-18 14:36 ` Chris Mason
2010-05-18 15:57 ` Jakob Unterwurzacher
2010-05-18 16:10 ` Chris Mason
2010-05-18 18:01 ` Goffredo Baroncelli
2010-05-18 18:24 ` Jakob Unterwurzacher
2010-05-18 23:00 ` Ric Wheeler [this message]
2010-05-19 1:05 ` Bruce Guenter
2010-05-19 1:34 ` Andy Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BF31C29.7080403@redhat.com \
--to=rwheeler@redhat.com \
--cc=chris.mason@oracle.com \
--cc=jakobunt@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).