From: Anand Jain <Anand.Jain@oracle.com>
To: Mitch Harder <mitch.harder@sabayonlinux.org>
Cc: Berend Dekens <btrfs@cyberwizzard.nl>,
Arne Jansen <sensille@gmx.net>,
linux-btrfs@vger.kernel.org
Subject: Re: BTRFS and power loss ~= corruption?
Date: Thu, 25 Aug 2011 11:31:35 +0800 [thread overview]
Message-ID: <4E55C217.4080203@oracle.com> (raw)
In-Reply-To: <CAKcLGm8wYKKZ-yEX3_A612MJPLND3DnM-7+N_hw3LUxwQe837w@mail.gmail.com>
We have a bit of documentation on the disk power failure and
corruption here:
https://btrfs.wiki.kernel.org/index.php/FAQ
Ref to the 2nd faq in the list.
Things would have been a lot easier for the filesystem (in terms
of maintaining the its consistency) if disks could have some kind
of atomic write (between disk-cache and disk) for a given block size.
anyways, solutions containing disk-write-cache disabled and SSD
is quite popular now a days. And in terms of random synchronous
write performance they are awesome.
HTH
Cheers, Anand
On 08/25/2011 01:06 AM, Mitch Harder wrote:
> On Wed, Aug 24, 2011 at 10:13 AM, Berend Dekens<btrfs@cyberwizzard.nl> wrote:
>> On 24/08/11 17:04, Arne Jansen wrote:
>>>
>>> On 24.08.2011 17:01, Berend Dekens wrote:
>>>>
>>>> On 24/08/11 15:31, Arne Jansen wrote:
>>>>>
>>>>> On 24.08.2011 15:11, Berend Dekens wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have followed the progress made in the btrfs filesystem over time and
>>>>>> while I have experimented with it a little in a VM, I have not yet used it
>>>>>> in a production machine.
>>>>>>
>>>>>> While the lack of a complete fsck was a major issue (I read the update
>>>>>> that the first working version is about to be released) I am still worried
>>>>>> about an issue I see popping up.
>>>>>>
>>>>>> How is it possible that a copy-on-write filesystem becomes corrupted if
>>>>>> a power failure occurs? I assume this means that even (hard) resetting a
>>>>>> computer can result in a corrupt filesystem.
>>>>>>
>>>>>> I thought the idea of COW was that whatever happens, you can always
>>>>>> mount in a semi-consistent state?
>>>>>>
>>>>>> As far as I can see, you wind up with this:
>>>>>> - No outstanding writes when power down
>>>>>> - File write complete, tree structure is updated. Since everything is
>>>>>> hashed and duplicated, unless the update propagates to the highest level,
>>>>>> the write will simply disappear upon failure. While this might be rectified
>>>>>> with a fsck, there should be no problems mounting the filesystem (read-only
>>>>>> if need be)
>>>>>> - Writes are not completed on all disks/partitions at the same time.
>>>>>> The checksums will detect these errors and once again, the write disappears
>>>>>> unless it is salvaged by a fsck.
>>>>>>
>>>>>> Am I missing something? How come there seem to be plenty people with a
>>>>>> corrupt btrfs after a power failure? And why haven't I experienced similar
>>>>>> issues where a filesystem becomes unmountable with say NTFS or Ext3/4?
>>>>>
>>>>> Problems arise when in your scenario writes from higher levels in the
>>>>> tree hit the disk earlier than updates on lower levels. In this case
>>>>> the tree is broken and the fs is unmountable.
>>>>> Of course btrfs takes care of the order it writes, but problems arise
>>>>> when the disk is lying about whether a write is stable on disk, i.e.
>>>>> about cache flushes or barriers.
>>>>
>>>> Ah, I see. So the issue is not with the software implementation at all
>>>> but only arises when hardware acknowledges flushes and barriers before they
>>>> actually complete?
>>>
>>> It doesn't mean there aren't any bugs left in the software stack ;)
>>
>> Naturally, but the fact that its very likely that the corruption stories
>> I've been reading about are caused by misbehaving hardware set my mind at
>> ease about experimenting further with btrfs (although I will await the fsck
>> before attempting things in production).
>>>>
>>>> Is this a common problem of hard disks?
>>>
>>> Only of very cheap ones. USB enclosures might add to the problem, too.
>>> Also some SSDs are rumored to be bad in this regard.
>>> Another problem are layers between btrfs and the hardware, like
>>> encryption.
>>
>> I am - and will be - using btrfs straight on hard disks, no lvm, (soft)raid,
>> encryption or other layers.
>>
>> My hard drives are not that fancy (no 15k raptors here); I usually buy
>> hardware from the major suppliers (WD, Maxtor, Seagate, Hitachi etc). Also,
>> until the fast cache mode for SSDs in combination with rotating hardware
>> becomes stable, I'll stick to ordinary hard drives.
>>
>> Thank you for clarifying things.
>>
>
> I have to admit I've been beginning to wonder if we picked up a
> regression somewhere along the way with respect to corruptions after
> power outages.
>
> I'm lucky enough to have very unreliable power. Btrfs was always
> robust for me on power outages until recently. Now I've recently had
> two corrupted volumes on unclean shutdowns and power outages.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-08-25 3:31 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-24 13:11 BTRFS and power loss ~= corruption? Berend Dekens
2011-08-24 13:31 ` Arne Jansen
2011-08-24 15:01 ` Berend Dekens
2011-08-24 15:04 ` *** GMX Spamverdacht *** " Arne Jansen
2011-08-24 15:13 ` Berend Dekens
2011-08-24 17:06 ` Mitch Harder
2011-08-24 21:00 ` Ahmed Kamal
2011-08-25 3:31 ` Anand Jain [this message]
2011-08-25 17:55 ` Martin Steigerwald
2011-08-25 22:16 ` Maciej Marcin Piechotka
2011-11-09 20:15 ` Martin Steigerwald
2011-08-25 23:01 ` Gregory Maxwell
2011-08-26 6:37 ` Arne Jansen
2011-08-26 7:48 ` Mike Fleetwood
2011-08-26 9:30 ` Arne Jansen
2011-11-09 17:33 ` Stefan Behrens
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E55C217.4080203@oracle.com \
--to=anand.jain@oracle.com \
--cc=btrfs@cyberwizzard.nl \
--cc=linux-btrfs@vger.kernel.org \
--cc=mitch.harder@sabayonlinux.org \
--cc=sensille@gmx.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.