From: "Patrick H." <linux-raid@feystorm.net>
To: linux-raid@vger.kernel.org
Subject: Re: filesystem corruption
Date: Sun, 02 Jan 2011 22:05:06 -0700 [thread overview]
Message-ID: <4D215902.9010308@feystorm.net> (raw)
In-Reply-To: <20110103155630.565341d0@notabene.brown>
Sent: Sun Jan 02 2011 21:56:30 GMT-0700 (Mountain Standard Time)
From: Neil Brown <neilb@suse.de>
To: Patrick H. <linux-raid@feystorm.net> linux-raid@vger.kernel.org
Subject: Re: filesystem corruption
> On Sun, 02 Jan 2011 21:06:52 -0700 "Patrick H." <linux-raid@feystorm.net>
> wrote:
>
>
>
>> That makes sense assuming that MD acknowleges the write once the data is
>> written to the data disks but not necessarily the parity disk, which is
>> what I gather you were saying is what happens. Is there any option that
>> can change the behavior so that md wont ack the write until its been
>> committed to all disks (I'm guessing no since you didnt mention it)?
>> Also does raid6 suffer this problem? Is it smart enough to use both
>> parity disks when calculating replacement, or will it just use one?
>>
>>
>
> md/raid5 doesn't acknowledge the write until both the data and the parity
> have been written. But that doesn't make any difference.
> If you schedule a number of interdependent writes (data and parity) and then
> allow some to complete but not all, then you have inconsistency.
> Recovery from losing a single device requires consistency of parity and data.
>
> RAID6 suffers equally from this problem. Even if it used both parity disks
> to recover (which it doesn't) how would that help? It would then have two
> possible value for the data and no way to know which was correct, and every
> possibility that both are incorrect. This would happen if a single data
> block was successfully written, but neither parity blocks were.
>
> The only way you can avoid this 'write hole' is by journalling in multiples
> of whole stripes. No current filesystems that I know of can do this as they
> journal in blocks, and the maximum block size is less than the minimum stripe
> size. So you would need journalling integrated with md/raid, or you would
> need a filesystem which was designed to understand this problem and write
> whole stripes at a time, always to an area of the device which did not
> contain live data.
>
> NeilBrown
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Ok, thanks for the info.
I think I'll solve it by creating 2 dedicated hosts for running the
array, but not actually export any disks themselves. This way if a
master dies, all the raid disks are still there and can be picked up by
the other master.
-Patrick
next prev parent reply other threads:[~2011-01-03 5:05 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-03 1:58 filesystem corruption Patrick H.
2011-01-03 3:16 ` Neil Brown
[not found] ` <4D214B5C.3010103@feystorm.net>
2011-01-03 4:56 ` Neil Brown
2011-01-03 5:05 ` Patrick H. [this message]
2011-01-04 5:33 ` NeilBrown
2011-01-04 7:50 ` Patrick H.
2011-01-04 17:31 ` Patrick H.
2011-01-05 1:22 ` Patrick H.
2011-01-05 7:02 ` CoolCold
[not found] ` <AANLkTinL_nz58f8rSPuhYvVwGY5jdu1XVkNLC1ky5A65@mail.gmail.com>
2011-01-05 14:28 ` Patrick H.
2011-01-05 15:52 ` Spelic
2011-01-05 15:55 ` Patrick H.
-- strict thread matches above, loose matches on Subject: below --
2018-12-03 9:31 Filesystem Corruption Stefan Malte Schumacher
2018-12-03 11:34 ` Qu Wenruo
2018-12-03 16:29 ` remi
2014-10-31 0:29 filesystem corruption Tobias Holst
2014-10-31 1:02 ` Tobias Holst
2014-10-31 2:41 ` Rich Freeman
2014-10-31 17:34 ` Tobias Holst
2014-11-02 4:49 ` Robert White
2014-11-02 21:57 ` Chris Murphy
2014-11-03 3:43 ` Zygo Blaxell
2014-11-03 17:11 ` Chris Murphy
2014-11-04 4:31 ` Zygo Blaxell
2014-11-04 8:25 ` Duncan
2014-11-04 18:28 ` Chris Murphy
2014-11-04 21:44 ` Duncan
2014-11-04 22:19 ` Robert White
2014-11-04 22:34 ` Zygo Blaxell
2014-11-03 2:55 ` Tobias Holst
2014-11-03 3:49 ` Robert White
2007-06-06 3:10 Filesystem corruption Xu CanHao
2007-06-06 12:16 ` Ingo Bormuth
2007-05-30 20:13 devsk
2007-05-30 17:22 devsk
2007-05-30 19:24 ` Toby Thain
2007-05-30 20:03 ` David Masover
2007-05-31 0:11 ` Ingo Bormuth
2007-06-02 23:10 ` Edward Shishkin
2007-06-04 2:55 ` Ingo Bormuth
2007-06-04 9:41 ` Edward Shishkin
2007-06-05 23:20 ` Ingo Bormuth
2007-05-27 13:18 Laurent CARON
2007-05-28 12:23 ` Vladimir V. Saveliev
2007-05-28 14:10 ` Laurent CARON
2007-05-28 17:13 ` Vladimir V. Saveliev
2007-05-28 17:27 ` Laurent CARON
[not found] ` <Pine.LNX.4.64.0705280025570.10429@sheep.housecafe.de>
2007-05-28 17:31 ` Christian Kujau
2007-05-28 18:16 ` Laurent CARON
2007-05-28 23:19 ` Christian Kujau
2007-05-29 8:39 ` Vladimir V. Saveliev
[not found] ` <465BA9AC.8040805@ultraviolet.org>
2007-05-29 8:15 ` Vladimir V. Saveliev
2007-05-29 12:36 ` Toby Thain
2007-05-30 13:25 ` David Masover
2007-05-30 16:02 ` Vladimir V. Saveliev
2007-05-30 20:06 ` David Masover
2007-05-30 16:42 ` Toby Thain
2007-05-30 19:42 ` David Masover
2007-05-30 16:08 ` Vladimir V. Saveliev
2003-08-13 16:05 Locke
2003-08-14 7:49 ` Oleg Drokin
2002-09-05 15:57 Filesystem Corruption Brian Tinsley
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-07 7:15 ` Oleg Drokin
2002-06-11 16:49 ` Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2001-02-05 16:00 Filesystem corruption Ian Chilton
2001-02-05 13:16 Ian Chilton
2001-01-31 14:20 Carsten Langgaard
2001-01-31 15:52 ` Florian Lohoff
2001-01-31 16:24 ` Carsten Langgaard
2001-01-31 16:48 ` Florian Lohoff
2001-02-05 10:02 ` Ralf Baechle
2001-02-05 12:10 ` Alan Cox
2001-02-05 12:10 ` Alan Cox
2001-02-05 12:56 ` Geert Uytterhoeven
2001-02-05 13:01 ` Alan Cox
2001-02-05 13:01 ` Alan Cox
2001-02-05 22:01 ` Ralf Baechle
2001-02-05 22:01 ` Ralf Baechle
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D215902.9010308@feystorm.net \
--to=linux-raid@feystorm.net \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.