From: Bill Davidsen <davidsen@tmr.com>
To: Doug Ledford <dledford@redhat.com>
Cc: Iustin Pop <iusty@k1024.org>, John Stoffel <john@stoffel.org>,
Justin Piszcz <jpiszcz@lucidpixels.com>,
linux-raid@vger.kernel.org
Subject: Re: Time to deprecate old RAID formats?
Date: Tue, 23 Oct 2007 19:09:40 -0400 [thread overview]
Message-ID: <471E7F34.4000402@tmr.com> (raw)
In-Reply-To: <1192830129.1666.103.camel@firewall.xsintricity.com>
Doug Ledford wrote:
> On Fri, 2007-10-19 at 23:23 +0200, Iustin Pop wrote:
>
>> On Fri, Oct 19, 2007 at 02:39:47PM -0400, John Stoffel wrote:
>>
>>> And if putting the superblock at the end is problematic, why is it the
>>> default? Shouldn't version 1.1 be the default?
>>>
>> In my opinion, having the superblock *only* at the end (e.g. the 0.90
>> format) is the best option.
>>
>> It allows one to mount the disk separately (in case of RAID 1), if the
>> MD superblock is corrupt or you just want to get easily at the raw data.
>>
>
> Bad reasoning. It's the reason that the default is at the end of the
> device, but that was a bad decision made by Ingo long, long ago in a
> galaxy far, far away.
>
> The simple fact of the matter is there are only two type of raid devices
> for the purpose of this issue: those that fragment data (raid0/4/5/6/10)
> and those that don't (raid1, linear).
>
> For the purposes of this issue, there are only two states we care about:
> the raid array works or doesn't work.
>
> If the raid array works, then you *only* want the system to access the
> data via the raid array. If the raid array doesn't work, then for the
> fragmented case you *never* want the system to see any of the data from
> the raid array (such as an ext3 superblock) or a subsequent fsck could
> see a valid superblock and actually start a filesystem scan on the raw
> device, and end up hosing the filesystem beyond all repair after it hits
> the first chunk size break (although in practice this is usually a
> situation where fsck declares the filesystem so corrupt that it refuses
> to touch it, that's leaving an awful lot to chance, you really don't
> want fsck to *ever* see that superblock).
>
> If the raid array is raid1, then the raid array should *never* fail to
> start unless all disks are missing (in which case there is no raw device
> to access anyway). The very few failure types that will cause the raid
> array to not start automatically *and* still have an intact copy of the
> data usually happen when the raid array is perfectly healthy, in which
> case automatically finding a constituent device when the raid array
> failed to start is exactly the *wrong* thing to do (for instance, you
> enable SELinux on a machine and it hasn't been relabeled and the raid
> array fails to start because /dev/md<blah> can't be created because of
> an SELinux denial...all the raid1 members are still there, but if you
> touch a single one of them, then you run the risk of creating silent
> data corruption).
>
> It really boils down to this: for any reason that a raid array might
> fail to start, you *never* want to touch the underlying data until
> someone has taken manual measures to figure out why it didn't start and
> corrected the problem. Putting the superblock in front of the data does
> not prevent manual measures (such as recreating superblocks) from
> getting at the data. But, putting superblocks at the end leaves the
> door open for accidental access via constituent devices when you
> *really* don't want that to happen.
>
You didn't mention some ill-behaved application using the raw device
(ie. database) writing just a little more than it should and destroying
the superblock.
> So, no, the default should *not* be at the end of the device.
>
>
You make a convincing argulemt.
>> As to the people who complained exactly because of this feature, LVM has
>> two mechanisms to protect from accessing PVs on the raw disks (the
>> ignore raid components option and the filter - I always set filters when
>> using LVM ontop of MD).
>>
>> regards,
>> iustin
>>
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
next prev parent reply other threads:[~2007-10-23 23:09 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-19 14:34 Time to deprecate old RAID formats? John Stoffel
2007-10-19 15:09 ` Justin Piszcz
2007-10-19 15:46 ` John Stoffel
2007-10-19 16:15 ` Doug Ledford
2007-10-19 16:35 ` Justin Piszcz
2007-10-19 16:38 ` John Stoffel
2007-10-19 16:40 ` Justin Piszcz
2007-10-19 16:44 ` John Stoffel
2007-10-19 16:45 ` Justin Piszcz
2007-10-19 17:04 ` Doug Ledford
2007-10-19 17:05 ` Justin Piszcz
2007-10-19 17:23 ` Doug Ledford
2007-10-19 17:47 ` Justin Piszcz
2007-10-20 18:38 ` Michael Tokarev
2007-10-20 20:02 ` Doug Ledford
2007-10-19 22:43 ` chunk size (was Re: Time to deprecate old RAID formats?) Michal Soltys
2007-10-20 13:29 ` Doug Ledford
2007-10-23 19:21 ` Michal Soltys
2007-10-24 0:14 ` Doug Ledford
2007-10-19 17:11 ` Time to deprecate old RAID formats? Doug Ledford
2007-10-19 18:39 ` John Stoffel
2007-10-19 21:23 ` Iustin Pop
2007-10-19 21:42 ` Doug Ledford
2007-10-20 7:53 ` Iustin Pop
2007-10-20 13:11 ` Doug Ledford
2007-10-26 9:54 ` Luca Berra
2007-10-26 16:22 ` Gabor Gombas
2007-10-26 17:06 ` Gabor Gombas
2007-10-27 10:34 ` Luca Berra
2007-10-26 18:52 ` Doug Ledford
2007-10-26 22:30 ` Gabor Gombas
2007-10-28 0:26 ` Doug Ledford
2007-10-28 14:13 ` Luca Berra
2007-10-28 17:47 ` Doug Ledford
2007-10-29 8:41 ` Luca Berra
2007-10-29 15:30 ` Doug Ledford
2007-10-29 21:44 ` Luca Berra
2007-10-29 23:05 ` Doug Ledford
2007-10-30 3:10 ` Neil Brown
2007-10-30 6:55 ` Luca Berra
2007-10-30 16:48 ` Doug Ledford
2007-10-27 8:00 ` Luca Berra
2007-10-27 20:09 ` Doug Ledford
2007-10-28 13:46 ` Luca Berra
2007-10-23 23:09 ` Bill Davidsen [this message]
2007-10-23 23:03 ` Bill Davidsen
2007-10-24 0:09 ` Doug Ledford
2007-10-24 23:55 ` Neil Brown
2007-10-25 0:09 ` Jeff Garzik
2007-10-25 8:09 ` David Greaves
2007-10-26 6:16 ` Neil Brown
2007-10-26 14:18 ` Bill Davidsen
2007-10-26 18:41 ` Doug Ledford
2007-10-26 22:20 ` Gabor Gombas
2007-10-26 22:58 ` Doug Ledford
2007-10-27 11:11 ` Luca Berra
2007-10-27 15:20 ` Bill Davidsen
2007-10-28 0:18 ` Doug Ledford
2007-10-29 0:44 ` Bill Davidsen
2007-10-27 21:11 ` Doug Ledford
2007-10-29 0:48 ` Bill Davidsen
2007-10-30 3:25 ` Neil Brown
2007-11-02 12:31 ` Bill Davidsen
2007-10-25 7:01 ` Doug Ledford
2007-10-25 14:49 ` Bill Davidsen
2007-10-25 15:00 ` David Greaves
2007-10-26 5:56 ` Neil Brown
2007-10-24 14:00 ` John Stoffel
2007-10-24 15:18 ` Mike Snitzer
2007-10-24 15:32 ` Bill Davidsen
2007-10-20 14:09 ` Michael Tokarev
2007-10-20 14:24 ` Doug Ledford
2007-10-20 14:52 ` John Stoffel
2007-10-20 15:07 ` Iustin Pop
2007-10-20 15:36 ` Doug Ledford
2007-10-20 18:24 ` Michael Tokarev
2007-10-22 20:39 ` John Stoffel
2007-10-22 22:29 ` Michael Tokarev
2007-10-24 0:42 ` Doug Ledford
2007-10-24 9:40 ` David Greaves
2007-10-24 20:22 ` Bill Davidsen
2007-10-25 16:29 ` Doug Ledford
2007-11-01 21:02 ` H. Peter Anvin
2007-11-02 15:50 ` Doug Ledford
2007-10-24 0:36 ` Doug Ledford
2007-10-23 23:18 ` Bill Davidsen
2007-10-19 16:34 ` Justin Piszcz
2007-10-23 23:19 ` Bill Davidsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=471E7F34.4000402@tmr.com \
--to=davidsen@tmr.com \
--cc=dledford@redhat.com \
--cc=iusty@k1024.org \
--cc=john@stoffel.org \
--cc=jpiszcz@lucidpixels.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).