From: Bill Davidsen <davidsen@tmr.com>
To: Doug Ledford <dledford@redhat.com>
Cc: Iustin Pop <iusty@k1024.org>, John Stoffel <john@stoffel.org>,
Justin Piszcz <jpiszcz@lucidpixels.com>,
linux-raid@vger.kernel.org
Subject: Re: Time to deprecate old RAID formats?
Date: Tue, 23 Oct 2007 19:09:40 -0400 [thread overview]
Message-ID: <471E7F34.4000402@tmr.com> (raw)
In-Reply-To: <1192830129.1666.103.camel@firewall.xsintricity.com>
Doug Ledford wrote:
> On Fri, 2007-10-19 at 23:23 +0200, Iustin Pop wrote:
>
>> On Fri, Oct 19, 2007 at 02:39:47PM -0400, John Stoffel wrote:
>>
>>> And if putting the superblock at the end is problematic, why is it the
>>> default? Shouldn't version 1.1 be the default?
>>>
>> In my opinion, having the superblock *only* at the end (e.g. the 0.90
>> format) is the best option.
>>
>> It allows one to mount the disk separately (in case of RAID 1), if the
>> MD superblock is corrupt or you just want to get easily at the raw data.
>>
>
> Bad reasoning. It's the reason that the default is at the end of the
> device, but that was a bad decision made by Ingo long, long ago in a
> galaxy far, far away.
>
> The simple fact of the matter is there are only two type of raid devices
> for the purpose of this issue: those that fragment data (raid0/4/5/6/10)
> and those that don't (raid1, linear).
>
> For the purposes of this issue, there are only two states we care about:
> the raid array works or doesn't work.
>
> If the raid array works, then you *only* want the system to access the
> data via the raid array. If the raid array doesn't work, then for the
> fragmented case you *never* want the system to see any of the data from
> the raid array (such as an ext3 superblock) or a subsequent fsck could
> see a valid superblock and actually start a filesystem scan on the raw
> device, and end up hosing the filesystem beyond all repair after it hits
> the first chunk size break (although in practice this is usually a
> situation where fsck declares the filesystem so corrupt that it refuses
> to touch it, that's leaving an awful lot to chance, you really don't
> want fsck to *ever* see that superblock).
>
> If the raid array is raid1, then the raid array should *never* fail to
> start unless all disks are missing (in which case there is no raw device
> to access anyway). The very few failure types that will cause the raid
> array to not start automatically *and* still have an intact copy of the
> data usually happen when the raid array is perfectly healthy, in which
> case automatically finding a constituent device when the raid array
> failed to start is exactly the *wrong* thing to do (for instance, you
> enable SELinux on a machine and it hasn't been relabeled and the raid
> array fails to start because /dev/md<blah> can't be created because of
> an SELinux denial...all the raid1 members are still there, but if you
> touch a single one of them, then you run the risk of creating silent
> data corruption).
>
> It really boils down to this: for any reason that a raid array might
> fail to start, you *never* want to touch the underlying data until
> someone has taken manual measures to figure out why it didn't start and
> corrected the problem. Putting the superblock in front of the data does
> not prevent manual measures (such as recreating superblocks) from
> getting at the data. But, putting superblocks at the end leaves the
> door open for accidental access via constituent devices when you
> *really* don't want that to happen.
>
You didn't mention some ill-behaved application using the raw device
(ie. database) writing just a little more than it should and destroying
the superblock.
> So, no, the default should *not* be at the end of the device.
>
>
You make a convincing argulemt.
>> As to the people who complained exactly because of this feature, LVM has
>> two mechanisms to protect from accessing PVs on the raw disks (the
>> ignore raid components option and the filter - I always set filters when
>> using LVM ontop of MD).
>>
>> regards,
>> iustin
>>
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
next prev parent reply other threads:[~2007-10-23 23:09 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-19 14:34 Time to deprecate old RAID formats? John Stoffel
2007-10-19 15:09 ` Justin Piszcz
2007-10-19 15:46 ` John Stoffel
2007-10-19 16:15 ` Doug Ledford
2007-10-19 16:35 ` Justin Piszcz
2007-10-19 16:38 ` John Stoffel
2007-10-19 16:40 ` Justin Piszcz
2007-10-19 16:44 ` John Stoffel
2007-10-19 16:45 ` Justin Piszcz
2007-10-19 17:04 ` Doug Ledford
2007-10-19 17:05 ` Justin Piszcz
2007-10-19 17:23 ` Doug Ledford
2007-10-19 17:47 ` Justin Piszcz
2007-10-20 18:38 ` Michael Tokarev
2007-10-20 20:02 ` Doug Ledford
2007-10-19 22:43 ` chunk size (was Re: Time to deprecate old RAID formats?) Michal Soltys
2007-10-20 13:29 ` Doug Ledford
2007-10-23 19:21 ` Michal Soltys
2007-10-24 0:14 ` Doug Ledford
2007-10-19 17:11 ` Time to deprecate old RAID formats? Doug Ledford
2007-10-19 18:39 ` John Stoffel
2007-10-19 21:23 ` Iustin Pop
2007-10-19 21:42 ` Doug Ledford
2007-10-20 7:53 ` Iustin Pop
2007-10-20 13:11 ` Doug Ledford
2007-10-26 9:54 ` Luca Berra
2007-10-26 16:22 ` Gabor Gombas
2007-10-26 17:06 ` Gabor Gombas
2007-10-27 10:34 ` Luca Berra
2007-10-26 18:52 ` Doug Ledford
2007-10-26 22:30 ` Gabor Gombas
2007-10-28 0:26 ` Doug Ledford
2007-10-28 14:13 ` Luca Berra
2007-10-28 17:47 ` Doug Ledford
2007-10-29 8:41 ` Luca Berra
2007-10-29 15:30 ` Doug Ledford
2007-10-29 21:44 ` Luca Berra
2007-10-29 23:05 ` Doug Ledford
2007-10-30 3:10 ` Neil Brown
2007-10-30 6:55 ` Luca Berra
2007-10-30 16:48 ` Doug Ledford
2007-10-27 8:00 ` Luca Berra
2007-10-27 20:09 ` Doug Ledford
2007-10-28 13:46 ` Luca Berra
2007-10-23 23:09 ` Bill Davidsen [this message]
2007-10-23 23:03 ` Bill Davidsen
2007-10-24 0:09 ` Doug Ledford
2007-10-24 23:55 ` Neil Brown
2007-10-25 0:09 ` Jeff Garzik
2007-10-25 8:09 ` David Greaves
2007-10-26 6:16 ` Neil Brown
2007-10-26 14:18 ` Bill Davidsen
2007-10-26 18:41 ` Doug Ledford
2007-10-26 22:20 ` Gabor Gombas
2007-10-26 22:58 ` Doug Ledford
2007-10-27 11:11 ` Luca Berra
2007-10-27 15:20 ` Bill Davidsen
2007-10-28 0:18 ` Doug Ledford
2007-10-29 0:44 ` Bill Davidsen
2007-10-27 21:11 ` Doug Ledford
2007-10-29 0:48 ` Bill Davidsen
2007-10-30 3:25 ` Neil Brown
2007-11-02 12:31 ` Bill Davidsen
2007-10-25 7:01 ` Doug Ledford
2007-10-25 14:49 ` Bill Davidsen
2007-10-25 15:00 ` David Greaves
2007-10-26 5:56 ` Neil Brown
2007-10-24 14:00 ` John Stoffel
2007-10-24 15:18 ` Mike Snitzer
2007-10-24 15:32 ` Bill Davidsen
2007-10-20 14:09 ` Michael Tokarev
2007-10-20 14:24 ` Doug Ledford
2007-10-20 14:52 ` John Stoffel
2007-10-20 15:07 ` Iustin Pop
2007-10-20 15:36 ` Doug Ledford
2007-10-20 18:24 ` Michael Tokarev
2007-10-22 20:39 ` John Stoffel
2007-10-22 22:29 ` Michael Tokarev
2007-10-24 0:42 ` Doug Ledford
2007-10-24 9:40 ` David Greaves
2007-10-24 20:22 ` Bill Davidsen
2007-10-25 16:29 ` Doug Ledford
2007-11-01 21:02 ` H. Peter Anvin
2007-11-02 15:50 ` Doug Ledford
2007-10-24 0:36 ` Doug Ledford
2007-10-23 23:18 ` Bill Davidsen
2007-10-19 16:34 ` Justin Piszcz
2007-10-23 23:19 ` Bill Davidsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=471E7F34.4000402@tmr.com \
--to=davidsen@tmr.com \
--cc=dledford@redhat.com \
--cc=iusty@k1024.org \
--cc=john@stoffel.org \
--cc=jpiszcz@lucidpixels.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.