From: Bill Davidsen <davidsen@tmr.com>
To: Doug Ledford <dledford@redhat.com>
Cc: Neil Brown <neilb@suse.de>, David Greaves <david@dgreaves.com>,
Jeff Garzik <jeff@garzik.org>, John Stoffel <john@stoffel.org>,
Justin Piszcz <jpiszcz@lucidpixels.com>,
linux-raid@vger.kernel.org
Subject: Re: Time to deprecate old RAID formats?
Date: Sun, 28 Oct 2007 20:44:06 -0400 [thread overview]
Message-ID: <47252CD6.9010804@tmr.com> (raw)
In-Reply-To: <1193530713.10336.389.camel@firewall.xsintricity.com>
Doug Ledford wrote:
> On Sat, 2007-10-27 at 11:20 -0400, Bill Davidsen wrote:
>
>>> * When using lilo to boot from a raid device, it automatically installs
>>> itself to the mbr, not to the partition. This can not be changed. Only
>>> 0.90 and 1.0 superblock types are supported because lilo doesn't
>>> understand the offset to the beginning of the fs otherwise.
>>>
>>>
>> I'm reasonably sure that's wrong, I used to set up dual boot machines by
>> putting LILO in the partition and making that the boot partition, by
>> changing the active partition flag I could just have the machine boot
>> Windows, to keep people from getting confused.
>>
>
> Yeah, someone else pointed this out too. The original patch to lilo
> *did* do as I suggest, so they must have improved on the patch later.
>
>
>>> * When using grub to boot from a raid device, only 0.90 and 1.0
>>> superblocks are supported[1] (because grub is ignorant of the raid and
>>> it requires the fs to start at the start of the partition). You can use
>>> either MBR or partition based installs of grub. However, partition
>>> based installs require that all bootable partitions be in exactly the
>>> same logical block address across all devices. This limitation can be
>>> an extremely hazardous limitation in the event a drive dies and you have
>>> to replace it with a new drive as newer drives may not share the older
>>> drive's geometry and will require starting your boot partition in an odd
>>> location to make the logical block addresses match.
>>>
>>> * When using grub2, there is supposedly already support for raid/lvm
>>> devices. However, I do not know if this includes version 1.0, 1.1, or
>>> 1.2 superblocks. I intend to find that out today. If you tell grub2 to
>>> install to an md device, it searches out all constituent devices and
>>> installs to the MBR on each device[2]. This can't be changed (at least
>>> right now, probably not ever though).
>>>
>>>
>> That sounds like a good reason to avoid grub2, frankly. Software which
>> decides that it knows what to do better than the user isn't my
>> preference. If I wanted software which fores me to do things "their way"
>> I'd be running Windows.
>>
>
> It's not really all that unreasonable of a restriction. Most people
> aren't aware than when you put a boot sector at the beginning of a
> partition, you only have 512 bytes of space, so the boot loader that you
> put there is basically nothing more than code to read the remainder of
> the boot loader from the file system space. Now, traditionally, most
> boot loaders have had to hard code the block addresses of certain key
> components into these second stage boot loaders. If a user isn't aware
> of the fact that the boot loader does this at install time (or at kernel
> selection update time in the case of lilo), then they aren't aware that
> the files must reside at exactly the same logical block address on all
> devices. Without that knowledge, they can easily create an unbootable
> setup by having the various boot partitions in slightly different
> locations on the disks. And intelligent partition editors like parted
> can compound the problem because as they insulate the user from having
> to pick which partition number is used for what partition, etc., they
> can end up placing the various boot partitions in different areas of
> different drives. The requirement above is a means of making sure that
> users aren't surprise by a non-working setup. The whole element of
> least surprise thing. Of course, if they keep that requirement, then I
> would expect it to be well documented so that people know this going
> into putting the boot loader in place, but I would argue that this is at
> least better than finding out when a drive dies that your system isn't
> bootable.
>
>
>>> So, given the above situations, really, superblock format 1.2 is likely
>>> to never be needed. None of the shipping boot loaders work with 1.2
>>> regardless, and the boot loader under development won't install to the
>>> partition in the event of an md device and therefore doesn't need that
>>> 4k buffer that 1.2 provides.
>>>
>>>
>> Sounds right, although it may have other uses for clever people.
>>
>>> [1] Grub won't work with either 1.1 or 1.2 superblocks at the moment. A
>>> person could probably hack it to work, but since grub development has
>>> stopped in preference to the still under development grub2, they won't
>>> take the patches upstream unless they are bug fixes, not new features.
>>>
>>>
>> If the patches were available, "doesn't work with existing raid formats"
>> would probably qualify as a bug.
>>
>
> Possibly. I'm a bit overbooked on other work at the moment, but I may
> try to squeeze in some work on grub/grub2 to support version 1.1 or 1.2
> superblocks.
>
>
>>> [2] There are two ways to install to a master boot record. The first is
>>> to use the first 512 bytes *only* and hardcode the location of the
>>> remainder of the boot loader into those 512 bytes. The second way is to
>>> use the free space between the MBR and the start of the first partition
>>> to embed the remainder of the boot loader. When you point grub2 at an
>>> md device, they automatically only use the second method of boot loader
>>> installation. This gives them the freedom to be able to modify the
>>> second stage boot loader on a boot disk by boot disk basis. The
>>> downside to this is that they need lots of room after the MBR and before
>>> the first partition in order to put their core.img file in place. I
>>> *think*, and I'll know for sure later today, that the core.img file is
>>> generated during grub install from the list of optional modules you
>>> specify during setup. Eg., the pc module gives partition table support,
>>> the lvm module lvm support, etc. You list the modules you need, and
>>> grub then builds a core.img out of all those modules. The normal amount
>>> of space between the MBR and the first partition is (sectors_per_track -
>>> 1). For standard disk geometries, that basically leaves 254 sectors, or
>>> 127k of space. This might not be enough for your particular needs if
>>> you have a complex boot environment. In that case, you would need to
>>> bump at least the starting track of your first partition to make room
>>> for your boot loader. Unfortunately, how is a person to know how much
>>> room their setup needs until after they've installed and it's too late
>>> to bump the partition table start? They can't. So, that's another
>>> thing I think I will check out today, what the maximum size of grub2
>>> might be with all modules included, and what a common size might be.
>>>
>>>
>>>
>> Based on your description, it sounds as if grub2 may not have given
>> adequate thought to what users other than the authors might need (that
>> may be a premature conclusion). I have multiple installs on several of
>> my machines, and I assume that the grub2 for 32 and 64 bit will be
>> different. Thanks for the research.
>>
>
> No, not really. The grub command on the two is different, but they
> actually build the boot sector out of 16 bit non-protected mode code,
> just like DOS. So either one would build the same boot sector given the
> same config. And you can always use the same trick I've used in the
> past of creating a large /boot partition (say 250MB) and using that same
> partition as /boot in all of your installs. Then they share a single
> grub config (while the grub binaries are in the individual / partitions)
> and from the single grub instance you can boot to any of the installs,
> as well as a kernel update in any install updates that global grub
> config. The other option is to use separate /boot partitions and chain
> load the grub instances, but I find that clunky in comparison. Of
>
I just copy a stanza of the 64 bit grub file into the 32 bit grub file,
and that seems to work okay, the 32 bit boot mounts /mnt/boot64, and the
64 bit boot mounts /mnt/boot64 so I can just copy the data. I confess
that the 64 bit stuff has little use recently, nothing I'm doing runs
appreciably faster, and I know the 32 bit code is more used and
therefore likely to be better debugged. Note "likely" in that. ;-)
> course, in my case I also made /lib/modules its own partition and also
> shared it between all the installs so that I could manually edit the
> various kernel boot params to specify different root partitions and in
> so doing I could boot a RHEL5 kernel using a RHEL4 install and vice
> versa. But if you do that, you have to manually
> patch /etc/rc.d/rc.sysinit to mount the /lib/modules partition before
> ever trying to do anything with modules (and you have to mount it rw so
> they can do a depmod if needed), then remount it ro for the fsck, then
> it gets remounted rw again after the fs check. It was a pain in the ass
> to maintain because every update to initscripts would wipe out the patch
> and if you forgot to repatch the file, the system wouldn't boot and
> you'd have to boot into another install, mount the / partition of the
> broken install, patch the file, then it would work again in that
> install.
>
>
That sounds like *way* more complexity than appeals to me. I stand in
awe, but have no urge to join you.
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
next prev parent reply other threads:[~2007-10-29 0:44 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-19 14:34 Time to deprecate old RAID formats? John Stoffel
2007-10-19 15:09 ` Justin Piszcz
2007-10-19 15:46 ` John Stoffel
2007-10-19 16:15 ` Doug Ledford
2007-10-19 16:35 ` Justin Piszcz
2007-10-19 16:38 ` John Stoffel
2007-10-19 16:40 ` Justin Piszcz
2007-10-19 16:44 ` John Stoffel
2007-10-19 16:45 ` Justin Piszcz
2007-10-19 17:04 ` Doug Ledford
2007-10-19 17:05 ` Justin Piszcz
2007-10-19 17:23 ` Doug Ledford
2007-10-19 17:47 ` Justin Piszcz
2007-10-20 18:38 ` Michael Tokarev
2007-10-20 20:02 ` Doug Ledford
2007-10-19 22:43 ` chunk size (was Re: Time to deprecate old RAID formats?) Michal Soltys
2007-10-20 13:29 ` Doug Ledford
2007-10-23 19:21 ` Michal Soltys
2007-10-24 0:14 ` Doug Ledford
2007-10-19 17:11 ` Time to deprecate old RAID formats? Doug Ledford
2007-10-19 18:39 ` John Stoffel
2007-10-19 21:23 ` Iustin Pop
2007-10-19 21:42 ` Doug Ledford
2007-10-20 7:53 ` Iustin Pop
2007-10-20 13:11 ` Doug Ledford
2007-10-26 9:54 ` Luca Berra
2007-10-26 16:22 ` Gabor Gombas
2007-10-26 17:06 ` Gabor Gombas
2007-10-27 10:34 ` Luca Berra
2007-10-26 18:52 ` Doug Ledford
2007-10-26 22:30 ` Gabor Gombas
2007-10-28 0:26 ` Doug Ledford
2007-10-28 14:13 ` Luca Berra
2007-10-28 17:47 ` Doug Ledford
2007-10-29 8:41 ` Luca Berra
2007-10-29 15:30 ` Doug Ledford
2007-10-29 21:44 ` Luca Berra
2007-10-29 23:05 ` Doug Ledford
2007-10-30 3:10 ` Neil Brown
2007-10-30 6:55 ` Luca Berra
2007-10-30 16:48 ` Doug Ledford
2007-10-27 8:00 ` Luca Berra
2007-10-27 20:09 ` Doug Ledford
2007-10-28 13:46 ` Luca Berra
2007-10-23 23:09 ` Bill Davidsen
2007-10-23 23:03 ` Bill Davidsen
2007-10-24 0:09 ` Doug Ledford
2007-10-24 23:55 ` Neil Brown
2007-10-25 0:09 ` Jeff Garzik
2007-10-25 8:09 ` David Greaves
2007-10-26 6:16 ` Neil Brown
2007-10-26 14:18 ` Bill Davidsen
2007-10-26 18:41 ` Doug Ledford
2007-10-26 22:20 ` Gabor Gombas
2007-10-26 22:58 ` Doug Ledford
2007-10-27 11:11 ` Luca Berra
2007-10-27 15:20 ` Bill Davidsen
2007-10-28 0:18 ` Doug Ledford
2007-10-29 0:44 ` Bill Davidsen [this message]
2007-10-27 21:11 ` Doug Ledford
2007-10-29 0:48 ` Bill Davidsen
2007-10-30 3:25 ` Neil Brown
2007-11-02 12:31 ` Bill Davidsen
2007-10-25 7:01 ` Doug Ledford
2007-10-25 14:49 ` Bill Davidsen
2007-10-25 15:00 ` David Greaves
2007-10-26 5:56 ` Neil Brown
2007-10-24 14:00 ` John Stoffel
2007-10-24 15:18 ` Mike Snitzer
2007-10-24 15:32 ` Bill Davidsen
2007-10-20 14:09 ` Michael Tokarev
2007-10-20 14:24 ` Doug Ledford
2007-10-20 14:52 ` John Stoffel
2007-10-20 15:07 ` Iustin Pop
2007-10-20 15:36 ` Doug Ledford
2007-10-20 18:24 ` Michael Tokarev
2007-10-22 20:39 ` John Stoffel
2007-10-22 22:29 ` Michael Tokarev
2007-10-24 0:42 ` Doug Ledford
2007-10-24 9:40 ` David Greaves
2007-10-24 20:22 ` Bill Davidsen
2007-10-25 16:29 ` Doug Ledford
2007-11-01 21:02 ` H. Peter Anvin
2007-11-02 15:50 ` Doug Ledford
2007-10-24 0:36 ` Doug Ledford
2007-10-23 23:18 ` Bill Davidsen
2007-10-19 16:34 ` Justin Piszcz
2007-10-23 23:19 ` Bill Davidsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47252CD6.9010804@tmr.com \
--to=davidsen@tmr.com \
--cc=david@dgreaves.com \
--cc=dledford@redhat.com \
--cc=jeff@garzik.org \
--cc=john@stoffel.org \
--cc=jpiszcz@lucidpixels.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).