* Raid5 regression
@ 2011-05-03 14:33 Phillip Susi
2011-05-03 15:30 ` Phillip Susi
2011-05-03 20:39 ` Phillip Susi
0 siblings, 2 replies; 14+ messages in thread
From: Phillip Susi @ 2011-05-03 14:33 UTC (permalink / raw)
To: grub-devel
After upgrading an Ubuntu server from Maverick to Natty ( grub2 version
1.99-rc1-13ubuntu3 ) the system no longer will boot. Grub complains
about the raid5 array:
error: Found two disks with the index 0 for RAID md0
error: Found two disks with the index 1 for RAID md0
error: Found two disks with the index 2 for RAID md0
error: Superfluous RAID member (4 found)
error: Unknown filesystem
This is a 4 disk raid5 array that mdadm still recognizes and looks fine,
with each disk having IDs 0 through 3 respectively. Despite the fact
that it complains about the raid, it still shows the LVM logical volumes
in the output of ls. The array is the sole LVM PV.
Bug filed at https://launchpad.net/bugs/776422. Any hints on how to
proceed with debugging this? Is there a way to check what modules are
built into the core image from the rescue shell?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Raid5 regression
2011-05-03 14:33 Raid5 regression Phillip Susi
@ 2011-05-03 15:30 ` Phillip Susi
2011-05-03 20:39 ` Phillip Susi
1 sibling, 0 replies; 14+ messages in thread
From: Phillip Susi @ 2011-05-03 15:30 UTC (permalink / raw)
To: The development of GNU GRUB
Wasn't there a change made recently to support the new md metadata
formats? I wonder if that is causing grub to detect the raid superblock
both at the end of the physical disk, as well as within the partition,
causing it to see the members twice?
On 5/3/2011 10:33 AM, Phillip Susi wrote:
> After upgrading an Ubuntu server from Maverick to Natty ( grub2 version
> 1.99-rc1-13ubuntu3 ) the system no longer will boot. Grub complains
> about the raid5 array:
>
> error: Found two disks with the index 0 for RAID md0
> error: Found two disks with the index 1 for RAID md0
> error: Found two disks with the index 2 for RAID md0
> error: Superfluous RAID member (4 found)
> error: Unknown filesystem
>
> This is a 4 disk raid5 array that mdadm still recognizes and looks fine,
> with each disk having IDs 0 through 3 respectively. Despite the fact
> that it complains about the raid, it still shows the LVM logical volumes
> in the output of ls. The array is the sole LVM PV.
>
> Bug filed at https://launchpad.net/bugs/776422. Any hints on how to
> proceed with debugging this? Is there a way to check what modules are
> built into the core image from the rescue shell?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Raid5 regression
2011-05-03 14:33 Raid5 regression Phillip Susi
2011-05-03 15:30 ` Phillip Susi
@ 2011-05-03 20:39 ` Phillip Susi
2011-05-03 21:15 ` Jérôme Poulin
2011-05-04 11:23 ` Goswin von Brederlow
1 sibling, 2 replies; 14+ messages in thread
From: Phillip Susi @ 2011-05-03 20:39 UTC (permalink / raw)
To: The development of GNU GRUB
After enabling raid debug output and then loading mdraid09.mod, I can
see that it is scanning and detecting the superblock on both (hdx) and
(hdx,msdos1).
On 5/3/2011 10:33 AM, Phillip Susi wrote:
> After upgrading an Ubuntu server from Maverick to Natty ( grub2 version
> 1.99-rc1-13ubuntu3 ) the system no longer will boot. Grub complains
> about the raid5 array:
>
> error: Found two disks with the index 0 for RAID md0
> error: Found two disks with the index 1 for RAID md0
> error: Found two disks with the index 2 for RAID md0
> error: Superfluous RAID member (4 found)
> error: Unknown filesystem
>
> This is a 4 disk raid5 array that mdadm still recognizes and looks fine,
> with each disk having IDs 0 through 3 respectively. Despite the fact
> that it complains about the raid, it still shows the LVM logical volumes
> in the output of ls. The array is the sole LVM PV.
>
> Bug filed at https://launchpad.net/bugs/776422. Any hints on how to
> proceed with debugging this? Is there a way to check what modules are
> built into the core image from the rescue shell?
>
> _______________________________________________
> Grub-devel mailing list
> Grub-devel@gnu.org
> https://lists.gnu.org/mailman/listinfo/grub-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Raid5 regression
2011-05-03 20:39 ` Phillip Susi
@ 2011-05-03 21:15 ` Jérôme Poulin
2011-05-04 6:17 ` Vladimir 'φ-coder/phcoder' Serbinenko
2011-05-04 11:23 ` Goswin von Brederlow
1 sibling, 1 reply; 14+ messages in thread
From: Jérôme Poulin @ 2011-05-03 21:15 UTC (permalink / raw)
To: The development of GNU GRUB
On Tue, May 3, 2011 at 4:39 PM, Phillip Susi <psusi@cfl.rr.com> wrote:
> After enabling raid debug output and then loading mdraid09.mod, I can see
> that it is scanning and detecting the superblock on both (hdx) and
> (hdx,msdos1).
I've had this problem too, it is a bug that should be fixed, phcoder
told me he didn't know enough of mdraid to change the behavior of its
module. I say it needs to be changed to scan partitions first, then
full disk.
See: http://www.mail-archive.com/grub-devel@gnu.org/msg16926.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Raid5 regression
2011-05-03 21:15 ` Jérôme Poulin
@ 2011-05-04 6:17 ` Vladimir 'φ-coder/phcoder' Serbinenko
0 siblings, 0 replies; 14+ messages in thread
From: Vladimir 'φ-coder/phcoder' Serbinenko @ 2011-05-04 6:17 UTC (permalink / raw)
To: grub-devel
[-- Attachment #1: Type: text/plain, Size: 884 bytes --]
On 03.05.2011 23:15, Jérôme Poulin wrote:
> On Tue, May 3, 2011 at 4:39 PM, Phillip Susi <psusi@cfl.rr.com> wrote:
>> After enabling raid debug output and then loading mdraid09.mod, I can see
>> that it is scanning and detecting the superblock on both (hdx) and
>> (hdx,msdos1).
> I've had this problem too, it is a bug that should be fixed, phcoder
> told me he didn't know enough of mdraid to change the behavior of its
> module. I say it needs to be changed to scan partitions first, then
> full disk.
This would only swap one problem for another (think about partitioned raid1)
> See: http://www.mail-archive.com/grub-devel@gnu.org/msg16926.html
>
> _______________________________________________
> Grub-devel mailing list
> Grub-devel@gnu.org
> https://lists.gnu.org/mailman/listinfo/grub-devel
>
--
Regards
Vladimir 'φ-coder/phcoder' Serbinenko
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 294 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Raid5 regression
2011-05-03 20:39 ` Phillip Susi
2011-05-03 21:15 ` Jérôme Poulin
@ 2011-05-04 11:23 ` Goswin von Brederlow
2011-05-04 13:49 ` Phillip Susi
1 sibling, 1 reply; 14+ messages in thread
From: Goswin von Brederlow @ 2011-05-04 11:23 UTC (permalink / raw)
To: The development of GNU GRUB
Phillip Susi <psusi@cfl.rr.com> writes:
> After enabling raid debug output and then loading mdraid09.mod, I can
> see that it is scanning and detecting the superblock on both (hdx) and
> (hdx,msdos1).
>
> On 5/3/2011 10:33 AM, Phillip Susi wrote:
>> After upgrading an Ubuntu server from Maverick to Natty ( grub2 version
>> 1.99-rc1-13ubuntu3 ) the system no longer will boot. Grub complains
>> about the raid5 array:
>>
>> error: Found two disks with the index 0 for RAID md0
>> error: Found two disks with the index 1 for RAID md0
>> error: Found two disks with the index 2 for RAID md0
>> error: Superfluous RAID member (4 found)
>> error: Unknown filesystem
>>
>> This is a 4 disk raid5 array that mdadm still recognizes and looks fine,
>> with each disk having IDs 0 through 3 respectively. Despite the fact
>> that it complains about the raid, it still shows the LVM logical volumes
>> in the output of ls. The array is the sole LVM PV.
>>
>> Bug filed at https://launchpad.net/bugs/776422. Any hints on how to
>> proceed with debugging this? Is there a way to check what modules are
>> built into the core image from the rescue shell?
That is a problem mdadm has (had?) too. The problem arises when the
alignment of the partition and its size causes the metadata of the
partition to be also valid for the disk (as in the partition ends
exactly at the end of the disk).
One of the reasons not to use metadata at the end of the disk.
MfG
Goswin
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Raid5 regression
2011-05-04 11:23 ` Goswin von Brederlow
@ 2011-05-04 13:49 ` Phillip Susi
2011-05-04 15:09 ` Phillip Susi
0 siblings, 1 reply; 14+ messages in thread
From: Phillip Susi @ 2011-05-04 13:49 UTC (permalink / raw)
To: The development of GNU GRUB; +Cc: Goswin von Brederlow
On 5/4/2011 7:23 AM, Goswin von Brederlow wrote:
> That is a problem mdadm has (had?) too. The problem arises when the
> alignment of the partition and its size causes the metadata of the
> partition to be also valid for the disk (as in the partition ends
> exactly at the end of the disk).
>
> One of the reasons not to use metadata at the end of the disk.
I thought that could be the problem so after speaking with Collin on IRC
yesterday, I checked and the partition ends over 300 sectors before the
end of the disk, which is enough to prevent the superblock in the
partition from being mistaken for one on the end of the disk.
At least it should be, and it was under Maverick, but does not seem to
be anymore in Natty. Just to make sure there wasn't another actual
superblock at the end of the disk, I zeroed out the sectors from the end
of the partition to the end of the disk and it did not change anything.
It seems that somehow this new version of grub is now scanning more than
just the last whole 64k of the disk, though looking at the code, I don't
see how.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Raid5 regression
2011-05-04 13:49 ` Phillip Susi
@ 2011-05-04 15:09 ` Phillip Susi
2011-05-04 15:21 ` Vladimir 'φ-coder/phcoder' Serbinenko
0 siblings, 1 reply; 14+ messages in thread
From: Phillip Susi @ 2011-05-04 15:09 UTC (permalink / raw)
To: The development of GNU GRUB
Something has gone wrong with the disk size detection. Grub thinks the
size of the disk is 488396800 sectors instead of the 488397168 that
fdisk reports. This shorter size causes the partition to appear to run
right up to the end of the disk and thus, the raid superblock is found
both at the end of the disk and the end of the partition.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Raid5 regression
2011-05-04 15:09 ` Phillip Susi
@ 2011-05-04 15:21 ` Vladimir 'φ-coder/phcoder' Serbinenko
2011-05-04 18:37 ` Phillip Susi
0 siblings, 1 reply; 14+ messages in thread
From: Vladimir 'φ-coder/phcoder' Serbinenko @ 2011-05-04 15:21 UTC (permalink / raw)
To: grub-devel
[-- Attachment #1: Type: text/plain, Size: 567 bytes --]
On 04.05.2011 17:09, Phillip Susi wrote:
> Something has gone wrong with the disk size detection. Grub thinks
> the size of the disk is 488396800 sectors instead of the 488397168
> that fdisk reports. This shorter size causes the partition to appear
> to run right up to the end of the disk and thus, the raid superblock
> is found both at the end of the disk and the end of the partition.
>
Does it happen with grub-fstest ? Some BIOSes are known to chop away
some sectors in the end of disk.
--
Regards
Vladimir 'φ-coder/phcoder' Serbinenko
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 294 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Raid5 regression
2011-05-04 15:21 ` Vladimir 'φ-coder/phcoder' Serbinenko
@ 2011-05-04 18:37 ` Phillip Susi
2011-05-04 19:01 ` Goswin von Brederlow
2011-05-04 20:23 ` Vladimir 'φ-coder/phcoder' Serbinenko
0 siblings, 2 replies; 14+ messages in thread
From: Phillip Susi @ 2011-05-04 18:37 UTC (permalink / raw)
To: The development of GNU GRUB
Cc: Vladimir 'φ-coder/phcoder' Serbinenko
On 5/4/2011 11:21 AM, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
> Does it happen with grub-fstest ? Some BIOSes are known to chop away
> some sectors in the end of disk.
That seems to be what is going on. I went back to the Maverick version
of grub and it gets no complaints, until I set debug=raid. Then I can
see the same type of messages about the duplicate detection.
Comparing the two versions of raid.c, it looks like the old version just
made the debug print when it found the duplicate superblock, and then
carried on, replacing the previously found device. The new version of
insert_array() returns a grub_error().
So the net result is that even though it always was detecting the
superblock on both the whole disk and on the partition, it used to let
the one found on the partition supersede so everything worked, but now
it keeps the one on the whole disk and so things break.
How should this conflict be resolved? I would think that the partition
should take precedence like it used to.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Raid5 regression
2011-05-04 18:37 ` Phillip Susi
@ 2011-05-04 19:01 ` Goswin von Brederlow
2011-05-04 20:23 ` Vladimir 'φ-coder/phcoder' Serbinenko
1 sibling, 0 replies; 14+ messages in thread
From: Goswin von Brederlow @ 2011-05-04 19:01 UTC (permalink / raw)
To: The development of GNU GRUB; +Cc: Vladimir '-coder/phcoder' Serbinenko
Phillip Susi <psusi@cfl.rr.com> writes:
> On 5/4/2011 11:21 AM, Vladimir 'Ï-coder/phcoder' Serbinenko wrote:
>> Does it happen with grub-fstest ? Some BIOSes are known to chop away
>> some sectors in the end of disk.
>
> That seems to be what is going on. I went back to the Maverick
> version of grub and it gets no complaints, until I set debug=raid.
> Then I can see the same type of messages about the duplicate detection.
>
> Comparing the two versions of raid.c, it looks like the old version
> just made the debug print when it found the duplicate superblock, and
> then carried on, replacing the previously found device. The new
> version of insert_array() returns a grub_error().
>
> So the net result is that even though it always was detecting the
> superblock on both the whole disk and on the partition, it used to let
> the one found on the partition supersede so everything worked, but now
> it keeps the one on the whole disk and so things break.
>
> How should this conflict be resolved? I would think that the
> partition should take precedence like it used to.
Unless the raid is partitioned and then the reverse is needed.
MfG
Goswin
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Raid5 regression
2011-05-04 18:37 ` Phillip Susi
2011-05-04 19:01 ` Goswin von Brederlow
@ 2011-05-04 20:23 ` Vladimir 'φ-coder/phcoder' Serbinenko
2011-05-05 14:38 ` Phillip Susi
1 sibling, 1 reply; 14+ messages in thread
From: Vladimir 'φ-coder/phcoder' Serbinenko @ 2011-05-04 20:23 UTC (permalink / raw)
To: The development of GNU GRUB
[-- Attachment #1: Type: text/plain, Size: 711 bytes --]
> So the net result is that even though it always was detecting the
> superblock on both the whole disk and on the partition, it used to let
> the one found on the partition supersede so everything worked, but now
> it keeps the one on the whole disk and so things break.
>
Now it breaks less then previously. Previous behaviour resulted in NULL
dereference in some cases.
> How should this conflict be resolved? I would think that the
> partition should take precedence like it used to.
>
Ideally by using 1.1 metadata. It's in the end of the device but
contains enough data to unambigously tell if it belongs to disk or
partition.
--
Regards
Vladimir 'φ-coder/phcoder' Serbinenko
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 294 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Raid5 regression
2011-05-04 20:23 ` Vladimir 'φ-coder/phcoder' Serbinenko
@ 2011-05-05 14:38 ` Phillip Susi
2011-05-05 16:57 ` Goswin von Brederlow
0 siblings, 1 reply; 14+ messages in thread
From: Phillip Susi @ 2011-05-05 14:38 UTC (permalink / raw)
To: The development of GNU GRUB
Cc: Vladimir 'φ-coder/phcoder' Serbinenko
On 5/4/2011 4:23 PM, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
> Ideally by using 1.1 metadata. It's in the end of the device but
> contains enough data to unambigously tell if it belongs to disk or
> partition.
You mean 1.0? I guess I'll have to try that.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Raid5 regression
2011-05-05 14:38 ` Phillip Susi
@ 2011-05-05 16:57 ` Goswin von Brederlow
0 siblings, 0 replies; 14+ messages in thread
From: Goswin von Brederlow @ 2011-05-05 16:57 UTC (permalink / raw)
To: The development of GNU GRUB
Phillip Susi <psusi@cfl.rr.com> writes:
> On 5/4/2011 4:23 PM, Vladimir 'Ï-coder/phcoder' Serbinenko wrote:
>> Ideally by using 1.1 metadata. It's in the end of the device but
>> contains enough data to unambigously tell if it belongs to disk or
>> partition.
>
> You mean 1.0? I guess I'll have to try that.
I just use 1.2, 4k from the start. That is never ambigous.
MfG
Goswin
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2011-05-05 16:57 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-03 14:33 Raid5 regression Phillip Susi
2011-05-03 15:30 ` Phillip Susi
2011-05-03 20:39 ` Phillip Susi
2011-05-03 21:15 ` Jérôme Poulin
2011-05-04 6:17 ` Vladimir 'φ-coder/phcoder' Serbinenko
2011-05-04 11:23 ` Goswin von Brederlow
2011-05-04 13:49 ` Phillip Susi
2011-05-04 15:09 ` Phillip Susi
2011-05-04 15:21 ` Vladimir 'φ-coder/phcoder' Serbinenko
2011-05-04 18:37 ` Phillip Susi
2011-05-04 19:01 ` Goswin von Brederlow
2011-05-04 20:23 ` Vladimir 'φ-coder/phcoder' Serbinenko
2011-05-05 14:38 ` Phillip Susi
2011-05-05 16:57 ` Goswin von Brederlow
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.