* Re: Bug#597563: grub-common: grub-probe segfaults scanning lvm devices
[not found] ` <alpine.DEB.2.00.1101072241560.25170@cheetah.fastcat.org>
@ 2011-01-08 12:41 ` Vladimir 'φ-coder/phcoder' Serbinenko
2011-01-08 22:53 ` Matthew Gabeler-Lee
0 siblings, 1 reply; 11+ messages in thread
From: Vladimir 'φ-coder/phcoder' Serbinenko @ 2011-01-08 12:41 UTC (permalink / raw)
To: Matthew Gabeler-Lee; +Cc: 597563, linux-raid
[-- Attachment #1: Type: text/plain, Size: 2251 bytes --]
As was recommended I forward the remaining part to linux-raid mailing list.
In short: on his system mdraid, raid5, 4 devices, metadata (presumably)
0.90, two devices have index 0.
If such situation is valid please advice me on how such situation should
be handled.
@Matthew: could you supply mdadm -Q outputput and *last* 64K of every disk?
On 01/08/2011 05:08 AM, Matthew Gabeler-Lee wrote:
> On Fri, 7 Jan 2011, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
>
>> I believe it to be a problem with raid5. Could you try the latest
>> upstream? If problem persists I would need following dumps:
>> dd if=/dev/sd[abcd]3 of=[abcd].img bs=1024 count=64
>> dd if=/dev/md2 of=2.img bs=1024 count=64
>> grub-fstest -c 4 /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3 hex -l
>> '(md2)+128' > g2.img
>
> OK, built grub from latest bzr trunk.
>
> From my past workarounds, I effectively have a list of all the
> invocations of grub-probe that grub-install/grub-setup runs on my
> system. Most of those work fine now. The only thing that isn't fine
> is that most invocations spit out a message "error: found two disks
> with the number 0" but give a correct answer and exit successfully.
>
> If I run grub-probe with enough --verbose arguments, then that message
> gets this context:
>
> grub-core/disk/raid.c:699: Scanning for RAID devices on disk hd2
> grub-core/kern/disk.c:245: Opening `hd2'...
> ./grub-probe: info: the size of hd2 is 1465149168.
> error: found two disks with the number 0.
> grub-core/kern/disk.c:330: Closing `hd2'.
>
> So, it seems maybe you're right that there's something funky with the
> raid5. The outputs you requested are attached. The grub-fstest
> invocation complained that -l is not a valid option, I hope the output
> without it is still what you want / need. I included the full output
> of one of the complaining grub-probe invocations too, on the guess
> that it might be helpful.
>
> FWIW, the raid5 array in question has had every disk swapped at least
> once in its life span, including from growing from 3 to 4 disks, and
> from smaller to larger disks, not to mention one or two disk failures
> along the way.
>
--
Regards
Vladimir 'φ-coder/phcoder' Serbinenko
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 294 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Bug#597563: grub-common: grub-probe segfaults scanning lvm devices
2011-01-08 12:41 ` Bug#597563: grub-common: grub-probe segfaults scanning lvm devices Vladimir 'φ-coder/phcoder' Serbinenko
@ 2011-01-08 22:53 ` Matthew Gabeler-Lee
2011-01-08 23:34 ` Vladimir 'φ-coder/phcoder' Serbinenko
2011-01-09 20:55 ` NeilBrown
0 siblings, 2 replies; 11+ messages in thread
From: Matthew Gabeler-Lee @ 2011-01-08 22:53 UTC (permalink / raw)
To: Vladimir 'φ-coder/phcoder' Serbinenko; +Cc: 597563, linux-raid
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1910 bytes --]
On Sat, 8 Jan 2011, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
> As was recommended I forward the remaining part to linux-raid mailing list.
> In short: on his system mdraid, raid5, 4 devices, metadata (presumably)
> 0.90, two devices have index 0.
> If such situation is valid please advice me on how such situation should
> be handled.
> @Matthew: could you supply mdadm -Q outputput and *last* 64K of every disk?
$ sudo mdadm -QD /dev/md2
/dev/md2:
Version : 0.90
Creation Time : Mon Mar 27 14:04:18 2006
Raid Level : raid5
Array Size : 2185667136 (2084.41 GiB 2238.12 GB)
Used Dev Size : 728555712 (694.80 GiB 746.04 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 2
Persistence : Superblock is persistent
Update Time : Sat Jan 8 17:40:20 2011
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
UUID : 01e2f978:88d1f867:34e1e46c:f3c01470
Events : 0.32040050
Number Major Minor RaidDevice State
0 8 35 0 active sync /dev/sdc3
1 8 19 1 active sync /dev/sdb3
2 8 3 2 active sync /dev/sda3
3 8 51 3 active sync /dev/sdd3
The actual last 64k of all the partitions in the array is all zeros.
So is the 64k up to the end of the "Used Dev Size". What has some data
in it is the 64k after that. I hope that has the superblock data I
presume you're looking for. I.e. dd if=/dev/sdX3 bs=1024
skip=728555712 count=64.
--
-Matt
"Reality is that which, when you stop believing in it, doesn't go away".
-- Philip K. Dick
GPG pubkey fingerprint: A57F B354 FD30 A502 795B 9637 3EF1 3F22 A85E 2AD1
[-- Attachment #2: Type: APPLICATION/octet-stream, Size: 465 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Bug#597563: grub-common: grub-probe segfaults scanning lvm devices
2011-01-08 22:53 ` Matthew Gabeler-Lee
@ 2011-01-08 23:34 ` Vladimir 'φ-coder/phcoder' Serbinenko
2011-01-08 23:38 ` Matthew Gabeler-Lee
2011-01-09 20:55 ` NeilBrown
1 sibling, 1 reply; 11+ messages in thread
From: Vladimir 'φ-coder/phcoder' Serbinenko @ 2011-01-08 23:34 UTC (permalink / raw)
To: Matthew Gabeler-Lee; +Cc: 597563, linux-raid
[-- Attachment #1: Type: text/plain, Size: 2273 bytes --]
On 01/08/2011 11:53 PM, Matthew Gabeler-Lee wrote:
> On Sat, 8 Jan 2011, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
>
>> As was recommended I forward the remaining part to linux-raid mailing
>> list.
>> In short: on his system mdraid, raid5, 4 devices, metadata (presumably)
>> 0.90, two devices have index 0.
>> If such situation is valid please advice me on how such situation should
>> be handled.
>> @Matthew: could you supply mdadm -Q outputput and *last* 64K of every
>> disk?
>
Sorry, I've noticed that I've looked into the wrong place all the long.
md2 is fine. I suppose it's a problem with md0 (all mdraid are assembled
at the beginning). Since md0 is raid1, its misassembly wouldn't have any
influence (we don't write to devices). mdstat lists both md0 and md1 as
having no duplicate indices. I would need info on md0 for this (now
minor) remaining bug.
> $ sudo mdadm -QD /dev/md2
> /dev/md2:
> Version : 0.90
> Creation Time : Mon Mar 27 14:04:18 2006
> Raid Level : raid5
> Array Size : 2185667136 (2084.41 GiB 2238.12 GB)
> Used Dev Size : 728555712 (694.80 GiB 746.04 GB)
> Raid Devices : 4
> Total Devices : 4
> Preferred Minor : 2
> Persistence : Superblock is persistent
>
> Update Time : Sat Jan 8 17:40:20 2011
> State : clean
> Active Devices : 4
> Working Devices : 4
> Failed Devices : 0
> Spare Devices : 0
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> UUID : 01e2f978:88d1f867:34e1e46c:f3c01470
> Events : 0.32040050
>
> Number Major Minor RaidDevice State
> 0 8 35 0 active sync /dev/sdc3
> 1 8 19 1 active sync /dev/sdb3
> 2 8 3 2 active sync /dev/sda3
> 3 8 51 3 active sync /dev/sdd3
>
> The actual last 64k of all the partitions in the array is all zeros.
> So is the 64k up to the end of the "Used Dev Size". What has some
> data in it is the 64k after that. I hope that has the superblock data
> I presume you're looking for. I.e. dd if=/dev/sdX3 bs=1024
> skip=728555712 count=64.
>
--
Regards
Vladimir 'φ-coder/phcoder' Serbinenko
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 294 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Bug#597563: grub-common: grub-probe segfaults scanning lvm devices
2011-01-08 23:34 ` Vladimir 'φ-coder/phcoder' Serbinenko
@ 2011-01-08 23:38 ` Matthew Gabeler-Lee
2011-01-08 23:55 ` Vladimir 'φ-coder/phcoder' Serbinenko
0 siblings, 1 reply; 11+ messages in thread
From: Matthew Gabeler-Lee @ 2011-01-08 23:38 UTC (permalink / raw)
To: Vladimir 'φ-coder/phcoder' Serbinenko; +Cc: 597563, linux-raid
[-- Attachment #1: Type: text/plain, Size: 1676 bytes --]
On 1/8/2011 18:34, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
> Sorry, I've noticed that I've looked into the wrong place all the long.
> md2 is fine. I suppose it's a problem with md0 (all mdraid are assembled
> at the beginning). Since md0 is raid1, its misassembly wouldn't have any
> influence (we don't write to devices). mdstat lists both md0 and md1 as
> having no duplicate indices. I would need info on md0 for this (now
> minor) remaining bug.
$ sudo mdadm -QD /dev/md0
/dev/md0:
Version : 0.90
Creation Time : Mon Mar 27 14:03:04 2006
Raid Level : raid1
Array Size : 2008000 (1961.27 MiB 2056.19 MB)
Used Dev Size : 2008000 (1961.27 MiB 2056.19 MB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sat Jan 8 18:35:47 2011
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
UUID : 9364f7a2:d74695d5:7d8db3a0:3b5f9e48
Events : 0.10758124
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1
2 8 1 2 active sync /dev/sda1
3 8 49 3 active sync /dev/sdd1
Output of "for i in a b c d ; do sudo dd if=/dev/sd${i}1
of=sd${i}1.last64k.img skip=$((2008000)) bs=1024 count=64 ; done" is attached.
--
-Matt
"Reality is that which, when you stop believing in it, doesn't go away".
-- Philip K. Dick
GPG pubkey fingerprint: A57F B354 FD30 A502 795B 9637 3EF1 3F22 A85E 2AD1
[-- Attachment #2: last64k.md0.tar.bz2 --]
[-- Type: application/octet-stream, Size: 455 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Bug#597563: grub-common: grub-probe segfaults scanning lvm devices
2011-01-08 23:38 ` Matthew Gabeler-Lee
@ 2011-01-08 23:55 ` Vladimir 'φ-coder/phcoder' Serbinenko
2011-01-09 3:09 ` Matthew Gabeler-Lee
0 siblings, 1 reply; 11+ messages in thread
From: Vladimir 'φ-coder/phcoder' Serbinenko @ 2011-01-08 23:55 UTC (permalink / raw)
To: Matthew Gabeler-Lee; +Cc: 597563, linux-raid
[-- Attachment #1: Type: text/plain, Size: 979 bytes --]
On 01/09/2011 12:38 AM, Matthew Gabeler-Lee wrote:
> On 1/8/2011 18:34, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
>> Sorry, I've noticed that I've looked into the wrong place all the long.
>> md2 is fine. I suppose it's a problem with md0 (all mdraid are assembled
>> at the beginning). Since md0 is raid1, its misassembly wouldn't have any
>> influence (we don't write to devices). mdstat lists both md0 and md1 as
>> having no duplicate indices. I would need info on md0 for this (now
>> minor) remaining bug.
I have a hypothesis. Does last partition on any device span, s.t. it
leaves less than 64K after it until the end of device? If so then GRUB
sees the same metadata sector as the one at the end of device and as at
the end of partition. With 1.x metadata sector contains its own location
on the device. With 0.90 I can see no such field. Is there a way to
check for such condition with 0.90?
--
Regards
Vladimir 'φ-coder/phcoder' Serbinenko
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 294 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Bug#597563: grub-common: grub-probe segfaults scanning lvm devices
2011-01-08 23:55 ` Vladimir 'φ-coder/phcoder' Serbinenko
@ 2011-01-09 3:09 ` Matthew Gabeler-Lee
0 siblings, 0 replies; 11+ messages in thread
From: Matthew Gabeler-Lee @ 2011-01-09 3:09 UTC (permalink / raw)
To: Vladimir 'φ-coder/phcoder' Serbinenko; +Cc: 597563, linux-raid
On 1/8/2011 18:55, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
> I have a hypothesis. Does last partition on any device span, s.t. it
> leaves less than 64K after it until the end of device? If so then GRUB
> sees the same metadata sector as the one at the end of device and as at
> the end of partition. With 1.x metadata sector contains its own location
> on the device. With 0.90 I can see no such field. Is there a way to
> check for such condition with 0.90?
I created the last partition on each disk running as close to the end of the
disk as fdisk would allow. Since the partitions were created on cylinder
boundaries, however, it looks like the last partition ends 2MB before the
end of the physical device. Reading the data between the end of the last
partition and the end of the disk gives all zeros for three of the four
disks, but /dev/sdb gives what looks like it might have an mdraid superblock
in it:
$ sudo fdisk -lu /dev/sdb
Disk /dev/sdb: 750.2 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders, total 1465149168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x994d834b
Device Boot Start End Blocks Id System
/dev/sdb1 * 63 4016249 2008093+ fd Linux raid autodetect
/dev/sdb2 4016250 8032499 2008125 fd Linux raid autodetect
/dev/sdb3 8032500 1465144064 728555782+ fd Linux raid autodetect
$ sudo dd if=/dev/sdb bs=512 skip=1465144064 2>/dev/null | hd
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00260000 fc 4e 2b a9 00 00 00 00 5a 00 00 00 03 00 00 00 |.N+.....Z.......|
00260010 00 00 00 00 a2 f7 64 93 e8 36 28 44 01 00 00 00 |......d..6(D....|
00260020 80 f3 0e 00 03 00 00 00 02 00 00 00 00 00 00 00 |................|
00260030 00 00 00 00 d5 95 46 d7 a0 b3 8d 7d 48 9e 5f 3b |......F....}H._;|
00260040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00260080 6f 90 65 49 01 00 00 00 02 00 00 00 03 00 00 00 |o.eI............|
00260090 00 00 00 00 01 00 00 00 eb 34 81 5b 52 25 90 00 |.........4.[R%..|
002600a0 00 00 00 00 52 25 90 00 00 00 00 00 ff ff ff ff |....R%..........|
002600b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00260200 00 00 00 00 08 00 00 00 01 00 00 00 00 00 00 00 |................|
00260210 06 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00260220 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00260280 01 00 00 00 08 00 00 00 11 00 00 00 01 00 00 00 |................|
00260290 06 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
002602a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00260300 02 00 00 00 08 00 00 00 30 00 00 00 02 00 00 00 |........0.......|
00260310 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00260f80 02 00 00 00 08 00 00 00 30 00 00 00 02 00 00 00 |........0.......|
00260f90 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
0027e000
mdadm seems to confirm this. mdadm --examine /dev/sd[acd] gives "no md
superblock", but on /dev/sdb, I get this (note that the device size at least
is bogus, since the disk itself is only 750GB):
$ sudo mdadm --examine /dev/sdb
/dev/sdb:
Magic : a92b4efc
Version : 0.90.03
UUID : 9364f7a2:d74695d5:7d8db3a0:3b5f9e48
Creation Time : Mon Mar 27 14:03:04 2006
Raid Level : raid1
Used Dev Size : 979840 (957.04 MiB 1003.36 MB)
Array Size : 979840 (957.04 MiB 1003.36 MB)
Raid Devices : 2
Total Devices : 3
Preferred Minor : 0
Update Time : Thu Jan 8 00:34:39 2009
State : clean
Active Devices : 2
Working Devices : 3
Failed Devices : 0
Spare Devices : 1
Checksum : 5b8134eb - correct
Events : 9446738
Number Major Minor RaidDevice State
this 2 8 48 2 spare /dev/sdd
0 0 8 1 0 active sync /dev/sda1
1 1 8 17 1 active sync /dev/sdb1
2 2 8 48 2 spare /dev/sdd
--
-Matt
"Reality is that which, when you stop believing in it, doesn't go away".
-- Philip K. Dick
GPG pubkey fingerprint: A57F B354 FD30 A502 795B 9637 3EF1 3F22 A85E 2AD1
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Bug#597563: grub-common: grub-probe segfaults scanning lvm devices
2011-01-08 22:53 ` Matthew Gabeler-Lee
2011-01-08 23:34 ` Vladimir 'φ-coder/phcoder' Serbinenko
@ 2011-01-09 20:55 ` NeilBrown
2011-01-09 21:32 ` Vladimir 'φ-coder/phcoder' Serbinenko
1 sibling, 1 reply; 11+ messages in thread
From: NeilBrown @ 2011-01-09 20:55 UTC (permalink / raw)
To: Matthew Gabeler-Lee
Cc: Vladimir 'φ-coder/phcoder' Serbinenko, 597563,
linux-raid
On Sat, 8 Jan 2011 17:53:07 -0500 (EST) Matthew Gabeler-Lee
<cheetah@cheetah.fastcat.org> wrote:
> On Sat, 8 Jan 2011, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
>
> > As was recommended I forward the remaining part to linux-raid mailing list.
> > In short: on his system mdraid, raid5, 4 devices, metadata (presumably)
> > 0.90, two devices have index 0.
What do you mean by "two devices have index 0" ??? I could see nothing in any
of the posts you sent that could be interpreted that way.
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Bug#597563: grub-common: grub-probe segfaults scanning lvm devices
2011-01-09 20:55 ` NeilBrown
@ 2011-01-09 21:32 ` Vladimir 'φ-coder/phcoder' Serbinenko
2011-01-09 21:57 ` NeilBrown
0 siblings, 1 reply; 11+ messages in thread
From: Vladimir 'φ-coder/phcoder' Serbinenko @ 2011-01-09 21:32 UTC (permalink / raw)
To: NeilBrown; +Cc: Matthew Gabeler-Lee, 597563, linux-raid
[-- Attachment #1: Type: text/plain, Size: 2769 bytes --]
On 01/09/2011 09:55 PM, NeilBrown wrote:
> On Sat, 8 Jan 2011 17:53:07 -0500 (EST) Matthew Gabeler-Lee
> <cheetah@cheetah.fastcat.org> wrote:
>
>
>> On Sat, 8 Jan 2011, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
>>
>>
>>> As was recommended I forward the remaining part to linux-raid mailing list.
>>> In short: on his system mdraid, raid5, 4 devices, metadata (presumably)
>>> 0.90, two devices have index 0.
>>>
> What do you mean by "two devices have index 0" ??? I could see nothing in any
> of the posts you sent that could be interpreted that way.
>
>
Sorry, I forgot this part:
grub-core/disk/raid.c:699: Scanning for RAID devices on disk hd2
grub-core/kern/disk.c:245: Opening `hd2'...
./grub-probe: info: the size of hd2 is 1465149168.
error: found two disks with the number 0.
grub-core/kern/disk.c:330: Closing `hd2'.
Trouble comes from followint part:
$ sudo mdadm --examine /dev/sdb
/dev/sdb:
Magic : a92b4efc
Version : 0.90.03
UUID : 9364f7a2:d74695d5:7d8db3a0:3b5f9e48
Creation Time : Mon Mar 27 14:03:04 2006
Raid Level : raid1
Used Dev Size : 979840 (957.04 MiB 1003.36 MB)
Array Size : 979840 (957.04 MiB 1003.36 MB)
Raid Devices : 2
Total Devices : 3
Preferred Minor : 0
Update Time : Thu Jan 8 00:34:39 2009
State : clean
Active Devices : 2
Working Devices : 3
Failed Devices : 0
Spare Devices : 1
Checksum : 5b8134eb - correct
Events : 9446738
So sdb as whole pretends to be a part of the following array:
$ sudo mdadm -QD /dev/md0
/dev/md0:
Version : 0.90
Creation Time : Mon Mar 27 14:03:04 2006
Raid Level : raid1
Array Size : 2008000 (1961.27 MiB 2056.19 MB)
Used Dev Size : 2008000 (1961.27 MiB 2056.19 MB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sat Jan 8 18:35:47 2011
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
UUID : 9364f7a2:d74695d5:7d8db3a0:3b5f9e48
Events : 0.10758124
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1
2 8 1 2 active sync /dev/sda1
3 8 49 3 active sync /dev/sdd1
As you can see there is a stalled superblock approximately 2 years old.
I don't know if it's some kind of freak accident or operator error. If
it's later then probably zero-filling over stalled superblock will solve
the problems
--
Regards
Vladimir 'φ-coder/phcoder' Serbinenko
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 294 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Bug#597563: grub-common: grub-probe segfaults scanning lvm devices
2011-01-09 21:32 ` Vladimir 'φ-coder/phcoder' Serbinenko
@ 2011-01-09 21:57 ` NeilBrown
2011-01-09 22:13 ` Matthew Gabeler-Lee
0 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2011-01-09 21:57 UTC (permalink / raw)
To: Vladimir 'φ-coder/phcoder' Serbinenko
Cc: Matthew Gabeler-Lee, 597563, linux-raid
[-- Attachment #1: Type: text/plain, Size: 3064 bytes --]
On Sun, 09 Jan 2011 22:32:01 +0100 Vladimir 'φ-coder/phcoder' Serbinenko
<phcoder@gmail.com> wrote:
> On 01/09/2011 09:55 PM, NeilBrown wrote:
> > On Sat, 8 Jan 2011 17:53:07 -0500 (EST) Matthew Gabeler-Lee
> > <cheetah@cheetah.fastcat.org> wrote:
> >
> >
> >> On Sat, 8 Jan 2011, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
> >>
> >>
> >>> As was recommended I forward the remaining part to linux-raid mailing list.
> >>> In short: on his system mdraid, raid5, 4 devices, metadata (presumably)
> >>> 0.90, two devices have index 0.
> >>>
> > What do you mean by "two devices have index 0" ??? I could see nothing in any
> > of the posts you sent that could be interpreted that way.
> >
> >
> Sorry, I forgot this part:
> grub-core/disk/raid.c:699: Scanning for RAID devices on disk hd2
> grub-core/kern/disk.c:245: Opening `hd2'...
> ./grub-probe: info: the size of hd2 is 1465149168.
> error: found two disks with the number 0.
> grub-core/kern/disk.c:330: Closing `hd2'.
>
> Trouble comes from followint part:
> $ sudo mdadm --examine /dev/sdb
> /dev/sdb:
> Magic : a92b4efc
> Version : 0.90.03
> UUID : 9364f7a2:d74695d5:7d8db3a0:3b5f9e48
> Creation Time : Mon Mar 27 14:03:04 2006
> Raid Level : raid1
> Used Dev Size : 979840 (957.04 MiB 1003.36 MB)
> Array Size : 979840 (957.04 MiB 1003.36 MB)
> Raid Devices : 2
> Total Devices : 3
> Preferred Minor : 0
>
> Update Time : Thu Jan 8 00:34:39 2009
> State : clean
> Active Devices : 2
> Working Devices : 3
> Failed Devices : 0
> Spare Devices : 1
> Checksum : 5b8134eb - correct
> Events : 9446738
> So sdb as whole pretends to be a part of the following array:
> $ sudo mdadm -QD /dev/md0
> /dev/md0:
> Version : 0.90
> Creation Time : Mon Mar 27 14:03:04 2006
> Raid Level : raid1
> Array Size : 2008000 (1961.27 MiB 2056.19 MB)
> Used Dev Size : 2008000 (1961.27 MiB 2056.19 MB)
> Raid Devices : 4
> Total Devices : 4
> Preferred Minor : 0
> Persistence : Superblock is persistent
>
> Update Time : Sat Jan 8 18:35:47 2011
> State : clean
> Active Devices : 4
> Working Devices : 4
> Failed Devices : 0
> Spare Devices : 0
>
> UUID : 9364f7a2:d74695d5:7d8db3a0:3b5f9e48
> Events : 0.10758124
>
> Number Major Minor RaidDevice State
> 0 8 17 0 active sync /dev/sdb1
> 1 8 33 1 active sync /dev/sdc1
> 2 8 1 2 active sync /dev/sda1
> 3 8 49 3 active sync /dev/sdd1
>
>
> As you can see there is a stalled superblock approximately 2 years old.
> I don't know if it's some kind of freak accident or operator error. If
> it's later then probably zero-filling over stalled superblock will solve
> the problems
>
Simply running
mdadm --zero-superblock /dev/sdb
should fix it.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 190 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Bug#597563: grub-common: grub-probe segfaults scanning lvm devices
2011-01-09 21:57 ` NeilBrown
@ 2011-01-09 22:13 ` Matthew Gabeler-Lee
2011-01-09 22:29 ` NeilBrown
0 siblings, 1 reply; 11+ messages in thread
From: Matthew Gabeler-Lee @ 2011-01-09 22:13 UTC (permalink / raw)
To: NeilBrown
Cc: Vladimir 'φ-coder/phcoder' Serbinenko, 597563,
linux-raid
On 1/9/2011 16:57, NeilBrown wrote:
> Simply running
> mdadm --zero-superblock /dev/sdb
>
> should fix it.
Well, that doesn't work very well: "mdadm: Couldn't open /dev/sdb for write
- not zeroing" ... strace reveals that mdadm is trying to open it O_EXCL,
which I presume is why it's not working ... I presume I'd have to reboot to
single user mode and stop the LVM and possibly MD stuff in order for that to
work, which might then require booting from a rescue cd to do it.
So I backed up the contents of the end of the disk in case I screwed up, and
then zerro'd it with dd (nervous nervous). I double-checked things with
mdadm --examine to double-check I had cleared the stray superblock and not
damaged the one in sdb3, and that looks OK
After doing that, the version of grub-probe that was crashing before appears
to work properly, and the trunk version of grub-probe no longer spits out
the warning/error. I then upgraded the debian package to the latest version
in testing (since I'd been using an old version where I could work around
the problems), and let it run the grub-install on all 4 disks, and that
proceeded without errors. Hooray :)
Thank you folks for your help solving this!
--
-Matt
"Reality is that which, when you stop believing in it, doesn't go away".
-- Philip K. Dick
GPG pubkey fingerprint: A57F B354 FD30 A502 795B 9637 3EF1 3F22 A85E 2AD1
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Bug#597563: grub-common: grub-probe segfaults scanning lvm devices
2011-01-09 22:13 ` Matthew Gabeler-Lee
@ 2011-01-09 22:29 ` NeilBrown
0 siblings, 0 replies; 11+ messages in thread
From: NeilBrown @ 2011-01-09 22:29 UTC (permalink / raw)
To: Matthew Gabeler-Lee
Cc: Vladimir 'φ-coder/phcoder' Serbinenko, 597563,
linux-raid
On Sun, 09 Jan 2011 17:13:10 -0500 Matthew Gabeler-Lee <cheetah@fastcat.org>
wrote:
> On 1/9/2011 16:57, NeilBrown wrote:
> > Simply running
> > mdadm --zero-superblock /dev/sdb
> >
> > should fix it.
> Well, that doesn't work very well: "mdadm: Couldn't open /dev/sdb for write
> - not zeroing" ... strace reveals that mdadm is trying to open it O_EXCL,
> which I presume is why it's not working ... I presume I'd have to reboot to
> single user mode and stop the LVM and possibly MD stuff in order for that to
> work, which might then require booting from a rescue cd to do it.
Sorry, I forgot that the device would be in use.
In that case
mdadm --zero-superblock --force /dev/sdb
would have done the trick. But you found another way which worked just as
well.
NeilBrown
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-01-09 22:29 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20100920202854.27101.8288.reportbug@cheetah.fastcat.org>
[not found] ` <4D274FF9.8010004@gmail.com>
[not found] ` <alpine.DEB.2.00.1101072241560.25170@cheetah.fastcat.org>
2011-01-08 12:41 ` Bug#597563: grub-common: grub-probe segfaults scanning lvm devices Vladimir 'φ-coder/phcoder' Serbinenko
2011-01-08 22:53 ` Matthew Gabeler-Lee
2011-01-08 23:34 ` Vladimir 'φ-coder/phcoder' Serbinenko
2011-01-08 23:38 ` Matthew Gabeler-Lee
2011-01-08 23:55 ` Vladimir 'φ-coder/phcoder' Serbinenko
2011-01-09 3:09 ` Matthew Gabeler-Lee
2011-01-09 20:55 ` NeilBrown
2011-01-09 21:32 ` Vladimir 'φ-coder/phcoder' Serbinenko
2011-01-09 21:57 ` NeilBrown
2011-01-09 22:13 ` Matthew Gabeler-Lee
2011-01-09 22:29 ` NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).