From: EJ Vincent <ej@ejane.org>
To: linux-raid@vger.kernel.org
Subject: Re: Upgrade from Ubuntu 10.04 to 12.04 broken raid6.
Date: Mon, 01 Oct 2012 13:14:26 -0400 [thread overview]
Message-ID: <5069CF72.6050906@ejane.org> (raw)
In-Reply-To: <50698F32.1080001@turmel.org>
On 10/1/2012 8:40 AM, Phil Turmel wrote:
> Hi EJ,
>
> On 09/30/2012 07:23 PM, EJ Vincent wrote:
>> On 9/30/2012 4:28 PM, Phil Turmel wrote:
>>> Do you have *any* dmesg output from the old system? Or dmesg from the
>>> very first boot under 12.04? That might have enough information to
>>> shorten your search.
>>>
>>> In the future, you should record your setup by saving the output of
>>> "mdadm -D" on each array, "mdadm -E" on each member device, and the
>>> output of "ls -l /dev/disk/by-id/"
>>>
>>> Or try my documentation script "lsdrv". [1]
>>>
>>> HTH,
>>>
>>> Phil
>>>
>>> [1] http://github.com/pturmel/lsdrv
>> Hi Phil,
>>
>> Unfortunately I don't have any dmesg log from the old system or the
>> first boot under 12.04.
>>
>> Getting my system to boot at all under 12.04 was chaotic enough, with
>> the overly-aggressive /usr/share/initramfs-tools/scripts/mdadm-functions
>> ravaging my array and then dropping me to a busybox shell over and over
>> again. I didn't think to record the very first error.
> I'm not prepared to condemn the 12.04 initramfs--I really don't think it
> is a factor in this crisis. The critical part is the degraded reboot bug.
>
>> Here's an observation of mine, disks: /dev/sdb1, /dev/sdi1, and
>> /dev/sdj1 don't have the Raid level "-unknown-", neither are they
>> labeled as spares. They are in fact, labeled clean and appear
>> *different* from the others.
>>
>> Could these disks still contain my metadata from 10.04? I recall during
>> my installation of 12.04 I had anywhere from 1 to 3 disks unpowered, so
>> that I could drop in a SATA CD/DVDRW into the slot.
> Leaving disks unpowered sounds like a key factor in your crisis. Raid6
> can't operate with more than two missing, and won't assemble if any disk
> disappears between shutdown and the next boot. (Must be forced.)
>
> So your array would only partially assemble under 12.04 due to
> deliberately missing drives, then you rebooted with a kernel that has a
> problem with that scenario.
>
> The disks very likely do have useful metadata, but no disk has all of
> it. It might reduce the permutations you need to try. If you share
> more information about your system layout, some educated first guesses
> might be possible, too. The output of "mdadm -E" for every drive, and
> lsdrv for an overview.
>
>> I am downloading 10.04.4 LTS and will be ready to use it soon. I fear
>> having to do permutations-- 9! (factorial) would mean 362,880
>> combinations. *gasp*
> Phil
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/1/2012 8:40 AM, Phil Turmel wrote:
> Hi EJ,
>
> On 09/30/2012 07:23 PM, EJ Vincent wrote:
>> On 9/30/2012 4:28 PM, Phil Turmel wrote:
>>> Do you have *any* dmesg output from the old system? Or dmesg from the
>>> very first boot under 12.04? That might have enough information to
>>> shorten your search.
>>>
>>> In the future, you should record your setup by saving the output of
>>> "mdadm -D" on each array, "mdadm -E" on each member device, and the
>>> output of "ls -l /dev/disk/by-id/"
>>>
>>> Or try my documentation script "lsdrv". [1]
>>>
>>> HTH,
>>>
>>> Phil
>>>
>>> [1] http://github.com/pturmel/lsdrv
>> Hi Phil,
>>
>> Unfortunately I don't have any dmesg log from the old system or the
>> first boot under 12.04.
>>
>> Getting my system to boot at all under 12.04 was chaotic enough, with
>> the overly-aggressive /usr/share/initramfs-tools/scripts/mdadm-functions
>> ravaging my array and then dropping me to a busybox shell over and over
>> again. I didn't think to record the very first error.
> I'm not prepared to condemn the 12.04 initramfs--I really don't think it
> is a factor in this crisis. The critical part is the degraded reboot bug.
>
>> Here's an observation of mine, disks: /dev/sdb1, /dev/sdi1, and
>> /dev/sdj1 don't have the Raid level "-unknown-", neither are they
>> labeled as spares. They are in fact, labeled clean and appear
>> *different* from the others.
>>
>> Could these disks still contain my metadata from 10.04? I recall during
>> my installation of 12.04 I had anywhere from 1 to 3 disks unpowered, so
>> that I could drop in a SATA CD/DVDRW into the slot.
> Leaving disks unpowered sounds like a key factor in your crisis. Raid6
> can't operate with more than two missing, and won't assemble if any disk
> disappears between shutdown and the next boot. (Must be forced.)
>
> So your array would only partially assemble under 12.04 due to
> deliberately missing drives, then you rebooted with a kernel that has a
> problem with that scenario.
>
> The disks very likely do have useful metadata, but no disk has all of
> it. It might reduce the permutations you need to try. If you share
> more information about your system layout, some educated first guesses
> might be possible, too. The output of "mdadm -E" for every drive, and
> lsdrv for an overview.
>
>> I am downloading 10.04.4 LTS and will be ready to use it soon. I fear
>> having to do permutations-- 9! (factorial) would mean 362,880
>> combinations. *gasp*
> Phil
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Phil,
Here's the information you requested.
The server has 10 disks, a dedicated 500GB disk for the operating system
(which Ubuntu 10.04.4 has labeled /dev/sdd), and 9 x 2TB disks
(/dev/sd[a,b,c,e,f,g,h,i,j):
Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
Disk /dev/sdd: 500.1 GB, 500107862016 bytes
Disk /dev/sde: 2000.4 GB, 2000398934016 bytes
Disk /dev/sdf: 2000.4 GB, 2000398934016 bytes
Disk /dev/sdg: 2000.4 GB, 2000398934016 bytes
Disk /dev/sdh: 2000.4 GB, 2000398934016 bytes
Disk /dev/sdi: 2000.4 GB, 2000398934016 bytes
Disk /dev/sdj: 2000.4 GB, 2000398934016 bytes
The devices are spread amongst an on-board SATA controller, MCP78S
GeForce AHCI, and two SiI 3124 PCI-X SATA controllers.
The layout is as follows: 5 disks are attached to the on-board
controller, 3 attached to one SiI 3124 controller, and 2 attached to the
other SiI 3124 controller.
I've loaded your lsdrv script, here are the results:
PCI [pata_amd] 00:06.0 IDE interface: nVidia Corporation MCP78S [GeForce
8200] IDE (rev a1)
scsi 0:x:x:x [Empty]
scsi 1:x:x:x [Empty]
PCI [sata_sil24] 06:04.0 RAID bus controller: Silicon Image, Inc. SiI
3124 PCI-X Serial ATA Controller (rev 02)
scsi 2:0:0:0 ATA ST2000DL003-9VT1
sda 1.82t [8:0] Empty/Unknown
sda1 1.82t [8:1] Empty/Unknown
scsi 5:0:0:0 ATA ST2000DL003-9VT1
sdb 1.82t [8:16] Empty/Unknown
sdb1 1.82t [8:17] Empty/Unknown
scsi 7:0:0:0 ATA ST2000DL003-9VT1
sdc 1.82t [8:32] Empty/Unknown
sdc1 1.82t [8:33] Empty/Unknown
scsi 9:x:x:x [Empty]
PCI [ahci] 00:09.0 SATA controller: nVidia Corporation MCP78S [GeForce
8200] AHCI Controller (rev a2)
scsi 3:0:0:0 ATA WDC WD5000AAKS-2
sdd 465.76g [8:48] Empty/Unknown
sdd1 237.00m [8:49] Empty/Unknown
Mounted as /dev/sdd1 @ /boot
sdd2 3.73g [8:50] Empty/Unknown
sdd3 23.28g [8:51] Empty/Unknown
Mounted as /dev/disk/by-uuid/65a128d3-3e2e-487a-a36b-11cbe5530429 @ /
sdd4 438.52g [8:52] Empty/Unknown
scsi 4:0:0:0 ATA ST2000DL003-9VT1
sde 1.82t [8:64] Empty/Unknown
sde1 1.82t [8:65] Empty/Unknown
scsi 6:0:0:0 ATA ST32000542AS
sdf 1.82t [8:80] Empty/Unknown
sdf1 1.82t [8:81] Empty/Unknown
scsi 8:0:0:0 ATA ST32000542AS
sdg 1.82t [8:96] Empty/Unknown
sdg1 1.82t [8:97] Empty/Unknown
scsi 10:0:0:0 ATA ST2000DL003-9VT1
sdh 1.82t [8:112] Empty/Unknown
sdh1 1.82t [8:113] Empty/Unknown
scsi 11:x:x:x [Empty]
PCI [sata_sil24] 08:04.0 RAID bus controller: Silicon Image, Inc. SiI
3124 PCI-X Serial ATA Controller (rev 02)
scsi 12:0:0:0 ATA ST2000DL003-9VT1
sdi 1.82t [8:128] Empty/Unknown
sdi1 1.82t [8:129] Empty/Unknown
scsi 13:0:0:0 ATA ST2000DL003-9VT1
sdj 1.82t [8:144] Empty/Unknown
sdj1 1.82t [8:145] Empty/Unknown
scsi 14:x:x:x [Empty]
scsi 15:x:x:x [Empty]
Here is what mdadm -E looks like for each member of the array, now under
Ubuntu 10.04.4:
# mdadm -E /dev/sda1
/dev/sda1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 321fc20c:997e9a1a:bb67ffde:9de489f5
Name : ruby:6 (local to host ruby)
Creation Time : Mon Apr 11 15:40:25 2011
Raid Level : -unknown-
Raid Devices : 0
Avail Dev Size : 3907026672 (1863.02 GiB 2000.40 GB)
Data Offset : 272 sectors
Super Offset : 8 sectors
State : active
Device UUID : 6190765b:200ff748:d50a75e3:597405c4
Update Time : Sun Sep 30 19:13:16 2012
Checksum : 37454049 - correct
Events : 1
Array Slot : 4 (empty, empty, failed, failed, empty, failed, empty,
failed, empty, failed, failed, empty, failed... <shortened for readability>)
Array State : 378 failed
# mdadm -E /dev/sdb1
/dev/sdb1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 321fc20c:997e9a1a:bb67ffde:9de489f5
Name : ruby:6 (local to host ruby)
Creation Time : Mon Apr 11 15:40:25 2011
Raid Level : -unknown-
Raid Devices : 0
Avail Dev Size : 3907026672 (1863.02 GiB 2000.40 GB)
Data Offset : 272 sectors
Super Offset : 8 sectors
State : active
Device UUID : 7d707598:a8881376:531ae0c6:aac82909
Update Time : Sun Sep 30 19:13:16 2012
Checksum : c9effdc2 - correct
Events : 1
Array Slot : 11 (empty, empty, failed, failed, empty, failed,
empty, failed, empty, failed, failed, empty, failed... <shortened for
readability>)
Array State : 378 failed
# mdadm -E /dev/sdc1
/dev/sdc1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 321fc20c:997e9a1a:bb67ffde:9de489f5
Name : ruby:6 (local to host ruby)
Creation Time : Mon Apr 11 15:40:25 2011
Raid Level : raid6
Raid Devices : 9
Avail Dev Size : 3907026672 (1863.02 GiB 2000.40 GB)
Array Size : 27349181440 (13041.11 GiB 14002.78 GB)
Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB)
Data Offset : 272 sectors
Super Offset : 8 sectors
State : clean
Device UUID : a6fd99b2:7bb75287:5d844ec5:822b6d8a
Update Time : Sun Sep 30 00:34:27 2012
Checksum : 760485cb - correct
Events : 2474296
Chunk Size : 512K
Array Slot : 7 (0, 1, failed, failed, 2, failed, 4, 5, 6, 7, 8, 3)
Array State : uuuuuUuuu 3 failed
# mdadm -E /dev/sde1
/dev/sde1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 321fc20c:997e9a1a:bb67ffde:9de489f5
Name : ruby:6 (local to host ruby)
Creation Time : Mon Apr 11 15:40:25 2011
Raid Level : -unknown-
Raid Devices : 0
Avail Dev Size : 3907026672 (1863.02 GiB 2000.40 GB)
Data Offset : 272 sectors
Super Offset : 8 sectors
State : active
Device UUID : 179691a0:fd201c2d:49c73803:409a0a9c
Update Time : Sun Sep 30 19:13:16 2012
Checksum : 584e3a3a - correct
Events : 1
Array Slot : 8 (empty, empty, failed, failed, empty, failed, empty,
failed, empty, failed, failed, empty, failed... <shortened for readability>)
Array State : 378 failed
# mdadm -E /dev/sdf1
/dev/sdf1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 321fc20c:997e9a1a:bb67ffde:9de489f5
Name : ruby:6 (local to host ruby)
Creation Time : Mon Apr 11 15:40:25 2011
Raid Level : -unknown-
Raid Devices : 0
Avail Dev Size : 3907026672 (1863.02 GiB 2000.40 GB)
Data Offset : 272 sectors
Super Offset : 8 sectors
State : active
Device UUID : f3f72549:8543972f:1f4a655d:fa9416bd
Update Time : Sun Sep 30 19:13:16 2012
Checksum : 7e963c27 - correct
Events : 1
Array Slot : 1 (empty, empty, failed, failed, empty, failed, empty,
failed, empty, failed, failed, empty, failed... <shortened for readability>)
Array State : 378 failed
# mdadm -E /dev/sdg1
/dev/sdg1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 321fc20c:997e9a1a:bb67ffde:9de489f5
Name : ruby:6 (local to host ruby)
Creation Time : Mon Apr 11 15:40:25 2011
Raid Level : -unknown-
Raid Devices : 0
Avail Dev Size : 3907026672 (1863.02 GiB 2000.40 GB)
Data Offset : 272 sectors
Super Offset : 8 sectors
State : active
Device UUID : 9c908e4b:ad7d8af8:ff5d2ab6:50b013e5
Update Time : Sun Sep 30 19:13:16 2012
Checksum : cab43e2e - correct
Events : 1
Array Slot : 0 (empty, empty, failed, failed, empty, failed, empty,
failed, empty, failed, failed, empty, failed... <shortened for readability>)
Array State : 378 failed
# mdadm -E /dev/sdh1
/dev/sdh1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 321fc20c:997e9a1a:bb67ffde:9de489f5
Name : ruby:6 (local to host ruby)
Creation Time : Mon Apr 11 15:40:25 2011
Raid Level : -unknown-
Raid Devices : 0
Avail Dev Size : 3907026672 (1863.02 GiB 2000.40 GB)
Data Offset : 272 sectors
Super Offset : 8 sectors
State : active
Device UUID : 321368f6:9f38bc16:76f787c3:4b3d398d
Update Time : Sun Sep 30 19:13:16 2012
Checksum : 4942a22e - correct
Events : 1
Array Slot : 6 (empty, empty, failed, failed, empty, failed, empty,
failed, empty, failed, failed, empty, failed... <shortened for readability>)
Array State : 378 failed
# mdadm -E /dev/sdi1
/dev/sdi1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 321fc20c:997e9a1a:bb67ffde:9de489f5
Name : ruby:6 (local to host ruby)
Creation Time : Mon Apr 11 15:40:25 2011
Raid Level : raid6
Raid Devices : 9
Avail Dev Size : 3907026672 (1863.02 GiB 2000.40 GB)
Array Size : 27349181440 (13041.11 GiB 14002.78 GB)
Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB)
Data Offset : 272 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 9d53248b:1db27ffc:a2a511c3:7176a7eb
Update Time : Sun Sep 30 00:34:27 2012
Checksum : 22b9429c - correct
Events : 2474296
Chunk Size : 512K
Array Slot : 10 (0, 1, failed, failed, 2, failed, 4, 5, 6, 7, 8, 3)
Array State : uuuuuuuuU 3 failed
# mdadm -E /dev/sdj1
/dev/sdj1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 321fc20c:997e9a1a:bb67ffde:9de489f5
Name : ruby:6 (local to host ruby)
Creation Time : Mon Apr 11 15:40:25 2011
Raid Level : raid6
Raid Devices : 9
Avail Dev Size : 3907026672 (1863.02 GiB 2000.40 GB)
Array Size : 27349181440 (13041.11 GiB 14002.78 GB)
Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB)
Data Offset : 272 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 880ed7fb:b9c673de:929d14c5:53f9b81d
Update Time : Sun Sep 30 00:34:27 2012
Checksum : a9748cf3 - correct
Events : 2474296
Chunk Size : 512K
Array Slot : 9 (0, 1, failed, failed, 2, failed, 4, 5, 6, 7, 8, 3)
Array State : uuuuuuuUu 3 failed
I'd be happy to also supply a dump of 'lshw' which I believe is similar
to 'lsdrv' if that would be useful to you. The system is back on
10.04.4 LTS, and is using mdadm version 2.6.7.1.
Thanks for your continued input and assistance. Much appreciated.
-EJ
next prev parent reply other threads:[~2012-10-01 17:14 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-30 9:21 Upgrade from Ubuntu 10.04 to 12.04 broken raid6 EJ
2012-09-30 9:30 ` EJ Vincent
2012-09-30 9:44 ` Jan Ceuleers
2012-09-30 10:04 ` Mikael Abrahamsson
2012-09-30 19:20 ` EJ Vincent
2012-09-30 19:22 ` Mathias Burén
2012-09-30 19:25 ` EJ Vincent
2012-09-30 20:28 ` Phil Turmel
2012-09-30 23:23 ` EJ Vincent
2012-10-01 12:40 ` Phil Turmel
2012-10-01 17:14 ` EJ Vincent [this message]
2012-10-02 2:15 ` NeilBrown
2012-10-02 3:53 ` EJ Vincent
2012-10-02 5:04 ` NeilBrown
2012-10-02 8:34 ` Upgrade from Ubuntu 10.04 to 12.04 broken raid6. [SOLVED] EJ Vincent
2012-10-02 12:18 ` Phil Turmel
2012-09-30 19:50 ` Upgrade from Ubuntu 10.04 to 12.04 broken raid6 Chris Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5069CF72.6050906@ejane.org \
--to=ej@ejane.org \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.