From: AndyLiebman@aol.com
To: linux-raid@vger.kernel.org
Subject: Linux Raid confused about one drive and two arrays
Date: Thu, 22 Jan 2004 09:34:43 EST [thread overview]
Message-ID: <6a.3aa594bc.2d413983@aol.com> (raw)
I have just encountered a very disturbing RAID problem. I hope somebody
understands what happened and can tell me how to fix it.
I have two RAID 5 arrays on my Linux machine -- md4 and md6.. Each array
consists of 5 firewire (1394a) drives -- one partition on each drive, 10 drives in
total. Because the device ID's on these drives can change, I always use MDADM
to create and manage my arrays based on UUIDs. I am using MDADM 1.3. Mandrake
9.2 with mandrake's 2.4.22-21 kernel.
After running these arrays successfully for two months -- rebooting my file
server every day -- one of my arrays came up in a degraded mode. It looks as if
the Linux RAID subsystem "thinks" one of my drives belongs to both arrays.
As you can see below, when I run mdadm -E on each of my ten firewire drives,
mdadm is telling me that for each of the drives in the md4 array (UUID group
62d8b91d:a2368783:6a78ca50:5793492f ) there are 5 Raid devices and 6 total
devices with one failed. However this array always only had 5 devices.
On the other hand, for most of the drives in the md6 arary (UUID group
57f26496:25520b96:41757b62:f83fcb7b), mdadm is telling me that there are 5 raid
devices and 5 total devices with one failed.
However, when I run mdadm -E on the drive currently identified as /dev/sdh1
-- which also belongs to md6 or the UUID group
57f26496:25520b96:41757b62:f83fcb7b -- mdadm tells me that sdh1 is part of an array with 6 total devices, 5
raid devices, one failed.
/dev/sdh1 is identified as device number 3 in the RAID with the UUID
57f26496:25520b96:41757b62:f83fcb7b. Howver, when I run mdadm -E on the other 4
drives that belong to md6, mdadm tells me that device number 3 is faulty.
My questions are:
How do I fix this problem?
Why did it occur?
How can I prevent it from occurring again?
Hope somebody can answer these questions today.
Here is all the output from starting up my arrays and running mdadm:
[root@localhost avidserver]# mdadm -Av /dev/md4
--uuid=62d8b91d:a2368783:6a78ca50:5793492f /dev/sd*
mdadm: looking for devices for /dev/md4
mdadm: /dev/sd is not a block device.
mdadm: /dev/sd has wrong uuid.
mdadm: no RAID superblock on /dev/sda
mdadm: /dev/sda has wrong uuid.
mdadm: /dev/sda1 is identified as a member of /dev/md4, slot 0.
mdadm: no RAID superblock on /dev/sdb
mdadm: /dev/sdb has wrong uuid.
mdadm: /dev/sdb1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdc
mdadm: /dev/sdc has wrong uuid.
mdadm: /dev/sdc1 is identified as a member of /dev/md4, slot 1.
mdadm: no RAID superblock on /dev/sdd
mdadm: /dev/sdd has wrong uuid.
mdadm: /dev/sdd1 has wrong uuid.
mdadm: no RAID superblock on /dev/sde
mdadm: /dev/sde has wrong uuid.
mdadm: /dev/sde1 is identified as a member of /dev/md4, slot 3.
mdadm: no RAID superblock on /dev/sdf
mdadm: /dev/sdf has wrong uuid.
mdadm: /dev/sdf1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdg
mdadm: /dev/sdg has wrong uuid.
mdadm: /dev/sdg1 is identified as a member of /dev/md4, slot 4.
mdadm: no RAID superblock on /dev/sdh
mdadm: /dev/sdh has wrong uuid.
mdadm: /dev/sdh1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdi
mdadm: /dev/sdi has wrong uuid.
mdadm: /dev/sdi1 is identified as a member of /dev/md4, slot 2.
mdadm: no RAID superblock on /dev/sdj
mdadm: /dev/sdj has wrong uuid.
mdadm: /dev/sdj1 has wrong uuid.
mdadm: added /dev/sdc1 to /dev/md4 as 1
mdadm: added /dev/sdi1 to /dev/md4 as 2
mdadm: added /dev/sde1 to /dev/md4 as 3
mdadm: added /dev/sdg1 to /dev/md4 as 4
mdadm: added /dev/sda1 to /dev/md4 as 0
mdadm: /dev/md4 has been started with 5 drives.
[root@localhost avidserver]# mdadm -Av /dev/md6
--uuid=57f26496:25520b96:41757b62:f83fcb7b /dev/sd*
mdadm: looking for devices for /dev/md6
mdadm: /dev/sd is not a block device.
mdadm: /dev/sd has wrong uuid.
mdadm: no RAID superblock on /dev/sda
mdadm: /dev/sda has wrong uuid.
mdadm: /dev/sda1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdb
mdadm: /dev/sdb has wrong uuid.
mdadm: /dev/sdb1 is identified as a member of /dev/md6, slot 0.
mdadm: no RAID superblock on /dev/sdc
mdadm: /dev/sdc has wrong uuid.
mdadm: /dev/sdc1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdd
mdadm: /dev/sdd has wrong uuid.
mdadm: /dev/sdd1 is identified as a member of /dev/md6, slot 1.
mdadm: no RAID superblock on /dev/sde
mdadm: /dev/sde has wrong uuid.
mdadm: /dev/sde1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdf
mdadm: /dev/sdf has wrong uuid.
mdadm: /dev/sdf1 is identified as a member of /dev/md6, slot 2.
mdadm: no RAID superblock on /dev/sdg
mdadm: /dev/sdg has wrong uuid.
mdadm: /dev/sdg1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdh
mdadm: /dev/sdh has wrong uuid.
mdadm: /dev/sdh1 is identified as a member of /dev/md6, slot 3.
mdadm: no RAID superblock on /dev/sdi
mdadm: /dev/sdi has wrong uuid.
mdadm: /dev/sdi1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdj
mdadm: /dev/sdj has wrong uuid.
mdadm: /dev/sdj1 is identified as a member of /dev/md6, slot 4.
mdadm: added /dev/sdd1 to /dev/md6 as 1
mdadm: added /dev/sdf1 to /dev/md6 as 2
mdadm: added /dev/sdh1 to /dev/md6 as 3
mdadm: added /dev/sdj1 to /dev/md6 as 4
mdadm: added /dev/sdb1 to /dev/md6 as 0
mdadm: /dev/md6 has been started with 4 drives (out of 5).
NOTE THAT mdadm identified sdh1 as being in slot 3 on md6, yet under cat
/proc/mdstat the slot 3
Drive in md6 is reported as missing.
[root@localhost avidserver]# cat /proc/mdstat
Personalities : [raid5]
read_ahead 1024 sectors
md6 : active raid5 scsi/host1/bus0/target1/lun0/part1[0]
scsi/host5/bus0/target1/lun0/part1[4] scsi/host3/bus0/target1/lun0/part1[2]
scsi/host2/bus0/target1/lun0/part1[1]
796566528 blocks level 5, 128k chunk, algorithm 2 [5/4] [UUU_U]
md4 : active raid5 scsi/host1/bus0/target0/lun0/part1[0]
scsi/host4/bus0/target0/lun0/part1[4] scsi/host3/bus0/target0/lun0/part1[3]
scsi/host5/bus0/target0/lun0/part1[2] scsi/host2/bus0/target0/lun0/part1[1]
480214528 blocks level 5, 128k chunk, algorithm 2 [5/5] [UUUUU]
unused devices: <none>
[root@localhost avidserver]# mdadm -E /dev/sda1
/dev/sda1:
Magic : a92b4efc
Version : 00.90.00
UUID : 62d8b91d:a2368783:6a78ca50:5793492f
Creation Time : Fri Nov 22 09:13:16 2002
Raid Level : raid5
Device Size : 120053632 (114.49 GiB 122.93 GB)
Raid Devices : 5
Total Devices : 6
Preferred Minor : 4
Update Time : Thu Jan 22 08:42:49 2004
State : dirty, no-errors
Active Devices : 5
Working Devices : 5
Failed Devices : 1
Spare Devices : 0
Checksum : f55e948c - correct
Events : 0.146
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 0 8 1 0 active sync
/dev/scsi/host1/bus0/target0/lun0/part1
0 0 8 1 0 active sync
/dev/scsi/host1/bus0/target0/lun0/part1
1 1 8 33 1 active sync
/dev/scsi/host2/bus0/target0/lun0/part1
2 2 8 129 2 active sync
/dev/scsi/host5/bus0/target0/lun0/part1
3 3 8 65 3 active sync
/dev/scsi/host3/bus0/target0/lun0/part1
4 4 8 97 4 active sync
/dev/scsi/host4/bus0/target0/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdb1
/dev/sdb1:
Magic : a92b4efc
Version : 00.90.00
UUID : 57f26496:25520b96:41757b62:f83fcb7b
Creation Time : Mon Nov 24 17:36:05 2003
Raid Level : raid5
Device Size : 199141632 (189.92 GiB 203.92 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 6
Update Time : Thu Jan 22 08:43:28 2004
State : dirty, no-errors
Active Devices : 4
Working Devices : 4
Failed Devices : 1
Spare Devices : 0
Checksum : ebd80d56 - correct
Events : 0.137
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 0 8 17 0 active sync
/dev/scsi/host1/bus0/target1/lun0/part1
0 0 8 17 0 active sync
/dev/scsi/host1/bus0/target1/lun0/part1
1 1 8 49 1 active sync
/dev/scsi/host2/bus0/target1/lun0/part1
2 2 8 81 2 active sync
/dev/scsi/host3/bus0/target1/lun0/part1
3 3 0 0 3 faulty removed
4 4 8 145 4 active sync
/dev/scsi/host5/bus0/target1/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdc1
/dev/sdc1:
Magic : a92b4efc
Version : 00.90.00
UUID : 62d8b91d:a2368783:6a78ca50:5793492f
Creation Time : Fri Nov 22 09:13:16 2002
Raid Level : raid5
Device Size : 120053632 (114.49 GiB 122.93 GB)
Raid Devices : 5
Total Devices : 6
Preferred Minor : 4
Update Time : Thu Jan 22 08:42:49 2004
State : dirty, no-errors
Active Devices : 5
Working Devices : 5
Failed Devices : 1
Spare Devices : 0
Checksum : f55e94ae - correct
Events : 0.146
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 1 8 33 1 active sync
/dev/scsi/host2/bus0/target0/lun0/part1
0 0 8 1 0 active sync
/dev/scsi/host1/bus0/target0/lun0/part1
1 1 8 33 1 active sync
/dev/scsi/host2/bus0/target0/lun0/part1
2 2 8 129 2 active sync
/dev/scsi/host5/bus0/target0/lun0/part1
3 3 8 65 3 active sync
/dev/scsi/host3/bus0/target0/lun0/part1
4 4 8 97 4 active sync
/dev/scsi/host4/bus0/target0/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdd1
/dev/sdd1:
Magic : a92b4efc
Version : 00.90.00
UUID : 57f26496:25520b96:41757b62:f83fcb7b
Creation Time : Mon Nov 24 17:36:05 2003
Raid Level : raid5
Device Size : 199141632 (189.92 GiB 203.92 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 6
Update Time : Thu Jan 22 08:43:28 2004
State : dirty, no-errors
Active Devices : 4
Working Devices : 4
Failed Devices : 1
Spare Devices : 0
Checksum : ebd80d78 - correct
Events : 0.137
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 1 8 49 1 active sync
/dev/scsi/host2/bus0/target1/lun0/part1
0 0 8 17 0 active sync
/dev/scsi/host1/bus0/target1/lun0/part1
1 1 8 49 1 active sync
/dev/scsi/host2/bus0/target1/lun0/part1
2 2 8 81 2 active sync
/dev/scsi/host3/bus0/target1/lun0/part1
3 3 0 0 3 faulty removed
4 4 8 145 4 active sync
/dev/scsi/host5/bus0/target1/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sde1
/dev/sde1:
Magic : a92b4efc
Version : 00.90.00
UUID : 62d8b91d:a2368783:6a78ca50:5793492f
Creation Time : Fri Nov 22 09:13:16 2002
Raid Level : raid5
Device Size : 120053632 (114.49 GiB 122.93 GB)
Raid Devices : 5
Total Devices : 6
Preferred Minor : 4
Update Time : Thu Jan 22 08:42:49 2004
State : dirty, no-errors
Active Devices : 5
Working Devices : 5
Failed Devices : 1
Spare Devices : 0
Checksum : f55e94d2 - correct
Events : 0.146
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 3 8 65 3 active sync
/dev/scsi/host3/bus0/target0/lun0/part1
0 0 8 1 0 active sync
/dev/scsi/host1/bus0/target0/lun0/part1
1 1 8 33 1 active sync /de
v/scsi/host2/bus0/target0/lun0/part1
2 2 8 129 2 active sync
/dev/scsi/host5/bus0/target0/lun0/part1
3 3 8 65 3 active sync
/dev/scsi/host3/bus0/target0/lun0/part1
4 4 8 97 4 active sync
/dev/scsi/host4/bus0/target0/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdf1
/dev/sdf1:
Magic : a92b4efc
Version : 00.90.00
UUID : 57f26496:25520b96:41757b62:f83fcb7b
Creation Time : Mon Nov 24 17:36:05 2003
Raid Level : raid5
Device Size : 199141632 (189.92 GiB 203.92 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 6
Update Time : Thu Jan 22 08:43:28 2004
State : dirty, no-errors
Active Devices : 4
Working Devices : 4
Failed Devices : 1
Spare Devices : 0
Checksum : ebd80d9a - correct
Events : 0.137
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 2 8 81 2 active sync
/dev/scsi/host3/bus0/target1/lun0/part1
0 0 8 17 0 active sync
/dev/scsi/host1/bus0/target1/lun0/part1
1 1 8 49 1 active sync
/dev/scsi/host2/bus0/target1/lun0/part1
2 2 8 81 2 active sync
/dev/scsi/host3/bus0/target1/lun0/part1
3 3 0 0 3 faulty removed
4 4 8 145 4 active sync
/dev/scsi/host5/bus0/target1/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdg1
/dev/sdg1:
Magic : a92b4efc
Version : 00.90.00
UUID : 62d8b91d:a2368783:6a78ca50:5793492f
Creation Time : Fri Nov 22 09:13:16 2002
Raid Level : raid5
Device Size : 120053632 (114.49 GiB 122.93 GB)
Raid Devices : 5
Total Devices : 6
Preferred Minor : 4
Update Time : Thu Jan 22 08:42:49 2004
State : dirty, no-errors
Active Devices : 5
Working Devices : 5
Failed Devices : 1
Spare Devices : 0
Checksum : f55e94f4 - correct
Events : 0.146
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 4 8 97 4 active sync
/dev/scsi/host4/bus0/target0/lun0/part1
0 0 8 1 0 active sync
/dev/scsi/host1/bus0/target0/lun0/part1
1 1 8 33 1 active sync
/dev/scsi/host2/bus0/target0/lun0/part1
2 2 8 129 2 active sync
/dev/scsi/host5/bus0/target0/lun0/part1
3 3 8 65 3 active sync
/dev/scsi/host3/bus0/target0/lun0/part1
4 4 8 97 4 active sync
/dev/scsi/host4/bus0/target0/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdh1
/dev/sdh1:
Magic : a92b4efc
Version : 00.90.00
UUID : 57f26496:25520b96:41757b62:f83fcb7b
Creation Time : Mon Nov 24 17:36:05 2003
Raid Level : raid5
Device Size : 199141632 (189.92 GiB 203.92 GB)
Raid Devices : 5
Total Devices : 6
Preferred Minor : 6
Update Time : Thu Jan 15 08:18:48 2004
State : dirty, no-errors
Active Devices : 5
Working Devices : 5
Failed Devices : 1
Spare Devices : 0
Checksum : ebcecdda - correct
Events : 0.118
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 3 8 113 3 active sync
/dev/scsi/host4/bus0/target1/lun0/part1
0 0 8 17 0 active sync
/dev/scsi/host1/bus0/target1/lun0/part1
1 1 8 49 1 active sync
/dev/scsi/host2/bus0/target1/lun0/part1
2 2 8 81 2 active sync
/dev/scsi/host3/bus0/target1/lun0/part1
3 3 8 113 3 active sync
/dev/scsi/host4/bus0/target1/lun0/part1
4 4 8 145 4 active sync
/dev/scsi/host5/bus0/target1/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdi1
/dev/sdi1:
Magic : a92b4efc
Version : 00.90.00
UUID : 62d8b91d:a2368783:6a78ca50:5793492f
Creation Time : Fri Nov 22 09:13:16 2002
Raid Level : raid5
Device Size : 120053632 (114.49 GiB 122.93 GB)
Raid Devices : 5
Total Devices : 6
Preferred Minor : 4
Update Time : Thu Jan 22 08:42:49 2004
State : dirty, no-errors
Active Devices : 5
Working Devices : 5
Failed Devices : 1
Spare Devices : 0
Checksum : f55e9510 - correct
Events : 0.146
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 2 8 129 2 active sync
/dev/scsi/host5/bus0/target0/lun0/part1
0 0 8 1 0 active sync
/dev/scsi/host1/bus0/target0/lun0/part1
1 1 8 33 1 active sync
/dev/scsi/host2/bus0/target0/lun0/part1
2 2 8 129 2 active sync
/dev/scsi/host5/bus0/target0/lun0/part1
3 3 8 65 3 active sync
/dev/scsi/host3/bus0/target0/lun0/part1
4 4 8 97 4 active sync
/dev/scsi/host4/bus0/target0/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdj1
/dev/sdj1:
Magic : a92b4efc
Version : 00.90.00
UUID : 57f26496:25520b96:41757b62:f83fcb7b
Creation Time : Mon Nov 24 17:36:05 2003
Raid Level : raid5
Device Size : 199141632 (189.92 GiB 203.92 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 6
Update Time : Thu Jan 22 08:43:28 2004
State : dirty, no-errors
Active Devices : 4
Working Devices : 4
Failed Devices : 1
Spare Devices : 0
Checksum : ebd80dde - correct
Events : 0.137
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 4 8 145 4 active sync
/dev/scsi/host5/bus0/target1/lun0/part1
0 0 8 17 0 active sync
/dev/scsi/host1/bus0/target1/lun0/part1
1 1 8 49 1 active sync
/dev/scsi/host2/bus0/target1/lun0/part1
2 2 8 81 2 active sync
/dev/scsi/host3/bus0/target1/lun0/part1
3 3 0 0 3 faulty removed
4 4 8 145 4 active sync
/dev/scsi/host5/bus0/target1/lun0/part1
next reply other threads:[~2004-01-22 14:34 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-01-22 14:34 AndyLiebman [this message]
2004-01-23 0:39 ` Linux Raid confused about one drive and two arrays Neil Brown
-- strict thread matches above, loose matches on Subject: below --
2004-01-23 3:22 AndyLiebman
2004-01-23 3:27 ` Neil Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6a.3aa594bc.2d413983@aol.com \
--to=andyliebman@aol.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.