From: AndyLiebman@aol.com
To: linux-raid@vger.kernel.org
Subject: Linux Raid confused about one drive and two arrays
Date: Thu, 22 Jan 2004 09:34:43 EST [thread overview]
Message-ID: <6a.3aa594bc.2d413983@aol.com> (raw)
I have just encountered a very disturbing RAID problem. I hope somebody
understands what happened and can tell me how to fix it.
I have two RAID 5 arrays on my Linux machine -- md4 and md6.. Each array
consists of 5 firewire (1394a) drives -- one partition on each drive, 10 drives in
total. Because the device ID's on these drives can change, I always use MDADM
to create and manage my arrays based on UUIDs. I am using MDADM 1.3. Mandrake
9.2 with mandrake's 2.4.22-21 kernel.
After running these arrays successfully for two months -- rebooting my file
server every day -- one of my arrays came up in a degraded mode. It looks as if
the Linux RAID subsystem "thinks" one of my drives belongs to both arrays.
As you can see below, when I run mdadm -E on each of my ten firewire drives,
mdadm is telling me that for each of the drives in the md4 array (UUID group
62d8b91d:a2368783:6a78ca50:5793492f ) there are 5 Raid devices and 6 total
devices with one failed. However this array always only had 5 devices.
On the other hand, for most of the drives in the md6 arary (UUID group
57f26496:25520b96:41757b62:f83fcb7b), mdadm is telling me that there are 5 raid
devices and 5 total devices with one failed.
However, when I run mdadm -E on the drive currently identified as /dev/sdh1
-- which also belongs to md6 or the UUID group
57f26496:25520b96:41757b62:f83fcb7b -- mdadm tells me that sdh1 is part of an array with 6 total devices, 5
raid devices, one failed.
/dev/sdh1 is identified as device number 3 in the RAID with the UUID
57f26496:25520b96:41757b62:f83fcb7b. Howver, when I run mdadm -E on the other 4
drives that belong to md6, mdadm tells me that device number 3 is faulty.
My questions are:
How do I fix this problem?
Why did it occur?
How can I prevent it from occurring again?
Hope somebody can answer these questions today.
Here is all the output from starting up my arrays and running mdadm:
[root@localhost avidserver]# mdadm -Av /dev/md4
--uuid=62d8b91d:a2368783:6a78ca50:5793492f /dev/sd*
mdadm: looking for devices for /dev/md4
mdadm: /dev/sd is not a block device.
mdadm: /dev/sd has wrong uuid.
mdadm: no RAID superblock on /dev/sda
mdadm: /dev/sda has wrong uuid.
mdadm: /dev/sda1 is identified as a member of /dev/md4, slot 0.
mdadm: no RAID superblock on /dev/sdb
mdadm: /dev/sdb has wrong uuid.
mdadm: /dev/sdb1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdc
mdadm: /dev/sdc has wrong uuid.
mdadm: /dev/sdc1 is identified as a member of /dev/md4, slot 1.
mdadm: no RAID superblock on /dev/sdd
mdadm: /dev/sdd has wrong uuid.
mdadm: /dev/sdd1 has wrong uuid.
mdadm: no RAID superblock on /dev/sde
mdadm: /dev/sde has wrong uuid.
mdadm: /dev/sde1 is identified as a member of /dev/md4, slot 3.
mdadm: no RAID superblock on /dev/sdf
mdadm: /dev/sdf has wrong uuid.
mdadm: /dev/sdf1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdg
mdadm: /dev/sdg has wrong uuid.
mdadm: /dev/sdg1 is identified as a member of /dev/md4, slot 4.
mdadm: no RAID superblock on /dev/sdh
mdadm: /dev/sdh has wrong uuid.
mdadm: /dev/sdh1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdi
mdadm: /dev/sdi has wrong uuid.
mdadm: /dev/sdi1 is identified as a member of /dev/md4, slot 2.
mdadm: no RAID superblock on /dev/sdj
mdadm: /dev/sdj has wrong uuid.
mdadm: /dev/sdj1 has wrong uuid.
mdadm: added /dev/sdc1 to /dev/md4 as 1
mdadm: added /dev/sdi1 to /dev/md4 as 2
mdadm: added /dev/sde1 to /dev/md4 as 3
mdadm: added /dev/sdg1 to /dev/md4 as 4
mdadm: added /dev/sda1 to /dev/md4 as 0
mdadm: /dev/md4 has been started with 5 drives.
[root@localhost avidserver]# mdadm -Av /dev/md6
--uuid=57f26496:25520b96:41757b62:f83fcb7b /dev/sd*
mdadm: looking for devices for /dev/md6
mdadm: /dev/sd is not a block device.
mdadm: /dev/sd has wrong uuid.
mdadm: no RAID superblock on /dev/sda
mdadm: /dev/sda has wrong uuid.
mdadm: /dev/sda1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdb
mdadm: /dev/sdb has wrong uuid.
mdadm: /dev/sdb1 is identified as a member of /dev/md6, slot 0.
mdadm: no RAID superblock on /dev/sdc
mdadm: /dev/sdc has wrong uuid.
mdadm: /dev/sdc1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdd
mdadm: /dev/sdd has wrong uuid.
mdadm: /dev/sdd1 is identified as a member of /dev/md6, slot 1.
mdadm: no RAID superblock on /dev/sde
mdadm: /dev/sde has wrong uuid.
mdadm: /dev/sde1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdf
mdadm: /dev/sdf has wrong uuid.
mdadm: /dev/sdf1 is identified as a member of /dev/md6, slot 2.
mdadm: no RAID superblock on /dev/sdg
mdadm: /dev/sdg has wrong uuid.
mdadm: /dev/sdg1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdh
mdadm: /dev/sdh has wrong uuid.
mdadm: /dev/sdh1 is identified as a member of /dev/md6, slot 3.
mdadm: no RAID superblock on /dev/sdi
mdadm: /dev/sdi has wrong uuid.
mdadm: /dev/sdi1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdj
mdadm: /dev/sdj has wrong uuid.
mdadm: /dev/sdj1 is identified as a member of /dev/md6, slot 4.
mdadm: added /dev/sdd1 to /dev/md6 as 1
mdadm: added /dev/sdf1 to /dev/md6 as 2
mdadm: added /dev/sdh1 to /dev/md6 as 3
mdadm: added /dev/sdj1 to /dev/md6 as 4
mdadm: added /dev/sdb1 to /dev/md6 as 0
mdadm: /dev/md6 has been started with 4 drives (out of 5).
NOTE THAT mdadm identified sdh1 as being in slot 3 on md6, yet under cat
/proc/mdstat the slot 3
Drive in md6 is reported as missing.
[root@localhost avidserver]# cat /proc/mdstat
Personalities : [raid5]
read_ahead 1024 sectors
md6 : active raid5 scsi/host1/bus0/target1/lun0/part1[0]
scsi/host5/bus0/target1/lun0/part1[4] scsi/host3/bus0/target1/lun0/part1[2]
scsi/host2/bus0/target1/lun0/part1[1]
796566528 blocks level 5, 128k chunk, algorithm 2 [5/4] [UUU_U]
md4 : active raid5 scsi/host1/bus0/target0/lun0/part1[0]
scsi/host4/bus0/target0/lun0/part1[4] scsi/host3/bus0/target0/lun0/part1[3]
scsi/host5/bus0/target0/lun0/part1[2] scsi/host2/bus0/target0/lun0/part1[1]
480214528 blocks level 5, 128k chunk, algorithm 2 [5/5] [UUUUU]
unused devices: <none>
[root@localhost avidserver]# mdadm -E /dev/sda1
/dev/sda1:
Magic : a92b4efc
Version : 00.90.00
UUID : 62d8b91d:a2368783:6a78ca50:5793492f
Creation Time : Fri Nov 22 09:13:16 2002
Raid Level : raid5
Device Size : 120053632 (114.49 GiB 122.93 GB)
Raid Devices : 5
Total Devices : 6
Preferred Minor : 4
Update Time : Thu Jan 22 08:42:49 2004
State : dirty, no-errors
Active Devices : 5
Working Devices : 5
Failed Devices : 1
Spare Devices : 0
Checksum : f55e948c - correct
Events : 0.146
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 0 8 1 0 active sync
/dev/scsi/host1/bus0/target0/lun0/part1
0 0 8 1 0 active sync
/dev/scsi/host1/bus0/target0/lun0/part1
1 1 8 33 1 active sync
/dev/scsi/host2/bus0/target0/lun0/part1
2 2 8 129 2 active sync
/dev/scsi/host5/bus0/target0/lun0/part1
3 3 8 65 3 active sync
/dev/scsi/host3/bus0/target0/lun0/part1
4 4 8 97 4 active sync
/dev/scsi/host4/bus0/target0/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdb1
/dev/sdb1:
Magic : a92b4efc
Version : 00.90.00
UUID : 57f26496:25520b96:41757b62:f83fcb7b
Creation Time : Mon Nov 24 17:36:05 2003
Raid Level : raid5
Device Size : 199141632 (189.92 GiB 203.92 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 6
Update Time : Thu Jan 22 08:43:28 2004
State : dirty, no-errors
Active Devices : 4
Working Devices : 4
Failed Devices : 1
Spare Devices : 0
Checksum : ebd80d56 - correct
Events : 0.137
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 0 8 17 0 active sync
/dev/scsi/host1/bus0/target1/lun0/part1
0 0 8 17 0 active sync
/dev/scsi/host1/bus0/target1/lun0/part1
1 1 8 49 1 active sync
/dev/scsi/host2/bus0/target1/lun0/part1
2 2 8 81 2 active sync
/dev/scsi/host3/bus0/target1/lun0/part1
3 3 0 0 3 faulty removed
4 4 8 145 4 active sync
/dev/scsi/host5/bus0/target1/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdc1
/dev/sdc1:
Magic : a92b4efc
Version : 00.90.00
UUID : 62d8b91d:a2368783:6a78ca50:5793492f
Creation Time : Fri Nov 22 09:13:16 2002
Raid Level : raid5
Device Size : 120053632 (114.49 GiB 122.93 GB)
Raid Devices : 5
Total Devices : 6
Preferred Minor : 4
Update Time : Thu Jan 22 08:42:49 2004
State : dirty, no-errors
Active Devices : 5
Working Devices : 5
Failed Devices : 1
Spare Devices : 0
Checksum : f55e94ae - correct
Events : 0.146
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 1 8 33 1 active sync
/dev/scsi/host2/bus0/target0/lun0/part1
0 0 8 1 0 active sync
/dev/scsi/host1/bus0/target0/lun0/part1
1 1 8 33 1 active sync
/dev/scsi/host2/bus0/target0/lun0/part1
2 2 8 129 2 active sync
/dev/scsi/host5/bus0/target0/lun0/part1
3 3 8 65 3 active sync
/dev/scsi/host3/bus0/target0/lun0/part1
4 4 8 97 4 active sync
/dev/scsi/host4/bus0/target0/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdd1
/dev/sdd1:
Magic : a92b4efc
Version : 00.90.00
UUID : 57f26496:25520b96:41757b62:f83fcb7b
Creation Time : Mon Nov 24 17:36:05 2003
Raid Level : raid5
Device Size : 199141632 (189.92 GiB 203.92 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 6
Update Time : Thu Jan 22 08:43:28 2004
State : dirty, no-errors
Active Devices : 4
Working Devices : 4
Failed Devices : 1
Spare Devices : 0
Checksum : ebd80d78 - correct
Events : 0.137
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 1 8 49 1 active sync
/dev/scsi/host2/bus0/target1/lun0/part1
0 0 8 17 0 active sync
/dev/scsi/host1/bus0/target1/lun0/part1
1 1 8 49 1 active sync
/dev/scsi/host2/bus0/target1/lun0/part1
2 2 8 81 2 active sync
/dev/scsi/host3/bus0/target1/lun0/part1
3 3 0 0 3 faulty removed
4 4 8 145 4 active sync
/dev/scsi/host5/bus0/target1/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sde1
/dev/sde1:
Magic : a92b4efc
Version : 00.90.00
UUID : 62d8b91d:a2368783:6a78ca50:5793492f
Creation Time : Fri Nov 22 09:13:16 2002
Raid Level : raid5
Device Size : 120053632 (114.49 GiB 122.93 GB)
Raid Devices : 5
Total Devices : 6
Preferred Minor : 4
Update Time : Thu Jan 22 08:42:49 2004
State : dirty, no-errors
Active Devices : 5
Working Devices : 5
Failed Devices : 1
Spare Devices : 0
Checksum : f55e94d2 - correct
Events : 0.146
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 3 8 65 3 active sync
/dev/scsi/host3/bus0/target0/lun0/part1
0 0 8 1 0 active sync
/dev/scsi/host1/bus0/target0/lun0/part1
1 1 8 33 1 active sync /de
v/scsi/host2/bus0/target0/lun0/part1
2 2 8 129 2 active sync
/dev/scsi/host5/bus0/target0/lun0/part1
3 3 8 65 3 active sync
/dev/scsi/host3/bus0/target0/lun0/part1
4 4 8 97 4 active sync
/dev/scsi/host4/bus0/target0/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdf1
/dev/sdf1:
Magic : a92b4efc
Version : 00.90.00
UUID : 57f26496:25520b96:41757b62:f83fcb7b
Creation Time : Mon Nov 24 17:36:05 2003
Raid Level : raid5
Device Size : 199141632 (189.92 GiB 203.92 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 6
Update Time : Thu Jan 22 08:43:28 2004
State : dirty, no-errors
Active Devices : 4
Working Devices : 4
Failed Devices : 1
Spare Devices : 0
Checksum : ebd80d9a - correct
Events : 0.137
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 2 8 81 2 active sync
/dev/scsi/host3/bus0/target1/lun0/part1
0 0 8 17 0 active sync
/dev/scsi/host1/bus0/target1/lun0/part1
1 1 8 49 1 active sync
/dev/scsi/host2/bus0/target1/lun0/part1
2 2 8 81 2 active sync
/dev/scsi/host3/bus0/target1/lun0/part1
3 3 0 0 3 faulty removed
4 4 8 145 4 active sync
/dev/scsi/host5/bus0/target1/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdg1
/dev/sdg1:
Magic : a92b4efc
Version : 00.90.00
UUID : 62d8b91d:a2368783:6a78ca50:5793492f
Creation Time : Fri Nov 22 09:13:16 2002
Raid Level : raid5
Device Size : 120053632 (114.49 GiB 122.93 GB)
Raid Devices : 5
Total Devices : 6
Preferred Minor : 4
Update Time : Thu Jan 22 08:42:49 2004
State : dirty, no-errors
Active Devices : 5
Working Devices : 5
Failed Devices : 1
Spare Devices : 0
Checksum : f55e94f4 - correct
Events : 0.146
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 4 8 97 4 active sync
/dev/scsi/host4/bus0/target0/lun0/part1
0 0 8 1 0 active sync
/dev/scsi/host1/bus0/target0/lun0/part1
1 1 8 33 1 active sync
/dev/scsi/host2/bus0/target0/lun0/part1
2 2 8 129 2 active sync
/dev/scsi/host5/bus0/target0/lun0/part1
3 3 8 65 3 active sync
/dev/scsi/host3/bus0/target0/lun0/part1
4 4 8 97 4 active sync
/dev/scsi/host4/bus0/target0/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdh1
/dev/sdh1:
Magic : a92b4efc
Version : 00.90.00
UUID : 57f26496:25520b96:41757b62:f83fcb7b
Creation Time : Mon Nov 24 17:36:05 2003
Raid Level : raid5
Device Size : 199141632 (189.92 GiB 203.92 GB)
Raid Devices : 5
Total Devices : 6
Preferred Minor : 6
Update Time : Thu Jan 15 08:18:48 2004
State : dirty, no-errors
Active Devices : 5
Working Devices : 5
Failed Devices : 1
Spare Devices : 0
Checksum : ebcecdda - correct
Events : 0.118
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 3 8 113 3 active sync
/dev/scsi/host4/bus0/target1/lun0/part1
0 0 8 17 0 active sync
/dev/scsi/host1/bus0/target1/lun0/part1
1 1 8 49 1 active sync
/dev/scsi/host2/bus0/target1/lun0/part1
2 2 8 81 2 active sync
/dev/scsi/host3/bus0/target1/lun0/part1
3 3 8 113 3 active sync
/dev/scsi/host4/bus0/target1/lun0/part1
4 4 8 145 4 active sync
/dev/scsi/host5/bus0/target1/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdi1
/dev/sdi1:
Magic : a92b4efc
Version : 00.90.00
UUID : 62d8b91d:a2368783:6a78ca50:5793492f
Creation Time : Fri Nov 22 09:13:16 2002
Raid Level : raid5
Device Size : 120053632 (114.49 GiB 122.93 GB)
Raid Devices : 5
Total Devices : 6
Preferred Minor : 4
Update Time : Thu Jan 22 08:42:49 2004
State : dirty, no-errors
Active Devices : 5
Working Devices : 5
Failed Devices : 1
Spare Devices : 0
Checksum : f55e9510 - correct
Events : 0.146
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 2 8 129 2 active sync
/dev/scsi/host5/bus0/target0/lun0/part1
0 0 8 1 0 active sync
/dev/scsi/host1/bus0/target0/lun0/part1
1 1 8 33 1 active sync
/dev/scsi/host2/bus0/target0/lun0/part1
2 2 8 129 2 active sync
/dev/scsi/host5/bus0/target0/lun0/part1
3 3 8 65 3 active sync
/dev/scsi/host3/bus0/target0/lun0/part1
4 4 8 97 4 active sync
/dev/scsi/host4/bus0/target0/lun0/part1
[root@localhost avidserver]# mdadm -E /dev/sdj1
/dev/sdj1:
Magic : a92b4efc
Version : 00.90.00
UUID : 57f26496:25520b96:41757b62:f83fcb7b
Creation Time : Mon Nov 24 17:36:05 2003
Raid Level : raid5
Device Size : 199141632 (189.92 GiB 203.92 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 6
Update Time : Thu Jan 22 08:43:28 2004
State : dirty, no-errors
Active Devices : 4
Working Devices : 4
Failed Devices : 1
Spare Devices : 0
Checksum : ebd80dde - correct
Events : 0.137
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 4 8 145 4 active sync
/dev/scsi/host5/bus0/target1/lun0/part1
0 0 8 17 0 active sync
/dev/scsi/host1/bus0/target1/lun0/part1
1 1 8 49 1 active sync
/dev/scsi/host2/bus0/target1/lun0/part1
2 2 8 81 2 active sync
/dev/scsi/host3/bus0/target1/lun0/part1
3 3 0 0 3 faulty removed
4 4 8 145 4 active sync
/dev/scsi/host5/bus0/target1/lun0/part1
next reply other threads:[~2004-01-22 14:34 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-01-22 14:34 AndyLiebman [this message]
2004-01-23 0:39 ` Linux Raid confused about one drive and two arrays Neil Brown
-- strict thread matches above, loose matches on Subject: below --
2004-01-23 3:22 AndyLiebman
2004-01-23 3:27 ` Neil Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6a.3aa594bc.2d413983@aol.com \
--to=andyliebman@aol.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).