linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* MD or MDADM bug?
@ 2005-09-01 21:26 David M. Strang
  2005-09-02  6:36 ` Claas Hilbrecht
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: David M. Strang @ 2005-09-01 21:26 UTC (permalink / raw)
  To: linux-raid

This is somewhat of a crosspost from my thread yesterday; but I think it 
deserves it's own thread atm. Some time ago, I had a device fail -- with the 
help of Neil, Tyler & others on the mailing list; a few patches to mdadm --  
I was able to recover. Using mdadm --remove & mdadm --add, I was able to 
rebuild the bad disc in my array. Everything seemed fine; however -- when I 
rebooted and re-assembled the raid; it wouldn't take the disk that was 
re-added. I had to add it again; and let it rebuild. About 3 weeks ago, I 
lost power -- the outage lasted longer than the UPS, and my system shutdown. 
Upon startup, once again -- I had to re-add 'the disk' back to the array. 
For some reason, if I remove a device and add it back -- when I stop and 
re-assemble the array - it won't 'start' that disk.

Last night, I had a drive fail. With help from Michael & Forrest; I was able 
to attempt to rebuild the array by hot replacing the failed drive without 
rebooting to re-enable disk I/O to that position -- I only had one spare 
available -- it was suspect; and it turns out it was bad. During the 
rebuild, the disk started to have errors -- and the array puked:

Aug 31 21:45:40 abyss kernel: raid5: Disk failure on sda, disabling device. 
Operation continuing on 26 devices
Aug 31 21:45:40 abyss kernel: raid5: Disk failure on sdb, disabling device. 
Operation continuing on 25 devices
Aug 31 21:45:40 abyss kernel: raid5: Disk failure on sdi, disabling device. 
Operation continuing on 18 devices
Aug 31 21:45:40 abyss kernel: raid5: Disk failure on sdj, disabling device. 
Operation continuing on 17 devices
Aug 31 21:45:40 abyss kernel: raid5: Disk failure on sdk, disabling device. 
Operation continuing on 16 devices
Aug 31 21:45:40 abyss kernel: raid5: Disk failure on sdl, disabling device. 
Operation continuing on 15 devices
Aug 31 21:45:40 abyss kernel: raid5: Disk failure on sdn, disabling device. 
Operation continuing on 14 devices

All of this disks tested fine; this happened once before -- simply forcing 
the raid to re-assemble fixes the issue; then replace the bad disk and 
re-sync it.

The problem is; my array is now 26 of 28 disks -- /dev/sdm *IS* bad; it was 
removed and re-added but the new drive is faulty -- however, disk /dev/sdaa 
is not bad -- but, since it was the 'original' disk that was hot removed / 
added so long ago -- it doesn't assemble into the raid. I'm really stuck, I 
can't start the array -- and obviously I can't rebuild the two 'bad'  disks. 
I asked this once before; and was told -- No, you shouldn't have to hotadd 
and resync each time, after hot-adding a "new" device and the initial 
rebuild finishes, unless there's another failure after that, or an unclean 
shutdown.

What can I do? I don't believe this is working as intended.

I'm using mdadm 2.0-devel-3 on a Linux 2.6.11.12 kernel, with version-1 
superblocks.

-- David



^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2005-09-07  2:04 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-01 21:26 MD or MDADM bug? David M. Strang
2005-09-02  6:36 ` Claas Hilbrecht
2005-09-02  6:42 ` Claas Hilbrecht
2005-09-02  8:15 ` Neil Brown
2005-09-02  8:33   ` David M. Strang
2005-09-02  8:45     ` Neil Brown
2005-09-02  8:48       ` David M. Strang
2005-09-02  9:34         ` Neil Brown
2005-09-02  9:41           ` David M. Strang
2005-09-02 10:03             ` Neil Brown
2005-09-02 10:08               ` David M. Strang
2005-09-02 11:18                 ` Neil Brown
2005-09-02 21:22                   ` David M. Strang
2005-09-02 21:49                     ` Neil Brown
2005-09-02 23:34                       ` David M. Strang
2005-09-03  3:52                         ` Neil Brown
2005-09-03  8:21                           ` Tyler
2005-09-04  6:18                             ` Neil Brown
2005-09-05  9:20                               ` danci
2005-09-05  9:35                                 ` Mario 'BitKoenig' Holbe
2005-09-05 16:45                               ` Molle Bestefich
2005-09-05 21:13                                 ` Luca Berra
2005-09-06  1:38                                 ` Neil Brown
2005-09-06  6:38                                   ` bart
2005-09-06 10:17                                   ` Molle Bestefich
2005-09-07  2:04                                   ` berk walker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).