From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tudor Holton Subject: Re: Spare disk not becoming active Date: Thu, 20 Dec 2012 10:19:57 +1100 Message-ID: <50D24B9D.8000801@smartguide.com.au> References: <50BBEC7E.7080200@smartguide.com.au> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <50BBEC7E.7080200@smartguide.com.au> Sender: linux-raid-owner@vger.kernel.org To: linux-raid@vger.kernel.org List-Id: linux-raid.ids I don't mean to be rude, but it's been two weeks and my system is still in this state. Bump, anyone? A thorough search of the web (before I originally posted this to the list) revealed nothing. No explanation as to why this occurs seemed apparent, only that it's happened a number of times. Most reports indicate that a complete stop of the array and reassemble fixes it, but I tried that and it still returned to spare. Some reports indicated my position but no response that seems complete. Eventually the discussions runs to wiping the disks and starting again. That seems a bit drastic and I'm concerned that *one* of the disks is faulty but not being reported as such, and I don't want to pick the wrong one to wipe off the superblock. mdadm reports no errors, but SMART indicates there may be a problem with the *active* disk, which is even more worrying because without making the spare active I can't remove it to test it properly. Any ideas? Cheers, Tudor. On 03/12/12 11:04, Tudor Holton wrote: > Hallo, > > I'm having some trouble with an array I have that has become degraded. > > I have an array with this array state: > > md101 : active raid1 sdf1[0] sdb1[2](S) > 1953511936 blocks [2/1] [U_] > > > mdadm --detail says: > > /dev/md101: > Version : 0.90 > Creation Time : Thu Jan 13 14:34:27 2011 > Raid Level : raid1 > Array Size : 1953511936 (1863.01 GiB 2000.40 GB) > Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB) > Raid Devices : 2 > Total Devices : 2 > Preferred Minor : 101 > Persistence : Superblock is persistent > > Update Time : Fri Nov 23 03:23:04 2012 > State : clean, degraded > Active Devices : 1 > Working Devices : 2 > Failed Devices : 0 > Spare Devices : 1 > > UUID : 43e92a79:90295495:0a76e71e:56c99031 (local to host > barney) > Events : 0.2127 > > Number Major Minor RaidDevice State > 0 8 81 0 active sync /dev/sdf1 > 1 0 0 1 removed > > 2 8 17 - spare /dev/sdb1 > > > If I attempt to force the spare to become active it begins to recover: > $ sudo mdadm -S /dev/md101 > mdadm: stopped /dev/md101 > $ sudo mdadm --assemble --force --no-degraded /dev/md101 /dev/sdf1 > /dev/sdb1 > mdadm: /dev/md101 has been started with 1 drive (out of 2) and 1 spare. > $ cat /proc/mdstat > md101 : active raid1 sdf1[0] sdb1[2] > 1953511936 blocks [2/1] [U_] > [>....................] recovery = 0.0% (541440/1953511936) > finish=420.8min speed=77348K/sec > > This runs for the allotted time but returns to the state of spare. > > Neither disk partition report errors: > $ cat /sys/block/md101/md/dev-sdf1/errors > 0 > $ cat /sys/block/md101/md/dev-sdb1/errors > 0 > > Are there mdadm logs to find out why this is not recovering properly? > How otherwise do I debug this? > > Cheers, > Tudor. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html