From mboxrd@z Thu Jan 1 00:00:00 1970 From: Robin Hill Subject: Re: raid1 issue after disk failure: both disks of the array are still active Date: Thu, 13 Sep 2012 11:34:32 +0100 Message-ID: <20120913103432.GA11764@cthulhu.home.robinhill.me.uk> References: <5051AF17.8010501@linuxsystems.it> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="YZ5djTAD1cGYuMQK" Return-path: Content-Disposition: inline In-Reply-To: <5051AF17.8010501@linuxsystems.it> Sender: linux-raid-owner@vger.kernel.org To: =?iso-8859-1?Q?Niccol=F2?= Belli Cc: linux-raid@vger.kernel.org List-Id: linux-raid.ids --YZ5djTAD1cGYuMQK Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu Sep 13, 2012 at 12:01:59PM +0200, Niccol=F2 Belli wrote: > Hi, > I have a raid1 array with two disks, distro is Squeeze amd64. /dev/sda=20 > is slowly dying, here is a snippet of "smartctl -a /dev/sda": >=20 > 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always= =20 > - 2 > 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline= =20 > - 1 >=20 > The bad sector is in the second half-MB of the disk, in fact with "dd=20 > if=3D/dev/sda1 of=3D/dev/null bs=3D524228 count=3D1 skip=3D1" I get this = output in=20 > /var/log/syslog: >=20 > root@asterisk:~# dd if=3D/dev/sda1 of=3D/dev/null bs=3D524228 count=3D1 s= kip=3D1 > 0+1 record dentro > 0+1 record fuori > 430140 byte (430 kB) copiati, 11,7265 s, 36,7 kB/s >=20 <- snip dmesg output -> >=20 > *Why doesn't it fail the first hard disk of the array!!??* >=20 Has anything actually attempted to read from that part of the array? Even if so, it may just have happened to read from the working disk anyway. md can only detect the error when it tries to read/write that sector of that disk. Your best bet now is to do an array check: echo check > /sys/block/md0/md/sync_action This will force a read of all disks in the array. This should trigger the read error, causing an attempt to re-write the faulty block, in turn causing the drive remap the bad sector (assuming the re-write fails). This should also be scheduled to run regularly for all arrays in order to pick up these sort of issues before they cause major problems during a rebuild. Cheers, Robin --=20 ___ =20 ( ' } | Robin Hill | / / ) | Little Jim says .... | // !! | "He fallen in de water !!" | --YZ5djTAD1cGYuMQK Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEARECAAYFAlBRtrcACgkQShxCyD40xBKl2QCdGukC6J8jm5w7Y7XKYnJsx9pi mLEAn12qlt9vg/BJoXcBRyUVfo3RCS5l =LHPh -----END PGP SIGNATURE----- --YZ5djTAD1cGYuMQK--