From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nicolas Jungers Subject: Re: Mdadm server eating drives Date: Thu, 27 Jun 2013 19:13:08 +0200 Message-ID: <51CC72A4.4040508@jungers.net> References: <51B896A2.9090105@websitemanagers.com.au> <51BA7B28.9030808@turmel.org> <51BB8A67.5000605@turmel.org> <51BB8B86.9050803@turmel.org> Reply-To: Nicolas Jungers Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: linux-raid-owner@vger.kernel.org To: Barrett Lewis Cc: "linux-raid@vger.kernel.org" List-Id: linux-raid.ids On 06/27/2013 02:23 AM, Barrett Lewis wrote: > Everything is going well, I am just trying to replace the parts that > are on the way out. > I ran a 'repair' and it came out with 5477 under > /sys/block/md0/md/mismatch_cnt. Then a 'check' came out with 0. > > Then I went out and bought a couple WD Reds (I'm done with greens now > that I know they lack ERC). I replaced one of the two drives Phil > said was not ok, which had many reallocations (I can personally see > those) in the smart status. I then ran another repair to be safe. It > came up with 0 mismatches, but in the process /dev/sda started giving > me tons (and tons and tons, rolled over dmesg) of these "failed > command: READ FPDMA QUEUED status: { DRDY ERR } error: { UNC }" > errors. sda hadn't been giving me problems before but I'll come back > to it. > > The second disk Phil said was "not ok" was this one which showed > "several pending errors". > (original smart status) http://pastie.org/8040852 > I was going to replace it with my second spare Red, but the errors > seem to have gone away. > (current smart status) http://pastie.org/8084278 > Or maybe I am looking in the wrong place to find the pending errors > (looking at "197 Current_Pending_Sector"). Is the drive currently in > need of replacement? I'm not sure what I'm looking for. > > What about this one (sda), after it gave all of those errors during a > repair? http://pastie.org/8084292 > I get the "5 Reallocated_Sector_Ct", but where do you find pending errors? > > What does it mean to get all these "failed command: READ FPDMA QUEUED > status: { DRDY ERR } error: { UNC }" errors and the smart status seems > to be fine even after a repair? Have you considered that your SATA may be faulty? I had consistent bad experiences with "cheap" SATA cables. I also use exclusively now cables with latches. I said "cheap" because the price is not an absolute criteria, quality of sourcing is more important in my experience. Regards, N. > > Thanks everyone, I'm learning a lot. > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >