From: Chris Eddington <chrise@synplicity.com>
To: Bill Davidsen <davidsen@tmr.com>, David Greaves <david@dgreaves.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: Raid5 assemble after dual sata port failure
Date: Fri, 16 Nov 2007 22:31:30 -0800 [thread overview]
Message-ID: <473E8AC2.9020701@synplicity.com> (raw)
In-Reply-To: <4737A5CC.8040105@tmr.com>
Yes, this is exactly the kind of symptoms I've experienced. I was
losing a drive here and there every couple of months (mostly the last
two drives sdc and sdd) which I though were cable problems (shut down,
re-plug the cables and restart and it would always work, with
add/rebuild the 4th disk). But now my guess is the motherboard chipset
is overheating (or maybe the drives). I have an MSI K9N platinum
AMD/Nividia chipset that has 4 raid ports + 2 raid ports from a separate
chip. The mb chipset comes with a wimpy heatsink on it and it is very
hot to the touch. I had been planning to replace it but never got
around to it.
I've been out of town this week so I had someone image all three disks.
He used ghost disk image application. He said the third disk reported
media problems, and about 5% of the data was not fixable (sector
errors). Using these three copied drives, the array comes up and
xfs_repair still reports a bunch of inode repairs as before, but it is a
bit different, maybe even a reduction in losses. But most important is
the hpa_sector errors no longer occur.
Key questions:
- I assume ddrescue will do a much better job of correcting errors when
imaging a disk? My colleague used ghost which is just a copy tool. I
don't understand the capabilities of ddrescue on raid partitions that well.
- fdisk -l reports that all the drives are exactly the same size with
exactly the same # sectors shown below. I don't quite follow the
hpa_resize issue, but it appears the drives don't have hidden HPA
sectors - I guess? Note that sdc is the original drive, where sda, sdb,
and sdd are the imaged drives.
So what do you recommend to do first? Should I try xfs_repair on the
ghost copy, or just re-copy myself using ddrescue? Are there special
settings to ddrescue I should consider to verify/correct potential HPA
changes?
Thks,
Chris
Disk /dev/sda: 500.1 GB, 500107862016 bytes
/dev/sda1 1 60801 488384001 fd Linux raid
autodetect
Disk /dev/sdb: 500.1 GB, 500107862016 bytes
/dev/sdb1 1 60801 488384001 fd Linux raid
autodetect
Disk /dev/sdc: 500.1 GB, 500107862016 bytes
/dev/sdc1 1 60801 488384001 fd Linux raid
autodetect
Disk /dev/sdd: 500.1 GB, 500107862016 bytes
/dev/sdd1 1 60801 488384001 fd Linux raid
autodetect
Bill Davidsen wrote:
> David Greaves wrote:
>> Chris Eddington wrote:
>>
>>> Yes, there is some kind of media error message in dmesg, below. It is
>>> not random, it happens at exactly the same moments in each
>>> xfs_repair -n
>>> run.
>>> Nov 11 09:48:25 altair kernel: [37043.300691] res
>>> 51/40:00:01:00:00/00:00:00:00:00/e1 Emask 0x9 (media error)
>>> Nov 11 09:48:25 altair kernel: [37043.304326] ata4.00:
>>> ata_hpa_resize 1:
>>> sectors = 976773168, hpa_sectors = 976773168
>>> Nov 11 09:48:25 altair kernel: [37043.307672] ata4.00:
>>> ata_hpa_resize 1:
>>> sectors = 976773168, hpa_sectors = 976773168
>>>
>>
>> I'm not sure what an ata_hpa_resize error is...
>>
>
> HPA = Hardware Protected Area.
>
> By any chance is this disk partitioned such that the partition size
> includes the HPA? If it does, this sounds at least familiar, this
> mailing list post may get you started:
> http://osdir.com/ml/linux.ataraid/2005-09/msg00002.html
>
> In any case, run "fdisk -l" and look at the claimed total disk size
> and the end point of the last partition. The HPA is not included in
> the "disk size" so nothing should be trying to do so.
>> It probably explains the problems you've been having with the raid
>> not 'just
>> recovering' though.
>>
>> I saw this:
>> http://www.linuxquestions.org/questions/linux-kernel-70/sata-issues-568894/
>>
>>
>
> May be the same thing. Let us know what fdisk reports.
>>
>> What does smartctl say about your drive?
>>
>> IMO the spare drive is no longer useful for data recovery - you may
>> want to use
>> ddrescue to try and copy this drive to the spare drive.
>>
>> David
>> PS Don't get the ddrescue parameters the wrong way round if you go
>> that route...
>> -
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>
>
next prev parent reply other threads:[~2007-11-17 6:31 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-07 20:28 Raid5 assemble after dual sata port failure Chris Eddington
2007-11-08 10:33 ` David Greaves
2007-11-09 21:23 ` Chris Eddington
2007-11-10 0:28 ` Chris Eddington
2007-11-10 9:16 ` David Greaves
2007-11-10 18:46 ` Chris Eddington
2007-11-11 17:09 ` David Greaves
2007-11-11 17:41 ` Chris Eddington
2007-11-11 22:49 ` David Greaves
2007-11-12 1:01 ` Bill Davidsen
2007-11-17 6:31 ` Chris Eddington [this message]
2007-11-18 12:25 ` David Greaves
-- strict thread matches above, loose matches on Subject: below --
2007-11-07 20:23 chrise
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=473E8AC2.9020701@synplicity.com \
--to=chrise@synplicity.com \
--cc=david@dgreaves.com \
--cc=davidsen@tmr.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).