From: Andrew Morton <akpm@linux-foundation.org>
To: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
linux-ide@vger.kernel.org, apiszcz@solarrain.com
Subject: Re: Kernel 2.6.23.9 / P35 Chipset + WD 750GB Drives (reset port)
Date: Thu, 6 Dec 2007 15:05:11 -0800 [thread overview]
Message-ID: <20071206150511.e0dd0b07.akpm@linux-foundation.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0712061737500.8523@p34.internal.lan>
On Thu, 6 Dec 2007 17:38:08 -0500 (EST)
Justin Piszcz <jpiszcz@lucidpixels.com> wrote:
>
>
> On Thu, 6 Dec 2007, Andrew Morton wrote:
>
> > On Sat, 1 Dec 2007 06:26:08 -0500 (EST)
> > Justin Piszcz <jpiszcz@lucidpixels.com> wrote:
> >
> >> I am putting a new machine together and I have dual raptor raid 1 for the
> >> root, which works just fine under all stress tests.
> >>
> >> Then I have the WD 750 GiB drive (not RE2, desktop ones for ~150-160 on
> >> sale now adays):
> >>
> >> I ran the following:
> >>
> >> dd if=/dev/zero of=/dev/sdc
> >> dd if=/dev/zero of=/dev/sdd
> >> dd if=/dev/zero of=/dev/sde
> >>
> >> (as it is always a very good idea to do this with any new disk)
> >>
> >> And sometime along the way(?) (i had gone to sleep and let it run), this
> >> occurred:
> >>
> >> [42880.680144] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x4010000
> >> action 0x2 frozen
> >
> > Gee we're seeing a lot of these lately.
> >
> >> [42880.680231] ata3.00: irq_stat 0x00400040, connection status changed
> >> [42880.680290] ata3.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb
> >> 0x0 data 512 in
> >> [42880.680292] res 40/00:ac:d8:64:54/00:00:57:00:00/40 Emask 0x10
> >> (ATA bus error)
> >> [42881.841899] ata3: soft resetting port
> >> [42885.966320] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> >> [42915.919042] ata3.00: qc timeout (cmd 0xec)
> >> [42915.919094] ata3.00: failed to IDENTIFY (I/O error, err_mask=0x5)
> >> [42915.919149] ata3.00: revalidation failed (errno=-5)
> >> [42915.919206] ata3: failed to recover some devices, retrying in 5 secs
> >> [42920.912458] ata3: hard resetting port
> >> [42926.411363] ata3: port is slow to respond, please be patient (Status
> >> 0x80)
> >> [42930.943080] ata3: COMRESET failed (errno=-16)
> >> [42930.943130] ata3: hard resetting port
> >> [42931.399628] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> >> [42931.413523] ata3.00: configured for UDMA/133
> >> [42931.413586] ata3: EH pending after completion, repeating EH (cnt=4)
> >> [42931.413655] ata3: EH complete
> >> [42931.413719] sd 2:0:0:0: [sdc] 1465149168 512-byte hardware sectors
> >> (750156 MB)
> >> [42931.413809] sd 2:0:0:0: [sdc] Write Protect is off
> >> [42931.413856] sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> >> [42931.413867] sd 2:0:0:0: [sdc] Write cache: enabled, read cache:
> >> enabled, doesn't support DPO or FUA
> >>
> >> Usually when I see this sort of thing with another box I have full of
> >> raptors, it was due to a bad raptor and I never saw it again after I
> >> replaced the disk that it happened on, but that was using the Intel P965
> >> chipset.
> >>
> >> For this board, it is a Gigabyte GSP-P35-DS4 (Rev 2.0) and I have all of
> >> the drives (2 raptors, 3 750s connected to the Intel ICH9 Southbridge).
> >>
> >> I am going to do some further testing but does this indicate a bad drive?
> >> Bad cable? Bad connector?
> >>
> >> As you can see above, /dev/sdc stopped responding for a little bit and
> >> then the kernel reset the port.
> >>
> >> Why is this though? What is the likely root cause? Should I replace the
> >> drive? Obviously this is not normal and cannot be good at all, the idea
> >> is to put these drives in a RAID5 and if one is going to timeout that is
> >> going to cause the array to go degraded and thus be worthless in a raid5
> >> configuration.
> >>
> >> Can anyone offer any insight here?
> >
> > It would be interesting to try 2.6.21 or 2.6.22.
> >
>
> This was due to NCQ issues (disabling it fixed the problem).
>
I cannot locate any further email discussion on this topic.
Disabling NCQ at either compile time or runtime is not a "fix" and further
work should be done here to maek the kernel run acceptably on that
hardware.
next prev parent reply other threads:[~2007-12-06 23:05 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-12-01 11:26 Kernel 2.6.23.9 / P35 Chipset + WD 750GB Drives (reset port) Justin Piszcz
2007-12-01 12:13 ` Jan Engelhardt
2007-12-01 12:23 ` Justin Piszcz
2007-12-01 16:47 ` Janek Kozicki
2007-12-01 16:55 ` Justin Piszcz
2007-12-02 9:11 ` Justin Piszcz
2007-12-02 21:14 ` Janek Kozicki
2007-12-02 21:21 ` Justin Piszcz
2007-12-02 21:25 ` Michael Tokarev
2007-12-02 21:32 ` Justin Piszcz
2007-12-10 8:23 ` Tejun Heo
2007-12-01 18:44 ` Bill Davidsen
2007-12-10 8:14 ` Tejun Heo
2007-12-13 22:27 ` Bill Davidsen
2007-12-06 22:00 ` Andrew Morton
2007-12-06 22:38 ` Justin Piszcz
2007-12-06 23:05 ` Andrew Morton [this message]
[not found] <fa.hhS4g1h0uppt8Xx/ZZfNNQfAv1Q@ifi.uio.no>
2007-12-01 20:08 ` Robert Hancock
[not found] ` <fa.YIWyRfjQw18aIH2fKaze37Gwuzo@ifi.uio.no>
[not found] ` <fa.ib4H8TQ3raADIWdsEBy+eSL/1RU@ifi.uio.no>
[not found] ` <fa.S4u1AwoYnqrSuegcUaP78D3SFXQ@ifi.uio.no>
[not found] ` <fa.H1nTe/xQV/oyEMTHAkOjqgqu7jY@ifi.uio.no>
[not found] ` <fa.YpQ6xCPOijQOCKsLJr1SDINFURI@ifi.uio.no>
2007-12-05 1:26 ` Robert Hancock
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071206150511.e0dd0b07.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=apiszcz@solarrain.com \
--cc=jpiszcz@lucidpixels.com \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.