All of lore.kernel.org
 help / color / mirror / Atom feed
From: Harri Olin <harri.olin@gmail.com>
To: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: Mark Lord <liml@rtr.ca>, linux-ide@vger.kernel.org
Subject: Re: sata_mv, io stucks
Date: Sun, 16 Nov 2008 01:47:19 +0200	[thread overview]
Message-ID: <491F5F87.8060200@gmail.com> (raw)
In-Reply-To: <alpine.DEB.1.10.0811151843520.27937@p34.internal.lan>

Justin Piszcz wrote:
>
>
> On Sun, 16 Nov 2008, Harri Olin wrote:
>
>> Mark Lord wrote:
>>> Harri Olin wrote:
>>>> Mark Lord wrote:
>>>>>>> Two marvell controllers, 16 disks, software raid10, IO stucks on 
>>>>>>> different disks, kernel 2.6.26.5.
>>>>>>> With default ubuntu's 8.04 2.6.24 kernel the problem can not be 
>>>>>>> repeated
>>>>>>>
>>>>>>>
>>>>>>> [  289.851609] ata11.00: exception Emask 0x0 SAct 0x1 SErr 0x0 
>>>>>>> action 0x6 frozen
>>>>>>> [  289.851695] ata11.00: cmd 61/08:00:60:1e:bf/00:00:01:00:00/40 
>>>>>>> tag 0 ncq 4096 out
>>>>>>> [  289.851697]          res 40/00:00:00:00:00/00:00:00:00:00/00 
>>>>>>> Emask 0x4 (timeout)
>>>>>>> [  289.851774] ata11.00: status: { DRDY }
>>>>>>> [  289.851834] ata11: hard resetting link
>>>>>>> [  290.649259] ata11: SATA link up 3.0 Gbps (SStatus 123 
>>>>>>> SControl 300)
>>>>>>> [  290.749239] ata11.00: max_sectors limited to 256 for NCQ
>>>>>>> [  290.809189] ata11.00: max_sectors limited to 256 for NCQ
>>>>>>> [  290.809194] ata11.00: configured for UDMA/133
>>>>>>> [  290.809200] ata11: EH complete
>>>>>>> [  290.809242] sd 10:0:0:0: [sdk] 1953525168 512-byte hardware 
>>>>>>> sectors (1000205 MB)
>>>>>>> [  290.809258] sd 10:0:0:0: [sdk] Write Protect is off
>>>>>>> [  290.809263] sd 10:0:0:0: [sdk] Mode Sense: 00 3a 00 00
>>>>>>> [  290.809286] sd 10:0:0:0: [sdk] Write cache: enabled, read 
>>>>>>> cache: enabled, doesn't support DPO or FUA
>>>>> ...
>>>>>
>>>>> I've just returned here from a month holiday in Italy,
>>>>> and I'll have a look at this and other sata_mv issues
>>>>> next week or so.
>>>>
>>>> I ran git-bisect on it and it returned 
>>>> a3718c1f230240361ed92d3e53342df0ff7efa8c as first bad commit. Also 
>>>> verified by hand that patching it on working tree breaks it.
>>> Looking at later kernels (after the commit in question), I see that
>>> the code was further fixed to remove some possible races and stuff,
>>> but that's still just 2.6.26.5, which you guys see failures on.
>>>
>>> So here's some instrumentation to help us figure it out.
>>> Please apply and report back once it triggers again.
>>> Thanks.
>>
>> I have to take back that bisect, as just couple of minutes ago it 
>> happened again, with last 'good' kernel from bisect. Just the 
>> frequency of stalls has dropped quite much. I also noticed that on 
>> current kernels are much better too.
>> pre-..0ff7efa8c: only once after 6 hours of testing
>> post-..0ff7efa8c: one hd stalled while filesystem was mounting. 
>> Before boot was complete, 3 stalls. Also at shutdown kernel hung at 
>> Synchronizing SCSI cache for a while.
>> 2.6.27: once in 5 minutes or so on heavy load
>>
>> When some hd/port stalls, other ports sill work fine.
>>
>> I applied your patch on 2.6.27.1, no results:
>>
>> ata14.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
>> ata14.00: cmd 61/08:00:3f:52:54/00:00:57:00:00/40 tag 0 ncq 4096 out
>>        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
>> ata14.00: status: { DRDY }
>> ata14: hard resetting link
>> ata14: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>> ata14.00: max_sectors limited to 256 for NCQ
>> ata14.00: max_sectors limited to 256 for NCQ
>> ata14.00: configured for UDMA/133
>> ata14: EH complete
>> sd 13:0:0:0: [sdh] 1465149168 512-byte hardware sectors (750156 MB)
>> sd 13:0:0:0: [sdh] Write Protect is off
>> sd 13:0:0:0: [sdh] Mode Sense: 00 3a 00 00
>> sd 13:0:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't 
>> support DPO or FUA
>>
>> Do I have to enable something somewhere else too?
>>
>> I also compiled and patched linux-2.6-stable tree from git but it 
>> just paniced after stall instead of recovering. I'm currently trying 
>> to reproduce that on second computer where I can capture the panic.
>
> What type of disks are you using?
>
> Justin.
I have seen this happening on on 3 different computers using WD5000ABYS, 
WD5000YS and WD7500AYYS hard disks. All have same Supermicro controller. 
Stalls happen only on controller ports 0-3, never on ports 4-7. Moving 
cables around doesn't help.

-- 
Harri.




  reply	other threads:[~2008-11-15 23:47 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-17 12:25 sata_mv, io stucks Artem Bokhan
2008-10-23  8:53 ` Artem Bokhan
2008-10-23 16:07   ` Mark Lord
2008-11-15 15:18     ` Harri Olin
2008-11-15 21:35       ` Mark Lord
2008-11-15 23:41         ` Harri Olin
2008-11-15 23:44           ` Justin Piszcz
2008-11-15 23:47             ` Harri Olin [this message]
2008-11-15 23:52               ` Justin Piszcz
2008-11-16  4:43           ` Mark Lord
2008-11-16  4:59             ` Mark Lord
2008-11-16  9:13               ` Justin Piszcz
2008-11-17  5:22                 ` Mark Lord
2008-11-17 14:10               ` Bokhan Artem
2008-11-16 12:35             ` Harri Olin
2008-11-16 17:32             ` Harri Olin
2008-10-23 13:31 ` Harri Olin
2008-10-23 16:32   ` Bokhan Artem

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=491F5F87.8060200@gmail.com \
    --to=harri.olin@gmail.com \
    --cc=jpiszcz@lucidpixels.com \
    --cc=liml@rtr.ca \
    --cc=linux-ide@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.