linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Harri Olin <harri.olin@gmail.com>
To: Mark Lord <liml@rtr.ca>
Cc: linux-ide@vger.kernel.org
Subject: Re: sata_mv, io stucks
Date: Sun, 16 Nov 2008 01:41:54 +0200	[thread overview]
Message-ID: <491F5E42.8010906@gmail.com> (raw)
In-Reply-To: <491F4096.9090701@rtr.ca>

Mark Lord wrote:
> Harri Olin wrote:
>> Mark Lord wrote:
>>>>> Two marvell controllers, 16 disks, software raid10, IO stucks on 
>>>>> different disks, kernel 2.6.26.5.
>>>>> With default ubuntu's 8.04 2.6.24 kernel the problem can not be 
>>>>> repeated
>>>>>
>>>>>
>>>>> [  289.851609] ata11.00: exception Emask 0x0 SAct 0x1 SErr 0x0 
>>>>> action 0x6 frozen
>>>>> [  289.851695] ata11.00: cmd 61/08:00:60:1e:bf/00:00:01:00:00/40 
>>>>> tag 0 ncq 4096 out
>>>>> [  289.851697]          res 40/00:00:00:00:00/00:00:00:00:00/00 
>>>>> Emask 0x4 (timeout)
>>>>> [  289.851774] ata11.00: status: { DRDY }
>>>>> [  289.851834] ata11: hard resetting link
>>>>> [  290.649259] ata11: SATA link up 3.0 Gbps (SStatus 123 SControl 
>>>>> 300)
>>>>> [  290.749239] ata11.00: max_sectors limited to 256 for NCQ
>>>>> [  290.809189] ata11.00: max_sectors limited to 256 for NCQ
>>>>> [  290.809194] ata11.00: configured for UDMA/133
>>>>> [  290.809200] ata11: EH complete
>>>>> [  290.809242] sd 10:0:0:0: [sdk] 1953525168 512-byte hardware 
>>>>> sectors (1000205 MB)
>>>>> [  290.809258] sd 10:0:0:0: [sdk] Write Protect is off
>>>>> [  290.809263] sd 10:0:0:0: [sdk] Mode Sense: 00 3a 00 00
>>>>> [  290.809286] sd 10:0:0:0: [sdk] Write cache: enabled, read 
>>>>> cache: enabled, doesn't support DPO or FUA
>>> ...
>>>
>>> I've just returned here from a month holiday in Italy,
>>> and I'll have a look at this and other sata_mv issues
>>> next week or so.
>>
>> I ran git-bisect on it and it returned 
>> a3718c1f230240361ed92d3e53342df0ff7efa8c as first bad commit. Also 
>> verified by hand that patching it on working tree breaks it.
> Looking at later kernels (after the commit in question), I see that
> the code was further fixed to remove some possible races and stuff,
> but that's still just 2.6.26.5, which you guys see failures on.
>
> So here's some instrumentation to help us figure it out.
> Please apply and report back once it triggers again.
> Thanks.

I have to take back that bisect, as just couple of minutes ago it 
happened again, with last 'good' kernel from bisect. Just the frequency 
of stalls has dropped quite much. I also noticed that on current kernels 
are much better too.
pre-..0ff7efa8c: only once after 6 hours of testing
post-..0ff7efa8c: one hd stalled while filesystem was mounting. Before 
boot was complete, 3 stalls. Also at shutdown kernel hung at 
Synchronizing SCSI cache for a while.
2.6.27: once in 5 minutes or so on heavy load

When some hd/port stalls, other ports sill work fine.

I applied your patch on 2.6.27.1, no results:

ata14.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
ata14.00: cmd 61/08:00:3f:52:54/00:00:57:00:00/40 tag 0 ncq 4096 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata14.00: status: { DRDY }
ata14: hard resetting link
ata14: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata14.00: max_sectors limited to 256 for NCQ
ata14.00: max_sectors limited to 256 for NCQ
ata14.00: configured for UDMA/133
ata14: EH complete
sd 13:0:0:0: [sdh] 1465149168 512-byte hardware sectors (750156 MB)
sd 13:0:0:0: [sdh] Write Protect is off
sd 13:0:0:0: [sdh] Mode Sense: 00 3a 00 00
sd 13:0:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't 
support DPO or FUA

Do I have to enable something somewhere else too?

I also compiled and patched linux-2.6-stable tree from git but it just 
paniced after stall instead of recovering. I'm currently trying to 
reproduce that on second computer where I can capture the panic.

-- 
Harri.


  reply	other threads:[~2008-11-15 23:42 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-17 12:25 sata_mv, io stucks Artem Bokhan
2008-10-23  8:53 ` Artem Bokhan
2008-10-23 16:07   ` Mark Lord
2008-11-15 15:18     ` Harri Olin
2008-11-15 21:35       ` Mark Lord
2008-11-15 23:41         ` Harri Olin [this message]
2008-11-15 23:44           ` Justin Piszcz
2008-11-15 23:47             ` Harri Olin
2008-11-15 23:52               ` Justin Piszcz
2008-11-16  4:43           ` Mark Lord
2008-11-16  4:59             ` Mark Lord
2008-11-16  9:13               ` Justin Piszcz
2008-11-17  5:22                 ` Mark Lord
2008-11-17 14:10               ` Bokhan Artem
2008-11-16 12:35             ` Harri Olin
2008-11-16 17:32             ` Harri Olin
2008-10-23 13:31 ` Harri Olin
2008-10-23 16:32   ` Bokhan Artem

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=491F5E42.8010906@gmail.com \
    --to=harri.olin@gmail.com \
    --cc=liml@rtr.ca \
    --cc=linux-ide@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).