From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: bugme-daemon@bugzilla.kernel.org
Cc: linux-scsi@vger.kernel.org
Subject: Re: [Bug 11117] New: aic94xx doesn't sustain the load when more than 2 SAS drives are connected and actively used
Date: Mon, 28 Jul 2008 09:44:29 -0500 [thread overview]
Message-ID: <1217256269.3503.73.camel@localhost.localdomain> (raw)
In-Reply-To: <bug-11117-11613@http.bugzilla.kernel.org/>
On Fri, 2008-07-18 at 08:37 -0700, bugme-daemon@bugzilla.kernel.org
wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=11117
>
> Summary: aic94xx doesn't sustain the load when more than 2 SAS
> drives are connected and actively used
[...]
> aic94xx: escb_tasklet_complete: REQ_TASK_ABORT, reason=0x6
> sas: command 0xffff8101d39733c0, task 0xffff8105e9e51240, timed out:
> EH_NOT_HANDLED
> sas: command 0xffff8104db3d1e40, task 0xffff8105ed10a6c0, timed out:
This is more or less a known problem with aic94xx. It's root cause is
that there are certain bus conditions the firmware requires help with.
REQ_TASK_ABORT is one of them (reason 0x6 means there was a protocol
error on the bus). What the card would like is for us to abort and
retransmit that command immediately (running abort). What we actually
do is to mark the command for abort by the error handler, halt all
in-progress commands and wake up the eh thread. This causes a nasty
hiccough in the data flow and runs into a potential snowball effect in
that if we get another REQ_TASK_ABORT on the retry of all the halted
commands (and there are quite a number of them), we have to do
everything over again (do this too often and the command will time out).
The fix is to alter the aic94xx code to do a running abort (as in do it
itself on the single command instead of halting everything and waking
the error handler). Unfortunately no-one's found the time to sit down
and code this up yet.
James
next prev parent reply other threads:[~2008-07-28 14:44 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-18 15:37 [Bug 11117] New: aic94xx doesn't sustain the load when more than 2 SAS drives are connected and actively used bugme-daemon
2008-07-28 14:44 ` James Bottomley [this message]
2008-07-28 14:45 ` [Bug 11117] " bugme-daemon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1217256269.3503.73.camel@localhost.localdomain \
--to=james.bottomley@hansenpartnership.com \
--cc=bugme-daemon@bugzilla.kernel.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).