public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: "Kevin P. Fleming" <kpfleming@backtobasicsmgmt.com>
To: Andrew Vasquez <andrew.vasquez@qlogic.com>
Cc: linux-scsi@vger.kernel.org
Subject: Re: Weird SCSI error, can anyone interpret?
Date: Fri, 05 Mar 2004 07:47:19 -0700	[thread overview]
Message-ID: <404892F7.8010001@backtobasicsmgmt.com> (raw)
In-Reply-To: <20040305060748.GC9315@praka.local.home>

Andrew Vasquez wrote:

> Before the deluge of I/O error messages, does the messages file
> contain any useful bits of information, i.e. the driver posting
> messages about link failures, devices going away, etc.

Nope, there were no other SCSI related messages before these (all the 
way back to the last kernel boot, about 18 hours before).

> Did the driver not recognize the RAID box on the loop?

No, the loop appeared to come up but there were no devices present. 
Thanks for asking, I hadn't checked the log file to that level of detail 
yet. That's a pretty important sign that the problem was the RAID 
controller, given that the ISP2100 and the CMD-7220 are the only devices 
on the loop (direct cable between them).

> Now that's interesting...

Yes, that's why I think this may be a RAID controller problem (we've 
already had one of the original two boards die a year or so ago).

> Well given the information that was presented, I'd say it seems rather
> suspicious that the RAID- box needed a power-cycle to be restored into
> functioning state.  What type of I/O patterns were being run to the
> RAID box?  How many concurrent commands were being queued?

At this time of day there would have been almost zero activity. The 
"flood" of error messages I referred was over the next 56 hours or so, 
from when the problem occurred until users starting trying to hit the 
server on Monday morning.

> Could you enable some additional debug in the driver and rerun your
> test?   Set DEBUG_QLA2100 to 1 in qla_settings.h and define
> QL_DEBUG_LEVEL_2 in qla_dbg.h.  Recompile the driver, then run your
> test.  If a failure occurs, send me the resultant /var/log/messages
> file and the output of the following command:
> 
> 	# cat /proc/scsi/qla2xxx/* 

I can try that this afternoon, can't promise when (if) the problem will 
occur again, and I will have to have the customer ready to issue the cat 
command but they are capable of that.

Thanks for your help.

      reply	other threads:[~2004-03-05 14:47 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-04  4:06 Weird SCSI error, can anyone interpret? Kevin P. Fleming
2004-03-05  6:07 ` Andrew Vasquez
2004-03-05 14:47   ` Kevin P. Fleming [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=404892F7.8010001@backtobasicsmgmt.com \
    --to=kpfleming@backtobasicsmgmt.com \
    --cc=andrew.vasquez@qlogic.com \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox