From: Bernd Schubert <bs@q-leap.de>
To: James Bottomley <James.Bottomley@hansenpartnership.com>
Cc: linux-scsi@vger.kernel.org
Subject: Re: [PATCH 1/7] print eh activation
Date: Wed, 3 Dec 2008 16:52:58 +0100 [thread overview]
Message-ID: <200812031652.58723.bs@q-leap.de> (raw)
In-Reply-To: <1228317390.5551.13.camel@localhost.localdomain>
On Wednesday 03 December 2008 16:16:30 James Bottomley wrote:
> On Wed, 2008-12-03 at 12:19 +0100, Bernd Schubert wrote:
> > On Wednesday 26 November 2008 19:47:02 James Bottomley wrote:
> > > On Wed, 2008-11-26 at 18:44 +0100, Bernd Schubert wrote:
> > > > Print activation of the scsi error handler to let the user know what
> > > > was the the error handler was activated. These information are
> > > > essential to diagnose hardware issues.
> > >
> > > But it can be turned on already with SCSI logging ... at least the
> > > activation message. I don't think we want this to be printed all the
> > > time, because the error handler can be activated in non-error
> > > situations for some HBAs (like sense collection for non-ACA emulating
> > > drivers).
> >
> > Sorry for the late reply, I didn't have access to my mails for a few
> > days.
> >
> > Actually I entirely disagree, activating the error handler should be an
> > exception and as such exception, it shall print it was activated and also
> > the reason why it was activated. Without these information we see quite
> > often in our logs something like:
> >
> > [12165690.357905] mptscsih: ioc1: attempting task abort!
> > (sc=ffff81012a957500) [12165690.357966] sd 3:0:1:0:
> > [12165690.358018] command: cdb[0]=0x28: 28 00 37 10 e9 4f 00 00
> > 08 00 [12165690.732712] mptbase: ioc1: IOCStatus(0x0048): SCSI Task
> > Terminated [12165690.733699] mptscsih: ioc1: task abort: SUCCESS
> > (sc=ffff81012a957500)
> >
> > But this gives you no chance to see, where it comes from. After adding
> > the additional printks from my patch, we recognized the error handler was
> > activated mostly due to command timeouts. So increasing the timeouts to
> > >90s already solved 2/3rds of our problems. Please also see patch nr. 6,
> > the additional printks did help me to recognize always only one special
> > scsi command fails.
>
> But surely what you're arguing for then, is a printk on command timeout?
I'm arguing that calling the error handler is a rare exception and that the
admin wants to know what has caused this exception. This is also nothing
you want to enable with scsi logging, since it mostly errors happen after
weeks when the system is already in production and nobody then has error
logging active.
Another example for the timeouts patch:
sd 6:0:2:2: [sdk] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
sd 6:0:2:2: [sdk] CDB: Synchronize Cache(10): 35 00 00 00 00 00 00 00 00 00
sd 6:0:2:2: Activating scsi error recovery (1)
sd 6:0:2:2: trying to abort command
qla2xxx 0000:07:02.0: scsi(6:2:2): Abort command issued -- 1 36e2df2 20
So without the printk patch you would see many messages like these:
qla2xxx 0000:07:02.0: scsi(6:2:2): Abort command issued -- 1 36e2df2 20
qla2xxx 0000:07:02.0: scsi(6:2:2): Abort command issued -- 1 36e2df2 20
qla2xxx 0000:07:02.0: scsi(6:2:2): Abort command issued -- 1 36e2df2 20
qla2xxx 0000:07:02.0: scsi(6:2:2): Abort command issued -- 1 36e2df2 20
qla2xxx 0000:07:02.0: scsi(6:2:2): Abort command issued -- 1 36e2df2 20
But you wouldn't have an idea why and which command was aborted. Eventually
this will cause a severe failure of qla2xxx driver, but you never would
figure out the underlying reason. Actually, I wouldn't mind to suppress these
driver messages, but the eh activation printks are essential to understand
what is going.
>
> > In my opinion, if a driver needs the error handler for specific actions,
> > we should create another interface for that. Could you please point me to
> > such a non-ACA river?
> > I also only see two calling functions of scsi_eh_scmd_add(), namely
> > scsi_times_out() and scsi_softirq_done() and only for these calls the
> > additinal printks will be done (since scmd is required to do the
> > printks).
>
> Mostly we converted the in-use drivers, but things like the parallel
> port drivers still use this mechanism.
Thanks, going to check these now.
Cheers,
Bernd
--
Bernd Schubert
Q-Leap Networks GmbH
next prev parent reply other threads:[~2008-12-03 15:53 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-26 17:40 [PATCH 0/7] scsi error handler improvements Bernd Schubert
2008-11-26 17:44 ` [PATCH 1/7] print eh activation Bernd Schubert
2008-11-26 18:47 ` James Bottomley
2008-12-03 11:19 ` Bernd Schubert
2008-12-03 15:16 ` James Bottomley
2008-12-03 15:52 ` Bernd Schubert [this message]
2008-11-26 17:46 ` [PATCH 2/7] Allow requeuement on DID_SOFT_ERROR Bernd Schubert
2008-11-26 18:47 ` James Bottomley
2008-12-03 12:17 ` Bernd Schubert
2008-12-03 15:16 ` James Bottomley
2008-12-03 16:00 ` Bernd Schubert
2008-12-03 16:29 ` James Bottomley
2008-12-03 17:06 ` Bernd Schubert
2008-11-26 18:25 ` [PATCH 03/07] Don't online offlined devices in scsi_target_quiesce() Bernd Schubert
2008-11-26 18:26 ` [PATCH 4/7] allow activation of eh on DID_NO_CONNECT Bernd Schubert
2008-11-26 18:29 ` [PATCH 5/7] time needs to be adjusted when eh was running Bernd Schubert
2009-01-07 18:09 ` Bernd Schubert
2008-11-26 18:31 ` [PATCH 6/7] SYNCHRONIZE_CACHE command used fixed value Bernd Schubert
2008-11-26 18:32 ` [PATCH 0/7] trivial: move a variable from function to if-scope Bernd Schubert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200812031652.58723.bs@q-leap.de \
--to=bs@q-leap.de \
--cc=James.Bottomley@hansenpartnership.com \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.