Can we use SCSI error trace events to monitor SCSI hardware problems ?

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* Can we use SCSI error trace events to monitor SCSI hardware problems ?
@ 2013-12-23  3:28 Junliang Li
  2014-01-02 14:51 ` Hannes Reinecke
  0 siblings, 1 reply; 2+ messages in thread
From: Junliang Li @ 2013-12-23  3:28 UTC (permalink / raw)
  To: hare; +Cc: linux-scsi

Hello, Hannes

I found you owned a project on github named "md_monitor". It supports MD
array by using mdadm tool. But how about generic SCSI devices ? There is
a "scsi_dispatch_cmd_error" tracepoint in SCSI subsystem, from which we
can get something useful output via sysfs.  We can do more work in
userspace. Now I setup a tracepoint in "scsi_print_result" and trace
scsi cmd result. By reading host status and sense data, I can find out
anything wrong while executing SCSI commands. Does it make sense or it
could be better ?

Thanks,
Junliang Li

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Can we use SCSI error trace events to monitor SCSI hardware problems ?
  2013-12-23  3:28 Can we use SCSI error trace events to monitor SCSI hardware problems ? Junliang Li
@ 2014-01-02 14:51 ` Hannes Reinecke
  0 siblings, 0 replies; 2+ messages in thread
From: Hannes Reinecke @ 2014-01-02 14:51 UTC (permalink / raw)
  To: Junliang Li; +Cc: linux-scsi

On 12/23/2013 04:28 AM, Junliang Li wrote:
> Hello, Hannes
>
> I found you owned a project on github named "md_monitor". It supports MD
> array by using mdadm tool. But how about generic SCSI devices ? There is
> a "scsi_dispatch_cmd_error" tracepoint in SCSI subsystem, from which we
> can get something useful output via sysfs.  We can do more work in
> userspace. Now I setup a tracepoint in "scsi_print_result" and trace
> scsi cmd result. By reading host status and sense data, I can find out
> anything wrong while executing SCSI commands. Does it make sense or it
> could be better ?
>
Hmm. Not sure if that gives you what you want.

md_monitor was primarily designed to handle transient I/O errors
when running under md, and re-adding failed devices if the I/O error 
condition was found to be resolved.
For DASD the entire functionality was implemented into md_monitor,
but when running on top of a SCSI device it should better be handled
via multipathing, as then most of the functionality is already
implemented there.

However, one of the goals of md_monitor was to guarantee a specific
response time, which isn't easily possible with SCSI devices.
With the updated SCSI EH it might be easier, but the code would
need to be updated and tested to work on SCSI devices, too.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2014-01-02 12:49 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-23  3:28 Can we use SCSI error trace events to monitor SCSI hardware problems ? Junliang Li
2014-01-02 14:51 ` Hannes Reinecke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox