All of lore.kernel.org
 help / color / mirror / Atom feed
* Device is not becoming offline on failed TUR
@ 2016-01-17 10:54 Lev Vainblat
  2016-01-19 12:46 ` Hannes Reinecke
  0 siblings, 1 reply; 2+ messages in thread
From: Lev Vainblat @ 2016-01-17 10:54 UTC (permalink / raw)
  To: linux-scsi; +Cc: Yaron Presente, Alex Lyakas

Hello,

We have a virtual target that under some circumstances returns 
HARDWARE_ERROR (4/44) on TUR. Previously on initiator side 
scsi_check_sense() returned TARGET_ERROR, that in scsi_eh_tur() was 
converted to 1 (device not ready). And then scsi_eh_ready_devs() called 
scsi_eh_offline_sdevs to move device to the OFFLINE state.

In commit 
http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=87f14e6 
this logic has been changed. Now on HARDWARE_ERROR scsi_check_sense() only 
sets DID_TARGET_FAILURE and returns SUCCESS. As a result device remains 
"running", and scsi_eh_flush_done_q() doesn't finish the command but retries 
it. So in this case it takes for the stuck command much more time to finish.

Am I missing something? Is it a bug or intentional new behavior?

Thanks,
    Lev. 


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Device is not becoming offline on failed TUR
  2016-01-17 10:54 Device is not becoming offline on failed TUR Lev Vainblat
@ 2016-01-19 12:46 ` Hannes Reinecke
  0 siblings, 0 replies; 2+ messages in thread
From: Hannes Reinecke @ 2016-01-19 12:46 UTC (permalink / raw)
  To: Lev Vainblat, linux-scsi; +Cc: Yaron Presente, Alex Lyakas

On 01/17/2016 11:54 AM, Lev Vainblat wrote:
> Hello,
> 
> We have a virtual target that under some circumstances returns
> HARDWARE_ERROR (4/44) on TUR. Previously on initiator side
> scsi_check_sense() returned TARGET_ERROR, that in scsi_eh_tur() was
> converted to 1 (device not ready). And then scsi_eh_ready_devs()
> called scsi_eh_offline_sdevs to move device to the OFFLINE state.
> 
> In commit
> http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=87f14e6
> this logic has been changed. Now on HARDWARE_ERROR
> scsi_check_sense() only sets DID_TARGET_FAILURE and returns SUCCESS.
> As a result device remains "running", and scsi_eh_flush_done_q()
> doesn't finish the command but retries it. So in this case it takes
> for the stuck command much more time to finish.
> 
> Am I missing something? Is it a bug or intentional new behavior?
> 
That is intentional.
We should only ever set the device to 'offline' is we cannot
communicate with it.
If we can (and that is the case here, as the drive returns a sense
code) the communication is okay, and the device should not be set to
offline.

It's up to the driver/calling application to correctly handle the
sense code.
As usual.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2016-01-19 12:46 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-17 10:54 Device is not becoming offline on failed TUR Lev Vainblat
2016-01-19 12:46 ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.