[PATCH] scsi: fix hang in scsi error handling

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] scsi: fix hang in scsi error handling
@ 2015-07-15 12:47 Kevin Groeneveld
  2015-07-16 11:11 ` Hannes Reinecke
  0 siblings, 1 reply; 6+ messages in thread
From: Kevin Groeneveld @ 2015-07-15 12:47 UTC (permalink / raw)
  To: JBottomley
  Cc: linux-scsi, festevam, richard.zhu, arnd, linux, Kevin Groeneveld

With the following setup/steps I can consistently trigger the scsi host to
hang requiring a reboot:
1. iMX6Q processor with built in AHCI compatible SATA host
2. SATA port multiplier in CBS mode connected to iMX6Q
3. HDD connected to port multiplier
4. CDROM connected to port multiplier
5. trigger continuous I/O to HDD
6. repeatedly execute CDROM_DRIVE_STATUS ioctl on CDROM with no disc in
   drive

I don't think this issue is iMX6 specific but that is the only platform
I have duplicated the hang on.

To trigger the issue at least two CPU cores must be enabled and the HDD
access and CDROM ioctls must be happening concurrently. If I only enable
one CPU core the hang does not occur.

The following C program can be used to trigger the CDROM ioctl:

#include <stdio.h>
#include <fcntl.h>
#include <linux/cdrom.h>

int main(int argc, char* argv[])
{
	int fd;

	fd = open("/dev/cdrom", O_RDONLY | O_NONBLOCK);
	if(fd < 0)
	{
		perror("cannot open /dev/cdrom");
		return fd;
	}

	for(;;)
	{
		ioctl(fd, CDROM_DRIVE_STATUS, 0);
		usleep(100 * 1000);
	}
}

When the hang occurs shost->host_busy == 2 and shost->host_failed == 1 in
the scsi_eh_wakeup function. However this function only wakes the error
handler if host_busy == host_failed.

The patch changes the condition to test if host_busy >= host_failed and
updates the corresponding condition in scsi_error_handler. Without the
patch I can trigger the hang within seconds. With the patch I have not
duplicated the hang after hours of testing.

Signed-off-by: Kevin Groeneveld <kgroeneveld@lenbrook.com>
---
 drivers/scsi/scsi_error.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 106884a..853964b 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -61,7 +61,7 @@ static int scsi_try_to_abort_cmd(struct scsi_host_template *,
 /* called with shost->host_lock held */
 void scsi_eh_wakeup(struct Scsi_Host *shost)
 {
-	if (atomic_read(&shost->host_busy) == shost->host_failed) {
+	if (atomic_read(&shost->host_busy) >= shost->host_failed) {
 		trace_scsi_eh_wakeup(shost);
 		wake_up_process(shost->ehandler);
 		SCSI_LOG_ERROR_RECOVERY(5, shost_printk(KERN_INFO, shost,
@@ -2173,7 +2173,7 @@ int scsi_error_handler(void *data)
 	while (!kthread_should_stop()) {
 		set_current_state(TASK_INTERRUPTIBLE);
 		if ((shost->host_failed == 0 && shost->host_eh_scheduled == 0) ||
-		    shost->host_failed != atomic_read(&shost->host_busy)) {
+		    shost->host_failed > atomic_read(&shost->host_busy)) {
 			SCSI_LOG_ERROR_RECOVERY(1,
 				shost_printk(KERN_INFO, shost,
 					     "scsi_eh_%d: sleeping\n",
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi: fix hang in scsi error handling
  2015-07-15 12:47 [PATCH] scsi: fix hang in scsi error handling Kevin Groeneveld
@ 2015-07-16 11:11 ` Hannes Reinecke
  2015-07-16 18:55   ` Kevin Groeneveld
  0 siblings, 1 reply; 6+ messages in thread
From: Hannes Reinecke @ 2015-07-16 11:11 UTC (permalink / raw)
  To: Kevin Groeneveld, JBottomley
  Cc: linux-scsi, festevam, richard.zhu, arnd, linux

On 07/15/2015 02:47 PM, Kevin Groeneveld wrote:
> With the following setup/steps I can consistently trigger the scsi host to
> hang requiring a reboot:
> 1. iMX6Q processor with built in AHCI compatible SATA host
> 2. SATA port multiplier in CBS mode connected to iMX6Q
> 3. HDD connected to port multiplier
> 4. CDROM connected to port multiplier
> 5. trigger continuous I/O to HDD
> 6. repeatedly execute CDROM_DRIVE_STATUS ioctl on CDROM with no disc in
>    drive
> 
> I don't think this issue is iMX6 specific but that is the only platform
> I have duplicated the hang on.
> 
> To trigger the issue at least two CPU cores must be enabled and the HDD
> access and CDROM ioctls must be happening concurrently. If I only enable
> one CPU core the hang does not occur.
> 
> The following C program can be used to trigger the CDROM ioctl:
> 
> #include <stdio.h>
> #include <fcntl.h>
> #include <linux/cdrom.h>
> 
> int main(int argc, char* argv[])
> {
> 	int fd;
> 
> 	fd = open("/dev/cdrom", O_RDONLY | O_NONBLOCK);
> 	if(fd < 0)
> 	{
> 		perror("cannot open /dev/cdrom");
> 		return fd;
> 	}
> 
> 	for(;;)
> 	{
> 		ioctl(fd, CDROM_DRIVE_STATUS, 0);
> 		usleep(100 * 1000);
> 	}
> }
> 
> When the hang occurs shost->host_busy == 2 and shost->host_failed == 1 in
> the scsi_eh_wakeup function. However this function only wakes the error
> handler if host_busy == host_failed.
> 
Which just means that one command is still outstanding, and we need
to wait for it to complete.
But see below...

> The patch changes the condition to test if host_busy >= host_failed and
> updates the corresponding condition in scsi_error_handler. Without the
> patch I can trigger the hang within seconds. With the patch I have not
> duplicated the hang after hours of testing.
> 
> Signed-off-by: Kevin Groeneveld <kgroeneveld@lenbrook.com>
> ---
>  drivers/scsi/scsi_error.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 106884a..853964b 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -61,7 +61,7 @@ static int scsi_try_to_abort_cmd(struct scsi_host_template *,
>  /* called with shost->host_lock held */
>  void scsi_eh_wakeup(struct Scsi_Host *shost)
>  {
> -	if (atomic_read(&shost->host_busy) == shost->host_failed) {
> +	if (atomic_read(&shost->host_busy) >= shost->host_failed) {
>  		trace_scsi_eh_wakeup(shost);
>  		wake_up_process(shost->ehandler);
>  		SCSI_LOG_ERROR_RECOVERY(5, shost_printk(KERN_INFO, shost,
> @@ -2173,7 +2173,7 @@ int scsi_error_handler(void *data)
>  	while (!kthread_should_stop()) {
>  		set_current_state(TASK_INTERRUPTIBLE);
>  		if ((shost->host_failed == 0 && shost->host_eh_scheduled == 0) ||
> -		    shost->host_failed != atomic_read(&shost->host_busy)) {
> +		    shost->host_failed > atomic_read(&shost->host_busy)) {
>  			SCSI_LOG_ERROR_RECOVERY(1,
>  				shost_printk(KERN_INFO, shost,
>  					     "scsi_eh_%d: sleeping\n",
> 
Hmm.
I am really not sure about this.

'host_busy' indicates the number of outstanding commands, and
'host_failed' is the number of commands which have failed (on the
ground that failed commands are considered outstanding, too).

So the first hunk would change the behaviour from
'start SCSI EH once all commands are completed or failed' to
'start SCSI EH for _any_ command if scsi_eh_wakeup is called'
(note that shost_failed might be '0'...).
Which doesn't sound right.

The second hunk seems to be okay, as in principle 'host_busy' could
have been decreased before the check is done (ie someone could have
called ->done on a failed command).
But even so this would point to an invalid command completion; as
soon as a command is marked as 'failed' control is back in the SCSI
midlayer, and no-one else should be tampering with it.

I guess this needs further debugging to get to the bottom of it.

Sorry, but:

NACK.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] scsi: fix hang in scsi error handling
  2015-07-16 11:11 ` Hannes Reinecke
@ 2015-07-16 18:55   ` Kevin Groeneveld
  2015-07-17  6:02     ` Hannes Reinecke
  2015-07-27 10:38     ` Hannes Reinecke
  0 siblings, 2 replies; 6+ messages in thread
From: Kevin Groeneveld @ 2015-07-16 18:55 UTC (permalink / raw)
  To: Hannes Reinecke, JBottomley@odin.com
  Cc: linux-scsi@vger.kernel.org, festevam@gmail.com,
	richard.zhu@freescale.com, arnd@arndb.de, linux@arm.linux.org.uk

> -----Original Message-----
> From: Hannes Reinecke [mailto:hare@suse.de]
> Sent: July-16-15 7:11 AM
> > When the hang occurs shost->host_busy == 2 and shost->host_failed == 1
> > in the scsi_eh_wakeup function. However this function only wakes the
> > error handler if host_busy == host_failed.
> >
> Which just means that one command is still outstanding, and we need to wait
> for it to complete.
> But see below...

So the root cause of the hang is maybe that the second command never
completes? Maybe host_failed being non zero is blocking something in the
port multiplier code?

> Hmm.
> I am really not sure about this.

I wasn't sure either, that is one reason why I posted the patch.

> 'host_busy' indicates the number of outstanding commands, and
> 'host_failed' is the number of commands which have failed (on the ground
> that failed commands are considered outstanding, too).
> 
> So the first hunk would change the behaviour from 'start SCSI EH once all
> commands are completed or failed' to 'start SCSI EH for _any_ command if
> scsi_eh_wakeup is called'
> (note that shost_failed might be '0'...).
> Which doesn't sound right.

So could the patch create any problems by starting the EH any time
scsi_eh_wakeup is called? Or is it is just inefficient?

> I guess this needs further debugging to get to the bottom of it.

Any suggestions on things I could try?

The fact that the problem goes away when I only enable one CPU core makes
me think there is a race happening somewhere.


Kevin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi: fix hang in scsi error handling
  2015-07-16 18:55   ` Kevin Groeneveld
@ 2015-07-17  6:02     ` Hannes Reinecke
  2015-07-27 10:38     ` Hannes Reinecke
  1 sibling, 0 replies; 6+ messages in thread
From: Hannes Reinecke @ 2015-07-17  6:02 UTC (permalink / raw)
  To: Kevin Groeneveld, JBottomley@odin.com
  Cc: linux-scsi@vger.kernel.org, festevam@gmail.com,
	richard.zhu@freescale.com, arnd@arndb.de, linux@arm.linux.org.uk

On 07/16/2015 08:55 PM, Kevin Groeneveld wrote:
>> -----Original Message-----
>> From: Hannes Reinecke [mailto:hare@suse.de]
>> Sent: July-16-15 7:11 AM
>>> When the hang occurs shost->host_busy == 2 and shost->host_failed == 1
>>> in the scsi_eh_wakeup function. However this function only wakes the
>>> error handler if host_busy == host_failed.
>>>
>> Which just means that one command is still outstanding, and we need to wait
>> for it to complete.
>> But see below...
> 
> So the root cause of the hang is maybe that the second command never
> completes? Maybe host_failed being non zero is blocking something in the
> port multiplier code?
> 
Yes, I think that's one of the reasons. You really should
investigate what happens to the second command.
(It might well be that the second command is issued _before_ the
first command completes, effectively creating a livelock.)

>> Hmm.
>> I am really not sure about this.
> 
> I wasn't sure either, that is one reason why I posted the patch.
> 
>> 'host_busy' indicates the number of outstanding commands, and
>> 'host_failed' is the number of commands which have failed (on the ground
>> that failed commands are considered outstanding, too).
>>
>> So the first hunk would change the behaviour from 'start SCSI EH once all
>> commands are completed or failed' to 'start SCSI EH for _any_ command if
>> scsi_eh_wakeup is called'
>> (note that shost_failed might be '0'...).
>> Which doesn't sound right.
> 
> So could the patch create any problems by starting the EH any time
> scsi_eh_wakeup is called? Or is it is just inefficient?
> 
The patch will play havoc with the SCSI EH code, as by the time SCSI
EH is working on the list of failed commands the host is assumed to
be stopped. So there cannot be any out-of-band modifications to the
list of failed commands.
With you patch commands might fail _while SCSI EH is active_,
so the list of failed commands will be modified during SCSI EH.
As the SCSI EH code doesn't have any locks on that list things will
become very tangled after that.

>> I guess this needs further debugging to get to the bottom of it.
> 
> Any suggestions on things I could try?
> 
Enable SCSI logging (or scsi tracing) and figure out what happens to
the second command.

> The fact that the problem goes away when I only enable one CPU core makes
> me think there is a race happening somewhere.
> 
Yeah, most definitely. But I doubt it's in the error handler, it's
more likely somewhere else.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi: fix hang in scsi error handling
  2015-07-16 18:55   ` Kevin Groeneveld
  2015-07-17  6:02     ` Hannes Reinecke
@ 2015-07-27 10:38     ` Hannes Reinecke
  2015-07-27 15:31       ` Kevin Groeneveld
  1 sibling, 1 reply; 6+ messages in thread
From: Hannes Reinecke @ 2015-07-27 10:38 UTC (permalink / raw)
  To: Kevin Groeneveld, JBottomley@odin.com
  Cc: linux-scsi@vger.kernel.org, festevam@gmail.com,
	richard.zhu@freescale.com, arnd@arndb.de, linux@arm.linux.org.uk

On 07/16/2015 08:55 PM, Kevin Groeneveld wrote:
>> -----Original Message-----
>> From: Hannes Reinecke [mailto:hare@suse.de]
>> Sent: July-16-15 7:11 AM
>>> When the hang occurs shost->host_busy == 2 and shost->host_failed == 1
>>> in the scsi_eh_wakeup function. However this function only wakes the
>>> error handler if host_busy == host_failed.
>>>
>> Which just means that one command is still outstanding, and we need to wait
>> for it to complete.
>> But see below...
> 
> So the root cause of the hang is maybe that the second command never
> completes? Maybe host_failed being non zero is blocking something in the
> port multiplier code?
> 
>> Hmm.
>> I am really not sure about this.
> 
> I wasn't sure either, that is one reason why I posted the patch.
> 
>> 'host_busy' indicates the number of outstanding commands, and
>> 'host_failed' is the number of commands which have failed (on the ground
>> that failed commands are considered outstanding, too).
>>
>> So the first hunk would change the behaviour from 'start SCSI EH once all
>> commands are completed or failed' to 'start SCSI EH for _any_ command if
>> scsi_eh_wakeup is called'
>> (note that shost_failed might be '0'...).
>> Which doesn't sound right.
> 
> So could the patch create any problems by starting the EH any time
> scsi_eh_wakeup is called? Or is it is just inefficient?
> 
SCSI EH _relies_ on the fact that no other commands are outstanding
on that SCSI host, hence the contents of eh_entry list won't change.
Your patch breaks this assumption, causing some I/O to be lost.

>> I guess this needs further debugging to get to the bottom of it.
> 
> Any suggestions on things I could try?
> 
> The fact that the problem goes away when I only enable one CPU core makes
> me think there is a race happening somewhere.
> 
Not sure here. You're effectively creating an endless loop with your
patch, assuming that each ioctl will be
However, you are effectively creating an endless loop with you
testcase, assuming that 'ioctl' finishes all I/O before returning.
Which _actually_ is not a requirement; the I/O itself needs to be
finished by the time the ioctl returns (obviously), but the
_structures_ associated with the ioctl might linger on a bit longer
(delayed freeing and whatnot).
Yet this is a bit far-fetched, and definitely needs some more analysis.

For debugging I would suggest looking at the lifetime of each scsi
command, figuring out if by the time the ioctl returns the scsi
command is indeed freed up.
Also you might want to play around with the 'usleep' a bit; my
assumption is that at one point for a large enough wait the problem
goes away.
(And, incidentally, we might actually getting more than one pending
commands if the sleep is small enough; but this is just conjecture :-)

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] scsi: fix hang in scsi error handling
  2015-07-27 10:38     ` Hannes Reinecke
@ 2015-07-27 15:31       ` Kevin Groeneveld
  0 siblings, 0 replies; 6+ messages in thread
From: Kevin Groeneveld @ 2015-07-27 15:31 UTC (permalink / raw)
  To: Hannes Reinecke, JBottomley@odin.com
  Cc: linux-scsi@vger.kernel.org, festevam@gmail.com,
	richard.zhu@freescale.com, arnd@arndb.de, linux@arm.linux.org.uk

> -----Original Message-----
> From: Hannes Reinecke [mailto:hare@suse.de]
> Sent: July-27-15 6:39 AM
> On 07/16/2015 08:55 PM, Kevin Groeneveld wrote:
> >> -----Original Message-----
> >> From: Hannes Reinecke [mailto:hare@suse.de]
> >> Sent: July-16-15 7:11 AM
> >>> When the hang occurs shost->host_busy == 2 and shost->host_failed ==
> >>> 1 in the scsi_eh_wakeup function. However this function only wakes
> >>> the error handler if host_busy == host_failed.
> >>>
> >> Which just means that one command is still outstanding, and we need
> >> to wait for it to complete.
> >> But see below...
> >
> > So the root cause of the hang is maybe that the second command never
> > completes? Maybe host_failed being non zero is blocking something in
> > the port multiplier code?
> >
> >> Hmm.
> >> I am really not sure about this.
> >
> > I wasn't sure either, that is one reason why I posted the patch.
> >
> >> 'host_busy' indicates the number of outstanding commands, and
> >> 'host_failed' is the number of commands which have failed (on the
> >> ground that failed commands are considered outstanding, too).
> >>
> >> So the first hunk would change the behaviour from 'start SCSI EH once
> >> all commands are completed or failed' to 'start SCSI EH for _any_
> >> command if scsi_eh_wakeup is called'
> >> (note that shost_failed might be '0'...).
> >> Which doesn't sound right.
> >
> > So could the patch create any problems by starting the EH any time
> > scsi_eh_wakeup is called? Or is it is just inefficient?
> >
> SCSI EH _relies_ on the fact that no other commands are outstanding on that
> SCSI host, hence the contents of eh_entry list won't change.
> Your patch breaks this assumption, causing some I/O to be lost.
> 
> >> I guess this needs further debugging to get to the bottom of it.
> >
> > Any suggestions on things I could try?
> >
> > The fact that the problem goes away when I only enable one CPU core
> > makes me think there is a race happening somewhere.
> >
> Not sure here. You're effectively creating an endless loop with your patch,
> assuming that each ioctl will be However, you are effectively creating an
> endless loop with you testcase, assuming that 'ioctl' finishes all I/O before
> returning.
> Which _actually_ is not a requirement; the I/O itself needs to be finished by
> the time the ioctl returns (obviously), but the _structures_ associated with
> the ioctl might linger on a bit longer (delayed freeing and whatnot).
> Yet this is a bit far-fetched, and definitely needs some more analysis.
> 
> For debugging I would suggest looking at the lifetime of each scsi command,
> figuring out if by the time the ioctl returns the scsi command is indeed freed
> up.

Thanks for the further feedback on this.

I haven't had a lot of time to debug this further. Last week I did tried enabling SCSI logging as you suggested in your previous post. I tried many different combinations of setting /proc/sys/dev/scsi/logging_level to enable different types and levels of debugging.  However everything I tried either resulted in not being able to trigger the problem or nothing useful in the log.

I was thinking of looking into the SCSI trace functionality to see if that would give more useful results.

One thing I did notice which may be a small clue is the following values each time after the hang:
/sys/class/scsi_device/0:0:0:0/device/device_busy = 1 (CD-ROM)
/sys/class/scsi_device/0:1:0:0/device/device_busy = 0 (HDD)

Before the hang the HDD busy value varies from 0 to 31. After the hang the HDD busy value is always 0.

> Also you might want to play around with the 'usleep' a bit; my assumption is
> that at one point for a large enough wait the problem goes away.
> (And, incidentally, we might actually getting more than one pending
> commands if the sleep is small enough; but this is just conjecture :-)

I tried a 10 second usleep.  On the first attempt the third ioctl never returned.  After a reboot and a second attempt the 10th ioctl never returned.

I also tried getting rid of the usleep entirely. If I avoid HDD access at the same time I can get about 100 ioctl calls per second and /sys/.../device_busy never seems to go above 1.  As soon as I access the HDD all SCSI access hangs.


Kevin

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-07-27 15:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-15 12:47 [PATCH] scsi: fix hang in scsi error handling Kevin Groeneveld
2015-07-16 11:11 ` Hannes Reinecke
2015-07-16 18:55   ` Kevin Groeneveld
2015-07-17  6:02     ` Hannes Reinecke
2015-07-27 10:38     ` Hannes Reinecke
2015-07-27 15:31       ` Kevin Groeneveld

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).