public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] 2.4.X Avoid excessive recursion when setting device offline
@ 2004-03-29 17:07 Justin T. Gibbs
  2004-03-29 17:26 ` Arjan van de Ven
  0 siblings, 1 reply; 6+ messages in thread
From: Justin T. Gibbs @ 2004-03-29 17:07 UTC (permalink / raw)
  To: linux-scsi

How to reproduce: Queue up lots of I/O to a device.  Set that device
		  offline.

Result: The SCSI mid-layer blows out the kernel stack due to the inherent
	recursion in its request function.  The repeating stack trace goes
	something like:

	scsi_request_fn
	__scsi_end_request
	scsi_release_command
	scsi_queue_next_request
	scsi_request_fn
	...

This patch prevents recursive calls through the scsi_request_fn
for the same queue:

===== scsi.h 1.9 vs edited =====
--- 1.9/drivers/scsi/scsi.h	Wed Jun 25 17:34:08 2003
+++ edited/scsi.h	Sun Mar 28 13:50:35 2004
@@ -609,6 +609,7 @@
 	unsigned starved:1;	/* unable to process commands because
 				   host busy */
 	unsigned no_start_on_add:1;	/* do not issue start on add */
+	unsigned running_request_fn:1;
 
 	// Flag to allow revalidate to succeed in sd_open
 	int allow_revalidate;
===== scsi_lib.c 1.17 vs edited =====
--- 1.17/drivers/scsi/scsi_lib.c	Fri Jul  4 11:12:45 2003
+++ edited/scsi_lib.c	Sun Mar 28 14:07:48 2004
@@ -850,12 +850,20 @@
 	if (!SDpnt) {
 		panic("Missing device");
 	}
+
 	SHpnt = SDpnt->host;
 
 	/*
 	 * To start with, we keep looping until the queue is empty, or until
-	 * the host is no longer able to accept any more requests.
+	 * the host is no longer able to accept any more requests.  We avoid
+	 * excessive recursion through the SCSI layer by recording that we
+	 * are running this queue and rejecting requests to run it again. This
+	 * avoids blowing out the stack when multiple commands are failed
+	 * during our loop.
 	 */
+	if (SDpnt->running_request_fn)
+		return;
+	SDpnt->running_request_fn = 1;
 	while (1 == 1) {
 		/*
 		 * Check this again - each time we loop through we will have
@@ -863,7 +871,7 @@
 		 * we need to check to see if the queue is plugged or not.
 		 */
 		if (SHpnt->in_recovery || q->plugged)
-			return;
+			break;
 
 		/*
 		 * If the device cannot accept another request, then quit.
@@ -1099,6 +1107,7 @@
 		 */
 		spin_lock_irq(&io_request_lock);
 	}
+	SDpnt->running_request_fn = 0;
 }
 
 /*


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] 2.4.X Avoid excessive recursion when setting device offline
  2004-03-29 17:07 [PATCH] 2.4.X Avoid excessive recursion when setting device offline Justin T. Gibbs
@ 2004-03-29 17:26 ` Arjan van de Ven
  2004-03-29 17:42   ` Justin T. Gibbs
                     ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Arjan van de Ven @ 2004-03-29 17:26 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: linux-scsi


[-- Attachment #1.1: Type: text/plain, Size: 1117 bytes --]

On Mon, 2004-03-29 at 19:07, Justin T. Gibbs wrote:
> How to reproduce: Queue up lots of I/O to a device.  Set that device
> 		  offline.
> 
> Result: The SCSI mid-layer blows out the kernel stack due to the inherent
> 	recursion in its request function.  The repeating stack trace goes
> 	something like:
> 
> 	scsi_request_fn
> 	__scsi_end_request
> 	scsi_release_command
> 	scsi_queue_next_request
> 	scsi_request_fn
> 	...
> 
> This patch prevents recursive calls through the scsi_request_fn
> for the same queue:

won't this achieve the same ?

diff -urN Linux/drivers/scsi/scsi_lib.c linux/drivers/scsi/scsi_lib.c
--- Linux/drivers/scsi/scsi_lib.c       2002-10-03 16:50:07.000000000
+0200
+++ linux/drivers/scsi/scsi_lib.c       2002-10-03 17:33:20.000000000
+0200
@@ -442,7 +442,7 @@
         * This will goose the queue request function at the end, so we
don't
         * need to worry about launching another command.
         */
-       scsi_release_command(SCpnt);
+       __scsi_release_command(SCpnt);
  
        if( frequeue ) {
                request_queue_t *q;


[-- Attachment #1.2: scsi.patch --]
[-- Type: text/x-patch, Size: 480 bytes --]

diff -urN Linux/drivers/scsi/scsi_lib.c linux/drivers/scsi/scsi_lib.c
--- Linux/drivers/scsi/scsi_lib.c	2002-10-03 16:50:07.000000000 +0200
+++ linux/drivers/scsi/scsi_lib.c	2002-10-03 17:33:20.000000000 +0200
@@ -442,7 +442,7 @@
 	 * This will goose the queue request function at the end, so we don't
 	 * need to worry about launching another command.
 	 */
-	scsi_release_command(SCpnt);
+	__scsi_release_command(SCpnt);
 
 	if( frequeue ) {
 		request_queue_t *q;

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] 2.4.X Avoid excessive recursion when setting device offline
  2004-03-29 17:26 ` Arjan van de Ven
@ 2004-03-29 17:42   ` Justin T. Gibbs
  2004-03-29 17:47   ` James Bottomley
  2004-03-30  7:03   ` Jens Axboe
  2 siblings, 0 replies; 6+ messages in thread
From: Justin T. Gibbs @ 2004-03-29 17:42 UTC (permalink / raw)
  To: arjanv; +Cc: linux-scsi

>> This patch prevents recursive calls through the scsi_request_fn
>> for the same queue:
> 
> won't this achieve the same ?

Yes, it appears it will and that the latest 2.4.X kernels already
have this change.  My only concern stems from the need to audit all
of the callers of __scsi_end_request to ensure they are setting
the frequeue flag correctly.

--
Justin


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] 2.4.X Avoid excessive recursion when setting device offline
  2004-03-29 17:26 ` Arjan van de Ven
  2004-03-29 17:42   ` Justin T. Gibbs
@ 2004-03-29 17:47   ` James Bottomley
  2004-03-29 17:52     ` Justin T. Gibbs
  2004-03-30  7:03   ` Jens Axboe
  2 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2004-03-29 17:47 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Justin T. Gibbs, SCSI Mailing List

On Mon, 2004-03-29 at 12:26, Arjan van de Ven wrote:
> won't this achieve the same ?
> 
> diff -urN Linux/drivers/scsi/scsi_lib.c linux/drivers/scsi/scsi_lib.c
> --- Linux/drivers/scsi/scsi_lib.c       2002-10-03 16:50:07.000000000
> +0200
> +++ linux/drivers/scsi/scsi_lib.c       2002-10-03 17:33:20.000000000
> -       scsi_release_command(SCpnt);
> +       __scsi_release_command(SCpnt);

Yes, this patch is in 2.4.25 already.  I seem to recall that's why it
was applied several years ago (to prevent recursion), but it's too long
ago for me to remember (or even bitkeeper, since it's also in scsi_lib.c
version 1.1).

Exactly which version of 2.4 is exhibiting this problem?

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] 2.4.X Avoid excessive recursion when setting device offline
  2004-03-29 17:47   ` James Bottomley
@ 2004-03-29 17:52     ` Justin T. Gibbs
  0 siblings, 0 replies; 6+ messages in thread
From: Justin T. Gibbs @ 2004-03-29 17:52 UTC (permalink / raw)
  To: James Bottomley, Arjan van de Ven; +Cc: SCSI Mailing List

> Yes, this patch is in 2.4.25 already.  I seem to recall that's why it
> was applied several years ago (to prevent recursion), but it's too long
> ago for me to remember (or even bitkeeper, since it's also in scsi_lib.c
> version 1.1).
> 
> Exactly which version of 2.4 is exhibiting this problem?

I was looking at RHEL3.  Sorry for the noise on the lists.
Arjan.  Please refer to bugzilla 119334.

--
Justin


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] 2.4.X Avoid excessive recursion when setting device offline
  2004-03-29 17:26 ` Arjan van de Ven
  2004-03-29 17:42   ` Justin T. Gibbs
  2004-03-29 17:47   ` James Bottomley
@ 2004-03-30  7:03   ` Jens Axboe
  2 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2004-03-30  7:03 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Justin T. Gibbs, linux-scsi

On Mon, Mar 29 2004, Arjan van de Ven wrote:
> On Mon, 2004-03-29 at 19:07, Justin T. Gibbs wrote:
> > How to reproduce: Queue up lots of I/O to a device.  Set that device
> > 		  offline.
> > 
> > Result: The SCSI mid-layer blows out the kernel stack due to the inherent
> > 	recursion in its request function.  The repeating stack trace goes
> > 	something like:
> > 
> > 	scsi_request_fn
> > 	__scsi_end_request
> > 	scsi_release_command
> > 	scsi_queue_next_request
> > 	scsi_request_fn
> > 	...
> > 
> > This patch prevents recursive calls through the scsi_request_fn
> > for the same queue:
> 
> won't this achieve the same ?
> 
> diff -urN Linux/drivers/scsi/scsi_lib.c linux/drivers/scsi/scsi_lib.c
> --- Linux/drivers/scsi/scsi_lib.c       2002-10-03 16:50:07.000000000
> +0200
> +++ linux/drivers/scsi/scsi_lib.c       2002-10-03 17:33:20.000000000
> +0200
> @@ -442,7 +442,7 @@
>          * This will goose the queue request function at the end, so we
> don't
>          * need to worry about launching another command.
>          */
> -       scsi_release_command(SCpnt);
> +       __scsi_release_command(SCpnt);
>   
>         if( frequeue ) {
>                 request_queue_t *q;
> 

> diff -urN Linux/drivers/scsi/scsi_lib.c linux/drivers/scsi/scsi_lib.c
> --- Linux/drivers/scsi/scsi_lib.c	2002-10-03 16:50:07.000000000 +0200
> +++ linux/drivers/scsi/scsi_lib.c	2002-10-03 17:33:20.000000000 +0200
> @@ -442,7 +442,7 @@
>  	 * This will goose the queue request function at the end, so we don't
>  	 * need to worry about launching another command.
>  	 */
> -	scsi_release_command(SCpnt);
> +	__scsi_release_command(SCpnt);
>  
>  	if( frequeue ) {
>  		request_queue_t *q;

That is better. If SCSI used blk_start_queue/stop_queue, it has
recursion detection as well (only allowing one reenter).

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-03-30  7:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-29 17:07 [PATCH] 2.4.X Avoid excessive recursion when setting device offline Justin T. Gibbs
2004-03-29 17:26 ` Arjan van de Ven
2004-03-29 17:42   ` Justin T. Gibbs
2004-03-29 17:47   ` James Bottomley
2004-03-29 17:52     ` Justin T. Gibbs
2004-03-30  7:03   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox