* [PATCH] 2.4.X Avoid excessive recursion when setting device offline
@ 2004-03-29 17:07 Justin T. Gibbs
2004-03-29 17:26 ` Arjan van de Ven
0 siblings, 1 reply; 6+ messages in thread
From: Justin T. Gibbs @ 2004-03-29 17:07 UTC (permalink / raw)
To: linux-scsi
How to reproduce: Queue up lots of I/O to a device. Set that device
offline.
Result: The SCSI mid-layer blows out the kernel stack due to the inherent
recursion in its request function. The repeating stack trace goes
something like:
scsi_request_fn
__scsi_end_request
scsi_release_command
scsi_queue_next_request
scsi_request_fn
...
This patch prevents recursive calls through the scsi_request_fn
for the same queue:
===== scsi.h 1.9 vs edited =====
--- 1.9/drivers/scsi/scsi.h Wed Jun 25 17:34:08 2003
+++ edited/scsi.h Sun Mar 28 13:50:35 2004
@@ -609,6 +609,7 @@
unsigned starved:1; /* unable to process commands because
host busy */
unsigned no_start_on_add:1; /* do not issue start on add */
+ unsigned running_request_fn:1;
// Flag to allow revalidate to succeed in sd_open
int allow_revalidate;
===== scsi_lib.c 1.17 vs edited =====
--- 1.17/drivers/scsi/scsi_lib.c Fri Jul 4 11:12:45 2003
+++ edited/scsi_lib.c Sun Mar 28 14:07:48 2004
@@ -850,12 +850,20 @@
if (!SDpnt) {
panic("Missing device");
}
+
SHpnt = SDpnt->host;
/*
* To start with, we keep looping until the queue is empty, or until
- * the host is no longer able to accept any more requests.
+ * the host is no longer able to accept any more requests. We avoid
+ * excessive recursion through the SCSI layer by recording that we
+ * are running this queue and rejecting requests to run it again. This
+ * avoids blowing out the stack when multiple commands are failed
+ * during our loop.
*/
+ if (SDpnt->running_request_fn)
+ return;
+ SDpnt->running_request_fn = 1;
while (1 == 1) {
/*
* Check this again - each time we loop through we will have
@@ -863,7 +871,7 @@
* we need to check to see if the queue is plugged or not.
*/
if (SHpnt->in_recovery || q->plugged)
- return;
+ break;
/*
* If the device cannot accept another request, then quit.
@@ -1099,6 +1107,7 @@
*/
spin_lock_irq(&io_request_lock);
}
+ SDpnt->running_request_fn = 0;
}
/*
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] 2.4.X Avoid excessive recursion when setting device offline
2004-03-29 17:07 [PATCH] 2.4.X Avoid excessive recursion when setting device offline Justin T. Gibbs
@ 2004-03-29 17:26 ` Arjan van de Ven
2004-03-29 17:42 ` Justin T. Gibbs
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Arjan van de Ven @ 2004-03-29 17:26 UTC (permalink / raw)
To: Justin T. Gibbs; +Cc: linux-scsi
[-- Attachment #1.1: Type: text/plain, Size: 1117 bytes --]
On Mon, 2004-03-29 at 19:07, Justin T. Gibbs wrote:
> How to reproduce: Queue up lots of I/O to a device. Set that device
> offline.
>
> Result: The SCSI mid-layer blows out the kernel stack due to the inherent
> recursion in its request function. The repeating stack trace goes
> something like:
>
> scsi_request_fn
> __scsi_end_request
> scsi_release_command
> scsi_queue_next_request
> scsi_request_fn
> ...
>
> This patch prevents recursive calls through the scsi_request_fn
> for the same queue:
won't this achieve the same ?
diff -urN Linux/drivers/scsi/scsi_lib.c linux/drivers/scsi/scsi_lib.c
--- Linux/drivers/scsi/scsi_lib.c 2002-10-03 16:50:07.000000000
+0200
+++ linux/drivers/scsi/scsi_lib.c 2002-10-03 17:33:20.000000000
+0200
@@ -442,7 +442,7 @@
* This will goose the queue request function at the end, so we
don't
* need to worry about launching another command.
*/
- scsi_release_command(SCpnt);
+ __scsi_release_command(SCpnt);
if( frequeue ) {
request_queue_t *q;
[-- Attachment #1.2: scsi.patch --]
[-- Type: text/x-patch, Size: 480 bytes --]
diff -urN Linux/drivers/scsi/scsi_lib.c linux/drivers/scsi/scsi_lib.c
--- Linux/drivers/scsi/scsi_lib.c 2002-10-03 16:50:07.000000000 +0200
+++ linux/drivers/scsi/scsi_lib.c 2002-10-03 17:33:20.000000000 +0200
@@ -442,7 +442,7 @@
* This will goose the queue request function at the end, so we don't
* need to worry about launching another command.
*/
- scsi_release_command(SCpnt);
+ __scsi_release_command(SCpnt);
if( frequeue ) {
request_queue_t *q;
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] 2.4.X Avoid excessive recursion when setting device offline
2004-03-29 17:26 ` Arjan van de Ven
@ 2004-03-29 17:42 ` Justin T. Gibbs
2004-03-29 17:47 ` James Bottomley
2004-03-30 7:03 ` Jens Axboe
2 siblings, 0 replies; 6+ messages in thread
From: Justin T. Gibbs @ 2004-03-29 17:42 UTC (permalink / raw)
To: arjanv; +Cc: linux-scsi
>> This patch prevents recursive calls through the scsi_request_fn
>> for the same queue:
>
> won't this achieve the same ?
Yes, it appears it will and that the latest 2.4.X kernels already
have this change. My only concern stems from the need to audit all
of the callers of __scsi_end_request to ensure they are setting
the frequeue flag correctly.
--
Justin
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] 2.4.X Avoid excessive recursion when setting device offline
2004-03-29 17:26 ` Arjan van de Ven
2004-03-29 17:42 ` Justin T. Gibbs
@ 2004-03-29 17:47 ` James Bottomley
2004-03-29 17:52 ` Justin T. Gibbs
2004-03-30 7:03 ` Jens Axboe
2 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2004-03-29 17:47 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Justin T. Gibbs, SCSI Mailing List
On Mon, 2004-03-29 at 12:26, Arjan van de Ven wrote:
> won't this achieve the same ?
>
> diff -urN Linux/drivers/scsi/scsi_lib.c linux/drivers/scsi/scsi_lib.c
> --- Linux/drivers/scsi/scsi_lib.c 2002-10-03 16:50:07.000000000
> +0200
> +++ linux/drivers/scsi/scsi_lib.c 2002-10-03 17:33:20.000000000
> - scsi_release_command(SCpnt);
> + __scsi_release_command(SCpnt);
Yes, this patch is in 2.4.25 already. I seem to recall that's why it
was applied several years ago (to prevent recursion), but it's too long
ago for me to remember (or even bitkeeper, since it's also in scsi_lib.c
version 1.1).
Exactly which version of 2.4 is exhibiting this problem?
James
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] 2.4.X Avoid excessive recursion when setting device offline
2004-03-29 17:47 ` James Bottomley
@ 2004-03-29 17:52 ` Justin T. Gibbs
0 siblings, 0 replies; 6+ messages in thread
From: Justin T. Gibbs @ 2004-03-29 17:52 UTC (permalink / raw)
To: James Bottomley, Arjan van de Ven; +Cc: SCSI Mailing List
> Yes, this patch is in 2.4.25 already. I seem to recall that's why it
> was applied several years ago (to prevent recursion), but it's too long
> ago for me to remember (or even bitkeeper, since it's also in scsi_lib.c
> version 1.1).
>
> Exactly which version of 2.4 is exhibiting this problem?
I was looking at RHEL3. Sorry for the noise on the lists.
Arjan. Please refer to bugzilla 119334.
--
Justin
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] 2.4.X Avoid excessive recursion when setting device offline
2004-03-29 17:26 ` Arjan van de Ven
2004-03-29 17:42 ` Justin T. Gibbs
2004-03-29 17:47 ` James Bottomley
@ 2004-03-30 7:03 ` Jens Axboe
2 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2004-03-30 7:03 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Justin T. Gibbs, linux-scsi
On Mon, Mar 29 2004, Arjan van de Ven wrote:
> On Mon, 2004-03-29 at 19:07, Justin T. Gibbs wrote:
> > How to reproduce: Queue up lots of I/O to a device. Set that device
> > offline.
> >
> > Result: The SCSI mid-layer blows out the kernel stack due to the inherent
> > recursion in its request function. The repeating stack trace goes
> > something like:
> >
> > scsi_request_fn
> > __scsi_end_request
> > scsi_release_command
> > scsi_queue_next_request
> > scsi_request_fn
> > ...
> >
> > This patch prevents recursive calls through the scsi_request_fn
> > for the same queue:
>
> won't this achieve the same ?
>
> diff -urN Linux/drivers/scsi/scsi_lib.c linux/drivers/scsi/scsi_lib.c
> --- Linux/drivers/scsi/scsi_lib.c 2002-10-03 16:50:07.000000000
> +0200
> +++ linux/drivers/scsi/scsi_lib.c 2002-10-03 17:33:20.000000000
> +0200
> @@ -442,7 +442,7 @@
> * This will goose the queue request function at the end, so we
> don't
> * need to worry about launching another command.
> */
> - scsi_release_command(SCpnt);
> + __scsi_release_command(SCpnt);
>
> if( frequeue ) {
> request_queue_t *q;
>
> diff -urN Linux/drivers/scsi/scsi_lib.c linux/drivers/scsi/scsi_lib.c
> --- Linux/drivers/scsi/scsi_lib.c 2002-10-03 16:50:07.000000000 +0200
> +++ linux/drivers/scsi/scsi_lib.c 2002-10-03 17:33:20.000000000 +0200
> @@ -442,7 +442,7 @@
> * This will goose the queue request function at the end, so we don't
> * need to worry about launching another command.
> */
> - scsi_release_command(SCpnt);
> + __scsi_release_command(SCpnt);
>
> if( frequeue ) {
> request_queue_t *q;
That is better. If SCSI used blk_start_queue/stop_queue, it has
recursion detection as well (only allowing one reenter).
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2004-03-30 7:04 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-29 17:07 [PATCH] 2.4.X Avoid excessive recursion when setting device offline Justin T. Gibbs
2004-03-29 17:26 ` Arjan van de Ven
2004-03-29 17:42 ` Justin T. Gibbs
2004-03-29 17:47 ` James Bottomley
2004-03-29 17:52 ` Justin T. Gibbs
2004-03-30 7:03 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox