linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* stack overflow: scsi_request_fn/blk_run_queue ping-pong
@ 2006-08-08 16:05 Andreas Herrmann
  0 siblings, 0 replies; only message in thread
From: Andreas Herrmann @ 2006-08-08 16:05 UTC (permalink / raw)
  To: Linux SCSI

Hi,

Recently I observed a kernel stack overflow due to a 
recursion between scsi_request_fn and blk_run_queue.
(We did error injection tests on s390 with zfcp using 32 LUNs and multiple
paths.)

Calling sequence was:
  scsi_request_fn->scsi_dispatch_cmd->scsi_queue_insert->
  scsi_run_queue->blk_run_queue->scsi_request_fn
Recursion depth was about 18.

On each iteration the request_queue passed to blk_run_queue/scsi_request_fn
was different. This is due to the fact that blk_run_queue was called
iterating shost->starved_list in scsi_run_queue:

        while (!list_empty(&shost->starved_list) &&
               !shost->host_blocked && !shost->host_self_blocked &&
                !((shost->can_queue > 0) &&
                  (shost->host_busy >= shost->can_queue))) {

...
                sdev = list_entry(shost->starved_list.next,
                                          struct scsi_device, starved_entry);

...
                blk_run_queue(sdev->request_queue);
...
	}

Because a different request_queue was passed to blk_run_queue the
check for QUEUE_FLAG_REENTER in blk_run_queue did not help to avoid
the recursion.

My explanation for this situation is as follows:
The shost was blocked temporary and the starved_list was filled.
Following some remote port was deleted which caused that sdev_state
for some scsi device was SDEV_BLOCK. And this in turn led to the above
recursion. (At time when recursion started shost was not blocked anymore.)

Of course this recursion ends if the starved_list is empty.
But always a stack overflow is imminent depending on the number of
entries in starved_list.

A quick hack would be to do the following before calling
blk_run_queue in scsi_run_queue:

	if (test_bit(QUEUE_FLAG_REENTER, &q->queue_flags))
		set_bit(QUEUE_FLAG_REENTER, &sdev->request_queue->queue_flags);

	blk_run_queue(sdev->request_queue);

Any opinion regarding the problem and the suggestion to fix the problem?
If there are no objections I will make a patch containing this fix.


Regards,

Andreas

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2006-08-08 16:09 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-08 16:05 stack overflow: scsi_request_fn/blk_run_queue ping-pong Andreas Herrmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).