linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] [SCSI] sg: fix unkillable I/O wait deadlock with scsi-mq
@ 2015-02-11 16:31 Tony Battersby
  0 siblings, 0 replies; only message in thread
From: Tony Battersby @ 2015-02-11 16:31 UTC (permalink / raw)
  To: linux-scsi, James E.J. Bottomley, Christoph Hellwig, Jens Axboe,
	Douglas Gilbert
  Cc: linux-kernel

When using the write()/read() interface for submitting commands, the
SCSI generic driver does not call blk_put_request() on a completed SCSI
command until userspace calls read() to get the command completion. 
Since scsi-mq uses a fixed number of preallocated requests, this makes
it possible for userspace to exhaust the entire preallocated supply of
requests, leading to deadlock with the user process stuck in a permanent
unkillable I/O wait in sg_write() -> ... -> blk_get_request() -> ... ->
bt_get().  Note that this deadlock can happen only if scsi-mq is
enabled.  Prevent the deadlock by calling blk_put_request() as soon as
the SCSI command completes instead of waiting for userspace to call read().

Cc: <stable@vger.kernel.org> # 3.17+
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
---

For inclusion in kernel 3.20.

I encountered this problem using mptsas (can_queue == 127) and 8 disks
connected via an expander.  I have a test program called cydiskbench
that spawns multiple threads, opens multiple /dev/sg* file descriptors,
and sends multiple disk read/write commands to each /dev/sg* file
descriptor.  I can vary the # of disks being tested and the command
queue depth per disk.  Whenever I chose test parameters such that
(n_disks * queue_depth_per_disk) > shost->can_queue, the test deadlocked
as described when scsi-mq was enabled but worked just fine with scsi-mq
disabled.

I will send a separate patch to fix the same problem in the bsg driver.

--- linux-3.19.0/drivers/scsi/sg.c.orig	2015-02-08 21:54:22.000000000 -0500
+++ linux-3.19.0/drivers/scsi/sg.c	2015-02-09 17:40:00.000000000 -0500
@@ -1350,6 +1350,17 @@ sg_rq_end_io(struct request *rq, int upt
 	}
 	/* Rely on write phase to clean out srp status values, so no "else" */
 
+	/*
+	 * Free the request as soon as it is complete so that its resources
+	 * can be reused without waiting for userspace to read() the
+	 * result.  But keep the associated bio (if any) around until
+	 * blk_rq_unmap_user() can be called from user context.
+	 */
+	srp->rq = NULL;
+	if (rq->cmd != rq->__cmd)
+		kfree(rq->cmd);
+	__blk_put_request(rq->q, rq);
+
 	write_lock_irqsave(&sfp->rq_list_lock, iflags);
 	if (unlikely(srp->orphan)) {
 		if (sfp->keep_orphan)
@@ -1777,10 +1788,10 @@ sg_finish_rem_req(Sg_request *srp)
 	SCSI_LOG_TIMEOUT(4, sg_printk(KERN_INFO, sfp->parentdp,
 				      "sg_finish_rem_req: res_used=%d\n",
 				      (int) srp->res_used));
+	if (srp->bio)
+		ret = blk_rq_unmap_user(srp->bio);
+
 	if (srp->rq) {
-		if (srp->bio)
-			ret = blk_rq_unmap_user(srp->bio);
-
 		if (srp->rq->cmd != srp->rq->__cmd)
 			kfree(srp->rq->cmd);
 		blk_put_request(srp->rq);


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2015-02-11 16:31 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-11 16:31 [PATCH] [SCSI] sg: fix unkillable I/O wait deadlock with scsi-mq Tony Battersby

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).