This patch ** is against Christoph's core-for-3.17 branch which includes scsi-mq V2. The purpose of releasing it now is to facilitate testing of scsi-mq . The fio user space program seems to be the tool of choice for such tests. With fio and four (pseudo) devices I have observed 1.2 M IOPS on my equipment. Rob Elliott has better results than that. ChangeLog: - add host_lock option whose default value is 0 which removes the host_lock around all queued commands - accept delay=-1 (_hi_) or -2 which use a tasklet to invoke the scsi_done callback into the mid-layer. The default is still delay=1 which uses a timer to delay 1 jiffy - wire .change_queue_depth and .change_queue_type functions to better simulate queueing in a modern LLD - add SCSI_DEBUG_OPT_Q_NOISE (0x200) mask to only produce debug output associated with queue full, plus from .change_queue_depth and .change_queue_type functions - add SCSI_DEBUG_OPT_ALL_TSF (0x400) mask which reports all queued_arr fulls at TASK_SET_FULL, otherwise SCSI_MLQUEUE_HOST_BUSY is returned - add SCSI_DEBUG_OPT_RARE_TSF (0x800) mask which works together with the every_nth option (> 0) to count occurrences of num_in_q==queue_depth. When every_nth is reached the victim (a command) yields TASK SET FULL - clean up many debug messages. IMO the scsi-mq queue_depth handling doesn't break but does not converge promptly (and in some cases seems to be steady state unstable) in the face of TASK_SET_FULL statuses. This can be seen with this sequence: # cd /sys/bus/pseudo/drivers/scsi_debug # echo 0x600 > opts < then in another window 'tail -f /var/log/syslog'> < then in yet another window run a fio test against scsi_debug> # echo 200 > max_queue < if the tail shows nothing, try writing a lower value into max_queue> max_queue starts at 576. Depending on the fio test, the new value written into max_queue needed to prompt the TSF trip point will vary. You will now it when you see it :-) The PI and provisioning parts of scsi_debug have been written by Martin Petersen and some of its bitmaps may be racy without the host_lock. I'm hoping that when Martin has time, he will re-visit that code. ** scsi_debug in the core-for-3.17 branch already has Hannes' 64 bit LUN changes and Akinobu Mita's "allow huge transfer length for read/write commands". So expect some noise if this patch is applied to other trees (such as vanilla lk 3.15.2). There is no real conflict but the patch command gets upset. Signed-off-by: Douglas Gilbert Tested-by: Robert Elliott