linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* SCSI mid layer and high IOPS capable devices
@ 2012-12-11  0:00 scameron
  2012-12-11  8:21 ` Bart Van Assche
  2012-12-13 15:22 ` Bart Van Assche
  0 siblings, 2 replies; 21+ messages in thread
From: scameron @ 2012-12-11  0:00 UTC (permalink / raw)
  To: linux-scsi; +Cc: stephenmcameron, scameron, dab


Is there any work going on to improve performance of the SCSI layer
to better support devices capable of high IOPS?

I've been playing around with some flash-based devices and have
a block driver that uses the make_request interface (calls
blk_queue_make_request() rather than blk_init_queue()) and a
SCSI LLD variant of the same driver.  The block driver is similar
in design and performance to the nvme driver.

If I compare the performance, the block driver gets about 3x the
performance as the SCSI LLD.  The SCSI LLD spends a lot of time
(according to perf) contending for locks in scsi_request_fn(),
presumably the host lock or the queue lock, or perhaps both.

All other things being equal, a SCSI LLD would be preferable to
me, but, with performance differing by a factor of around 3x, all
other things are definitely not equal.

I tried using scsi_debug with fake_rw and also the scsi_ram driver
that was recently posted to get some idea of what the maximum IOPS
that could be pushed through the SCSI midlayer might be, and the
numbers were a little disappointing (was getting around 150k iops
with scsi_debug with reads and writes faked, and around 3x that
with the block driver actually doing the i/o).

Essentially, what I've been finding out is consistent with what's in
this slide deck: http://static.usenix.org/event/lsf08/tech/IO_Carlson_Accardi_SATA.pdf

The driver, like nvme, has a submit and reply queue per cpu.
I'm sort of guessing that funnelling all the requests through
a single request queue per device that only one cpu can touch
at a time as the scsi mid layer does is a big part of what's
killing performance.  Looking through the scsi code, if I read
it correctly, the assumption that each device has a request queue
seems to be all over the code, so how exactly one might go about
attempting to improve the situation is not really obvious to me.

Anyway, just wondering if anybody is looking into doing some
improvements in this area.

-- steve


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2012-12-19 14:23 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-11  0:00 SCSI mid layer and high IOPS capable devices scameron
2012-12-11  8:21 ` Bart Van Assche
2012-12-11 22:46   ` scameron
2012-12-13 11:40     ` Bart Van Assche
2012-12-13 18:03       ` scameron
2012-12-13 17:18         ` Bart Van Assche
2012-12-13 15:22 ` Bart Van Assche
2012-12-13 17:25   ` scameron
2012-12-13 16:47     ` Bart Van Assche
2012-12-13 16:49       ` Christoph Hellwig
2012-12-14  9:44         ` Bart Van Assche
2012-12-14 16:44           ` scameron
2012-12-14 16:15             ` Bart Van Assche
2012-12-14 19:55               ` scameron
2012-12-14 19:28                 ` Bart Van Assche
2012-12-14 21:06                   ` scameron
2012-12-15  9:40                     ` Bart Van Assche
2012-12-19 14:23                       ` Christoph Hellwig
2012-12-13 21:20       ` scameron
2012-12-14  0:22       ` Jack Wang
     [not found]         ` <CADzpL0TMT31yka98Zv0=53N4=pDZOc9+gacnvDWMbj+iZg4H5w@mail.gmail.com>
     [not found]           ` <006301cdd99c$35099b40$9f1cd1c0$@com>
     [not found]             ` <CADzpL0S5cfCRQftrxHij8KOjKj55psSJedmXLBQz1uQm_SC30A@mail.gmail.com>
2012-12-14  4:59               ` Jack Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).