linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: scameron@beardog.cce.hp.com
To: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	linux-scsi@vger.kernel.org, stephenmcameron@gmail.com,
	dab@hp.com, scameron@beardog.cce.hp.com
Subject: Re: SCSI mid layer and high IOPS capable devices
Date: Fri, 14 Dec 2012 13:55:33 -0600	[thread overview]
Message-ID: <20121214195533.GS20898@beardog.cce.hp.com> (raw)
In-Reply-To: <50CB50A9.1050202@acm.org>

On Fri, Dec 14, 2012 at 05:15:37PM +0100, Bart Van Assche wrote:
> On 12/14/12 17:44, scameron@beardog.cce.hp.com wrote:
> >I expect the flash devices re-order requests as well, simply because
> >to feed requests to the things at a sufficient rate, you have to pump
> >requests into them concurrently on multiple hardware queues -- a single
> >cpu jamming requests into them as fast as it can is still not fast enough
> >to keep them busy.  Consequently, they *can't* care about ordering, as the
> >relative order requests on different hardware queues are submitted into 
> >them
> >is not even really controlled, so the OS *can't* count on concurrent 
> >requests
> >not to be essentially "re-ordered", just because of the nature of the way
> >requests get into the device.
> 
> Why should a flash device have to reorder write requests ? These devices 
> typically use a log-structured file system internally.

It's not so much that they are re-ordered as that there is no controlled
ordering to begin with because multiple cpus are submitting to multiple
hardware queues concurrently.  If you have 12 requests coming in on 12
cpus to 12 hardware queues to the device, it's going to be racy as to
which request is processed first by the device -- and this is fine, the
hardware queues are independent of one another and do not need to worry
about each other.  This is all to provide a means of getting enough commands
on the device to actually keep it busy.  A single cpu can't do it, the
device is too fast.  If you have ordering dependencies such that request
A must complete before request B completes, then don't submit A and B
concurrently, because if you do submit them concurrently, you cannot tell
whether A or B will arrive into the device first because they may go into
it via different hardware queues.

Note, in case it isn't obvious, the hardware queues I'm talking about here
are not the struct scsi_device, sdev->request_queue queues, they are
typically ring buffers in host memory from which the device DMAs commands/responses
to/from depending on if it's a submit queue or a completion queue and with
producer/consumer indexes one of which is in host memory and one of which
is a register on the device (which is which depends on the direction of the
queue, from device (pi = host memory, ci = device register), or to device
(pi = device register, ci = host memory))

-- steve

  reply	other threads:[~2012-12-14 18:55 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-11  0:00 SCSI mid layer and high IOPS capable devices scameron
2012-12-11  8:21 ` Bart Van Assche
2012-12-11 22:46   ` scameron
2012-12-13 11:40     ` Bart Van Assche
2012-12-13 18:03       ` scameron
2012-12-13 17:18         ` Bart Van Assche
2012-12-13 15:22 ` Bart Van Assche
2012-12-13 17:25   ` scameron
2012-12-13 16:47     ` Bart Van Assche
2012-12-13 16:49       ` Christoph Hellwig
2012-12-14  9:44         ` Bart Van Assche
2012-12-14 16:44           ` scameron
2012-12-14 16:15             ` Bart Van Assche
2012-12-14 19:55               ` scameron [this message]
2012-12-14 19:28                 ` Bart Van Assche
2012-12-14 21:06                   ` scameron
2012-12-15  9:40                     ` Bart Van Assche
2012-12-19 14:23                       ` Christoph Hellwig
2012-12-13 21:20       ` scameron
2012-12-14  0:22       ` Jack Wang
     [not found]         ` <CADzpL0TMT31yka98Zv0=53N4=pDZOc9+gacnvDWMbj+iZg4H5w@mail.gmail.com>
     [not found]           ` <006301cdd99c$35099b40$9f1cd1c0$@com>
     [not found]             ` <CADzpL0S5cfCRQftrxHij8KOjKj55psSJedmXLBQz1uQm_SC30A@mail.gmail.com>
2012-12-14  4:59               ` Jack Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121214195533.GS20898@beardog.cce.hp.com \
    --to=scameron@beardog.cce.hp.com \
    --cc=bvanassche@acm.org \
    --cc=dab@hp.com \
    --cc=hch@infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=stephenmcameron@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).