From: Suparna Bhattacharya <suparna@in.ibm.com>
To: James Bottomley <James.Bottomley@steeleye.com>
Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org,
axboe@kernel.org, B.Zolnierkiewicz@elka.pw.edu.pl,
akpm@zip.com.au
Subject: Re: [PATCH] Bio Traversal Changes
Date: Tue, 6 Aug 2002 17:47:25 +0530 [thread overview]
Message-ID: <20020806174725.A2901@in.ibm.com> (raw)
In-Reply-To: <200208051547.g75FldT11138@localhost.localdomain>; from James.Bottomley@steeleye.com on Mon, Aug 05, 2002 at 10:47:39AM -0500
On Mon, Aug 05, 2002 at 10:47:39AM -0500, James Bottomley wrote:
> suparna@in.ibm.com said:
> > There is only one call to ->request_fn for the entire request, and the
> > drivers manages things underneath. The chunks are expected to complete
> > sequentially. In the situation where the request is restarted in the
> > event of an error (say), the submission pointers are rolled back to
> > the last (successfully) completed point before issuing the request
> > again.
>
> Yes, that's the way I thought it would operate.
>
> suparna@in.ibm.com said:
> > I must say that I initially did think that this could be extended to
> > the more generic case which you probably are referring to and that
> > such an approach could take away the need to split bios in certain
> > cases (i.e. when the i/o is destined for a single queue). Later it
> > appeared that trying to cover the case where each of these pieces
> > gets queued up and might complete out of order (requiring a tag to
> > correlate things on completion), would most likely boil down to
> > trying to maintain all the state that struct request does today.
>
> For this more generic case, most of our problems seem to be because the
> barrier has width: It actually belongs to an I/O request. If the barrier had
> zero width (i.e. it was simply a barrier in the stream with no I/O attached)
> then it would be much easier to preserve it correctly across this (or any
> other) type of bio splitting. It would also make it much more obvious to the
> implementing driver where the barrier was supposed to be in the I/O stream,
> and would allow more efficient "wait for completion" barrier implementations
> for drivers that couldn't enforce it any other way.
>
> > Would be nice (for me) to understand this in more detail. There might
> > be some possibilities. Any pointers that I can look up to get a
> > clearer idea ?
>
> The SCSI standards (www.t10.org) are the only real authoritative source (with
> even some explanation). However, I'll do my best to summarise.
>
> In SCSI, commands are allowed to disconnect, that is suspend temporarily while
> the device does other things. When the device implements tag command
> queueing, it is allowed to disconnect one command and subsequently reconnect
> (restart) a different one. In theory, this means that we can have multiple
> active I/Os at once. The way you signal to the scsi device that you want a
> barrier is to label one or more of the tags as "ordered" which means that the
> device must complete all I/O of tags prior to the ordered one before it and
> may not begin I/O of subsequent tags until the ordered tag has completed.
>
> looping a single request over a big bio means that the SCSI device sees the
> I/O as a discrete stream of tags. However, we lose throughput if we stall the
> queue waiting for this single bio to complete and we can't work out what the
> next tag is until the prior tag completes. In the non barrier case,
> everything will still be OK as long as the queue isn't stalled because we'll
> be getting throughput from other bios coming down.
>
> I think basically, I'd like to translate as much of the bio as I can into SCSI
> tags to improve throughput and each tag currently requires a struct request.
I didn't think of the possibility of serializing the chunks
of a single request, while letting other requests on the queue through
in the no barrier situation. That's a thought, though it might result
in non-optimal scans ... and in that sense affect the throughput.
But, now I see why the barrier case was the one you were mainly worried
about.
>
> > Does completion notification happen only when all the commands
> > covered by a single tag complete ? Otherwise, what is the ordering
> > amongst the multiple commands in question (do they complete in serial
> > order as well) ?
>
> Yes and no. You get a special completion code (INTERMEDIATE_TASK_COMPLETE)
> which says "I've finished this bit, give me the next part". You don't get a
> real SCSI completion until the last part of the linked task set completes.
> The task is linked sequentially, so it does complete in serial order.
Thanks for the explanation. I think I get the gist.
>
> However, Don't worry about the linked task stuff, it's a rather esoteric area
> of the SCSI standard (that allows a single tag to be used across multiple I/Os
> in very much the same way the bio splitting works) which, on mature
> reflection, probably isn't such a good idea to use since I'd be doubtful about
> how well it's implemented in the devices we have to deal with.
OK.
Regards
Suparna
>
> James
>
>
prev parent reply other threads:[~2002-08-06 12:14 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-08-02 12:35 [PATCH] Bio Traversal Changes Suparna Bhattacharya
2002-08-02 12:43 ` [PATCH] Bio Traversal Changes (Patch 1/4: biotr8-blk.diff) Suparna Bhattacharya
2002-08-02 12:46 ` [PATCH] Bio Traversal Changes (Patch 2/4: biotr8-blkusers.diff) Suparna Bhattacharya
2002-08-02 13:17 ` [PATCH] Bio Traversal Changes - (Patch 3/4 : biotr8-blkdrivers.diff) Suparna Bhattacharya
2002-08-02 13:20 ` [PATCH] Bio Traversal Changes (Patch 4/4: biotr8-doc.diff) Suparna Bhattacharya
2002-08-02 13:48 ` [PATCH] Bio Traversal Changes James Bottomley
2002-08-05 12:38 ` Suparna Bhattacharya
2002-08-05 15:47 ` James Bottomley
2002-08-06 12:17 ` Suparna Bhattacharya [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20020806174725.A2901@in.ibm.com \
--to=suparna@in.ibm.com \
--cc=B.Zolnierkiewicz@elka.pw.edu.pl \
--cc=James.Bottomley@steeleye.com \
--cc=akpm@zip.com.au \
--cc=axboe@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox