All of lore.kernel.org
 help / color / mirror / Atom feed
From: Suparna Bhattacharya <suparna@in.ibm.com>
To: James Bottomley <James.Bottomley@steeleye.com>
Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org,
	axboe@kernel.org, B.Zolnierkiewicz@elka.pw.edu.pl,
	akpm@zip.com.au
Subject: Re: [PATCH] Bio Traversal Changes
Date: Tue, 6 Aug 2002 17:47:25 +0530	[thread overview]
Message-ID: <20020806174725.A2901@in.ibm.com> (raw)
In-Reply-To: <200208051547.g75FldT11138@localhost.localdomain>; from James.Bottomley@steeleye.com on Mon, Aug 05, 2002 at 10:47:39AM -0500

On Mon, Aug 05, 2002 at 10:47:39AM -0500, James Bottomley wrote:
> suparna@in.ibm.com said:
> > There is only one call to ->request_fn for the entire request, and the
> > drivers manages things underneath. The chunks are expected to complete
> > sequentially. In the situation where the request is restarted in the
> > event of an error (say), the submission pointers are rolled back to
> > the last (successfully) completed point before issuing the request
> > again. 
> 
> Yes, that's the way I thought it would operate.
> 
> suparna@in.ibm.com said:
> > I must say that I initially did think that this could be  extended to
> > the more generic case which you probably are  referring to and that
> > such an approach could take away the need  to split bios in certain
> > cases (i.e. when the i/o is destined for  a single queue). Later it
> > appeared that trying to cover  the case where each of these pieces
> > gets queued up and might  complete out of order (requiring a tag to
> > correlate things on  completion), would most likely boil down to
> > trying to maintain  all the state that struct request does today.  
> 
> For this more generic case, most of our problems seem to be because the 
> barrier has width:  It actually belongs to an I/O request.  If the barrier had 
> zero width (i.e. it was simply a barrier in the stream with no I/O attached) 
> then it would be much easier to preserve it correctly across this (or any 
> other) type of bio splitting.  It would also make it much more obvious to the 
> implementing driver where the barrier was supposed to be in the I/O stream, 
> and would allow more efficient "wait for completion" barrier implementations 
> for drivers that couldn't enforce it any other way.
> 
> > Would be nice (for me) to understand this in more detail.  There might
> > be some possibilities. Any pointers that I can look up to get a
> > clearer idea ? 
> 
> The SCSI standards (www.t10.org) are the only real authoritative source (with 
> even some explanation).  However, I'll do my best to summarise.
> 
> In SCSI, commands are allowed to disconnect, that is suspend temporarily while 
> the device does other things.  When the device implements tag command 
> queueing, it is allowed to disconnect one command and subsequently reconnect 
> (restart) a different one.  In theory, this means that we can have multiple 
> active I/Os at once.  The way you signal to the scsi device that you want a 
> barrier is to label one or more of the tags as "ordered" which means that the 
> device must complete all I/O of tags prior to the ordered one before it and 
> may not begin I/O of subsequent tags until the ordered tag has completed.
> 
> looping a single request over a big bio means that the SCSI device sees the 
> I/O as a discrete stream of tags.  However, we lose throughput if we stall the 
> queue waiting for this single bio to complete and we can't work out what the 
> next tag is until the prior tag completes.  In the non barrier case, 
> everything will still be OK as long as the queue isn't stalled because we'll 
> be getting throughput from other bios coming down.
> 
> I think basically, I'd like to translate as much of the bio as I can into SCSI 
> tags to improve throughput and each tag currently requires a struct request.

I didn't think of the possibility of serializing the chunks
of a single request, while letting other requests on the queue through
in the no barrier situation. That's a thought, though it might result 
in non-optimal scans ... and in that sense affect the throughput.
But, now I see why the barrier case was the one you were mainly worried
about.

> 
> > Does completion notification happen only when all the commands
> > covered by a single tag complete ? Otherwise, what is the ordering
> > amongst the multiple commands in question (do they complete in  serial
> > order as well) ? 
> 
> Yes and no.  You get a special completion code (INTERMEDIATE_TASK_COMPLETE) 
> which says "I've finished this bit, give me the next part".  You don't get a 
> real SCSI completion until the last part of the linked task set completes.  
> The task is linked sequentially, so it does complete in serial order.

Thanks for the explanation. I think I get the gist.

> 
> However, Don't worry about the linked task stuff, it's a rather esoteric area 
> of the SCSI standard (that allows a single tag to be used across multiple I/Os 
> in very much the same way the bio splitting works) which, on mature 
> reflection, probably isn't such a good idea to use since I'd be doubtful about 
> how well it's implemented in the devices we have to deal with.

OK. 

Regards
Suparna

> 
> James
> 
> 

  parent reply	other threads:[~2002-08-06 12:14 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-08-02 12:35 [PATCH] Bio Traversal Changes Suparna Bhattacharya
2002-08-02 12:43 ` [PATCH] Bio Traversal Changes (Patch 1/4: biotr8-blk.diff) Suparna Bhattacharya
2002-08-02 12:43 ` Suparna Bhattacharya
2002-08-02 12:46 ` [PATCH] Bio Traversal Changes (Patch 2/4: biotr8-blkusers.diff) Suparna Bhattacharya
2002-08-02 13:17 ` [PATCH] Bio Traversal Changes - (Patch 3/4 : biotr8-blkdrivers.diff) Suparna Bhattacharya
2002-08-02 13:20 ` [PATCH] Bio Traversal Changes (Patch 4/4: biotr8-doc.diff) Suparna Bhattacharya
2002-08-02 13:20 ` Suparna Bhattacharya
2002-08-02 13:48 ` [PATCH] Bio Traversal Changes James Bottomley
2002-08-02 13:48 ` James Bottomley
2002-08-05 12:38   ` Suparna Bhattacharya
2002-08-05 15:47     ` James Bottomley
2002-08-06 12:17       ` Suparna Bhattacharya
2002-08-06 12:17       ` Suparna Bhattacharya [this message]
  -- strict thread matches above, loose matches on Subject: below --
2002-08-02 12:35 Suparna Bhattacharya

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020806174725.A2901@in.ibm.com \
    --to=suparna@in.ibm.com \
    --cc=B.Zolnierkiewicz@elka.pw.edu.pl \
    --cc=James.Bottomley@steeleye.com \
    --cc=akpm@zip.com.au \
    --cc=axboe@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.