linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Boaz Harrosh <bharrosh@panasas.com>
To: James Bottomley <James.Bottomley@SteelEye.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>,
	linux-scsi@vger.kernel.org, bhalevy@panasas.com,
	jens.axboe@oracle.com, hch@infradead.org,
	akpm@linux-foundation.org, michaelc@cs.wisc.edu
Subject: Re: [PATCH v2] add bidi support for block pc requests
Date: Wed, 09 May 2007 10:46:34 +0300	[thread overview]
Message-ID: <46417C5A.1090707@panasas.com> (raw)
In-Reply-To: <1178654497.3737.81.camel@mulgrave.il.steeleye.com>

James Bottomley wrote:
> I think you'll find that kzalloc comes directly out of a slab for this
> size of allocation anyway ... you mean you want to see a dedicated pool
> for this specific allocation?
Yes, As you said below so we can always send IO for "forward progress
of freeing memory". My test machine is a Linux cluster in front of a
pNFS over OSD. The HPC cluster is diskless. It will reach this
situation very fast.

> There's another problem in that it destroys our forward progress
> guarantee.  There's always a single reserve command for every HBA so
> that forward progress for freeing memory can always be made in the
> system even if the command slab is out and we have to reclaim memory
> through a HBA with no outstanding commands.  Allocating two commands per
> bidirectional request hoses that guarantee ... it could be fixed up by
> increasing the reserve pool to 2, but that's adding further unwanted
> complexity ...
> 
Thanks for catching it! I was afraid of that. If we stick with this solution
in the interim until we do what you suggested below, we will need to put one
more for bidi. It should not be a complicated pool thing, just a reserved one
for the bidi case.

> 
> There's actually a fourth option you haven't considered:
> 
> Roll all the required sglist definitions (request_bufflen,
> request_buffer, use_sg and sglist_len) into the sgtable pools.
> 
> We're getting very close to the point where someone gets to sweep
> through the drivers eliminating the now superfluous non-sg path in the
> queuecommand.  When that happens the only cases become no transfer or SG
> backed commands.  At this point we can do a consolidation of the struct
> scsi_cmnd fields.  This does represent the ideal time to sweep the sg
> list handling fields into the sgtable and simply have a single pointer
> to struct sgtable in the scsi_cmnd (== NULL is the signal for a no
> transfer command).
> 
This is a grate Idea. Let me see if I understand what you mean.
1. An sgtable is a single allocation with an sgtable header type
   at the begining and a veriable size array of struct scatterlist.
   something like:
   struct sgtable {
   	struct sgtable_header {
		unsigned sg_count, sglist_len, length;
		struct sgtable* next; //for Jens's big io
   	} hdr;
	struct scatterlist sglist[];
   }
   Slabs are put up for above sgtable of different sizes as
   done today. (Should they be sized on different ARCHs to
   align on page boundaries?)

2. The way we can do this in stages: Meaning put up code that has
   both sets of API, Transfer drivers one-by-one to new API, deprecate
   old API for a kernel cycle or two. Than submit last piece of
   code that removes the old API.
   It can be done. We just need to copy sgtable_header fields
   to the old fields, and let them stick around for a while.

3. The second bidi sgtable will hang on request->next_rq->special.

> James
> 
> 
If everyone agrees on something like above. I can do it right away.
It's a solution I wouldn't even dream of. Thanks

Boaz

  reply	other threads:[~2007-05-09  7:47 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-08  2:25 [PATCH v2] add bidi support for block pc requests FUJITA Tomonori
2007-05-08 18:53 ` Boaz Harrosh
2007-05-08 20:01   ` James Bottomley
2007-05-09  7:46     ` Boaz Harrosh [this message]
2007-05-09 10:52       ` FUJITA Tomonori
2007-05-09 13:58         ` Boaz Harrosh
2007-05-09 14:54           ` FUJITA Tomonori
2007-05-09 14:55           ` James Bottomley
2007-05-09 16:54             ` Boaz Harrosh
2007-05-10  6:53               ` FUJITA Tomonori
2007-05-10  7:45                 ` FUJITA Tomonori
2007-05-10 12:37                   ` Boaz Harrosh
2007-05-10 13:01                     ` FUJITA Tomonori
2007-05-10 15:10                     ` Douglas Gilbert
2007-05-10 15:48                       ` Boaz Harrosh
2007-05-11 16:26                       ` James Bottomley
2007-05-16 17:29       ` Boaz Harrosh
2007-05-16 17:53         ` Jens Axboe
2007-05-16 18:06           ` James Bottomley
2007-05-16 18:13             ` Jens Axboe
2007-05-17  8:46               ` Boaz Harrosh
2007-05-17  2:57           ` FUJITA Tomonori
2007-05-17  5:48             ` Jens Axboe
2007-05-17  5:59               ` FUJITA Tomonori
2007-05-17  8:49                 ` Boaz Harrosh
2007-05-17 11:12                   ` FUJITA Tomonori
2007-05-17 17:37                     ` Benny Halevy
2007-05-24 16:37                     ` Boaz Harrosh
2007-05-24 16:46                       ` Boaz Harrosh
2007-05-24 16:47                       ` FUJITA Tomonori
2007-05-24 16:59                         ` Boaz Harrosh
2007-05-17 11:29                   ` FUJITA Tomonori
2007-05-17 13:27                   ` James Bottomley
2007-05-17 14:00                     ` Boaz Harrosh
2007-05-17 14:11                       ` FUJITA Tomonori
2007-05-17 15:49                         ` Boaz Harrosh
2007-06-01 20:21                           ` Jeff Garzik
2007-06-03  7:45                             ` Boaz Harrosh
2007-06-03 13:17                               ` James Bottomley
2007-07-07 15:27                                 ` Jeff Garzik
2007-07-07 15:42                                   ` James Bottomley
2007-07-07 16:59                                     ` Jeff Garzik
2007-05-09 10:49     ` FUJITA Tomonori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46417C5A.1090707@panasas.com \
    --to=bharrosh@panasas.com \
    --cc=James.Bottomley@SteelEye.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhalevy@panasas.com \
    --cc=fujita.tomonori@lab.ntt.co.jp \
    --cc=hch@infradead.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=michaelc@cs.wisc.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).