SCSI Command scatter-gather number

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* SCSI Command scatter-gather number
@ 2002-07-30 21:07 Haofeng Kou
  2002-07-30 21:33 ` Kurt Garloff
  2002-07-30 21:43 ` Mike Anderson
  0 siblings, 2 replies; 9+ messages in thread
From: Haofeng Kou @ 2002-07-30 21:07 UTC (permalink / raw)
  To: linux-scsi

For the "struct scsi_cmnd", there is a member :
	unsigned short use_sg;
Which gives the Number of pieces of the scatter-gather.

How to limit its value, how to set the max value for the "use_sg"?


Thanks,

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SCSI Command scatter-gather number
  2002-07-30 21:07 SCSI Command scatter-gather number Haofeng Kou
@ 2002-07-30 21:33 ` Kurt Garloff
  2002-07-30 21:43 ` Mike Anderson
  1 sibling, 0 replies; 9+ messages in thread
From: Kurt Garloff @ 2002-07-30 21:33 UTC (permalink / raw)
  To: Haofeng Kou; +Cc: Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 664 bytes --]

Hi Haofung,

On Tue, Jul 30, 2002 at 02:07:49PM -0700, Haofeng Kou wrote:
> For the "struct scsi_cmnd", there is a member :
> 	unsigned short use_sg;
> Which gives the Number of pieces of the scatter-gather.
> 
> How to limit its value, how to set the max value for the "use_sg"?

For a low-level driver you mean?
Use the sg_tablesize field.

Regards,
-- 
Kurt Garloff                   <kurt@garloff.de>         [Eindhoven, NL]
Physics: Plasma simulations    <K.Garloff@TUE.NL>     [TU Eindhoven, NL]
Linux: SCSI, Security          <garloff@suse.de>    [SuSE Nuernberg, DE]
 (See mail header or public key servers for PGP2 and GPG public keys.)

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SCSI Command scatter-gather number
  2002-07-30 21:07 SCSI Command scatter-gather number Haofeng Kou
  2002-07-30 21:33 ` Kurt Garloff
@ 2002-07-30 21:43 ` Mike Anderson
  2002-07-31  3:22   ` Douglas Gilbert
  1 sibling, 1 reply; 9+ messages in thread
From: Mike Anderson @ 2002-07-30 21:43 UTC (permalink / raw)
  To: Haofeng Kou; +Cc: linux-scsi

On 2.4 the use_sg value should be limited to the smaller of MAX_SEGMENTS
or the scsi host sg_tablesize.

This is checked in the block interface during __make_request. I believe if
you got through the sg interface the limiting value is only sg_tablesize,
but Doug is the best one to answer that.

-Mike

Haofeng Kou [haofengk@ptu.promise.com] wrote:
> For the "struct scsi_cmnd", there is a member :
> 	unsigned short use_sg;
> Which gives the Number of pieces of the scatter-gather.
> 
> How to limit its value, how to set the max value for the "use_sg"?
> 
> 
> Thanks,
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Michael Anderson
andmike@us.ibm.com


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SCSI Command scatter-gather number
  2002-07-30 21:43 ` Mike Anderson
@ 2002-07-31  3:22   ` Douglas Gilbert
  2002-07-31  5:11     ` Doug Ledford
  0 siblings, 1 reply; 9+ messages in thread
From: Douglas Gilbert @ 2002-07-31  3:22 UTC (permalink / raw)
  To: Mike Anderson; +Cc: Haofeng Kou, linux-scsi

Mike Anderson wrote:
> 
> On 2.4 the use_sg value should be limited to the smaller of MAX_SEGMENTS
> or the scsi host sg_tablesize.
> 
> This is checked in the block interface during __make_request. I believe if
> you got through the sg interface the limiting value is only sg_tablesize,
> but Doug is the best one to answer that.

Mike,
Yes, the sg driver limits the number of elements in a scatter
gather list to Scsi_Host::sg_tablesize . I haven't seen these
values over 256 but I can't see anything enforcing that (and
the type of sg_tablesize is unsigned short). [Perhaps
Scsi_Cmnd::use_sg was a unsigned char at some time in the
past.]

The 2 LLDD I use the most are advansys and sym53c8xx and
they have fixed values of 255 and 96 respectively. Small
numbers only seem to be a problem with direct IO when large
transfers (with a single command) are attempted. Since
direct IO remaps a large data block from the user space
allocated by malloc() to a scatter gather list, it must
cope with the typical case in which every page is 
discontinuous. So in the case of the sym53c8xx on i386
the biggest direct IO transfer that can be done is
96 * 4KB == 384KB .

Doug Gilbert

> -Mike
> 
> Haofeng Kou [haofengk@ptu.promise.com] wrote:
> > For the "struct scsi_cmnd", there is a member :
> >       unsigned short use_sg;
> > Which gives the Number of pieces of the scatter-gather.
> >
> > How to limit its value, how to set the max value for the "use_sg"?
> >
> >
> > Thanks,
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Michael Anderson
> andmike@us.ibm.com
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SCSI Command scatter-gather number
  2002-07-31  3:22   ` Douglas Gilbert
@ 2002-07-31  5:11     ` Doug Ledford
  2002-07-31  5:26       ` Matthew Jacob
  2002-07-31  7:44       ` Jeremy Higdon
  0 siblings, 2 replies; 9+ messages in thread
From: Doug Ledford @ 2002-07-31  5:11 UTC (permalink / raw)
  To: Douglas Gilbert; +Cc: Mike Anderson, Haofeng Kou, linux-scsi

On Tue, Jul 30, 2002 at 11:22:39PM -0400, Douglas Gilbert wrote:
> Mike Anderson wrote:
> > 
> > On 2.4 the use_sg value should be limited to the smaller of MAX_SEGMENTS
> > or the scsi host sg_tablesize.
> > 
> > This is checked in the block interface during __make_request. I believe if
> > you got through the sg interface the limiting value is only sg_tablesize,
> > but Doug is the best one to answer that.
> 
> Mike,
> Yes, the sg driver limits the number of elements in a scatter
> gather list to Scsi_Host::sg_tablesize . I haven't seen these
> values over 256 but I can't see anything enforcing that (and
> the type of sg_tablesize is unsigned short). [Perhaps
> Scsi_Cmnd::use_sg was a unsigned char at some time in the
> past.]

Most LLDD don't want to use too large of a number because to be truly safe
you must preallocate enough memory for these sg structs at init time and
hold it forever.  You obviously can't allocate memory for a sg struct when
you are out of mem, it's a deadlock scenario.  So, 256 elements per
command is 2048 bytes of permanently allocated DMAable memory per command 
the host controller can queue (given the typical 8 byte, 32bit DMA 
address hardware sg struct, gets worse when you support 64bit 
addressing).  Support something like 256 commands on the card total and 
that's now 512K of memory just for sg structs and doesn't count the actual 
scsi command blocks that the cards also need dma memory for...

> The 2 LLDD I use the most are advansys and sym53c8xx and
> they have fixed values of 255 and 96 respectively. Small
> numbers only seem to be a problem with direct IO when large
> transfers (with a single command) are attempted. Since
> direct IO remaps a large data block from the user space
> allocated by malloc() to a scatter gather list, it must
> cope with the typical case in which every page is 
> discontinuous. So in the case of the sym53c8xx on i386
> the biggest direct IO transfer that can be done is
> 96 * 4KB == 384KB .

There are some controllers out there that can't use a fixed size,
unfortunately.  Look at the crap in qlogicfc.c and qlogicisp.c in their
queuecommand() routines to see what I'm referring to.  That's something I
would like to do something about in my changes, but I haven't decided if
it's worth it.  What I would like to do actually is decouple the maximum
sg table size from the available sg table size by introducing the concept
of both a maximum single command sg table size and a sg pool limit.  This
way a low level driver can support a really large size sg table on a few
commands and still deal with other smaller commands by doing internal DMA
pool allocations.  However, this would have to be optional since pool
fragmentation issues and such would make *truly* supporting this somewhat
messy down in the LLDD.  The old way of doing sg limiting should still
work (which is easy, just set the pool size to can_queue * sg_tablesize
and forget about it).  Certain Qlogic cards need this to be treated like a
finite resource where each command uses some number of these up out of a
total count pool.

However, the way that the Qlogic driver needs to calculate both the
available sg table entries and the available command queue slots is
actually odd enough that no generic mechanism for supporting it in the mid
layer is possible.  Only the driver really knows how these finite
resources are used up, and when you know the manipulations the driver goes
through then you'll agree only the driver should know, I'm sure ;-)

The Qlogic interface has a command queue slot which can start a command
plus contain up to 4 sg entries.  Then, to get more sg entries, you must
use a daisy chained command queue slot, which may contain up to 7 more sg
segments.  In this way, a command with a 12 segment sg table needs to use
3 command queue slots to be sent out to the device.  So, while the mid
layer would increase the active command count by 1 (while checking it
against can_queue to keep from sending too many commands to the device)
and also assumes that the number of available sg table entries is constant
for the next command, we would have actually used up 3 command slots and
21 possible sg table entries.  The Qlogic drivers attempt to make up for
this by futzing with can_queue and sg_tablesize in their queuecommand()  
routines.  Switching to per queue locks breaks this futzing horribly and
results in all sorts of crap with these drivers.

Properly supporting this type of driver interface model is something on my
list of "want to accomplish" items, but I really haven't decided how to 
try and accomplish it yet.  It would, at a minimum, require support from 
the LLDD and it would likely require that scsi_resuest_fn() be prepared to 
stick a command back on the request queue if it turns out that not all the 
resources it thought were available truly were.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SCSI Command scatter-gather number
  2002-07-31  5:11     ` Doug Ledford
@ 2002-07-31  5:26       ` Matthew Jacob
  2002-07-31  7:44       ` Jeremy Higdon
  1 sibling, 0 replies; 9+ messages in thread
From: Matthew Jacob @ 2002-07-31  5:26 UTC (permalink / raw)
  To: Doug Ledford; +Cc: Douglas Gilbert, Mike Anderson, Haofeng Kou, linux-scsi

> 
> However, the way that the Qlogic driver needs to calculate both the
> available sg table entries and the available command queue slots is
> actually odd enough that no generic mechanism for supporting it in the mid
> layer is possible.  Only the driver really knows how these finite
> resources are used up, and when you know the manipulations the driver goes
> through then you'll agree only the driver should know, I'm sure ;-)

It's a combination of:

	how much request queue space is available*

	how many s/g elements you need to use (depending on length of
	command and whether you need to use A64 or A32 entries)

	how much SRAM is on the card in question and how much was taken
	up by firmware version XYZ loaded from flash or host

	how many commands are in process

	how many current active exception conditions are in use

Note that the request queue space just is there long enough to put down a
command and notify the f/w on the qlogic. When it can, it moves the request
into onboard SRAM before beginning execution.

Except for running out of queue space with extremely long s/g lists, the rest
is far too hard to really figure out. 

Luckily the QLogic f/w will synthesize a QFULL if you go over execution
throttle, or otherwise overrun internal resources. So it goes.

-matt

* 256 entries for 1020/1040 product, any unsigned 16 bit value for
Ultra2/Ultra3 or FC products.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SCSI Command scatter-gather number
  2002-07-31  5:11     ` Doug Ledford
  2002-07-31  5:26       ` Matthew Jacob
@ 2002-07-31  7:44       ` Jeremy Higdon
  2002-07-31 13:36         ` Doug Ledford
  1 sibling, 1 reply; 9+ messages in thread
From: Jeremy Higdon @ 2002-07-31  7:44 UTC (permalink / raw)
  To: Doug Ledford, Douglas Gilbert; +Cc: Mike Anderson, Haofeng Kou, linux-scsi

On Jul 31,  1:11am, Doug Ledford wrote:
> 
> Most LLDD don't want to use too large of a number because to be truly safe
> you must preallocate enough memory for these sg structs at init time and
> hold it forever.  You obviously can't allocate memory for a sg struct when
> you are out of mem, it's a deadlock scenario.  So, 256 elements per
> command is 2048 bytes of permanently allocated DMAable memory per command 
> the host controller can queue (given the typical 8 byte, 32bit DMA 
> address hardware sg struct, gets worse when you support 64bit 
> addressing).  Support something like 256 commands on the card total and 
> that's now 512K of memory just for sg structs and doesn't count the actual 
> scsi command blocks that the cards also need dma memory for...

Why not allocate 1 (or a small number) of giant s/g lists, and N
medium lists (where N is the max # of outstanding commands).  Then,
when you get a request that needs more s/g entries than the medium,
you allocate a new list.  If that fails, you use the preallocated
giant list when it's available, leaving the request on the lld's
internal queue?

I found that a good tradeoff between maximum request size and reasonable
memory usage.

Similary for the Qlogic example, you don't have to worry about throttling
to the lld, since it can just leave commands on its internal queue if
there isn't enough space in the request queue.  You can then issue the
queued commands when the next command completes, or you can set a timer
to check to see if there is new space in the request queue.

jeremy

> There are some controllers out there that can't use a fixed size,
> unfortunately.  Look at the crap in qlogicfc.c and qlogicisp.c in their
> queuecommand() routines to see what I'm referring to.  That's something I
> would like to do something about in my changes, but I haven't decided if
> it's worth it.  What I would like to do actually is decouple the maximum
> sg table size from the available sg table size by introducing the concept
> of both a maximum single command sg table size and a sg pool limit.  This
> way a low level driver can support a really large size sg table on a few
> commands and still deal with other smaller commands by doing internal DMA
> pool allocations.  However, this would have to be optional since pool
> fragmentation issues and such would make *truly* supporting this somewhat
> messy down in the LLDD.  The old way of doing sg limiting should still
> work (which is easy, just set the pool size to can_queue * sg_tablesize
> and forget about it).  Certain Qlogic cards need this to be treated like a
> finite resource where each command uses some number of these up out of a
> total count pool.
> 
> However, the way that the Qlogic driver needs to calculate both the
> available sg table entries and the available command queue slots is
> actually odd enough that no generic mechanism for supporting it in the mid
> layer is possible.  Only the driver really knows how these finite
> resources are used up, and when you know the manipulations the driver goes
> through then you'll agree only the driver should know, I'm sure ;-)
> 
> The Qlogic interface has a command queue slot which can start a command
> plus contain up to 4 sg entries.  Then, to get more sg entries, you must
> use a daisy chained command queue slot, which may contain up to 7 more sg
> segments.  In this way, a command with a 12 segment sg table needs to use
> 3 command queue slots to be sent out to the device.  So, while the mid
> layer would increase the active command count by 1 (while checking it
> against can_queue to keep from sending too many commands to the device)
> and also assumes that the number of available sg table entries is constant
> for the next command, we would have actually used up 3 command slots and
> 21 possible sg table entries.  The Qlogic drivers attempt to make up for
> this by futzing with can_queue and sg_tablesize in their queuecommand()  
> routines.  Switching to per queue locks breaks this futzing horribly and
> results in all sorts of crap with these drivers.
> 
> Properly supporting this type of driver interface model is something on my
> list of "want to accomplish" items, but I really haven't decided how to 
> try and accomplish it yet.  It would, at a minimum, require support from 
> the LLDD and it would likely require that scsi_resuest_fn() be prepared to 
> stick a command back on the request queue if it turns out that not all the 
> resources it thought were available truly were.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SCSI Command scatter-gather number
  2002-07-31  7:44       ` Jeremy Higdon
@ 2002-07-31 13:36         ` Doug Ledford
  2002-08-01  1:16           ` Jeremy Higdon
  0 siblings, 1 reply; 9+ messages in thread
From: Doug Ledford @ 2002-07-31 13:36 UTC (permalink / raw)
  To: Jeremy Higdon; +Cc: Douglas Gilbert, Mike Anderson, Haofeng Kou, linux-scsi

On Wed, Jul 31, 2002 at 12:44:42AM -0700, Jeremy Higdon wrote:
> On Jul 31,  1:11am, Doug Ledford wrote:
> > 
> > Most LLDD don't want to use too large of a number because to be truly safe
> > you must preallocate enough memory for these sg structs at init time and
> > hold it forever.  You obviously can't allocate memory for a sg struct when
> > you are out of mem, it's a deadlock scenario.  So, 256 elements per
> > command is 2048 bytes of permanently allocated DMAable memory per command 
> > the host controller can queue (given the typical 8 byte, 32bit DMA 
> > address hardware sg struct, gets worse when you support 64bit 
> > addressing).  Support something like 256 commands on the card total and 
> > that's now 512K of memory just for sg structs and doesn't count the actual 
> > scsi command blocks that the cards also need dma memory for...
> 
> Why not allocate 1 (or a small number) of giant s/g lists, and N
> medium lists (where N is the max # of outstanding commands).  Then,
> when you get a request that needs more s/g entries than the medium,
> you allocate a new list.  If that fails, you use the preallocated
> giant list when it's available, leaving the request on the lld's
> internal queue?

For random use patterns it might be sufficient, but under lots of load 
scenarios that I've seen this wouldn't work too well.  It depends on 
whether or not you want your driver to be able to do the full streaming 
stuff that tends to blow up this type of allocation.

> I found that a good tradeoff between maximum request size and reasonable
> memory usage.
> 
> Similary for the Qlogic example, you don't have to worry about throttling
> to the lld, since it can just leave commands on its internal queue if
> there isn't enough space in the request queue.  You can then issue the
> queued commands when the next command completes, or you can set a timer
> to check to see if there is new space in the request queue.

This is exactly what I want to avoid.  There should not be any reason that 
all the low level drivers implement their own queues and timers.  That 
code most definitely *is* perfectly generic enough that I would like to 
see the mid layer get it right and I would like to see low level drivers 
yanking out queue code left and right, and instead the low level drivers 
should not be required to do any more than tell the mid layer yes I can 
take this or no I'm too busy right now, hold on to this until I'm ready.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606
  

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SCSI Command scatter-gather number
  2002-07-31 13:36         ` Doug Ledford
@ 2002-08-01  1:16           ` Jeremy Higdon
  0 siblings, 0 replies; 9+ messages in thread
From: Jeremy Higdon @ 2002-08-01  1:16 UTC (permalink / raw)
  To: Doug Ledford; +Cc: Douglas Gilbert, Mike Anderson, Haofeng Kou, linux-scsi

On Jul 31,  9:36am, Doug Ledford wrote:
> 
> On Wed, Jul 31, 2002 at 12:44:42AM -0700, Jeremy Higdon wrote:
> > On Jul 31,  1:11am, Doug Ledford wrote:
> > > 
> > > Most LLDD don't want to use too large of a number because to be truly safe
> > > you must preallocate enough memory for these sg structs at init time and
> > > hold it forever.  You obviously can't allocate memory for a sg struct when
> > > you are out of mem, it's a deadlock scenario.  So, 256 elements per
> > > command is 2048 bytes of permanently allocated DMAable memory per command 
> > > the host controller can queue (given the typical 8 byte, 32bit DMA 
> > > address hardware sg struct, gets worse when you support 64bit 
> > > addressing).  Support something like 256 commands on the card total and 
> > > that's now 512K of memory just for sg structs and doesn't count the actual 
> > > scsi command blocks that the cards also need dma memory for...
> > 
> > Why not allocate 1 (or a small number) of giant s/g lists, and N
> > medium lists (where N is the max # of outstanding commands).  Then,
> > when you get a request that needs more s/g entries than the medium,
> > you allocate a new list.  If that fails, you use the preallocated
> > giant list when it's available, leaving the request on the lld's
> > internal queue?
> 
> For random use patterns it might be sufficient, but under lots of load 
> scenarios that I've seen this wouldn't work too well.  It depends on 
> whether or not you want your driver to be able to do the full streaming 
> stuff that tends to blow up this type of allocation.

1.  You can tune the medium size s/g list size to whatever it needs to
    be for your usage.  The point is that you don't have to restrict
    your maximum I/O request size when you restrict your memory footprint.

2.  Memory allocation overhead for s/g lists for huge I/O requests is
    small, unless the memory allocator is very inefficient.


> > I found that a good tradeoff between maximum request size and reasonable
> > memory usage.
> > 
> > Similary for the Qlogic example, you don't have to worry about throttling
> > to the lld, since it can just leave commands on its internal queue if
> > there isn't enough space in the request queue.  You can then issue the
> > queued commands when the next command completes, or you can set a timer
> > to check to see if there is new space in the request queue.
> 
> This is exactly what I want to avoid.  There should not be any reason that 
> all the low level drivers implement their own queues and timers.  That 
> code most definitely *is* perfectly generic enough that I would like to 
> see the mid layer get it right and I would like to see low level drivers 
> yanking out queue code left and right, and instead the low level drivers 
> should not be required to do any more than tell the mid layer yes I can 
> take this or no I'm too busy right now, hold on to this until I'm ready.


That has the same effect as what I was describing.  It just moves the
code to another place and adds an interface call (scsi_does_lld_have_room()).
When would you check for room next -- when a command completes or after
a certain amount of time passes.

jeremy

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2002-08-01  1:16 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-07-30 21:07 SCSI Command scatter-gather number Haofeng Kou
2002-07-30 21:33 ` Kurt Garloff
2002-07-30 21:43 ` Mike Anderson
2002-07-31  3:22   ` Douglas Gilbert
2002-07-31  5:11     ` Doug Ledford
2002-07-31  5:26       ` Matthew Jacob
2002-07-31  7:44       ` Jeremy Higdon
2002-07-31 13:36         ` Doug Ledford
2002-08-01  1:16           ` Jeremy Higdon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox