16 commands per lun limitation bug?

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* 16 commands per lun limitation bug?
@ 2010-02-10 20:19 scameron
  2010-02-10 21:59 ` James Bottomley
  0 siblings, 1 reply; 5+ messages in thread
From: scameron @ 2010-02-10 20:19 UTC (permalink / raw)
  To: linux-scsi; +Cc: mike.miller, james.bottomley, scameron

We have seen the amount of commands per lun that are sent to 
low level scsi driver limited to 16 commands per lun,
(seemingly artificially, well below our can_queue and
cmd_per_lun limits of 1020)

2.6.29 does not exhibit this bad behavior.
2.6.30, 2.6.31, 2.6.32 (2.6.32.1 through 2.6.32.8) do exhibit this bad behavior
2.6.31-rc1 does not exhibit this bad behavior

Testing was done with fio with this config file:

[global]
description=4096 byte random reads - ${CONFIG}
readwrite=randread
blocksize=4K
ioengine=libaio
softrandommap=1
direct=1
runtime=2000
time_based
#ramp_time=10
thread
filename=/dev/sdb

[iodepth=128]
stonewall
iodepth=128

When the problem is present, we only see 16 commands per lun, when
not present, we see 128 commands per lun.

Anybody else seen this problem?  Anybody know what caused the
brokeness between 2.6.29 and 2.6.30, or what fixed this between
2.6.32 and 2.6.33-rc1?

Hard for me to do a git bisect as the only hardware I have capable
of exercising the test case is Smart Array, and hpsa didn't go in
until 2.6.33-rc1.   Would like to know if other hardware/drivers
encounter this problem as well.

-- steve 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 16 commands per lun limitation bug?
  2010-02-10 20:19 16 commands per lun limitation bug? scameron
@ 2010-02-10 21:59 ` James Bottomley
  2010-02-10 22:19   ` Miller, Mike (OS Dev)
  0 siblings, 1 reply; 5+ messages in thread
From: James Bottomley @ 2010-02-10 21:59 UTC (permalink / raw)
  To: scameron; +Cc: linux-scsi, mike.miller

On Wed, 2010-02-10 at 14:19 -0600, scameron@beardog.cce.hp.com wrote:
> We have seen the amount of commands per lun that are sent to 
> low level scsi driver limited to 16 commands per lun,
> (seemingly artificially, well below our can_queue and
> cmd_per_lun limits of 1020)
> 
> 2.6.29 does not exhibit this bad behavior.
> 2.6.30, 2.6.31, 2.6.32 (2.6.32.1 through 2.6.32.8) do exhibit this bad behavior
> 2.6.31-rc1 does not exhibit this bad behavior

I can't think of any reason for this.  Best guess at the fix would be
the new queue full ramp up ramp down code, but no clue as to what the
problem is ... no other drivers seem to have noticed the performance
problems this would likely cause ... and 2.6.32 is becoming the standard
enterprise kernel.

James



^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: 16 commands per lun limitation bug?
  2010-02-10 21:59 ` James Bottomley
@ 2010-02-10 22:19   ` Miller, Mike (OS Dev)
  2010-02-10 22:22     ` James Bottomley
  0 siblings, 1 reply; 5+ messages in thread
From: Miller, Mike (OS Dev) @ 2010-02-10 22:19 UTC (permalink / raw)
  To: James Bottomley, scameron@beardog.cce.hp.com; +Cc: linux-scsi@vger.kernel.org

 

> -----Original Message-----
> From: James Bottomley [mailto:James.Bottomley@suse.de] 
> Sent: Wednesday, February 10, 2010 3:59 PM
> To: scameron@beardog.cce.hp.com
> Cc: linux-scsi@vger.kernel.org; Miller, Mike (OS Dev)
> Subject: Re: 16 commands per lun limitation bug?
> 
> On Wed, 2010-02-10 at 14:19 -0600, scameron@beardog.cce.hp.com wrote:
> > We have seen the amount of commands per lun that are sent 
> to low level 
> > scsi driver limited to 16 commands per lun, (seemingly 
> artificially, 
> > well below our can_queue and cmd_per_lun limits of 1020)
> > 
> > 2.6.29 does not exhibit this bad behavior.
> > 2.6.30, 2.6.31, 2.6.32 (2.6.32.1 through 2.6.32.8) do 
> exhibit this bad 
> > behavior
> > 2.6.31-rc1 does not exhibit this bad behavior
> 
> I can't think of any reason for this.  Best guess at the fix 
> would be the new queue full ramp up ramp down code, but no 
> clue as to what the problem is ... no other drivers seem to 
> have noticed the performance problems this would likely cause 
> ... and 2.6.32 is becoming the standard enterprise kernel.
> 
James,
I'm not sure there's much hardware out there that's capable of queuing up so many commands without choking the disks. I _think_ in most cases if you queue up say 64 commands on a single scsi disk that's just too much. But with Smart Array we can queue up to 1024 commands (on most controllers) and those are then distributed across all the drives in the array(s). IOW, we're thinking not many people would have noticed such a change. Hope this makes sense.

-- mikem

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: 16 commands per lun limitation bug?
  2010-02-10 22:19   ` Miller, Mike (OS Dev)
@ 2010-02-10 22:22     ` James Bottomley
  2010-02-11  2:27       ` scameron
  0 siblings, 1 reply; 5+ messages in thread
From: James Bottomley @ 2010-02-10 22:22 UTC (permalink / raw)
  To: Miller, Mike (OS Dev)
  Cc: scameron@beardog.cce.hp.com, linux-scsi@vger.kernel.org

On Wed, 2010-02-10 at 22:19 +0000, Miller, Mike (OS Dev) wrote:
> 
> > -----Original Message-----
> > From: James Bottomley [mailto:James.Bottomley@suse.de] 
> > Sent: Wednesday, February 10, 2010 3:59 PM
> > To: scameron@beardog.cce.hp.com
> > Cc: linux-scsi@vger.kernel.org; Miller, Mike (OS Dev)
> > Subject: Re: 16 commands per lun limitation bug?
> > 
> > On Wed, 2010-02-10 at 14:19 -0600, scameron@beardog.cce.hp.com wrote:
> > > We have seen the amount of commands per lun that are sent 
> > to low level 
> > > scsi driver limited to 16 commands per lun, (seemingly 
> > artificially, 
> > > well below our can_queue and cmd_per_lun limits of 1020)
> > > 
> > > 2.6.29 does not exhibit this bad behavior.
> > > 2.6.30, 2.6.31, 2.6.32 (2.6.32.1 through 2.6.32.8) do 
> > exhibit this bad 
> > > behavior
> > > 2.6.31-rc1 does not exhibit this bad behavior
> > 
> > I can't think of any reason for this.  Best guess at the fix 
> > would be the new queue full ramp up ramp down code, but no 
> > clue as to what the problem is ... no other drivers seem to 
> > have noticed the performance problems this would likely cause 
> > ... and 2.6.32 is becoming the standard enterprise kernel.
> > 
> James,
> I'm not sure there's much hardware out there that's capable of queuing
> up so many commands without choking the disks. I _think_ in most cases
> if you queue up say 64 commands on a single scsi disk that's just too
> much. But with Smart Array we can queue up to 1024 commands (on most
> controllers) and those are then distributed across all the drives in
> the array(s). IOW, we're thinking not many people would have noticed
> such a change. Hope this makes sense.

Fibre drivers to FC arrays would regard a depth of 16 as "fiddling small
change" to quote hitchhikers.  If any of the FC drivers got limited in
this regard, we'll see substantial enterprise performance drops ...
which I haven't actually heard about yet.

James



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 16 commands per lun limitation bug?
  2010-02-10 22:22     ` James Bottomley
@ 2010-02-11  2:27       ` scameron
  0 siblings, 0 replies; 5+ messages in thread
From: scameron @ 2010-02-11  2:27 UTC (permalink / raw)
  To: James Bottomley
  Cc: Miller, Mike (OS Dev), linux-scsi@vger.kernel.org, axboe,
	jay.allison, scameron

On Wed, Feb 10, 2010 at 05:22:40PM -0500, James Bottomley wrote:
> On Wed, 2010-02-10 at 22:19 +0000, Miller, Mike (OS Dev) wrote:
> > 
> > > -----Original Message-----
> > > From: James Bottomley [mailto:James.Bottomley@suse.de] 
> > > Sent: Wednesday, February 10, 2010 3:59 PM
> > > To: scameron@beardog.cce.hp.com
> > > Cc: linux-scsi@vger.kernel.org; Miller, Mike (OS Dev)
> > > Subject: Re: 16 commands per lun limitation bug?
> > > 
> > > On Wed, 2010-02-10 at 14:19 -0600, scameron@beardog.cce.hp.com wrote:
> > > > We have seen the amount of commands per lun that are sent 
> > > to low level 
> > > > scsi driver limited to 16 commands per lun, (seemingly 
> > > artificially, 
> > > > well below our can_queue and cmd_per_lun limits of 1020)
> > > > 
> > > > 2.6.29 does not exhibit this bad behavior.
> > > > 2.6.30, 2.6.31, 2.6.32 (2.6.32.1 through 2.6.32.8) do 
> > > exhibit this bad 
> > > > behavior
> > > > 2.6.31-rc1 does not exhibit this bad behavior
> > > 
> > > I can't think of any reason for this.  Best guess at the fix 
> > > would be the new queue full ramp up ramp down code, but no 
> > > clue as to what the problem is ... no other drivers seem to 
> > > have noticed the performance problems this would likely cause 
> > > ... and 2.6.32 is becoming the standard enterprise kernel.
> > > 
> > James,
> > I'm not sure there's much hardware out there that's capable of queuing
> > up so many commands without choking the disks. I _think_ in most cases
> > if you queue up say 64 commands on a single scsi disk that's just too
> > much. But with Smart Array we can queue up to 1024 commands (on most
> > controllers) and those are then distributed across all the drives in
> > the array(s). IOW, we're thinking not many people would have noticed
> > such a change. Hope this makes sense.
> 
> Fibre drivers to FC arrays would regard a depth of 16 as "fiddling small
> change" to quote hitchhikers.  If any of the FC drivers got limited in
> this regard, we'll see substantial enterprise performance drops ...
> which I haven't actually heard about yet.

Nevertheless, the problem is repeatable here on varied hardware,
and I figured out how to make git bisect work, patching in hpsa as I went.
It went through the several thousand commits between v2.6.29
and v2.6.30 and eventually homed in on this one as the apparent
cause of the symptoms I'm seeing:

Script started on Wed 10 Feb 2010 08:10:03 PM CST
scameron@sixx:~/lx26/linux-2.6> git bisect good
2f5cb7381b737e24c8046fd4aeab571fb71315f5 is first bad commit
commit 2f5cb7381b737e24c8046fd4aeab571fb71315f5
Author: Jens Axboe <jens.axboe@oracle.com>
Date:   Tue Apr 7 08:51:19 2009 +0200

    cfq-iosched: change dispatch logic to deal with single requests at the time
    
    The IO scheduler core calls into the IO scheduler dispatch_request hook
    to move requests from the IO scheduler and into the driver dispatch
    list. It only does so when the dispatch list is empty. CFQ moves several
    requests to the dispatch list, which can cause higher latencies if we
    suddenly have to switch to some important sync IO. Change the logic to
    move one request at the time instead.
    
    This should almost be functionally equivalent to what we did before,
    except that we now honor 'quantum' as the maximum queue depth at the
    device side from any single cfqq. If there's just a single active
    cfqq, we allow up to 4 times the normal quantum.
    
    Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

:040000 040000 be8b0f16d8faba88fc04e877eaa9240ec5d00648 5fd6eb925f065f411ed0d387277e45ba722ea154 M	block
scameron@sixx:~/lx26/linux-2.6> exit
exit


Script done on Wed 10 Feb 2010 08:10:34 PM CST


Any thoughts Jens?

BTW, snipped out of the above email was an fio config
file I was using to do the testing to produce the symptoms.
It was 4k random reads, libaio, iodepth of 128, one drive,
/dev/sdb, hpsa driver.



> 
> James
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-02-11  2:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-10 20:19 16 commands per lun limitation bug? scameron
2010-02-10 21:59 ` James Bottomley
2010-02-10 22:19   ` Miller, Mike (OS Dev)
2010-02-10 22:22     ` James Bottomley
2010-02-11  2:27       ` scameron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox