* 16 commands per lun limitation bug?
@ 2010-02-10 20:19 scameron
2010-02-10 21:59 ` James Bottomley
0 siblings, 1 reply; 5+ messages in thread
From: scameron @ 2010-02-10 20:19 UTC (permalink / raw)
To: linux-scsi; +Cc: mike.miller, james.bottomley, scameron
We have seen the amount of commands per lun that are sent to
low level scsi driver limited to 16 commands per lun,
(seemingly artificially, well below our can_queue and
cmd_per_lun limits of 1020)
2.6.29 does not exhibit this bad behavior.
2.6.30, 2.6.31, 2.6.32 (2.6.32.1 through 2.6.32.8) do exhibit this bad behavior
2.6.31-rc1 does not exhibit this bad behavior
Testing was done with fio with this config file:
[global]
description=4096 byte random reads - ${CONFIG}
readwrite=randread
blocksize=4K
ioengine=libaio
softrandommap=1
direct=1
runtime=2000
time_based
#ramp_time=10
thread
filename=/dev/sdb
[iodepth=128]
stonewall
iodepth=128
When the problem is present, we only see 16 commands per lun, when
not present, we see 128 commands per lun.
Anybody else seen this problem? Anybody know what caused the
brokeness between 2.6.29 and 2.6.30, or what fixed this between
2.6.32 and 2.6.33-rc1?
Hard for me to do a git bisect as the only hardware I have capable
of exercising the test case is Smart Array, and hpsa didn't go in
until 2.6.33-rc1. Would like to know if other hardware/drivers
encounter this problem as well.
-- steve
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: 16 commands per lun limitation bug? 2010-02-10 20:19 16 commands per lun limitation bug? scameron @ 2010-02-10 21:59 ` James Bottomley 2010-02-10 22:19 ` Miller, Mike (OS Dev) 0 siblings, 1 reply; 5+ messages in thread From: James Bottomley @ 2010-02-10 21:59 UTC (permalink / raw) To: scameron; +Cc: linux-scsi, mike.miller On Wed, 2010-02-10 at 14:19 -0600, scameron@beardog.cce.hp.com wrote: > We have seen the amount of commands per lun that are sent to > low level scsi driver limited to 16 commands per lun, > (seemingly artificially, well below our can_queue and > cmd_per_lun limits of 1020) > > 2.6.29 does not exhibit this bad behavior. > 2.6.30, 2.6.31, 2.6.32 (2.6.32.1 through 2.6.32.8) do exhibit this bad behavior > 2.6.31-rc1 does not exhibit this bad behavior I can't think of any reason for this. Best guess at the fix would be the new queue full ramp up ramp down code, but no clue as to what the problem is ... no other drivers seem to have noticed the performance problems this would likely cause ... and 2.6.32 is becoming the standard enterprise kernel. James ^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: 16 commands per lun limitation bug? 2010-02-10 21:59 ` James Bottomley @ 2010-02-10 22:19 ` Miller, Mike (OS Dev) 2010-02-10 22:22 ` James Bottomley 0 siblings, 1 reply; 5+ messages in thread From: Miller, Mike (OS Dev) @ 2010-02-10 22:19 UTC (permalink / raw) To: James Bottomley, scameron@beardog.cce.hp.com; +Cc: linux-scsi@vger.kernel.org > -----Original Message----- > From: James Bottomley [mailto:James.Bottomley@suse.de] > Sent: Wednesday, February 10, 2010 3:59 PM > To: scameron@beardog.cce.hp.com > Cc: linux-scsi@vger.kernel.org; Miller, Mike (OS Dev) > Subject: Re: 16 commands per lun limitation bug? > > On Wed, 2010-02-10 at 14:19 -0600, scameron@beardog.cce.hp.com wrote: > > We have seen the amount of commands per lun that are sent > to low level > > scsi driver limited to 16 commands per lun, (seemingly > artificially, > > well below our can_queue and cmd_per_lun limits of 1020) > > > > 2.6.29 does not exhibit this bad behavior. > > 2.6.30, 2.6.31, 2.6.32 (2.6.32.1 through 2.6.32.8) do > exhibit this bad > > behavior > > 2.6.31-rc1 does not exhibit this bad behavior > > I can't think of any reason for this. Best guess at the fix > would be the new queue full ramp up ramp down code, but no > clue as to what the problem is ... no other drivers seem to > have noticed the performance problems this would likely cause > ... and 2.6.32 is becoming the standard enterprise kernel. > James, I'm not sure there's much hardware out there that's capable of queuing up so many commands without choking the disks. I _think_ in most cases if you queue up say 64 commands on a single scsi disk that's just too much. But with Smart Array we can queue up to 1024 commands (on most controllers) and those are then distributed across all the drives in the array(s). IOW, we're thinking not many people would have noticed such a change. Hope this makes sense. -- mikem ^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: 16 commands per lun limitation bug? 2010-02-10 22:19 ` Miller, Mike (OS Dev) @ 2010-02-10 22:22 ` James Bottomley 2010-02-11 2:27 ` scameron 0 siblings, 1 reply; 5+ messages in thread From: James Bottomley @ 2010-02-10 22:22 UTC (permalink / raw) To: Miller, Mike (OS Dev) Cc: scameron@beardog.cce.hp.com, linux-scsi@vger.kernel.org On Wed, 2010-02-10 at 22:19 +0000, Miller, Mike (OS Dev) wrote: > > > -----Original Message----- > > From: James Bottomley [mailto:James.Bottomley@suse.de] > > Sent: Wednesday, February 10, 2010 3:59 PM > > To: scameron@beardog.cce.hp.com > > Cc: linux-scsi@vger.kernel.org; Miller, Mike (OS Dev) > > Subject: Re: 16 commands per lun limitation bug? > > > > On Wed, 2010-02-10 at 14:19 -0600, scameron@beardog.cce.hp.com wrote: > > > We have seen the amount of commands per lun that are sent > > to low level > > > scsi driver limited to 16 commands per lun, (seemingly > > artificially, > > > well below our can_queue and cmd_per_lun limits of 1020) > > > > > > 2.6.29 does not exhibit this bad behavior. > > > 2.6.30, 2.6.31, 2.6.32 (2.6.32.1 through 2.6.32.8) do > > exhibit this bad > > > behavior > > > 2.6.31-rc1 does not exhibit this bad behavior > > > > I can't think of any reason for this. Best guess at the fix > > would be the new queue full ramp up ramp down code, but no > > clue as to what the problem is ... no other drivers seem to > > have noticed the performance problems this would likely cause > > ... and 2.6.32 is becoming the standard enterprise kernel. > > > James, > I'm not sure there's much hardware out there that's capable of queuing > up so many commands without choking the disks. I _think_ in most cases > if you queue up say 64 commands on a single scsi disk that's just too > much. But with Smart Array we can queue up to 1024 commands (on most > controllers) and those are then distributed across all the drives in > the array(s). IOW, we're thinking not many people would have noticed > such a change. Hope this makes sense. Fibre drivers to FC arrays would regard a depth of 16 as "fiddling small change" to quote hitchhikers. If any of the FC drivers got limited in this regard, we'll see substantial enterprise performance drops ... which I haven't actually heard about yet. James ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: 16 commands per lun limitation bug? 2010-02-10 22:22 ` James Bottomley @ 2010-02-11 2:27 ` scameron 0 siblings, 0 replies; 5+ messages in thread From: scameron @ 2010-02-11 2:27 UTC (permalink / raw) To: James Bottomley Cc: Miller, Mike (OS Dev), linux-scsi@vger.kernel.org, axboe, jay.allison, scameron On Wed, Feb 10, 2010 at 05:22:40PM -0500, James Bottomley wrote: > On Wed, 2010-02-10 at 22:19 +0000, Miller, Mike (OS Dev) wrote: > > > > > -----Original Message----- > > > From: James Bottomley [mailto:James.Bottomley@suse.de] > > > Sent: Wednesday, February 10, 2010 3:59 PM > > > To: scameron@beardog.cce.hp.com > > > Cc: linux-scsi@vger.kernel.org; Miller, Mike (OS Dev) > > > Subject: Re: 16 commands per lun limitation bug? > > > > > > On Wed, 2010-02-10 at 14:19 -0600, scameron@beardog.cce.hp.com wrote: > > > > We have seen the amount of commands per lun that are sent > > > to low level > > > > scsi driver limited to 16 commands per lun, (seemingly > > > artificially, > > > > well below our can_queue and cmd_per_lun limits of 1020) > > > > > > > > 2.6.29 does not exhibit this bad behavior. > > > > 2.6.30, 2.6.31, 2.6.32 (2.6.32.1 through 2.6.32.8) do > > > exhibit this bad > > > > behavior > > > > 2.6.31-rc1 does not exhibit this bad behavior > > > > > > I can't think of any reason for this. Best guess at the fix > > > would be the new queue full ramp up ramp down code, but no > > > clue as to what the problem is ... no other drivers seem to > > > have noticed the performance problems this would likely cause > > > ... and 2.6.32 is becoming the standard enterprise kernel. > > > > > James, > > I'm not sure there's much hardware out there that's capable of queuing > > up so many commands without choking the disks. I _think_ in most cases > > if you queue up say 64 commands on a single scsi disk that's just too > > much. But with Smart Array we can queue up to 1024 commands (on most > > controllers) and those are then distributed across all the drives in > > the array(s). IOW, we're thinking not many people would have noticed > > such a change. Hope this makes sense. > > Fibre drivers to FC arrays would regard a depth of 16 as "fiddling small > change" to quote hitchhikers. If any of the FC drivers got limited in > this regard, we'll see substantial enterprise performance drops ... > which I haven't actually heard about yet. Nevertheless, the problem is repeatable here on varied hardware, and I figured out how to make git bisect work, patching in hpsa as I went. It went through the several thousand commits between v2.6.29 and v2.6.30 and eventually homed in on this one as the apparent cause of the symptoms I'm seeing: Script started on Wed 10 Feb 2010 08:10:03 PM CST scameron@sixx:~/lx26/linux-2.6> git bisect good 2f5cb7381b737e24c8046fd4aeab571fb71315f5 is first bad commit commit 2f5cb7381b737e24c8046fd4aeab571fb71315f5 Author: Jens Axboe <jens.axboe@oracle.com> Date: Tue Apr 7 08:51:19 2009 +0200 cfq-iosched: change dispatch logic to deal with single requests at the time The IO scheduler core calls into the IO scheduler dispatch_request hook to move requests from the IO scheduler and into the driver dispatch list. It only does so when the dispatch list is empty. CFQ moves several requests to the dispatch list, which can cause higher latencies if we suddenly have to switch to some important sync IO. Change the logic to move one request at the time instead. This should almost be functionally equivalent to what we did before, except that we now honor 'quantum' as the maximum queue depth at the device side from any single cfqq. If there's just a single active cfqq, we allow up to 4 times the normal quantum. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> :040000 040000 be8b0f16d8faba88fc04e877eaa9240ec5d00648 5fd6eb925f065f411ed0d387277e45ba722ea154 M block scameron@sixx:~/lx26/linux-2.6> exit exit Script done on Wed 10 Feb 2010 08:10:34 PM CST Any thoughts Jens? BTW, snipped out of the above email was an fio config file I was using to do the testing to produce the symptoms. It was 4k random reads, libaio, iodepth of 128, one drive, /dev/sdb, hpsa driver. > > James > ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-02-11 2:24 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-02-10 20:19 16 commands per lun limitation bug? scameron 2010-02-10 21:59 ` James Bottomley 2010-02-10 22:19 ` Miller, Mike (OS Dev) 2010-02-10 22:22 ` James Bottomley 2010-02-11 2:27 ` scameron
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox