* io_request_lock patch? @ 2001-07-09 19:39 Jonathan Lahr 2001-07-09 19:44 ` Jens Axboe 0 siblings, 1 reply; 22+ messages in thread From: Jonathan Lahr @ 2001-07-09 19:39 UTC (permalink / raw) To: linux-kernel I have heard that a patch to reduce io_request_lock contention by breaking it into per queue locks was released in the past. Does anyone know where I could find this patch if it exists? Thanks, Jonathan -- Jonathan Lahr IBM Linux Technology Center Beaverton, Oregon lahr@us.ibm.com 503-578-3385 ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-09 19:39 io_request_lock patch? Jonathan Lahr @ 2001-07-09 19:44 ` Jens Axboe 2001-07-10 19:49 ` Jonathan Lahr 0 siblings, 1 reply; 22+ messages in thread From: Jens Axboe @ 2001-07-09 19:44 UTC (permalink / raw) To: Jonathan Lahr; +Cc: linux-kernel On Mon, Jul 09 2001, Jonathan Lahr wrote: > > I have heard that a patch to reduce io_request_lock contention by > breaking it into per queue locks was released in the past. Does > anyone know where I could find this patch if it exists? I had a patch about a year ago that did it safely for the block layer and IDE at least, and also for selected SCSI hba's. Some of the latter variety are pretty hard and/or tedious to fixup, Eric Y has done some work automating this process almost completely. Until that is done, the general patch has no chance of being integrated. It's also interesting to take a look at _why_ there's contention on the io_request_lock. And fix those up first. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-09 19:44 ` Jens Axboe @ 2001-07-10 19:49 ` Jonathan Lahr 2001-07-10 20:09 ` Eric Youngdale 0 siblings, 1 reply; 22+ messages in thread From: Jonathan Lahr @ 2001-07-10 19:49 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-kernel, linux-scsi Jens Axboe [axboe@suse.de] wrote: > On Mon, Jul 09 2001, Jonathan Lahr wrote: > > > > I have heard that a patch to reduce io_request_lock contention by > > breaking it into per queue locks was released in the past. Does > > anyone know where I could find this patch if it exists? > > I had a patch about a year ago that did it safely for the block layer > and IDE at least, and also for selected SCSI hba's. Some of the latter > variety are pretty hard and/or tedious to fixup, Eric Y has done some > work automating this process almost completely. Until that is done, the > general patch has no chance of being integrated. I am investigating reducing io_request_lock contention in the shorter term if possible with smaller incremental modifications. So I'm first trying to discover any previous work that might have been done toward this purpose. -- Jonathan Lahr IBM Linux Technology Center Beaverton, Oregon lahr@us.ibm.com 503-578-3385 ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-10 19:49 ` Jonathan Lahr @ 2001-07-10 20:09 ` Eric Youngdale 2001-07-11 8:05 ` Jens Axboe 0 siblings, 1 reply; 22+ messages in thread From: Eric Youngdale @ 2001-07-10 20:09 UTC (permalink / raw) To: Jonathan Lahr, Jens Axboe; +Cc: linux-kernel, linux-scsi The bit that I had automated was to essentially fix each and every low-level SCSI driver such that each low-level driver would be responsible for it's own locking. At this point the patches and the tool are on hold - once the 2.5 kernel series gets underway, I can generate some fairly massive patchsets. -Eric ----- Original Message ----- From: "Jonathan Lahr" <lahr@us.ibm.com> To: "Jens Axboe" <axboe@suse.de> Cc: <linux-kernel@vger.kernel.org>; <linux-scsi@vger.kernel.org> Sent: Tuesday, July 10, 2001 3:49 PM Subject: Re: io_request_lock patch? > Jens Axboe [axboe@suse.de] wrote: > > On Mon, Jul 09 2001, Jonathan Lahr wrote: > > > > > > I have heard that a patch to reduce io_request_lock contention by > > > breaking it into per queue locks was released in the past. Does > > > anyone know where I could find this patch if it exists? > > > > I had a patch about a year ago that did it safely for the block layer > > and IDE at least, and also for selected SCSI hba's. Some of the latter > > variety are pretty hard and/or tedious to fixup, Eric Y has done some > > work automating this process almost completely. Until that is done, the > > general patch has no chance of being integrated. > > I am investigating reducing io_request_lock contention in the shorter term > if possible with smaller incremental modifications. So I'm first trying to > discover any previous work that might have been done toward this purpose. > > -- > Jonathan Lahr > IBM Linux Technology Center > Beaverton, Oregon > lahr@us.ibm.com > 503-578-3385 > > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-10 20:09 ` Eric Youngdale @ 2001-07-11 8:05 ` Jens Axboe 0 siblings, 0 replies; 22+ messages in thread From: Jens Axboe @ 2001-07-11 8:05 UTC (permalink / raw) To: Eric Youngdale; +Cc: Jonathan Lahr, linux-kernel, linux-scsi On Tue, Jul 10 2001, Eric Youngdale wrote: > > The bit that I had automated was to essentially fix each and every > low-level SCSI driver such that each low-level driver would be responsible > for it's own locking. At this point the patches and the tool are on hold - > once the 2.5 kernel series gets underway, I can generate some fairly massive > patchsets. Cool, sounds good Eric. I'll probably go ahead and complete the initial ll_rw_blk and IDE work, along with a few selected SCSI drivers. Then leave the grunt of the work for your tools. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? @ 2001-07-10 11:55 Dipankar Sarma 2001-07-10 23:05 ` Mike Anderson 2001-07-11 7:19 ` Jens Axboe 0 siblings, 2 replies; 22+ messages in thread From: Dipankar Sarma @ 2001-07-10 11:55 UTC (permalink / raw) To: axboe; +Cc: linux-kernel Hi Jens, In article <20010709214453.U16505@suse.de> you wrote: > On Mon, Jul 09 2001, Jonathan Lahr wrote: > It's also interesting to take a look at _why_ there's contention on the > io_request_lock. And fix those up first. > -- > Jens Axboe Here are some lockmeter outputs for tiobench (http://sourceforge.net/projects/tiobench), a simple benchmark that we tried on ext2 filesystem. 4 concurrent threads doing random/sequential read/write on 10MB files on a 4-way pIII 700MHz machine with 1MB L2 cache - SPINLOCKS HOLD WAIT UTIL CON MEAN( MAX ) MEAN( MAX )(% CPU) TOTAL NOWAIT SPIN RJECT NAME 2.9% 26.7% 7.4us( 706us) 72us( 920us)( 1.9%) 1557496 73.3% 26.7% 0% io_request_lock 0.00% 34.9% 0.5us( 2.8us) 63us( 839us)(0.04%) 29478 65.1% 34.9% 0% __get_request_wait+0x98 2.6% 4.7% 17us( 706us) 69us( 740us)(0.13%) 617741 95.3% 4.7% 0% __make_request+0x110 0.07% 60.2% 0.5us( 4.0us) 72us( 920us)( 1.7%) 610820 39.8% 60.2% 0% blk_get_queue+0x10 0.09% 2.9% 6.6us( 55us) 102us( 746us)(0.01%) 55327 97.1% 2.9% 0% do_aic7xxx_isr+0x24 0.00% 3.7% 0.3us( 22us) 29us( 569us)(0.00%) 22602 96.3% 3.7% 0% generic_unplug_device+0x10 0.02% 4.9% 1.3us( 27us) 54us( 621us)(0.01%) 55382 95.1% 4.9% 0% scsi_dispatch_cmd+0x12c 0.02% 1.3% 1.2us( 8.0us) 23us( 554us)(0.00%) 55382 98.7% 1.3% 0% scsi_old_done+0x5b8 0.04% 3.2% 2.8us( 31us) 200us( 734us)(0.02%) 55382 96.8% 3.2% 0% scsi_queue_next_request+0x18 0.02% 1.4% 1.1us( 7.8us) 46us( 638us)(0.00%) 55382 98.6% 1.4% 0% scsi_request_fn+0x350 1557496*26.7%*72us makes it about 30 seconds of time waiting for io_request_lock. That is nearly one-third of the total system time (about 98 seconds). As number of CPUs increase, this will likely worsen. It also seems that __make_request() holds the lock for the largest amount of time. This hold time isn't likely to change significantly for a per-queue lock, but atleast it will not affect queueing i/o requests to other devices. Besides, I am not sure if blk_get_queue() really needs to grab the io_request_lock. blk_dev[] entries aren't likely to be updated in an open device and hence it should be safe to look up the queue of an open device. For mutual exclusion in the device-specific queue() function, it might be better to leave it to the driver instead of forcing the mutual exclusion. For example, a driver might want to use a reader/writer lock to lookup its device table for the queue. It also might make sense to have separate mutual exclusion mechanism for block device and scsi device level queues. Thanks Dipankar -- Dipankar Sarma <dipankar@sequent.com> Project: http://lse.sourceforge.net Linux Technology Center, IBM Software Lab, Bangalore, India. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-10 11:55 Dipankar Sarma @ 2001-07-10 23:05 ` Mike Anderson 2001-07-11 7:15 ` Jens Axboe 2001-07-11 8:53 ` Dipankar Sarma 2001-07-11 7:19 ` Jens Axboe 1 sibling, 2 replies; 22+ messages in thread From: Mike Anderson @ 2001-07-10 23:05 UTC (permalink / raw) To: Dipankar Sarma; +Cc: axboe, linux-kernel The call to do_aic7xxx_isr appears that you are running the aic7xxx_old.c code. This driver is using the io_request_lock to protect internal data. The newer aic driver has its own lock. This is related to previous comments by Jens and Eric about lower level use of this lock. I would like to know why the request_freelist is going empty? Having __get_request_wait being called alot would appear to be not optimal. -Mike Dipankar Sarma [dipankar@sequent.com] wrote: > Hi Jens, > > In article <20010709214453.U16505@suse.de> you wrote: > > On Mon, Jul 09 2001, Jonathan Lahr wrote: > > It's also interesting to take a look at _why_ there's contention on the > > io_request_lock. And fix those up first. > > > -- > > Jens Axboe > > Here are some lockmeter outputs for tiobench > (http://sourceforge.net/projects/tiobench), a simple benchmark > that we tried on ext2 filesystem. 4 concurrent threads doing > random/sequential read/write on 10MB files on a 4-way pIII 700MHz > machine with 1MB L2 cache - > > SPINLOCKS HOLD WAIT > UTIL CON MEAN( MAX ) MEAN( MAX )(% CPU) TOTAL NOWAIT SPIN RJECT NAME > > 2.9% 26.7% 7.4us( 706us) 72us( 920us)( 1.9%) 1557496 73.3% 26.7% 0% io_request_lock > 0.00% 34.9% 0.5us( 2.8us) 63us( 839us)(0.04%) 29478 65.1% 34.9% 0% __get_request_wait+0x98 > 2.6% 4.7% 17us( 706us) 69us( 740us)(0.13%) 617741 95.3% 4.7% 0% __make_request+0x110 > 0.07% 60.2% 0.5us( 4.0us) 72us( 920us)( 1.7%) 610820 39.8% 60.2% 0% blk_get_queue+0x10 > 0.09% 2.9% 6.6us( 55us) 102us( 746us)(0.01%) 55327 97.1% 2.9% 0% do_aic7xxx_isr+0x24 > 0.00% 3.7% 0.3us( 22us) 29us( 569us)(0.00%) 22602 96.3% 3.7% 0% generic_unplug_device+0x10 > 0.02% 4.9% 1.3us( 27us) 54us( 621us)(0.01%) 55382 95.1% 4.9% 0% scsi_dispatch_cmd+0x12c > 0.02% 1.3% 1.2us( 8.0us) 23us( 554us)(0.00%) 55382 98.7% 1.3% 0% scsi_old_done+0x5b8 > 0.04% 3.2% 2.8us( 31us) 200us( 734us)(0.02%) 55382 96.8% 3.2% 0% scsi_queue_next_request+0x18 > 0.02% 1.4% 1.1us( 7.8us) 46us( 638us)(0.00%) 55382 98.6% 1.4% 0% scsi_request_fn+0x350 > > 1557496*26.7%*72us makes it about 30 seconds of time waiting for > io_request_lock. That is nearly one-third of the total system time > (about 98 seconds). As number of CPUs increase, this will likely > worsen. > > It also seems that __make_request() holds the lock for the largest > amount of time. This hold time isn't likely to change significantly > for a per-queue lock, but atleast it will not affect queueing i/o > requests to other devices. Besides, I am not sure if blk_get_queue() > really needs to grab the io_request_lock. blk_dev[] entries aren't > likely to be updated in an open device and hence it should be > safe to look up the queue of an open device. For mutual > exclusion in the device-specific queue() function, it might be > better to leave it to the driver instead of forcing the mutual > exclusion. For example, a driver might want to use a reader/writer > lock to lookup its device table for the queue. It also might make sense to > have separate mutual exclusion mechanism for block device > and scsi device level queues. > > Thanks > Dipankar > -- > Dipankar Sarma <dipankar@sequent.com> Project: http://lse.sourceforge.net > Linux Technology Center, IBM Software Lab, Bangalore, India. > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Michael Anderson mike.anderson@us.ibm.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-10 23:05 ` Mike Anderson @ 2001-07-11 7:15 ` Jens Axboe 2001-07-11 8:53 ` Dipankar Sarma 1 sibling, 0 replies; 22+ messages in thread From: Jens Axboe @ 2001-07-11 7:15 UTC (permalink / raw) To: Mike Anderson; +Cc: Dipankar Sarma, linux-kernel On Tue, Jul 10 2001, Mike Anderson wrote: > The call to do_aic7xxx_isr appears that you are running the aic7xxx_old.c > code. This driver is using the io_request_lock to protect internal data. > The newer aic driver has its own lock. This is related to previous > comments by Jens and Eric about lower level use of this lock. Yes > I would like to know why the request_freelist is going empty? Having Well, it's a limited resource so it's bound to go empty when under load. In fact, if you have plenty of RAM this is what will end up starting I/O on the queued buffers. With the batch freeing of request slots, this is ok. > __get_request_wait being called alot would appear to be not optimal. ?? Care to explain your reasoning behind this statement. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-10 23:05 ` Mike Anderson 2001-07-11 7:15 ` Jens Axboe @ 2001-07-11 8:53 ` Dipankar Sarma 2001-07-11 8:53 ` Jens Axboe 2001-07-11 16:02 ` Mike Anderson 1 sibling, 2 replies; 22+ messages in thread From: Dipankar Sarma @ 2001-07-11 8:53 UTC (permalink / raw) To: Mike Anderson; +Cc: axboe, linux-kernel Hi Mike, On Tue, Jul 10, 2001 at 04:05:12PM -0700, Mike Anderson wrote: > The call to do_aic7xxx_isr appears that you are running the aic7xxx_old.c > code. This driver is using the io_request_lock to protect internal data. > The newer aic driver has its own lock. This is related to previous > comments by Jens and Eric about lower level use of this lock. There were some problems booting with the new aic7xxx driver and 2.4.4 kernel. This may have been fixed in later kernels, so we will check this again. Besides, I wasn't aware that the new aic7xxx driver uses a different locking model. Thanks for letting me know. > > I would like to know why the request_freelist is going empty? Having > __get_request_wait being called alot would appear to be not optimal. It is not unreasonable for request IOCB pools to go empty, the important issue is at what rate ? If a large portion of I/Os have to wait for request structures to be freed, we may not be able to utilize the available hardware bandwidth of the system optimally when we need, say, large # of IOs/Sec. On the other hand, having large number of request structures available may not necessarily give you large IOs/sec. The thing to look at would be - how well are we utilizing the queueing capablility of the hardware given a particular type of workload. Thanks Dipankar -- Dipankar Sarma <dipankar@sequent.com> Project: http://lse.sourceforge.net Linux Technology Center, IBM Software Lab, Bangalore, India. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-11 8:53 ` Dipankar Sarma @ 2001-07-11 8:53 ` Jens Axboe 2001-07-11 14:02 ` Dipankar Sarma 2001-07-11 16:02 ` Mike Anderson 1 sibling, 1 reply; 22+ messages in thread From: Jens Axboe @ 2001-07-11 8:53 UTC (permalink / raw) To: Dipankar Sarma; +Cc: Mike Anderson, linux-kernel On Wed, Jul 11 2001, Dipankar Sarma wrote: > > I would like to know why the request_freelist is going empty? Having > > __get_request_wait being called alot would appear to be not optimal. > > It is not unreasonable for request IOCB pools to go empty, the important > issue is at what rate ? If a large portion of I/Os have to wait for > request structures to be freed, we may not be able to utilize the available > hardware bandwidth of the system optimally when we need, say, large > # of IOs/Sec. On the other hand, having large number of request structures > available may not necessarily give you large IOs/sec. The thing to look > at would be - how well are we utilizing the queueing capablility > of the hardware given a particular type of workload. The queue lengths should always be long enough to keep the hw busy of course. And in addition, the bigger the queues the bigger the chance of skipping seeks due to reordering. But don't worry, I've scaled the queue lengths so I'm pretty sure that they are always on the safe side in size. It's pretty easy to test for yourself if you want, just change QUEUE_NR_REQUESTS in blkdev.h. It's currently 8192, the request slots are scaled down from this value. 8k will give you twice the amount of slots that you have RAM in mb, ie 2048 on a 1gig machine. block: queued sectors max/low 683554kB/552482kB, 2048 slots per queue -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-11 8:53 ` Jens Axboe @ 2001-07-11 14:02 ` Dipankar Sarma 2001-07-11 14:01 ` Jens Axboe 0 siblings, 1 reply; 22+ messages in thread From: Dipankar Sarma @ 2001-07-11 14:02 UTC (permalink / raw) To: Jens Axboe; +Cc: Mike Anderson, linux-kernel On Wed, Jul 11, 2001 at 10:53:39AM +0200, Jens Axboe wrote: > The queue lengths should always be long enough to keep the hw busy of > course. And in addition, the bigger the queues the bigger the chance of > skipping seeks due to reordering. But don't worry, I've scaled the queue > lengths so I'm pretty sure that they are always on the safe side in > size. > > It's pretty easy to test for yourself if you want, just change > QUEUE_NR_REQUESTS in blkdev.h. It's currently 8192, the request slots > are scaled down from this value. 8k will give you twice the amount of > slots that you have RAM in mb, ie 2048 on a 1gig machine. > > block: queued sectors max/low 683554kB/552482kB, 2048 slots per queue Hmm.. The tiobench run was done on a 1GB machine and we still ran out of request slots. Will investigate. Thanks Dipankar -- Dipankar Sarma <dipankar@sequent.com> Project: http://lse.sourceforge.net Linux Technology Center, IBM Software Lab, Bangalore, India. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-11 14:02 ` Dipankar Sarma @ 2001-07-11 14:01 ` Jens Axboe 2001-07-11 14:55 ` Dipankar Sarma 0 siblings, 1 reply; 22+ messages in thread From: Jens Axboe @ 2001-07-11 14:01 UTC (permalink / raw) To: Dipankar Sarma; +Cc: Mike Anderson, linux-kernel On Wed, Jul 11 2001, Dipankar Sarma wrote: > On Wed, Jul 11, 2001 at 10:53:39AM +0200, Jens Axboe wrote: > > The queue lengths should always be long enough to keep the hw busy of > > course. And in addition, the bigger the queues the bigger the chance of > > skipping seeks due to reordering. But don't worry, I've scaled the queue > > lengths so I'm pretty sure that they are always on the safe side in > > size. > > > > It's pretty easy to test for yourself if you want, just change > > QUEUE_NR_REQUESTS in blkdev.h. It's currently 8192, the request slots > > are scaled down from this value. 8k will give you twice the amount of > > slots that you have RAM in mb, ie 2048 on a 1gig machine. > > > > block: queued sectors max/low 683554kB/552482kB, 2048 slots per queue > > Hmm.. The tiobench run was done on a 1GB machine and we still ran > out of request slots. Will investigate. Sure, that's to be expected. If we never ran out we would be wasting memory. My point is that you should rerun the same test with more request slots -- and I'd be surprised if you gained any significant performance on that account. I never said that you'd never run out, that's of course not true. In fact, running out is what starts the I/O typically on a 1GB machine and bigger. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-11 14:01 ` Jens Axboe @ 2001-07-11 14:55 ` Dipankar Sarma 2001-07-11 19:16 ` Jens Axboe 0 siblings, 1 reply; 22+ messages in thread From: Dipankar Sarma @ 2001-07-11 14:55 UTC (permalink / raw) To: Jens Axboe; +Cc: Mike Anderson, linux-kernel Hi Jens, On Wed, Jul 11, 2001 at 04:01:48PM +0200, Jens Axboe wrote: > On Wed, Jul 11 2001, Dipankar Sarma wrote: > > On Wed, Jul 11, 2001 at 10:53:39AM +0200, Jens Axboe wrote: > > > The queue lengths should always be long enough to keep the hw busy of > > > course. And in addition, the bigger the queues the bigger the chance of > > > skipping seeks due to reordering. But don't worry, I've scaled the queue > > > lengths so I'm pretty sure that they are always on the safe side in > > > size. > > > > > > It's pretty easy to test for yourself if you want, just change > > > QUEUE_NR_REQUESTS in blkdev.h. It's currently 8192, the request slots > > > are scaled down from this value. 8k will give you twice the amount of > > > slots that you have RAM in mb, ie 2048 on a 1gig machine. > > > > > > block: queued sectors max/low 683554kB/552482kB, 2048 slots per queue > > > > Hmm.. The tiobench run was done on a 1GB machine and we still ran > > out of request slots. Will investigate. > > Sure, that's to be expected. If we never ran out we would be wasting > memory. My point is that you should rerun the same test with more > request slots -- and I'd be surprised if you gained any significant > performance on that account. I never said that you'd never run out, > that's of course not true. In fact, running out is what starts the I/O > typically on a 1GB machine and bigger. I agree that running out of slots don't necessarily mean anything. I was just wondering about the queue lengths. On the face of it, 2048 requests pending for a single disk seems like a fairly large queue length. Is this all related to plugging of queues for disk block optimization ? Thanks Dipankar -- Dipankar Sarma <dipankar@sequent.com> Project: http://lse.sourceforge.net Linux Technology Center, IBM Software Lab, Bangalore, India. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-11 14:55 ` Dipankar Sarma @ 2001-07-11 19:16 ` Jens Axboe 0 siblings, 0 replies; 22+ messages in thread From: Jens Axboe @ 2001-07-11 19:16 UTC (permalink / raw) To: Dipankar Sarma; +Cc: Mike Anderson, linux-kernel On Wed, Jul 11 2001, Dipankar Sarma wrote: > On Wed, Jul 11, 2001 at 04:01:48PM +0200, Jens Axboe wrote: > > On Wed, Jul 11 2001, Dipankar Sarma wrote: > > > On Wed, Jul 11, 2001 at 10:53:39AM +0200, Jens Axboe wrote: > > > > The queue lengths should always be long enough to keep the hw busy of > > > > course. And in addition, the bigger the queues the bigger the chance of > > > > skipping seeks due to reordering. But don't worry, I've scaled the queue > > > > lengths so I'm pretty sure that they are always on the safe side in > > > > size. > > > > > > > > It's pretty easy to test for yourself if you want, just change > > > > QUEUE_NR_REQUESTS in blkdev.h. It's currently 8192, the request slots > > > > are scaled down from this value. 8k will give you twice the amount of > > > > slots that you have RAM in mb, ie 2048 on a 1gig machine. > > > > > > > > block: queued sectors max/low 683554kB/552482kB, 2048 slots per queue > > > > > > Hmm.. The tiobench run was done on a 1GB machine and we still ran > > > out of request slots. Will investigate. > > > > Sure, that's to be expected. If we never ran out we would be wasting > > memory. My point is that you should rerun the same test with more > > request slots -- and I'd be surprised if you gained any significant > > performance on that account. I never said that you'd never run out, > > that's of course not true. In fact, running out is what starts the I/O > > typically on a 1GB machine and bigger. > > I agree that running out of slots don't necessarily mean anything. I > was just wondering about the queue lengths. On the face of it, 2048 > requests pending for a single disk seems like a fairly large queue length. > Is this all related to plugging of queues for disk block optimization ? Plugging to a degree, although when you get queues this long you can get nice merging even though the queue will probably never get replugged (because it never empties). The merging is the most important step, and also the insertion sort of course. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-11 8:53 ` Dipankar Sarma 2001-07-11 8:53 ` Jens Axboe @ 2001-07-11 16:02 ` Mike Anderson 2001-07-11 19:20 ` Jens Axboe 1 sibling, 1 reply; 22+ messages in thread From: Mike Anderson @ 2001-07-11 16:02 UTC (permalink / raw) To: Dipankar Sarma; +Cc: axboe, linux-kernel Sorry for the slow fingers just trying to catch up on this thread. Dipankar Sarma [dipankar@sequent.com] wrote: > Hi Mike, > > On Tue, Jul 10, 2001 at 04:05:12PM -0700, Mike Anderson wrote: > > The call to do_aic7xxx_isr appears that you are running the aic7xxx_old.c > > code. This driver is using the io_request_lock to protect internal data. > > The newer aic driver has its own lock. This is related to previous > > comments by Jens and Eric about lower level use of this lock. > > There were some problems booting with the new aic7xxx driver and 2.4.4 > kernel. This may have been fixed in later kernels, so we will check > this again. Besides, I wasn't aware that the new aic7xxx driver uses > a different locking model. Thanks for letting me know. > > > > > I would like to know why the request_freelist is going empty? Having > > __get_request_wait being called alot would appear to be not optimal. > > It is not unreasonable for request IOCB pools to go empty, the important > issue is at what rate ? If a large portion of I/Os have to wait for > request structures to be freed, we may not be able to utilize the available > hardware bandwidth of the system optimally when we need, say, large > # of IOs/Sec. On the other hand, having large number of request structures > available may not necessarily give you large IOs/sec. The thing to look > at would be - how well are we utilizing the queueing capablility > of the hardware given a particular type of workload. Jens, I think Dipankar might have stated my comment about questioning optimal utilization of a pool of resources shared by all device queues in the last sentence of the above paragraph. My thought was that if one has enough IO that cannot be merged on a queue you eat up request descriptors. If a request queue contains more requests than can be put in flight by the lower level to a spindle than this resource might be better used by other request queues. I might be missing something and I will look at the code some more. Thanks. > > Thanks > Dipankar > -- > Dipankar Sarma <dipankar@sequent.com> Project: http://lse.sourceforge.net > Linux Technology Center, IBM Software Lab, Bangalore, India. > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Michael Anderson mike.anderson@us.ibm.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-11 16:02 ` Mike Anderson @ 2001-07-11 19:20 ` Jens Axboe 2001-07-11 20:13 ` Dipankar Sarma 0 siblings, 1 reply; 22+ messages in thread From: Jens Axboe @ 2001-07-11 19:20 UTC (permalink / raw) To: Mike Anderson; +Cc: Dipankar Sarma, linux-kernel On Wed, Jul 11 2001, Mike Anderson wrote: > Dipankar Sarma [dipankar@sequent.com] wrote: > > Hi Mike, > > > > On Tue, Jul 10, 2001 at 04:05:12PM -0700, Mike Anderson wrote: > > > The call to do_aic7xxx_isr appears that you are running the aic7xxx_old.c > > > code. This driver is using the io_request_lock to protect internal data. > > > The newer aic driver has its own lock. This is related to previous > > > comments by Jens and Eric about lower level use of this lock. > > > > There were some problems booting with the new aic7xxx driver and 2.4.4 > > kernel. This may have been fixed in later kernels, so we will check > > this again. Besides, I wasn't aware that the new aic7xxx driver uses > > a different locking model. Thanks for letting me know. > > > > > > > > I would like to know why the request_freelist is going empty? Having > > > __get_request_wait being called alot would appear to be not optimal. > > > > It is not unreasonable for request IOCB pools to go empty, the important > > issue is at what rate ? If a large portion of I/Os have to wait for > > request structures to be freed, we may not be able to utilize the available > > hardware bandwidth of the system optimally when we need, say, large > > # of IOs/Sec. On the other hand, having large number of request structures > > available may not necessarily give you large IOs/sec. The thing to look > > at would be - how well are we utilizing the queueing capablility > > of the hardware given a particular type of workload. > > Jens, I think Dipankar might have stated my comment about questioning > optimal utilization of a pool of resources shared by all device queues in > the last sentence of the above paragraph. > > My thought was that if one has enough IO that cannot be merged on a queue > you eat up request descriptors. If a request queue contains more requests > than can be put in flight by the lower level to a spindle than this > resource might be better used by other request queues. True. In theory it would be possible to do request slot stealing from idle queues, in fact it's doable without adding any additional overhead to struct request. I did discuss this with [someone, forgot who] last year, when the per-queue slots where introduced. I'm not sure I want to do this though. If you have lots of disks, then yes there will be some wastage if they are idle. IMO that's ok. What's not ok and what I do want to fix is that slower devices get just as many slots as a 15K disk for instance. For, say, floppy or CDROM devices we really don't need to waste that much RAM. This will change for 2.5, not before. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-11 19:20 ` Jens Axboe @ 2001-07-11 20:13 ` Dipankar Sarma 2001-07-11 20:17 ` Jens Axboe 0 siblings, 1 reply; 22+ messages in thread From: Dipankar Sarma @ 2001-07-11 20:13 UTC (permalink / raw) To: Jens Axboe; +Cc: mike.anderson, linux-kernel On Wed, Jul 11, 2001 at 09:20:22PM +0200, Jens Axboe wrote: > True. In theory it would be possible to do request slot stealing from > idle queues, in fact it's doable without adding any additional overhead > to struct request. I did discuss this with [someone, forgot who] last > year, when the per-queue slots where introduced. > > I'm not sure I want to do this though. If you have lots of disks, then > yes there will be some wastage if they are idle. IMO that's ok. What's > not ok and what I do want to fix is that slower devices get just as many > slots as a 15K disk for instance. For, say, floppy or CDROM devices we > really don't need to waste that much RAM. This will change for 2.5, not > before. Unless there is some serious evidence substantiating the need for stealing request slots from other devices to avoid starvation, it makes sense to avoid it and go for a simpler scheme. I suspect that device type based slot allocation should just suffice. Thanks Dipankar -- Dipankar Sarma <dipankar@sequent.com> Project: http://lse.sourceforge.net Linux Technology Center, IBM Software Lab, Bangalore, India. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-11 20:13 ` Dipankar Sarma @ 2001-07-11 20:17 ` Jens Axboe 2001-07-11 21:05 ` Mike Anderson 0 siblings, 1 reply; 22+ messages in thread From: Jens Axboe @ 2001-07-11 20:17 UTC (permalink / raw) To: Dipankar Sarma; +Cc: mike.anderson, linux-kernel On Thu, Jul 12 2001, Dipankar Sarma wrote: > On Wed, Jul 11, 2001 at 09:20:22PM +0200, Jens Axboe wrote: > > True. In theory it would be possible to do request slot stealing from > > idle queues, in fact it's doable without adding any additional overhead > > to struct request. I did discuss this with [someone, forgot who] last > > year, when the per-queue slots where introduced. > > > > I'm not sure I want to do this though. If you have lots of disks, then > > yes there will be some wastage if they are idle. IMO that's ok. What's > > not ok and what I do want to fix is that slower devices get just as many > > slots as a 15K disk for instance. For, say, floppy or CDROM devices we > > really don't need to waste that much RAM. This will change for 2.5, not > > before. > > Unless there is some serious evidence substantiating the need for > stealing request slots from other devices to avoid starvation, it > makes sense to avoid it and go for a simpler scheme. I suspect that device > type based slot allocation should just suffice. My point exactly. And typically, if you have lots of queues you have lots of RAM. A standard 128meg desktop machine does not waste a whole lot. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-11 20:17 ` Jens Axboe @ 2001-07-11 21:05 ` Mike Anderson 0 siblings, 0 replies; 22+ messages in thread From: Mike Anderson @ 2001-07-11 21:05 UTC (permalink / raw) To: Jens Axboe; +Cc: Dipankar Sarma, linux-kernel Jens Axboe [axboe@suse.de] wrote: > On Thu, Jul 12 2001, Dipankar Sarma wrote: > > On Wed, Jul 11, 2001 at 09:20:22PM +0200, Jens Axboe wrote: > > > True. In theory it would be possible to do request slot stealing from > > > idle queues, in fact it's doable without adding any additional overhead > > > to struct request. I did discuss this with [someone, forgot who] last > > > year, when the per-queue slots where introduced. > > > > > > I'm not sure I want to do this though. If you have lots of disks, then > > > yes there will be some wastage if they are idle. IMO that's ok. What's > > > not ok and what I do want to fix is that slower devices get just as many > > > slots as a 15K disk for instance. For, say, floppy or CDROM devices we > > > really don't need to waste that much RAM. This will change for 2.5, not > > > before. > > > > Unless there is some serious evidence substantiating the need for > > stealing request slots from other devices to avoid starvation, it > > makes sense to avoid it and go for a simpler scheme. I suspect that device > > type based slot allocation should just suffice. > > My point exactly. And typically, if you have lots of queues you have > lots of RAM. A standard 128meg desktop machine does not waste a whole > lot. I would vote for the simpler approach of slots. :-). > > -- > Jens Axboe -- Michael Anderson mike.anderson@us.ibm.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-10 11:55 Dipankar Sarma 2001-07-10 23:05 ` Mike Anderson @ 2001-07-11 7:19 ` Jens Axboe 2001-07-11 8:39 ` Dipankar Sarma 1 sibling, 1 reply; 22+ messages in thread From: Jens Axboe @ 2001-07-11 7:19 UTC (permalink / raw) To: Dipankar Sarma; +Cc: linux-kernel On Tue, Jul 10 2001, Dipankar Sarma wrote: > In article <20010709214453.U16505@suse.de> you wrote: > > On Mon, Jul 09 2001, Jonathan Lahr wrote: > > It's also interesting to take a look at _why_ there's contention on the > > io_request_lock. And fix those up first. > > > -- > > Jens Axboe > > Here are some lockmeter outputs for tiobench > (http://sourceforge.net/projects/tiobench), a simple benchmark > that we tried on ext2 filesystem. 4 concurrent threads doing > random/sequential read/write on 10MB files on a 4-way pIII 700MHz > machine with 1MB L2 cache - > > SPINLOCKS HOLD WAIT > UTIL CON MEAN( MAX ) MEAN( MAX )(% CPU) TOTAL NOWAIT SPIN RJECT NAME > > 2.9% 26.7% 7.4us( 706us) 72us( 920us)( 1.9%) 1557496 73.3% 26.7% 0% io_request_lock > 0.00% 34.9% 0.5us( 2.8us) 63us( 839us)(0.04%) 29478 65.1% 34.9% 0% __get_request_wait+0x98 > 2.6% 4.7% 17us( 706us) 69us( 740us)(0.13%) 617741 95.3% 4.7% 0% __make_request+0x110 > 0.07% 60.2% 0.5us( 4.0us) 72us( 920us)( 1.7%) 610820 39.8% 60.2% 0% blk_get_queue+0x10 > 0.09% 2.9% 6.6us( 55us) 102us( 746us)(0.01%) 55327 97.1% 2.9% 0% do_aic7xxx_isr+0x24 > 0.00% 3.7% 0.3us( 22us) 29us( 569us)(0.00%) 22602 96.3% 3.7% 0% generic_unplug_device+0x10 > 0.02% 4.9% 1.3us( 27us) 54us( 621us)(0.01%) 55382 95.1% 4.9% 0% scsi_dispatch_cmd+0x12c > 0.02% 1.3% 1.2us( 8.0us) 23us( 554us)(0.00%) 55382 98.7% 1.3% 0% scsi_old_done+0x5b8 > 0.04% 3.2% 2.8us( 31us) 200us( 734us)(0.02%) 55382 96.8% 3.2% 0% scsi_queue_next_request+0x18 > 0.02% 1.4% 1.1us( 7.8us) 46us( 638us)(0.00%) 55382 98.6% 1.4% 0% scsi_request_fn+0x350 > > 1557496*26.7%*72us makes it about 30 seconds of time waiting for > io_request_lock. That is nearly one-third of the total system time > (about 98 seconds). As number of CPUs increase, this will likely > worsen. Auch! That's pretty bad. > It also seems that __make_request() holds the lock for the largest > amount of time. This hold time isn't likely to change significantly __make_request -> elevator merge/insertion scan. This is what is taking all the time, not __make_request itself. With the bio-XX patches I have completely eliminated merge scans, so that can be done in O(1) time. For now insertion is still a O(N) scan, maybe that will change too [1]. > for a per-queue lock, but atleast it will not affect queueing i/o > requests to other devices. Besides, I am not sure if blk_get_queue() > really needs to grab the io_request_lock. blk_dev[] entries aren't Funny, this is one thing I've been looking at too. blk_get_queue _will_ die, don't worry. And yes, ie at open we can assign the queue. Or simply map it and protect it otherwise. It all ties in with being able to up or down a queue too, currently grabbing io_request_lock from blk_get_queue accomplishes exactly nothing and may as well be removed. If you do that, does it change the contention numbers significantly? [1] Appropriate data structures that both allow barriers and quick insertion sorts are needed. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-11 7:19 ` Jens Axboe @ 2001-07-11 8:39 ` Dipankar Sarma 2001-07-11 8:47 ` Jens Axboe 0 siblings, 1 reply; 22+ messages in thread From: Dipankar Sarma @ 2001-07-11 8:39 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-kernel Hi Jens, On Wed, Jul 11, 2001 at 09:19:00AM +0200, Jens Axboe wrote: > On Tue, Jul 10 2001, Dipankar Sarma wrote: > > In article <20010709214453.U16505@suse.de> you wrote: > > > On Mon, Jul 09 2001, Jonathan Lahr wrote: > > > > SPINLOCKS HOLD WAIT > > UTIL CON MEAN( MAX ) MEAN( MAX )(% CPU) TOTAL NOWAIT SPIN RJECT NAME > > > > 0.07% 60.2% 0.5us( 4.0us) 72us( 920us)( 1.7%) 610820 39.8% 60.2% 0% blk_get_queue+0x10 > > > > 1557496*26.7%*72us makes it about 30 seconds of time waiting for > > io_request_lock. That is nearly one-third of the total system time > > (about 98 seconds). As number of CPUs increase, this will likely > > worsen. > > Auch! That's pretty bad. > > > It also seems that __make_request() holds the lock for the largest > > amount of time. This hold time isn't likely to change significantly > > __make_request -> elevator merge/insertion scan. This is what is taking > all the time, not __make_request itself. With the bio-XX patches I have > completely eliminated merge scans, so that can be done in O(1) time. For > now insertion is still a O(N) scan, maybe that will change too [1]. I haven't got as far down as the elevator algorithm yet, but I would like to, at some point in time. In any case, my point was that because of disk block sorting done during initial queueing, there is likely to be a slightly longer lock (per-queue or otherwise) hold time there compared to, say, dequeueing for dispatch to lowlevel drivers. Where can I get the bio patches from ? > > > for a per-queue lock, but atleast it will not affect queueing i/o > > requests to other devices. Besides, I am not sure if blk_get_queue() > > really needs to grab the io_request_lock. blk_dev[] entries aren't > > Funny, this is one thing I've been looking at too. blk_get_queue _will_ > die, don't worry. And yes, ie at open we can assign the queue. Or simply > map it and protect it otherwise. It all ties in with being able to up or > down a queue too, currently grabbing io_request_lock from blk_get_queue > accomplishes exactly nothing and may as well be removed. If you do that, > does it change the contention numbers significantly? I haven't yet experimented with this yet, but theoritically speaking yes, it should make a big difference. blk_get_queue() grabs the lock very often and holds it for a very short period of time on average, so it is the one that is affected most. Out of the 30 seconds of spin-wait for io_request_lock, blk_get_queue() seems to take up 610820*60.2%*72us = 26.5 seconds. I will get to this soon though. BTW, where can I get some of these lock-splitting patches from ? I can do one myself for scsi+aic7xxx, but if there already exist some work, I would like to start off with them. Thanks Dipankar -- Dipankar Sarma <dipankar@sequent.com> Project: http://lse.sourceforge.net Linux Technology Center, IBM Software Lab, Bangalore, India. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: io_request_lock patch? 2001-07-11 8:39 ` Dipankar Sarma @ 2001-07-11 8:47 ` Jens Axboe 0 siblings, 0 replies; 22+ messages in thread From: Jens Axboe @ 2001-07-11 8:47 UTC (permalink / raw) To: Dipankar Sarma; +Cc: linux-kernel On Wed, Jul 11 2001, Dipankar Sarma wrote: > > > It also seems that __make_request() holds the lock for the largest > > > amount of time. This hold time isn't likely to change significantly > > > > __make_request -> elevator merge/insertion scan. This is what is taking > > all the time, not __make_request itself. With the bio-XX patches I have > > completely eliminated merge scans, so that can be done in O(1) time. For > > now insertion is still a O(N) scan, maybe that will change too [1]. > > I haven't got as far down as the elevator algorithm yet, but I would > like to, at some point in time. In any case, my point was that because > of disk block sorting done during initial queueing, there is likely > to be a slightly longer lock (per-queue or otherwise) hold time there > compared to, say, dequeueing for dispatch to lowlevel drivers. That's exactly right. The merging/insertion is O(N) now, so for queueing 100 buffers it will take some time. Especially for bigger hardware, where we allocate a much bigger freelist (and potentially get very long queues). Dequeueing is O(1) always, and thus doesn't hold the lock for long. > Where can I get the bio patches from ? kernel.org/pub/linux/kernel/people/axboe/v2.5 wait for bio-14 final though. > > > for a per-queue lock, but atleast it will not affect queueing i/o > > > requests to other devices. Besides, I am not sure if blk_get_queue() > > > really needs to grab the io_request_lock. blk_dev[] entries aren't > > > > Funny, this is one thing I've been looking at too. blk_get_queue _will_ > > die, don't worry. And yes, ie at open we can assign the queue. Or simply > > map it and protect it otherwise. It all ties in with being able to up or > > down a queue too, currently grabbing io_request_lock from blk_get_queue > > accomplishes exactly nothing and may as well be removed. If you do that, > > does it change the contention numbers significantly? > > I haven't yet experimented with this yet, but theoritically speaking > yes, it should make a big difference. blk_get_queue() grabs the lock > very often and holds it for a very short period of time on average, > so it is the one that is affected most. Out of the 30 seconds of > spin-wait for io_request_lock, blk_get_queue() seems to take up > 610820*60.2%*72us = 26.5 seconds. I will get to this soon though. In that case, I'll make sure to rip it out immediately in the stock kernel too. You'll note that the bio-XX patches don't use it either, and haven't for some time. > BTW, where can I get some of these lock-splitting patches from ? I > can do one myself for scsi+aic7xxx, but if there already exist some > work, I would like to start off with them. I have some old patches somewhere, but no chance of them applying now. It's mostly a case of s/io_request_lock/q->queue_lock in some way for all cases, so it's probably just as easy if you do it yourself for testing. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2001-07-11 21:06 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2001-07-09 19:39 io_request_lock patch? Jonathan Lahr 2001-07-09 19:44 ` Jens Axboe 2001-07-10 19:49 ` Jonathan Lahr 2001-07-10 20:09 ` Eric Youngdale 2001-07-11 8:05 ` Jens Axboe -- strict thread matches above, loose matches on Subject: below -- 2001-07-10 11:55 Dipankar Sarma 2001-07-10 23:05 ` Mike Anderson 2001-07-11 7:15 ` Jens Axboe 2001-07-11 8:53 ` Dipankar Sarma 2001-07-11 8:53 ` Jens Axboe 2001-07-11 14:02 ` Dipankar Sarma 2001-07-11 14:01 ` Jens Axboe 2001-07-11 14:55 ` Dipankar Sarma 2001-07-11 19:16 ` Jens Axboe 2001-07-11 16:02 ` Mike Anderson 2001-07-11 19:20 ` Jens Axboe 2001-07-11 20:13 ` Dipankar Sarma 2001-07-11 20:17 ` Jens Axboe 2001-07-11 21:05 ` Mike Anderson 2001-07-11 7:19 ` Jens Axboe 2001-07-11 8:39 ` Dipankar Sarma 2001-07-11 8:47 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox