From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760251AbYD3HO3 (ORCPT ); Wed, 30 Apr 2008 03:14:29 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754967AbYD3HOU (ORCPT ); Wed, 30 Apr 2008 03:14:20 -0400 Received: from brick.kernel.dk ([87.55.233.238]:25674 "EHLO kernel.dk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754935AbYD3HOT (ORCPT ); Wed, 30 Apr 2008 03:14:19 -0400 Date: Wed, 30 Apr 2008 09:14:15 +0200 From: Jens Axboe To: Mike Anderson Cc: Mikulas Patocka , linux-kernel@vger.kernel.org, Alasdair Graeme Kergon Subject: Re: [PATCH] Optimize lock in queue unplugging Message-ID: <20080430071415.GM12774@kernel.dk> References: <20080429192556.GB12774@kernel.dk> <20080429202928.GB10641@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080429202928.GB10641@linux.vnet.ibm.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 29 2008, Mike Anderson wrote: > Jens Axboe wrote: > > On Tue, Apr 29 2008, Mikulas Patocka wrote: > > > Hi > > > > > > Mike Anderson was doing an OLTP benchmark on a computer with 48 physical > > > disks mapped to one logical device via device mapper. > > > > > > He found that there was a slowdown on request_queue->lock in function > > > generic_unplug_device. The slowdown is caused by the fact that when some > > > code calls unplug on the device mapper, device mapper calls unplug on all > > > physical disks. These unplug calls take the lock, find that the queue is > > > already unplugged, release the lock and exit. > > > > > > With the below patch, performance of the benchmark was increased by 18% > > > (the whole OLTP application, not just block layer microbenchmarks). > > > > > > So I'm submitting this patch for upstream. I think the patch is correct, > > > because when more threads call simultaneously plug and unplug, it is > > > unspecified, if the queue is or isn't plugged (so the patch can't make > > > this worse). And the caller that plugged the queue should unplug it > > > anyway. (if it doesn't, there's 3ms timeout). > > > > Where were these unplug calls coming from? The block layer will > > generally only unplug when it is already unplugged, so if you are seeing > > so many unplug calls that the patch redues overhead by as much > > described, perhaps the callsite is buggy? > > I do not have direct access the the benchmark setup, but here is the data > I have received. > > The oprofile data was showing ll_rw_blk::generic_unplug_device() as a top > routine at 13% of the samples. Annotation of the samples shows hits on > spin_lock_irq(q->queue_lock). > > Here are some sample call traces: > > Call trace #1 > > kernel: [] generic_unplug_device+0x5d/0xc6 > kernel: [] :dm_mod:dm_table_unplug_all+0x33/0x41 > kernel: [] :dm_mod:dm_unplug_all+0x1d/0x28 > kernel: [] blk_backing_dev_unplug+0x56/0x5b > kernel: [] sync_buffer+0x36/0x3f > kernel: [] __wait_on_bit+0x40/0x6f > kernel: [] sync_buffer+0x0/0x3f > kernel: [] out_of_line_wait_on_bit+0x6c/0x78 > kernel: [] wake_bit_function+0x0/0x23 > kernel: [] :jbd:journal_commit_transaction+0x91f/0x1086 > kernel: [] lock_timer_base+0x1b/0x3c > kernel: [] :jbd:kjournald+0xc1/0x213 > kernel: [] autoremove_wake_function+0x0/0x2e > kernel: [] keventd_create_kthread+0x0/0x61 > kernel: [] :jbd:kjournald+0x0/0x213 > kernel: [] keventd_create_kthread+0x0/0x61 > kernel: [] kthread+0xfe/0x132 > kernel: [] child_rip+0xa/0x11 > kernel: [] keventd_create_kthread+0x0/0x61 > kernel: [] kthread+0x0/0x132 > kernel: [] child_rip+0x0/0x11 > > Call trace #2 > kernel: [] generic_unplug_device+0x5d/0xc6 > kernel: [] :dm_mod:dm_table_unplug_all+0x33/0x41 > kernel: [] :dm_mod:dm_unplug_all+0x1d/0x28 > kernel: [] blk_backing_dev_unplug+0x56/0x5b > kernel: [] __blockdev_direct_IO+0x889/0xaa2 > kernel: [] :ext3:ext3_direct_IO+0xf3/0x18b > kernel: [] :ext3:ext3_get_block+0x0/0xe3 > kernel: [] generic_file_direct_IO+0xbd/0xfb > kernel: [] generic_file_direct_write+0x60/0xf2 > kernel: [] __generic_file_aio_write_nolock+0x2b7/0x3b8 > kernel: [] generic_file_aio_write+0x65/0xc1 > kernel: [] :ext3:ext3_file_write+0x16/0x91 > kernel: [] do_sync_write+0xc7/0x104 > kernel: [] autoremove_wake_function+0x0/0x2e > kernel: [] free_msg+0x22/0x3c > kernel: [] vfs_write+0xce/0x174 > kernel: [] sys_pwrite64+0x50/0x70 > kernel: [] error_exit+0x0/0x84 > kernel: [] system_call+0x7e/0x83 So it's basically dm calling into blk_unplug() all the time, which doesn't check if the queue is plugged. The reason why I didn't like the initial patch is that ->unplug_fn() really should not be called unless the queue IS plugged. So how about this instead: http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=c44993018887e82abd49023e92e8d8b6000e03ed That's a lot more appropriate, imho. -- Jens Axboe