From: Jens Axboe <axboe@kernel.dk>
To: Jeff Chua <jeff.chua.linux@gmail.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>,
Lai Jiangshan <laijs@cn.fujitsu.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Jan Kara <jack@suse.cz>, lkml <linux-kernel@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: Recent kernel "mount" slow
Date: Tue, 27 Nov 2012 13:33:23 +0100 [thread overview]
Message-ID: <50B4B313.3030707@kernel.dk> (raw)
In-Reply-To: <CAAJw_ZsYuLCwCTc6U=ELO_PvyjnVKQdkPke2enfzm6zOeWAqjA@mail.gmail.com>
On 2012-11-27 11:06, Jeff Chua wrote:
> On Tue, Nov 27, 2012 at 3:38 PM, Jens Axboe <axboe@kernel.dk> wrote:
>> On 2012-11-27 06:57, Jeff Chua wrote:
>>> On Sun, Nov 25, 2012 at 7:23 AM, Jeff Chua <jeff.chua.linux@gmail.com> wrote:
>>>> On Sun, Nov 25, 2012 at 5:09 AM, Mikulas Patocka <mpatocka@redhat.com> wrote:
>>>>> So it's better to slow down mount.
>>>>
>>>> I am quite proud of the linux boot time pitting against other OS. Even
>>>> with 10 partitions. Linux can boot up in just a few seconds, but now
>>>> you're saying that we need to do this semaphore check at boot up. By
>>>> doing so, it's inducing additional 4 seconds during boot up.
>>>
>>> By the way, I'm using a pretty fast SSD (Samsung PM830) and fast CPU
>>> (2.8GHz). I wonder if those on slower hard disk or slower CPU, what
>>> kind of degradation would this cause or just the same?
>>
>> It'd likely be the same slow down time wise, but as a percentage it
>> would appear smaller on a slower disk.
>>
>> Could you please test Mikulas' suggestion of changing
>> synchronize_sched() in include/linux/percpu-rwsem.h to
>> synchronize_sched_expedited()?
>
> Tested. It seems as fast as before, but may be a "tick" slower. Just
> perception. I was getting pretty much 0.012s with everything reverted.
> With synchronize_sched_expedited(), it seems to be 0.012s ~ 0.013s.
> So, it's good.
Excellent
>> linux-next also has a re-write of the per-cpu rw sems, out of Andrews
>> tree. It would be a good data point it you could test that, too.
>
> Tested. It's slower. 0.350s. But still faster than 0.500s without the patch.
Makes sense, it's 2 synchronize_sched() instead of 3. So it doesn't fix
the real issue, which is having to do synchronize_sched() in the first
place.
> # time mount /dev/sda1 /mnt; sync; sync; umount /mnt
>
>
> So, here's the comparison ...
>
> 0.500s 3.7.0-rc7
> 0.168s 3.7.0-rc2
> 0.012s 3.6.0
> 0.013s 3.7.0-rc7 + synchronize_sched_expedited()
> 0.350s 3.7.0-rc7 + Oleg's patch.
I wonder how many of them are due to changing to the same block size.
Does the below patch make a difference?
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 1a1e5e3..f041c56 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -126,29 +126,28 @@ int set_blocksize(struct block_device *bdev, int size)
if (size < bdev_logical_block_size(bdev))
return -EINVAL;
- /* Prevent starting I/O or mapping the device */
- percpu_down_write(&bdev->bd_block_size_semaphore);
-
/* Check that the block device is not memory mapped */
mapping = bdev->bd_inode->i_mapping;
mutex_lock(&mapping->i_mmap_mutex);
if (mapping_mapped(mapping)) {
mutex_unlock(&mapping->i_mmap_mutex);
- percpu_up_write(&bdev->bd_block_size_semaphore);
return -EBUSY;
}
mutex_unlock(&mapping->i_mmap_mutex);
/* Don't change the size if it is same as current */
if (bdev->bd_block_size != size) {
- sync_blockdev(bdev);
- bdev->bd_block_size = size;
- bdev->bd_inode->i_blkbits = blksize_bits(size);
- kill_bdev(bdev);
+ /* Prevent starting I/O */
+ percpu_down_write(&bdev->bd_block_size_semaphore);
+ if (bdev->bd_block_size != size) {
+ sync_blockdev(bdev);
+ bdev->bd_block_size = size;
+ bdev->bd_inode->i_blkbits = blksize_bits(size);
+ kill_bdev(bdev);
+ }
+ percpu_up_write(&bdev->bd_block_size_semaphore);
}
- percpu_up_write(&bdev->bd_block_size_semaphore);
-
return 0;
}
@@ -1649,14 +1648,12 @@ EXPORT_SYMBOL_GPL(blkdev_aio_write);
static int blkdev_mmap(struct file *file, struct vm_area_struct *vma)
{
+ struct address_space *mapping = file->f_mapping;
int ret;
- struct block_device *bdev = I_BDEV(file->f_mapping->host);
-
- percpu_down_read(&bdev->bd_block_size_semaphore);
+ mutex_lock(&mapping->i_mmap_mutex);
ret = generic_file_mmap(file, vma);
-
- percpu_up_read(&bdev->bd_block_size_semaphore);
+ mutex_unlock(&mapping->i_mmap_mutex);
return ret;
}
--
Jens Axboe
next prev parent reply other threads:[~2012-11-27 12:33 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-19 0:33 Recent kernel "mount" slow Jeff Chua
2012-11-20 18:09 ` Jan Kara
2012-11-21 15:46 ` Jeff Chua
2012-11-22 14:30 ` Jeff Chua
2012-11-22 19:21 ` Linus Torvalds
2012-11-23 13:24 ` Jens Axboe
2012-11-23 22:21 ` Jeff Chua
2012-11-23 23:31 ` Jeff Chua
2012-11-23 23:48 ` Jeff Chua
2012-11-24 21:09 ` Mikulas Patocka
2012-11-24 23:23 ` Jeff Chua
2012-11-27 5:57 ` Jeff Chua
2012-11-27 7:38 ` Jens Axboe
2012-11-27 7:44 ` Jens Axboe
2012-11-27 8:45 ` Jeff Chua
2012-11-27 10:06 ` Jeff Chua
2012-11-27 12:33 ` Jens Axboe [this message]
2012-11-28 3:57 ` Mikulas Patocka
2012-11-28 8:33 ` Jens Axboe
2012-11-28 13:05 ` Jeff Chua
2012-11-28 17:25 ` [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow) Mikulas Patocka
2012-11-28 19:15 ` Linus Torvalds
2012-11-28 19:43 ` Al Viro
2012-11-28 19:53 ` Linus Torvalds
2012-11-28 22:01 ` [PATCH v2] Do a proper locking for mmap and block size change Mikulas Patocka
2012-11-29 17:19 ` Linus Torvalds
2012-11-29 18:23 ` Mikulas Patocka
2012-11-29 18:46 ` Linus Torvalds
2012-11-29 19:02 ` Linus Torvalds
2012-11-29 19:15 ` Chris Mason
2012-11-29 19:26 ` Linus Torvalds
2012-11-29 19:48 ` Chris Mason
2012-11-29 19:55 ` Linus Torvalds
2012-11-29 20:10 ` Linus Torvalds
2012-11-29 20:52 ` Linus Torvalds
2012-11-29 21:29 ` Chris Mason
2012-11-29 22:16 ` Linus Torvalds
2012-11-29 22:36 ` Linus Torvalds
2012-11-30 1:16 ` Chris Mason
2012-11-30 2:13 ` Linus Torvalds
2012-11-30 2:27 ` Chris Mason
2012-11-30 2:49 ` Dave Chinner
2012-11-30 14:31 ` Chris Mason
2012-11-30 16:42 ` Linus Torvalds
2012-11-30 16:36 ` Christoph Hellwig
2012-11-30 22:40 ` Dave Chinner
2012-11-30 23:09 ` Christoph Hellwig
2012-11-29 19:50 ` Linus Torvalds
2012-11-28 19:50 ` [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow) Mikulas Patocka
2012-11-28 20:03 ` Linus Torvalds
2012-11-28 20:13 ` Linus Torvalds
2012-11-28 20:32 ` Linus Torvalds
2012-11-28 20:47 ` Linus Torvalds
2012-11-28 22:10 ` Mikulas Patocka
2012-11-28 21:29 ` Mikulas Patocka
2012-11-28 22:52 ` Linus Torvalds
2012-11-28 23:13 ` Linus Torvalds
2012-11-29 1:20 ` Mikulas Patocka
2012-11-29 0:38 ` Mikulas Patocka
2012-11-29 2:04 ` Linus Torvalds
2012-11-29 2:58 ` Linus Torvalds
2012-11-29 6:16 ` Linus Torvalds
2012-11-29 6:25 ` Al Viro
2012-11-29 6:30 ` Al Viro
2012-11-29 6:37 ` Linus Torvalds
2012-11-29 6:45 ` Al Viro
2012-11-29 10:57 ` Jeff Chua
2012-11-29 6:33 ` Linus Torvalds
2012-11-29 14:12 ` Chris Mason
2012-11-29 17:26 ` Chris Mason
2012-11-29 17:26 ` Linus Torvalds
2012-11-29 17:51 ` Chris Mason
2012-11-29 18:12 ` Linus Torvalds
2012-11-28 3:59 ` [PATCH 1/2] percpu-rwsem: use synchronize_sched_expedited Mikulas Patocka
2012-11-28 4:01 ` [PATCH 2/2] block_dev: don't take the write lock if block size doesn't change Mikulas Patocka
2012-11-28 14:24 ` Jeff Chua
2012-11-28 22:03 ` Mikulas Patocka
2012-11-28 14:19 ` [PATCH 1/2] percpu-rwsem: use synchronize_sched_expedited Jeff Chua
2012-11-30 0:06 ` Andrew Morton
2012-11-30 3:00 ` Mikulas Patocka
2012-11-30 13:42 ` Paul E. McKenney
2012-11-30 18:57 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50B4B313.3030707@kernel.dk \
--to=axboe@kernel.dk \
--cc=jack@suse.cz \
--cc=jeff.chua.linux@gmail.com \
--cc=laijs@cn.fujitsu.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mpatocka@redhat.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.