From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Mikulas Patocka <mpatocka@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Jeff Chua <jeff.chua.linux@gmail.com>,
Jens Axboe <axboe@kernel.dk>,
Lai Jiangshan <laijs@cn.fujitsu.com>, Jan Kara <jack@suse.cz>,
lkml <linux-kernel@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Oleg Nesterov <oleg@redhat.com>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH 1/2] percpu-rwsem: use synchronize_sched_expedited
Date: Fri, 30 Nov 2012 05:42:13 -0800 [thread overview]
Message-ID: <20121130134213.GI2474@linux.vnet.ibm.com> (raw)
In-Reply-To: <Pine.LNX.4.64.1211292149550.14890@file.rdu.redhat.com>
On Thu, Nov 29, 2012 at 10:00:53PM -0500, Mikulas Patocka wrote:
> On Thu, 29 Nov 2012, Andrew Morton wrote:
> > On Tue, 27 Nov 2012 22:59:52 -0500 (EST)
> > Mikulas Patocka <mpatocka@redhat.com> wrote:
> >
> > > percpu-rwsem: use synchronize_sched_expedited
> > >
> > > Use synchronize_sched_expedited() instead of synchronize_sched()
> > > to improve mount speed.
> > >
> > > This patch improves mount time from 0.500s to 0.013s.
> > >
> > > Note: if realtime people complain about the use
> > > synchronize_sched_expedited() and synchronize_rcu_expedited(), I suggest
> > > that they introduce an option CONFIG_REALTIME or
> > > /proc/sys/kernel/realtime and turn off these *_expedited functions if
> > > the option is enabled (i.e. turn synchronize_sched_expedited into
> > > synchronize_sched and synchronize_rcu_expedited into synchronize_rcu).
> > >
> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> >
> > So I read through this thread but I really didn't see a clear
> > description of why mount() got slower. The changelog for 4b05a1c74d1
> > is spectacularly awful :(
> >
> >
> > Apparently the slowdown occurred because a blockdev mount patch
> > 62ac665ff9fc07497ca524 ("blockdev: turn a rw semaphore into a percpu rw
> > semaphore") newly uses percpu rwsems, and percpu rwsems are slow on the
> > down_write() path.
> >
> > And using synchronize_sched_expedited() rather than synchronize_sched()
> > makes percpu_down_write() somewhat less slow. Correct?
>
> Yes.
>
> > Why is it OK to use synchronize_sched_expedited() here? If it's
> > faster, why can't we use synchronize_sched_expedited() everywhere and
> > zap synchronize_sched()?
>
> Because synchronize_sched_expedited sends interrupts to all processors and
> it is bad for realtime workloads.
>
> Peter Zijlstra once complained when I used synchronize_rcu_expedited in
> bdi_remove_from_list (but he left it there).
>
> I suggest that if it really hurts real time response for someone, let they
> introduce a switch to turn it into non-expedited call.
Once Frederic's adaptive-ticks work reaches mainline, it will be possible
to avoid the IPIs to CPUs that are executing in user mode, in addition to
the current code's avoiding sending IPIs to CPUs that are idle. That said,
it will still be necessary to send IPIs to CPUs that are executing in
the kernel.
So things will get better, but won't be perfect. Sort of like this was
real life or something. ;-)
Thanx, Paul
next prev parent reply other threads:[~2012-11-30 13:42 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAAJw_ZtbhE5Jtd4PsWx8a23QdFTW7aMrKBmRf-bo5Wrean9Xhg@mail.gmail.com>
2012-11-20 18:09 ` Recent kernel "mount" slow Jan Kara
2012-11-21 15:46 ` Jeff Chua
2012-11-22 14:30 ` Jeff Chua
2012-11-22 19:21 ` Linus Torvalds
2012-11-23 13:24 ` Jens Axboe
2012-11-23 22:21 ` Jeff Chua
2012-11-23 23:31 ` Jeff Chua
2012-11-23 23:48 ` Jeff Chua
2012-11-24 21:09 ` Mikulas Patocka
2012-11-24 23:23 ` Jeff Chua
2012-11-27 5:57 ` Jeff Chua
2012-11-27 7:38 ` Jens Axboe
2012-11-27 7:44 ` Jens Axboe
2012-11-27 8:45 ` Jeff Chua
2012-11-27 10:06 ` Jeff Chua
2012-11-27 12:33 ` Jens Axboe
2012-11-28 3:57 ` Mikulas Patocka
2012-11-28 8:33 ` Jens Axboe
2012-11-28 13:05 ` Jeff Chua
2012-11-28 17:25 ` [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow) Mikulas Patocka
2012-11-28 19:15 ` Linus Torvalds
2012-11-28 19:43 ` Al Viro
2012-11-28 19:53 ` Linus Torvalds
2012-11-28 22:01 ` [PATCH v2] Do a proper locking for mmap and block size change Mikulas Patocka
2012-11-29 17:19 ` Linus Torvalds
2012-11-29 18:23 ` Mikulas Patocka
2012-11-29 18:46 ` Linus Torvalds
2012-11-29 19:02 ` Linus Torvalds
2012-11-29 19:15 ` Chris Mason
2012-11-29 19:26 ` Linus Torvalds
2012-11-29 19:48 ` Chris Mason
2012-11-29 19:55 ` Linus Torvalds
2012-11-29 20:10 ` Linus Torvalds
2012-11-29 20:52 ` Linus Torvalds
2012-11-29 21:29 ` Chris Mason
2012-11-29 22:16 ` Linus Torvalds
2012-11-29 22:36 ` Linus Torvalds
2012-11-30 1:16 ` Chris Mason
2012-11-30 2:13 ` Linus Torvalds
2012-11-30 2:27 ` Chris Mason
2012-11-30 2:49 ` Dave Chinner
2012-11-30 14:31 ` Chris Mason
2012-11-30 16:42 ` Linus Torvalds
2012-11-30 16:36 ` Christoph Hellwig
2012-11-30 22:40 ` Dave Chinner
2012-11-30 23:09 ` Christoph Hellwig
2012-11-29 19:50 ` Linus Torvalds
2012-11-28 19:50 ` [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow) Mikulas Patocka
2012-11-28 20:03 ` Linus Torvalds
2012-11-28 20:13 ` Linus Torvalds
2012-11-28 20:32 ` Linus Torvalds
2012-11-28 20:47 ` Linus Torvalds
2012-11-28 22:10 ` Mikulas Patocka
2012-11-28 21:29 ` Mikulas Patocka
2012-11-28 22:52 ` Linus Torvalds
2012-11-28 23:13 ` Linus Torvalds
2012-11-29 1:20 ` Mikulas Patocka
2012-11-29 0:38 ` Mikulas Patocka
2012-11-29 2:04 ` Linus Torvalds
2012-11-29 2:58 ` Linus Torvalds
2012-11-29 6:16 ` Linus Torvalds
2012-11-29 6:25 ` Al Viro
2012-11-29 6:30 ` Al Viro
2012-11-29 6:37 ` Linus Torvalds
2012-11-29 6:45 ` Al Viro
2012-11-29 10:57 ` Jeff Chua
2012-11-29 6:33 ` Linus Torvalds
2012-11-29 14:12 ` Chris Mason
2012-11-29 17:26 ` Chris Mason
2012-11-29 17:26 ` Linus Torvalds
2012-11-29 17:51 ` Chris Mason
2012-11-29 18:12 ` Linus Torvalds
2012-11-28 3:59 ` [PATCH 1/2] percpu-rwsem: use synchronize_sched_expedited Mikulas Patocka
2012-11-28 4:01 ` [PATCH 2/2] block_dev: don't take the write lock if block size doesn't change Mikulas Patocka
2012-11-28 14:24 ` Jeff Chua
2012-11-28 22:03 ` Mikulas Patocka
2012-11-28 14:19 ` [PATCH 1/2] percpu-rwsem: use synchronize_sched_expedited Jeff Chua
2012-11-30 0:06 ` Andrew Morton
2012-11-30 3:00 ` Mikulas Patocka
2012-11-30 13:42 ` Paul E. McKenney [this message]
2012-11-30 18:57 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121130134213.GI2474@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=jack@suse.cz \
--cc=jeff.chua.linux@gmail.com \
--cc=laijs@cn.fujitsu.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mpatocka@redhat.com \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).