From: Oleg Nesterov <oleg@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
Paul McKenney <paulmck@linux.vnet.ibm.com>,
Peter Zijlstra <peterz@infradead.org>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>,
Davidlohr Bueso <dave@stgolabs.net>,
Ingo Molnar <mingo@redhat.com>, Tejun Heo <tj@kernel.org>,
linux-kernel@vger.kernel.org
Subject: [PATCH 0/7] Add rcu_sync infrastructure to avoid _expedited() in percpu-rwsem
Date: Sun, 12 Jul 2015 01:35:35 +0200 [thread overview]
Message-ID: <20150711233535.GA829@redhat.com> (raw)
Hello,
Let me make another attempt to push rcu_sync and add a _simple_
improvment into percpu-rwsem. It already has another user (cgroups)
and I think it can have more. Peter has some use-cases. sb->s_writers
(which afaics is buggy btw) can be turned into percpu-rwsem too I think.
Linus, I am mostly trying to convince you. Nobody else objected so far.
Could you please comment?
Peter, if you agree with 5-7, can I add your Signed-off-by's ?
To me, the most annoying problem with percpu_rw_semaphore is
synchronize_sched_expedited() which is called twice by every
down_write/up_write. I think it would be really nice to avoid it.
Let's start with the simple test-case,
#!/bin/bash
perf probe -x /lib/libc.so.6 syscall
for i in {1..1000}; do
echo 1 >| /sys/kernel/debug/tracing/events/probe_libc/syscall/enable
echo 0 >| /sys/kernel/debug/tracing/events/probe_libc/syscall/enable
done
It needs ~ 13.5 seconds (2 CPUs, KVM). If we simply replace
synchronize_sched_expedited() with synchronize_sched() it takes
~ 67.5 seconds. This is not good.
With these patches it takes around 13.3 seconds again (a little
bit faster), and it doesn't use _expedited. synchronize_sched()
is called 1-2 (max 3) times in average. And now it does not
disturb the whole system.
And just in case, I also measured
for (i = 0; i < 1000000; ++i) {
percpu_down_write(&dup_mmap_sem);
percpu_up_write(&dup_mmap_sem);
}
and it runs more than 1.5 times faster (to remind, only 2 CPUs),
but this is not that interesting, I agree.
And note that the actual change in percpu-rwsem is really simple,
and imo it even makes the code simpler. (the last patch is off-
topic cleanup).
So the only complication is rcu_sync itself. But, rightly or not (I
am obviously biased), I believe this new rcu infrastructure is natural
and useful, and I think it can have more users too.
And. We can do more improvements in rcu_sync and percpu-rwsem, and
I don't only mean other optimizations from Peter. In particular, we
can extract the "wait for gp pass" from rcu_sync_enter() into another
helper, we can teach percpu_down_write() to allow multiple writers,
and more.
Oleg.
include/linux/percpu-rwsem.h | 3 +-
include/linux/rcusync.h | 57 +++++++++++++++
kernel/locking/percpu-rwsem.c | 78 ++++++---------------
kernel/rcu/Makefile | 2 +-
kernel/rcu/sync.c | 152 +++++++++++++++++++++++++++++++++++++++++
5 files changed, 235 insertions(+), 57 deletions(-)
next reply other threads:[~2015-07-11 23:37 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-11 23:35 Oleg Nesterov [this message]
2015-07-11 23:35 ` [PATCH 1/7] rcu: Create rcu_sync infrastructure Oleg Nesterov
2015-07-15 18:05 ` Paul E. McKenney
2015-07-15 18:15 ` Peter Zijlstra
2015-07-15 18:28 ` Paul E. McKenney
2015-07-15 19:08 ` Oleg Nesterov
2015-07-15 19:15 ` Paul E. McKenney
2015-07-15 18:41 ` Oleg Nesterov
2015-07-11 23:35 ` [PATCH 2/7] rcusync: Introduce struct rcu_sync_ops Oleg Nesterov
2015-07-11 23:35 ` [PATCH 3/7] rcusync: Add the CONFIG_PROVE_RCU checks Oleg Nesterov
2015-07-11 23:35 ` [PATCH 4/7] rcusync: Introduce rcu_sync_dtor() Oleg Nesterov
2015-07-11 23:36 ` [PATCH 5/7] percpu-rwsem: change it to rely on rss_sync infrastructure Oleg Nesterov
2015-07-15 18:15 ` Paul E. McKenney
2015-07-15 18:59 ` Oleg Nesterov
2015-07-11 23:36 ` [PATCH 6/7] percpu-rwsem: fix the comments outdated by rcu_sync Oleg Nesterov
2015-07-11 23:36 ` [PATCH 7/7] percpu-rwsem: cleanup the lockdep annotations in percpu_down_read() Oleg Nesterov
2015-07-11 23:47 ` [PATCH 0/7] Add rcu_sync infrastructure to avoid _expedited() in percpu-rwsem Linus Torvalds
2015-07-15 18:27 ` Paul E. McKenney
2015-07-15 19:36 ` Oleg Nesterov
2015-07-15 21:59 ` Paul E. McKenney
2015-07-17 23:29 ` Oleg Nesterov
2015-07-17 23:47 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150711233535.GA829@redhat.com \
--to=oleg@redhat.com \
--cc=daniel.wagner@bmw-carit.de \
--cc=dave@stgolabs.net \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.