From: Oleg Nesterov <oleg@redhat.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Daniel Wagner <daniel.wagner@bmw-carit.de>,
Davidlohr Bueso <dave@stgolabs.net>,
Ingo Molnar <mingo@redhat.com>, Tejun Heo <tj@kernel.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/7] Add rcu_sync infrastructure to avoid _expedited() in percpu-rwsem
Date: Wed, 15 Jul 2015 21:36:01 +0200 [thread overview]
Message-ID: <20150715193601.GA4386@redhat.com> (raw)
In-Reply-To: <20150715182713.GL3717@linux.vnet.ibm.com>
On 07/15, Paul E. McKenney wrote:
>
> On Sun, Jul 12, 2015 at 01:35:35AM +0200, Oleg Nesterov wrote:
>
> > Let's start with the simple test-case,
> >
> > #!/bin/bash
> >
> > perf probe -x /lib/libc.so.6 syscall
> >
> > for i in {1..1000}; do
> > echo 1 >| /sys/kernel/debug/tracing/events/probe_libc/syscall/enable
> > echo 0 >| /sys/kernel/debug/tracing/events/probe_libc/syscall/enable
> > done
> >
> > It needs ~ 13.5 seconds (2 CPUs, KVM). If we simply replace
> > synchronize_sched_expedited() with synchronize_sched() it takes
> > ~ 67.5 seconds. This is not good.
>
> Yep, even if you avoided the write-release grace period, you would
> still be looking at something like 40 seconds, which is 3x. Some
> might consider that to be a performance regression. ;-)
Yes ;)
> > And just in case, I also measured
> >
> > for (i = 0; i < 1000000; ++i) {
> > percpu_down_write(&dup_mmap_sem);
> > percpu_up_write(&dup_mmap_sem);
> > }
> >
> > and it runs more than 1.5 times faster (to remind, only 2 CPUs),
> > but this is not that interesting, I agree.
>
> Your trick avoiding the grace periods during a writer-to-writer handoff
> are cute, and they are helping a lot here.
Yes. And even the fact that a single writer doesn't need to sleep in
percpu_up_write()->synchronize_sched() looks good imo.
Yes, yes, we can remove it if we penalize the readers, but I'd like to
avoid this.
> Concurrent readers would
> have a tough time of it with this workload, though. They would all
> be serialized.
Sure. in this case it is not better than the normal rw_semaphore. Worse
actually.
> > And note that the actual change in percpu-rwsem is really simple,
> > and imo it even makes the code simpler. (the last patch is off-
> > topic cleanup).
> >
> > So the only complication is rcu_sync itself. But, rightly or not (I
> > am obviously biased), I believe this new rcu infrastructure is natural
> > and useful, and I think it can have more users too.
>
> I don't have an objection to it, even in its current form (I did
> review it long ago), but it does need to have a user!
Do you mean you need another user except percpu_rw_semaphore? I do
not see any right now...
Let me remind about sb_writers again. It actually has 3 rw_sem's
and I am trying to turn then into percpu_rw_semaphore's.
In this case freeze_super() will need 6 synchronize_sched_expedited().
This just looks ugly. But if we have rcu_sync primitives, all 3 sem's
in struct super_block can share the same "struct rcu_sync", and
freeze_super() will need only once synchronize_sched().
> > And. We can do more improvements in rcu_sync and percpu-rwsem, and
> > I don't only mean other optimizations from Peter. In particular, we
> > can extract the "wait for gp pass" from rcu_sync_enter() into another
> > helper, we can teach percpu_down_write() to allow multiple writers,
> > and more.
>
> As in a percpu_down_write() that allows up to (say) five concurrent
> write-holders?
Yes. Because this is what uprobes (and probably cgroups) actually needs.
It does not need the global lock. Just it needs to exclude the "readers"
(dup_mmap) globally.
And in fact the very first version I sent worked this way. Then I removed
this because a) this was a bit "unusual" for reviewers ;) and b) because
I raced with another commit which has already added the initial (and
sub-optimal) version of percpu_rw_semaphore.
Oleg.
next prev parent reply other threads:[~2015-07-15 19:37 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-11 23:35 [PATCH 0/7] Add rcu_sync infrastructure to avoid _expedited() in percpu-rwsem Oleg Nesterov
2015-07-11 23:35 ` [PATCH 1/7] rcu: Create rcu_sync infrastructure Oleg Nesterov
2015-07-15 18:05 ` Paul E. McKenney
2015-07-15 18:15 ` Peter Zijlstra
2015-07-15 18:28 ` Paul E. McKenney
2015-07-15 19:08 ` Oleg Nesterov
2015-07-15 19:15 ` Paul E. McKenney
2015-07-15 18:41 ` Oleg Nesterov
2015-07-11 23:35 ` [PATCH 2/7] rcusync: Introduce struct rcu_sync_ops Oleg Nesterov
2015-07-11 23:35 ` [PATCH 3/7] rcusync: Add the CONFIG_PROVE_RCU checks Oleg Nesterov
2015-07-11 23:35 ` [PATCH 4/7] rcusync: Introduce rcu_sync_dtor() Oleg Nesterov
2015-07-11 23:36 ` [PATCH 5/7] percpu-rwsem: change it to rely on rss_sync infrastructure Oleg Nesterov
2015-07-15 18:15 ` Paul E. McKenney
2015-07-15 18:59 ` Oleg Nesterov
2015-07-11 23:36 ` [PATCH 6/7] percpu-rwsem: fix the comments outdated by rcu_sync Oleg Nesterov
2015-07-11 23:36 ` [PATCH 7/7] percpu-rwsem: cleanup the lockdep annotations in percpu_down_read() Oleg Nesterov
2015-07-11 23:47 ` [PATCH 0/7] Add rcu_sync infrastructure to avoid _expedited() in percpu-rwsem Linus Torvalds
2015-07-15 18:27 ` Paul E. McKenney
2015-07-15 19:36 ` Oleg Nesterov [this message]
2015-07-15 21:59 ` Paul E. McKenney
2015-07-17 23:29 ` Oleg Nesterov
2015-07-17 23:47 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150715193601.GA4386@redhat.com \
--to=oleg@redhat.com \
--cc=daniel.wagner@bmw-carit.de \
--cc=dave@stgolabs.net \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.