From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Ingo Molnar <mingo@kernel.org>,
lkml <linux-kernel@vger.kernel.org>, Tejun Heo <tj@kernel.org>,
Dmitry Shmidt <dimitrysh@google.com>,
Rom Lemarchand <romlem@google.com>,
Colin Cross <ccross@google.com>, Todd Kjos <tkjos@google.com>
Subject: Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact
Date: Wed, 10 Aug 2016 19:09:46 +0200 [thread overview]
Message-ID: <20160810170946.GA11051@redhat.com> (raw)
In-Reply-To: <20160810075606.GK6879@twins.programming.kicks-ass.net>
On 08/10, Peter Zijlstra wrote:
>
> On Tue, Aug 09, 2016 at 04:47:38PM -0700, John Stultz wrote:
> > On Tue, Aug 9, 2016 at 2:51 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > Currently the percpu-rwsem switches to (global) atomic ops while a
> > > writer is waiting; which could be quite a while and slows down
> > > releasing the readers.
> > >
> > > This patch cures this problem by ordering the reader-state vs
> > > reader-count (see the comments in __percpu_down_read() and
> > > percpu_down_write()). This changes a global atomic op into a full
> > > memory barrier, which doesn't have the global cacheline contention.
> > >
> > > This also enables using the percpu-rwsem with rcu_sync disabled in order
> > > to bias the implementation differently, reducing the writer latency by
> > > adding some cost to readers.
> >
> > So this by itself doesn't help us much, but including the following
> > from Oleg does help quite a bit:
>
> Correct, Oleg was going to send his rcu_sync rework on top of this. But
> since its holiday season things might be tad delayed.
Yeees. The patch is ready and even seems to work, but I still can't (re)write
the comments. Will try to finish tomorrow.
But. We need something simple and backportable, so I am going to send the
patch below first. As you can see this is your "sabotage" change, just
the new helper was renamed + s/!GP_IDLE/GP_PASSED/ fix. And the only reason
I can't send it today is that somehow I can't write the changelog ;)
So I would be really happy if you send this change instead of me, I am going
to keep "From: peterz" anyway.
Oleg.
diff --git a/include/linux/rcu_sync.h b/include/linux/rcu_sync.h
index a63a33e..ece7ed9 100644
--- a/include/linux/rcu_sync.h
+++ b/include/linux/rcu_sync.h
@@ -59,6 +59,7 @@ static inline bool rcu_sync_is_idle(struct rcu_sync *rsp)
}
extern void rcu_sync_init(struct rcu_sync *, enum rcu_sync_type);
+extern void rcu_sync_enter_start(struct rcu_sync *);
extern void rcu_sync_enter(struct rcu_sync *);
extern void rcu_sync_exit(struct rcu_sync *);
extern void rcu_sync_dtor(struct rcu_sync *);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 75c0ff0..614f9cd 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -5609,6 +5609,8 @@ int __init cgroup_init(void)
BUG_ON(cgroup_init_cftypes(NULL, cgroup_dfl_base_files));
BUG_ON(cgroup_init_cftypes(NULL, cgroup_legacy_base_files));
+ rcu_sync_enter_start(&cgroup_threadgroup_rwsem.rss);
+
get_user_ns(init_cgroup_ns.user_ns);
mutex_lock(&cgroup_mutex);
diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
index be922c9..c9b7bc8 100644
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -83,6 +83,18 @@ void rcu_sync_init(struct rcu_sync *rsp, enum rcu_sync_type type)
}
/**
+ * Must be called after rcu_sync_init() and before first use.
+ *
+ * Ensures rcu_sync_is_idle() returns false and rcu_sync_{enter,exit}() pairs
+ * turn into NO-OPs.
+ */
+void rcu_sync_enter_start(struct rcu_sync *rsp)
+{
+ rsp->gp_count++;
+ rsp->gp_state = GP_PASSED;
+}
+
+/**
* rcu_sync_enter() - Force readers onto slowpath
* @rsp: Pointer to rcu_sync structure to use for synchronization
*
next prev parent reply other threads:[~2016-08-10 18:27 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-09 9:51 [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact Peter Zijlstra
2016-08-09 23:47 ` John Stultz
2016-08-10 7:56 ` Peter Zijlstra
2016-08-10 17:09 ` Oleg Nesterov [this message]
[not found] ` <CAH7ZN-zXrenrbbcmvZ+biNozYe21jw6fULopG=g9-xRwWHE6nw@mail.gmail.com>
[not found] ` <5a2cc178ee03466fa3b104f8f28b44ff@NASANEXM02C.na.qualcomm.com>
2016-08-13 1:44 ` Om Dhyade
2016-08-24 21:16 ` John Stultz
2016-08-24 21:30 ` Tejun Heo
2016-08-24 22:50 ` John Stultz
2016-08-26 2:14 ` John Stultz
2016-08-26 12:51 ` Tejun Heo
2016-08-26 16:47 ` Dmitry Shmidt
2016-08-26 20:10 ` Om Dhyade
2016-08-11 16:54 ` [PATCH] cgroup: avoid synchronize_sched() in __cgroup_procs_write() Oleg Nesterov
2016-08-18 10:59 ` [tip:locking/core] locking, rcu, cgroup: Avoid " tip-bot for Peter Zijlstra
2016-08-18 13:41 ` tip-bot for Peter Zijlstra
2016-08-31 5:21 ` [v2] locking/percpu-rwsem: Optimize readers and reduce global impact Guenter Roeck
2016-08-31 8:09 ` Peter Zijlstra
2016-08-31 13:41 ` Guenter Roeck
2016-08-31 13:47 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160810170946.GA11051@redhat.com \
--to=oleg@redhat.com \
--cc=ccross@google.com \
--cc=dimitrysh@google.com \
--cc=john.stultz@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=romlem@google.com \
--cc=tj@kernel.org \
--cc=tkjos@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox