From: Oleg Nesterov <oleg@tv-sign.ru>
To: linux-kernel@vger.kernel.org
Cc: Manfred Spraul <manfred@colorfullife.com>,
Dipankar Sarma <dipankar@in.ibm.com>,
Andrew Morton <akpm@osdl.org>
Subject: [PATCH] rcu: speed up quiescent state detection
Date: Thu, 30 Dec 2004 19:16:05 +0300 [thread overview]
Message-ID: <41D429C5.F32D7161@tv-sign.ru> (raw)
On top of Manfred's 'rcu: simplify quiescent state detection', see
http://marc.theaimsgroup.com/?l=linux-kernel&m=110433412126392
Let's suppose that cpu is running idle thread or user level process,
and the grace period was started.
Afaics, currently we need 2 local timer interrupts to happen before
this cpu can end its grace period.
1st timer_interrupt:
__rcu_pending():
if (rdp->quiescbatch != rcp->cur)
return 1;
rcu_check_callbacks():
if (user || idle)
->passed_quiesc = 1;
tasklet_schedule(rcu_tasklet)
do_softirq():
__rcu_process_callbacks():
rcu_check_quiescent_state()
if (rdp->quiescbatch != rcp->cur) {
->qs_pending = 1;
->passed_quiesc = 0;
return;
}
Only the next interrupt will notice ->qs_pending in __rcu_pending() and
increment ->completed in rcu_check_quiescent_state().
I think it is better to set ->qs_pending = 1 directly in __rcu_pending():
int __rcu_pending()
{
if (qs_pending) return 1;
if (quiescbatch != rcp->cur) {
quiescbatch = rcp->cur;
passed_quiesc = 0;
barrier();
qs_pending = 1;
return 1;
}
... other checks ...
}
void rcu_check_quiescent_state()
{
if (!qs_pending) return;
barrier();
if (!passed_quiesc) return;
cpu_quiet();
qs_pending = 0;
}
This way grace period for that cpu will be completed after 1st interupt.
Note that when fork()->scheduler_tick()->rcu_pending() runs in process
context, it does this with interrupts disabled, so there is no race with
timer interrupt (Although i think that it is better to call rcu_pending()
directly from update_process_times() instead).
What do you think?
Oleg.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
--- 2.6.10/include/linux/rcupdate.h~ 2004-12-30 18:31:43.486279168 +0300
+++ 2.6.10/include/linux/rcupdate.h 2004-12-30 20:21:49.589998752 +0300
@@ -124,30 +124,7 @@ static inline void rcu_bh_qsctr_inc(int
rdp->passed_quiesc = 1;
}
-static inline int __rcu_pending(struct rcu_ctrlblk *rcp,
- struct rcu_data *rdp)
-{
- /* This cpu has pending rcu entries and the grace period
- * for them has completed.
- */
- if (rdp->curlist && !rcu_batch_before(rcp->completed, rdp->batch))
- return 1;
-
- /* This cpu has no pending entries, but there are new entries */
- if (!rdp->curlist && rdp->nxtlist)
- return 1;
-
- /* This cpu has finished callbacks to invoke */
- if (rdp->donelist)
- return 1;
-
- /* The rcu core waits for a quiescent state from the cpu */
- if (rdp->quiescbatch != rcp->cur || rdp->qs_pending)
- return 1;
-
- /* nothing to do */
- return 0;
-}
+extern int __rcu_pending(struct rcu_ctrlblk *rcp, struct rcu_data *rdp);
static inline int rcu_pending(int cpu)
{
--- 2.6.10/kernel/rcupdate.c~ 2004-12-30 19:45:36.290390512 +0300
+++ 2.6.10/kernel/rcupdate.c 2004-12-30 20:20:59.955544336 +0300
@@ -207,6 +207,39 @@ static void cpu_quiet(int cpu, struct rc
}
}
+int __rcu_pending(struct rcu_ctrlblk *rcp, struct rcu_data *rdp)
+{
+ /* The rcu core waits for a quiescent state from the cpu */
+ if (rdp->qs_pending)
+ return 1;
+
+ if (rdp->quiescbatch != rcp->cur) {
+ /* start new grace period: */
+ rdp->quiescbatch = rcp->cur;
+ rdp->passed_quiesc = 0;
+ barrier();
+ rdp->qs_pending = 1;
+ return 1;
+ }
+
+ /* This cpu has pending rcu entries and the grace period
+ * for them has completed.
+ */
+ if (rdp->curlist && !rcu_batch_before(rcp->completed, rdp->batch))
+ return 1;
+
+ /* This cpu has no pending entries, but there are new entries */
+ if (!rdp->curlist && rdp->nxtlist)
+ return 1;
+
+ /* This cpu has finished callbacks to invoke */
+ if (rdp->donelist)
+ return 1;
+
+ /* nothing to do */
+ return 0;
+}
+
/*
* Check if the cpu has gone through a quiescent state (say context
* switch). If so and if it already hasn't done so in this RCU
@@ -215,14 +248,6 @@ static void cpu_quiet(int cpu, struct rc
static void rcu_check_quiescent_state(struct rcu_ctrlblk *rcp,
struct rcu_state *rsp, struct rcu_data *rdp)
{
- if (rdp->quiescbatch != rcp->cur) {
- /* start new grace period: */
- rdp->qs_pending = 1;
- rdp->passed_quiesc = 0;
- rdp->quiescbatch = rcp->cur;
- return;
- }
-
/* Grace period already completed for this cpu?
* qs_pending is checked instead of the actual bitmap to avoid
* cacheline trashing.
@@ -234,9 +259,9 @@ static void rcu_check_quiescent_state(st
* Was there a quiescent state since the beginning of the grace
* period? If no, then exit and wait for the next call.
*/
+ barrier();
if (!rdp->passed_quiesc)
return;
- rdp->qs_pending = 0;
spin_lock(&rsp->lock);
/*
@@ -247,6 +272,8 @@ static void rcu_check_quiescent_state(st
cpu_quiet(rdp->cpu, rcp, rsp);
spin_unlock(&rsp->lock);
+
+ rdp->qs_pending = 0;
}
next reply other threads:[~2004-12-30 15:14 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-12-30 16:16 Oleg Nesterov [this message]
2004-12-30 15:36 ` [PATCH] rcu: speed up quiescent state detection Manfred Spraul
2004-12-30 17:22 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=41D429C5.F32D7161@tv-sign.ru \
--to=oleg@tv-sign.ru \
--cc=akpm@osdl.org \
--cc=dipankar@in.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.