From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: "Zhang, Jun" <jun.zhang@intel.com>
Cc: "He, Bo" <bo.he@intel.com>, Steven Rostedt <rostedt@goodmis.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"josh@joshtriplett.org" <josh@joshtriplett.org>,
"mathieu.desnoyers@efficios.com" <mathieu.desnoyers@efficios.com>,
"jiangshanlai@gmail.com" <jiangshanlai@gmail.com>,
"Xiao, Jin" <jin.xiao@intel.com>,
"Zhang, Yanmin" <yanmin.zhang@intel.com>,
"Bai, Jie A" <jie.a.bai@intel.com>,
"Sun, Yi J" <yi.j.sun@intel.com>,
"Chang, Junxiao" <junxiao.chang@intel.com>,
"Mei, Paul" <paul.mei@intel.com>
Subject: Re: rcu_preempt caused oom
Date: Mon, 17 Dec 2018 21:34:26 -0800 [thread overview]
Message-ID: <20181218053426.GP4170@linux.ibm.com> (raw)
In-Reply-To: <88DC34334CA3444C85D647DBFA962C2735AD64A0@SHSMSX104.ccr.corp.intel.com>
On Tue, Dec 18, 2018 at 02:46:43AM +0000, Zhang, Jun wrote:
> Hello, paul
>
> In softirq context, and current is rcu_preempt-10, rcu_gp_kthread_wake don't wakeup rcu_preempt.
> Maybe next patch could fix it. Please help review.
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 0b760c1..98f5b40 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1697,7 +1697,7 @@ static bool rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
> */
> static void rcu_gp_kthread_wake(struct rcu_state *rsp)
> {
> - if (current == rsp->gp_kthread ||
> + if (((current == rsp->gp_kthread) && !in_softirq()) ||
Close, but not quite. Please see below.
> !READ_ONCE(rsp->gp_flags) ||
> !rsp->gp_kthread)
> return;
>
> [44932.311439, 0][ rcu_preempt] rcu_preempt-10 [001] .n.. 44929.401037: rcu_grace_period: rcu_preempt 19063548 reqwait
> ......
> [44932.311517, 0][ rcu_preempt] rcu_preempt-10 [001] d.s2 44929.402234: rcu_future_grace_period: rcu_preempt 19063548 19063552 0 0 3 Startleaf
> [44932.311536, 0][ rcu_preempt] rcu_preempt-10 [001] d.s2 44929.402237: rcu_future_grace_period: rcu_preempt 19063548 19063552 0 0 3 Startedroot
Good catch! If the rcu_preempt kthread had just entered the function
swait_event_idle_exclusive(), which had just called __swait_event_idle()
which had just called ___swait_event(), which had just gotten done
checking the "condition", then yes, the rcu_preempt kthread could
sleep forever. This is a very narrow race window, but that matches
your experience with its not happening often -- and my experience with
it not happening at all.
However, for this to happen, the wakeup must happen within a softirq
handler that executes upon return from an interrupt that interrupted
___swait_event() just after the "if (condition)". For this, we don't want
in_softirq() but rather in_serving_softirq(), as shown in the patch below.
The patch you have above could result in spurious wakeups, as it is
checking for bottom halves being disabled, not just executing within a
softirq handler. Which might be better than not having enough wakeups,
but let's please try for just the right number. ;-)
So could you please instead test the patch below?
And if it works, could I please have your Signed-off-by so that I can
queue it? My patch is quite clearly derived from yours, after all!
And you should get credit for finding the problem and arriving at an
approximate fix, after all.
Thanx, Paul
------------------------------------------------------------------------
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index e9392a9d6291..b9205b40b621 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1722,7 +1722,7 @@ static bool rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
*/
static void rcu_gp_kthread_wake(struct rcu_state *rsp)
{
- if (current == rsp->gp_kthread ||
+ if ((current == rsp->gp_kthread && !in_serving_softirq()) ||
!READ_ONCE(rsp->gp_flags) ||
!rsp->gp_kthread)
return;
next prev parent reply other threads:[~2018-12-18 5:34 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-29 8:49 rcu_preempt caused oom He, Bo
2018-11-29 13:06 ` Paul E. McKenney
2018-11-29 14:27 ` Paul E. McKenney
2018-11-30 8:03 ` He, Bo
2018-11-30 14:43 ` Paul E. McKenney
2018-11-30 15:16 ` Steven Rostedt
2018-11-30 15:18 ` He, Bo
2018-11-30 16:49 ` Paul E. McKenney
2018-12-03 7:44 ` He, Bo
2018-12-03 13:56 ` Paul E. McKenney
2018-12-04 7:50 ` He, Bo
2018-12-04 19:49 ` Paul E. McKenney
2018-12-05 8:42 ` He, Bo
2018-12-05 17:44 ` Paul E. McKenney
[not found] ` <CD6925E8781EFD4D8E11882D20FC406D52A16C46@SHSMSX104.ccr.corp.intel.com>
2018-12-06 17:38 ` Paul E. McKenney
[not found] ` <CD6925E8781EFD4D8E11882D20FC406D52A180C5@SHSMSX104.ccr.corp.intel.com>
2018-12-07 14:11 ` Paul E. McKenney
2018-12-09 19:56 ` Paul E. McKenney
2018-12-10 6:56 ` He, Bo
2018-12-11 0:38 ` Paul E. McKenney
2018-12-11 4:46 ` Paul E. McKenney
2018-12-11 5:29 ` He, Bo
2018-12-12 1:37 ` He, Bo
2018-12-12 2:24 ` Paul E. McKenney
[not found] ` <CD6925E8781EFD4D8E11882D20FC406D52A192C3@SHSMSX104.ccr.corp.intel.com>
2018-12-12 15:42 ` Paul E. McKenney
2018-12-12 21:03 ` Paul E. McKenney
2018-12-12 23:13 ` He, Bo
2018-12-13 0:12 ` Paul E. McKenney
2018-12-13 2:11 ` Zhang, Jun
2018-12-13 2:42 ` Paul E. McKenney
[not found] ` <88DC34334CA3444C85D647DBFA962C2735AD5F9E@SHSMSX104.ccr.corp.intel.com>
2018-12-13 4:40 ` Paul E. McKenney
[not found] ` <CD6925E8781EFD4D8E11882D20FC406D52A197EC@SHSMSX104.ccr.corp.intel.com>
2018-12-13 18:11 ` Paul E. McKenney
2018-12-14 1:30 ` He, Bo
2018-12-14 2:15 ` Paul E. McKenney
2018-12-14 2:40 ` He, Bo
2018-12-14 5:10 ` Paul E. McKenney
2018-12-14 5:38 ` Paul E. McKenney
2018-12-17 3:15 ` He, Bo
2018-12-17 4:26 ` Paul E. McKenney
[not found] ` <CD6925E8781EFD4D8E11882D20FC406D52A1A634@SHSMSX104.ccr.corp.intel.com>
2018-12-18 2:46 ` Zhang, Jun
2018-12-18 3:12 ` He, Bo
2018-12-18 5:34 ` Paul E. McKenney [this message]
2019-02-13 8:31 ` [tip:core/rcu] rcu: Prevent needless ->gp_seq_needed update in __note_gp_changes() tip-bot for Zhang, Jun
2019-02-13 8:30 ` [tip:core/rcu] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt tip-bot for Zhang, Jun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181218053426.GP4170@linux.ibm.com \
--to=paulmck@linux.ibm.com \
--cc=bo.he@intel.com \
--cc=jiangshanlai@gmail.com \
--cc=jie.a.bai@intel.com \
--cc=jin.xiao@intel.com \
--cc=josh@joshtriplett.org \
--cc=jun.zhang@intel.com \
--cc=junxiao.chang@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=paul.mei@intel.com \
--cc=rostedt@goodmis.org \
--cc=yanmin.zhang@intel.com \
--cc=yi.j.sun@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.