From: "Paul E. McKenney" <paulmck@kernel.org>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: linux-kernel@vger.kernel.org, Davidlohr Bueso <dave@stgolabs.net>,
Jonathan Corbet <corbet@lwn.net>,
Josh Triplett <josh@joshtriplett.org>,
Lai Jiangshan <jiangshanlai@gmail.com>,
linux-doc@vger.kernel.org,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
neeraju@codeaurora.org, peterz@infradead.org,
Randy Dunlap <rdunlap@infradead.org>,
rcu@vger.kernel.org, Steven Rostedt <rostedt@goodmis.org>,
tglx@linutronix.de, vineethrp@gmail.com
Subject: Re: [PATCH v4 4/5] rcutorture: Force synchronizing of RCU flavor from hotplug notifier
Date: Mon, 10 Aug 2020 10:54:34 -0700 [thread overview]
Message-ID: <20200810175434.GL4295@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200810173109.GA2253395@google.com>
On Mon, Aug 10, 2020 at 01:31:09PM -0400, Joel Fernandes wrote:
> Hi Paul,
>
> On Mon, Aug 10, 2020 at 09:19:45AM -0700, Paul E. McKenney wrote:
> > On Fri, Aug 07, 2020 at 01:07:21PM -0400, Joel Fernandes (Google) wrote:
> > > RCU has had deadlocks in the past related to synchronizing in a hotplug
> > > notifier. Typically, this has occurred because timer callbacks did not get
> > > migrated before the CPU hotplug notifier requesting RCU's services is
> > > called. If RCU's grace period processing has a timer callback queued in
> > > the meanwhile, it may never get called causing RCU stalls.
> > >
> > > These issues have been fixed by removing such dependencies from grace
> > > period processing, however there are no testing scenarios for such
> > > cases.
> > >
> > > This commit therefore reuses rcutorture's existing hotplug notifier to
> > > invoke the flavor-specific synchronize callback. If anything locks up,
> > > we expect stall warnings and/or other splats.
> > >
> > > Obviously, we need not test for rcu_barrier from a notifier, since those
> > > are not allowed from notifiers. This fact is already detailed in the
> > > documentation as well.
> > >
> > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> >
> > Given that rcutorture_booster_init() is invoked on the CPU in question
> > only after it is up and running, and that (if I remember correctly)
> > rcutorture_booster_cleanup() is invoked on the outgoing CPU before it
> > has really started going away, would this code really have caught that
> > timer/CPU-hotplug/RCU bug?
>
> You are right, it would not have caught that particular one because the timer
> callbacks would have been migrated by the time the rcutorture_booster_init()
> is called.
>
> I still thought it is a good idea anyway to test if the dynamic hotplug
> notifiers don't have these issues.
>
> Did you have a better idea on how to test the timer/hotplug/rcu bug?
My suggestion would be to place an rcutorture hook in all of the RCU
notifiers that support blocking and that have some possibility of making
this deadlock happen. There are some similar hooks in other parts of RCU.
Thanx, Paul
> thanks,
>
> - Joel
>
>
>
> > > kernel/rcu/rcutorture.c | 81 +++++++++++++++++++++--------------------
> > > 1 file changed, 42 insertions(+), 39 deletions(-)
> > >
> > > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> > > index 92cb79620939..083b65e4877d 100644
> > > --- a/kernel/rcu/rcutorture.c
> > > +++ b/kernel/rcu/rcutorture.c
> > > @@ -1645,12 +1645,37 @@ rcu_torture_print_module_parms(struct rcu_torture_ops *cur_ops, const char *tag)
> > > read_exit_delay, read_exit_burst);
> > > }
> > >
> > > -static int rcutorture_booster_cleanup(unsigned int cpu)
> > > +static bool rcu_torture_can_boost(void)
> > > +{
> > > + static int boost_warn_once;
> > > + int prio;
> > > +
> > > + if (!(test_boost == 1 && cur_ops->can_boost) && test_boost != 2)
> > > + return false;
> > > +
> > > + prio = rcu_get_gp_kthreads_prio();
> > > + if (!prio)
> > > + return false;
> > > +
> > > + if (prio < 2) {
> > > + if (boost_warn_once == 1)
> > > + return false;
> > > +
> > > + pr_alert("%s: WARN: RCU kthread priority too low to test boosting. Skipping RCU boost test. Try passing rcutree.kthread_prio > 1 on the kernel command line.\n", KBUILD_MODNAME);
> > > + boost_warn_once = 1;
> > > + return false;
> > > + }
> > > +
> > > + return true;
> > > +}
> > > +
> > > +static int rcutorture_hp_cleanup(unsigned int cpu)
> > > {
> > > struct task_struct *t;
> > >
> > > - if (boost_tasks[cpu] == NULL)
> > > + if (!rcu_torture_can_boost() || boost_tasks[cpu] == NULL)
> > > return 0;
> > > +
> > > mutex_lock(&boost_mutex);
> > > t = boost_tasks[cpu];
> > > boost_tasks[cpu] = NULL;
> > > @@ -1662,11 +1687,14 @@ static int rcutorture_booster_cleanup(unsigned int cpu)
> > > return 0;
> > > }
> > >
> > > -static int rcutorture_booster_init(unsigned int cpu)
> > > +static int rcutorture_hp_init(unsigned int cpu)
> > > {
> > > int retval;
> > >
> > > - if (boost_tasks[cpu] != NULL)
> > > + /* Force synchronizing from hotplug notifier to ensure it is safe. */
> > > + cur_ops->sync();
> > > +
> > > + if (!rcu_torture_can_boost() || boost_tasks[cpu] != NULL)
> > > return 0; /* Already created, nothing more to do. */
> > >
> > > /* Don't allow time recalculation while creating a new task. */
> > > @@ -2336,30 +2364,6 @@ static void rcu_torture_barrier_cleanup(void)
> > > }
> > > }
> > >
> > > -static bool rcu_torture_can_boost(void)
> > > -{
> > > - static int boost_warn_once;
> > > - int prio;
> > > -
> > > - if (!(test_boost == 1 && cur_ops->can_boost) && test_boost != 2)
> > > - return false;
> > > -
> > > - prio = rcu_get_gp_kthreads_prio();
> > > - if (!prio)
> > > - return false;
> > > -
> > > - if (prio < 2) {
> > > - if (boost_warn_once == 1)
> > > - return false;
> > > -
> > > - pr_alert("%s: WARN: RCU kthread priority too low to test boosting. Skipping RCU boost test. Try passing rcutree.kthread_prio > 1 on the kernel command line.\n", KBUILD_MODNAME);
> > > - boost_warn_once = 1;
> > > - return false;
> > > - }
> > > -
> > > - return true;
> > > -}
> > > -
> > > static bool read_exit_child_stop;
> > > static bool read_exit_child_stopped;
> > > static wait_queue_head_t read_exit_wq;
> > > @@ -2503,8 +2507,7 @@ rcu_torture_cleanup(void)
> > > rcutorture_seq_diff(gp_seq, start_gp_seq));
> > > torture_stop_kthread(rcu_torture_stats, stats_task);
> > > torture_stop_kthread(rcu_torture_fqs, fqs_task);
> > > - if (rcu_torture_can_boost())
> > > - cpuhp_remove_state(rcutor_hp);
> > > + cpuhp_remove_state(rcutor_hp);
> > >
> > > /*
> > > * Wait for all RCU callbacks to fire, then do torture-type-specific
> > > @@ -2773,21 +2776,21 @@ rcu_torture_init(void)
> > > if (firsterr)
> > > goto unwind;
> > > }
> > > +
> > > + firsterr = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "RCU_TORTURE",
> > > + rcutorture_hp_init,
> > > + rcutorture_hp_cleanup);
> > > + if (firsterr < 0)
> > > + goto unwind;
> > > + rcutor_hp = firsterr;
> > > +
> > > if (test_boost_interval < 1)
> > > test_boost_interval = 1;
> > > if (test_boost_duration < 2)
> > > test_boost_duration = 2;
> > > - if (rcu_torture_can_boost()) {
> > > -
> > > + if (rcu_torture_can_boost())
> > > boost_starttime = jiffies + test_boost_interval * HZ;
> > >
> > > - firsterr = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "RCU_TORTURE",
> > > - rcutorture_booster_init,
> > > - rcutorture_booster_cleanup);
> > > - if (firsterr < 0)
> > > - goto unwind;
> > > - rcutor_hp = firsterr;
> > > - }
> > > shutdown_jiffies = jiffies + shutdown_secs * HZ;
> > > firsterr = torture_shutdown_init(shutdown_secs, rcu_torture_cleanup);
> > > if (firsterr)
> > > --
> > > 2.28.0.236.gb10cc79966-goog
> > >
next prev parent reply other threads:[~2020-08-10 17:54 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-07 17:07 [PATCH v4 0/5] option-subject: RCU and CPU hotplug checks and docs Joel Fernandes (Google)
2020-08-07 17:07 ` [PATCH v4 1/5] rcu/tree: Add a warning if CPU being onlined did not report QS already Joel Fernandes (Google)
2020-08-10 15:46 ` Paul E. McKenney
2020-08-10 17:39 ` Joel Fernandes
2020-08-10 17:57 ` Paul E. McKenney
2020-08-10 19:25 ` Joel Fernandes
2020-08-10 20:20 ` Paul E. McKenney
2020-08-07 17:07 ` [PATCH v4 2/5] rcu/tree: Clarify comments about FQS loop reporting quiescent states Joel Fernandes (Google)
2020-08-10 18:06 ` Paul E. McKenney
2020-08-10 19:22 ` Joel Fernandes
2020-08-07 17:07 ` [PATCH v4 3/5] rcu/tree: Make FQS complaining about offline CPU more aggressive Joel Fernandes (Google)
2020-08-10 20:56 ` Paul E. McKenney
2020-08-07 17:07 ` [PATCH v4 4/5] rcutorture: Force synchronizing of RCU flavor from hotplug notifier Joel Fernandes (Google)
2020-08-10 16:19 ` Paul E. McKenney
2020-08-10 17:31 ` Joel Fernandes
2020-08-10 17:54 ` Paul E. McKenney [this message]
2020-08-10 19:41 ` Joel Fernandes
2020-08-07 17:07 ` [PATCH v4 5/5] docs: Update RCU's hotplug requirements with a bit about design Joel Fernandes (Google)
2020-08-08 2:10 ` Randy Dunlap
2020-08-10 17:41 ` Joel Fernandes
2020-08-07 18:31 ` [PATCH v4 0/5] option-subject: RCU and CPU hotplug checks and docs Joel Fernandes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200810175434.GL4295@paulmck-ThinkPad-P72 \
--to=paulmck@kernel.org \
--cc=corbet@lwn.net \
--cc=dave@stgolabs.net \
--cc=jiangshanlai@gmail.com \
--cc=joel@joelfernandes.org \
--cc=josh@joshtriplett.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mchehab+samsung@kernel.org \
--cc=neeraju@codeaurora.org \
--cc=peterz@infradead.org \
--cc=rcu@vger.kernel.org \
--cc=rdunlap@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=vineethrp@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox