From: paulmck@kernel.org
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com,
mingo@kernel.org, jiangshanlai@gmail.com, dipankar@in.ibm.com,
akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
josh@joshtriplett.org, tglx@linutronix.de, peterz@infradead.org,
rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com,
fweisbec@gmail.com, oleg@redhat.com, joel@joelfernandes.org,
sfr@canb.auug.org.au, "Paul E. McKenney" <paulmck@kernel.org>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Jiri Olsa <jolsa@redhat.com>,
bpf@vger.kernel.org, "# 5 . 7 . x" <stable@vger.kernel.org>
Subject: [PATCH tip/core/rcu 6/8] rcu-tasks: Fix grace-period/unlock race in RCU Tasks Trace
Date: Thu, 17 Sep 2020 14:07:42 -0700 [thread overview]
Message-ID: <20200917210744.2995-6-paulmck@kernel.org> (raw)
In-Reply-To: <20200917210652.GA31242@paulmck-ThinkPad-P72>
From: "Paul E. McKenney" <paulmck@kernel.org>
The more intense grace-period processing resulting from the 50x RCU
Tasks Trace grace-period speedups exposed the following race condition:
o Task A running on CPU 0 executes rcu_read_lock_trace(),
entering a read-side critical section.
o When Task A eventually invokes rcu_read_unlock_trace()
to exit its read-side critical section, this function
notes that the ->trc_reader_special.s flag is zero and
and therefore invoke wil set ->trc_reader_nesting to zero
using WRITE_ONCE(). But before that happens...
o The RCU Tasks Trace grace-period kthread running on some other
CPU interrogates Task A, but this fails because this task is
currently running. This kthread therefore sends an IPI to CPU 0.
o CPU 0 receives the IPI, and thus invokes trc_read_check_handler().
Because Task A has not yet cleared its ->trc_reader_nesting
counter, this function sees that Task A is still within its
read-side critical section. This function therefore sets the
->trc_reader_nesting.b.need_qs flag, AKA the .need_qs flag.
Except that Task A has already checked the .need_qs flag, which
is part of the ->trc_reader_special.s flag. The .need_qs flag
therefore remains set until Task A's next rcu_read_unlock_trace().
o Task A now invokes synchronize_rcu_tasks_trace(), which cannot
start a new grace period until the current grace period completes.
And thus cannot return until after that time.
But Task A's .need_qs flag is still set, which prevents the current
grace period from completing. And because Task A is blocked, it
will never execute rcu_read_unlock_trace() until its call to
synchronize_rcu_tasks_trace() returns.
We are therefore deadlocked.
This race is improbable, but 80 hours of rcutorture made it happen twice.
The race was possible before the grace-period speedup, but roughly 50x
less probable. Several thousand hours of rcutorture would have been
necessary to have a reasonable chance of making this happen before this
50x speedup.
This commit therefore eliminates this deadlock by setting
->trc_reader_nesting to a large negative number before checking the
.need_qs and zeroing (or decrementing with respect to its initial
value) ->trc_reader_nesting. For its part, the IPI handler's
trc_read_check_handler() function adds a check for negative values,
deferring evaluation of the task in this case. Taken together, these
changes avoid this deadlock scenario.
Fixes: 276c410448db ("rcu-tasks: Split ->trc_reader_need_end")
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: <bpf@vger.kernel.org>
Cc: <stable@vger.kernel.org> # 5.7.x
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
include/linux/rcupdate_trace.h | 4 ++++
kernel/rcu/tasks.h | 6 ++++++
2 files changed, 10 insertions(+)
diff --git a/include/linux/rcupdate_trace.h b/include/linux/rcupdate_trace.h
index d9015aa..a6a6a3a 100644
--- a/include/linux/rcupdate_trace.h
+++ b/include/linux/rcupdate_trace.h
@@ -50,6 +50,7 @@ static inline void rcu_read_lock_trace(void)
struct task_struct *t = current;
WRITE_ONCE(t->trc_reader_nesting, READ_ONCE(t->trc_reader_nesting) + 1);
+ barrier();
if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB) &&
t->trc_reader_special.b.need_mb)
smp_mb(); // Pairs with update-side barriers
@@ -72,6 +73,9 @@ static inline void rcu_read_unlock_trace(void)
rcu_lock_release(&rcu_trace_lock_map);
nesting = READ_ONCE(t->trc_reader_nesting) - 1;
+ barrier(); // Critical section before disabling.
+ // Disable IPI-based setting of .need_qs.
+ WRITE_ONCE(t->trc_reader_nesting, INT_MIN);
if (likely(!READ_ONCE(t->trc_reader_special.s)) || nesting) {
WRITE_ONCE(t->trc_reader_nesting, nesting);
return; // We assume shallow reader nesting.
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index a0eaed5..e583a2d 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -830,6 +830,12 @@ static void trc_read_check_handler(void *t_in)
WRITE_ONCE(t->trc_reader_checked, true);
goto reset_ipi;
}
+ // If we are racing with an rcu_read_unlock_trace(), try again later.
+ if (unlikely(t->trc_reader_nesting < 0)) {
+ if (WARN_ON_ONCE(atomic_dec_and_test(&trc_n_readers_need_end)))
+ wake_up(&trc_wait);
+ goto reset_ipi;
+ }
WRITE_ONCE(t->trc_reader_checked, true);
// Get here if the task is in a read-side critical section. Set
--
2.9.5
next prev parent reply other threads:[~2020-09-17 21:08 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-10 20:19 [PATCH RFC tip/core/rcu 0/4] Accelerate RCU Tasks Trace updates Paul E. McKenney
2020-09-10 20:20 ` [PATCH RFC tip/core/rcu 1/4] rcu-tasks: Mark variables static paulmck
2020-09-10 20:20 ` [PATCH RFC tip/core/rcu 2/4] rcu-tasks: Use more aggressive polling for RCU Tasks Trace paulmck
2020-09-10 20:20 ` [PATCH RFC tip/core/rcu 3/4] rcu-tasks: Selectively enable more RCU Tasks Trace IPIs paulmck
2020-09-10 20:20 ` [PATCH RFC tip/core/rcu 4/4] rcu-tasks: Shorten per-grace-period sleep for RCU Tasks Trace paulmck
2020-09-11 3:18 ` Alexei Starovoitov
2020-09-11 4:37 ` Paul E. McKenney
2020-09-17 21:06 ` [PATCH RFC tip/core/rcu 0/4] Accelerate RCU Tasks Trace updates Paul E. McKenney
2020-09-17 21:07 ` [PATCH tip/core/rcu 1/8] rcu-tasks: Prevent complaints of unused show_rcu_tasks_classic_gp_kthread() paulmck
2020-09-17 21:07 ` [PATCH tip/core/rcu 2/8] rcu-tasks: Mark variables static paulmck
2020-09-17 21:07 ` [PATCH tip/core/rcu 3/8] rcu-tasks: Use more aggressive polling for RCU Tasks Trace paulmck
2020-09-17 21:07 ` [PATCH tip/core/rcu 4/8] rcu-tasks: Selectively enable more RCU Tasks Trace IPIs paulmck
2020-09-17 21:07 ` [PATCH tip/core/rcu 5/8] rcu-tasks: Shorten per-grace-period sleep for RCU Tasks Trace paulmck
2020-09-17 21:07 ` paulmck [this message]
2020-09-17 21:07 ` [PATCH tip/core/rcu 7/8] rcu-tasks: Fix low-probability task_struct leak paulmck
2020-09-17 21:07 ` [PATCH tip/core/rcu 8/8] rcu-tasks: Enclose task-list scan in rcu_read_lock() paulmck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200917210744.2995-6-paulmck@kernel.org \
--to=paulmck@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=alexei.starovoitov@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=dhowells@redhat.com \
--cc=dipankar@in.ibm.com \
--cc=edumazet@google.com \
--cc=fweisbec@gmail.com \
--cc=jiangshanlai@gmail.com \
--cc=joel@joelfernandes.org \
--cc=jolsa@redhat.com \
--cc=josh@joshtriplett.org \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=sfr@canb.auug.org.au \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox