From: Peter Zijlstra <peterz@infradead.org>
To: linux-kernel@vger.kernel.org, torvalds@linux-foundation.org,
manfred@colorfullife.com, dave@stgolabs.net,
paulmck@linux.vnet.ibm.com, will.deacon@arm.com
Cc: boqun.feng@gmail.com, Waiman.Long@hpe.com, tj@kernel.org,
pablo@netfilter.org, kaber@trash.net, davem@davemloft.net,
oleg@redhat.com, netfilter-devel@vger.kernel.org,
sasha.levin@oracle.com, hofrat@osadl.org,
"Peter Zijlstra (Intel)" <peterz@infradead.org>
Subject: [RFC][PATCH 2/3] locking: Annotate spin_unlock_wait() users
Date: Tue, 24 May 2016 16:27:25 +0200 [thread overview]
Message-ID: <20160524143649.608476390@infradead.org> (raw)
In-Reply-To: 20160524142723.178148277@infradead.org
[-- Attachment #1: peterz-locking-fix-spin_unlock_wait.patch --]
[-- Type: text/plain, Size: 3030 bytes --]
spin_unlock_wait() has an unintuitive 'feature' in that it doesn't
fully serialize against the spin_unlock() we've waited on.
In particular, spin_unlock_wait() only provides a control dependency,
which is a LOAD->STORE order. This means subsequent loads can creep up
and observe state prior to the waited-for unlock. This means we don't
necessarily observe the full critical section.
We must employ smp_acquire__after_ctrl_dep() to upgrade the
LOAD->STORE to LOAD->{LOAD,STORE} aka. load-AQUIRE and thereby ensure
we observe the full critical section we've waited on.
Many spin_unlock_wait() users were unaware of this issue and need
help.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
drivers/ata/libata-eh.c | 4 +++-
kernel/exit.c | 14 ++++++++++++--
kernel/sched/completion.c | 7 +++++++
kernel/task_work.c | 2 +-
4 files changed, 23 insertions(+), 4 deletions(-)
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -703,8 +703,10 @@ void ata_scsi_cmd_error_handler(struct S
/* initialize eh_tries */
ap->eh_tries = ATA_EH_MAX_TRIES;
- } else
+ } else {
spin_unlock_wait(ap->lock);
+ smp_acquire__after_ctrl_dep();
+ }
}
EXPORT_SYMBOL(ata_scsi_cmd_error_handler);
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -776,11 +776,16 @@ void do_exit(long code)
exit_signals(tsk); /* sets PF_EXITING */
/*
- * tsk->flags are checked in the futex code to protect against
- * an exiting task cleaning up the robust pi futexes.
+ * Ensure that all new tsk->pi_lock acquisitions must observe
+ * PF_EXITING. Serializes against futex.c:attach_to_pi_owner().
*/
smp_mb();
raw_spin_unlock_wait(&tsk->pi_lock);
+ /*
+ * Ensure that we must observe the pi_state in exit_mm() ->
+ * mm_release() -> exit_pi_state_list().
+ */
+ smp_acquire__after_ctrl_dep();
if (unlikely(in_atomic())) {
pr_info("note: %s[%d] exited with preempt_count %d\n",
@@ -897,6 +902,11 @@ void do_exit(long code)
*/
smp_mb();
raw_spin_unlock_wait(&tsk->pi_lock);
+ /*
+ * Since there are no following loads the LOAD->LOAD order
+ * provided by smp_acquire__after_ctrl_dep() is not
+ * strictly required.
+ */
/* causes final put_task_struct in finish_task_switch(). */
tsk->state = TASK_DEAD;
--- a/kernel/sched/completion.c
+++ b/kernel/sched/completion.c
@@ -312,6 +312,13 @@ bool completion_done(struct completion *
*/
smp_rmb();
spin_unlock_wait(&x->wait.lock);
+ /*
+ * Even though we've observed 'done', this doesn't mean we can observe
+ * all stores prior to complete(), as the only RELEASE barrier on that
+ * path is provided by the spin_unlock().
+ */
+ smp_acquire__after_ctrl_dep();
+
return true;
}
EXPORT_SYMBOL(completion_done);
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -108,7 +108,7 @@ void task_work_run(void)
* fail, but it can play with *work and other entries.
*/
raw_spin_unlock_wait(&task->pi_lock);
- smp_mb();
+ smp_acquire__after_ctrl_dep();
do {
next = work->next;
next prev parent reply other threads:[~2016-05-24 14:27 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-24 14:27 [RFC][PATCH 0/3] spin_unlock_wait and assorted borkage Peter Zijlstra
2016-05-24 14:27 ` [RFC][PATCH 1/3] locking: Introduce smp_acquire__after_ctrl_dep Peter Zijlstra
[not found] ` <57451581.6000700@hpe.com>
2016-05-25 4:53 ` Paul E. McKenney
2016-05-25 5:39 ` Boqun Feng
2016-05-25 14:29 ` Paul E. McKenney
2016-05-25 15:20 ` Waiman Long
2016-05-25 15:57 ` Paul E. McKenney
2016-05-25 16:28 ` Peter Zijlstra
2016-05-25 16:54 ` Linus Torvalds
2016-05-25 18:59 ` Paul E. McKenney
2016-06-03 9:18 ` Vineet Gupta
2016-06-03 9:38 ` Peter Zijlstra
2016-06-03 12:08 ` Paul E. McKenney
2016-06-03 12:23 ` Peter Zijlstra
2016-06-03 12:27 ` Peter Zijlstra
2016-06-03 13:33 ` Paul E. McKenney
2016-06-03 13:32 ` Paul E. McKenney
2016-06-03 13:45 ` Will Deacon
2016-06-04 15:29 ` Paul E. McKenney
2016-06-06 17:28 ` Paul E. McKenney
2016-06-07 7:15 ` Peter Zijlstra
2016-06-07 12:41 ` Hannes Frederic Sowa
2016-06-07 13:06 ` Paul E. McKenney
2016-06-07 14:59 ` Hannes Frederic Sowa
2016-06-07 15:23 ` Paul E. McKenney
2016-06-07 17:48 ` Peter Zijlstra
2016-06-07 18:44 ` Paul E. McKenney
2016-06-07 18:01 ` Will Deacon
2016-06-07 18:44 ` Paul E. McKenney
2016-06-07 18:54 ` Paul E. McKenney
2016-06-07 18:37 ` Hannes Frederic Sowa
2016-05-24 14:27 ` Peter Zijlstra [this message]
2016-05-24 16:17 ` [RFC][PATCH 2/3] locking: Annotate spin_unlock_wait() users Linus Torvalds
2016-05-24 16:22 ` Tejun Heo
2016-05-24 16:58 ` Peter Zijlstra
2016-05-25 19:28 ` Tejun Heo
2016-05-24 16:57 ` Peter Zijlstra
2016-05-24 14:27 ` [RFC][PATCH 3/3] locking,netfilter: Fix nf_conntrack_lock() Peter Zijlstra
2016-05-24 14:42 ` Peter Zijlstra
[not found] ` <3e1671fc-be0f-bc95-4fbb-6bfc56e6c15b@colorfullife.com>
2016-05-26 13:54 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160524143649.608476390@infradead.org \
--to=peterz@infradead.org \
--cc=Waiman.Long@hpe.com \
--cc=boqun.feng@gmail.com \
--cc=dave@stgolabs.net \
--cc=davem@davemloft.net \
--cc=hofrat@osadl.org \
--cc=kaber@trash.net \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=netfilter-devel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=pablo@netfilter.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=sasha.levin@oracle.com \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).