netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: linux-kernel@vger.kernel.org, torvalds@linux-foundation.org,
	manfred@colorfullife.com, dave@stgolabs.net,
	paulmck@linux.vnet.ibm.com, will.deacon@arm.com
Cc: boqun.feng@gmail.com, Waiman.Long@hpe.com, tj@kernel.org,
	pablo@netfilter.org, kaber@trash.net, davem@davemloft.net,
	oleg@redhat.com, netfilter-devel@vger.kernel.org,
	sasha.levin@oracle.com, hofrat@osadl.org,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>
Subject: [RFC][PATCH 2/3] locking: Annotate spin_unlock_wait() users
Date: Tue, 24 May 2016 16:27:25 +0200	[thread overview]
Message-ID: <20160524143649.608476390@infradead.org> (raw)
In-Reply-To: 20160524142723.178148277@infradead.org

[-- Attachment #1: peterz-locking-fix-spin_unlock_wait.patch --]
[-- Type: text/plain, Size: 3030 bytes --]

spin_unlock_wait() has an unintuitive 'feature' in that it doesn't
fully serialize against the spin_unlock() we've waited on.

In particular, spin_unlock_wait() only provides a control dependency,
which is a LOAD->STORE order. This means subsequent loads can creep up
and observe state prior to the waited-for unlock. This means we don't
necessarily observe the full critical section.

We must employ smp_acquire__after_ctrl_dep() to upgrade the
LOAD->STORE to LOAD->{LOAD,STORE} aka. load-AQUIRE and thereby ensure
we observe the full critical section we've waited on.

Many spin_unlock_wait() users were unaware of this issue and need
help.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 drivers/ata/libata-eh.c   |    4 +++-
 kernel/exit.c             |   14 ++++++++++++--
 kernel/sched/completion.c |    7 +++++++
 kernel/task_work.c        |    2 +-
 4 files changed, 23 insertions(+), 4 deletions(-)

--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -703,8 +703,10 @@ void ata_scsi_cmd_error_handler(struct S
 
 		/* initialize eh_tries */
 		ap->eh_tries = ATA_EH_MAX_TRIES;
-	} else
+	} else {
 		spin_unlock_wait(ap->lock);
+		smp_acquire__after_ctrl_dep();
+	}
 
 }
 EXPORT_SYMBOL(ata_scsi_cmd_error_handler);
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -776,11 +776,16 @@ void do_exit(long code)
 
 	exit_signals(tsk);  /* sets PF_EXITING */
 	/*
-	 * tsk->flags are checked in the futex code to protect against
-	 * an exiting task cleaning up the robust pi futexes.
+	 * Ensure that all new tsk->pi_lock acquisitions must observe
+	 * PF_EXITING. Serializes against futex.c:attach_to_pi_owner().
 	 */
 	smp_mb();
 	raw_spin_unlock_wait(&tsk->pi_lock);
+	/*
+	 * Ensure that we must observe the pi_state in exit_mm() ->
+	 * mm_release() -> exit_pi_state_list().
+	 */
+	smp_acquire__after_ctrl_dep();
 
 	if (unlikely(in_atomic())) {
 		pr_info("note: %s[%d] exited with preempt_count %d\n",
@@ -897,6 +902,11 @@ void do_exit(long code)
 	 */
 	smp_mb();
 	raw_spin_unlock_wait(&tsk->pi_lock);
+	/*
+	 * Since there are no following loads the LOAD->LOAD order
+	 * provided by smp_acquire__after_ctrl_dep() is not
+	 * strictly required.
+	 */
 
 	/* causes final put_task_struct in finish_task_switch(). */
 	tsk->state = TASK_DEAD;
--- a/kernel/sched/completion.c
+++ b/kernel/sched/completion.c
@@ -312,6 +312,13 @@ bool completion_done(struct completion *
 	 */
 	smp_rmb();
 	spin_unlock_wait(&x->wait.lock);
+	/*
+	 * Even though we've observed 'done', this doesn't mean we can observe
+	 * all stores prior to complete(), as the only RELEASE barrier on that
+	 * path is provided by the spin_unlock().
+	 */
+	smp_acquire__after_ctrl_dep();
+
 	return true;
 }
 EXPORT_SYMBOL(completion_done);
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -108,7 +108,7 @@ void task_work_run(void)
 		 * fail, but it can play with *work and other entries.
 		 */
 		raw_spin_unlock_wait(&task->pi_lock);
-		smp_mb();
+		smp_acquire__after_ctrl_dep();
 
 		do {
 			next = work->next;

  parent reply	other threads:[~2016-05-24 14:27 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-24 14:27 [RFC][PATCH 0/3] spin_unlock_wait and assorted borkage Peter Zijlstra
2016-05-24 14:27 ` [RFC][PATCH 1/3] locking: Introduce smp_acquire__after_ctrl_dep Peter Zijlstra
     [not found]   ` <57451581.6000700@hpe.com>
2016-05-25  4:53     ` Paul E. McKenney
2016-05-25  5:39       ` Boqun Feng
2016-05-25 14:29         ` Paul E. McKenney
2016-05-25 15:20       ` Waiman Long
2016-05-25 15:57         ` Paul E. McKenney
2016-05-25 16:28           ` Peter Zijlstra
2016-05-25 16:54             ` Linus Torvalds
2016-05-25 18:59               ` Paul E. McKenney
2016-06-03  9:18           ` Vineet Gupta
2016-06-03  9:38             ` Peter Zijlstra
2016-06-03 12:08               ` Paul E. McKenney
2016-06-03 12:23                 ` Peter Zijlstra
2016-06-03 12:27                   ` Peter Zijlstra
2016-06-03 13:33                     ` Paul E. McKenney
2016-06-03 13:32                   ` Paul E. McKenney
2016-06-03 13:45                     ` Will Deacon
2016-06-04 15:29                       ` Paul E. McKenney
2016-06-06 17:28                         ` Paul E. McKenney
2016-06-07  7:15                           ` Peter Zijlstra
2016-06-07 12:41                             ` Hannes Frederic Sowa
2016-06-07 13:06                               ` Paul E. McKenney
2016-06-07 14:59                                 ` Hannes Frederic Sowa
2016-06-07 15:23                                   ` Paul E. McKenney
2016-06-07 17:48                                     ` Peter Zijlstra
2016-06-07 18:44                                       ` Paul E. McKenney
2016-06-07 18:01                                     ` Will Deacon
2016-06-07 18:44                                       ` Paul E. McKenney
2016-06-07 18:54                                       ` Paul E. McKenney
2016-06-07 18:37                                     ` Hannes Frederic Sowa
2016-05-24 14:27 ` Peter Zijlstra [this message]
2016-05-24 16:17   ` [RFC][PATCH 2/3] locking: Annotate spin_unlock_wait() users Linus Torvalds
2016-05-24 16:22     ` Tejun Heo
2016-05-24 16:58       ` Peter Zijlstra
2016-05-25 19:28         ` Tejun Heo
2016-05-24 16:57     ` Peter Zijlstra
2016-05-24 14:27 ` [RFC][PATCH 3/3] locking,netfilter: Fix nf_conntrack_lock() Peter Zijlstra
2016-05-24 14:42   ` Peter Zijlstra
     [not found]   ` <3e1671fc-be0f-bc95-4fbb-6bfc56e6c15b@colorfullife.com>
2016-05-26 13:54     ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160524143649.608476390@infradead.org \
    --to=peterz@infradead.org \
    --cc=Waiman.Long@hpe.com \
    --cc=boqun.feng@gmail.com \
    --cc=dave@stgolabs.net \
    --cc=davem@davemloft.net \
    --cc=hofrat@osadl.org \
    --cc=kaber@trash.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=pablo@netfilter.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=sasha.levin@oracle.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).