All of lore.kernel.org
 help / color / mirror / Atom feed
From: paulmck@linux.vnet.ibm.com (Paul E. McKenney)
Subject: [PATCH] nvme: fix flush dependency in delete controller flow
Date: Thu, 5 Apr 2018 17:18:11 -0700	[thread overview]
Message-ID: <20180406001811.GQ3948@linux.vnet.ibm.com> (raw)
In-Reply-To: <20180405195512.GA12678@lst.de>

On Thu, Apr 05, 2018@09:55:12PM +0200, Christoph Hellwig wrote:
> On Thu, Apr 05, 2018@09:08:52AM -0700, Paul E. McKenney wrote:
> > > But if this SRCU use case doesn't ever invoke call_srcu(), I
> > > -think- that shouldn't need to do the flush_delayed_work() calls in
> > > cleanup_srcu_struct().  I think.  It is a bit early out here, so I don't
> > > trust myself on this one just now, and need to take another look when
> > > fully awake.
> > > 
> > > But in the meantime, is it really the case that this SRCU use case never
> > > ever invokes call_srcu()?
> 
> We never use call_srcu().
> 
> > And I can do a bit better than this.  I could provide something like a
> > cleanup_srcu_struct_quiesced() that would avoid blocking if the caller had
> > quiesced it.  Here, to quiesce is to make sure that any synchronize_srcu()
> > or synchronize_srcu_expedited() calls have returned before the call to
> > cleanup_srcu_struct_quiesced(), and that any callbacks from call_srcu()
> > have completed.  Of course, srcu_barrier() could be used to wait for
> > the callbacks from earlier call_srcu().
> > 
> > Would something like this help?
> 
> Yes.  Note that blocking isn't even the problem, just the flush_work.
> But not blocking at all would be fine of course.
> 
> The pattern in nvme is that we have a struct nvme_ns_head, which has a
> list of struct nvme_ns structures hanging off it.  We use SRCU to protect
> lookups in said list.  We call synchronize_srcu after removing a nvme_ns
> from the list in struct nvme_ns_head and nowhere else.  A struct
> nvme_ns_head is freed (and we thus call cleanup_srcu_struct) once the
> reference count on it hits zero, which can only happen after the last
> nvme_ns has been removed from the list.
> 
> I suspect this might be a common patter in other parts of the kernel as
> well.

OK, how does the following (untested) patch look?

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 33c1c698df09..91494d7e8e41 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -69,11 +69,45 @@ struct srcu_struct { };
 
 void call_srcu(struct srcu_struct *sp, struct rcu_head *head,
 		void (*func)(struct rcu_head *head));
-void cleanup_srcu_struct(struct srcu_struct *sp);
+void _cleanup_srcu_struct(struct srcu_struct *sp, bool quiesced);
 int __srcu_read_lock(struct srcu_struct *sp) __acquires(sp);
 void __srcu_read_unlock(struct srcu_struct *sp, int idx) __releases(sp);
 void synchronize_srcu(struct srcu_struct *sp);
 
+/**
+ * cleanup_srcu_struct - deconstruct a sleep-RCU structure
+ * @sp: structure to clean up.
+ *
+ * Must invoke this after you are finished using a given srcu_struct that
+ * was initialized via init_srcu_struct(), else you leak memory.
+ */
+static inline void cleanup_srcu_struct(struct srcu_struct *sp)
+{
+	_cleanup_srcu_struct(sp, false);
+}
+
+/**
+ * cleanup_srcu_struct_quiesced - deconstruct a quiesced sleep-RCU structure
+ * @sp: structure to clean up.
+ *
+ * Must invoke this after you are finished using a given srcu_struct that
+ * was initialized via init_srcu_struct(), else you leak memory.  Also,
+ * all grace-period processing must have completed.
+ *
+ * "Completed" means that the last synchronize_srcu() and
+ * synchronize_srcu_expedited() calls must have returned before the call
+ * to cleanup_srcu_struct_quiesced().  It also means that the callback
+ * from the last call_srcu() must have been invoked before the call to
+ * cleanup_srcu_struct_quiesced(), but you can use srcu_barrier() to help
+ * with this last.  Violating these rules will get you a WARN_ON() splat
+ * (with high probability, anyway), and will also cause the srcu_struct
+ * to be leaked.
+ */
+static inline void cleanup_srcu_struct_quiesced(struct srcu_struct *sp)
+{
+	_cleanup_srcu_struct(sp, true);
+}
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 
 /**
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 680c96d8c00f..f0e1d44459f8 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -593,7 +593,12 @@ static void srcu_torture_init(void)
 
 static void srcu_torture_cleanup(void)
 {
-	cleanup_srcu_struct(&srcu_ctld);
+	static DEFINE_TORTURE_RANDOM(rand);
+
+	if (torture_random(&rand) & 0x800)
+		cleanup_srcu_struct(&srcu_ctld);
+	else
+		cleanup_srcu_struct_quiesced(&srcu_ctld);
 	srcu_ctlp = &srcu_ctl; /* In case of a later rcutorture run. */
 }
 
diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c
index 76ac5f50b2c7..622792abe41a 100644
--- a/kernel/rcu/srcutiny.c
+++ b/kernel/rcu/srcutiny.c
@@ -86,16 +86,19 @@ EXPORT_SYMBOL_GPL(init_srcu_struct);
  * Must invoke this after you are finished using a given srcu_struct that
  * was initialized via init_srcu_struct(), else you leak memory.
  */
-void cleanup_srcu_struct(struct srcu_struct *sp)
+void _cleanup_srcu_struct(struct srcu_struct *sp, bool quiesced)
 {
 	WARN_ON(sp->srcu_lock_nesting[0] || sp->srcu_lock_nesting[1]);
-	flush_work(&sp->srcu_work);
+	if (quiesced)
+		WARN_ON(work_pending(&sp->srcu_work));
+	else
+		flush_work(&sp->srcu_work);
 	WARN_ON(sp->srcu_gp_running);
 	WARN_ON(sp->srcu_gp_waiting);
 	WARN_ON(sp->srcu_cb_head);
 	WARN_ON(&sp->srcu_cb_head != sp->srcu_cb_tail);
 }
-EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
+EXPORT_SYMBOL_GPL(_cleanup_srcu_struct);
 
 /*
  * Removes the count for the old reader from the appropriate element of
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index fb560fca9ef4..b4123d7a2cec 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -366,24 +366,28 @@ static unsigned long srcu_get_delay(struct srcu_struct *sp)
 	return SRCU_INTERVAL;
 }
 
-/**
- * cleanup_srcu_struct - deconstruct a sleep-RCU structure
- * @sp: structure to clean up.
- *
- * Must invoke this after you are finished using a given srcu_struct that
- * was initialized via init_srcu_struct(), else you leak memory.
- */
-void cleanup_srcu_struct(struct srcu_struct *sp)
+/* Helper for cleanup_srcu_struct() and cleanup_srcu_struct_quiesced(). */
+void _cleanup_srcu_struct(struct srcu_struct *sp, bool quiesced)
 {
 	int cpu;
 
 	if (WARN_ON(!srcu_get_delay(sp)))
-		return; /* Leakage unless caller handles error. */
+		return; /* Just leak it! */
 	if (WARN_ON(srcu_readers_active(sp)))
-		return; /* Leakage unless caller handles error. */
-	flush_delayed_work(&sp->work);
+		return; /* Just leak it! */
+	if (quiesced) {
+		if (WARN_ON(delayed_work_pending(&sp->work)))
+			return; /* Just leak it! */
+	} else {
+		flush_delayed_work(&sp->work);
+	}
 	for_each_possible_cpu(cpu)
-		flush_delayed_work(&per_cpu_ptr(sp->sda, cpu)->work);
+		if (quiesced) {
+			if (WARN_ON(delayed_work_pending(&per_cpu_ptr(sp->sda, cpu)->work)))
+				return; /* Just leak it! */
+		} else {
+			flush_delayed_work(&per_cpu_ptr(sp->sda, cpu)->work);
+		}
 	if (WARN_ON(rcu_seq_state(READ_ONCE(sp->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
 	    WARN_ON(srcu_readers_active(sp))) {
 		pr_info("%s: Active srcu_struct %p state: %d\n", __func__, sp, rcu_seq_state(READ_ONCE(sp->srcu_gp_seq)));
@@ -392,7 +396,7 @@ void cleanup_srcu_struct(struct srcu_struct *sp)
 	free_percpu(sp->sda);
 	sp->sda = NULL;
 }
-EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
+EXPORT_SYMBOL_GPL(_cleanup_srcu_struct);
 
 /*
  * Counts the new reader in the appropriate per-CPU element of the

  reply	other threads:[~2018-04-06  0:18 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-02 12:37 [PATCH] nvme: fix flush dependency in delete controller flow Nitzan Carmi
2018-04-05  8:54 ` Christoph Hellwig
2018-04-05 13:14   ` Paul E. McKenney
2018-04-05 16:08     ` Paul E. McKenney
2018-04-05 19:55       ` Christoph Hellwig
2018-04-06  0:18         ` Paul E. McKenney [this message]
2018-04-06  6:19           ` Christoph Hellwig
2018-04-06 16:30             ` Paul E. McKenney
2018-04-08  7:20               ` Nitzan Carmi
2018-04-08 16:48                 ` Paul E. McKenney
2018-04-09  6:39                   ` Nitzan Carmi
2018-04-09  6:50                     ` Christoph Hellwig
2018-04-09 14:50                       ` [PATCH] nvme: avoid " Nitzan Carmi
2018-04-09 16:48                         ` Paul E. McKenney
2018-04-09 16:58                           ` Max Gurtovoy
2018-04-09 18:22                             ` Paul E. McKenney
2018-04-10 10:47                               ` Max Gurtovoy
2018-04-10 17:01                                 ` Paul E. McKenney
2018-04-09 16:30                     ` [PATCH] nvme: fix " Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180406001811.GQ3948@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.