All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: axboe@kernel.dk, vgoyal@redhat.com
Cc: ctalbott@google.com, rni@google.com,
	linux-kernel@vger.kernel.org, Tejun Heo <tj@kernel.org>
Subject: [PATCH 09/13] block, cfq: fix cic lookup locking
Date: Tue, 25 Oct 2011 18:48:35 -0700	[thread overview]
Message-ID: <1319593719-19132-10-git-send-email-tj@kernel.org> (raw)
In-Reply-To: <1319593719-19132-1-git-send-email-tj@kernel.org>

* cfq_cic_lookup() may be called without queue_lock and multiple tasks
  can execute it simultaneously for the same shared ioc.  Nothing
  prevents them racing each other and trying to drop the same dead cic
  entry multiple times.

* smp_wmb() in cfq_exit_cic() doesn't really do anything and nothing
  prevents cfq_cic_lookup() seeing stale cic->key.  This usually
  doesn't blow up because by the time cic is exited, all requests have
  been drained and new requests are terminated before going through
  elevator.  However, it can still be triggered by plug merge path
  which doesn't grab queue_lock and thus can't check DEAD state
  reliably.

This patch updates lookup locking such that,

* Lookup is always performed under queue_lock.  This doesn't add any
  more locking.  The only issue is cfq_allow_merge() which can be
  called from plug merge path without holding any lock.  For now, this
  is worked around by using cic of the request to merge into, which is
  guaranteed to have the same ioc.  For longer term, I think it would
  be best to separate out plug merge method from regular one.

* Spurious ioc->lock locking around cic lookup hint assignment
  dropped.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
---
 block/cfq-iosched.c |   69 ++++++++++++++++++++++++++------------------------
 1 files changed, 36 insertions(+), 33 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 181a63d..e617b08 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1682,12 +1682,19 @@ static int cfq_allow_merge(struct request_queue *q, struct request *rq,
 		return false;
 
 	/*
-	 * Lookup the cfqq that this bio will be queued with. Allow
-	 * merge only if rq is queued there.
+	 * Lookup the cfqq that this bio will be queued with and allow
+	 * merge only if rq is queued there.  This function can be called
+	 * from plug merge without queue_lock.  In such cases, ioc of @rq
+	 * and %current are guaranteed to be equal.  Avoid lookup which
+	 * requires queue_lock by using @rq's cic.
 	 */
-	cic = cfq_cic_lookup(cfqd, current->io_context);
-	if (!cic)
-		return false;
+	if (current->io_context == RQ_CIC(rq)->ioc) {
+		cic = RQ_CIC(rq);
+	} else {
+		cic = cfq_cic_lookup(cfqd, current->io_context);
+		if (!cic)
+			return false;
+	}
 
 	cfqq = cic_to_cfqq(cic, cfq_bio_sync(bio));
 	return cfqq == RQ_CFQQ(rq);
@@ -2784,21 +2791,15 @@ static void cfq_exit_cic(struct cfq_io_context *cic)
 	struct io_context *ioc = cic->ioc;
 
 	list_del_init(&cic->queue_list);
+	cic->key = cfqd_dead_key(cfqd);
 
 	/*
-	 * Make sure dead mark is seen for dead queues
+	 * Both setting lookup hint to and clearing it from @cic are done
+	 * under queue_lock.  If it's not pointing to @cic now, it never
+	 * will.  Hint assignment itself can race safely.
 	 */
-	smp_wmb();
-	cic->key = cfqd_dead_key(cfqd);
-
-	rcu_read_lock();
-	if (rcu_dereference(ioc->ioc_data) == cic) {
-		rcu_read_unlock();
-		spin_lock(&ioc->lock);
+	if (rcu_dereference_raw(ioc->ioc_data) == cic)
 		rcu_assign_pointer(ioc->ioc_data, NULL);
-		spin_unlock(&ioc->lock);
-	} else
-		rcu_read_unlock();
 
 	if (cic->cfqq[BLK_RW_ASYNC]) {
 		cfq_exit_cfqq(cfqd, cic->cfqq[BLK_RW_ASYNC]);
@@ -3092,12 +3093,20 @@ cfq_drop_dead_cic(struct cfq_data *cfqd, struct io_context *ioc,
 	cfq_cic_free(cic);
 }
 
+/**
+ * cfq_cic_lookup - lookup cfq_io_context
+ * @cfqd: the associated cfq_data
+ * @ioc: the associated io_context
+ *
+ * Look up cfq_io_context associated with @cfqd - @ioc pair.  Must be
+ * called with queue_lock held.
+ */
 static struct cfq_io_context *
 cfq_cic_lookup(struct cfq_data *cfqd, struct io_context *ioc)
 {
 	struct cfq_io_context *cic;
-	unsigned long flags;
 
+	lockdep_assert_held(cfqd->queue->queue_lock);
 	if (unlikely(!ioc))
 		return NULL;
 
@@ -3107,28 +3116,22 @@ cfq_cic_lookup(struct cfq_data *cfqd, struct io_context *ioc)
 	 * we maintain a last-hit cache, to avoid browsing over the tree
 	 */
 	cic = rcu_dereference(ioc->ioc_data);
-	if (cic && cic->key == cfqd) {
-		rcu_read_unlock();
-		return cic;
-	}
+	if (cic && cic->key == cfqd)
+		goto out;
 
 	do {
 		cic = radix_tree_lookup(&ioc->radix_root, cfqd->queue->id);
-		rcu_read_unlock();
 		if (!cic)
 			break;
-		if (unlikely(cic->key != cfqd)) {
-			cfq_drop_dead_cic(cfqd, ioc, cic);
-			rcu_read_lock();
-			continue;
+		if (likely(cic->key == cfqd)) {
+			/* hint assignment itself can race safely */
+			rcu_assign_pointer(ioc->ioc_data, cic);
+			break;
 		}
-
-		spin_lock_irqsave(&ioc->lock, flags);
-		rcu_assign_pointer(ioc->ioc_data, cic);
-		spin_unlock_irqrestore(&ioc->lock, flags);
-		break;
+		cfq_drop_dead_cic(cfqd, ioc, cic);
 	} while (1);
-
+out:
+	rcu_read_unlock();
 	return cic;
 }
 
-- 
1.7.3.1


  parent reply	other threads:[~2011-10-26  1:49 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-26  1:48 [PATCHSET block:for-3.2/core] rescue cfq from RCU death sprial :) Tejun Heo
2011-10-26  1:48 ` [PATCH 01/13] ida: make ida_simple_get/put() IRQ safe Tejun Heo
2011-10-26  4:42   ` Rusty Russell
2011-10-26 20:28     ` Tejun Heo
2011-10-26  1:48 ` [PATCH 02/13] block, cfq: move cfqd->cic_index to q->id Tejun Heo
2011-10-26  1:48 ` [PATCH 03/13] block: misc ioc cleanups Tejun Heo
2011-10-26  1:48 ` [PATCH 04/13] block: make ioc get/put interface more conventional and fix race on alloction Tejun Heo
2011-10-26 16:01   ` Vivek Goyal
2011-10-26 19:29     ` Tejun Heo
2011-10-26 21:30   ` [PATCH UPDATED " Tejun Heo
2011-10-26  1:48 ` [PATCH 05/13] block: misc updates to blk_get_queue() Tejun Heo
2011-10-26  1:48 ` [PATCH 06/13] block, cfq: misc updates to cfq_io_context Tejun Heo
2011-10-27 15:39   ` Vivek Goyal
2011-10-27 16:24     ` Tejun Heo
2011-10-26  1:48 ` [PATCH 07/13] block, cfq: move ioc ioprio/cgroup changed handling to cic Tejun Heo
2011-10-26  1:48 ` [PATCH 08/13] block, cfq: fix race condition in cic creation path and tighten locking Tejun Heo
2011-10-26  1:48 ` Tejun Heo [this message]
2011-10-26  1:48 ` [PATCH 10/13] block, cfq: unlink cfq_io_context's immediately Tejun Heo
2011-10-26 19:48   ` Vivek Goyal
2011-10-26 19:55     ` Tejun Heo
2011-10-26 20:38       ` Vivek Goyal
2011-10-26 20:54         ` Tejun Heo
2011-10-26 21:31   ` [PATCH UPDATED " Tejun Heo
2011-10-27 14:31     ` Vivek Goyal
2011-10-27 16:17       ` Tejun Heo
2011-10-27 17:05         ` Vivek Goyal
2011-10-27 17:11           ` Tejun Heo
2011-10-26  1:48 ` [PATCH 11/13] block, cfq: remove delayed unlink Tejun Heo
2011-10-26  1:48 ` [PATCH 12/13] block, cfq: kill ioc_gone Tejun Heo
2011-10-26  1:48 ` [PATCH 13/13] block, cfq: kill cic->key Tejun Heo
2011-10-26 21:36 ` [PATCHSET block:for-3.2/core] rescue cfq from RCU death sprial :) Tejun Heo
2011-10-27 15:32   ` Vivek Goyal
2011-10-27 16:10     ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1319593719-19132-10-git-send-email-tj@kernel.org \
    --to=tj@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=ctalbott@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rni@google.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.