All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Bart Van Assche <bart.vanassche@sandisk.com>,
	device-mapper development <dm-devel@redhat.com>,
	Aaro Koskinen <aaro.koskinen@nokia.com>
Subject: [PATCH] dm: fix free_rq_clone() NULL pointer when requeueing unmapped request
Date: Wed, 29 Apr 2015 14:53:45 -0400	[thread overview]
Message-ID: <20150429185345.GA5975@redhat.com> (raw)
In-Reply-To: <20150429133433.GA23127@redhat.com>

On Wed, Apr 29 2015 at  9:34am -0400,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Wed, Apr 29 2015 at  9:20am -0400,
> Christoph Hellwig <hch@lst.de> wrote:
> 
> > On Tue, Apr 28, 2015 at 01:52:20PM +0200, Bart Van Assche wrote:
> > > Hello,
> > >
> > > Earlier today I started testing an SRP initiator patch series on top of 
> > > Linux kernel v4.1-rc1. Although that patch series works reliably on top of 
> > > kernel v4.0, a test during which I triggered scsi_remove_host() + relogin 
> > > (for p in /sys/class/srp_remote_ports/*; do echo 1 >$p/delete & done; wait; 
> > > srp_daemon -oaec) triggered the following kernel oops:
> > 
> > Can you try the patch below?  From my cursory reading of the dm code
> > it can have tio->clone allocated for a while before it sets up the ->q
> > pointer for it:
> > 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index f8c7ca3..ee74764 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -1089,7 +1089,7 @@ static void free_rq_clone(struct request *clone)
> >  
> >  	blk_rq_unprep_clone(clone);
> >  
> > -	if (clone->q->mq_ops)
> > +	if (clone->q && clone->q->mq_ops)
> >  		tio->ti->type->release_clone_rq(clone);
> >  	else if (!md->queue->mq_ops)
> >  		/* request_fn queue stacked on request_fn queue(s) */
> 
> I'm seeing this same crash on the completion path (when using your
> tcm_loop script).  But for Bart's case his stacktrace included
> dm_requeue_unmapped_original_request() -- which if called from
> map_request() implies clone->q won't have been initialized given
> __multipath_map()'s code for setting up the old request_fn case.
> 
> Long story short: your fix is right for Bart's crash (but not the ones
> I'm seeing with tcm_loop) -- I'll get it queued up with a proper header
> attributed to you and cc'ing stable as needed.

Actually, here is the proper 4.1-only fix (Bart please verify this works
for you):

From: Mike Snitzer <snitzer@redhat.com>
Date: Wed, 29 Apr 2015 10:48:09 -0400
Subject: dm: fix free_rq_clone() NULL pointer when requeueing unmapped request

Commit 022333427a ("dm: optimize dm_mq_queue_rq to _not_ use kthread if
using pure blk-mq") mistakenly removed free_rq_clone()'s clone->q check
before testing clone->q->mq_ops.  It was an oversight to discontinue
that check for 1 of the 2 use-cases for free_rq_clone():
1) free_rq_clone() called when an unmapped original request is requeued
2) free_rq_clone() called in the request-based IO completion path

The clone->q check made sense for case #1 but not for #2.  However, we
cannot just reinstate the check as it'd mask a serious bug in the IO
completion case #2 -- no in-flight request should have an uninitialized
request_queue (basic block layer refcounting _should_ ensure this).

The NULL pointer seen for case #1 is detailed here:
https://www.redhat.com/archives/dm-devel/2015-April/msg00160.html

Fix this free_rq_clone() NULL pointer by simply checking if the
mapped_device's type is DM_TYPE_MQ_REQUEST_BASED (clone's queue is
blk-mq) rather than checking clone->q->mq_ops.  This avoids the need to
dereference clone->q, but a WARN_ON_ONCE is added to let us know if an
uninitialized clone request is being completed.

Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 3d34b5d..5998c26 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1031,16 +1031,24 @@ static void rq_completed(struct mapped_device *md, int rw, bool run_queue)
 	dm_put(md);
 }
 
-static void free_rq_clone(struct request *clone)
+static void free_rq_clone(struct request *clone, bool must_be_mapped)
 {
 	struct dm_rq_target_io *tio = clone->end_io_data;
 	struct mapped_device *md = tio->md;
 
-	if (clone->q->mq_ops)
+	WARN_ON_ONCE(must_be_mapped && !clone->q);
+
+	if (md->type == DM_TYPE_MQ_REQUEST_BASED)
+		/* stacked on blk-mq queue(s) */
 		tio->ti->type->release_clone_rq(clone);
 	else if (!md->queue->mq_ops)
 		/* request_fn queue stacked on request_fn queue(s) */
 		free_clone_request(md, clone);
+	/*
+	 * NOTE: for the blk-mq queue stacked on request_fn queue(s) case:
+	 * no need to call free_clone_request() because we leverage blk-mq by
+	 * allocating the clone at the end of the blk-mq pdu (see: clone_rq)
+	 */
 
 	if (!md->queue->mq_ops)
 		free_rq_tio(tio);
@@ -1071,7 +1079,7 @@ static void dm_end_request(struct request *clone, int error)
 			rq->sense_len = clone->sense_len;
 	}
 
-	free_rq_clone(clone);
+	free_rq_clone(clone, true);
 	if (!rq->q->mq_ops)
 		blk_end_request_all(rq, error);
 	else
@@ -1090,7 +1098,7 @@ static void dm_unprep_request(struct request *rq)
 	}
 
 	if (clone)
-		free_rq_clone(clone);
+		free_rq_clone(clone, false);
 }
 
 /*
-- 
2.3.2 (Apple Git-55)

  parent reply	other threads:[~2015-04-29 18:53 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-28 11:52 Kernel v4.1-rc1 + MQ dm-multipath + MQ SRP oops Bart Van Assche
2015-04-28 13:52 ` Mike Snitzer
2015-04-28 21:54   ` Mike Snitzer
2015-04-29 13:24     ` Christoph Hellwig
2015-04-29 13:43       ` Mike Snitzer
2015-04-29 13:20 ` Christoph Hellwig
2015-04-29 13:34   ` Mike Snitzer
2015-04-29 13:37     ` Christoph Hellwig
2015-04-29 18:53     ` Mike Snitzer [this message]
2015-04-29 19:11       ` [PATCH] dm: fix free_rq_clone() NULL pointer when requeueing unmapped request Bart Van Assche
2015-04-29 19:53         ` Mike Snitzer
2015-04-30  9:07           ` Bart Van Assche
2015-04-30 12:57             ` Mike Snitzer
2015-04-30  9:11           ` Aaro Koskinen
2015-04-30 12:56             ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150429185345.GA5975@redhat.com \
    --to=snitzer@redhat.com \
    --cc=aaro.koskinen@nokia.com \
    --cc=bart.vanassche@sandisk.com \
    --cc=dm-devel@redhat.com \
    --cc=hch@lst.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.