* [PATCH] Kernel bug triggered in multipath
@ 2014-03-14 11:13 Hannes Reinecke
2014-03-14 11:15 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: Hannes Reinecke @ 2014-03-14 11:13 UTC (permalink / raw)
To: James Bottomley
Cc: Mike Snitzer, Christoph Hellwig, linux-scsi, Hannes Reinecke
Starting multipath on a cciss device will cause a kernel
warning to be triggered. Problem is that we're using the
->queuedata field of the request_queue to derefence the
scsi device; however, for other (non-SCSI) devices this
points to a totally different structure.
So we should rather be using accessors here which make
sure we're only returning valid SCSI device structures.
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
drivers/scsi/device_handler/scsi_dh.c | 6 +++---
drivers/scsi/scsi_lib.c | 11 +++++++++++
include/scsi/scsi_device.h | 1 +
3 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/device_handler/scsi_dh.c b/drivers/scsi/device_handler/scsi_dh.c
index 33e422e..b89f204 100644
--- a/drivers/scsi/device_handler/scsi_dh.c
+++ b/drivers/scsi/device_handler/scsi_dh.c
@@ -388,7 +388,7 @@ int scsi_dh_activate(struct request_queue *q, activate_complete fn, void *data)
struct device *dev = NULL;
spin_lock_irqsave(q->queue_lock, flags);
- sdev = q->queuedata;
+ sdev = scsi_device_from_queue(q);
if (!sdev) {
spin_unlock_irqrestore(q->queue_lock, flags);
err = SCSI_DH_NOSYS;
@@ -484,7 +484,7 @@ int scsi_dh_attach(struct request_queue *q, const char *name)
return -EINVAL;
spin_lock_irqsave(q->queue_lock, flags);
- sdev = q->queuedata;
+ sdev = scsi_device_from_queue(q);
if (!sdev || !get_device(&sdev->sdev_gendev))
err = -ENODEV;
spin_unlock_irqrestore(q->queue_lock, flags);
@@ -513,7 +513,7 @@ void scsi_dh_detach(struct request_queue *q)
struct scsi_device_handler *scsi_dh = NULL;
spin_lock_irqsave(q->queue_lock, flags);
- sdev = q->queuedata;
+ sdev = scsi_device_from_queue(q);
if (!sdev || !get_device(&sdev->sdev_gendev))
sdev = NULL;
spin_unlock_irqrestore(q->queue_lock, flags);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 7bd7f0d..12184a4 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1668,6 +1668,17 @@ out:
spin_lock_irq(q->queue_lock);
}
+struct scsi_device *scsi_device_from_queue(struct request_queue *q)
+{
+ struct scsi_device *sdev = NULL;
+
+ if (q->request_fn == scsi_request_fn)
+ sdev = q->queuedata;
+
+ return sdev;
+}
+EXPORT_SYMBOL_GPL(scsi_device_from_queue);
+
u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
{
struct device *host_dev;
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index d65fbec..4d43642 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -325,6 +325,7 @@ extern void starget_for_each_device(struct scsi_target *, void *,
extern void __starget_for_each_device(struct scsi_target *, void *,
void (*fn)(struct scsi_device *,
void *));
+extern struct scsi_device *scsi_device_from_queue(struct request_queue *);
/* only exposed to implement shost_for_each_device */
extern struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *,
--
1.7.12.4
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH] Kernel bug triggered in multipath
2014-03-14 11:13 [PATCH] Kernel bug triggered in multipath Hannes Reinecke
@ 2014-03-14 11:15 ` Christoph Hellwig
2014-03-14 16:21 ` Mike Snitzer
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2014-03-14 11:15 UTC (permalink / raw)
To: Hannes Reinecke
Cc: James Bottomley, Mike Snitzer, Christoph Hellwig, linux-scsi
On Fri, Mar 14, 2014 at 12:13:52PM +0100, Hannes Reinecke wrote:
> Starting multipath on a cciss device will cause a kernel
> warning to be triggered. Problem is that we're using the
> ->queuedata field of the request_queue to derefence the
> scsi device; however, for other (non-SCSI) devices this
> points to a totally different structure.
> So we should rather be using accessors here which make
> sure we're only returning valid SCSI device structures.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
Looks reasonable to me as a short term fix. Long ter mwe should stop
calling into scsi-specific code directly from the DM code.
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Kernel bug triggered in multipath
2014-03-14 11:15 ` Christoph Hellwig
@ 2014-03-14 16:21 ` Mike Snitzer
2014-03-14 16:26 ` Mike Snitzer
2014-03-15 13:22 ` Christoph Hellwig
0 siblings, 2 replies; 6+ messages in thread
From: Mike Snitzer @ 2014-03-14 16:21 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Hannes Reinecke, James Bottomley, linux-scsi, dm-devel
On Fri, Mar 14 2014 at 7:15am -0400,
Christoph Hellwig <hch@infradead.org> wrote:
> On Fri, Mar 14, 2014 at 12:13:52PM +0100, Hannes Reinecke wrote:
> > Starting multipath on a cciss device will cause a kernel
> > warning to be triggered. Problem is that we're using the
> > ->queuedata field of the request_queue to derefence the
> > scsi device; however, for other (non-SCSI) devices this
> > points to a totally different structure.
> > So we should rather be using accessors here which make
> > sure we're only returning valid SCSI device structures.
> >
> > Signed-off-by: Hannes Reinecke <hare@suse.de>
>
> Looks reasonable to me as a short term fix. Long ter mwe should stop
> calling into scsi-specific code directly from the DM code.
DM multipath has a role in insuring the desired scsi_dh is attached and
that it holds a reference on the attached scsi_dh.
I'm open to ideas of how dm-multipath could avoid having _any_ role here
but it isn't so simple to avoid, dm-multipath does 3 things in this
area (ranging from lightest to heaviest relative to scsi_dh interface use):
1) get reference on scsi_dh that is already attached -- most widely used
now that the scsi_dh matching code has been improved to get correct
scsi_dh attached during scsi device scan)
2) no scsi_dh was attached, but one should be -- really shouldn't happen
anymore
3) switch from the scsi_dh that was auto-attached by scsi_dh matching to
some user-specified override -- shouldn't be needed now but a user may
have a custom scsi_dh they've developed.
I have no problem with this patch, added safety-net and all, but
bottomline: if scsi_dh interfaces were being called against a DM
multipath request_queue that is a bug. In practice that never happens
in supported configurations. AFAICT, Hannes just stumbled upon it cause
he was trying to get cciss working with dm-multipath.
Acked-by: Mike Snitzer <snitzer@redhat.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Kernel bug triggered in multipath
2014-03-14 16:21 ` Mike Snitzer
@ 2014-03-14 16:26 ` Mike Snitzer
2014-03-15 13:23 ` Christoph Hellwig
2014-03-15 13:22 ` Christoph Hellwig
1 sibling, 1 reply; 6+ messages in thread
From: Mike Snitzer @ 2014-03-14 16:26 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Hannes Reinecke, James Bottomley, linux-scsi, dm-devel
On Fri, Mar 14 2014 at 12:21pm -0400,
Mike Snitzer <snitzer@redhat.com> wrote:
> I have no problem with this patch, added safety-net and all, but
> bottomline: if scsi_dh interfaces were being called against a DM
> multipath request_queue that is a bug.
Sorry, s/DM multipath request_queue/non-SCSI requeue_queue/
> In practice that never happens
> in supported configurations. AFAICT, Hannes just stumbled upon it cause
> he was trying to get cciss working with dm-multipath.
>
> Acked-by: Mike Snitzer <snitzer@redhat.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Kernel bug triggered in multipath
2014-03-14 16:26 ` Mike Snitzer
@ 2014-03-15 13:23 ` Christoph Hellwig
0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2014-03-15 13:23 UTC (permalink / raw)
To: Mike Snitzer
Cc: Christoph Hellwig, Hannes Reinecke, James Bottomley, linux-scsi,
dm-devel
On Fri, Mar 14, 2014 at 12:26:15PM -0400, Mike Snitzer wrote:
> On Fri, Mar 14 2014 at 12:21pm -0400,
> Mike Snitzer <snitzer@redhat.com> wrote:
>
> > I have no problem with this patch, added safety-net and all, but
> > bottomline: if scsi_dh interfaces were being called against a DM
> > multipath request_queue that is a bug.
>
> Sorry, s/DM multipath request_queue/non-SCSI requeue_queue/
>
> > In practice that never happens
> > in supported configurations. AFAICT, Hannes just stumbled upon it cause
> > he was trying to get cciss working with dm-multipath.
What prevents an admin from trying to configure a path selector for a
non-scsi device currently?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Kernel bug triggered in multipath
2014-03-14 16:21 ` Mike Snitzer
2014-03-14 16:26 ` Mike Snitzer
@ 2014-03-15 13:22 ` Christoph Hellwig
1 sibling, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2014-03-15 13:22 UTC (permalink / raw)
To: Mike Snitzer
Cc: Christoph Hellwig, Hannes Reinecke, James Bottomley, linux-scsi,
dm-devel
On Fri, Mar 14, 2014 at 12:21:11PM -0400, Mike Snitzer wrote:
> DM multipath has a role in insuring the desired scsi_dh is attached and
> that it holds a reference on the attached scsi_dh.
>
> I'm open to ideas of how dm-multipath could avoid having _any_ role here
> but it isn't so simple to avoid, dm-multipath does 3 things in this
> area (ranging from lightest to heaviest relative to scsi_dh interface use):
> 1) get reference on scsi_dh that is already attached -- most widely used
> now that the scsi_dh matching code has been improved to get correct
> scsi_dh attached during scsi device scan)
> 2) no scsi_dh was attached, but one should be -- really shouldn't happen
> anymore
> 3) switch from the scsi_dh that was auto-attached by scsi_dh matching to
> some user-specified override -- shouldn't be needed now but a user may
> have a custom scsi_dh they've developed.
What we currently have surely is a bit of a layering violation. I don't
think it's urgent or overly important to fix it now, but I see two ways
out:
a) move the handler registration to dm-multipath. This still leaves
the problem on figuring out if a handler supports a device, but at
least that problem is inside the handlers now.
b) move the path selectors to the block layer, and have the methods
provided indirect off the requeuest_queue
b) seems like the cleanest layering, although in the case of SCSI (the
only one that matters at the moment) that would give us a double
indirection.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-03-15 13:23 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-14 11:13 [PATCH] Kernel bug triggered in multipath Hannes Reinecke
2014-03-14 11:15 ` Christoph Hellwig
2014-03-14 16:21 ` Mike Snitzer
2014-03-14 16:26 ` Mike Snitzer
2014-03-15 13:23 ` Christoph Hellwig
2014-03-15 13:22 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox