* [PATCH v3][SCSI] scsi_dh: propagate SCSI device deletion
[not found] ` <D8C50530D6022F40A817A35C40CC06A7064AFFD087@DUBX7MCDUB01.EMEA.DELL.COM>
@ 2010-12-16 19:57 ` Mike Snitzer
2010-12-17 15:46 ` Moger, Babu
0 siblings, 1 reply; 2+ messages in thread
From: Mike Snitzer @ 2010-12-16 19:57 UTC (permalink / raw)
To: linux-scsi; +Cc: Menny_Hamburger, Itay_Dar, Rob_Thomas, dm-devel
From: Menny Hamburger <Menny_Hamburger@Dell.com>
Currently, when scsi_dh_activate() returns with an error
(e.g. SCSI_DH_NOSYS) the activate_complete callback is not called and
the error is not propagated to DM mpath.
When a SCSI device attached to a device handler is deleted, userland
processes currently performing I/O on the device will have their I/O
hang forever.
- Set SCSI_DH_NOSYS error when the handler is in the process of being
deleted (e.g. the SCSI device is in a SDEV_CANCEL or SDEV_DEL state).
- Set SCSI_DH_DEV_OFFLINED error when device is in SDEV_OFFLINE state.
- Call the activate_complete callback function directly from
scsi_dh_activate if an error has been set (when either the scsi_dh
internal data has already been deleted or is in the process of being
deleted).
The patch was tested in an iSCSI environment, RDAC H/W handler and
multipath. In the following reproduction process, dd will I/O hang
forever and the only way to release it will be to reboot the machine:
1) Perform I/O on a multipath device:
dd if=/dev/dm-0 of=/dev/zero bs=8k count=1000000 &
2) Delete all slave SCSI devices contained in the mpath device:
I) In an iSCSI environment, the easiest way to do this is by
stopping iSCSI:
/etc/init.d/iscsi stop
II) Another way to delete the devices is by applying the following
bash scriptlet:
dm_devs=$(ls /sys/block/ | grep dm- | xargs)
for dm_dev in $dm_devs; do
devices=$(ls /sys/block/$dm_dev/slaves)
for device in $devices; do
echo 1 > /sys/block/$device/device/delete
done
done
NOTE: when DM mpath's fail_path uses blk_abort_queue this scsi_dh change
isn't strictly required. However, DM mpath's call to blk_abort_queue
will soon be reverted because it has proven to be unsafe due to a race
(between blk_abort_queue and scsi_request_fn) that can lead to list
corruption. Therefore we cannot rely on blk_abort_queue via fail_path,
but even if we could this scsi_dh change is still preferrable.
Signed-off-by: Menny Hamburger <Menny_Hamburger@Dell.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
drivers/scsi/device_handler/scsi_dh.c | 11 +++++++++--
1 files changed, 9 insertions(+), 2 deletions(-)
v1 and v2 were posted/discussed on dm-devel
v3: just tweaked the patch header a bit
diff --git a/drivers/scsi/device_handler/scsi_dh.c b/drivers/scsi/device_handler/scsi_dh.c
index 6fae3d2..b0c56f6 100644
--- a/drivers/scsi/device_handler/scsi_dh.c
+++ b/drivers/scsi/device_handler/scsi_dh.c
@@ -442,12 +442,19 @@ int scsi_dh_activate(struct request_queue *q, activate_complete fn, void *data)
sdev = q->queuedata;
if (sdev && sdev->scsi_dh_data)
scsi_dh = sdev->scsi_dh_data->scsi_dh;
- if (!scsi_dh || !get_device(&sdev->sdev_gendev))
+ if (!scsi_dh || !get_device(&sdev->sdev_gendev) ||
+ sdev->sdev_state == SDEV_CANCEL ||
+ sdev->sdev_state == SDEV_DEL)
err = SCSI_DH_NOSYS;
+ if (sdev->sdev_state == SDEV_OFFLINE)
+ err = SCSI_DH_DEV_OFFLINED;
spin_unlock_irqrestore(q->queue_lock, flags);
- if (err)
+ if (err) {
+ if (fn)
+ fn(data, err);
return err;
+ }
if (scsi_dh->activate)
err = scsi_dh->activate(sdev, fn, data);
^ permalink raw reply related [flat|nested] 2+ messages in thread
* RE: [PATCH v3][SCSI] scsi_dh: propagate SCSI device deletion
2010-12-16 19:57 ` [PATCH v3][SCSI] scsi_dh: propagate SCSI device deletion Mike Snitzer
@ 2010-12-17 15:46 ` Moger, Babu
0 siblings, 0 replies; 2+ messages in thread
From: Moger, Babu @ 2010-12-17 15:46 UTC (permalink / raw)
To: Mike Snitzer, linux-scsi@vger.kernel.org
Cc: Menny_Hamburger@dell.com, Itay_Dar@dell.com, Rob_Thomas@dell.com,
dm-devel@redhat.com
Patches look good..
Reviewed-by: Babu Moger <babu.moger@lsi.com>
> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> owner@vger.kernel.org] On Behalf Of Mike Snitzer
> Sent: Thursday, December 16, 2010 1:57 PM
> To: linux-scsi@vger.kernel.org
> Cc: Menny_Hamburger@dell.com; Itay_Dar@dell.com; Rob_Thomas@dell.com; dm-
> devel@redhat.com
> Subject: [PATCH v3][SCSI] scsi_dh: propagate SCSI device deletion
>
> From: Menny Hamburger <Menny_Hamburger@Dell.com>
>
> Currently, when scsi_dh_activate() returns with an error
> (e.g. SCSI_DH_NOSYS) the activate_complete callback is not called and
> the error is not propagated to DM mpath.
>
> When a SCSI device attached to a device handler is deleted, userland
> processes currently performing I/O on the device will have their I/O
> hang forever.
>
> - Set SCSI_DH_NOSYS error when the handler is in the process of being
> deleted (e.g. the SCSI device is in a SDEV_CANCEL or SDEV_DEL state).
>
> - Set SCSI_DH_DEV_OFFLINED error when device is in SDEV_OFFLINE state.
>
> - Call the activate_complete callback function directly from
> scsi_dh_activate if an error has been set (when either the scsi_dh
> internal data has already been deleted or is in the process of being
> deleted).
>
> The patch was tested in an iSCSI environment, RDAC H/W handler and
> multipath. In the following reproduction process, dd will I/O hang
> forever and the only way to release it will be to reboot the machine:
> 1) Perform I/O on a multipath device:
> dd if=/dev/dm-0 of=/dev/zero bs=8k count=1000000 &
> 2) Delete all slave SCSI devices contained in the mpath device:
> I) In an iSCSI environment, the easiest way to do this is by
> stopping iSCSI:
> /etc/init.d/iscsi stop
> II) Another way to delete the devices is by applying the following
> bash scriptlet:
> dm_devs=$(ls /sys/block/ | grep dm- | xargs)
> for dm_dev in $dm_devs; do
> devices=$(ls /sys/block/$dm_dev/slaves)
> for device in $devices; do
> echo 1 > /sys/block/$device/device/delete
> done
> done
>
> NOTE: when DM mpath's fail_path uses blk_abort_queue this scsi_dh change
> isn't strictly required. However, DM mpath's call to blk_abort_queue
> will soon be reverted because it has proven to be unsafe due to a race
> (between blk_abort_queue and scsi_request_fn) that can lead to list
> corruption. Therefore we cannot rely on blk_abort_queue via fail_path,
> but even if we could this scsi_dh change is still preferrable.
>
> Signed-off-by: Menny Hamburger <Menny_Hamburger@Dell.com>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
> drivers/scsi/device_handler/scsi_dh.c | 11 +++++++++--
> 1 files changed, 9 insertions(+), 2 deletions(-)
>
> v1 and v2 were posted/discussed on dm-devel
> v3: just tweaked the patch header a bit
>
> diff --git a/drivers/scsi/device_handler/scsi_dh.c
> b/drivers/scsi/device_handler/scsi_dh.c
> index 6fae3d2..b0c56f6 100644
> --- a/drivers/scsi/device_handler/scsi_dh.c
> +++ b/drivers/scsi/device_handler/scsi_dh.c
> @@ -442,12 +442,19 @@ int scsi_dh_activate(struct request_queue *q,
> activate_complete fn, void *data)
> sdev = q->queuedata;
> if (sdev && sdev->scsi_dh_data)
> scsi_dh = sdev->scsi_dh_data->scsi_dh;
> - if (!scsi_dh || !get_device(&sdev->sdev_gendev))
> + if (!scsi_dh || !get_device(&sdev->sdev_gendev) ||
> + sdev->sdev_state == SDEV_CANCEL ||
> + sdev->sdev_state == SDEV_DEL)
> err = SCSI_DH_NOSYS;
> + if (sdev->sdev_state == SDEV_OFFLINE)
> + err = SCSI_DH_DEV_OFFLINED;
> spin_unlock_irqrestore(q->queue_lock, flags);
>
> - if (err)
> + if (err) {
> + if (fn)
> + fn(data, err);
> return err;
> + }
>
> if (scsi_dh->activate)
> err = scsi_dh->activate(sdev, fn, data);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2010-12-17 15:46 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <D8C50530D6022F40A817A35C40CC06A7064AFC9450@DUBX7MCDUB01.EMEA.DELL.COM>
[not found] ` <20101215160952.GA20869@redhat.com>
[not found] ` <D8C50530D6022F40A817A35C40CC06A7064AFFCA3E@DUBX7MCDUB01.EMEA.DELL.COM>
[not found] ` <20101216140209.GA23507@redhat.com>
[not found] ` <D8C50530D6022F40A817A35C40CC06A7064AFFCF09@DUBX7MCDUB01.EMEA.DELL.COM>
[not found] ` <20101216152930.GB23507@redhat.com>
[not found] ` <D8C50530D6022F40A817A35C40CC06A7064AFFD030@DUBX7MCDUB01.EMEA.DELL.COM>
[not found] ` <20101216162951.GE23507@redhat.com>
[not found] ` <D8C50530D6022F40A817A35C40CC06A7064AFFD087@DUBX7MCDUB01.EMEA.DELL.COM>
2010-12-16 19:57 ` [PATCH v3][SCSI] scsi_dh: propagate SCSI device deletion Mike Snitzer
2010-12-17 15:46 ` Moger, Babu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox