* [PATCH 1/1] scsi_dh: add ALUA notification for EMC Clariion devices
@ 2008-09-23 14:46 Levy_Jerome
2008-09-23 15:57 ` Matthew Wilcox
2008-10-13 17:41 ` [PATCH 1/1] scsi_dh: fix boot oops with EMC Clariion Levy_Jerome
0 siblings, 2 replies; 9+ messages in thread
From: Levy_Jerome @ 2008-09-23 14:46 UTC (permalink / raw)
To: linux-scsi
Adding sense code data decode and notification for optimal/non-optimal
path changeover on Clariion devices. Unfortunately in the read sense
code we can't do another inquiry, so we can't tell the user whether we
are on the optimal or non-optimal path, only that a change has occurred.
Signed-off-by: Jerry Levy <levy_jerome@emc.com>
---------------------
--- ./drivers/scsi/device_handler/scsi_dh_emc.original.c
2008-09-17 14:50:18.000000000 -0400
+++ ./drivers/scsi/device_handler/scsi_dh_emc.c 2008-09-17
14:54:38.000000000 -0400
@@ -435,6 +435,16 @@
return SUCCESS;
break;
case UNIT_ATTENTION:
+ if (sense_hdr->asc == 0x2A && sense_hdr->ascq == 0x06)
+ /*
+ * ALUA status has changed. Report to host,
+ * no further action required... jml
+ */
+ sdev_printk(KERN_NOTICE, sdev,
+ "%s: Asymmetric access state has
changed.\n",
+ CLARIION_NAME);
+ return SUCCESS;
+ break;
if (sense_hdr->asc == 0x29 && sense_hdr->ascq == 0x00)
/*
* Unit Attention Code. This is the first IO
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/1] scsi_dh: add ALUA notification for EMC Clariion devices
2008-09-23 14:46 [PATCH 1/1] scsi_dh: add ALUA notification for EMC Clariion devices Levy_Jerome
@ 2008-09-23 15:57 ` Matthew Wilcox
2008-09-23 16:04 ` Levy_Jerome
2008-10-13 17:41 ` [PATCH 1/1] scsi_dh: fix boot oops with EMC Clariion Levy_Jerome
1 sibling, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2008-09-23 15:57 UTC (permalink / raw)
To: Levy_Jerome; +Cc: linux-scsi
On Tue, Sep 23, 2008 at 10:46:18AM -0400, Levy_Jerome@emc.com wrote:
> Adding sense code data decode and notification for optimal/non-optimal
> path changeover on Clariion devices. Unfortunately in the read sense
> code we can't do another inquiry, so we can't tell the user whether we
> are on the optimal or non-optimal path, only that a change has occurred.
I'm not sure that printk is the optimal user notification here.
What if we had a uevent so that udev could take action, such as issuing
an inquiry?
What if we scheduled some work so we could issue an inquiry and take
appropriate action?
> case UNIT_ATTENTION:
> + if (sense_hdr->asc == 0x2A && sense_hdr->ascq == 0x06)
> + /*
> + * ALUA status has changed. Report to host,
> + * no further action required... jml
> + */
> + sdev_printk(KERN_NOTICE, sdev,
> + "%s: Asymmetric access state has
> changed.\n",
> + CLARIION_NAME);
> + return SUCCESS;
> + break;
> if (sense_hdr->asc == 0x29 && sense_hdr->ascq == 0x00)
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [PATCH 1/1] scsi_dh: add ALUA notification for EMC Clariion devices
2008-09-23 15:57 ` Matthew Wilcox
@ 2008-09-23 16:04 ` Levy_Jerome
0 siblings, 0 replies; 9+ messages in thread
From: Levy_Jerome @ 2008-09-23 16:04 UTC (permalink / raw)
To: matthew; +Cc: linux-scsi
There's no real action to take -- the array will handle the condition
properly. This is just an informative message. It _would_ be nice if we
displayed the actual status, and I could do that in a uevent. Is it
reasonable to run one in the context of a sense-code routine or are we
better off keeping it fast and simple?
The other thing I was thinking of was merely setting a flag and letting
dm-multipath do the reporting later on. I'd be open to either
approach... or better ideas...
Jerry Levy, Engineer IV
EMC Global Services Technical Support, PREM ISG
AMER Host Systems Software - UNIX
-----Original Message-----
From: Matthew Wilcox [mailto:matthew@wil.cx]
Sent: Tuesday, September 23, 2008 11:57 AM
To: Levy, Jerome
Cc: linux-scsi@vger.kernel.org
Subject: Re: [PATCH 1/1] scsi_dh: add ALUA notification for EMC Clariion
devices
On Tue, Sep 23, 2008 at 10:46:18AM -0400, Levy_Jerome@emc.com wrote:
> Adding sense code data decode and notification for optimal/non-optimal
> path changeover on Clariion devices. Unfortunately in the read sense
> code we can't do another inquiry, so we can't tell the user whether we
> are on the optimal or non-optimal path, only that a change has
occurred.
I'm not sure that printk is the optimal user notification here.
What if we had a uevent so that udev could take action, such as issuing
an inquiry?
What if we scheduled some work so we could issue an inquiry and take
appropriate action?
> case UNIT_ATTENTION:
> + if (sense_hdr->asc == 0x2A && sense_hdr->ascq ==
0x06)
> + /*
> + * ALUA status has changed. Report to host,
> + * no further action required... jml
> + */
> + sdev_printk(KERN_NOTICE, sdev,
> + "%s: Asymmetric access state has
> changed.\n",
> + CLARIION_NAME);
> + return SUCCESS;
> + break;
> if (sense_hdr->asc == 0x29 && sense_hdr->ascq == 0x00)
--
Matthew Wilcox Intel Open Source Technology
Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/1] scsi_dh: fix boot oops with EMC Clariion
2008-09-23 14:46 [PATCH 1/1] scsi_dh: add ALUA notification for EMC Clariion devices Levy_Jerome
2008-09-23 15:57 ` Matthew Wilcox
@ 2008-10-13 17:41 ` Levy_Jerome
2008-10-17 17:34 ` James Bottomley
1 sibling, 1 reply; 9+ messages in thread
From: Levy_Jerome @ 2008-10-13 17:41 UTC (permalink / raw)
To: linux-scsi
Patch to fix intermittent but frequent oops occurring with multiple
paths to Clariion at boot.
Cause is uninitialized rq->flags variable which presents garbage from
clariion_activate.
Signed-off-by: Jerry Levy <levy_jerome@emc.com>
---------------------
--- drivers/scsi/device_handler/scsi_dh_emc.orig 2008-10-13
13:33:35.000000000 -0400
+++ drivers/scsi/device_handler/scsi_dh_emc.c 2008-10-09
16:20:15.000000000 -0400
@@ -283,6 +283,7 @@
memset(rq->cmd, 0, BLK_MAX_CDB);
rq->cmd[0] = cmd;
rq->cmd_len = COMMAND_SIZE(cmd);
+ rq->flags = 0;
switch (cmd) {
case MODE_SELECT:
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/1] scsi_dh: fix boot oops with EMC Clariion
2008-10-13 17:41 ` [PATCH 1/1] scsi_dh: fix boot oops with EMC Clariion Levy_Jerome
@ 2008-10-17 17:34 ` James Bottomley
2008-10-17 18:17 ` Levy_Jerome
0 siblings, 1 reply; 9+ messages in thread
From: James Bottomley @ 2008-10-17 17:34 UTC (permalink / raw)
To: Levy_Jerome; +Cc: linux-scsi
On Mon, 2008-10-13 at 13:41 -0400, Levy_Jerome@emc.com wrote:
> Patch to fix intermittent but frequent oops occurring with multiple
> paths to Clariion at boot.
> Cause is uninitialized rq->flags variable which presents garbage from
> clariion_activate.
There are several syntactic problems with the patch: It's against a
pretty old kernel (cmd has been a pointer for a while, so the memset
went away). It's also whitespace damaged (all tabs have become spaces).
However, I also don't quite understand why this is necessary. All calls
to blk_get_request() ultimately end up in blk_rq_init() which does a
memset(rq, 0, sizeof(*rq)) which should clear flags. How is it getting
bogus data?
James
> Signed-off-by: Jerry Levy <levy_jerome@emc.com>
> ---------------------
> --- drivers/scsi/device_handler/scsi_dh_emc.orig 2008-10-13
> 13:33:35.000000000 -0400
> +++ drivers/scsi/device_handler/scsi_dh_emc.c 2008-10-09
> 16:20:15.000000000 -0400
> @@ -283,6 +283,7 @@
> memset(rq->cmd, 0, BLK_MAX_CDB);
> rq->cmd[0] = cmd;
> rq->cmd_len = COMMAND_SIZE(cmd);
> + rq->flags = 0;
>
> switch (cmd) {
> case MODE_SELECT:
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [PATCH 1/1] scsi_dh: fix boot oops with EMC Clariion
2008-10-17 17:34 ` James Bottomley
@ 2008-10-17 18:17 ` Levy_Jerome
2008-10-17 18:33 ` James Bottomley
0 siblings, 1 reply; 9+ messages in thread
From: Levy_Jerome @ 2008-10-17 18:17 UTC (permalink / raw)
To: James.Bottomley; +Cc: linux-scsi
The change was the addition of rq->flags = 0; the memset isn't mine. Sorry about the whitespace -- I'm still a bit new at this.
As to why it's necessary, I've had boot-time oopses on two completely different hosts -- one iSCSI, one FC -- which both resolved to exactly the same code; bizarre values in rq->flags. The source seems to OR the desired values in but never actually initializes rq->flags (the memset initializes the CDB, not the flags variable), so I added the line to do so. After testing the old module to confirm the oops still occurred regularly, I installed the new code and have since (in over 100 reboots) been unable to reproduce the oops.
Jerry
-----Original Message-----
From: James Bottomley [mailto:James.Bottomley@HansenPartnership.com]
Sent: Fri 10/17/2008 1:34 PM
To: Levy, Jerome
Cc: linux-scsi@vger.kernel.org
Subject: Re: [PATCH 1/1] scsi_dh: fix boot oops with EMC Clariion
On Mon, 2008-10-13 at 13:41 -0400, Levy_Jerome@emc.com wrote:
> Patch to fix intermittent but frequent oops occurring with multiple
> paths to Clariion at boot.
> Cause is uninitialized rq->flags variable which presents garbage from
> clariion_activate.
There are several syntactic problems with the patch: It's against a
pretty old kernel (cmd has been a pointer for a while, so the memset
went away). It's also whitespace damaged (all tabs have become spaces).
However, I also don't quite understand why this is necessary. All calls
to blk_get_request() ultimately end up in blk_rq_init() which does a
memset(rq, 0, sizeof(*rq)) which should clear flags. How is it getting
bogus data?
James
> Signed-off-by: Jerry Levy <levy_jerome@emc.com>
> ---------------------
> --- drivers/scsi/device_handler/scsi_dh_emc.orig 2008-10-13
> 13:33:35.000000000 -0400
> +++ drivers/scsi/device_handler/scsi_dh_emc.c 2008-10-09
> 16:20:15.000000000 -0400
> @@ -283,6 +283,7 @@
> memset(rq->cmd, 0, BLK_MAX_CDB);
> rq->cmd[0] = cmd;
> rq->cmd_len = COMMAND_SIZE(cmd);
> + rq->flags = 0;
>
> switch (cmd) {
> case MODE_SELECT:
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [PATCH 1/1] scsi_dh: fix boot oops with EMC Clariion
2008-10-17 18:17 ` Levy_Jerome
@ 2008-10-17 18:33 ` James Bottomley
2008-10-17 18:35 ` Levy_Jerome
2008-10-17 18:37 ` Ric Wheeler
0 siblings, 2 replies; 9+ messages in thread
From: James Bottomley @ 2008-10-17 18:33 UTC (permalink / raw)
To: Levy_Jerome; +Cc: linux-scsi
On Fri, 2008-10-17 at 14:17 -0400, Levy_Jerome@emc.com wrote:
> The change was the addition of rq->flags = 0; the memset isn't mine.
> Sorry about the whitespace -- I'm still a bit new at this.
That's OK.
Documentation/SubmittingPatches
Documentation/email-clients.txt
contain useful information.
> As to why it's necessary, I've had boot-time oopses on two completely
> different hosts -- one iSCSI, one FC -- which both resolved to exactly
> the same code; bizarre values in rq->flags. The source seems to OR the
> desired values in but never actually initializes rq->flags (the memset
> initializes the CDB, not the flags variable), so I added the line to
> do so. After testing the old module to confirm the oops still occurred
> regularly, I installed the new code and have since (in over 100
> reboots) been unable to reproduce the oops.
No, my point is that this was fixed by a memset to zero of the request
in blk_rq_init() in 2.6.26 (so it fixed every other problem, not just
the one in dm_emc). I think the kernel you're testing is too old to see
the generic fix (based on what the diff contained).
James
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [PATCH 1/1] scsi_dh: fix boot oops with EMC Clariion
2008-10-17 18:33 ` James Bottomley
@ 2008-10-17 18:35 ` Levy_Jerome
2008-10-17 18:37 ` Ric Wheeler
1 sibling, 0 replies; 9+ messages in thread
From: Levy_Jerome @ 2008-10-17 18:35 UTC (permalink / raw)
To: James.Bottomley; +Cc: linux-scsi
Ah! Fair enough. I was working off of .105, so I'll check for a newer release. Seen something very similar in scsi_dh_alua, perhaps that is fixed too.
Thanks...
Jerry
-----Original Message-----
From: James Bottomley [mailto:James.Bottomley@HansenPartnership.com]
Sent: Fri 10/17/2008 2:33 PM
To: Levy, Jerome
Cc: linux-scsi@vger.kernel.org
Subject: RE: [PATCH 1/1] scsi_dh: fix boot oops with EMC Clariion
On Fri, 2008-10-17 at 14:17 -0400, Levy_Jerome@emc.com wrote:
> The change was the addition of rq->flags = 0; the memset isn't mine.
> Sorry about the whitespace -- I'm still a bit new at this.
That's OK.
Documentation/SubmittingPatches
Documentation/email-clients.txt
contain useful information.
> As to why it's necessary, I've had boot-time oopses on two completely
> different hosts -- one iSCSI, one FC -- which both resolved to exactly
> the same code; bizarre values in rq->flags. The source seems to OR the
> desired values in but never actually initializes rq->flags (the memset
> initializes the CDB, not the flags variable), so I added the line to
> do so. After testing the old module to confirm the oops still occurred
> regularly, I installed the new code and have since (in over 100
> reboots) been unable to reproduce the oops.
No, my point is that this was fixed by a memset to zero of the request
in blk_rq_init() in 2.6.26 (so it fixed every other problem, not just
the one in dm_emc). I think the kernel you're testing is too old to see
the generic fix (based on what the diff contained).
James
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/1] scsi_dh: fix boot oops with EMC Clariion
2008-10-17 18:33 ` James Bottomley
2008-10-17 18:35 ` Levy_Jerome
@ 2008-10-17 18:37 ` Ric Wheeler
1 sibling, 0 replies; 9+ messages in thread
From: Ric Wheeler @ 2008-10-17 18:37 UTC (permalink / raw)
To: James Bottomley; +Cc: Levy_Jerome, linux-scsi, berthiaume_wayne
James Bottomley wrote:
> On Fri, 2008-10-17 at 14:17 -0400, Levy_Jerome@emc.com wrote:
>
>> The change was the addition of rq->flags = 0; the memset isn't mine.
>> Sorry about the whitespace -- I'm still a bit new at this.
>>
>
> That's OK.
>
> Documentation/SubmittingPatches
> Documentation/email-clients.txt
>
> contain useful information.
>
>
>> As to why it's necessary, I've had boot-time oopses on two completely
>> different hosts -- one iSCSI, one FC -- which both resolved to exactly
>> the same code; bizarre values in rq->flags. The source seems to OR the
>> desired values in but never actually initializes rq->flags (the memset
>> initializes the CDB, not the flags variable), so I added the line to
>> do so. After testing the old module to confirm the oops still occurred
>> regularly, I installed the new code and have since (in over 100
>> reboots) been unable to reproduce the oops.
>>
>
> No, my point is that this was fixed by a memset to zero of the request
> in blk_rq_init() in 2.6.26 (so it fixed every other problem, not just
> the one in dm_emc). I think the kernel you're testing is too old to see
> the generic fix (based on what the diff contained).
>
> James
>
Of course, if you see this with a vendor specific (older) kernel, you
can follow up with the vendor and log a bugzilla ticket for that kernel.
Ric
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-10-17 18:37 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-23 14:46 [PATCH 1/1] scsi_dh: add ALUA notification for EMC Clariion devices Levy_Jerome
2008-09-23 15:57 ` Matthew Wilcox
2008-09-23 16:04 ` Levy_Jerome
2008-10-13 17:41 ` [PATCH 1/1] scsi_dh: fix boot oops with EMC Clariion Levy_Jerome
2008-10-17 17:34 ` James Bottomley
2008-10-17 18:17 ` Levy_Jerome
2008-10-17 18:33 ` James Bottomley
2008-10-17 18:35 ` Levy_Jerome
2008-10-17 18:37 ` Ric Wheeler
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.