* [PATCH 2/2] scsi : fixing the new host byte settings (DID_TARGET_FAILURE and DID_NEXUS_FAILURE)
@ 2012-01-23 18:43 Moger, Babu
0 siblings, 0 replies; 5+ messages in thread
From: Moger, Babu @ 2012-01-23 18:43 UTC (permalink / raw)
To: linux-scsi@vger.kernel.org
Cc: device-mapper development (dm-devel@redhat.com)
[-- Attachment #1.1: Type: text/plain, Size: 2631 bytes --]
This patch fixes the host byte settings DID_TARGET_FAILURE and DID_NEXUS_FAILURE.
The function __scsi_error_from_host_byte, tries to reset the host byte to DID_OK. But that
does not happen because of the OR operation.
Here is the flow.
scsi_softirq_done-> scsi_decide_disposition -> __scsi_error_from_host_byte
Let's take an example with DID_NEXUS_FAILURE. In scsi_decide_disposition, result will be set as
DID_NEXUS_FAILURE (=0x11). Then in __scsi_error_from_host_byte, when we do OR with
DID_OK. Purpose is to reset it back to DID_OK. But that does not happen. This patch fixes this issue.
Signed-off-by: Babu Moger <babu.moger@netapp.com>
---
diff -uprN -X linux-3.3-rc1/Documentation/dontdiff linux-3.3-rc1//drivers/scsi/scsi_error.c linux-3.3-rc1-new//drivers/scsi/scsi_error.c
--- linux-3.3-rc1//drivers/scsi/scsi_error.c 2012-01-19 17:04:48.000000000 -0600
+++ linux-3.3-rc1-new//drivers/scsi/scsi_error.c 2012-01-23 11:55:26.000000000 -0600
@@ -1540,7 +1540,7 @@ int scsi_decide_disposition(struct scsi_
* Need to modify host byte to signal a
* permanent target failure
*/
- scmd->result |= (DID_TARGET_FAILURE << 16);
+ set_host_byte(scmd, DID_TARGET_FAILURE);
rtn = SUCCESS;
}
/* if rtn == FAILED, we have no sense information;
@@ -1560,7 +1560,7 @@ int scsi_decide_disposition(struct scsi_
case RESERVATION_CONFLICT:
sdev_printk(KERN_INFO, scmd->device,
"reservation conflict\n");
- scmd->result |= (DID_NEXUS_FAILURE << 16);
+ set_host_byte(scmd, DID_NEXUS_FAILURE);
return SUCCESS; /* causes immediate i/o error */
default:
return FAILED;
diff -uprN -X linux-3.3-rc1/Documentation/dontdiff linux-3.3-rc1//drivers/scsi/scsi_lib.c linux-3.3-rc1-new//drivers/scsi/scsi_lib.c
--- linux-3.3-rc1//drivers/scsi/scsi_lib.c 2012-01-19 17:04:48.000000000 -0600
+++ linux-3.3-rc1-new//drivers/scsi/scsi_lib.c 2012-01-23 11:50:25.000000000 -0600
@@ -682,11 +682,11 @@ static int __scsi_error_from_host_byte(s
error = -ENOLINK;
break;
case DID_TARGET_FAILURE:
- cmd->result |= (DID_OK << 16);
+ set_host_byte(cmd, DID_OK);
error = -EREMOTEIO;
break;
case DID_NEXUS_FAILURE:
- cmd->result |= (DID_OK << 16);
+ set_host_byte(cmd, DID_OK);
error = -EBADE;
break;
default:
[-- Attachment #1.2: Type: text/html, Size: 31210 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/2] scsi : fixing the new host byte settings (DID_TARGET_FAILURE and DID_NEXUS_FAILURE)
@ 2012-01-24 20:38 Moger, Babu
2012-01-24 23:03 ` Mike Snitzer
0 siblings, 1 reply; 5+ messages in thread
From: Moger, Babu @ 2012-01-24 20:38 UTC (permalink / raw)
To: linux-scsi@vger.kernel.org
Cc: device-mapper development (dm-devel@redhat.com)
Resubmitting as my previous post had format issues and did not go linux-scsi.
This patch fixes the host byte settings DID_TARGET_FAILURE and DID_NEXUS_FAILURE.
The function __scsi_error_from_host_byte, tries to reset the host byte to DID_OK. But that
does not happen because of the OR operation.
Here is the flow.
scsi_softirq_done-> scsi_decide_disposition -> __scsi_error_from_host_byte
Let's take an example with DID_NEXUS_FAILURE. In scsi_decide_disposition, result will be set as
DID_NEXUS_FAILURE (=0x11). Then in __scsi_error_from_host_byte, when we do OR with
DID_OK. Purpose is to reset it back to DID_OK. But that does not happen. This patch fixes this issue.
Signed-off-by: Babu Moger <babu.moger@netapp.com>
---
diff -uprN -X linux-3.3-rc1/Documentation/dontdiff linux-3.3-rc1//drivers/scsi/scsi_error.c linux-3.3-rc1-new//drivers/scsi/scsi_error.c
--- linux-3.3-rc1//drivers/scsi/scsi_error.c 2012-01-19 17:04:48.000000000 -0600
+++ linux-3.3-rc1-new//drivers/scsi/scsi_error.c 2012-01-23 11:55:26.000000000 -0600
@@ -1540,7 +1540,7 @@ int scsi_decide_disposition(struct scsi_
* Need to modify host byte to signal a
* permanent target failure
*/
- scmd->result |= (DID_TARGET_FAILURE << 16);
+ set_host_byte(scmd, DID_TARGET_FAILURE);
rtn = SUCCESS;
}
/* if rtn == FAILED, we have no sense information;
@@ -1560,7 +1560,7 @@ int scsi_decide_disposition(struct scsi_
case RESERVATION_CONFLICT:
sdev_printk(KERN_INFO, scmd->device,
"reservation conflict\n");
- scmd->result |= (DID_NEXUS_FAILURE << 16);
+ set_host_byte(scmd, DID_NEXUS_FAILURE);
return SUCCESS; /* causes immediate i/o error */
default:
return FAILED;
diff -uprN -X linux-3.3-rc1/Documentation/dontdiff linux-3.3-rc1//drivers/scsi/scsi_lib.c linux-3.3-rc1-new//drivers/scsi/scsi_lib.c
--- linux-3.3-rc1//drivers/scsi/scsi_lib.c 2012-01-19 17:04:48.000000000 -0600
+++ linux-3.3-rc1-new//drivers/scsi/scsi_lib.c 2012-01-23 11:50:25.000000000 -0600
@@ -682,11 +682,11 @@ static int __scsi_error_from_host_byte(s
error = -ENOLINK;
break;
case DID_TARGET_FAILURE:
- cmd->result |= (DID_OK << 16);
+ set_host_byte(cmd, DID_OK);
error = -EREMOTEIO;
break;
case DID_NEXUS_FAILURE:
- cmd->result |= (DID_OK << 16);
+ set_host_byte(cmd, DID_OK);
error = -EBADE;
break;
default:
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] scsi : fixing the new host byte settings (DID_TARGET_FAILURE and DID_NEXUS_FAILURE)
2012-01-24 20:38 [PATCH 2/2] scsi : fixing the new host byte settings (DID_TARGET_FAILURE and DID_NEXUS_FAILURE) Moger, Babu
@ 2012-01-24 23:03 ` Mike Snitzer
2012-01-25 21:39 ` Moger, Babu
0 siblings, 1 reply; 5+ messages in thread
From: Mike Snitzer @ 2012-01-24 23:03 UTC (permalink / raw)
To: Moger, Babu
Cc: linux-scsi@vger.kernel.org,
device-mapper development (dm-devel@redhat.com)
Hi Babu,
Thanks for finding this.
On Tue, Jan 24 2012 at 3:38pm -0500,
Moger, Babu <Babu.Moger@netapp.com> wrote:
> Resubmitting as my previous post had format issues and did not go linux-scsi.
>
> This patch fixes the host byte settings DID_TARGET_FAILURE and DID_NEXUS_FAILURE.
> The function __scsi_error_from_host_byte, tries to reset the host byte to DID_OK. But that
> does not happen because of the OR operation.
>
> Here is the flow.
> scsi_softirq_done-> scsi_decide_disposition -> __scsi_error_from_host_byte
or more accurately:
scsi_softirq_done -> scsi_decide_disposition
scsi_softirq_done -> scsi_finish_command -> scsi_io_completion -> __scsi_error_from_host_byte
> Let's take an example with DID_NEXUS_FAILURE. In scsi_decide_disposition, result will be set as
> DID_NEXUS_FAILURE (=0x11). Then in __scsi_error_from_host_byte, when we do OR with
> DID_OK. Purpose is to reset it back to DID_OK. But that does not happen. This patch fixes this issue.
We clearly aren't properly resetting to DID_OK but I'm not seeing an
obvious "nasty bug" that is lurking due to this. Am I missing
something?
__scsi_error_from_host_byte() is setting error which is passed back up
via blk_end_request() and blk_end_request_all(). And in my previous
testing I know that corresponding errors are making it out to
dm-multipath (e.g. -EREMOTEIO).
Also, your patch header is missing the location where DID_OK is not
properly matched (because it wasn't set exclussively due to being
or'd). Looks like scsi_noretry_cmd() will be made more efficient
because it will match DID_OK immediately. Any other locations? Would
be good to call them out.
Mike
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH 2/2] scsi : fixing the new host byte settings (DID_TARGET_FAILURE and DID_NEXUS_FAILURE)
2012-01-24 23:03 ` Mike Snitzer
@ 2012-01-25 21:39 ` Moger, Babu
2012-01-25 22:47 ` Mike Snitzer
0 siblings, 1 reply; 5+ messages in thread
From: Moger, Babu @ 2012-01-25 21:39 UTC (permalink / raw)
To: Mike Snitzer
Cc: linux-scsi@vger.kernel.org,
device-mapper development (dm-devel@redhat.com)
> -----Original Message-----
> From: Mike Snitzer [mailto:snitzer@redhat.com]
> Sent: Tuesday, January 24, 2012 5:03 PM
> To: Moger, Babu
> Cc: linux-scsi@vger.kernel.org; device-mapper development (dm-
> devel@redhat.com)
> Subject: Re: [PATCH 2/2] scsi : fixing the new host byte settings
> (DID_TARGET_FAILURE and DID_NEXUS_FAILURE)
>
> Hi Babu,
>
> Thanks for finding this.
>
> On Tue, Jan 24 2012 at 3:38pm -0500,
> Moger, Babu <Babu.Moger@netapp.com> wrote:
>
> > Resubmitting as my previous post had format issues and did not go linux-scsi.
> >
> > This patch fixes the host byte settings DID_TARGET_FAILURE and
> DID_NEXUS_FAILURE.
> > The function __scsi_error_from_host_byte, tries to reset the host byte to
> DID_OK. But that
> > does not happen because of the OR operation.
> >
> > Here is the flow.
> > scsi_softirq_done-> scsi_decide_disposition -> __scsi_error_from_host_byte
>
> or more accurately:
>
> scsi_softirq_done -> scsi_decide_disposition
> scsi_softirq_done -> scsi_finish_command -> scsi_io_completion ->
> __scsi_error_from_host_byte
>
> > Let's take an example with DID_NEXUS_FAILURE. In scsi_decide_disposition,
> result will be set as
> > DID_NEXUS_FAILURE (=0x11). Then in __scsi_error_from_host_byte, when we
> do OR with
> > DID_OK. Purpose is to reset it back to DID_OK. But that does not happen. This
> patch fixes this issue.
>
> We clearly aren't properly resetting to DID_OK but I'm not seeing an
> obvious "nasty bug" that is lurking due to this. Am I missing
> something?
Yes. It is causing some issues in our proprietary multipath driver. Normally, our assumption
is that host status overrides all other statuses. If host status is set to status other than DID_OK
then we normally ignore other statuses(like reading the check sense). We have worked this around.
My assumption is, most of the user Level code does the same thing. It might give wrong impression
about the kind of error.
One question.. Did the newlines wrapped in this patch also?
>
> __scsi_error_from_host_byte() is setting error which is passed back up
> via blk_end_request() and blk_end_request_all(). And in my previous
> testing I know that corresponding errors are making it out to
> dm-multipath (e.g. -EREMOTEIO).
>
> Also, your patch header is missing the location where DID_OK is not
> properly matched (because it wasn't set exclussively due to being
I am not sure what you meant here.
> or'd). Looks like scsi_noretry_cmd() will be made more efficient
> because it will match DID_OK immediately. Any other locations? Would
> be good to call them out.
>
> Mike
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] scsi : fixing the new host byte settings (DID_TARGET_FAILURE and DID_NEXUS_FAILURE)
2012-01-25 21:39 ` Moger, Babu
@ 2012-01-25 22:47 ` Mike Snitzer
0 siblings, 0 replies; 5+ messages in thread
From: Mike Snitzer @ 2012-01-25 22:47 UTC (permalink / raw)
To: device-mapper development; +Cc: linux-scsi@vger.kernel.org
On Wed, Jan 25 2012 at 4:39pm -0500,
Moger, Babu <Babu.Moger@netapp.com> wrote:
> > -----Original Message-----
> > From: Mike Snitzer [mailto:snitzer@redhat.com]
> > Sent: Tuesday, January 24, 2012 5:03 PM
> > To: Moger, Babu
> > Cc: linux-scsi@vger.kernel.org; device-mapper development (dm-
> > devel@redhat.com)
> > Subject: Re: [PATCH 2/2] scsi : fixing the new host byte settings
> > (DID_TARGET_FAILURE and DID_NEXUS_FAILURE)
> >
> > Hi Babu,
> >
> > Thanks for finding this.
> >
> > On Tue, Jan 24 2012 at 3:38pm -0500,
> > Moger, Babu <Babu.Moger@netapp.com> wrote:
> >
> > > Resubmitting as my previous post had format issues and did not go linux-scsi.
> > >
> > > This patch fixes the host byte settings DID_TARGET_FAILURE and
> > DID_NEXUS_FAILURE.
> > > The function __scsi_error_from_host_byte, tries to reset the host byte to
> > DID_OK. But that
> > > does not happen because of the OR operation.
> > >
> > > Here is the flow.
> > > scsi_softirq_done-> scsi_decide_disposition -> __scsi_error_from_host_byte
> >
> > or more accurately:
> >
> > scsi_softirq_done -> scsi_decide_disposition
> > scsi_softirq_done -> scsi_finish_command -> scsi_io_completion ->
> > __scsi_error_from_host_byte
> >
> > > Let's take an example with DID_NEXUS_FAILURE. In scsi_decide_disposition,
> > result will be set as
> > > DID_NEXUS_FAILURE (=0x11). Then in __scsi_error_from_host_byte, when we
> > do OR with
> > > DID_OK. Purpose is to reset it back to DID_OK. But that does not happen. This
> > patch fixes this issue.
> >
> > We clearly aren't properly resetting to DID_OK but I'm not seeing an
> > obvious "nasty bug" that is lurking due to this. Am I missing
> > something?
>
> Yes. It is causing some issues in our proprietary multipath driver. Normally, our assumption
> is that host status overrides all other statuses. If host status is set to status other than DID_OK
> then we normally ignore other statuses(like reading the check sense). We have worked this around.
> My assumption is, most of the user Level code does the same thing. It might give wrong impression
> about the kind of error.
>
> One question.. Did the newlines wrapped in this patch also?
Looks fine to me.
> > __scsi_error_from_host_byte() is setting error which is passed back up
> > via blk_end_request() and blk_end_request_all(). And in my previous
> > testing I know that corresponding errors are making it out to
> > dm-multipath (e.g. -EREMOTEIO).
> >
> > Also, your patch header is missing the location where DID_OK is not
> > properly matched (because it wasn't set exclussively due to being
>
> I am not sure what you meant here.
Well like I said, it is clear that scsi_noretry_cmd() won't match the
DID_OK case in the host_byte select statement. I was just wondering
where else this improperly set DID_OK was causing a DID_OK match to not
happen.
But the fact that this is impacting your proprietary multipath driver
basically answers my question (I was trying to understand what was
ultimately broken as a result of us improperly resetting to DID_OK).
All said:
Acked-by: Mike Snitzer <snitzer@redhat.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-01-25 22:47 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-24 20:38 [PATCH 2/2] scsi : fixing the new host byte settings (DID_TARGET_FAILURE and DID_NEXUS_FAILURE) Moger, Babu
2012-01-24 23:03 ` Mike Snitzer
2012-01-25 21:39 ` Moger, Babu
2012-01-25 22:47 ` Mike Snitzer
-- strict thread matches above, loose matches on Subject: below --
2012-01-23 18:43 Moger, Babu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).