[PATCH 1/1] RFC: scsi/dm-mpath: return -EACCES on reservation conflict

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 1/1] RFC: scsi/dm-mpath: return -EACCES on reservation conflict
@ 2009-10-13  4:02 michaelc
  2009-10-13  4:03 ` Mike Christie
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: michaelc @ 2009-10-13  4:02 UTC (permalink / raw)
  To: linux-scsi, dm-devel; +Cc: Mike Christie

From: Mike Christie <michaelc@cs.wisc.edu>

This patch was made over this patch
http://marc.info/?l=linux-scsi&m=125417106125449&w=2

The basic problem is that we do not want dm-multipath to retry
this error, but the scsi layer returns -EIO or -EILSEQ, so
dm-multipath cannot distinguish between a reservation conflict
and other errors.

This problem was originally discussed here
http://www.linux-archive.org/device-mapper-development/180290-dm-mpath-scsi-persistent-reservation.html

I have considered adding new blk error values (I have sent pactches
for this before and can send updated ones if we want to go this route),
and even just using more -EXYZ values for scsi errors, but in the end I am
just not sure it ended up being worth it, so this patch just
handles the one error.

The problem with adding new blk errors is that it seems only dm-multipath
knows what it wants (have not seen anything from the FS or RAID people),
and I also do not know what every device is sending so I cannot completely
clean up cases like where a device returns a error (check condition
and sense) indicating a controller port is temporarily unavialable.
For example, I do not know if I am getting a ILLEGAL request for some
non retryable device error vs the controller is getting its FW updated
(for a non retryable device error case we do not want to fail the path
and just want to fail the IO, but for FW update we just want to fail
the path), so I have to treat those device errors like a transport error
and just fail the path.

So, I did another take just using lots of different -EXYZ values. See
this patch

for an example. The problem is still that the transport error
and generic error cases are the same so all I bought was the handling
of the reservation conflict.

And, that is how I ended up here where I am only handling the one
error I know for sure will cause problems with the infrastructure we have.
I am  not in love with this patch, so please give me any other
suggestions.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
---
 drivers/md/dm-mpath.c   |    2 +-
 drivers/scsi/scsi_lib.c |    4 ++++
 2 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 32d0b87..93e6ce5 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1214,7 +1214,7 @@ static int do_end_io(struct multipath *m, struct request *clone,
 	if (!error && !clone->errors)
 		return 0;	/* I/O complete */

-	if (error == -EOPNOTSUPP)
+	if (error == -EOPNOTSUPP || error == -EACCES)
 		return error;

 	if (mpio->pgpath)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 1086552..5635035 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -797,6 +797,10 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 		 * happens.
 		 */
 		action = ACTION_RETRY;
+	else if (status_byte(cmd->result) == RESERVATION_CONFLICT) {
+		error = -EACCES;
+		description = "Could not access device";
+		action = ACTION_FAIL;
 	} else if (sense_valid && !sense_deferred) {
 		switch (sshdr.sense_key) {
 		case UNIT_ATTENTION:
-- 
1.6.2.2

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] RFC: scsi/dm-mpath: return -EACCES on reservation conflict
  2009-10-13  4:02 [PATCH 1/1] RFC: scsi/dm-mpath: return -EACCES on reservation conflict michaelc
@ 2009-10-13  4:03 ` Mike Christie
  2009-10-13 10:07 ` Boaz Harrosh
  2010-12-17 16:59 ` Mike Snitzer
  2 siblings, 0 replies; 7+ messages in thread
From: Mike Christie @ 2009-10-13  4:03 UTC (permalink / raw)
  To: linux-scsi, dm-devel

michaelc@cs.wisc.edu wrote:
> So, I did another take just using lots of different -EXYZ values. See
> this patch

For got to add the link
http://www.kernel.org/pub/linux/kernel/people/mnc/multipath/dm-mpath-use-exyz.patch
> 
> for an example. The problem is still that the transport error
> and generic error cases are the same so all I bought was the handling
> of the reservation conflict.
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] RFC: scsi/dm-mpath: return -EACCES on reservation conflict
  2009-10-13  4:02 [PATCH 1/1] RFC: scsi/dm-mpath: return -EACCES on reservation conflict michaelc
  2009-10-13  4:03 ` Mike Christie
@ 2009-10-13 10:07 ` Boaz Harrosh
  2010-12-17 16:59 ` Mike Snitzer
  2 siblings, 0 replies; 7+ messages in thread
From: Boaz Harrosh @ 2009-10-13 10:07 UTC (permalink / raw)
  To: michaelc; +Cc: dm-devel, linux-scsi

On 10/13/2009 06:02 AM, michaelc@cs.wisc.edu wrote:
> From: Mike Christie <michaelc@cs.wisc.edu>
> 
> This patch was made over this patch
> http://marc.info/?l=linux-scsi&m=125417106125449&w=2
> 
> The basic problem is that we do not want dm-multipath to retry
> this error, but the scsi layer returns -EIO or -EILSEQ, so
> dm-multipath cannot distinguish between a reservation conflict
> and other errors.
> 
> This problem was originally discussed here
> http://www.linux-archive.org/device-mapper-development/180290-dm-mpath-scsi-persistent-reservation.html
> 
> I have considered adding new blk error values (I have sent pactches
> for this before and can send updated ones if we want to go this route),
> and even just using more -EXYZ values for scsi errors, but in the end I am
> just not sure it ended up being worth it, so this patch just
> handles the one error.
> 
> The problem with adding new blk errors is that it seems only dm-multipath
> knows what it wants (have not seen anything from the FS or RAID people),
> and I also do not know what every device is sending so I cannot completely
> clean up cases like where a device returns a error (check condition
> and sense) indicating a controller port is temporarily unavialable.
> For example, I do not know if I am getting a ILLEGAL request for some
> non retryable device error vs the controller is getting its FW updated
> (for a non retryable device error case we do not want to fail the path
> and just want to fail the IO, but for FW update we just want to fail
> the path), so I have to treat those device errors like a transport error
> and just fail the path.
> 
> So, I did another take just using lots of different -EXYZ values. See
> this patch
> 
> for an example. The problem is still that the transport error
> and generic error cases are the same so all I bought was the handling
> of the reservation conflict.
> 
> And, that is how I ended up here where I am only handling the one
> error I know for sure will cause problems with the infrastructure we have.
> I am  not in love with this patch, so please give me any other
> suggestions.
> 

Hi Mike.

I guess error handling is always a mess. It's fine to take it one at a time.
At the end we'll handle them all ;-)

Please do not use -EACCES it means something else. It means more like permissions
thing, as if, this user does not have access but someone else might. (Actually I've
used it in OSD when permissions are not sufficient for requested access)

I have some better suggestions perhaps, some taken from Posix/NFS world

#define	EISCONN		106	/* Transport endpoint is already connected */
#define	ECONNREFUSED	111	/* Connection refused */
#define	EREMOTEIO	121	/* Remote I/O error */


> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
> ---
>  drivers/md/dm-mpath.c   |    2 +-
>  drivers/scsi/scsi_lib.c |    4 ++++
>  2 files changed, 5 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index 32d0b87..93e6ce5 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -1214,7 +1214,7 @@ static int do_end_io(struct multipath *m, struct request *clone,
>  	if (!error && !clone->errors)
>  		return 0;	/* I/O complete */
>  
> -	if (error == -EOPNOTSUPP)
> +	if (error == -EOPNOTSUPP || error == -EACCES)
>  		return error;
>  
>  	if (mpio->pgpath)
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 1086552..5635035 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -797,6 +797,10 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
>  		 * happens.
>  		 */
>  		action = ACTION_RETRY;
> +	else if (status_byte(cmd->result) == RESERVATION_CONFLICT) {
> +		error = -EACCES;
> +		description = "Could not access device";

Please put: description = "scsi reservation conflict"

It is a well defined scsi error that means exactly what it is.

> +		action = ACTION_FAIL;
>  	} else if (sense_valid && !sense_deferred) {
>  		switch (sshdr.sense_key) {
>  		case UNIT_ATTENTION:

Thanks
Boaz

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] RFC: scsi/dm-mpath: return -EACCES on reservation conflict
  2009-10-13  4:02 [PATCH 1/1] RFC: scsi/dm-mpath: return -EACCES on reservation conflict michaelc
  2009-10-13  4:03 ` Mike Christie
  2009-10-13 10:07 ` Boaz Harrosh
@ 2010-12-17 16:59 ` Mike Snitzer
  2011-02-25 18:40   ` Eddie Williams
  2 siblings, 1 reply; 7+ messages in thread
From: Mike Snitzer @ 2010-12-17 16:59 UTC (permalink / raw)
  To: Mike Christie; +Cc: linux-scsi, device-mapper development, Hannes Reinecke

On Tue, Oct 13 2009 at 12:02am -0400,
michaelc@cs.wisc.edu <michaelc@cs.wisc.edu> wrote:

> From: Mike Christie <michaelc@cs.wisc.edu>
> 
> This patch was made over this patch
> http://marc.info/?l=linux-scsi&m=125417106125449&w=2
> 
> The basic problem is that we do not want dm-multipath to retry
> this error, but the scsi layer returns -EIO or -EILSEQ, so
> dm-multipath cannot distinguish between a reservation conflict
> and other errors.
> 
> This problem was originally discussed here
> http://www.linux-archive.org/device-mapper-development/180290-dm-mpath-scsi-persistent-reservation.html
> 
> I have considered adding new blk error values (I have sent pactches
> for this before and can send updated ones if we want to go this route),
> and even just using more -EXYZ values for scsi errors, but in the end I am
> just not sure it ended up being worth it, so this patch just
> handles the one error.
> 
> The problem with adding new blk errors is that it seems only dm-multipath
> knows what it wants (have not seen anything from the FS or RAID people),
> and I also do not know what every device is sending so I cannot completely
> clean up cases like where a device returns a error (check condition
> and sense) indicating a controller port is temporarily unavialable.
> For example, I do not know if I am getting a ILLEGAL request for some
> non retryable device error vs the controller is getting its FW updated
> (for a non retryable device error case we do not want to fail the path
> and just want to fail the IO, but for FW update we just want to fail
> the path), so I have to treat those device errors like a transport error
> and just fail the path.
> 
> So, I did another take just using lots of different -EXYZ values. See
> this patch
> 
> for an example. The problem is still that the transport error
> and generic error cases are the same so all I bought was the handling
> of the reservation conflict.
> 
> And, that is how I ended up here where I am only handling the one
> error I know for sure will cause problems with the infrastructure we have.
> I am  not in love with this patch, so please give me any other
> suggestions.
> 
> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
> ---
>  drivers/md/dm-mpath.c   |    2 +-
>  drivers/scsi/scsi_lib.c |    4 ++++
>  2 files changed, 5 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index 32d0b87..93e6ce5 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -1214,7 +1214,7 @@ static int do_end_io(struct multipath *m, struct request *clone,
>  	if (!error && !clone->errors)
>  		return 0;	/* I/O complete */
>  
> -	if (error == -EOPNOTSUPP)
> +	if (error == -EOPNOTSUPP || error == -EACCES)
>  		return error;
>  
>  	if (mpio->pgpath)
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 1086552..5635035 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -797,6 +797,10 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
>  		 * happens.
>  		 */
>  		action = ACTION_RETRY;
> +	else if (status_byte(cmd->result) == RESERVATION_CONFLICT) {
> +		error = -EACCES;
> +		description = "Could not access device";
> +		action = ACTION_FAIL;
>  	} else if (sense_valid && !sense_deferred) {
>  		switch (sshdr.sense_key) {
>  		case UNIT_ATTENTION:
> -- 
> 1.6.2.2

Hi Mike,

Just a reminder that Hannes has proposed a slightly different approach
(returning -EREMOTEIO instead of -EACCES).  Here is the version of
Hannes' patch that I reviewed/rebased/tweaked last week:
https://patchwork.kernel.org/patch/384612/

(NOTE: the scsi_decide_disposition() change relative to
RESERVATION_CONFLICT).

And here is the corresponding DM mpath change:
https://patchwork.kernel.org/patch/384602/

If you agree with this approach your ack would be appreciated.

Thanks,
Mike

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] RFC: scsi/dm-mpath: return -EACCES on reservation conflict
  2010-12-17 16:59 ` Mike Snitzer
@ 2011-02-25 18:40   ` Eddie Williams
  2011-02-25 18:43     ` Eddie Williams
  2011-02-26  2:39     ` Mike Snitzer
  0 siblings, 2 replies; 7+ messages in thread
From: Eddie Williams @ 2011-02-25 18:40 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Mike Christie, device-mapper development, linux-scsi


[-- Attachment #1.1: Type: text/plain, Size: 5793 bytes --]

I did not see a resolution on this and don't see any changes in
linux-2.6.38-rc4...

I would like to disagree that it is not useful or appropriate to retry an IO
that gets a RESERVATION CONFLICT on different paths to the storage.  The
issue is a possible window when a path is hot added to the system and before
a registration is added for the new path.  OK, this window is for SCSI-3
reservations not SCSI-2...

If an IO is sent down the new path in this window then it will get a
reservation conflict.  If the IO is retried on another path (that is
registered) then it will work.

So I think the reservation conflict should be retried on each path and then
failed if none work.  The key issue being we don't want to get into an
infinite loop here where we retry forever especially when another IO comes
does (like a read if the lock is a write lock or for many devices a test
unit ready) that will look like the path is fine so mark the path good
again.

The alternative to retrying would be to make sure that before an IO is
allowed down a new path that the path is registered.  This would mean that
instead of registering/reserving the paths by an application outside of
device mapper multipath that it would need to be done inside device mapper
so it knows what/when to register.

Eddie Williams
On Fri, Dec 17, 2010 at 11:59 AM, Mike Snitzer <snitzer@redhat.com> wrote:

> On Tue, Oct 13 2009 at 12:02am -0400,
> michaelc@cs.wisc.edu <michaelc@cs.wisc.edu> wrote:
>
> > From: Mike Christie <michaelc@cs.wisc.edu>
> >
> > This patch was made over this patch
> > http://marc.info/?l=linux-scsi&m=125417106125449&w=2
> >
> > The basic problem is that we do not want dm-multipath to retry
> > this error, but the scsi layer returns -EIO or -EILSEQ, so
> > dm-multipath cannot distinguish between a reservation conflict
> > and other errors.
> >
> > This problem was originally discussed here
> >
> http://www.linux-archive.org/device-mapper-development/180290-dm-mpath-scsi-persistent-reservation.html
> >
> > I have considered adding new blk error values (I have sent pactches
> > for this before and can send updated ones if we want to go this route),
> > and even just using more -EXYZ values for scsi errors, but in the end I
> am
> > just not sure it ended up being worth it, so this patch just
> > handles the one error.
> >
> > The problem with adding new blk errors is that it seems only dm-multipath
> > knows what it wants (have not seen anything from the FS or RAID people),
> > and I also do not know what every device is sending so I cannot
> completely
> > clean up cases like where a device returns a error (check condition
> > and sense) indicating a controller port is temporarily unavialable.
> > For example, I do not know if I am getting a ILLEGAL request for some
> > non retryable device error vs the controller is getting its FW updated
> > (for a non retryable device error case we do not want to fail the path
> > and just want to fail the IO, but for FW update we just want to fail
> > the path), so I have to treat those device errors like a transport error
> > and just fail the path.
> >
> > So, I did another take just using lots of different -EXYZ values. See
> > this patch
> >
> > for an example. The problem is still that the transport error
> > and generic error cases are the same so all I bought was the handling
> > of the reservation conflict.
> >
> > And, that is how I ended up here where I am only handling the one
> > error I know for sure will cause problems with the infrastructure we
> have.
> > I am  not in love with this patch, so please give me any other
> > suggestions.
> >
> > Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
> > ---
> >  drivers/md/dm-mpath.c   |    2 +-
> >  drivers/scsi/scsi_lib.c |    4 ++++
> >  2 files changed, 5 insertions(+), 1 deletions(-)
> >
> > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> > index 32d0b87..93e6ce5 100644
> > --- a/drivers/md/dm-mpath.c
> > +++ b/drivers/md/dm-mpath.c
> > @@ -1214,7 +1214,7 @@ static int do_end_io(struct multipath *m, struct
> request *clone,
> >       if (!error && !clone->errors)
> >               return 0;       /* I/O complete */
> >
> > -     if (error == -EOPNOTSUPP)
> > +     if (error == -EOPNOTSUPP || error == -EACCES)
> >               return error;
> >
> >       if (mpio->pgpath)
> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > index 1086552..5635035 100644
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -797,6 +797,10 @@ void scsi_io_completion(struct scsi_cmnd *cmd,
> unsigned int good_bytes)
> >                * happens.
> >                */
> >               action = ACTION_RETRY;
> > +     else if (status_byte(cmd->result) == RESERVATION_CONFLICT) {
> > +             error = -EACCES;
> > +             description = "Could not access device";
> > +             action = ACTION_FAIL;
> >       } else if (sense_valid && !sense_deferred) {
> >               switch (sshdr.sense_key) {
> >               case UNIT_ATTENTION:
> > --
> > 1.6.2.2
>
> Hi Mike,
>
> Just a reminder that Hannes has proposed a slightly different approach
> (returning -EREMOTEIO instead of -EACCES).  Here is the version of
> Hannes' patch that I reviewed/rebased/tweaked last week:
> https://patchwork.kernel.org/patch/384612/
>
> (NOTE: the scsi_decide_disposition() change relative to
> RESERVATION_CONFLICT).
>
> And here is the corresponding DM mpath change:
> https://patchwork.kernel.org/patch/384602/
>
> If you agree with this approach your ack would be appreciated.
>
> Thanks,
> Mike
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

[-- Attachment #1.2: Type: text/html, Size: 7401 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] RFC: scsi/dm-mpath: return -EACCES on reservation conflict
  2011-02-25 18:40   ` Eddie Williams
@ 2011-02-25 18:43     ` Eddie Williams
  2011-02-26  2:39     ` Mike Snitzer
  1 sibling, 0 replies; 7+ messages in thread
From: Eddie Williams @ 2011-02-25 18:43 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Mike Christie, linux-scsi, device-mapper development,
	Hannes Reinecke

oops, resend as plain text

On Fri, Feb 25, 2011 at 1:40 PM, Eddie Williams
<Eddie.Williams@us.sios.com> wrote:
>
> I did not see a resolution on this and don't see any changes in linux-2.6.38-rc4...
>
> I would like to disagree that it is not useful or appropriate to retry an IO that gets a RESERVATION CONFLICT on different paths to the storage.  The issue is a possible window when a path is hot added to the system and before a registration is added for the new path.  OK, this window is for SCSI-3 reservations not SCSI-2...
>
> If an IO is sent down the new path in this window then it will get a reservation conflict.  If the IO is retried on another path (that is registered) then it will work.
>
> So I think the reservation conflict should be retried on each path and then failed if none work.  The key issue being we don't want to get into an infinite loop here where we retry forever especially when another IO comes does (like a read if the lock is a write lock or for many devices a test unit ready) that will look like the path is fine so mark the path good again.
>
> The alternative to retrying would be to make sure that before an IO is allowed down a new path that the path is registered.  This would mean that instead of registering/reserving the paths by an application outside of device mapper multipath that it would need to be done inside device mapper so it knows what/when to register.
>
> Eddie Williams
> On Fri, Dec 17, 2010 at 11:59 AM, Mike Snitzer <snitzer@redhat.com> wrote:
>>
>> On Tue, Oct 13 2009 at 12:02am -0400,
>> michaelc@cs.wisc.edu <michaelc@cs.wisc.edu> wrote:
>>
>> > From: Mike Christie <michaelc@cs.wisc.edu>
>> >
>> > This patch was made over this patch
>> > http://marc.info/?l=linux-scsi&m=125417106125449&w=2
>> >
>> > The basic problem is that we do not want dm-multipath to retry
>> > this error, but the scsi layer returns -EIO or -EILSEQ, so
>> > dm-multipath cannot distinguish between a reservation conflict
>> > and other errors.
>> >
>> > This problem was originally discussed here
>> > http://www.linux-archive.org/device-mapper-development/180290-dm-mpath-scsi-persistent-reservation.html
>> >
>> > I have considered adding new blk error values (I have sent pactches
>> > for this before and can send updated ones if we want to go this route),
>> > and even just using more -EXYZ values for scsi errors, but in the end I am
>> > just not sure it ended up being worth it, so this patch just
>> > handles the one error.
>> >
>> > The problem with adding new blk errors is that it seems only dm-multipath
>> > knows what it wants (have not seen anything from the FS or RAID people),
>> > and I also do not know what every device is sending so I cannot completely
>> > clean up cases like where a device returns a error (check condition
>> > and sense) indicating a controller port is temporarily unavialable.
>> > For example, I do not know if I am getting a ILLEGAL request for some
>> > non retryable device error vs the controller is getting its FW updated
>> > (for a non retryable device error case we do not want to fail the path
>> > and just want to fail the IO, but for FW update we just want to fail
>> > the path), so I have to treat those device errors like a transport error
>> > and just fail the path.
>> >
>> > So, I did another take just using lots of different -EXYZ values. See
>> > this patch
>> >
>> > for an example. The problem is still that the transport error
>> > and generic error cases are the same so all I bought was the handling
>> > of the reservation conflict.
>> >
>> > And, that is how I ended up here where I am only handling the one
>> > error I know for sure will cause problems with the infrastructure we have.
>> > I am  not in love with this patch, so please give me any other
>> > suggestions.
>> >
>> > Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
>> > ---
>> >  drivers/md/dm-mpath.c   |    2 +-
>> >  drivers/scsi/scsi_lib.c |    4 ++++
>> >  2 files changed, 5 insertions(+), 1 deletions(-)
>> >
>> > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
>> > index 32d0b87..93e6ce5 100644
>> > --- a/drivers/md/dm-mpath.c
>> > +++ b/drivers/md/dm-mpath.c
>> > @@ -1214,7 +1214,7 @@ static int do_end_io(struct multipath *m, struct request *clone,
>> >       if (!error && !clone->errors)
>> >               return 0;       /* I/O complete */
>> >
>> > -     if (error == -EOPNOTSUPP)
>> > +     if (error == -EOPNOTSUPP || error == -EACCES)
>> >               return error;
>> >
>> >       if (mpio->pgpath)
>> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> > index 1086552..5635035 100644
>> > --- a/drivers/scsi/scsi_lib.c
>> > +++ b/drivers/scsi/scsi_lib.c
>> > @@ -797,6 +797,10 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
>> >                * happens.
>> >                */
>> >               action = ACTION_RETRY;
>> > +     else if (status_byte(cmd->result) == RESERVATION_CONFLICT) {
>> > +             error = -EACCES;
>> > +             description = "Could not access device";
>> > +             action = ACTION_FAIL;
>> >       } else if (sense_valid && !sense_deferred) {
>> >               switch (sshdr.sense_key) {
>> >               case UNIT_ATTENTION:
>> > --
>> > 1.6.2.2
>>
>> Hi Mike,
>>
>> Just a reminder that Hannes has proposed a slightly different approach
>> (returning -EREMOTEIO instead of -EACCES).  Here is the version of
>> Hannes' patch that I reviewed/rebased/tweaked last week:
>> https://patchwork.kernel.org/patch/384612/
>>
>> (NOTE: the scsi_decide_disposition() change relative to
>> RESERVATION_CONFLICT).
>>
>> And here is the corresponding DM mpath change:
>> https://patchwork.kernel.org/patch/384602/
>>
>> If you agree with this approach your ack would be appreciated.
>>
>> Thanks,
>> Mike
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] RFC: scsi/dm-mpath: return -EACCES on reservation conflict
  2011-02-25 18:40   ` Eddie Williams
  2011-02-25 18:43     ` Eddie Williams
@ 2011-02-26  2:39     ` Mike Snitzer
  1 sibling, 0 replies; 7+ messages in thread
From: Mike Snitzer @ 2011-02-26  2:39 UTC (permalink / raw)
  To: Eddie Williams
  Cc: Mike Christie, linux-scsi, device-mapper development,
	Hannes Reinecke

On Fri, Feb 25 2011 at  1:40pm -0500,
Eddie Williams <Eddie.Williams@us.sios.com> wrote:

> I did not see a resolution on this and don't see any changes in
> linux-2.6.38-rc4...

None of the improved SCSI error differentiation made it in 2.6.38.

But it was discussed and resolved on linux-scsi, see:
http://marc.info/?l=linux-scsi&m=129527914627980&w=2

The final commit, which reflects the above discussion, was staged in
James' scsi-misc-2.6 for 2.6.39 inclusion.  It includes special
processing to properly account for RESERVATION CONFLICT (differentiates
SCSI errors further and returns EBADE), see:
http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commit;h=63583cca745f440167bf27877182dc13e19d4bcf

>From that commit's header:
"EBADE: I/O error restricted to the I_T_L nexus"
...
"I/O errors restricted to the I_T_L nexus might be retried
on another nexus / path, but they should _not_ be queued
if no paths are available."

Also, the mpath change (also queued for 2.6.39) includes proper handling
of EBADE:
http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commit;h=751b2a7d623ead9e55f751a6087efeab454b5659


> I would like to disagree that it is not useful or appropriate to retry an IO
> that gets a RESERVATION CONFLICT on different paths to the storage.  The
> issue is a possible window when a path is hot added to the system and before
> a registration is added for the new path.  OK, this window is for SCSI-3
> reservations not SCSI-2...
> 
> If an IO is sent down the new path in this window then it will get a
> reservation conflict.  If the IO is retried on another path (that is
> registered) then it will work.
> 
> So I think the reservation conflict should be retried on each path and then
> failed if none work.  The key issue being we don't want to get into an
> infinite loop here where we retry forever especially when another IO comes
> does (like a read if the lock is a write lock or for many devices a test
> unit ready) that will look like the path is fine so mark the path good
> again.
> 
> The alternative to retrying would be to make sure that before an IO is
> allowed down a new path that the path is registered.  This would mean that
> instead of registering/reserving the paths by an application outside of
> device mapper multipath that it would need to be done inside device mapper
> so it knows what/when to register.
> 
> Eddie Williams
> On Fri, Dec 17, 2010 at 11:59 AM, Mike Snitzer <snitzer@redhat.com> wrote:
> 
> > On Tue, Oct 13 2009 at 12:02am -0400,
> > michaelc@cs.wisc.edu <michaelc@cs.wisc.edu> wrote:
> >
> > > From: Mike Christie <michaelc@cs.wisc.edu>
> > >
> > > This patch was made over this patch
> > > http://marc.info/?l=linux-scsi&m=125417106125449&w=2
> > >
> > > The basic problem is that we do not want dm-multipath to retry
> > > this error, but the scsi layer returns -EIO or -EILSEQ, so
> > > dm-multipath cannot distinguish between a reservation conflict
> > > and other errors.
> > >
> > > This problem was originally discussed here
> > >
> > http://www.linux-archive.org/device-mapper-development/180290-dm-mpath-scsi-persistent-reservation.html
> > >
> > > I have considered adding new blk error values (I have sent pactches
> > > for this before and can send updated ones if we want to go this route),
> > > and even just using more -EXYZ values for scsi errors, but in the end I
> > am
> > > just not sure it ended up being worth it, so this patch just
> > > handles the one error.
> > >
> > > The problem with adding new blk errors is that it seems only dm-multipath
> > > knows what it wants (have not seen anything from the FS or RAID people),
> > > and I also do not know what every device is sending so I cannot
> > completely
> > > clean up cases like where a device returns a error (check condition
> > > and sense) indicating a controller port is temporarily unavialable.
> > > For example, I do not know if I am getting a ILLEGAL request for some
> > > non retryable device error vs the controller is getting its FW updated
> > > (for a non retryable device error case we do not want to fail the path
> > > and just want to fail the IO, but for FW update we just want to fail
> > > the path), so I have to treat those device errors like a transport error
> > > and just fail the path.
> > >
> > > So, I did another take just using lots of different -EXYZ values. See
> > > this patch
> > >
> > > for an example. The problem is still that the transport error
> > > and generic error cases are the same so all I bought was the handling
> > > of the reservation conflict.
> > >
> > > And, that is how I ended up here where I am only handling the one
> > > error I know for sure will cause problems with the infrastructure we
> > have.
> > > I am  not in love with this patch, so please give me any other
> > > suggestions.
> > >
> > > Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
> > > ---
> > >  drivers/md/dm-mpath.c   |    2 +-
> > >  drivers/scsi/scsi_lib.c |    4 ++++
> > >  2 files changed, 5 insertions(+), 1 deletions(-)
> > >
> > > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> > > index 32d0b87..93e6ce5 100644
> > > --- a/drivers/md/dm-mpath.c
> > > +++ b/drivers/md/dm-mpath.c
> > > @@ -1214,7 +1214,7 @@ static int do_end_io(struct multipath *m, struct
> > request *clone,
> > >       if (!error && !clone->errors)
> > >               return 0;       /* I/O complete */
> > >
> > > -     if (error == -EOPNOTSUPP)
> > > +     if (error == -EOPNOTSUPP || error == -EACCES)
> > >               return error;
> > >
> > >       if (mpio->pgpath)
> > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > > index 1086552..5635035 100644
> > > --- a/drivers/scsi/scsi_lib.c
> > > +++ b/drivers/scsi/scsi_lib.c
> > > @@ -797,6 +797,10 @@ void scsi_io_completion(struct scsi_cmnd *cmd,
> > unsigned int good_bytes)
> > >                * happens.
> > >                */
> > >               action = ACTION_RETRY;
> > > +     else if (status_byte(cmd->result) == RESERVATION_CONFLICT) {
> > > +             error = -EACCES;
> > > +             description = "Could not access device";
> > > +             action = ACTION_FAIL;
> > >       } else if (sense_valid && !sense_deferred) {
> > >               switch (sshdr.sense_key) {
> > >               case UNIT_ATTENTION:
> > > --
> > > 1.6.2.2
> >
> > Hi Mike,
> >
> > Just a reminder that Hannes has proposed a slightly different approach
> > (returning -EREMOTEIO instead of -EACCES).  Here is the version of
> > Hannes' patch that I reviewed/rebased/tweaked last week:
> > https://patchwork.kernel.org/patch/384612/
> >
> > (NOTE: the scsi_decide_disposition() change relative to
> > RESERVATION_CONFLICT).
> >
> > And here is the corresponding DM mpath change:
> > https://patchwork.kernel.org/patch/384602/
> >
> > If you agree with this approach your ack would be appreciated.
> >
> > Thanks,
> > Mike
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-02-26  2:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-13  4:02 [PATCH 1/1] RFC: scsi/dm-mpath: return -EACCES on reservation conflict michaelc
2009-10-13  4:03 ` Mike Christie
2009-10-13 10:07 ` Boaz Harrosh
2010-12-17 16:59 ` Mike Snitzer
2011-02-25 18:40   ` Eddie Williams
2011-02-25 18:43     ` Eddie Williams
2011-02-26  2:39     ` Mike Snitzer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).