* [PATCH] scsi_dh_alua: add missing transitioning state support @ 2010-08-17 19:05 Mike Snitzer 2010-08-17 19:23 ` Nicholas A. Bellinger 0 siblings, 1 reply; 14+ messages in thread From: Mike Snitzer @ 2010-08-17 19:05 UTC (permalink / raw) To: James Bottomley; +Cc: Mike Christie, linux-scsi Handle transitioning in the prep_fn. Handle transitioning in alua_rtpg's implicit alua code too. These gaps were identified during controller failover testing of an ALUA array. Signed-off-by: Mike Snitzer <snitzer@redhat.com> --- drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c index 1a970a7..c1eedc5 100644 --- a/drivers/scsi/device_handler/scsi_dh_alua.c +++ b/drivers/scsi/device_handler/scsi_dh_alua.c @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h) h->state == TPGS_STATE_STANDBY) /* Useable path if active */ err = SCSI_DH_OK; + else if (h->state == TPGS_STATE_TRANSITIONING) + /* State transition, retry */ + goto retry; else /* Path unuseable for unavailable/offline */ err = SCSI_DH_DEV_OFFLINED; @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req) struct alua_dh_data *h = get_alua_data(sdev); int ret = BLKPREP_OK; - if (h->state != TPGS_STATE_OPTIMIZED && - h->state != TPGS_STATE_NONOPTIMIZED) { + if (h->state == TPGS_STATE_TRANSITIONING) + ret = BLKPREP_DEFER; + else if (h->state != TPGS_STATE_OPTIMIZED && + h->state != TPGS_STATE_NONOPTIMIZED) { ret = BLKPREP_KILL; req->cmd_flags |= REQ_QUIET; } return ret; - } static const struct scsi_dh_devlist alua_dev_list[] = { ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] scsi_dh_alua: add missing transitioning state support 2010-08-17 19:05 [PATCH] scsi_dh_alua: add missing transitioning state support Mike Snitzer @ 2010-08-17 19:23 ` Nicholas A. Bellinger 2010-08-30 9:36 ` Hannes Reinecke 0 siblings, 1 reply; 14+ messages in thread From: Nicholas A. Bellinger @ 2010-08-17 19:23 UTC (permalink / raw) To: Mike Snitzer; +Cc: James Bottomley, Mike Christie, linux-scsi On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote: > Handle transitioning in the prep_fn. > Handle transitioning in alua_rtpg's implicit alua code too. > > These gaps were identified during controller failover testing of an > ALUA array. > > Signed-off-by: Mike Snitzer <snitzer@redhat.com> > --- > drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++--- > 1 files changed, 7 insertions(+), 3 deletions(-) > > diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c > index 1a970a7..c1eedc5 100644 > --- a/drivers/scsi/device_handler/scsi_dh_alua.c > +++ b/drivers/scsi/device_handler/scsi_dh_alua.c > @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h) > h->state == TPGS_STATE_STANDBY) > /* Useable path if active */ > err = SCSI_DH_OK; > + else if (h->state == TPGS_STATE_TRANSITIONING) > + /* State transition, retry */ > + goto retry; > else > /* Path unuseable for unavailable/offline */ > err = SCSI_DH_DEV_OFFLINED; > @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req) > struct alua_dh_data *h = get_alua_data(sdev); > int ret = BLKPREP_OK; > > - if (h->state != TPGS_STATE_OPTIMIZED && > - h->state != TPGS_STATE_NONOPTIMIZED) { > + if (h->state == TPGS_STATE_TRANSITIONING) > + ret = BLKPREP_DEFER; > + else if (h->state != TPGS_STATE_OPTIMIZED && > + h->state != TPGS_STATE_NONOPTIMIZED) { > ret = BLKPREP_KILL; > req->cmd_flags |= REQ_QUIET; > } > return ret; > - > } > Makes sense to me.. Acked-by: Nicholas A. Bellinger <nab@linux-iscsi.org> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] scsi_dh_alua: add missing transitioning state support 2010-08-17 19:23 ` Nicholas A. Bellinger @ 2010-08-30 9:36 ` Hannes Reinecke 2010-08-31 15:11 ` Mike Snitzer 0 siblings, 1 reply; 14+ messages in thread From: Hannes Reinecke @ 2010-08-30 9:36 UTC (permalink / raw) To: Nicholas A. Bellinger Cc: Mike Snitzer, James Bottomley, Mike Christie, linux-scsi Nicholas A. Bellinger wrote: > On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote: >> Handle transitioning in the prep_fn. >> Handle transitioning in alua_rtpg's implicit alua code too. >> >> These gaps were identified during controller failover testing of an >> ALUA array. >> >> Signed-off-by: Mike Snitzer <snitzer@redhat.com> >> --- >> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++--- >> 1 files changed, 7 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c >> index 1a970a7..c1eedc5 100644 >> --- a/drivers/scsi/device_handler/scsi_dh_alua.c >> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c >> @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h) >> h->state == TPGS_STATE_STANDBY) >> /* Useable path if active */ >> err = SCSI_DH_OK; >> + else if (h->state == TPGS_STATE_TRANSITIONING) >> + /* State transition, retry */ >> + goto retry; >> else >> /* Path unuseable for unavailable/offline */ >> err = SCSI_DH_DEV_OFFLINED; >> @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req) >> struct alua_dh_data *h = get_alua_data(sdev); >> int ret = BLKPREP_OK; >> >> - if (h->state != TPGS_STATE_OPTIMIZED && >> - h->state != TPGS_STATE_NONOPTIMIZED) { >> + if (h->state == TPGS_STATE_TRANSITIONING) >> + ret = BLKPREP_DEFER; >> + else if (h->state != TPGS_STATE_OPTIMIZED && >> + h->state != TPGS_STATE_NONOPTIMIZED) { >> ret = BLKPREP_KILL; >> req->cmd_flags |= REQ_QUIET; >> } >> return ret; >> - >> } >> > > Makes sense to me.. > > Acked-by: Nicholas A. Bellinger <nab@linux-iscsi.org> > Not so fast. There are two problems with this approach: The path is retried indefinitely. Arrays are _supposed_ to be in 'transitioning' only temporary; however, if the array is stuck due to a fw error we're stuck in 'defer', too. Secondly this path fails with 'directio' multipath checker. Remember that 'directio' is using 'fs' requests, not block-pc ones. Hence for all I/O the prep_fn() callback is evaluated, which will return 'DEFER' here once the path is in transitioning. And the state is never updated as RTPG is never called. I'm currently preparing a patch which addressed these situations, too. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support 2010-08-30 9:36 ` Hannes Reinecke @ 2010-08-31 15:11 ` Mike Snitzer 2010-09-20 15:35 ` Mike Snitzer 0 siblings, 1 reply; 14+ messages in thread From: Mike Snitzer @ 2010-08-31 15:11 UTC (permalink / raw) To: Hannes Reinecke Cc: Nicholas A. Bellinger, James Bottomley, Mike Christie, linux-scsi On Mon, Aug 30 2010 at 5:36am -0400, Hannes Reinecke <hare@suse.de> wrote: > Nicholas A. Bellinger wrote: > > On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote: > >> Handle transitioning in the prep_fn. > >> Handle transitioning in alua_rtpg's implicit alua code too. > >> > >> These gaps were identified during controller failover testing of an > >> ALUA array. > >> > >> Signed-off-by: Mike Snitzer <snitzer@redhat.com> > >> --- > >> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++--- > >> 1 files changed, 7 insertions(+), 3 deletions(-) > >> > >> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c > >> index 1a970a7..c1eedc5 100644 > >> --- a/drivers/scsi/device_handler/scsi_dh_alua.c > >> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c > >> @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h) > >> h->state == TPGS_STATE_STANDBY) > >> /* Useable path if active */ > >> err = SCSI_DH_OK; > >> + else if (h->state == TPGS_STATE_TRANSITIONING) > >> + /* State transition, retry */ > >> + goto retry; > >> else > >> /* Path unuseable for unavailable/offline */ > >> err = SCSI_DH_DEV_OFFLINED; > >> @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req) > >> struct alua_dh_data *h = get_alua_data(sdev); > >> int ret = BLKPREP_OK; > >> > >> - if (h->state != TPGS_STATE_OPTIMIZED && > >> - h->state != TPGS_STATE_NONOPTIMIZED) { > >> + if (h->state == TPGS_STATE_TRANSITIONING) > >> + ret = BLKPREP_DEFER; > >> + else if (h->state != TPGS_STATE_OPTIMIZED && > >> + h->state != TPGS_STATE_NONOPTIMIZED) { > >> ret = BLKPREP_KILL; > >> req->cmd_flags |= REQ_QUIET; > >> } > >> return ret; > >> - > >> } > >> > > > > Makes sense to me.. > > > > Acked-by: Nicholas A. Bellinger <nab@linux-iscsi.org> > > > Not so fast. There are two problems with this approach: > > The path is retried indefinitely. Arrays are _supposed_ to be in 'transitioning' > only temporary; however, if the array is stuck due to a fw error we're stuck in 'defer', > too. And what is the problem with that? The IO will eventually time out. > Secondly this path fails with 'directio' multipath checker. Remember that 'directio' > is using 'fs' requests, not block-pc ones. Hence for all I/O the prep_fn() callback > is evaluated, which will return 'DEFER' here once the path is in transitioning. > And the state is never updated as RTPG is never called. Testing ALUA with directio path checker did not result in such immutable state in the few instances that TPGS_STATE_TRANSITIONING was seen in alua_prep_fn. > I'm currently preparing a patch which addressed these situations, too. OK, please share. Thanks, Mike ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support 2010-08-31 15:11 ` Mike Snitzer @ 2010-09-20 15:35 ` Mike Snitzer 2010-09-21 2:27 ` Mike Christie 0 siblings, 1 reply; 14+ messages in thread From: Mike Snitzer @ 2010-09-20 15:35 UTC (permalink / raw) To: Hannes Reinecke Cc: Nicholas A. Bellinger, James Bottomley, Mike Christie, linux-scsi Hi Hannes, On Tue, Aug 31 2010 at 11:11am -0400, Mike Snitzer <snitzer@redhat.com> wrote: > On Mon, Aug 30 2010 at 5:36am -0400, > Hannes Reinecke <hare@suse.de> wrote: > > > Nicholas A. Bellinger wrote: > > > On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote: > > >> Handle transitioning in the prep_fn. > > >> Handle transitioning in alua_rtpg's implicit alua code too. > > >> > > >> These gaps were identified during controller failover testing of an > > >> ALUA array. > > >> > > >> Signed-off-by: Mike Snitzer <snitzer@redhat.com> > > >> --- > > >> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++--- > > >> 1 files changed, 7 insertions(+), 3 deletions(-) > > >> > > >> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c > > >> index 1a970a7..c1eedc5 100644 > > >> --- a/drivers/scsi/device_handler/scsi_dh_alua.c > > >> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c > > >> @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h) > > >> h->state == TPGS_STATE_STANDBY) > > >> /* Useable path if active */ > > >> err = SCSI_DH_OK; > > >> + else if (h->state == TPGS_STATE_TRANSITIONING) > > >> + /* State transition, retry */ > > >> + goto retry; > > >> else > > >> /* Path unuseable for unavailable/offline */ > > >> err = SCSI_DH_DEV_OFFLINED; > > >> @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req) > > >> struct alua_dh_data *h = get_alua_data(sdev); > > >> int ret = BLKPREP_OK; > > >> > > >> - if (h->state != TPGS_STATE_OPTIMIZED && > > >> - h->state != TPGS_STATE_NONOPTIMIZED) { > > >> + if (h->state == TPGS_STATE_TRANSITIONING) > > >> + ret = BLKPREP_DEFER; > > >> + else if (h->state != TPGS_STATE_OPTIMIZED && > > >> + h->state != TPGS_STATE_NONOPTIMIZED) { > > >> ret = BLKPREP_KILL; > > >> req->cmd_flags |= REQ_QUIET; > > >> } > > >> return ret; > > >> - > > >> } > > >> > > > > > > Makes sense to me.. > > > > > > Acked-by: Nicholas A. Bellinger <nab@linux-iscsi.org> > > > > > Not so fast. There are two problems with this approach: > > > > The path is retried indefinitely. Arrays are _supposed_ to be in 'transitioning' > > only temporary; however, if the array is stuck due to a fw error we're stuck in 'defer', > > too. > > And what is the problem with that? The IO will eventually time out. To restate as a question: even though we'll retry in alua_rtpg(); shouldn't the SCSI command eventually time out (via scsi_attempt_requeue_command)? Note that my proposed change to alua_rtpg() just adds a TPGS_STATE_TRANSITIONING handler for implicit ALUA -- explicit ALUA already has such a handler. Allowing implicit ALUA to fall through (as we do currently) causes a return alua_rtpg() of SCSI_DH_DEV_OFFLINED -- which isn't the correct state. > > Secondly this path fails with 'directio' multipath checker. Remember that 'directio' > > is using 'fs' requests, not block-pc ones. Hence for all I/O the prep_fn() callback > > is evaluated, which will return 'DEFER' here once the path is in transitioning. > > And the state is never updated as RTPG is never called. > > Testing ALUA with directio path checker did not result in such immutable > state in the few instances that TPGS_STATE_TRANSITIONING was seen in > alua_prep_fn. I had another look and I see what you're saying. Thanks for catching this! > > I'm currently preparing a patch which addressed these situations, too. > > OK, please share. Do you have that patch you were preparing? I look forward to seeing your solution to this. Thanks, Mike ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support 2010-09-20 15:35 ` Mike Snitzer @ 2010-09-21 2:27 ` Mike Christie 2010-09-21 2:28 ` Mike Christie 2010-09-21 19:33 ` Mike Snitzer 0 siblings, 2 replies; 14+ messages in thread From: Mike Christie @ 2010-09-21 2:27 UTC (permalink / raw) To: Mike Snitzer Cc: Hannes Reinecke, Nicholas A. Bellinger, James Bottomley, linux-scsi On 09/20/2010 10:35 AM, Mike Snitzer wrote: > Hi Hannes, > > On Tue, Aug 31 2010 at 11:11am -0400, > Mike Snitzer<snitzer@redhat.com> wrote: > >> On Mon, Aug 30 2010 at 5:36am -0400, >> Hannes Reinecke<hare@suse.de> wrote: >> >>> Nicholas A. Bellinger wrote: >>>> On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote: >>>>> Handle transitioning in the prep_fn. >>>>> Handle transitioning in alua_rtpg's implicit alua code too. >>>>> >>>>> These gaps were identified during controller failover testing of an >>>>> ALUA array. >>>>> >>>>> Signed-off-by: Mike Snitzer<snitzer@redhat.com> >>>>> --- >>>>> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++--- >>>>> 1 files changed, 7 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c >>>>> index 1a970a7..c1eedc5 100644 >>>>> --- a/drivers/scsi/device_handler/scsi_dh_alua.c >>>>> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c >>>>> @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h) >>>>> h->state == TPGS_STATE_STANDBY) >>>>> /* Useable path if active */ >>>>> err = SCSI_DH_OK; >>>>> + else if (h->state == TPGS_STATE_TRANSITIONING) >>>>> + /* State transition, retry */ >>>>> + goto retry; >>>>> else >>>>> /* Path unuseable for unavailable/offline */ >>>>> err = SCSI_DH_DEV_OFFLINED; >>>>> @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req) >>>>> struct alua_dh_data *h = get_alua_data(sdev); >>>>> int ret = BLKPREP_OK; >>>>> >>>>> - if (h->state != TPGS_STATE_OPTIMIZED&& >>>>> - h->state != TPGS_STATE_NONOPTIMIZED) { >>>>> + if (h->state == TPGS_STATE_TRANSITIONING) >>>>> + ret = BLKPREP_DEFER; >>>>> + else if (h->state != TPGS_STATE_OPTIMIZED&& >>>>> + h->state != TPGS_STATE_NONOPTIMIZED) { >>>>> ret = BLKPREP_KILL; >>>>> req->cmd_flags |= REQ_QUIET; >>>>> } >>>>> return ret; >>>>> - >>>>> } >>>>> >>>> >>>> Makes sense to me.. >>>> >>>> Acked-by: Nicholas A. Bellinger<nab@linux-iscsi.org> >>>> >>> Not so fast. There are two problems with this approach: >>> >>> The path is retried indefinitely. Arrays are _supposed_ to be in 'transitioning' >>> only temporary; however, if the array is stuck due to a fw error we're stuck in 'defer', >>> too. >> >> And what is the problem with that? The IO will eventually time out. > > To restate as a question: even though we'll retry in alua_rtpg(); > shouldn't the SCSI command eventually time out (via > scsi_attempt_requeue_command)? That function is only in RHEL. Requests that are prepd and sent to the scsi layer and driver would eventually timeout in scsi_softirq_done in upstream. alua_prep_fn prevents the IO from getting sent to the scsi layer so we do not hit the check in scsi_softirq_done though. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support 2010-09-21 2:27 ` Mike Christie @ 2010-09-21 2:28 ` Mike Christie 2010-09-21 19:33 ` Mike Snitzer 1 sibling, 0 replies; 14+ messages in thread From: Mike Christie @ 2010-09-21 2:28 UTC (permalink / raw) To: Mike Snitzer Cc: Hannes Reinecke, Nicholas A. Bellinger, James Bottomley, linux-scsi On 09/20/2010 09:27 PM, Mike Christie wrote: > On 09/20/2010 10:35 AM, Mike Snitzer wrote: >> Hi Hannes, >> >> On Tue, Aug 31 2010 at 11:11am -0400, >> Mike Snitzer<snitzer@redhat.com> wrote: >> >>> On Mon, Aug 30 2010 at 5:36am -0400, >>> Hannes Reinecke<hare@suse.de> wrote: >>> >>>> Nicholas A. Bellinger wrote: >>>>> On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote: >>>>>> Handle transitioning in the prep_fn. >>>>>> Handle transitioning in alua_rtpg's implicit alua code too. >>>>>> >>>>>> These gaps were identified during controller failover testing of an >>>>>> ALUA array. >>>>>> >>>>>> Signed-off-by: Mike Snitzer<snitzer@redhat.com> >>>>>> --- >>>>>> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++--- >>>>>> 1 files changed, 7 insertions(+), 3 deletions(-) >>>>>> >>>>>> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c >>>>>> b/drivers/scsi/device_handler/scsi_dh_alua.c >>>>>> index 1a970a7..c1eedc5 100644 >>>>>> --- a/drivers/scsi/device_handler/scsi_dh_alua.c >>>>>> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c >>>>>> @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, >>>>>> struct alua_dh_data *h) >>>>>> h->state == TPGS_STATE_STANDBY) >>>>>> /* Useable path if active */ >>>>>> err = SCSI_DH_OK; >>>>>> + else if (h->state == TPGS_STATE_TRANSITIONING) >>>>>> + /* State transition, retry */ >>>>>> + goto retry; >>>>>> else >>>>>> /* Path unuseable for unavailable/offline */ >>>>>> err = SCSI_DH_DEV_OFFLINED; >>>>>> @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device >>>>>> *sdev, struct request *req) >>>>>> struct alua_dh_data *h = get_alua_data(sdev); >>>>>> int ret = BLKPREP_OK; >>>>>> >>>>>> - if (h->state != TPGS_STATE_OPTIMIZED&& >>>>>> - h->state != TPGS_STATE_NONOPTIMIZED) { >>>>>> + if (h->state == TPGS_STATE_TRANSITIONING) >>>>>> + ret = BLKPREP_DEFER; >>>>>> + else if (h->state != TPGS_STATE_OPTIMIZED&& >>>>>> + h->state != TPGS_STATE_NONOPTIMIZED) { >>>>>> ret = BLKPREP_KILL; >>>>>> req->cmd_flags |= REQ_QUIET; >>>>>> } >>>>>> return ret; >>>>>> - >>>>>> } >>>>>> >>>>> >>>>> Makes sense to me.. >>>>> >>>>> Acked-by: Nicholas A. Bellinger<nab@linux-iscsi.org> >>>>> >>>> Not so fast. There are two problems with this approach: >>>> >>>> The path is retried indefinitely. Arrays are _supposed_ to be in >>>> 'transitioning' >>>> only temporary; however, if the array is stuck due to a fw error >>>> we're stuck in 'defer', >>>> too. >>> >>> And what is the problem with that? The IO will eventually time out. >> >> To restate as a question: even though we'll retry in alua_rtpg(); >> shouldn't the SCSI command eventually time out (via >> scsi_attempt_requeue_command)? > > That function is only in RHEL. Requests that are prepd and sent to the > scsi layer and driver would eventually timeout in scsi_softirq_done in > upstream. Did we want to move this check to the block layer btw? If so it could solve the problem. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support 2010-09-21 2:27 ` Mike Christie 2010-09-21 2:28 ` Mike Christie @ 2010-09-21 19:33 ` Mike Snitzer 2010-09-21 21:14 ` Mike Christie 1 sibling, 1 reply; 14+ messages in thread From: Mike Snitzer @ 2010-09-21 19:33 UTC (permalink / raw) To: Mike Christie Cc: Hannes Reinecke, Nicholas A. Bellinger, James Bottomley, linux-scsi On Mon, Sep 20 2010 at 10:27pm -0400, Mike Christie <michaelc@cs.wisc.edu> wrote: > On 09/20/2010 10:35 AM, Mike Snitzer wrote: > >Hi Hannes, > > > >On Tue, Aug 31 2010 at 11:11am -0400, > >Mike Snitzer<snitzer@redhat.com> wrote: > > > >>On Mon, Aug 30 2010 at 5:36am -0400, > >>Hannes Reinecke<hare@suse.de> wrote: > >> > >>>Nicholas A. Bellinger wrote: > >>>>On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote: > >>>>>Handle transitioning in the prep_fn. > >>>>>Handle transitioning in alua_rtpg's implicit alua code too. > >>>>> > >>>>>These gaps were identified during controller failover testing of an > >>>>>ALUA array. > >>>>> > >>>>>Signed-off-by: Mike Snitzer<snitzer@redhat.com> > >>>>>--- > >>>>> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++--- > >>>>> 1 files changed, 7 insertions(+), 3 deletions(-) > >>>>> > >>>>>diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c > >>>>>index 1a970a7..c1eedc5 100644 > >>>>>--- a/drivers/scsi/device_handler/scsi_dh_alua.c > >>>>>+++ b/drivers/scsi/device_handler/scsi_dh_alua.c > >>>>>@@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h) > >>>>> h->state == TPGS_STATE_STANDBY) > >>>>> /* Useable path if active */ > >>>>> err = SCSI_DH_OK; > >>>>>+ else if (h->state == TPGS_STATE_TRANSITIONING) > >>>>>+ /* State transition, retry */ > >>>>>+ goto retry; > >>>>> else > >>>>> /* Path unuseable for unavailable/offline */ > >>>>> err = SCSI_DH_DEV_OFFLINED; > >>>>>@@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req) > >>>>> struct alua_dh_data *h = get_alua_data(sdev); > >>>>> int ret = BLKPREP_OK; > >>>>> > >>>>>- if (h->state != TPGS_STATE_OPTIMIZED&& > >>>>>- h->state != TPGS_STATE_NONOPTIMIZED) { > >>>>>+ if (h->state == TPGS_STATE_TRANSITIONING) > >>>>>+ ret = BLKPREP_DEFER; > >>>>>+ else if (h->state != TPGS_STATE_OPTIMIZED&& > >>>>>+ h->state != TPGS_STATE_NONOPTIMIZED) { > >>>>> ret = BLKPREP_KILL; > >>>>> req->cmd_flags |= REQ_QUIET; > >>>>> } > >>>>> return ret; > >>>>>- > >>>>> } > >>>>> > >>>> > >>>>Makes sense to me.. > >>>> > >>>>Acked-by: Nicholas A. Bellinger<nab@linux-iscsi.org> > >>>> > >>>Not so fast. There are two problems with this approach: > >>> > >>>The path is retried indefinitely. Arrays are _supposed_ to be in 'transitioning' > >>>only temporary; however, if the array is stuck due to a fw error we're stuck in 'defer', > >>>too. > >> > >>And what is the problem with that? The IO will eventually time out. > > > >To restate as a question: even though we'll retry in alua_rtpg(); > >shouldn't the SCSI command eventually time out (via > >scsi_attempt_requeue_command)? > > That function is only in RHEL. Requests that are prepd and sent to > the scsi layer and driver would eventually timeout in > scsi_softirq_done in upstream. > > alua_prep_fn prevents the IO from getting sent to the scsi layer so > we do not hit the check in scsi_softirq_done though. That is only the case if alua_prep_fn were to return BLKPREP_DEFER right? Taking a step back: 1) the proposed alua_prep_fn BLKPREP_DEFER change is not correct and is no longer being proposed... 2) the patch also modified alua_rtpg() so implicit ALUA would retry (just like explicit ALUA currently does) if TPGS_STATE_TRANSITIONING - so why should we avoid retry for implicit but do it for explicit? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support 2010-09-21 19:33 ` Mike Snitzer @ 2010-09-21 21:14 ` Mike Christie 2010-09-22 10:13 ` Hannes Reinecke 0 siblings, 1 reply; 14+ messages in thread From: Mike Christie @ 2010-09-21 21:14 UTC (permalink / raw) To: Mike Snitzer Cc: Hannes Reinecke, Nicholas A. Bellinger, James Bottomley, linux-scsi On 09/21/2010 02:33 PM, Mike Snitzer wrote: > On Mon, Sep 20 2010 at 10:27pm -0400, > Mike Christie<michaelc@cs.wisc.edu> wrote: > >> On 09/20/2010 10:35 AM, Mike Snitzer wrote: >>> Hi Hannes, >>> >>> On Tue, Aug 31 2010 at 11:11am -0400, >>> Mike Snitzer<snitzer@redhat.com> wrote: >>> >>>> On Mon, Aug 30 2010 at 5:36am -0400, >>>> Hannes Reinecke<hare@suse.de> wrote: >>>> >>>>> Nicholas A. Bellinger wrote: >>>>>> On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote: >>>>>>> Handle transitioning in the prep_fn. >>>>>>> Handle transitioning in alua_rtpg's implicit alua code too. >>>>>>> >>>>>>> These gaps were identified during controller failover testing of an >>>>>>> ALUA array. >>>>>>> >>>>>>> Signed-off-by: Mike Snitzer<snitzer@redhat.com> >>>>>>> --- >>>>>>> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++--- >>>>>>> 1 files changed, 7 insertions(+), 3 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c >>>>>>> index 1a970a7..c1eedc5 100644 >>>>>>> --- a/drivers/scsi/device_handler/scsi_dh_alua.c >>>>>>> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c >>>>>>> @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h) >>>>>>> h->state == TPGS_STATE_STANDBY) >>>>>>> /* Useable path if active */ >>>>>>> err = SCSI_DH_OK; >>>>>>> + else if (h->state == TPGS_STATE_TRANSITIONING) >>>>>>> + /* State transition, retry */ >>>>>>> + goto retry; >>>>>>> else >>>>>>> /* Path unuseable for unavailable/offline */ >>>>>>> err = SCSI_DH_DEV_OFFLINED; >>>>>>> @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req) >>>>>>> struct alua_dh_data *h = get_alua_data(sdev); >>>>>>> int ret = BLKPREP_OK; >>>>>>> >>>>>>> - if (h->state != TPGS_STATE_OPTIMIZED&& >>>>>>> - h->state != TPGS_STATE_NONOPTIMIZED) { >>>>>>> + if (h->state == TPGS_STATE_TRANSITIONING) >>>>>>> + ret = BLKPREP_DEFER; >>>>>>> + else if (h->state != TPGS_STATE_OPTIMIZED&& >>>>>>> + h->state != TPGS_STATE_NONOPTIMIZED) { >>>>>>> ret = BLKPREP_KILL; >>>>>>> req->cmd_flags |= REQ_QUIET; >>>>>>> } >>>>>>> return ret; >>>>>>> - >>>>>>> } >>>>>>> >>>>>> >>>>>> Makes sense to me.. >>>>>> >>>>>> Acked-by: Nicholas A. Bellinger<nab@linux-iscsi.org> >>>>>> >>>>> Not so fast. There are two problems with this approach: >>>>> >>>>> The path is retried indefinitely. Arrays are _supposed_ to be in 'transitioning' >>>>> only temporary; however, if the array is stuck due to a fw error we're stuck in 'defer', >>>>> too. >>>> >>>> And what is the problem with that? The IO will eventually time out. >>> >>> To restate as a question: even though we'll retry in alua_rtpg(); >>> shouldn't the SCSI command eventually time out (via >>> scsi_attempt_requeue_command)? >> >> That function is only in RHEL. Requests that are prepd and sent to >> the scsi layer and driver would eventually timeout in >> scsi_softirq_done in upstream. >> >> alua_prep_fn prevents the IO from getting sent to the scsi layer so >> we do not hit the check in scsi_softirq_done though. > > That is only the case if alua_prep_fn were to return BLKPREP_DEFER > right? Yeah. I misread the email above. Just to clarify.. For the alua case, we will always retry due the prep_fn issue I mentioned. For the alua_rtpg() retry case you were discussing we will also always retry, because the timer check in scsi_attempt_requeue_command/scsi_softirq_done will break us from retrying forever in that execution of the request started by the submit_rtpg call. However, the added goto retry will would just end up starting a another execution. To handle Hannes comment I think you would want to add some retry/timer checks in alua_rtpg to prevent that from retrying forever. > > 2) the patch also modified alua_rtpg() so implicit ALUA would retry > (just like explicit ALUA currently does) if TPGS_STATE_TRANSITIONING > - so why should we avoid retry for implicit but do it for explicit? Leaving that for Hannes. I cannot think of a reason. Probably just did not do it. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support 2010-09-21 21:14 ` Mike Christie @ 2010-09-22 10:13 ` Hannes Reinecke 2010-09-22 12:29 ` Mike Snitzer 0 siblings, 1 reply; 14+ messages in thread From: Hannes Reinecke @ 2010-09-22 10:13 UTC (permalink / raw) To: Mike Christie Cc: Mike Snitzer, Nicholas A. Bellinger, James Bottomley, linux-scsi [-- Attachment #1: Type: text/plain, Size: 4863 bytes --] Mike Christie wrote: > On 09/21/2010 02:33 PM, Mike Snitzer wrote: >> On Mon, Sep 20 2010 at 10:27pm -0400, >> Mike Christie<michaelc@cs.wisc.edu> wrote: >> >>> On 09/20/2010 10:35 AM, Mike Snitzer wrote: >>>> Hi Hannes, >>>> >>>> On Tue, Aug 31 2010 at 11:11am -0400, >>>> Mike Snitzer<snitzer@redhat.com> wrote: >>>> >>>>> On Mon, Aug 30 2010 at 5:36am -0400, >>>>> Hannes Reinecke<hare@suse.de> wrote: >>>>> >>>>>> Nicholas A. Bellinger wrote: >>>>>>> On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote: >>>>>>>> Handle transitioning in the prep_fn. >>>>>>>> Handle transitioning in alua_rtpg's implicit alua code too. >>>>>>>> >>>>>>>> These gaps were identified during controller failover testing of an >>>>>>>> ALUA array. >>>>>>>> >>>>>>>> Signed-off-by: Mike Snitzer<snitzer@redhat.com> >>>>>>>> --- >>>>>>>> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++--- >>>>>>>> 1 files changed, 7 insertions(+), 3 deletions(-) >>>>>>>> >>>>>>>> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c >>>>>>>> b/drivers/scsi/device_handler/scsi_dh_alua.c >>>>>>>> index 1a970a7..c1eedc5 100644 >>>>>>>> --- a/drivers/scsi/device_handler/scsi_dh_alua.c >>>>>>>> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c >>>>>>>> @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device >>>>>>>> *sdev, struct alua_dh_data *h) >>>>>>>> h->state == TPGS_STATE_STANDBY) >>>>>>>> /* Useable path if active */ >>>>>>>> err = SCSI_DH_OK; >>>>>>>> + else if (h->state == TPGS_STATE_TRANSITIONING) >>>>>>>> + /* State transition, retry */ >>>>>>>> + goto retry; >>>>>>>> else >>>>>>>> /* Path unuseable for unavailable/offline */ >>>>>>>> err = SCSI_DH_DEV_OFFLINED; >>>>>>>> @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device >>>>>>>> *sdev, struct request *req) >>>>>>>> struct alua_dh_data *h = get_alua_data(sdev); >>>>>>>> int ret = BLKPREP_OK; >>>>>>>> >>>>>>>> - if (h->state != TPGS_STATE_OPTIMIZED&& >>>>>>>> - h->state != TPGS_STATE_NONOPTIMIZED) { >>>>>>>> + if (h->state == TPGS_STATE_TRANSITIONING) >>>>>>>> + ret = BLKPREP_DEFER; >>>>>>>> + else if (h->state != TPGS_STATE_OPTIMIZED&& >>>>>>>> + h->state != TPGS_STATE_NONOPTIMIZED) { >>>>>>>> ret = BLKPREP_KILL; >>>>>>>> req->cmd_flags |= REQ_QUIET; >>>>>>>> } >>>>>>>> return ret; >>>>>>>> - >>>>>>>> } >>>>>>>> >>>>>>> >>>>>>> Makes sense to me.. >>>>>>> >>>>>>> Acked-by: Nicholas A. Bellinger<nab@linux-iscsi.org> >>>>>>> >>>>>> Not so fast. There are two problems with this approach: >>>>>> >>>>>> The path is retried indefinitely. Arrays are _supposed_ to be in >>>>>> 'transitioning' >>>>>> only temporary; however, if the array is stuck due to a fw error >>>>>> we're stuck in 'defer', >>>>>> too. >>>>> >>>>> And what is the problem with that? The IO will eventually time out. >>>> >>>> To restate as a question: even though we'll retry in alua_rtpg(); >>>> shouldn't the SCSI command eventually time out (via >>>> scsi_attempt_requeue_command)? >>> >>> That function is only in RHEL. Requests that are prepd and sent to >>> the scsi layer and driver would eventually timeout in >>> scsi_softirq_done in upstream. >>> >>> alua_prep_fn prevents the IO from getting sent to the scsi layer so >>> we do not hit the check in scsi_softirq_done though. >> >> That is only the case if alua_prep_fn were to return BLKPREP_DEFER >> right? > > Yeah. I misread the email above. Just to clarify.. > > For the alua case, we will always retry due the prep_fn issue I mentioned. > > For the alua_rtpg() retry case you were discussing we will also always > retry, because the timer check in > scsi_attempt_requeue_command/scsi_softirq_done will break us from > retrying forever in that execution of the request started by the > submit_rtpg call. However, the added goto retry will would just end up > starting a another execution. To handle Hannes comment I think you would > want to add some retry/timer checks in alua_rtpg to prevent that from > retrying forever. > >> >> 2) the patch also modified alua_rtpg() so implicit ALUA would retry >> (just like explicit ALUA currently does) if TPGS_STATE_TRANSITIONING >> - so why should we avoid retry for implicit but do it for explicit? > > Leaving that for Hannes. I cannot think of a reason. Probably just did > not do it. Finally I got around to answering this. I've attached a patch which I made the other day which seems to work reasonably well. Looks better from my side, so if you agree I'll be sending it upstream properly. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) [-- Attachment #2: scsi-dh-alua-handle-all-states --] [-- Type: text/plain, Size: 4696 bytes --] >From d3f02c90db3e3177309b78726d082e17dd772ee2 Mon Sep 17 00:00:00 2001 From: Hannes Reinecke <hare@suse.de> Date: Wed, 22 Sep 2010 12:09:07 +0200 Subject: [PATCH] scsi_dh_alua: Handle all states correctly For ALUA we should be handling all states, independent of whether is explicit or implicit. For 'Transitioning' we should be retry for a certain amount of time; after that an error should be returned. Signed-off-by: Hannes Reinecke <hare@suse.de> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c index 1a970a7..c6f57e3 100644 --- a/drivers/scsi/device_handler/scsi_dh_alua.c +++ b/drivers/scsi/device_handler/scsi_dh_alua.c @@ -31,6 +31,7 @@ #define TPGS_STATE_NONOPTIMIZED 0x1 #define TPGS_STATE_STANDBY 0x2 #define TPGS_STATE_UNAVAILABLE 0x3 +#define TPGS_STATE_LBA_DEPENDENT 0x4 #define TPGS_STATE_OFFLINE 0xe #define TPGS_STATE_TRANSITIONING 0xf @@ -39,6 +40,7 @@ #define TPGS_SUPPORT_NONOPTIMIZED 0x02 #define TPGS_SUPPORT_STANDBY 0x04 #define TPGS_SUPPORT_UNAVAILABLE 0x08 +#define TPGS_SUPPORT_LBA_DEPENDENT 0x10 #define TPGS_SUPPORT_OFFLINE 0x40 #define TPGS_SUPPORT_TRANSITION 0x80 @@ -460,6 +462,8 @@ static char print_alua_state(int state) return 'S'; case TPGS_STATE_UNAVAILABLE: return 'U'; + case TPGS_STATE_LBA_DEPENDENT: + return 'L'; case TPGS_STATE_OFFLINE: return 'O'; case TPGS_STATE_TRANSITIONING: @@ -542,7 +546,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h) int len, k, off, valid_states = 0; char *ucp; unsigned err; + unsigned long expiry, interval = 10; + expiry = round_jiffies_up(jiffies + ALUA_FAILOVER_TIMEOUT); retry: err = submit_rtpg(sdev, h); @@ -553,7 +559,7 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h) return SCSI_DH_IO; err = alua_check_sense(sdev, &sense_hdr); - if (err == ADD_TO_MLQUEUE) + if (err == ADD_TO_MLQUEUE && time_before(jiffies, expiry)) goto retry; sdev_printk(KERN_INFO, sdev, "%s: rtpg sense code %02x/%02x/%02x\n", @@ -587,38 +593,36 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h) } sdev_printk(KERN_INFO, sdev, - "%s: port group %02x state %c supports %c%c%c%c%c%c\n", + "%s: port group %02x state %c supports %c%c%c%c%c%c%c\n", ALUA_DH_NAME, h->group_id, print_alua_state(h->state), valid_states&TPGS_SUPPORT_TRANSITION?'T':'t', valid_states&TPGS_SUPPORT_OFFLINE?'O':'o', + valid_states&TPGS_SUPPORT_LBA_DEPENDENT?'L':'l', valid_states&TPGS_SUPPORT_UNAVAILABLE?'U':'u', valid_states&TPGS_SUPPORT_STANDBY?'S':'s', valid_states&TPGS_SUPPORT_NONOPTIMIZED?'N':'n', valid_states&TPGS_SUPPORT_OPTIMIZED?'A':'a'); - if (h->tpgs & TPGS_MODE_EXPLICIT) { - switch (h->state) { - case TPGS_STATE_TRANSITIONING: + switch (h->state) { + case TPGS_STATE_TRANSITIONING: + if (time_before(jiffies, expiry)) { /* State transition, retry */ + interval *= 10; + msleep(interval); goto retry; - break; - case TPGS_STATE_OFFLINE: - /* Path is offline, fail */ - err = SCSI_DH_DEV_OFFLINED; - break; - default: - break; } - } else { - /* Only Implicit ALUA support */ - if (h->state == TPGS_STATE_OPTIMIZED || - h->state == TPGS_STATE_NONOPTIMIZED || - h->state == TPGS_STATE_STANDBY) - /* Useable path if active */ - err = SCSI_DH_OK; - else - /* Path unuseable for unavailable/offline */ - err = SCSI_DH_DEV_OFFLINED; + /* Transitioning time exceeded */ + err = SCSI_DH_RETRY; + break; + case TPGS_STATE_OFFLINE: + case TPGS_STATE_UNAVAILABLE: + /* Path unuseable for unavailable/offline */ + err = SCSI_DH_DEV_OFFLINED; + break; + default: + /* Useable path if active */ + err = SCSI_DH_OK; + break; } return err; } @@ -672,7 +676,9 @@ static int alua_activate(struct scsi_device *sdev, goto out; } - if (h->tpgs & TPGS_MODE_EXPLICIT && h->state != TPGS_STATE_OPTIMIZED) { + if (h->tpgs & TPGS_MODE_EXPLICIT && + h->state != TPGS_STATE_OPTIMIZED && + h->state != TPGS_STATE_LBA_DEPENDENT) { h->callback_fn = fn; h->callback_data = data; err = submit_stpg(h); @@ -698,8 +704,11 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req) struct alua_dh_data *h = get_alua_data(sdev); int ret = BLKPREP_OK; - if (h->state != TPGS_STATE_OPTIMIZED && - h->state != TPGS_STATE_NONOPTIMIZED) { + if (h->state == TPGS_STATE_TRANSITIONING) + ret = BLKPREP_DEFER; + else if (h->state != TPGS_STATE_OPTIMIZED && + h->state != TPGS_STATE_NONOPTIMIZED && + h->state != TPGS_STATE_LBA_DEPENDENT) { ret = BLKPREP_KILL; req->cmd_flags |= REQ_QUIET; } ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support 2010-09-22 10:13 ` Hannes Reinecke @ 2010-09-22 12:29 ` Mike Snitzer 2010-09-23 7:15 ` Hannes Reinecke 0 siblings, 1 reply; 14+ messages in thread From: Mike Snitzer @ 2010-09-22 12:29 UTC (permalink / raw) To: Hannes Reinecke Cc: Mike Christie, Nicholas A. Bellinger, James Bottomley, linux-scsi On Wed, Sep 22 2010 at 6:13am -0400, Hannes Reinecke <hare@suse.de> wrote: > Mike Christie wrote: > > On 09/21/2010 02:33 PM, Mike Snitzer wrote: > >> 2) the patch also modified alua_rtpg() so implicit ALUA would retry > >> (just like explicit ALUA currently does) if TPGS_STATE_TRANSITIONING > >> - so why should we avoid retry for implicit but do it for explicit? > > > > Leaving that for Hannes. I cannot think of a reason. Probably just did > > not do it. > > Finally I got around to answering this. > > I've attached a patch which I made the other day which seems to work > reasonably well. > Looks better from my side, so if you agree I'll be sending it > upstream properly. Looks good for covering this "2)" change above. But "1)" change you had concern with (in my previous patch) was that alua_prep_fn would return BLKPREP_DEFER if TPGS_STATE_TRANSITIONING. This was bad in that FS requests (like the directio path checker) would always call ->prep_fn and that if TPGS_STATE_TRANSITIONING the RTPG state would never be reevaluated.. leaving us stuck in BLKPREP_DEFER. Seems I'm missing a new flow that proves this is no longer a concern. Please advise, thanks. Mike > From d3f02c90db3e3177309b78726d082e17dd772ee2 Mon Sep 17 00:00:00 2001 > From: Hannes Reinecke <hare@suse.de> > Date: Wed, 22 Sep 2010 12:09:07 +0200 > Subject: [PATCH] scsi_dh_alua: Handle all states correctly > > For ALUA we should be handling all states, independent of whether > is explicit or implicit. For 'Transitioning' we should be retry > for a certain amount of time; after that an error should be > returned. > > Signed-off-by: Hannes Reinecke <hare@suse.de> > > diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c > index 1a970a7..c6f57e3 100644 > --- a/drivers/scsi/device_handler/scsi_dh_alua.c > +++ b/drivers/scsi/device_handler/scsi_dh_alua.c ... > @@ -698,8 +704,11 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req) > struct alua_dh_data *h = get_alua_data(sdev); > int ret = BLKPREP_OK; > > - if (h->state != TPGS_STATE_OPTIMIZED && > - h->state != TPGS_STATE_NONOPTIMIZED) { > + if (h->state == TPGS_STATE_TRANSITIONING) > + ret = BLKPREP_DEFER; > + else if (h->state != TPGS_STATE_OPTIMIZED && > + h->state != TPGS_STATE_NONOPTIMIZED && > + h->state != TPGS_STATE_LBA_DEPENDENT) { > ret = BLKPREP_KILL; > req->cmd_flags |= REQ_QUIET; > } ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support 2010-09-22 12:29 ` Mike Snitzer @ 2010-09-23 7:15 ` Hannes Reinecke 2010-09-23 13:44 ` Mike Snitzer 0 siblings, 1 reply; 14+ messages in thread From: Hannes Reinecke @ 2010-09-23 7:15 UTC (permalink / raw) To: Mike Snitzer Cc: Mike Christie, Nicholas A. Bellinger, James Bottomley, linux-scsi [-- Attachment #1: Type: text/plain, Size: 1827 bytes --] Mike Snitzer wrote: > > On Wed, Sep 22 2010 at 6:13am -0400, > Hannes Reinecke <hare@suse.de> wrote: > >> Mike Christie wrote: >>> On 09/21/2010 02:33 PM, Mike Snitzer wrote: >>>> 2) the patch also modified alua_rtpg() so implicit ALUA would retry >>>> (just like explicit ALUA currently does) if TPGS_STATE_TRANSITIONING >>>> - so why should we avoid retry for implicit but do it for explicit? >>> Leaving that for Hannes. I cannot think of a reason. Probably just did >>> not do it. >> Finally I got around to answering this. >> >> I've attached a patch which I made the other day which seems to work >> reasonably well. >> Looks better from my side, so if you agree I'll be sending it >> upstream properly. > > Looks good for covering this "2)" change above. > > But "1)" change you had concern with (in my previous patch) was that > alua_prep_fn would return BLKPREP_DEFER if TPGS_STATE_TRANSITIONING. > This was bad in that FS requests (like the directio path checker) would > always call ->prep_fn and that if TPGS_STATE_TRANSITIONING the RTPG > state would never be reevaluated.. leaving us stuck in BLKPREP_DEFER. > > Seems I'm missing a new flow that proves this is no longer a concern. > The device_handler will keep retrying any path in 'transitioning'. After ALUA_FAILOVER_TIMEOUT the device handler will either have updated the port state to something other than 'transitioning' or returned with 'SCSI_DH_RETRY'. But yes, you are right; we should be setting the port to something else than 'transitioning' then. Probably 'standby' is a good choice here. New patch attached. Thanks for the review. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) [-- Attachment #2: scsi-dh-alua-handle-all-states --] [-- Type: text/plain, Size: 4832 bytes --] >From db02ba3c33a4a1dbfcdc4de1b73e378f7b3d8925 Mon Sep 17 00:00:00 2001 From: Hannes Reinecke <hare@suse.de> Date: Wed, 22 Sep 2010 12:09:07 +0200 Subject: [PATCH] scsi_dh_alua: Handle all states correctly For ALUA we should be handling all states, independent of whether is explicit or implicit. For 'Transitioning' we should be retry for a certain amount of time; after that we're setting the port to 'Standby' and return SCSI_DH_RETRY to signal upper layers a retry is in order here. Signed-off-by: Hannes Reinecke <hare@suse.de> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c index 1a970a7..c7e11ed 100644 --- a/drivers/scsi/device_handler/scsi_dh_alua.c +++ b/drivers/scsi/device_handler/scsi_dh_alua.c @@ -31,6 +31,7 @@ #define TPGS_STATE_NONOPTIMIZED 0x1 #define TPGS_STATE_STANDBY 0x2 #define TPGS_STATE_UNAVAILABLE 0x3 +#define TPGS_STATE_LBA_DEPENDENT 0x4 #define TPGS_STATE_OFFLINE 0xe #define TPGS_STATE_TRANSITIONING 0xf @@ -39,6 +40,7 @@ #define TPGS_SUPPORT_NONOPTIMIZED 0x02 #define TPGS_SUPPORT_STANDBY 0x04 #define TPGS_SUPPORT_UNAVAILABLE 0x08 +#define TPGS_SUPPORT_LBA_DEPENDENT 0x10 #define TPGS_SUPPORT_OFFLINE 0x40 #define TPGS_SUPPORT_TRANSITION 0x80 @@ -460,6 +462,8 @@ static char print_alua_state(int state) return 'S'; case TPGS_STATE_UNAVAILABLE: return 'U'; + case TPGS_STATE_LBA_DEPENDENT: + return 'L'; case TPGS_STATE_OFFLINE: return 'O'; case TPGS_STATE_TRANSITIONING: @@ -542,7 +546,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h) int len, k, off, valid_states = 0; char *ucp; unsigned err; + unsigned long expiry, interval = 10; + expiry = round_jiffies_up(jiffies + ALUA_FAILOVER_TIMEOUT); retry: err = submit_rtpg(sdev, h); @@ -553,7 +559,7 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h) return SCSI_DH_IO; err = alua_check_sense(sdev, &sense_hdr); - if (err == ADD_TO_MLQUEUE) + if (err == ADD_TO_MLQUEUE && time_before(jiffies, expiry)) goto retry; sdev_printk(KERN_INFO, sdev, "%s: rtpg sense code %02x/%02x/%02x\n", @@ -587,38 +593,37 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h) } sdev_printk(KERN_INFO, sdev, - "%s: port group %02x state %c supports %c%c%c%c%c%c\n", + "%s: port group %02x state %c supports %c%c%c%c%c%c%c\n", ALUA_DH_NAME, h->group_id, print_alua_state(h->state), valid_states&TPGS_SUPPORT_TRANSITION?'T':'t', valid_states&TPGS_SUPPORT_OFFLINE?'O':'o', + valid_states&TPGS_SUPPORT_LBA_DEPENDENT?'L':'l', valid_states&TPGS_SUPPORT_UNAVAILABLE?'U':'u', valid_states&TPGS_SUPPORT_STANDBY?'S':'s', valid_states&TPGS_SUPPORT_NONOPTIMIZED?'N':'n', valid_states&TPGS_SUPPORT_OPTIMIZED?'A':'a'); - if (h->tpgs & TPGS_MODE_EXPLICIT) { - switch (h->state) { - case TPGS_STATE_TRANSITIONING: + switch (h->state) { + case TPGS_STATE_TRANSITIONING: + if (time_before(jiffies, expiry)) { /* State transition, retry */ + interval *= 10; + msleep(interval); goto retry; - break; - case TPGS_STATE_OFFLINE: - /* Path is offline, fail */ - err = SCSI_DH_DEV_OFFLINED; - break; - default: - break; } - } else { - /* Only Implicit ALUA support */ - if (h->state == TPGS_STATE_OPTIMIZED || - h->state == TPGS_STATE_NONOPTIMIZED || - h->state == TPGS_STATE_STANDBY) - /* Useable path if active */ - err = SCSI_DH_OK; - else - /* Path unuseable for unavailable/offline */ - err = SCSI_DH_DEV_OFFLINED; + /* Transitioning time exceeded, set port to standby */ + err = SCSI_DH_RETRY; + h->state = TPGS_STATE_STANDBY; + break; + case TPGS_STATE_OFFLINE: + case TPGS_STATE_UNAVAILABLE: + /* Path unuseable for unavailable/offline */ + err = SCSI_DH_DEV_OFFLINED; + break; + default: + /* Useable path if active */ + err = SCSI_DH_OK; + break; } return err; } @@ -672,7 +677,9 @@ static int alua_activate(struct scsi_device *sdev, goto out; } - if (h->tpgs & TPGS_MODE_EXPLICIT && h->state != TPGS_STATE_OPTIMIZED) { + if (h->tpgs & TPGS_MODE_EXPLICIT && + h->state != TPGS_STATE_OPTIMIZED && + h->state != TPGS_STATE_LBA_DEPENDENT) { h->callback_fn = fn; h->callback_data = data; err = submit_stpg(h); @@ -698,8 +705,11 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req) struct alua_dh_data *h = get_alua_data(sdev); int ret = BLKPREP_OK; - if (h->state != TPGS_STATE_OPTIMIZED && - h->state != TPGS_STATE_NONOPTIMIZED) { + if (h->state == TPGS_STATE_TRANSITIONING) + ret = BLKPREP_DEFER; + else if (h->state != TPGS_STATE_OPTIMIZED && + h->state != TPGS_STATE_NONOPTIMIZED && + h->state != TPGS_STATE_LBA_DEPENDENT) { ret = BLKPREP_KILL; req->cmd_flags |= REQ_QUIET; } ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support 2010-09-23 7:15 ` Hannes Reinecke @ 2010-09-23 13:44 ` Mike Snitzer 2010-09-23 18:53 ` Mike Snitzer 0 siblings, 1 reply; 14+ messages in thread From: Mike Snitzer @ 2010-09-23 13:44 UTC (permalink / raw) To: Hannes Reinecke Cc: Mike Christie, Nicholas A. Bellinger, James Bottomley, linux-scsi On Thu, Sep 23 2010 at 3:15am -0400, Hannes Reinecke <hare@suse.de> wrote: > The device_handler will keep retrying any path in 'transitioning'. > After ALUA_FAILOVER_TIMEOUT the device handler will either have > updated the port state to something other than 'transitioning' or > returned with 'SCSI_DH_RETRY'. > But yes, you are right; we should be setting the port to something > else than 'transitioning' then. Probably 'standby' is a good choice > here. > New patch attached. > > Thanks for the review. > > Cheers, > > Hannes > -- > Dr. Hannes Reinecke zSeries & Storage > hare@suse.de +49 911 74053 688 > SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: Markus Rex, HRB 16746 (AG Nürnberg) > From db02ba3c33a4a1dbfcdc4de1b73e378f7b3d8925 Mon Sep 17 00:00:00 2001 > From: Hannes Reinecke <hare@suse.de> > Date: Wed, 22 Sep 2010 12:09:07 +0200 > Subject: [PATCH] scsi_dh_alua: Handle all states correctly > > For ALUA we should be handling all states, independent of whether > is explicit or implicit. For 'Transitioning' we should be retry > for a certain amount of time; after that we're setting the port > to 'Standby' and return SCSI_DH_RETRY to signal upper layers > a retry is in order here. > > Signed-off-by: Hannes Reinecke <hare@suse.de> Acked-by: Mike Snitzer <snitzer@redhat.com> Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support 2010-09-23 13:44 ` Mike Snitzer @ 2010-09-23 18:53 ` Mike Snitzer 0 siblings, 0 replies; 14+ messages in thread From: Mike Snitzer @ 2010-09-23 18:53 UTC (permalink / raw) To: Hannes Reinecke Cc: Mike Christie, Nicholas A. Bellinger, James Bottomley, linux-scsi On Thu, Sep 23 2010 at 9:44am -0400, Mike Snitzer <snitzer@redhat.com> wrote: > > From db02ba3c33a4a1dbfcdc4de1b73e378f7b3d8925 Mon Sep 17 00:00:00 2001 > > From: Hannes Reinecke <hare@suse.de> > > Date: Wed, 22 Sep 2010 12:09:07 +0200 > > Subject: [PATCH] scsi_dh_alua: Handle all states correctly > > > > For ALUA we should be handling all states, independent of whether > > is explicit or implicit. For 'Transitioning' we should be retry > > for a certain amount of time; after that we're setting the port > > to 'Standby' and return SCSI_DH_RETRY to signal upper layers > > a retry is in order here. > > > > Signed-off-by: Hannes Reinecke <hare@suse.de> > > Acked-by: Mike Snitzer <snitzer@redhat.com> BTW, we're missing the include for the msleep() introduced in this patch: #include <linux/delay.h> James, any chance you'd add that? Or should Hannes send a revised patch? Mike ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2010-09-23 18:53 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-08-17 19:05 [PATCH] scsi_dh_alua: add missing transitioning state support Mike Snitzer 2010-08-17 19:23 ` Nicholas A. Bellinger 2010-08-30 9:36 ` Hannes Reinecke 2010-08-31 15:11 ` Mike Snitzer 2010-09-20 15:35 ` Mike Snitzer 2010-09-21 2:27 ` Mike Christie 2010-09-21 2:28 ` Mike Christie 2010-09-21 19:33 ` Mike Snitzer 2010-09-21 21:14 ` Mike Christie 2010-09-22 10:13 ` Hannes Reinecke 2010-09-22 12:29 ` Mike Snitzer 2010-09-23 7:15 ` Hannes Reinecke 2010-09-23 13:44 ` Mike Snitzer 2010-09-23 18:53 ` Mike Snitzer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).