public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Chandra Seetharaman <sekharan@us.ibm.com>
To: Grant Grundler <grundler@google.com>
Cc: "Moger, Babu" <Babu.Moger@lsi.com>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"Chauhan, Vijay" <Vijay.Chauhan@lsi.com>
Subject: Re: [PATCH] dm mpath: Try recover from I/O failure by re-initializing  the PG if device is running on one path
Date: Wed, 22 Apr 2009 12:29:49 -0700	[thread overview]
Message-ID: <1240428589.19442.14.camel@chandra-ubuntu> (raw)
In-Reply-To: <da824cf30904221041h40b87096yb4cb2784f31c5270@mail.gmail.com>


On Wed, 2009-04-22 at 10:41 -0700, Grant Grundler wrote:
> On Mon, Apr 20, 2009 at 11:05 AM, Moger, Babu <Babu.Moger@lsi.com> wrote:
> > This patch introduces the mechanism to recover from I/O failures by re-initializing the path if the device is running on only one path.
> >
> > Problem: Device mapper fails the path for every I/O error.
> > It does not care about the type of error.
> 
> This is the fundamental problem.  Different layers of the block IO
> path have to agree on how to handle each possible type of error that
> can be returned. I don't know where to find such an agreement and
> think an implementation that does discriminate is needed.
> 
> > There are certain errors which can be recovered by re-initializing the path again. I have seen this problem during my testing on rdac device handler. I have observed I/O errors when there is a change in Lun ownership. When Lun ownership changes device will return back with check condition with sense 0x05/0x94/0x01(SK/ASC/ASCQ -meaning Lun ownership changed). Currently, device mapper fails the path for this error and eventually this will lead to I/O error. We don't want to see I/O error for this reason.
> 
> 1) This patch isn't discriminating between transport, media, or other
> device errors. Wouldn't it make sense to discriminate?

yes it is. But currently we do not have it.

> "LUN ownership changed" sounds like some of the events possible in
> multi-inititiator enviroment would want to be notified about and
> perhaps even take some action (renegotiate access to
> 
> 2) Will this result in resetting a SATA device?
> I ask because device reset may result in data loss due to WCE enabled.
> I just don't know the higher parts of the block SW stack and how
> errors flow up the stack.

The device is not hung, the I/O will come back after a while.

BTW, activate doesn't do a reset, it just sends a command (in lsi rdac
case, it just sends a mode select) to the controller.

> 
> thanks,
> grant
> 
> >
> > The patch will set the flag pg_init_required if the device is running on single path. The process_queued_ios will re-initialize path if required. I have tested this patch on LSI rdac handler.
> >
> > Signed-off-by: Babu Moger <babu.moger@lsi.com>
> > ---
> >
> > --- linux-2.6.30-rc2/drivers/md/dm-mpath.c.orig 2009-04-17 16:49:33.000000000 -0500
> > +++ linux-2.6.30-rc2/drivers/md/dm-mpath.c      2009-04-17 17:09:51.000000000 -0500
> > @@ -1152,6 +1152,15 @@ static int do_end_io(struct multipath *m
> >                return error;
> >
> >        spin_lock_irqsave(&m->lock, flags);
> > +       /*
> > +        * If this is the only path left, then lets try to
> > +        * re-initialize the PG one last time..
> > +        */
> > +       if (m->nr_valid_paths == 1 && m->hw_handler_name) {
> > +               m->pg_init_required = 1;
> > +               spin_unlock_irqrestore(&m->lock, flags);
> > +               goto requeue;
> > +       }
> >        if (!m->nr_valid_paths) {
> >                if (__must_push_back(m)) {
> >                        spin_unlock_irqrestore(&m->lock, flags);
> >
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >


      parent reply	other threads:[~2009-04-22 19:29 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-20 18:05 [PATCH] dm mpath: Try recover from I/O failure by re-initializing the PG if device is running on one path Moger, Babu
2009-04-21  1:06 ` Kiyoshi Ueda
2009-04-21 17:06   ` Moger, Babu
2009-04-22  1:52     ` Kiyoshi Ueda
2009-04-22 14:03       ` Moger, Babu
2009-04-22 17:33         ` Chandra Seetharaman
2009-04-22 17:43           ` Moger, Babu
2009-04-22 17:41 ` Grant Grundler
2009-04-22 18:16   ` Moger, Babu
2009-04-22 19:29   ` Chandra Seetharaman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1240428589.19442.14.camel@chandra-ubuntu \
    --to=sekharan@us.ibm.com \
    --cc=Babu.Moger@lsi.com \
    --cc=Vijay.Chauhan@lsi.com \
    --cc=dm-devel@redhat.com \
    --cc=grundler@google.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=sekharan@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox