dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* multipath errors
@ 2013-09-10 11:29 Amitai Alkalay
  2013-09-23  8:09 ` Fwd: " Amitai Alkalay
  0 siblings, 1 reply; 4+ messages in thread
From: Amitai Alkalay @ 2013-09-10 11:29 UTC (permalink / raw)
  To: dm-devel


[-- Attachment #1.1: Type: text/plain, Size: 1388 bytes --]

hi,

I sometimes see cases where io failed after some path failover, although
there are other valid paths.

I seem to get a lot of the following errors during a path removal
(failover):

Aug  1 08:59:37 lg641 multipathd: 3514f0c532d80000a: failed in domap
for removal of path sdcy
Aug  1 08:59:37 lg641 multipathd: uevent trigger error
Aug  1 08:59:37 lg641 multipathd: sdt: remove path (uevent)
Aug  1 08:59:37 lg641 kernel: device-mapper: table: 253:5: multipath:
error getting device
Aug  1 08:59:37 lg641 kernel: device-mapper: ioctl: error adding target to table
Aug  1 08:59:37 lg641 multipathd: 3514f0c532d800001: load table [0
21474836480 multipath 0 0 1 1 queue-length 0 12 1 66:240 1 66:80 1
70:192 1 71:96 1 67:176 1 67:112 1 71:224 1 128:128 1 8:
16 1 69:16 1 69:32 1 8:32 1]
Aug  1 08:59:37 lg641 multipathd: sdt: path removed from map 3514f0c532d800001f

All the other paths are there, and still multipath decided to fail the io
with no apparent reason.

I would appreciate any comment about:

1. How can this happen.
2. How can I increase the log level to understand multipath decisions.
3. Why do I always see the errors regarding adding target to table.
The only thing I can think think about, that multipath temporarily bypassed
the other paths (maybe it got busy several times and gave up).

I'm using device-mapper-multipath-0.4.9-64.el6.x86_64.

Thanks a lot,
Amitai

[-- Attachment #1.2: Type: text/html, Size: 3008 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Fwd: multipath errors
  2013-09-10 11:29 multipath errors Amitai Alkalay
@ 2013-09-23  8:09 ` Amitai Alkalay
  2013-09-27 20:11   ` Benjamin Marzinski
  0 siblings, 1 reply; 4+ messages in thread
From: Amitai Alkalay @ 2013-09-23  8:09 UTC (permalink / raw)
  To: dm-devel


[-- Attachment #1.1: Type: text/plain, Size: 1656 bytes --]

Hi,
Sorry for bumping, but I will appreciate any help on this matter..

Thanks,
Amitai

---------- Forwarded message ----------
From: Amitai Alkalay <amitai.alkalay.work@gmail.com>
Date: Tue, Sep 10, 2013 at 2:29 PM
Subject: multipath errors
To: dm-devel@redhat.com


hi,

I sometimes see cases where io failed after some path failover, although
there are other valid paths.

I seem to get a lot of the following errors during a path removal
(failover):

Aug  1 08:59:37 lg641 multipathd: 3514f0c532d80000a: failed in domap
for removal of path sdcy
Aug  1 08:59:37 lg641 multipathd: uevent trigger error
Aug  1 08:59:37 lg641 multipathd: sdt: remove path (uevent)
Aug  1 08:59:37 lg641 kernel: device-mapper: table: 253:5: multipath:
error getting device
Aug  1 08:59:37 lg641 kernel: device-mapper: ioctl: error adding target to table
Aug  1 08:59:37 lg641 multipathd: 3514f0c532d800001: load table [0
21474836480 multipath 0 0 1 1 queue-length 0 12 1 66:240 1 66:80 1
70:192 1 71:96 1 67:176 1 67:112 1 71:224 1 128:128 1 8:
16 1 69:16 1 69:32 1 8:32 1]
Aug  1 08:59:37 lg641 multipathd: sdt: path removed from map 3514f0c532d800001f

All the other paths are there, and still multipath decided to fail the io
with no apparent reason.

I would appreciate any comment about:

1. How can this happen.
2. How can I increase the log level to understand multipath decisions.
3. Why do I always see the errors regarding adding target to table.
The only thing I can think think about, that multipath temporarily bypassed
the other paths (maybe it got busy several times and gave up).

I'm using device-mapper-multipath-0.4.9-64.el6.x86_64.

Thanks a lot,
Amitai

[-- Attachment #1.2: Type: text/html, Size: 3267 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fwd: multipath errors
  2013-09-23  8:09 ` Fwd: " Amitai Alkalay
@ 2013-09-27 20:11   ` Benjamin Marzinski
  2013-11-14  0:25     ` Stewart, Sean
  0 siblings, 1 reply; 4+ messages in thread
From: Benjamin Marzinski @ 2013-09-27 20:11 UTC (permalink / raw)
  To: device-mapper development

On Mon, Sep 23, 2013 at 11:09:18AM +0300, Amitai Alkalay wrote:
>    Hi,
>    Sorry for bumping, but I will appreciate any help on this matter..
>    Thanks,
>    Amitai
> 
>    ---------- Forwarded message ----------
>    From: Amitai Alkalay <[1]amitai.alkalay.work@gmail.com>
>    Date: Tue, Sep 10, 2013 at 2:29 PM
>    Subject: multipath errors
>    To: [2]dm-devel@redhat.com
> 
>    hi,
>    I sometimes see cases where io failed after some path failover, although
>    there are other valid paths.
>    I seem to get a lot of the following errors during a path removal
>    (failover):
> 
>  Aug  1 08:59:37 lg641 multipathd: 3514f0c532d80000a: failed in domap for removal of path sdcy
>  Aug  1 08:59:37 lg641 multipathd: uevent trigger error
>  Aug  1 08:59:37 lg641 multipathd: sdt: remove path (uevent)
>  Aug  1 08:59:37 lg641 kernel: device-mapper: table: 253:5: multipath: error getting device
>  Aug  1 08:59:37 lg641 kernel: device-mapper: ioctl: error adding target to table
>  Aug  1 08:59:37 lg641 multipathd: 3514f0c532d800001: load table [0 21474836480 multipath 0 0 1 1 queue-length 0 12 1 66:240 1 66:80 1 70:192 1 71:96 1 67:176 1 67:112 1 71:224 1 128:128 1 8:
>  16 1 69:16 1 69:32 1 8:32 1]
>  Aug  1 08:59:37 lg641 multipathd: sdt: path removed from map 3514f0c532d800001f
> 
>    All the other paths are there, and still multipath decided to fail the io
>    with no apparent reason.�
> 
>    I would appreciate any comment about:
> 
>    1. How can this happen.

It shouldn't.  The kernel should check through all of the paths before
failing.  Sometimes some actions on storage arrays temporarily bring all
paths down, but even in that case, you should see messages in the logs
of multipath trying all the paths before it fails.

>    2. How can I increase the log level to understand multipath decisions.

in /etc/multipath.conf

defaults {
	...
	verbosity 3
}

This will add a lot of extra logging.  Make sure that your logging isn't
rate limited, or you will miss messages exactly when it's most important
to see them.


>    3. Why do I always see the errors regarding adding target to table.
>    The only thing I can think think about, that multipath temporarily
>    bypassed the other paths (maybe it got busy several times and gave up).
>    I'm using�device-mapper-multipath-0.4.9-64.el6.x86_64.

These messages:

Aug  1 08:59:37 lg641 kernel: device-mapper: table: 253:5: multipath: error getting device
Aug  1 08:59:37 lg641 kernel: device-mapper: ioctl: error adding target to table

ususally mean that the device is already in use.  They shouldn't be in
relation to the device you are removing.  Do you get them when you
create the device as well?

Another possibility is that the device has the wrong permissions. For
instance, this happens whenever multipath tries to get a read-only
device?  Again, this doesn't seem like it could be referring to the
device that is being removed.  Unfortunately, the kernel doesn't give
any indication which path device is failing, or why.  That should
probably get fixed.

Are you seeing IO errors for the multipath device in the messages?
Can you post those?

Could you post all of the log messages around the failure (I assume
there there is a kernel message saying that an IO failed), along with
the multipath -l listing of the device both when no paths are failed,
and immediately after the error happens.

Also, it would be interesting to know if setting something like
"no_path_retry 5" would avoid the issue.  There's still a bug if multipath
isn't trying all the paths, but this would narrow down where to look.

-Ben

>    Thanks a lot,
>    Amitai
> 
> References
> 
>    Visible links
>    1. mailto:amitai.alkalay.work@gmail.com
>    2. mailto:dm-devel@redhat.com

> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fwd: multipath errors
  2013-09-27 20:11   ` Benjamin Marzinski
@ 2013-11-14  0:25     ` Stewart, Sean
  0 siblings, 0 replies; 4+ messages in thread
From: Stewart, Sean @ 2013-11-14  0:25 UTC (permalink / raw)
  To: device-mapper development

On Fri, 2013-09-27 at 15:11 -0500, Benjamin Marzinski wrote:
> On Mon, Sep 23, 2013 at 11:09:18AM +0300, Amitai Alkalay wrote:
> >    Hi,
> >    Sorry for bumping, but I will appreciate any help on this matter..
> >    Thanks,
> >    Amitai
> > 
> >    ---------- Forwarded message ----------
> >    From: Amitai Alkalay <[1]amitai.alkalay.work@gmail.com>
> >    Date: Tue, Sep 10, 2013 at 2:29 PM
> >    Subject: multipath errors
> >    To: [2]dm-devel@redhat.com
> > 
> >    hi,
> >    I sometimes see cases where io failed after some path failover, although
> >    there are other valid paths.
> >    I seem to get a lot of the following errors during a path removal
> >    (failover):
> > 
> >  Aug  1 08:59:37 lg641 multipathd: 3514f0c532d80000a: failed in domap for removal of path sdcy
> >  Aug  1 08:59:37 lg641 multipathd: uevent trigger error
> >  Aug  1 08:59:37 lg641 multipathd: sdt: remove path (uevent)
> >  Aug  1 08:59:37 lg641 kernel: device-mapper: table: 253:5: multipath: error getting device
> >  Aug  1 08:59:37 lg641 kernel: device-mapper: ioctl: error adding target to table
> >  Aug  1 08:59:37 lg641 multipathd: 3514f0c532d800001: load table [0 21474836480 multipath 0 0 1 1 queue-length 0 12 1 66:240 1 66:80 1 70:192 1 71:96 1 67:176 1 67:112 1 71:224 1 128:128 1 8:
> >  16 1 69:16 1 69:32 1 8:32 1]
> >  Aug  1 08:59:37 lg641 multipathd: sdt: path removed from map 3514f0c532d800001f
> > 
> >    All the other paths are there, and still multipath decided to fail the io
> >    with no apparent reason.�
> > 
> >    I would appreciate any comment about:
> > 
> >    1. How can this happen.
> 
> It shouldn't.  The kernel should check through all of the paths before
> failing.  Sometimes some actions on storage arrays temporarily bring all
> paths down, but even in that case, you should see messages in the logs
> of multipath trying all the paths before it fails.
> 
> >    2. How can I increase the log level to understand multipath decisions.
> 
> in /etc/multipath.conf
> 
> defaults {
> 	...
> 	verbosity 3
> }
> 
> This will add a lot of extra logging.  Make sure that your logging isn't
> rate limited, or you will miss messages exactly when it's most important
> to see them.
> 
> 
> >    3. Why do I always see the errors regarding adding target to table.
> >    The only thing I can think think about, that multipath temporarily
> >    bypassed the other paths (maybe it got busy several times and gave up).
> >    I'm using�device-mapper-multipath-0.4.9-64.el6.x86_64.
> 
> These messages:
> 
> Aug  1 08:59:37 lg641 kernel: device-mapper: table: 253:5: multipath: error getting device
> Aug  1 08:59:37 lg641 kernel: device-mapper: ioctl: error adding target to table
> 
> ususally mean that the device is already in use.  They shouldn't be in
> relation to the device you are removing.  Do you get them when you
> create the device as well?

Other than this, I've seen these messages occur because of another path
in the map being offline, or has already been deleted but the message
hasn't yet reached userspace, or otherwise unavailable.  In these cases,
I usually don't see I/O errors unless all the paths are now gone.  One
offlined device can prevent paths from being added or removed from the
map.

The messages file should give a clue if one of these is the case.

- Sean

> 
> Another possibility is that the device has the wrong permissions. For
> instance, this happens whenever multipath tries to get a read-only
> device?  Again, this doesn't seem like it could be referring to the
> device that is being removed.  Unfortunately, the kernel doesn't give
> any indication which path device is failing, or why.  That should
> probably get fixed.
> 
> Are you seeing IO errors for the multipath device in the messages?
> Can you post those?
> 
> Could you post all of the log messages around the failure (I assume
> there there is a kernel message saying that an IO failed), along with
> the multipath -l listing of the device both when no paths are failed,
> and immediately after the error happens.
> 
> Also, it would be interesting to know if setting something like
> "no_path_retry 5" would avoid the issue.  There's still a bug if multipath
> isn't trying all the paths, but this would narrow down where to look.
> 
> -Ben
> 
> >    Thanks a lot,
> >    Amitai
> > 
> > References
> > 
> >    Visible links
> >    1. mailto:amitai.alkalay.work@gmail.com
> >    2. mailto:dm-devel@redhat.com
> 
> > --
> > dm-devel mailing list
> > dm-devel@redhat.com
> > https://www.redhat.com/mailman/listinfo/dm-devel
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-11-14  0:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-10 11:29 multipath errors Amitai Alkalay
2013-09-23  8:09 ` Fwd: " Amitai Alkalay
2013-09-27 20:11   ` Benjamin Marzinski
2013-11-14  0:25     ` Stewart, Sean

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).