All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alan Kasindorf <akasindorf@mail.communityconnect.com>
To: dm-devel@redhat.com
Subject: multipath-tools-0.4.4 on 3par unknown path failure issue
Date: Wed, 10 Aug 2005 17:49:40 -0400	[thread overview]
Message-ID: <42FA7674.4070201@mail.communityconnect.com> (raw)

Hey,

I have ~10 machines running multipath-tools-0.4.4 on RHEL ES 4.1 (latest 
everything). Machines are mounting multipathed mounts on an EMC clariion 
and a 3PAR SAN device, over the same fabric.

At some random point in time today, one of the machines lost one of its 
four 3par mounts. All other mounts worked fine. This has happened once 
or twice before as well, but we rebooted before I had time to inspect 
the issue.

multipath -v3 -l showed this status on the bad path;

params = 1 queue_if_no_path 0 1 1 round-robin 0 2 1 8:64 1000 8:176 1000
status = 1 3 0 1 1 E 0 2 0 8:64 F 3574 8:176 F 3574
exports (350002ac0005b02a4)
[size=150 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [enabled][first]
  \_ 5:0:0:3 sde  8:64    [ready ][failed]
  \_ 6:0:1:3 sdl  8:176   [ready ][failed]

This was being spammed into /var/log/messages once every five seconds 
(the multipathd polling interval):

Aug 10 15:35:43 cc42-86 multipathd: 8:64: tur checker reports path is up
Aug 10 15:35:43 cc42-86 multipathd: devmap event (8163) on exports
Aug 10 15:35:43 cc42-86 kernel: device-mapper: dm-multipath: Failing 
path 8:176.
Aug 10 15:35:43 cc42-86 kernel: device-mapper: dm-multipath: Failing 
path 8:64.
Aug 10 15:35:43 cc42-86 multipathd: 8:176: tur checker reports path is up
Aug 10 15:35:43 cc42-86 kernel: cdrom: open failed.
Aug 10 15:35:43 cc42-86 kernel: device-mapper: dm-multipath: Failing 
path 8:176.
Aug 10 15:35:43 cc42-86 kernel: device-mapper: dm-multipath: Failing 
path 8:64.
Aug 10 15:35:43 cc42-86 kernel: cdrom: open failed.
Aug 10 15:35:43 cc42-86 multipathd: open(/dev/hdc) failed
Aug 10 15:35:43 cc42-86 multipathd: mark 8:64 as failed
Aug 10 15:35:43 cc42-86 multipathd: mark 8:176 as failed
Aug 10 15:35:43 cc42-86 multipathd: devmap event (8164) on exports
Aug 10 15:35:43 cc42-86 kernel: cdrom: open failed.
Aug 10 15:35:43 cc42-86 multipathd: open(/dev/hdc) failed
Aug 10 15:35:43 cc42-86 kernel: cdrom: open failed.
Aug 10 15:35:43 cc42-86 multipathd: open(/dev/hdc) failed

tur sees it up, kernel says it's down, ad infinitum.

Nothing I tried could elicit a more detailed error about why this was 
happening. The mount on top of it is a normal ext3 mount, and wasn't 
being accessed at the time of the failure as far as I know.

I switched off the queue_if_no_path option globally in the 
mulitpath.conf file. Immediately the ext3 journal failed out, and 
multipath brought both paths back as active:

exports (350002ac0005b02a4)
[size=150 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [active][first]
  \_ 5:0:0:3 sde  8:64    [ready ][active]
  \_ 6:0:1:3 sdl  8:176   [ready ][active]

I was able to fsck the device and remount it without issue or reboot 
after that. Since, I've left the queue option disabled to see if the 
problem creeps back.

I basically have a default multipath.conf file, with some WWN to alias 
mappings, had the queue_if_no_path option enabled, and the EMC device 
info added. The problem's on the 3par however. Only one of the four 3par 
mounts on the machine was having issues.

Is this known at all? Is there anything else I can provide so that we 
can figure out why this happened? I had been running multipath tools for 
two months on a test box and never encounterred this problem. It's only 
snuck up as we've started deploying it on more machines for 
pre-production. All of the servers are identical... redhat ES4.1, same 
qla2300 fiber cards, same CPUs/etc.

We also encounterred the EMC ghost LUN issue (discussed on here once), 
which is especially bad if queue_if_no_path is enabled. Sometimes 
causing a kernel panic and bringing the machine down :(

Any assistance on the first or second issue would be appreciated!

Thanks,
-Alan

             reply	other threads:[~2005-08-10 21:49 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-10 21:49 Alan Kasindorf [this message]
2005-08-11 19:55 ` multipath-tools-0.4.4 on 3par unknown path failure issue Andy
2005-08-11 20:19   ` Alan Kasindorf
2005-08-12 16:04     ` Andy
2005-08-17 19:38     ` Alan Kasindorf
2005-08-17 20:08       ` Ed Wilts
2005-08-17 20:25         ` Alan Kasindorf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42FA7674.4070201@mail.communityconnect.com \
    --to=akasindorf@mail.communityconnect.com \
    --cc=dm-devel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.