All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dushyanth Harinath <dushyanth.h@directi.com>
To: sekharan@linux.vnet.ibm.com,
	device-mapper development <dm-devel@redhat.com>
Subject: Re: Multipath failover issues
Date: Tue, 17 Mar 2009 18:00:00 +0530	[thread overview]
Message-ID: <49BF97C8.1030406@directi.com> (raw)
In-Reply-To: <1237243253.309.13.camel@chandra-ubuntu>

Hi,

>> 257 Critical 2009-03-11 10:38:43 ALERT:Redundant Controller Failure
>> Detected (Slot B)
>>
>> I also found additional logs from /var/log/messages which i did not 
>> check earlier.
>>
>> Mar 11 10:32:46 multipathd: sdc: readsector0 checker reports path is down
>> Mar 11 10:32:46 multipathd: checker failed path 8:32 in map infortrend01
>> Mar 11 10:32:46 multipathd: infortrend01: remaining active paths: 1
>> Mar 11 10:32:46 multipathd: sdd: readsector0 checker reports path is down
>> Mar 11 10:32:46 multipathd: checker failed path 8:48 in map infortrend01
> 
> Does this timing correspond to when you turned off the controller ?

This is when the controller failed. The controller shutdown happened 
much later.

>> Iam assuming it must have been busy for a few secs during the switch 
>> over and the multipath config doesn't wait enough for the switchover to 
>> work.
> 
> Answer to your previous question would help here :)
> 
> Set no_path_retry to "queue", which would queue the I/Os when "all" the
> paths fail.

Iam not sure if i can do this as well. Aren't we creating an illusion 
that the storage subsystem is fine and queuing requests when actually 
the subsystem is gone ? What actually is done for queuing and there must 
be some limits for the queue as well right ?

> If the behavior seen above was caused by the storage and will be
> rectified in an acceptable (to the user) time, then this parameter
> setting would solve your problem.

Iam checking this with infortrend.

> BTW, have you seen the I/O successfully been sent to the lun (both paths
> - you can use iostat to check it) before you failed the controller ? (I
> am trying to see if your config settings are proper).

Iam doing a post mortem of the redundant controller failure here :). I 
dug out what was done after the controller failure.

* Primary Controller failed and failover to secondary did not work
* Multipath failed both paths and ext3 went read only
* Postgres crashed
* When they logged in and ran (multipath -v2 -ll), they saw both paths 
active - I cannot find any multipath log entries which shows paths 
reinstated until 11:50 - which was after controller shutdown and power 
cycle.
* The filesystem was mounted again (without fsck) and database started 
(This answers your question abt IO to the LUNs i think)
* Postgres recovered and was shutdown immediately and /data unmounted.
* After this the controllers on the infotrend was shutdown and the 
device power cycled.

PS : Iam digging up the entire multipath logs instead of posting 
snippets here - will add to pastebin and send the link over

TIA
Dushyanth

      reply	other threads:[~2009-03-17 12:30 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-16 16:21 Multipath failover issues dushyanth.h
2009-03-16 16:49 ` Bryn M. Reeves
2009-03-16 16:50 ` Bryn M. Reeves
2009-03-16 17:59 ` Chandra Seetharaman
2009-03-16 21:03   ` dushyanth.h
2009-03-16 22:40     ` Chandra Seetharaman
2009-03-17 12:30       ` Dushyanth Harinath [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49BF97C8.1030406@directi.com \
    --to=dushyanth.h@directi.com \
    --cc=dm-devel@redhat.com \
    --cc=sekharan@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.