* md multipath and failover
@ 2003-09-10 22:55 Doug Griswold
2003-09-11 7:43 ` Lars Marowsky-Bree
0 siblings, 1 reply; 5+ messages in thread
From: Doug Griswold @ 2003-09-10 22:55 UTC (permalink / raw)
To: linux-raid
Alright I got md/multipath to failover with 2 emulex hba's on red hat
advanced server 2.1 kernel version 2.4.9-27enterprise. My next new
problem is that it took five minutes to fail over from when I yanked one
of the fibre channel cables. Where could this timeout come from? I
have tried several times but each time it takes 5 minutes to failover.
Also it won't failback. Any ideas out there.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: md multipath and failover
2003-09-10 22:55 Doug Griswold
@ 2003-09-11 7:43 ` Lars Marowsky-Bree
0 siblings, 0 replies; 5+ messages in thread
From: Lars Marowsky-Bree @ 2003-09-11 7:43 UTC (permalink / raw)
To: Doug Griswold, linux-raid
On 2003-09-10T18:55:41,
Doug Griswold <griswld@cio.sc.gov> said:
> Alright I got md/multipath to failover with 2 emulex hba's on red hat
> advanced server 2.1 kernel version 2.4.9-27enterprise. My next new
> problem is that it took five minutes to fail over from when I yanked one
> of the fibre channel cables. Where could this timeout come from? I
> have tried several times but each time it takes 5 minutes to failover.
> Also it won't failback. Any ideas out there.
The plain md multipath can't do failback automatically. People's opinion
on whether that is a good idea do differ ;-)
The timeout is the time needed until the damn (sorry) Linux Kernel SCSI
layers give up retrying and then pass the error code up to md for
handling. You can maybe try tuning some emulex parameters to fix that.
If you want load balancing for the md multipath, you could try checking
out my patch at ftp://ftp.suse.com/pub/people/lmb/md-mp/ vs 2.4, it adds
some features to md and also makes it quite a bit more robust.
Sincerely,
Lars Marowsky-Brée <lmb@suse.de>
--
High Availability & Clustering ever tried. ever failed. no matter.
SuSE Labs try again. fail again. fail better.
Research & Development, SuSE Linux AG -- Samuel Beckett
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: md multipath and failover
@ 2003-09-11 11:07 Doug Griswold
0 siblings, 0 replies; 5+ messages in thread
From: Doug Griswold @ 2003-09-11 11:07 UTC (permalink / raw)
To: Doug Griswold, lmb, linux-raid
Thanks for the info. My next question is I have set the link down
timeout in the emulex driver to be 15 seconds but the paths are still
not failing over for 5 minutes. Is there a way to get the scsi timeouts
down
to 30 seconds? Can I pass anything to the scsi module at boot? If I
applied your patch that provides load balancing then I would not have to
worry about this issue since they are both working already we wouldn't
have the timeout issue. Does your patch pick up on the path once it
becomes available after losing it?
Thanks for the info.
Doug
>>> Lars Marowsky-Bree <lmb@suse.de> 09/11/03 03:45 AM >>>
On 2003-09-10T18:55:41,
Doug Griswold <griswld@cio.sc.gov> said:
> Alright I got md/multipath to failover with 2 emulex hba's on red hat
> advanced server 2.1 kernel version 2.4.9-27enterprise. My next new
> problem is that it took five minutes to fail over from when I yanked
one
> of the fibre channel cables. Where could this timeout come from? I
> have tried several times but each time it takes 5 minutes to failover.
> Also it won't failback. Any ideas out there.
The plain md multipath can't do failback automatically. People's opinion
on whether that is a good idea do differ ;-)
The timeout is the time needed until the damn (sorry) Linux Kernel SCSI
layers give up retrying and then pass the error code up to md for
handling. You can maybe try tuning some emulex parameters to fix that.
If you want load balancing for the md multipath, you could try checking
out my patch at ftp://ftp.suse.com/pub/people/lmb/md-mp/ vs 2.4, it adds
some features to md and also makes it quite a bit more robust.
Sincerely,
Lars Marowsky-Brée <lmb@suse.de>
--
High Availability & Clustering ever tried. ever failed. no
matter.
SuSE Labs try again. fail again. fail
better.
Research & Development, SuSE Linux AG -- Samuel Beckett
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: md multipath and failover
@ 2003-09-11 17:30 Doug Griswold
2003-09-18 9:12 ` Lars Marowsky-Bree
0 siblings, 1 reply; 5+ messages in thread
From: Doug Griswold @ 2003-09-11 17:30 UTC (permalink / raw)
To: lmb; +Cc: linux-raid
Here is a update on my situation. I downloaded the md-mp patch and
tried to apply it to the Red Hat kernel source 2.4.9-27enterprise
kernel, this did not patch correctly. I then downloaded the 2.4.22
kernel and applied with success. The box now fails over in a matter of
seconds instead of minutes it still doesn't failback but that's no big
deal. My next question is how can I apply this patch to the red hat
supplied kernel 2.4.9-27 enterprise so I will still have red hat support
on this kernel? Also is there away with this patch applied to get
failback?
Thanks,
Doug
>>> Lars Marowsky-Bree <lmb@suse.de> 09/11/03 03:43AM >>>
On 2003-09-10T18:55:41,
Doug Griswold <griswld@cio.sc.gov> said:
> Alright I got md/multipath to failover with 2 emulex hba's on red
hat
> advanced server 2.1 kernel version 2.4.9-27enterprise. My next new
> problem is that it took five minutes to fail over from when I yanked
one
> of the fibre channel cables. Where could this timeout come from?
I
> have tried several times but each time it takes 5 minutes to
failover.
> Also it won't failback. Any ideas out there.
The plain md multipath can't do failback automatically. People's
opinion
on whether that is a good idea do differ ;-)
The timeout is the time needed until the damn (sorry) Linux Kernel
SCSI
layers give up retrying and then pass the error code up to md for
handling. You can maybe try tuning some emulex parameters to fix that.
If you want load balancing for the md multipath, you could try
checking
out my patch at ftp://ftp.suse.com/pub/people/lmb/md-mp/ vs 2.4, it
adds
some features to md and also makes it quite a bit more robust.
Sincerely,
Lars Marowsky-Brée <lmb@suse.de>
--
High Availability & Clustering ever tried. ever failed. no
matter.
SuSE Labs try again. fail again. fail
better.
Research & Development, SuSE Linux AG -- Samuel Beckett
-
To unsubscribe from this list: send the line "unsubscribe linux-raid"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: md multipath and failover
2003-09-11 17:30 md multipath and failover Doug Griswold
@ 2003-09-18 9:12 ` Lars Marowsky-Bree
0 siblings, 0 replies; 5+ messages in thread
From: Lars Marowsky-Bree @ 2003-09-18 9:12 UTC (permalink / raw)
To: Doug Griswold; +Cc: linux-raid
On 2003-09-11T13:30:41,
Doug Griswold <griswld@cio.sc.gov> said:
> deal. My next question is how can I apply this patch to the red hat
> supplied kernel 2.4.9-27 enterprise so I will still have red hat support
> on this kernel?
No idea. I doubt RH will support a patched kernel anyway, even if you
started from their base source. I suggest you contact your support
representative about this.
> Also is there away with this patch applied to get failback?
You have to do it in user-space, ie monitor the paths periodically and
restore them if you are satisfied.
Sincerely,
Lars Marowsky-Brée <lmb@suse.de>
--
High Availability & Clustering ever tried. ever failed. no matter.
SuSE Labs try again. fail again. fail better.
Research & Development, SuSE Linux AG -- Samuel Beckett
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2003-09-18 9:12 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-09-11 17:30 md multipath and failover Doug Griswold
2003-09-18 9:12 ` Lars Marowsky-Bree
-- strict thread matches above, loose matches on Subject: below --
2003-09-11 11:07 Doug Griswold
2003-09-10 22:55 Doug Griswold
2003-09-11 7:43 ` Lars Marowsky-Bree
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).