* Has replicated mount failover been implemented?
@ 2007-10-11 6:28 John Simon
2007-10-11 6:59 ` Ian Kent
2007-10-11 18:50 ` John Simon
0 siblings, 2 replies; 8+ messages in thread
From: John Simon @ 2007-10-11 6:28 UTC (permalink / raw)
To: autofs
We are in the process of switching from Sun blades to
Linux blades for our compute farm and are running into
issues with automount not failing over when an NFS
server goes down. I am wondering if it is a
configuration issue, if I am using a version of autofs
that doesn't support failover or if it hasn't been
implemented yet.
Currently I am running SLES 10 with autofs 4.1.4. Here
are my configs. The server will pick the next host in
line at mount but if the failure happens after mount
the mount does NOT failover. Any information would be
greatly appreciated.:
/etc/auto_appl
test
-rsize=32768,wsize=32768,nfsvers=3,tcp,retrans=5,timeo=600
test-ap01,test-ap02,test-ap03:/export/appl/test
/etc/auto.master
/appl /etc/auto_appl -rw,intr,nosuid,nobrowse
/apps /etc/auto_apps -ro,intr,nosuid,nobrowse
/home /etc/auto_home -rw,intr,nosuid,nobrowse
/etc/sysconfig/autofs
AUTOFS_OPTIONS="--timeout 3600"
____________________________________________________________________________________
Tonight's top picks. What will you watch tonight? Preview the hottest shows on Yahoo! TV.
http://tv.yahoo.com/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Has replicated mount failover been implemented?
2007-10-11 6:28 Has replicated mount failover been implemented? John Simon
@ 2007-10-11 6:59 ` Ian Kent
2007-10-11 18:50 ` John Simon
1 sibling, 0 replies; 8+ messages in thread
From: Ian Kent @ 2007-10-11 6:59 UTC (permalink / raw)
To: John Simon; +Cc: autofs
On Wed, 2007-10-10 at 23:28 -0700, John Simon wrote:
> We are in the process of switching from Sun blades to
> Linux blades for our compute farm and are running into
> issues with automount not failing over when an NFS
> server goes down. I am wondering if it is a
> configuration issue, if I am using a version of autofs
> that doesn't support failover or if it hasn't been
> implemented yet.
>
> Currently I am running SLES 10 with autofs 4.1.4. Here
> are my configs. The server will pick the next host in
> line at mount but if the failure happens after mount
> the mount does NOT failover. Any information would be
> greatly appreciated.:
Failover of active NFS mounts isn't something that autofs can do. When I
say that I don't mean it hasn't been implemented, I mean, as far as I
can see, it's not possible for autofs to do it. I've thought about quite
a bit and I just can't see a way to implement it.
Once autofs has selected a server and performed the mount it has no
knowledge of what is happening in the mount itself. All that autofs
could do (if it was implemented in the NFS client) is to pass an ordered
list of servers to mount so it can then pass that to the kernel. This is
essentially what the Solaris automount does.
This has to be supported in the kernel NFS client and it's not.
Ian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Has replicated mount failover been implemented?
2007-10-11 6:28 Has replicated mount failover been implemented? John Simon
2007-10-11 6:59 ` Ian Kent
@ 2007-10-11 18:50 ` John Simon
2007-10-11 19:21 ` Jeff Moyer
` (2 more replies)
1 sibling, 3 replies; 8+ messages in thread
From: John Simon @ 2007-10-11 18:50 UTC (permalink / raw)
To: autofs
Since client-side failover is not currently possible
with Linux autofs does anyone have any recommendations
for minimizing server side impact during when a HA-NFS
server fails over. Right now what happens with HA-NFS
is all the clients retry so much they basically end up
DOS'ing the NFS server and causing it to failover
again and again. We have hundreds of clients.
--- John Simon <tzzhc4@yahoo.com> wrote:
> We are in the process of switching from Sun blades
> to
> Linux blades for our compute farm and are running
> into
> issues with automount not failing over when an NFS
> server goes down. I am wondering if it is a
> configuration issue, if I am using a version of
> autofs
> that doesn't support failover or if it hasn't been
> implemented yet.
____________________________________________________________________________________
Looking for a deal? Find great prices on flights and hotels with Yahoo! FareChase.
http://farechase.yahoo.com/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Has replicated mount failover been implemented?
2007-10-11 18:50 ` John Simon
@ 2007-10-11 19:21 ` Jeff Moyer
2007-10-11 19:56 ` Peter Staubach
2007-10-11 22:39 ` Todd Denniston
2 siblings, 0 replies; 8+ messages in thread
From: Jeff Moyer @ 2007-10-11 19:21 UTC (permalink / raw)
To: John Simon; +Cc: autofs
John Simon <tzzhc4@yahoo.com> writes:
> Since client-side failover is not currently possible
> with Linux autofs does anyone have any recommendations
> for minimizing server side impact during when a HA-NFS
> server fails over. Right now what happens with HA-NFS
> is all the clients retry so much they basically end up
> DOS'ing the NFS server and causing it to failover
> again and again. We have hundreds of clients.
See the retry, retrans, and timeo nfs mount options.
Cheers,
Jeff
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Has replicated mount failover been implemented?
2007-10-11 18:50 ` John Simon
2007-10-11 19:21 ` Jeff Moyer
@ 2007-10-11 19:56 ` Peter Staubach
2007-10-12 3:18 ` John Simon
2007-10-11 22:39 ` Todd Denniston
2 siblings, 1 reply; 8+ messages in thread
From: Peter Staubach @ 2007-10-11 19:56 UTC (permalink / raw)
To: John Simon; +Cc: autofs
John Simon wrote:
> Since client-side failover is not currently possible
> with Linux autofs does anyone have any recommendations
> for minimizing server side impact during when a HA-NFS
> server fails over. Right now what happens with HA-NFS
> is all the clients retry so much they basically end up
> DOS'ing the NFS server and causing it to failover
> again and again. We have hundreds of clients.
I don't think that I understand what the client side failover
of Solaris was being used for in this configuration. If the
server is truly HA, then shouldn't the NFS service be able to
failover from one server to the next with minimal interruption
on the clients?
The Solaris client side failover required relooking up all
file handles which referred to the dead server, so it wasn't
cheap either.
Thanx...
ps
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Has replicated mount failover been implemented?
2007-10-11 18:50 ` John Simon
2007-10-11 19:21 ` Jeff Moyer
2007-10-11 19:56 ` Peter Staubach
@ 2007-10-11 22:39 ` Todd Denniston
2 siblings, 0 replies; 8+ messages in thread
From: Todd Denniston @ 2007-10-11 22:39 UTC (permalink / raw)
To: John Simon; +Cc: autofs
John Simon wrote, On 10/11/2007 01:50 PM:
> Since client-side failover is not currently possible
> with Linux autofs does anyone have any recommendations
> for minimizing server side impact during when a HA-NFS
> server fails over. Right now what happens with HA-NFS
> is all the clients retry so much they basically end up
> DOS'ing the NFS server and causing it to failover
> again and again. We have hundreds of clients.
>
>
Do you mean an HA Linux[1] server?
You might try running nslookup|dig in a for loop that has all of your client
machines listed, if that for loop takes very long (greater than 30-60 seconds
to lookup every client in your domain) then you may want to maintain a full
/etc/hosts table on the server. When I had some problems with name service
taking a while to respond[3] HA would give up on nfs getting started and fall
back to the other machine (repeatedly), and that was when all the client
machines were shutdown.
Unfortunately, you will need to maintain this hosts table so I would suggest,
that like me, you write a script to build looked up information and let you
know when that differs from the current /etc/hosts file.
Assuming that the failover again and again is caused by HA getting impatient
with 'service nfs start', you might ask on the HA list[2] if there is a
timeout value you could increase to give it a better chance of coming up in
your environment (been a _long_ time since I messed with HA's timeouts and
that was for ver 1.2.X).
Jeff's timeo nfs mount option might help too.
[1] http://www.linux-ha.org/
which IIRC works on solaris too.
[2] http://lists.linux-ha.org/mailman/listinfo/linux-ha
[3] it WAS very sick DNS hardware.
>
> --- John Simon <tzzhc4@yahoo.com> wrote:
>
>> We are in the process of switching from Sun blades
>> to
>> Linux blades for our compute farm and are running
>> into
>> issues with automount not failing over when an NFS
>> server goes down. I am wondering if it is a
>> configuration issue, if I am using a version of
>> autofs
>> that doesn't support failover or if it hasn't been
>> implemented yet.
>
>
>
--
Todd Denniston
Crane Division, Naval Surface Warfare Center (NSWC Crane)
Harnessing the Power of Technology for the Warfighter
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Has replicated mount failover been implemented?
2007-10-11 19:56 ` Peter Staubach
@ 2007-10-12 3:18 ` John Simon
2007-10-12 11:35 ` Peter Staubach
0 siblings, 1 reply; 8+ messages in thread
From: John Simon @ 2007-10-12 3:18 UTC (permalink / raw)
To: Peter Staubach; +Cc: autofs
We have two pairs of VCS HA-NFS servers (one in each
data center), so each entry in auto_appl is the VIP
for the HA pair. Should one portion of the cluster in
the data center local to the compute engines fail it
will just fail over to the other cluster node and
client-side failover would not be used. If however the
entire cluster fails or is unavailable for whatever
reason the duplicate VCS HA-NFS server in the other
data center resumes serving data, albeit at a slightly
slower rate due to latency over the MAN. This site to
site failover is where we currently depend on Solaris
client-side failover.
--- Peter Staubach <staubach@redhat.com> wrote:
> I don't think that I understand what the client side
> failover
> of Solaris was being used for in this configuration.
> If the
> server is truly HA, then shouldn't the NFS service
> be able to
> failover from one server to the next with minimal
> interruption
> on the clients?
>
> The Solaris client side failover required relooking
> up all
> file handles which referred to the dead server, so
> it wasn't
> cheap either.
>
> Thanx...
>
> ps
>
____________________________________________________________________________________
Need a vacation? Get great deals
to amazing places on Yahoo! Travel.
http://travel.yahoo.com/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Has replicated mount failover been implemented?
2007-10-12 3:18 ` John Simon
@ 2007-10-12 11:35 ` Peter Staubach
0 siblings, 0 replies; 8+ messages in thread
From: Peter Staubach @ 2007-10-12 11:35 UTC (permalink / raw)
To: John Simon; +Cc: autofs
John Simon wrote:
> We have two pairs of VCS HA-NFS servers (one in each
> data center), so each entry in auto_appl is the VIP
> for the HA pair. Should one portion of the cluster in
> the data center local to the compute engines fail it
> will just fail over to the other cluster node and
> client-side failover would not be used. If however the
> entire cluster fails or is unavailable for whatever
> reason the duplicate VCS HA-NFS server in the other
> data center resumes serving data, albeit at a slightly
> slower rate due to latency over the MAN. This site to
> site failover is where we currently depend on Solaris
> client-side failover.
Ahhh. Okay, thanx!
ps
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-10-12 11:35 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-11 6:28 Has replicated mount failover been implemented? John Simon
2007-10-11 6:59 ` Ian Kent
2007-10-11 18:50 ` John Simon
2007-10-11 19:21 ` Jeff Moyer
2007-10-11 19:56 ` Peter Staubach
2007-10-12 3:18 ` John Simon
2007-10-12 11:35 ` Peter Staubach
2007-10-11 22:39 ` Todd Denniston
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.