All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ian Kent <raven@themaw.net>
To: "Breitman, Jason" <Jason.Breitman@blackrock.com>
Cc: "'autofs@linux.kernel.org'" <autofs@linux.kernel.org>
Subject: Re: Recovering from the loss of a NFS Server
Date: Sun, 13 Mar 2011 17:17:52 +0800	[thread overview]
Message-ID: <1300007872.2906.17.camel@perseus> (raw)
In-Reply-To: <4A961A51CBE84C429C56E5734477BD1F2DBB789448@EXCHAMRS03.na.blkint.com>

On Sat, 2011-03-12 at 23:27 -0500, Breitman, Jason wrote:
> OS
> 	Linux hostname 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
> 
> autofs package
> 	autofs-5.0.1-0.rc2.148.bz579312.1.el5
> 
> Mount options
> 	$ cat /etc/auto.master
> 	# Master map for automounter
> 	#
> 	/home             auto_home               -hard,intr,retry=10
> 
> 	$ cat /etc/sysconfig/autofs
> 	TIMEOUT=86400 - we have a long TIMEOUT to avoid mount storms.
> 
> What am I trying to do?
> 	Prior to a disaster recovery test, my home directory will be mounted from my-nfs-server.domainname:/home/jbreitma.
> 	At this point my-nfs-server.domainname points to 1.1.1.1.
> 	There are active reads and writes to my home directory.
> 	Lets say I have a subdirectory called htdocs and am running apache.
> 
> 	Now we are cutoff from 1.1.1.1 because the Data Center where 1.1.1.1 lives is no longer accessible.
> 	We simulate this with an ACL.
> 	We now repoint my-nfs-server.domainname to 2.2.2.2.
> 
> 	The NFS Clients where /home/jbreitma is mounted are now confused.
> 
> 	What is my best coarse of action?
> 		umount -l /home/jbreitma
> 		/etc/init.d/autofs restart
> 		fuser -k /home/jbreitma
> 		kill -USR1 `pgrep automount`
> 		etc ...

That's about all you can do.

The "umount -l" has it's own set of problems.
In particular any process that has an active mount must do a "cd ." (I
believe that will work) to recover from the changed mount otherwise
getcwd(3) will fail and /proc/<pid>/cwd will point to "/" instead of a
valid working directory.

Also, there is pretty much no way to get the RPC layer to give up on
those outstanding IOs which will cause ongoing problems.

> 
> 	How do I recover from this situation?	

There's not much you can do for read/write mounts and even read only
fail over hasn't been implemented within the Linux kernel NFS client.

> 	I am open to a new approach if that is required.

The only way I think high availability NFS can work today is when the
backend deals with the change such as in Clustered environments.

> 
> 
> I have had some success with umount -l /home/jbreitma followed by
> a /etc/init.d/autofs restart, but this does not always work.
> I specifically fail when active writes and or reads are occurring
> to /home/jbreitma.
> 		
> 
> Jason Breitman
> A&T-Tech-GTI
> Jason.Breitman@blackrock.com
> BlackRock
> 
> THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY, AND MAY BE PRIVILEGED.  If this message was misdirected, BlackRock, Inc. and its subsidiaries, ("BlackRock") does not waive any confidentiality or privilege.  If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone.  Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized.  The views and opinions expressed in this e-mail message are the author's own and may not reflect the views and opinions of BlackRock, unless the author is authorized by BlackRock to express such views or opinions on its behalf.  All email sent to or from this address is subject to electronic storage and r
 eview by BlackRock.  Although BlackRock operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.
> 
> 
> _______________________________________________
> autofs mailing list
> autofs@linux.kernel.org
> http://linux.kernel.org/mailman/listinfo/autofs

      reply	other threads:[~2011-03-13  9:17 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-13  4:27 Recovering from the loss of a NFS Server Breitman, Jason
2011-03-13  9:17 ` Ian Kent [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1300007872.2906.17.camel@perseus \
    --to=raven@themaw.net \
    --cc=Jason.Breitman@blackrock.com \
    --cc=autofs@linux.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.