All of lore.kernel.org
 help / color / mirror / Atom feed
* Recovering from the loss of a NFS Server
@ 2011-03-13  4:27 Breitman, Jason
  2011-03-13  9:17 ` Ian Kent
  0 siblings, 1 reply; 2+ messages in thread
From: Breitman, Jason @ 2011-03-13  4:27 UTC (permalink / raw)
  To: 'autofs@linux.kernel.org'

OS
	Linux hostname 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux

autofs package
	autofs-5.0.1-0.rc2.148.bz579312.1.el5

Mount options
	$ cat /etc/auto.master
	# Master map for automounter
	#
	/home             auto_home               -hard,intr,retry=10

	$ cat /etc/sysconfig/autofs
	TIMEOUT=86400 - we have a long TIMEOUT to avoid mount storms.

What am I trying to do?
	Prior to a disaster recovery test, my home directory will be mounted from my-nfs-server.domainname:/home/jbreitma.
	At this point my-nfs-server.domainname points to 1.1.1.1.
	There are active reads and writes to my home directory.
	Lets say I have a subdirectory called htdocs and am running apache.

	Now we are cutoff from 1.1.1.1 because the Data Center where 1.1.1.1 lives is no longer accessible.
	We simulate this with an ACL.
	We now repoint my-nfs-server.domainname to 2.2.2.2.

	The NFS Clients where /home/jbreitma is mounted are now confused.

	What is my best coarse of action?
		umount -l /home/jbreitma
		/etc/init.d/autofs restart
		fuser -k /home/jbreitma
		kill -USR1 `pgrep automount`
		etc ...

	How do I recover from this situation?	
	I am open to a new approach if that is required.


I have had some success with umount -l /home/jbreitma followed by a /etc/init.d/autofs restart, but this does not always work.
I specifically fail when active writes and or reads are occurring to /home/jbreitma.
		

Jason Breitman
A&T-Tech-GTI
Jason.Breitman@blackrock.com
BlackRock

THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY, AND MAY BE PRIVILEGED.  If this message was misdirected, BlackRock, Inc. and its subsidiaries, ("BlackRock") does not waive any confidentiality or privilege.  If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone.  Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized.  The views and opinions expressed in this e-mail message are the author's own and may not reflect the views and opinions of BlackRock, unless the author is authorized by BlackRock to express such views or opinions on its behalf.  All email sent to or from this address is subject to electronic storage and rev
 iew by BlackRock.  Although BlackRock operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Recovering from the loss of a NFS Server
  2011-03-13  4:27 Recovering from the loss of a NFS Server Breitman, Jason
@ 2011-03-13  9:17 ` Ian Kent
  0 siblings, 0 replies; 2+ messages in thread
From: Ian Kent @ 2011-03-13  9:17 UTC (permalink / raw)
  To: Breitman, Jason; +Cc: 'autofs@linux.kernel.org'

On Sat, 2011-03-12 at 23:27 -0500, Breitman, Jason wrote:
> OS
> 	Linux hostname 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
> 
> autofs package
> 	autofs-5.0.1-0.rc2.148.bz579312.1.el5
> 
> Mount options
> 	$ cat /etc/auto.master
> 	# Master map for automounter
> 	#
> 	/home             auto_home               -hard,intr,retry=10
> 
> 	$ cat /etc/sysconfig/autofs
> 	TIMEOUT=86400 - we have a long TIMEOUT to avoid mount storms.
> 
> What am I trying to do?
> 	Prior to a disaster recovery test, my home directory will be mounted from my-nfs-server.domainname:/home/jbreitma.
> 	At this point my-nfs-server.domainname points to 1.1.1.1.
> 	There are active reads and writes to my home directory.
> 	Lets say I have a subdirectory called htdocs and am running apache.
> 
> 	Now we are cutoff from 1.1.1.1 because the Data Center where 1.1.1.1 lives is no longer accessible.
> 	We simulate this with an ACL.
> 	We now repoint my-nfs-server.domainname to 2.2.2.2.
> 
> 	The NFS Clients where /home/jbreitma is mounted are now confused.
> 
> 	What is my best coarse of action?
> 		umount -l /home/jbreitma
> 		/etc/init.d/autofs restart
> 		fuser -k /home/jbreitma
> 		kill -USR1 `pgrep automount`
> 		etc ...

That's about all you can do.

The "umount -l" has it's own set of problems.
In particular any process that has an active mount must do a "cd ." (I
believe that will work) to recover from the changed mount otherwise
getcwd(3) will fail and /proc/<pid>/cwd will point to "/" instead of a
valid working directory.

Also, there is pretty much no way to get the RPC layer to give up on
those outstanding IOs which will cause ongoing problems.

> 
> 	How do I recover from this situation?	

There's not much you can do for read/write mounts and even read only
fail over hasn't been implemented within the Linux kernel NFS client.

> 	I am open to a new approach if that is required.

The only way I think high availability NFS can work today is when the
backend deals with the change such as in Clustered environments.

> 
> 
> I have had some success with umount -l /home/jbreitma followed by
> a /etc/init.d/autofs restart, but this does not always work.
> I specifically fail when active writes and or reads are occurring
> to /home/jbreitma.
> 		
> 
> Jason Breitman
> A&T-Tech-GTI
> Jason.Breitman@blackrock.com
> BlackRock
> 
> THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY, AND MAY BE PRIVILEGED.  If this message was misdirected, BlackRock, Inc. and its subsidiaries, ("BlackRock") does not waive any confidentiality or privilege.  If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone.  Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized.  The views and opinions expressed in this e-mail message are the author's own and may not reflect the views and opinions of BlackRock, unless the author is authorized by BlackRock to express such views or opinions on its behalf.  All email sent to or from this address is subject to electronic storage and r
 eview by BlackRock.  Although BlackRock operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.
> 
> 
> _______________________________________________
> autofs mailing list
> autofs@linux.kernel.org
> http://linux.kernel.org/mailman/listinfo/autofs

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-03-13  9:17 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-13  4:27 Recovering from the loss of a NFS Server Breitman, Jason
2011-03-13  9:17 ` Ian Kent

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.