Killing process in D state on mount to dead NFS server.

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Killing process in D state on mount to dead NFS server.
@ 2014-07-31 18:00 Ben Greear
  2014-07-31 19:49 ` Malahal Naineni
  2014-07-31 20:42 ` NeilBrown
  0 siblings, 2 replies; 16+ messages in thread
From: Ben Greear @ 2014-07-31 18:00 UTC (permalink / raw)
  To: linux-nfs@vger.kernel.org

So, this has been asked all over the interweb for years and years, but
the best answer I can find is to reboot the system or create a fake NFS
server somewhere with the same IP as the gone-away NFS server.

The problem is:

I have some mounts to an NFS server that no longer exists (crashed/powered down).

I have some processes stuck trying to write to files open on these mounts.

I want to kill the process and unmount.

umount -l will make the mount go a way, sort of.  But process is still hung.
umount -f complains:
  umount2:  Device or resource busy
  umount.nfs: /mnt/foo: device is busy

kill -9 does not work on process.

Aside from bringing a fake NFS server back up on the same IP, is there any
other way to get these mounts unmounted and the processes killed without
rebooting?

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Killing process in D state on mount to dead NFS server.
  2014-07-31 18:00 Killing process in D state on mount to dead NFS server Ben Greear
@ 2014-07-31 19:49 ` Malahal Naineni
  2014-07-31 19:52   ` Ben Greear
  2014-07-31 20:42 ` NeilBrown
  1 sibling, 1 reply; 16+ messages in thread
From: Malahal Naineni @ 2014-07-31 19:49 UTC (permalink / raw)
  To: Ben Greear; +Cc: linux-nfs@vger.kernel.org

Ben Greear [greearb@candelatech.com] wrote:
> So, this has been asked all over the interweb for years and years, but
> the best answer I can find is to reboot the system or create a fake NFS
> server somewhere with the same IP as the gone-away NFS server.
> 
> The problem is:
> 
> I have some mounts to an NFS server that no longer exists (crashed/powered down).
> 
> I have some processes stuck trying to write to files open on these mounts.
> 
> I want to kill the process and unmount.
> 
> umount -l will make the mount go a way, sort of.  But process is still hung.
> umount -f complains:
>   umount2:  Device or resource busy
>   umount.nfs: /mnt/foo: device is busy
> 
> kill -9 does not work on process.
> 
> 
> Aside from bringing a fake NFS server back up on the same IP, is there any
> other way to get these mounts unmounted and the processes killed without
> rebooting?

You don't need a fake NFS server, you just need a fake or real server
with that IP address.  A popular way is to alias that IP on the NFS
client itself.

See the second popular answer below:
http://stackoverflow.com/questions/40317/force-unmount-of-nfs-mounted-directory

Regards, Malahal.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Killing process in D state on mount to dead NFS server.
  2014-07-31 19:49 ` Malahal Naineni
@ 2014-07-31 19:52   ` Ben Greear
  0 siblings, 0 replies; 16+ messages in thread
From: Ben Greear @ 2014-07-31 19:52 UTC (permalink / raw)
  To: linux-nfs@vger.kernel.org

On 07/31/2014 12:49 PM, Malahal Naineni wrote:
> Ben Greear [greearb@candelatech.com] wrote:
>> So, this has been asked all over the interweb for years and years, but
>> the best answer I can find is to reboot the system or create a fake NFS
>> server somewhere with the same IP as the gone-away NFS server.
>>
>> The problem is:
>>
>> I have some mounts to an NFS server that no longer exists (crashed/powered down).
>>
>> I have some processes stuck trying to write to files open on these mounts.
>>
>> I want to kill the process and unmount.
>>
>> umount -l will make the mount go a way, sort of.  But process is still hung.
>> umount -f complains:
>>   umount2:  Device or resource busy
>>   umount.nfs: /mnt/foo: device is busy
>>
>> kill -9 does not work on process.
>>
>>
>> Aside from bringing a fake NFS server back up on the same IP, is there any
>> other way to get these mounts unmounted and the processes killed without
>> rebooting?
> 
> You don't need a fake NFS server, you just need a fake or real server
> with that IP address.  A popular way is to alias that IP on the NFS
> client itself.
> 
> See the second popular answer below:
> http://stackoverflow.com/questions/40317/force-unmount-of-nfs-mounted-directory

In my case, routing is set up so that the NFS traffic always exits the system, so
doing a local IP that matches the server is not an option.  It also seems like a
horrible hack that should have a better solution :P

Thanks,
Ben

> 
> Regards, Malahal.


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Killing process in D state on mount to dead NFS server.
  2014-07-31 18:00 Killing process in D state on mount to dead NFS server Ben Greear
  2014-07-31 19:49 ` Malahal Naineni
@ 2014-07-31 20:42 ` NeilBrown
  2014-07-31 21:20   ` Ben Greear
  1 sibling, 1 reply; 16+ messages in thread
From: NeilBrown @ 2014-07-31 20:42 UTC (permalink / raw)
  To: Ben Greear; +Cc: linux-nfs@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1159 bytes --]

On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear <greearb@candelatech.com> wrote:

> So, this has been asked all over the interweb for years and years, but
> the best answer I can find is to reboot the system or create a fake NFS
> server somewhere with the same IP as the gone-away NFS server.
> 
> The problem is:
> 
> I have some mounts to an NFS server that no longer exists (crashed/powered down).
> 
> I have some processes stuck trying to write to files open on these mounts.
> 
> I want to kill the process and unmount.
> 
> umount -l will make the mount go a way, sort of.  But process is still hung.
> umount -f complains:
>   umount2:  Device or resource busy
>   umount.nfs: /mnt/foo: device is busy
> 
> kill -9 does not work on process.

Kill -1 should work (since about 2.6.25 or so).
If it doesn't please report the kernel version and 
 cat /proc/$PID/stack

for some processes that cannot be killed.

NeilBrown

> 
> 
> Aside from bringing a fake NFS server back up on the same IP, is there any
> other way to get these mounts unmounted and the processes killed without
> rebooting?
> 
> Thanks,
> Ben
> 


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Killing process in D state on mount to dead NFS server.
  2014-07-31 20:42 ` NeilBrown
@ 2014-07-31 21:20   ` Ben Greear
  2014-07-31 21:50     ` Killing process in D state on mount to dead NFS server. (when process is in fsync) NeilBrown
  2014-08-13 15:42     ` Killing process in D state on mount to dead NFS server Ben Greear
  0 siblings, 2 replies; 16+ messages in thread
From: Ben Greear @ 2014-07-31 21:20 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs@vger.kernel.org

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/31/2014 01:42 PM, NeilBrown wrote:
> On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear <greearb@candelatech.com> wrote:
> 
>> So, this has been asked all over the interweb for years and years, but the best answer I can find is to reboot the system or create a fake NFS server
>> somewhere with the same IP as the gone-away NFS server.
>> 
>> The problem is:
>> 
>> I have some mounts to an NFS server that no longer exists (crashed/powered down).
>> 
>> I have some processes stuck trying to write to files open on these mounts.
>> 
>> I want to kill the process and unmount.
>> 
>> umount -l will make the mount go a way, sort of.  But process is still hung. umount -f complains: umount2:  Device or resource busy umount.nfs: /mnt/foo:
>> device is busy
>> 
>> kill -9 does not work on process.
> 
> Kill -1 should work (since about 2.6.25 or so).

That is -[ONE], right?  Assuming so, it did not work for me.

Kernel is 3.14.4+, with some of extra patches, but probably nothing that
influences this particular behaviour.

[root@lf1005-14010010 ~]# cat /proc/3805/stack
[<ffffffff811371ba>] sleep_on_page+0x9/0xd
[<ffffffff8113738e>] wait_on_page_bit+0x71/0x78
[<ffffffff8113769a>] filemap_fdatawait_range+0xa2/0x16d
[<ffffffff8113780e>] filemap_write_and_wait_range+0x3b/0x77
[<ffffffffa0f04734>] nfs_file_fsync+0x37/0x83 [nfs]
[<ffffffff811a8d32>] vfs_fsync_range+0x19/0x1b
[<ffffffff811a8d4b>] vfs_fsync+0x17/0x19
[<ffffffffa0f05305>] nfs_file_flush+0x6b/0x6f [nfs]
[<ffffffff81183e46>] filp_close+0x3f/0x71
[<ffffffff8119c8ae>] __close_fd+0x80/0x98
[<ffffffff81183de5>] SyS_close+0x1c/0x3e
[<ffffffff815c55f9>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
[root@lf1005-14010010 ~]# kill -1 3805
[root@lf1005-14010010 ~]# cat /proc/3805/stack
[<ffffffff811371ba>] sleep_on_page+0x9/0xd
[<ffffffff8113738e>] wait_on_page_bit+0x71/0x78
[<ffffffff8113769a>] filemap_fdatawait_range+0xa2/0x16d
[<ffffffff8113780e>] filemap_write_and_wait_range+0x3b/0x77
[<ffffffffa0f04734>] nfs_file_fsync+0x37/0x83 [nfs]
[<ffffffff811a8d32>] vfs_fsync_range+0x19/0x1b
[<ffffffff811a8d4b>] vfs_fsync+0x17/0x19
[<ffffffffa0f05305>] nfs_file_flush+0x6b/0x6f [nfs]
[<ffffffff81183e46>] filp_close+0x3f/0x71
[<ffffffff8119c8ae>] __close_fd+0x80/0x98
[<ffffffff81183de5>] SyS_close+0x1c/0x3e
[<ffffffff815c55f9>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

Thanks,
Ben

> If it doesn't please report the kernel version and cat /proc/$PID/stack
> 
> for some processes that cannot be killed.
> 
> NeilBrown
> 
>> 
>> 
>> Aside from bringing a fake NFS server back up on the same IP, is there any other way to get these mounts unmounted and the processes killed without 
>> rebooting?
>> 
>> Thanks, Ben
>> 
> 


- -- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJT2rLiAAoJELbHqkYeJT4OqPgH/0taKW6Be90c1mETZf9yeqZF
YMLZk8XC2wloEd9nVz//mXREmiu18Hc+5p7Upd4Os21J2P4PBMGV6P/9DMxxehwH
YX1HKha0EoAsbO5ILQhbLf83cRXAPEpvJPgYHrq6xjlKB8Q8OxxND37rY7kl19Zz
sdAw6GiqHICF3Hq1ATa/jvixMluDnhER9Dln3wOdAGzmmuFYqpTsV4EwzbKKqInJ
6C15q+cq/9aYh6usN6z2qJhbHgqM9EWcPL6jOrCwX4PbC1XjKHekpFN0t9oKQClx
qSPuweMQ7fP4IBd2Ke8L/QlyOVblAKSE7t+NdrjfzLmYPzyHTyfLABR/BI053to=
=/9FJ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Killing process in D state on mount to dead NFS server. (when process is in fsync)
  2014-07-31 21:20   ` Ben Greear
@ 2014-07-31 21:50     ` NeilBrown
  2014-08-01 12:47       ` Jan Kara
  2014-08-02  1:21       ` Jeff Layton
  2014-08-13 15:42     ` Killing process in D state on mount to dead NFS server Ben Greear
  1 sibling, 2 replies; 16+ messages in thread
From: NeilBrown @ 2014-07-31 21:50 UTC (permalink / raw)
  To: Ben Greear, Andrew Morton
  Cc: linux-nfs@vger.kernel.org, linux-kernel, linux-mm, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 4672 bytes --]

On Thu, 31 Jul 2014 14:20:07 -0700 Ben Greear <greearb@candelatech.com> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 07/31/2014 01:42 PM, NeilBrown wrote:
> > On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear <greearb@candelatech.com> wrote:
> > 
> >> So, this has been asked all over the interweb for years and years, but the best answer I can find is to reboot the system or create a fake NFS server
> >> somewhere with the same IP as the gone-away NFS server.
> >> 
> >> The problem is:
> >> 
> >> I have some mounts to an NFS server that no longer exists (crashed/powered down).
> >> 
> >> I have some processes stuck trying to write to files open on these mounts.
> >> 
> >> I want to kill the process and unmount.
> >> 
> >> umount -l will make the mount go a way, sort of.  But process is still hung. umount -f complains: umount2:  Device or resource busy umount.nfs: /mnt/foo:
> >> device is busy
> >> 
> >> kill -9 does not work on process.
> > 
> > Kill -1 should work (since about 2.6.25 or so).
> 
> That is -[ONE], right?  Assuming so, it did not work for me.

No, it was "-9" .... sorry, I really shouldn't be let out without my proof
reader.

However the 'stack' is sufficient to see what is going on.

The problem is that it is blocked inside the "VM" well away from NFS and
there is no way for NFS to say "give up and go home".

I'd suggest that is a bug.   I cannot see any justification for fsync to not
be killable.
It wouldn't be too hard to create a patch to make it so.
It would be a little harder to examine all call paths and create a
convincing case that the patch was safe.
It might be herculean task to convince others that it was the right thing
to do.... so let's start with that one.

Hi Linux-mm and fs-devel people.  What do people think of making "fsync" and
variants "KILLABLE" ??

I probably only need a little bit of encouragement to write a patch....

Thanks,
NeilBrown

> 
> Kernel is 3.14.4+, with some of extra patches, but probably nothing that
> influences this particular behaviour.
> 
> [root@lf1005-14010010 ~]# cat /proc/3805/stack
> [<ffffffff811371ba>] sleep_on_page+0x9/0xd
> [<ffffffff8113738e>] wait_on_page_bit+0x71/0x78
> [<ffffffff8113769a>] filemap_fdatawait_range+0xa2/0x16d
> [<ffffffff8113780e>] filemap_write_and_wait_range+0x3b/0x77
> [<ffffffffa0f04734>] nfs_file_fsync+0x37/0x83 [nfs]
> [<ffffffff811a8d32>] vfs_fsync_range+0x19/0x1b
> [<ffffffff811a8d4b>] vfs_fsync+0x17/0x19
> [<ffffffffa0f05305>] nfs_file_flush+0x6b/0x6f [nfs]
> [<ffffffff81183e46>] filp_close+0x3f/0x71
> [<ffffffff8119c8ae>] __close_fd+0x80/0x98
> [<ffffffff81183de5>] SyS_close+0x1c/0x3e
> [<ffffffff815c55f9>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
> [root@lf1005-14010010 ~]# kill -1 3805
> [root@lf1005-14010010 ~]# cat /proc/3805/stack
> [<ffffffff811371ba>] sleep_on_page+0x9/0xd
> [<ffffffff8113738e>] wait_on_page_bit+0x71/0x78
> [<ffffffff8113769a>] filemap_fdatawait_range+0xa2/0x16d
> [<ffffffff8113780e>] filemap_write_and_wait_range+0x3b/0x77
> [<ffffffffa0f04734>] nfs_file_fsync+0x37/0x83 [nfs]
> [<ffffffff811a8d32>] vfs_fsync_range+0x19/0x1b
> [<ffffffff811a8d4b>] vfs_fsync+0x17/0x19
> [<ffffffffa0f05305>] nfs_file_flush+0x6b/0x6f [nfs]
> [<ffffffff81183e46>] filp_close+0x3f/0x71
> [<ffffffff8119c8ae>] __close_fd+0x80/0x98
> [<ffffffff81183de5>] SyS_close+0x1c/0x3e
> [<ffffffff815c55f9>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> Thanks,
> Ben
> 
> > If it doesn't please report the kernel version and cat /proc/$PID/stack
> > 
> > for some processes that cannot be killed.
> > 
> > NeilBrown
> > 
> >> 
> >> 
> >> Aside from bringing a fake NFS server back up on the same IP, is there any other way to get these mounts unmounted and the processes killed without 
> >> rebooting?
> >> 
> >> Thanks, Ben
> >> 
> > 
> 
> 
> - -- 
> Ben Greear <greearb@candelatech.com>
> Candela Technologies Inc  http://www.candelatech.com
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.13 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> 
> iQEcBAEBAgAGBQJT2rLiAAoJELbHqkYeJT4OqPgH/0taKW6Be90c1mETZf9yeqZF
> YMLZk8XC2wloEd9nVz//mXREmiu18Hc+5p7Upd4Os21J2P4PBMGV6P/9DMxxehwH
> YX1HKha0EoAsbO5ILQhbLf83cRXAPEpvJPgYHrq6xjlKB8Q8OxxND37rY7kl19Zz
> sdAw6GiqHICF3Hq1ATa/jvixMluDnhER9Dln3wOdAGzmmuFYqpTsV4EwzbKKqInJ
> 6C15q+cq/9aYh6usN6z2qJhbHgqM9EWcPL6jOrCwX4PbC1XjKHekpFN0t9oKQClx
> qSPuweMQ7fP4IBd2Ke8L/QlyOVblAKSE7t+NdrjfzLmYPzyHTyfLABR/BI053to=
> =/9FJ
> -----END PGP SIGNATURE-----


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Killing process in D state on mount to dead NFS server. (when process is in fsync)
  2014-07-31 21:50     ` Killing process in D state on mount to dead NFS server. (when process is in fsync) NeilBrown
@ 2014-08-01 12:47       ` Jan Kara
  2014-08-02  1:21       ` Jeff Layton
  1 sibling, 0 replies; 16+ messages in thread
From: Jan Kara @ 2014-08-01 12:47 UTC (permalink / raw)
  To: NeilBrown
  Cc: Ben Greear, Andrew Morton, linux-nfs@vger.kernel.org,
	linux-kernel, linux-mm, linux-fsdevel

On Fri 01-08-14 07:50:53, NeilBrown wrote:
> On Thu, 31 Jul 2014 14:20:07 -0700 Ben Greear <greearb@candelatech.com> wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> > 
> > On 07/31/2014 01:42 PM, NeilBrown wrote:
> > > On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear <greearb@candelatech.com> wrote:
> > > 
> > >> So, this has been asked all over the interweb for years and years, but the best answer I can find is to reboot the system or create a fake NFS server
> > >> somewhere with the same IP as the gone-away NFS server.
> > >> 
> > >> The problem is:
> > >> 
> > >> I have some mounts to an NFS server that no longer exists (crashed/powered down).
> > >> 
> > >> I have some processes stuck trying to write to files open on these mounts.
> > >> 
> > >> I want to kill the process and unmount.
> > >> 
> > >> umount -l will make the mount go a way, sort of.  But process is still hung. umount -f complains: umount2:  Device or resource busy umount.nfs: /mnt/foo:
> > >> device is busy
> > >> 
> > >> kill -9 does not work on process.
> > > 
> > > Kill -1 should work (since about 2.6.25 or so).
> > 
> > That is -[ONE], right?  Assuming so, it did not work for me.
> 
> No, it was "-9" .... sorry, I really shouldn't be let out without my proof
> reader.
> 
> However the 'stack' is sufficient to see what is going on.
> 
> The problem is that it is blocked inside the "VM" well away from NFS and
> there is no way for NFS to say "give up and go home".
> 
> I'd suggest that is a bug.   I cannot see any justification for fsync to not
> be killable.
> It wouldn't be too hard to create a patch to make it so.
> It would be a little harder to examine all call paths and create a
> convincing case that the patch was safe.
> It might be herculean task to convince others that it was the right thing
> to do.... so let's start with that one.
> 
> Hi Linux-mm and fs-devel people.  What do people think of making "fsync" and
> variants "KILLABLE" ??
  Sounds useful to me and I don't see how it could break some
application...

								Honza

-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Killing process in D state on mount to dead NFS server. (when process is in fsync)
  2014-07-31 21:50     ` Killing process in D state on mount to dead NFS server. (when process is in fsync) NeilBrown
  2014-08-01 12:47       ` Jan Kara
@ 2014-08-02  1:21       ` Jeff Layton
  2014-08-02  1:50         ` Roger Heflin
  2014-08-02  2:55         ` Trond Myklebust
  1 sibling, 2 replies; 16+ messages in thread
From: Jeff Layton @ 2014-08-02  1:21 UTC (permalink / raw)
  To: NeilBrown
  Cc: Ben Greear, Andrew Morton, linux-nfs@vger.kernel.org,
	linux-kernel, linux-mm, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 6034 bytes --]

On Fri, 1 Aug 2014 07:50:53 +1000
NeilBrown <neilb@suse.de> wrote:

> On Thu, 31 Jul 2014 14:20:07 -0700 Ben Greear <greearb@candelatech.com> wrote:
> 
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> > 
> > On 07/31/2014 01:42 PM, NeilBrown wrote:
> > > On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear <greearb@candelatech.com> wrote:
> > > 
> > >> So, this has been asked all over the interweb for years and years, but the best answer I can find is to reboot the system or create a fake NFS server
> > >> somewhere with the same IP as the gone-away NFS server.
> > >> 
> > >> The problem is:
> > >> 
> > >> I have some mounts to an NFS server that no longer exists (crashed/powered down).
> > >> 
> > >> I have some processes stuck trying to write to files open on these mounts.
> > >> 
> > >> I want to kill the process and unmount.
> > >> 
> > >> umount -l will make the mount go a way, sort of.  But process is still hung. umount -f complains: umount2:  Device or resource busy umount.nfs: /mnt/foo:
> > >> device is busy
> > >> 
> > >> kill -9 does not work on process.
> > > 
> > > Kill -1 should work (since about 2.6.25 or so).
> > 
> > That is -[ONE], right?  Assuming so, it did not work for me.
> 
> No, it was "-9" .... sorry, I really shouldn't be let out without my proof
> reader.
> 
> However the 'stack' is sufficient to see what is going on.
> 
> The problem is that it is blocked inside the "VM" well away from NFS and
> there is no way for NFS to say "give up and go home".
> 
> I'd suggest that is a bug.   I cannot see any justification for fsync to not
> be killable.
> It wouldn't be too hard to create a patch to make it so.
> It would be a little harder to examine all call paths and create a
> convincing case that the patch was safe.
> It might be herculean task to convince others that it was the right thing
> to do.... so let's start with that one.
> 
> Hi Linux-mm and fs-devel people.  What do people think of making "fsync" and
> variants "KILLABLE" ??
> 
> I probably only need a little bit of encouragement to write a patch....
> 
> Thanks,
> NeilBrown
> 


It would be good to fix this in some fashion once and for all, and the
wait_on_page_writeback wait is a major source of pain for a lot of
people.

So to summarize...

The problem in a nutshell is that Ben has some cached writes to the
NFS server, but the server has gone away (presumably forever). The
question is -- how do we communicate to the kernel that that server
isn't coming back and that those dirty pages should be invalidated so
that we can umount the filesystem?

Allowing fsync/close to be killable sounds reasonable to me as at least
a partial solution. Both close(2) and fsync(2) are allowed to return
EINTR according to the POSIX spec. Allowing a kill -9 there seems
like it should be fine, and maybe we ought to even consider letting it
be susceptible to lesser signals.

That still leaves some open questions though...

Is that enough to fix it? You'd still have the dirty pages lingering
around, right? Would a umount -f presumably work at that point?

> > 
> > Kernel is 3.14.4+, with some of extra patches, but probably nothing that
> > influences this particular behaviour.
> > 
> > [root@lf1005-14010010 ~]# cat /proc/3805/stack
> > [<ffffffff811371ba>] sleep_on_page+0x9/0xd
> > [<ffffffff8113738e>] wait_on_page_bit+0x71/0x78
> > [<ffffffff8113769a>] filemap_fdatawait_range+0xa2/0x16d
> > [<ffffffff8113780e>] filemap_write_and_wait_range+0x3b/0x77
> > [<ffffffffa0f04734>] nfs_file_fsync+0x37/0x83 [nfs]
> > [<ffffffff811a8d32>] vfs_fsync_range+0x19/0x1b
> > [<ffffffff811a8d4b>] vfs_fsync+0x17/0x19
> > [<ffffffffa0f05305>] nfs_file_flush+0x6b/0x6f [nfs]
> > [<ffffffff81183e46>] filp_close+0x3f/0x71
> > [<ffffffff8119c8ae>] __close_fd+0x80/0x98
> > [<ffffffff81183de5>] SyS_close+0x1c/0x3e
> > [<ffffffff815c55f9>] system_call_fastpath+0x16/0x1b
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > [root@lf1005-14010010 ~]# kill -1 3805
> > [root@lf1005-14010010 ~]# cat /proc/3805/stack
> > [<ffffffff811371ba>] sleep_on_page+0x9/0xd
> > [<ffffffff8113738e>] wait_on_page_bit+0x71/0x78
> > [<ffffffff8113769a>] filemap_fdatawait_range+0xa2/0x16d
> > [<ffffffff8113780e>] filemap_write_and_wait_range+0x3b/0x77
> > [<ffffffffa0f04734>] nfs_file_fsync+0x37/0x83 [nfs]
> > [<ffffffff811a8d32>] vfs_fsync_range+0x19/0x1b
> > [<ffffffff811a8d4b>] vfs_fsync+0x17/0x19
> > [<ffffffffa0f05305>] nfs_file_flush+0x6b/0x6f [nfs]
> > [<ffffffff81183e46>] filp_close+0x3f/0x71
> > [<ffffffff8119c8ae>] __close_fd+0x80/0x98
> > [<ffffffff81183de5>] SyS_close+0x1c/0x3e
> > [<ffffffff815c55f9>] system_call_fastpath+0x16/0x1b
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > 
> > Thanks,
> > Ben
> > 
> > > If it doesn't please report the kernel version and cat /proc/$PID/stack
> > > 
> > > for some processes that cannot be killed.
> > > 
> > > NeilBrown
> > > 
> > >> 
> > >> 
> > >> Aside from bringing a fake NFS server back up on the same IP, is there any other way to get these mounts unmounted and the processes killed without 
> > >> rebooting?
> > >> 
> > >> Thanks, Ben
> > >> 
> > > 
> > 
> > 
> > - -- 
> > Ben Greear <greearb@candelatech.com>
> > Candela Technologies Inc  http://www.candelatech.com
> > 
> > -----BEGIN PGP SIGNATURE-----
> > Version: GnuPG v1.4.13 (GNU/Linux)
> > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> > 
> > iQEcBAEBAgAGBQJT2rLiAAoJELbHqkYeJT4OqPgH/0taKW6Be90c1mETZf9yeqZF
> > YMLZk8XC2wloEd9nVz//mXREmiu18Hc+5p7Upd4Os21J2P4PBMGV6P/9DMxxehwH
> > YX1HKha0EoAsbO5ILQhbLf83cRXAPEpvJPgYHrq6xjlKB8Q8OxxND37rY7kl19Zz
> > sdAw6GiqHICF3Hq1ATa/jvixMluDnhER9Dln3wOdAGzmmuFYqpTsV4EwzbKKqInJ
> > 6C15q+cq/9aYh6usN6z2qJhbHgqM9EWcPL6jOrCwX4PbC1XjKHekpFN0t9oKQClx
> > qSPuweMQ7fP4IBd2Ke8L/QlyOVblAKSE7t+NdrjfzLmYPzyHTyfLABR/BI053to=
> > =/9FJ
> > -----END PGP SIGNATURE-----
> 


-- 
Jeff Layton <jlayton@poochiereds.net>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Killing process in D state on mount to dead NFS server. (when process is in fsync)
  2014-08-02  1:21       ` Jeff Layton
@ 2014-08-02  1:50         ` Roger Heflin
  2014-08-02  2:07           ` Jeff Layton
  2014-08-02  2:55         ` Trond Myklebust
  1 sibling, 1 reply; 16+ messages in thread
From: Roger Heflin @ 2014-08-02  1:50 UTC (permalink / raw)
  To: Jeff Layton
  Cc: NeilBrown, Ben Greear, Andrew Morton, linux-nfs@vger.kernel.org,
	Kernel development list, linux-mm, linux-fsdevel

Doesn't NFS have an intr flag to allow kill -9 to work?   Whenever I
have had that set it has appeared to work after about 30 seconds or
so...without that kill -9 does not work when the nfs server is
missing.



On Fri, Aug 1, 2014 at 8:21 PM, Jeff Layton <jlayton@poochiereds.net> wrote:
> On Fri, 1 Aug 2014 07:50:53 +1000
> NeilBrown <neilb@suse.de> wrote:
>
>> On Thu, 31 Jul 2014 14:20:07 -0700 Ben Greear <greearb@candelatech.com> wrote:
>>
>> > -----BEGIN PGP SIGNED MESSAGE-----
>> > Hash: SHA1
>> >
>> > On 07/31/2014 01:42 PM, NeilBrown wrote:
>> > > On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear <greearb@candelatech.com> wrote:
>> > >
>> > >> So, this has been asked all over the interweb for years and years, but the best answer I can find is to reboot the system or create a fake NFS server
>> > >> somewhere with the same IP as the gone-away NFS server.
>> > >>
>> > >> The problem is:
>> > >>
>> > >> I have some mounts to an NFS server that no longer exists (crashed/powered down).
>> > >>
>> > >> I have some processes stuck trying to write to files open on these mounts.
>> > >>
>> > >> I want to kill the process and unmount.
>> > >>
>> > >> umount -l will make the mount go a way, sort of.  But process is still hung. umount -f complains: umount2:  Device or resource busy umount.nfs: /mnt/foo:
>> > >> device is busy
>> > >>
>> > >> kill -9 does not work on process.
>> > >
>> > > Kill -1 should work (since about 2.6.25 or so).
>> >
>> > That is -[ONE], right?  Assuming so, it did not work for me.
>>
>> No, it was "-9" .... sorry, I really shouldn't be let out without my proof
>> reader.
>>
>> However the 'stack' is sufficient to see what is going on.
>>
>> The problem is that it is blocked inside the "VM" well away from NFS and
>> there is no way for NFS to say "give up and go home".
>>
>> I'd suggest that is a bug.   I cannot see any justification for fsync to not
>> be killable.
>> It wouldn't be too hard to create a patch to make it so.
>> It would be a little harder to examine all call paths and create a
>> convincing case that the patch was safe.
>> It might be herculean task to convince others that it was the right thing
>> to do.... so let's start with that one.
>>
>> Hi Linux-mm and fs-devel people.  What do people think of making "fsync" and
>> variants "KILLABLE" ??
>>
>> I probably only need a little bit of encouragement to write a patch....
>>
>> Thanks,
>> NeilBrown
>>
>
>
> It would be good to fix this in some fashion once and for all, and the
> wait_on_page_writeback wait is a major source of pain for a lot of
> people.
>
> So to summarize...
>
> The problem in a nutshell is that Ben has some cached writes to the
> NFS server, but the server has gone away (presumably forever). The
> question is -- how do we communicate to the kernel that that server
> isn't coming back and that those dirty pages should be invalidated so
> that we can umount the filesystem?
>
> Allowing fsync/close to be killable sounds reasonable to me as at least
> a partial solution. Both close(2) and fsync(2) are allowed to return
> EINTR according to the POSIX spec. Allowing a kill -9 there seems
> like it should be fine, and maybe we ought to even consider letting it
> be susceptible to lesser signals.
>
> That still leaves some open questions though...
>
> Is that enough to fix it? You'd still have the dirty pages lingering
> around, right? Would a umount -f presumably work at that point?
>
>> >
>> > Kernel is 3.14.4+, with some of extra patches, but probably nothing that
>> > influences this particular behaviour.
>> >
>> > [root@lf1005-14010010 ~]# cat /proc/3805/stack
>> > [<ffffffff811371ba>] sleep_on_page+0x9/0xd
>> > [<ffffffff8113738e>] wait_on_page_bit+0x71/0x78
>> > [<ffffffff8113769a>] filemap_fdatawait_range+0xa2/0x16d
>> > [<ffffffff8113780e>] filemap_write_and_wait_range+0x3b/0x77
>> > [<ffffffffa0f04734>] nfs_file_fsync+0x37/0x83 [nfs]
>> > [<ffffffff811a8d32>] vfs_fsync_range+0x19/0x1b
>> > [<ffffffff811a8d4b>] vfs_fsync+0x17/0x19
>> > [<ffffffffa0f05305>] nfs_file_flush+0x6b/0x6f [nfs]
>> > [<ffffffff81183e46>] filp_close+0x3f/0x71
>> > [<ffffffff8119c8ae>] __close_fd+0x80/0x98
>> > [<ffffffff81183de5>] SyS_close+0x1c/0x3e
>> > [<ffffffff815c55f9>] system_call_fastpath+0x16/0x1b
>> > [<ffffffffffffffff>] 0xffffffffffffffff
>> > [root@lf1005-14010010 ~]# kill -1 3805
>> > [root@lf1005-14010010 ~]# cat /proc/3805/stack
>> > [<ffffffff811371ba>] sleep_on_page+0x9/0xd
>> > [<ffffffff8113738e>] wait_on_page_bit+0x71/0x78
>> > [<ffffffff8113769a>] filemap_fdatawait_range+0xa2/0x16d
>> > [<ffffffff8113780e>] filemap_write_and_wait_range+0x3b/0x77
>> > [<ffffffffa0f04734>] nfs_file_fsync+0x37/0x83 [nfs]
>> > [<ffffffff811a8d32>] vfs_fsync_range+0x19/0x1b
>> > [<ffffffff811a8d4b>] vfs_fsync+0x17/0x19
>> > [<ffffffffa0f05305>] nfs_file_flush+0x6b/0x6f [nfs]
>> > [<ffffffff81183e46>] filp_close+0x3f/0x71
>> > [<ffffffff8119c8ae>] __close_fd+0x80/0x98
>> > [<ffffffff81183de5>] SyS_close+0x1c/0x3e
>> > [<ffffffff815c55f9>] system_call_fastpath+0x16/0x1b
>> > [<ffffffffffffffff>] 0xffffffffffffffff
>> >
>> > Thanks,
>> > Ben
>> >
>> > > If it doesn't please report the kernel version and cat /proc/$PID/stack
>> > >
>> > > for some processes that cannot be killed.
>> > >
>> > > NeilBrown
>> > >
>> > >>
>> > >>
>> > >> Aside from bringing a fake NFS server back up on the same IP, is there any other way to get these mounts unmounted and the processes killed without
>> > >> rebooting?
>> > >>
>> > >> Thanks, Ben
>> > >>
>> > >
>> >
>> >
>> > - --
>> > Ben Greear <greearb@candelatech.com>
>> > Candela Technologies Inc  http://www.candelatech.com
>> >
>> > -----BEGIN PGP SIGNATURE-----
>> > Version: GnuPG v1.4.13 (GNU/Linux)
>> > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>> >
>> > iQEcBAEBAgAGBQJT2rLiAAoJELbHqkYeJT4OqPgH/0taKW6Be90c1mETZf9yeqZF
>> > YMLZk8XC2wloEd9nVz//mXREmiu18Hc+5p7Upd4Os21J2P4PBMGV6P/9DMxxehwH
>> > YX1HKha0EoAsbO5ILQhbLf83cRXAPEpvJPgYHrq6xjlKB8Q8OxxND37rY7kl19Zz
>> > sdAw6GiqHICF3Hq1ATa/jvixMluDnhER9Dln3wOdAGzmmuFYqpTsV4EwzbKKqInJ
>> > 6C15q+cq/9aYh6usN6z2qJhbHgqM9EWcPL6jOrCwX4PbC1XjKHekpFN0t9oKQClx
>> > qSPuweMQ7fP4IBd2Ke8L/QlyOVblAKSE7t+NdrjfzLmYPzyHTyfLABR/BI053to=
>> > =/9FJ
>> > -----END PGP SIGNATURE-----
>>
>
>
> --
> Jeff Layton <jlayton@poochiereds.net>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Killing process in D state on mount to dead NFS server. (when process is in fsync)
  2014-08-02  1:50         ` Roger Heflin
@ 2014-08-02  2:07           ` Jeff Layton
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff Layton @ 2014-08-02  2:07 UTC (permalink / raw)
  To: Roger Heflin
  Cc: NeilBrown, Ben Greear, Andrew Morton, linux-nfs@vger.kernel.org,
	Kernel development list, linux-mm, linux-fsdevel

On Fri, 1 Aug 2014 20:50:13 -0500
Roger Heflin <rogerheflin@gmail.com> wrote:

> Doesn't NFS have an intr flag to allow kill -9 to work?   Whenever I
> have had that set it has appeared to work after about 30 seconds or
> so...without that kill -9 does not work when the nfs server is
> missing.
> 
> 

Not anymore. That mount option has been deprecated (and ignored) for
years. The code in the RPC engine will generally give up in the face of
fatal signals. In this case though, we're in uninterruptible sleep in
the bowels of the writeback code.

The problem here is not really specific to NFS, per-se -- it just
happens to be the filesystem that most people notice it on.

> 
> On Fri, Aug 1, 2014 at 8:21 PM, Jeff Layton <jlayton@poochiereds.net> wrote:
> > On Fri, 1 Aug 2014 07:50:53 +1000
> > NeilBrown <neilb@suse.de> wrote:
> >
> >> On Thu, 31 Jul 2014 14:20:07 -0700 Ben Greear <greearb@candelatech.com> wrote:
> >>
> >> > -----BEGIN PGP SIGNED MESSAGE-----
> >> > Hash: SHA1
> >> >
> >> > On 07/31/2014 01:42 PM, NeilBrown wrote:
> >> > > On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear <greearb@candelatech.com> wrote:
> >> > >
> >> > >> So, this has been asked all over the interweb for years and years, but the best answer I can find is to reboot the system or create a fake NFS server
> >> > >> somewhere with the same IP as the gone-away NFS server.
> >> > >>
> >> > >> The problem is:
> >> > >>
> >> > >> I have some mounts to an NFS server that no longer exists (crashed/powered down).
> >> > >>
> >> > >> I have some processes stuck trying to write to files open on these mounts.
> >> > >>
> >> > >> I want to kill the process and unmount.
> >> > >>
> >> > >> umount -l will make the mount go a way, sort of.  But process is still hung. umount -f complains: umount2:  Device or resource busy umount.nfs: /mnt/foo:
> >> > >> device is busy
> >> > >>
> >> > >> kill -9 does not work on process.
> >> > >
> >> > > Kill -1 should work (since about 2.6.25 or so).
> >> >
> >> > That is -[ONE], right?  Assuming so, it did not work for me.
> >>
> >> No, it was "-9" .... sorry, I really shouldn't be let out without my proof
> >> reader.
> >>
> >> However the 'stack' is sufficient to see what is going on.
> >>
> >> The problem is that it is blocked inside the "VM" well away from NFS and
> >> there is no way for NFS to say "give up and go home".
> >>
> >> I'd suggest that is a bug.   I cannot see any justification for fsync to not
> >> be killable.
> >> It wouldn't be too hard to create a patch to make it so.
> >> It would be a little harder to examine all call paths and create a
> >> convincing case that the patch was safe.
> >> It might be herculean task to convince others that it was the right thing
> >> to do.... so let's start with that one.
> >>
> >> Hi Linux-mm and fs-devel people.  What do people think of making "fsync" and
> >> variants "KILLABLE" ??
> >>
> >> I probably only need a little bit of encouragement to write a patch....
> >>
> >> Thanks,
> >> NeilBrown
> >>
> >
> >
> > It would be good to fix this in some fashion once and for all, and the
> > wait_on_page_writeback wait is a major source of pain for a lot of
> > people.
> >
> > So to summarize...
> >
> > The problem in a nutshell is that Ben has some cached writes to the
> > NFS server, but the server has gone away (presumably forever). The
> > question is -- how do we communicate to the kernel that that server
> > isn't coming back and that those dirty pages should be invalidated so
> > that we can umount the filesystem?
> >
> > Allowing fsync/close to be killable sounds reasonable to me as at least
> > a partial solution. Both close(2) and fsync(2) are allowed to return
> > EINTR according to the POSIX spec. Allowing a kill -9 there seems
> > like it should be fine, and maybe we ought to even consider letting it
> > be susceptible to lesser signals.
> >
> > That still leaves some open questions though...
> >
> > Is that enough to fix it? You'd still have the dirty pages lingering
> > around, right? Would a umount -f presumably work at that point?
> >
> >> >
> >> > Kernel is 3.14.4+, with some of extra patches, but probably nothing that
> >> > influences this particular behaviour.
> >> >
> >> > [root@lf1005-14010010 ~]# cat /proc/3805/stack
> >> > [<ffffffff811371ba>] sleep_on_page+0x9/0xd
> >> > [<ffffffff8113738e>] wait_on_page_bit+0x71/0x78
> >> > [<ffffffff8113769a>] filemap_fdatawait_range+0xa2/0x16d
> >> > [<ffffffff8113780e>] filemap_write_and_wait_range+0x3b/0x77
> >> > [<ffffffffa0f04734>] nfs_file_fsync+0x37/0x83 [nfs]
> >> > [<ffffffff811a8d32>] vfs_fsync_range+0x19/0x1b
> >> > [<ffffffff811a8d4b>] vfs_fsync+0x17/0x19
> >> > [<ffffffffa0f05305>] nfs_file_flush+0x6b/0x6f [nfs]
> >> > [<ffffffff81183e46>] filp_close+0x3f/0x71
> >> > [<ffffffff8119c8ae>] __close_fd+0x80/0x98
> >> > [<ffffffff81183de5>] SyS_close+0x1c/0x3e
> >> > [<ffffffff815c55f9>] system_call_fastpath+0x16/0x1b
> >> > [<ffffffffffffffff>] 0xffffffffffffffff
> >> > [root@lf1005-14010010 ~]# kill -1 3805
> >> > [root@lf1005-14010010 ~]# cat /proc/3805/stack
> >> > [<ffffffff811371ba>] sleep_on_page+0x9/0xd
> >> > [<ffffffff8113738e>] wait_on_page_bit+0x71/0x78
> >> > [<ffffffff8113769a>] filemap_fdatawait_range+0xa2/0x16d
> >> > [<ffffffff8113780e>] filemap_write_and_wait_range+0x3b/0x77
> >> > [<ffffffffa0f04734>] nfs_file_fsync+0x37/0x83 [nfs]
> >> > [<ffffffff811a8d32>] vfs_fsync_range+0x19/0x1b
> >> > [<ffffffff811a8d4b>] vfs_fsync+0x17/0x19
> >> > [<ffffffffa0f05305>] nfs_file_flush+0x6b/0x6f [nfs]
> >> > [<ffffffff81183e46>] filp_close+0x3f/0x71
> >> > [<ffffffff8119c8ae>] __close_fd+0x80/0x98
> >> > [<ffffffff81183de5>] SyS_close+0x1c/0x3e
> >> > [<ffffffff815c55f9>] system_call_fastpath+0x16/0x1b
> >> > [<ffffffffffffffff>] 0xffffffffffffffff
> >> >
> >> > Thanks,
> >> > Ben
> >> >
> >> > > If it doesn't please report the kernel version and cat /proc/$PID/stack
> >> > >
> >> > > for some processes that cannot be killed.
> >> > >
> >> > > NeilBrown
> >> > >
> >> > >>
> >> > >>
> >> > >> Aside from bringing a fake NFS server back up on the same IP, is there any other way to get these mounts unmounted and the processes killed without
> >> > >> rebooting?
> >> > >>
> >> > >> Thanks, Ben
> >> > >>
> >> > >
> >> >
> >> >
> >> > - --
> >> > Ben Greear <greearb@candelatech.com>
> >> > Candela Technologies Inc  http://www.candelatech.com
> >> >
> >> > -----BEGIN PGP SIGNATURE-----
> >> > Version: GnuPG v1.4.13 (GNU/Linux)
> >> > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> >> >
> >> > iQEcBAEBAgAGBQJT2rLiAAoJELbHqkYeJT4OqPgH/0taKW6Be90c1mETZf9yeqZF
> >> > YMLZk8XC2wloEd9nVz//mXREmiu18Hc+5p7Upd4Os21J2P4PBMGV6P/9DMxxehwH
> >> > YX1HKha0EoAsbO5ILQhbLf83cRXAPEpvJPgYHrq6xjlKB8Q8OxxND37rY7kl19Zz
> >> > sdAw6GiqHICF3Hq1ATa/jvixMluDnhER9Dln3wOdAGzmmuFYqpTsV4EwzbKKqInJ
> >> > 6C15q+cq/9aYh6usN6z2qJhbHgqM9EWcPL6jOrCwX4PbC1XjKHekpFN0t9oKQClx
> >> > qSPuweMQ7fP4IBd2Ke8L/QlyOVblAKSE7t+NdrjfzLmYPzyHTyfLABR/BI053to=
> >> > =/9FJ
> >> > -----END PGP SIGNATURE-----
> >>
> >
> >
> > --
> > Jeff Layton <jlayton@poochiereds.net>


-- 
Jeff Layton <jlayton@poochiereds.net>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Killing process in D state on mount to dead NFS server. (when process is in fsync)
  2014-08-02  1:21       ` Jeff Layton
  2014-08-02  1:50         ` Roger Heflin
@ 2014-08-02  2:55         ` Trond Myklebust
  2014-08-02  3:19           ` NeilBrown
  1 sibling, 1 reply; 16+ messages in thread
From: Trond Myklebust @ 2014-08-02  2:55 UTC (permalink / raw)
  To: Jeff Layton
  Cc: NeilBrown, Ben Greear, Andrew Morton, linux-nfs@vger.kernel.org,
	linux-kernel, linux-mm, linux-fsdevel

On Fri, Aug 1, 2014 at 9:21 PM, Jeff Layton <jlayton@poochiereds.net> wrote:
> On Fri, 1 Aug 2014 07:50:53 +1000
> NeilBrown <neilb@suse.de> wrote:
>
>> On Thu, 31 Jul 2014 14:20:07 -0700 Ben Greear <greearb@candelatech.com> wrote:
>>
>> > -----BEGIN PGP SIGNED MESSAGE-----
>> > Hash: SHA1
>> >
>> > On 07/31/2014 01:42 PM, NeilBrown wrote:
>> > > On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear <greearb@candelatech.com> wrote:
>> > >
>> > >> So, this has been asked all over the interweb for years and years, but the best answer I can find is to reboot the system or create a fake NFS server
>> > >> somewhere with the same IP as the gone-away NFS server.
>> > >>
>> > >> The problem is:
>> > >>
>> > >> I have some mounts to an NFS server that no longer exists (crashed/powered down).
>> > >>
>> > >> I have some processes stuck trying to write to files open on these mounts.
>> > >>
>> > >> I want to kill the process and unmount.
>> > >>
>> > >> umount -l will make the mount go a way, sort of.  But process is still hung. umount -f complains: umount2:  Device or resource busy umount.nfs: /mnt/foo:
>> > >> device is busy
>> > >>
>> > >> kill -9 does not work on process.
>> > >
>> > > Kill -1 should work (since about 2.6.25 or so).
>> >
>> > That is -[ONE], right?  Assuming so, it did not work for me.
>>
>> No, it was "-9" .... sorry, I really shouldn't be let out without my proof
>> reader.
>>
>> However the 'stack' is sufficient to see what is going on.
>>
>> The problem is that it is blocked inside the "VM" well away from NFS and
>> there is no way for NFS to say "give up and go home".
>>
>> I'd suggest that is a bug.   I cannot see any justification for fsync to not
>> be killable.
>> It wouldn't be too hard to create a patch to make it so.
>> It would be a little harder to examine all call paths and create a
>> convincing case that the patch was safe.
>> It might be herculean task to convince others that it was the right thing
>> to do.... so let's start with that one.
>>
>> Hi Linux-mm and fs-devel people.  What do people think of making "fsync" and
>> variants "KILLABLE" ??
>>
>> I probably only need a little bit of encouragement to write a patch....
>>
>> Thanks,
>> NeilBrown
>>
>
>
> It would be good to fix this in some fashion once and for all, and the
> wait_on_page_writeback wait is a major source of pain for a lot of
> people.
>
> So to summarize...
>
> The problem in a nutshell is that Ben has some cached writes to the
> NFS server, but the server has gone away (presumably forever). The
> question is -- how do we communicate to the kernel that that server
> isn't coming back and that those dirty pages should be invalidated so
> that we can umount the filesystem?
>
> Allowing fsync/close to be killable sounds reasonable to me as at least
> a partial solution. Both close(2) and fsync(2) are allowed to return
> EINTR according to the POSIX spec. Allowing a kill -9 there seems
> like it should be fine, and maybe we ought to even consider letting it
> be susceptible to lesser signals.
>
> That still leaves some open questions though...
>
> Is that enough to fix it? You'd still have the dirty pages lingering
> around, right? Would a umount -f presumably work at that point?

'umount -f' will kill any outstanding RPC calls that are causing the
mount to hang, but doesn't do anything to change page states or NFS
file/lock states.

Cheers
  Trond

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Killing process in D state on mount to dead NFS server. (when process is in fsync)
  2014-08-02  2:55         ` Trond Myklebust
@ 2014-08-02  3:19           ` NeilBrown
  2014-08-02  3:44             ` Trond Myklebust
  0 siblings, 1 reply; 16+ messages in thread
From: NeilBrown @ 2014-08-02  3:19 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Jeff Layton, Ben Greear, Andrew Morton, linux-nfs@vger.kernel.org,
	linux-kernel, linux-mm, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 893 bytes --]

On Fri, 1 Aug 2014 22:55:42 -0400 Trond Myklebust <trondmy@gmail.com> wrote:

> > That still leaves some open questions though...
> >
> > Is that enough to fix it? You'd still have the dirty pages lingering
> > around, right? Would a umount -f presumably work at that point?
> 
> 'umount -f' will kill any outstanding RPC calls that are causing the
> mount to hang, but doesn't do anything to change page states or NFS
> file/lock states.

Should it though?

       MNT_FORCE (since Linux 2.1.116)
              Force  unmount  even  if busy.  This can cause data loss.  (Only
              for NFS mounts.)

Given that data loss is explicitly permitted, I suspect it should.

Can we make MNT_FORCE on NFS not only abort outstanding RPC calls, but
fail all subsequent RPC calls?  That might make it really useful.   You
wouldn't even need to "kill -9" then.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Killing process in D state on mount to dead NFS server. (when process is in fsync)
  2014-08-02  3:19           ` NeilBrown
@ 2014-08-02  3:44             ` Trond Myklebust
  0 siblings, 0 replies; 16+ messages in thread
From: Trond Myklebust @ 2014-08-02  3:44 UTC (permalink / raw)
  To: NeilBrown
  Cc: Jeff Layton, Ben Greear, Andrew Morton, linux-nfs@vger.kernel.org,
	linux-kernel, linux-mm, linux-fsdevel

On Fri, Aug 1, 2014 at 11:19 PM, NeilBrown <neilb@suse.de> wrote:
> On Fri, 1 Aug 2014 22:55:42 -0400 Trond Myklebust <trondmy@gmail.com> wrote:
>
>> > That still leaves some open questions though...
>> >
>> > Is that enough to fix it? You'd still have the dirty pages lingering
>> > around, right? Would a umount -f presumably work at that point?
>>
>> 'umount -f' will kill any outstanding RPC calls that are causing the
>> mount to hang, but doesn't do anything to change page states or NFS
>> file/lock states.
>
> Should it though?
>
>        MNT_FORCE (since Linux 2.1.116)
>               Force  unmount  even  if busy.  This can cause data loss.  (Only
>               for NFS mounts.)
>
> Given that data loss is explicitly permitted, I suspect it should.
>
> Can we make MNT_FORCE on NFS not only abort outstanding RPC calls, but
> fail all subsequent RPC calls?  That might make it really useful.   You
> wouldn't even need to "kill -9" then.

Yes, but if the umount fails due to other conditions (for example an
application happens to still have a file open on that volume) then
that could leave you with a persistent messy situation on your hands.

Cheers
  Trond

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Killing process in D state on mount to dead NFS server.
  2014-07-31 21:20   ` Ben Greear
  2014-07-31 21:50     ` Killing process in D state on mount to dead NFS server. (when process is in fsync) NeilBrown
@ 2014-08-13 15:42     ` Ben Greear
  2014-08-13 21:18       ` NeilBrown
  1 sibling, 1 reply; 16+ messages in thread
From: Ben Greear @ 2014-08-13 15:42 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs@vger.kernel.org

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello!

Did you get a chance to look at the stacks below?

Thanks,
Ben


On 07/31/2014 02:20 PM, Ben Greear wrote:
> On 07/31/2014 01:42 PM, NeilBrown wrote:
>> On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear <greearb@candelatech.com> wrote:
> 
>>> So, this has been asked all over the interweb for years and years, but the best answer I can find is to reboot the system or create a fake NFS server 
>>> somewhere with the same IP as the gone-away NFS server.
>>> 
>>> The problem is:
>>> 
>>> I have some mounts to an NFS server that no longer exists (crashed/powered down).
>>> 
>>> I have some processes stuck trying to write to files open on these mounts.
>>> 
>>> I want to kill the process and unmount.
>>> 
>>> umount -l will make the mount go a way, sort of.  But process is still hung. umount -f complains: umount2:  Device or resource busy umount.nfs:
>>> /mnt/foo: device is busy
>>> 
>>> kill -9 does not work on process.
> 
>> Kill -1 should work (since about 2.6.25 or so).
> 
> That is -[ONE], right?  Assuming so, it did not work for me.
> 
> Kernel is 3.14.4+, with some of extra patches, but probably nothing that influences this particular behaviour.
> 
> [root@lf1005-14010010 ~]# cat /proc/3805/stack [<ffffffff811371ba>] sleep_on_page+0x9/0xd [<ffffffff8113738e>] wait_on_page_bit+0x71/0x78 
> [<ffffffff8113769a>] filemap_fdatawait_range+0xa2/0x16d [<ffffffff8113780e>] filemap_write_and_wait_range+0x3b/0x77 [<ffffffffa0f04734>]
> nfs_file_fsync+0x37/0x83 [nfs] [<ffffffff811a8d32>] vfs_fsync_range+0x19/0x1b [<ffffffff811a8d4b>] vfs_fsync+0x17/0x19 [<ffffffffa0f05305>]
> nfs_file_flush+0x6b/0x6f [nfs] [<ffffffff81183e46>] filp_close+0x3f/0x71 [<ffffffff8119c8ae>] __close_fd+0x80/0x98 [<ffffffff81183de5>]
> SyS_close+0x1c/0x3e [<ffffffff815c55f9>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff [root@lf1005-14010010 ~]# kill -1 3805 
> [root@lf1005-14010010 ~]# cat /proc/3805/stack [<ffffffff811371ba>] sleep_on_page+0x9/0xd [<ffffffff8113738e>] wait_on_page_bit+0x71/0x78 
> [<ffffffff8113769a>] filemap_fdatawait_range+0xa2/0x16d [<ffffffff8113780e>] filemap_write_and_wait_range+0x3b/0x77 [<ffffffffa0f04734>]
> nfs_file_fsync+0x37/0x83 [nfs] [<ffffffff811a8d32>] vfs_fsync_range+0x19/0x1b [<ffffffff811a8d4b>] vfs_fsync+0x17/0x19 [<ffffffffa0f05305>]
> nfs_file_flush+0x6b/0x6f [nfs] [<ffffffff81183e46>] filp_close+0x3f/0x71 [<ffffffff8119c8ae>] __close_fd+0x80/0x98 [<ffffffff81183de5>]
> SyS_close+0x1c/0x3e [<ffffffff815c55f9>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff
> 
> Thanks, Ben
> 
>> If it doesn't please report the kernel version and cat /proc/$PID/stack
> 
>> for some processes that cannot be killed.
> 
>> NeilBrown
> 
>>> 
>>> 
>>> Aside from bringing a fake NFS server back up on the same IP, is there any other way to get these mounts unmounted and the processes killed without 
>>> rebooting?
>>> 
>>> Thanks, Ben
>>> 
> 
> 
> 
> -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> 

- -- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJT64dqAAoJELbHqkYeJT4OHC0IAIRB2A8v5msRhXrdd+ybvkwD
NcOSYOhSsxCHIS5BR5CNLg89zipRuocVCbdLRdtbse8nspMq8PAiQJt3YOkGwzos
ifcsgxouMUKfmLcFHtJ0maIkWMPIrttPvHJuw67gt7LbHLPsFjlrdrKPv6aGa95m
7mCkY/bRniiJYCxrCqixzQpuWfIyVal6FPGtmpydTVh6lq0y05vDEVB8lP5xGyes
w+I/vJkGf9ddTIDasYJbLwUXECbN3makJxmHNAZf4slQMB5FNNnpeTOqL17u62cY
F/do8m/zxzztibTZqjKHIhHGDw/huTyQWfRsQ0AA9Exu8/RZKhJlL2EeYlFJWJQ=
=hNGY
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Killing process in D state on mount to dead NFS server.
  2014-08-13 15:42     ` Killing process in D state on mount to dead NFS server Ben Greear
@ 2014-08-13 21:18       ` NeilBrown
  2014-08-13 21:22         ` Ben Greear
  0 siblings, 1 reply; 16+ messages in thread
From: NeilBrown @ 2014-08-13 21:18 UTC (permalink / raw)
  To: Ben Greear; +Cc: linux-nfs@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 4736 bytes --]

On Wed, 13 Aug 2014 08:42:34 -0700 Ben Greear <greearb@candelatech.com> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hello!
> 
> Did you get a chance to look at the stacks below?

Yes I did, and I replied on Date: Fri, 1 Aug 2014 07:50:53 +1000

The problem is that "fsync" and related functions are not killable.
I think it is generally agreed that this is a bug, and that a fix would
probably be accepted.
I started working on one the other day but haven't got very hard yet (lots of
other things to work on).

NeilBrown

> 
> Thanks,
> Ben
> 
> 
> On 07/31/2014 02:20 PM, Ben Greear wrote:
> > On 07/31/2014 01:42 PM, NeilBrown wrote:
> >> On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear <greearb@candelatech.com> wrote:
> > 
> >>> So, this has been asked all over the interweb for years and years, but the best answer I can find is to reboot the system or create a fake NFS server 
> >>> somewhere with the same IP as the gone-away NFS server.
> >>> 
> >>> The problem is:
> >>> 
> >>> I have some mounts to an NFS server that no longer exists (crashed/powered down).
> >>> 
> >>> I have some processes stuck trying to write to files open on these mounts.
> >>> 
> >>> I want to kill the process and unmount.
> >>> 
> >>> umount -l will make the mount go a way, sort of.  But process is still hung. umount -f complains: umount2:  Device or resource busy umount.nfs:
> >>> /mnt/foo: device is busy
> >>> 
> >>> kill -9 does not work on process.
> > 
> >> Kill -1 should work (since about 2.6.25 or so).
> > 
> > That is -[ONE], right?  Assuming so, it did not work for me.
> > 
> > Kernel is 3.14.4+, with some of extra patches, but probably nothing that influences this particular behaviour.
> > 
> > [root@lf1005-14010010 ~]# cat /proc/3805/stack [<ffffffff811371ba>] sleep_on_page+0x9/0xd [<ffffffff8113738e>] wait_on_page_bit+0x71/0x78 
> > [<ffffffff8113769a>] filemap_fdatawait_range+0xa2/0x16d [<ffffffff8113780e>] filemap_write_and_wait_range+0x3b/0x77 [<ffffffffa0f04734>]
> > nfs_file_fsync+0x37/0x83 [nfs] [<ffffffff811a8d32>] vfs_fsync_range+0x19/0x1b [<ffffffff811a8d4b>] vfs_fsync+0x17/0x19 [<ffffffffa0f05305>]
> > nfs_file_flush+0x6b/0x6f [nfs] [<ffffffff81183e46>] filp_close+0x3f/0x71 [<ffffffff8119c8ae>] __close_fd+0x80/0x98 [<ffffffff81183de5>]
> > SyS_close+0x1c/0x3e [<ffffffff815c55f9>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff [root@lf1005-14010010 ~]# kill -1 3805 
> > [root@lf1005-14010010 ~]# cat /proc/3805/stack [<ffffffff811371ba>] sleep_on_page+0x9/0xd [<ffffffff8113738e>] wait_on_page_bit+0x71/0x78 
> > [<ffffffff8113769a>] filemap_fdatawait_range+0xa2/0x16d [<ffffffff8113780e>] filemap_write_and_wait_range+0x3b/0x77 [<ffffffffa0f04734>]
> > nfs_file_fsync+0x37/0x83 [nfs] [<ffffffff811a8d32>] vfs_fsync_range+0x19/0x1b [<ffffffff811a8d4b>] vfs_fsync+0x17/0x19 [<ffffffffa0f05305>]
> > nfs_file_flush+0x6b/0x6f [nfs] [<ffffffff81183e46>] filp_close+0x3f/0x71 [<ffffffff8119c8ae>] __close_fd+0x80/0x98 [<ffffffff81183de5>]
> > SyS_close+0x1c/0x3e [<ffffffff815c55f9>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff
> > 
> > Thanks, Ben
> > 
> >> If it doesn't please report the kernel version and cat /proc/$PID/stack
> > 
> >> for some processes that cannot be killed.
> > 
> >> NeilBrown
> > 
> >>> 
> >>> 
> >>> Aside from bringing a fake NFS server back up on the same IP, is there any other way to get these mounts unmounted and the processes killed without 
> >>> rebooting?
> >>> 
> >>> Thanks, Ben
> >>> 
> > 
> > 
> > 
> > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at
> > http://vger.kernel.org/majordomo-info.html
> > 
> 
> - -- 
> Ben Greear <greearb@candelatech.com>
> Candela Technologies Inc  http://www.candelatech.com
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.13 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> 
> iQEcBAEBAgAGBQJT64dqAAoJELbHqkYeJT4OHC0IAIRB2A8v5msRhXrdd+ybvkwD
> NcOSYOhSsxCHIS5BR5CNLg89zipRuocVCbdLRdtbse8nspMq8PAiQJt3YOkGwzos
> ifcsgxouMUKfmLcFHtJ0maIkWMPIrttPvHJuw67gt7LbHLPsFjlrdrKPv6aGa95m
> 7mCkY/bRniiJYCxrCqixzQpuWfIyVal6FPGtmpydTVh6lq0y05vDEVB8lP5xGyes
> w+I/vJkGf9ddTIDasYJbLwUXECbN3makJxmHNAZf4slQMB5FNNnpeTOqL17u62cY
> F/do8m/zxzztibTZqjKHIhHGDw/huTyQWfRsQ0AA9Exu8/RZKhJlL2EeYlFJWJQ=
> =hNGY
> -----END PGP SIGNATURE-----
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Killing process in D state on mount to dead NFS server.
  2014-08-13 21:18       ` NeilBrown
@ 2014-08-13 21:22         ` Ben Greear
  0 siblings, 0 replies; 16+ messages in thread
From: Ben Greear @ 2014-08-13 21:22 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs@vger.kernel.org

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/13/2014 02:18 PM, NeilBrown wrote:
> On Wed, 13 Aug 2014 08:42:34 -0700 Ben Greear <greearb@candelatech.com> wrote:
> 
> Hello!
> 
> Did you get a chance to look at the stacks below?
> 
>> Yes I did, and I replied on Date: Fri, 1 Aug 2014 07:50:53 +1000

Hmm, I don't seem to have received that email (or I
managed to lose it quite thoroughly), but no worries.

>> The problem is that "fsync" and related functions are not killable. I think it is generally agreed that this is a bug, and that a fix would probably be
>> accepted. I started working on one the other day but haven't got very hard yet (lots of other things to work on).

Ok, thanks for the effort so far, and if you do get a patch cooked up,
I will be happy to test it.

Thanks,
Ben

- -- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJT69cpAAoJELbHqkYeJT4OL34H/jxncIyX5sD9zm1DmiAJZA/g
nOpyk2603aMMj3qH91svxv8FCwoXpqnyV77vWUSKInkw+E1sWRsxuAZoKbxNjN4U
yCpfUlFc1IrDPSVV0Ax7oFs/DbOJRLrtrh3BW/0zBVZfw3RRYKEtR3LGAKtt+VNI
5RiwqFes/6tJXj6h8vF4WYBJ31J31mqWYh4GcJ2B2w3lYRX4CnTfAuJ75KRfTX3k
LESaS2w+7B9Ta42Xl9QU6SWsisdxEhYA0R5TbAl1MICjU7RvmLEyeKcuG3/DpPjJ
TjYoLR5SBcNqCBKab+L+j8jFPta8pE9LR/WWP0PfC/NWgw7+dy2VUFF2se7wSeY=
=GsQw
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-08-13 21:22 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-31 18:00 Killing process in D state on mount to dead NFS server Ben Greear
2014-07-31 19:49 ` Malahal Naineni
2014-07-31 19:52   ` Ben Greear
2014-07-31 20:42 ` NeilBrown
2014-07-31 21:20   ` Ben Greear
2014-07-31 21:50     ` Killing process in D state on mount to dead NFS server. (when process is in fsync) NeilBrown
2014-08-01 12:47       ` Jan Kara
2014-08-02  1:21       ` Jeff Layton
2014-08-02  1:50         ` Roger Heflin
2014-08-02  2:07           ` Jeff Layton
2014-08-02  2:55         ` Trond Myklebust
2014-08-02  3:19           ` NeilBrown
2014-08-02  3:44             ` Trond Myklebust
2014-08-13 15:42     ` Killing process in D state on mount to dead NFS server Ben Greear
2014-08-13 21:18       ` NeilBrown
2014-08-13 21:22         ` Ben Greear

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).