* NFSD automatically releases all states when underlying file system is unmounted
@ 2025-03-19 18:22 Dai Ngo
2025-03-19 18:28 ` Chuck Lever
2025-03-19 21:46 ` NeilBrown
0 siblings, 2 replies; 20+ messages in thread
From: Dai Ngo @ 2025-03-19 18:22 UTC (permalink / raw)
To: Chuck Lever, Jeff Layton, Neil Brown, Olga Kornievskaia,
Tom Talpey
Cc: Linux NFS Mailing List
Hi,
Currently when the local file system needs to be unmounted for maintenance
the admin needs to make sure all the NFS clients have stopped using any files
on the NFS shares before the umount(8) can succeed.
In an environment where there are thousands of clients this manual process
seems almost impossible or impractical. The only option available now is to
restart the NFS server which would works since the NFS client can recover its
state but it seems like this is a big hammer approach.
Ideally, when the umount command is run there is a callback from the VFS layer
to notify the upper protocols; NFS and SMB, to release its states on this file
system for the umount to complete.
Is there any existing mechanism to allow NFSD to release its states automatically
on unmount?
Unmount is not a frequent operation. Is it justifiable to add a bunch of complex
code for something is not frequently needed?
I appreciate any opinions on this issue.
Thanks,
-Dai
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-19 18:22 NFSD automatically releases all states when underlying file system is unmounted Dai Ngo
@ 2025-03-19 18:28 ` Chuck Lever
2025-03-19 19:00 ` Dai Ngo
2025-03-19 21:46 ` NeilBrown
1 sibling, 1 reply; 20+ messages in thread
From: Chuck Lever @ 2025-03-19 18:28 UTC (permalink / raw)
To: Dai Ngo
Cc: Jeff Layton, Neil Brown, Tom Talpey, Olga Kornievskaia,
Mike Snitzer, Linux NFS Mailing List
Hi Dai, thanks for starting this conversation.
[ adding Mike -- IIRC localio is facing a similar issue ]
On 3/19/25 2:22 PM, Dai Ngo wrote:
> Hi,
>
> Currently when the local file system needs to be unmounted for maintenance
> the admin needs to make sure all the NFS clients have stopped using any
> files
> on the NFS shares before the umount(8) can succeed.
>
> In an environment where there are thousands of clients this manual process
> seems almost impossible or impractical. The only option available now is to
> restart the NFS server which would works since the NFS client can
> recover its
> state but it seems like this is a big hammer approach.
Well we could do this instead by having the server pretend to reboot for
only clients that have mounted the export that is going away. That way
any clients that don't have an interest in the unexported/unmounted file
system don't have to deal with state recovery.
> Ideally, when the umount command is run there is a callback from the VFS
> layer
> to notify the upper protocols; NFS and SMB, to release its states on
> this file
> system for the umount to complete.
>
> Is there any existing mechanism to allow NFSD to release its states
> automatically on unmount?
Can you explain why you don't believe unexport is the right place to
trigger remote file closure?
> Unmount is not a frequent operation. Is it justifiable to add a bunch of
> complex
> code for something is not frequently needed?
I agree that I/O is significantly more frequent than unexport/unmount.
It suggests we want a solution that does not make a heavy impact on the
I/O code paths.
--
Chuck Lever
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-19 18:28 ` Chuck Lever
@ 2025-03-19 19:00 ` Dai Ngo
2025-03-19 19:24 ` Chuck Lever
0 siblings, 1 reply; 20+ messages in thread
From: Dai Ngo @ 2025-03-19 19:00 UTC (permalink / raw)
To: Chuck Lever
Cc: Jeff Layton, Neil Brown, Tom Talpey, Olga Kornievskaia,
Mike Snitzer, Linux NFS Mailing List
On 3/19/25 11:28 AM, Chuck Lever wrote:
> Hi Dai, thanks for starting this conversation.
>
> [ adding Mike -- IIRC localio is facing a similar issue ]
>
> On 3/19/25 2:22 PM, Dai Ngo wrote:
>> Hi,
>>
>> Currently when the local file system needs to be unmounted for maintenance
>> the admin needs to make sure all the NFS clients have stopped using any
>> files
>> on the NFS shares before the umount(8) can succeed.
>>
>> In an environment where there are thousands of clients this manual process
>> seems almost impossible or impractical. The only option available now is to
>> restart the NFS server which would works since the NFS client can
>> recover its
>> state but it seems like this is a big hammer approach.
> Well we could do this instead by having the server pretend to reboot for
> only clients that have mounted the export that is going away. That way
> any clients that don't have an interest in the unexported/unmounted file
> system don't have to deal with state recovery.
Is there a way to restart the NFS server for just the export that's going
away? How do we specify an export when doing 'systemctl restart nfs-server'.
>
>
>> Ideally, when the umount command is run there is a callback from the VFS
>> layer
>> to notify the upper protocols; NFS and SMB, to release its states on
>> this file
>> system for the umount to complete.
>>
>> Is there any existing mechanism to allow NFSD to release its states
>> automatically on unmount?
> Can you explain why you don't believe unexport is the right place to
> trigger remote file closure?
Yes, unexport is another place that can be enhanced to trigger the releasing
of all states of the export that going away. For this to work, the downcall
mechanism between exportfs and the kernel needs to be enhanced to specify
the export that is going away. This approach would eliminate the need for
VFS involvement.
Currently when 'exportfs -u' is called, exportfs makes a downcall to the
kernel to clear the cache of ALL exports and not just the one that going
away.
>
>
>> Unmount is not a frequent operation. Is it justifiable to add a bunch of
>> complex
>> code for something is not frequently needed?
> I agree that I/O is significantly more frequent than unexport/unmount.
> It suggests we want a solution that does not make a heavy impact on the
> I/O code paths.
Thanks,
-Dai
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-19 19:00 ` Dai Ngo
@ 2025-03-19 19:24 ` Chuck Lever
2025-03-19 19:44 ` Dai Ngo
0 siblings, 1 reply; 20+ messages in thread
From: Chuck Lever @ 2025-03-19 19:24 UTC (permalink / raw)
To: Dai Ngo
Cc: Jeff Layton, Neil Brown, Tom Talpey, Olga Kornievskaia,
Mike Snitzer, Linux NFS Mailing List
On 3/19/25 3:00 PM, Dai Ngo wrote:
>
> On 3/19/25 11:28 AM, Chuck Lever wrote:
>> Hi Dai, thanks for starting this conversation.
>>
>> [ adding Mike -- IIRC localio is facing a similar issue ]
>>
>> On 3/19/25 2:22 PM, Dai Ngo wrote:
>>> Hi,
>>>
>>> Currently when the local file system needs to be unmounted for
>>> maintenance
>>> the admin needs to make sure all the NFS clients have stopped using any
>>> files
>>> on the NFS shares before the umount(8) can succeed.
>>>
>>> In an environment where there are thousands of clients this manual
>>> process
>>> seems almost impossible or impractical. The only option available now
>>> is to
>>> restart the NFS server which would works since the NFS client can
>>> recover its
>>> state but it seems like this is a big hammer approach.
>> Well we could do this instead by having the server pretend to reboot for
>> only clients that have mounted the export that is going away. That way
>> any clients that don't have an interest in the unexported/unmounted file
>> system don't have to deal with state recovery.
>
> Is there a way to restart the NFS server for just the export that's going
> away?
There is not a way do that, currently, though it's a feature that has
been discussed for years.
> How do we specify an export when doing 'systemctl restart nfs-
> server'.
I would think that the emulated restart for select exports would be
handled entirely in the kernel, and not via systemd.
>>> Ideally, when the umount command is run there is a callback from the VFS
>>> layer
>>> to notify the upper protocols; NFS and SMB, to release its states on
>>> this file
>>> system for the umount to complete.
>>>
>>> Is there any existing mechanism to allow NFSD to release its states
>>> automatically on unmount?
>> Can you explain why you don't believe unexport is the right place to
>> trigger remote file closure?
>
> Yes, unexport is another place that can be enhanced to trigger the
> releasing
> of all states of the export that going away. For this to work, the downcall
> mechanism between exportfs and the kernel needs to be enhanced to specify
> the export that is going away. This approach would eliminate the need for
> VFS involvement.
>
> Currently when 'exportfs -u' is called, exportfs makes a downcall to the
> kernel to clear the cache of ALL exports and not just the one that going
> away.
Clearing the export cache is OK. That just means that new client
requests will trigger an upcall tp repopulate the kernel's export cache.
That is a much smaller bump in the performance road than a full server
restart.
--
Chuck Lever
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-19 19:24 ` Chuck Lever
@ 2025-03-19 19:44 ` Dai Ngo
0 siblings, 0 replies; 20+ messages in thread
From: Dai Ngo @ 2025-03-19 19:44 UTC (permalink / raw)
To: Chuck Lever
Cc: Jeff Layton, Neil Brown, Tom Talpey, Olga Kornievskaia,
Mike Snitzer, Linux NFS Mailing List
On 3/19/25 12:24 PM, Chuck Lever wrote:
> On 3/19/25 3:00 PM, Dai Ngo wrote:
>> On 3/19/25 11:28 AM, Chuck Lever wrote:
>>> Hi Dai, thanks for starting this conversation.
>>>
>>> [ adding Mike -- IIRC localio is facing a similar issue ]
>>>
>>> On 3/19/25 2:22 PM, Dai Ngo wrote:
>>>> Hi,
>>>>
>>>> Currently when the local file system needs to be unmounted for
>>>> maintenance
>>>> the admin needs to make sure all the NFS clients have stopped using any
>>>> files
>>>> on the NFS shares before the umount(8) can succeed.
>>>>
>>>> In an environment where there are thousands of clients this manual
>>>> process
>>>> seems almost impossible or impractical. The only option available now
>>>> is to
>>>> restart the NFS server which would works since the NFS client can
>>>> recover its
>>>> state but it seems like this is a big hammer approach.
>>> Well we could do this instead by having the server pretend to reboot for
>>> only clients that have mounted the export that is going away. That way
>>> any clients that don't have an interest in the unexported/unmounted file
>>> system don't have to deal with state recovery.
>> Is there a way to restart the NFS server for just the export that's going
>> away?
> There is not a way do that, currently, though it's a feature that has
> been discussed for years.
>
>
>> How do we specify an export when doing 'systemctl restart nfs-
>> server'.
> I would think that the emulated restart for select exports would be
> handled entirely in the kernel, and not via systemd.
>
>
>>>> Ideally, when the umount command is run there is a callback from the VFS
>>>> layer
>>>> to notify the upper protocols; NFS and SMB, to release its states on
>>>> this file
>>>> system for the umount to complete.
>>>>
>>>> Is there any existing mechanism to allow NFSD to release its states
>>>> automatically on unmount?
>>> Can you explain why you don't believe unexport is the right place to
>>> trigger remote file closure?
>> Yes, unexport is another place that can be enhanced to trigger the
>> releasing
>> of all states of the export that going away. For this to work, the downcall
>> mechanism between exportfs and the kernel needs to be enhanced to specify
>> the export that is going away. This approach would eliminate the need for
>> VFS involvement.
>>
>> Currently when 'exportfs -u' is called, exportfs makes a downcall to the
>> kernel to clear the cache of ALL exports and not just the one that going
>> away.
> Clearing the export cache is OK. That just means that new client
> requests will trigger an upcall tp repopulate the kernel's export cache.
> That is a much smaller bump in the performance road than a full server
> restart.
It's not just clearing the export cache. Since we do not know which export is
going away we have to release the states of all files that using the exports.
Meaning all NFS clients have to recover their states. Perhaps this still has
less impact than a full server restart.
-Dai
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-19 18:22 NFSD automatically releases all states when underlying file system is unmounted Dai Ngo
2025-03-19 18:28 ` Chuck Lever
@ 2025-03-19 21:46 ` NeilBrown
2025-03-19 22:12 ` Dai Ngo
2025-03-20 17:53 ` Chuck Lever
1 sibling, 2 replies; 20+ messages in thread
From: NeilBrown @ 2025-03-19 21:46 UTC (permalink / raw)
To: Dai Ngo
Cc: Chuck Lever, Jeff Layton, Olga Kornievskaia, Tom Talpey,
Linux NFS Mailing List
On Thu, 20 Mar 2025, Dai Ngo wrote:
> Hi,
>
> Currently when the local file system needs to be unmounted for maintenance
> the admin needs to make sure all the NFS clients have stopped using any files
> on the NFS shares before the umount(8) can succeed.
This is easily achieved with
echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem
Do this after unexporting and before unmounting.
All state for NFSv4 exports, and all NLM locks for NFSv2/3 exports, will
be invalidated and files closed. NFSv4 clients will get
NFS4ERR_ADMIN_REVOKED when they attempt to use any state that was on
that filesystem.
(I don't think this flushes the NFSv3 file cache, so a short delay might
be needed before the unmount when v3 is used. That should be fixed)
NeilBrown
>
> In an environment where there are thousands of clients this manual process
> seems almost impossible or impractical. The only option available now is to
> restart the NFS server which would works since the NFS client can recover its
> state but it seems like this is a big hammer approach.
>
> Ideally, when the umount command is run there is a callback from the VFS layer
> to notify the upper protocols; NFS and SMB, to release its states on this file
> system for the umount to complete.
>
> Is there any existing mechanism to allow NFSD to release its states automatically
> on unmount?
>
> Unmount is not a frequent operation. Is it justifiable to add a bunch of complex
> code for something is not frequently needed?
>
> I appreciate any opinions on this issue.
>
> Thanks,
> -Dai
>
>
>
>
>
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-19 21:46 ` NeilBrown
@ 2025-03-19 22:12 ` Dai Ngo
2025-03-20 17:53 ` Chuck Lever
1 sibling, 0 replies; 20+ messages in thread
From: Dai Ngo @ 2025-03-19 22:12 UTC (permalink / raw)
To: NeilBrown
Cc: Chuck Lever, Jeff Layton, Olga Kornievskaia, Tom Talpey,
Linux NFS Mailing List
On 3/19/25 2:46 PM, NeilBrown wrote:
> On Thu, 20 Mar 2025, Dai Ngo wrote:
>> Hi,
>>
>> Currently when the local file system needs to be unmounted for maintenance
>> the admin needs to make sure all the NFS clients have stopped using any files
>> on the NFS shares before the umount(8) can succeed.
> This is easily achieved with
> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem
>
> Do this after unexporting and before unmounting.
Yes, this works!
>
> All state for NFSv4 exports, and all NLM locks for NFSv2/3 exports, will
> be invalidated and files closed. NFSv4 clients will get
> NFS4ERR_ADMIN_REVOKED when they attempt to use any state that was on
> that filesystem.
In my test, client gets NFS4ERR_STALE for the PUTFH in the GETATTR compound
which is expected.
>
> (I don't think this flushes the NFSv3 file cache, so a short delay might
> be needed before the unmount when v3 is used. That should be fixed)
Thank you very much Neil!
-Dai
>
> NeilBrown
>
>
>> In an environment where there are thousands of clients this manual process
>> seems almost impossible or impractical. The only option available now is to
>> restart the NFS server which would works since the NFS client can recover its
>> state but it seems like this is a big hammer approach.
>>
>> Ideally, when the umount command is run there is a callback from the VFS layer
>> to notify the upper protocols; NFS and SMB, to release its states on this file
>> system for the umount to complete.
>>
>> Is there any existing mechanism to allow NFSD to release its states automatically
>> on unmount?
>>
>> Unmount is not a frequent operation. Is it justifiable to add a bunch of complex
>> code for something is not frequently needed?
>>
>> I appreciate any opinions on this issue.
>>
>> Thanks,
>> -Dai
>>
>>
>>
>>
>>
>>
>>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-19 21:46 ` NeilBrown
2025-03-19 22:12 ` Dai Ngo
@ 2025-03-20 17:53 ` Chuck Lever
2025-03-21 14:36 ` Benjamin Coddington
1 sibling, 1 reply; 20+ messages in thread
From: Chuck Lever @ 2025-03-20 17:53 UTC (permalink / raw)
To: NeilBrown
Cc: Jeff Layton, Dai Ngo, Olga Kornievskaia, Tom Talpey,
Linux NFS Mailing List
On 3/19/25 5:46 PM, NeilBrown wrote:
> On Thu, 20 Mar 2025, Dai Ngo wrote:
>> Hi,
>>
>> Currently when the local file system needs to be unmounted for maintenance
>> the admin needs to make sure all the NFS clients have stopped using any files
>> on the NFS shares before the umount(8) can succeed.
>
> This is easily achieved with
> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem
>
> Do this after unexporting and before unmounting.
Seems like administrators would expect that a filesystem can be
unmounted immediately after unexporting it. Should "exportfs" be changed
to handle this extra step under the covers? Doesn't seem like it would
be hard to do, and I can't think of a use case where it would be
harmful.
> All state for NFSv4 exports, and all NLM locks for NFSv2/3 exports, will
> be invalidated and files closed. NFSv4 clients will get
> NFS4ERR_ADMIN_REVOKED when they attempt to use any state that was on
> that filesystem.
I'm wondering if this mechanism also flushes courtesy client state for
the file system that is about to be exported... it should, if it does
not already take care of that.
> (I don't think this flushes the NFSv3 file cache, so a short delay might
> be needed before the unmount when v3 is used. That should be fixed)
>
> NeilBrown
>
>
>>
>> In an environment where there are thousands of clients this manual process
>> seems almost impossible or impractical. The only option available now is to
>> restart the NFS server which would works since the NFS client can recover its
>> state but it seems like this is a big hammer approach.
>>
>> Ideally, when the umount command is run there is a callback from the VFS layer
>> to notify the upper protocols; NFS and SMB, to release its states on this file
>> system for the umount to complete.
>>
>> Is there any existing mechanism to allow NFSD to release its states automatically
>> on unmount?
>>
>> Unmount is not a frequent operation. Is it justifiable to add a bunch of complex
>> code for something is not frequently needed?
>>
>> I appreciate any opinions on this issue.
>>
>> Thanks,
>> -Dai
>>
>>
>>
>>
>>
>>
>>
>
--
Chuck Lever
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-20 17:53 ` Chuck Lever
@ 2025-03-21 14:36 ` Benjamin Coddington
2025-03-21 14:43 ` Jeff Layton
2025-03-21 14:44 ` Chuck Lever
0 siblings, 2 replies; 20+ messages in thread
From: Benjamin Coddington @ 2025-03-21 14:36 UTC (permalink / raw)
To: Chuck Lever
Cc: NeilBrown, Jeff Layton, Dai Ngo, Olga Kornievskaia, Tom Talpey,
Linux NFS Mailing List
On 20 Mar 2025, at 13:53, Chuck Lever wrote:
> On 3/19/25 5:46 PM, NeilBrown wrote:
>> On Thu, 20 Mar 2025, Dai Ngo wrote:
>>> Hi,
>>>
>>> Currently when the local file system needs to be unmounted for maintenance
>>> the admin needs to make sure all the NFS clients have stopped using any files
>>> on the NFS shares before the umount(8) can succeed.
>>
>> This is easily achieved with
>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem
>>
>> Do this after unexporting and before unmounting.
>
> Seems like administrators would expect that a filesystem can be
> unmounted immediately after unexporting it. Should "exportfs" be changed
> to handle this extra step under the covers? Doesn't seem like it would
> be hard to do, and I can't think of a use case where it would be
> harmful.
No. I think that admins don't expect to lose all their NFS client's state if
they're managing the exports. That would be a really big and invisible change
to existing behavior.
Ben
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-21 14:36 ` Benjamin Coddington
@ 2025-03-21 14:43 ` Jeff Layton
2025-03-21 15:07 ` Benjamin Coddington
2025-03-21 14:44 ` Chuck Lever
1 sibling, 1 reply; 20+ messages in thread
From: Jeff Layton @ 2025-03-21 14:43 UTC (permalink / raw)
To: Benjamin Coddington, Chuck Lever
Cc: NeilBrown, Dai Ngo, Olga Kornievskaia, Tom Talpey,
Linux NFS Mailing List
On Fri, 2025-03-21 at 10:36 -0400, Benjamin Coddington wrote:
> On 20 Mar 2025, at 13:53, Chuck Lever wrote:
>
> > On 3/19/25 5:46 PM, NeilBrown wrote:
> > > On Thu, 20 Mar 2025, Dai Ngo wrote:
> > > > Hi,
> > > >
> > > > Currently when the local file system needs to be unmounted for maintenance
> > > > the admin needs to make sure all the NFS clients have stopped using any files
> > > > on the NFS shares before the umount(8) can succeed.
> > >
> > > This is easily achieved with
> > > echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem
> > >
> > > Do this after unexporting and before unmounting.
> >
> > Seems like administrators would expect that a filesystem can be
> > unmounted immediately after unexporting it. Should "exportfs" be changed
> > to handle this extra step under the covers? Doesn't seem like it would
> > be hard to do, and I can't think of a use case where it would be
> > harmful.
>
> No. I think that admins don't expect to lose all their NFS client's state if
> they're managing the exports. That would be a really big and invisible change
> to existing behavior.
>
If we're unexporting the filesystem though, then ISTM like we ought to
cancel any state that was held on it. Are you concerned the admin
inadvertently unexporting something or is there another use-case you're
worried about?
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-21 14:36 ` Benjamin Coddington
2025-03-21 14:43 ` Jeff Layton
@ 2025-03-21 14:44 ` Chuck Lever
2025-03-26 0:23 ` NeilBrown
1 sibling, 1 reply; 20+ messages in thread
From: Chuck Lever @ 2025-03-21 14:44 UTC (permalink / raw)
To: Benjamin Coddington
Cc: NeilBrown, Jeff Layton, Dai Ngo, Olga Kornievskaia, Tom Talpey,
Linux NFS Mailing List
On 3/21/25 10:36 AM, Benjamin Coddington wrote:
> On 20 Mar 2025, at 13:53, Chuck Lever wrote:
>
>> On 3/19/25 5:46 PM, NeilBrown wrote:
>>> On Thu, 20 Mar 2025, Dai Ngo wrote:
>>>> Hi,
>>>>
>>>> Currently when the local file system needs to be unmounted for maintenance
>>>> the admin needs to make sure all the NFS clients have stopped using any files
>>>> on the NFS shares before the umount(8) can succeed.
>>>
>>> This is easily achieved with
>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem
>>>
>>> Do this after unexporting and before unmounting.
>>
>> Seems like administrators would expect that a filesystem can be
>> unmounted immediately after unexporting it. Should "exportfs" be changed
>> to handle this extra step under the covers? Doesn't seem like it would
>> be hard to do, and I can't think of a use case where it would be
>> harmful.
>
> No. I think that admins don't expect to lose all their NFS client's state if
> they're managing the exports. That would be a really big and invisible change
> to existing behavior.
To be clear, I mean that a file system should be unlocked only when it
is specifically unexported. IMO, unexport is usually an administrator
action that means "I want to stop remote access to this file system now"
and that's what unlock_filesystem does.
IMO administrators would be surprised to learn that NFS clients may
continue to access a file system (via existing open files) after it
has been explicitly unexported.
The alternative is to document unlock_filesystem in man exportfs(8).
And perhaps we need a more surgical mechanism that can handle the case
where the file system is still exported but the security policy has
changed. Because this does feel like a real information leak.
--
Chuck Lever
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-21 14:43 ` Jeff Layton
@ 2025-03-21 15:07 ` Benjamin Coddington
2025-03-21 15:18 ` Chuck Lever
0 siblings, 1 reply; 20+ messages in thread
From: Benjamin Coddington @ 2025-03-21 15:07 UTC (permalink / raw)
To: Jeff Layton
Cc: Chuck Lever, NeilBrown, Dai Ngo, Olga Kornievskaia, Tom Talpey,
Linux NFS Mailing List
On 21 Mar 2025, at 10:43, Jeff Layton wrote:
> On Fri, 2025-03-21 at 10:36 -0400, Benjamin Coddington wrote:
>> On 20 Mar 2025, at 13:53, Chuck Lever wrote:
>>
>>> On 3/19/25 5:46 PM, NeilBrown wrote:
>>>> On Thu, 20 Mar 2025, Dai Ngo wrote:
>>>>> Hi,
>>>>>
>>>>> Currently when the local file system needs to be unmounted for maintenance
>>>>> the admin needs to make sure all the NFS clients have stopped using any files
>>>>> on the NFS shares before the umount(8) can succeed.
>>>>
>>>> This is easily achieved with
>>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem
>>>>
>>>> Do this after unexporting and before unmounting.
>>>
>>> Seems like administrators would expect that a filesystem can be
>>> unmounted immediately after unexporting it. Should "exportfs" be changed
>>> to handle this extra step under the covers? Doesn't seem like it would
>>> be hard to do, and I can't think of a use case where it would be
>>> harmful.
>>
>> No. I think that admins don't expect to lose all their NFS client's state if
>> they're managing the exports. That would be a really big and invisible change
>> to existing behavior.
>>
>
> If we're unexporting the filesystem though, then ISTM like we ought to
> cancel any state that was held on it. Are you concerned the admin
> inadvertently unexporting something or is there another use-case you're
> worried about?
I'm worried about changing existing behavior and the fallout, today I can
un-export and re-export all day long, and as long as I re-export the
filesystem the applications on those clients are unaffected.
I'm an old sysadmin that knows that I can un-export and re-export stuff and
not have to worry about state loss. There have to be existing systems and
people that also have that knowledge built in by now. If we change this, we
break things.
Ben
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-21 15:07 ` Benjamin Coddington
@ 2025-03-21 15:18 ` Chuck Lever
2025-03-21 15:51 ` Benjamin Coddington
0 siblings, 1 reply; 20+ messages in thread
From: Chuck Lever @ 2025-03-21 15:18 UTC (permalink / raw)
To: Benjamin Coddington, Jeff Layton
Cc: NeilBrown, Dai Ngo, Olga Kornievskaia, Tom Talpey,
Linux NFS Mailing List
On 3/21/25 11:07 AM, Benjamin Coddington wrote:
> On 21 Mar 2025, at 10:43, Jeff Layton wrote:
>
>> On Fri, 2025-03-21 at 10:36 -0400, Benjamin Coddington wrote:
>>> On 20 Mar 2025, at 13:53, Chuck Lever wrote:
>>>
>>>> On 3/19/25 5:46 PM, NeilBrown wrote:
>>>>> On Thu, 20 Mar 2025, Dai Ngo wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Currently when the local file system needs to be unmounted for maintenance
>>>>>> the admin needs to make sure all the NFS clients have stopped using any files
>>>>>> on the NFS shares before the umount(8) can succeed.
>>>>>
>>>>> This is easily achieved with
>>>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem
>>>>>
>>>>> Do this after unexporting and before unmounting.
>>>>
>>>> Seems like administrators would expect that a filesystem can be
>>>> unmounted immediately after unexporting it. Should "exportfs" be changed
>>>> to handle this extra step under the covers? Doesn't seem like it would
>>>> be hard to do, and I can't think of a use case where it would be
>>>> harmful.
>>>
>>> No. I think that admins don't expect to lose all their NFS client's state if
>>> they're managing the exports. That would be a really big and invisible change
>>> to existing behavior.
>>>
>>
>> If we're unexporting the filesystem though, then ISTM like we ought to
>> cancel any state that was held on it. Are you concerned the admin
>> inadvertently unexporting something or is there another use-case you're
>> worried about?
>
> I'm worried about changing existing behavior and the fallout, today I can
> un-export and re-export all day long, and as long as I re-export the
> filesystem the applications on those clients are unaffected.
>
> I'm an old sysadmin that knows that I can un-export and re-export stuff and
> not have to worry about state loss.
Is it documented that you can rely on that? If not, then I'd say old
sysadmins should expect that behavior can be changed. 2-cents.
Also, as a sysadmin, I would never unexport and expect there to be no
consequences. Running apps that try to open a file on a recently
unexported share /will/ get ESTALE -- NFSv3 holds no open state at
all, so the next NFS READ on that share will fail with EIO.
So unexport is already not without some consequences. IMO it's not
sensible to expect an unexport / re-export cycle will be safe under all
circumstances.
> There have to be existing systems and
> people that also have that knowledge built in by now. If we change this, we
> break things.
No lies detected. ;-)
Another reality test is to audit other server implementations. I can ask
around.
--
Chuck Lever
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-21 15:18 ` Chuck Lever
@ 2025-03-21 15:51 ` Benjamin Coddington
0 siblings, 0 replies; 20+ messages in thread
From: Benjamin Coddington @ 2025-03-21 15:51 UTC (permalink / raw)
To: Chuck Lever
Cc: Jeff Layton, NeilBrown, Dai Ngo, Olga Kornievskaia, Tom Talpey,
Linux NFS Mailing List
On 21 Mar 2025, at 11:18, Chuck Lever wrote:
> On 3/21/25 11:07 AM, Benjamin Coddington wrote:
>> On 21 Mar 2025, at 10:43, Jeff Layton wrote:
>>
>>> On Fri, 2025-03-21 at 10:36 -0400, Benjamin Coddington wrote:
>>>> On 20 Mar 2025, at 13:53, Chuck Lever wrote:
>>>>
>>>>> On 3/19/25 5:46 PM, NeilBrown wrote:
>>>>>> On Thu, 20 Mar 2025, Dai Ngo wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Currently when the local file system needs to be unmounted for maintenance
>>>>>>> the admin needs to make sure all the NFS clients have stopped using any files
>>>>>>> on the NFS shares before the umount(8) can succeed.
>>>>>>
>>>>>> This is easily achieved with
>>>>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem
>>>>>>
>>>>>> Do this after unexporting and before unmounting.
>>>>>
>>>>> Seems like administrators would expect that a filesystem can be
>>>>> unmounted immediately after unexporting it. Should "exportfs" be changed
>>>>> to handle this extra step under the covers? Doesn't seem like it would
>>>>> be hard to do, and I can't think of a use case where it would be
>>>>> harmful.
>>>>
>>>> No. I think that admins don't expect to lose all their NFS client's state if
>>>> they're managing the exports. That would be a really big and invisible change
>>>> to existing behavior.
>>>>
>>>
>>> If we're unexporting the filesystem though, then ISTM like we ought to
>>> cancel any state that was held on it. Are you concerned the admin
>>> inadvertently unexporting something or is there another use-case you're
>>> worried about?
>>
>> I'm worried about changing existing behavior and the fallout, today I can
>> un-export and re-export all day long, and as long as I re-export the
>> filesystem the applications on those clients are unaffected.
>>
>> I'm an old sysadmin that knows that I can un-export and re-export stuff and
>> not have to worry about state loss.
>
> Is it documented that you can rely on that? If not, then I'd say old
> sysadmins should expect that behavior can be changed. 2-cents.
No, I don't know any place it's documented. It's the consequences of this
change I'm worried about, not our ability to say "you should have expected
this!"
> Also, as a sysadmin, I would never unexport and expect there to be no
> consequences. Running apps that try to open a file on a recently
> unexported share /will/ get ESTALE -- NFSv3 holds no open state at
> all, so the next NFS READ on that share will fail with EIO.
>
> So unexport is already not without some consequences. IMO it's not
> sensible to expect an unexport / re-export cycle will be safe under all
> circumstances.
This is true.
>> There have to be existing systems and
>> people that also have that knowledge built in by now. If we change this, we
>> break things.
>
> No lies detected. ;-)
>
> Another reality test is to audit other server implementations. I can ask
> around.
Thanks for taking my worries seriously. Since I'm working on a distro, I'm
sensitive to how many folks might get upset when an upgrade breaks things.
Ben
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-21 14:44 ` Chuck Lever
@ 2025-03-26 0:23 ` NeilBrown
2025-03-26 3:20 ` Dai Ngo
0 siblings, 1 reply; 20+ messages in thread
From: NeilBrown @ 2025-03-26 0:23 UTC (permalink / raw)
To: Chuck Lever
Cc: Benjamin Coddington, Jeff Layton, Dai Ngo, Olga Kornievskaia,
Tom Talpey, Linux NFS Mailing List
On Sat, 22 Mar 2025, Chuck Lever wrote:
> On 3/21/25 10:36 AM, Benjamin Coddington wrote:
> > On 20 Mar 2025, at 13:53, Chuck Lever wrote:
> >
> >> On 3/19/25 5:46 PM, NeilBrown wrote:
> >>> On Thu, 20 Mar 2025, Dai Ngo wrote:
> >>>> Hi,
> >>>>
> >>>> Currently when the local file system needs to be unmounted for maintenance
> >>>> the admin needs to make sure all the NFS clients have stopped using any files
> >>>> on the NFS shares before the umount(8) can succeed.
> >>>
> >>> This is easily achieved with
> >>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem
> >>>
> >>> Do this after unexporting and before unmounting.
> >>
> >> Seems like administrators would expect that a filesystem can be
> >> unmounted immediately after unexporting it. Should "exportfs" be changed
> >> to handle this extra step under the covers? Doesn't seem like it would
> >> be hard to do, and I can't think of a use case where it would be
> >> harmful.
> >
> > No. I think that admins don't expect to lose all their NFS client's state if
> > they're managing the exports. That would be a really big and invisible change
> > to existing behavior.
>
> To be clear, I mean that a file system should be unlocked only when it
> is specifically unexported. IMO, unexport is usually an administrator
> action that means "I want to stop remote access to this file system now"
> and that's what unlock_filesystem does.
A problem with that position is that "unexport" isn't a well defined
operation.
It is quite possible to edit /etc/exports then run "exportfs -r". This
may implicit unexport things.
The kernel certainly doesn't have a concept of "unexport". You can run
"exportfs -f" at any time quite safely. That tells the kernel to forget
all export information, but allows the kernel to ask mountd for anything
it find that it needs.
>
> IMO administrators would be surprised to learn that NFS clients may
> continue to access a file system (via existing open files) after it
> has been explicitly unexported.
They can't access those file while it remains unexported. But if it is
re-exported, the access they had can continue seamlessly.
The origin model is NLM which is separate from NFS. Unexporting to NFS
doesn't close the locks held by NLM. That can be done separately by the
client with a STATMON request. In fact NLM never drops locks unless
explicitly asked to by the client or forced by the server admin. So it
isn't a good model, but it is what we had.
>
> The alternative is to document unlock_filesystem in man exportfs(8).
Another alternative is to provide new functionality in exportfs. Maybe
a --force flag or a --close-all flag.
It could examine /proc/fs/nfsd/clients/*/states to determine which
filesystems had active state, then examine the export tables
(/var/lib/nfs/etab) to see what was currently exported, then write
something appropriate to unlock_filesystem for any active filesystems
which are no longer exported.
If we did that we would want to find NLM locks in /proc/locks too and
ensure those were discarded if necessary.
There is also the possibility that a filesystem is still exported to
some clients but not to all. In that case writing something to
unlock_ip might be appropriate - though that doesn't revoke v4 state
yet.
Thanks,
NeilBrown
>
> And perhaps we need a more surgical mechanism that can handle the case
> where the file system is still exported but the security policy has
> changed. Because this does feel like a real information leak.
>
>
> --
> Chuck Lever
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-26 0:23 ` NeilBrown
@ 2025-03-26 3:20 ` Dai Ngo
2025-03-26 3:41 ` NeilBrown
0 siblings, 1 reply; 20+ messages in thread
From: Dai Ngo @ 2025-03-26 3:20 UTC (permalink / raw)
To: NeilBrown, Chuck Lever
Cc: Benjamin Coddington, Jeff Layton, Olga Kornievskaia, Tom Talpey,
Linux NFS Mailing List
On 3/25/25 5:23 PM, NeilBrown wrote:
> On Sat, 22 Mar 2025, Chuck Lever wrote:
>> On 3/21/25 10:36 AM, Benjamin Coddington wrote:
>>> On 20 Mar 2025, at 13:53, Chuck Lever wrote:
>>>
>>>> On 3/19/25 5:46 PM, NeilBrown wrote:
>>>>> On Thu, 20 Mar 2025, Dai Ngo wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Currently when the local file system needs to be unmounted for maintenance
>>>>>> the admin needs to make sure all the NFS clients have stopped using any files
>>>>>> on the NFS shares before the umount(8) can succeed.
>>>>> This is easily achieved with
>>>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem
>>>>>
>>>>> Do this after unexporting and before unmounting.
>>>> Seems like administrators would expect that a filesystem can be
>>>> unmounted immediately after unexporting it. Should "exportfs" be changed
>>>> to handle this extra step under the covers? Doesn't seem like it would
>>>> be hard to do, and I can't think of a use case where it would be
>>>> harmful.
>>> No. I think that admins don't expect to lose all their NFS client's state if
>>> they're managing the exports. That would be a really big and invisible change
>>> to existing behavior.
>> To be clear, I mean that a file system should be unlocked only when it
>> is specifically unexported. IMO, unexport is usually an administrator
>> action that means "I want to stop remote access to this file system now"
>> and that's what unlock_filesystem does.
> A problem with that position is that "unexport" isn't a well defined
> operation.
> It is quite possible to edit /etc/exports then run "exportfs -r". This
> may implicit unexport things.
>
> The kernel certainly doesn't have a concept of "unexport". You can run
> "exportfs -f" at any time quite safely. That tells the kernel to forget
> all export information, but allows the kernel to ask mountd for anything
> it find that it needs.
>
>> IMO administrators would be surprised to learn that NFS clients may
>> continue to access a file system (via existing open files) after it
>> has been explicitly unexported.
> They can't access those file while it remains unexported. But if it is
> re-exported, the access they had can continue seamlessly.
>
> The origin model is NLM which is separate from NFS. Unexporting to NFS
> doesn't close the locks held by NLM. That can be done separately by the
> client with a STATMON request. In fact NLM never drops locks unless
> explicitly asked to by the client or forced by the server admin. So it
> isn't a good model, but it is what we had.
>
>> The alternative is to document unlock_filesystem in man exportfs(8).
> Another alternative is to provide new functionality in exportfs. Maybe
> a --force flag or a --close-all flag.
> It could examine /proc/fs/nfsd/clients/*/states to determine which
> filesystems had active state, then examine the export tables
> (/var/lib/nfs/etab) to see what was currently exported, then write
> something appropriate to unlock_filesystem for any active filesystems
> which are no longer exported.
Is it possible that at the time of cache_clean/svc_export_put the kernel
makes an upcall to rpc.mountd to check if svc_export.ex_path is still
exported?. If it's not then release all the states that use that super_block.
-Dai
>
> If we did that we would want to find NLM locks in /proc/locks too and
> ensure those were discarded if necessary.
>
> There is also the possibility that a filesystem is still exported to
> some clients but not to all. In that case writing something to
> unlock_ip might be appropriate - though that doesn't revoke v4 state
> yet.
>
> Thanks,
> NeilBrown
>
>
>> And perhaps we need a more surgical mechanism that can handle the case
>> where the file system is still exported but the security policy has
>> changed. Because this does feel like a real information leak.
>>
>>
>> --
>> Chuck Lever
>>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-26 3:20 ` Dai Ngo
@ 2025-03-26 3:41 ` NeilBrown
2025-03-26 13:15 ` Chuck Lever
2025-04-09 21:00 ` Dai Ngo
0 siblings, 2 replies; 20+ messages in thread
From: NeilBrown @ 2025-03-26 3:41 UTC (permalink / raw)
To: Dai Ngo
Cc: Chuck Lever, Benjamin Coddington, Jeff Layton, Olga Kornievskaia,
Tom Talpey, Linux NFS Mailing List
On Wed, 26 Mar 2025, Dai Ngo wrote:
> On 3/25/25 5:23 PM, NeilBrown wrote:
> > On Sat, 22 Mar 2025, Chuck Lever wrote:
> >> On 3/21/25 10:36 AM, Benjamin Coddington wrote:
> >>> On 20 Mar 2025, at 13:53, Chuck Lever wrote:
> >>>
> >>>> On 3/19/25 5:46 PM, NeilBrown wrote:
> >>>>> On Thu, 20 Mar 2025, Dai Ngo wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> Currently when the local file system needs to be unmounted for maintenance
> >>>>>> the admin needs to make sure all the NFS clients have stopped using any files
> >>>>>> on the NFS shares before the umount(8) can succeed.
> >>>>> This is easily achieved with
> >>>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem
> >>>>>
> >>>>> Do this after unexporting and before unmounting.
> >>>> Seems like administrators would expect that a filesystem can be
> >>>> unmounted immediately after unexporting it. Should "exportfs" be changed
> >>>> to handle this extra step under the covers? Doesn't seem like it would
> >>>> be hard to do, and I can't think of a use case where it would be
> >>>> harmful.
> >>> No. I think that admins don't expect to lose all their NFS client's state if
> >>> they're managing the exports. That would be a really big and invisible change
> >>> to existing behavior.
> >> To be clear, I mean that a file system should be unlocked only when it
> >> is specifically unexported. IMO, unexport is usually an administrator
> >> action that means "I want to stop remote access to this file system now"
> >> and that's what unlock_filesystem does.
> > A problem with that position is that "unexport" isn't a well defined
> > operation.
> > It is quite possible to edit /etc/exports then run "exportfs -r". This
> > may implicit unexport things.
> >
> > The kernel certainly doesn't have a concept of "unexport". You can run
> > "exportfs -f" at any time quite safely. That tells the kernel to forget
> > all export information, but allows the kernel to ask mountd for anything
> > it find that it needs.
> >
> >> IMO administrators would be surprised to learn that NFS clients may
> >> continue to access a file system (via existing open files) after it
> >> has been explicitly unexported.
> > They can't access those file while it remains unexported. But if it is
> > re-exported, the access they had can continue seamlessly.
> >
> > The origin model is NLM which is separate from NFS. Unexporting to NFS
> > doesn't close the locks held by NLM. That can be done separately by the
> > client with a STATMON request. In fact NLM never drops locks unless
> > explicitly asked to by the client or forced by the server admin. So it
> > isn't a good model, but it is what we had.
> >
> >> The alternative is to document unlock_filesystem in man exportfs(8).
> > Another alternative is to provide new functionality in exportfs. Maybe
> > a --force flag or a --close-all flag.
> > It could examine /proc/fs/nfsd/clients/*/states to determine which
> > filesystems had active state, then examine the export tables
> > (/var/lib/nfs/etab) to see what was currently exported, then write
> > something appropriate to unlock_filesystem for any active filesystems
> > which are no longer exported.
>
> Is it possible that at the time of cache_clean/svc_export_put the kernel
> makes an upcall to rpc.mountd to check if svc_export.ex_path is still
> exported?. If it's not then release all the states that use that super_block.
I suspect that could be done, but then you would hit Ben's concern.
Temporarily unexported a filesystem would change from the client getting
ESTALE if it happens to access a file while the filesystem is not
exported, to the client definitely getting ADMIN_REVOKED (probably -EIO)
then next time it accesses a file even if the filesystem has been
exported again.
I agree with Ben that there needs to be a deliberate admin action to
revoke state, not just a side-effect of unexport which historically has
not revoked state.
NeilBrown
>
> -Dai
>
> >
> > If we did that we would want to find NLM locks in /proc/locks too and
> > ensure those were discarded if necessary.
> >
> > There is also the possibility that a filesystem is still exported to
> > some clients but not to all. In that case writing something to
> > unlock_ip might be appropriate - though that doesn't revoke v4 state
> > yet.
> >
> > Thanks,
> > NeilBrown
> >
> >
> >> And perhaps we need a more surgical mechanism that can handle the case
> >> where the file system is still exported but the security policy has
> >> changed. Because this does feel like a real information leak.
> >>
> >>
> >> --
> >> Chuck Lever
> >>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-26 3:41 ` NeilBrown
@ 2025-03-26 13:15 ` Chuck Lever
2025-03-27 22:47 ` NeilBrown
2025-04-09 21:00 ` Dai Ngo
1 sibling, 1 reply; 20+ messages in thread
From: Chuck Lever @ 2025-03-26 13:15 UTC (permalink / raw)
To: NeilBrown, Dai Ngo
Cc: Benjamin Coddington, Jeff Layton, Olga Kornievskaia, Tom Talpey,
Linux NFS Mailing List
On 3/25/25 11:41 PM, NeilBrown wrote:
> On Wed, 26 Mar 2025, Dai Ngo wrote:
>> On 3/25/25 5:23 PM, NeilBrown wrote:
>>> On Sat, 22 Mar 2025, Chuck Lever wrote:
>>>> On 3/21/25 10:36 AM, Benjamin Coddington wrote:
>>>>> On 20 Mar 2025, at 13:53, Chuck Lever wrote:
>>>>>
>>>>>> On 3/19/25 5:46 PM, NeilBrown wrote:
>>>>>>> On Thu, 20 Mar 2025, Dai Ngo wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Currently when the local file system needs to be unmounted for maintenance
>>>>>>>> the admin needs to make sure all the NFS clients have stopped using any files
>>>>>>>> on the NFS shares before the umount(8) can succeed.
>>>>>>> This is easily achieved with
>>>>>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem
>>>>>>>
>>>>>>> Do this after unexporting and before unmounting.
>>>>>> Seems like administrators would expect that a filesystem can be
>>>>>> unmounted immediately after unexporting it. Should "exportfs" be changed
>>>>>> to handle this extra step under the covers? Doesn't seem like it would
>>>>>> be hard to do, and I can't think of a use case where it would be
>>>>>> harmful.
>>>>> No. I think that admins don't expect to lose all their NFS client's state if
>>>>> they're managing the exports. That would be a really big and invisible change
>>>>> to existing behavior.
>>>> To be clear, I mean that a file system should be unlocked only when it
>>>> is specifically unexported. IMO, unexport is usually an administrator
>>>> action that means "I want to stop remote access to this file system now"
>>>> and that's what unlock_filesystem does.
>>> A problem with that position is that "unexport" isn't a well defined
>>> operation.
>>> It is quite possible to edit /etc/exports then run "exportfs -r". This
>>> may implicit unexport things.
>>>
>>> The kernel certainly doesn't have a concept of "unexport". You can run
>>> "exportfs -f" at any time quite safely. That tells the kernel to forget
>>> all export information, but allows the kernel to ask mountd for anything
>>> it find that it needs.
>>>
>>>> IMO administrators would be surprised to learn that NFS clients may
>>>> continue to access a file system (via existing open files) after it
>>>> has been explicitly unexported.
>>> They can't access those file while it remains unexported. But if it is
>>> re-exported, the access they had can continue seamlessly.
>>>
>>> The origin model is NLM which is separate from NFS. Unexporting to NFS
>>> doesn't close the locks held by NLM. That can be done separately by the
>>> client with a STATMON request. In fact NLM never drops locks unless
>>> explicitly asked to by the client or forced by the server admin. So it
>>> isn't a good model, but it is what we had.
>>>
>>>> The alternative is to document unlock_filesystem in man exportfs(8).
>>> Another alternative is to provide new functionality in exportfs. Maybe
>>> a --force flag or a --close-all flag.
>>> It could examine /proc/fs/nfsd/clients/*/states to determine which
>>> filesystems had active state, then examine the export tables
>>> (/var/lib/nfs/etab) to see what was currently exported, then write
>>> something appropriate to unlock_filesystem for any active filesystems
>>> which are no longer exported.
>>
>> Is it possible that at the time of cache_clean/svc_export_put the kernel
>> makes an upcall to rpc.mountd to check if svc_export.ex_path is still
>> exported?. If it's not then release all the states that use that super_block.
>
> I suspect that could be done, but then you would hit Ben's concern.
> Temporarily unexported a filesystem would change from the client getting
> ESTALE if it happens to access a file while the filesystem is not
> exported, to the client definitely getting ADMIN_REVOKED (probably -EIO)
> then next time it accesses a file even if the filesystem has been
> exported again.
>
> I agree with Ben that there needs to be a deliberate admin action to
> revoke state, not just a side-effect of unexport which historically has
> not revoked state.
I'm not religiously attached to expunging open/lock state on a simple
unexport operation. But I think it is critical to document the fact that
NFSv4 state remains and that will prevent an unmount (I'm not sure we've
identified any possible security exposures).
Neil, do you happen to know if unlock_filesystem and unlock_ip are
mentioned in man pages? If so, then exportfs(8) should refer to them.
> NeilBrown
>
>
>
>>
>> -Dai
>>
>>>
>>> If we did that we would want to find NLM locks in /proc/locks too and
>>> ensure those were discarded if necessary.
>>>
>>> There is also the possibility that a filesystem is still exported to
>>> some clients but not to all. In that case writing something to
>>> unlock_ip might be appropriate - though that doesn't revoke v4 state
>>> yet.
>>>
>>> Thanks,
>>> NeilBrown
>>>
>>>
>>>> And perhaps we need a more surgical mechanism that can handle the case
>>>> where the file system is still exported but the security policy has
>>>> changed. Because this does feel like a real information leak.
>>>>
>>>>
>>>> --
>>>> Chuck Lever
>>>>
>>
>
--
Chuck Lever
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-26 13:15 ` Chuck Lever
@ 2025-03-27 22:47 ` NeilBrown
0 siblings, 0 replies; 20+ messages in thread
From: NeilBrown @ 2025-03-27 22:47 UTC (permalink / raw)
To: Chuck Lever
Cc: Dai Ngo, Benjamin Coddington, Jeff Layton, Olga Kornievskaia,
Tom Talpey, Linux NFS Mailing List
On Thu, 27 Mar 2025, Chuck Lever wrote:
> On 3/25/25 11:41 PM, NeilBrown wrote:
> > On Wed, 26 Mar 2025, Dai Ngo wrote:
> >> On 3/25/25 5:23 PM, NeilBrown wrote:
> >>> On Sat, 22 Mar 2025, Chuck Lever wrote:
> >>>> On 3/21/25 10:36 AM, Benjamin Coddington wrote:
> >>>>> On 20 Mar 2025, at 13:53, Chuck Lever wrote:
> >>>>>
> >>>>>> On 3/19/25 5:46 PM, NeilBrown wrote:
> >>>>>>> On Thu, 20 Mar 2025, Dai Ngo wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> Currently when the local file system needs to be unmounted for maintenance
> >>>>>>>> the admin needs to make sure all the NFS clients have stopped using any files
> >>>>>>>> on the NFS shares before the umount(8) can succeed.
> >>>>>>> This is easily achieved with
> >>>>>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem
> >>>>>>>
> >>>>>>> Do this after unexporting and before unmounting.
> >>>>>> Seems like administrators would expect that a filesystem can be
> >>>>>> unmounted immediately after unexporting it. Should "exportfs" be changed
> >>>>>> to handle this extra step under the covers? Doesn't seem like it would
> >>>>>> be hard to do, and I can't think of a use case where it would be
> >>>>>> harmful.
> >>>>> No. I think that admins don't expect to lose all their NFS client's state if
> >>>>> they're managing the exports. That would be a really big and invisible change
> >>>>> to existing behavior.
> >>>> To be clear, I mean that a file system should be unlocked only when it
> >>>> is specifically unexported. IMO, unexport is usually an administrator
> >>>> action that means "I want to stop remote access to this file system now"
> >>>> and that's what unlock_filesystem does.
> >>> A problem with that position is that "unexport" isn't a well defined
> >>> operation.
> >>> It is quite possible to edit /etc/exports then run "exportfs -r". This
> >>> may implicit unexport things.
> >>>
> >>> The kernel certainly doesn't have a concept of "unexport". You can run
> >>> "exportfs -f" at any time quite safely. That tells the kernel to forget
> >>> all export information, but allows the kernel to ask mountd for anything
> >>> it find that it needs.
> >>>
> >>>> IMO administrators would be surprised to learn that NFS clients may
> >>>> continue to access a file system (via existing open files) after it
> >>>> has been explicitly unexported.
> >>> They can't access those file while it remains unexported. But if it is
> >>> re-exported, the access they had can continue seamlessly.
> >>>
> >>> The origin model is NLM which is separate from NFS. Unexporting to NFS
> >>> doesn't close the locks held by NLM. That can be done separately by the
> >>> client with a STATMON request. In fact NLM never drops locks unless
> >>> explicitly asked to by the client or forced by the server admin. So it
> >>> isn't a good model, but it is what we had.
> >>>
> >>>> The alternative is to document unlock_filesystem in man exportfs(8).
> >>> Another alternative is to provide new functionality in exportfs. Maybe
> >>> a --force flag or a --close-all flag.
> >>> It could examine /proc/fs/nfsd/clients/*/states to determine which
> >>> filesystems had active state, then examine the export tables
> >>> (/var/lib/nfs/etab) to see what was currently exported, then write
> >>> something appropriate to unlock_filesystem for any active filesystems
> >>> which are no longer exported.
> >>
> >> Is it possible that at the time of cache_clean/svc_export_put the kernel
> >> makes an upcall to rpc.mountd to check if svc_export.ex_path is still
> >> exported?. If it's not then release all the states that use that super_block.
> >
> > I suspect that could be done, but then you would hit Ben's concern.
> > Temporarily unexported a filesystem would change from the client getting
> > ESTALE if it happens to access a file while the filesystem is not
> > exported, to the client definitely getting ADMIN_REVOKED (probably -EIO)
> > then next time it accesses a file even if the filesystem has been
> > exported again.
> >
> > I agree with Ben that there needs to be a deliberate admin action to
> > revoke state, not just a side-effect of unexport which historically has
> > not revoked state.
>
> I'm not religiously attached to expunging open/lock state on a simple
> unexport operation. But I think it is critical to document the fact that
> NFSv4 state remains and that will prevent an unmount (I'm not sure we've
> identified any possible security exposures).
>
> Neil, do you happen to know if unlock_filesystem and unlock_ip are
> mentioned in man pages? If so, then exportfs(8) should refer to them.
>
They aren't mentioned in the nfs-utils package at all, or in
linux/Documentation.
So no: no documentation.
They should be mentioned in nfsd.7 (utils/exportfs/nfsd.man)
I guess someone should update that man page.... Probably the new netlink
interface could be described there too?
NeilBrown
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted
2025-03-26 3:41 ` NeilBrown
2025-03-26 13:15 ` Chuck Lever
@ 2025-04-09 21:00 ` Dai Ngo
1 sibling, 0 replies; 20+ messages in thread
From: Dai Ngo @ 2025-04-09 21:00 UTC (permalink / raw)
To: NeilBrown
Cc: Chuck Lever, Benjamin Coddington, Jeff Layton, Olga Kornievskaia,
Tom Talpey, Linux NFS Mailing List
On 3/25/25 8:41 PM, NeilBrown wrote:
> On Wed, 26 Mar 2025, Dai Ngo wrote:
>> On 3/25/25 5:23 PM, NeilBrown wrote:
>>> On Sat, 22 Mar 2025, Chuck Lever wrote:
>>>> On 3/21/25 10:36 AM, Benjamin Coddington wrote:
>>>>> On 20 Mar 2025, at 13:53, Chuck Lever wrote:
>>>>>
>>>>>> On 3/19/25 5:46 PM, NeilBrown wrote:
>>>>>>> On Thu, 20 Mar 2025, Dai Ngo wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Currently when the local file system needs to be unmounted for maintenance
>>>>>>>> the admin needs to make sure all the NFS clients have stopped using any files
>>>>>>>> on the NFS shares before the umount(8) can succeed.
>>>>>>> This is easily achieved with
>>>>>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem
>>>>>>>
>>>>>>> Do this after unexporting and before unmounting.
>>>>>> Seems like administrators would expect that a filesystem can be
>>>>>> unmounted immediately after unexporting it. Should "exportfs" be changed
>>>>>> to handle this extra step under the covers? Doesn't seem like it would
>>>>>> be hard to do, and I can't think of a use case where it would be
>>>>>> harmful.
>>>>> No. I think that admins don't expect to lose all their NFS client's state if
>>>>> they're managing the exports. That would be a really big and invisible change
>>>>> to existing behavior.
>>>> To be clear, I mean that a file system should be unlocked only when it
>>>> is specifically unexported. IMO, unexport is usually an administrator
>>>> action that means "I want to stop remote access to this file system now"
>>>> and that's what unlock_filesystem does.
>>> A problem with that position is that "unexport" isn't a well defined
>>> operation.
>>> It is quite possible to edit /etc/exports then run "exportfs -r". This
>>> may implicit unexport things.
>>>
>>> The kernel certainly doesn't have a concept of "unexport". You can run
>>> "exportfs -f" at any time quite safely. That tells the kernel to forget
>>> all export information, but allows the kernel to ask mountd for anything
>>> it find that it needs.
>>>
>>>> IMO administrators would be surprised to learn that NFS clients may
>>>> continue to access a file system (via existing open files) after it
>>>> has been explicitly unexported.
>>> They can't access those file while it remains unexported. But if it is
>>> re-exported, the access they had can continue seamlessly.
>>>
>>> The origin model is NLM which is separate from NFS. Unexporting to NFS
>>> doesn't close the locks held by NLM. That can be done separately by the
>>> client with a STATMON request. In fact NLM never drops locks unless
>>> explicitly asked to by the client or forced by the server admin. So it
>>> isn't a good model, but it is what we had.
>>>
>>>> The alternative is to document unlock_filesystem in man exportfs(8).
>>> Another alternative is to provide new functionality in exportfs. Maybe
>>> a --force flag or a --close-all flag.
>>> It could examine /proc/fs/nfsd/clients/*/states to determine which
>>> filesystems had active state, then examine the export tables
>>> (/var/lib/nfs/etab) to see what was currently exported, then write
>>> something appropriate to unlock_filesystem for any active filesystems
>>> which are no longer exported.
>> Is it possible that at the time of cache_clean/svc_export_put the kernel
>> makes an upcall to rpc.mountd to check if svc_export.ex_path is still
>> exported?. If it's not then release all the states that use that super_block.
> I suspect that could be done, but then you would hit Ben's concern.
> Temporarily unexported a filesystem would change from the client getting
> ESTALE if it happens to access a file while the filesystem is not
> exported, to the client definitely getting ADMIN_REVOKED (probably -EIO)
> then next time it accesses a file even if the filesystem has been
> exported again.
>
> I agree with Ben that there needs to be a deliberate admin action to
> revoke state, not just a side-effect of unexport which historically has
> not revoked state.
Is it useful to add an option, 'R', to exportfs to also revoke state when
user doing '-au':
# exportfs -Rau
The main purpose of this option is to allow all the underlying file systems
to be unmounted without requiring all clients to unmount the NFS exports first.
Thanks,
-Dai
>
> NeilBrown
>
>
>
>> -Dai
>>
>>> If we did that we would want to find NLM locks in /proc/locks too and
>>> ensure those were discarded if necessary.
>>>
>>> There is also the possibility that a filesystem is still exported to
>>> some clients but not to all. In that case writing something to
>>> unlock_ip might be appropriate - though that doesn't revoke v4 state
>>> yet.
>>>
>>> Thanks,
>>> NeilBrown
>>>
>>>
>>>> And perhaps we need a more surgical mechanism that can handle the case
>>>> where the file system is still exported but the security policy has
>>>> changed. Because this does feel like a real information leak.
>>>>
>>>>
>>>> --
>>>> Chuck Lever
>>>>
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-04-09 21:00 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-19 18:22 NFSD automatically releases all states when underlying file system is unmounted Dai Ngo
2025-03-19 18:28 ` Chuck Lever
2025-03-19 19:00 ` Dai Ngo
2025-03-19 19:24 ` Chuck Lever
2025-03-19 19:44 ` Dai Ngo
2025-03-19 21:46 ` NeilBrown
2025-03-19 22:12 ` Dai Ngo
2025-03-20 17:53 ` Chuck Lever
2025-03-21 14:36 ` Benjamin Coddington
2025-03-21 14:43 ` Jeff Layton
2025-03-21 15:07 ` Benjamin Coddington
2025-03-21 15:18 ` Chuck Lever
2025-03-21 15:51 ` Benjamin Coddington
2025-03-21 14:44 ` Chuck Lever
2025-03-26 0:23 ` NeilBrown
2025-03-26 3:20 ` Dai Ngo
2025-03-26 3:41 ` NeilBrown
2025-03-26 13:15 ` Chuck Lever
2025-03-27 22:47 ` NeilBrown
2025-04-09 21:00 ` Dai Ngo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox