* NFSD automatically releases all states when underlying file system is unmounted @ 2025-03-19 18:22 Dai Ngo 2025-03-19 18:28 ` Chuck Lever 2025-03-19 21:46 ` NeilBrown 0 siblings, 2 replies; 20+ messages in thread From: Dai Ngo @ 2025-03-19 18:22 UTC (permalink / raw) To: Chuck Lever, Jeff Layton, Neil Brown, Olga Kornievskaia, Tom Talpey Cc: Linux NFS Mailing List Hi, Currently when the local file system needs to be unmounted for maintenance the admin needs to make sure all the NFS clients have stopped using any files on the NFS shares before the umount(8) can succeed. In an environment where there are thousands of clients this manual process seems almost impossible or impractical. The only option available now is to restart the NFS server which would works since the NFS client can recover its state but it seems like this is a big hammer approach. Ideally, when the umount command is run there is a callback from the VFS layer to notify the upper protocols; NFS and SMB, to release its states on this file system for the umount to complete. Is there any existing mechanism to allow NFSD to release its states automatically on unmount? Unmount is not a frequent operation. Is it justifiable to add a bunch of complex code for something is not frequently needed? I appreciate any opinions on this issue. Thanks, -Dai ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-19 18:22 NFSD automatically releases all states when underlying file system is unmounted Dai Ngo @ 2025-03-19 18:28 ` Chuck Lever 2025-03-19 19:00 ` Dai Ngo 2025-03-19 21:46 ` NeilBrown 1 sibling, 1 reply; 20+ messages in thread From: Chuck Lever @ 2025-03-19 18:28 UTC (permalink / raw) To: Dai Ngo Cc: Jeff Layton, Neil Brown, Tom Talpey, Olga Kornievskaia, Mike Snitzer, Linux NFS Mailing List Hi Dai, thanks for starting this conversation. [ adding Mike -- IIRC localio is facing a similar issue ] On 3/19/25 2:22 PM, Dai Ngo wrote: > Hi, > > Currently when the local file system needs to be unmounted for maintenance > the admin needs to make sure all the NFS clients have stopped using any > files > on the NFS shares before the umount(8) can succeed. > > In an environment where there are thousands of clients this manual process > seems almost impossible or impractical. The only option available now is to > restart the NFS server which would works since the NFS client can > recover its > state but it seems like this is a big hammer approach. Well we could do this instead by having the server pretend to reboot for only clients that have mounted the export that is going away. That way any clients that don't have an interest in the unexported/unmounted file system don't have to deal with state recovery. > Ideally, when the umount command is run there is a callback from the VFS > layer > to notify the upper protocols; NFS and SMB, to release its states on > this file > system for the umount to complete. > > Is there any existing mechanism to allow NFSD to release its states > automatically on unmount? Can you explain why you don't believe unexport is the right place to trigger remote file closure? > Unmount is not a frequent operation. Is it justifiable to add a bunch of > complex > code for something is not frequently needed? I agree that I/O is significantly more frequent than unexport/unmount. It suggests we want a solution that does not make a heavy impact on the I/O code paths. -- Chuck Lever ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-19 18:28 ` Chuck Lever @ 2025-03-19 19:00 ` Dai Ngo 2025-03-19 19:24 ` Chuck Lever 0 siblings, 1 reply; 20+ messages in thread From: Dai Ngo @ 2025-03-19 19:00 UTC (permalink / raw) To: Chuck Lever Cc: Jeff Layton, Neil Brown, Tom Talpey, Olga Kornievskaia, Mike Snitzer, Linux NFS Mailing List On 3/19/25 11:28 AM, Chuck Lever wrote: > Hi Dai, thanks for starting this conversation. > > [ adding Mike -- IIRC localio is facing a similar issue ] > > On 3/19/25 2:22 PM, Dai Ngo wrote: >> Hi, >> >> Currently when the local file system needs to be unmounted for maintenance >> the admin needs to make sure all the NFS clients have stopped using any >> files >> on the NFS shares before the umount(8) can succeed. >> >> In an environment where there are thousands of clients this manual process >> seems almost impossible or impractical. The only option available now is to >> restart the NFS server which would works since the NFS client can >> recover its >> state but it seems like this is a big hammer approach. > Well we could do this instead by having the server pretend to reboot for > only clients that have mounted the export that is going away. That way > any clients that don't have an interest in the unexported/unmounted file > system don't have to deal with state recovery. Is there a way to restart the NFS server for just the export that's going away? How do we specify an export when doing 'systemctl restart nfs-server'. > > >> Ideally, when the umount command is run there is a callback from the VFS >> layer >> to notify the upper protocols; NFS and SMB, to release its states on >> this file >> system for the umount to complete. >> >> Is there any existing mechanism to allow NFSD to release its states >> automatically on unmount? > Can you explain why you don't believe unexport is the right place to > trigger remote file closure? Yes, unexport is another place that can be enhanced to trigger the releasing of all states of the export that going away. For this to work, the downcall mechanism between exportfs and the kernel needs to be enhanced to specify the export that is going away. This approach would eliminate the need for VFS involvement. Currently when 'exportfs -u' is called, exportfs makes a downcall to the kernel to clear the cache of ALL exports and not just the one that going away. > > >> Unmount is not a frequent operation. Is it justifiable to add a bunch of >> complex >> code for something is not frequently needed? > I agree that I/O is significantly more frequent than unexport/unmount. > It suggests we want a solution that does not make a heavy impact on the > I/O code paths. Thanks, -Dai > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-19 19:00 ` Dai Ngo @ 2025-03-19 19:24 ` Chuck Lever 2025-03-19 19:44 ` Dai Ngo 0 siblings, 1 reply; 20+ messages in thread From: Chuck Lever @ 2025-03-19 19:24 UTC (permalink / raw) To: Dai Ngo Cc: Jeff Layton, Neil Brown, Tom Talpey, Olga Kornievskaia, Mike Snitzer, Linux NFS Mailing List On 3/19/25 3:00 PM, Dai Ngo wrote: > > On 3/19/25 11:28 AM, Chuck Lever wrote: >> Hi Dai, thanks for starting this conversation. >> >> [ adding Mike -- IIRC localio is facing a similar issue ] >> >> On 3/19/25 2:22 PM, Dai Ngo wrote: >>> Hi, >>> >>> Currently when the local file system needs to be unmounted for >>> maintenance >>> the admin needs to make sure all the NFS clients have stopped using any >>> files >>> on the NFS shares before the umount(8) can succeed. >>> >>> In an environment where there are thousands of clients this manual >>> process >>> seems almost impossible or impractical. The only option available now >>> is to >>> restart the NFS server which would works since the NFS client can >>> recover its >>> state but it seems like this is a big hammer approach. >> Well we could do this instead by having the server pretend to reboot for >> only clients that have mounted the export that is going away. That way >> any clients that don't have an interest in the unexported/unmounted file >> system don't have to deal with state recovery. > > Is there a way to restart the NFS server for just the export that's going > away? There is not a way do that, currently, though it's a feature that has been discussed for years. > How do we specify an export when doing 'systemctl restart nfs- > server'. I would think that the emulated restart for select exports would be handled entirely in the kernel, and not via systemd. >>> Ideally, when the umount command is run there is a callback from the VFS >>> layer >>> to notify the upper protocols; NFS and SMB, to release its states on >>> this file >>> system for the umount to complete. >>> >>> Is there any existing mechanism to allow NFSD to release its states >>> automatically on unmount? >> Can you explain why you don't believe unexport is the right place to >> trigger remote file closure? > > Yes, unexport is another place that can be enhanced to trigger the > releasing > of all states of the export that going away. For this to work, the downcall > mechanism between exportfs and the kernel needs to be enhanced to specify > the export that is going away. This approach would eliminate the need for > VFS involvement. > > Currently when 'exportfs -u' is called, exportfs makes a downcall to the > kernel to clear the cache of ALL exports and not just the one that going > away. Clearing the export cache is OK. That just means that new client requests will trigger an upcall tp repopulate the kernel's export cache. That is a much smaller bump in the performance road than a full server restart. -- Chuck Lever ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-19 19:24 ` Chuck Lever @ 2025-03-19 19:44 ` Dai Ngo 0 siblings, 0 replies; 20+ messages in thread From: Dai Ngo @ 2025-03-19 19:44 UTC (permalink / raw) To: Chuck Lever Cc: Jeff Layton, Neil Brown, Tom Talpey, Olga Kornievskaia, Mike Snitzer, Linux NFS Mailing List On 3/19/25 12:24 PM, Chuck Lever wrote: > On 3/19/25 3:00 PM, Dai Ngo wrote: >> On 3/19/25 11:28 AM, Chuck Lever wrote: >>> Hi Dai, thanks for starting this conversation. >>> >>> [ adding Mike -- IIRC localio is facing a similar issue ] >>> >>> On 3/19/25 2:22 PM, Dai Ngo wrote: >>>> Hi, >>>> >>>> Currently when the local file system needs to be unmounted for >>>> maintenance >>>> the admin needs to make sure all the NFS clients have stopped using any >>>> files >>>> on the NFS shares before the umount(8) can succeed. >>>> >>>> In an environment where there are thousands of clients this manual >>>> process >>>> seems almost impossible or impractical. The only option available now >>>> is to >>>> restart the NFS server which would works since the NFS client can >>>> recover its >>>> state but it seems like this is a big hammer approach. >>> Well we could do this instead by having the server pretend to reboot for >>> only clients that have mounted the export that is going away. That way >>> any clients that don't have an interest in the unexported/unmounted file >>> system don't have to deal with state recovery. >> Is there a way to restart the NFS server for just the export that's going >> away? > There is not a way do that, currently, though it's a feature that has > been discussed for years. > > >> How do we specify an export when doing 'systemctl restart nfs- >> server'. > I would think that the emulated restart for select exports would be > handled entirely in the kernel, and not via systemd. > > >>>> Ideally, when the umount command is run there is a callback from the VFS >>>> layer >>>> to notify the upper protocols; NFS and SMB, to release its states on >>>> this file >>>> system for the umount to complete. >>>> >>>> Is there any existing mechanism to allow NFSD to release its states >>>> automatically on unmount? >>> Can you explain why you don't believe unexport is the right place to >>> trigger remote file closure? >> Yes, unexport is another place that can be enhanced to trigger the >> releasing >> of all states of the export that going away. For this to work, the downcall >> mechanism between exportfs and the kernel needs to be enhanced to specify >> the export that is going away. This approach would eliminate the need for >> VFS involvement. >> >> Currently when 'exportfs -u' is called, exportfs makes a downcall to the >> kernel to clear the cache of ALL exports and not just the one that going >> away. > Clearing the export cache is OK. That just means that new client > requests will trigger an upcall tp repopulate the kernel's export cache. > That is a much smaller bump in the performance road than a full server > restart. It's not just clearing the export cache. Since we do not know which export is going away we have to release the states of all files that using the exports. Meaning all NFS clients have to recover their states. Perhaps this still has less impact than a full server restart. -Dai > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-19 18:22 NFSD automatically releases all states when underlying file system is unmounted Dai Ngo 2025-03-19 18:28 ` Chuck Lever @ 2025-03-19 21:46 ` NeilBrown 2025-03-19 22:12 ` Dai Ngo 2025-03-20 17:53 ` Chuck Lever 1 sibling, 2 replies; 20+ messages in thread From: NeilBrown @ 2025-03-19 21:46 UTC (permalink / raw) To: Dai Ngo Cc: Chuck Lever, Jeff Layton, Olga Kornievskaia, Tom Talpey, Linux NFS Mailing List On Thu, 20 Mar 2025, Dai Ngo wrote: > Hi, > > Currently when the local file system needs to be unmounted for maintenance > the admin needs to make sure all the NFS clients have stopped using any files > on the NFS shares before the umount(8) can succeed. This is easily achieved with echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem Do this after unexporting and before unmounting. All state for NFSv4 exports, and all NLM locks for NFSv2/3 exports, will be invalidated and files closed. NFSv4 clients will get NFS4ERR_ADMIN_REVOKED when they attempt to use any state that was on that filesystem. (I don't think this flushes the NFSv3 file cache, so a short delay might be needed before the unmount when v3 is used. That should be fixed) NeilBrown > > In an environment where there are thousands of clients this manual process > seems almost impossible or impractical. The only option available now is to > restart the NFS server which would works since the NFS client can recover its > state but it seems like this is a big hammer approach. > > Ideally, when the umount command is run there is a callback from the VFS layer > to notify the upper protocols; NFS and SMB, to release its states on this file > system for the umount to complete. > > Is there any existing mechanism to allow NFSD to release its states automatically > on unmount? > > Unmount is not a frequent operation. Is it justifiable to add a bunch of complex > code for something is not frequently needed? > > I appreciate any opinions on this issue. > > Thanks, > -Dai > > > > > > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-19 21:46 ` NeilBrown @ 2025-03-19 22:12 ` Dai Ngo 2025-03-20 17:53 ` Chuck Lever 1 sibling, 0 replies; 20+ messages in thread From: Dai Ngo @ 2025-03-19 22:12 UTC (permalink / raw) To: NeilBrown Cc: Chuck Lever, Jeff Layton, Olga Kornievskaia, Tom Talpey, Linux NFS Mailing List On 3/19/25 2:46 PM, NeilBrown wrote: > On Thu, 20 Mar 2025, Dai Ngo wrote: >> Hi, >> >> Currently when the local file system needs to be unmounted for maintenance >> the admin needs to make sure all the NFS clients have stopped using any files >> on the NFS shares before the umount(8) can succeed. > This is easily achieved with > echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem > > Do this after unexporting and before unmounting. Yes, this works! > > All state for NFSv4 exports, and all NLM locks for NFSv2/3 exports, will > be invalidated and files closed. NFSv4 clients will get > NFS4ERR_ADMIN_REVOKED when they attempt to use any state that was on > that filesystem. In my test, client gets NFS4ERR_STALE for the PUTFH in the GETATTR compound which is expected. > > (I don't think this flushes the NFSv3 file cache, so a short delay might > be needed before the unmount when v3 is used. That should be fixed) Thank you very much Neil! -Dai > > NeilBrown > > >> In an environment where there are thousands of clients this manual process >> seems almost impossible or impractical. The only option available now is to >> restart the NFS server which would works since the NFS client can recover its >> state but it seems like this is a big hammer approach. >> >> Ideally, when the umount command is run there is a callback from the VFS layer >> to notify the upper protocols; NFS and SMB, to release its states on this file >> system for the umount to complete. >> >> Is there any existing mechanism to allow NFSD to release its states automatically >> on unmount? >> >> Unmount is not a frequent operation. Is it justifiable to add a bunch of complex >> code for something is not frequently needed? >> >> I appreciate any opinions on this issue. >> >> Thanks, >> -Dai >> >> >> >> >> >> >> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-19 21:46 ` NeilBrown 2025-03-19 22:12 ` Dai Ngo @ 2025-03-20 17:53 ` Chuck Lever 2025-03-21 14:36 ` Benjamin Coddington 1 sibling, 1 reply; 20+ messages in thread From: Chuck Lever @ 2025-03-20 17:53 UTC (permalink / raw) To: NeilBrown Cc: Jeff Layton, Dai Ngo, Olga Kornievskaia, Tom Talpey, Linux NFS Mailing List On 3/19/25 5:46 PM, NeilBrown wrote: > On Thu, 20 Mar 2025, Dai Ngo wrote: >> Hi, >> >> Currently when the local file system needs to be unmounted for maintenance >> the admin needs to make sure all the NFS clients have stopped using any files >> on the NFS shares before the umount(8) can succeed. > > This is easily achieved with > echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem > > Do this after unexporting and before unmounting. Seems like administrators would expect that a filesystem can be unmounted immediately after unexporting it. Should "exportfs" be changed to handle this extra step under the covers? Doesn't seem like it would be hard to do, and I can't think of a use case where it would be harmful. > All state for NFSv4 exports, and all NLM locks for NFSv2/3 exports, will > be invalidated and files closed. NFSv4 clients will get > NFS4ERR_ADMIN_REVOKED when they attempt to use any state that was on > that filesystem. I'm wondering if this mechanism also flushes courtesy client state for the file system that is about to be exported... it should, if it does not already take care of that. > (I don't think this flushes the NFSv3 file cache, so a short delay might > be needed before the unmount when v3 is used. That should be fixed) > > NeilBrown > > >> >> In an environment where there are thousands of clients this manual process >> seems almost impossible or impractical. The only option available now is to >> restart the NFS server which would works since the NFS client can recover its >> state but it seems like this is a big hammer approach. >> >> Ideally, when the umount command is run there is a callback from the VFS layer >> to notify the upper protocols; NFS and SMB, to release its states on this file >> system for the umount to complete. >> >> Is there any existing mechanism to allow NFSD to release its states automatically >> on unmount? >> >> Unmount is not a frequent operation. Is it justifiable to add a bunch of complex >> code for something is not frequently needed? >> >> I appreciate any opinions on this issue. >> >> Thanks, >> -Dai >> >> >> >> >> >> >> > -- Chuck Lever ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-20 17:53 ` Chuck Lever @ 2025-03-21 14:36 ` Benjamin Coddington 2025-03-21 14:43 ` Jeff Layton 2025-03-21 14:44 ` Chuck Lever 0 siblings, 2 replies; 20+ messages in thread From: Benjamin Coddington @ 2025-03-21 14:36 UTC (permalink / raw) To: Chuck Lever Cc: NeilBrown, Jeff Layton, Dai Ngo, Olga Kornievskaia, Tom Talpey, Linux NFS Mailing List On 20 Mar 2025, at 13:53, Chuck Lever wrote: > On 3/19/25 5:46 PM, NeilBrown wrote: >> On Thu, 20 Mar 2025, Dai Ngo wrote: >>> Hi, >>> >>> Currently when the local file system needs to be unmounted for maintenance >>> the admin needs to make sure all the NFS clients have stopped using any files >>> on the NFS shares before the umount(8) can succeed. >> >> This is easily achieved with >> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem >> >> Do this after unexporting and before unmounting. > > Seems like administrators would expect that a filesystem can be > unmounted immediately after unexporting it. Should "exportfs" be changed > to handle this extra step under the covers? Doesn't seem like it would > be hard to do, and I can't think of a use case where it would be > harmful. No. I think that admins don't expect to lose all their NFS client's state if they're managing the exports. That would be a really big and invisible change to existing behavior. Ben ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-21 14:36 ` Benjamin Coddington @ 2025-03-21 14:43 ` Jeff Layton 2025-03-21 15:07 ` Benjamin Coddington 2025-03-21 14:44 ` Chuck Lever 1 sibling, 1 reply; 20+ messages in thread From: Jeff Layton @ 2025-03-21 14:43 UTC (permalink / raw) To: Benjamin Coddington, Chuck Lever Cc: NeilBrown, Dai Ngo, Olga Kornievskaia, Tom Talpey, Linux NFS Mailing List On Fri, 2025-03-21 at 10:36 -0400, Benjamin Coddington wrote: > On 20 Mar 2025, at 13:53, Chuck Lever wrote: > > > On 3/19/25 5:46 PM, NeilBrown wrote: > > > On Thu, 20 Mar 2025, Dai Ngo wrote: > > > > Hi, > > > > > > > > Currently when the local file system needs to be unmounted for maintenance > > > > the admin needs to make sure all the NFS clients have stopped using any files > > > > on the NFS shares before the umount(8) can succeed. > > > > > > This is easily achieved with > > > echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem > > > > > > Do this after unexporting and before unmounting. > > > > Seems like administrators would expect that a filesystem can be > > unmounted immediately after unexporting it. Should "exportfs" be changed > > to handle this extra step under the covers? Doesn't seem like it would > > be hard to do, and I can't think of a use case where it would be > > harmful. > > No. I think that admins don't expect to lose all their NFS client's state if > they're managing the exports. That would be a really big and invisible change > to existing behavior. > If we're unexporting the filesystem though, then ISTM like we ought to cancel any state that was held on it. Are you concerned the admin inadvertently unexporting something or is there another use-case you're worried about? -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-21 14:43 ` Jeff Layton @ 2025-03-21 15:07 ` Benjamin Coddington 2025-03-21 15:18 ` Chuck Lever 0 siblings, 1 reply; 20+ messages in thread From: Benjamin Coddington @ 2025-03-21 15:07 UTC (permalink / raw) To: Jeff Layton Cc: Chuck Lever, NeilBrown, Dai Ngo, Olga Kornievskaia, Tom Talpey, Linux NFS Mailing List On 21 Mar 2025, at 10:43, Jeff Layton wrote: > On Fri, 2025-03-21 at 10:36 -0400, Benjamin Coddington wrote: >> On 20 Mar 2025, at 13:53, Chuck Lever wrote: >> >>> On 3/19/25 5:46 PM, NeilBrown wrote: >>>> On Thu, 20 Mar 2025, Dai Ngo wrote: >>>>> Hi, >>>>> >>>>> Currently when the local file system needs to be unmounted for maintenance >>>>> the admin needs to make sure all the NFS clients have stopped using any files >>>>> on the NFS shares before the umount(8) can succeed. >>>> >>>> This is easily achieved with >>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem >>>> >>>> Do this after unexporting and before unmounting. >>> >>> Seems like administrators would expect that a filesystem can be >>> unmounted immediately after unexporting it. Should "exportfs" be changed >>> to handle this extra step under the covers? Doesn't seem like it would >>> be hard to do, and I can't think of a use case where it would be >>> harmful. >> >> No. I think that admins don't expect to lose all their NFS client's state if >> they're managing the exports. That would be a really big and invisible change >> to existing behavior. >> > > If we're unexporting the filesystem though, then ISTM like we ought to > cancel any state that was held on it. Are you concerned the admin > inadvertently unexporting something or is there another use-case you're > worried about? I'm worried about changing existing behavior and the fallout, today I can un-export and re-export all day long, and as long as I re-export the filesystem the applications on those clients are unaffected. I'm an old sysadmin that knows that I can un-export and re-export stuff and not have to worry about state loss. There have to be existing systems and people that also have that knowledge built in by now. If we change this, we break things. Ben ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-21 15:07 ` Benjamin Coddington @ 2025-03-21 15:18 ` Chuck Lever 2025-03-21 15:51 ` Benjamin Coddington 0 siblings, 1 reply; 20+ messages in thread From: Chuck Lever @ 2025-03-21 15:18 UTC (permalink / raw) To: Benjamin Coddington, Jeff Layton Cc: NeilBrown, Dai Ngo, Olga Kornievskaia, Tom Talpey, Linux NFS Mailing List On 3/21/25 11:07 AM, Benjamin Coddington wrote: > On 21 Mar 2025, at 10:43, Jeff Layton wrote: > >> On Fri, 2025-03-21 at 10:36 -0400, Benjamin Coddington wrote: >>> On 20 Mar 2025, at 13:53, Chuck Lever wrote: >>> >>>> On 3/19/25 5:46 PM, NeilBrown wrote: >>>>> On Thu, 20 Mar 2025, Dai Ngo wrote: >>>>>> Hi, >>>>>> >>>>>> Currently when the local file system needs to be unmounted for maintenance >>>>>> the admin needs to make sure all the NFS clients have stopped using any files >>>>>> on the NFS shares before the umount(8) can succeed. >>>>> >>>>> This is easily achieved with >>>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem >>>>> >>>>> Do this after unexporting and before unmounting. >>>> >>>> Seems like administrators would expect that a filesystem can be >>>> unmounted immediately after unexporting it. Should "exportfs" be changed >>>> to handle this extra step under the covers? Doesn't seem like it would >>>> be hard to do, and I can't think of a use case where it would be >>>> harmful. >>> >>> No. I think that admins don't expect to lose all their NFS client's state if >>> they're managing the exports. That would be a really big and invisible change >>> to existing behavior. >>> >> >> If we're unexporting the filesystem though, then ISTM like we ought to >> cancel any state that was held on it. Are you concerned the admin >> inadvertently unexporting something or is there another use-case you're >> worried about? > > I'm worried about changing existing behavior and the fallout, today I can > un-export and re-export all day long, and as long as I re-export the > filesystem the applications on those clients are unaffected. > > I'm an old sysadmin that knows that I can un-export and re-export stuff and > not have to worry about state loss. Is it documented that you can rely on that? If not, then I'd say old sysadmins should expect that behavior can be changed. 2-cents. Also, as a sysadmin, I would never unexport and expect there to be no consequences. Running apps that try to open a file on a recently unexported share /will/ get ESTALE -- NFSv3 holds no open state at all, so the next NFS READ on that share will fail with EIO. So unexport is already not without some consequences. IMO it's not sensible to expect an unexport / re-export cycle will be safe under all circumstances. > There have to be existing systems and > people that also have that knowledge built in by now. If we change this, we > break things. No lies detected. ;-) Another reality test is to audit other server implementations. I can ask around. -- Chuck Lever ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-21 15:18 ` Chuck Lever @ 2025-03-21 15:51 ` Benjamin Coddington 0 siblings, 0 replies; 20+ messages in thread From: Benjamin Coddington @ 2025-03-21 15:51 UTC (permalink / raw) To: Chuck Lever Cc: Jeff Layton, NeilBrown, Dai Ngo, Olga Kornievskaia, Tom Talpey, Linux NFS Mailing List On 21 Mar 2025, at 11:18, Chuck Lever wrote: > On 3/21/25 11:07 AM, Benjamin Coddington wrote: >> On 21 Mar 2025, at 10:43, Jeff Layton wrote: >> >>> On Fri, 2025-03-21 at 10:36 -0400, Benjamin Coddington wrote: >>>> On 20 Mar 2025, at 13:53, Chuck Lever wrote: >>>> >>>>> On 3/19/25 5:46 PM, NeilBrown wrote: >>>>>> On Thu, 20 Mar 2025, Dai Ngo wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Currently when the local file system needs to be unmounted for maintenance >>>>>>> the admin needs to make sure all the NFS clients have stopped using any files >>>>>>> on the NFS shares before the umount(8) can succeed. >>>>>> >>>>>> This is easily achieved with >>>>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem >>>>>> >>>>>> Do this after unexporting and before unmounting. >>>>> >>>>> Seems like administrators would expect that a filesystem can be >>>>> unmounted immediately after unexporting it. Should "exportfs" be changed >>>>> to handle this extra step under the covers? Doesn't seem like it would >>>>> be hard to do, and I can't think of a use case where it would be >>>>> harmful. >>>> >>>> No. I think that admins don't expect to lose all their NFS client's state if >>>> they're managing the exports. That would be a really big and invisible change >>>> to existing behavior. >>>> >>> >>> If we're unexporting the filesystem though, then ISTM like we ought to >>> cancel any state that was held on it. Are you concerned the admin >>> inadvertently unexporting something or is there another use-case you're >>> worried about? >> >> I'm worried about changing existing behavior and the fallout, today I can >> un-export and re-export all day long, and as long as I re-export the >> filesystem the applications on those clients are unaffected. >> >> I'm an old sysadmin that knows that I can un-export and re-export stuff and >> not have to worry about state loss. > > Is it documented that you can rely on that? If not, then I'd say old > sysadmins should expect that behavior can be changed. 2-cents. No, I don't know any place it's documented. It's the consequences of this change I'm worried about, not our ability to say "you should have expected this!" > Also, as a sysadmin, I would never unexport and expect there to be no > consequences. Running apps that try to open a file on a recently > unexported share /will/ get ESTALE -- NFSv3 holds no open state at > all, so the next NFS READ on that share will fail with EIO. > > So unexport is already not without some consequences. IMO it's not > sensible to expect an unexport / re-export cycle will be safe under all > circumstances. This is true. >> There have to be existing systems and >> people that also have that knowledge built in by now. If we change this, we >> break things. > > No lies detected. ;-) > > Another reality test is to audit other server implementations. I can ask > around. Thanks for taking my worries seriously. Since I'm working on a distro, I'm sensitive to how many folks might get upset when an upgrade breaks things. Ben ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-21 14:36 ` Benjamin Coddington 2025-03-21 14:43 ` Jeff Layton @ 2025-03-21 14:44 ` Chuck Lever 2025-03-26 0:23 ` NeilBrown 1 sibling, 1 reply; 20+ messages in thread From: Chuck Lever @ 2025-03-21 14:44 UTC (permalink / raw) To: Benjamin Coddington Cc: NeilBrown, Jeff Layton, Dai Ngo, Olga Kornievskaia, Tom Talpey, Linux NFS Mailing List On 3/21/25 10:36 AM, Benjamin Coddington wrote: > On 20 Mar 2025, at 13:53, Chuck Lever wrote: > >> On 3/19/25 5:46 PM, NeilBrown wrote: >>> On Thu, 20 Mar 2025, Dai Ngo wrote: >>>> Hi, >>>> >>>> Currently when the local file system needs to be unmounted for maintenance >>>> the admin needs to make sure all the NFS clients have stopped using any files >>>> on the NFS shares before the umount(8) can succeed. >>> >>> This is easily achieved with >>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem >>> >>> Do this after unexporting and before unmounting. >> >> Seems like administrators would expect that a filesystem can be >> unmounted immediately after unexporting it. Should "exportfs" be changed >> to handle this extra step under the covers? Doesn't seem like it would >> be hard to do, and I can't think of a use case where it would be >> harmful. > > No. I think that admins don't expect to lose all their NFS client's state if > they're managing the exports. That would be a really big and invisible change > to existing behavior. To be clear, I mean that a file system should be unlocked only when it is specifically unexported. IMO, unexport is usually an administrator action that means "I want to stop remote access to this file system now" and that's what unlock_filesystem does. IMO administrators would be surprised to learn that NFS clients may continue to access a file system (via existing open files) after it has been explicitly unexported. The alternative is to document unlock_filesystem in man exportfs(8). And perhaps we need a more surgical mechanism that can handle the case where the file system is still exported but the security policy has changed. Because this does feel like a real information leak. -- Chuck Lever ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-21 14:44 ` Chuck Lever @ 2025-03-26 0:23 ` NeilBrown 2025-03-26 3:20 ` Dai Ngo 0 siblings, 1 reply; 20+ messages in thread From: NeilBrown @ 2025-03-26 0:23 UTC (permalink / raw) To: Chuck Lever Cc: Benjamin Coddington, Jeff Layton, Dai Ngo, Olga Kornievskaia, Tom Talpey, Linux NFS Mailing List On Sat, 22 Mar 2025, Chuck Lever wrote: > On 3/21/25 10:36 AM, Benjamin Coddington wrote: > > On 20 Mar 2025, at 13:53, Chuck Lever wrote: > > > >> On 3/19/25 5:46 PM, NeilBrown wrote: > >>> On Thu, 20 Mar 2025, Dai Ngo wrote: > >>>> Hi, > >>>> > >>>> Currently when the local file system needs to be unmounted for maintenance > >>>> the admin needs to make sure all the NFS clients have stopped using any files > >>>> on the NFS shares before the umount(8) can succeed. > >>> > >>> This is easily achieved with > >>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem > >>> > >>> Do this after unexporting and before unmounting. > >> > >> Seems like administrators would expect that a filesystem can be > >> unmounted immediately after unexporting it. Should "exportfs" be changed > >> to handle this extra step under the covers? Doesn't seem like it would > >> be hard to do, and I can't think of a use case where it would be > >> harmful. > > > > No. I think that admins don't expect to lose all their NFS client's state if > > they're managing the exports. That would be a really big and invisible change > > to existing behavior. > > To be clear, I mean that a file system should be unlocked only when it > is specifically unexported. IMO, unexport is usually an administrator > action that means "I want to stop remote access to this file system now" > and that's what unlock_filesystem does. A problem with that position is that "unexport" isn't a well defined operation. It is quite possible to edit /etc/exports then run "exportfs -r". This may implicit unexport things. The kernel certainly doesn't have a concept of "unexport". You can run "exportfs -f" at any time quite safely. That tells the kernel to forget all export information, but allows the kernel to ask mountd for anything it find that it needs. > > IMO administrators would be surprised to learn that NFS clients may > continue to access a file system (via existing open files) after it > has been explicitly unexported. They can't access those file while it remains unexported. But if it is re-exported, the access they had can continue seamlessly. The origin model is NLM which is separate from NFS. Unexporting to NFS doesn't close the locks held by NLM. That can be done separately by the client with a STATMON request. In fact NLM never drops locks unless explicitly asked to by the client or forced by the server admin. So it isn't a good model, but it is what we had. > > The alternative is to document unlock_filesystem in man exportfs(8). Another alternative is to provide new functionality in exportfs. Maybe a --force flag or a --close-all flag. It could examine /proc/fs/nfsd/clients/*/states to determine which filesystems had active state, then examine the export tables (/var/lib/nfs/etab) to see what was currently exported, then write something appropriate to unlock_filesystem for any active filesystems which are no longer exported. If we did that we would want to find NLM locks in /proc/locks too and ensure those were discarded if necessary. There is also the possibility that a filesystem is still exported to some clients but not to all. In that case writing something to unlock_ip might be appropriate - though that doesn't revoke v4 state yet. Thanks, NeilBrown > > And perhaps we need a more surgical mechanism that can handle the case > where the file system is still exported but the security policy has > changed. Because this does feel like a real information leak. > > > -- > Chuck Lever > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-26 0:23 ` NeilBrown @ 2025-03-26 3:20 ` Dai Ngo 2025-03-26 3:41 ` NeilBrown 0 siblings, 1 reply; 20+ messages in thread From: Dai Ngo @ 2025-03-26 3:20 UTC (permalink / raw) To: NeilBrown, Chuck Lever Cc: Benjamin Coddington, Jeff Layton, Olga Kornievskaia, Tom Talpey, Linux NFS Mailing List On 3/25/25 5:23 PM, NeilBrown wrote: > On Sat, 22 Mar 2025, Chuck Lever wrote: >> On 3/21/25 10:36 AM, Benjamin Coddington wrote: >>> On 20 Mar 2025, at 13:53, Chuck Lever wrote: >>> >>>> On 3/19/25 5:46 PM, NeilBrown wrote: >>>>> On Thu, 20 Mar 2025, Dai Ngo wrote: >>>>>> Hi, >>>>>> >>>>>> Currently when the local file system needs to be unmounted for maintenance >>>>>> the admin needs to make sure all the NFS clients have stopped using any files >>>>>> on the NFS shares before the umount(8) can succeed. >>>>> This is easily achieved with >>>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem >>>>> >>>>> Do this after unexporting and before unmounting. >>>> Seems like administrators would expect that a filesystem can be >>>> unmounted immediately after unexporting it. Should "exportfs" be changed >>>> to handle this extra step under the covers? Doesn't seem like it would >>>> be hard to do, and I can't think of a use case where it would be >>>> harmful. >>> No. I think that admins don't expect to lose all their NFS client's state if >>> they're managing the exports. That would be a really big and invisible change >>> to existing behavior. >> To be clear, I mean that a file system should be unlocked only when it >> is specifically unexported. IMO, unexport is usually an administrator >> action that means "I want to stop remote access to this file system now" >> and that's what unlock_filesystem does. > A problem with that position is that "unexport" isn't a well defined > operation. > It is quite possible to edit /etc/exports then run "exportfs -r". This > may implicit unexport things. > > The kernel certainly doesn't have a concept of "unexport". You can run > "exportfs -f" at any time quite safely. That tells the kernel to forget > all export information, but allows the kernel to ask mountd for anything > it find that it needs. > >> IMO administrators would be surprised to learn that NFS clients may >> continue to access a file system (via existing open files) after it >> has been explicitly unexported. > They can't access those file while it remains unexported. But if it is > re-exported, the access they had can continue seamlessly. > > The origin model is NLM which is separate from NFS. Unexporting to NFS > doesn't close the locks held by NLM. That can be done separately by the > client with a STATMON request. In fact NLM never drops locks unless > explicitly asked to by the client or forced by the server admin. So it > isn't a good model, but it is what we had. > >> The alternative is to document unlock_filesystem in man exportfs(8). > Another alternative is to provide new functionality in exportfs. Maybe > a --force flag or a --close-all flag. > It could examine /proc/fs/nfsd/clients/*/states to determine which > filesystems had active state, then examine the export tables > (/var/lib/nfs/etab) to see what was currently exported, then write > something appropriate to unlock_filesystem for any active filesystems > which are no longer exported. Is it possible that at the time of cache_clean/svc_export_put the kernel makes an upcall to rpc.mountd to check if svc_export.ex_path is still exported?. If it's not then release all the states that use that super_block. -Dai > > If we did that we would want to find NLM locks in /proc/locks too and > ensure those were discarded if necessary. > > There is also the possibility that a filesystem is still exported to > some clients but not to all. In that case writing something to > unlock_ip might be appropriate - though that doesn't revoke v4 state > yet. > > Thanks, > NeilBrown > > >> And perhaps we need a more surgical mechanism that can handle the case >> where the file system is still exported but the security policy has >> changed. Because this does feel like a real information leak. >> >> >> -- >> Chuck Lever >> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-26 3:20 ` Dai Ngo @ 2025-03-26 3:41 ` NeilBrown 2025-03-26 13:15 ` Chuck Lever 2025-04-09 21:00 ` Dai Ngo 0 siblings, 2 replies; 20+ messages in thread From: NeilBrown @ 2025-03-26 3:41 UTC (permalink / raw) To: Dai Ngo Cc: Chuck Lever, Benjamin Coddington, Jeff Layton, Olga Kornievskaia, Tom Talpey, Linux NFS Mailing List On Wed, 26 Mar 2025, Dai Ngo wrote: > On 3/25/25 5:23 PM, NeilBrown wrote: > > On Sat, 22 Mar 2025, Chuck Lever wrote: > >> On 3/21/25 10:36 AM, Benjamin Coddington wrote: > >>> On 20 Mar 2025, at 13:53, Chuck Lever wrote: > >>> > >>>> On 3/19/25 5:46 PM, NeilBrown wrote: > >>>>> On Thu, 20 Mar 2025, Dai Ngo wrote: > >>>>>> Hi, > >>>>>> > >>>>>> Currently when the local file system needs to be unmounted for maintenance > >>>>>> the admin needs to make sure all the NFS clients have stopped using any files > >>>>>> on the NFS shares before the umount(8) can succeed. > >>>>> This is easily achieved with > >>>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem > >>>>> > >>>>> Do this after unexporting and before unmounting. > >>>> Seems like administrators would expect that a filesystem can be > >>>> unmounted immediately after unexporting it. Should "exportfs" be changed > >>>> to handle this extra step under the covers? Doesn't seem like it would > >>>> be hard to do, and I can't think of a use case where it would be > >>>> harmful. > >>> No. I think that admins don't expect to lose all their NFS client's state if > >>> they're managing the exports. That would be a really big and invisible change > >>> to existing behavior. > >> To be clear, I mean that a file system should be unlocked only when it > >> is specifically unexported. IMO, unexport is usually an administrator > >> action that means "I want to stop remote access to this file system now" > >> and that's what unlock_filesystem does. > > A problem with that position is that "unexport" isn't a well defined > > operation. > > It is quite possible to edit /etc/exports then run "exportfs -r". This > > may implicit unexport things. > > > > The kernel certainly doesn't have a concept of "unexport". You can run > > "exportfs -f" at any time quite safely. That tells the kernel to forget > > all export information, but allows the kernel to ask mountd for anything > > it find that it needs. > > > >> IMO administrators would be surprised to learn that NFS clients may > >> continue to access a file system (via existing open files) after it > >> has been explicitly unexported. > > They can't access those file while it remains unexported. But if it is > > re-exported, the access they had can continue seamlessly. > > > > The origin model is NLM which is separate from NFS. Unexporting to NFS > > doesn't close the locks held by NLM. That can be done separately by the > > client with a STATMON request. In fact NLM never drops locks unless > > explicitly asked to by the client or forced by the server admin. So it > > isn't a good model, but it is what we had. > > > >> The alternative is to document unlock_filesystem in man exportfs(8). > > Another alternative is to provide new functionality in exportfs. Maybe > > a --force flag or a --close-all flag. > > It could examine /proc/fs/nfsd/clients/*/states to determine which > > filesystems had active state, then examine the export tables > > (/var/lib/nfs/etab) to see what was currently exported, then write > > something appropriate to unlock_filesystem for any active filesystems > > which are no longer exported. > > Is it possible that at the time of cache_clean/svc_export_put the kernel > makes an upcall to rpc.mountd to check if svc_export.ex_path is still > exported?. If it's not then release all the states that use that super_block. I suspect that could be done, but then you would hit Ben's concern. Temporarily unexported a filesystem would change from the client getting ESTALE if it happens to access a file while the filesystem is not exported, to the client definitely getting ADMIN_REVOKED (probably -EIO) then next time it accesses a file even if the filesystem has been exported again. I agree with Ben that there needs to be a deliberate admin action to revoke state, not just a side-effect of unexport which historically has not revoked state. NeilBrown > > -Dai > > > > > If we did that we would want to find NLM locks in /proc/locks too and > > ensure those were discarded if necessary. > > > > There is also the possibility that a filesystem is still exported to > > some clients but not to all. In that case writing something to > > unlock_ip might be appropriate - though that doesn't revoke v4 state > > yet. > > > > Thanks, > > NeilBrown > > > > > >> And perhaps we need a more surgical mechanism that can handle the case > >> where the file system is still exported but the security policy has > >> changed. Because this does feel like a real information leak. > >> > >> > >> -- > >> Chuck Lever > >> > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-26 3:41 ` NeilBrown @ 2025-03-26 13:15 ` Chuck Lever 2025-03-27 22:47 ` NeilBrown 2025-04-09 21:00 ` Dai Ngo 1 sibling, 1 reply; 20+ messages in thread From: Chuck Lever @ 2025-03-26 13:15 UTC (permalink / raw) To: NeilBrown, Dai Ngo Cc: Benjamin Coddington, Jeff Layton, Olga Kornievskaia, Tom Talpey, Linux NFS Mailing List On 3/25/25 11:41 PM, NeilBrown wrote: > On Wed, 26 Mar 2025, Dai Ngo wrote: >> On 3/25/25 5:23 PM, NeilBrown wrote: >>> On Sat, 22 Mar 2025, Chuck Lever wrote: >>>> On 3/21/25 10:36 AM, Benjamin Coddington wrote: >>>>> On 20 Mar 2025, at 13:53, Chuck Lever wrote: >>>>> >>>>>> On 3/19/25 5:46 PM, NeilBrown wrote: >>>>>>> On Thu, 20 Mar 2025, Dai Ngo wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Currently when the local file system needs to be unmounted for maintenance >>>>>>>> the admin needs to make sure all the NFS clients have stopped using any files >>>>>>>> on the NFS shares before the umount(8) can succeed. >>>>>>> This is easily achieved with >>>>>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem >>>>>>> >>>>>>> Do this after unexporting and before unmounting. >>>>>> Seems like administrators would expect that a filesystem can be >>>>>> unmounted immediately after unexporting it. Should "exportfs" be changed >>>>>> to handle this extra step under the covers? Doesn't seem like it would >>>>>> be hard to do, and I can't think of a use case where it would be >>>>>> harmful. >>>>> No. I think that admins don't expect to lose all their NFS client's state if >>>>> they're managing the exports. That would be a really big and invisible change >>>>> to existing behavior. >>>> To be clear, I mean that a file system should be unlocked only when it >>>> is specifically unexported. IMO, unexport is usually an administrator >>>> action that means "I want to stop remote access to this file system now" >>>> and that's what unlock_filesystem does. >>> A problem with that position is that "unexport" isn't a well defined >>> operation. >>> It is quite possible to edit /etc/exports then run "exportfs -r". This >>> may implicit unexport things. >>> >>> The kernel certainly doesn't have a concept of "unexport". You can run >>> "exportfs -f" at any time quite safely. That tells the kernel to forget >>> all export information, but allows the kernel to ask mountd for anything >>> it find that it needs. >>> >>>> IMO administrators would be surprised to learn that NFS clients may >>>> continue to access a file system (via existing open files) after it >>>> has been explicitly unexported. >>> They can't access those file while it remains unexported. But if it is >>> re-exported, the access they had can continue seamlessly. >>> >>> The origin model is NLM which is separate from NFS. Unexporting to NFS >>> doesn't close the locks held by NLM. That can be done separately by the >>> client with a STATMON request. In fact NLM never drops locks unless >>> explicitly asked to by the client or forced by the server admin. So it >>> isn't a good model, but it is what we had. >>> >>>> The alternative is to document unlock_filesystem in man exportfs(8). >>> Another alternative is to provide new functionality in exportfs. Maybe >>> a --force flag or a --close-all flag. >>> It could examine /proc/fs/nfsd/clients/*/states to determine which >>> filesystems had active state, then examine the export tables >>> (/var/lib/nfs/etab) to see what was currently exported, then write >>> something appropriate to unlock_filesystem for any active filesystems >>> which are no longer exported. >> >> Is it possible that at the time of cache_clean/svc_export_put the kernel >> makes an upcall to rpc.mountd to check if svc_export.ex_path is still >> exported?. If it's not then release all the states that use that super_block. > > I suspect that could be done, but then you would hit Ben's concern. > Temporarily unexported a filesystem would change from the client getting > ESTALE if it happens to access a file while the filesystem is not > exported, to the client definitely getting ADMIN_REVOKED (probably -EIO) > then next time it accesses a file even if the filesystem has been > exported again. > > I agree with Ben that there needs to be a deliberate admin action to > revoke state, not just a side-effect of unexport which historically has > not revoked state. I'm not religiously attached to expunging open/lock state on a simple unexport operation. But I think it is critical to document the fact that NFSv4 state remains and that will prevent an unmount (I'm not sure we've identified any possible security exposures). Neil, do you happen to know if unlock_filesystem and unlock_ip are mentioned in man pages? If so, then exportfs(8) should refer to them. > NeilBrown > > > >> >> -Dai >> >>> >>> If we did that we would want to find NLM locks in /proc/locks too and >>> ensure those were discarded if necessary. >>> >>> There is also the possibility that a filesystem is still exported to >>> some clients but not to all. In that case writing something to >>> unlock_ip might be appropriate - though that doesn't revoke v4 state >>> yet. >>> >>> Thanks, >>> NeilBrown >>> >>> >>>> And perhaps we need a more surgical mechanism that can handle the case >>>> where the file system is still exported but the security policy has >>>> changed. Because this does feel like a real information leak. >>>> >>>> >>>> -- >>>> Chuck Lever >>>> >> > -- Chuck Lever ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-26 13:15 ` Chuck Lever @ 2025-03-27 22:47 ` NeilBrown 0 siblings, 0 replies; 20+ messages in thread From: NeilBrown @ 2025-03-27 22:47 UTC (permalink / raw) To: Chuck Lever Cc: Dai Ngo, Benjamin Coddington, Jeff Layton, Olga Kornievskaia, Tom Talpey, Linux NFS Mailing List On Thu, 27 Mar 2025, Chuck Lever wrote: > On 3/25/25 11:41 PM, NeilBrown wrote: > > On Wed, 26 Mar 2025, Dai Ngo wrote: > >> On 3/25/25 5:23 PM, NeilBrown wrote: > >>> On Sat, 22 Mar 2025, Chuck Lever wrote: > >>>> On 3/21/25 10:36 AM, Benjamin Coddington wrote: > >>>>> On 20 Mar 2025, at 13:53, Chuck Lever wrote: > >>>>> > >>>>>> On 3/19/25 5:46 PM, NeilBrown wrote: > >>>>>>> On Thu, 20 Mar 2025, Dai Ngo wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> Currently when the local file system needs to be unmounted for maintenance > >>>>>>>> the admin needs to make sure all the NFS clients have stopped using any files > >>>>>>>> on the NFS shares before the umount(8) can succeed. > >>>>>>> This is easily achieved with > >>>>>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem > >>>>>>> > >>>>>>> Do this after unexporting and before unmounting. > >>>>>> Seems like administrators would expect that a filesystem can be > >>>>>> unmounted immediately after unexporting it. Should "exportfs" be changed > >>>>>> to handle this extra step under the covers? Doesn't seem like it would > >>>>>> be hard to do, and I can't think of a use case where it would be > >>>>>> harmful. > >>>>> No. I think that admins don't expect to lose all their NFS client's state if > >>>>> they're managing the exports. That would be a really big and invisible change > >>>>> to existing behavior. > >>>> To be clear, I mean that a file system should be unlocked only when it > >>>> is specifically unexported. IMO, unexport is usually an administrator > >>>> action that means "I want to stop remote access to this file system now" > >>>> and that's what unlock_filesystem does. > >>> A problem with that position is that "unexport" isn't a well defined > >>> operation. > >>> It is quite possible to edit /etc/exports then run "exportfs -r". This > >>> may implicit unexport things. > >>> > >>> The kernel certainly doesn't have a concept of "unexport". You can run > >>> "exportfs -f" at any time quite safely. That tells the kernel to forget > >>> all export information, but allows the kernel to ask mountd for anything > >>> it find that it needs. > >>> > >>>> IMO administrators would be surprised to learn that NFS clients may > >>>> continue to access a file system (via existing open files) after it > >>>> has been explicitly unexported. > >>> They can't access those file while it remains unexported. But if it is > >>> re-exported, the access they had can continue seamlessly. > >>> > >>> The origin model is NLM which is separate from NFS. Unexporting to NFS > >>> doesn't close the locks held by NLM. That can be done separately by the > >>> client with a STATMON request. In fact NLM never drops locks unless > >>> explicitly asked to by the client or forced by the server admin. So it > >>> isn't a good model, but it is what we had. > >>> > >>>> The alternative is to document unlock_filesystem in man exportfs(8). > >>> Another alternative is to provide new functionality in exportfs. Maybe > >>> a --force flag or a --close-all flag. > >>> It could examine /proc/fs/nfsd/clients/*/states to determine which > >>> filesystems had active state, then examine the export tables > >>> (/var/lib/nfs/etab) to see what was currently exported, then write > >>> something appropriate to unlock_filesystem for any active filesystems > >>> which are no longer exported. > >> > >> Is it possible that at the time of cache_clean/svc_export_put the kernel > >> makes an upcall to rpc.mountd to check if svc_export.ex_path is still > >> exported?. If it's not then release all the states that use that super_block. > > > > I suspect that could be done, but then you would hit Ben's concern. > > Temporarily unexported a filesystem would change from the client getting > > ESTALE if it happens to access a file while the filesystem is not > > exported, to the client definitely getting ADMIN_REVOKED (probably -EIO) > > then next time it accesses a file even if the filesystem has been > > exported again. > > > > I agree with Ben that there needs to be a deliberate admin action to > > revoke state, not just a side-effect of unexport which historically has > > not revoked state. > > I'm not religiously attached to expunging open/lock state on a simple > unexport operation. But I think it is critical to document the fact that > NFSv4 state remains and that will prevent an unmount (I'm not sure we've > identified any possible security exposures). > > Neil, do you happen to know if unlock_filesystem and unlock_ip are > mentioned in man pages? If so, then exportfs(8) should refer to them. > They aren't mentioned in the nfs-utils package at all, or in linux/Documentation. So no: no documentation. They should be mentioned in nfsd.7 (utils/exportfs/nfsd.man) I guess someone should update that man page.... Probably the new netlink interface could be described there too? NeilBrown ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NFSD automatically releases all states when underlying file system is unmounted 2025-03-26 3:41 ` NeilBrown 2025-03-26 13:15 ` Chuck Lever @ 2025-04-09 21:00 ` Dai Ngo 1 sibling, 0 replies; 20+ messages in thread From: Dai Ngo @ 2025-04-09 21:00 UTC (permalink / raw) To: NeilBrown Cc: Chuck Lever, Benjamin Coddington, Jeff Layton, Olga Kornievskaia, Tom Talpey, Linux NFS Mailing List On 3/25/25 8:41 PM, NeilBrown wrote: > On Wed, 26 Mar 2025, Dai Ngo wrote: >> On 3/25/25 5:23 PM, NeilBrown wrote: >>> On Sat, 22 Mar 2025, Chuck Lever wrote: >>>> On 3/21/25 10:36 AM, Benjamin Coddington wrote: >>>>> On 20 Mar 2025, at 13:53, Chuck Lever wrote: >>>>> >>>>>> On 3/19/25 5:46 PM, NeilBrown wrote: >>>>>>> On Thu, 20 Mar 2025, Dai Ngo wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Currently when the local file system needs to be unmounted for maintenance >>>>>>>> the admin needs to make sure all the NFS clients have stopped using any files >>>>>>>> on the NFS shares before the umount(8) can succeed. >>>>>>> This is easily achieved with >>>>>>> echo /path/to/filesystem > /proc/fs/nfsd/unlock_filesystem >>>>>>> >>>>>>> Do this after unexporting and before unmounting. >>>>>> Seems like administrators would expect that a filesystem can be >>>>>> unmounted immediately after unexporting it. Should "exportfs" be changed >>>>>> to handle this extra step under the covers? Doesn't seem like it would >>>>>> be hard to do, and I can't think of a use case where it would be >>>>>> harmful. >>>>> No. I think that admins don't expect to lose all their NFS client's state if >>>>> they're managing the exports. That would be a really big and invisible change >>>>> to existing behavior. >>>> To be clear, I mean that a file system should be unlocked only when it >>>> is specifically unexported. IMO, unexport is usually an administrator >>>> action that means "I want to stop remote access to this file system now" >>>> and that's what unlock_filesystem does. >>> A problem with that position is that "unexport" isn't a well defined >>> operation. >>> It is quite possible to edit /etc/exports then run "exportfs -r". This >>> may implicit unexport things. >>> >>> The kernel certainly doesn't have a concept of "unexport". You can run >>> "exportfs -f" at any time quite safely. That tells the kernel to forget >>> all export information, but allows the kernel to ask mountd for anything >>> it find that it needs. >>> >>>> IMO administrators would be surprised to learn that NFS clients may >>>> continue to access a file system (via existing open files) after it >>>> has been explicitly unexported. >>> They can't access those file while it remains unexported. But if it is >>> re-exported, the access they had can continue seamlessly. >>> >>> The origin model is NLM which is separate from NFS. Unexporting to NFS >>> doesn't close the locks held by NLM. That can be done separately by the >>> client with a STATMON request. In fact NLM never drops locks unless >>> explicitly asked to by the client or forced by the server admin. So it >>> isn't a good model, but it is what we had. >>> >>>> The alternative is to document unlock_filesystem in man exportfs(8). >>> Another alternative is to provide new functionality in exportfs. Maybe >>> a --force flag or a --close-all flag. >>> It could examine /proc/fs/nfsd/clients/*/states to determine which >>> filesystems had active state, then examine the export tables >>> (/var/lib/nfs/etab) to see what was currently exported, then write >>> something appropriate to unlock_filesystem for any active filesystems >>> which are no longer exported. >> Is it possible that at the time of cache_clean/svc_export_put the kernel >> makes an upcall to rpc.mountd to check if svc_export.ex_path is still >> exported?. If it's not then release all the states that use that super_block. > I suspect that could be done, but then you would hit Ben's concern. > Temporarily unexported a filesystem would change from the client getting > ESTALE if it happens to access a file while the filesystem is not > exported, to the client definitely getting ADMIN_REVOKED (probably -EIO) > then next time it accesses a file even if the filesystem has been > exported again. > > I agree with Ben that there needs to be a deliberate admin action to > revoke state, not just a side-effect of unexport which historically has > not revoked state. Is it useful to add an option, 'R', to exportfs to also revoke state when user doing '-au': # exportfs -Rau The main purpose of this option is to allow all the underlying file systems to be unmounted without requiring all clients to unmount the NFS exports first. Thanks, -Dai > > NeilBrown > > > >> -Dai >> >>> If we did that we would want to find NLM locks in /proc/locks too and >>> ensure those were discarded if necessary. >>> >>> There is also the possibility that a filesystem is still exported to >>> some clients but not to all. In that case writing something to >>> unlock_ip might be appropriate - though that doesn't revoke v4 state >>> yet. >>> >>> Thanks, >>> NeilBrown >>> >>> >>>> And perhaps we need a more surgical mechanism that can handle the case >>>> where the file system is still exported but the security policy has >>>> changed. Because this does feel like a real information leak. >>>> >>>> >>>> -- >>>> Chuck Lever >>>> ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-04-09 21:00 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-03-19 18:22 NFSD automatically releases all states when underlying file system is unmounted Dai Ngo 2025-03-19 18:28 ` Chuck Lever 2025-03-19 19:00 ` Dai Ngo 2025-03-19 19:24 ` Chuck Lever 2025-03-19 19:44 ` Dai Ngo 2025-03-19 21:46 ` NeilBrown 2025-03-19 22:12 ` Dai Ngo 2025-03-20 17:53 ` Chuck Lever 2025-03-21 14:36 ` Benjamin Coddington 2025-03-21 14:43 ` Jeff Layton 2025-03-21 15:07 ` Benjamin Coddington 2025-03-21 15:18 ` Chuck Lever 2025-03-21 15:51 ` Benjamin Coddington 2025-03-21 14:44 ` Chuck Lever 2025-03-26 0:23 ` NeilBrown 2025-03-26 3:20 ` Dai Ngo 2025-03-26 3:41 ` NeilBrown 2025-03-26 13:15 ` Chuck Lever 2025-03-27 22:47 ` NeilBrown 2025-04-09 21:00 ` Dai Ngo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox