* sm notify (nlm) question @ 2024-05-14 20:56 Olga Kornievskaia 2024-05-14 21:08 ` Chuck Lever III 0 siblings, 1 reply; 9+ messages in thread From: Olga Kornievskaia @ 2024-05-14 20:56 UTC (permalink / raw) To: linux-nfs Hi folks, Given that not everything for NFSv3 has a specification, I post a question here (as it concerns linux v3 (client) implementation) but I ask a generic question with respect to NOTIFY sent by an NFS server. A NOTIFY message that is sent by an NFS server upon reboot has a monitor name and a state. This "state" is an integer and is modified on each server reboot. My question is: what about state value uniqueness? Is there somewhere some notion that this value has to be unique (as in say a random value). Here's a problem. Say a client has 2 mounts to ip1 and ip2 (both representing the same DNS name) and acquires a lock per mount. Now say each of those servers reboot. Once up they each send a NOTIFY call and each use a timestamp as basis for their "state" value -- which very likely is to produce the same value for 2 servers rebooted at the same time (or for the linux server that looks like a counter). On the client side, once the client processes the 1st NOTIFY call, it updates the "state" for the monitor name (ie a client monitors based on a DNS name which is the same for ip1 and ip2) and then in the current code, because the 2nd NOTIFY has the same "state" value this NOTIFY call would be ignored. The linux client would never reclaim the 2nd lock (but the application obviously would never know it's missing a lock) --- data corruption. Who is to blame: is the server not allowed to send "non-unique" state value? Or is the client at fault here for some reason? I'd appreciate the feedback. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sm notify (nlm) question 2024-05-14 20:56 sm notify (nlm) question Olga Kornievskaia @ 2024-05-14 21:08 ` Chuck Lever III 2024-05-14 21:21 ` Olga Kornievskaia 2024-05-14 21:36 ` Frank Filz 0 siblings, 2 replies; 9+ messages in thread From: Chuck Lever III @ 2024-05-14 21:08 UTC (permalink / raw) To: Olga Kornievskaia; +Cc: Linux NFS Mailing List > On May 14, 2024, at 2:56 PM, Olga Kornievskaia <aglo@umich.edu> wrote: > > Hi folks, > > Given that not everything for NFSv3 has a specification, I post a > question here (as it concerns linux v3 (client) implementation) but I > ask a generic question with respect to NOTIFY sent by an NFS server. There is a standard: https://pubs.opengroup.org/onlinepubs/9629799/chap11.htm > A NOTIFY message that is sent by an NFS server upon reboot has a monitor > name and a state. This "state" is an integer and is modified on each > server reboot. My question is: what about state value uniqueness? Is > there somewhere some notion that this value has to be unique (as in > say a random value). > > Here's a problem. Say a client has 2 mounts to ip1 and ip2 (both > representing the same DNS name) and acquires a lock per mount. Now say > each of those servers reboot. Once up they each send a NOTIFY call and > each use a timestamp as basis for their "state" value -- which very > likely is to produce the same value for 2 servers rebooted at the same > time (or for the linux server that looks like a counter). On the > client side, once the client processes the 1st NOTIFY call, it updates > the "state" for the monitor name (ie a client monitors based on a DNS > name which is the same for ip1 and ip2) and then in the current code, > because the 2nd NOTIFY has the same "state" value this NOTIFY call > would be ignored. The linux client would never reclaim the 2nd lock > (but the application obviously would never know it's missing a lock) > --- data corruption. > > Who is to blame: is the server not allowed to send "non-unique" state > value? Or is the client at fault here for some reason? The state value is supposed to be specific to the monitored host. If the client is indeed ignoring the second reboot notification, that's incorrect behavior, IMO. -- Chuck Lever ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sm notify (nlm) question 2024-05-14 21:08 ` Chuck Lever III @ 2024-05-14 21:21 ` Olga Kornievskaia 2024-05-14 21:36 ` Frank Filz 1 sibling, 0 replies; 9+ messages in thread From: Olga Kornievskaia @ 2024-05-14 21:21 UTC (permalink / raw) To: Chuck Lever III; +Cc: Linux NFS Mailing List On Tue, May 14, 2024 at 5:09 PM Chuck Lever III <chuck.lever@oracle.com> wrote: > > > > > On May 14, 2024, at 2:56 PM, Olga Kornievskaia <aglo@umich.edu> wrote: > > > > Hi folks, > > > > Given that not everything for NFSv3 has a specification, I post a > > question here (as it concerns linux v3 (client) implementation) but I > > ask a generic question with respect to NOTIFY sent by an NFS server. > > There is a standard: > > https://pubs.opengroup.org/onlinepubs/9629799/chap11.htm Thank you Chuck. This too does not give any limits as to the uniqueness of the state value. > > A NOTIFY message that is sent by an NFS server upon reboot has a monitor > > name and a state. This "state" is an integer and is modified on each > > server reboot. My question is: what about state value uniqueness? Is > > there somewhere some notion that this value has to be unique (as in > > say a random value). > > > > Here's a problem. Say a client has 2 mounts to ip1 and ip2 (both > > representing the same DNS name) and acquires a lock per mount. Now say > > each of those servers reboot. Once up they each send a NOTIFY call and > > each use a timestamp as basis for their "state" value -- which very > > likely is to produce the same value for 2 servers rebooted at the same > > time (or for the linux server that looks like a counter). On the > > client side, once the client processes the 1st NOTIFY call, it updates > > the "state" for the monitor name (ie a client monitors based on a DNS > > name which is the same for ip1 and ip2) and then in the current code, > > because the 2nd NOTIFY has the same "state" value this NOTIFY call > > would be ignored. The linux client would never reclaim the 2nd lock > > (but the application obviously would never know it's missing a lock) > > --- data corruption. > > > > Who is to blame: is the server not allowed to send "non-unique" state > > value? Or is the client at fault here for some reason? > > The state value is supposed to be specific to the monitored > host. If the client is indeed ignoring the second reboot > notification, that's incorrect behavior, IMO. State is supposed to help against replays I think. This client is in its right to update the state value upon processing a reboot notification. The fact that another sm_notiffy comes with the same state (and from the same DNS name monitor name) seems logical that can be a re-try and thus grounds for ignoring it. > > > -- > Chuck Lever > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: sm notify (nlm) question 2024-05-14 21:08 ` Chuck Lever III 2024-05-14 21:21 ` Olga Kornievskaia @ 2024-05-14 21:36 ` Frank Filz 2024-05-14 21:49 ` Olga Kornievskaia 1 sibling, 1 reply; 9+ messages in thread From: Frank Filz @ 2024-05-14 21:36 UTC (permalink / raw) To: 'Chuck Lever III', 'Olga Kornievskaia' Cc: 'Linux NFS Mailing List' > > On May 14, 2024, at 2:56 PM, Olga Kornievskaia <aglo@umich.edu> wrote: > > > > Hi folks, > > > > Given that not everything for NFSv3 has a specification, I post a > > question here (as it concerns linux v3 (client) implementation) but I > > ask a generic question with respect to NOTIFY sent by an NFS server. > > There is a standard: > > https://pubs.opengroup.org/onlinepubs/9629799/chap11.htm > > > > A NOTIFY message that is sent by an NFS server upon reboot has a > > monitor name and a state. This "state" is an integer and is modified > > on each server reboot. My question is: what about state value > > uniqueness? Is there somewhere some notion that this value has to be > > unique (as in say a random value). > > > > Here's a problem. Say a client has 2 mounts to ip1 and ip2 (both > > representing the same DNS name) and acquires a lock per mount. Now say > > each of those servers reboot. Once up they each send a NOTIFY call and > > each use a timestamp as basis for their "state" value -- which very > > likely is to produce the same value for 2 servers rebooted at the same > > time (or for the linux server that looks like a counter). On the > > client side, once the client processes the 1st NOTIFY call, it updates > > the "state" for the monitor name (ie a client monitors based on a DNS > > name which is the same for ip1 and ip2) and then in the current code, > > because the 2nd NOTIFY has the same "state" value this NOTIFY call > > would be ignored. The linux client would never reclaim the 2nd lock > > (but the application obviously would never know it's missing a lock) > > --- data corruption. > > > > Who is to blame: is the server not allowed to send "non-unique" state > > value? Or is the client at fault here for some reason? > > The state value is supposed to be specific to the monitored host. If the client is > indeed ignoring the second reboot notification, that's incorrect behavior, IMO. If you are using multiple server IP addresses with the same DNS name, you may want to set: sysctl fs.nfs.nsm_use_hostnames=0 The NLM will register with statd using the IP address as name instead of host name. Then your two IP addresses will each have a separate monitor entry and state value monitored. Frank ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sm notify (nlm) question 2024-05-14 21:36 ` Frank Filz @ 2024-05-14 21:49 ` Olga Kornievskaia 2024-05-14 22:13 ` Frank Filz 0 siblings, 1 reply; 9+ messages in thread From: Olga Kornievskaia @ 2024-05-14 21:49 UTC (permalink / raw) To: Frank Filz; +Cc: Chuck Lever III, Linux NFS Mailing List On Tue, May 14, 2024 at 5:36 PM Frank Filz <ffilzlnx@mindspring.com> wrote: > > > > On May 14, 2024, at 2:56 PM, Olga Kornievskaia <aglo@umich.edu> wrote: > > > > > > Hi folks, > > > > > > Given that not everything for NFSv3 has a specification, I post a > > > question here (as it concerns linux v3 (client) implementation) but I > > > ask a generic question with respect to NOTIFY sent by an NFS server. > > > > There is a standard: > > > > https://pubs.opengroup.org/onlinepubs/9629799/chap11.htm > > > > > > > A NOTIFY message that is sent by an NFS server upon reboot has a > > > monitor name and a state. This "state" is an integer and is modified > > > on each server reboot. My question is: what about state value > > > uniqueness? Is there somewhere some notion that this value has to be > > > unique (as in say a random value). > > > > > > Here's a problem. Say a client has 2 mounts to ip1 and ip2 (both > > > representing the same DNS name) and acquires a lock per mount. Now say > > > each of those servers reboot. Once up they each send a NOTIFY call and > > > each use a timestamp as basis for their "state" value -- which very > > > likely is to produce the same value for 2 servers rebooted at the same > > > time (or for the linux server that looks like a counter). On the > > > client side, once the client processes the 1st NOTIFY call, it updates > > > the "state" for the monitor name (ie a client monitors based on a DNS > > > name which is the same for ip1 and ip2) and then in the current code, > > > because the 2nd NOTIFY has the same "state" value this NOTIFY call > > > would be ignored. The linux client would never reclaim the 2nd lock > > > (but the application obviously would never know it's missing a lock) > > > --- data corruption. > > > > > > Who is to blame: is the server not allowed to send "non-unique" state > > > value? Or is the client at fault here for some reason? > > > > The state value is supposed to be specific to the monitored host. If the client is > > indeed ignoring the second reboot notification, that's incorrect behavior, IMO. > > If you are using multiple server IP addresses with the same DNS name, you may want to set: > > sysctl fs.nfs.nsm_use_hostnames=0 > > The NLM will register with statd using the IP address as name instead of host name. Then your two IP addresses will each have a separate monitor entry and state value monitored. In my setup I already have this set to 0. But I'll look around the code to see what it is supposed to do. > > Frank > ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: sm notify (nlm) question 2024-05-14 21:49 ` Olga Kornievskaia @ 2024-05-14 22:13 ` Frank Filz 2024-05-22 13:57 ` Olga Kornievskaia 0 siblings, 1 reply; 9+ messages in thread From: Frank Filz @ 2024-05-14 22:13 UTC (permalink / raw) To: 'Olga Kornievskaia' Cc: 'Chuck Lever III', 'Linux NFS Mailing List' > -----Original Message----- > From: Olga Kornievskaia [mailto:aglo@umich.edu] > Sent: Tuesday, May 14, 2024 2:50 PM > To: Frank Filz <ffilzlnx@mindspring.com> > Cc: Chuck Lever III <chuck.lever@oracle.com>; Linux NFS Mailing List <linux- > nfs@vger.kernel.org> > Subject: Re: sm notify (nlm) question > > On Tue, May 14, 2024 at 5:36 PM Frank Filz <ffilzlnx@mindspring.com> wrote: > > > > > > On May 14, 2024, at 2:56 PM, Olga Kornievskaia <aglo@umich.edu> > wrote: > > > > > > > > Hi folks, > > > > > > > > Given that not everything for NFSv3 has a specification, I post a > > > > question here (as it concerns linux v3 (client) implementation) > > > > but I ask a generic question with respect to NOTIFY sent by an NFS server. > > > > > > There is a standard: > > > > > > https://pubs.opengroup.org/onlinepubs/9629799/chap11.htm > > > > > > > > > > A NOTIFY message that is sent by an NFS server upon reboot has a > > > > monitor name and a state. This "state" is an integer and is > > > > modified on each server reboot. My question is: what about state > > > > value uniqueness? Is there somewhere some notion that this value > > > > has to be unique (as in say a random value). > > > > > > > > Here's a problem. Say a client has 2 mounts to ip1 and ip2 (both > > > > representing the same DNS name) and acquires a lock per mount. Now > > > > say each of those servers reboot. Once up they each send a NOTIFY > > > > call and each use a timestamp as basis for their "state" value -- > > > > which very likely is to produce the same value for 2 servers > > > > rebooted at the same time (or for the linux server that looks like > > > > a counter). On the client side, once the client processes the 1st > > > > NOTIFY call, it updates the "state" for the monitor name (ie a > > > > client monitors based on a DNS name which is the same for ip1 and > > > > ip2) and then in the current code, because the 2nd NOTIFY has the > > > > same "state" value this NOTIFY call would be ignored. The linux > > > > client would never reclaim the 2nd lock (but the application > > > > obviously would never know it's missing a lock) > > > > --- data corruption. > > > > > > > > Who is to blame: is the server not allowed to send "non-unique" > > > > state value? Or is the client at fault here for some reason? > > > > > > The state value is supposed to be specific to the monitored host. If > > > the client is indeed ignoring the second reboot notification, that's incorrect > behavior, IMO. > > > > If you are using multiple server IP addresses with the same DNS name, you > may want to set: > > > > sysctl fs.nfs.nsm_use_hostnames=0 > > > > The NLM will register with statd using the IP address as name instead of host > name. Then your two IP addresses will each have a separate monitor entry and > state value monitored. > > In my setup I already have this set to 0. But I'll look around the code to see what > it is supposed to do. Hmm, maybe it doesn't work on the client side. I don't often test NLM clients with my Ganesha work because I only run one VM and NLM clients can’t function on the same host as any server other than knfsd... Frank ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sm notify (nlm) question 2024-05-14 22:13 ` Frank Filz @ 2024-05-22 13:57 ` Olga Kornievskaia 2024-05-22 16:20 ` Trond Myklebust 0 siblings, 1 reply; 9+ messages in thread From: Olga Kornievskaia @ 2024-05-22 13:57 UTC (permalink / raw) To: Frank Filz; +Cc: Chuck Lever III, Linux NFS Mailing List On Tue, May 14, 2024 at 6:13 PM Frank Filz <ffilzlnx@mindspring.com> wrote: > > > > > -----Original Message----- > > From: Olga Kornievskaia [mailto:aglo@umich.edu] > > Sent: Tuesday, May 14, 2024 2:50 PM > > To: Frank Filz <ffilzlnx@mindspring.com> > > Cc: Chuck Lever III <chuck.lever@oracle.com>; Linux NFS Mailing List <linux- > > nfs@vger.kernel.org> > > Subject: Re: sm notify (nlm) question > > > > On Tue, May 14, 2024 at 5:36 PM Frank Filz <ffilzlnx@mindspring.com> wrote: > > > > > > > > On May 14, 2024, at 2:56 PM, Olga Kornievskaia <aglo@umich.edu> > > wrote: > > > > > > > > > > Hi folks, > > > > > > > > > > Given that not everything for NFSv3 has a specification, I post a > > > > > question here (as it concerns linux v3 (client) implementation) > > > > > but I ask a generic question with respect to NOTIFY sent by an NFS server. > > > > > > > > There is a standard: > > > > > > > > https://pubs.opengroup.org/onlinepubs/9629799/chap11.htm > > > > > > > > > > > > > A NOTIFY message that is sent by an NFS server upon reboot has a > > > > > monitor name and a state. This "state" is an integer and is > > > > > modified on each server reboot. My question is: what about state > > > > > value uniqueness? Is there somewhere some notion that this value > > > > > has to be unique (as in say a random value). > > > > > > > > > > Here's a problem. Say a client has 2 mounts to ip1 and ip2 (both > > > > > representing the same DNS name) and acquires a lock per mount. Now > > > > > say each of those servers reboot. Once up they each send a NOTIFY > > > > > call and each use a timestamp as basis for their "state" value -- > > > > > which very likely is to produce the same value for 2 servers > > > > > rebooted at the same time (or for the linux server that looks like > > > > > a counter). On the client side, once the client processes the 1st > > > > > NOTIFY call, it updates the "state" for the monitor name (ie a > > > > > client monitors based on a DNS name which is the same for ip1 and > > > > > ip2) and then in the current code, because the 2nd NOTIFY has the > > > > > same "state" value this NOTIFY call would be ignored. The linux > > > > > client would never reclaim the 2nd lock (but the application > > > > > obviously would never know it's missing a lock) > > > > > --- data corruption. > > > > > > > > > > Who is to blame: is the server not allowed to send "non-unique" > > > > > state value? Or is the client at fault here for some reason? > > > > > > > > The state value is supposed to be specific to the monitored host. If > > > > the client is indeed ignoring the second reboot notification, that's incorrect > > behavior, IMO. > > > > > > If you are using multiple server IP addresses with the same DNS name, you > > may want to set: > > > > > > sysctl fs.nfs.nsm_use_hostnames=0 > > > > > > The NLM will register with statd using the IP address as name instead of host > > name. Then your two IP addresses will each have a separate monitor entry and > > state value monitored. > > > > In my setup I already have this set to 0. But I'll look around the code to see what > > it is supposed to do. > > Hmm, maybe it doesn't work on the client side. I don't often test NLM clients with my Ganesha work because I only run one VM and NLM clients can’t function on the same host as any server other than knfsd... I've been staring and tracing the code and here's what I conclude: the use of nsm_use_hostname toggles nothing that helps. No matter what statd always stores whatever it is monitoring based on the DSN name (looks like git blame says it's due to nfs-utils's commit 0da56f7d359475837008ea4b8d3764fe982ef512 "statd - use dnsname to ensure correct matching of NOTIFY requests". Now what's worse is that when statd receives a 2nd monitoring request from lockd for something that maps to the same DNS name, statd overwrites the previous monitoring information it had. When a NOTIFY arrives from an IP matching the DNS name, the statd does the downcall and it will send whatever the last monitoring information lockd gave it. Therefore all the other locks will never be recovered. What I struggle with is how to solve this problem. Say ip1 and ip2 run an NFS server and both are known under the same DNS name: foo.bar.com. Does it mean that they represent the "same" server? Can we assume that if one of them "rebooted" then the other rebooted as well? It seems like we can't go backwards and go back to monitoring by IP. In that case I can see that we'll get in trouble if the rebooted server indeed comes back up with a different IP (same DNS name) and then it would never match the old entry and the lock would never be recovered (but then also I think lockd will only send the lock to the IP is stored previously which in this case would be unreachable). If statd continues to monitor by DNS name and then matches either ips to the stored entry, then the problem comes with "state" update. Once statd processes one NOTIFY which matched the DNS name its state "should" be updated but then it would leads us back into the problem if ignoring the 2nd NOTIFY call. If statd were to be changed to store multiple monitor handles lockd asked to monitor, then when the 1st NOTIFY call comes we can ask lockd to recover "all" the store handles. But then it circles back to my question: can we assume that if one IP rebooted does it imply all IPs rebooted? Perhaps it's lockd that needs to change in how it keeps track of servers that hold locks. The behaviour seems to have changed in 2010 (with commit 8ea6ecc8b0759756a766c05dc7c98c51ec90de37 "lockd: Create client-side nlm_host cache") when nlm_host cache was introduced written to be based on hash of IP. It seems that before things were based on a DNS name making it in line with statd. Anybody has any thoughts as to whether statd or lockd needs to change? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sm notify (nlm) question 2024-05-22 13:57 ` Olga Kornievskaia @ 2024-05-22 16:20 ` Trond Myklebust 2024-05-22 17:18 ` Tom Talpey 0 siblings, 1 reply; 9+ messages in thread From: Trond Myklebust @ 2024-05-22 16:20 UTC (permalink / raw) To: ffilzlnx@mindspring.com, aglo@umich.edu Cc: linux-nfs@vger.kernel.org, chuck.lever@oracle.com On Wed, 2024-05-22 at 09:57 -0400, Olga Kornievskaia wrote: > On Tue, May 14, 2024 at 6:13 PM Frank Filz <ffilzlnx@mindspring.com> > wrote: > > > > > > > > > -----Original Message----- > > > From: Olga Kornievskaia [mailto:aglo@umich.edu] > > > Sent: Tuesday, May 14, 2024 2:50 PM > > > To: Frank Filz <ffilzlnx@mindspring.com> > > > Cc: Chuck Lever III <chuck.lever@oracle.com>; Linux NFS Mailing > > > List <linux- > > > nfs@vger.kernel.org> > > > Subject: Re: sm notify (nlm) question > > > > > > On Tue, May 14, 2024 at 5:36 PM Frank Filz > > > <ffilzlnx@mindspring.com> wrote: > > > > > > > > > > On May 14, 2024, at 2:56 PM, Olga Kornievskaia > > > > > > <aglo@umich.edu> > > > wrote: > > > > > > > > > > > > Hi folks, > > > > > > > > > > > > Given that not everything for NFSv3 has a specification, I > > > > > > post a > > > > > > question here (as it concerns linux v3 (client) > > > > > > implementation) > > > > > > but I ask a generic question with respect to NOTIFY sent by > > > > > > an NFS server. > > > > > > > > > > There is a standard: > > > > > > > > > > https://pubs.opengroup.org/onlinepubs/9629799/chap11.htm > > > > > > > > > > > > > > > > A NOTIFY message that is sent by an NFS server upon reboot > > > > > > has a > > > > > > monitor name and a state. This "state" is an integer and is > > > > > > modified on each server reboot. My question is: what about > > > > > > state > > > > > > value uniqueness? Is there somewhere some notion that this > > > > > > value > > > > > > has to be unique (as in say a random value). > > > > > > > > > > > > Here's a problem. Say a client has 2 mounts to ip1 and ip2 > > > > > > (both > > > > > > representing the same DNS name) and acquires a lock per > > > > > > mount. Now > > > > > > say each of those servers reboot. Once up they each send a > > > > > > NOTIFY > > > > > > call and each use a timestamp as basis for their "state" > > > > > > value -- > > > > > > which very likely is to produce the same value for 2 > > > > > > servers > > > > > > rebooted at the same time (or for the linux server that > > > > > > looks like > > > > > > a counter). On the client side, once the client processes > > > > > > the 1st > > > > > > NOTIFY call, it updates the "state" for the monitor name > > > > > > (ie a > > > > > > client monitors based on a DNS name which is the same for > > > > > > ip1 and > > > > > > ip2) and then in the current code, because the 2nd NOTIFY > > > > > > has the > > > > > > same "state" value this NOTIFY call would be ignored. The > > > > > > linux > > > > > > client would never reclaim the 2nd lock (but the > > > > > > application > > > > > > obviously would never know it's missing a lock) > > > > > > --- data corruption. > > > > > > > > > > > > Who is to blame: is the server not allowed to send "non- > > > > > > unique" > > > > > > state value? Or is the client at fault here for some > > > > > > reason? > > > > > > > > > > The state value is supposed to be specific to the monitored > > > > > host. If > > > > > the client is indeed ignoring the second reboot notification, > > > > > that's incorrect > > > behavior, IMO. > > > > > > > > If you are using multiple server IP addresses with the same DNS > > > > name, you > > > may want to set: > > > > > > > > sysctl fs.nfs.nsm_use_hostnames=0 > > > > > > > > The NLM will register with statd using the IP address as name > > > > instead of host > > > name. Then your two IP addresses will each have a separate > > > monitor entry and > > > state value monitored. > > > > > > In my setup I already have this set to 0. But I'll look around > > > the code to see what > > > it is supposed to do. > > > > Hmm, maybe it doesn't work on the client side. I don't often test > > NLM clients with my Ganesha work because I only run one VM and NLM > > clients can’t function on the same host as any server other than > > knfsd... > > I've been staring and tracing the code and here's what I conclude: > the > use of nsm_use_hostname toggles nothing that helps. No matter what > statd always stores whatever it is monitoring based on the DSN name > (looks like git blame says it's due to nfs-utils's commit > 0da56f7d359475837008ea4b8d3764fe982ef512 "statd - use dnsname to > ensure correct matching of NOTIFY requests". Now what's worse is that > when statd receives a 2nd monitoring request from lockd for something > that maps to the same DNS name, statd overwrites the previous > monitoring information it had. When a NOTIFY arrives from an IP > matching the DNS name, the statd does the downcall and it will send > whatever the last monitoring information lockd gave it. Therefore all > the other locks will never be recovered. > > What I struggle with is how to solve this problem. Say ip1 and ip2 > run > an NFS server and both are known under the same DNS name: > foo.bar.com. > Does it mean that they represent the "same" server? Can we assume > that > if one of them "rebooted" then the other rebooted as well? It seems > like we can't go backwards and go back to monitoring by IP. In that > case I can see that we'll get in trouble if the rebooted server > indeed > comes back up with a different IP (same DNS name) and then it would > never match the old entry and the lock would never be recovered (but > then also I think lockd will only send the lock to the IP is stored > previously which in this case would be unreachable). If statd > continues to monitor by DNS name and then matches either ips to the > stored entry, then the problem comes with "state" update. Once statd > processes one NOTIFY which matched the DNS name its state "should" be > updated but then it would leads us back into the problem if ignoring > the 2nd NOTIFY call. If statd were to be changed to store multiple > monitor handles lockd asked to monitor, then when the 1st NOTIFY call > comes we can ask lockd to recover "all" the store handles. But then > it > circles back to my question: can we assume that if one IP rebooted > does it imply all IPs rebooted? > > Perhaps it's lockd that needs to change in how it keeps track of > servers that hold locks. The behaviour seems to have changed in 2010 > (with commit 8ea6ecc8b0759756a766c05dc7c98c51ec90de37 "lockd: Create > client-side nlm_host cache") when nlm_host cache was introduced > written to be based on hash of IP. It seems that before things were > based on a DNS name making it in line with statd. > > Anybody has any thoughts as to whether statd or lockd needs to > change? > I believe Tom Talpey is to blame for the nsm_use_hostname stuff. That all came from his 2006 Connectathon talk https://nfsv4bat.org/Documents/ConnectAThon/2006/talpey-cthon06-nsm.pdf -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sm notify (nlm) question 2024-05-22 16:20 ` Trond Myklebust @ 2024-05-22 17:18 ` Tom Talpey 0 siblings, 0 replies; 9+ messages in thread From: Tom Talpey @ 2024-05-22 17:18 UTC (permalink / raw) To: Trond Myklebust, ffilzlnx@mindspring.com, aglo@umich.edu Cc: linux-nfs@vger.kernel.org, chuck.lever@oracle.com On 5/22/2024 12:20 PM, Trond Myklebust wrote: > On Wed, 2024-05-22 at 09:57 -0400, Olga Kornievskaia wrote: >> On Tue, May 14, 2024 at 6:13 PM Frank Filz <ffilzlnx@mindspring.com> >> wrote: >>> >>> >>> >>>> -----Original Message----- >>>> From: Olga Kornievskaia [mailto:aglo@umich.edu] >>>> Sent: Tuesday, May 14, 2024 2:50 PM >>>> To: Frank Filz <ffilzlnx@mindspring.com> >>>> Cc: Chuck Lever III <chuck.lever@oracle.com>; Linux NFS Mailing >>>> List <linux- >>>> nfs@vger.kernel.org> >>>> Subject: Re: sm notify (nlm) question >>>> >>>> On Tue, May 14, 2024 at 5:36 PM Frank Filz >>>> <ffilzlnx@mindspring.com> wrote: >>>>> >>>>>>> On May 14, 2024, at 2:56 PM, Olga Kornievskaia >>>>>>> <aglo@umich.edu> >>>> wrote: >>>>>>> >>>>>>> Hi folks, >>>>>>> >>>>>>> Given that not everything for NFSv3 has a specification, I >>>>>>> post a >>>>>>> question here (as it concerns linux v3 (client) >>>>>>> implementation) >>>>>>> but I ask a generic question with respect to NOTIFY sent by >>>>>>> an NFS server. >>>>>> >>>>>> There is a standard: >>>>>> >>>>>> https://pubs.opengroup.org/onlinepubs/9629799/chap11.htm >>>>>> >>>>>> >>>>>>> A NOTIFY message that is sent by an NFS server upon reboot >>>>>>> has a >>>>>>> monitor name and a state. This "state" is an integer and is >>>>>>> modified on each server reboot. My question is: what about >>>>>>> state >>>>>>> value uniqueness? Is there somewhere some notion that this >>>>>>> value >>>>>>> has to be unique (as in say a random value). >>>>>>> >>>>>>> Here's a problem. Say a client has 2 mounts to ip1 and ip2 >>>>>>> (both >>>>>>> representing the same DNS name) and acquires a lock per >>>>>>> mount. Now >>>>>>> say each of those servers reboot. Once up they each send a >>>>>>> NOTIFY >>>>>>> call and each use a timestamp as basis for their "state" >>>>>>> value -- >>>>>>> which very likely is to produce the same value for 2 >>>>>>> servers >>>>>>> rebooted at the same time (or for the linux server that >>>>>>> looks like >>>>>>> a counter). On the client side, once the client processes >>>>>>> the 1st >>>>>>> NOTIFY call, it updates the "state" for the monitor name >>>>>>> (ie a >>>>>>> client monitors based on a DNS name which is the same for >>>>>>> ip1 and >>>>>>> ip2) and then in the current code, because the 2nd NOTIFY >>>>>>> has the >>>>>>> same "state" value this NOTIFY call would be ignored. The >>>>>>> linux >>>>>>> client would never reclaim the 2nd lock (but the >>>>>>> application >>>>>>> obviously would never know it's missing a lock) >>>>>>> --- data corruption. >>>>>>> >>>>>>> Who is to blame: is the server not allowed to send "non- >>>>>>> unique" >>>>>>> state value? Or is the client at fault here for some >>>>>>> reason? >>>>>> >>>>>> The state value is supposed to be specific to the monitored >>>>>> host. If >>>>>> the client is indeed ignoring the second reboot notification, >>>>>> that's incorrect >>>> behavior, IMO. >>>>> >>>>> If you are using multiple server IP addresses with the same DNS >>>>> name, you >>>> may want to set: >>>>> >>>>> sysctl fs.nfs.nsm_use_hostnames=0 >>>>> >>>>> The NLM will register with statd using the IP address as name >>>>> instead of host >>>> name. Then your two IP addresses will each have a separate >>>> monitor entry and >>>> state value monitored. >>>> >>>> In my setup I already have this set to 0. But I'll look around >>>> the code to see what >>>> it is supposed to do. >>> >>> Hmm, maybe it doesn't work on the client side. I don't often test >>> NLM clients with my Ganesha work because I only run one VM and NLM >>> clients can’t function on the same host as any server other than >>> knfsd... >> >> I've been staring and tracing the code and here's what I conclude: >> the >> use of nsm_use_hostname toggles nothing that helps. No matter what >> statd always stores whatever it is monitoring based on the DSN name >> (looks like git blame says it's due to nfs-utils's commit >> 0da56f7d359475837008ea4b8d3764fe982ef512 "statd - use dnsname to >> ensure correct matching of NOTIFY requests". Now what's worse is that >> when statd receives a 2nd monitoring request from lockd for something >> that maps to the same DNS name, statd overwrites the previous >> monitoring information it had. When a NOTIFY arrives from an IP >> matching the DNS name, the statd does the downcall and it will send >> whatever the last monitoring information lockd gave it. Therefore all >> the other locks will never be recovered. >> >> What I struggle with is how to solve this problem. Say ip1 and ip2 >> run >> an NFS server and both are known under the same DNS name: >> foo.bar.com. >> Does it mean that they represent the "same" server? Can we assume >> that >> if one of them "rebooted" then the other rebooted as well? It seems >> like we can't go backwards and go back to monitoring by IP. In that >> case I can see that we'll get in trouble if the rebooted server >> indeed >> comes back up with a different IP (same DNS name) and then it would >> never match the old entry and the lock would never be recovered (but >> then also I think lockd will only send the lock to the IP is stored >> previously which in this case would be unreachable). If statd >> continues to monitor by DNS name and then matches either ips to the >> stored entry, then the problem comes with "state" update. Once statd >> processes one NOTIFY which matched the DNS name its state "should" be >> updated but then it would leads us back into the problem if ignoring >> the 2nd NOTIFY call. If statd were to be changed to store multiple >> monitor handles lockd asked to monitor, then when the 1st NOTIFY call >> comes we can ask lockd to recover "all" the store handles. But then >> it >> circles back to my question: can we assume that if one IP rebooted >> does it imply all IPs rebooted? >> >> Perhaps it's lockd that needs to change in how it keeps track of >> servers that hold locks. The behaviour seems to have changed in 2010 >> (with commit 8ea6ecc8b0759756a766c05dc7c98c51ec90de37 "lockd: Create >> client-side nlm_host cache") when nlm_host cache was introduced >> written to be based on hash of IP. It seems that before things were >> based on a DNS name making it in line with statd. >> >> Anybody has any thoughts as to whether statd or lockd needs to >> change? >> > > I believe Tom Talpey is to blame for the nsm_use_hostname stuff. That > all came from his 2006 Connectathon talk > https://nfsv4bat.org/Documents/ConnectAThon/2006/talpey-cthon06-nsm.pdf I deny that!! :) All that talk intended to do was to point out how deeply flawed the statmon protocol is, and how badly it was then implemented. However, hostnames may be a slight improvement over the mess that was 2006. And it's been kinda sorta working since then. Personally I still think trying to "fix" nsm is a fool's errand. It's just never ever going to succeed. Particularly if both the clients *and* servers have to change. NFS4.1 is the better way. Tom. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-05-22 17:19 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-14 20:56 sm notify (nlm) question Olga Kornievskaia 2024-05-14 21:08 ` Chuck Lever III 2024-05-14 21:21 ` Olga Kornievskaia 2024-05-14 21:36 ` Frank Filz 2024-05-14 21:49 ` Olga Kornievskaia 2024-05-14 22:13 ` Frank Filz 2024-05-22 13:57 ` Olga Kornievskaia 2024-05-22 16:20 ` Trond Myklebust 2024-05-22 17:18 ` Tom Talpey
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox