PYNFS LOCK20 Blocking Lock Test Case

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* PYNFS LOCK20 Blocking Lock Test Case
@ 2025-08-12 16:55 Frank Filz
  2025-08-14  1:29 ` Calum Mackay
  0 siblings, 1 reply; 5+ messages in thread
From: Frank Filz @ 2025-08-12 16:55 UTC (permalink / raw)
  To: linux-nfs; +Cc: 'Calum Mackay', 'Ofir Vainshtein'

I believe this test case is wrong, relevant text from RFC:

Some clients require the support of blocking locks. The NFSv4
protocol must not rely on a callback mechanism and therefore is
unable to notify a client when a previously denied lock has been
granted. Clients have no choice but to continually poll for the
lock. This presents a fairness problem. Two new lock types are
added, READW and WRITEW, and are used to indicate to the server that
the client is requesting a blocking lock. The server should maintain
an ordered list of pending blocking locks. When the conflicting lock
is released, the server may wait the lease period for the first
waiting client to re-request the lock. After the lease period
expires, the next waiting client request is allowed the lock.

Test case:

    # Standard owner opens and locks a file
    fh1, stateid1 = c.create_confirm(t.word(), deny=OPEN4_SHARE_DENY_NONE)
    res1 = c.lock_file(t.word(), fh1, stateid1, type=WRITE_LT)
    check(res1, msg="Locking file %s" % t.word())
    # Second owner is denied a blocking lock
    file = c.homedir + [t.word()]
    fh2, stateid2 = c.open_confirm(b"owner2", file,
                                   access=OPEN4_SHARE_ACCESS_BOTH,
                                   deny=OPEN4_SHARE_DENY_NONE)
    res2 = c.lock_file(b"owner2", fh2, stateid2,
                       type=WRITEW_LT, lockowner=b"lockowner2_LOCK20")
    check(res2, NFS4ERR_DENIED, msg="Conflicting lock on %s" % t.word())
    sleeptime = c.getLeaseTime() // 2
    # Wait for queued lock to timeout
    for i in range(3):
        env.sleep(sleeptime, "Waiting for queued blocking lock to timeout")
        res = c.compound([op.renew(c.clientid)])
        check(res, [NFS4_OK, NFS4ERR_CB_PATH_DOWN])
    # Standard owner releases lock
    res1 = c.unlock_file(1, fh1, res1.lockid)
    check(res1)
    # Third owner tries to butt in and steal lock second owner is waiting
for
    # Should work, since second owner let his turn expire
    file = c.homedir + [t.word()]
    fh3, stateid3 = c.open_confirm(b"owner3", file,
                                   access=OPEN4_SHARE_ACCESS_BOTH,
                                   deny=OPEN4_SHARE_DENY_NONE)
    res3 = c.lock_file(b"owner3", fh3, stateid3,
                       type=WRITEW_LT, lockowner=b"lockowner3_LOCK20")
    check(res3, msg="Grabbing lock after another owner let his 'turn'
expire")

Note that the RFC indicated the client has one lease period AFTER the
conflicting lock is released to retry while the test case waits 1.5 lease
period after requesting the blocking lock before it releases the conflicting
lock...

Am I reading things right?

Thanks

Frank Filz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PYNFS LOCK20 Blocking Lock Test Case
  2025-08-12 16:55 PYNFS LOCK20 Blocking Lock Test Case Frank Filz
@ 2025-08-14  1:29 ` Calum Mackay
  2025-08-21 17:30   ` Frank Filz
  0 siblings, 1 reply; 5+ messages in thread
From: Calum Mackay @ 2025-08-14  1:29 UTC (permalink / raw)
  To: Frank Filz, linux-nfs
  Cc: Calum Mackay, 'Ofir Vainshtein', Chuck Lever

On 12/08/2025 5:55 pm, Frank Filz wrote:
> I believe this test case is wrong, relevant text from RFC:
> 
> Some clients require the support of blocking locks. The NFSv4
> protocol must not rely on a callback mechanism and therefore is
> unable to notify a client when a previously denied lock has been
> granted. Clients have no choice but to continually poll for the
> lock. This presents a fairness problem. Two new lock types are
> added, READW and WRITEW, and are used to indicate to the server that
> the client is requesting a blocking lock. The server should maintain
> an ordered list of pending blocking locks. When the conflicting lock
> is released, the server may wait the lease period for the first
> waiting client to re-request the lock. After the lease period
> expires, the next waiting client request is allowed the lock.
> 
> Test case:
> 
>      # Standard owner opens and locks a file
>      fh1, stateid1 = c.create_confirm(t.word(), deny=OPEN4_SHARE_DENY_NONE)
>      res1 = c.lock_file(t.word(), fh1, stateid1, type=WRITE_LT)
>      check(res1, msg="Locking file %s" % t.word())
>      # Second owner is denied a blocking lock
>      file = c.homedir + [t.word()]
>      fh2, stateid2 = c.open_confirm(b"owner2", file,
>                                     access=OPEN4_SHARE_ACCESS_BOTH,
>                                     deny=OPEN4_SHARE_DENY_NONE)
>      res2 = c.lock_file(b"owner2", fh2, stateid2,
>                         type=WRITEW_LT, lockowner=b"lockowner2_LOCK20")
>      check(res2, NFS4ERR_DENIED, msg="Conflicting lock on %s" % t.word())
>      sleeptime = c.getLeaseTime() // 2
>      # Wait for queued lock to timeout
>      for i in range(3):
>          env.sleep(sleeptime, "Waiting for queued blocking lock to timeout")
>          res = c.compound([op.renew(c.clientid)])
>          check(res, [NFS4_OK, NFS4ERR_CB_PATH_DOWN])
>      # Standard owner releases lock
>      res1 = c.unlock_file(1, fh1, res1.lockid)
>      check(res1)
>      # Third owner tries to butt in and steal lock second owner is waiting
> for
>      # Should work, since second owner let his turn expire
>      file = c.homedir + [t.word()]
>      fh3, stateid3 = c.open_confirm(b"owner3", file,
>                                     access=OPEN4_SHARE_ACCESS_BOTH,
>                                     deny=OPEN4_SHARE_DENY_NONE)
>      res3 = c.lock_file(b"owner3", fh3, stateid3,
>                         type=WRITEW_LT, lockowner=b"lockowner3_LOCK20")
>      check(res3, msg="Grabbing lock after another owner let his 'turn'
> expire")
> 
> Note that the RFC indicated the client has one lease period AFTER the
> conflicting lock is released to retry while the test case waits 1.5 lease
> period after requesting the blocking lock before it releases the conflicting
> lock...
> 
> Am I reading things right?

I see what you mean.

But since a waiting blocking lock client obviously doesn't know when the 
lock-holding client is going to release its lock, the waiting client has 
to start polling regularly as soon as its initial blocking lock request 
is denied. It has to poll at least once per lease period.

If the server notices that a waiting client hasn't polled once in a 
lease period, after its initial blocking lock request was denied, then 
it seems reasonable for the server to forget that waiting client's 
interest in the pending lock there and then. There's no need for the 
server to wait a further lease period after the lock is released.

Looking at the current Linux nfsd code, that does seem to be what it 
does. I see that when the server adds the blocking lock request to its 
pending list, it adds the current timestamp to it, i.e. the time that 
the blocking lock was requested.

The nfsd background clean-up thread (which runs at least once per lease 
period) removes any pending blocking lock requests if a lease period has 
passed since they were placed on the list (i.e. when the blocking lock 
was requested). There's a corresponding comment:

	/*
	 * It's possible for a client to try and acquire an already held lock
	 * that is being held for a long time, and then lose interest in it.
	 * So, we clean out any un-revisited request after a lease period
	 * under the assumption that the client is no longer interested.

https://elixir.bootlin.com/linux/v6.16/source/fs/nfsd/nfs4state.c#L6824

There's no pending locks action taken on lock release. The timing is 
based solely on when the blocking READW/WRITEW request occurred, i.e. 
the res2 WRITEW in the pynfs test, which is before the sleep.

So, whilst the RFC may seem to suggest the timer should start at lock 
release, it doesn't seem unreasonable for the NFS server to start the 
timer earlier, at the blocking lock request, to avoid an unnecessary 
delay upon lock release if the client has lost interest in the lock, 
i.e. it isn't polling.

Presumably, the pynfs test was originally written to match NFS server 
behaviour, rather than RFC wording. I'm not sure what other NFS servers 
do in this case. Waiting longer wouldn't change the test result in this 
case, I think.

Does that seem reasonable to you?

thanks,
calum.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: PYNFS LOCK20 Blocking Lock Test Case
  2025-08-14  1:29 ` Calum Mackay
@ 2025-08-21 17:30   ` Frank Filz
  2025-08-21 17:41     ` Calum Mackay
  0 siblings, 1 reply; 5+ messages in thread
From: Frank Filz @ 2025-08-21 17:30 UTC (permalink / raw)
  To: 'Calum Mackay', linux-nfs
  Cc: 'Ofir Vainshtein', 'Chuck Lever'

Ah, I see the logic the test case is expecting.

For Ganesha, we maintain the blocking lock so long as the clientid is being renewed, so we don't start the timer for claiming the lock until the lock becomes available which seems to be allowed per the RFC. Maybe we just need to not run that test case.

But it would be nice to have a similar test case that just takes too long after the lock is available to retry.

Part of the challenge is we share a lot of logic between 4.0 and 4.1 and with the actual callback in 4.1, there is no expectation of the client polling for the lock.

Let me mull this one over more.

Thanks

Frank

> -----Original Message-----
> From: Calum Mackay [mailto:calum.mackay@oracle.com]
> Sent: Wednesday, August 13, 2025 6:30 PM
> To: Frank Filz <ffilzlnx@mindspring.com>; linux-nfs@vger.kernel.org
> Cc: Calum Mackay <calum.mackay@oracle.com>; 'Ofir Vainshtein'
> <ofirvins@google.com>; Chuck Lever <chuck.lever@oracle.com>
> Subject: Re: PYNFS LOCK20 Blocking Lock Test Case
> 
> On 12/08/2025 5:55 pm, Frank Filz wrote:
> > I believe this test case is wrong, relevant text from RFC:
> >
> > Some clients require the support of blocking locks. The NFSv4 protocol
> > must not rely on a callback mechanism and therefore is unable to
> > notify a client when a previously denied lock has been granted.
> > Clients have no choice but to continually poll for the lock. This
> > presents a fairness problem. Two new lock types are added, READW and
> > WRITEW, and are used to indicate to the server that the client is
> > requesting a blocking lock. The server should maintain an ordered list
> > of pending blocking locks. When the conflicting lock is released, the
> > server may wait the lease period for the first waiting client to
> > re-request the lock. After the lease period expires, the next waiting
> > client request is allowed the lock.
> >
> > Test case:
> >
> >      # Standard owner opens and locks a file
> >      fh1, stateid1 = c.create_confirm(t.word(),
> deny=OPEN4_SHARE_DENY_NONE)
> >      res1 = c.lock_file(t.word(), fh1, stateid1, type=WRITE_LT)
> >      check(res1, msg="Locking file %s" % t.word())
> >      # Second owner is denied a blocking lock
> >      file = c.homedir + [t.word()]
> >      fh2, stateid2 = c.open_confirm(b"owner2", file,
> >                                     access=OPEN4_SHARE_ACCESS_BOTH,
> >                                     deny=OPEN4_SHARE_DENY_NONE)
> >      res2 = c.lock_file(b"owner2", fh2, stateid2,
> >                         type=WRITEW_LT, lockowner=b"lockowner2_LOCK20")
> >      check(res2, NFS4ERR_DENIED, msg="Conflicting lock on %s" % t.word())
> >      sleeptime = c.getLeaseTime() // 2
> >      # Wait for queued lock to timeout
> >      for i in range(3):
> >          env.sleep(sleeptime, "Waiting for queued blocking lock to timeout")
> >          res = c.compound([op.renew(c.clientid)])
> >          check(res, [NFS4_OK, NFS4ERR_CB_PATH_DOWN])
> >      # Standard owner releases lock
> >      res1 = c.unlock_file(1, fh1, res1.lockid)
> >      check(res1)
> >      # Third owner tries to butt in and steal lock second owner is
> > waiting for
> >      # Should work, since second owner let his turn expire
> >      file = c.homedir + [t.word()]
> >      fh3, stateid3 = c.open_confirm(b"owner3", file,
> >                                     access=OPEN4_SHARE_ACCESS_BOTH,
> >                                     deny=OPEN4_SHARE_DENY_NONE)
> >      res3 = c.lock_file(b"owner3", fh3, stateid3,
> >                         type=WRITEW_LT, lockowner=b"lockowner3_LOCK20")
> >      check(res3, msg="Grabbing lock after another owner let his 'turn'
> > expire")
> >
> > Note that the RFC indicated the client has one lease period AFTER the
> > conflicting lock is released to retry while the test case waits 1.5
> > lease period after requesting the blocking lock before it releases the
> > conflicting lock...
> >
> > Am I reading things right?
> 
> I see what you mean.
> 
> But since a waiting blocking lock client obviously doesn't know when the lock-
> holding client is going to release its lock, the waiting client has to start polling
> regularly as soon as its initial blocking lock request is denied. It has to poll at
> least once per lease period.
> 
> If the server notices that a waiting client hasn't polled once in a lease period,
> after its initial blocking lock request was denied, then it seems reasonable for
> the server to forget that waiting client's interest in the pending lock there and
> then. There's no need for the server to wait a further lease period after the lock
> is released.
> 
> 
> Looking at the current Linux nfsd code, that does seem to be what it does. I see
> that when the server adds the blocking lock request to its pending list, it adds the
> current timestamp to it, i.e. the time that the blocking lock was requested.
> 
> The nfsd background clean-up thread (which runs at least once per lease
> period) removes any pending blocking lock requests if a lease period has passed
> since they were placed on the list (i.e. when the blocking lock was requested).
> There's a corresponding comment:
> 
> 	/*
> 	 * It's possible for a client to try and acquire an already held lock
> 	 * that is being held for a long time, and then lose interest in it.
> 	 * So, we clean out any un-revisited request after a lease period
> 	 * under the assumption that the client is no longer interested.
> 
> https://elixir.bootlin.com/linux/v6.16/source/fs/nfsd/nfs4state.c#L6824
> 
> 
> There's no pending locks action taken on lock release. The timing is based solely
> on when the blocking READW/WRITEW request occurred, i.e.
> the res2 WRITEW in the pynfs test, which is before the sleep.
> 
> So, whilst the RFC may seem to suggest the timer should start at lock release, it
> doesn't seem unreasonable for the NFS server to start the timer earlier, at the
> blocking lock request, to avoid an unnecessary delay upon lock release if the
> client has lost interest in the lock, i.e. it isn't polling.
> 
> 
> Presumably, the pynfs test was originally written to match NFS server behaviour,
> rather than RFC wording. I'm not sure what other NFS servers do in this case.
> Waiting longer wouldn't change the test result in this case, I think.
> 
> 
> Does that seem reasonable to you?
> 
> thanks,
> calum.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PYNFS LOCK20 Blocking Lock Test Case
  2025-08-21 17:30   ` Frank Filz
@ 2025-08-21 17:41     ` Calum Mackay
  2025-08-21 17:47       ` Frank Filz
  0 siblings, 1 reply; 5+ messages in thread
From: Calum Mackay @ 2025-08-21 17:41 UTC (permalink / raw)
  To: Frank Filz, linux-nfs
  Cc: Calum Mackay, 'Ofir Vainshtein', 'Chuck Lever'


[-- Attachment #1.1: Type: text/plain, Size: 7196 bytes --]

On 21/08/2025 6:30 pm, Frank Filz wrote:
> Ah, I see the logic the test case is expecting.
> 
> For Ganesha, we maintain the blocking lock so long as the clientid is being renewed, so we don't start the timer for claiming the lock until the lock becomes available which seems to be allowed per the RFC. Maybe we just need to not run that test case.
> 
> But it would be nice to have a similar test case that just takes too long after the lock is available to retry.

Thanks Frank; Chuck also made a similar suggestion to me privately. I'll 
have a look at either adjusting this test to report information that the 
lock was obtained "early", and/or a separate/optional test that more 
closely matches the RFC wording, regardless of how the server might behave.

Of course, the RFC wording, and lack of nominative language, suggests 
this is an implementation choice, and thus difficult for pynfs tests to 
adjudicate on.

thanks,
calum.

> 
> Part of the challenge is we share a lot of logic between 4.0 and 4.1 and with the actual callback in 4.1, there is no expectation of the client polling for the lock.
> 
> Let me mull this one over more.
> 
> Thanks
> 
> Frank
> 
>> -----Original Message-----
>> From: Calum Mackay [mailto:calum.mackay@oracle.com]
>> Sent: Wednesday, August 13, 2025 6:30 PM
>> To: Frank Filz <ffilzlnx@mindspring.com>; linux-nfs@vger.kernel.org
>> Cc: Calum Mackay <calum.mackay@oracle.com>; 'Ofir Vainshtein'
>> <ofirvins@google.com>; Chuck Lever <chuck.lever@oracle.com>
>> Subject: Re: PYNFS LOCK20 Blocking Lock Test Case
>>
>> On 12/08/2025 5:55 pm, Frank Filz wrote:
>>> I believe this test case is wrong, relevant text from RFC:
>>>
>>> Some clients require the support of blocking locks. The NFSv4 protocol
>>> must not rely on a callback mechanism and therefore is unable to
>>> notify a client when a previously denied lock has been granted.
>>> Clients have no choice but to continually poll for the lock. This
>>> presents a fairness problem. Two new lock types are added, READW and
>>> WRITEW, and are used to indicate to the server that the client is
>>> requesting a blocking lock. The server should maintain an ordered list
>>> of pending blocking locks. When the conflicting lock is released, the
>>> server may wait the lease period for the first waiting client to
>>> re-request the lock. After the lease period expires, the next waiting
>>> client request is allowed the lock.
>>>
>>> Test case:
>>>
>>>       # Standard owner opens and locks a file
>>>       fh1, stateid1 = c.create_confirm(t.word(),
>> deny=OPEN4_SHARE_DENY_NONE)
>>>       res1 = c.lock_file(t.word(), fh1, stateid1, type=WRITE_LT)
>>>       check(res1, msg="Locking file %s" % t.word())
>>>       # Second owner is denied a blocking lock
>>>       file = c.homedir + [t.word()]
>>>       fh2, stateid2 = c.open_confirm(b"owner2", file,
>>>                                      access=OPEN4_SHARE_ACCESS_BOTH,
>>>                                      deny=OPEN4_SHARE_DENY_NONE)
>>>       res2 = c.lock_file(b"owner2", fh2, stateid2,
>>>                          type=WRITEW_LT, lockowner=b"lockowner2_LOCK20")
>>>       check(res2, NFS4ERR_DENIED, msg="Conflicting lock on %s" % t.word())
>>>       sleeptime = c.getLeaseTime() // 2
>>>       # Wait for queued lock to timeout
>>>       for i in range(3):
>>>           env.sleep(sleeptime, "Waiting for queued blocking lock to timeout")
>>>           res = c.compound([op.renew(c.clientid)])
>>>           check(res, [NFS4_OK, NFS4ERR_CB_PATH_DOWN])
>>>       # Standard owner releases lock
>>>       res1 = c.unlock_file(1, fh1, res1.lockid)
>>>       check(res1)
>>>       # Third owner tries to butt in and steal lock second owner is
>>> waiting for
>>>       # Should work, since second owner let his turn expire
>>>       file = c.homedir + [t.word()]
>>>       fh3, stateid3 = c.open_confirm(b"owner3", file,
>>>                                      access=OPEN4_SHARE_ACCESS_BOTH,
>>>                                      deny=OPEN4_SHARE_DENY_NONE)
>>>       res3 = c.lock_file(b"owner3", fh3, stateid3,
>>>                          type=WRITEW_LT, lockowner=b"lockowner3_LOCK20")
>>>       check(res3, msg="Grabbing lock after another owner let his 'turn'
>>> expire")
>>>
>>> Note that the RFC indicated the client has one lease period AFTER the
>>> conflicting lock is released to retry while the test case waits 1.5
>>> lease period after requesting the blocking lock before it releases the
>>> conflicting lock...
>>>
>>> Am I reading things right?
>>
>> I see what you mean.
>>
>> But since a waiting blocking lock client obviously doesn't know when the lock-
>> holding client is going to release its lock, the waiting client has to start polling
>> regularly as soon as its initial blocking lock request is denied. It has to poll at
>> least once per lease period.
>>
>> If the server notices that a waiting client hasn't polled once in a lease period,
>> after its initial blocking lock request was denied, then it seems reasonable for
>> the server to forget that waiting client's interest in the pending lock there and
>> then. There's no need for the server to wait a further lease period after the lock
>> is released.
>>
>>
>> Looking at the current Linux nfsd code, that does seem to be what it does. I see
>> that when the server adds the blocking lock request to its pending list, it adds the
>> current timestamp to it, i.e. the time that the blocking lock was requested.
>>
>> The nfsd background clean-up thread (which runs at least once per lease
>> period) removes any pending blocking lock requests if a lease period has passed
>> since they were placed on the list (i.e. when the blocking lock was requested).
>> There's a corresponding comment:
>>
>> 	/*
>> 	 * It's possible for a client to try and acquire an already held lock
>> 	 * that is being held for a long time, and then lose interest in it.
>> 	 * So, we clean out any un-revisited request after a lease period
>> 	 * under the assumption that the client is no longer interested.
>>
>> https://elixir.bootlin.com/linux/v6.16/source/fs/nfsd/nfs4state.c#L6824
>>
>>
>> There's no pending locks action taken on lock release. The timing is based solely
>> on when the blocking READW/WRITEW request occurred, i.e.
>> the res2 WRITEW in the pynfs test, which is before the sleep.
>>
>> So, whilst the RFC may seem to suggest the timer should start at lock release, it
>> doesn't seem unreasonable for the NFS server to start the timer earlier, at the
>> blocking lock request, to avoid an unnecessary delay upon lock release if the
>> client has lost interest in the lock, i.e. it isn't polling.
>>
>>
>> Presumably, the pynfs test was originally written to match NFS server behaviour,
>> rather than RFC wording. I'm not sure what other NFS servers do in this case.
>> Waiting longer wouldn't change the test result in this case, I think.
>>
>>
>> Does that seem reasonable to you?
>>
>> thanks,
>> calum.
> 

-- 
Calum Mackay
Linux Kernel Engineering
Oracle Linux and Virtualisation


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: PYNFS LOCK20 Blocking Lock Test Case
  2025-08-21 17:41     ` Calum Mackay
@ 2025-08-21 17:47       ` Frank Filz
  0 siblings, 0 replies; 5+ messages in thread
From: Frank Filz @ 2025-08-21 17:47 UTC (permalink / raw)
  To: 'Calum Mackay', linux-nfs
  Cc: 'Ofir Vainshtein', 'Chuck Lever'

Thanks

> -----Original Message-----
> From: Calum Mackay [mailto:calum.mackay@oracle.com]
> Sent: Thursday, August 21, 2025 10:41 AM
> To: Frank Filz <ffilzlnx@mindspring.com>; linux-nfs@vger.kernel.org
> Cc: Calum Mackay <calum.mackay@oracle.com>; 'Ofir Vainshtein'
> <ofirvins@google.com>; 'Chuck Lever' <chuck.lever@oracle.com>
> Subject: Re: PYNFS LOCK20 Blocking Lock Test Case
> 
> On 21/08/2025 6:30 pm, Frank Filz wrote:
> > Ah, I see the logic the test case is expecting.
> >
> > For Ganesha, we maintain the blocking lock so long as the clientid is being
> renewed, so we don't start the timer for claiming the lock until the lock becomes
> available which seems to be allowed per the RFC. Maybe we just need to not run
> that test case.
> >
> > But it would be nice to have a similar test case that just takes too long after
> the lock is available to retry.
> 
> Thanks Frank; Chuck also made a similar suggestion to me privately. I'll have a
> look at either adjusting this test to report information that the lock was
> obtained "early", and/or a separate/optional test that more closely matches the
> RFC wording, regardless of how the server might behave.
> 
> Of course, the RFC wording, and lack of nominative language, suggests this is an
> implementation choice, and thus difficult for pynfs tests to adjudicate on.

Yes, that can be tricky. I added a ganesha tag so there could be a few implementation specific tests.

> thanks,
> calum.
> 
> >
> > Part of the challenge is we share a lot of logic between 4.0 and 4.1 and with
> the actual callback in 4.1, there is no expectation of the client polling for the
> lock.
> >
> > Let me mull this one over more.
> >
> > Thanks
> >
> > Frank
> >
> >> -----Original Message-----
> >> From: Calum Mackay [mailto:calum.mackay@oracle.com]
> >> Sent: Wednesday, August 13, 2025 6:30 PM
> >> To: Frank Filz <ffilzlnx@mindspring.com>; linux-nfs@vger.kernel.org
> >> Cc: Calum Mackay <calum.mackay@oracle.com>; 'Ofir Vainshtein'
> >> <ofirvins@google.com>; Chuck Lever <chuck.lever@oracle.com>
> >> Subject: Re: PYNFS LOCK20 Blocking Lock Test Case
> >>
> >> On 12/08/2025 5:55 pm, Frank Filz wrote:
> >>> I believe this test case is wrong, relevant text from RFC:
> >>>
> >>> Some clients require the support of blocking locks. The NFSv4
> >>> protocol must not rely on a callback mechanism and therefore is
> >>> unable to notify a client when a previously denied lock has been granted.
> >>> Clients have no choice but to continually poll for the lock. This
> >>> presents a fairness problem. Two new lock types are added, READW and
> >>> WRITEW, and are used to indicate to the server that the client is
> >>> requesting a blocking lock. The server should maintain an ordered
> >>> list of pending blocking locks. When the conflicting lock is
> >>> released, the server may wait the lease period for the first waiting
> >>> client to re-request the lock. After the lease period expires, the
> >>> next waiting client request is allowed the lock.
> >>>
> >>> Test case:
> >>>
> >>>       # Standard owner opens and locks a file
> >>>       fh1, stateid1 = c.create_confirm(t.word(),
> >> deny=OPEN4_SHARE_DENY_NONE)
> >>>       res1 = c.lock_file(t.word(), fh1, stateid1, type=WRITE_LT)
> >>>       check(res1, msg="Locking file %s" % t.word())
> >>>       # Second owner is denied a blocking lock
> >>>       file = c.homedir + [t.word()]
> >>>       fh2, stateid2 = c.open_confirm(b"owner2", file,
> >>>                                      access=OPEN4_SHARE_ACCESS_BOTH,
> >>>                                      deny=OPEN4_SHARE_DENY_NONE)
> >>>       res2 = c.lock_file(b"owner2", fh2, stateid2,
> >>>                          type=WRITEW_LT, lockowner=b"lockowner2_LOCK20")
> >>>       check(res2, NFS4ERR_DENIED, msg="Conflicting lock on %s" % t.word())
> >>>       sleeptime = c.getLeaseTime() // 2
> >>>       # Wait for queued lock to timeout
> >>>       for i in range(3):
> >>>           env.sleep(sleeptime, "Waiting for queued blocking lock to timeout")
> >>>           res = c.compound([op.renew(c.clientid)])
> >>>           check(res, [NFS4_OK, NFS4ERR_CB_PATH_DOWN])
> >>>       # Standard owner releases lock
> >>>       res1 = c.unlock_file(1, fh1, res1.lockid)
> >>>       check(res1)
> >>>       # Third owner tries to butt in and steal lock second owner is
> >>> waiting for
> >>>       # Should work, since second owner let his turn expire
> >>>       file = c.homedir + [t.word()]
> >>>       fh3, stateid3 = c.open_confirm(b"owner3", file,
> >>>                                      access=OPEN4_SHARE_ACCESS_BOTH,
> >>>                                      deny=OPEN4_SHARE_DENY_NONE)
> >>>       res3 = c.lock_file(b"owner3", fh3, stateid3,
> >>>                          type=WRITEW_LT, lockowner=b"lockowner3_LOCK20")
> >>>       check(res3, msg="Grabbing lock after another owner let his 'turn'
> >>> expire")
> >>>
> >>> Note that the RFC indicated the client has one lease period AFTER
> >>> the conflicting lock is released to retry while the test case waits
> >>> 1.5 lease period after requesting the blocking lock before it
> >>> releases the conflicting lock...
> >>>
> >>> Am I reading things right?
> >>
> >> I see what you mean.
> >>
> >> But since a waiting blocking lock client obviously doesn't know when
> >> the lock- holding client is going to release its lock, the waiting
> >> client has to start polling regularly as soon as its initial blocking
> >> lock request is denied. It has to poll at least once per lease period.
> >>
> >> If the server notices that a waiting client hasn't polled once in a
> >> lease period, after its initial blocking lock request was denied,
> >> then it seems reasonable for the server to forget that waiting
> >> client's interest in the pending lock there and then. There's no need
> >> for the server to wait a further lease period after the lock is released.
> >>
> >>
> >> Looking at the current Linux nfsd code, that does seem to be what it
> >> does. I see that when the server adds the blocking lock request to
> >> its pending list, it adds the current timestamp to it, i.e. the time that the
> blocking lock was requested.
> >>
> >> The nfsd background clean-up thread (which runs at least once per
> >> lease
> >> period) removes any pending blocking lock requests if a lease period
> >> has passed since they were placed on the list (i.e. when the blocking lock was
> requested).
> >> There's a corresponding comment:
> >>
> >> 	/*
> >> 	 * It's possible for a client to try and acquire an already held lock
> >> 	 * that is being held for a long time, and then lose interest in it.
> >> 	 * So, we clean out any un-revisited request after a lease period
> >> 	 * under the assumption that the client is no longer interested.
> >>
> >> https://elixir.bootlin.com/linux/v6.16/source/fs/nfsd/nfs4state.c#L68
> >> 24
> >>
> >>
> >> There's no pending locks action taken on lock release. The timing is
> >> based solely on when the blocking READW/WRITEW request occurred, i.e.
> >> the res2 WRITEW in the pynfs test, which is before the sleep.
> >>
> >> So, whilst the RFC may seem to suggest the timer should start at lock
> >> release, it doesn't seem unreasonable for the NFS server to start the
> >> timer earlier, at the blocking lock request, to avoid an unnecessary
> >> delay upon lock release if the client has lost interest in the lock, i.e. it isn't
> polling.
> >>
> >>
> >> Presumably, the pynfs test was originally written to match NFS server
> >> behaviour, rather than RFC wording. I'm not sure what other NFS servers do
> in this case.
> >> Waiting longer wouldn't change the test result in this case, I think.
> >>
> >>
> >> Does that seem reasonable to you?
> >>
> >> thanks,
> >> calum.
> >
> 
> --
> Calum Mackay
> Linux Kernel Engineering
> Oracle Linux and Virtualisation



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-08-21 18:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-12 16:55 PYNFS LOCK20 Blocking Lock Test Case Frank Filz
2025-08-14  1:29 ` Calum Mackay
2025-08-21 17:30   ` Frank Filz
2025-08-21 17:41     ` Calum Mackay
2025-08-21 17:47       ` Frank Filz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).