From: Ichiko Sakamoto <i-sakamoto@pb.jp.nec.com>
To: linux-nfs@vger.kernel.org
Cc: frankvm@frankvm.com, Trond.Myklebust@netapp.com, bfields@fieldses.org
Subject: Re: [NLM] fcntl(F_SETLKW) yields -ENOLCK when grace period expires.
Date: Fri, 16 Mar 2012 19:53:20 +0900 [thread overview]
Message-ID: <4F631BA0.7010300@pb.jp.nec.com> (raw)
In-Reply-To: <20110805132823.GA32305@janus>
[-- Attachment #1: Type: text/plain, Size: 6466 bytes --]
(2011/08/05 22:28), Frank van Maarseveen wrote:
> On Thu, Aug 04, 2011 at 02:17:35PM -0400, Trond Myklebust wrote:
>> On Thu, 2011-08-04 at 19:27 +0200, Frank van Maarseveen wrote:
>> > On Thu, Aug 04, 2011 at 01:10:20PM -0400, Trond Myklebust wrote:
>> > > On Thu, 2011-08-04 at 12:49 -0400, J. Bruce Fields wrote:
>> > > > On Thu, Aug 04, 2011 at 06:43:13PM +0200, Frank van Maarseveen wrote:
>> > > > > On Thu, Aug 04, 2011 at 12:34:52PM -0400, J. Bruce Fields wrote:
>> > > > > > On Thu, Aug 04, 2011 at 12:30:19PM +0200, Frank van Maarseveen wrote:
>> > > > > > > Both client- and server run 2.6.39.3, NFSv3 over UDP (without the
>> > > > > > > relock_filesystem patch proposed earlier).
>> > > > > > >
>> > > > > > > A second client has an exclusive lock on a file on the server. The
>> > > > > > > client under test calls fcntl(F_SETLKW) to wait for the same exclusive
>> > > > > > > lock. Wireshark sees NLM V4 LOCK calls resulting in NLM_BLOCKED.
>> > > > > > >
>> > > > > > > Next the server is rebooted. The second client recovers the lock
>> > > > > > > correctly. The client under test now receives NLM_DENIED_GRACE_PERIOD for
>> > > > > > > every NLM V4 LOCK request resulting from the waiting fcntl(F_SETLKW). When
>> > > > > > > this changes to NLM_BLOCKED after grace period expiration the fcntl
>> > > > > > > returns -ENOLCK ("No locks available.") instead of continuing to wait.
>> > > > > >
>> > > > > > So that sounds like a client bug, and correct behavior from the server
>> > > > > > (assuming the second client was still holding the lock throughout).
>> > > > >
>> > > > > yes.
>> > >
>> > > Is the client actually asking for a blocking lock after the grace period
>> > > expires?
>> >
>> > yes, according to my interpretation of that of wireshark, see reply to Bruce.
>> >
>>
>> OK... Does the following patch help?
>>
>> Cheers
>> Trond
>> ---
>> diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
>> index 8392cb8..40c0d88 100644
>> --- a/fs/lockd/clntproc.c
>> +++ b/fs/lockd/clntproc.c
>> @@ -270,6 +270,9 @@ nlmclnt_call(struct rpc_cred *cred, struct nlm_rqst *req, u32 proc)
>> return -ENOLCK;
>> msg.rpc_proc = &clnt->cl_procinfo[proc];
>>
>> + /* Reset the reply status */
>> + if (argp->block)
>> + resp->status = nlm_lck_blocked;
>> /* Perform the RPC call. If an error occurs, try again */
>> if ((status = rpc_call_sync(clnt, &msg, 0)) < 0) {
>> dprintk("lockd: rpc_call returned error %d\n", -status);
>>
>
> Negative. I've tried it on the client under test and I'm seeing three
> types of behavior, one good, two bad. In all cases the secondary
> client (unmodified) correctly regains the lock after the server has
> rebooted. Client under test behavior depends on whether it had queued
> the conflicting lock before of after the server reboot. Afterwards it
> seems to work with the above modification (don't know if that was the
> case before though).
>
> When the client under test tries to lock before the server reboot then
> the fcntl(F_SETLKW) returns either right after the NSM NOTIFY with
> -ENOLCK without any NLM trafic or it returns with -ENOLCK when the
> NLM_DENIED_GRACE_PERIOD changes into NLM_BLOCKED (the original report).
>
Hi all
Was this fixed?
I have same issue in 3.2.9-2.fc16.
When the client recieves NSM NOTIFY, reclaimer() thread updates
block->b_status to nlm_lck_denied_grace_period.
fs/lockd/clntlock.c
265 /* Now, wake up all processes that sleep on a blocked lock */
266 spin_lock(&nlm_blocked_lock);
267 list_for_each_entry(block, &nlm_blocked, b_list) {
268 if (block->b_host == host) {
* 269 block->b_status = nlm_lck_denied_grace_period;
270 wake_up(&block->b_wait);
271 }
272 }
273 spin_unlock(&nlm_blocked_lock);
Blocked process loops inside nlmclnt_call() during grace period,
and recieves NLM_BLOCKED again.
Then nlmclnt_block() copies block->b_status(== nlm_lck_denied_grace_period)
to req->a_res.status.
fs/lockd/clntlock.c
139 ret = wait_event_interruptible_timeout(block->b_wait,
140 block->b_status != nlm_lck_blocked,
141 timeout);
142 if (ret < 0)
143 return -ERESTARTSYS;
* 144 req->a_res.status = block->b_status;
145 return 0;
.. and nlmclnt_lock() breaks retry loop and returns -ENOLCK.
fs/lockd/clntproc.c
550 /* Wait on an NLM blocking lock */
551 status = nlmclnt_block(block, req, NLMCLNT_POLL_TIMEOUT);
552 if (status < 0)
553 break;
* 554 if (resp->status != nlm_lck_blocked)
* 555 break;
556 }
...
590 if (resp->status == nlm_lck_denied && (fl_flags & FL_SLEEP))
591 status = -ENOLCK;
592 else
* 593 status = nlm_stat_to_errno(resp->status);
594out_unblock:
595 nlmclnt_finish_block(block);
596out:
597 nlmclnt_release_call(req);
* 598 return status;
Following patch works fine in my fc16.
--- a/fs/lockd/clntlock.c 2012-01-04 23:55:44.000000000 +0000
+++ b/fs/lockd/clntlock.c 2012-03-16 08:08:03.793687409 +0000
@@ -121,6 +121,7 @@
int nlmclnt_block(struct nlm_wait *block, struct nlm_rqst *req, long timeout)
{
long ret;
+ u32 nsmstate;
/* A borken server might ask us to block even if we didn't
* request it. Just say no!
@@ -136,8 +137,10 @@
* a 1 minute timeout would do. See the comment before
* nlmclnt_lock for an explanation.
*/
+ nsmstate = block->b_host->h_nsmstate;
ret = wait_event_interruptible_timeout(block->b_wait,
- block->b_status != nlm_lck_blocked,
+ block->b_status != nlm_lck_blocked ||
+ block->b_host->h_nsmstate != nsmstate,
timeout);
if (ret < 0)
return -ERESTARTSYS;
@@ -266,7 +269,6 @@
spin_lock(&nlm_blocked_lock);
list_for_each_entry(block, &nlm_blocked, b_list) {
if (block->b_host == host) {
- block->b_status = nlm_lck_denied_grace_period;
wake_up(&block->b_wait);
}
}
Thanks,
Ichiko
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5483 bytes --]
next prev parent reply other threads:[~2012-03-16 10:54 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-04 10:30 [NLM] fcntl(F_SETLKW) yields -ENOLCK when grace period expires Frank van Maarseveen
2011-08-04 16:34 ` J. Bruce Fields
2011-08-04 16:43 ` Frank van Maarseveen
2011-08-04 16:49 ` J. Bruce Fields
2011-08-04 17:10 ` Trond Myklebust
2011-08-04 17:27 ` Frank van Maarseveen
2011-08-04 18:17 ` Trond Myklebust
2011-08-05 13:28 ` Frank van Maarseveen
2012-03-16 10:53 ` Ichiko Sakamoto [this message]
2011-08-04 17:24 ` Frank van Maarseveen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F631BA0.7010300@pb.jp.nec.com \
--to=i-sakamoto@pb.jp.nec.com \
--cc=Trond.Myklebust@netapp.com \
--cc=bfields@fieldses.org \
--cc=frankvm@frankvm.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.