From: Rob Gardner <rob.gardner@hp.com>
To: tmtalpey@gmail.com,
"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: Huge race in lockd for async lock requests?
Date: Wed, 20 May 2009 10:37:05 -0600 [thread overview]
Message-ID: <4A1431B1.6080708@hp.com> (raw)
In-Reply-To: <4a140d0a.85c2f10a.53bc.0979-ATjtLOhZ0NVl57MIdRCFDg@public.gmane.org>
Tom Talpey wrote:
> At 02:55 AM 5/20/2009, Rob Gardner wrote:
>
>> Tom Talpey wrote:
>>
>>> At 04:43 PM 5/19/2009, Rob Gardner wrote:
>>>
>>>> I've got a question about lockd in conjunction with a filesystem that
>>>> provides its own (async) locking.
>>>>
>>>> After nlmsvc_lock() calls vfs_lock_file(), it seems to be that we might
>>>> get the async callback (nlmsvc_grant_deferred) at any time. What's to
>>>> stop it from arriving before we even put the block on the nlm_block
>>>> list? If this happens, then nlmsvc_grant_deferred() will print "grant
>>>> for unknown block" and then we'll wait forever for a grant that will
>>>> never come.
>>>>
>>> Yes, there's a race but the client will retry every 30 seconds, so it won't
>>> wait forever.
>>>
>> OK, a blocking lock request will get retried in 30 seconds and work out
>> "ok". But a non-blocking request will get in big trouble. Let's say the
>>
>
> A non-blocking lock doesn't request, and won't get, a callback. So I
> don't understand...
>
>
What do you mean a non-blocking lock doesn't request? Remember that I'm
dealing with a filesystem that provides its own locking functions via
file->f_op->lock(). Such a filesystem might easily defer a non-blocking
lock request and invoke the callback later. At least I don't know of any
rule that says that it can't do this, and clearly the code expects this
possibility:
case FILE_LOCK_DEFERRED:
if (wait)
break;
/* Filesystem lock operation is in progress
Add it to the queue waiting for callback */
ret = nlmsvc_defer_lock_rqst(rqstp, block);
>> callback is invoked immediately after the vfs_lock_file call returns
>> FILE_LOCK_DEFERRED. At this point, the block is not on the nlm_block
>> list, so the callback routine will not be able to find it and mark it as
>> granted. Then nlmsvc_lock() will call nlmsvc_defer_lock_rqst(), put the
>> block on the nlm_block list, and eventually the request will timeout and
>> the client will get lck_denied. Meanwhile, the lock has actually been
>> granted, but nobody knows about it.
>>
>
> Yes, this can happen, I've seen it too. Again, it's a bug in the protocol
> more than a bug in the clients.
It looks to me like a bug in the server. The server must be able to deal
with async filesystem callbacks happening at any time, however inconvenient.
> It gets even worse when retries occur.
> If the reply cache doesn't catch the duplicates (and it never does), all
> heck breaks out.
>
You'll have to explain further what scenario you're talking about. I
don't understand what the reply cache has to do with lockd.
>> by using a semaphore to cover the vfs_lock_file() to
>> nlmsvc_insert_block() sequence in nlmsvc_lock() and also
>> nlmsvc_grant_deferred(). So if the callback arrives at a bad time, it
>> has to wait until the lock actually makes it onto the nlm_block list,
>> and so the status of the lock gets updated properly.
>>
>
> Can you explain this further? If you're implementing the server, how do
> you know your callback "arrives at a bad time", by the DENIED result
> from the client?
>
I sense a little confusion so let me be more precise. By "callback" I am
talking about the callback from the filesystem to lockd via
lock_manager_operations.fl_grant (ie, nlmsvc_grant_deferred). If this
callback is invoked while the lockd thread is executing code in
nlmsvc_lock() between the call to vfs_lock_file() and the call to
nlmsvc_insert_block(), then the callback routine (nlmsvc_grant_deferred)
will not find the block on the nlm_block list because it's not there
yet, and thus the "grant" is effectively lost. We use a semaphore in
nlmsvc_lock() and nlmsvc_grant_deferred() to avoid this race.
Rob Gardner
next prev parent reply other threads:[~2009-05-20 16:37 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-15 14:48 Virtual IPs and blocking locks Sachin S. Prabhu
2009-05-15 16:50 ` Rob Gardner
2009-05-18 13:41 ` Sachin S. Prabhu
2009-05-18 13:46 ` Trond Myklebust
2009-05-18 13:55 ` Rob Gardner
2009-05-19 20:43 ` Huge race in lockd for async lock requests? Rob Gardner
2009-05-19 21:33 ` Tom Talpey
2009-05-20 6:55 ` Rob Gardner
2009-05-20 14:00 ` Tom Talpey
[not found] ` <4a140d0a.85c2f10a.53bc.0979-ATjtLOhZ0NVl57MIdRCFDg@public.gmane.org>
2009-05-20 14:14 ` Tom Talpey
[not found] ` <4a14106e.48c3f10a.7ce3.0e55-ATjtLOhZ0NVl57MIdRCFDg@public.gmane.org>
2009-05-20 23:20 ` Rob Gardner
2009-05-20 16:37 ` Rob Gardner [this message]
2009-05-28 20:05 ` J. Bruce Fields
2009-05-28 21:34 ` Rob Gardner
2009-05-29 0:26 ` J. Bruce Fields
2009-05-29 2:59 ` Rob Gardner
2009-05-29 13:22 ` Tom Talpey
[not found] ` <4a1fe1c0.06045a0a.165b.5fbc-ATjtLOhZ0NVl57MIdRCFDg@public.gmane.org>
2009-05-29 15:24 ` Rob Gardner
2009-05-29 19:14 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A1431B1.6080708@hp.com \
--to=rob.gardner@hp.com \
--cc=linux-nfs@vger.kernel.org \
--cc=tmtalpey@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox