From: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
To: Jeff Layton <jlayton@poochiereds.net>
Cc: mtk.manpages@gmail.com, NeilBrown <neilb@suse.de>,
"Stefan (metze) Metzmacher" <metze@samba.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
lkml <linux-kernel@vger.kernel.org>,
Ganesha NFS List <nfs-ganesha-devel@lists.sourceforge.net>,
Suresh Jayaraman <sjayaraman@suse.de>,
Trond Myklebust <trond.myklebust@fys.uio.no>,
Christoph Hellwig <hch@infradead.org>,
linux-nfs <linux-nfs@vger.kernel.org>,
"J. Bruce Fields" <bfields@fieldses.org>
Subject: Re: flock() and NFS [Was: Re: [PATCH] locks: rename file-private locks to file-description locks]
Date: Tue, 29 Apr 2014 14:20:03 +0200 [thread overview]
Message-ID: <535F98F3.8070101@gmail.com> (raw)
In-Reply-To: <20140429073454.220572a8@tlielax.poochiereds.net>
On 04/29/2014 01:34 PM, Jeff Layton wrote:
> On Tue, 29 Apr 2014 11:53:40 +0200
> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> wrote:
>
>> On 04/29/2014 11:24 AM, NeilBrown wrote:
>>> On Tue, 29 Apr 2014 11:07:16 +0200 "Michael Kerrisk (man-pages)"
>>> <mtk.manpages@gmail.com> wrote:
>>>
>>>> On 04/27/2014 11:28 PM, NeilBrown wrote:
>>>>> On Sun, 27 Apr 2014 13:11:33 +0200 "Michael Kerrisk (man-pages)"
>>>>> <mtk.manpages@gmail.com> wrote:
>>>>>
>>>>>> On Sun, Apr 27, 2014 at 12:04 PM, NeilBrown <neilb@suse.de> wrote:
>>>>>>> On Sun, 27 Apr 2014 11:16:02 +0200 "Michael Kerrisk (man-pages)"
>>>>>>> <mtk.manpages@gmail.com> wrote:
>>>>>>>
>>>>>>>> [Trimming some folk from CC, and adding various NFS people]
>>>>>>>>
>>>>>>>> On 04/27/2014 06:51 AM, NeilBrown wrote:
>>>>>>>>
>>>>>>>> [...]
>>>>>>>>
>>>>>>>>> Note to Michael: The text
>>>>>>>>> flock() does not lock files over NFS.
>>>>>>>>> in flock(2) is no longer accurate. The reality is ... complex.
>>>>>>>>> See nfs(5), and search for "local_lock".
>>>>>>>>
>>>>>>>> Ahhh -- I see:
>>>>>>>> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=5eebde23223aeb0ad2d9e3be6590ff8bbfab0fc2
>>>>>>>>
>>>>>>>> Thanks for the heads up.
>>>>>>>>
>>>>>>>> Just in general, it would be great if the flock(2) and fcntl(2) man pages
>>>>>>>> contained correct details for NFS, of course. So, for example, if there
>>>>>>>> are any current gotchas for NFS and fcntl() byte-range locking, I'd like
>>>>>>>> to add those to the fcntl(2) man page.
>>>>>>>
>>>>>>> The only peculiarities I can think of are:
>>>>>>> - With NFS, locking or unlocking a region forces a flush of any cached data
>>>>>>> for that file (or maybe for the region of the file). I'm not sure if this
>>>>>>> is worth mentioning.
>>>>>>
>>>>>> I agree that it's probably not necessary to mention.
>>>>>>
>>>>>>> - With NFSv4 the client can lose a lock if it is out of contact with the
>>>>>>> server for a period of time. When this happens, any IO to the file by a
>>>>>>> process which "thinks" it holds a lock will fail until that process closes
>>>>>>> and re-opens the file.
>>>>>>> This behaviour is since 3.12. Prior to that the client might lose and
>>>>>>> regain the lock without ever knowing thus potentially risking corruption
>>>>>>> (but only if client and server lost contact for an extended period).
>>>>>>
>>>>>> Do you have a pointer for that commit to 3.12?
>>>>>>
>>>>>
>>>>> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ef1820f9be27b6ad158f433ab38002ab8131db4d
>>>>>
>>>>> did most of the work while the subsequent commit
>>>>>
>>>>> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f6de7a39c181dfb8a2c534661a53c73afb3081cd
>>>>>
>>>>> changed some details, added some documentation, and inverted the default
>>>>> behaviour.
>>>>
>>>> Thanks for that detail. What do you think of the following text for the
>>>> fcntl(2) man page:
>>>>
>>>> Before Linux 3.12, if an NFS client is out of contact with the
>>>> server for a period of time, it might lose and regain a lock
>>>> without ever being aware of the fact. This scenario poten‐
>>>> tially risks data corruption, since another process might
>>>> acquire a lock in the intervening period and perform file I/O.
>>>> Since Linux 3.12, if the client loses contact with the server,
>>>> any I/O to the file by a process which "thinks" it holds a lock
>>>> will fail until that process closes and reopens the file. A
>>>> kernel parameter, nfs.recover_lost_locks, can be set to 1 to
>>>> obtain the pre-3.12 behavior, whereby the client will attempt
>>>> to recover lost locks when contact is reestablished with the
>>>> server. Because of the attendant risk of data corruption, this
>>>> parameter defaults to 0 (disabled).
>>>>
>>>
>>> Mostly good.
>>>
>>> I'm just a little concerned about "if the client loses contact with the
>>> server" in the middle there. It is no longer qualified and it isn't clear
>>> that the "for a period of time" qualification still applied. And we should
>>> probably quantify the period of time - which defaults to 90 seconds.
>>> I don't remember just now the difference between
>>> /proc/fs/nfsd/nfsv4{lease,grace}time
>>> but this 90 seconds is one of those.
>>>
>>> Also this is NFSv4 specific. With NFSv3 the failure mode is the reverse. If
>>> the server loses contact with a client then any lock stays in place
>>> indefinitely ("why can't I read my mail"... I remember it well).
>>>
>>> Before Linux 3.12, if an NFSv4 client loses contact with the server
>>> (defined as more than 90 seconds with no communication), it might lose
>>> and regain ....
>>
>> Thanks, Neil. Changed as you suggest. I'd quite like to mention
>> which of /proc/fs/nfsd/nfsv4{lease,grace}time is relevant here. I had a
>> quick scan, but could not determine it with complete confidence. My suspicion,
>> looking at fs/lockd/svcproc.c and fs/lockd/grace.c::locks_in_grace()
>> is that it is /proc/fs/nfsd/nfsv4gracetime that is relevant here. Can anyone
>> confirm?
>>
>
> The difference here is subtle. The gracetime is how long after a reboot
> should knfsd allow clients to reclaim state (and deny the creation of
> new locks and opens). The leasetime is how long the NFSv4 lease period
> is. There is a relationship between the two that's illustrated in the
> comments above write_gracetime:
>
> /**
> * write_gracetime - Set or report current NFSv4 grace period time
> *
> * As above, but sets the time of the NFSv4 grace period.
> *
> * Note this should never be set to less than the *previous*
> * lease-period time, but we don't try to enforce this. (In the common
> * case (a new boot), we don't know what the previous lease time was
> * anyway.)
> */
>
> The value you're interested in here is the nfsv4leasetime. If the
> client doesn't renew its lease within that period, then it's subject to
> the server giving up on it and dropping any state that it holds on that
> clients' behalf.
>
> Note that this is not a firm timeout. The server runs a job
> periodically to clean out expired stateful objects, and it's likely
> that there is some time (maybe even up to another whole lease period)
> between when the timeout expires and the job actually runs. If the
> client gets a RENEW in there within that window, its lease will be
> renewed and its state preserved.
>
> Also note that all of the above just applies to the Linux knfsd. There
> are many other servers in the field and they have different rules for
> dropping state held by clients that have gone AWOL.
Thanks for the detailed explanation, Jeff. I've updated the draft text to
mention nfsv4gracetime. I won't add the subtleties you mention above
(but they'll go into the commit message).
The text is now:
Record locking and NFS
Before Linux 3.12, if an NFSv4 client loses contact with the
server for a period of time (defined as more than 90 seconds
with no communication), it might lose and regain a lock without
ever being aware of the fact. (The period of time after which
contact is assumed lost is defined by /proc/fs/nfsd/nfsv4lease‐
time, which expresses the period in seconds. The default value
for this file is 90.) This scenario potentially risks data
corruption, since another process might acquire a lock in the
intervening period and perform file I/O.
Since Linux 3.12, if an NFSv4 client loses contact with the
server, any I/O to the file by a process which "thinks" it
holds a lock will fail until that process closes and reopens
the file. A kernel parameter, nfs.recover_lost_locks, can be
set to 1 to obtain the pre-3.12 behavior, whereby the client
will attempt to recover lost locks when contact is reestab‐
lished with the server. Because of the attendant risk of data
corruption, this parameter defaults to 0 (disabled).
Cheers,
Michael
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
next prev parent reply other threads:[~2014-04-29 12:20 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-21 13:45 [PATCH] locks: rename file-private locks to file-description locks Jeff Layton
2014-04-21 14:02 ` Rich Felker
2014-04-21 14:23 ` Michael Kerrisk (man-pages)
2014-04-21 16:09 ` Christoph Hellwig
2014-04-21 16:42 ` Jeff Layton
2014-04-21 17:03 ` [Nfs-ganesha-devel] " Frank Filz
2014-04-21 18:20 ` Michael Kerrisk (man-pages)
2014-04-21 16:10 ` Rich Felker
2014-04-21 16:45 ` Jeff Layton
2014-04-21 18:01 ` Andy Lutomirski
2014-04-21 18:43 ` Michael Kerrisk (man-pages)
2014-04-21 18:18 ` Michael Kerrisk (man-pages)
2014-04-21 18:32 ` Jeff Layton
2014-04-21 18:48 ` Rich Felker
2014-04-21 19:16 ` Jeff Layton
2014-04-21 20:22 ` Rich Felker
2014-04-21 18:32 ` Michael Kerrisk (man-pages)
2014-04-21 18:34 ` Christoph Hellwig
2014-04-21 18:39 ` Michael Kerrisk (man-pages)
2014-04-21 18:46 ` Rich Felker
2014-04-21 19:39 ` Michael Kerrisk (man-pages)
2014-04-21 19:55 ` Jeff Layton
2014-04-21 21:15 ` Stefan (metze) Metzmacher
2014-04-22 4:54 ` Michael Kerrisk (man-pages)
2014-04-27 4:51 ` NeilBrown
2014-04-27 9:14 ` Michael Kerrisk (man-pages)
2014-04-27 9:16 ` flock() and NFS [Was: Re: [PATCH] locks: rename file-private locks to file-description locks] Michael Kerrisk (man-pages)
2014-04-27 10:04 ` NeilBrown
2014-04-27 11:11 ` Michael Kerrisk (man-pages)
2014-04-27 21:28 ` NeilBrown
2014-04-29 9:07 ` Michael Kerrisk (man-pages)
2014-04-29 9:24 ` NeilBrown
2014-04-29 9:53 ` Michael Kerrisk (man-pages)
2014-04-29 11:34 ` Jeff Layton
2014-04-29 12:20 ` Michael Kerrisk (man-pages) [this message]
2014-04-28 10:23 ` [PATCH] locks: rename file-private locks to file-description locks Jeff Layton
2014-04-28 10:46 ` NeilBrown
2014-04-21 18:48 ` Theodore Ts'o
2014-04-21 18:51 ` Rich Felker
2014-04-21 19:04 ` Theodore Ts'o
2014-04-21 19:06 ` Christoph Hellwig
2014-04-21 20:10 ` Michael Kerrisk (man-pages)
2014-04-21 20:20 ` Rich Felker
2014-04-21 14:25 ` Michael Kerrisk (man-pages)
2014-04-21 16:05 ` Stefan (metze) Metzmacher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=535F98F3.8070101@gmail.com \
--to=mtk.manpages@gmail.com \
--cc=bfields@fieldses.org \
--cc=hch@infradead.org \
--cc=jlayton@poochiereds.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=metze@samba.org \
--cc=neilb@suse.de \
--cc=nfs-ganesha-devel@lists.sourceforge.net \
--cc=sjayaraman@suse.de \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).