From: "'bfields@fieldses.org'" <bfields@fieldses.org>
To: "inoguchi.yuki@fujitsu.com" <inoguchi.yuki@fujitsu.com>
Cc: 'Trond Myklebust' <trondmy@hammerspace.com>,
"'linux-nfs@vger.kernel.org'" <linux-nfs@vger.kernel.org>,
"'neilb@suse.de'" <neilb@suse.de>,
"'mbenjami@redhat.com'" <mbenjami@redhat.com>
Subject: Re: client caching and locks
Date: Wed, 5 Jan 2022 17:03:53 -0500 [thread overview]
Message-ID: <20220105220353.GF25384@fieldses.org> (raw)
In-Reply-To: <OSZPR01MB7050C5098D47514FFEC2DA82EF4B9@OSZPR01MB7050.jpnprd01.prod.outlook.com>
On Wed, Jan 05, 2022 at 09:31:59AM +0000, inoguchi.yuki@fujitsu.com wrote:
> I have understood. So for cache consistency, full file locking is needed if
> multiple clients can write the different parts of the same file concurrently.
>
> I think this kind of information should be documented in somewhere.
> If it is enough to focus on the file locking, I'm assuming it to be under "DATA AND METADATA COHERENCE"
> section in the nfs man page.
That subsection is kind of outdated. It leads with a discussion of the
(increasingly less relevant) NLM and NSM protocols, and despite being a
subsection of the "DATA AND METADTA COHERENCE" section, never gets
around to talking about that.
It also makes it sound like "nolock" only affects NLM, which I don't
think is right.
How about this? I also updated the lock/nolock description and deleted
the end of this subsection since it's redundant with that. And removed
the bit about using nolock to mount /var with v2/v3 as that seems like a
bit of a niche case at this point. If we still want to document that, I
think it belongs elsewhere.
--b.
diff --git a/utils/mount/nfs.man b/utils/mount/nfs.man
index 3bc18e1b30a9..7db043202fcf 100644
--- a/utils/mount/nfs.man
+++ b/utils/mount/nfs.man
@@ -722,10 +722,10 @@ reports the proper maximum component length to applications
in such cases.
.TP 1.5i
.BR lock " / " nolock
-Selects whether to use the NLM sideband protocol to lock files on the server.
+Selects whether to lock files on the server.
If neither option is specified (or if
.B lock
-is specified), NLM locking is used for this mount point.
+is specified), locks are taken on the server.
When using the
.B nolock
option, applications can lock files,
@@ -733,16 +733,9 @@ but such locks provide exclusion only against other applications
running on the same client.
Remote applications are not affected by these locks.
.IP
-NLM locking must be disabled with the
-.B nolock
-option when using NFS to mount
-.I /var
-because
-.I /var
-contains files used by the NLM implementation on Linux.
-Using the
+The
.B nolock
-option is also required when mounting exports on NFS servers
+option is required when using NFSv2 or NFSv3 to mount servers
that do not support the NLM protocol.
.TP 1.5i
.BR cto " / " nocto
@@ -1486,47 +1479,40 @@ the use of the
.B sync
mount option.
.SS "Using file locks with NFS"
-The Network Lock Manager protocol is a separate sideband protocol
-used to manage file locks in NFS version 2 and version 3.
-To support lock recovery after a client or server reboot,
-a second sideband protocol --
-known as the Network Status Manager protocol --
-is also required.
-In NFS version 4,
-file locking is supported directly in the main NFS protocol,
-and the NLM and NSM sideband protocols are not used.
+The nfs filesystem supports advisory byte-range locks acquired with
+.BR fcntl (2) .
+Locks obtained by
+.BR flock (2)
+are implemented as
+.BR fcntl (2)
+locks.
.P
-In most cases, NLM and NSM services are started automatically,
-and no extra configuration is required.
-Configure all NFS clients with fully-qualified domain names
-to ensure that NFS servers can find clients to notify them of server reboots.
+Locking can also provide cache consistency:
.P
-NLM supports advisory file locks only.
-To lock NFS files, use
-.BR fcntl (2)
-with the F_GETLK and F_SETLK commands.
-The NFS client converts file locks obtained via
-.BR flock (2)
-to advisory locks.
+Before acquiring a file lock, the client revalidates its cached data for
+the file. Before releasing a write lock, the client flushes to the
+server's stable storage any data in the locked range.
.P
-When mounting servers that do not support the NLM protocol,
-or when mounting an NFS server through a firewall
-that blocks the NLM service port,
-specify the
-.B nolock
-mount option. NLM locking must be disabled with the
-.B nolock
-option when using NFS to mount
-.I /var
-because
-.I /var
-contains files used by the NLM implementation on Linux.
+A distributed application running on multiple NFS clients can take a
+read lock for each range that it reads and a write lock for each range that
+it writes. On its own, however, that is insufficient to ensure that
+reads get up-to-date data.
.P
-Specifying the
-.B nolock
-option may also be advised to improve the performance
-of a proprietary application which runs on a single client
-and uses file locks extensively.
+When revalidating caches, the client is unable to reliably determine the
+difference between changes made by other clients and changes it made
+itself. Therefore, such an application would also need to prevent
+concurrent writes from multiple clients, either by taking whole-file
+locks on every write or by some other method.
+.P
+The protocol used for file locking differs between version. In versions
+before NFSv4, locks are implemented using the Network Lock Manager and
+Network Status Manager protocols. In versions since NFSv4, file locking
+is supported directly in the main NFS protocol.
+.P
+In most cases, NLM and NSM services are started automatically,
+and no extra configuration is required. NFSv2 and NFSv3 clients should
+be configured with fully-qualified domain names
+to ensure that NFS servers can find clients to notify them of server reboots.
.SS "NFS version 4 caching features"
The data and metadata caching behavior of NFS version 4
clients is similar to that of earlier versions.
next prev parent reply other threads:[~2022-01-05 22:04 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-08 21:19 client caching and locks J. Bruce Fields
2020-06-18 9:54 ` inoguchi.yuki
2020-06-18 14:29 ` Trond Myklebust
2020-06-18 20:09 ` bfields
2020-06-22 13:52 ` bfields
2020-10-01 21:47 ` bfields
2020-10-01 22:26 ` Matt Benjamin
2020-10-06 17:26 ` bfields
2021-12-28 2:39 ` inoguchi.yuki
2021-12-28 5:11 ` NeilBrown
2022-01-03 16:20 ` 'bfields@fieldses.org'
2022-01-04 9:24 ` inoguchi.yuki
2022-01-04 12:36 ` Trond Myklebust
2022-01-04 15:32 ` bfields
2022-01-04 15:54 ` Trond Myklebust
2022-01-05 9:31 ` inoguchi.yuki
2022-01-05 22:03 ` 'bfields@fieldses.org' [this message]
2022-01-06 7:23 ` inoguchi.yuki
2022-01-06 14:16 ` 'bfields@fieldses.org'
2022-01-07 8:33 ` inoguchi.yuki
2022-01-09 22:16 ` NeilBrown
2022-01-09 22:38 ` 'bfields@fieldses.org'
2022-01-09 21:58 ` NeilBrown
2022-01-09 22:41 ` 'bfields@fieldses.org'
2022-01-17 9:09 ` inoguchi.yuki
2022-01-17 22:27 ` NeilBrown
2022-02-02 4:09 ` inoguchi.yuki
2022-02-02 4:25 ` Trond Myklebust
2022-02-02 4:44 ` NeilBrown
2022-02-03 7:31 ` inoguchi.yuki
2022-02-07 4:16 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220105220353.GF25384@fieldses.org \
--to=bfields@fieldses.org \
--cc=inoguchi.yuki@fujitsu.com \
--cc=linux-nfs@vger.kernel.org \
--cc=mbenjami@redhat.com \
--cc=neilb@suse.de \
--cc=trondmy@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.