Linux NFS development
 help / color / mirror / Atom feed
From: Chuck Lever <cel@kernel.org>
To: NeilBrown <neilb@ownmail.net>, Jeff Layton <jlayton@kernel.org>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <dai.ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: <linux-nfs@vger.kernel.org>, NeilBrown <neil@brown.name>,
	Tj <tj.iam.tj@proton.me>
Subject: [PATCH v3] lockd: fix TEST handling when not all permissions are available.
Date: Tue, 28 Apr 2026 15:47:44 -0400	[thread overview]
Message-ID: <20260428194744.2360735-1-cel@kernel.org> (raw)

From: NeilBrown <neil@brown.name>

The F_GETLK fcntl can work with either read access or write access or
both.  It can query F_RDLCK and F_WRLCK locks in either case.

However lockd currently treats F_GETLK similar to F_SETLK in that read
access is required to query an F_RDLCK lock and write access is required
to query a F_WRLCK lock.

This is wrong and can cause problems - e.g.  when qemu accesses a
read-only (e.g. iso) filesystem image over NFS (though why it queries
if it can get a write lock - I don't know.  But it does, and this works
with local filesystems).

So we need TEST requests to be handled differently.  To do this:

- change nlm_do_fopen() to accept O_RDWR as a mode and in that case
  succeed if either a O_RDONLY or O_WRONLY file can be opened.
- change nlm_lookup_file() to accept a mode argument from caller,
  instead of deducing base on lock time, and pass that on to nlm_do_fopen()
- change nlm4svc_retrieve_args() and nlmsvc_retrieve_args() to detect
  TEST requests and pass O_RDWR as a mode to nlm_lookup_file, passing
  the same mode as before for other requests.  Also set
   lock->fl.c.flc_file to whichever file is available for TEST requests.
- change nlmsvc_testlock() to also not calculate the mode, but to use
  whatever was stored in lock->fl.c.flc_file.

This behaviour of lockd - requesting O_WRONLY access to TEST for
exclusive locks - has been present at least since git history began.
However it was hidden until recently because knfsd ignored the access
requested by lockd and required only READ access for all locking
requests (unless the underlying filesystem provided an f_op->open
function which checked access permissions).

The commit mentioned in Fixes: below changed nfsd_permission() to NOT
override the access request for LOCK requests and this exposed the bug
that we are now fixing.

Note that there is another issue that this patch does not address.
The flock(.., LOCK_EX) call is permitted on a read-only file descriptor.
Linux NFS maps this to NLM locking as whole-file byte-range locks.
nfsd will see this as though it were fcntl( F_SETLK (F_WRLCK)) and will
now require write access, which it might not be able to get.
It is not clear if this is a problem in practice, or what the best
solution might be.  So no attempt is made to address it.

Reported-by: Tj <tj.iam.tj@proton.me>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1128861
Fixes: 4cc9b9f2bf4d ("nfsd: refine and rename NFSD_MAY_LOCK")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/lockd/lockd.h    |  2 +-
 fs/lockd/svc4proc.c |  9 +++++++--
 fs/lockd/svclock.c  |  4 +---
 fs/lockd/svcproc.c  | 15 ++++++++++++---
 fs/lockd/svcsubs.c  | 31 +++++++++++++++++++++----------
 5 files changed, 42 insertions(+), 19 deletions(-)

Changes since v2:
- Rebase on nfsd-testing (v7.1-rc1 ++)
- Handle "drop reply" properly

diff --git a/fs/lockd/lockd.h b/fs/lockd/lockd.h
index a7c85ab6d4b5..1db6cb352542 100644
--- a/fs/lockd/lockd.h
+++ b/fs/lockd/lockd.h
@@ -332,7 +332,7 @@ int		  nlmsvc_dispatch(struct svc_rqst *rqstp);
  * File handling for the server personality
  */
 __be32		  nlm_lookup_file(struct svc_rqst *, struct nlm_file **,
-					struct nlm_lock *);
+				  struct nlm_lock *, int);
 void		  nlm_release_file(struct nlm_file *);
 void		  nlmsvc_put_lockowner(struct nlm_lockowner *);
 void		  nlmsvc_release_lockowner(struct nlm_lock *);
diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
index 5de41e249534..41cab858de57 100644
--- a/fs/lockd/svc4proc.c
+++ b/fs/lockd/svc4proc.c
@@ -146,8 +146,11 @@ nlm4svc_lookup_file(struct svc_rqst *rqstp, struct nlm_host *host,
 		    struct nlm_lock *lock, struct nlm_file **filp,
 		    struct nlm4_lock *xdr_lock, unsigned char type)
 {
+	bool is_test = (rqstp->rq_proc == NLMPROC4_TEST ||
+			rqstp->rq_proc == NLMPROC4_TEST_MSG);
 	struct file_lock *fl = &lock->fl;
 	struct nlm_file *file = NULL;
+	int mode;
 	__be32 error;
 
 	if (xdr_lock->fh.len > NFS_MAXFHSIZE)
@@ -170,7 +173,8 @@ nlm4svc_lookup_file(struct svc_rqst *rqstp, struct nlm_host *host,
 	fl->c.flc_type = type;
 	lockd_set_file_lock_range4(fl, lock->lock_start, lock->lock_len);
 
-	error = nlm_lookup_file(rqstp, &file, lock);
+	mode = is_test ? O_RDWR : lock_to_openmode(fl);
+	error = nlm_lookup_file(rqstp, &file, lock, mode);
 	switch (error) {
 	case nlm_granted:
 		break;
@@ -184,7 +188,8 @@ nlm4svc_lookup_file(struct svc_rqst *rqstp, struct nlm_host *host,
 	*filp = file;
 
 	fl->c.flc_flags = FL_POSIX;
-	fl->c.flc_file = file->f_file[lock_to_openmode(fl)];
+	fl->c.flc_file = is_test ? nlmsvc_file_file(file)
+				 : file->f_file[mode];
 	fl->c.flc_pid = current->tgid;
 	fl->fl_lmops = &nlmsvc_lock_operations;
 	nlmsvc_locks_init_private(fl, host, (pid_t)lock->svid);
diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index b98b1d0ada35..f4520149d6d7 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -613,7 +613,6 @@ nlmsvc_testlock(struct svc_rqst *rqstp, struct nlm_file *file,
 		struct nlm_lock *conflock)
 {
 	int			error;
-	int			mode;
 	__be32			ret;
 
 	dprintk("lockd: nlmsvc_testlock(%s/%llu, ty=%d, %Ld-%Ld)\n",
@@ -631,14 +630,13 @@ nlmsvc_testlock(struct svc_rqst *rqstp, struct nlm_file *file,
 		goto out;
 	}
 
-	mode = lock_to_openmode(&lock->fl);
 	locks_init_lock(&conflock->fl);
 	/* vfs_test_lock only uses start, end, and owner, but tests flc_file */
 	conflock->fl.c.flc_file = lock->fl.c.flc_file;
 	conflock->fl.fl_start = lock->fl.fl_start;
 	conflock->fl.fl_end = lock->fl.fl_end;
 	conflock->fl.c.flc_owner = lock->fl.c.flc_owner;
-	error = vfs_test_lock(file->f_file[mode], &conflock->fl);
+	error = vfs_test_lock(lock->fl.c.flc_file, &conflock->fl);
 	if (error) {
 		ret = nlm_lck_denied_nolocks;
 		goto out;
diff --git a/fs/lockd/svcproc.c b/fs/lockd/svcproc.c
index 749abf8886ba..c0a3487719e2 100644
--- a/fs/lockd/svcproc.c
+++ b/fs/lockd/svcproc.c
@@ -68,6 +68,8 @@ nlmsvc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
 	struct nlm_host		*host = NULL;
 	struct nlm_file		*file = NULL;
 	struct nlm_lock		*lock = &argp->lock;
+	bool			is_test = (rqstp->rq_proc == NLMPROC_TEST ||
+					   rqstp->rq_proc == NLMPROC_TEST_MSG);
 	int			mode;
 	__be32			error = 0;
 
@@ -83,15 +85,22 @@ nlmsvc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
 
 	/* Obtain file pointer. Not used by FREE_ALL call. */
 	if (filp != NULL) {
-		error = cast_status(nlm_lookup_file(rqstp, &file, lock));
+		mode = lock_to_openmode(&lock->fl);
+
+		if (is_test)
+			mode = O_RDWR;
+
+		error = cast_status(nlm_lookup_file(rqstp, &file, lock, mode));
 		if (error != 0)
 			goto no_locks;
 		*filp = file;
 
 		/* Set up the missing parts of the file_lock structure */
-		mode = lock_to_openmode(&lock->fl);
 		lock->fl.c.flc_flags = FL_POSIX;
-		lock->fl.c.flc_file  = file->f_file[mode];
+		if (is_test)
+			lock->fl.c.flc_file = nlmsvc_file_file(file);
+		else
+			lock->fl.c.flc_file = file->f_file[mode];
 		lock->fl.c.flc_pid = current->tgid;
 		lock->fl.fl_lmops = &nlmsvc_lock_operations;
 		nlmsvc_locks_init_private(&lock->fl, host, (pid_t)lock->svid);
diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
index 344e6c187cde..9da9d6e0b42e 100644
--- a/fs/lockd/svcsubs.c
+++ b/fs/lockd/svcsubs.c
@@ -83,23 +83,36 @@ int lock_to_openmode(struct file_lock *lock)
  *
  * We have to make sure we have the right credential to open
  * the file.
+ *
+ * @mode is O_RDONLY, O_WRONLY, or O_RDWR. O_RDWR means success
+ * is achieved with EITHER O_RDONLY or O_WRONLY; it does not
+ * require both.
  */
 static __be32 nlm_do_fopen(struct svc_rqst *rqstp,
 			   struct nlm_file *file, int mode)
 {
-	struct file **fp = &file->f_file[mode];
-	__be32 nlmerr = nlm_granted;
+	__be32 nlmerr = nlm__int__failed;
+	__be32 deferred = 0;
 	int error;
+	int m;
 
-	if (*fp)
-		return nlmerr;
+	for (m = O_RDONLY; m <= O_WRONLY; m++) {
+		struct file **fp = &file->f_file[m];
+
+		if (mode != O_RDWR && mode != m)
+			continue;
+		if (*fp)
+			return nlm_granted;
+
+		error = nlmsvc_ops->fopen(rqstp, &file->f_handle, fp, m);
+		if (!error)
+			return nlm_granted;
 
-	error = nlmsvc_ops->fopen(rqstp, &file->f_handle, fp, mode);
-	if (error) {
 		dprintk("lockd: open failed (errno %d)\n", error);
 		switch (error) {
 		case -EWOULDBLOCK:
 			nlmerr = nlm__int__drop_reply;
+			deferred = nlmerr;
 			break;
 		case -ESTALE:
 			nlmerr = nlm__int__stale_fh;
@@ -110,7 +123,7 @@ static __be32 nlm_do_fopen(struct svc_rqst *rqstp,
 		}
 	}
 
-	return nlmerr;
+	return deferred ? deferred : nlmerr;
 }
 
 /*
@@ -119,17 +132,15 @@ static __be32 nlm_do_fopen(struct svc_rqst *rqstp,
  */
 __be32
 nlm_lookup_file(struct svc_rqst *rqstp, struct nlm_file **result,
-					struct nlm_lock *lock)
+		struct nlm_lock *lock, int mode)
 {
 	struct nlm_file	*file;
 	unsigned int	hash;
 	__be32		nfserr;
-	int		mode;
 
 	nlm_debug_print_fh("nlm_lookup_file", &lock->fh);
 
 	hash = file_hash(&lock->fh);
-	mode = lock_to_openmode(&lock->fl);
 
 	/* Lock file table */
 	mutex_lock(&nlm_file_mutex);
-- 
2.53.0


                 reply	other threads:[~2026-04-28 19:47 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260428194744.2360735-1-cel@kernel.org \
    --to=cel@kernel.org \
    --cc=dai.ngo@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neil@brown.name \
    --cc=neilb@ownmail.net \
    --cc=okorniev@redhat.com \
    --cc=tj.iam.tj@proton.me \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox