linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Fix RELEASE_LOCKOWNER
@ 2024-02-01 14:21 Chuck Lever
  0 siblings, 0 replies; 11+ messages in thread
From: Chuck Lever @ 2024-02-01 14:21 UTC (permalink / raw)
  To: stable; +Cc: linux-nfs

Passes pynfs, fstests, and the git regression suite. Please apply
these to origin/linux-5.15.y.

---

Chuck Lever (3):
      NFSD: Modernize nfsd4_release_lockowner()
      NFSD: Add documenting comment for nfsd4_release_lockowner()
      From: NeilBrown <neilb@suse.de>


 fs/nfsd/nfs4state.c | 65 +++++++++++++++++++++++++--------------------
 1 file changed, 36 insertions(+), 29 deletions(-)

--
Chuck Lever


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 0/3] Fix RELEASE_LOCKOWNER
@ 2024-02-01 14:23 Chuck Lever
  0 siblings, 0 replies; 11+ messages in thread
From: Chuck Lever @ 2024-02-01 14:23 UTC (permalink / raw)
  To: stable; +Cc: linux-nfs

Passes pynfs, fstests, and the git regression suite. Please apply
these to origin/linux-5.10.y.

---

Chuck Lever (2):
      NFSD: Modernize nfsd4_release_lockowner()
      NFSD: Add documenting comment for nfsd4_release_lockowner()

NeilBrown (1):
      nfsd: fix RELEASE_LOCKOWNER


 fs/nfsd/nfs4state.c | 65 +++++++++++++++++++++++++--------------------
 1 file changed, 36 insertions(+), 29 deletions(-)

--
Chuck Lever


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 0/3] Fix RELEASE_LOCKOWNER
@ 2024-02-01 19:06 Chuck Lever
  2024-02-01 19:06 ` [PATCH 1/3] NFSD: Modernize nfsd4_release_lockowner() Chuck Lever
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Chuck Lever @ 2024-02-01 19:06 UTC (permalink / raw)
  To: stable; +Cc: linux-nfs

Passes pynfs, fstests, and the git regression suite. Please apply
these to origin/linux-5.4.y.

---

Chuck Lever (2):
      NFSD: Modernize nfsd4_release_lockowner()
      NFSD: Add documenting comment for nfsd4_release_lockowner()

NeilBrown (1):
      nfsd: fix RELEASE_LOCKOWNER


 fs/nfsd/nfs4state.c | 65 +++++++++++++++++++++++++--------------------
 1 file changed, 36 insertions(+), 29 deletions(-)

--
Chuck Lever


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] NFSD: Modernize nfsd4_release_lockowner()
  2024-02-01 19:06 [PATCH 0/3] Fix RELEASE_LOCKOWNER Chuck Lever
@ 2024-02-01 19:06 ` Chuck Lever
  2024-02-01 19:06 ` [PATCH 2/3] NFSD: Add documenting comment for nfsd4_release_lockowner() Chuck Lever
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Chuck Lever @ 2024-02-01 19:06 UTC (permalink / raw)
  To: stable; +Cc: linux-nfs

From: Chuck Lever <chuck.lever@oracle.com>

[ Upstream commit bd8fdb6e545f950f4654a9a10d7e819ad48146e5 ]

Refactor: Use existing helpers that other lock operations use. This
change removes several automatic variables, so re-organize the
variable declarations for readability.

Stable-dep-of: edcf9725150e ("nfsd: fix RELEASE_LOCKOWNER")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs4state.c |   36 +++++++++++-------------------------
 1 file changed, 11 insertions(+), 25 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index a0aa7e63739d..9a77a3eac4ac 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -6873,16 +6873,13 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 			union nfsd4_op_u *u)
 {
 	struct nfsd4_release_lockowner *rlockowner = &u->release_lockowner;
+	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
 	clientid_t *clid = &rlockowner->rl_clientid;
-	struct nfs4_stateowner *sop;
-	struct nfs4_lockowner *lo = NULL;
 	struct nfs4_ol_stateid *stp;
-	struct xdr_netobj *owner = &rlockowner->rl_owner;
-	unsigned int hashval = ownerstr_hashval(owner);
-	__be32 status;
-	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
+	struct nfs4_lockowner *lo;
 	struct nfs4_client *clp;
-	LIST_HEAD (reaplist);
+	LIST_HEAD(reaplist);
+	__be32 status;
 
 	dprintk("nfsd4_release_lockowner clientid: (%08x/%08x):\n",
 		clid->cl_boot, clid->cl_id);
@@ -6890,30 +6887,19 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 	status = lookup_clientid(clid, cstate, nn);
 	if (status)
 		return status;
-
 	clp = cstate->clp;
-	/* Find the matching lock stateowner */
-	spin_lock(&clp->cl_lock);
-	list_for_each_entry(sop, &clp->cl_ownerstr_hashtbl[hashval],
-			    so_strhash) {
 
-		if (sop->so_is_open_owner || !same_owner_str(sop, owner))
-			continue;
-
-		if (atomic_read(&sop->so_count) != 1) {
-			spin_unlock(&clp->cl_lock);
-			return nfserr_locks_held;
-		}
-
-		lo = lockowner(sop);
-		nfs4_get_stateowner(sop);
-		break;
-	}
+	spin_lock(&clp->cl_lock);
+	lo = find_lockowner_str_locked(clp, &rlockowner->rl_owner);
 	if (!lo) {
 		spin_unlock(&clp->cl_lock);
 		return status;
 	}
-
+	if (atomic_read(&lo->lo_owner.so_count) != 2) {
+		spin_unlock(&clp->cl_lock);
+		nfs4_put_stateowner(&lo->lo_owner);
+		return nfserr_locks_held;
+	}
 	unhash_lockowner_locked(lo);
 	while (!list_empty(&lo->lo_owner.so_stateids)) {
 		stp = list_first_entry(&lo->lo_owner.so_stateids,



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] NFSD: Add documenting comment for nfsd4_release_lockowner()
  2024-02-01 19:06 [PATCH 0/3] Fix RELEASE_LOCKOWNER Chuck Lever
  2024-02-01 19:06 ` [PATCH 1/3] NFSD: Modernize nfsd4_release_lockowner() Chuck Lever
@ 2024-02-01 19:06 ` Chuck Lever
  2024-02-01 19:06 ` [PATCH 3/3] nfsd: fix RELEASE_LOCKOWNER Chuck Lever
  2024-02-01 22:24 ` [PATCH 0/3] Fix RELEASE_LOCKOWNER NeilBrown
  3 siblings, 0 replies; 11+ messages in thread
From: Chuck Lever @ 2024-02-01 19:06 UTC (permalink / raw)
  To: stable; +Cc: linux-nfs

From: Chuck Lever <chuck.lever@oracle.com>

[ Upstream commit 043862b09cc00273e35e6c3a6389957953a34207 ]

And return explicit nfserr values that match what is documented in the
new comment / API contract.

Stable-dep-of: edcf9725150e ("nfsd: fix RELEASE_LOCKOWNER")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs4state.c |   23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9a77a3eac4ac..0dfc45d37658 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -6867,6 +6867,23 @@ check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
 	return status;
 }
 
+/**
+ * nfsd4_release_lockowner - process NFSv4.0 RELEASE_LOCKOWNER operations
+ * @rqstp: RPC transaction
+ * @cstate: NFSv4 COMPOUND state
+ * @u: RELEASE_LOCKOWNER arguments
+ *
+ * The lockowner's so_count is bumped when a lock record is added
+ * or when copying a conflicting lock. The latter case is brief,
+ * but can lead to fleeting false positives when looking for
+ * locks-in-use.
+ *
+ * Return values:
+ *   %nfs_ok: lockowner released or not found
+ *   %nfserr_locks_held: lockowner still in use
+ *   %nfserr_stale_clientid: clientid no longer active
+ *   %nfserr_expired: clientid not recognized
+ */
 __be32
 nfsd4_release_lockowner(struct svc_rqst *rqstp,
 			struct nfsd4_compound_state *cstate,
@@ -6893,7 +6910,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 	lo = find_lockowner_str_locked(clp, &rlockowner->rl_owner);
 	if (!lo) {
 		spin_unlock(&clp->cl_lock);
-		return status;
+		return nfs_ok;
 	}
 	if (atomic_read(&lo->lo_owner.so_count) != 2) {
 		spin_unlock(&clp->cl_lock);
@@ -6909,11 +6926,11 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 		put_ol_stateid_locked(stp, &reaplist);
 	}
 	spin_unlock(&clp->cl_lock);
+
 	free_ol_stateid_reaplist(&reaplist);
 	remove_blocked_locks(lo);
 	nfs4_put_stateowner(&lo->lo_owner);
-
-	return status;
+	return nfs_ok;
 }
 
 static inline struct nfs4_client_reclaim *



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] nfsd: fix RELEASE_LOCKOWNER
  2024-02-01 19:06 [PATCH 0/3] Fix RELEASE_LOCKOWNER Chuck Lever
  2024-02-01 19:06 ` [PATCH 1/3] NFSD: Modernize nfsd4_release_lockowner() Chuck Lever
  2024-02-01 19:06 ` [PATCH 2/3] NFSD: Add documenting comment for nfsd4_release_lockowner() Chuck Lever
@ 2024-02-01 19:06 ` Chuck Lever
  2024-02-01 22:24 ` [PATCH 0/3] Fix RELEASE_LOCKOWNER NeilBrown
  3 siblings, 0 replies; 11+ messages in thread
From: Chuck Lever @ 2024-02-01 19:06 UTC (permalink / raw)
  To: stable; +Cc: linux-nfs

From: NeilBrown <neilb@suse.de>

[ Upstream commit edcf9725150e42beeca42d085149f4c88fa97afd ]

The test on so_count in nfsd4_release_lockowner() is nonsense and
harmful.  Revert to using check_for_locks(), changing that to not sleep.

First: harmful.
As is documented in the kdoc comment for nfsd4_release_lockowner(), the
test on so_count can transiently return a false positive resulting in a
return of NFS4ERR_LOCKS_HELD when in fact no locks are held.  This is
clearly a protocol violation and with the Linux NFS client it can cause
incorrect behaviour.

If RELEASE_LOCKOWNER is sent while some other thread is still
processing a LOCK request which failed because, at the time that request
was received, the given owner held a conflicting lock, then the nfsd
thread processing that LOCK request can hold a reference (conflock) to
the lock owner that causes nfsd4_release_lockowner() to return an
incorrect error.

The Linux NFS client ignores that NFS4ERR_LOCKS_HELD error because it
never sends NFS4_RELEASE_LOCKOWNER without first releasing any locks, so
it knows that the error is impossible.  It assumes the lock owner was in
fact released so it feels free to use the same lock owner identifier in
some later locking request.

When it does reuse a lock owner identifier for which a previous RELEASE
failed, it will naturally use a lock_seqid of zero.  However the server,
which didn't release the lock owner, will expect a larger lock_seqid and
so will respond with NFS4ERR_BAD_SEQID.

So clearly it is harmful to allow a false positive, which testing
so_count allows.

The test is nonsense because ... well... it doesn't mean anything.

so_count is the sum of three different counts.
1/ the set of states listed on so_stateids
2/ the set of active vfs locks owned by any of those states
3/ various transient counts such as for conflicting locks.

When it is tested against '2' it is clear that one of these is the
transient reference obtained by find_lockowner_str_locked().  It is not
clear what the other one is expected to be.

In practice, the count is often 2 because there is precisely one state
on so_stateids.  If there were more, this would fail.

In my testing I see two circumstances when RELEASE_LOCKOWNER is called.
In one case, CLOSE is called before RELEASE_LOCKOWNER.  That results in
all the lock states being removed, and so the lockowner being discarded
(it is removed when there are no more references which usually happens
when the lock state is discarded).  When nfsd4_release_lockowner() finds
that the lock owner doesn't exist, it returns success.

The other case shows an so_count of '2' and precisely one state listed
in so_stateid.  It appears that the Linux client uses a separate lock
owner for each file resulting in one lock state per lock owner, so this
test on '2' is safe.  For another client it might not be safe.

So this patch changes check_for_locks() to use the (newish)
find_any_file_locked() so that it doesn't take a reference on the
nfs4_file and so never calls nfsd_file_put(), and so never sleeps.  With
this check is it safe to restore the use of check_for_locks() rather
than testing so_count against the mysterious '2'.

Fixes: ce3c4ad7f4ce ("NFSD: Fix possible sleep during nfsd4_release_lockowner()")
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Cc: stable@vger.kernel.org # v6.2+
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs4state.c |   26 +++++++++++++++-----------
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 0dfc45d37658..bca22325083c 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -6840,14 +6840,16 @@ check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
 {
 	struct file_lock *fl;
 	int status = false;
-	struct nfsd_file *nf = find_any_file(fp);
+	struct nfsd_file *nf;
 	struct inode *inode;
 	struct file_lock_context *flctx;
 
+	spin_lock(&fp->fi_lock);
+	nf = find_any_file_locked(fp);
 	if (!nf) {
 		/* Any valid lock stateid should have some sort of access */
 		WARN_ON_ONCE(1);
-		return status;
+		goto out;
 	}
 
 	inode = locks_inode(nf->nf_file);
@@ -6863,7 +6865,8 @@ check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
 		}
 		spin_unlock(&flctx->flc_lock);
 	}
-	nfsd_file_put(nf);
+out:
+	spin_unlock(&fp->fi_lock);
 	return status;
 }
 
@@ -6873,10 +6876,8 @@ check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
  * @cstate: NFSv4 COMPOUND state
  * @u: RELEASE_LOCKOWNER arguments
  *
- * The lockowner's so_count is bumped when a lock record is added
- * or when copying a conflicting lock. The latter case is brief,
- * but can lead to fleeting false positives when looking for
- * locks-in-use.
+ * Check if theree are any locks still held and if not - free the lockowner
+ * and any lock state that is owned.
  *
  * Return values:
  *   %nfs_ok: lockowner released or not found
@@ -6912,10 +6913,13 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 		spin_unlock(&clp->cl_lock);
 		return nfs_ok;
 	}
-	if (atomic_read(&lo->lo_owner.so_count) != 2) {
-		spin_unlock(&clp->cl_lock);
-		nfs4_put_stateowner(&lo->lo_owner);
-		return nfserr_locks_held;
+
+	list_for_each_entry(stp, &lo->lo_owner.so_stateids, st_perstateowner) {
+		if (check_for_locks(stp->st_stid.sc_file, lo)) {
+			spin_unlock(&clp->cl_lock);
+			nfs4_put_stateowner(&lo->lo_owner);
+			return nfserr_locks_held;
+		}
 	}
 	unhash_lockowner_locked(lo);
 	while (!list_empty(&lo->lo_owner.so_stateids)) {



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] Fix RELEASE_LOCKOWNER
  2024-02-01 19:06 [PATCH 0/3] Fix RELEASE_LOCKOWNER Chuck Lever
                   ` (2 preceding siblings ...)
  2024-02-01 19:06 ` [PATCH 3/3] nfsd: fix RELEASE_LOCKOWNER Chuck Lever
@ 2024-02-01 22:24 ` NeilBrown
  2024-02-02 14:12   ` Chuck Lever III
  3 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2024-02-01 22:24 UTC (permalink / raw)
  To: Chuck Lever; +Cc: stable, linux-nfs

On Fri, 02 Feb 2024, Chuck Lever wrote:
> Passes pynfs, fstests, and the git regression suite. Please apply
> these to origin/linux-5.4.y.

I should have mentioned this a day or two ago but I hadn't quite made
all the connection yet...

The RELEASE_LOCKOWNER bug was masking a double-free bug that was fixed
by
Commit 47446d74f170 ("nfsd4: add refcount for nfsd4_blocked_lock")
which landed in v5.17 and wasn't marked as a bugfix, and so has not gone to
stable kernels.

Any kernel earlier than v5.17 that receives the RELEASE_LOCKOWNER fix
also needs the nfsd4_blocked_lock fix.  There is a minor follow-up fix
for that nfsd4_blocked_lock fix which Chuck queued yesterday.

The problem scenario is that an nfsd4_lock() call finds a conflicting
lock and so has a reference to a particular nfsd4_blocked_lock.  A concurrent
nfsd4_read_lockowner call frees all the nfsd4_blocked_locks including
the one held in nfsd4_lock().  nfsd4_lock then tries to free the
blocked_lock it has, and results in a double-free or a use-after-free.

Before either patch is applied, the extra reference on the lock-owner
than nfsd4_lock holds causes nfsd4_realease_lockowner() to incorrectly
return an error and NOT free the blocks_lock.
With only the RELEASE_LOCKOWNER fix applied, the double-free happens.
With both patches applied the refcount on the nfsd4_blocked_lock prevents
the double-free.

Kernels before 4.9 are (probably) not affected as they didn't have
find_or_allocate_block() which takes the second reference to a shared
object.  But that is ancient history - those kernels are well past EOL.

Thanks,
NeilBrown


> 
> ---
> 
> Chuck Lever (2):
>       NFSD: Modernize nfsd4_release_lockowner()
>       NFSD: Add documenting comment for nfsd4_release_lockowner()
> 
> NeilBrown (1):
>       nfsd: fix RELEASE_LOCKOWNER
> 
> 
>  fs/nfsd/nfs4state.c | 65 +++++++++++++++++++++++++--------------------
>  1 file changed, 36 insertions(+), 29 deletions(-)
> 
> --
> Chuck Lever
> 
> 
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] Fix RELEASE_LOCKOWNER
  2024-02-01 22:24 ` [PATCH 0/3] Fix RELEASE_LOCKOWNER NeilBrown
@ 2024-02-02 14:12   ` Chuck Lever III
  2024-02-03  0:26     ` NeilBrown
  2024-02-03  1:04     ` Greg KH
  0 siblings, 2 replies; 11+ messages in thread
From: Chuck Lever III @ 2024-02-02 14:12 UTC (permalink / raw)
  To: linux-stable; +Cc: Chuck Lever, Linux NFS Mailing List, Neil Brown



> On Feb 1, 2024, at 5:24 PM, NeilBrown <neilb@suse.de> wrote:
> 
> On Fri, 02 Feb 2024, Chuck Lever wrote:
>> Passes pynfs, fstests, and the git regression suite. Please apply
>> these to origin/linux-5.4.y.
> 
> I should have mentioned this a day or two ago but I hadn't quite made
> all the connection yet...
> 
> The RELEASE_LOCKOWNER bug was masking a double-free bug that was fixed
> by
> Commit 47446d74f170 ("nfsd4: add refcount for nfsd4_blocked_lock")
> which landed in v5.17 and wasn't marked as a bugfix, and so has not gone to
> stable kernels.

Then, instructions to stable@vger.kernel.org:

Do not apply the patches I just sent for 5.15, 5.10, and 5.4. I will
apply 47446d74f170, run the tests again, and resend.


> Any kernel earlier than v5.17 that receives the RELEASE_LOCKOWNER fix
> also needs the nfsd4_blocked_lock fix.  There is a minor follow-up fix
> for that nfsd4_blocked_lock fix which Chuck queued yesterday.
> 
> The problem scenario is that an nfsd4_lock() call finds a conflicting
> lock and so has a reference to a particular nfsd4_blocked_lock.  A concurrent
> nfsd4_read_lockowner call frees all the nfsd4_blocked_locks including
> the one held in nfsd4_lock().  nfsd4_lock then tries to free the
> blocked_lock it has, and results in a double-free or a use-after-free.
> 
> Before either patch is applied, the extra reference on the lock-owner
> than nfsd4_lock holds causes nfsd4_realease_lockowner() to incorrectly
> return an error and NOT free the blocks_lock.
> With only the RELEASE_LOCKOWNER fix applied, the double-free happens.

Our test suite currently does not exercise this use case, apparently.
I didn't see a problem like this during testing.


> With both patches applied the refcount on the nfsd4_blocked_lock prevents
> the double-free.
> 
> Kernels before 4.9 are (probably) not affected as they didn't have
> find_or_allocate_block() which takes the second reference to a shared
> object.  But that is ancient history - those kernels are well past EOL.
> 
> Thanks,
> NeilBrown
> 
> 
>> 
>> ---
>> 
>> Chuck Lever (2):
>>      NFSD: Modernize nfsd4_release_lockowner()
>>      NFSD: Add documenting comment for nfsd4_release_lockowner()
>> 
>> NeilBrown (1):
>>      nfsd: fix RELEASE_LOCKOWNER
>> 
>> 
>> fs/nfsd/nfs4state.c | 65 +++++++++++++++++++++++++--------------------
>> 1 file changed, 36 insertions(+), 29 deletions(-)
>> 
>> --
>> Chuck Lever
>> 
>> 
>> 
> 

--
Chuck Lever



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] Fix RELEASE_LOCKOWNER
  2024-02-02 14:12   ` Chuck Lever III
@ 2024-02-03  0:26     ` NeilBrown
  2024-02-03  3:43       ` Chuck Lever III
  2024-02-03  1:04     ` Greg KH
  1 sibling, 1 reply; 11+ messages in thread
From: NeilBrown @ 2024-02-03  0:26 UTC (permalink / raw)
  To: Chuck Lever III; +Cc: linux-stable, Chuck Lever, Linux NFS Mailing List

On Sat, 03 Feb 2024, Chuck Lever III wrote:
> 
> 
> > On Feb 1, 2024, at 5:24 PM, NeilBrown <neilb@suse.de> wrote:
> > 
> > On Fri, 02 Feb 2024, Chuck Lever wrote:
> >> Passes pynfs, fstests, and the git regression suite. Please apply
> >> these to origin/linux-5.4.y.
> > 
> > I should have mentioned this a day or two ago but I hadn't quite made
> > all the connection yet...
> > 
> > The RELEASE_LOCKOWNER bug was masking a double-free bug that was fixed
> > by
> > Commit 47446d74f170 ("nfsd4: add refcount for nfsd4_blocked_lock")
> > which landed in v5.17 and wasn't marked as a bugfix, and so has not gone to
> > stable kernels.
> 
> Then, instructions to stable@vger.kernel.org:
> 
> Do not apply the patches I just sent for 5.15, 5.10, and 5.4. I will
> apply 47446d74f170, run the tests again, and resend.
> 
> 
> > Any kernel earlier than v5.17 that receives the RELEASE_LOCKOWNER fix
> > also needs the nfsd4_blocked_lock fix.  There is a minor follow-up fix
> > for that nfsd4_blocked_lock fix which Chuck queued yesterday.
> > 
> > The problem scenario is that an nfsd4_lock() call finds a conflicting
> > lock and so has a reference to a particular nfsd4_blocked_lock.  A concurrent
> > nfsd4_read_lockowner call frees all the nfsd4_blocked_locks including
> > the one held in nfsd4_lock().  nfsd4_lock then tries to free the
> > blocked_lock it has, and results in a double-free or a use-after-free.
> > 
> > Before either patch is applied, the extra reference on the lock-owner
> > than nfsd4_lock holds causes nfsd4_realease_lockowner() to incorrectly
> > return an error and NOT free the blocks_lock.
> > With only the RELEASE_LOCKOWNER fix applied, the double-free happens.
> 
> Our test suite currently does not exercise this use case, apparently.
> I didn't see a problem like this during testing.
> 

Our OpenQA testing found it (as did our customer :-).
Quoting from a bugzilla that unfortunately is not public (though might
not be accessible to anyone with an account)

https://bugzilla.suse.com/show_bug.cgi?id=1219349

     LTP test nfslock01.sh randomly fails on the latest SLE-15SP4 and
     SLE-15SP5 KOTD.  The failures appear only when testing NFS protocol
     v4.0, other versions do not seem to be affected.  The test either
     gets stuck or sometimes triggers kernel oops.  The contents of the
     kernel backtrace vary.  All archs appear to be affected. 

Does your test suite cover v4.0?  Does it include LTP ?

Thanks,
NeilBrown


> 
> > With both patches applied the refcount on the nfsd4_blocked_lock prevents
> > the double-free.
> > 
> > Kernels before 4.9 are (probably) not affected as they didn't have
> > find_or_allocate_block() which takes the second reference to a shared
> > object.  But that is ancient history - those kernels are well past EOL.
> > 
> > Thanks,
> > NeilBrown
> > 
> > 
> >> 
> >> ---
> >> 
> >> Chuck Lever (2):
> >>      NFSD: Modernize nfsd4_release_lockowner()
> >>      NFSD: Add documenting comment for nfsd4_release_lockowner()
> >> 
> >> NeilBrown (1):
> >>      nfsd: fix RELEASE_LOCKOWNER
> >> 
> >> 
> >> fs/nfsd/nfs4state.c | 65 +++++++++++++++++++++++++--------------------
> >> 1 file changed, 36 insertions(+), 29 deletions(-)
> >> 
> >> --
> >> Chuck Lever
> >> 
> >> 
> >> 
> > 
> 
> --
> Chuck Lever
> 
> 
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] Fix RELEASE_LOCKOWNER
  2024-02-02 14:12   ` Chuck Lever III
  2024-02-03  0:26     ` NeilBrown
@ 2024-02-03  1:04     ` Greg KH
  1 sibling, 0 replies; 11+ messages in thread
From: Greg KH @ 2024-02-03  1:04 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: linux-stable, Chuck Lever, Linux NFS Mailing List, Neil Brown

On Fri, Feb 02, 2024 at 02:12:11PM +0000, Chuck Lever III wrote:
> 
> 
> > On Feb 1, 2024, at 5:24 PM, NeilBrown <neilb@suse.de> wrote:
> > 
> > On Fri, 02 Feb 2024, Chuck Lever wrote:
> >> Passes pynfs, fstests, and the git regression suite. Please apply
> >> these to origin/linux-5.4.y.
> > 
> > I should have mentioned this a day or two ago but I hadn't quite made
> > all the connection yet...
> > 
> > The RELEASE_LOCKOWNER bug was masking a double-free bug that was fixed
> > by
> > Commit 47446d74f170 ("nfsd4: add refcount for nfsd4_blocked_lock")
> > which landed in v5.17 and wasn't marked as a bugfix, and so has not gone to
> > stable kernels.
> 
> Then, instructions to stable@vger.kernel.org:
> 
> Do not apply the patches I just sent for 5.15, 5.10, and 5.4. I will
> apply 47446d74f170, run the tests again, and resend.

Thanks for letting us know, I've dropped them from my review queue.

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] Fix RELEASE_LOCKOWNER
  2024-02-03  0:26     ` NeilBrown
@ 2024-02-03  3:43       ` Chuck Lever III
  0 siblings, 0 replies; 11+ messages in thread
From: Chuck Lever III @ 2024-02-03  3:43 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-stable, Chuck Lever, Linux NFS Mailing List



> On Feb 2, 2024, at 7:26 PM, NeilBrown <neilb@suse.de> wrote:
> 
> On Sat, 03 Feb 2024, Chuck Lever III wrote:
>> 
>> 
>>> On Feb 1, 2024, at 5:24 PM, NeilBrown <neilb@suse.de> wrote:
>>> 
>>> On Fri, 02 Feb 2024, Chuck Lever wrote:
>>>> Passes pynfs, fstests, and the git regression suite. Please apply
>>>> these to origin/linux-5.4.y.
>>> 
>>> I should have mentioned this a day or two ago but I hadn't quite made
>>> all the connection yet...
>>> 
>>> The RELEASE_LOCKOWNER bug was masking a double-free bug that was fixed
>>> by
>>> Commit 47446d74f170 ("nfsd4: add refcount for nfsd4_blocked_lock")
>>> which landed in v5.17 and wasn't marked as a bugfix, and so has not gone to
>>> stable kernels.
>> 
>> Then, instructions to stable@vger.kernel.org:
>> 
>> Do not apply the patches I just sent for 5.15, 5.10, and 5.4. I will
>> apply 47446d74f170, run the tests again, and resend.
>> 
>> 
>>> Any kernel earlier than v5.17 that receives the RELEASE_LOCKOWNER fix
>>> also needs the nfsd4_blocked_lock fix.  There is a minor follow-up fix
>>> for that nfsd4_blocked_lock fix which Chuck queued yesterday.
>>> 
>>> The problem scenario is that an nfsd4_lock() call finds a conflicting
>>> lock and so has a reference to a particular nfsd4_blocked_lock.  A concurrent
>>> nfsd4_read_lockowner call frees all the nfsd4_blocked_locks including
>>> the one held in nfsd4_lock().  nfsd4_lock then tries to free the
>>> blocked_lock it has, and results in a double-free or a use-after-free.
>>> 
>>> Before either patch is applied, the extra reference on the lock-owner
>>> than nfsd4_lock holds causes nfsd4_realease_lockowner() to incorrectly
>>> return an error and NOT free the blocks_lock.
>>> With only the RELEASE_LOCKOWNER fix applied, the double-free happens.
>> 
>> Our test suite currently does not exercise this use case, apparently.
>> I didn't see a problem like this during testing.
>> 
> 
> Our OpenQA testing found it (as did our customer :-).
> Quoting from a bugzilla that unfortunately is not public (though might
> not be accessible to anyone with an account)
> 
> https://bugzilla.suse.com/show_bug.cgi?id=1219349
> 
>     LTP test nfslock01.sh randomly fails on the latest SLE-15SP4 and
>     SLE-15SP5 KOTD.  The failures appear only when testing NFS protocol
>     v4.0, other versions do not seem to be affected.  The test either
>     gets stuck or sometimes triggers kernel oops.  The contents of the
>     kernel backtrace vary.  All archs appear to be affected. 
> 
> Does your test suite cover v4.0?

pynfs covers v4.0 and v4.1.
I ran fstests and the git regression suite with only NFSv4.0.


> Does it include LTP ?

kdevops doesn't include an LTP workflow at this time.


> Thanks,
> NeilBrown
> 
> 
>> 
>>> With both patches applied the refcount on the nfsd4_blocked_lock prevents
>>> the double-free.
>>> 
>>> Kernels before 4.9 are (probably) not affected as they didn't have
>>> find_or_allocate_block() which takes the second reference to a shared
>>> object.  But that is ancient history - those kernels are well past EOL.
>>> 
>>> Thanks,
>>> NeilBrown
>>> 
>>> 
>>>> 
>>>> ---
>>>> 
>>>> Chuck Lever (2):
>>>>     NFSD: Modernize nfsd4_release_lockowner()
>>>>     NFSD: Add documenting comment for nfsd4_release_lockowner()
>>>> 
>>>> NeilBrown (1):
>>>>     nfsd: fix RELEASE_LOCKOWNER
>>>> 
>>>> 
>>>> fs/nfsd/nfs4state.c | 65 +++++++++++++++++++++++++--------------------
>>>> 1 file changed, 36 insertions(+), 29 deletions(-)
>>>> 
>>>> --
>>>> Chuck Lever
>>>> 
>>>> 
>>>> 
>>> 
>> 
>> --
>> Chuck Lever


--
Chuck Lever



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-02-03  3:43 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-01 19:06 [PATCH 0/3] Fix RELEASE_LOCKOWNER Chuck Lever
2024-02-01 19:06 ` [PATCH 1/3] NFSD: Modernize nfsd4_release_lockowner() Chuck Lever
2024-02-01 19:06 ` [PATCH 2/3] NFSD: Add documenting comment for nfsd4_release_lockowner() Chuck Lever
2024-02-01 19:06 ` [PATCH 3/3] nfsd: fix RELEASE_LOCKOWNER Chuck Lever
2024-02-01 22:24 ` [PATCH 0/3] Fix RELEASE_LOCKOWNER NeilBrown
2024-02-02 14:12   ` Chuck Lever III
2024-02-03  0:26     ` NeilBrown
2024-02-03  3:43       ` Chuck Lever III
2024-02-03  1:04     ` Greg KH
  -- strict thread matches above, loose matches on Subject: below --
2024-02-01 14:23 Chuck Lever
2024-02-01 14:21 Chuck Lever

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).